Landscape Review & Decision Framework
The AI agent memory landscape has matured rapidly. Ten significant frameworks exist, three academic papers provide theoretical grounding, and expert opinion is genuinely divided on key architectural questions. After reviewing 50+ sources, the core finding is this: Luci's current markdown-based memory system is architecturally sound and aligned with the emerging expert consensus (Karpathy, Willison, Harrison Chase all converge on file-based, human-readable memory). The gaps are not in the storage model but in three specific capabilities: (1) entity extraction and compiled people-pages, (2) overnight consolidation/dream cycles, and (3) hybrid retrieval beyond grep.
Recommendation: Keep markdown as source of truth. Add a lightweight dream cycle for consolidation. Add compiled entity pages. Optionally add SQLite FTS5 for hybrid search. Do not adopt GBrain wholesale — steal the "compiled truth + timeline" page pattern and the dream cycle concept instead.
| System | Architecture | Storage | Retrieval | Open Source | Production-Ready |
|---|---|---|---|---|---|
| GBrain (Garry Tan) | Markdown + pgvector hybrid | Git-backed .md files + Postgres | Hybrid: vector + keyword + RRF | Yes (MIT) | Personal use |
| MemGPT / Letta | OS-inspired tiered memory | Message buffer + core + recall + archival | Agent-managed paging | Yes | Yes (Letta platform) |
| Mem0 | Universal memory layer | Multi-scope (user/agent/session/org) | Graph + vector + keyword | Yes | Yes (186M API calls/quarter) |
| LangMem | SDK for agent learning | LangGraph BaseStore | Semantic + episodic + procedural | Yes | LangChain ecosystem |
| LlamaIndex | Composable memory modules | Pluggable backends | Vector + summary + composable | Yes | RAG-heavy workflows |
| Obsidian + MCP | Vault as memory backend | .md files in folders | BM25 or grep | Community | Personal use |
| Karpathy LLM Wiki | Compiled markdown knowledge base | raw/ + wiki/ + schema | Optional BM25/hybrid | Concept (gist) | Concept |
| OpenAI Assistants | Server-managed threads | OpenAI servers | Full-thread re-processing | No | Being deprecated (Aug 2026) |
| Anthropic Memory | File-based, client-side | /memories directory | Glob/grep + compaction | Partial (MCP) | Yes (Claude Code) |
| Devin (Cognition) | DeepWiki + proprietary | Proprietary | Code-specialized indexing | No | Enterprise ($500/mo) |
Convergence on files: Three independent high-profile systems (Claude Code, GBrain, Manus) converged on markdown files as the memory substrate. Harrison Chase (LangChain) explicitly stated: "I very, very strongly believe that if you're building a long-horizon agent, you need to give it access to a file system." This is not a coincidence — files are inspectable, versioned, portable, and work with every tool.
MemGPT's conceptual influence: Even systems that don't use MemGPT's code adopt its mental model — tiered memory with working/short-term/long-term layers and agent-managed promotion between tiers. This is the dominant conceptual framework in the field.
Mem0 is the production benchmark: With 186M API calls/quarter and published benchmarks (26% accuracy improvement over OpenAI, 91% lower p95 latency, 90% token cost savings), Mem0 is the system to beat on quantitative metrics. Its graph memory variant adds entity relationship tracking.
GBrain's specific contribution: The "compiled truth + timeline" page model is genuinely novel. Each entity page has a rewritable top section (current understanding) and an append-only bottom section (evidence trail with dates). This separates the compiled state from the raw evidence — you can always trace why the system believes something.
Luci currently uses grep + glob (keyword search over markdown files). This is equivalent to basic keyword retrieval with no semantic matching and no ranking. It works because:
The gap: as the corpus grows, grep won't surface semantically related memories that don't share exact keywords.
Add SQLite FTS5 as a second retrieval path. This is the highest-value, lowest-cost upgrade:
CREATE VIRTUAL TABLE memory_fts USING fts5(...)Skip vector search for now. At Luci's scale (<1,000 memory entries), FTS5 + the existing grep/glob is sufficient. Vector search adds value at 10K+ entries. Revisit if the corpus grows.
Skip graph memory. Entity relationships can be captured in compiled entity pages (markdown links) rather than a graph database. The maintenance overhead of Neo4j/Graphiti is not justified for a single-user agent.
Keep markdown-as-source-of-truth. Add compiled entity pages as a second layer.
The compiled entity page pattern from GBrain/Karpathy is the single highest-value addition to Luci's memory. Here's why:
~/.claude/memory/entities/ directory.Use Claude itself for entity extraction during the dream cycle. The pipeline:
This is Tier 2 work — clear requirements, known patterns. The dream cycle script could be a Python scheduler task that calls Claude via the API for extraction.
| System | What Happens | When | Measured Impact |
|---|---|---|---|
| Stanford Generative Agents | Reflection: synthesize observations into abstract statements | Inline (importance threshold) | "Single biggest contributor to believable behavior" |
| MemGPT/Letta Sleep-time Compute | Pre-process context, anticipate queries, pre-compute reasoning | Between sessions (idle time) | ~5x compute reduction, up to 13% accuracy improvement |
| GBrain | Scan conversations, enrich entities, fix citations, consolidate | Overnight cron | Not formally measured |
| MemoryBank | Ebbinghaus forgetting curves — decay unaccessed memories | Continuous | Personality adaptation over time |
| A-MEM | Memory evolution — new memories trigger updates to old ones | On new memory arrival | SOTA on 6 models |
| Claude Code Auto-Dream | Prune, merge, refresh memory files | Idle time | Not formally measured |
This is the strongest academic justification for dream cycles. Key findings:
The insight: decompose the prompt into static context (pre-processable between sessions) and dynamic query (real-time). Sleep-time compute enriches the static context during idle periods.
A lightweight dream cycle as a scheduler task (nightly or every 6 hours):
Phase 1 — Scan: Read recent activity_log entries, new emails, new WhatsApp messages, completed ticket work.
Phase 2 — Extract: Use Claude API to extract entities, facts, and relationships from new data.
Phase 3 — Compile: Update entity pages in ~/.claude/memory/entities/. Update/merge existing memory files. Flag contradictions for human review.
Phase 4 — Prune: Apply forgetting curves — memories not accessed in 30+ days get moved to an archive. Stale project memories (completed projects) get marked as historical.
Phase 5 — Index: Rebuild FTS5 index from all memory files. Update MEMORY.md index.
Phase 6 — Report: Log what changed to activity_log. Optionally send a Telegram summary: "Dream cycle complete: 3 entity pages updated, 2 memories pruned, 1 contradiction flagged."
This is the GBrain dream cycle pattern adapted for Luci's infrastructure — no Postgres, no pgvector, just markdown + SQLite + Claude API.
| Position | Expert |
|---|---|
| RAG is dead — use compiled markdown wikis | Karpathy |
| RAG is NOT dead — economics and context rot | Hamel Husain |
| Context expands to fill limits — RAG stays relevant | Chip Huyen |
| RAG is a good hack, fine-tuning may matter more long-term | Jerry Liu |
| Long inputs > short prompts, but keep infra simple | Simon Willison |
| Position | Expert |
|---|---|
| Single-vector embeddings lose critical info; use ColBERT | Hamel Husain |
| No vector DB needed — grep, file trees work better for code | MindStudio analysis |
| Faceted search with structured extraction > top-k vector | Jason Liu |
| Hybrid is fine, optional not required | Karpathy |
| Position | Expert |
|---|---|
| Graph databases unnecessary — CSV or Postgres suffice | Hamel Husain |
| Significant operational overhead, not every query benefits | Mem0 analysis |
| Graph adds value for multi-hop entity reasoning | Neo4j/Graphiti advocates |
| Zep/Graphiti: 94.8% DMR, 18.5% accuracy improvement | Rasmussen et al. |
Three strong arguments:
Counter-arguments:
| Criterion | Keep Current Markdown | Add Entity Pages + Dream Cycle | Adopt GBrain | Adopt Mem0 | Build Custom (pgvector + graph) |
|---|---|---|---|---|---|
| Setup cost | Zero (already working) | Low (Python script + scheduler task) | Medium (Postgres, PGLite, or Supabase) | Medium (API integration or self-host) | High (Postgres + pgvector + schema) |
| Retrieval latency | <1ms (grep) | <1ms (grep + FTS5) | 5-50ms (hybrid pgvector + FTS) | ~1s median (hosted), faster self-hosted | 5-50ms |
| Recall quality | Good for exact, poor for semantic | Good for exact + BM25 ranked | Very good (hybrid + RRF) | Best measured (26% over OpenAI) | Depends on implementation |
| Maintenance | Manual MEMORY.md updates | Dream cycle automates most | Must maintain Postgres + PGLite | API dependency or self-host complexity | High ongoing effort |
| Integration with Claude Code | Native (Read/Write/Grep tools) | Native + scheduled task | Requires MCP bridge or API wrapper | Requires API integration | Custom MCP server |
| Inspectability | Excellent (plain markdown) | Excellent (still markdown) | Good (markdown + Postgres) | Poor (opaque memory store) | Depends on implementation |
| Entity tracking | Manual memory files | Auto-compiled entity pages | Built-in (compiled truth + timeline) | Built-in (graph memory) | Custom entity extraction |
| Dream cycles | None | Nightly consolidation script | Built-in (overnight crons) | Not documented | Custom implementation |
| Portability | git clone | git clone | Postgres dependency | API dependency | Postgres dependency |
| Elmar can edit | Yes (markdown) | Yes (markdown) | Partially (markdown, not Postgres) | No | Partially |
The evidence strongly supports keeping markdown as the foundation and adding three specific capabilities:
~/.claude/memory/entities/What NOT to build:
| Pattern | Source | How to Apply |
|---|---|---|
| Compiled truth + timeline pages | GBrain | Entity page format with rewritable top + append-only bottom |
| Sleep-time compute / dream cycles | Letta paper + GBrain crons | Nightly scheduler task for consolidation |
| Ebbinghaus forgetting curves | MemoryBank | Decay weight on memories by last-access date |
| Memory evolution | A-MEM | New facts trigger updates to existing entity pages |
| Reflection | Stanford Generative Agents | Dream cycle synthesizes observations into higher-level insights |
| Zettelkasten linking | A-MEM | Entity pages cross-link to related entities |
| Context engineering | Anthropic | Keep memory lean — "smallest set of high-signal tokens" |
RAG vs long context? At Luci's scale (<1,000 memory entries), long context wins. The entire memory corpus fits in the 1M token window. But Chip Huyen's Context Expansion Law applies — as Luci grows, we'll need retrieval. FTS5 is the cheapest hedge.
Vector DBs necessary? No, not yet. pgvectorscale is impressive (471 QPS at 99% recall), but FTS5 at <1ms for keyword search covers Luci's needs. Revisit at 10K+ entries.
Graph memory? No. Entity relationships are better captured as markdown links between compiled entity pages. The overhead of Neo4j/Graphiti is not justified for a single-user agent.
GBrain? Respect the ideas, don't adopt the stack. GBrain's best concepts (compiled truth + timeline, dream cycles, hybrid retrieval) can be implemented on Luci's existing markdown + SQLite infrastructure without Postgres.
Is markdown enough? Yes, with enhancements. Karpathy, Willison, Chase, and Anthropic all validate this approach. The failures reported in production (Boschi 2026) are at scales (500+ files, multi-hop cross-file reasoning) that Luci hasn't hit yet. When we do, FTS5 + entity pages + dream cycles provide the safety net.