Memory Systems for AI Agents — Landscape Review & Decision Framework

Prepared for: Luci (self-assessment) and Elmar (decision-maker) Date: 2026-04-11 Triggered by: GBrain release (Garry Tan, 2026-04-10) NotebookLM notebook: "Memory Systems for AI Agents — Landscape Review" (51 sources)

Executive Summary

The AI agent memory landscape has matured rapidly. Ten significant frameworks exist, three academic papers provide theoretical grounding, and expert opinion is genuinely divided on key architectural questions. After reviewing 50+ sources, the core finding is this: Luci's current markdown-based memory system is architecturally sound and aligned with the emerging expert consensus (Karpathy, Willison, Harrison Chase all converge on file-based, human-readable memory). The gaps are not in the storage model but in three specific capabilities: (1) entity extraction and compiled people-pages, (2) overnight consolidation/dream cycles, and (3) hybrid retrieval beyond grep.

Recommendation: Keep markdown as source of truth. Add a lightweight dream cycle for consolidation. Add compiled entity pages. Optionally add SQLite FTS5 for hybrid search. Do not adopt GBrain wholesale — steal the "compiled truth + timeline" page pattern and the dream cycle concept instead.

1. Current Landscape

The Ten Systems

System	Architecture	Storage	Retrieval	Open Source	Production-Ready
GBrain (Garry Tan)	Markdown + pgvector hybrid	Git-backed .md files + Postgres	Hybrid: vector + keyword + RRF	Yes (MIT)	Personal use
MemGPT / Letta	OS-inspired tiered memory	Message buffer + core + recall + archival	Agent-managed paging	Yes	Yes (Letta platform)
Mem0	Universal memory layer	Multi-scope (user/agent/session/org)	Graph + vector + keyword	Yes	Yes (186M API calls/quarter)
LangMem	SDK for agent learning	LangGraph BaseStore	Semantic + episodic + procedural	Yes	LangChain ecosystem
LlamaIndex	Composable memory modules	Pluggable backends	Vector + summary + composable	Yes	RAG-heavy workflows
Obsidian + MCP	Vault as memory backend	.md files in folders	BM25 or grep	Community	Personal use
Karpathy LLM Wiki	Compiled markdown knowledge base	raw/ + wiki/ + schema	Optional BM25/hybrid	Concept (gist)	Concept
OpenAI Assistants	Server-managed threads	OpenAI servers	Full-thread re-processing	No	Being deprecated (Aug 2026)
Anthropic Memory	File-based, client-side	/memories directory	Glob/grep + compaction	Partial (MCP)	Yes (Claude Code)
Devin (Cognition)	DeepWiki + proprietary	Proprietary	Code-specialized indexing	No	Enterprise ($500/mo)

Key Observations

Convergence on files: Three independent high-profile systems (Claude Code, GBrain, Manus) converged on markdown files as the memory substrate. Harrison Chase (LangChain) explicitly stated: "I very, very strongly believe that if you're building a long-horizon agent, you need to give it access to a file system." This is not a coincidence — files are inspectable, versioned, portable, and work with every tool.

MemGPT's conceptual influence: Even systems that don't use MemGPT's code adopt its mental model — tiered memory with working/short-term/long-term layers and agent-managed promotion between tiers. This is the dominant conceptual framework in the field.

Mem0 is the production benchmark: With 186M API calls/quarter and published benchmarks (26% accuracy improvement over OpenAI, 91% lower p95 latency, 90% token cost savings), Mem0 is the system to beat on quantitative metrics. Its graph memory variant adds entity relationship tracking.

GBrain's specific contribution: The "compiled truth + timeline" page model is genuinely novel. Each entity page has a rewritable top section (current understanding) and an append-only bottom section (evidence trail with dates). This separates the compiled state from the raw evidence — you can always trace why the system believes something.

2. Retrieval Architectures

The Four Approaches

Vector-only (pgvector, Pinecone, Weaviate, Chroma): - pgvector: 5-8ms HNSW at <5M vectors; pgvectorscale hit 471 QPS at 99% recall on 50M vectors - Best for semantic similarity ("planned parenthood" → "reproductive rights") - Worst for exact matches (entity names, error messages, project codes) - Cost: managed vector DBs run $50-200/GB/month vs $0.02/GB for local disk

Keyword-only (FTS5, Elasticsearch): - SQLite FTS5: sub-millisecond BM25 search, zero dependencies, single-file deployment - Best for exact intent — "what you search is what you get" - A contrarian benchmark showed FTS5 beating Pinecone on 4,300 agent memories: <1ms vs 50-200ms - Worst for semantic fuzzy matching

Hybrid (RRF + reranking): - BM25 + dense retrieval fused via Reciprocal Rank Fusion, then cross-encoder reranking - TREC iKAT 2025: nDCG@10 improved from 0.4218 to 0.4425 with RRF before reranking - The consensus "best pipeline": BM25 + vector → RRF → cross-encoder reranker → top-5 - GBrain uses this pattern natively with pgvector + Postgres FTS

Graph-based (Neo4j, GraphRAG, Graphiti): - Microsoft GraphRAG: LLM builds entity knowledge graph, pregenerates community summaries - Zep/Graphiti: 94.8% on DMR benchmark (vs MemGPT's 93.4%), bi-temporal model - Best for multi-hop reasoning across entities - Worst for setup cost and maintenance; Hamel Husain: "Graph databases are unnecessary — simpler solutions like CSV or Postgres typically suffice"

What Luci Uses Today

Luci currently uses grep + glob (keyword search over markdown files). This is equivalent to basic keyword retrieval with no semantic matching and no ranking. It works because: 1. The corpus is small (~200 memory files, ~50 vault files) 2. Entity names are explicit and greppable 3. MEMORY.md index provides a manual routing layer

The gap: as the corpus grows, grep won't surface semantically related memories that don't share exact keywords.

Recommendation for Luci

Add SQLite FTS5 as a second retrieval path. This is the highest-value, lowest-cost upgrade: - Zero new dependencies (SQLite is already used for mc.db, vault.db, email.db) - Sub-millisecond BM25 ranking - Can index all memory files, vault entries, email subjects, and WhatsApp messages - Trivial to implement: CREATE VIRTUAL TABLE memory_fts USING fts5(...) - Does NOT require a vector database, embedding model, or GPU

Skip vector search for now. At Luci's scale (<1,000 memory entries), FTS5 + the existing grep/glob is sufficient. Vector search adds value at 10K+ entries. Revisit if the corpus grows.

Skip graph memory. Entity relationships can be captured in compiled entity pages (markdown links) rather than a graph database. The maintenance overhead of Neo4j/Graphiti is not justified for a single-user agent.

3. Data Model Patterns

The Four Models

Markdown-as-source-of-truth (Luci's current model, Karpathy's LLM Wiki, Claude Code): - Files are the canonical store. The LLM reads, writes, and maintains them. - Git provides versioning and audit trail. - Human-readable and editable — Elmar can inspect and correct any memory. - Scales poorly beyond ~10K notes without indexing. Temporal queries ("what happened last week?") require additional metadata.

Append-only event logs (MemGPT/Letta): - Every interaction is an immutable event. The agent pages relevant events into working context. - Preserves full history — nothing is lost. - Requires active management (the agent must decide what to promote/archive). - Can grow unbounded without consolidation.

Compiled entity pages (GBrain, Karpathy Wiki): - Auto-generated pages for people, projects, and concepts that aggregate all mentions. - Top section: rewritable current understanding (compiled truth). - Bottom section: append-only evidence trail with dates and sources. - The knowledge compounds — each new source enriches existing pages.

Conversation threads (OpenAI Assistants, Memoria): - Raw conversation history as the memory substrate. - Simple but expensive (re-processes full thread per query). - OpenAI's approach scored worst on LOCOMO benchmark (52.9% accuracy).

What Luci Should Adopt

Keep markdown-as-source-of-truth. Add compiled entity pages as a second layer.

The compiled entity page pattern from GBrain/Karpathy is the single highest-value addition to Luci's memory. Here's why:

Luci already has the raw data. vault.db has entities, edges, and file references. email.db has contacts. whatsapp-messages.db has conversation history. The data exists but isn't compiled.
Entity pages solve the "who is X?" problem. When Elmar mentions "Stephan" or "Chazelle" or "NMG", Luci should be able to pull up a compiled page: who they are, recent interactions, key facts, contact details, linked projects.
The compilation can run as a dream cycle. A nightly script scans vault.db, email.db, and whatsapp-messages.db, extracts entity mentions, and compiles/updates markdown pages in a ~/.claude/memory/entities/ directory.
Markdown stays the source of truth. The compiled pages are derived artifacts — regenerable from the raw data. Git tracks changes. Elmar can inspect and correct.

4. Entity Extraction and Enrichment

Current State of the Art

Few-shot LLM extraction now matches or beats supervised NER models without training data. GPT-4/Claude can extract entities, relationships, and temporal expressions with high accuracy.
Auto-disambiguation remains hard. "Apple" in tech vs food contexts. Joint NER + entity disambiguation (2025 state of art) helps but isn't perfect.
Temporal validity is an open problem. MemoryBank uses Ebbinghaus forgetting curves (unaccessed memories decay). A-MEM uses memory evolution (new facts update old memories). No system handles both gracefully.

Who Does It Well

Mem0: Graph memory variant with entity relationship tracking. 26% accuracy improvement.
A-MEM: Zettelkasten-inspired linked notes with auto-updating. SOTA on 6 foundation models.
Zep/Graphiti: Bi-temporal entity tracking (when it happened vs when it was ingested). 94.8% on DMR.

What Luci Should Do

Use Claude itself for entity extraction during the dream cycle. The pipeline:

Scan recent activity (email.db, whatsapp-messages.db, vault.db activity_log)
Extract entities: people, organizations, projects, with context
For each entity, check if a compiled page exists
If yes: update the compiled truth section, append to evidence trail
If no: create a new entity page from accumulated evidence
Cross-link entity pages (person → project, person → organization)

This is Tier 2 work — clear requirements, known patterns. The dream cycle script could be a Python scheduler task that calls Claude via the API for extraction.

5. Dream Cycles / Overnight Consolidation

Who Does It

System	What Happens	When	Measured Impact
Stanford Generative Agents	Reflection: synthesize observations into abstract statements	Inline (importance threshold)	"Single biggest contributor to believable behavior"
MemGPT/Letta Sleep-time Compute	Pre-process context, anticipate queries, pre-compute reasoning	Between sessions (idle time)	~5x compute reduction, up to 13% accuracy improvement
GBrain	Scan conversations, enrich entities, fix citations, consolidate	Overnight cron	Not formally measured
MemoryBank	Ebbinghaus forgetting curves — decay unaccessed memories	Continuous	Personality adaptation over time
A-MEM	Memory evolution — new memories trigger updates to old ones	On new memory arrival	SOTA on 6 models
Claude Code Auto-Dream	Prune, merge, refresh memory files	Idle time	Not formally measured

The Sleep-time Compute Paper (Lin et al., 2025)

This is the strongest academic justification for dream cycles. Key findings: - ~5x reduction in test-time compute for equivalent accuracy - Up to 13% accuracy improvement on GSM-Symbolic, 18% on AIME - 2.5x cost reduction per query through amortization - Effectiveness correlates with query predictability — consolidation should focus on information the user is likely to ask about again

The insight: decompose the prompt into static context (pre-processable between sessions) and dynamic query (real-time). Sleep-time compute enriches the static context during idle periods.

What Luci Should Build

A lightweight dream cycle as a scheduler task (nightly or every 6 hours):

Phase 1 — Scan: Read recent activity_log entries, new emails, new WhatsApp messages, completed ticket work.

Phase 2 — Extract: Use Claude API to extract entities, facts, and relationships from new data.

Phase 3 — Compile: Update entity pages in ~/.claude/memory/entities/. Update/merge existing memory files. Flag contradictions for human review.

Phase 4 — Prune: Apply forgetting curves — memories not accessed in 30+ days get moved to an archive. Stale project memories (completed projects) get marked as historical.

Phase 5 — Index: Rebuild FTS5 index from all memory files. Update MEMORY.md index.

Phase 6 — Report: Log what changed to activity_log. Optionally send a Telegram summary: "Dream cycle complete: 3 entity pages updated, 2 memories pruned, 1 contradiction flagged."

This is the GBrain dream cycle pattern adapted for Luci's infrastructure — no Postgres, no pgvector, just markdown + SQLite + Claude API.

6. Academic Foundations

The Three Essential Papers

1. Generative Agents (Park et al., Stanford, 2023) - Introduced the memory stream + reflection + planning architecture - Reflection was the single biggest quality contributor - Retrieval scoring: recency × importance × relevance (all three needed) - Limitation: no forgetting, memory grows unbounded

2. Sleep-time Compute (Lin et al., Letta, 2025) - Academic foundation for dream cycles - 5x compute reduction, 13-18% accuracy improvement - Key insight: amortize reasoning across queries during idle time - Authors are MemGPT founders — direct lineage

3. Episodic Memory Position Paper (Pink et al., 2025) - Argues Tulving's episodic/semantic distinction is critical for agents - Episodic: raw events with timestamps ("Elmar said X on 2026-04-10") - Semantic: extracted facts ("Elmar prefers direct communication") - The consolidation pathway (episodic → semantic) is the dream cycle

Supporting Papers

MemoryBank (Zhong et al., 2023): Ebbinghaus forgetting curves for AI memory. Unaccessed memories decay.
A-MEM (Xu et al., 2025): Zettelkasten method for agent memory. Memory evolution: new memories update old ones.
Recursively Summarizing (Wu et al., 2023): Hierarchical compression of conversation history. The algorithm for the dream cycle's "merge" operation.
Memory Survey (Zhang et al., 2024): 39-page taxonomy. Three dimensions: sources, forms, operations.
Memory in the Age of AI Agents (Hu et al., 2025): Most comprehensive survey. 47 authors.
Zep Paper (Rasmussen et al., 2025): Temporal knowledge graphs. Bi-temporal model (event time vs ingestion time).

7. Expert Commentary and Contested Questions

Where Experts Agree

Memory is essential for agents — every named expert agrees on this
Human-readable, inspectable memory is preferable — Karpathy, Willison, Chase, Anthropic all converge
File systems are a legitimate memory substrate — Chase: "give it access to a file system"; three independent projects converged on markdown
Consolidation/dream cycles add measurable value — Stanford reflections, Letta sleep-time compute, GBrain dream cycles

Where Experts Disagree

RAG vs Long Context:

Position	Expert
RAG is dead — use compiled markdown wikis	Karpathy
RAG is NOT dead — economics and context rot	Hamel Husain
Context expands to fill limits — RAG stays relevant	Chip Huyen
RAG is a good hack, fine-tuning may matter more long-term	Jerry Liu
Long inputs > short prompts, but keep infra simple	Simon Willison

Vector DBs — Necessary?

Position	Expert
Single-vector embeddings lose critical info; use ColBERT	Hamel Husain
No vector DB needed — grep, file trees work better for code	MindStudio analysis
Faceted search with structured extraction > top-k vector	Jason Liu
Hybrid is fine, optional not required	Karpathy

Graph Memory — Worth the Overhead?

Position	Expert
Graph databases unnecessary — CSV or Postgres suffice	Hamel Husain
Significant operational overhead, not every query benefits	Mem0 analysis
Graph adds value for multi-hop entity reasoning	Neo4j/Graphiti advocates
Zep/Graphiti: 94.8% DMR, 18.5% accuracy improvement	Rasmussen et al.

The Contrarian Case: "Just Use Markdown Files"

Three strong arguments: 1. Cost: $0.02/GB/month local disk vs $50-200/GB managed vector DBs 2. Inspectability: Human can read, edit, correct. No black-box embeddings. 3. Convergence: Claude Code, GBrain, Manus all independently chose markdown. This is a strong signal.

Counter-arguments: - Markdown doesn't scale past ~10K files without indexing - No semantic search (grep misses conceptual matches) - Temporal queries require additional metadata - Multi-hop reasoning across files requires iterative LLM calls

Production Failure Lessons

Context rot: Models struggle with information buried in the middle of large contexts — a "dead zone" that expands with larger datasets (Boschi, 2026)
Manus context budget: ~30% overhead from managing state files
Retrieval misses are catastrophic for coding agents: Wrong chunks → broken code → cascading failures (MindStudio)
No implicit memory extraction: Systems capture explicit statements but miss behavioral patterns (swyx)

8. Tradeoffs Matrix for Luci

Criterion	Keep Current Markdown	Add Entity Pages + Dream Cycle	Adopt GBrain	Adopt Mem0	Build Custom (pgvector + graph)
Setup cost	Zero (already working)	Low (Python script + scheduler task)	Medium (Postgres, PGLite, or Supabase)	Medium (API integration or self-host)	High (Postgres + pgvector + schema)
Retrieval latency	<1ms (grep)	<1ms (grep + FTS5)	5-50ms (hybrid pgvector + FTS)	~1s median (hosted), faster self-hosted	5-50ms
Recall quality	Good for exact, poor for semantic	Good for exact + BM25 ranked	Very good (hybrid + RRF)	Best measured (26% over OpenAI)	Depends on implementation
Maintenance	Manual MEMORY.md updates	Dream cycle automates most	Must maintain Postgres + PGLite	API dependency or self-host complexity	High ongoing effort
Integration with Claude Code	Native (Read/Write/Grep tools)	Native + scheduled task	Requires MCP bridge or API wrapper	Requires API integration	Custom MCP server
Inspectability	Excellent (plain markdown)	Excellent (still markdown)	Good (markdown + Postgres)	Poor (opaque memory store)	Depends on implementation
Entity tracking	Manual memory files	Auto-compiled entity pages	Built-in (compiled truth + timeline)	Built-in (graph memory)	Custom entity extraction
Dream cycles	None	Nightly consolidation script	Built-in (overnight crons)	Not documented	Custom implementation
Portability	git clone	git clone	Postgres dependency	API dependency	Postgres dependency
Elmar can edit	Yes (markdown)	Yes (markdown)	Partially (markdown, not Postgres)	No	Partially

9. Decision Framework

The Four Options

Option A: Keep current markdown approach (do nothing) - Pro: Working, simple, aligned with expert consensus - Con: No entity compilation, no consolidation, no ranked retrieval - Risk: Memory quality degrades as corpus grows - Cost: Zero

Option B: Add entity pages + dream cycle + FTS5 (recommended) - Pro: Highest value per effort, stays on markdown, adds the three missing capabilities - Con: Requires building the dream cycle script and entity extraction pipeline - Risk: Low — it's additive, doesn't change existing infra - Cost: ~1 week of dev time (Tier 2 ticket) - This is the "compiled people-pages from existing infra" option from the ticket

Option C: Adopt GBrain prototype - Pro: Battle-tested by Garry Tan, comprehensive feature set, MIT licensed - Con: Requires Postgres (PGLite or full), opinionated toward Garry's workflow, tightly coupled to OpenClaw ecosystem - Risk: Medium — introduces Postgres dependency, may not integrate cleanly with Claude Code's native file tools - Cost: ~2-3 weeks including migration and adaptation

Option D: Adopt a different pattern (Mem0, Letta, etc.) - Pro: Most mature production systems with benchmarks - Con: API dependency (Mem0 hosted) or significant self-hosting complexity (Letta), reduced inspectability - Risk: Medium-high — vendor lock-in or maintenance burden - Cost: ~2-4 weeks

Recommendation: Option B — Enhanced Markdown

The evidence strongly supports keeping markdown as the foundation and adding three specific capabilities:

Compiled entity pages (steal from GBrain's "compiled truth + timeline" pattern)
Auto-generated markdown pages for people, projects, and organizations
Top section: current understanding (rewritable)
Bottom section: evidence trail with dates (append-only)
Stored in ~/.claude/memory/entities/
Dream cycle (steal from Letta's sleep-time compute + GBrain's overnight crons)
Nightly scheduler task
Scans recent activity across all data sources
Extracts entities, updates entity pages, prunes stale memories
Logs changes and optionally sends Telegram summary
SQLite FTS5 index (steal from Willison's SQLite-everything philosophy)
Index all memory files, entity pages, and MEMORY.md
BM25-ranked search as a retrieval path alongside grep
Zero new dependencies

What NOT to build: - No vector database (not needed at Luci's scale) - No graph database (entity relationships captured in markdown links) - No external API dependencies (everything local) - No Postgres (SQLite is already the standard on Luci)

Patterns to Steal Without Adopting the Whole Stack

Pattern	Source	How to Apply
Compiled truth + timeline pages	GBrain	Entity page format with rewritable top + append-only bottom
Sleep-time compute / dream cycles	Letta paper + GBrain crons	Nightly scheduler task for consolidation
Ebbinghaus forgetting curves	MemoryBank	Decay weight on memories by last-access date
Memory evolution	A-MEM	New facts trigger updates to existing entity pages
Reflection	Stanford Generative Agents	Dream cycle synthesizes observations into higher-level insights
Zettelkasten linking	A-MEM	Entity pages cross-link to related entities
Context engineering	Anthropic	Keep memory lean — "smallest set of high-signal tokens"

10. Contested Questions — Where I (Luci) Stand

RAG vs long context? At Luci's scale (<1,000 memory entries), long context wins. The entire memory corpus fits in the 1M token window. But Chip Huyen's Context Expansion Law applies — as Luci grows, we'll need retrieval. FTS5 is the cheapest hedge.

Vector DBs necessary? No, not yet. pgvectorscale is impressive (471 QPS at 99% recall), but FTS5 at <1ms for keyword search covers Luci's needs. Revisit at 10K+ entries.

Graph memory? No. Entity relationships are better captured as markdown links between compiled entity pages. The overhead of Neo4j/Graphiti is not justified for a single-user agent.

GBrain? Respect the ideas, don't adopt the stack. GBrain's best concepts (compiled truth + timeline, dream cycles, hybrid retrieval) can be implemented on Luci's existing markdown + SQLite infrastructure without Postgres.

Is markdown enough? Yes, with enhancements. Karpathy, Willison, Chase, and Anthropic all validate this approach. The failures reported in production (Boschi 2026) are at scales (500+ files, multi-hop cross-file reasoning) that Luci hasn't hit yet. When we do, FTS5 + entity pages + dream cycles provide the safety net.

Sources

Academic Papers

Park et al. (2023). "Generative Agents: Interactive Simulacra of Human Behavior." https://arxiv.org/abs/2304.03442
Packer et al. (2023). "MemGPT: Towards LLMs as Operating Systems." https://arxiv.org/abs/2310.08560
Zhong et al. (2023). "MemoryBank: Enhancing Large Language Models with Long-Term Memory." https://arxiv.org/abs/2305.10250
Wu et al. (2023). "Recursively Summarizing Enables Long-Term Dialogue Memory." https://arxiv.org/abs/2308.15022
Xu et al. (2025). "A-MEM: Agentic Memory for LLM Agents." https://arxiv.org/abs/2502.12110
Pink et al. (2025). "Episodic Memory is the Missing Piece for Long-Term LLM Agents." https://arxiv.org/abs/2502.06975
Lin et al. (2025). "Sleep-time Compute: Beyond Inference Scaling at Test-time." https://arxiv.org/abs/2504.13171
Chhikara et al. (2025). "Mem0: Building Production-Ready AI Agents with Scalable Long-Term Memory." https://arxiv.org/abs/2504.19413
Zhang et al. (2024). "A Survey on the Memory Mechanism of Large Language Model based Agents." https://arxiv.org/abs/2404.13501
Hu et al. (2025). "Memory in the Age of AI Agents: A Survey." https://arxiv.org/abs/2512.13564
Edge et al. (2024). "From Local to Global: A Graph RAG Approach." https://arxiv.org/abs/2404.16130
Rasmussen et al. (2025). "Zep: A Temporal Knowledge Graph Architecture." https://arxiv.org/abs/2501.13956
Li et al. (2025). "Long Context vs. RAG for LLMs." https://arxiv.org/abs/2501.01880

Frameworks & Tools

GBrain. https://github.com/garrytan/gbrain
Letta. https://github.com/letta-ai/letta
Mem0. https://github.com/mem0ai/mem0
LangMem. https://github.com/langchain-ai/langmem
A-MEM. https://github.com/agiresearch/A-mem
Graphiti (Zep). https://github.com/getzep/graphiti
Microsoft GraphRAG. https://github.com/microsoft/graphrag
MCP Knowledge Graph Memory Server. https://github.com/modelcontextprotocol/servers/tree/main/src/memory

Expert Commentary

Karpathy. "LLM Wiki" gist. https://gist.github.com/karpathy/442a6bf555914893e9891c11519de94f
Willison. "Things we learned about LLMs in 2024." https://simonw.substack.com/p/things-we-learned-about-llms-in-2024
Husain. "Stop Saying RAG Is Dead." https://hamel.dev/notes/llm/rag/not_dead.html
Jason Liu. "Beyond Chunks: Context Engineering." https://jxnl.co/writing/2025/08/27/facets-context-engineering/
Harrison Chase. Sequoia podcast on context engineering. https://sequoiacap.com/podcast/context-engineering-our-way-to-long-horizon-agents-langchains-harrison-chase/
Anthropic. "Effective context engineering for AI agents." https://www.anthropic.com/engineering/effective-context-engineering-for-ai-agents
Mem0. "State of AI Agent Memory 2026." https://mem0.ai/blog/state-of-ai-agent-memory-2026
Letta. "Agent Memory" deep-dive. https://www.letta.com/blog/agent-memory
Chip Huyen. "Agents." https://huyenchip.com/2025/01/07/agents.html
Cognition. "Devin Annual Performance Review 2025." https://cognition.ai/blog/devin-annual-performance-review-2025

Technical References

Alex Garcia. "Hybrid full-text search and vector search with SQLite." https://alexgarcia.xyz/blog/2024/sqlite-vec-hybrid-search/index.html
RAGFlow. "2025 year-end review." https://ragflow.io/blog/rag-review-2025-from-rag-to-context
Guillaume Laforge. "Understanding RRF in Hybrid Search." https://glaforge.dev/posts/2026/02/10/advanced-rag-understanding-reciprocal-rank-fusion-in-hybrid-search/
"AI Agent Memory Management: When Markdown Files Are All You Need." https://dev.to/imaginex/ai-agent-memory-management-when-markdown-files-are-all-you-need-5ekk

Report prepared by Luci. 51 sources loaded in NotebookLM notebook "Memory Systems for AI Agents — Landscape Review." NotebookLM Deep Research report generated separately as a complementary artifact.