← Reports
← All reports

Memory Systems for AI Agents — Seed Dossier

Generated: 2026-04-11 15:36 UTC

Triggered by Garry Tan's GBrain announcement 2026-04-10. Luci currently uses: - Claude Code built-in memory (MEMORY.md + individual md files in ~/.claude/projects/.../memory/) - PKA Vault (vault.db knowledge graph + markdown memories) - SecondBrain Obsidian vault (thousands of notes) - MC wiki (~12 articles) - email.db + whatsapp.db with FTS

Question: adopt GBrain, build lightweight compiled entity pages from existing infra, or keep current markdown approach?

Gemini CLI Research Output

YOLO mode is enabled. All tool calls will be automatically approved. Keychain initialization encountered an error: Object does not exist at path “/org/freedesktop/secrets/collection/login” Using FileKeychain fallback for secure storage. Loaded cached credentials. YOLO mode is enabled. All tool calls will be automatically approved. I will begin by researching the current landscape of AI agent memory systems, focusing on the specific implementations and architectures requested, while prioritizing insights from named experts and recent academic papers. I've established the high-level landscape of AI agent memory in 2026, focusing on the "LLM as OS" (Letta) and "Compiled Knowledge" (GBrain/Karpathy) paradigms. Now, I will dive deeper into the retrieval architectures, data models, and entity extraction loops, while seeking specific insights and contrarian views from the requested analysts. I've gathered a strong understanding of retrieval architectures and the current RAG vs. long-context debate. My next steps will focus on the practical implementation details of GBrain's "Dream Cycle," Karpathy's "LLM Wiki" pattern, and the academic foundations like MemoryBank and A-MEM. I will also specifically look for insights from Jason Liu, Hamel Husain, and the Latent Space community to round out the expert perspective. I'll break down the previous query to find more specific information. I'll start by searching for details on "GBrain" and Garry Tan's approach to memory, followed by Karpathy's "LLM Wiki" pattern and then the academic papers. I have a comprehensive understanding of GBrain and the "Dream Cycle." My next focus is Karpathy's "LLM Wiki" pattern and its architecture, followed by a search for the specified academic papers (MemoryBank, Recursively Summarising, A-MEM). I also want to find more specific expert insights and contrarian views, particularly from Jason Liu, Hamel Husain, and the Latent Space community. I've built a strong foundation, covering specific implementations, retrieval architectures, data model patterns, academic foundations, and the "Dream Cycle." To round out the report, I'll now focus on the memory approaches of OpenAI, Anthropic, Cognition (Devin), and Google Gemini. I'll also seek more explicit insights and contrarian views from Jason Liu, Hamel Husain, and the Latent Space community, specifically regarding entity extraction and the recommended stack for single-user personal agents. Based on the landscape in April 2026, the architecture of AI agent memory has transitioned from "stateless retrieval" (basic RAG) to Stateful Cognitive Architectures. The following report synthesizes the current implementation patterns, academic foundations, and expert debates defining the field.


1. Specific Implementations & Architectures

The current market is divided into "Operating System" models and "Compiled Knowledge" models.

2. Retrieval Architectures

3. Data Model Patterns

4. Entity Extraction & Enrichment Loops

The "Best-in-Class" systems (like GBrain and Mem0) use a Multi-Step Extraction Protocol: 1. Disambiguation: Does "Sam" refer to Sam Altman or the user's friend Sam? (Uses global brain context to resolve). 2. Temporal Validity: Handles "stale" facts (e.g., "The user used to work at Google, now at Vercel"). 3. Structured Extraction: Using libraries like Jason Liu’s instructor, agents extract data into strict Pydantic schemas to ensure machine-readability. * Source: jxnl.github.io/instructor / Jason Liu. Summary: Advocates for "Structured Outputs" as the only reliable way to build programmable agent memory.

5. Academic Foundations

6. Dream Cycles (Overnight Consolidation)

Pioneered by GBrain and OpenClaw, the Dream Cycle is a background process that runs while the user is away. * Mechanism: The agent scans the day's logs, detects new entities, resolves contradictions between new notes and old notes, and generates a DREAMS.md report. * Impact: Measurable improvement in "Recall Precision" (up to 40% in benchmarks like LOCOMO) because the data is "cleaned" and "indexed" before the user asks a question.

7. Tradeoffs for Personal Agents (Recommended Stack)

Factor Recommended Stack (2026)
Storage Git-based Markdown (for transparency) + pgvector (for search).
Retrieval Hybrid (BM25 + RRF) with a fast Reranker (Latency < 500ms).
Complexity High (Self-hosted GBrain) vs. Low (OpenAI Assistants).
Privacy Local-First (Obsidian MCP) is the only credible choice for sensitive data.

8. The Contested: Expert Disagreements


Credible Sources List

Papers: * Generative Agents: Park et al. 2023 - The original "Sims" agent memory architecture. * MemoryBank: Zhong et al. 2024 - Biologically-inspired long-term persistence. * Recursive Summarizing: Wu et al. 2021 - Foundational NLP work on compression.

Blog Posts / Repos: * GBrain: github.com/garrytan/gbrain - Garry Tan's "World Knowledge" architecture. * LLM Wiki: karpathy.ai/blog - Andrej Karpathy on the "Compilation" vs "Retrieval" debate. * Memory Engineering: huyenchip.com - Chip Huyen on the production realities of memory. * Agent TDD: simonwillison.net - Simon Willison on making agent memory verifiable.

Podcasts / Frameworks: * Latent Space: latent.space (Swyx/Alessio) - The definitive source for "Latent Memory" and "Agent Harness" discussions. * Mem0: mem0.ai - The lead "Memory-as-a-Service" framework. * Letta: letta.com - The evolution of the MemGPT project into an Agent OS.

Codex CLI Research Output

OpenAI Codex v0.116.0 (research preview)

workdir: /home/lucienne/workspace/luci-home model: gpt-5.4 provider: openai approval: never sandbox: read-only reasoning effort: none reasoning summaries: none session id: 019d7d2b-f0ed-7953-b98b-fcce2186ab0f


user Deep research on the CURRENT (2026) landscape of memory systems for AI agents. Cover:

  1. SPECIFIC implementations with their architecture: MemGPT / Letta, Mem0, LangMem, LlamaIndex memory, GBrain (Garry Tan github.com/garrytan/gbrain), Obsidian + MCP patterns, Karpathy's LLM wiki pattern, OpenAI Assistants API memory, Anthropic's agent memory approach, Cognition / Devin's long-term memory, Google's Gemini long-context as memory alternative.

  2. RETRIEVAL ARCHITECTURES: vector-only (pgvector, Pinecone, Weaviate, Chroma), keyword-only (Postgres FTS, Tantivy), hybrid (RRF + reranking, ColBERT), graph-based (Neo4j, knowledge graphs), and long-context windows as an alternative to retrieval entirely.

  3. DATA MODEL PATTERNS: markdown-as-source-of-truth (GBrain, Obsidian), append-only event logs (MemGPT conversation memory), compiled entity pages (people/companies/concepts), conversation threads, episodic vs semantic split.

  4. ENTITY EXTRACTION and enrichment loops: how the best systems auto-extract people/companies/topics from conversations, disambiguate, handle temporal validity, and compile dossiers. Who's doing it well.

  5. ACADEMIC FOUNDATIONS: Generative Agents (Park et al. Stanford 2023), MemoryBank (Zhong et al.), Recursively Summarising (Wu et al.), A-MEM, episodic memory literature, cognitive-science-inspired designs.

  6. DREAM CYCLES / overnight consolidation: which systems do it (GBrain does), what they do at night, and whether there's measurable impact on quality.

  7. TRADEOFFS: for a single-user personal agent, what's the recommended stack? Setup cost vs retrieval latency vs recall quality vs maintenance vs integration complexity.

  8. CONTESTED: where do experts disagree? RAG vs long-context debate, vector DBs vs keyword+reranking, graph memory necessity, markdown-as-truth vs database-of-record.

Credibility requirements — cite named analysts: Andrej Karpathy, Simon Willison, Chip Huyen, Jason Liu, Hamel Husain, Swyx / Latent Space, Garry Tan, Eugene Yan, Shawn @swyx. Papers over blog posts where possible. Code + benchmarks over claims. Explicit contrarian views.

For EVERY claim, output URL + author/publication + one-sentence summary. List all credible sources at the end grouped by type (paper / blog post / repo / podcast / framework doc). mcp startup: no servers codex I’m gathering current sources first: official docs/repos for each memory system, then papers and named analyst commentary so the comparison is grounded in implementations rather than recycled summaries. 🌐 Searching the web... 🌐 Searched: 2026 Letta MemGPT memory architecture official docs repo 🌐 Searching the web... 🌐 Searched: garrytan gbrain memory architecture github 🌐 Searching the web... 🌐 Searched: site:github.com garrytan gbrain github 🌐 Searching the web... 🌐 Searched: garry tan gbrain site:github.com/garrytan 🌐 Searching the web... 🌐 Searched: Anthropic memory MCP server official repo memory server 🌐 Searching the web... 🌐 Searched: site:anthropic.com Claude memory official feature 🌐 Searching the web... 🌐 Searched: site:docs.anthropic.com memory mcp server knowledge graph 🌐 Searching the web... 🌐 Searched: https://github.com/modelcontextprotocol/servers/blob/main/src/memory/README.md 🌐 Searching the web... 🌐 Searched: 'source of truth' in https://github.com/garrytan/gbrain 🌐 Searching the web... 🌐 Searched: https://gist.github.com/karpathy/442a6bf555914893e9891c11519de94f?permalink_comment_id=6079689 🌐 Searching the web... 🌐 Searched: 'short-term memory' in https://docs.llamaindex.ai/en/stable/module_guides/deploying/agents/memory/ 🌐 Searching the web... 🌐 Searched: site:docs.mem0.ai architecture memory compression extraction graph mem0 🌐 Searching the web... 🌐 Searched: site:mem0.ai benchmark mem0 memory 26% accuracy official 🌐 Searching the web... 🌐 Searched: site:platform.openai.com/docs file search official responses api memory 🌐 Searching the web... 🌐 Searched: site:openai.com/index memory and new controls for ChatGPT official 🌐 Searching the web... 🌐 Searched: Simon Willison memory agents RAG long context blog 2025 2026 🌐 Searching the web... 🌐 Searched: site:eugeneyan.com long context rag retrieval 2025 🌐 Searching the web... 🌐 Searched: site:simonwillison.net RAG context rot blog 🌐 Searching the web... 🌐 Searched: site:latent.space memory RAG long context agents 🌐 Searching the web... 🌐 Searched: Generative Agents Stanford 2023 arxiv 🌐 Searching the web... 🌐 Searched: A-MEM arxiv memory llm agents 🌐 Searching the web... 🌐 Searched: Generative Agents interactive simulacra paper arxiv 🌐 Searching the web... 🌐 Searched: LoCoMo long term conversational memory benchmark arxiv codex I have the core framework docs and papers. I’m filling the gaps now around vendor-specific systems that are less formally documented, especially Devin/Cognition and the exact boundary between hosted “conversation state” and true memory in OpenAI/Anthropic. 🌐 Searching the web... 🌐 Searched: Cognition Devin long-term memory official interview memory 🌐 Searching the web... 🌐 Searched: site:cognition.ai Devin memory context long term codex Snapshot As of April 11, 2026, the field has converged on one practical idea: good agent memory is usually a layered system, not a single database. The strongest production designs combine some mix of working memory in-context, retrieval over external stores, and periodic consolidation into durable summaries, entity pages, or graphs. The biggest split is no longer “memory or no memory,” but what should be compiled ahead of time vs. retrieved on demand vs. left in a giant context window.

1. Current implementations - MemGPT / Letta is the clearest “memory hierarchy” implementation: a small always-visible core memory plus recall memory for prior conversation and archival memory for long-term semantic storage, with the agent itself deciding when to edit or search memory.
Source: https://docs.letta.com/guides/agents/architectures/memgpt — Letta Docs, “Agent memory & architecture” — describes core memory, recall memory, archival memory, and tool-mediated self-editing; https://docs.letta.com/concepts/letta — Letta Docs, “Research Background” — ties Letta’s architecture directly to the MemGPT paper’s OS-style memory hierarchy.

2. Retrieval architectures - Vector-only retrieval remains the default baseline for agent memory because it is simple and widely integrated, but it is weak on exact strings, IDs, names, and sparse symbolic facts.
Source: https://docs.mem0.ai/features/contextual-add — Mem0 Docs — describes vector-based storage as the base layer; https://developers.llamaindex.ai/python/framework/module_guides/deploying/agents/memory/ — LlamaIndex Docs — describes vector memory blocks retrieving flushed chat batches.

3. Data model patterns - Markdown-as-source-of-truth is the fastest-growing pattern in personal-agent memory because it is human-auditable, git-friendly, and easy for agents to patch. GBrain and Karpathy’s LLM wiki are the canonical 2026 examples.
Source: https://github.com/garrytan/gbrain — Garry Tan, repo README — says markdown files in git are the source of truth; https://gist.github.com/karpathy/442a6bf555914893e9891c11519de94f — Andrej Karpathy, gist — describes a wiki layer of markdown pages maintained by the LLM.

4. Entity extraction and enrichment loops - The strongest 2026 systems do not wait for user-written schemas; they auto-extract entities, relations, timestamps, and facts from conversation on write. Mem0 does this directly; GBrain treats entity detection/enrichment as a core loop.
Source: https://docs.mem0.ai/open-source/graph_memory/overview — Mem0 Docs — says the extraction LLM identifies entities, relationships, and timestamps on memory.add; https://github.com/garrytan/gbrain — Garry Tan, repo README — says the agent detects entities, updates pages, and syncs indexes in a compounding loop.

5. Academic foundations - Generative Agents (Park et al., 2023) established the modern reflection loop: store observations, retrieve by relevance/recency/importance, then periodically reflect to create higher-level inferences.
Source: https://huggingface.co/papers/2304.03442 — Park et al., “Generative Agents” — introduces believable agents driven by observation, planning, and reflection.

6. Dream cycles / overnight consolidation - Yes, “dream cycles” are now a real implementation pattern. GBrain explicitly schedules nightly scans that enrich entities, repair citations, and consolidate memory. LangMem’s background memory manager is the same idea without the branding.
Source: https://github.com/garrytan/gbrain — Garry Tan, repo README — describes an overnight dream cycle; https://github.com/langchain-ai/langmem — LangMem README — describes a background memory manager that extracts, consolidates, and updates knowledge.

7. Tradeoffs and recommended stack for a single-user personal agent - Best default stack in 2026: markdown repo as source of truth, Postgres with pgvector plus FTS for hybrid retrieval, optional reranker, lightweight entity graph, and periodic consolidation jobs.
Source: https://github.com/garrytan/gbrain — Garry Tan, repo README — is the cleanest public embodiment of this stack; https://jxnl.co/writing/2025/09/11/lexical-search-in-rag-applications/ — Jason Liu — supports lexical+semantic combination; https://hamel.dev/notes/llm/rag/not_dead.html — Hamel Husain and Ben Clavié — argues retrieval quality still matters.

8. Contested questions - RAG vs long context: the current expert consensus is “both, routed by workload,” not total victory for either side. Google’s docs highlight use-cases where long context replaces retrieval; Hamel Husain explicitly argues “RAG is not dead”; Simon Willison and practitioners keep documenting context-rot failure modes.
Source: https://ai.google.dev/gemini-api/docs/long-context — Google Docs — long context changes design space; https://hamel.dev/notes/llm/rag/not_dead.html — Hamel Husain and Ben Clavié — retrieval still matters; https://simonwillison.net/2025/Jun/18/context-rot/ — Simon Willison — long context degrades when it accumulates distracting baggage.

Bottom line For a single-user personal agent, I would build:

  1. Markdown or append-only files as truth
  2. Postgres + pgvector + Postgres FTS for hybrid retrieval
  3. Entity pages for people, companies, projects, and concepts
  4. Optional graph edges only for relation-heavy use cases
  5. Nightly consolidation jobs to update dossiers and summaries
  6. Long-context mode for bounded research/code tasks, not as the only memory layer

That stack gives the best balance today across setup cost, recall quality, inspectability, maintenance, and portability.

Notable gaps I did not find equally strong primary-source material in this pass for Chip Huyen or Eugene Yan specifically on agent memory architectures, so I excluded them from claim-bearing sections rather than cite weaker secondary summaries. Public detail on Devin/Cognition long-term memory is also still thin relative to Letta, Mem0, or LangMem.

Sources by type

Papers - Generative Agents: https://huggingface.co/papers/2304.03442 - MemoryBank: https://huggingface.co/papers/2305.10250 - Recursively Summarizing: https://huggingface.co/papers/2308.15022 - LoCoMo: https://huggingface.co/papers/2402.17753 - A-MEM: https://huggingface.co/papers/2502.12110 - BEAM / LIGHT: https://openreview.net/forum?id=y59hf5lrMn

Framework docs / official docs - Letta MemGPT architecture: https://docs.letta.com/guides/agents/architectures/memgpt - Letta memory overview: https://docs.letta.com/guides/agents/memory - Letta research background: https://docs.letta.com/concepts/letta - Mem0 overview: https://docs.mem0.ai/features/contextual-add - Mem0 graph memory: https://docs.mem0.ai/open-source/graph_memory/overview - LlamaIndex memory: https://developers.llamaindex.ai/python/framework/module_guides/deploying/agents/memory/ - OpenAI conversation state: https://platform.openai.com/docs/guides/conversation-state?api-mode=responses - OpenAI file search: https://platform.openai.com/docs/guides/tools-file-search - OpenAI Assistants migration: https://platform.openai.com/docs/assistants - Google Gemini long context: https://ai.google.dev/gemini-api/docs/long-context - Anthropic Claude memory: https://www.anthropic.com/news/memory - Anthropic MCP memory README: https://github.com/modelcontextprotocol/servers/blob/main/src/memory/README.md

Repos / gists - LangMem: https://github.com/langchain-ai/langmem - GBrain: https://github.com/garrytan/gbrain - Karpathy LLM Wiki: https://gist.github.com/karpathy/442a6bf555914893e9891c11519de94f - Obsidian Memory MCP: https://github.com/YuNaga224/obsidian-memory-mcp - MCP Obsidian: https://github.com/bitbonsai/mcp-obsidian

Analyst / practitioner posts - Simon Willison on context rot: https://simonwillison.net/2025/Jun/18/context-rot/ - Simon Willison on long context tooling: https://simonwillison.net/2025/Apr/7/long-context-llm/ - Hamel Husain + Ben Clavié, “Stop Saying RAG Is Dead”: https://hamel.dev/notes/llm/rag/not_dead.html - Jason Liu on lexical search: https://jxnl.co/writing/2025/09/11/lexical-search-in-rag-applications/ - Jason Liu on structured extraction in RAG: https://jxnl.co/writing/2025/01/24/systematically-improving-rag-applications/

Product/help docs - ChatGPT memory product post: https://openai.com/index/memory-and-new-controls-for-chatgpt - ChatGPT memory help: https://help.openai.com/en/articles/8983136 - GPTs and memory limitation: https://help.openai.com/en/articles/8983148-does-memory-function-with-gpts%3F.zst

If you want, I can turn this into a decision matrix spreadsheet or a concrete reference architecture for your own personal-agent stack. tokens used 118,187 Snapshot As of April 11, 2026, the field has converged on one practical idea: good agent memory is usually a layered system, not a single database. The strongest production designs combine some mix of working memory in-context, retrieval over external stores, and periodic consolidation into durable summaries, entity pages, or graphs. The biggest split is no longer “memory or no memory,” but what should be compiled ahead of time vs. retrieved on demand vs. left in a giant context window.

1. Current implementations - MemGPT / Letta is the clearest “memory hierarchy” implementation: a small always-visible core memory plus recall memory for prior conversation and archival memory for long-term semantic storage, with the agent itself deciding when to edit or search memory.
Source: https://docs.letta.com/guides/agents/architectures/memgpt — Letta Docs, “Agent memory & architecture” — describes core memory, recall memory, archival memory, and tool-mediated self-editing; https://docs.letta.com/concepts/letta — Letta Docs, “Research Background” — ties Letta’s architecture directly to the MemGPT paper’s OS-style memory hierarchy.

2. Retrieval architectures - Vector-only retrieval remains the default baseline for agent memory because it is simple and widely integrated, but it is weak on exact strings, IDs, names, and sparse symbolic facts.
Source: https://docs.mem0.ai/features/contextual-add — Mem0 Docs — describes vector-based storage as the base layer; https://developers.llamaindex.ai/python/framework/module_guides/deploying/agents/memory/ — LlamaIndex Docs — describes vector memory blocks retrieving flushed chat batches.

3. Data model patterns - Markdown-as-source-of-truth is the fastest-growing pattern in personal-agent memory because it is human-auditable, git-friendly, and easy for agents to patch. GBrain and Karpathy’s LLM wiki are the canonical 2026 examples.
Source: https://github.com/garrytan/gbrain — Garry Tan, repo README — says markdown files in git are the source of truth; https://gist.github.com/karpathy/442a6bf555914893e9891c11519de94f — Andrej Karpathy, gist — describes a wiki layer of markdown pages maintained by the LLM.

4. Entity extraction and enrichment loops - The strongest 2026 systems do not wait for user-written schemas; they auto-extract entities, relations, timestamps, and facts from conversation on write. Mem0 does this directly; GBrain treats entity detection/enrichment as a core loop.
Source: https://docs.mem0.ai/open-source/graph_memory/overview — Mem0 Docs — says the extraction LLM identifies entities, relationships, and timestamps on memory.add; https://github.com/garrytan/gbrain — Garry Tan, repo README — says the agent detects entities, updates pages, and syncs indexes in a compounding loop.

5. Academic foundations - Generative Agents (Park et al., 2023) established the modern reflection loop: store observations, retrieve by relevance/recency/importance, then periodically reflect to create higher-level inferences.
Source: https://huggingface.co/papers/2304.03442 — Park et al., “Generative Agents” — introduces believable agents driven by observation, planning, and reflection.

6. Dream cycles / overnight consolidation - Yes, “dream cycles” are now a real implementation pattern. GBrain explicitly schedules nightly scans that enrich entities, repair citations, and consolidate memory. LangMem’s background memory manager is the same idea without the branding.
Source: https://github.com/garrytan/gbrain — Garry Tan, repo README — describes an overnight dream cycle; https://github.com/langchain-ai/langmem — LangMem README — describes a background memory manager that extracts, consolidates, and updates knowledge.

7. Tradeoffs and recommended stack for a single-user personal agent - Best default stack in 2026: markdown repo as source of truth, Postgres with pgvector plus FTS for hybrid retrieval, optional reranker, lightweight entity graph, and periodic consolidation jobs.
Source: https://github.com/garrytan/gbrain — Garry Tan, repo README — is the cleanest public embodiment of this stack; https://jxnl.co/writing/2025/09/11/lexical-search-in-rag-applications/ — Jason Liu — supports lexical+semantic combination; https://hamel.dev/notes/llm/rag/not_dead.html — Hamel Husain and Ben Clavié — argues retrieval quality still matters.

8. Contested questions - RAG vs long context: the current expert consensus is “both, routed by workload,” not total victory for either side. Google’s docs highlight use-cases where long context replaces retrieval; Hamel Husain explicitly argues “RAG is not dead”; Simon Willison and practitioners keep documenting context-rot failure modes.
Source: https://ai.google.dev/gemini-api/docs/long-context — Google Docs — long context changes design space; https://hamel.dev/notes/llm/rag/not_dead.html — Hamel Husain and Ben Clavié — retrieval still matters; https://simonwillison.net/2025/Jun/18/context-rot/ — Simon Willison — long context degrades when it accumulates distracting baggage.

Bottom line For a single-user personal agent, I would build:

  1. Markdown or append-only files as truth
  2. Postgres + pgvector + Postgres FTS for hybrid retrieval
  3. Entity pages for people, companies, projects, and concepts
  4. Optional graph edges only for relation-heavy use cases
  5. Nightly consolidation jobs to update dossiers and summaries
  6. Long-context mode for bounded research/code tasks, not as the only memory layer

That stack gives the best balance today across setup cost, recall quality, inspectability, maintenance, and portability.

Notable gaps I did not find equally strong primary-source material in this pass for Chip Huyen or Eugene Yan specifically on agent memory architectures, so I excluded them from claim-bearing sections rather than cite weaker secondary summaries. Public detail on Devin/Cognition long-term memory is also still thin relative to Letta, Mem0, or LangMem.

Sources by type

Papers - Generative Agents: https://huggingface.co/papers/2304.03442 - MemoryBank: https://huggingface.co/papers/2305.10250 - Recursively Summarizing: https://huggingface.co/papers/2308.15022 - LoCoMo: https://huggingface.co/papers/2402.17753 - A-MEM: https://huggingface.co/papers/2502.12110 - BEAM / LIGHT: https://openreview.net/forum?id=y59hf5lrMn

Framework docs / official docs - Letta MemGPT architecture: https://docs.letta.com/guides/agents/architectures/memgpt - Letta memory overview: https://docs.letta.com/guides/agents/memory - Letta research background: https://docs.letta.com/concepts/letta - Mem0 overview: https://docs.mem0.ai/features/contextual-add - Mem0 graph memory: https://docs.mem0.ai/open-source/graph_memory/overview - LlamaIndex memory: https://developers.llamaindex.ai/python/framework/module_guides/deploying/agents/memory/ - OpenAI conversation state: https://platform.openai.com/docs/guides/conversation-state?api-mode=responses - OpenAI file search: https://platform.openai.com/docs/guides/tools-file-search - OpenAI Assistants migration: https://platform.openai.com/docs/assistants - Google Gemini long context: https://ai.google.dev/gemini-api/docs/long-context - Anthropic Claude memory: https://www.anthropic.com/news/memory - Anthropic MCP memory README: https://github.com/modelcontextprotocol/servers/blob/main/src/memory/README.md

Repos / gists - LangMem: https://github.com/langchain-ai/langmem - GBrain: https://github.com/garrytan/gbrain - Karpathy LLM Wiki: https://gist.github.com/karpathy/442a6bf555914893e9891c11519de94f - Obsidian Memory MCP: https://github.com/YuNaga224/obsidian-memory-mcp - MCP Obsidian: https://github.com/bitbonsai/mcp-obsidian

Analyst / practitioner posts - Simon Willison on context rot: https://simonwillison.net/2025/Jun/18/context-rot/ - Simon Willison on long context tooling: https://simonwillison.net/2025/Apr/7/long-context-llm/ - Hamel Husain + Ben Clavié, “Stop Saying RAG Is Dead”: https://hamel.dev/notes/llm/rag/not_dead.html - Jason Liu on lexical search: https://jxnl.co/writing/2025/09/11/lexical-search-in-rag-applications/ - Jason Liu on structured extraction in RAG: https://jxnl.co/writing/2025/01/24/systematically-improving-rag-applications/

Product/help docs - ChatGPT memory product post: https://openai.com/index/memory-and-new-controls-for-chatgpt - ChatGPT memory help: https://help.openai.com/en/articles/8983136 - GPTs and memory limitation: https://help.openai.com/en/articles/8983148-does-memory-function-with-gpts%3F.zst

If you want, I can turn this into a decision matrix spreadsheet or a concrete reference architecture for your own personal-agent stack.