Top-line verdict: This is a sophisticated, well-designed system with excellent architectural discipline — high marks for memory model, agent separation, and strategic dispatch. The main risk is over-engineering at the margins — GBrain is powerful but adds moving parts, and some memory/rule guidance is contradictory or stale. Fix the contradictions first; the moving parts are fine if GBrain holds.
Lucienne is a multi-machine Chief of Staff system running across four hosts:
.claude/settings.local.json pins auto-memories to ./Vault/memory/auto/mac/.elmar@46.225.208.1. Dispatched by Luci for code work via SSH.Knowledge layers (most to least important):
~/.claude/vault/ (shared personal data) + SecondBrain (from index.py scans)../Vault/memory/auto/mac/). Claude's memory system appends here per session; gets indexed into vault.db automatically.Team model: Permanent Chief of Staff (Lucienne) + permanent PM (Luci) + permanent specialists (Atlas, Tessa, Scott) + lightweight auto-dispatch agents (architect, developer, reviewer, qa, security) + retained consultant (Larry). No persona briefs — roles defined in agent YAML + short team patterns.
Markdown-first discipline. CLAUDE.md lines 279-285 declare it crisply: "Markdown files are the source of truth. vault.db is a query index over them. ... Only activity_log is authoritative in the DB" (line 85 SETUP.md clarified this on 2026-04-18 — activity_log was demoted from authoritative to derived). This is a correct and defensible choice.
Auto-memory routing per device. Boot sequence (CLAUDE.md line 5) verifies .claude/settings.local.json contains "autoMemoryDirectory": "./Vault/memory/auto/mac/". Each machine writes to its own subdirectory. This solves the "which device did Claude write this from?" ambiguity that killed Echo's memory system.
Frontmatter discipline. Memory files use consistent YAML: name, description, type (user|feedback|project|reference), lifecycle (active|archived), created, last_reviewed, stale_after. Schema v2 (Vault/schema.sql) includes a v_memory_health view that flags stale memories by checking stale_after dates. This is solid.
Activity log as derived index. The decision to rebuild activity_log from text feeds (skills_log.jsonl, git log, task frontmatter, session transcripts) rather than writing to it live is clever — avoids sync conflicts across machines and keeps markdown as canonical.
139 memory files indexed. Not bloated; each has a clear purpose.
"invoke
memory-managerskill before answering... Applies to DECLARATIONS too — 'doesn't happen in our setup', 'we haven't done X'... check memory BEFORE denying, not after Elmar corrects you."
But MEMORY.md line 46 (the actual memory index entry for this) is:
[Load Skills First](feedback_load_skills_first.md) — Always load the relevant skill before Luci tasks; don't wing it from memory.
And CLAUDE.md line 351 says:
Fall back to vault.db graph queries only if GBrain is unreachable or the topic is clearly structured-ops (activity_log, tasks, sessions).
Contradiction: Should Lucienne use memory-manager skill, or query vault.db, or GBrain? All three reference memories exist. The CLAUDE.md dispatch order is: GBrain first (line 351) → vault memory search for agent outcomes (line 352) → Atlas/wiki for system questions (line 350). But the memory file layout doesn't distinguish "agent outcome memories" from "reference memories" — they're all in Vault/memory/, mixed together, with some living in auto/ and others not.
Fix required: Rewrite MEMORY.md frontmatter to tag each file with query tier (gbrain_first | vault_first | check_before_denying). Update CLAUDE.md dispatch logic to match. As-is, a session will try all three methods in the wrong order and waste context.
stale_after dates and a v_memory_health view, but I found no automated job that sweeps old memories into an archive. CLAUDE.md mentions "Do not pre-read specialist files... memory files at boot" (line 26) but doesn't say what happens to a memory that reaches stale_after date. Does it get archived? Deleted? Flagged in the dashboard?Symptom: project_mc601_cancelled.md (MEMORY.md line 24) is marked cancelled 2026-04-20 but still indexed. If Elmar asks "should we do PostgreSQL migration?", vault.db will surface the "cancelled" note — which is correct behavior, but only if the note is reviewed regularly. No evidence that happens.
Fix required: Add a monthly cycle: "Atlas reports on stale memories" or a dashboard card flagging v_memory_health rows where is_stale = 1.
lifecycle: active|archived (SETUP.md line 125) but vault.mcp.py never checks it. A memory tagged lifecycle: archived still appears in searches and gets indexed into vault.db. Archival is only a semantic signal, not a query filter.Fix required: Either make lifecycle a real query filter in vault_mcp.py (skip archived in search results), or remove the field and use subdirectories (./Vault/memory/_archived/) instead.
feedback_auto_dispatch.md says "auto-dispatch agents autonomously without being asked" but CLAUDE.md lines 77-79 reference "memory-manager skill" to check before answering. Where is memory-manager defined? I found it in the skills list at the top of this review but no MEMORY.md entry pointing to its behavior.Risk: Session may invoke memory-manager skill when a simpler vault search would suffice, or vice versa.
GBrain adoption is well-executed. Decision doc (2026-04-17-gbrain-phase5-decision.md) is rigorous: 30-query A/B across 5 classes, scored on Recall@5 and nDCG@5. GBrain wins on 4 of 5 classes (person +51%, financial +46%, temporal -18% but still GBrain, entity +40%) and loses only on "decision" class where vault.db's LLM synthesis edge is noted but deemed trivially replaceable (Lucienne naturally synthesizes from chunks). Latency is 1000× better (20ms vs 22s).
Phased rollout is sensible. Phase 5 (decision) → Phase 6 (wiring) with clear Week 1/Week 2 breakdown. GBrain MCP is already in .mcp.json. Luci parallel install planned. CONFIG file (atlas.md:47) correctly documents the three scopes: SecondBrain + PKA wiki + PKA docs.
wiki-compiler is NOT retiring. (project_gbrain_operational.md lines 36-40). The confusion is real but addressed: "compilation script (compile.py) is NOT retiring — it still produces the wiki pages GBrain indexes. What IS retiring is using wiki-compiler as a retrieval mechanism (the wiki_lookup and semantic_search MCP tools)". This distinction is explicit and correct.
Skill thin-adapter pattern. Retiring wiki-query / search-brain by rewriting their bodies to call gbrain.query() (not retiring the skill names) means existing agent calls don't break. Clever and low-risk.
Platform-aware paths. vault_mcp.py lines 44-48 detect platform.system() == "Darwin" to set WIKI_ROOT correctly. Cross-machine knowledge flows through feedback_cross_machine_knowledge.md (2026-04-19 padel-racket gap) which explicitly tells Lucienne to query GBrain before claiming "no prior work on X".
project_gbrain_operational.md lines 42-52 document it; feedback_gbrain_stale_pid.md has quick-fix (rm the file). But atlas.md (which every PKA system dispatch reads) has zero mention of GBrain as a tool chain entry point. If Atlas needs to query GBrain during a brief and GBrain fails to connect (e.g., stale PID), Atlas will silently skip it and not consult the wiki, violating the 2026-04-19 hard rule in atlas.md lines 47-49 ("FIRST tool call must be mcp__gbrain__query").Fix: Add a "GBrain health check" to the PreToolUse hook (currently only fires on Glob|Grep). Have it remove stale postmaster.pid on session start.
Risk: Session may hang waiting for GBrain, or silently degrade to wrong results if the hybrid retriever returns partial data.
index.py lines 78-90 scan ~/cowork/SecondBrain/wiki/ daily via cron (implied by "daily" in CLAUDE.md line 196). But there's no evidence of a validation that vault.db's SecondBrain index is up to date. If SecondBrain changes and git pulls don't sync immediately, vault.db can be out of date by up to 15 min. GBrain has ~/.gbrain/pka-sync.sh running every 15 min (project_gbrain_operational.md line 57), but vault.db has no parallel sync guarantee.Risk: Two different indexing schedules (vault.db daily, GBrain every 15 min) can diverge. If Lucienne queries vault.db and gets stale results while GBrain is fresh, she won't know.
Fix: Create a dashboard card listing source files in SecondBrain/sources/ so Lucienne can check at a glance before building a new extractor.
tools: line restricting what they can call. Examples:This is correct. Reviewer can't edit because that violates the review gate.
PreToolUse hook for Graphify. .claude/settings.json lines 37-47 inject a reminder to consult graphify-out/GRAPH_REPORT.md before using Glob|Grep on codebase questions. This is smart — prevents re-deriving god nodes from source when the AST analysis already has it.
UserPromptSubmit hook injects datetime. Lines 27-35 inject "Current local datetime: <formatted>". This prevents the "what time is it" ambiguity. Documented in feedback line 70: "Clock now injected via UserPromptSubmit hook; read the injected datetime, don't guess".
Vault MCP server has graceful fallback. vault_mcp.py lines 20-28 try to import salience re-ranker but continue if unavailable. FTS5 ordering is the fallback. This is good defensive coding.
Two-DB split is architecturally sound. vault.db (Lucienne, markdown-first) vs mc.db (Luci, operational). Vault MCP server exposes both transparently via HTTP (project_vault_db_split.md). No write conflicts because they own different domains.
exco-ingest function migration from Luci to Mac LaunchAgent. The rule says: before declaring something missing, check wiki grep (name AND function), Mac LaunchAgents, sources-watch.yaml, and Luci scheduler tasks.The rule is clear but reactive. It was added AFTER a failure. There's no automated check preventing this pattern. If Atlas reasons from incomplete search, they'll make the same mistake again (now with a written rule to consult, but still prone to human skip). This is a soft control, not a hard gate.
Fix: Create a check_multi_host Bash wrapper that checks all four locations in parallel when Atlas invokes it. Make it part of the pre-brief checklist.
wiki-compiler: "Triggers on 'wiki compile', 'compile wiki', 'update wiki'..." ✓ Clearsearch-brain: [Not readable in output, but MEMORY.md line 46 references it] — likely vague overlap with GBrain now that it's thin-adapterbrain: [Available but not read] — possible name collision with "brain explorer"autoplan: Triggers on "auto review", "autoplan", "run all reviews" ✓ Clear but only works if the user says those wordsmemory-manager: [Expected to exist from CLAUDE.md line 14 but not found in skills list] ⚠️ Missing or misnamedRisk: If memory-manager is missing, CLAUDE.md line 14's "call mcp__vault__memory_search or invoke the memory-manager skill" is a false instruction.
Fix: Verify memory-manager exists at ~/.claude/skills/memory-manager/. If not, either create it or remove the instruction from CLAUDE.md.
wiki-compiler/SKILL.md has no tools: line in frontmatter. If this skill is dispatched as a subagent, it will inherit the default tool set, which may be broader than needed.Fix: Add tools: frontmatter to all skills. Recommend: Bash, Read, Write, Edit, Glob, Grep, WebFetch, Skill (no MCP tools except if skill-specific).
Risk: Long sessions with high query frequency could exhaust connection pools.
Fix: Add a connection pool (sqlite3.StaticPool or equivalent) to vault_mcp.py::_db_connection(). Document the limit in vault_mcp.py docstring.
Risk: If a skill fails silently, Lucienne won't know. No dashboard visibility into skill outcomes.
Fix: Add optional PostToolUse logging (with a matcher that's always true: "matcher": "Bash|Skill|Agent") to log tool outcomes to a skill_invoked.jsonl.
memory-manager skill is referenced in CLAUDE.md line 14 but not found in ~/.claude/skills/. This is a false instruction. Either the skill is missing or misnamed. If missing, CLAUDE.md's core rule ("check memory before denying") cannot be executed. Severity: CRITICAL. Blocks CLAUDE.md boot sequence.Fix: Search for the skill. If it doesn't exist, either remove the line from CLAUDE.md or create the skill. If it exists with a different name (e.g., vault-search or memory-search), update the reference.
Fix: Write a single, definitive query order: (1) GBrain for knowledge questions, (2) vault.mcp for agent-outcome questions, (3) Atlas/wiki for system architecture questions. Update all three sections to refer to this unified order.
postmaster.pid but fix is not automated. Documented in feedback_gbrain_stale_pid.md (2026-04-20). Session-end.sh and mcp-serve.sh have been fixed, but existing sessions that fail to connect won't auto-recover. The fix requires manual rm -f ~/.gbrain/brain.pglite/postmaster.pid. Severity: MEDIUM (session-start issue, not production-blocking, but frustrating).Fix: Add a PreToolUse hook that checks for stale postmaster.pid and removes it silently if no gbrain process is running. Or add a gbrain health-check command that runs at session start.
Fix: Add a sqlite3 trigger (BEFORE INSERT/UPDATE on activity_log) that prevents manual writes. Only index.py can write via the rebuild_activity_log() function.
lifecycle: archived or stale_after dates in the past, but no job sweeps them into an archive. Dashboard has no card showing stale memories. Severity: LOW-MEDIUM (functional but accumulates cruft).Fix: Create a monthly task: run python index.py --archive-stale to move files with stale_after date in the past to Vault/memory/_archived/. Add a dashboard card showing v_memory_health rows where is_stale = 1.
Fix: Align sync schedules. Make vault.db hourly or make GBrain 15-min. Document in CLAUDE.md which index is fresher.
Fix: Add a ls SecondBrain/sources/ dashboard card or a search-sources skill to query that folder.
Fix: Audit all skill descriptions for collisions. Rewrite ambiguous ones (e.g., "wiki-compiler" vs "wiki-query" both involve wiki). Add a "Skills Index" dashboard card that lists all skills and their exact trigger phrases.
lifecycle: archived but vault_mcp.py ignores it. Severity: LOW. Semantic signal only; easy to add filter, but not blocking.Fix: Make vault_mcp.py skip lifecycle: archived files in search results, OR use subdirectories instead and remove the field.
What: Write a single, unified query order for all knowledge questions. Remove the three conflicting rules (lines 77-79, 351-353, memory-manager skill). Replace with one definitive flowchart:
Query Order for Knowledge Questions:
1. Is this a system architecture question?
→ Dispatch Atlas (Explain mode). He has wiki loaded.
2. Is this a project/person/decision question from Elmar's work?
→ Query GBrain (mcp__gbrain__query). Covers SecondBrain + PKA wiki + docs.
3. Is this about agent outcomes or recent session work?
→ Query vault.mcp (mcp__vault__memory_search). Covers agent-generated memories + activity_log.
4. Is this about tasks, deadlines, or Lucienne's local ops?
→ Query vault.db v_upcoming_tasks / v_active_projects views.
5. Fallback: raw file reads as last resort.
Update all three CLAUDE.md sections to reference this flowchart. Remove memory-manager skill instruction if it doesn't exist, or document how it fits.
Why: Unambiguous dispatch order prevents wrong-tool selection, saves context, speeds answers. Currently Lucienne has three conflicting instructions.
What: Confirm whether ~/.claude/skills/memory-manager/ exists. If yes, read SKILL.md and add it to MEMORY.md as an entry under "Reference" or "Procedures". If no, either:
- Create a minimal skill (Read, Glob, Grep over Vault/memory/) that finds memories by name/description, OR
- Remove the reference from CLAUDE.md and use vault.mcp tools directly.
Why: Currently CLAUDE.md references a tool that may not exist, making the boot sequence unreliable.
What: Add a session-start hook that:
1. Checks if ~/.gbrain/brain.pglite/postmaster.pid exists
2. Runs pgrep -f "gbrain serve" to see if gbrain is actually running
3. If pid file exists but process is gone, remove the file silently
4. On next claude mcp list, GBrain should reconnect
Also: Add PreToolUse hook to verify GBrain is connected before running Atlas brief. If GBrain fails, log it and fall back to wiki files (not silently).
Why: GBrain is the speed layer for knowledge; if it fails silently, Lucienne reverts to 22-second waterfall without realizing it.
PKA is a well-engineered system with excellent architecture discipline, clean agent separation, and rigorous decision-making (GBrain A/B testing, council review gates). The memory model is sound; the two-DB split is correct; the dispatch model is lean.
The main work is housekeeping: fix contradictory rules, verify missing tools, automate health checks. None of these are architectural; they're maintenance and clarity.
Do not over-refactor. The system works. The improvements above are force multipliers (faster answers, fewer wrong turns, better reliability), not foundational fixes.
Audit conducted 2026-04-20. Report by outside-eyes reviewer.