Vault & SecondBrain

An interactive explainer — three stores, one index, a compiled wiki. Updated 2026-05-24 (post-v5). Tap the cards, rungs, and buttons below — they react.

The one-sentence version.

Everything Elmar captures and everything Lucienne needs to remember lives as markdown files in the PKA repo, gets indexed into one searchable graph (vault.db), and the high-value bits are compiled by AI into a browsable wiki. "Vault" is how Lucienne behaves and remembers; "SecondBrain" is Elmar's projects and life.

files indexed in vault.db

source markdown in data/

compiled wiki pages

scan roots, one DB

1. Three stores — tap to explore

"Vault" is overloaded. There are three distinct stores plus a derived cache. Tap a card to see what's inside and what it's for:

🔐 Personal Vault

~/.claude/vault/

📒 PKA Vault

~/PKA/Vault/

🧠 SecondBrain

~/PKA/SecondBrain/

🗂️ data/ cache

~/PKA/data/

Rule of thumb: tells Lucienne how to behave/remember → Vault. Tells Elmar what's happening in his projects + life → SecondBrain.

2. The v5 change — sources left Obsidian

Until May 2026 thousands of converted source markdowns lived inside the Obsidian vault. That bloated it (~160 MB) and stalled mobile Obsidian on indexing.

before

5,000+ derived markdowns in SecondBrain/sources/ — bloating the vault, syncing to the phone, mixing machine cache with Elmar's own notes.

after (v5)

Sources moved to ~/PKA/data/sources/ — out of Obsidian, gitignored, still fully indexed via the data scan root. Obsidian dropped 160 MB → ~4 MB.

SecondBrain/sources/ no longer exists. Paths resolve through pka_paths.resolve_source_path(). The vault is now just the brain (wiki + notes); data/ is the cache.

3. The three layers — how knowledge is built

SecondBrain is a Karpathy-style LLM wiki: capture once, index into a graph, compile into synthesized pages.

The hard work happens at ingest time — so queries are fast, because the synthesis already exists. That's the difference from plain RAG, which re-discovers everything on every query.

4. The indexer — one DB, four roots

index.py (schema v4) scans four roots into the single vault.db. Markdown is source of truth; the DB rebuilds in seconds.

Root	Path	What's there
`pka`	`PKA/`	Memory, tasks, notes, projects, CLAUDE.md, team briefs, and `reports/` (commissioned analysis).
`personal`	`~/.claude/vault/`	Identity, contacts, entities, preferences, work, infrastructure (secrets excluded).
`secondbrain`	`SecondBrain/`	Meetings, ideas, docs, recipes, and the compiled `wiki/`.
`data` v5	`~/PKA/data/`	Out-of-vault source cache. Keeps moved sources indexed without bloating Obsidian.

Reports are indexed too (2026-05-24): HTML stripped to text — CSS/JS dropped, SVG diagram labels kept — so a commissioned report stays retrievable by search, not just viewable.

How a file is processed (4 passes)

hash & skip (unchanged files skipped) → frontmatter parse (YAML) → edges (typed frontmatter fields + body wikilinks, code blocks stripped so bash [[ ]] isn't mistaken for a link) → FTS + tags. Rebuild from scratch any time: python index.py --rebuild.

5. How a question gets answered — tap a rung

CLAUDE.md Rule 15. Start at the top, stop when you have the answer. Tap each rung for when to use it.

GBrain (the old semantic layer) was retired 2026-05-01 — a 100%-reliable wiki+FTS combo beat its lock-prone synthesis.

6. Capture → compile

You only ever do two things: record meetings and handle email normally. Everything downstream is automatic.

1. CAPTURE meeting → Drive → Luci transcribes (Gemini) → meeting-notes skill writes a note
or email / Exco / PDF / DOCX → convert_to_md.py (markitdown, Gemini-vision fallback)
2. → markdown lands in data/sources/<type>/ (or SecondBrain notes for meetings)
3. → index.py rebuilds vault.db → searchable
4. → dream cycle (nightly, Luci) detects stale wiki pages
5. → dedicated compilers refresh curated pages
6. → next question answered from the page in seconds, with citations

Ingest channels & schedules

Channel	Where / when	Lands in
Exco sync	Mac LaunchAgent, daily 10:00 SAST	`data/sources/exco/`
Sources-sync (watched folders)	Mac LaunchAgent, daily 10:15 SAST	`data/sources/<watch>/`
Email sync	Luci, daily 7am SAST	`email.db` + OneDrive
"ingest this" / Team Inbox	on demand	converted, indexed
Meeting recorded	overnight	SecondBrain note → index.py

7. Your playbook — pick what you want to do

You almost never pick a folder. Tap what you're trying to do — see where it goes and whether it stays searchable:

The two inboxes — don't mix them up

📥 Team Inbox/ you → team

Drop files for the team to process. Lucienne triages whatever lands here.

📤 Elmar Inbox/ team → you

Finished work for you to read — reports, research, handoffs. Don't dump scratch here.

Read-only — generated, don't hand-edit

Don't touch	Why	Edit instead
`data/sources/**`	Machine-converted copies, rebuilt from the originals	the original in Dropbox / GDrive
`vault.db` · `graphify-out/`	Generated indexes — rebuilt from markdown	nothing — they regenerate
A wiki page above its `MANUAL OVERRIDES` line	Auto-compiled, overwritten on recompile	only below the marker
`_archived/` · `_deleted_/`	Historical snapshots	nothing — leave as history

Everything else under SecondBrain/ — ideas/, inbox/, Scratchpad/, docs/, your notes — is yours to write freely.

8. The compiler & project registry

📄 Dedicated compilers

compile.py + compile_priorities.py (Tue 12:00) + compile_board_brief.py (Tue 12:30), Mac-side. Each writes a synthesized page with a sources_hash for staleness detection.

🗒️ Project registry v5

wiki/projects/_registry.yaml — Elmar-owned. status · flavor · seeds · parent. Seeds run as FTS queries so a project finds its sources even when filenames don't contain the project name.

9. Across machines — what syncs, what doesn't

Git moves only the markdown. Each machine keeps its own vault.db + data/ cache (both gitignored, rebuilt locally). Search is local and fast; the truth is the shared markdown.

10. Related & outstanding

Memory architecture — the auto-memory + symlink + promotion story: reports/memory-architecture.html.
System wiki — canonical detail in wiki/vault.md, wiki/secondbrain-how-it-works.md, wiki/memory-system.md.
Diagram review pending — the hand-laid Excalidraw flow diagrams still predate v5; flagged for a separate pass.
Salience instrumentation — needed to unblock the wiki promotion pipeline (still idle).

Generated 2026-05-24 · interactive explainer · PKA vault + SecondBrain · counts from live vault.db · all interactions are vanilla JS, no external dependencies