PKA Command Centre — Deep Review (2026-06-12)

Scope: dashboard.py (generator, 4,864 lines), serve_dashboard.py (Flask server, 2,873 lines), the generated dashboard.html (~2.5MB), and the LIVE app at http://localhost:8787 — tested in a real browser (all 6 tabs, screenshots in .scratch/wf-review/ux/).

Method: ultracode multi-agent workflow (run wf_ab1d9721-e04, 79 agents): 6 dimension finders (generator correctness, server correctness, data accuracy, dead/duplicated code, performance, live-browser UX) → every finding independently reproduced by an adversarial verifier → second independent verifier for high-severity items. 62 findings confirmed, 4 refuted. Read-only review; nothing fixed.

Headline: the command centre has one architectural keystone problem (viewing the page triggers the full data pipeline, and the page re-downloads itself every 30s) and a cluster of silently-wrong numbers on the Home tab (session counts contradict themselves three ways; the activity feed shows March as 'recent'). Several health cards show green for things that are actually broken — the exact failure the dashboard exists to catch.

Findings (priority order)

Severity: HIGH

1. Session counting: page shows three contradictory numbers (baked 106/43-active war strip vs live API 0), driven by get_sessions counting every transcript .jsonl — claude-mem observer files flood it

Kind: broken · Effort: M · Dimension: gen-correctness
Where: /Users/elmar/PKA/dashboard.py (742-871 (get_sessions), 2134-2184 (build_session_cards), 4046-4050 (baked hero), 4231-4261 (hydrateSessionBar))
Impact: Home shows '0 sessions' seconds after load while the war strip directly above shows ~107 session cards; the 'active session' signal is unusable. Plan task 5 (docs/plans/2026-06-05-command-centre-6tab-restructure.md: 'replace baked counts with client-side fetch of /api/sessions') was only half-implemented — war strip + alerts are still baked from a different counting method than the API, and the audit's 'session counts frozen+self-inconsistent' finding persists.
Evidence: Live-served dashboard.html (GET :8787/dashboard.html, regenerated 22:44): war strip = 107 baked cards, top cards labelled 'Claude Mem Observer Sessions · -Users-elmar--claude-mem-observer-sessions'; baked hero id=home-open-sessions = 106 ('43 active · 63 idle · 0 stale'). GET /api/sessions at the same moment → {counts:{active:0,idle:0,stale:0,closed:0}, sessions:[]} — hydrateSessionBar (line 4251-4254) then overwrites the hero to 0. Filesystem confirms 46 .jsonl transcripts modified <5min, mostly under ~/.claude/projects/-Users-elmar--claude-mem-observer-sessions/ — get_sessions (lines 769-818) counts each .jsonl file in every project dir as one session, with no exclusion of the claude-mem observer pseudo-project or subagent sidechains.
Suggested action: Pick one source of truth: exclude the claude-mem-observer project dir (and ideally sidechain .jsonl) in get_sessions, render the war strip client-side from /api/sessions, or at minimum reconcile the generator's counting rules with serve_dashboard's /api/sessions rules.
Verifier: Independently reproduced every element on the live server (2026-06-12 13:18). GET :8787/api/sessions returned {sessions:[], counts:{active:0,idle:0,stale:0,closed:0}} — handler serve_dashboard.py:1887-1922 reads only the manual sessions.json registry. Simultaneously, the live-served dashboard.html had baked hero id=home-open-sessions=19 and 20 war-cards, including 6 labelled 'Claude Mem Observer S
Second verifier: Re-reproduced live (2026-06-12 13:32, GET-only): /api/sessions returned all-zero counts with empty sessions[], while the simultaneously served dashboard.html had baked hero home-open-sessions=119 ('57 active · 62 idle · 0 stale') and ~123 war-cards, 50 labelled 'Claude Mem Observer Sessions'. Read s

2. KYC unverified-fields card scans a non-existent path and always renders a green 'All KYC fields verified' all-clear

Kind: broken · Effort: S · Dimension: gen-correctness
Where: /Users/elmar/PKA/dashboard.py (510-519)
Impact: The Memory-tab KYC card permanently shows a green 0/'All KYC fields verified' while 4 fields are actually unverified — Elmar believes verification work is done when it isn't. Silent false positive since the scan directory vanished.
Evidence: Line 510: sb_wiki = Path.home() / "CoWork" / "SecondBrain" / "wiki" — ls /Users/elmar/CoWork/SecondBrain/wiki → 'No such file or directory'. SecondBrain actually lives at ~/PKA/SecondBrain (per repo + pka_paths.secondbrain_wiki_root(), which this same file imports at line 62 and uses in get_auto_compile_health but NOT here). Running the identical scan logic against the correct root (/Users/elmar/PKA/SecondBrain/wiki) yields 4 unverified fields (yellownickel.md: 4). Generated and live-served dashboard.html both contain 'kyc-big-num zero">0<' and 'All KYC fields verified ✓' (verified via GET http://localhost:8787/dashboard.html).
Suggested action: Replace the hardcoded ~/CoWork path with the already-imported secondbrain_wiki_root() (or ROOT/'SecondBrain'/'wiki'), and make get_kyc_unverified return an error marker when neither kyc/ nor entities/ exists instead of total=0.
Verifier: Independently reproduced every claim. (1) Read dashboard.py:489-563 — get_kyc_unverified() hardcodes Path.home()/'CoWork'/'SecondBrain'/'wiki' at line 510; ls confirms /Users/elmar/CoWork/SecondBrain/wiki does not exist, so the .exists() guards at 514/518 silently skip both scan dirs and the function returns total=0 with no error. (2) Confirmed it is live code: called at dashboard.py:3423, rendere
Second verifier: Second-pass verification with a severity-stress lens, all probes re-run independently. (1) ls confirms /Users/elmar/CoWork/SecondBrain/wiki does not exist while /Users/elmar/PKA/SecondBrain/wiki/{kyc,entities} are populated. (2) Re-ran the exact scan logic from dashboard.py:489-563 against the real

3. Home 'Open Sessions' is overwritten to 0 on page load — live API reads an empty registry while dozens of sessions are active

Kind: broken · Effort: S · Dimension: data-accuracy
Where: /Users/elmar/PKA/serve_dashboard.py (1887-1923 (api_sessions); dashboard.py:743-830 (get_sessions))
Impact: The single most prominent number on the Home tab is wrong twice over: baked value uses one definition (transcript files), the hydrated value uses another (manual registry, currently empty). Elmar sees '0 sessions' while ~44 sessions are live, making the war-room view useless for spotting running/stuck work.
Evidence: GET http://localhost:8787/api/sessions returned {"sessions": [], "counts": {"active": 0, "idle": 0, "stale": 0, "closed": 0}} (saved at .scratch/wf-review/sessions.json). ~/.claude/vault/sessions.json is 20 bytes (mtime Jun 10 22:14). The baked HTML simultaneously shows id="home-open-sessions">34 (22:23 bake), then 126 (22:50), then 148 '33 active · 90 idle · 25 stale' (22:58). hydrateSessionBar() in the generated page replaces the baked value with the API total on load and every 30s, so the user sees '0 sessions · 0 active'. Ground truth: 44 transcript .jsonl files under ~/.claude/projects modified within 5 min at 22:38 (this review session included).
Suggested action: Make /api/sessions use the same transcript-autodetect logic as dashboard.py get_sessions() (or have both read one shared source), and investigate what emptied sessions.json at 22:14.
Verifier: Reproduced end-to-end: (1) GET /api/sessions on the live server returned {"sessions": [], "counts": {active:0, idle:0, stale:0, closed:0}}; (2) ~/.claude/vault/sessions.json is 20 bytes / {"sessions": []}; (3) serve_dashboard.py:1887-1923 api_sessions reads ONLY the manual registry file, while dashboard.py:743-830 get_sessions (which bakes the HTML value) scans ~/.claude/projects/*/.jsonl mtimes
Second verifier: Independently re-ran the key probes (GET-only, read-only). (1) Live GET http://localhost:8787/api/sessions returned {"sessions": [], "counts": {"active": 0, "idle": 0, "stale": 0, "closed": 0}} and ~/.claude/vault/sessions.json is 20 bytes containing {"sessions": []} — matches both verifiers. (2) se

4. Activity feed and 'Usage by Role' recents sort by id DESC, but activity_log ids are not chronological — today's events buried at position 1,787, 'recent' shows 31 March

Kind: broken · Effort: S · Dimension: data-accuracy
Where: /Users/elmar/PKA/dashboard.py (319-322 (get_activity_log), 607-611 (get_timesheet_data actor_recent))
Impact: The activity feed's visible top is yesterday-and-older entries in jumbled order; today's 63 events are effectively invisible, and per-role 'recent activity' claims 2.5-month-old items are the latest work. Misleads any 'what happened today' glance.
Evidence: vault.db (read-only): id range for date 2026-06-10 is 1448165-1449146 while date 2026-03-31 occupies HIGHER ids 1450164-1450179; max id 1450932 is a 2026-06-09 row. In the generated page, the first 2026-06-10 entry appears at position 1,787 of 3,393 evt-date entries; the Lucienne 'Usage by Role' card shows three 'ts-recent' lines all dated 2026-03-31 ('Task done: Push latest PKA changes…') while the true latest Lucienne row is 2026-06-10 ('findash single-writer plan v2…').
Suggested action: ORDER BY date DESC, id DESC (or created_at DESC) in get_activity_log and the actor_recent query, since the indexer rebuild does not preserve chronological ids.
Verifier: Independently reproduced all elements. (1) Read dashboard.py:319-322 and 607-611 — both get_activity_log and actor_recent use ORDER BY id DESC. (2) Read-only sqlite on vault.db: ids are not chronological — for actor Lucienne, ORDER BY id DESC returns 2026-03-31 rows (ids ~1672505-1672507, 'Task done: Push latest PKA changes to GitHub') while her true latest row is dated 2026-06-12; between two que
Second verifier: Second-pass verification with a severity/proportionality lens; all probes read-only GET/SELECT. (1) Code: dashboard.py:319-322 (get_activity_log) and 607-611 (actor_recent) both use ORDER BY id DESC with no date ordering — confirmed by direct read. Searched the whole file for any compensating sort:

5. Live WS refresh silently wipes in-progress user input every ~10-15s (filter text, checkbox selections)

Kind: broken · Effort: S · Dimension: ux-live
Where: /Users/elmar/PKA/dashboard.py (4401-4402, 4476-4520 (doRefresh))
Impact: Elmar cannot complete any multi-step interaction on the Reports tab — typed filter text and bulk-delete selections are destroyed mid-action every few seconds, making bulk operations effectively unusable and the page feel haunted.
Evidence: Set #reports-search to 'fuel' (list correctly filtered 492→33 shown) and tagged the input with dataset.marker='X'. Polling every 5s: at t=10s value='fuel'/marker='X'; at t=15s value=''/marker=null — the entire .shell innerHTML was replaced by doRefresh(), which fetches the full /dashboard.html?t= (2.5MB) on every 'sessions.updated' WS event and restores only active tab, sort mode and scrollY — not filter text, report checkbox selections, or open
. A separate 2-min test showed filter+checked checkbox both reset to empty/0 (screenshot .scratch/wf-review/ux/16-reports-after-refresh.png). Tested against the LIVE server at :8787.
Suggested action: In doRefresh(), preserve and restore #reports-search value (re-dispatch input event), checked .report-select-cb state and open
; better, debounce sessions.updated and patch only the session strip/counters instead of swapping the whole shell and re-downloading 2.5MB per event.
Verifier: Read dashboard.py:4400-4523: sessions.updated → doRefresh(); doRefresh() refetches full /dashboard.html?t= and replaces .shell innerHTML, restoring only active tab, sort mode, bulk-bar init and scrollY — never filter text, checkbox selections, or open details. Line 4523 adds setInterval(doRefresh, 30000), so the wipe fires at least every 30s even without WS events. Verified the LIVE server at :878
Second verifier: Independently re-verified both halves of the finding. (1) Source: /Users/elmar/PKA/dashboard.py:4400-4402 (sessions.updated → doRefresh), 4476-4519 (doRefresh refetches full /dashboard.html?t= and replaces .shell innerHTML, restoring only active tab, sort mode, bulk-bar recalc, memory count, scrollY

6. War-strip session ticker accumulates duplicate cards without bound (60 → 136 in ~30 min, 20,000px wide)

Kind: broken · Effort: S · Dimension: ux-live
Where: /Users/elmar/PKA/dashboard.py (769-818 (get_sessions transcript scan, dedupe key line 801), 2159 (card render))
Impact: The top-of-page 'what is running right now' strip is unreadable noise — dozens of duplicate cards for the same two activities, requiring ~15 screens of horizontal scrolling; genuine session info is buried.
Evidence: Observed .war-strip .war-card count grow 60 → 68 (65s later) → 73 → 124 → 134 → 136 during the session; scrollWidth 19,821px vs 1,356px visible (screenshot .scratch/wf-review/ux/07-war-strip.png). Cards are near-identical: 'Claude Mem Observer Sessions · -Users-elmar--claude-mem-observer-sessions' ×5+ and 'PKA · Lucienne' ×17 per age bucket (0m,1m,2m...). Served HTML itself contained 116 'war-card' occurrences (server-side, not a JS append bug). Root cause read in code: get_sessions() dedupes by (project, jsonl.name) — every transcript .jsonl modified in the last 4h becomes its own card, and the claude-mem observer creates new transcript files continuously. Labels also expose the raw project slug '-Users-elmar--claude-mem-observer-sessions'.
Suggested action: Dedupe sessions per (project) or per session UUID keeping only the most recent transcript, cap the strip at ~10 cards, exclude/aggregate the claude-mem observer project, and humanise the '-Users-elmar--*' slug.
Verifier: Independently reproduced. Code: dashboard.py:769-818 get_sessions() makes every *.jsonl transcript modified <4h its own session card (dedupe key (project, jsonl.name) at line 801); build_session_cards() (lines 2134-2172, card render 2159) emits one war-card per active/idle session, no cap/aggregation. Live: GET http://localhost:8787/dashboard.html at 13:42 contained 155 'war-card' occurrences incl
Second verifier: Independently re-verified two days after the first pass (different lens: severity/mitigation stress-test). Live GET of http://localhost:8787/dashboard.html shows 168 war-card occurrences, 79 identical 'Claude Mem Observer Sessions' cards (raw slug exposed) — persistent steady-state, matching first v

7. Home claims '0 sessions / 0 active' while the war-strip directly above shows dozens of live session cards

Kind: broken · Effort: S · Dimension: ux-live
Where: /Users/elmar/PKA/serve_dashboard.py (1887 (/api/sessions) vs dashboard.py:743-818)
Impact: The two session indicators contradict each other on first read — Elmar cannot trust either; 'is anything running?' gets two opposite answers on one page.
Evidence: GET http://localhost:8787/api/sessions (live server) returned {"sessions": [], "counts": {"active": 0, "idle": 0, "stale": 0, "closed": 0}} while the war-strip rendered 73+ active cards including 'PKA · Lucienne | 0m' (this very session). Home tab '.home-session-bar' showed '0 sessions | 0 active | 0 idle | 0 stale' and hero card '0 OPEN SESSIONS' (screenshots 01-home-top.png, 08-home-mid.png). hydrateSessionBar() (dashboard.py:4230) reads /api/sessions which only reflects the manual registry (~/.claude/vault/sessions.json), whereas the war-strip is baked from transcript auto-detection — two divergent session sources on the same screen.
Suggested action: Make /api/sessions use the same transcript auto-detect + registry merge as the generator's get_sessions(), or have hydrateSessionBar count the baked war-strip cards as fallback when the registry is empty.
Verifier: Reproduced end-to-end against the live server. (1) GET http://localhost:8787/api/sessions returned {"sessions": [], "counts": {active:0, idle:0, stale:0, closed:0}} — because serve_dashboard.py:1887-1922 reads only env.sessions_file, and ~/.claude/vault/sessions.json is 20 bytes / 0 sessions. (2) The live-served /dashboard.html contains 156 war-card session cards in the war-strip (baked by dashboa
Second verifier: Independently re-confirmed against the LIVE server (2026-06-12): GET http://localhost:8787/api/sessions returned {"counts": {active:0, idle:0, stale:0, closed:0}, sessions: []} while the live-served /dashboard.html simultaneously contained 162 'war-card' occurrences and baked hero values home-open-s

Severity: MEDIUM

8. Three weekly wiki-compile targets (safair-holdings, safair-operations, safair-lease-finance) have never been produced — pages CLAUDE.md cites as readable don't exist

Kind: broken · Effort: M · Dimension: data-accuracy
Where: /Users/elmar/PKA/SecondBrain/wiki
Impact: The dashboard is truthful here, but the compile pipeline for three pages the retrieval guidance depends on has never delivered output; sessions following Rule 15 hit dead references.
Evidence: Home tab Wiki Compile card shows status 'missing / Compiled missing' for safair-holdings.md, safair-operations.md, safair-lease-finance.md (Mac LaunchAgent, Tue 12:40). Verified: find SecondBrain/wiki -iname 'safair-holdings*' etc. → no matches anywhere (entities/, synthesis/, projects/ checked). Yet CLAUDE.md Rule 15 step 2 lists all three as known auto-compiled pages to read first.
Suggested action: Check the Mac LaunchAgent for the 12:40 compile lane (logs/mac-tasks) and either fix the compiler or remove the three pages from CLAUDE.md Rule 15 and the manifest.
Verifier: Independently reproduced every element: (1) find over /Users/elmar/PKA/SecondBrain/wiki for safair-holdings/safair-operations/safair-lease* returns nothing; whole-repo find matches only unrelated email sources in data/sources/emails. (2) Vault/config/wiki-auto-compile.json active_entries lists all three pages enabled=true, runner com.elmar.entity-compile, Tuesday 12:40 SAST Mac LaunchAgent; comp

9. get_active_agents queries a table (agent_events) that does not exist in vault.db; the exception is swallowed, so org-chart live-agent status is permanently dead

Kind: broken · Effort: S · Dimension: gen-correctness
Where: /Users/elmar/PKA/dashboard.py (676-699 (get_active_agents), 3483-3494 (consumer))
Impact: The 'agent is Active via hook events' feature silently never fires — the org chart's status column conveys false information (everything looks idle), and the broken dependency is invisible because the error is swallowed.
Evidence: sqlite3 'file:/Users/elmar/PKA/Vault/vault.db?mode=ro' '.tables' lists: activity_log, edges, files, relation_types, search_fts, tag_assignments, tags, task_runs, v_, vitals_log — no agent_events. 'SELECT ts FROM agent_events LIMIT 3' → 'Error: no such table: agent_events'. get_active_agents wraps the query in try/except Exception: return [] (line 698-699), so active_agent_names at line 3484 is always empty and every agent in the System-tab org chart renders 'On Shelf' regardless of real activity.
Suggested action: Either create/restore the agent_events ingestion or delete get_active_agents and the active_agent_names override; at minimum log the sqlite error instead of returning [] silently.
Verifier: Independently reproduced. (1) Read dashboard.py:676-699 — get_active_agents queries agent_events with try/except Exception: return []; consumer at 3483-3493 builds active_agent_names for the org-chart 'Active' override. (2) Read-only sqlite on /Users/elmar/PKA/Vault/vault.db: 'SELECT COUNT(*) FROM agent_events' → 'no such table: agent_events'; full table list contains no agent table. (3) Executed

10. Baked session count is inflated by claude-mem observer transcripts and swings wildly (34→126→148 in 35 min)

Kind: broken · Effort: S · Dimension: data-accuracy
Where: /Users/elmar/PKA/dashboard.py (743-830 (get_sessions), 1350-1367 (get_hero_metrics))
Impact: Even before the hydration bug zeroes it, the 'Open Sessions' metric and the header session cards double-count memory-pipeline artifacts as work sessions, so the number never reflects reality.
Evidence: get_sessions() counts every *.jsonl under ~/.claude/projects modified <4h as an open session. In the 22:58 bake, 67 of 124 header war-cards are 'Claude Mem Observer Sessions · -Users-elmar--claude-mem-observer-sessions' (grep -o 'class="war-project">...' dash3.html | sort | uniq -c). Baked Open Sessions value observed at 34 (22:23), 126 (22:50), 148 (22:58) — driven by background extractors touching transcripts, not by real session changes.
Suggested action: Exclude the claude-mem observer project dir (and other non-interactive transcript producers) from get_sessions() detection, or key sessions on the registry plus a tight transcript filter.
Verifier: Read dashboard.py:742-818 — get_sessions() counts every *.jsonl under ~/.claude/projects with mtime<240min as an open session, with no exclusion for the claude-mem observer project; get_hero_metrics() (1350-1367) sums them all into total_open. Reproduced live via GET http://localhost:8787/dashboard.html: baked hero shows Open Sessions=46, and 15 of the 46 war-cards are 'Claude Mem Observer Session

11. Dashboards listing and serving use two diverged path resolvers — newest Exco Dashboard (9 June) is served but invisible in the command centre

Kind: broken · Effort: S · Dimension: dead-dup
Where: /Users/elmar/PKA/dashboard.py (1561-1575 (and serve_dashboard.py:444-490))
Impact: The latest Exco dashboard has been missing from the command centre app list for 36+ hours; Elmar sees the stale 2 June version listed instead. Any output written only to the repo dir (not mirrored to cloud) silently disappears from the UI.
Evidence: dashboard.py get_dashboards() lists from ONE base via _primary_output_base('dashboards') (cloud GDrive PKA-Outputs once non-empty), while serve_dashboard.py _output_bases/_output_dir_for serves per-file cloud-first WITH repo fallback. 'Exco Dashboard - 9 June 2026.html' exists in repo dashboards/ (mtime 2026-06-09 10:01:35) but NOT in '/Users/elmar/Library/CloudStorage/GoogleDrive-conrelma@gmail.com/My Drive/PKA-Outputs/dashboards' (verified directory listing diff). Generated dashboard.html (regenerated 2026-06-10 22:53) contains '2 June 2026' 3 times and '9 June 2026' 0 times. Live GET http://localhost:8787/dashboards/Exco%20Dashboard%20-%209%20June%202026.html returned 200 (72,948 bytes) — the file serves fine, it just never appears in the Apps listing.
Suggested action: Make get_dashboards() enumerate the union of cloud + repo bases (reuse the serve-side _output_bases search order, freshest wins per filename), or guarantee every dashboard export also lands in PKA-Outputs.
Verifier: Independently reproduced every claim. (1) Read dashboard.py:1561-1575: _primary_output_base() returns ONE base — cloud GDrive PKA-Outputs once non-empty, else repo — and get_dashboards() (line 1578-1684) globs *.html only from that single base. Read serve_dashboard.py:444-490: _output_bases/_output_dir_for serve per-file across cloud+repo (+OneDrive for findash), picking freshest mtime. The two re

12. Open dashboard tab drives a full server regeneration pipeline every 30s, continuously

Kind: broken · Effort: S · Dimension: performance
Where: /Users/elmar/PKA/dashboard.html (43325 (generated; source dashboard.py ~4253 area), serve_dashboard.py:409-413)
Impact: The Mac burns 5.6-8.7s of Python work every 30 seconds around the clock while a tab is open (battery/CPU), and the page re-downloads 2.5MB each cycle. The 'fallback' poll runs even when the WS live channel is healthy, making the WS design pointless.
Evidence: dashboard.html:43325 let fallbackTimer = setInterval(doRefresh, 30000); — never cleared even when the WebSocket is connected (ws.onopen at 43303 only updates the status dot). doRefresh() (43417) fetches '/dashboard.html?t='+Date.now(), and serve_dashboard.py:411 calls regenerate() on every such GET with only a 10s cooldown (REGEN_COOLDOWN=10, line 250) — 30s polls always miss the cooldown. Verified live with zero requests from me: dashboard.html mtime advanced 22:55:05 -> 22:55:42 -> 22:56:05 -> 22:56:35 (~every 30s). Each regen measured at 5.59s and 8.67s wall time. That is ~19-29% CPU duty cycle 24/7 plus a 2,493,769-byte uncompressed transfer per poll (~5 MB/min, ~7.2 GB/day per open tab).
Suggested action: Clear fallbackTimer when ws.onopen fires (re-arm on ws.onclose), and/or have doRefresh hit a lightweight delta endpoint instead of the full page (see /api/sessions finding).
Verifier: Independently reproduced every element. (1) Code: dashboard.py:4523 (source of generated dashboard.html) has let fallbackTimer = setInterval(doRefresh, 30000); and rg confirms no clearInterval anywhere in dashboard.py or dashboard.html; ws.onopen (dashboard.py:4371-4377) only recolors the status dot/label, so the 30s poll runs even with a healthy WebSocket. doRefresh (4476ff) fetches '/dashboard

13. GET /dashboard.html synchronously runs index.py + generate_context.py + full rebuild in the request path, with no staleness check

Kind: broken · Effort: S · Dimension: performance
Where: /Users/elmar/PKA/serve_dashboard.py (355-381, 409-413; dashboard.py:4790-4815)
Impact: First page load (or any load >10s after the last) stalls 5.6-8.7s before HTML arrives; the requester pays for a full vault re-index they didn't ask for. If index.py fails the whole page build fails (required=True).
Evidence: serve_dashboard.py regenerate() (355-381) subprocess-runs dashboard.py (timeout 30s) inside the request; dashboard.py main() (4810-4815) first runs index.py (required=True, a vault.db WRITER) and generate_context.py as subprocesses on every invocation. There is no change/staleness detection anywhere — only the 10s time cooldown — so it rebuilds even when nothing changed. Measured: triggering GET blocks 5.587s and 8.667s; profiled read-only data-gather + HTML build alone is 1.52s (scratch profile_dash.py), so ~4.1-7.1s of every regen is the index.py/generate_context.py subprocess chain.
Suggested action: Move regeneration to a background thread (serve current file immediately, regenerate behind it like _maybe_scan_token_dashboard already does), and skip regen when no source mtime/DB max(id) changed. Decouple index.py from page rendering (scheduled indexer).
Verifier: Read serve_dashboard.py:355-381 (regenerate() subprocess-runs dashboard.py synchronously, gated only by REGEN_COOLDOWN=10 at line 250, no staleness/mtime check) and :403-413 (both / and /dashboard.html call regenerate() in the request path). Read dashboard.py:4790-4816: main() unconditionally runs index.py (required=True, raises on failure) and generate_context.py before the build; index.py:432-51

14. Raw Node.js stack trace dumped into the Home tab QMD card (QMD integration broken)

Kind: broken · Effort: S · Dimension: ux-live
Where: /Users/elmar/PKA/dashboard.py (965-999 (qmd status capture))
Impact: A retrieval layer is silently broken (QMD evals score 0.000) and the dashboard surfaces it as an unreadable raw stack trace instead of a one-line status + fix hint — looks broken and tells the reader nothing actionable.
Evidence: Home tab QMD card renders verbatim: "QMD · unavailable — qmd --index pka status failed (1): node:internal/modules/cjs/loader:1939 ... Error: The module '/Users/elmar/.bun/install/global/node_modules/better-sqlite3/build/Release/better_sqlite3.node' was compiled against a different Node.js version using NODE_MODULE_VERSION 141. This version of Node.js requires NODE_MODULE_VERSION 127..." (extracted from live DOM; visible as the dense grey text wall in .scratch/wf-review/ux/08-home-mid.png). Consistent with the Evals tab showing QMD P@5 0.000 / MRR 0.000.
Suggested action: Rebuild better-sqlite3 for the current Node (or pin the node used by the qmd wrapper); in dashboard.py truncate the error to the first meaningful line and show a short 'QMD offline — node module version mismatch, run npm rebuild' message.
Verifier: Read dashboard.py:963-1002: get_qmd_health appends full untruncated stderr to the error string on non-zero exit. GET http://localhost:8787/dashboard.html (200, 2.55MB) — confirmed the QMD tile inside the Home panel (div id="p-home" at line 2461; tile at ~line 2645) renders 'QMD · unavailable' followed by the verbatim multi-line Node stack trace: node:internal/modules/cjs/loader:1939, better_sqlite

15. MEMORY.md overflow health check measures the wrong metric (200 lines) on the wrong file — the actually-overflowing auto-memory index (38.4KB > 24.4KB, truncated at load) is invisible and the card shows green

Kind: fragile · Effort: S · Dimension: gen-correctness
Where: /Users/elmar/PKA/dashboard.py (250-255, 2684-2696)
Impact: Memory entries silently drop out of every session's loaded index while the health dashboard says the memory system is healthy — exactly the class of silent data loss the card exists to catch.
Evidence: Lines 250-255 check only MEMORY_DIR/'MEMORY.md' (= Vault/memory/MEMORY.md) and flag overflow only when len(lines) > 200. Measured: Vault/memory/MEMORY.md = 115 lines / 14,605 bytes → card renders 'MEMORY.md within 200-line limit — OK'. Meanwhile the live session-loaded index Vault/memory/auto/lucienne/MEMORY.md is 38.4KB and Claude Code's own loader warns 'MEMORY.md is 38.4KB (limit: 24.4KB) — Only part of it was loaded' (observed in this session's context injection). The dashboard's Memory Health section reports green while the real boot-loaded index is being truncated, and Claude's limit is KB-based, not line-based.
Suggested action: Check byte size against ~24KB (not 200 lines) and include Vault/memory/auto//MEMORY.md in the overflow check.
Verifier: Confirmed all claimed evidence. dashboard.py:38 sets MEMORY_DIR=Vault/memory; lines 250-255 check only Vault/memory/MEMORY.md with a lines>200 threshold; lines 2684-2696 render green 'MEMORY.md within 200-line limit' when not tripped. Measured: Vault/memory/MEMORY.md = 115 lines/14,605 bytes (passes); Vault/memory/auto/lucienne/MEMORY.md = 192 lines/41,784 bytes (~40.8KB, grown past the claimed 38

16. Generator-emitted 30s polling loop forces a full dashboard.py regeneration (index.py reindex + qmd subprocess + transcript scan) for every open tab

Kind: fragile · Effort: S · Dimension: gen-correctness
Where: /Users/elmar/PKA/dashboard.py (4523 (setInterval(doRefresh,30000)), 4479 (fetch /dashboard.html?t=), 4810-4811 (main runs index.py))
Impact: One open browser tab causes a full DB reindex + subprocess fan-out roughly every 30s indefinitely (2.4MB transfer per poll); if regen ever exceeds the server's 30s timeout the page silently serves stale data while the machine keeps burning CPU. WS 'sessions.updated' events trigger the same full-document swap on top.
Evidence: Emitted JS: let fallbackTimer = setInterval(doRefresh, 30000); where doRefresh fetches '/dashboard.html?t='+Date.now(). serve_dashboard.py:355-372 regenerates by running python dashboard.py on request with REGEN_COOLDOWN = 10s (serve_dashboard.py:250) and subprocess timeout=30. dashboard.py main() (4810-4816) runs index.py (full vault reindex) plus generate_context.py, then get_qmd_health spawns the qmd CLI (982-988), get_reports re-reads every report file, and get_sessions rescans all ~/.claude/projects transcripts. Observed live: served timestamp advanced 22:43 → 22:44 between two requests one minute apart, confirming per-request regeneration. Each poll also pulls the full 2.48MB document (measured size 2,484,051 bytes) and replaces .shell innerHTML.
Suggested action: Drop or drastically lengthen the fallback poll (WS already exists), and/or have doRefresh hit a cheap freshness endpoint instead of the full document; decouple index.py from every regeneration.
Verifier: Independently confirmed every element. Code: dashboard.py:4523 emits setInterval(doRefresh,30000); doRefresh (4476-4519) fetches /dashboard.html?t=Date.now() and replaces .shell innerHTML with the full document. serve_dashboard.py /dashboard.html route (409-411) calls regenerate() (355-381), which runs python dashboard.py as a subprocess with REGEN_COOLDOWN=10 (line 250) and timeout=30; on failu

17. Memory tab '378 integrity issue(s)' banner: all 305 'index drift' rows are reports intentionally migrated to cloud PKA-Outputs — files exist and serve fine

Kind: fragile · Effort: S · Dimension: data-accuracy
Where: /Users/elmar/PKA/dashboard.py (n/a (integrity check section feeding 'Index drift (DB row but file missing on disk)'))
Impact: A red 378-issue alarm is 80% false positives caused by the planned outputs-to-cloud migration, drowning the ~73 genuinely broken wiki links and training Elmar to ignore the integrity banner.
Evidence: Dashboard shows 'Index drift (DB row but file missing on disk): 305' inside a red '⚠ 378 integrity issue(s)' banner. Verified via read-only query: exactly 305 files-table rows missing under /Users/elmar/PKA, ALL with path prefix reports/, and ALL 305 exist in '/Users/elmar/Library/CloudStorage/GoogleDrive-conrelma@gmail.com/My Drive/PKA-Outputs/reports'. Sampled URL GET /reports/skill-audit-2026-06-08.html → 200 (listed as 'missing' in the drilldown).
Suggested action: Teach the integrity check (and index.py) to resolve reports/dashboards paths via pka_paths.outputs_dir() before declaring drift.
Verifier: Fully reproduced. dashboard.py:264-266 flags a files-table row as 'index drift' if the path doesn't exist locally, with no awareness of the planned outputs-to-cloud migration (docs/plans/2026-06-05-outputs-to-cloud-storage-split.md). Read-only query on Vault/vault.db: 308 of 1,932 pka/personal rows missing locally, 100% under reports/, and all 308/308 present in GoogleDrive PKA-Outputs/reports. Re

18. Spend tile shows '$35,687.48' lifetime estimated cost with no 'estimated' label or time period, and token total silently omits 647M cache-creation tokens

Kind: fragile · Effort: S · Dimension: data-accuracy
Where: /Users/elmar/PKA/dashboard.py (hydrateSpendTile JS in generated page; API serve_dashboard.py /api/token-dashboard/overview)
Impact: Reads as 'we spent $35.7k' when it is an all-time API-equivalent estimate for mostly-subscription usage; the token figure also under-reports by ~647M cache-write tokens.
Evidence: GET /api/token-dashboard/overview → {"cost_usd":35687.4756, "input_tokens":31141344, "output_tokens":59210240, "cache_read_tokens":16403776759, "cache_create_1h_tokens":527731281, "cache_create_5m_tokens":119778070, ...}. The tile JS renders '$35,687.48' + input+output+cache_read only (excludes both cache_create fields) under the heading 'Spend & Tokens' with no period or 'estimated' qualifier. Usage runs largely on Claude subscription (per house rules, cost-shaped API fields ≠ real charges).
Suggested action: Label the figure 'est. API-equivalent (all time)' and include cache-creation tokens (or show a 30-day window).
Verifier: Independently reproduced every element. (1) GET http://localhost:8787/api/token-dashboard/overview returned cost_usd=35989.8041, input_tokens=43647306, output_tokens=62199635, cache_read_tokens=16829394521, cache_create_1h_tokens=542186822, cache_create_5m_tokens=147073157 — slightly higher than the reviewer's figures because this is an accumulating all-time metric (the tile passes no since/until,

19. Activity log rendered with no LIMIT: 3,392 rows = 1.54MB = 62% of the entire page

Kind: fragile · Effort: S · Dimension: performance
Where: /Users/elmar/PKA/dashboard.py (319-332 (get_activity_log), 3272-3300 (render loop))
Impact: 62% of every 2.5MB page build, transfer, parse and 30s innerHTML re-render is a timeline nobody scrolls 3,392 rows deep; page size and regen time grow unboundedly with activity.
Evidence: get_activity_log: "SELECT date, actor, category, summary, details FROM activity_log ORDER BY id DESC" — no LIMIT. Render loop emits an .evt div block per row. Byte-share scan (scratch panelshare.py/homeshare.py): id="home-session-breakdown" chunk = 1,538,220 B of the 2,487,184 B page (62.3%); 3,392 .evt rows. sqlite3 'file:Vault/vault.db?mode=ro' "SELECT COUNT(*) FROM activity_log" -> 3392, dates 2026-03-31..2026-06-10 — i.e. the page grows ~600KB+/month forever.
Suggested action: Add LIMIT 200 (or last-30-days) to get_activity_log and a 'view full log' link; instantly cuts the page from 2.49MB to ~1.0MB.
Verifier: Independently reproduced every element. dashboard.py:319-322 (read directly): SELECT from activity_log with no LIMIT; render loop 3272-3300 emits an .evt div per row; the only filters (lines 3133-3135 'Waiting for', in-loop index/system_maintenance skips) currently remove zero rows. Read-only sqlite query: activity_log now has 3,456 rows (2026-03-31..2026-06-12) — up from the claimed 3,392 on 2026

20. Page views cause vault.db writes: index.py (a DB writer) runs every 30s as a side effect of GET

Kind: fragile · Effort: S · Dimension: performance
Where: /Users/elmar/PKA/dashboard.py (4810-4811 (run_step index.py, required=True); serve_dashboard.py:365-372)
Impact: Constant DB write churn and lock contention driven purely by someone looking at a dashboard; a hung/failed index.py (required=True) takes down page regeneration entirely.
Evidence: dashboard.py main() runs index.py as a required first step; index.py docstring: 'populates vault.db as a query index' with 21 INSERT/UPDATE/DELETE statements. Because regenerate() is triggered by GET /dashboard.html (serve_dashboard.py:411) and the open tab polls every 30s, vault.db gets a write transaction every ~30s, 24/7 (observed via dashboard.html mtime advancing every ~30s with no external requests — each of those is an index.py run). This GET-with-side-effects also competes with every other vault.db writer (memory extractors, graphify, session tracker) for the write lock.
Suggested action: Run index.py on its own schedule (launchd/scheduler, e.g. every 10 min) or on file-change events; dashboard.py should only read.
Verifier: Independently reproduced the full chain on the LIVE server (PID 12110, started 2026-06-11 18:13; serve_dashboard.py is now committed/clean at 296ae842 so file==live for this path, and I verified behaviour empirically anyway). (1) Code path confirmed: serve_dashboard.py:409-411 GET /dashboard.html calls regenerate(); regenerate() (lines 355-379, REGEN_COOLDOWN=10s at line 250) synchronously subproc

21. doRefresh refetches the whole 2.49MB page and innerHTML-swaps ~36,700 elements when a 1ms JSON endpoint already exists

Kind: simplify · Effort: M · Dimension: performance
Where: /Users/elmar/PKA/dashboard.html (43417-43461 (generated; source in dashboard.py generate_html JS block))
Impact: Every refresh rebuilds the entire DOM (including the 1.5MB activity timeline) to update a handful of session cards — wasted client CPU/memory every 30s, plus lost in-page state (event listeners, partial scroll) papered over by re-hydration code.
Evidence: doRefresh() fetches '/dashboard.html?t='+Date.now(), DOMParser-parses the full 2,485,879-byte document (43,752 lines, 36,714 open tags counted), replaces the entire .shell innerHTML, then re-binds handlers / re-hydrates tiles / re-sorts reports / restores scroll. It also fires on every WS 'sessions.updated' event (ws_events.py:100-106 broadcasts whenever sessions.json mtime changes). Meanwhile GET /api/sessions measured at 0.001-0.005s (3 runs: 4.9ms, 1.1ms, 1.2ms) returns the same session data as JSON.
Suggested action: Replace full-shell swap with a targeted update from /api/sessions (the war-card grid is the only thing that changes at 30s cadence); keep full refetch only for manual reload.
Verifier: Reproduced everything. Read doRefresh at dashboard.html:43977-44018 (line numbers shifted from claimed 43417 because the file was regenerated; content identical to claim): fetches '/dashboard.html?t='+Date.now(), DOMParser-parses the full document, swaps .shell innerHTML, then re-binds handlers, restores tab, re-hydrates session/spend tiles, re-sorts reports, restores scroll. Confirmed triggers: h

22. Entire activity history baked into the DOM: 3,456 timeline nodes / 321,485px of content inside a 520px scroll box

Kind: simplify · Effort: S · Dimension: ux-live
Where: /Users/elmar/PKA/dashboard.py (timeline render feeding .timeline (client insert at 4410-4434))
Impact: Page weight and refresh cost are dominated by history nobody scrolls 600 screens deep for; every live refresh re-ships and re-parses the whole archive, causing jank on each WS event.
Evidence: Live DOM measurement: the Home '.timeline' scroll container has clientHeight 520, scrollHeight 321,485 and 3,456 child elements (~1,400 activity entries). Combined with the war-strip duplication this drives the 2.5MB / ~45k-line page that doRefresh re-downloads and re-parses on every sessions.updated event (every ~15s observed).
Suggested action: Bake only the most recent ~50 activity entries with a 'view full log' link to a separate page/endpoint; this alone should cut the generated HTML by a large fraction.
Verifier: Independently reproduced every load-bearing number. (1) Live DOM via Playwright on http://localhost:8787/dashboard.html: .timeline has children=3460, scrollHeight=321873, clientHeight=520, document HTML=2,555,293 bytes — matches claimed ~3456 / 321,485 / 520 (delta = a few new activity events since the reviewer measured). (2) Root cause confirmed in code: dashboard.py:319-322 get_activity_log() ru

Severity: LOW

23. War-strip cards render the raw encoded project dirname as 'persona' (e.g. '-Users-elmar--claude-mem-observer-sessions')

Kind: broken · Effort: S · Dimension: gen-correctness
Where: /Users/elmar/PKA/dashboard.py (874-892 (_guess_persona), 809 (persona=_guess_persona(project)), 2165 (render))
Impact: Unreadable garbage labels on the most prominent strip of the Home view for any project without a hardcoded persona mapping.
Evidence: _guess_persona falls through to return project_name (line 892) with the raw Claude dir name, while the display name is humanised separately (line 807). Live-served dashboard.html war strip contains: 'Claude Mem Observer Sessions · -Users-elmar--claude-mem-observer-sessions' — the persona slot shows the ugly encoded path.
Suggested action: Fall back to humanise_session_name(project_name) (or empty string) instead of the raw dirname in _guess_persona.
Verifier: Independently reproduced. (1) Read /Users/elmar/PKA/dashboard.py:874-892 — _guess_persona() has hardcoded mappings for pka/cowork/legalmind/crypto/smartmoney and falls through to return project_name at line 892 with the raw, un-humanised Claude project dirname. (2) Line 809 passes the raw project (encoded dir name) to _guess_persona, while line 806 humanises the display name separately via hum

24. Agent-dispatch 'this week' window is 8 days inclusive, not 7

Kind: broken · Effort: S · Dimension: gen-correctness
Where: /Users/elmar/PKA/dashboard.py (2358-2377)
Impact: The promotion-ladder evidence number ('4+/week → evaluate making permanent') is inflated by up to one extra day of dispatches — borderline agents can cross the threshold incorrectly.
Evidence: week_ago = now_date - timedelta(days=7) and the filter is if entry_date >= week_ago (line 2376) — today plus the 7 previous days = 8 distinct dates counted as 'dispatches this week'. The card label (line 2437) says 'dispatches this week' and Team/roster.md's promotion rule is '4+ dispatches per week'.
Suggested action: Use entry_date > week_ago or timedelta(days=6) for a true rolling 7-day window.
Verifier: Read /Users/elmar/PKA/dashboard.py:2356-2377 myself: week_ago = now_date - timedelta(days=7) (line 2359) and if entry_date >= week_ago (line 2376). Ran the arithmetic in Python: a dispatch dated exactly 7 days ago satisfies the filter, and the inclusive window spans exactly 8 distinct calendar dates — confirmed empirically (printed 8). Confirmed the code path is live: get_agent_dispatch_stats

25. Non-integer query params crash to HTTP 500 instead of 400 (multiple endpoints)

Kind: broken · Effort: S · Dimension: server-correctness
Where: /Users/elmar/PKA/serve_dashboard.py (1633, 1685, 2447-2448)
Impact: A malformed or stale client query (e.g. a saved URL, a fuzzer, or a UI bug passing an empty/garbage value) returns a 500 server error instead of a clean 400. Masks real server faults in logs and gives the dashboard JS an opaque failure instead of a handled error.
Evidence: int(request.args.get(...)) is called outside any try/except. Live probes: GET /api/token-dashboard/prompts?limit=abc -> HTTP 500; GET /api/token-dashboard/sessions?limit=notanumber -> HTTP 500; GET /api/v1/brain/db?db=vault.db&table=edges&page=abc -> HTTP 500; ...&per_page=xyz -> HTTP 500. Body is the generic Werkzeug 500 page (no traceback leak; debug=False). Compare line 1633 limit = max(1, min(1000, int(request.args.get("limit", 50)))) and lines 2447-2448 page = max(1, int(request.args.get("page", 1))) which run before the try block at 2459.
Suggested action: Wrap the int() parses in try/except ValueError and return jsonify(error=...) with 400, or coerce with a safe default. Move the brain/db int parses inside the existing try block.
Verifier: Read serve_dashboard.py myself: line 1633 and 1685 call int(request.args.get("limit", ...)) with no try/except; lines 2447-2448 call int() on page/per_page before the try block at 2459. Reproduced live with GET-only curls: /api/token-dashboard/prompts?limit=abc -> 500, /api/token-dashboard/sessions?limit=notanumber -> 500, /api/v1/brain/db?db=vault.db&table=edges&page=abc -> 500, &per_page=xyz ->

26. System tab 'Skill Audit dashboard' link points to /skill-audit which 404s — route does not exist

Kind: broken · Effort: S · Dimension: data-accuracy
Where: /Users/elmar/PKA/dashboard.py (~4120-4128 (Skills Health section link))
Impact: Clicking the only drill-down link in Skills Health dead-ends on a 404.
Evidence: Generated HTML: 'Full skill-audit details: Skill Audit dashboard ↗'. curl -s -o /dev/null -w '%{http_code}' http://localhost:8787/skill-audit → 404 (also seen in live server access log: 'GET /skill-audit HTTP/1.1" 404'). grep -n 'skill-audit' serve_dashboard.py → 0 matches (on disk too, not just the stale running process). The working target exists at /dashboards/skill-audit.html (200).
Suggested action: Change the href to /dashboards/skill-audit.html (or add the route).
Verifier: Independently reproduced every element of the claim. (1) /Users/elmar/PKA/dashboard.py:4123 emits 'Full skill-audit details: Skill Audit dashboard' exactly as claimed — and there is a SECOND instance at dashboard.py:2461 ('View full skill audit ↗', same /skill-audit href) the finding missed. (2) Generated /Users/elmar/PKA/dashboard.html contains 2 occurrences of href="/s

27. Reports tab lists 25 node_modules README/CHANGELOG/license files as reports (~5% of the 489 cards)

Kind: broken · Effort: S · Dimension: data-accuracy
Where: /Users/elmar/PKA/dashboard.py (1464-1509 (md rglob in get_reports — no node_modules exclusion))
Impact: Junk library docs pollute the Reports tab (and its 489 count), each with Share/PDF/Delete buttons, making the report library look untrustworthy.
Evidence: Generated page contains 25 distinct report cards with data-filename like 'narrowbody-market-2026-05-29/node_modules/util-deprecate/README.md', 'node_modules/pako/CHANGELOG.md', 'node_modules/process-nextick-args/license.md'. Matching 25 *.md files exist under PKA-Outputs/reports/narrowbody-market-2026-05-29/node_modules/. The md listing filters prompts/smoke/hermes-workers but not node_modules; one even appeared in the live access log being served via /md-view.
Suggested action: Skip any path containing node_modules (and similar vendor dirs) in get_reports rglob; consider deleting the vendored dir from the cloud reports folder.
Verifier: Reproduced all evidence: (1) Read dashboard.py lines 1464-1478 — md rglob in get_reports filters dotfiles/_deleted/hermes-workers/prompt-files/smoke- but has no node_modules exclusion; reports_dir resolves via _primary_output_base to GDrive PKA-Outputs/reports. (2) Ran the exact claimed repro against the live server: curl http://localhost:8787/dashboard.html | grep data-filename node_modules | sor

28. Placeholder-dated stub reports ('…-2099-01-01') from scheduled tasks shown as fresh 2026-06-09 reports

Kind: broken · Effort: S · Dimension: data-accuracy
Where: /Users/elmar/Library/CloudStorage/GoogleDrive-conrelma@gmail.com/My Drive/PKA-Outputs/reports
Impact: Some scheduled report generator is running with an unexpanded date placeholder, producing near-empty duplicate stubs that the dashboard presents as recent deliverables.
Evidence: Cloud reports dir contains skill-audit-2099-01-01.html (2.5K stub, Skill Audit — 2099-01-01), sender-intelligence-2099-01-01.html (1.8K), operator-tune-2099-01-01.md (91 bytes), and dir investment-weekly-2099-01-01/. The Reports tab renders them near the top with date 2026-06-09 and titles carrying the literal '2099-01-01' placeholder, alongside the real skill-audit-2026-06-08.html.</li> <li>Suggested action: Find the scheduler task templating '{date}'→'2099-01-01', fix it, and delete the four stub artifacts.</li> <li>Verifier: Reproduced both repro steps. (1) ls of the cloud reports dir confirmed skill-audit-2099-01-01.html (2.5K, title 'Skill Audit — 2099-01-01'), sender-intelligence-2099-01-01.html (1.8K), operator-tune-2099-01-01.md (91B), and investment-weekly-2099-01-01/ with 30-byte stub mp3/pdf/mp4 files; all mtime Jun 9 17:49:13 2026. (2) GET http://localhost:8787/dashboard.html (live server) returned 19 occurre</li> </ul> <h4>29. Evals tab reports 'GBrain: ok' for a system retired 2026-05-01 — label defaults to 'ok' when the store is absent from the eval run</h4> <ul> <li>Kind: broken · Effort: S · Dimension: data-accuracy</li> <li>Where: <code>/Users/elmar/PKA/dashboard.py</code> (2897-2901)</li> <li>Impact: A green health claim is fabricated for a component that does not exist and was never tested — the default-to-ok pattern will mask real failures if a gbrain-like store ever returns.</li> <li>Evidence: Code: gb = retrieval_stores.get('gbrain') ... gb = gb or {}; gb_label = 'skipped' if gb.get('skipped') else ('error' if gb.get('error') else 'ok') — an empty dict yields 'ok'. Latest eval JSON (GET /eval-results/2026-06-10T103030Z.json) has stores: ['vault','qmd'] only — no gbrain key. Page renders 'GBrain ok' in the Retrieval Stack card. GBrain was retired 2026-05-01 per CLAUDE.md Rule 15.</li> <li>Suggested action: Default gb_label to '—'/'retired' when the store is missing; only show ok on a real positive result.</li> <li>Verifier: Reproduced end-to-end. (1) Read dashboard.py:2880-2901 and 3013-3022: on the has_retrieval branch, gb = retrieval_stores.get('gbrain') -> None -> gb or {} -> gb_label='ok', rendered as the GBrain row of the Retrieval Stack card (line 3019). (2) GET /eval-results/2026-06-10T103030Z.json: retrieval.stores = ['vault','qmd'], no gbrain store — but the JSON has a top-level gbrain {"skipped": true, "rea</li> </ul> <h4>30. QMD retrieval layer is down on this Mac (better-sqlite3 ABI mismatch) — dashboard tile and eval P@5 0.000 confirm it live</h4> <ul> <li>Kind: broken · Effort: S · Dimension: data-accuracy</li> <li>Where: <code>/Users/elmar/.bun/install/global/node_modules/better-sqlite3/build/Release/better_sqlite3.node</code></li> <li>Impact: Step 5 of the canonical retrieval stack (qmd search BM25) is completely non-functional, and has been failing every eval run; the dashboard reports it accurately but nobody has acted.</li> <li>Evidence: Home tab QMD tile (baked from a real <code>qmd --index pka status</code> run): 'Error: The module …better_sqlite3.node was compiled against a different Node.js version using NODE_MODULE_VERSION 141. This version of Node.js requires NODE_MODULE_VERSION 127' (Node v22.22.3). Evals tab shows QMD P@5 0.000 / MRR 0.000 for the 2026-06-10T103030Z run (vault store scores 0.667).</li> <li>Suggested action: npm rebuild better-sqlite3 under the Node version bun uses (or reinstall @tobilu/qmd).</li> <li>Verifier: Reproduced all three evidence legs myself: (1) ran /Users/elmar/.bun/bin/qmd --index pka status — got the exact ERR_DLOPEN_FAILED error for /Users/elmar/.bun/install/global/node_modules/better-sqlite3/build/Release/better_sqlite3.node (NODE_MODULE_VERSION 141 vs required 127); (2) grep on /Users/elmar/PKA/dashboard.html found the same baked error text (QMD tile); (3) read /Users/elmar/PKA/tests/ev</li> </ul> <h4>31. 'Open web terminal' button dead-ends (302 → localhost:7681 connection refused, ttyd not running)</h4> <ul> <li>Kind: broken · Effort: S · Dimension: ux-live</li> <li>Where: <code>/Users/elmar/PKA/serve_dashboard.py</code> (1990-1994 (TTYD_PORT default 7681))</li> <li>Impact: A primary Console-tab action opens a browser 'can't connect' error page; the terminal pane below it also can't open sessions while ttyd is down.</li> <li>Evidence: Console tab button 'Open web terminal' href=/console/ttyd; GET http://localhost:8787/console/ttyd returns 302 Location: http://localhost:7681, which refuses connection ([Errno 61] Connection refused — tested against the live server). The embedded console iframe itself displays a red 'ttyd offline' badge (screenshot .scratch/wf-review/ux/02-tab-console.png), so the system knows ttyd is down yet still offers the button.</li> <li>Suggested action: Either auto-start/supervise ttyd, or have /console/ttyd return a friendly 'terminal offline — start ttyd' page and grey out the button when the offline state (already detected) is true.</li> <li>Verifier: Reproduced fully against the live server: GET /console/ttyd returned 302 Location: http://localhost:7681, and a socket connect to 7681 failed with [Errno 61] Connection refused (ttyd installed at /opt/homebrew/bin/ttyd but not running). The redirect route is an unconditional redirect in shared_console/blueprint.py:603-609 (registered at serve_dashboard.py:2782); the cited serve_dashboard.py:1989-1</li> </ul> <h4>32. Every activity row shows raw debug metadata 'dispatcher=Lucienne session=unknown ts=...' (1,407 occurrences)</h4> <ul> <li>Kind: broken · Effort: S · Dimension: ux-live</li> <li>Where: <code>http://localhost:8787/dashboard.html (generated by /Users/elmar/PKA/dashboard.py activity render)</code> (n/a (1,407 hits in served HTML))</li> <li>Impact: The activity feed reads like debug logs; the session field carries zero information (always 'unknown') while adding visual noise to every single row.</li> <li>Evidence: Served dashboard.html contains 'session=unknown' 1,407 times; every visible ACTIVITY row reads e.g. 'Dispatched: Review graphify + workspace + wiki-system-update / dispatcher=Lucienne session=unknown ts=2026-06-12T11:22:27' (screenshot .scratch/wf-review/ux/09-home-bottom.png). The session linkage resolves to 'unknown' on 100% of rows sampled.</li> <li>Suggested action: Drop the session=unknown suffix when unresolved (or fix the dispatcher logging to record real session ids); render ts as a humanised time instead of raw ISO key=value.</li> <li>Verifier: Reproduced via GET http://localhost:8787/dashboard.html: 1,408 occurrences of 'session=unknown' (claim said 1,407; page regenerates so 1-count drift expected). All occurrences are inside <div class="evt-detail"> activity rows rendered by dashboard.py:3282, which dumps evt['details'] verbatim; CSS (.evt-detail{font-size:11px;color:var(--text2)}) makes them visible. The 15 most-recent rows at the to</li> </ul> <h4>33. Freshest-by-mtime selection for findash trusts cloud-sync mtimes (staged change)</h4> <ul> <li>Kind: fragile · Effort: M · Dimension: server-correctness</li> <li>Where: <code>/Users/elmar/PKA/serve_dashboard.py</code> (467-490)</li> <li>Impact: After this change deploys, the dashboard could pick a stale-but-recently-synced findash over the genuinely newest one, silently showing wrong financials.</li> <li>Evidence: Staged _output_dir_for() sorts candidates by fpath.stat().st_mtime descending across GoogleDrive PKA-Outputs, in-repo, and OneDrive-Safair, then returns the largest-mtime base. mtime on cloud-synced files reflects local sync time, not content recency: a file that syncs down later (or a partially-downloaded placeholder) gets a newer local mtime regardless of which actually holds the latest numbers. The OneDrive branch (478-485) also skips the relative_to() traversal re-check the loop applies (safe only because filename is hard-pinned to 'findash.html').</li> <li>Suggested action: Prefer an explicit recency signal embedded in the file (a generated_at timestamp/version) over filesystem mtime, or restrict to a single authoritative source.</li> <li>Verifier: Code facts confirmed by reading serve_dashboard.py:459-490: candidates sorted by st_mtime desc across GDrive PKA-Outputs/in-repo/OneDrive-Safair, largest-mtime base returned; OneDrive branch (478-485) indeed skips the relative_to() traversal check, safe only via the hard-pinned 'findash.html' filename compare. Two corrections to the finding: (1) it is NOT a staged change — git diff --cached is emp</li> </ul> <h4>34. Expired manual-registry session entries suppress live auto-detected sessions for the same project</h4> <ul> <li>Kind: fragile · Effort: S · Dimension: gen-correctness</li> <li>Where: <code>/Users/elmar/PKA/dashboard.py</code> (850-864)</li> <li>Impact: As soon as the manual registry is used again, a single stale registry entry hides all real activity for that project from the dashboard.</li> <li>Evidence: Phase 3 merge builds manual_projects from ALL manual entries (line 852: <code>manual_projects = {v["project"].lower() for v in manual.values()}</code>) but only adds manual entries with ago_mins <= 240 to the result (855-857). An auto-detected session whose project matches any manual entry — including one dead for days — is skipped (860-864). Currently inert only because /Users/elmar/.claude/vault/sessions.json has 0 entries (verified: <code>python3 -c "import json; print(len(json.load(open('/Users/elmar/.claude/vault/sessions.json'))['sessions']))"</code> → 0), so the bug re-arms the moment session_tracker.py registers a session again.</li> <li>Suggested action: Build manual_projects only from manual entries that made it into result (ago_mins <= 240).</li> <li>Verifier: Read dashboard.py:850-864 directly: line 852 builds manual_projects from ALL manual entries with no recency filter; lines 855-857 only add manual entries with ago_mins <= 240 to the result; lines 860-864 skip any auto-detected session whose normalized project name is in manual_projects. Therefore a manual entry older than 240 min is dropped from display but still suppresses live auto-detected sess</li> </ul> <h4>35. hydrateSessionBar mixes server counts and client-derived counts for the same row (idle ignored from API)</h4> <ul> <li>Kind: fragile · Effort: S · Dimension: gen-correctness</li> <li>Where: <code>/Users/elmar/PKA/dashboard.py</code> (4236-4243)</li> <li>Impact: The Home session bar can show arithmetic that doesn't add up the moment server counts and the returned list diverge; currently masked because everything is 0.</li> <li>Evidence: Live GET /api/sessions returns counts:{active,idle,stale,closed} (verified: {'active':0,'idle':0,'stale':0,'closed':0}). The emitted JS takes active and stale from counts when present (4238-4241) but ALWAYS computes idle from the sessions array (4242: <code>var idle = sessions.filter(...)</code>) and total from sessions.length (4243). If the API ever returns counts computed over a different set than the sessions list it returns (e.g. counts over all sessions, list truncated/filtered), the bar shows internally inconsistent numbers (active+idle+stale ≠ total).</li> <li>Suggested action: Use counts.idle and a counts-derived total when counts is present, falling back to list-derived values only as a unit.</li> <li>Verifier: Read dashboard.py:4231-4260 — confirmed active/stale come from counts.* when present but idle is always client-derived from the sessions array (4242) and total from sessions.length (4243); counts.idle never read. Live GET /api/sessions returned {"sessions":[],"counts":{"active":0,"idle":0,"stale":0,"closed":0}} exactly as claimed. Fetched the live-served /dashboard.html and the hydrateSessionBar J</li> </ul> <h4>36. get_memories parses 'frontmatter' keys from the entire file body — any body line starting with name:/type:/description: silently overrides the real frontmatter</h4> <ul> <li>Kind: fragile · Effort: S · Dimension: gen-correctness</li> <li>Where: <code>/Users/elmar/PKA/dashboard.py</code> (566-583)</li> <li>Impact: Memory entries on the Memory tab can show wrong name/type/description whenever a memory file quotes YAML in its body (common in these memory files, which document config snippets), and the type filter buttons then mis-bucket the entry.</li> <li>Evidence: The loop iterates content.splitlines() for the whole file with no '---' delimiter handling and no break after the frontmatter block: <code>for line in content.splitlines(): if line.startswith("type:")...</code> — the LAST matching line anywhere in the document wins (assignments overwrite). A proper frontmatter parser already exists in the same file (_parse_frontmatter, lines 1067-1082) and is not used here.</li> <li>Suggested action: Reuse _parse_frontmatter(f) instead of the ad-hoc whole-file scan.</li> <li>Verifier: Confirmed end-to-end. (1) Read dashboard.py:566-583: get_memories() iterates content.splitlines() over the WHOLE file with no '---' delimiter handling and no break — assignments overwrite, so the last matching 'name:'/'type:'/'description:' line anywhere in the body wins. A correct delimiter-aware parser (_parse_frontmatter) exists at lines 1067-1082 and is unused here. (2) Reproduced the impact w</li> </ul> <h4>37. Hero 'Activity Today' counts raw activity_log rows while the feed below filters categories — numbers can disagree (consistent today by luck)</h4> <ul> <li>Kind: fragile · Effort: S · Dimension: gen-correctness</li> <li>Where: <code>/Users/elmar/PKA/dashboard.py</code> (1358-1360 (hero count), 3133-3134 + 3274-3278 (feed filters))</li> <li>Impact: The headline KPI and the feed it summarises use different definitions; the count is also dominated by automated extraction noise, diluting its meaning.</li> <li>Evidence: get_hero_metrics counts every activity row with date==today (1360). The rendered feed drops rows whose summary starts 'Waiting for' (3134) or 'Incremental index' (3275) and categories 'index'/'system_maintenance' (3277). Verified read-only against vault.db: today 63 rows total and 63 would pass the feed filters (categories today: session_extracted 43, commit 18, skill_invoked 2) — equal today, but any 'index'/'system_maintenance' row reintroduces the mismatch the audit pattern is prone to. Note 43/63 of 'Activity Today' are session_extracted memory-pipeline rows, so the headline number mostly measures the extractor, not Elmar-visible work.</li> <li>Suggested action: Compute activity_today from the same filtered list used for the feed (filtered_activity), and consider excluding session_extracted from the headline.</li> <li>Verifier: Code confirmed by direct read: dashboard.py:1358-1360 hero counts all rows with date==today unfiltered; feed filters at 3133-3135 (summary startswith 'Waiting for') and 3274-3278 (summary startswith 'Incremental index', categories index/system_maintenance); get_activity_log (319-322) has no LIMIT so both see the same rows. Baked dashboard.html:1083 shows 10 for 'Activity Today' matching read-only </li> </ul> <h4>38. On-disk dashboard.html title metadata extraction reads only the first 2000 bytes of each report HTML — late <title>/<meta> silently fall back to filename</h4> <ul> <li>Kind: fragile · Effort: S · Dimension: gen-correctness</li> <li>Where: <code>/Users/elmar/PKA/dashboard.py</code> (1440-1451)</li> <li>Impact: Reports tab cards show raw filename stems and wrong 'research' type for any report with a heavier head, degrading sort-by-type and search.</li> <li>Evidence: <code>content = f.read_text(encoding="utf-8")[:2000]</code> then regex for <title> and report-* metas — any report whose <head> exceeds 2KB before the title (common when inline CSS/fonts precede it; this very dashboard puts ~500 lines of CSS before body) gets title=f.stem and type='research'. The except Exception: pass at 1450-1451 also hides read/encoding errors, leaving meta={} with no signal.</li> <li>Suggested action: Raise the sniff window (e.g. 16KB) or read until </head>; log instead of pass on decode errors.</li> <li>Verifier: Read dashboard.py:1431-1462 — code matches exactly: read_text()[:2000], regex for <title>/<meta>, fallback to f.stem and type='research', bare except at 1450-1451. Independently scanned the live reports base (GDrive PKA-Outputs/reports, resolved via _primary_output_base): 19 of 172 HTML reports have <title> at/after byte 2000 (cemair-terms.html at 2502, webwright-v4/saa/home.html at 40644). Confir</li> </ul> <h4>39. regenerate() allows a concurrent request to read dashboard.html mid-rewrite (truncated HTML)</h4> <ul> <li>Kind: fragile · Effort: S · Dimension: server-correctness</li> <li>Where: <code>/Users/elmar/PKA/serve_dashboard.py</code> (355-381, 409-413)</li> <li>Impact: Under concurrent access right after the cooldown window, a user can be served a truncated/blank dashboard.html (broken page) until the next reload.</li> <li>Evidence: index() and dashboard() call regenerate() then immediately Path(DASHBOARD).read_text() (line 412). regenerate() only blocks (under regen_lock, subprocess.run of dashboard.py) for the request that wins the cooldown check; a second request arriving within REGEN_COOLDOWN (10s) hits <code>now - last_regen_attempt <= REGEN_COOLDOWN</code> at line 358 and returns immediately WITHOUT waiting for the in-flight subprocess, then reads dashboard.html at line 412 while dashboard.py is still writing it. dashboard.py:4859 uses <code>OUTPUT.write_text(html, ...)</code> which truncates the file to 0 then streams ~3.4MB, so a concurrent reader can observe a partially written / truncated file.</li> <li>Suggested action: Have dashboard.py write to a temp file and os.replace() atomically (atomic rename), and/or have dashboard()/index() serve the last-good HTML rather than reading a file that another request may be rewriting.</li> <li>Verifier: Confirmed both mechanisms live. Code: serve_dashboard.py:358 lock-free cooldown check returns immediately for any request arriving while a regen is in flight (last_regen_attempt set at line 363 BEFORE subprocess.run), then dashboard() reads the file at line 412; dashboard.py:4859 uses OUTPUT.write_text (truncate+write, non-atomic). Live repro on the running :8787 server: a trigger GET took 20.31s </li> </ul> <h4>40. /api/annotations POST is CSRF-exempt + CORS * and writes files unauthenticated on a 0.0.0.0 bind</h4> <ul> <li>Kind: fragile · Effort: S · Dimension: server-correctness</li> <li>Where: <code>/Users/elmar/PKA/serve_dashboard.py</code> (177-180, 219-232, 523-602)</li> <li>Impact: Any website the operator visits, or any host that can reach port 8787, can write arbitrary annotation files and large screenshot blobs to Vault/annotations with no limit on number of batches — a disk-fill / spam DoS vector and an unauthenticated write surface.</li> <li>Evidence: _csrf_same_origin_guard() returns None for /api/annotations before any origin check (lines 177-180), and _annotations_cors() sets Access-Control-Allow-Origin: * for it (223-225). post_annotations() writes Vault/annotations/<page>-<ts>/batch.json plus base64-decoded PNG crops up to 3MB each / 25MB per batch (542-563). The server binds 0.0.0.0 by default (line 2855). There is no auth token and no per-client rate limit; batch dirs are unbounded (only items-per-batch is capped at 100).</li> <li>Suggested action: Restrict /api/annotations to loopback (like the cookie-ingest endpoint), or require the same X-API-Key control; cap total annotations dir size / add a batch rate limit; tighten CORS to the dashboard origin.</li> <li>Verifier: Reproduced all five evidence components. Read lines 161-216: line 177-180 returns None for /api/annotations on any non-OPTIONS method, bypassing the same-origin guard before any origin check. Lines 222-225 set Access-Control-Allow-Origin:* for the path. Confirmed LIVE via GET-only/OPTIONS: GET /api/annotations/inbox=HTTP 200; OPTIONS with Origin:https://evil.example.com returned 204 with Access-Co</li> </ul> <h4>41. md-view path containment uses startswith without a trailing separator (prefix confusion)</h4> <ul> <li>Kind: fragile · Effort: S · Dimension: server-correctness</li> <li>Where: <code>/Users/elmar/PKA/serve_dashboard.py</code> (1810-1813)</li> <li>Impact: If a directory named with a 'docs'/'reports'/'Vault/memory' prefix is ever added, md-view could serve .md files outside the intended root. Latent path-containment weakness.</li> <li>Evidence: fpath = (base_dir / rel).resolve(); guard is <code>if not str(fpath).startswith(str(base_dir.resolve())) or fpath.suffix != '.md': abort(404)</code>. With base_dir = ROOT/'docs', the string '/Users/elmar/PKA/docs' is a prefix of a sibling like '/Users/elmar/PKA/docs2/secret.md', so a sibling directory whose name starts with 'docs' (or 'reports', or 'Vault/memory') would pass the check. The sibling api at /api/v1/brain/file (lines 2358-2363) correctly uses fpath.is_relative_to(r) to avoid exactly this. Currently not exploitable — no such sibling dir exists (only docs/, reports/, reports_utils.py) — and live probes (/md-view?file=../CLAUDE.md, ../serve_dashboard.py, reports/../../CLAUDE.md) all returned 404.</li> <li>Suggested action: Use fpath.is_relative_to(base_dir.resolve()) instead of str.startswith(), matching the brain_file_detail guard.</li> <li>Verifier: Re-read serve_dashboard.py 1790-1815: lines 1810-1813 match the finding verbatim — fpath=(base_dir/rel).resolve(); guard is <code>if not str(fpath).startswith(str(base_dir.resolve())) or fpath.suffix != '.md': abort(404)</code>. Re-read 2348-2363: the sibling /api/v1/brain/file endpoint uses <code>fpath.is_relative_to(r)</code> with an explicit code comment that startswith would allow prefix-confusion attacks (e.g. /Us</li> </ul> <h4>42. Three repo-only reports are invisible in the Reports tab because listing reads cloud dir only</h4> <ul> <li>Kind: fragile · Effort: S · Dimension: data-accuracy</li> <li>Where: <code>/Users/elmar/PKA/dashboard.py</code> (1561-1575 (_primary_output_base returns cloud once non-empty; no merge with repo fallback))</li> <li>Impact: A handful of recent local reports silently vanish from the library until someone migrates them to the cloud folder.</li> <li>Evidence: Replicating get_reports filters over both dirs: cloud listing 475 files, repo listing 173, repo-only = 3: reports/mc-4861_lucienne_local_model_benchmark.md, reports/pka-memory-reddit-post.md, reports/pka-memory-system-reddit.html. These never appear in the 489 cards (489 = cloud listing + docs/.md with Date:* markers). The serving routes are per-file cloud-first/repo-fallback, so the files would open if linked — they just aren't listed.</li> <li>Suggested action: Union repo + cloud files (dedupe by relative path, cloud wins) in get_reports, or migrate the 3 stragglers.</li> <li>Verifier: Read dashboard.py:1561-1575 and get_reports (1406-1507): listing uses a single base — cloud dir once non-empty, no merge with repo. Re-ran the set-diff with get_reports' exact filters: cloud 479, repo 175, repo-only = 4 (the 3 claimed files plus portfolio-dashboard-review-2026-06-10.html, 194KB, written AFTER migration — confirms the fragility is ongoing). Grep for all 4 filenames: 0 hits in gener</li> </ul> <h4>43. Reports listing reads full content of all 625 report files from Google Drive FUSE every regen, and renders all 489 cards (28% of page)</h4> <ul> <li>Kind: fragile · Effort: S · Dimension: performance</li> <li>Where: <code>/Users/elmar/PKA/dashboard.py</code> (1432-1505)</li> <li>Impact: Regen time becomes network-bound (28MB of Drive reads every 30s) whenever the Drive cache is cold or syncing; the page permanently carries every report ever written.</li> <li>Evidence: f.read_text(encoding='utf-8')[:2000] reads the ENTIRE file then slices — measured 171 HTML files = 23.9MB read per regen (largest single file 8.62MB mc-4223-ua-signoff-report.html) plus 454 .md files = 4.2MB, from /Users/elmar/Library/CloudStorage/GoogleDrive-conrelma@gmail.com/My Drive/PKA-Outputs/reports (cloud-backed FUSE). Locally cached this is 0.06s; a bounded 4KB read is 0.02s, but on a cold/evicted Drive cache each full read becomes a network download (28MB per regen, every 30s). Output side: p-reports panel = 688,229 B (27.7% of page), 489 report-card divs, avg 1,399 B each (scratch panelshare.py).</li> <li>Suggested action: Read only the first 4KB (open(f,'rb').read(4096)) for metadata extraction, and cap the rendered list to the newest ~100 with the rest behind the existing filter/search.</li> <li>Verifier: Independently confirmed every element: (1) dashboard.py:1440 does f.read_text()[:2000] (full read then slice) for HTML and :1481 [:3000] for md; (2) _primary_output_base (dashboard.py:1561-1575) resolves via pka_paths.outputs_dir to the GDrive CloudStorage FUSE path /Users/elmar/Library/CloudStorage/GoogleDrive-conrelma@gmail.com/My Drive/PKA-Outputs/reports; (3) ran readcost.py: 172 HTML = 24.0MB</li> </ul> <h4>44. Token-dashboard API GETs cost 0.24-1.22s against a 700.9MB DB, and 10 routes each kick a 2GB transcript rescan on a 30s cooldown</h4> <ul> <li>Kind: fragile · Effort: S · Dimension: performance</li> <li>Where: <code>/Users/elmar/PKA/serve_dashboard.py</code> (318-352, 1609-1775)</li> <li>Impact: Token tab feels sluggish (~1s per card) and keeps a background scanner thread cycling over a 2GB tree every 30s while open; the 700MB DB grows unboundedly inside ~/.claude.</li> <li>Evidence: Measured (3 runs each): /api/token-dashboard/overview = 1.219s / 0.471s / 0.707s; /api/token-dashboard/sessions = 0.361s / 0.271s / 0.240s. ~/.claude/token-dashboard.db = 700.9MB (+4.6MB WAL). Ten routes (lines 1613-1757) each call _maybe_scan_token_dashboard(), which spawns a background scan_dir thread over ~/.claude/projects whenever >30s since the last scan — that tree is 4,940 .jsonl files / 2.03GB (measured via scratch walkcost.py; the stat-walk alone is 0.16s, the scan parses new content on top). The token tab polling these endpoints re-triggers a scan every 30s.</li> <li>Suggested action: Raise the scan cooldown to 5 min, add indexes/precomputed rollups for overview_totals, and add a retention/aggregation policy for token-dashboard.db.</li> <li>Verifier: Reproduced every measurable element: (1) timed 3 GETs each — overview 0.643/0.415/0.669s, sessions 0.307/0.237/0.245s on the live :8787 server; (2) ~/.claude/token-dashboard.db = 774MB + 6.2MB WAL (grown from the claimed 700.9MB, confirming unbounded growth); (3) read serve_dashboard.py:318-352 — _maybe_scan_token_dashboard spawns a daemon scan thread on a 30s cooldown — and confirmed all ten toke</li> </ul> <h4>45. Reports list polluted by placeholder '2099-01-01' artifacts and duplicate cards</h4> <ul> <li>Kind: fragile · Effort: S · Dimension: ux-live</li> <li>Where: <code>http://localhost:8787/dashboard.html (Reports tab; source files in /Users/elmar/PKA/reports/)</code></li> <li>Impact: Obviously-bogus 2099 dates near the top of the list undermine trust in the whole Reports index, and duplicate cards make it unclear which Kapama report is current.</li> <li>Evidence: Reports tab (sorted Date ▼) positions 5-8 of 492: 'Skill Audit — 2099-01-01' (data-date 2026-06-09, links /reports/skill-audit-2099-01-01.html), 'Sender intelligence proposal' (/reports/sender-intelligence-2099-01-01.html), 'Operator Tune 2099 01 01' — generator placeholder dates leaked into filenames/titles. Also two near-identical cards 'Kapama Family Stay Options — Interactive Comparison' on 2026-06-10 (/reports/kapama-family-stay-options.html and /reports/Kapama-family-options-for-Nicolette.html). Screenshot .scratch/wf-review/ux/02-tab-reports.png.</li> <li>Suggested action: Rename/regenerate the three 2099-01-01 report files with real dates (and fix whatever skill writes the 2099 placeholder); delete or merge the duplicate Kapama report.</li> <li>Verifier: Reproduced against the LIVE server (GET-only). curl of http://localhost:8787/dashboard.html shows 19 '2099-01-01' occurrences. Parsed all report-card divs: exactly 492 cards as claimed. Date-sorted top rows contain 'Skill Audit — 2099-01-01' (data-date 2026-06-09, /reports/skill-audit-2099-01-01.html), 'sender-intelligence-2099-01-01.html', and 'operator-tune-2099-01-01.md' at positions 6/7/9 (rev</li> </ul> <h4>46. Stuck-looking 'Loading…' on System tab Spend & Tokens tile for ~8s with no skeleton/progress</h4> <ul> <li>Kind: fragile · Effort: S · Dimension: ux-live</li> <li>Where: <code>/Users/elmar/PKA/dashboard.py</code> (4263 (hydrateSpendTile))</li> <li>Impact: Tile looks broken on slower hydration, and a context-free '$35,997' headline number invites misreading (looks like a current bill).</li> <li>Evidence: On opening the System tab, 'USAGE BY ROLE / Spend & Tokens: Loading…' was still showing after the tab rendered; ~8s later it resolved to '$35,997.49 / 16,953,384,742 tokens'. The figure appears with no period or scope label (lifetime? month?), and the role list beneath shows tasks from 2026-03-31.</li> <li>Suggested action: Add the period/scope to the label (e.g. 'lifetime est. across all CLIs') and a spinner/skeleton; cache the value server-side so it bakes with the page.</li> <li>Verifier: Confirmed, and the latency is worse than claimed. (1) Code: /Users/elmar/PKA/dashboard.py:4180-4183 bakes the tile with body 'Loading…'; hydrateSpendTile at lines 4264-4287 fetches /api/token-dashboard/overview with NO since/until params and renders only '$cost' + 'N tokens' — no period/scope label, no spinner, no timeout (catch just sets '— data unavailable —'). Served HTML at http://localhost:87</li> </ul> <h4>47. Every open tab refetches the full 2.4MB dashboard and triggers a dashboard.py regen subprocess every 30s; all 3,393 activity rows and 489 report cards are baked into one page</h4> <ul> <li>Kind: simplify · Effort: L · Dimension: data-accuracy</li> <li>Where: <code>/Users/elmar/PKA/serve_dashboard.py</code> (248-252 (REGEN_COOLDOWN=10), 409-411; generated JS doRefresh()/setInterval 30000)</li> <li>Impact: ~7GB/day transfer per idle open tab plus a continuous regen subprocess churning vault.db and ~/.claude/projects rglob — heavy machinery for a status page, and the main reason the page is 2.5MB.</li> <li>Evidence: doRefresh() fetches /dashboard.html?t=… every 30s and innerHTML-swaps the whole shell; each request calls regenerate() (full 4,864-line dashboard.py run, 10s cooldown). Page is 2,554,781 bytes with 3,393 activity entries and 489 report cards baked in; logs/dashboard.stderr.log is 59.7MB; lsof showed multiple persistent Chrome connections; dashboard.html mtime advances every ~30s while tabs are open (observed 22:49:42→22:50:41→22:58:16).</li> <li>Suggested action: Serve deltas via the existing WebSocket or small JSON APIs; lengthen the fallback poll; paginate activity feed (e.g. latest 200) and reports grid.</li> <li>Verifier: Fully reproduced every element of the finding against both the on-disk code and the LIVE server (behaviour identical on this path). (1) Code: /Users/elmar/PKA/serve_dashboard.py:250-251 sets REGEN_COOLDOWN=10; the /dashboard.html route (lines ~403-412) calls regenerate() on every GET; regenerate() (line ~355) runs <code>subprocess.run([sys.executable, dashboard.py])</code> synchronously with a 30s timeout, g</li> </ul> <h4>48. ~300 lines of dead code from the 9→6 tab restructure still execute or sit unused every regeneration</h4> <ul> <li>Kind: simplify · Effort: S · Dimension: gen-correctness</li> <li>Where: <code>/Users/elmar/PKA/dashboard.py</code> (335-348, 447-472, 1370-1389, 1754-1827, 2207-2247, 2465-2512, 3417, 4826, 4832)</li> <li>Impact: Wasted work on every regeneration (which happens as often as every ~30s, see polling finding) and a 4,864-line file that's materially harder to review — the skills-panel builder in particular looks live but its output is discarded, inviting someone to 'fix' the wrong layer.</li> <li>Evidence: grep over the file shows zero callers for: get_graph_data (335), get_tasks (447), group_tasks_by_owner (1370), build_task_board (2207), build_gbrain_tile (1754; line 3261 hardcodes gbrain_tile_html="" instead). build_skills_panel IS executed every regen (line 3417 skills_panel_html=...) but the result is never inserted — generated dashboard.html contains neither 'filterSkills' nor 'skills-tbody' (verified by grep on the 2.4MB output). get_vault_structure() runs in main (4826, scans ~/.claude/vault) and is passed to generate_html (3114) which never reads it. get_gbrain_health() result is passed but only feeds the empty-string tile. build_evals_panel also has unused vars (last7_leakage/cutoff7, lines 2934-2935).</li> <li>Suggested action: Delete the uncalled builders, drop the build_skills_panel call + skills_panel_html, remove get_vault_structure/gbrain plumbing (keep get_gbrain_health only if something still reads the retirement marker).</li> <li>Verifier: Independently confirmed every claim. (1) grep over /Users/elmar/PKA/dashboard.py shows ONLY def-sites for get_graph_data (335), get_tasks (447), group_tasks_by_owner (1370), build_task_board (2207), build_gbrain_tile (1754) — no production callers; the sole external reference is tests/test_dashboard_health_cards.py:55 calling build_gbrain_tile, which keeps dead code test-green. (2) Line 3261 hardc</li> </ul> <h4>49. Dashboards launcher hides stale exports via a hardcoded filename blocklist that will silently rot</h4> <ul> <li>Kind: simplify · Effort: S · Dimension: gen-correctness</li> <li>Where: <code>/Users/elmar/PKA/dashboard.py</code> (1677-1688)</li> <li>Impact: Launcher clutter creeps back with every dated export; the blocklist gives false confidence that duplicates are handled.</li> <li>Evidence: hidden_dashboard_files = {"Dashboard.html", "Exco Dashboard - 12 May 2026.html", "Exco Dashboard - 26 May 2026.html"} plus a special-case <code>if f.name == "smart-money.html": continue</code>. Every newly exported dated duplicate (e.g. the next 'Exco Dashboard - 9 Jun 2026.html' dropped into the dashboards dir) reappears as an anonymous grey card until someone edits the generator, while the launcher's curated meta dict (1584-1676) already defines the canonical set.</li> <li>Suggested action: Invert the rule: render only files present in the meta dict (plus an explicit 'other files' count/link), or glob-exclude patterns like 'Exco Dashboard - *.html'.</li> <li>Verifier: Read /Users/elmar/PKA/dashboard.py:1677-1699 — the cited code exists exactly as claimed: hidden_dashboard_files = {"Dashboard.html", "Exco Dashboard - 12 May 2026.html", "Exco Dashboard - 26 May 2026.html"} (lines 1677-1683), a smart-money.html special case (1687-1688), and a grey fallback meta (color #64748b, group Tools, empty desc) for any filename not in the curated meta dict (1689-1698). Did </li> </ul> <h4>50. Dead local-task-board chain: get_tasks, group_tasks_by_owner, build_task_board and the mc_bridge import block (~120 lines)</h4> <ul> <li>Kind: simplify · Effort: S · Dimension: dead-dup</li> <li>Where: <code>/Users/elmar/PKA/dashboard.py</code> (447-472, 1370-1389, 2207-2247, 72-81)</li> <li>Impact: ~120 lines of dead code in the generator that future sessions read, maintain, and can mistakenly re-wire.</li> <li>Evidence: Repo-wide grep for get_tasks/group_tasks_by_owner/build_task_board (--include=*.py) returns ONLY their def lines in dashboard.py — zero call sites (the asana skill's get_tasks is an unrelated method). build_task_board's ticket_map consumer is the try/except import of mc_bridge get_task_ticket_map/get_mc_url at lines 72-81, which are also referenced nowhere else (uses only at the import/stub lines 74/77/80). main() builds the page without any of them.</li> <li>Suggested action: Delete get_tasks, group_tasks_by_owner, build_task_board, the mc_bridge import block, and the orphaned tb-/task- CSS rules (see CSS finding).</li> <li>Verifier: Independently reproduced. (1) Ran the repro grep myself (rg over *.py, asana excluded): get_tasks, group_tasks_by_owner, build_task_board occur ONLY at their def lines in dashboard.py (447, 1370, 2207); get_task_ticket_map/get_mc_url occur only at the dashboard.py import/stub lines 74/77/80 plus inside mc_bridge.py itself (its own definitions/docstring, not a consumer of dashboard's import). (2) R</li> </ul> <h4>51. GBrain (retired 2026-05-01) tile code still present: build_gbrain_tile never called in production, get_gbrain_health result discarded (~95 lines)</h4> <ul> <li>Kind: simplify · Effort: S · Dimension: dead-dup</li> <li>Where: <code>/Users/elmar/PKA/dashboard.py</code> (946-960, 1754-1826, 4832, 4851)</li> <li>Impact: Retired-system code (88 lines + param plumbing + a test pinning it) runs a health probe whose result is thrown away on every dashboard regeneration.</li> <li>Evidence: generate_html contains the explicit comment '# GBrain is retired, so do not surface a permanent unavailable tile' and hardcodes gbrain_tile_html = ""; the gbrain_data parameter has 0 uses in the function body (verified by counting identifier occurrences). build_gbrain_tile (1754-1826, 73 lines) has no production caller — its only call site is tests/test_dashboard_health_cards.py:55. Yet main() still calls get_gbrain_health() at 4832 and threads gbrain_data=gbrain_data into generate_html at 4851 on every regen. Note: the .gbrain-* CSS classes are NOT fully dead — build_qmd_tile reuses them (21 occurrences), so keep/rename the CSS.</li> <li>Suggested action: Delete build_gbrain_tile, get_gbrain_health, the gbrain_data parameter and main() call; update tests/test_dashboard_health_cards.py; optionally rename gbrain- CSS classes to qmd- since only the QMD tile uses them.</li> <li>Verifier: Independently reproduced every cited line: get_gbrain_health stub at dashboard.py:946-960 (returns static retirement dict — note: no actual probe runs, so the claimed 'health probe thrown away' impact is overstated; it's pure dead code, not runtime cost); build_gbrain_tile at 1754-1826 with sole caller in tests/test_dashboard_health_cards.py:55 (grep confirmed); generate_html param gbrain_data=Non</li> </ul> <h4>52. get_vault_structure() computed on every regen but its generate_html parameter is never used</h4> <ul> <li>Kind: simplify · Effort: S · Dimension: dead-dup</li> <li>Where: <code>/Users/elmar/PKA/dashboard.py</code> (475-488, 4826)</li> <li>Impact: Wasted vault filesystem walk on every dashboard regeneration (which the server triggers on page views) plus a misleading parameter implying the vault tree is rendered.</li> <li>Evidence: Identifier count of 'vault_structure' inside generate_html's body (def line to def main) is 1 — the signature only; 0 uses in the body. main() line 4826 still calls get_vault_structure() (walks the ~/.claude/vault directory tree) and passes the result positionally.</li> <li>Suggested action: Remove the vault_structure parameter, the main() call, and get_vault_structure() (lines 475-488).</li> <li>Verifier: Independently confirmed all claimed evidence: (1) read dashboard.py:475-486 — get_vault_structure() walks ~/.claude/vault; (2) read dashboard.py:4826/4845 — main() computes vault_structure and passes it positionally to generate_html; (3) read generate_html signature at dashboard.py:3108-3127 — vault_structure is param #6; (4) grep over the whole file shows exactly 4 occurrences (function def, sign</li> </ul> <h4>53. Dead closeSession() JS emitted into every generated page; its target POST /api/sessions/close has no remaining caller</h4> <ul> <li>Kind: simplify · Effort: S · Dimension: dead-dup</li> <li>Where: <code>/Users/elmar/PKA/dashboard.py</code> (4576-4582 (and serve_dashboard.py:1925-1957))</li> <li>Impact: Dead UI affordance shipped in every 2.5MB page render, and a mutating tmux-killing endpoint kept alive with no UI path to it — confusing for the next person hardening the server.</li> <li>Evidence: Scanning the generated dashboard.html: 26 JS functions defined, only 'closeSession' and 'cleanupSessions' have zero call sites in markup/JS. cleanupSessions IS conditionally wired (onclick emitted by build_session_cards line 2177 when stale sessions exist), but closeSession has no emitter anywhere in dashboard.py — grep finds only the function definition at 4576. Its fetch('/api/sessions/close') is therefore the only non-test reference to serve_dashboard.py's api_session_close (1925-1957); other repo hits are tests/test_csrf_guard.py and docs.</li> <li>Suggested action: Delete the closeSession JS from generate_html; either delete /api/sessions/close + its CSRF test or re-wire a close button on the session cards if the capability is still wanted.</li> <li>Verifier: Reproduced: dashboard.py has exactly one 'closeSession' hit (line 4576, the definition, read lines 4565-4589); generated dashboard.html contains exactly one occurrence (the definition) and zero call sites; cleanupSessions by contrast IS wired via onclick emitted at dashboard.py:2177. serve_dashboard.py:1925-1955 defines POST /api/sessions/close whose only non-doc reference is the dead JS fetch. Tw</li> </ul> <h4>54. dispatch_tracker import + stub is permanently on the fallback path — module doesn't exist and get_dispatch_stats is never called</h4> <ul> <li>Kind: simplify · Effort: S · Dimension: dead-dup</li> <li>Where: <code>/Users/elmar/PKA/dashboard.py</code> (47-59)</li> <li>Impact: 13 lines of dead import/stub suggesting a tracker integration that was removed.</li> <li>Evidence: os.path.exists('/Users/elmar/PKA/dispatch_tracker.py') is False and repo-wide grep for dispatch_tracker (--include=*.py) matches only dashboard.py. Occurrences of get_dispatch_stats in dashboard.py are lines 48 (import) and 51 (stub def) — nothing ever calls it; the live dispatch panel uses get_agent_dispatch_stats(activity) instead.</li> <li>Suggested action: Delete the try/except dispatch_tracker block (lines 47-59).</li> <li>Verifier: Read dashboard.py:47-58 directly: try-import of dispatch_tracker.get_dispatch_stats with an ImportError stub returning zeroed dict. Confirmed dispatch_tracker.py does not exist (ls: No such file). Repo-wide grep (*.py) for dispatch_tracker/get_dispatch_stats matches only dashboard.py lines 48 (import) and 51 (stub def) — no call sites in dashboard.py or serve_dashboard.py. The dispatch panel uses </li> </ul> <h4>55. get_graph_data() is dead (never called anywhere)</h4> <ul> <li>Kind: simplify · Effort: S · Dimension: dead-dup</li> <li>Where: <code>/Users/elmar/PKA/dashboard.py</code> (335-348)</li> <li>Impact: 14 dead lines querying vault.db graph tables for a panel that no longer exists.</li> <li>Evidence: Repo-wide grep for get_graph_data (--include=*.py) returns only the def line at dashboard.py:335. Nothing in main(), generate_html, or any other module references it.</li> <li>Suggested action: Delete get_graph_data (lines 335-348).</li> <li>Verifier: Independently confirmed. Read /Users/elmar/PKA/dashboard.py lines 335-348: def get_graph_data(conn) exists exactly as claimed, 14 lines querying vault.db <code>files</code> and <code>edges</code> tables and returning (nodes, edges) lists. Re-ran the repro: grep -rn 'get_graph_data' across the repo with --include='*.py' returns only dashboard.py:335 (the def line) plus unrelated same-named functions in .scratch/ copies </li> </ul> <h4>56. ~74 CSS classes emitted into every generated dashboard.html have no matching markup or JS — leftovers from removed session cards, task board, scheduler card and the 9-to-6 tab restructure</h4> <ul> <li>Kind: simplify · Effort: S · Dimension: dead-dup</li> <li>Where: <code>/Users/elmar/PKA/dashboard.py</code> (~3560-3995 (CSS block inside generate_html))</li> <li>Impact: ~65 stylesheet lines of dead weight regenerated into every ~2.5MB dashboard.html, and a misleading map of UI components that no longer exist.</li> <li>Evidence: Automated scan of the generated /Users/elmar/PKA/dashboard.html: of 267 CSS class selectors in <style>, 74 appear in no class attribute and no JS string. Clusters: sess- x18 (old session-card design, replaced by war- cards — .sess-card/.sess-grid only exist as CSS at dashboard.py:3718-3719), task- x10 + tb- x8 (dead task board), sched- x7 (removed scheduler card, dashboard.py:3941-3942), hc-/health-card x7 (3950-3953), brief-/apps-command-brief x6 (3560-3566, 3788 — all CSS, no emitter), luci-meta x3 (3933, matching the 'Luci-scoped helpers removed' comment at line ~65), war-link/war-id/war-persona, nav-badge. Each was cross-checked against dashboard.py: only occurrences are inside generate_html's stylesheet (or inside the already-dead build_task_board). Conditional classes (kyc-, mac-task-empty/error, alert-hidden, sess-portrait/sess-initial) were excluded from this count where an emitter exists.</li> <li>Suggested action: Delete the sess-, task-, tb-, sched-, hc-/health-card, brief-/apps-command-brief, luci-meta, nav-badge and orphan war- rules from generate_html's CSS (verify each against a fresh regen first since some classes are conditionally emitted).</li> <li>Verifier: Independently reproduced. Ran the claimed repro on /Users/elmar/PKA/dashboard.html (2.4MB, 44,012 lines): 267 CSS class selectors, 76 raw unused — matches claim's ~74 once their stated exclusions (alert-, sess-initial, kyc-, mac-task-empty/error, which DO have live conditional emitters at dashboard.py:2153-2204, 2553-2574, 1993-2111) are applied. Verified cited source lines: brief-*/apps-command</li> </ul> <h4>57. token-dashboard web app exists in three identical copies (repo source, repo dashboards/ fallback, cloud PKA-Outputs) — ~1MB each, dominated by vendored echarts.min.js</h4> <ul> <li>Kind: simplify · Effort: S · Dimension: dead-dup</li> <li>Where: <code>/Users/elmar/PKA/dashboards/token-dashboard/</code></li> <li>Impact: Three copies of the same web app with no declared sync mechanism — a fix to Projects/token-dashboard/web silently won't reach the served (cloud) copy.</li> <li>Evidence: diff -rq /Users/elmar/PKA/dashboards/token-dashboard /Users/elmar/PKA/Projects/token-dashboard/web reports zero differences (app.js 5.0K, charts.js 4.8K, echarts.min.js 1005.1K, index.html, style.css, routes/). A third copy exists at PKA-Outputs/dashboards/token-dashboard (verified file listing). Serving is cloud-first (serve_dashboard.py _output_bases), so the repo dashboards/ copy is normally shadowed; dashboards/ is git-untracked (git ls-files dashboards → 0 files).</li> <li>Suggested action: Serve the token-dashboard UI directly from Projects/token-dashboard/web via a dedicated route (single source), or document one canonical copy and delete the repo dashboards/ duplicate.</li> <li>Verifier: Independently reproduced every element of the claim. (1) <code>diff -rq /Users/elmar/PKA/dashboards/token-dashboard /Users/elmar/PKA/Projects/token-dashboard/web</code> → zero differences ("IDENTICAL"). (2) <code>diff -rq</code> of the GDrive copy ('/Users/elmar/Library/CloudStorage/GoogleDrive-conrelma@gmail.com/My Drive/PKA-Outputs/dashboards/token-dashboard') against the repo dashboards/ copy → also zero differences</li> </ul> <h4>58. get_index_rebuild_health and get_kg_sync_health are near-identical 22-line twins</h4> <ul> <li>Kind: simplify · Effort: S · Dimension: dead-dup</li> <li>Where: <code>/Users/elmar/PKA/dashboard.py</code> (1124-1169)</li> <li>Impact: ~20 duplicated lines; a fix to one health reader (e.g. the timezone parse) can silently miss the other.</li> <li>Evidence: Duplicate-block hash scan flagged lines 1127-1136 == 1151-1160 verbatim (read both: identical _read_status_json/_status_tier/ts-parse/age_label logic; the only differences are the status-file name and one dict key files_count vs files_scanned).</li> <li>Suggested action: Collapse into one _status_health(path, count_key) helper called twice.</li> <li>Verifier: Read /Users/elmar/PKA/dashboard.py lines 1090-1170 directly. get_index_rebuild_health (1124-1145) and get_kg_sync_health (1148-1169) are confirmed near-identical: lines 1127-1136 and 1151-1160 are verbatim duplicates (same _read_status_json/_status_tier calls, same ts-parse try/except, same _age_label). Only differences are the status-file name (.vault-index-status.json vs .kg-sync-status.json) an</li> </ul> <h4>59. Small cross-file duplications: read-only sqlite opener and frontmatter parser implemented in both generator and server</h4> <ul> <li>Kind: simplify · Effort: S · Dimension: dead-dup</li> <li>Where: <code>/Users/elmar/PKA/serve_dashboard.py</code> (259-261, 2600-2613 (vs dashboard.py:42-44, 1067-1082))</li> <li>Impact: Drift risk: two frontmatter dialects can disagree on the same files (e.g. multiline YAML values parse in the server but not the generator).</li> <li>Evidence: dashboard.py connect_vault_db() and serve_dashboard.py _connect_db_ro() are the same one-liner (sqlite3.connect(f"file:{path}?mode=ro", uri=True) with the same docstring rationale). dashboard.py _parse_frontmatter (manual line parser) and serve_dashboard.py _parse_sb_frontmatter (yaml.safe_load) both parse the same '---' frontmatter convention with different fidelity — keys parsed by one can fail in the other.</li> <li>Suggested action: Move both helpers into pka_paths or a small shared util module imported by generator and server.</li> <li>Verifier: Independently re-read all four cited locations. dashboard.py:42-44 connect_vault_db() and serve_dashboard.py:259-261 _connect_db_ro() are byte-equivalent one-liners (sqlite3.connect(f"file:{...}?mode=ro", uri=True)) with the same docstring rationale. dashboard.py:1067-1082 _parse_frontmatter (manual split-on-colon, string values only) and serve_dashboard.py:2600-2612 _parse_sb_frontmatter (yaml.sa</li> </ul> <h4>60. No HTTP compression: 2.49MB raw per response, 222KB gzipped (11.2x)</h4> <ul> <li>Kind: simplify · Effort: S · Dimension: performance</li> <li>Where: <code>/Users/elmar/PKA/serve_dashboard.py</code> (409-413, 2869 (app.run))</li> <li>Impact: 11x avoidable transfer on every page load and every 30s poll; noticeable when viewing over Tailscale from another machine.</li> <li>Evidence: curl -s -D - -H 'Accept-Encoding: gzip, br' http://localhost:8787/dashboard.html -> Content-Length: 2493769, no Content-Encoding, Connection: close, Server: Werkzeug/3.1.8. gzip -c /Users/elmar/PKA/dashboard.html | wc -c -> 222,236 bytes (11.2x smaller). Combined with the 30s poll this is ~5MB/min uncompressed per open tab (vs ~0.44MB/min gzipped), and matters over Tailscale (bind_host default 0.0.0.0, line 2855).</li> <li>Suggested action: Add flask-compress (2 lines) or pre-gzip the file at regen time and serve with Content-Encoding: gzip.</li> <li>Verifier: Reproduced live: GET /dashboard.html with Accept-Encoding gzip,br returned 200 with Content-Length 2,513,755 and NO Content-Encoding (Server: Werkzeug/3.1.8). gzip -c dashboard.html = 224,315 bytes (~11.2x smaller); minor byte drift vs claim is explained by regenerate() rewriting the file per request. Verified serve_dashboard.py:409-413 returns plain Response with no compression, bind_host default</li> </ul> <h4>61. get_vault_db_health is 43% of the in-process build time (0.66s of 1.52s)</h4> <ul> <li>Kind: simplify · Effort: S · Dimension: performance</li> <li>Where: <code>/Users/elmar/PKA/dashboard.py</code> (main() step get_vault_db_health (called at 4830 area))</li> <li>Impact: Almost half of every page build is spent recomputing a slow health card plus invoking a qmd CLI that currently always fails — pure overhead 2x/minute.</li> <li>Evidence: Read-only profile of dashboard.py's generation steps (scratch profile_dash.py): TOTAL 1.52s, of which get_vault_db_health 0.66s, generate_html 0.31s, get_qmd_health 0.26s (shells out to ~/.bun/bin/qmd which fails in 0.28s every single regen — Node error, measured), get_sessions 0.15s (rglob over 4,940 jsonl). The health card is recomputed from scratch every 30s regen cycle.</li> <li>Suggested action: Cache health-card results for 5-10 min (they change slowly) and short-circuit get_qmd_health after the first failure of the day; fix or remove the broken qmd invocation.</li> <li>Verifier: Independently reproduced. Read dashboard.py:126-165 — get_vault_db_health runs PRAGMA quick_check (full scan) on Vault/vault.db, which I measured at 154 MB; main() calls it at line 4834 every regen. Re-ran the read-only profile (.scratch/wf-review/profile_dash.py): TOTAL 2.53s with get_vault_db_health 1.18s = 47% — same proportion as claimed (43%), absolute numbers vary with machine load. Also con</li> </ul> <h4>62. Every /dashboard.html response re-reads the 2.4MB file and string-scans it for script injection</h4> <ul> <li>Kind: simplify · Effort: S · Dimension: performance</li> <li>Where: <code>/Users/elmar/PKA/serve_dashboard.py</code> (409-425)</li> <li>Impact: ~13-40ms and a 2.4MB allocation per request that could be near-zero with a cached/pre-injected file; minor alone, but multiplied by the 30s poll forever.</li> <li>Evidence: dashboard() does Path(DASHBOARD).read_text(...) then _inject_annotate(html) — two full scans ('__denAnnotateLoaded' in html, '</body>' in html) plus a replace over 2.49MB, on every request, served via Werkzeug/3.1.8 dev server with Connection: close (header observed). Cooldown-hot responses measured 0.013-0.039s, so it's tolerable today, but the work is identical every time since the annotate tag could be appended once at generation.</li> <li>Suggested action: Inject the annotate tag in dashboard.py at generation time and serve with send_file (enables Werkzeug conditional/etag responses), or cache the injected string keyed on file mtime.</li> <li>Verifier: Independently confirmed every element. (1) Code: read /Users/elmar/PKA/serve_dashboard.py:409-425 — dashboard() calls regenerate(), then Path(DASHBOARD).read_text(...) on every request, then _inject_annotate(html), which does substring scans for '__denAnnotateLoaded' and the annotate tag, then a '</body>' scan plus str.replace over the full ~2.4MB string. (2) Live behaviour matches on-disk code de</li> </ul> <hr /> <h2>Refuted findings (did not survive adversarial verification)</h2> <ul> <li>(server-correctness) Live server predates on-disk code; staged findash freshest-file fix is NOT running</li> <li>Refutation: Could not reproduce — the claimed condition no longer exists. (1) The cited process is gone: <code>ps -p 78078</code> returns nothing; port 8787 is now served by PID 12110, started Thu 11 Jun 18:13:09 2026 (after the finding was written). (2) The "staged" fix i</li> <li>(server-correctness) Dead '..' string checks in /wiki and /sb-wiki (Werkzeug normalizes before handler)</li> <li>Refutation: I re-read serve_dashboard.py:2580-2581 (wiki) and 2676-2677 (sb-wiki) — both do <code>if ".." in page: return "Invalid path", 400</code>. I reproduced every HTTP fact: GET /wiki/.. and /sb-wiki/.. -> 302 to /dashboard.html (Werkzeug collapses the segment to / b</li> <li>(data-accuracy) No generated-at timestamp anywhere on the page; regeneration failures are silent — observed a 27-minute stale window</li> <li>Refutation: The headline claim — "no generated-at timestamp anywhere on the page" — is factually wrong. I fetched http://localhost:8787/dashboard.html (HTTP 200, 2.49MB) and the page header contains a baked generation timestamp: '<div class="timestamp">2026-06-1</li> <li>(dead-dup) Live server process (started Jun 9 15:06) predates today's serve_dashboard.py changes — disk has findash freshest-file/OneDrive logic the running server does not</li> <li>Refutation: The finding's specific evidence no longer reproduces, and its claimed impact is demonstrably false against the live server. (1) Process start: pgrep/ps shows exactly one serve_dashboard.py process, PID 12110, started 'Thu 11 Jun 18:13:09 2026' — not </li> </ul> <hr /> <h2>Coverage notes (what was and wasn't checked)</h2> <h3>gen-correctness</h3> COVERED: Read /Users/elmar/PKA/dashboard.py essentially end-to-end (lines 1-3100 fully; 3100-3700 = generate_html data assembly + CSS skimmed; 3700-4000 mostly CSS skimmed; 4000-4864 fully, including all emitted JS and main()). Cross-checked against docs/plans/2026-06-05-command-centre-6tab-restructure.md (6 tabs/panels present and default=Home ✓; sticky thead th present ✓; spend tile wired to /api/token-dashboard/overview and field names verified live (cost_usd=35663.06, input/output/cache_read <h3>server-correctness</h3> Read serve_dashboard.py in full (all 2873 lines) and live-probed the running server at http://localhost:8787 with GET-only requests (HARD RULES honored: no POST/PUT/DELETE, never touched /api/findash/refresh or /refresh-fuel, read-only sqlite not needed). Verified routes: /, /dashboard.html, /dashboards/<f>, /reports/<f>, /static/<f>, /eval-results/<f>, /api/annotations/inbox, /api/findash/refresh*/status, /api/findash/logs(+/<name>), /hermes-workers(+/<id>), /md-view, /api/v1/brain/file, /api/v <h3>data-accuracy</h3> Tested the LIVE server (PID 78078, started Tue 9 Jun 15:06 — confirmed it serves the disk dashboard.html plus injected /static/annotate.js, and that regen-on-request works on the live process, so live behaviour matched the on-disk generator for everything tested except /skill-audit, which is missing from BOTH the running process and the modified on-disk serve_dashboard.py). All HTTP checks were GET-only; refresh endpoints untouched; sqlite opened mode=ro; scratch confined to .scratch/wf-review/. <h3>dead-dup</h3> Covered: full call-graph dead-code scan of all 60 top-level functions in /Users/elmar/PKA/dashboard.py and all non-route functions in /Users/elmar/PKA/serve_dashboard.py (every 'never called' claim verified by repo-wide grep, with decorator/thread-target/blueprint false positives individually cleared — _csrf_same_origin_guard, _annotations_cors, broadcast_event, _run_pipeline, _run_fuel_refresh, check_deps, _brain_allowed_roots are all alive); all Flask routes checked for missing target files/di <h3>performance</h3> Tested the LIVE server (PID 78078, started Tue 9 Jun 15:06, RSS ~95MB) with GET-only requests; never touched /api/findash/refresh*. Note on disk-vs-live: serve_dashboard.py's uncommitted change is STAGED only (git diff empty; git diff --cached = +20/-2 confined to _output_dir_for, the findash.html freshest-file selection) — none of the code I cite (regenerate, dashboard route, token-dashboard, ws) is touched by that diff, so disk code and live behaviour agree for these findings; the one staged f <h3>ux-live</h3> Tested the LIVE server at http://localhost:8787 (running process started Tuesday) in real headless Chrome via browser-harness, viewport 1440x900 plus 1280/1200/1100/1000/950/900 widths. Covered: all 6 tabs clicked and screenshotted (.scratch/wf-review/ux/01-16*.png); all 22 Home links HTTP-verified (all 200/expected, no dead links); Reports filter (492→33 'fuel'), all 3 sorts, bulk-bar; Memory type filters (96→42); Evals drilldowns expanded + raw JSON link (200); System tab links (/skill-audit 2</div> </body> </html>