Scope: dashboard.py (generator, 4,864 lines), serve_dashboard.py (Flask server, 2,873 lines), the generated dashboard.html (~2.5MB), and the LIVE app at http://localhost:8787 — tested in a real browser (all 6 tabs, screenshots in .scratch/wf-review/ux/).
Method: ultracode multi-agent workflow (run wf_ab1d9721-e04, 79 agents): 6 dimension finders (generator correctness, server correctness, data accuracy, dead/duplicated code, performance, live-browser UX) → every finding independently reproduced by an adversarial verifier → second independent verifier for high-severity items. 62 findings confirmed, 4 refuted. Read-only review; nothing fixed.
Headline: the command centre has one architectural keystone problem (viewing the page triggers the full data pipeline, and the page re-downloads itself every 30s) and a cluster of silently-wrong numbers on the Home tab (session counts contradict themselves three ways; the activity feed shows March as 'recent'). Several health cards show green for things that are actually broken — the exact failure the dashboard exists to catch.
Findings (priority order)
Severity: HIGH
1. Session counting: page shows three contradictory numbers (baked 106/43-active war strip vs live API 0), driven by get_sessions counting every transcript .jsonl — claude-mem observer files flood it
- Kind: broken · Effort: M · Dimension: gen-correctness
- Where:
/Users/elmar/PKA/dashboard.py (742-871 (get_sessions), 2134-2184 (build_session_cards), 4046-4050 (baked hero), 4231-4261 (hydrateSessionBar))
- Impact: Home shows '0 sessions' seconds after load while the war strip directly above shows ~107 session cards; the 'active session' signal is unusable. Plan task 5 (docs/plans/2026-06-05-command-centre-6tab-restructure.md: 'replace baked counts with client-side fetch of /api/sessions') was only half-implemented — war strip + alerts are still baked from a different counting method than the API, and the audit's 'session counts frozen+self-inconsistent' finding persists.
- Evidence: Live-served dashboard.html (GET :8787/dashboard.html, regenerated 22:44): war strip = 107 baked cards, top cards labelled 'Claude Mem Observer Sessions · -Users-elmar--claude-mem-observer-sessions'; baked hero id=home-open-sessions = 106 ('43 active · 63 idle · 0 stale'). GET /api/sessions at the same moment → {counts:{active:0,idle:0,stale:0,closed:0}, sessions:[]} — hydrateSessionBar (line 4251-4254) then overwrites the hero to 0. Filesystem confirms 46 .jsonl transcripts modified <5min, mostly under ~/.claude/projects/-Users-elmar--claude-mem-observer-sessions/ — get_sessions (lines 769-818) counts each .jsonl file in every project dir as one session, with no exclusion of the claude-mem observer pseudo-project or subagent sidechains.
- Suggested action: Pick one source of truth: exclude the claude-mem-observer project dir (and ideally sidechain .jsonl) in get_sessions, render the war strip client-side from /api/sessions, or at minimum reconcile the generator's counting rules with serve_dashboard's /api/sessions rules.
- Verifier: Independently reproduced every element on the live server (2026-06-12 13:18). GET :8787/api/sessions returned {sessions:[], counts:{active:0,idle:0,stale:0,closed:0}} — handler serve_dashboard.py:1887-1922 reads only the manual sessions.json registry. Simultaneously, the live-served dashboard.html had baked hero id=home-open-sessions=19 and 20 war-cards, including 6 labelled 'Claude Mem Observer S
- Second verifier: Re-reproduced live (2026-06-12 13:32, GET-only): /api/sessions returned all-zero counts with empty sessions[], while the simultaneously served dashboard.html had baked hero home-open-sessions=119 ('57 active · 62 idle · 0 stale') and ~123 war-cards, 50 labelled 'Claude Mem Observer Sessions'. Read s
2. KYC unverified-fields card scans a non-existent path and always renders a green 'All KYC fields verified' all-clear
- Kind: broken · Effort: S · Dimension: gen-correctness
- Where:
/Users/elmar/PKA/dashboard.py (510-519)
- Impact: The Memory-tab KYC card permanently shows a green 0/'All KYC fields verified' while 4 fields are actually unverified — Elmar believes verification work is done when it isn't. Silent false positive since the scan directory vanished.
- Evidence: Line 510: sb_wiki = Path.home() / "CoWork" / "SecondBrain" / "wiki" —
ls /Users/elmar/CoWork/SecondBrain/wiki → 'No such file or directory'. SecondBrain actually lives at ~/PKA/SecondBrain (per repo + pka_paths.secondbrain_wiki_root(), which this same file imports at line 62 and uses in get_auto_compile_health but NOT here). Running the identical scan logic against the correct root (/Users/elmar/PKA/SecondBrain/wiki) yields 4 unverified fields (yellownickel.md: 4). Generated and live-served dashboard.html both contain 'kyc-big-num zero">0<' and 'All KYC fields verified ✓' (verified via GET http://localhost:8787/dashboard.html).
- Suggested action: Replace the hardcoded ~/CoWork path with the already-imported secondbrain_wiki_root() (or ROOT/'SecondBrain'/'wiki'), and make get_kyc_unverified return an error marker when neither kyc/ nor entities/ exists instead of total=0.
- Verifier: Independently reproduced every claim. (1) Read dashboard.py:489-563 — get_kyc_unverified() hardcodes Path.home()/'CoWork'/'SecondBrain'/'wiki' at line 510; ls confirms /Users/elmar/CoWork/SecondBrain/wiki does not exist, so the .exists() guards at 514/518 silently skip both scan dirs and the function returns total=0 with no error. (2) Confirmed it is live code: called at dashboard.py:3423, rendere
- Second verifier: Second-pass verification with a severity-stress lens, all probes re-run independently. (1) ls confirms /Users/elmar/CoWork/SecondBrain/wiki does not exist while /Users/elmar/PKA/SecondBrain/wiki/{kyc,entities} are populated. (2) Re-ran the exact scan logic from dashboard.py:489-563 against the real
3. Home 'Open Sessions' is overwritten to 0 on page load — live API reads an empty registry while dozens of sessions are active
- Kind: broken · Effort: S · Dimension: data-accuracy
- Where:
/Users/elmar/PKA/serve_dashboard.py (1887-1923 (api_sessions); dashboard.py:743-830 (get_sessions))
- Impact: The single most prominent number on the Home tab is wrong twice over: baked value uses one definition (transcript files), the hydrated value uses another (manual registry, currently empty). Elmar sees '0 sessions' while ~44 sessions are live, making the war-room view useless for spotting running/stuck work.
- Evidence: GET http://localhost:8787/api/sessions returned {"sessions": [], "counts": {"active": 0, "idle": 0, "stale": 0, "closed": 0}} (saved at .scratch/wf-review/sessions.json). ~/.claude/vault/sessions.json is 20 bytes (mtime Jun 10 22:14). The baked HTML simultaneously shows id="home-open-sessions">34 (22:23 bake), then 126 (22:50), then 148 '33 active · 90 idle · 25 stale' (22:58). hydrateSessionBar() in the generated page replaces the baked value with the API total on load and every 30s, so the user sees '0 sessions · 0 active'. Ground truth: 44 transcript .jsonl files under ~/.claude/projects modified within 5 min at 22:38 (this review session included).
- Suggested action: Make /api/sessions use the same transcript-autodetect logic as dashboard.py get_sessions() (or have both read one shared source), and investigate what emptied sessions.json at 22:14.
- Verifier: Reproduced end-to-end: (1) GET /api/sessions on the live server returned {"sessions": [], "counts": {active:0, idle:0, stale:0, closed:0}}; (2) ~/.claude/vault/sessions.json is 20 bytes / {"sessions": []}; (3) serve_dashboard.py:1887-1923 api_sessions reads ONLY the manual registry file, while dashboard.py:743-830 get_sessions (which bakes the HTML value) scans ~/.claude/projects/*/.jsonl mtimes
- Second verifier: Independently re-ran the key probes (GET-only, read-only). (1) Live GET http://localhost:8787/api/sessions returned {"sessions": [], "counts": {"active": 0, "idle": 0, "stale": 0, "closed": 0}} and ~/.claude/vault/sessions.json is 20 bytes containing {"sessions": []} — matches both verifiers. (2) se
4. Activity feed and 'Usage by Role' recents sort by id DESC, but activity_log ids are not chronological — today's events buried at position 1,787, 'recent' shows 31 March
- Kind: broken · Effort: S · Dimension: data-accuracy
- Where:
/Users/elmar/PKA/dashboard.py (319-322 (get_activity_log), 607-611 (get_timesheet_data actor_recent))
- Impact: The activity feed's visible top is yesterday-and-older entries in jumbled order; today's 63 events are effectively invisible, and per-role 'recent activity' claims 2.5-month-old items are the latest work. Misleads any 'what happened today' glance.
- Evidence: vault.db (read-only): id range for date 2026-06-10 is 1448165-1449146 while date 2026-03-31 occupies HIGHER ids 1450164-1450179; max id 1450932 is a 2026-06-09 row. In the generated page, the first 2026-06-10 entry appears at position 1,787 of 3,393 evt-date entries; the Lucienne 'Usage by Role' card shows three 'ts-recent' lines all dated 2026-03-31 ('Task done: Push latest PKA changes…') while the true latest Lucienne row is 2026-06-10 ('findash single-writer plan v2…').
- Suggested action: ORDER BY date DESC, id DESC (or created_at DESC) in get_activity_log and the actor_recent query, since the indexer rebuild does not preserve chronological ids.
- Verifier: Independently reproduced all elements. (1) Read dashboard.py:319-322 and 607-611 — both get_activity_log and actor_recent use ORDER BY id DESC. (2) Read-only sqlite on vault.db: ids are not chronological — for actor Lucienne, ORDER BY id DESC returns 2026-03-31 rows (ids ~1672505-1672507, 'Task done: Push latest PKA changes to GitHub') while her true latest row is dated 2026-06-12; between two que
- Second verifier: Second-pass verification with a severity/proportionality lens; all probes read-only GET/SELECT. (1) Code: dashboard.py:319-322 (get_activity_log) and 607-611 (actor_recent) both use ORDER BY id DESC with no date ordering — confirmed by direct read. Searched the whole file for any compensating sort:
5. Live WS refresh silently wipes in-progress user input every ~10-15s (filter text, checkbox selections)
- Kind: broken · Effort: S · Dimension: ux-live
- Where:
/Users/elmar/PKA/dashboard.py (4401-4402, 4476-4520 (doRefresh))
- Impact: Elmar cannot complete any multi-step interaction on the Reports tab — typed filter text and bulk-delete selections are destroyed mid-action every few seconds, making bulk operations effectively unusable and the page feel haunted.
- Evidence: Set #reports-search to 'fuel' (list correctly filtered 492→33 shown) and tagged the input with dataset.marker='X'. Polling every 5s: at t=10s value='fuel'/marker='X'; at t=15s value=''/marker=null — the entire .shell innerHTML was replaced by doRefresh(), which fetches the full /dashboard.html?t= (2.5MB) on every 'sessions.updated' WS event and restores only active tab, sort mode and scrollY — not filter text, report checkbox selections, or open
. A separate 2-min test showed filter+checked checkbox both reset to empty/0 (screenshot .scratch/wf-review/ux/16-reports-after-refresh.png). Tested against the LIVE server at :8787.
- Suggested action: In doRefresh(), preserve and restore #reports-search value (re-dispatch input event), checked .report-select-cb state and open
; better, debounce sessions.updated and patch only the session strip/counters instead of swapping the whole shell and re-downloading 2.5MB per event.
- Verifier: Read dashboard.py:4400-4523: sessions.updated → doRefresh(); doRefresh() refetches full /dashboard.html?t= and replaces .shell innerHTML, restoring only active tab, sort mode, bulk-bar init and scrollY — never filter text, checkbox selections, or open details. Line 4523 adds setInterval(doRefresh, 30000), so the wipe fires at least every 30s even without WS events. Verified the LIVE server at :878
- Second verifier: Independently re-verified both halves of the finding. (1) Source: /Users/elmar/PKA/dashboard.py:4400-4402 (sessions.updated → doRefresh), 4476-4519 (doRefresh refetches full /dashboard.html?t= and replaces .shell innerHTML, restoring only active tab, sort mode, bulk-bar recalc, memory count, scrollY
6. War-strip session ticker accumulates duplicate cards without bound (60 → 136 in ~30 min, 20,000px wide)
- Kind: broken · Effort: S · Dimension: ux-live
- Where:
/Users/elmar/PKA/dashboard.py (769-818 (get_sessions transcript scan, dedupe key line 801), 2159 (card render))
- Impact: The top-of-page 'what is running right now' strip is unreadable noise — dozens of duplicate cards for the same two activities, requiring ~15 screens of horizontal scrolling; genuine session info is buried.
- Evidence: Observed .war-strip .war-card count grow 60 → 68 (65s later) → 73 → 124 → 134 → 136 during the session; scrollWidth 19,821px vs 1,356px visible (screenshot .scratch/wf-review/ux/07-war-strip.png). Cards are near-identical: 'Claude Mem Observer Sessions · -Users-elmar--claude-mem-observer-sessions' ×5+ and 'PKA · Lucienne' ×17 per age bucket (0m,1m,2m...). Served HTML itself contained 116 'war-card' occurrences (server-side, not a JS append bug). Root cause read in code: get_sessions() dedupes by (project, jsonl.name) — every transcript .jsonl modified in the last 4h becomes its own card, and the claude-mem observer creates new transcript files continuously. Labels also expose the raw project slug '-Users-elmar--claude-mem-observer-sessions'.
- Suggested action: Dedupe sessions per (project) or per session UUID keeping only the most recent transcript, cap the strip at ~10 cards, exclude/aggregate the claude-mem observer project, and humanise the '-Users-elmar--*' slug.
- Verifier: Independently reproduced. Code: dashboard.py:769-818 get_sessions() makes every *.jsonl transcript modified <4h its own session card (dedupe key (project, jsonl.name) at line 801); build_session_cards() (lines 2134-2172, card render 2159) emits one war-card per active/idle session, no cap/aggregation. Live: GET http://localhost:8787/dashboard.html at 13:42 contained 155 'war-card' occurrences incl
- Second verifier: Independently re-verified two days after the first pass (different lens: severity/mitigation stress-test). Live GET of http://localhost:8787/dashboard.html shows 168 war-card occurrences, 79 identical 'Claude Mem Observer Sessions' cards (raw slug exposed) — persistent steady-state, matching first v
7. Home claims '0 sessions / 0 active' while the war-strip directly above shows dozens of live session cards
- Kind: broken · Effort: S · Dimension: ux-live
- Where:
/Users/elmar/PKA/serve_dashboard.py (1887 (/api/sessions) vs dashboard.py:743-818)
- Impact: The two session indicators contradict each other on first read — Elmar cannot trust either; 'is anything running?' gets two opposite answers on one page.
- Evidence: GET http://localhost:8787/api/sessions (live server) returned {"sessions": [], "counts": {"active": 0, "idle": 0, "stale": 0, "closed": 0}} while the war-strip rendered 73+ active cards including 'PKA · Lucienne | 0m' (this very session). Home tab '.home-session-bar' showed '0 sessions | 0 active | 0 idle | 0 stale' and hero card '0 OPEN SESSIONS' (screenshots 01-home-top.png, 08-home-mid.png). hydrateSessionBar() (dashboard.py:4230) reads /api/sessions which only reflects the manual registry (~/.claude/vault/sessions.json), whereas the war-strip is baked from transcript auto-detection — two divergent session sources on the same screen.
- Suggested action: Make /api/sessions use the same transcript auto-detect + registry merge as the generator's get_sessions(), or have hydrateSessionBar count the baked war-strip cards as fallback when the registry is empty.
- Verifier: Reproduced end-to-end against the live server. (1) GET http://localhost:8787/api/sessions returned {"sessions": [], "counts": {active:0, idle:0, stale:0, closed:0}} — because serve_dashboard.py:1887-1922 reads only env.sessions_file, and ~/.claude/vault/sessions.json is 20 bytes / 0 sessions. (2) The live-served /dashboard.html contains 156 war-card session cards in the war-strip (baked by dashboa
- Second verifier: Independently re-confirmed against the LIVE server (2026-06-12): GET http://localhost:8787/api/sessions returned {"counts": {active:0, idle:0, stale:0, closed:0}, sessions: []} while the live-served /dashboard.html simultaneously contained 162 'war-card' occurrences and baked hero values home-open-s
Severity: MEDIUM
8. Three weekly wiki-compile targets (safair-holdings, safair-operations, safair-lease-finance) have never been produced — pages CLAUDE.md cites as readable don't exist
- Kind: broken · Effort: M · Dimension: data-accuracy
- Where:
/Users/elmar/PKA/SecondBrain/wiki
- Impact: The dashboard is truthful here, but the compile pipeline for three pages the retrieval guidance depends on has never delivered output; sessions following Rule 15 hit dead references.
- Evidence: Home tab Wiki Compile card shows status 'missing / Compiled missing' for safair-holdings.md, safair-operations.md, safair-lease-finance.md (Mac LaunchAgent, Tue 12:40). Verified: find SecondBrain/wiki -iname 'safair-holdings*' etc. → no matches anywhere (entities/, synthesis/, projects/ checked). Yet CLAUDE.md Rule 15 step 2 lists all three as known auto-compiled pages to read first.
- Suggested action: Check the Mac LaunchAgent for the 12:40 compile lane (logs/mac-tasks) and either fix the compiler or remove the three pages from CLAUDE.md Rule 15 and the manifest.
- Verifier: Independently reproduced every element: (1) find over /Users/elmar/PKA/SecondBrain/wiki for safair-holdings/safair-operations/safair-lease* returns nothing; whole-repo find matches only unrelated email sources in data/sources/emails. (2) Vault/config/wiki-auto-compile.json active_entries lists all three pages enabled=true, runner com.elmar.entity-compile, Tuesday 12:40 SAST Mac LaunchAgent; comp
9. get_active_agents queries a table (agent_events) that does not exist in vault.db; the exception is swallowed, so org-chart live-agent status is permanently dead
- Kind: broken · Effort: S · Dimension: gen-correctness
- Where:
/Users/elmar/PKA/dashboard.py (676-699 (get_active_agents), 3483-3494 (consumer))
- Impact: The 'agent is Active via hook events' feature silently never fires — the org chart's status column conveys false information (everything looks idle), and the broken dependency is invisible because the error is swallowed.
- Evidence: sqlite3 'file:/Users/elmar/PKA/Vault/vault.db?mode=ro' '.tables' lists: activity_log, edges, files, relation_types, search_fts, tag_assignments, tags, task_runs, v_, vitals_log — no agent_events. 'SELECT ts FROM agent_events LIMIT 3' → 'Error: no such table: agent_events'. get_active_agents wraps the query in try/except Exception: return [] (line 698-699), so active_agent_names at line 3484 is always empty and every agent in the System-tab org chart renders 'On Shelf' regardless of real activity.
- Suggested action: Either create/restore the agent_events ingestion or delete get_active_agents and the active_agent_names override; at minimum log the sqlite error instead of returning [] silently.
- Verifier: Independently reproduced. (1) Read dashboard.py:676-699 — get_active_agents queries agent_events with try/except Exception: return []; consumer at 3483-3493 builds active_agent_names for the org-chart 'Active' override. (2) Read-only sqlite on /Users/elmar/PKA/Vault/vault.db: 'SELECT COUNT(*) FROM agent_events' → 'no such table: agent_events'; full table list contains no agent table. (3) Executed
10. Baked session count is inflated by claude-mem observer transcripts and swings wildly (34→126→148 in 35 min)
- Kind: broken · Effort: S · Dimension: data-accuracy
- Where:
/Users/elmar/PKA/dashboard.py (743-830 (get_sessions), 1350-1367 (get_hero_metrics))
- Impact: Even before the hydration bug zeroes it, the 'Open Sessions' metric and the header session cards double-count memory-pipeline artifacts as work sessions, so the number never reflects reality.
- Evidence: get_sessions() counts every *.jsonl under ~/.claude/projects modified <4h as an open session. In the 22:58 bake, 67 of 124 header war-cards are 'Claude Mem Observer Sessions · -Users-elmar--claude-mem-observer-sessions' (grep -o 'class="war-project">...' dash3.html | sort | uniq -c). Baked Open Sessions value observed at 34 (22:23), 126 (22:50), 148 (22:58) — driven by background extractors touching transcripts, not by real session changes.
- Suggested action: Exclude the claude-mem observer project dir (and other non-interactive transcript producers) from get_sessions() detection, or key sessions on the registry plus a tight transcript filter.
- Verifier: Read dashboard.py:742-818 — get_sessions() counts every *.jsonl under ~/.claude/projects with mtime<240min as an open session, with no exclusion for the claude-mem observer project; get_hero_metrics() (1350-1367) sums them all into total_open. Reproduced live via GET http://localhost:8787/dashboard.html: baked hero shows Open Sessions=46, and 15 of the 46 war-cards are 'Claude Mem Observer Session
11. Dashboards listing and serving use two diverged path resolvers — newest Exco Dashboard (9 June) is served but invisible in the command centre
- Kind: broken · Effort: S · Dimension: dead-dup
- Where:
/Users/elmar/PKA/dashboard.py (1561-1575 (and serve_dashboard.py:444-490))
- Impact: The latest Exco dashboard has been missing from the command centre app list for 36+ hours; Elmar sees the stale 2 June version listed instead. Any output written only to the repo dir (not mirrored to cloud) silently disappears from the UI.
- Evidence: dashboard.py get_dashboards() lists from ONE base via _primary_output_base('dashboards') (cloud GDrive PKA-Outputs once non-empty), while serve_dashboard.py _output_bases/_output_dir_for serves per-file cloud-first WITH repo fallback. 'Exco Dashboard - 9 June 2026.html' exists in repo dashboards/ (mtime 2026-06-09 10:01:35) but NOT in '/Users/elmar/Library/CloudStorage/GoogleDrive-conrelma@gmail.com/My Drive/PKA-Outputs/dashboards' (verified directory listing diff). Generated dashboard.html (regenerated 2026-06-10 22:53) contains '2 June 2026' 3 times and '9 June 2026' 0 times. Live GET http://localhost:8787/dashboards/Exco%20Dashboard%20-%209%20June%202026.html returned 200 (72,948 bytes) — the file serves fine, it just never appears in the Apps listing.
- Suggested action: Make get_dashboards() enumerate the union of cloud + repo bases (reuse the serve-side _output_bases search order, freshest wins per filename), or guarantee every dashboard export also lands in PKA-Outputs.
- Verifier: Independently reproduced every claim. (1) Read dashboard.py:1561-1575: _primary_output_base() returns ONE base — cloud GDrive PKA-Outputs once non-empty, else repo — and get_dashboards() (line 1578-1684) globs *.html only from that single base. Read serve_dashboard.py:444-490: _output_bases/_output_dir_for serve per-file across cloud+repo (+OneDrive for findash), picking freshest mtime. The two re
12. Open dashboard tab drives a full server regeneration pipeline every 30s, continuously
- Kind: broken · Effort: S · Dimension: performance
- Where:
/Users/elmar/PKA/dashboard.html (43325 (generated; source dashboard.py ~4253 area), serve_dashboard.py:409-413)
- Impact: The Mac burns 5.6-8.7s of Python work every 30 seconds around the clock while a tab is open (battery/CPU), and the page re-downloads 2.5MB each cycle. The 'fallback' poll runs even when the WS live channel is healthy, making the WS design pointless.
- Evidence: dashboard.html:43325
let fallbackTimer = setInterval(doRefresh, 30000); — never cleared even when the WebSocket is connected (ws.onopen at 43303 only updates the status dot). doRefresh() (43417) fetches '/dashboard.html?t='+Date.now(), and serve_dashboard.py:411 calls regenerate() on every such GET with only a 10s cooldown (REGEN_COOLDOWN=10, line 250) — 30s polls always miss the cooldown. Verified live with zero requests from me: dashboard.html mtime advanced 22:55:05 -> 22:55:42 -> 22:56:05 -> 22:56:35 (~every 30s). Each regen measured at 5.59s and 8.67s wall time. That is ~19-29% CPU duty cycle 24/7 plus a 2,493,769-byte uncompressed transfer per poll (~5 MB/min, ~7.2 GB/day per open tab).
- Suggested action: Clear fallbackTimer when ws.onopen fires (re-arm on ws.onclose), and/or have doRefresh hit a lightweight delta endpoint instead of the full page (see /api/sessions finding).
- Verifier: Independently reproduced every element. (1) Code: dashboard.py:4523 (source of generated dashboard.html) has
let fallbackTimer = setInterval(doRefresh, 30000); and rg confirms no clearInterval anywhere in dashboard.py or dashboard.html; ws.onopen (dashboard.py:4371-4377) only recolors the status dot/label, so the 30s poll runs even with a healthy WebSocket. doRefresh (4476ff) fetches '/dashboard
13. GET /dashboard.html synchronously runs index.py + generate_context.py + full rebuild in the request path, with no staleness check
- Kind: broken · Effort: S · Dimension: performance
- Where:
/Users/elmar/PKA/serve_dashboard.py (355-381, 409-413; dashboard.py:4790-4815)
- Impact: First page load (or any load >10s after the last) stalls 5.6-8.7s before HTML arrives; the requester pays for a full vault re-index they didn't ask for. If index.py fails the whole page build fails (required=True).
- Evidence: serve_dashboard.py regenerate() (355-381) subprocess-runs dashboard.py (timeout 30s) inside the request; dashboard.py main() (4810-4815) first runs index.py (required=True, a vault.db WRITER) and generate_context.py as subprocesses on every invocation. There is no change/staleness detection anywhere — only the 10s time cooldown — so it rebuilds even when nothing changed. Measured: triggering GET blocks 5.587s and 8.667s; profiled read-only data-gather + HTML build alone is 1.52s (scratch profile_dash.py), so ~4.1-7.1s of every regen is the index.py/generate_context.py subprocess chain.
- Suggested action: Move regeneration to a background thread (serve current file immediately, regenerate behind it like _maybe_scan_token_dashboard already does), and skip regen when no source mtime/DB max(id) changed. Decouple index.py from page rendering (scheduled indexer).
- Verifier: Read serve_dashboard.py:355-381 (regenerate() subprocess-runs dashboard.py synchronously, gated only by REGEN_COOLDOWN=10 at line 250, no staleness/mtime check) and :403-413 (both / and /dashboard.html call regenerate() in the request path). Read dashboard.py:4790-4816: main() unconditionally runs index.py (required=True, raises on failure) and generate_context.py before the build; index.py:432-51
14. Raw Node.js stack trace dumped into the Home tab QMD card (QMD integration broken)
- Kind: broken · Effort: S · Dimension: ux-live
- Where:
/Users/elmar/PKA/dashboard.py (965-999 (qmd status capture))
- Impact: A retrieval layer is silently broken (QMD evals score 0.000) and the dashboard surfaces it as an unreadable raw stack trace instead of a one-line status + fix hint — looks broken and tells the reader nothing actionable.
- Evidence: Home tab QMD card renders verbatim: "QMD · unavailable —
qmd --index pka status failed (1): node:internal/modules/cjs/loader:1939 ... Error: The module '/Users/elmar/.bun/install/global/node_modules/better-sqlite3/build/Release/better_sqlite3.node' was compiled against a different Node.js version using NODE_MODULE_VERSION 141. This version of Node.js requires NODE_MODULE_VERSION 127..." (extracted from live DOM; visible as the dense grey text wall in .scratch/wf-review/ux/08-home-mid.png). Consistent with the Evals tab showing QMD P@5 0.000 / MRR 0.000.
- Suggested action: Rebuild better-sqlite3 for the current Node (or pin the node used by the qmd wrapper); in dashboard.py truncate the error to the first meaningful line and show a short 'QMD offline — node module version mismatch, run npm rebuild' message.
- Verifier: Read dashboard.py:963-1002: get_qmd_health appends full untruncated stderr to the error string on non-zero exit. GET http://localhost:8787/dashboard.html (200, 2.55MB) — confirmed the QMD tile inside the Home panel (div id="p-home" at line 2461; tile at ~line 2645) renders 'QMD · unavailable' followed by the verbatim multi-line Node stack trace: node:internal/modules/cjs/loader:1939, better_sqlite
15. MEMORY.md overflow health check measures the wrong metric (200 lines) on the wrong file — the actually-overflowing auto-memory index (38.4KB > 24.4KB, truncated at load) is invisible and the card shows green
- Kind: fragile · Effort: S · Dimension: gen-correctness
- Where:
/Users/elmar/PKA/dashboard.py (250-255, 2684-2696)
- Impact: Memory entries silently drop out of every session's loaded index while the health dashboard says the memory system is healthy — exactly the class of silent data loss the card exists to catch.
- Evidence: Lines 250-255 check only MEMORY_DIR/'MEMORY.md' (= Vault/memory/MEMORY.md) and flag overflow only when len(lines) > 200. Measured: Vault/memory/MEMORY.md = 115 lines / 14,605 bytes → card renders 'MEMORY.md within 200-line limit — OK'. Meanwhile the live session-loaded index Vault/memory/auto/lucienne/MEMORY.md is 38.4KB and Claude Code's own loader warns 'MEMORY.md is 38.4KB (limit: 24.4KB) — Only part of it was loaded' (observed in this session's context injection). The dashboard's Memory Health section reports green while the real boot-loaded index is being truncated, and Claude's limit is KB-based, not line-based.
- Suggested action: Check byte size against ~24KB (not 200 lines) and include Vault/memory/auto//MEMORY.md in the overflow check.
- Verifier: Confirmed all claimed evidence. dashboard.py:38 sets MEMORY_DIR=Vault/memory; lines 250-255 check only Vault/memory/MEMORY.md with a lines>200 threshold; lines 2684-2696 render green 'MEMORY.md within 200-line limit' when not tripped. Measured: Vault/memory/MEMORY.md = 115 lines/14,605 bytes (passes); Vault/memory/auto/lucienne/MEMORY.md = 192 lines/41,784 bytes (~40.8KB, grown past the claimed 38
16. Generator-emitted 30s polling loop forces a full dashboard.py regeneration (index.py reindex + qmd subprocess + transcript scan) for every open tab
- Kind: fragile · Effort: S · Dimension: gen-correctness
- Where:
/Users/elmar/PKA/dashboard.py (4523 (setInterval(doRefresh,30000)), 4479 (fetch /dashboard.html?t=), 4810-4811 (main runs index.py))
- Impact: One open browser tab causes a full DB reindex + subprocess fan-out roughly every 30s indefinitely (2.4MB transfer per poll); if regen ever exceeds the server's 30s timeout the page silently serves stale data while the machine keeps burning CPU. WS 'sessions.updated' events trigger the same full-document swap on top.
- Evidence: Emitted JS:
let fallbackTimer = setInterval(doRefresh, 30000); where doRefresh fetches '/dashboard.html?t='+Date.now(). serve_dashboard.py:355-372 regenerates by running python dashboard.py on request with REGEN_COOLDOWN = 10s (serve_dashboard.py:250) and subprocess timeout=30. dashboard.py main() (4810-4816) runs index.py (full vault reindex) plus generate_context.py, then get_qmd_health spawns the qmd CLI (982-988), get_reports re-reads every report file, and get_sessions rescans all ~/.claude/projects transcripts. Observed live: served timestamp advanced 22:43 → 22:44 between two requests one minute apart, confirming per-request regeneration. Each poll also pulls the full 2.48MB document (measured size 2,484,051 bytes) and replaces .shell innerHTML.
- Suggested action: Drop or drastically lengthen the fallback poll (WS already exists), and/or have doRefresh hit a cheap freshness endpoint instead of the full document; decouple index.py from every regeneration.
- Verifier: Independently confirmed every element. Code: dashboard.py:4523 emits setInterval(doRefresh,30000); doRefresh (4476-4519) fetches /dashboard.html?t=Date.now() and replaces .shell innerHTML with the full document. serve_dashboard.py /dashboard.html route (409-411) calls regenerate() (355-381), which runs
python dashboard.py as a subprocess with REGEN_COOLDOWN=10 (line 250) and timeout=30; on failu
17. Memory tab '378 integrity issue(s)' banner: all 305 'index drift' rows are reports intentionally migrated to cloud PKA-Outputs — files exist and serve fine
- Kind: fragile · Effort: S · Dimension: data-accuracy
- Where:
/Users/elmar/PKA/dashboard.py (n/a (integrity check section feeding 'Index drift (DB row but file missing on disk)'))
- Impact: A red 378-issue alarm is 80% false positives caused by the planned outputs-to-cloud migration, drowning the ~73 genuinely broken wiki links and training Elmar to ignore the integrity banner.
- Evidence: Dashboard shows 'Index drift (DB row but file missing on disk): 305' inside a red '⚠ 378 integrity issue(s)' banner. Verified via read-only query: exactly 305 files-table rows missing under /Users/elmar/PKA, ALL with path prefix reports/, and ALL 305 exist in '/Users/elmar/Library/CloudStorage/GoogleDrive-conrelma@gmail.com/My Drive/PKA-Outputs/reports'. Sampled URL GET /reports/skill-audit-2026-06-08.html → 200 (listed as 'missing' in the drilldown).
- Suggested action: Teach the integrity check (and index.py) to resolve reports/dashboards paths via pka_paths.outputs_dir() before declaring drift.
- Verifier: Fully reproduced. dashboard.py:264-266 flags a files-table row as 'index drift' if the path doesn't exist locally, with no awareness of the planned outputs-to-cloud migration (docs/plans/2026-06-05-outputs-to-cloud-storage-split.md). Read-only query on Vault/vault.db: 308 of 1,932 pka/personal rows missing locally, 100% under reports/, and all 308/308 present in GoogleDrive PKA-Outputs/reports. Re
18. Spend tile shows '$35,687.48' lifetime estimated cost with no 'estimated' label or time period, and token total silently omits 647M cache-creation tokens
- Kind: fragile · Effort: S · Dimension: data-accuracy
- Where:
/Users/elmar/PKA/dashboard.py (hydrateSpendTile JS in generated page; API serve_dashboard.py /api/token-dashboard/overview)
- Impact: Reads as 'we spent $35.7k' when it is an all-time API-equivalent estimate for mostly-subscription usage; the token figure also under-reports by ~647M cache-write tokens.
- Evidence: GET /api/token-dashboard/overview → {"cost_usd":35687.4756, "input_tokens":31141344, "output_tokens":59210240, "cache_read_tokens":16403776759, "cache_create_1h_tokens":527731281, "cache_create_5m_tokens":119778070, ...}. The tile JS renders '$35,687.48' + input+output+cache_read only (excludes both cache_create fields) under the heading 'Spend & Tokens' with no period or 'estimated' qualifier. Usage runs largely on Claude subscription (per house rules, cost-shaped API fields ≠ real charges).
- Suggested action: Label the figure 'est. API-equivalent (all time)' and include cache-creation tokens (or show a 30-day window).
- Verifier: Independently reproduced every element. (1) GET http://localhost:8787/api/token-dashboard/overview returned cost_usd=35989.8041, input_tokens=43647306, output_tokens=62199635, cache_read_tokens=16829394521, cache_create_1h_tokens=542186822, cache_create_5m_tokens=147073157 — slightly higher than the reviewer's figures because this is an accumulating all-time metric (the tile passes no since/until,
19. Activity log rendered with no LIMIT: 3,392 rows = 1.54MB = 62% of the entire page
- Kind: fragile · Effort: S · Dimension: performance
- Where:
/Users/elmar/PKA/dashboard.py (319-332 (get_activity_log), 3272-3300 (render loop))
- Impact: 62% of every 2.5MB page build, transfer, parse and 30s innerHTML re-render is a timeline nobody scrolls 3,392 rows deep; page size and regen time grow unboundedly with activity.
- Evidence: get_activity_log: "SELECT date, actor, category, summary, details FROM activity_log ORDER BY id DESC" — no LIMIT. Render loop emits an .evt div block per row. Byte-share scan (scratch panelshare.py/homeshare.py): id="home-session-breakdown" chunk = 1,538,220 B of the 2,487,184 B page (62.3%); 3,392 .evt rows. sqlite3 'file:Vault/vault.db?mode=ro' "SELECT COUNT(*) FROM activity_log" -> 3392, dates 2026-03-31..2026-06-10 — i.e. the page grows ~600KB+/month forever.
- Suggested action: Add LIMIT 200 (or last-30-days) to get_activity_log and a 'view full log' link; instantly cuts the page from 2.49MB to ~1.0MB.
- Verifier: Independently reproduced every element. dashboard.py:319-322 (read directly): SELECT from activity_log with no LIMIT; render loop 3272-3300 emits an .evt div per row; the only filters (lines 3133-3135 'Waiting for', in-loop index/system_maintenance skips) currently remove zero rows. Read-only sqlite query: activity_log now has 3,456 rows (2026-03-31..2026-06-12) — up from the claimed 3,392 on 2026
20. Page views cause vault.db writes: index.py (a DB writer) runs every 30s as a side effect of GET
- Kind: fragile · Effort: S · Dimension: performance
- Where:
/Users/elmar/PKA/dashboard.py (4810-4811 (run_step index.py, required=True); serve_dashboard.py:365-372)
- Impact: Constant DB write churn and lock contention driven purely by someone looking at a dashboard; a hung/failed index.py (required=True) takes down page regeneration entirely.
- Evidence: dashboard.py main() runs index.py as a required first step; index.py docstring: 'populates vault.db as a query index' with 21 INSERT/UPDATE/DELETE statements. Because regenerate() is triggered by GET /dashboard.html (serve_dashboard.py:411) and the open tab polls every 30s, vault.db gets a write transaction every ~30s, 24/7 (observed via dashboard.html mtime advancing every ~30s with no external requests — each of those is an index.py run). This GET-with-side-effects also competes with every other vault.db writer (memory extractors, graphify, session tracker) for the write lock.
- Suggested action: Run index.py on its own schedule (launchd/scheduler, e.g. every 10 min) or on file-change events; dashboard.py should only read.
- Verifier: Independently reproduced the full chain on the LIVE server (PID 12110, started 2026-06-11 18:13; serve_dashboard.py is now committed/clean at 296ae842 so file==live for this path, and I verified behaviour empirically anyway). (1) Code path confirmed: serve_dashboard.py:409-411 GET /dashboard.html calls regenerate(); regenerate() (lines 355-379, REGEN_COOLDOWN=10s at line 250) synchronously subproc
21. doRefresh refetches the whole 2.49MB page and innerHTML-swaps ~36,700 elements when a 1ms JSON endpoint already exists
- Kind: simplify · Effort: M · Dimension: performance
- Where:
/Users/elmar/PKA/dashboard.html (43417-43461 (generated; source in dashboard.py generate_html JS block))
- Impact: Every refresh rebuilds the entire DOM (including the 1.5MB activity timeline) to update a handful of session cards — wasted client CPU/memory every 30s, plus lost in-page state (event listeners, partial scroll) papered over by re-hydration code.
- Evidence: doRefresh() fetches '/dashboard.html?t='+Date.now(), DOMParser-parses the full 2,485,879-byte document (43,752 lines, 36,714 open tags counted), replaces the entire .shell innerHTML, then re-binds handlers / re-hydrates tiles / re-sorts reports / restores scroll. It also fires on every WS 'sessions.updated' event (ws_events.py:100-106 broadcasts whenever sessions.json mtime changes). Meanwhile GET /api/sessions measured at 0.001-0.005s (3 runs: 4.9ms, 1.1ms, 1.2ms) returns the same session data as JSON.
- Suggested action: Replace full-shell swap with a targeted update from /api/sessions (the war-card grid is the only thing that changes at 30s cadence); keep full refetch only for manual reload.
- Verifier: Reproduced everything. Read doRefresh at dashboard.html:43977-44018 (line numbers shifted from claimed 43417 because the file was regenerated; content identical to claim): fetches '/dashboard.html?t='+Date.now(), DOMParser-parses the full document, swaps .shell innerHTML, then re-binds handlers, restores tab, re-hydrates session/spend tiles, re-sorts reports, restores scroll. Confirmed triggers: h
22. Entire activity history baked into the DOM: 3,456 timeline nodes / 321,485px of content inside a 520px scroll box
- Kind: simplify · Effort: S · Dimension: ux-live
- Where:
/Users/elmar/PKA/dashboard.py (timeline render feeding .timeline (client insert at 4410-4434))
- Impact: Page weight and refresh cost are dominated by history nobody scrolls 600 screens deep for; every live refresh re-ships and re-parses the whole archive, causing jank on each WS event.
- Evidence: Live DOM measurement: the Home '.timeline' scroll container has clientHeight 520, scrollHeight 321,485 and 3,456 child elements (~1,400 activity entries). Combined with the war-strip duplication this drives the 2.5MB / ~45k-line page that doRefresh re-downloads and re-parses on every sessions.updated event (every ~15s observed).
- Suggested action: Bake only the most recent ~50 activity entries with a 'view full log' link to a separate page/endpoint; this alone should cut the generated HTML by a large fraction.
- Verifier: Independently reproduced every load-bearing number. (1) Live DOM via Playwright on http://localhost:8787/dashboard.html: .timeline has children=3460, scrollHeight=321873, clientHeight=520, document HTML=2,555,293 bytes — matches claimed ~3456 / 321,485 / 520 (delta = a few new activity events since the reviewer measured). (2) Root cause confirmed in code: dashboard.py:319-322 get_activity_log() ru
Severity: LOW
23. War-strip cards render the raw encoded project dirname as 'persona' (e.g. '-Users-elmar--claude-mem-observer-sessions')
- Kind: broken · Effort: S · Dimension: gen-correctness
- Where:
/Users/elmar/PKA/dashboard.py (874-892 (_guess_persona), 809 (persona=_guess_persona(project)), 2165 (render))
- Impact: Unreadable garbage labels on the most prominent strip of the Home view for any project without a hardcoded persona mapping.
- Evidence: _guess_persona falls through to
return project_name (line 892) with the raw Claude dir name, while the display name is humanised separately (line 807). Live-served dashboard.html war strip contains: 'Claude Mem Observer Sessions · -Users-elmar--claude-mem-observer-sessions' — the persona slot shows the ugly encoded path.
- Suggested action: Fall back to humanise_session_name(project_name) (or empty string) instead of the raw dirname in _guess_persona.
- Verifier: Independently reproduced. (1) Read /Users/elmar/PKA/dashboard.py:874-892 — _guess_persona() has hardcoded mappings for pka/cowork/legalmind/crypto/smartmoney and falls through to
return project_name at line 892 with the raw, un-humanised Claude project dirname. (2) Line 809 passes the raw project (encoded dir name) to _guess_persona, while line 806 humanises the display name separately via hum
24. Agent-dispatch 'this week' window is 8 days inclusive, not 7
- Kind: broken · Effort: S · Dimension: gen-correctness
- Where:
/Users/elmar/PKA/dashboard.py (2358-2377)
- Impact: The promotion-ladder evidence number ('4+/week → evaluate making permanent') is inflated by up to one extra day of dispatches — borderline agents can cross the threshold incorrectly.
- Evidence: week_ago = now_date - timedelta(days=7) and the filter is
if entry_date >= week_ago (line 2376) — today plus the 7 previous days = 8 distinct dates counted as 'dispatches this week'. The card label (line 2437) says 'dispatches this week' and Team/roster.md's promotion rule is '4+ dispatches per week'.
- Suggested action: Use
entry_date > week_ago or timedelta(days=6) for a true rolling 7-day window.
- Verifier: Read /Users/elmar/PKA/dashboard.py:2356-2377 myself:
week_ago = now_date - timedelta(days=7) (line 2359) and if entry_date >= week_ago (line 2376). Ran the arithmetic in Python: a dispatch dated exactly 7 days ago satisfies the filter, and the inclusive window spans exactly 8 distinct calendar dates — confirmed empirically (printed 8). Confirmed the code path is live: get_agent_dispatch_stats
25. Non-integer query params crash to HTTP 500 instead of 400 (multiple endpoints)
- Kind: broken · Effort: S · Dimension: server-correctness
- Where:
/Users/elmar/PKA/serve_dashboard.py (1633, 1685, 2447-2448)
- Impact: A malformed or stale client query (e.g. a saved URL, a fuzzer, or a UI bug passing an empty/garbage value) returns a 500 server error instead of a clean 400. Masks real server faults in logs and gives the dashboard JS an opaque failure instead of a handled error.
- Evidence: int(request.args.get(...)) is called outside any try/except. Live probes: GET /api/token-dashboard/prompts?limit=abc -> HTTP 500; GET /api/token-dashboard/sessions?limit=notanumber -> HTTP 500; GET /api/v1/brain/db?db=vault.db&table=edges&page=abc -> HTTP 500; ...&per_page=xyz -> HTTP 500. Body is the generic Werkzeug 500 page (no traceback leak; debug=False). Compare line 1633
limit = max(1, min(1000, int(request.args.get("limit", 50)))) and lines 2447-2448 page = max(1, int(request.args.get("page", 1))) which run before the try block at 2459.
- Suggested action: Wrap the int() parses in try/except ValueError and return jsonify(error=...) with 400, or coerce with a safe default. Move the brain/db int parses inside the existing try block.
- Verifier: Read serve_dashboard.py myself: line 1633 and 1685 call int(request.args.get("limit", ...)) with no try/except; lines 2447-2448 call int() on page/per_page before the try block at 2459. Reproduced live with GET-only curls: /api/token-dashboard/prompts?limit=abc -> 500, /api/token-dashboard/sessions?limit=notanumber -> 500, /api/v1/brain/db?db=vault.db&table=edges&page=abc -> 500, &per_page=xyz ->
26. System tab 'Skill Audit dashboard' link points to /skill-audit which 404s — route does not exist
27. Reports tab lists 25 node_modules README/CHANGELOG/license files as reports (~5% of the 489 cards)
- Kind: broken · Effort: S · Dimension: data-accuracy
- Where:
/Users/elmar/PKA/dashboard.py (1464-1509 (md rglob in get_reports — no node_modules exclusion))
- Impact: Junk library docs pollute the Reports tab (and its 489 count), each with Share/PDF/Delete buttons, making the report library look untrustworthy.
- Evidence: Generated page contains 25 distinct report cards with data-filename like 'narrowbody-market-2026-05-29/node_modules/util-deprecate/README.md', 'node_modules/pako/CHANGELOG.md', 'node_modules/process-nextick-args/license.md'. Matching 25 *.md files exist under PKA-Outputs/reports/narrowbody-market-2026-05-29/node_modules/. The md listing filters prompts/smoke/hermes-workers but not node_modules; one even appeared in the live access log being served via /md-view.
- Suggested action: Skip any path containing node_modules (and similar vendor dirs) in get_reports rglob; consider deleting the vendored dir from the cloud reports folder.
- Verifier: Reproduced all evidence: (1) Read dashboard.py lines 1464-1478 — md rglob in get_reports filters dotfiles/_deleted/hermes-workers/prompt-files/smoke- but has no node_modules exclusion; reports_dir resolves via _primary_output_base to GDrive PKA-Outputs/reports. (2) Ran the exact claimed repro against the live server: curl http://localhost:8787/dashboard.html | grep data-filename node_modules | sor
28. Placeholder-dated stub reports ('…-2099-01-01') from scheduled tasks shown as fresh 2026-06-09 reports
- Kind: broken · Effort: S · Dimension: data-accuracy
- Where:
/Users/elmar/Library/CloudStorage/GoogleDrive-conrelma@gmail.com/My Drive/PKA-Outputs/reports
- Impact: Some scheduled report generator is running with an unexpanded date placeholder, producing near-empty duplicate stubs that the dashboard presents as recent deliverables.
- Evidence: Cloud reports dir contains skill-audit-2099-01-01.html (2.5K stub, Skill Audit — 2099-01-01), sender-intelligence-2099-01-01.html (1.8K), operator-tune-2099-01-01.md (91 bytes), and dir investment-weekly-2099-01-01/. The Reports tab renders them near the top with date 2026-06-09 and titles carrying the literal '2099-01-01' placeholder, alongside the real skill-audit-2026-06-08.html.
- Suggested action: Find the scheduler task templating '{date}'→'2099-01-01', fix it, and delete the four stub artifacts.
- Verifier: Reproduced both repro steps. (1) ls of the cloud reports dir confirmed skill-audit-2099-01-01.html (2.5K, title 'Skill Audit — 2099-01-01'), sender-intelligence-2099-01-01.html (1.8K), operator-tune-2099-01-01.md (91B), and investment-weekly-2099-01-01/ with 30-byte stub mp3/pdf/mp4 files; all mtime Jun 9 17:49:13 2026. (2) GET http://localhost:8787/dashboard.html (live server) returned 19 occurre
29. Evals tab reports 'GBrain: ok' for a system retired 2026-05-01 — label defaults to 'ok' when the store is absent from the eval run
- Kind: broken · Effort: S · Dimension: data-accuracy
- Where:
/Users/elmar/PKA/dashboard.py (2897-2901)
- Impact: A green health claim is fabricated for a component that does not exist and was never tested — the default-to-ok pattern will mask real failures if a gbrain-like store ever returns.
- Evidence: Code: gb = retrieval_stores.get('gbrain') ... gb = gb or {}; gb_label = 'skipped' if gb.get('skipped') else ('error' if gb.get('error') else 'ok') — an empty dict yields 'ok'. Latest eval JSON (GET /eval-results/2026-06-10T103030Z.json) has stores: ['vault','qmd'] only — no gbrain key. Page renders 'GBrain ok' in the Retrieval Stack card. GBrain was retired 2026-05-01 per CLAUDE.md Rule 15.
- Suggested action: Default gb_label to '—'/'retired' when the store is missing; only show ok on a real positive result.
- Verifier: Reproduced end-to-end. (1) Read dashboard.py:2880-2901 and 3013-3022: on the has_retrieval branch, gb = retrieval_stores.get('gbrain') -> None -> gb or {} -> gb_label='ok', rendered as the GBrain row of the Retrieval Stack card (line 3019). (2) GET /eval-results/2026-06-10T103030Z.json: retrieval.stores = ['vault','qmd'], no gbrain store — but the JSON has a top-level gbrain {"skipped": true, "rea
30. QMD retrieval layer is down on this Mac (better-sqlite3 ABI mismatch) — dashboard tile and eval P@5 0.000 confirm it live
- Kind: broken · Effort: S · Dimension: data-accuracy
- Where:
/Users/elmar/.bun/install/global/node_modules/better-sqlite3/build/Release/better_sqlite3.node
- Impact: Step 5 of the canonical retrieval stack (qmd search BM25) is completely non-functional, and has been failing every eval run; the dashboard reports it accurately but nobody has acted.
- Evidence: Home tab QMD tile (baked from a real
qmd --index pka status run): 'Error: The module …better_sqlite3.node was compiled against a different Node.js version using NODE_MODULE_VERSION 141. This version of Node.js requires NODE_MODULE_VERSION 127' (Node v22.22.3). Evals tab shows QMD P@5 0.000 / MRR 0.000 for the 2026-06-10T103030Z run (vault store scores 0.667).
- Suggested action: npm rebuild better-sqlite3 under the Node version bun uses (or reinstall @tobilu/qmd).
- Verifier: Reproduced all three evidence legs myself: (1) ran /Users/elmar/.bun/bin/qmd --index pka status — got the exact ERR_DLOPEN_FAILED error for /Users/elmar/.bun/install/global/node_modules/better-sqlite3/build/Release/better_sqlite3.node (NODE_MODULE_VERSION 141 vs required 127); (2) grep on /Users/elmar/PKA/dashboard.html found the same baked error text (QMD tile); (3) read /Users/elmar/PKA/tests/ev
31. 'Open web terminal' button dead-ends (302 → localhost:7681 connection refused, ttyd not running)
- Kind: broken · Effort: S · Dimension: ux-live
- Where:
/Users/elmar/PKA/serve_dashboard.py (1990-1994 (TTYD_PORT default 7681))
- Impact: A primary Console-tab action opens a browser 'can't connect' error page; the terminal pane below it also can't open sessions while ttyd is down.
- Evidence: Console tab button 'Open web terminal' href=/console/ttyd; GET http://localhost:8787/console/ttyd returns 302 Location: http://localhost:7681, which refuses connection ([Errno 61] Connection refused — tested against the live server). The embedded console iframe itself displays a red 'ttyd offline' badge (screenshot .scratch/wf-review/ux/02-tab-console.png), so the system knows ttyd is down yet still offers the button.
- Suggested action: Either auto-start/supervise ttyd, or have /console/ttyd return a friendly 'terminal offline — start ttyd' page and grey out the button when the offline state (already detected) is true.
- Verifier: Reproduced fully against the live server: GET /console/ttyd returned 302 Location: http://localhost:7681, and a socket connect to 7681 failed with [Errno 61] Connection refused (ttyd installed at /opt/homebrew/bin/ttyd but not running). The redirect route is an unconditional redirect in shared_console/blueprint.py:603-609 (registered at serve_dashboard.py:2782); the cited serve_dashboard.py:1989-1
32. Every activity row shows raw debug metadata 'dispatcher=Lucienne session=unknown ts=...' (1,407 occurrences)
- Kind: broken · Effort: S · Dimension: ux-live
- Where:
http://localhost:8787/dashboard.html (generated by /Users/elmar/PKA/dashboard.py activity render) (n/a (1,407 hits in served HTML))
- Impact: The activity feed reads like debug logs; the session field carries zero information (always 'unknown') while adding visual noise to every single row.
- Evidence: Served dashboard.html contains 'session=unknown' 1,407 times; every visible ACTIVITY row reads e.g. 'Dispatched: Review graphify + workspace + wiki-system-update / dispatcher=Lucienne session=unknown ts=2026-06-12T11:22:27' (screenshot .scratch/wf-review/ux/09-home-bottom.png). The session linkage resolves to 'unknown' on 100% of rows sampled.
- Suggested action: Drop the session=unknown suffix when unresolved (or fix the dispatcher logging to record real session ids); render ts as a humanised time instead of raw ISO key=value.
- Verifier: Reproduced via GET http://localhost:8787/dashboard.html: 1,408 occurrences of 'session=unknown' (claim said 1,407; page regenerates so 1-count drift expected). All occurrences are inside
activity rows rendered by dashboard.py:3282, which dumps evt['details'] verbatim; CSS (.evt-detail{font-size:11px;color:var(--text2)}) makes them visible. The 15 most-recent rows at the to
33. Freshest-by-mtime selection for findash trusts cloud-sync mtimes (staged change)
- Kind: fragile · Effort: M · Dimension: server-correctness
- Where:
/Users/elmar/PKA/serve_dashboard.py (467-490)
- Impact: After this change deploys, the dashboard could pick a stale-but-recently-synced findash over the genuinely newest one, silently showing wrong financials.
- Evidence: Staged _output_dir_for() sorts candidates by fpath.stat().st_mtime descending across GoogleDrive PKA-Outputs, in-repo, and OneDrive-Safair, then returns the largest-mtime base. mtime on cloud-synced files reflects local sync time, not content recency: a file that syncs down later (or a partially-downloaded placeholder) gets a newer local mtime regardless of which actually holds the latest numbers. The OneDrive branch (478-485) also skips the relative_to() traversal re-check the loop applies (safe only because filename is hard-pinned to 'findash.html').
- Suggested action: Prefer an explicit recency signal embedded in the file (a generated_at timestamp/version) over filesystem mtime, or restrict to a single authoritative source.
- Verifier: Code facts confirmed by reading serve_dashboard.py:459-490: candidates sorted by st_mtime desc across GDrive PKA-Outputs/in-repo/OneDrive-Safair, largest-mtime base returned; OneDrive branch (478-485) indeed skips the relative_to() traversal check, safe only via the hard-pinned 'findash.html' filename compare. Two corrections to the finding: (1) it is NOT a staged change — git diff --cached is emp
34. Expired manual-registry session entries suppress live auto-detected sessions for the same project
- Kind: fragile · Effort: S · Dimension: gen-correctness
- Where:
/Users/elmar/PKA/dashboard.py (850-864)
- Impact: As soon as the manual registry is used again, a single stale registry entry hides all real activity for that project from the dashboard.
- Evidence: Phase 3 merge builds manual_projects from ALL manual entries (line 852:
manual_projects = {v["project"].lower() for v in manual.values()}) but only adds manual entries with ago_mins <= 240 to the result (855-857). An auto-detected session whose project matches any manual entry — including one dead for days — is skipped (860-864). Currently inert only because /Users/elmar/.claude/vault/sessions.json has 0 entries (verified: python3 -c "import json; print(len(json.load(open('/Users/elmar/.claude/vault/sessions.json'))['sessions']))" → 0), so the bug re-arms the moment session_tracker.py registers a session again.
- Suggested action: Build manual_projects only from manual entries that made it into result (ago_mins <= 240).
- Verifier: Read dashboard.py:850-864 directly: line 852 builds manual_projects from ALL manual entries with no recency filter; lines 855-857 only add manual entries with ago_mins <= 240 to the result; lines 860-864 skip any auto-detected session whose normalized project name is in manual_projects. Therefore a manual entry older than 240 min is dropped from display but still suppresses live auto-detected sess
35. hydrateSessionBar mixes server counts and client-derived counts for the same row (idle ignored from API)
- Kind: fragile · Effort: S · Dimension: gen-correctness
- Where:
/Users/elmar/PKA/dashboard.py (4236-4243)
- Impact: The Home session bar can show arithmetic that doesn't add up the moment server counts and the returned list diverge; currently masked because everything is 0.
- Evidence: Live GET /api/sessions returns counts:{active,idle,stale,closed} (verified: {'active':0,'idle':0,'stale':0,'closed':0}). The emitted JS takes active and stale from counts when present (4238-4241) but ALWAYS computes idle from the sessions array (4242:
var idle = sessions.filter(...)) and total from sessions.length (4243). If the API ever returns counts computed over a different set than the sessions list it returns (e.g. counts over all sessions, list truncated/filtered), the bar shows internally inconsistent numbers (active+idle+stale ≠ total).
- Suggested action: Use counts.idle and a counts-derived total when counts is present, falling back to list-derived values only as a unit.
- Verifier: Read dashboard.py:4231-4260 — confirmed active/stale come from counts.* when present but idle is always client-derived from the sessions array (4242) and total from sessions.length (4243); counts.idle never read. Live GET /api/sessions returned {"sessions":[],"counts":{"active":0,"idle":0,"stale":0,"closed":0}} exactly as claimed. Fetched the live-served /dashboard.html and the hydrateSessionBar J
36. get_memories parses 'frontmatter' keys from the entire file body — any body line starting with name:/type:/description: silently overrides the real frontmatter
- Kind: fragile · Effort: S · Dimension: gen-correctness
- Where:
/Users/elmar/PKA/dashboard.py (566-583)
- Impact: Memory entries on the Memory tab can show wrong name/type/description whenever a memory file quotes YAML in its body (common in these memory files, which document config snippets), and the type filter buttons then mis-bucket the entry.
- Evidence: The loop iterates content.splitlines() for the whole file with no '---' delimiter handling and no break after the frontmatter block:
for line in content.splitlines(): if line.startswith("type:")... — the LAST matching line anywhere in the document wins (assignments overwrite). A proper frontmatter parser already exists in the same file (_parse_frontmatter, lines 1067-1082) and is not used here.
- Suggested action: Reuse _parse_frontmatter(f) instead of the ad-hoc whole-file scan.
- Verifier: Confirmed end-to-end. (1) Read dashboard.py:566-583: get_memories() iterates content.splitlines() over the WHOLE file with no '---' delimiter handling and no break — assignments overwrite, so the last matching 'name:'/'type:'/'description:' line anywhere in the body wins. A correct delimiter-aware parser (_parse_frontmatter) exists at lines 1067-1082 and is unused here. (2) Reproduced the impact w
37. Hero 'Activity Today' counts raw activity_log rows while the feed below filters categories — numbers can disagree (consistent today by luck)
- Kind: fragile · Effort: S · Dimension: gen-correctness
- Where:
/Users/elmar/PKA/dashboard.py (1358-1360 (hero count), 3133-3134 + 3274-3278 (feed filters))
- Impact: The headline KPI and the feed it summarises use different definitions; the count is also dominated by automated extraction noise, diluting its meaning.
- Evidence: get_hero_metrics counts every activity row with date==today (1360). The rendered feed drops rows whose summary starts 'Waiting for' (3134) or 'Incremental index' (3275) and categories 'index'/'system_maintenance' (3277). Verified read-only against vault.db: today 63 rows total and 63 would pass the feed filters (categories today: session_extracted 43, commit 18, skill_invoked 2) — equal today, but any 'index'/'system_maintenance' row reintroduces the mismatch the audit pattern is prone to. Note 43/63 of 'Activity Today' are session_extracted memory-pipeline rows, so the headline number mostly measures the extractor, not Elmar-visible work.
- Suggested action: Compute activity_today from the same filtered list used for the feed (filtered_activity), and consider excluding session_extracted from the headline.
- Verifier: Code confirmed by direct read: dashboard.py:1358-1360 hero counts all rows with date==today unfiltered; feed filters at 3133-3135 (summary startswith 'Waiting for') and 3274-3278 (summary startswith 'Incremental index', categories index/system_maintenance); get_activity_log (319-322) has no LIMIT so both see the same rows. Baked dashboard.html:1083 shows 10 for 'Activity Today' matching read-only
38. On-disk dashboard.html title metadata extraction reads only the first 2000 bytes of each report HTML — late / silently fall back to filename
- Kind: fragile · Effort: S · Dimension: gen-correctness
- Where:
/Users/elmar/PKA/dashboard.py (1440-1451)
- Impact: Reports tab cards show raw filename stems and wrong 'research' type for any report with a heavier head, degrading sort-by-type and search.
- Evidence:
content = f.read_text(encoding="utf-8")[:2000] then regex for and report-* metas — any report whose exceeds 2KB before the title (common when inline CSS/fonts precede it; this very dashboard puts ~500 lines of CSS before body) gets title=f.stem and type='research'. The except Exception: pass at 1450-1451 also hides read/encoding errors, leaving meta={} with no signal.
- Suggested action: Raise the sniff window (e.g. 16KB) or read until ; log instead of pass on decode errors.
- Verifier: Read dashboard.py:1431-1462 — code matches exactly: read_text()[:2000], regex for /, fallback to f.stem and type='research', bare except at 1450-1451. Independently scanned the live reports base (GDrive PKA-Outputs/reports, resolved via _primary_output_base): 19 of 172 HTML reports have at/after byte 2000 (cemair-terms.html at 2502, webwright-v4/saa/home.html at 40644). Confir
39. regenerate() allows a concurrent request to read dashboard.html mid-rewrite (truncated HTML)
- Kind: fragile · Effort: S · Dimension: server-correctness
- Where:
/Users/elmar/PKA/serve_dashboard.py (355-381, 409-413)
- Impact: Under concurrent access right after the cooldown window, a user can be served a truncated/blank dashboard.html (broken page) until the next reload.
- Evidence: index() and dashboard() call regenerate() then immediately Path(DASHBOARD).read_text() (line 412). regenerate() only blocks (under regen_lock, subprocess.run of dashboard.py) for the request that wins the cooldown check; a second request arriving within REGEN_COOLDOWN (10s) hits
now - last_regen_attempt <= REGEN_COOLDOWN at line 358 and returns immediately WITHOUT waiting for the in-flight subprocess, then reads dashboard.html at line 412 while dashboard.py is still writing it. dashboard.py:4859 uses OUTPUT.write_text(html, ...) which truncates the file to 0 then streams ~3.4MB, so a concurrent reader can observe a partially written / truncated file.
- Suggested action: Have dashboard.py write to a temp file and os.replace() atomically (atomic rename), and/or have dashboard()/index() serve the last-good HTML rather than reading a file that another request may be rewriting.
- Verifier: Confirmed both mechanisms live. Code: serve_dashboard.py:358 lock-free cooldown check returns immediately for any request arriving while a regen is in flight (last_regen_attempt set at line 363 BEFORE subprocess.run), then dashboard() reads the file at line 412; dashboard.py:4859 uses OUTPUT.write_text (truncate+write, non-atomic). Live repro on the running :8787 server: a trigger GET took 20.31s
40. /api/annotations POST is CSRF-exempt + CORS * and writes files unauthenticated on a 0.0.0.0 bind
- Kind: fragile · Effort: S · Dimension: server-correctness
- Where:
/Users/elmar/PKA/serve_dashboard.py (177-180, 219-232, 523-602)
- Impact: Any website the operator visits, or any host that can reach port 8787, can write arbitrary annotation files and large screenshot blobs to Vault/annotations with no limit on number of batches — a disk-fill / spam DoS vector and an unauthenticated write surface.
- Evidence: _csrf_same_origin_guard() returns None for /api/annotations before any origin check (lines 177-180), and _annotations_cors() sets Access-Control-Allow-Origin: * for it (223-225). post_annotations() writes Vault/annotations/-/batch.json plus base64-decoded PNG crops up to 3MB each / 25MB per batch (542-563). The server binds 0.0.0.0 by default (line 2855). There is no auth token and no per-client rate limit; batch dirs are unbounded (only items-per-batch is capped at 100).
- Suggested action: Restrict /api/annotations to loopback (like the cookie-ingest endpoint), or require the same X-API-Key control; cap total annotations dir size / add a batch rate limit; tighten CORS to the dashboard origin.
- Verifier: Reproduced all five evidence components. Read lines 161-216: line 177-180 returns None for /api/annotations on any non-OPTIONS method, bypassing the same-origin guard before any origin check. Lines 222-225 set Access-Control-Allow-Origin:* for the path. Confirmed LIVE via GET-only/OPTIONS: GET /api/annotations/inbox=HTTP 200; OPTIONS with Origin:https://evil.example.com returned 204 with Access-Co
41. md-view path containment uses startswith without a trailing separator (prefix confusion)
- Kind: fragile · Effort: S · Dimension: server-correctness
- Where:
/Users/elmar/PKA/serve_dashboard.py (1810-1813)
- Impact: If a directory named with a 'docs'/'reports'/'Vault/memory' prefix is ever added, md-view could serve .md files outside the intended root. Latent path-containment weakness.
- Evidence: fpath = (base_dir / rel).resolve(); guard is
if not str(fpath).startswith(str(base_dir.resolve())) or fpath.suffix != '.md': abort(404). With base_dir = ROOT/'docs', the string '/Users/elmar/PKA/docs' is a prefix of a sibling like '/Users/elmar/PKA/docs2/secret.md', so a sibling directory whose name starts with 'docs' (or 'reports', or 'Vault/memory') would pass the check. The sibling api at /api/v1/brain/file (lines 2358-2363) correctly uses fpath.is_relative_to(r) to avoid exactly this. Currently not exploitable — no such sibling dir exists (only docs/, reports/, reports_utils.py) — and live probes (/md-view?file=../CLAUDE.md, ../serve_dashboard.py, reports/../../CLAUDE.md) all returned 404.
- Suggested action: Use fpath.is_relative_to(base_dir.resolve()) instead of str.startswith(), matching the brain_file_detail guard.
- Verifier: Re-read serve_dashboard.py 1790-1815: lines 1810-1813 match the finding verbatim — fpath=(base_dir/rel).resolve(); guard is
if not str(fpath).startswith(str(base_dir.resolve())) or fpath.suffix != '.md': abort(404). Re-read 2348-2363: the sibling /api/v1/brain/file endpoint uses fpath.is_relative_to(r) with an explicit code comment that startswith would allow prefix-confusion attacks (e.g. /Us
42. Three repo-only reports are invisible in the Reports tab because listing reads cloud dir only
- Kind: fragile · Effort: S · Dimension: data-accuracy
- Where:
/Users/elmar/PKA/dashboard.py (1561-1575 (_primary_output_base returns cloud once non-empty; no merge with repo fallback))
- Impact: A handful of recent local reports silently vanish from the library until someone migrates them to the cloud folder.
- Evidence: Replicating get_reports filters over both dirs: cloud listing 475 files, repo listing 173, repo-only = 3: reports/mc-4861_lucienne_local_model_benchmark.md, reports/pka-memory-reddit-post.md, reports/pka-memory-system-reddit.html. These never appear in the 489 cards (489 = cloud listing + docs/.md with Date:* markers). The serving routes are per-file cloud-first/repo-fallback, so the files would open if linked — they just aren't listed.
- Suggested action: Union repo + cloud files (dedupe by relative path, cloud wins) in get_reports, or migrate the 3 stragglers.
- Verifier: Read dashboard.py:1561-1575 and get_reports (1406-1507): listing uses a single base — cloud dir once non-empty, no merge with repo. Re-ran the set-diff with get_reports' exact filters: cloud 479, repo 175, repo-only = 4 (the 3 claimed files plus portfolio-dashboard-review-2026-06-10.html, 194KB, written AFTER migration — confirms the fragility is ongoing). Grep for all 4 filenames: 0 hits in gener
43. Reports listing reads full content of all 625 report files from Google Drive FUSE every regen, and renders all 489 cards (28% of page)
- Kind: fragile · Effort: S · Dimension: performance
- Where:
/Users/elmar/PKA/dashboard.py (1432-1505)
- Impact: Regen time becomes network-bound (28MB of Drive reads every 30s) whenever the Drive cache is cold or syncing; the page permanently carries every report ever written.
- Evidence: f.read_text(encoding='utf-8')[:2000] reads the ENTIRE file then slices — measured 171 HTML files = 23.9MB read per regen (largest single file 8.62MB mc-4223-ua-signoff-report.html) plus 454 .md files = 4.2MB, from /Users/elmar/Library/CloudStorage/GoogleDrive-conrelma@gmail.com/My Drive/PKA-Outputs/reports (cloud-backed FUSE). Locally cached this is 0.06s; a bounded 4KB read is 0.02s, but on a cold/evicted Drive cache each full read becomes a network download (28MB per regen, every 30s). Output side: p-reports panel = 688,229 B (27.7% of page), 489 report-card divs, avg 1,399 B each (scratch panelshare.py).
- Suggested action: Read only the first 4KB (open(f,'rb').read(4096)) for metadata extraction, and cap the rendered list to the newest ~100 with the rest behind the existing filter/search.
- Verifier: Independently confirmed every element: (1) dashboard.py:1440 does f.read_text()[:2000] (full read then slice) for HTML and :1481 [:3000] for md; (2) _primary_output_base (dashboard.py:1561-1575) resolves via pka_paths.outputs_dir to the GDrive CloudStorage FUSE path /Users/elmar/Library/CloudStorage/GoogleDrive-conrelma@gmail.com/My Drive/PKA-Outputs/reports; (3) ran readcost.py: 172 HTML = 24.0MB
44. Token-dashboard API GETs cost 0.24-1.22s against a 700.9MB DB, and 10 routes each kick a 2GB transcript rescan on a 30s cooldown
- Kind: fragile · Effort: S · Dimension: performance
- Where:
/Users/elmar/PKA/serve_dashboard.py (318-352, 1609-1775)
- Impact: Token tab feels sluggish (~1s per card) and keeps a background scanner thread cycling over a 2GB tree every 30s while open; the 700MB DB grows unboundedly inside ~/.claude.
- Evidence: Measured (3 runs each): /api/token-dashboard/overview = 1.219s / 0.471s / 0.707s; /api/token-dashboard/sessions = 0.361s / 0.271s / 0.240s. ~/.claude/token-dashboard.db = 700.9MB (+4.6MB WAL). Ten routes (lines 1613-1757) each call _maybe_scan_token_dashboard(), which spawns a background scan_dir thread over ~/.claude/projects whenever >30s since the last scan — that tree is 4,940 .jsonl files / 2.03GB (measured via scratch walkcost.py; the stat-walk alone is 0.16s, the scan parses new content on top). The token tab polling these endpoints re-triggers a scan every 30s.
- Suggested action: Raise the scan cooldown to 5 min, add indexes/precomputed rollups for overview_totals, and add a retention/aggregation policy for token-dashboard.db.
- Verifier: Reproduced every measurable element: (1) timed 3 GETs each — overview 0.643/0.415/0.669s, sessions 0.307/0.237/0.245s on the live :8787 server; (2) ~/.claude/token-dashboard.db = 774MB + 6.2MB WAL (grown from the claimed 700.9MB, confirming unbounded growth); (3) read serve_dashboard.py:318-352 — _maybe_scan_token_dashboard spawns a daemon scan thread on a 30s cooldown — and confirmed all ten toke
45. Reports list polluted by placeholder '2099-01-01' artifacts and duplicate cards
- Kind: fragile · Effort: S · Dimension: ux-live
- Where:
http://localhost:8787/dashboard.html (Reports tab; source files in /Users/elmar/PKA/reports/)
- Impact: Obviously-bogus 2099 dates near the top of the list undermine trust in the whole Reports index, and duplicate cards make it unclear which Kapama report is current.
- Evidence: Reports tab (sorted Date ▼) positions 5-8 of 492: 'Skill Audit — 2099-01-01' (data-date 2026-06-09, links /reports/skill-audit-2099-01-01.html), 'Sender intelligence proposal' (/reports/sender-intelligence-2099-01-01.html), 'Operator Tune 2099 01 01' — generator placeholder dates leaked into filenames/titles. Also two near-identical cards 'Kapama Family Stay Options — Interactive Comparison' on 2026-06-10 (/reports/kapama-family-stay-options.html and /reports/Kapama-family-options-for-Nicolette.html). Screenshot .scratch/wf-review/ux/02-tab-reports.png.
- Suggested action: Rename/regenerate the three 2099-01-01 report files with real dates (and fix whatever skill writes the 2099 placeholder); delete or merge the duplicate Kapama report.
- Verifier: Reproduced against the LIVE server (GET-only). curl of http://localhost:8787/dashboard.html shows 19 '2099-01-01' occurrences. Parsed all report-card divs: exactly 492 cards as claimed. Date-sorted top rows contain 'Skill Audit — 2099-01-01' (data-date 2026-06-09, /reports/skill-audit-2099-01-01.html), 'sender-intelligence-2099-01-01.html', and 'operator-tune-2099-01-01.md' at positions 6/7/9 (rev
46. Stuck-looking 'Loading…' on System tab Spend & Tokens tile for ~8s with no skeleton/progress
- Kind: fragile · Effort: S · Dimension: ux-live
- Where:
/Users/elmar/PKA/dashboard.py (4263 (hydrateSpendTile))
- Impact: Tile looks broken on slower hydration, and a context-free '$35,997' headline number invites misreading (looks like a current bill).
- Evidence: On opening the System tab, 'USAGE BY ROLE / Spend & Tokens: Loading…' was still showing after the tab rendered; ~8s later it resolved to '$35,997.49 / 16,953,384,742 tokens'. The figure appears with no period or scope label (lifetime? month?), and the role list beneath shows tasks from 2026-03-31.
- Suggested action: Add the period/scope to the label (e.g. 'lifetime est. across all CLIs') and a spinner/skeleton; cache the value server-side so it bakes with the page.
- Verifier: Confirmed, and the latency is worse than claimed. (1) Code: /Users/elmar/PKA/dashboard.py:4180-4183 bakes the tile with body 'Loading…'; hydrateSpendTile at lines 4264-4287 fetches /api/token-dashboard/overview with NO since/until params and renders only '$cost' + 'N tokens' — no period/scope label, no spinner, no timeout (catch just sets '— data unavailable —'). Served HTML at http://localhost:87
47. Every open tab refetches the full 2.4MB dashboard and triggers a dashboard.py regen subprocess every 30s; all 3,393 activity rows and 489 report cards are baked into one page
- Kind: simplify · Effort: L · Dimension: data-accuracy
- Where:
/Users/elmar/PKA/serve_dashboard.py (248-252 (REGEN_COOLDOWN=10), 409-411; generated JS doRefresh()/setInterval 30000)
- Impact: ~7GB/day transfer per idle open tab plus a continuous regen subprocess churning vault.db and ~/.claude/projects rglob — heavy machinery for a status page, and the main reason the page is 2.5MB.
- Evidence: doRefresh() fetches /dashboard.html?t=… every 30s and innerHTML-swaps the whole shell; each request calls regenerate() (full 4,864-line dashboard.py run, 10s cooldown). Page is 2,554,781 bytes with 3,393 activity entries and 489 report cards baked in; logs/dashboard.stderr.log is 59.7MB; lsof showed multiple persistent Chrome connections; dashboard.html mtime advances every ~30s while tabs are open (observed 22:49:42→22:50:41→22:58:16).
- Suggested action: Serve deltas via the existing WebSocket or small JSON APIs; lengthen the fallback poll; paginate activity feed (e.g. latest 200) and reports grid.
- Verifier: Fully reproduced every element of the finding against both the on-disk code and the LIVE server (behaviour identical on this path). (1) Code: /Users/elmar/PKA/serve_dashboard.py:250-251 sets REGEN_COOLDOWN=10; the /dashboard.html route (lines ~403-412) calls regenerate() on every GET; regenerate() (line ~355) runs
subprocess.run([sys.executable, dashboard.py]) synchronously with a 30s timeout, g
48. ~300 lines of dead code from the 9→6 tab restructure still execute or sit unused every regeneration
- Kind: simplify · Effort: S · Dimension: gen-correctness
- Where:
/Users/elmar/PKA/dashboard.py (335-348, 447-472, 1370-1389, 1754-1827, 2207-2247, 2465-2512, 3417, 4826, 4832)
- Impact: Wasted work on every regeneration (which happens as often as every ~30s, see polling finding) and a 4,864-line file that's materially harder to review — the skills-panel builder in particular looks live but its output is discarded, inviting someone to 'fix' the wrong layer.
- Evidence: grep over the file shows zero callers for: get_graph_data (335), get_tasks (447), group_tasks_by_owner (1370), build_task_board (2207), build_gbrain_tile (1754; line 3261 hardcodes gbrain_tile_html="" instead). build_skills_panel IS executed every regen (line 3417 skills_panel_html=...) but the result is never inserted — generated dashboard.html contains neither 'filterSkills' nor 'skills-tbody' (verified by grep on the 2.4MB output). get_vault_structure() runs in main (4826, scans ~/.claude/vault) and is passed to generate_html (3114) which never reads it. get_gbrain_health() result is passed but only feeds the empty-string tile. build_evals_panel also has unused vars (last7_leakage/cutoff7, lines 2934-2935).
- Suggested action: Delete the uncalled builders, drop the build_skills_panel call + skills_panel_html, remove get_vault_structure/gbrain plumbing (keep get_gbrain_health only if something still reads the retirement marker).
- Verifier: Independently confirmed every claim. (1) grep over /Users/elmar/PKA/dashboard.py shows ONLY def-sites for get_graph_data (335), get_tasks (447), group_tasks_by_owner (1370), build_task_board (2207), build_gbrain_tile (1754) — no production callers; the sole external reference is tests/test_dashboard_health_cards.py:55 calling build_gbrain_tile, which keeps dead code test-green. (2) Line 3261 hardc
49. Dashboards launcher hides stale exports via a hardcoded filename blocklist that will silently rot
- Kind: simplify · Effort: S · Dimension: gen-correctness
- Where:
/Users/elmar/PKA/dashboard.py (1677-1688)
- Impact: Launcher clutter creeps back with every dated export; the blocklist gives false confidence that duplicates are handled.
- Evidence: hidden_dashboard_files = {"Dashboard.html", "Exco Dashboard - 12 May 2026.html", "Exco Dashboard - 26 May 2026.html"} plus a special-case
if f.name == "smart-money.html": continue. Every newly exported dated duplicate (e.g. the next 'Exco Dashboard - 9 Jun 2026.html' dropped into the dashboards dir) reappears as an anonymous grey card until someone edits the generator, while the launcher's curated meta dict (1584-1676) already defines the canonical set.
- Suggested action: Invert the rule: render only files present in the meta dict (plus an explicit 'other files' count/link), or glob-exclude patterns like 'Exco Dashboard - *.html'.
- Verifier: Read /Users/elmar/PKA/dashboard.py:1677-1699 — the cited code exists exactly as claimed: hidden_dashboard_files = {"Dashboard.html", "Exco Dashboard - 12 May 2026.html", "Exco Dashboard - 26 May 2026.html"} (lines 1677-1683), a smart-money.html special case (1687-1688), and a grey fallback meta (color #64748b, group Tools, empty desc) for any filename not in the curated meta dict (1689-1698). Did
50. Dead local-task-board chain: get_tasks, group_tasks_by_owner, build_task_board and the mc_bridge import block (~120 lines)
- Kind: simplify · Effort: S · Dimension: dead-dup
- Where:
/Users/elmar/PKA/dashboard.py (447-472, 1370-1389, 2207-2247, 72-81)
- Impact: ~120 lines of dead code in the generator that future sessions read, maintain, and can mistakenly re-wire.
- Evidence: Repo-wide grep for get_tasks/group_tasks_by_owner/build_task_board (--include=*.py) returns ONLY their def lines in dashboard.py — zero call sites (the asana skill's get_tasks is an unrelated method). build_task_board's ticket_map consumer is the try/except import of mc_bridge get_task_ticket_map/get_mc_url at lines 72-81, which are also referenced nowhere else (uses only at the import/stub lines 74/77/80). main() builds the page without any of them.
- Suggested action: Delete get_tasks, group_tasks_by_owner, build_task_board, the mc_bridge import block, and the orphaned tb-/task- CSS rules (see CSS finding).
- Verifier: Independently reproduced. (1) Ran the repro grep myself (rg over *.py, asana excluded): get_tasks, group_tasks_by_owner, build_task_board occur ONLY at their def lines in dashboard.py (447, 1370, 2207); get_task_ticket_map/get_mc_url occur only at the dashboard.py import/stub lines 74/77/80 plus inside mc_bridge.py itself (its own definitions/docstring, not a consumer of dashboard's import). (2) R
51. GBrain (retired 2026-05-01) tile code still present: build_gbrain_tile never called in production, get_gbrain_health result discarded (~95 lines)
- Kind: simplify · Effort: S · Dimension: dead-dup
- Where:
/Users/elmar/PKA/dashboard.py (946-960, 1754-1826, 4832, 4851)
- Impact: Retired-system code (88 lines + param plumbing + a test pinning it) runs a health probe whose result is thrown away on every dashboard regeneration.
- Evidence: generate_html contains the explicit comment '# GBrain is retired, so do not surface a permanent unavailable tile' and hardcodes gbrain_tile_html = ""; the gbrain_data parameter has 0 uses in the function body (verified by counting identifier occurrences). build_gbrain_tile (1754-1826, 73 lines) has no production caller — its only call site is tests/test_dashboard_health_cards.py:55. Yet main() still calls get_gbrain_health() at 4832 and threads gbrain_data=gbrain_data into generate_html at 4851 on every regen. Note: the .gbrain-* CSS classes are NOT fully dead — build_qmd_tile reuses them (21 occurrences), so keep/rename the CSS.
- Suggested action: Delete build_gbrain_tile, get_gbrain_health, the gbrain_data parameter and main() call; update tests/test_dashboard_health_cards.py; optionally rename gbrain- CSS classes to qmd- since only the QMD tile uses them.
- Verifier: Independently reproduced every cited line: get_gbrain_health stub at dashboard.py:946-960 (returns static retirement dict — note: no actual probe runs, so the claimed 'health probe thrown away' impact is overstated; it's pure dead code, not runtime cost); build_gbrain_tile at 1754-1826 with sole caller in tests/test_dashboard_health_cards.py:55 (grep confirmed); generate_html param gbrain_data=Non
52. get_vault_structure() computed on every regen but its generate_html parameter is never used
- Kind: simplify · Effort: S · Dimension: dead-dup
- Where:
/Users/elmar/PKA/dashboard.py (475-488, 4826)
- Impact: Wasted vault filesystem walk on every dashboard regeneration (which the server triggers on page views) plus a misleading parameter implying the vault tree is rendered.
- Evidence: Identifier count of 'vault_structure' inside generate_html's body (def line to def main) is 1 — the signature only; 0 uses in the body. main() line 4826 still calls get_vault_structure() (walks the ~/.claude/vault directory tree) and passes the result positionally.
- Suggested action: Remove the vault_structure parameter, the main() call, and get_vault_structure() (lines 475-488).
- Verifier: Independently confirmed all claimed evidence: (1) read dashboard.py:475-486 — get_vault_structure() walks ~/.claude/vault; (2) read dashboard.py:4826/4845 — main() computes vault_structure and passes it positionally to generate_html; (3) read generate_html signature at dashboard.py:3108-3127 — vault_structure is param #6; (4) grep over the whole file shows exactly 4 occurrences (function def, sign
53. Dead closeSession() JS emitted into every generated page; its target POST /api/sessions/close has no remaining caller
- Kind: simplify · Effort: S · Dimension: dead-dup
- Where:
/Users/elmar/PKA/dashboard.py (4576-4582 (and serve_dashboard.py:1925-1957))
- Impact: Dead UI affordance shipped in every 2.5MB page render, and a mutating tmux-killing endpoint kept alive with no UI path to it — confusing for the next person hardening the server.
- Evidence: Scanning the generated dashboard.html: 26 JS functions defined, only 'closeSession' and 'cleanupSessions' have zero call sites in markup/JS. cleanupSessions IS conditionally wired (onclick emitted by build_session_cards line 2177 when stale sessions exist), but closeSession has no emitter anywhere in dashboard.py — grep finds only the function definition at 4576. Its fetch('/api/sessions/close') is therefore the only non-test reference to serve_dashboard.py's api_session_close (1925-1957); other repo hits are tests/test_csrf_guard.py and docs.
- Suggested action: Delete the closeSession JS from generate_html; either delete /api/sessions/close + its CSRF test or re-wire a close button on the session cards if the capability is still wanted.
- Verifier: Reproduced: dashboard.py has exactly one 'closeSession' hit (line 4576, the definition, read lines 4565-4589); generated dashboard.html contains exactly one occurrence (the definition) and zero call sites; cleanupSessions by contrast IS wired via onclick emitted at dashboard.py:2177. serve_dashboard.py:1925-1955 defines POST /api/sessions/close whose only non-doc reference is the dead JS fetch. Tw
54. dispatch_tracker import + stub is permanently on the fallback path — module doesn't exist and get_dispatch_stats is never called
- Kind: simplify · Effort: S · Dimension: dead-dup
- Where:
/Users/elmar/PKA/dashboard.py (47-59)
- Impact: 13 lines of dead import/stub suggesting a tracker integration that was removed.
- Evidence: os.path.exists('/Users/elmar/PKA/dispatch_tracker.py') is False and repo-wide grep for dispatch_tracker (--include=*.py) matches only dashboard.py. Occurrences of get_dispatch_stats in dashboard.py are lines 48 (import) and 51 (stub def) — nothing ever calls it; the live dispatch panel uses get_agent_dispatch_stats(activity) instead.
- Suggested action: Delete the try/except dispatch_tracker block (lines 47-59).
- Verifier: Read dashboard.py:47-58 directly: try-import of dispatch_tracker.get_dispatch_stats with an ImportError stub returning zeroed dict. Confirmed dispatch_tracker.py does not exist (ls: No such file). Repo-wide grep (*.py) for dispatch_tracker/get_dispatch_stats matches only dashboard.py lines 48 (import) and 51 (stub def) — no call sites in dashboard.py or serve_dashboard.py. The dispatch panel uses
55. get_graph_data() is dead (never called anywhere)
- Kind: simplify · Effort: S · Dimension: dead-dup
- Where:
/Users/elmar/PKA/dashboard.py (335-348)
- Impact: 14 dead lines querying vault.db graph tables for a panel that no longer exists.
- Evidence: Repo-wide grep for get_graph_data (--include=*.py) returns only the def line at dashboard.py:335. Nothing in main(), generate_html, or any other module references it.
- Suggested action: Delete get_graph_data (lines 335-348).
- Verifier: Independently confirmed. Read /Users/elmar/PKA/dashboard.py lines 335-348: def get_graph_data(conn) exists exactly as claimed, 14 lines querying vault.db
files and edges tables and returning (nodes, edges) lists. Re-ran the repro: grep -rn 'get_graph_data' across the repo with --include='*.py' returns only dashboard.py:335 (the def line) plus unrelated same-named functions in .scratch/ copies
56. ~74 CSS classes emitted into every generated dashboard.html have no matching markup or JS — leftovers from removed session cards, task board, scheduler card and the 9-to-6 tab restructure
- Kind: simplify · Effort: S · Dimension: dead-dup
- Where:
/Users/elmar/PKA/dashboard.py (~3560-3995 (CSS block inside generate_html))
- Impact: ~65 stylesheet lines of dead weight regenerated into every ~2.5MB dashboard.html, and a misleading map of UI components that no longer exist.
- Evidence: Automated scan of the generated /Users/elmar/PKA/dashboard.html: of 267 CSS class selectors in