← Reports

Mission Control — how it works, how it compares, what's missing

Live rescan of the running system on Luci (not the stale canonical PKA tree). Benchmarked against 7 public "agent OS / mission control" builds surveyed 2026-05-24.

interactive
ticket runtime (subscription, not -p)
2
MAX_WORKERS hard cap
Postgres
durable board + workflow state
auto-loop off
dev_review_qa built, not auto-spawning

The one-line verdict

MC already beats every surveyed build on the engine — durable Postgres board, interactive-tmux subscription runtimes, claim-after-spawn race guards + reaper, host-aware verification, a real harvest contract, a single-voice orchestrator, and quality gates (Tessa + council) that none of the others have. The remaining gaps are not capability — they are an unfinished auto-loop, a split console, and a small scheduled-job billing cleanup.

✅ What's solid

Orchestrator control-plane, interactive subscription runtimes, dispatch safety (race guard + reaper + locks), Postgres lifecycle, harvest contract, host-aware verify, Tessa + council gates, provider flexibility.

⚠️ Half-built

The dev_review_qa phase workflow is coded (phases + return_for_fixes requeue) but auto-spawn is disabled. No machine acceptance gate, no round cap. So the code→review→Tessa loop is manual and unreliable.

🔴 Open gaps

No single unified web console / live agent-health grid; ~9 scheduled claude -p jobs exposed to the 15-Jun API billing change; throughput capped at 2 workers.

How Mission Control works now

Click a stage to expand. This is the live runtime loop on Luci.

01
Intent
Elmar → orchestrator
02
Decompose
proposal cards → tickets
03
Pickup
scheduler spawns worker
04
Work
interactive tmux session
05
Gate
review / QA / question
06
Operator
sweep + self-heal
07
Digest
inbox → orchestrator

Ticket lifecycle (state machine)

todo→ pickup → in_progress needs_input / in_review done · cancelled

Worker emits line-start signals: QUESTION:needs_input (runtime kept warm) · REVIEW:in_review (warm) · DONE: → review-ready/close. loop primitive Return for fixes requeues implementation and makes the ticket pickupable again — the loop-back exists, but is triggered manually, not auto.

Component map (live mission-control/)

Control plane

persistent_luci.py — orchestrator (long-lived tmux). semantic_router.py — intent routing. orchestrator_triage.py — auto-triage (Gemini Flash). single-voice

Dispatch & runtime

mc_pickup.py — dispatcher (claim-after-spawn, reaper, MAX_WORKERS=2). ticket_runtime.py — interactive tmux claude. runtime_picker / runtime_pool — profile + warm pool. mc_tmux.py — send_input/capture.

Gates & health

luci_operator.py — board sweep / self-heal. council_runner.py — second opinion. followup_ledger / handoff — continuity. app.py — server, Workbench, harvest, Telegram bridge.

The roles (agents = roles, not processes)

Luci
Orchestrator + default operator. Owns continuity, MC/Luci code, all dispatch.
Larry
Implementation on external repos (LegalMind / SafairBru / Coolify host).
Tessa
QA / UX validation. Mandatory for UI & 3+ component changes.
🔍
Scott
Research / current-source scout.
🏛
Atlas
Architecture sign-off for PKA/MC system changes.
Council
Codex+Gemini+Opus+Kimi+GLM. Significant runtime/security/workflow changes.

How MC compares to the other models

7 public builds surveyed. The split is consistent: marketers build pretty consoles over thin engines; engineers build raw orchestration with no governance. MC is the inverse of the marketers and ahead of the engineers on safety.

Dimension Mission Control Hermes / Julian Alex Finn IndyDevDan Auto Claude
Orchestration triggerScheduler push + operator sweepHeartbeat persona pollOpenClaw heartbeatAgent spawns sub-agentsDesktop spawns instances
Per-task isolation own tmux session + session_id personas, 1 process personas sandbox + ctx window worktrees
Ticket board Postgres, lifecycle-gated kanban (DB/md) kanban event stream kanban + GH issues
Dispatch safety race guard + reaper + locks
Quality gates Tessa + council + Atlas none none none review agent
Memory vault.db + SB + graph Obsidian bolt-on journal per-ctx Graphiti
Host-aware verify resolve + verify on host
Auto code→review loop built, auto-spawn OFF SendMessage
Unified web console Board+Workbench, split polished polished observability desktop app
Live agent-health grid data, no at-a-glance grid event pulse
Billing model subscription (tmux)API/creditsAPI/creditsAPIAPI

strong   partial   absent

Engine maturity vs console polish

The whole market sits on one diagonal. MC is top-left — strongest engine, weakest console. The marketers are bottom-right. The opportunity is to move MC right without losing the engine.

▲ engine maturity
console polish ▶
Mission Control
Alex Finn
Hermes / Julian
IndyDevDan
Auto Claude
CC Agent View
↗ target

Gaps & what I propose

Ordered by leverage. Each gap is grounded in the live code.

1 · The code→review→Tessa loop doesn't auto-run high

evidence: app.py:4473 dev_review_qa phases defined · app.py:6150 "auto-spawn currently disabled by should_auto_workflow" · return_for_fixes exists but manual

The phase workflow + the requeue primitive are already coded. What's missing is automatic progression and a stop-condition — so today you ask one agent to be the loop controller, which never holds.

Propose: Enable bounded dev_review_qa auto-spawn behind three guards: (a) a machine-checkable acceptance gate per phase (tests pass / build green / Tessa PASS — not "until Opus is happy"); (b) a max-round cap (e.g. 3) → on exceed, park needs_input + ping; (c) the existing circuit-breaker (≥2 same-root-cause failures → freeze + audit). Coder = codex profile, reviewer = Opus, QA = Tessa browser — each a bounded role runtime, continuity via branch/artifacts not chat.

2 · No single unified console / live agent-health grid medium

evidence: surfaces split across Board · Runtime Workbench · dashboard chat · Telegram · PKA :8787 (functional, not designed)

Every surveyed build's one real advantage is a dark 3-zone console (sidebar · main · live-activity rail) with an at-a-glance agent grid. MC has all the data, scattered.

Propose: A mission-control console view consolidating onto one surface: left nav (Board / Workbench / Agents / Activity), centre = the work, right = a live activity rail, plus an Agents health grid (Luci / Larry / Tessa / Scott + provider status) with the green/yellow/red semaphore + current task + latency. Reuse this dark "mission control" language. Skip the pixel "Office" — every reviewer flagged it as gimmick. Tier-2, no new server.

3 · Scheduled claude -p jobs exposed to 15-Jun API billing medium · time-boxed

evidence: ~9 task defs shell claude -p — life-manager-digest/scan, morning-briefing, b4i-fuel-history, claude-mem-value-eval, agent-watch, legalmind-wa-watcher, provider-smoke, self-improve-luci-weekly. (Ticket work already subscription/interactive — not exposed.)

From 15 June, headless -p bills to the separate API account. Tickets are safe; these scheduled jobs are the only exposure.

Propose: Audit the 9 → quality-critical ones (briefings, self-improve) move to a subscription-interactive runner; the rest (scans, smoke, digests) drop to GLM / Gemini Flash via the existing provider-switch. Neutralises the change for cents. Do before 15 Jun.

4 · Throughput capped at 2 workers; single-lane stall risk low

evidence: mc_pickup.py MAX_WORKERS = 2 (lowered 3→2) · runtime_pool.py exists

Fine for safety today, but a ceiling once the auto-loop runs (each loop holds a slot through multiple phases).

Propose: Tie worker count to the runtime model — a small warm pool of per-role slots (e.g. 2 coder + 1 review + 1 Tessa) rather than one global cap, with the operator sweep watching for a wedged slot. Revisit cap after the auto-loop lands.

5 · Heavy review/QA runtimes reaped mid-loop low · partly fixed

evidence: prior incidents (truncated subagent / heavy-runtime harvest) · harvest durability work (MC-3482, MC-3804 MCP-primary harvest) already in tree

Long Tessa/Preview or review runs can truncate before writing a verdict → loop looks stuck.

Propose: Keep the MCP-primary harvest + enforce incremental commit/push per phase so a reaped runtime is reconstructable from git + harvest, never the cut-off return. Already the direction — make it a hard phase rule.

Suggested sequence

now

Gap 3 (billing audit) — small, deadline-driven, before 15 Jun.

next

Gap 1 (auto-loop with acceptance gate + cap) — the real unlock. Atlas + council first.

then

Gap 2 (console + agent grid), then 4/5 tune once the loop runs.

Scanned live on Luci ~/workspace/mission-control/ 2026-05-24. Comparison from the 6-builder survey (Elmar Inbox/agent-os-survey.md) + Hermes teardown. Severity reflects leverage × effort, not difficulty alone.