⌂ Home ☷ Board

Second-opinion review — Luci Control Room and Runtime Independence Plan

1. Executive verdict

Plan is sound, scoped well, and Phase 1 already landed cleanly. The governance/navigation framing fixes the main risk (control room becoming a second source of truth). Direction is correct: proceed with amendments. Most gaps are about what isn't yet named — Claude Code itself as a runtime peer to Hermes, the rare-runtime-failure recovery loop, and the runtime_sessions ledger contract — rather than what's wrong in what's written.

2. Must-fix before Phase 1 sign-off

Phase 1 is marked done. Two items should land before declaring it shipped:

3. Should-fix soon

4. Optional / later improvements

5. Incorrect assumptions or stale architecture

6. Runtime-independence risks

7. Telegram / CCGram / routing risks

8. Storage-boundary risks

The four-layer boundary (reports/ vs mission-control/docs/ vs manifest/CAPABILITIES vs agent-control-room/) is well-drawn but has unclosed seams:

9. Missing workers / runtimes / tooling

10. Concrete recommended patches to the living plan

  1. Add a Phase 1.5: adapter contract. Before Phase 2/3 work begins, add a small subsection in docs/runtime-independence.md titled "Runtime adapter contract" enumerating: (a) runtime_sessions row shape, (b) terminal-state output shape, (c) harvest commit-before-DONE rule, (d) CCGram-sole-poller rule, (e) --settings settings-worker.json rule for any long-running claude invocation, (f) worktree-pool slot anchoring rule, (g) Telegram outbound = notify.py only.
  2. Rewrite Phase 3 Gemini-CLI bullet to reflect already-deprecating-2026-06-18 status (obs 1721 already retired it from persistent_luci). Move it from "future fallback plan" to "active migration."
  3. Promote runtime-profile lint from Phase 4 audit task to a continuous gate: add an entry to Phase 1 marking audit_task_runtime_profiles.py --lint as the smoke test the control room exposes.
  4. Add Claude Code as named runtime. In the Phase 1 inventory, separate "Claude Code (the CLI binary)" from "claude_anthropic (a profile of that binary)" — they are different abstractions.
  5. Add to Phase 1 the explicit Atlas-signoff ticket for the reverse link from runtime-architecture-refresh.md. Don't carry it as a deferred bullet — make it a tracked sub-task with the same MC-3898 parent.
  6. Add a CAPABILITIES.md entry for the control room itself so it's discoverable via the always-injected manifest. (Not just a link from the plan — an entry in the table.)
  7. Strengthen open-decision wording. "Should the control room live in ~/workspace/agent-control-room/?" — current state in ~/workspace/ is consistent with the rest of Luci's home; reword from open question to ratified default unless Elmar contradicts.
  8. Add a "validity-window" header to both governance docs: "Valid as of ; re-check against runtime-architecture-refresh.md and scheduler.py PROFILE_PROVIDER on or before ." Forces drift detection.
  9. Cross-link ~/.claude/rules/agent-recovery-and-loop-discipline.md from runtime-independence.md so terminal-state contract is reachable from the governance layer.
  10. Fix one wording risk in line 165: "no code edits for direct API profiles" — actually direct_anthropic_sdk could in principle edit files if the script does it. The real constraint is "direct API profiles MUST declare direct_* runtime_profile and MUST NOT be assigned tasks routed through the claude CLI dispatcher's tool/file-edit assumptions." Tighten or it misleads adapter authors.

11. Final recommendation

Proceed with amendments.

The plan's direction is correct and Phase 1 work is real. The amendments are mostly about making implicit contracts explicit — adapter contract, Claude-Code-as-runtime naming, Gemini-CLI status update, CCGram invariant promotion, Atlas reverse-link as a tracked sub-task, and CAPABILITIES.md entry for the control room itself. None require redesign; all should land before Phase 2 begins so the inventory work doesn't bake in ambiguity. Do not start Phase 4 (model/cost routing) until Phase 3 inventory + adapter contract are explicit, otherwise the routing decisions land on a registry that doesn't fully define what it's routing.