Plan is sound, scoped well, and Phase 1 already landed cleanly. The governance/navigation framing fixes the main risk (control room becoming a second source of truth). Direction is correct: proceed with amendments. Most gaps are about what isn't yet named — Claude Code itself as a runtime peer to Hermes, the rare-runtime-failure recovery loop, and the runtime_sessions ledger contract — rather than what's wrong in what's written.
Phase 1 is marked done. Two items should land before declaring it shipped:
runtime-architecture-refresh.md does not point back to agent-control-room/docs/runtime-independence.md. Until Atlas signoff lands the reverse link, the canonical contract is unaware of the governance layer — anyone reading MC docs first will not discover the control room. Open the Atlas review ticket explicitly now so the Phase 1 box isn't quietly left half-checked.runtime-independence.md. Phase 5A defers Telegram routing, but the invariant "CCGram is the only inbound poller; any other process invoking claude must use --settings ~/.claude/settings-worker.json" is a runtime-independence constraint now, not a Phase 5A decision. If a future runtime adapter forgets this, the 2026-04-16 409-Conflict outage repeats. Add it to the principle section as a non-negotiable adapter constraint.runtime_sessions ledger is named as the invariant but not contracted. Plan says runtime adapters write to runtime_sessions, but nowhere defines the minimum row contract (session id, ticket id, profile, started_at, harvest path, terminal status). Without that, every new adapter re-invents the schema and runtime independence fails silently on read-back.claude_anthropic, claude_glm, etc. — all of which are the Claude CLI with different providers. The interactive Luci-persistent tmux session is also Claude Code, and the dispatcher spawns Claude Code subprocesses. The plan talks about Hermes vs Codex vs Gemini as adapters but never names Claude-Code-the-runtime distinctly from the CLI-that-routes-providers. This conflation is exactly the kind of Claude-specific assumption Phase 3 says to surface — name it explicitly.~/workspace/.claude/worktrees/pool-{0,1,2}; persistent session never claims. A runtime adapter that doesn't honor "commit + push before DONE" destroys uncommitted work at next claim. This belongs in the adapter contract.audit_task_runtime_profiles.py --lint is referenced; surface it now as the runtime-honesty smoke test rather than waiting for Phase 3.~/.claude/rules/agent-recovery-and-loop-discipline.md terminal-state shape (status / summary / next_actions / artifacts) is the de facto contract for harvested subagent output. Plan doesn't reference it. Without that link, a Codex or Gemini adapter can satisfy the plan's letter but break Luci's recovery loop.docs/glossary.md distinguishing: runtime (Claude Code, Hermes CLI, Codex CLI, Gemini CLI), provider (Anthropic, xAI, Z.AI, Moonshot, MiniMax, Google), model (sonnet, gpt-5.5, grok-4.3, glm-4.6), profile (the named adapter binding the three). Plan uses these terms interchangeably in places.hermes config show diff snapshot habit before/after, to make WebUI drift git-diffable.skills.external_dirs" (line 121). Treated as a current strength. Verify before relying on it — cross-host-skill-port skill exists precisely because cross-provider skill loading is unreliable in practice. If this isn't smoke-tested it is aspirational; the plan's own rule forbids aspirational claims.persistent_luci. Update wording from "fallback plan for Gemini CLI retirement" to "Gemini CLI is already deprecating 2026-06-18; document the active migration path" — this is no longer hypothetical and Phase 3 should reflect that priority.gpt-5.5 (line 347) and grok-4.3 (line 347): verify these model identifiers are current Hermes-resolvable names; if Hermes config drifts the inventory rots.dispatch_policy.py::forbidden_runtime_profiles and the audit_task_runtime_profiles.py --lint rule already give meaningful portability enforcement. The weakness is documentation, not absence. Sharpen the wording so Phase 3 doesn't redesign what exists.direct_gemini, direct_anthropic_sdk, direct_mixed bypass the CLI and therefore bypass any provider-routing env the scheduler injects. The runtime-profile-honesty rule (CLAUDE.md key rule #8) is the only thing keeping these honest. The lint must run on every commit that touches ~/workspace/tasks/; plan should escalate that from a Phase 4 audit to a continuous gate./api/provider/larry). If MC is down or that endpoint hangs, pickup uses stale defaults. Plan should note that Larry adapter readiness depends on MC liveness — a circular dependency worth flagging.~/workspace; pool workers anchor at pool-N/. A runtime adapter that doesn't distinguish these will either contaminate the persistent branch or fail to commit at all. Make this an adapter-contract item.claude_glm as the CLI fallback but doesn't define what happens if Anthropic and Z.AI are both unreachable. The cost-band design needs a "local/fallback" tier (mentioned in Phase 4 line 287); make it concrete — what runs offline / on degraded providers / in 429 storms?notify.py POST, or that adds any getUpdates polling, breaks CCGram immediately. This is a Day-0 invariant, not a Phase 5A deliverable.mc_telegram_bridge.py::runtime_profiles() is listed under MC registry surfaces. The bridge service is stopped/disabled (per CLAUDE.md, since MC-2617 2026-04-29). If code remains but the service is dead, the inventory is misleading — confirm whether this is live, vestigial, or being re-purposed.claude process must use --settings ~/.claude/settings-worker.json" rule lives in CLAUDE.md but not in runtime-independence.md. Phase 1 should link it.The four-layer boundary (reports/ vs mission-control/docs/ vs manifest/CAPABILITIES vs agent-control-room/) is well-drawn but has unclosed seams:
reports/README.md is checked off but unverified in this review. The plan marks it [x]; quick spot-check it actually exists and is current, otherwise Phase 6 is a paper completion.runtime-architecture-refresh.md is canonical; runtime-independence.md is governance. If MC architecture changes and governance doc isn't re-read, governance silently goes stale. Add a "validity window" or "last reviewed against runtime-architecture-refresh.md" line to governance docs.luci-manifest.md ↔ CAPABILITIES.md split is ambiguous. Manifest = "deployed inventory"; CAPABILITIES = "deployed inventory + capabilities". The plan treats them as siblings but they overlap. Worth a one-line authority statement: which one wins when they disagree?~/.hermes/config.yaml, the WebUI is effectively a write path into runtime config — declare it.docs/runtime-independence.md titled "Runtime adapter contract" enumerating: (a) runtime_sessions row shape, (b) terminal-state output shape, (c) harvest commit-before-DONE rule, (d) CCGram-sole-poller rule, (e) --settings settings-worker.json rule for any long-running claude invocation, (f) worktree-pool slot anchoring rule, (g) Telegram outbound = notify.py only.audit_task_runtime_profiles.py --lint as the smoke test the control room exposes.runtime-architecture-refresh.md. Don't carry it as a deferred bullet — make it a tracked sub-task with the same MC-3898 parent.~/workspace/agent-control-room/?" — current state in ~/workspace/ is consistent with the rest of Luci's home; reword from open question to ratified default unless Elmar contradicts.runtime-architecture-refresh.md and scheduler.py PROFILE_PROVIDER on or before ~/.claude/rules/agent-recovery-and-loop-discipline.md from runtime-independence.md so terminal-state contract is reachable from the governance layer.direct_anthropic_sdk could in principle edit files if the script does it. The real constraint is "direct API profiles MUST declare direct_* runtime_profile and MUST NOT be assigned tasks routed through the claude CLI dispatcher's tool/file-edit assumptions." Tighten or it misleads adapter authors.Proceed with amendments.
The plan's direction is correct and Phase 1 work is real. The amendments are mostly about making implicit contracts explicit — adapter contract, Claude-Code-as-runtime naming, Gemini-CLI status update, CCGram invariant promotion, Atlas reverse-link as a tracked sub-task, and CAPABILITIES.md entry for the control room itself. None require redesign; all should land before Phase 2 begins so the inventory work doesn't bake in ambiguity. Do not start Phase 4 (model/cost routing) until Phase 3 inventory + adapter contract are explicit, otherwise the routing decisions land on a registry that doesn't fully define what it's routing.