⌂ Home ☷ Board

Luci Agent WAT Architecture Assessment

Date: 2026-05-13
Assessor: Lucienne (Chief of Staff) with specialist subagent analysis
Framework: WAT (Workflows, Agents, Tools)
Scope: Mission Control runtime architecture, Runtime Workbench, WAT orchestration
Reference Documents: - /Users/elmar/Projects/Mission-Control/docs/runtime-architecture-refresh.md (canonical) - /Users/elmar/Projects/Mission-Control/docs/runtime-workbench-prd.md (supporting) - /Users/elmar/PKA/Team/roster.md (team definitions) - /Users/elmar/PKA/Vault/memory/MEMORY.md (memory index)


Executive Summary

Mission Control's Luci runtime architecture is a mature, tmux-backed execution platform that maps surprisingly well to the WAT framework, but with significant gaps in the Workflow orchestration layer and cross-cutting governance controls. The Agent and Tool layers are well-developed; the Workflow layer lacks conditional branching, parallel execution, and an explicit state machine. The roadmap (10 items) correctly identifies the highest-priority gaps, but four foundational items remain unstarted.

Overall WAT Maturity Score

Dimension Score (1-5) Status
Workflows 2.5 ⚠️ Sequential only — missing conditional, parallel, iterative
Agents 3.5 ✅ Strong roles and context loading; missing registry and council memory
Tools 3.5 ✅ Rich provider diversity; missing MCP schema and composition
Governance (cross-cutting) 2.5 ⚠️ HITL and observability present but coarse
Overall 3.0 / 5.0 Functional but not yet fully WAT-native

1. Workflow Dimension

1.1 Current State

The only defined workflow template is dev_review_qa with 6 strictly sequential phases:

Order Phase Role Required Default
0 research scott False False
1 implement larry True True
2 council_review council False True
3 code_review luci False True
4 validate tessa True True
5 signoff atlas False False

14 workflow actions exist (10 operator-initiated + 4 operational signals). State transitions are ticket-status-based, not phase-state-based. The workflow_events table provides nonce-based idempotent audit logging.

1.2 WAT Compliance Assessment

WAT Workflow Type Status Evidence
Sequential ✅ Implemented Strict order field; child ticket chaining
Parallel / Fan-out ❌ Missing No concurrent phase execution model
Conditional / Routing ⚠️ Partial required: False gates only; no runtime branching on results
Iterative / Loop ⚠️ Implicit "Return for fixes" is manual; no auto-retry counter or max-iteration guard

1.3 Key Gaps

  1. No explicit workflow state machine — phases lack their own status (pending/running/completed/failed/skipped); state is implicit in ticket status
  2. No conditional branching — council finding CRITICAL does not auto-loop to implementation with retry tracking
  3. No parallel execution — research + scoping cannot run concurrently
  4. No phase-runtime profile binding — roadmap #2 pending; all phases use the ticket's initial runtime profile
  5. Council runs outside runtime_sessions — background bash jobs are not tracked in the runtime ledger

1.4 Recommendations

Priority Recommendation Effort Roadmap Link
High Add explicit workflow state machine with phase-level statuses Medium New
High Implement conditional branching — auto-loop on CRITICAL with retry counter (max 3) Medium Partially #1
Medium Add parallel phase support (fan-out/fan-in) High New
Medium Phase-runtime profile binding Medium #2
Medium Per-phase observability (timestamps, token costs, retry counts) Low New
Low Workflow pause/resume actions Low New

2. Agent Dimension

2.1 Current State

Six roles are defined in WORKFLOW_ROLE_PROFILES, mapping directly to the PKA roster:

Role Jurisdiction Context Loading Runtime
Luci Orchestration, continuity Persistent runtime CLI (tmux)
Larry Code implementation Per-ticket runtime SSH + CLI
Tessa UX validation Test plans in tests/ Browser tools
Scott Research Source hierarchy + skills Web search + scrape
Atlas Architecture sign-off Wiki + Vault MCP + Graphify Read + analysis
Council Multi-AI review Stateless per-run Codex + Gemini CLI

2.2 WAT Compliance Matrix

Role Reasoning Memory Tools HITL Observability
Luci
Larry
Tessa
Scott
Atlas ❌*
Council ⚠️

* Atlas is informative-only (no action approval needed for briefs)

2.3 Key Gaps

  1. Council has no persistent memory — each council run is stateless; no accumulation of "this pattern was flagged before"
  2. No agent registry / supervisor view — roadmap #5 pending; no visibility into which agents are active across the system
  3. No cross-agent context packs — runtime switching has "continuity pack" but this is runtime-level, not agent-knowledge-level
  4. Scott has no sanity gate — Deep Scout outputs flow directly to implementation without optional review
  5. Agent capability not formally declared — each agent's tool set is documented in agent files but not machine-readable

2.4 Recommendations

Priority Recommendation Effort Roadmap Link
High Agent Registry / Supervisor View — runtime_sessions-backed active agent dashboard Medium #5
High Council Memory — lightweight findings accumulator across reviews Low New
Medium Cross-agent Context Packs — formalize what travels on handoff Medium New
Medium Agent Capability Declaration — machine-readable tool profiles per agent Low New
Medium Bounded Supervisor Check-ins — progress summaries without action Medium #6
Low Scott Sanity Gate — optional Lucienne review for Deep Scout Low New

3. Tool Dimension

3.1 Current State

Runtime Profile Catalog: 9 profiles across 6 providers (Anthropic, OpenAI, Google, Kimi, GLM, MiniMax) with CLI and API backends. Each profile declares capabilities (skills, MCP, hooks, agents, file_tools, shell_tools, tmux_attach, direct_api).

Runtime Types: 6 distinct runtime types (persistent, ticket, workflow role, scheduled/batch, Larry, operator console) with appropriate routing rules.

MCP Tools: Vault MCP (5 tools), Claude Preview MCP (9 browser automation tools). No MC-specific MCP tools yet.

Tmux as Container: Core architectural choice providing process isolation, persistence, and raw console access.

3.2 WAT Compliance Assessment

WAT Tool Requirement Status Evidence
Focused, deterministic tools Capability flags per profile
Tool descriptions ⚠️ Descriptions exist; no usage examples
Agent selects appropriate tool Runtime dropdown + Switch runtime
Tool result feeds back Harvest path to MC History
Tool composition / chaining Not modeled
Tool result caching Same work rerun per ticket
Tool sandboxing Larry has full SSH

3.3 Key Gaps

  1. No MCP tool result schema — returns markdown/text, not structured JSON for programmatic consumption
  2. No MC-specific MCP tools — agents cannot self-orchestrate ticket operations via MCP
  3. No tool composition — cannot declaratively chain tools (e.g., research → summarize → post)
  4. No per-tool cost tracking — token spend is per-session approximate, not per-invocation
  5. No runtime health probes — stuck runtimes detected by polling, not heartbeat
  6. No tool cache — expensive operations (web scrapes, API calls) are rerun

3.4 Recommendations

Priority Recommendation Effort Roadmap Link
High MC-Specific MCP Tools — ticket ops, runtime mgmt, workflow actions Medium New
High Tool Result Schema — JSON schemas for MCP outputs Low New
Medium Tool Composition / Chaining — declarative chains in workflow phases High New
Medium Per-Tool Cost Tracking — token spend per MCP invocation Medium New
Medium Runtime Health Probes — periodic ping; auto-restart stale/failed Medium New
Low Tool Cache Layer — TTL-based cache for expensive ops Medium New
Low Tool Sandbox for Larry — network restrictions on external host High New

4. Governance & Cross-Cutting

4.1 Human-in-the-Loop (HITL)

Current gates: Elmar sign-off, Tessa validation (mandatory before Elmar sees UI), Council CRITICAL blocking, Atlas sign-off for PKA changes, explicit runtime switch.

Gaps: No per-action approval within a running runtime; needs_input is ticket-coarse not action-fine; no deployment approval gate for Larry.

4.2 Observability

Current: runtime_sessions ledger, workflow_events audit log, MC History, pane_mirror, clean/raw console dual views.

Gaps: No distributed tracing across handoffs; no per-step timing; no correlation ID across runtime switch; Council output is file-based not structured; token costs are approximate.

4.3 Memory & Context

Current stack: Session context → MC History → runtime_sessions → vault.db → Wiki → Agent definitions.

Gaps: No "context diff" on handoff; no shared scratchpad for concurrent agents; memory is read-heavy, write-light; GBrain retirement removed semantic search (Vault MCP + wiki are reliable but not semantic).

4.4 Roadmap Gap Analysis

# Item Status WAT Layer Critical?
1 Mandatory WAT review policy ❌ Not started Workflow Yes
2 Role-specific runtime profile overrides ❌ Not started Agent+Tool Yes
3 Cheaper API profiles for triage ⚠️ Partial Tool No
4 Kimi CLI smoke checks ⚠️ In progress Tool No
5 Agent Registry / supervisor view ❌ Not started Agent Yes
6 Bounded supervisor check-ins ❌ Not started Agent No
7 Runtime archive on close ❌ Not started Tool No
8 Telegram per-target queueing ⚠️ Partial Workflow No
9 Stable pre-deploy regression ⚠️ Partial Workflow No
10 Workflow/council hooks ❌ Not started Workflow Yes

Critical Path: Items 1, 2, 5, and 10 are foundational to WAT compliance. The remaining 6 items are operational improvements that can follow.

4.5 Recommendations

Priority Recommendation Effort Roadmap Link
High Implement roadmap #1 — Mandatory WAT review policy Medium #1
High Structured council output — JSON schema in workflow_events Low #10
High Correlation IDs — trace ID surviving handoffs and reviews Low New
Medium Per-action HITL — file edit approval within runtime High New
Medium Agent scratchpad — shared mutable state for concurrent agents Medium New
Medium Observability dashboard — real-time runtimes, token burn, phase progress High New
Low Auto-compounding — every completed runtime logs learnings Low New

5. Prioritized Implementation Roadmap

Phase 1: WAT Foundation (4 items — unblock everything else)

Item Layer Effort Deliverable
1. Mandatory WAT review policy Workflow Medium Document defining when council/Tessa/Atlas are mandatory vs optional
2. Role-specific runtime profile overrides Agent+Tool Medium Phase → profile mapping in dev_review_qa template
5. Agent Registry / supervisor view Agent Medium Workbench panel or dashboard showing active agents, phases, token burn
10. Workflow/council hooks Workflow Low Council output written to workflow_events as structured JSON

Phase 2: Workflow Engine (2 items — move beyond sequential)

Item Layer Effort Deliverable
Explicit workflow state machine Workflow Medium Phase statuses: pending/running/completed/failed/skipped
Conditional branching + auto-loop Workflow Medium Council CRITICAL → auto-return to implement with retry counter

Phase 3: Observability & Control (4 items)

Item Layer Effort Deliverable
Correlation IDs Governance Low Trace ID in runtime_sessions, workflow_events, council output
Structured council output Governance Low JSON schema for council findings stored in workflow_events
Runtime health probes Tool Medium Periodic ping from MC to runtimes; auto-restart stale
Per-phase observability Workflow Low Timestamps, token costs, retry counts visible in Workbench

Phase 4: Advanced Features (4 items — nice to have)

Item Layer Effort Deliverable
Parallel phase support Workflow High Fan-out/fan-in for concurrent research + scoping
Tool composition / chaining Tool High Declarative tool chains in workflow phases
MC-specific MCP tools Tool Medium Ticket ops, runtime mgmt as MCP tools
Observability dashboard Governance High Real-time view of active runtimes, token burn, phase progress

6. Risk Assessment

Risk Likelihood Impact Mitigation
Too much workflow automation hides what happened Medium High Keep raw console; verbose workflow_events; never auto-close without explicit action
Too many child tickets fragment context Low Medium Single-session rule already prevents this; maintain it
Council memory gap causes repeated findings High Medium Phase 1 "Council memory" item; lightweight accumulator
Agent registry adds operational overhead Medium Low Back it by existing runtime_sessions; don't build new infra
Per-action HITL slows down simple tasks Medium Medium Only for sensitive ops; Tier 1 tasks exempt

7. Conclusion

Mission Control's Luci runtime is a well-architected execution platform with strong Agent and Tool layers. The WAT framework provides a useful lens that reveals the Workflow layer as the primary gap: the system is sequential-heavy, conditional-light, and parallel-absent. The existing roadmap correctly identifies the highest-priority items, but four foundational pieces (items 1, 2, 5, 10) remain unstarted and should be prioritized.

The single-session dev loop (2026-04-29) was a significant WAT-alignment improvement — it reduced coordination overhead while preserving role separation. The tmux-backed runtime is a durable architectural choice that should be retained and enhanced, not replaced.

Bottom line: Luci is a capable agent operator. To become a fully WAT-native orchestration platform, invest in the Workflow layer's state machine, conditional branching, and the four foundational roadmap items.


Assessment generated by Lucienne with parallel specialist subagent analysis across Workflow, Agent, Tool, and Governance dimensions.