Luci Agent WAT Architecture Assessment

Date: 2026-05-13
Assessor: Lucienne (Chief of Staff) with specialist subagent analysis
Framework: WAT (Workflows, Agents, Tools)
Scope: Mission Control runtime architecture, Runtime Workbench, WAT orchestration
Reference Documents: - /Users/elmar/Projects/Mission-Control/docs/runtime-architecture-refresh.md (canonical) - /Users/elmar/Projects/Mission-Control/docs/runtime-workbench-prd.md (supporting) - /Users/elmar/PKA/Team/roster.md (team definitions) - /Users/elmar/PKA/Vault/memory/MEMORY.md (memory index)

Executive Summary

Mission Control's Luci runtime architecture is a mature, tmux-backed execution platform that maps surprisingly well to the WAT framework, but with significant gaps in the Workflow orchestration layer and cross-cutting governance controls. The Agent and Tool layers are well-developed; the Workflow layer lacks conditional branching, parallel execution, and an explicit state machine. The roadmap (10 items) correctly identifies the highest-priority gaps, but four foundational items remain unstarted.

Overall WAT Maturity Score

Dimension	Score (1-5)	Status
Workflows	2.5	⚠️ Sequential only — missing conditional, parallel, iterative
Agents	3.5	✅ Strong roles and context loading; missing registry and council memory
Tools	3.5	✅ Rich provider diversity; missing MCP schema and composition
Governance (cross-cutting)	2.5	⚠️ HITL and observability present but coarse
Overall	3.0 / 5.0	Functional but not yet fully WAT-native

1. Workflow Dimension

1.1 Current State

The only defined workflow template is dev_review_qa with 6 strictly sequential phases:

Order	Phase	Role	Required	Default
0	research	scott	False	False
1	implement	larry	True	True
2	council_review	council	False	True
3	code_review	luci	False	True
4	validate	tessa	True	True
5	signoff	atlas	False	False

14 workflow actions exist (10 operator-initiated + 4 operational signals). State transitions are ticket-status-based, not phase-state-based. The workflow_events table provides nonce-based idempotent audit logging.

1.2 WAT Compliance Assessment

WAT Workflow Type	Status	Evidence
Sequential	✅ Implemented	Strict `order` field; child ticket chaining
Parallel / Fan-out	❌ Missing	No concurrent phase execution model
Conditional / Routing	⚠️ Partial	`required: False` gates only; no runtime branching on results
Iterative / Loop	⚠️ Implicit	"Return for fixes" is manual; no auto-retry counter or max-iteration guard

1.3 Key Gaps

No explicit workflow state machine — phases lack their own status (pending/running/completed/failed/skipped); state is implicit in ticket status
No conditional branching — council finding CRITICAL does not auto-loop to implementation with retry tracking
No parallel execution — research + scoping cannot run concurrently
No phase-runtime profile binding — roadmap #2 pending; all phases use the ticket's initial runtime profile
Council runs outside runtime_sessions — background bash jobs are not tracked in the runtime ledger

1.4 Recommendations

Priority	Recommendation	Effort	Roadmap Link
High	Add explicit workflow state machine with phase-level statuses	Medium	New
High	Implement conditional branching — auto-loop on CRITICAL with retry counter (max 3)	Medium	Partially #1
Medium	Add parallel phase support (fan-out/fan-in)	High	New
Medium	Phase-runtime profile binding	Medium	#2
Medium	Per-phase observability (timestamps, token costs, retry counts)	Low	New
Low	Workflow pause/resume actions	Low	New

2. Agent Dimension

2.1 Current State

Six roles are defined in WORKFLOW_ROLE_PROFILES, mapping directly to the PKA roster:

Role	Jurisdiction	Context Loading	Runtime
Luci	Orchestration, continuity	Persistent runtime	CLI (tmux)
Larry	Code implementation	Per-ticket runtime	SSH + CLI
Tessa	UX validation	Test plans in `tests/`	Browser tools
Scott	Research	Source hierarchy + skills	Web search + scrape
Atlas	Architecture sign-off	Wiki + Vault MCP + Graphify	Read + analysis
Council	Multi-AI review	Stateless per-run	Codex + Gemini CLI

2.2 WAT Compliance Matrix

Role	Reasoning	Memory	Tools	HITL	Observability
Luci	✅	✅	✅	✅	✅
Larry	✅	✅	✅	✅	✅
Tessa	✅	✅	✅	✅	✅
Scott	✅	✅	✅	❌	✅
Atlas	✅	✅	✅	❌*	✅
Council	✅	❌	✅	❌	⚠️

* Atlas is informative-only (no action approval needed for briefs)

2.3 Key Gaps

Council has no persistent memory — each council run is stateless; no accumulation of "this pattern was flagged before"
No agent registry / supervisor view — roadmap #5 pending; no visibility into which agents are active across the system
No cross-agent context packs — runtime switching has "continuity pack" but this is runtime-level, not agent-knowledge-level
Scott has no sanity gate — Deep Scout outputs flow directly to implementation without optional review
Agent capability not formally declared — each agent's tool set is documented in agent files but not machine-readable

2.4 Recommendations

Priority	Recommendation	Effort	Roadmap Link
High	Agent Registry / Supervisor View — `runtime_sessions`-backed active agent dashboard	Medium	#5
High	Council Memory — lightweight findings accumulator across reviews	Low	New
Medium	Cross-agent Context Packs — formalize what travels on handoff	Medium	New
Medium	Agent Capability Declaration — machine-readable tool profiles per agent	Low	New
Medium	Bounded Supervisor Check-ins — progress summaries without action	Medium	#6
Low	Scott Sanity Gate — optional Lucienne review for Deep Scout	Low	New

3. Tool Dimension

3.1 Current State

Runtime Profile Catalog: 9 profiles across 6 providers (Anthropic, OpenAI, Google, Kimi, GLM, MiniMax) with CLI and API backends. Each profile declares capabilities (skills, MCP, hooks, agents, file_tools, shell_tools, tmux_attach, direct_api).

Runtime Types: 6 distinct runtime types (persistent, ticket, workflow role, scheduled/batch, Larry, operator console) with appropriate routing rules.

MCP Tools: Vault MCP (5 tools), Claude Preview MCP (9 browser automation tools). No MC-specific MCP tools yet.

Tmux as Container: Core architectural choice providing process isolation, persistence, and raw console access.

3.2 WAT Compliance Assessment

WAT Tool Requirement	Status	Evidence
Focused, deterministic tools	✅	Capability flags per profile
Tool descriptions	⚠️	Descriptions exist; no usage examples
Agent selects appropriate tool	✅	Runtime dropdown + `Switch runtime`
Tool result feeds back	✅	Harvest path to MC History
Tool composition / chaining	❌	Not modeled
Tool result caching	❌	Same work rerun per ticket
Tool sandboxing	❌	Larry has full SSH

3.3 Key Gaps

No MCP tool result schema — returns markdown/text, not structured JSON for programmatic consumption
No MC-specific MCP tools — agents cannot self-orchestrate ticket operations via MCP
No tool composition — cannot declaratively chain tools (e.g., research → summarize → post)
No per-tool cost tracking — token spend is per-session approximate, not per-invocation
No runtime health probes — stuck runtimes detected by polling, not heartbeat
No tool cache — expensive operations (web scrapes, API calls) are rerun

3.4 Recommendations

Priority	Recommendation	Effort	Roadmap Link
High	MC-Specific MCP Tools — ticket ops, runtime mgmt, workflow actions	Medium	New
High	Tool Result Schema — JSON schemas for MCP outputs	Low	New
Medium	Tool Composition / Chaining — declarative chains in workflow phases	High	New
Medium	Per-Tool Cost Tracking — token spend per MCP invocation	Medium	New
Medium	Runtime Health Probes — periodic ping; auto-restart stale/failed	Medium	New
Low	Tool Cache Layer — TTL-based cache for expensive ops	Medium	New
Low	Tool Sandbox for Larry — network restrictions on external host	High	New

4. Governance & Cross-Cutting

4.1 Human-in-the-Loop (HITL)

Current gates: Elmar sign-off, Tessa validation (mandatory before Elmar sees UI), Council CRITICAL blocking, Atlas sign-off for PKA changes, explicit runtime switch.

Gaps: No per-action approval within a running runtime; needs_input is ticket-coarse not action-fine; no deployment approval gate for Larry.

4.2 Observability

Current: runtime_sessions ledger, workflow_events audit log, MC History, pane_mirror, clean/raw console dual views.

Gaps: No distributed tracing across handoffs; no per-step timing; no correlation ID across runtime switch; Council output is file-based not structured; token costs are approximate.

4.3 Memory & Context

Current stack: Session context → MC History → runtime_sessions → vault.db → Wiki → Agent definitions.

Gaps: No "context diff" on handoff; no shared scratchpad for concurrent agents; memory is read-heavy, write-light; GBrain retirement removed semantic search (Vault MCP + wiki are reliable but not semantic).

4.4 Roadmap Gap Analysis

#	Item	Status	WAT Layer	Critical?
1	Mandatory WAT review policy	❌ Not started	Workflow	Yes
2	Role-specific runtime profile overrides	❌ Not started	Agent+Tool	Yes
3	Cheaper API profiles for triage	⚠️ Partial	Tool	No
4	Kimi CLI smoke checks	⚠️ In progress	Tool	No
5	Agent Registry / supervisor view	❌ Not started	Agent	Yes
6	Bounded supervisor check-ins	❌ Not started	Agent	No
7	Runtime archive on close	❌ Not started	Tool	No
8	Telegram per-target queueing	⚠️ Partial	Workflow	No
9	Stable pre-deploy regression	⚠️ Partial	Workflow	No
10	Workflow/council hooks	❌ Not started	Workflow	Yes

Critical Path: Items 1, 2, 5, and 10 are foundational to WAT compliance. The remaining 6 items are operational improvements that can follow.

4.5 Recommendations

Priority	Recommendation	Effort	Roadmap Link
High	Implement roadmap #1 — Mandatory WAT review policy	Medium	#1
High	Structured council output — JSON schema in workflow_events	Low	#10
High	Correlation IDs — trace ID surviving handoffs and reviews	Low	New
Medium	Per-action HITL — file edit approval within runtime	High	New
Medium	Agent scratchpad — shared mutable state for concurrent agents	Medium	New
Medium	Observability dashboard — real-time runtimes, token burn, phase progress	High	New
Low	Auto-compounding — every completed runtime logs learnings	Low	New

5. Prioritized Implementation Roadmap

Phase 1: WAT Foundation (4 items — unblock everything else)

Item	Layer	Effort	Deliverable
1. Mandatory WAT review policy	Workflow	Medium	Document defining when council/Tessa/Atlas are mandatory vs optional
2. Role-specific runtime profile overrides	Agent+Tool	Medium	Phase → profile mapping in `dev_review_qa` template
5. Agent Registry / supervisor view	Agent	Medium	Workbench panel or dashboard showing active agents, phases, token burn
10. Workflow/council hooks	Workflow	Low	Council output written to workflow_events as structured JSON

Phase 2: Workflow Engine (2 items — move beyond sequential)

Item	Layer	Effort	Deliverable
Explicit workflow state machine	Workflow	Medium	Phase statuses: pending/running/completed/failed/skipped
Conditional branching + auto-loop	Workflow	Medium	Council CRITICAL → auto-return to implement with retry counter

Phase 3: Observability & Control (4 items)

Item	Layer	Effort	Deliverable
Correlation IDs	Governance	Low	Trace ID in runtime_sessions, workflow_events, council output
Structured council output	Governance	Low	JSON schema for council findings stored in workflow_events
Runtime health probes	Tool	Medium	Periodic ping from MC to runtimes; auto-restart stale
Per-phase observability	Workflow	Low	Timestamps, token costs, retry counts visible in Workbench

Phase 4: Advanced Features (4 items — nice to have)

Item	Layer	Effort	Deliverable
Parallel phase support	Workflow	High	Fan-out/fan-in for concurrent research + scoping
Tool composition / chaining	Tool	High	Declarative tool chains in workflow phases
MC-specific MCP tools	Tool	Medium	Ticket ops, runtime mgmt as MCP tools
Observability dashboard	Governance	High	Real-time view of active runtimes, token burn, phase progress

6. Risk Assessment

Risk	Likelihood	Impact	Mitigation
Too much workflow automation hides what happened	Medium	High	Keep raw console; verbose workflow_events; never auto-close without explicit action
Too many child tickets fragment context	Low	Medium	Single-session rule already prevents this; maintain it
Council memory gap causes repeated findings	High	Medium	Phase 1 "Council memory" item; lightweight accumulator
Agent registry adds operational overhead	Medium	Low	Back it by existing runtime_sessions; don't build new infra
Per-action HITL slows down simple tasks	Medium	Medium	Only for sensitive ops; Tier 1 tasks exempt

7. Conclusion

Mission Control's Luci runtime is a well-architected execution platform with strong Agent and Tool layers. The WAT framework provides a useful lens that reveals the Workflow layer as the primary gap: the system is sequential-heavy, conditional-light, and parallel-absent. The existing roadmap correctly identifies the highest-priority items, but four foundational pieces (items 1, 2, 5, 10) remain unstarted and should be prioritized.

The single-session dev loop (2026-04-29) was a significant WAT-alignment improvement — it reduced coordination overhead while preserving role separation. The tmux-backed runtime is a durable architectural choice that should be retained and enhanced, not replaced.

Bottom line: Luci is a capable agent operator. To become a fully WAT-native orchestration platform, invest in the Workflow layer's state machine, conditional branching, and the four foundational roadmap items.

Assessment generated by Lucienne with parallel specialist subagent analysis across Workflow, Agent, Tool, and Governance dimensions.