type: report tags: [assessment, luci, mission-control, wat, architecture] created: 2026-05-13 status: active
Date: 2026-05-13 Scope: Luci's current agent setup reviewed against MC Runtime Architecture, Runtime Workbench PRD, and industry WAT framework (Workflows, Agents, Tools) best practices Analysts: Architecture Analyst, Gap Analyst, Industry Benchmark Analyst (specialist subagents) Synthesised by: Lucienne
Luci/Mission Control is a mature Level 3 (Workflow-driven) agentic system — well beyond one-shot prompting, with a complete ticket lifecycle, six defined agent roles, tmux-backed persistent runtimes, two-tier memory architecture, and mobile control surfaces. It has several areas where it leads the commercial framework landscape (memory architecture, human-in-the-loop, cost-aware profiles, mobile control, runtime persistence) and several where it trails (conditional workflow routing, structured output protocol, agent-to-agent handoff, observability).
Overall WAT compliance: 7.5/10. The architecture doc acceptance tests pass at 70%, with no hard failures. The biggest opportunities are conditional routing (LangGraph-inspired) and structured agent handoffs (CrewAI-inspired).
Key finding: The current architecture is sound and should NOT be rewritten into an external framework. The advantages of the home-grown approach (tmux runtime, real production data, cost awareness) outweigh the advantages of off-the-shelf frameworks. Targeted feature adoption is the right strategy.
| # | Test | Status |
|---|---|---|
| A1 | Full ticket lifecycle without leaving MC | ✅ PASS |
| A2 | Workbench full control surface | ✅ PASS |
| A3 | Durable send (History first, then tmux) | ✅ PASS |
| A4 | History scroll-preservation during polling | ⚠️ PARTIAL |
| A5 | Runtime feed follow-bottom with scroll-up override | ✅ PASS |
| A6 | QUESTION: warm runtime + needs_input | ✅ PASS |
| A7 | Root DONE: review-ready keeps runtime | ✅ PASS |
| A8 | Send to Larry: pickupable | ✅ PASS |
| A9 | Return for fixes: requeue implementation | ✅ PASS |
| A10 | Workflow history visible without child tickets | ⚠️ PARTIAL |
7/10 PASS, 2/10 PARTIAL, 0/10 FAIL. The two partials are scroll-preservation edge cases and child-ticket navigation depth — both fixable without architecture changes.
| Layer | Score | Assessment |
|---|---|---|
| Workflows (W) | 7/10 | Strong sequential template; lacks conditional routing, parallel fan-out, iterative loops. Single template (dev_review_qa). |
| Agents (A) | 8/10 | Six well-defined roles with clear purposes and tool access control. Gaps: no agent-to-agent handoff protocol, no capability registry. |
| Tools (T) | 7/10 | Rich tool surface (tmux, CLI, API, browser, Telegram, scheduler). Gaps: no tool discovery, no result caching, limited composition. |
These are genuine competitive advantages. Don't lose them.
| Advantage | vs Industry | Evidence |
|---|---|---|
| Production-grade ticket lifecycle | +2 | 891 tickets, 18K messages, 39K task runs. Frameworks assume you build this. |
| Persistent, inspectable runtimes | +2 | tmux-backed. SSH in and see exactly what's happening. No framework offers this. |
| Two-tier memory architecture | +2 | Episodic (mc.db) + semantic (vault.db) with dream cycle promotion and salience re-ranking. |
| Mobile-first control surface | +1 | Telegram bridge + responsive Workbench. Frameworks are API-only. |
| Cost-aware runtime profiles | +1 | Explicit tool policy: cheap API for tool-less work, CLI for code. |
These are the gaps to close.
| Gap | vs Industry | Fix | Effort |
|---|---|---|---|
| No conditional workflow routing | -2 | Add phase conditions to workflow templates (if pass → advance, if fail → return_to_fixes) | Medium |
| No structured output protocol | -2 | Adopt JSON-mode output alongside sentinel-based harvest | High |
| No agent-to-agent handoff protocol | -2 | Extend continuity pack for role transitions (changed_files, test_results, findings) | Medium |
| No observability framework | -1 | Structured workflow events (JSONL per workflow step) | Low-Medium |
| No dynamic agent composition | -1 | Defer — explicitly against design principles ("Luci as default, explicit routing") | N/A |
| Journey | Status | Gap |
|---|---|---|
| 1. Plan with Luci | ✅ Complete | — |
| 2. Do it now | ✅ Complete | — |
| 3. Manage active ticket | ⚠️ Partial | No AI-driven "Next Action" strip; attachment flow limited |
| 4. Recover runtime | ⚠️ Partial | No one-click "Retry as Todo"; provider failure recovery not self-service |
| 5. WAT review gate | ⚠️ Partial | No mandatory review policy; findings don't auto-route to fixes |
| 6. Raw session escape hatch | ✅ Complete | — |
The architecture doc lists 10 roadmap items. Current status:
| # | Item | Status | Recommendation |
|---|---|---|---|
| 1 | Mandatory WAT review policy | ⬜ Not started | P0 — define when council/review is required |
| 2 | Role-specific runtime profile overrides | ⬜ Not started | P1 — per-phase profile selection |
| 3 | Cheaper API profiles for triage/summaries | ⬜ Not started | P1 — tool-less task routing |
| 4 | Kimi CLI smoke checks | ⬜ Not started | P2 — expand provider coverage |
| 5 | Agent Registry / supervisor view | ⬜ Not started | P2 — operational visibility |
| 6 | Bounded supervisor check-ins | ⬜ Not started | P2 — active pane summaries |
| 7 | Runtime archive on close | ⬜ Not started | P1 — audit trail completeness |
| 8 | Telegram per-target queueing | ⬜ Not started | P2 — rapid message handling |
| 9 | Pre-deploy regression command | ⬜ Not started | P1 — release safety |
| 10 | Workflow/council hooks | ⬜ Not started | P1 — second-opinion integration |
New items identified by this review:
| # | New Item | Recommendation |
|---|---|---|
| N1 | Conditional workflow routing | P0 — single biggest WAT maturity improvement |
| N2 | Structured agent handoff packs | P1 — reduce context loss between roles |
| N3 | Multiple workflow templates (simple_fix, research_only, security_audit) | P1 — right-size the workflow to the task |
| N4 | Structured workflow events (observability) | P2 — JSONL per step |
| N5 | Next Action intelligence in Workbench | P2 — PRD requirement, user experience |
| # | Action | Why | Effort |
|---|---|---|---|
| R1 | Conditional workflow routing | Close the biggest WAT maturity gap. Pass/fail branching on council review, Tessa validation, and Larry implementation. | Medium |
| R2 | Mandatory WAT review policy | Architecture doc Roadmap #1. Define when council/review is required vs optional. | Low |
| # | Action | Why | Effort |
|---|---|---|---|
| R3 | Structured agent handoff packs | Extend build_continuity_pack to role transitions. Larry → Council should carry changed_files + diff summary. |
Medium |
| R4 | Multiple workflow templates | simple_fix (Larry → done), research_only (Scott → done), security_audit (security + reviewer → done). Right-size the pipeline. |
Low |
| R5 | Formal workflow state machine | Allowed transitions prevent invalid states. Currently implicit in status + action history. | Medium |
| R6 | Runtime archive on close | Transcript excerpt, changed files, commands run, cost, close reason. Audit trail. | Low |
| R7 | Pre-deploy regression command | Combined runtime + workflow + Telegram + scheduler smoke test. Release safety. | Medium |
| # | Action | Why | Effort |
|---|---|---|---|
| R8 | Agent Registry view | Queryable capabilities, status, tool access. Enables supervisor pattern. | Medium |
| R9 | Structured workflow events | JSONL per workflow step. Observability without heavy framework. | Low-Medium |
| R10 | Next Action intelligence | PRD requirement. Derive next step from workflow state + recent activity. | Medium |
| R11 | Telegram per-target queueing | Architecture doc Roadmap #8. | Medium |
| R12 | Role-specific runtime profiles | Per-phase profile overrides. Tessa gets browser-enabled profile, Council gets API profile. | Medium |
| # | Action | Why | Effort |
|---|---|---|---|
| R13 | Structured output protocol | JSON-mode output to complement sentinel-based harvest. Start with new agents. | High |
| R14 | Kimi CLI smoke checks | Expand provider coverage. | Medium |
| # | Action | Why Defer |
|---|---|---|
| D1 | Dynamic agent composition | Architecture doc: "target is not 'many agents by default'" |
| D2 | In-process agent runtime | tmux-backed is a strength, not a weakness |
| D3 | Framework migration (LangGraph/CrewAI) | Would lose production advantages |
| Level | Description | Status |
|---|---|---|
| L1 | Single agent, one-shot tasks | ✅ Exceeded |
| L2 | Multi-agent, manual dispatch | ✅ Exceeded |
| L3 | Workflow-driven, phased, state tracking | ✅ Current |
| L4 | Conditional routing, branching on results | ⬜ Next target |
| L5 | Self-orchestrating, agents decide workflow | ⬜ Explicitly deferred |
The system should target Level 4. That means conditional routing (P0), multiple templates (P1), and structured handoffs (P1). Level 5 is the right thing to defer — the design philosophy of "Luci as default, explicit routing" is correct for a system where Elmar is the primary user and wants to stay in control.
Luci/Mission Control is a well-architected agentic system that's been refined through real production use (891 tickets, 18K messages, 39K task runs). It has genuine advantages over commercial frameworks in memory architecture, human-in-the-loop control, cost awareness, and runtime persistence.
The primary gap is workflow maturity — a single sequential template with manual dispatch is appropriate for Level 3 but insufficient for Level 4. Adding conditional routing, multiple templates, and structured agent handoffs would close this gap without requiring an architecture rewrite.
The 10-item roadmap from the architecture doc remains valid and should be executed in priority order, supplemented by the 5 new items identified here. No roadmap items should be dropped; several should be promoted in priority.
Report generated by Lucienne with specialist subagent analysis. Individual subagent reports available at /tmp/luci-arch-assessment.md, /tmp/luci-gap-assessment.md, /tmp/luci-benchmark-assessment.md.