⌂ Home ☷ Board

Luci Agent Assessment — WAT Architecture Review


type: report tags: [assessment, luci, mission-control, wat, architecture] created: 2026-05-13 status: active


Date: 2026-05-13 Scope: Luci's current agent setup reviewed against MC Runtime Architecture, Runtime Workbench PRD, and industry WAT framework (Workflows, Agents, Tools) best practices Analysts: Architecture Analyst, Gap Analyst, Industry Benchmark Analyst (specialist subagents) Synthesised by: Lucienne


Executive Summary

Luci/Mission Control is a mature Level 3 (Workflow-driven) agentic system — well beyond one-shot prompting, with a complete ticket lifecycle, six defined agent roles, tmux-backed persistent runtimes, two-tier memory architecture, and mobile control surfaces. It has several areas where it leads the commercial framework landscape (memory architecture, human-in-the-loop, cost-aware profiles, mobile control, runtime persistence) and several where it trails (conditional workflow routing, structured output protocol, agent-to-agent handoff, observability).

Overall WAT compliance: 7.5/10. The architecture doc acceptance tests pass at 70%, with no hard failures. The biggest opportunities are conditional routing (LangGraph-inspired) and structured agent handoffs (CrewAI-inspired).

Key finding: The current architecture is sound and should NOT be rewritten into an external framework. The advantages of the home-grown approach (tmux runtime, real production data, cost awareness) outweigh the advantages of off-the-shelf frameworks. Targeted feature adoption is the right strategy.


1. Architecture Doc Compliance — 7/10 PASS

Acceptance Test Results

# Test Status
A1 Full ticket lifecycle without leaving MC ✅ PASS
A2 Workbench full control surface ✅ PASS
A3 Durable send (History first, then tmux) ✅ PASS
A4 History scroll-preservation during polling ⚠️ PARTIAL
A5 Runtime feed follow-bottom with scroll-up override ✅ PASS
A6 QUESTION: warm runtime + needs_input ✅ PASS
A7 Root DONE: review-ready keeps runtime ✅ PASS
A8 Send to Larry: pickupable ✅ PASS
A9 Return for fixes: requeue implementation ✅ PASS
A10 Workflow history visible without child tickets ⚠️ PARTIAL

7/10 PASS, 2/10 PARTIAL, 0/10 FAIL. The two partials are scroll-preservation edge cases and child-ticket navigation depth — both fixable without architecture changes.


2. WAT Framework Scores

Layer Score Assessment
Workflows (W) 7/10 Strong sequential template; lacks conditional routing, parallel fan-out, iterative loops. Single template (dev_review_qa).
Agents (A) 8/10 Six well-defined roles with clear purposes and tool access control. Gaps: no agent-to-agent handoff protocol, no capability registry.
Tools (T) 7/10 Rich tool surface (tmux, CLI, API, browser, Telegram, scheduler). Gaps: no tool discovery, no result caching, limited composition.

3. What Luci Does Better Than Frameworks

These are genuine competitive advantages. Don't lose them.

Advantage vs Industry Evidence
Production-grade ticket lifecycle +2 891 tickets, 18K messages, 39K task runs. Frameworks assume you build this.
Persistent, inspectable runtimes +2 tmux-backed. SSH in and see exactly what's happening. No framework offers this.
Two-tier memory architecture +2 Episodic (mc.db) + semantic (vault.db) with dream cycle promotion and salience re-ranking.
Mobile-first control surface +1 Telegram bridge + responsive Workbench. Frameworks are API-only.
Cost-aware runtime profiles +1 Explicit tool policy: cheap API for tool-less work, CLI for code.

4. What Luci Does Worse Than Frameworks

These are the gaps to close.

Gap vs Industry Fix Effort
No conditional workflow routing -2 Add phase conditions to workflow templates (if pass → advance, if fail → return_to_fixes) Medium
No structured output protocol -2 Adopt JSON-mode output alongside sentinel-based harvest High
No agent-to-agent handoff protocol -2 Extend continuity pack for role transitions (changed_files, test_results, findings) Medium
No observability framework -1 Structured workflow events (JSONL per workflow step) Low-Medium
No dynamic agent composition -1 Defer — explicitly against design principles ("Luci as default, explicit routing") N/A

5. PRD User Journey Gaps

Journey Status Gap
1. Plan with Luci ✅ Complete
2. Do it now ✅ Complete
3. Manage active ticket ⚠️ Partial No AI-driven "Next Action" strip; attachment flow limited
4. Recover runtime ⚠️ Partial No one-click "Retry as Todo"; provider failure recovery not self-service
5. WAT review gate ⚠️ Partial No mandatory review policy; findings don't auto-route to fixes
6. Raw session escape hatch ✅ Complete

6. Roadmap Gap Analysis

The architecture doc lists 10 roadmap items. Current status:

# Item Status Recommendation
1 Mandatory WAT review policy ⬜ Not started P0 — define when council/review is required
2 Role-specific runtime profile overrides ⬜ Not started P1 — per-phase profile selection
3 Cheaper API profiles for triage/summaries ⬜ Not started P1 — tool-less task routing
4 Kimi CLI smoke checks ⬜ Not started P2 — expand provider coverage
5 Agent Registry / supervisor view ⬜ Not started P2 — operational visibility
6 Bounded supervisor check-ins ⬜ Not started P2 — active pane summaries
7 Runtime archive on close ⬜ Not started P1 — audit trail completeness
8 Telegram per-target queueing ⬜ Not started P2 — rapid message handling
9 Pre-deploy regression command ⬜ Not started P1 — release safety
10 Workflow/council hooks ⬜ Not started P1 — second-opinion integration

New items identified by this review:

# New Item Recommendation
N1 Conditional workflow routing P0 — single biggest WAT maturity improvement
N2 Structured agent handoff packs P1 — reduce context loss between roles
N3 Multiple workflow templates (simple_fix, research_only, security_audit) P1 — right-size the workflow to the task
N4 Structured workflow events (observability) P2 — JSONL per step
N5 Next Action intelligence in Workbench P2 — PRD requirement, user experience

7. Prioritised Recommendations

P0 — Do Next (High Impact, Aligns with Architecture)

# Action Why Effort
R1 Conditional workflow routing Close the biggest WAT maturity gap. Pass/fail branching on council review, Tessa validation, and Larry implementation. Medium
R2 Mandatory WAT review policy Architecture doc Roadmap #1. Define when council/review is required vs optional. Low

P1 — Do Soon (High Value, Build on Foundation)

# Action Why Effort
R3 Structured agent handoff packs Extend build_continuity_pack to role transitions. Larry → Council should carry changed_files + diff summary. Medium
R4 Multiple workflow templates simple_fix (Larry → done), research_only (Scott → done), security_audit (security + reviewer → done). Right-size the pipeline. Low
R5 Formal workflow state machine Allowed transitions prevent invalid states. Currently implicit in status + action history. Medium
R6 Runtime archive on close Transcript excerpt, changed files, commands run, cost, close reason. Audit trail. Low
R7 Pre-deploy regression command Combined runtime + workflow + Telegram + scheduler smoke test. Release safety. Medium

P2 — Plan For (Good Value, Higher Effort)

# Action Why Effort
R8 Agent Registry view Queryable capabilities, status, tool access. Enables supervisor pattern. Medium
R9 Structured workflow events JSONL per workflow step. Observability without heavy framework. Low-Medium
R10 Next Action intelligence PRD requirement. Derive next step from workflow state + recent activity. Medium
R11 Telegram per-target queueing Architecture doc Roadmap #8. Medium
R12 Role-specific runtime profiles Per-phase profile overrides. Tessa gets browser-enabled profile, Council gets API profile. Medium

P3 — Explore (Strategic, Longer Horizon)

# Action Why Effort
R13 Structured output protocol JSON-mode output to complement sentinel-based harvest. Start with new agents. High
R14 Kimi CLI smoke checks Expand provider coverage. Medium

Defer (Against Design Principles)

# Action Why Defer
D1 Dynamic agent composition Architecture doc: "target is not 'many agents by default'"
D2 In-process agent runtime tmux-backed is a strength, not a weakness
D3 Framework migration (LangGraph/CrewAI) Would lose production advantages

8. Maturity Model

Level Description Status
L1 Single agent, one-shot tasks ✅ Exceeded
L2 Multi-agent, manual dispatch ✅ Exceeded
L3 Workflow-driven, phased, state tracking Current
L4 Conditional routing, branching on results Next target
L5 Self-orchestrating, agents decide workflow ⬜ Explicitly deferred

The system should target Level 4. That means conditional routing (P0), multiple templates (P1), and structured handoffs (P1). Level 5 is the right thing to defer — the design philosophy of "Luci as default, explicit routing" is correct for a system where Elmar is the primary user and wants to stay in control.


9. Conclusion

Luci/Mission Control is a well-architected agentic system that's been refined through real production use (891 tickets, 18K messages, 39K task runs). It has genuine advantages over commercial frameworks in memory architecture, human-in-the-loop control, cost awareness, and runtime persistence.

The primary gap is workflow maturity — a single sequential template with manual dispatch is appropriate for Level 3 but insufficient for Level 4. Adding conditional routing, multiple templates, and structured agent handoffs would close this gap without requiring an architecture rewrite.

The 10-item roadmap from the architecture doc remains valid and should be executed in priority order, supplemented by the 5 new items identified here. No roadmap items should be dropped; several should be promoted in priority.


Report generated by Lucienne with specialist subagent analysis. Individual subagent reports available at /tmp/luci-arch-assessment.md, /tmp/luci-gap-assessment.md, /tmp/luci-benchmark-assessment.md.