⌂ Home ☷ Board

Orchestration & Delegation Framework for Elmar

Date: 2026-05-31
Author: Lucienne (architectural review)
Status: Practical guide — how to run a 7-person virtual team with Hermes + Claude Code


1. The Simple Mental Model: "A Restaurant Kitchen"

Imagine your team as a restaurant kitchen:

Role Kitchen Analogy What They Actually Do
Planner Head chef / Expediter Reads the order, breaks it into steps, decides who cooks what
Designer Pastry chef / Plating lead Makes it look beautiful and feel right before it goes out
Builder/Coder Line cook Cooks the dish — writes the code, edits the files
Reviewer Sous chef tasting Tastes before it leaves — code review, logic check
QA Coder Food safety inspector Runs tests, checks for bugs, verifies it works
QA Design Visual inspector Checks the plate looks right on mobile and desktop
Browser Tester Mystery shopper Actually sits down and tries to eat the meal — real user flow

The key insight: Not every order needs all seven people. A cup of coffee (fix a typo) just needs the Builder. A tasting menu (new feature) needs the full kitchen.


2. How Work Flows: The "Ticket → Runtime → Evidence" Loop

Elmar has an idea
    ↓
Planner (Luci) creates a ticket in Mission Control
    ↓
Planner decides: "Is this simple or complex?"
    ↓
    ├─ Simple → Builder does it directly in one runtime session
    └─ Complex → Designer briefs → Builder codes → Reviewer checks → QA tests → Browser Tester validates
    ↓
Evidence (screenshots, tests, diffs) harvests back into the ticket
    ↓
Planner shows Elmar: "Done. Here's the proof."

The golden rule: Every piece of work becomes a ticket in Mission Control. Every ticket gets a runtime (a live working session). Every runtime produces evidence that gets recorded back into the ticket.


3. Mapping Your 7-Person Team to Actual Tools

The Team → Tool Mapping

Team Role Primary Tool When to Use Concrete Example
Planner Luci (Hermes/Claude Code in tmux) Every ticket starts here "Create ticket MC-4500: fix login button on mobile"
Designer Luci with design rubric + browser vision UI/UX tickets before coding "Design brief: 375px mobile, chat-first layout, sticky composer"
Builder/Coder Claude Code CLI (claude in tmux) All code work claude --mcp-config mc-coord-mcp-config.json in ticket runtime
Reviewer Claude Code or Codex CLI (codex) Code review phase Spawn Codex in tmux, feed it the diff + brief, ask for verdict
QA Coder Claude Code + pytest/regression scripts Test verification Run scripts/mc_regression.sh, attach output to ticket
QA Design Browser tool + vision model (Tessa) Visual/mobile validation Screenshot at 375px, vision model checks spacing, tap targets
Browser Tester Browser automation + Claude Code End-to-end user flows Navigate real routes, fill forms, verify console errors

Where the "People" Actually Live

Your 7-person team is not 7 separate computers. They are roles that run in runtime sessions:

One person (Claude Code) can wear multiple hats — but only one hat at a time per runtime. The Planner decides when to switch hats or spawn a new worker.


4. Decision Tree: Which Tool for Which Job?

START: You have work to do.
    │
    ▼
┌─────────────────────────────────────┐
│ Is this a recurring health check    │
│ or deterministic script?            │
└─────────────────────────────────────┘
    │
    ├─ YES → Use Hermes cronjob with `no_agent: true`
    │         (cheap, runs every 5-15 min, only calls LLM on events)
    │
    └─ NO → Continue
              │
              ▼
┌─────────────────────────────────────┐
│ Does this need file edits, shell,   │
│ or MCP tools?                       │
└─────────────────────────────────────┘
    │
    ├─ YES → Use a CLI runtime (Claude Code, Hermes, Codex)
    │         in a tmux session
    │
    └─ NO → Continue
              │
              ▼
┌─────────────────────────────────────┐
│ Is this a quick analysis, summary,  │
│ or triage with no tools needed?     │
└─────────────────────────────────────┘
    │
    ├─ YES → Use direct API call (cheap, fast, no tmux needed)
    │         Example: "Summarize these 10 tickets"
    │
    └─ NO → Continue
              │
              ▼
┌─────────────────────────────────────┐
│ Is this a bounded, focused task     │
│ that needs isolation from the main  │
│ runtime?                            │
└─────────────────────────────────────┘
    │
    ├─ YES → Use Hermes `delegate_task` (subagent)
    │         Spawns a fresh session, clean context, returns result
    │         Good for: research, council review, one-off analysis
    │
    └─ NO → Use the main ticket runtime (Claude Code in tmux)
              This is the default for most work.

Quick Reference Table

Situation Tool Why
Health check every 5 min Hermes cronjob no_agent Costs ~$0.01/day, only wakes LLM on events
Write code, edit files Claude Code CLI in tmux Full tool access, file edits, MCP, skills
Quick summary of tickets Direct API (Kimi/GLM) Cheap, fast, no setup
Code review / second opinion Codex CLI or subagent Fresh eyes, isolated context
Visual QA / mobile check Browser + vision model Real screenshots, real DOM
Research / scout work Subagent with web search Bounded, doesn't clutter main runtime
Long-running task ( > 30 min) terminal(background=true) Runs without blocking, you check later
Urgent fix, needs hands-on Claude Code directly in tmux Fastest path from brain to file

5. Maximizing Your Claude Code Subscription

Claude Code is your premium tool. Here's how to get the most from it:

What Claude Code Is Best At

What Claude Code Is NOT For

The 80/20 Rule for Claude Code

Use Claude Code For Don't Use Claude Code For
Writing and editing code Health checks and polling
Interactive debugging Simple text summaries
Complex architectural decisions Deterministic data transforms
Code review and QA Scheduled status reports
Browser-based user flow testing Bulk ticket triage

Concrete Pattern: "Claude Does the Hard Stuff, Others Do the Rest"

Ticket arrives
    ↓
Planner (Luci, cheap Hermes/Kimi) reads and routes it
    ↓
Builder (Claude Code) writes the code
    ↓
Reviewer (Codex or Claude Code second pass) checks the diff
    ↓
QA Coder (Claude Code + pytest) runs tests
    ↓
QA Design (browser + vision model) checks mobile screenshot
    ↓
Planner (Luci) synthesizes and reports to Elmar

Cost math: If Claude Code costs $0.50 per code task and Kimi costs $0.02 per routing task, you save 90% by keeping Claude for code and using cheaper models for routing.


6. Keeping Visual and Design Consistency

The Design Brief Pattern

Before any Builder starts coding, the Planner must attach a design brief to the ticket. This is a checklist:

DESIGN BRIEF for MC-4500
- Target: Mobile login screen (375px viewport)
- User flow: Tap "Login" → enter credentials → tap "Submit"
- Visual standard: Follow Material 3 bottom-sheet pattern
- Colors: Use existing CSS variables (--mc-primary, --mc-surface)
- Touch targets: Minimum 48px for all buttons
- Composer: Sticky bottom, never overlap content
- Evidence required: 375px screenshot + desktop screenshot + console log

The rule: No UI ticket moves to "In Progress" without a design brief. The brief is created by the Planner (with Designer input if needed) and enforced by the Reviewer.

The Visual QA Gate

For every UI ticket, the Browser Tester must provide:

  1. 375px mobile screenshot — actual browser render, not DOM theory
  2. Desktop screenshot — verify it still looks good
  3. Console log — no JavaScript errors
  4. Tap/scroll verification — actual interaction, not just visual

The gate: A ticket cannot move to "Done" without these four items. Mission Control enforces this with the mobile_review_required flag.

Consistency Through Reuse, Not Memory

Don't rely on the model to "remember" the design system. Instead:


7. The Control Room: How It All Comes Together

The Simplest Version

Elmar says: "Fix the mobile login button"
    ↓
Luci (Planner) creates ticket MC-4500
    ↓
Luci writes design brief (Designer hat)
    ↓
Luci spawns Claude Code runtime (Builder hat)
    ↓
Claude Code writes code, runs tests
    ↓
Claude Code signals DONE → ticket moves to Review
    ↓
Luci spawns Codex runtime (Reviewer hat) → approves
    ↓
Tessa (Browser + vision) checks 375px screenshot → approves
    ↓
Luci moves ticket to Done, reports to Elmar

What Actually Happens in the System

  1. Ticket created in Mission Control (SQLite/PostgreSQL)
  2. Runtime session created in runtime_sessions table
  3. tmux session spawned: mc-ticket-4500
  4. Claude Code launched inside tmux with MCP config
  5. Work happens — file edits, tests, browser checks
  6. Signal file written: state/mc-signals/MC-4500.json
  7. Harvest reads signal, updates ticket status, records history
  8. Runtime closed or kept warm for next phase

8. Summary: The "One-Pager" for Elmar

Your Virtual Team

Your Tools

Your Decision Tree (Memorize This)

  1. Recurring/polling? → Cronjob no_agent
  2. Need file edits or tools? → Claude Code in tmux
  3. Quick summary, no tools? → Direct API (Kimi/GLM)
  4. Isolated focused task? → Subagent (delegate_task)
  5. Everything else? → Main ticket runtime

Your Cost Control

Your Quality Control


9. Appendix: Concrete Commands

Spawn a Builder (Claude Code) for a ticket

# In Luci's orchestrator session
tmux new-session -d -s mc-ticket-4500 -c ~/workspace/worktrees/mc-4500
claude --mcp-config ~/workspace/mission-control/mc-coord-mcp-config.json

Spawn a Reviewer (Codex) for code review

tmux new-session -d -s mc-review-4500 -c ~/workspace/worktrees/mc-4500
codex --approval-mode full-auto
# Feed it: git diff + design brief + review checklist

Run a cheap summary (Kimi API)

# Direct API call, no tmux, no tools
import openai
client = openai.OpenAI(base_url="https://api.moonshot.cn/v1", api_key=...)
response = client.chat.completions.create(model="kimi-latest", messages=[...])

Delegate a research task (Hermes subagent)

# In Hermes, use delegate_task
# Spawns isolated session, runs task, returns result to parent

Health check cron (no_agent)

{
  "schedule": "*/5 * * * *",
  "no_agent": true,
  "command": "python3 ~/workspace/mission-control/scripts/health_check.py"
}

Bottom line for Elmar:

Mission Control is your kitchen whiteboard. Claude Code is your head chef. The other models are your prep cooks and line cooks. You (the Planner) decide what gets cooked and who cooks it. The system records everything so you never lose track. Keep Claude on the hard stuff, use cheap models for the easy stuff, and never skip the design brief or the screenshot gate.