# MC-4291: Wire the Worker → Reviewer → retry loop **Priority:** high **Assigned:** luci **Depends on:** MC-4290 (shadow reviewer must be active first) ## What to do When a W...
StateDoneNext ActionClosedOwnerLuciRuntimeClosedAge17d ago
Ticket is done; runtime is closed.·profile claude_opus_1m_medium · cwd /home/lucienne/workspace/mission-control · uptime 16d 18h · last activity 16d 15h ago
Description
MC-4302
# MC-4291: Wire the Worker → Reviewer → retry loop
**Priority:** high
**Assigned:** luci
**Depends on:** MC-4290 (shadow reviewer must be active first)
## What to do
When a Worker reports DONE and the QA reviewer says "not done" (verdict=fail), the Controller should send the Worker back to fix it — not ask Elmar. This creates the automatic retry loop.
## Steps
1. **Hook the harvest path.** Find where MC currently processes the `DONE:` sentinel from workers (likely in `mc_pickup.py` or `ticket_runtime.py`). After the DONE is detected, check `shadow_reviews` for a verdict on that ticket's `done_sha`.
2. **If verdict is `pass`** → proceed as today (mark done, close ticket or escalate to Elmar if needed).
3. **If verdict is `fail`** → do NOT mark done. Instead:
- Add a comment to the ticket with the reviewer's `reasons` and `gaps`
- Set ticket status to `todo` (or `in_progress` if the worker runtime is still alive)
- Increment a `review_cycles` counter on the ticket (add the field to `models.py` if it doesn't exist — default 0, max 3)
- If `review_cycles` >= 3: set status to `needs_input`, add comment "Max review cycles reached. Needs Elmar decision.", escalate
- If `review_cycles` < 3: re-dispatch to the same worker with the reviewer's feedback appended to the prompt
4. **If verdict is `uncertain`** → escalate to Elmar with the reviewer's reasoning. Do not auto-retry.
5. **Add a `review_cycles` field.** In `models.py` or wherever tickets are defined, add `review_cycles INTEGER DEFAULT 0`. Add a migration if the DB already has the tickets table. Reset to 0 when a ticket moves from `todo` to `in_progress` for the first time.
6. **Commit and push.**
## Acceptance criteria
- When a Worker reports DONE and the QA reviewer verdict is `fail`, the ticket goes back to the Worker with feedback — not to Elmar
- The `review_cycles` counter increments each retry
- At 3 cycles, the ticket escalates to `needs_input` with a clear message
- `uncertain` verdicts always escalate to Elmar
- The existing DONE path still works for `pass` verdicts
## If blocked
- If there's no clean hook point for the DONE sentinel processing, add one — but do not break the existing harvest
- If the worker runtime is dead when retry fires, create a new worker dispatch (mc_pickup handles this)
- Test with a fake ticket: set `MC_ORCH_SHADOW_REVIEW=1`, create a test ticket, force a `fail` verdict, verify the loop
## What NOT to do
- Do not touch Tessa/browser testing (that's MC-4292)
- Do not change the operator (that's MC-4293)
- Do not change the inbox (that's MC-4294)
- Do not auto-fix the Worker's code — just send the feedback back and let the Worker fix it
Activity
done
INTERACTIVE
Luci is working...
Details —
Done
· High
· Luci
▼
SState
Done
Closed
PPeople
TTiming / Details▼
api (human)
Mission Control
17d ago
16d ago
Advanced / Operator evidence
RRouting owner
ROperator console
Ticket is done; runtime is closed.ticket_marked_doneprofile claude_opus_1m_medium · cwd /home/lucienne/workspace/mission-control · uptime 16d 18h · last activity 16d 15h agoMC is visibility-only. Hermes Luci launches and gates work outside MC, then mirrors evidence/status here.Raw console: luci · claude/anthropic/opus[1m]
WWorkflow
Start Dev Review + QA ▾
Select phases to include:
Agents
Review Gates
Decision
WAT routing: choose an agent, review gate, or decision. Buttons use the live runtime when one is attached.
BLOCKED until MC-4301 is done. Once the QA reviewer is active and producing verdicts, wire the fail→retry loop. Do not start until MC-4301 is confirmed working.
luci17d ago
Luci picking up this ticket (background worker).
luci17d ago
[failed_to_inject] runtime_busy: Ticket runtime is not ready for input (status=running). Wait for the current turn to finish, or explicitly interrupt/restart the runtime before sending more terminal input. This message
Ticket picked up by Luci via MC dispatcher.
MC-4302: Wire Worker to Reviewer to retry loop
Work this ticket in the live tmux runtime. Use DONE:, REVIEW:, or QUESTION: when you need MC to reflect the next state.
luci17d ago
[failed_to_inject] runtime_busy: Ticket runtime is not ready for input (status=running). Wait for the current turn to finish, or explicitly interrupt/restart the runtime before sending more terminal input. This message
Ticket picked up by Luci via MC dispatcher.
MC-4302: Wire Worker to Reviewer to retry loop
Work this ticket in the live tmux runtime. Use DONE:, REVIEW:, or QUESTION: when you need MC to reflect the next state.
luci17d ago
Ticket runtime send failed while dispatching.
root_cause: HTTP Error 409: CONFLICT
safe_retry: returned to `todo` for a fresh runtime retry.
human_input_required: no
system17d ago
RUNTIME TERMINAL STATE (MC-3482 contract)
status: warning
summary: Ticket runtime send failed; parked for automatic recovery.
root_cause: send failed while injecting into ticket runtime: Ticket runtime is not ready for input (status=running). Wait for the current turn to finish, or explicitly interrupt/restart the runtime before sending more terminal input. This message was not sent.
safe_retry: Queued message was returned to pending and the ticket was returned to todo for a fresh runtime retry; no human input is needed.
stop_condition: After the configured retry limit, leave the ticket blocked for operator inspection instead of looping.
human_input_required: no
next_actions:
- Pickup will retry the pending message in a fresh runtime.
- Inspect runtime/send only if the retry fails again.
artifacts:
- ticket:MC-4302
luci17d ago
Luci picking up this ticket (background worker).
luci17d ago
Ticket runtime send failed while dispatching.
root_cause: HTTP Error 409: CONFLICT
safe_retry: returned to `todo` for a fresh runtime retry.
human_input_required: no
system17d ago
RUNTIME TERMINAL STATE (MC-3482 contract)
status: warning
summary: Ticket runtime send failed; parked for automatic recovery.
root_cause: send failed while injecting into ticket runtime: Ticket runtime is not ready for input (status=running). Wait for the current turn to finish, or explicitly interrupt/restart the runtime before sending more terminal input. This message was not sent.
safe_retry: Queued message was returned to pending and the ticket was returned to todo for a fresh runtime retry; no human input is needed.
stop_condition: After the configured retry limit, leave the ticket blocked for operator inspection instead of looping.
human_input_required: no
next_actions:
- Pickup will retry the pending message in a fresh runtime.
- Inspect runtime/send only if the retry fails again.
artifacts:
- ticket:MC-4302
luci17d ago
Luci picking up this ticket (background worker).
luci17d ago
Ticket runtime send failed while dispatching.
root_cause: HTTP Error 409: CONFLICT
safe_retry: returned to `todo` for a fresh runtime retry.
human_input_required: no
system17d ago
RUNTIME TERMINAL STATE (MC-3482 contract)
status: error
summary: Ticket runtime send failed; parked for automatic recovery.
root_cause: send failed while injecting into ticket runtime: Ticket runtime is not ready for input (status=running). Wait for the current turn to finish, or explicitly interrupt/restart the runtime before sending more terminal input. This message was not sent.
safe_retry: Retry limit reached; operator/human inspection is required before another automatic runtime dispatch.
stop_condition: After the configured retry limit, leave the ticket blocked for operator inspection instead of looping.
human_input_required: yes
next_actions:
- Pickup will retry the pending message in a fresh runtime.
- Inspect runtime/send only if the retry fails again.
artifacts:
- ticket:MC-4302
luci17d ago
Luci picking up this ticket (background worker).
luci17d ago
[system]: Ticket moved to Todo: Wire Worker to Reviewer to retry loop
[system]: Ticket moved to Todo: Wire Worker to Reviewer to retry loop
[system]: Ticket moved to Todo: Wire Worker to Reviewer to retry loop
luci17d ago
Ticket runtime send failed while dispatching.
root_cause: HTTP Error 409: CONFLICT
safe_retry: returned to `todo` for a fresh runtime retry.
human_input_required: no
system17d ago
RUNTIME TERMINAL STATE (MC-3482 contract)
status: error
summary: Ticket runtime send failed; parked for automatic recovery.
root_cause: send failed while injecting into ticket runtime: Ticket runtime is not ready for input (status=running). Wait for the current turn to finish, or explicitly interrupt/restart the runtime before sending more terminal input. This message was not sent.
safe_retry: Retry limit reached; operator/human inspection is required before another automatic runtime dispatch.
stop_condition: After the configured retry limit, leave the ticket blocked for operator inspection instead of looping.
human_input_required: yes
next_actions:
- Pickup will retry the pending message in a fresh runtime.
- Inspect runtime/send only if the retry fails again.
artifacts:
- ticket:MC-4302
luci17d ago
Luci picking up this ticket (background worker).
luci17d ago
Ticket picked up by Luci via MC dispatcher.
MC-4302: Wire Worker to Reviewer to retry loop
Work this ticket in the live tmux runtime. Use DONE:, REVIEW:, or QUESTION: when you need MC to reflect the next state.
Luci17d ago
DONE: emission.Runend-to-endin\n"
4941:"alsoemitthematchingDONE:/REVIEW:/QUESTION:line
asfallback.\n\n"
5543:#thisdispatchexistsitisauthoritativeoverthe
stdoutDONE:/
5551:_out_pfx.startswith(p)forpin("DONE:",
"REVIEW:","QUESTION:")
5582:ifany(_pfx.startswith(p)forpin("DONE:",
"REVIEW:","QUESTION:")):
5610:if_session_commentsand
output.lstrip().upper().startswith("DONE:"):
5614:output="REVIEW:"+output[5:]#ReplaceDONE:
prefixwithREVIEW:
5650:_output_upper.startswith(p)forpin("REVIEW:",
"QUESTION:","DONE:")
5757:elifoutput_upper.startswith("DONE:"):
6391:#AlsotreatanexplicitDONE:/REVIEW:/QUESTION:
prefix
6397:.startswith(("DONE:","REVIEW:",
"QUESTION:"))
6434:textor"DONE:Codexreportedtask
complete."
6964:elifoutput_upper.startswith("DONE:"):
6974:elifoutput_upper.startswith("DONE:"):
7118:"6.Emit`DONE:merged+closed`asyourfinal
line.\n\n"
7126:"ThenemitDONE:returnedtoLarryasyourfinal
line."
✢Moonwalking…(1m17s·↑2.6ktokens·thoughtfor1s)
⏵⏵bypasspermissionson (shift+tabtocycle)79895tokens
·oowa
nl8
wk
ai
✢lkng
i…
*n
g…
✶
✻
✽
✻
✶9
*
✢
Mo
·o
Mn
oowa
✢nl
wk
*ai
lkng
✶i…
n
✻g…20
✽
✻
✶
*
✢
Mo
o1
✢Mn
oowa
*nl
wk
●Read
* Moonwalking… (1m 21s · ↓ 2.6k tokens)
⏵⏵bypasspermissionson (shift+tabtocycle)79895tokens
✶
✻
✽
✻
(/home/lucienne/workspace/.claude/worktrees/pool-2/mc_pickup.py·lines
5700-5899)
✻Moonwalking… (1m 21s · ↓2.6k tokens)
⏵⏵bypasspermissionson (shift+tabtocycle)80705tokens
✶
Moonwalking…2
*7
Moonwalking…
✢
✢
*
Moonwalking…
✶
✻
●
⎿ Read200lines
✻Moonwalking… (1m 22s · ↓2.7k tokens)
⏵⏵bypasspermissionson (shift+tabtocycle)80806tokens
Moonwalking…↑
ai
✽lkng3
i…
n
g…
✻
✶
*
✢
✢
*
Mo
✶o
✻Mn
oowa
✽nl
wk
ai
lkng
✻i…
n
✶g…
*
✢
✢
*
✶
✻Mo
o
✽Mn
oowa
nl6
wk
✻ai
✶lkng
i…
Moong…↓
*almost done thinking with medium effort)
✢o
o
almost done thinking with medium effort
✢M
almost done thinking with medium effort
*
✶almost done thinking with medium effort
✻
✽
✻
almost done thinking with medium effort
✶
*
almost done thinking with medium effort
✢
·almost done thinking with medium effort
✢
*
✶
✻
almost done thinking with medium effort
✽
almost done thinking with medium effort
✻9
✶
almost done thinking with medium effort
*
●Read
* Moonwalking… (1m 29s · almost done thinking with medium effort)
⏵⏵bypasspermissionson (shift+tabtocycle)84749tokens
↓ 2.8ktkns · thought for 3s)
✢
✢
*
✶30
✻
(/home/lucienne/workspace/mission-control/mc_shadow_review.py)
✽
✻
✶
*Moonwalking…
✢
Moonwalking…1
✢
●
⎿ Read440lines
✢Moonwalking… (1m 31s · ↑3.0k tokens · thought for 3s)
⏵⏵bypasspermissionson (shift+tabtocycle)85028tokens
*
✶
✻
✽
✻
luci17d ago
Workflow action `review_ready` applied. Work ready for review.
api17d ago
Orchestrator gate FAILURE — same pattern as MC-4286. Worker emitted REVIEW with garbled output (terminal animation chars only, no real content) and operator-bounce loop active (2+ cycles 16:30/16:45). No MC-4302 commit exists in any branch (verified git log --all --grep).
Loop circuit-breaker triggered. Reassigning lucienne controller — either re-scope (smaller slice), switch runtime profile (worker keeps wedging on meaty control-plane tickets via codex gpt-5.5 xhigh), or implement controller-side via persistent session.
api17d ago
Orchestrator gate FAILURE — same pattern as MC-4286. Worker emitted REVIEW with garbled output (terminal animation chars only). No MC-4302 commit in any branch. Loop circuit-breaker triggered. Reassigning lucienne controller — re-scope or switch runtime profile.
Luci17d ago
MC-4291: review retry loop wired — shadow verdict fail reopens ticket with feedback (cap 3), uncertain escalates to needs_input. Flag MC_ORCH_REVIEW_RETRY (default OFF). Branch feat/mc-4291-review-retry-loop in mission-control.
api17d ago
Closed (orchestrator). Worker recovered after lucienne reassign and shipped: 4317650 on mission-control master. mc_review_retry.act_on_verdict reads shadow_reviews verdict (pass/fail/uncertain), bumps review_cycles, caps at 3 cycles before needs_input+Elmar comment. Flag MC_ORCH_REVIEW_RETRY (default OFF). Companion to MC-4301 (verdict-in-digest) + MC-4304 (operator skip-reopen). Branch FF-pushed to master.
luci17d ago
POST-MERGE / SHADOW-REVIEW EVIDENCE (2026-05-27 19:45 SAST)
1) Branches/master:
- master now at mission-control 579a529 `fix: repair shadow reviewer runner`.
- Required feature commits are ancestors of master:
- feat/mc-4291-review-retry-loop: 4317650
- feat/mc-4303-tessa-gate: 0133d58
- feat/mc-4304-operator-fix-deadworker-killswitch: 3f383e9 + d6a77a0
- `git status --short --branch`: ## master...origin/master
- MC service restarted and active: luci-dashboard.service active after restart.
2) Shadow reviewer root cause/fix:
- Runner was firing but then erroring every run: `AttributeError: module 'mc_shadow_review' has no attribute 'reconcile_human_decisions'`.
- Secondary issue: direct shell runner did not have ZAI/GLM key in process env; semantic_router now reads the key from ~/.hermes/.env fail-open.
- Fix committed/pushed: mission-control 579a529.
- Tests: 79 passed (`test_mc4207_shadow_review`, `test_semantic_router`, `test_mc4291`, `test_mc4303`, `test_mc4304`).
- Manual runner after fix: `reviewed=0 reconciled=0` with no AttributeError.
3) Forced + E2E proof:
- Forced real-ticket review on MC-4303 wrote shadow_reviews id=4: verdict=fail, would_action=return_for_fixes, reviewer_model=glm-4.7.
- E2E smoke ticket MC-4315: dummy worker commit e255dc06010372a01407e5032ddf829f5c01764c; shadow runner wrote shadow_reviews id=5: verdict=pass, would_action=advance, human_decision=accepted_done.
4) Flags enabled after E2E:
- Workspace scheduler commit da608526 enables:
- ticket-pickup: MC_ORCH_SHADOW_REVIEW=1
- shadow-review-runner: MC_ORCH_SHADOW_REVIEW=1 MC_ORCH_REVIEW_RETRY=1 MC_ORCH_TESSA_GATE=1
- Runtime flag check: shadow_review=True, review_retry=True, tessa_gate=True, killswitch=False.
luci17d ago
Cleanup follow-up: deleted checked-in tests/screenshots/ artifacts and pushed mission-control commit 032182e (`chore: remove checked-in browser screenshots`). .gitignore now blocks tests/screenshots/ and .scratchpad/ so browser/Tessa scratch outputs do not re-enter the repo.
luci-operator17d ago
Luci Operator: promoted this assigned `inbox` ticket to `todo` so pickup can run it.
luci-operator17d ago
Luci Operator corrected this ticket: `blocked` carried a runtime/worker failure with no unanswered worker `QUESTION:`, so it is back in `todo` for Luci/Larry/Tessa to handle.
luci-operator17d ago
Luci Operator corrected this ticket: `blocked` carried a runtime/worker failure with no unanswered worker `QUESTION:`, so it is back in `todo` for Luci/Larry/Tessa to handle.