You're offline — showing cached data

MC-4301

Activate QA reviewer (shadow reviewer to acting)
2026-06-13 08:50:47 SAST
Home Board MC-4301

Activate QA reviewer (shadow reviewer to acting)

# MC-4290: Activate the QA reviewer (shadow reviewer → acting) **Priority:** high **Assigned:** luci **Depends on:** nothing (can start immediately) ## What to do The shadow ...
State Done Next Action Closed Owner Luci Runtime Closed Age 17d ago
MC-4301
Ticket is done; runtime is closed. · profile claude_opus_1m_medium · cwd /home/lucienne/workspace/mission-control · uptime 16d 18h · last activity 16d 16h ago

Description

MC-4301
# MC-4290: Activate the QA reviewer (shadow reviewer → acting) **Priority:** high **Assigned:** luci **Depends on:** nothing (can start immediately) ## What to do The shadow reviewer (`mc_shadow_review.py`) exists but is not switched on. It currently logs verdicts without acting. This ticket makes it the active QA reviewer in the MC workflow. ## Steps 1. **Enable the flag.** Set `MC_ORCH_SHADOW_REVIEW=1` in the scheduler task environment (the task that calls `mc_pickup.py` or the orchestrator drain). Verify by running `python3 -c "import mc_shadow_review; print(mc_shadow_review._flag_on())"` — should print `True`. 2. **Wire the verdict into the digest.** In `mc_pickup.py`, find `drain_orchestrator_inbox()`. When it builds the digest for persistent-Luci, include the latest shadow reviewer verdict for each ticket. Format: `[REVIEW] verdict=<pass|fail|uncertain> confidence=<0-1> reasons=<text>`. If no review exists for a ticket, omit the line. 3. **Prove it catches real failures.** Let it run for 24 hours on real tickets. Check `shadow_reviews` table for new rows. Verify: - `verdict` is `pass` or `fail` (not always `uncertain`) - `reasons` references the actual diff or evidence (not generic boilerplate) - `confidence` is a real number, not always 0.5 If all reviews are `uncertain` or generic, the LLM prompt in `mc_shadow_review.py` needs tuning — log this as a blocker. 4. **Calibrate against human decisions.** The `human_decision` field should be filled after a ticket is resolved (done/cancelled/blocked). Add a scheduled task or operator step that backfills `human_decision` from the final ticket status for all reviewed tickets. 5. **Commit and push.** All changes in the mission-control repo, on master. ## Acceptance criteria - `MC_ORCH_SHADOW_REVIEW` flag is enabled in the live scheduler task environment - The digest sent to persistent-Luci includes `[REVIEW]` lines for tickets with shadow reviews - At least 3 shadow reviews exist in the database with non-generic verdicts - `human_decision` backfill is running ## If blocked - If the flag module doesn't know about `shadow_review`, add it to `mc_orchestrator_flags.py` FLAGS dict - If the digest function doesn't exist, create it — read `shadow_reviews` by `ticket_id`, take the latest row - Do NOT change ticket status or auto-return-for-fixes — that's MC-4291, not this ticket ## What NOT to do - Do not wire the review loop (that's MC-4291) - Do not change the operator (that's MC-4293) - Do not touch the inbox cleanup (that's MC-4294)

Activity

done
Luci is working...
Live
No activity yet
Help