RECURRING infra bug. When a pool slot is re-claimed for a new ticket, its git branch is NOT reset to slot{N}/mc-MC-<new-id> — it stays on the PREVIOUS ticket's branch. Worker th...
StateDoneNext ActionClosedOwnerLuciRuntimeClosedAge14d ago
Ticket is done; runtime is closed.·cwd /home/lucienne/workspace/state/control-room-worktrees/mc-4464-pool-fix · uptime 13d 20h · last activity 13d 19h ago
Description
MC-4464
RECURRING infra bug. When a pool slot is re-claimed for a new ticket, its git branch is NOT reset to slot{N}/mc-MC-<new-id> — it stays on the PREVIOUS ticket's branch. Worker then sees wrong cwd/branch and either (a) freezes (correct, per circuit-breaker) or (b) risks committing onto another ticket's branch / clobbering its work.
Observed:
- MC-4457 dispatched to pool-2, but live 'git branch --show-current' = slot2/mc-MC-4455 (previous claimant). Worker froze, nothing committed. Correct stop, but wasted a dispatch.
- MC-4431 earlier: pool slots reset from STALE origin/main (Feb-13 77a2275) while live code is master — same family of mis-wiring (pool-claim/409 'unsafe runtime cwd' failures).
- mem obs 7192 documents the slot mis-registration.
Root cause candidates (confirm): worktree_pool.py claim path not running reset-on-claim, OR reset targeting wrong base ref (origin/main vs master), OR branch-rename step skipped when slot already has a namespaced branch from prior claim.
Action:
1. Audit worktree_pool.py claim() — ensure every claim does: fetch, hard-reset to origin/master (NOT stale origin/main), checkout/create slot{N}/mc-MC-<new-id>, verify branch == expected before returning slot.
2. Add a pre-dispatch assertion in mc_pickup.py: refuse to inject a worker if slot branch != slot{N}/mc-MC-<ticket-id>; re-provision instead.
3. Reconcile origin/main vs master base-ref confusion (mission-control deploy uses master).
Source: MC-4457 + MC-4431 parked-dispatch evidence. Build via dev-loop; council the claim() reset sequence.
Activity
done
INTERACTIVE
Luci is working...
Details —
Done
· Critical
· Luci
▼
SState
Done
Closed
PPeople
TTiming / Details▼
api (human)
Infrastructure
14d ago
14d ago
Advanced / Operator evidence
RRouting owner
ROperator console
Ticket is done; runtime is closed.ticket_donecwd /home/lucienne/workspace/state/control-room-worktrees/mc-4464-pool-fix · uptime 13d 20h · last activity 13d 19h agoMC is visibility-only. Hermes Luci launches and gates work outside MC, then mirrors evidence/status here.Raw console: luci · claude/opus[1m]
WWorkflow
Start Dev Review + QA ▾
Select phases to include:
Agents
Review Gates
Decision
WAT routing: choose an agent, review gate, or decision. Buttons use the live runtime when one is attached.
Scope add — diagnose dual-dispatcher conflict FIRST. Symptoms today: auto-pickup re-dispatched already-handled tickets (parent-tracker MC-4451, test MC-4463), 409 CONFLICT collisions on inject, garbled injection-echo QUESTIONs, and a pool slot on the wrong branch (MC-4457 → slot2/mc-MC-4455). Prior MC-4431 notes said the legacy mc_pickup pool path was 'unsafe after the Control Room cutover' and 'no mc_pickup.py live' — yet MC-4457 DID land in a pool slot. So BOTH the legacy pool dispatcher and the Control Room dispatcher may be partially active and colliding (two injectors → 409). Builder: (1) determine which dispatch path is authoritative NOW (pool vs control-room), (2) disable/guard the other so only one injects, (3) THEN fix pool reset-on-claim branch mis-registration + add the pre-dispatch branch assertion. This is live-MC-core (MC-3725 precedent) — dev-loop + council mandatory, build in clean runtime, no blind edits. PRIORITY: this unblocks frozen MC-4457 and stops the thrash. Marking critical.
lucienne14d ago
Controller: worker completed. Commit fac7f18e merged to master. 8/8 regression tests pass. Worktree pool claim() now uses checkout -B (idempotent), detects live master vs stale main via newest-tip tiebreak, asserts HEAD matches slot branch before returning. Moving to in_review for QA.
luci14d ago
Control Room triage: promoting to todo. Legacy pickup confirmed disabled (both task files enabled:false, no live mc_pickup processes, no recent task_runs). The dual-dispatcher concern in the scope-add comment appears resolved — the collision was from the auto-pickup era. Remaining root cause: worktree_pool.py claim() not resetting branch on re-claim. This needs a dedicated worker with dev-loop.
Will dispatch via Control Room direct-worker path (not mc_pickup).
luci14d ago
Control Room plan: MC-4464 is ready for direct-worker dispatch.
Target file: ~/workspace/worktree_pool.py (1215 lines, workspace root repo).
Bridge file: ~/workspace/mission-control/runtime_pool.py (307 lines).
Fix scope:
1. Audit worktree_pool.py claim() — ensure every claim: fetch, hard-reset to origin/master (NOT origin/main), create/checkout slot{N}/mc-MC-<new-id>, verify branch == expected.
2. Add pre-dispatch assertion: refuse to inject if slot branch != expected.
3. Reconcile origin/main vs master base-ref confusion.
This is MC-core infra. Needs dev-loop + council. Will dispatch as direct Control Room worker.
luci14d ago
Expected check-in: worker should show first artifact (plan/diff) within 30 minutes of dispatch.
luci14d ago
Control Room dispatch: Codex CLI (o4-mini) launched in isolated worktree.
Worktree: /home/lucienne/workspace/state/cr-mc-4464 (branch cr/mc-4464-pool-fix)
Target: worktree_pool.py claim() fix - proper branch reset on re-claim
tmux: cr-MC-4464
Expected check-in: first artifact (diff/test) within 30 min
luci14d ago
Worker correction: Codex CLI hit usage limit; fell back to Claude Code CLI (opus[1m]). Worker is initializing in /home/lucienne/workspace/state/control-room-worktrees/mc-4464-pool-fix. tmux: cr-MC-4464, PID: 1650738.
luci14d ago
Control Room gate: MC-4464 verified and closed.
Commit fac7f18e landed on master and pushed to origin.
8/8 regression tests pass (extracted from commit, run independently).
Changes: _detect_default_base (live-master vs stale-main tiebreak), checkout -B for idempotent re-claim, pre-dispatch assertion with re-provision-once + abort+free.
No mission-control repo changes needed — fix is self-contained in worktree_pool.py.