You're offline — showing cached data

Luci · Investigate pool-slot context leak. Start by examinin

runtime not set · chat:579:session · failed
×
Luci
Investigate pool-slot context leak. Start by examining the commit + the two suspect files.
Luci
Now examine the suspect files. Read worktree_pool.py reset + mc_pickup.py brief assembly.
Luci
Files not in this repo dir. Check repo layout — where do these files live.
Luci
Two repos in play. `mc_pickup.py`/`worktree_pool.py` live in `~/workspace`; this worktree is the mission-control repo. Read the real source files.
Luci
`~/workspace/` and `~/workspace/mission-control/` are separate repos. The slot branch `slot1/mc-MC-4548` uses the MC-3840 `slotN/mc-MC-<id>` scheme → managed by `worktree_pool.py`. Read the reset + claim + brief-assembly logic.
Luci
Read `_decide_base` — the resume/fresh decision is the leak suspect.
Luci
Reset path is keyed on `ticket_id` and does `reset --hard base_ref` + `clean -fdx` — thorough, no cross-ticket carryover there. The leak must be in dispatch/brief assembly or the break-glass operator path. Search mc_pickup.py + operator dispatch.
Luci
Let me use the Grep tool properly on the workspace files.
Luci
Operator/break-glass dev-loops run with `cwd=~/workspace` (not a pool slot). Now check how mc_pickup picks the pool root + repo for an MC ticket, and the pre-claim validation.
Luci
Key finding at line 5530: pool repo is `~/workspace` (workspace repo). But I'm running in `~/workspace/mission-control/.claude/worktrees/pool-1`. Two pools exist. Let me get the git evidence on how 9e22475 originated relative to MC-4460, and check `WorktreePool.shared` repo resolution.
Luci
MC-4371 comment confirms TWO pools: `~/workspace` and `~/workspace/mission-control`. Singleton is keyed on repo. But line 5530 hardcodes `~/workspace`. Find where mission-control repo gets selected as pool root.
Luci
mc_pickup ALWAYS uses `repo=~/workspace` (line 5530) — no mission-control selection. But I'm in the mission-control pool → MC-4548 dispatched via the **tmux ticket_runtime**, a separate path. Pull the actual incident facts: MC-4460, MC-4534, and the operator dev-loop timing.
Luci
Ticket lookup format differs. Check schema + find MC-4460/MC-4534.
Luci
MC-4460 = ".gitignore audit + schema-version stamp" — totally unrelated to the disk fix. Trace where 9e22475 was first created (reflog) + the MC-4460 slot branch.
Luci
Reflog smoking gun: pool-2 HEAD was at `9e22475` then moved to origin/master at 09:09, while bouncing on `slot2/mc-MC-4460`. So the disk fix WAS authored inside the MC-4460 pool slot. MC-4460 ran via the tmux **ticket_runtime** (mission-control pool). Examine ticket_runtime brief assembly + pool repo selection — the "context brief" mechanism.
Luci
ticket_runtime delegates to `runtime_pool` — claims a slot when intended cwd is the control-plane main checkout. The operator dev-loop (luci_operator.py) does NOT claim a slot — it runs `cwd=~/workspace` directly. Read `runtime_pool.claim_for_cwd` + `is_control_plane_main_checkout` to confirm collision mechanism.
Luci
Now check `_detect_default_base` — does a fresh slot branch from **local** master (inheriting unpushed commits) or `origin/master`?
Luci
`_detect_default_base` → `origin/master` (remote, MC-4464 hardened). So fresh slots branch from remote, not local — unpushed commits don't leak via base. Combined with `reset --hard` + `clean -fdx` + `slot_branch_matches` assertion, the **reset path is clean**. And both brief assemblers key strictly on `ticket["id"]` — **brief path is clean** too. The real writer is the operator. Read the operator dev-loop gating to confirm it bypasses pool isolation.
Luci
Confirmed: `_launch_operator_dev_loop` / `_launch_breakglass_dev_loop` spawn `claude -p --dangerously-skip-permissions` with `cwd=~/workspace` and **no `runtime_pool.claim_for_cwd`** — the only writer that bypasses pool isolation. And `_disk_pressure_critical` is itself a launch trigger (line 1445), so the false-critical kept launching the loop that authored 9e22475. Before concluding, verify hypothesis #2 against real data — check MC-4460's actual stored description/comments for any disk-pressure contamination.
Luci
Both named hypotheses **disproven against real data**: MC-4460's brief is purely .gitignore/schema-version (zero disk/operator content), and the reset path is hardened. Root cause lies in the operator dev-loop, not the pool. This is an infra/investigation deliverable — write a design doc + decision, commit, REVIEW. Invoking dev-loop per the hard rule before writing/committing.
Luci
Tier 1: investigation already complete; deliverable is the write-up. Writing the root-cause doc in the MC repo (this worktree), then commit + push the slot branch.
Luci
Committed `f72cd50`, pushed. Now create the recommended follow-up ticket so the operator-isolation fix isn't lost.
Luci
Follow-up filed as MC-4556. Investigation done, doc committed + pushed (`f72cd50`), no code-behavior change shipped (correctly gated). Reporting REVIEW via mc-coord.
Luci
Investigation complete. **Root cause: mis-attribution, not a leak.** Both mechanisms the ticket named are clean: - **Pool reset-on-claim** — `_reset_slot`: `reset --hard origin/master` + `clean -fdx`, base keyed on `ticket_id`, `slot_branch_matches` poison-check. No carryover. - **Brief assembly** — per-ticket only. Verified live: MC-4460's description is `.gitignore audit + schema-version stamp`, zero disk/operator tokens. **Real cause:** `luci_operator.py` `_launch_operator_dev_loop` / `_launch_breakglass_dev_loop` spawn `claude -p` with `cwd=~/workspace` and **no `runtime_pool` claim** — the only writer bypassing pool isolation. `9e22475` was authored there, committed linear on master (`dc5ad66→9e22475→76fa7ba`), and only appeared under `slot2/mc-MC-4460` via `origin/master` base advancement during concurrent churn (reflog 09:09). `_disk_pressure_critical` was itself the launch trigger → false-critical kept launching the loop that fixed it. **Shipped:** `docs/incident-2026-06-01-pool-slot-context-leak-MC-4548.md` — commit `f72cd50`, pushed `slot1/mc-MC-4548`. No code-behavior change (correctly gated). **Follow-up:** MC-4556 — pool-isolate the MC-reachable operator dev-loop; break-glass stays un-isolated by necessity. Control-plane recovery change → plan+council before shipping. {"mc_signal":"REVIEW","ticket":"MC-4548","nonce":"MC_TICKET_TURN:4548:1780313597852:5d18d456","payload":"Both named mechanisms (pool reset, brief assembly) clean; 9e22475 mis-attributed — real cause is un-isolated operator dev-loop. Doc committed f72cd50; follow-up MC-4556 filed."}
Default runtime
Help