Window: last 24h (extended 72h for recurrence). Runs: 47. Total actions: 457. Avg actions/run: 8.66.
47 run_start
47 disk_worktree_snapshot
47 memory_snapshot
47 done_audit_summary
47 run_complete
32 blocked_lane_classification_summary
31 recent_task_failure
30 repo_dirty_observed
26 operator_ticket_exists
15 operator_dev_loop_throttled
15 terminal_ticket_runtime_closed
12 promote_dead_zone_ticket
11 active_lane_snapshot
9 todo_backlog_observed
6 repo_dirty_all_known_generated
6 operator_dev_loop_finished
6 create_operator_ticket
5 attach_operator_context_to_closed_failure_ticket
3 reopen_weak_done_ticket
3 skipped_overlap
2 memory_pressure_observed
2 blocked_runtime_failure_reset
1 operator_ticket_resolved
1 stale_runtime_observed
1 stale_runtime_marked
1 stale_runtime_summary
1 breakglass_noop_healthy
1 pickup_direct_run
1 operator_dev_loop_failed
1 blocked_lane_completed_marked_done
Problem: recent_task_failure (31 events) is in trigger_actions, so every failed task run — including non-critical ones like swing-trader scans — attempts to launch a dev-loop. Most are throttled (15 of 21 attempts = 0.71 ratio), producing throttle-log noise with no operational gain. Critical failures are already handled by _ensure_failure_ticket + watchdog.
Location: maybe_launch_operator_dev_loop, trigger_actions set (~line defining the set).
Change: Remove "recent_task_failure" from trigger_actions, or add a post-filter so only critical-task failures trigger:
triggers = [a for a in self.actions
if a.get("action") in trigger_actions
and not (a.get("action") == "recent_task_failure"
and a.get("task_id") not in CRITICAL_TASK_IDS)]
Expected effect: Throttle events drop from ~15 to ~3-5 per 24h. Dev-loop still fires for critical-task failures. Average actions/run decreases slightly.
ACTIVE_LANE_STALE_HOURS_TRIGGER of 2 h is too tightProblem: [operator:active-lane-backlog] recurs 4× in 72h. The dev-loop timeout alone is 3600 s (~1 h), and operator tickets regularly run 1-2 h. A 2-hour stale threshold flags healthy in-progress work as "backlog not draining," creating false escalations.
Location: Constant ACTIVE_LANE_STALE_HOURS_TRIGGER = 2.
Change: Raise to 3.
Expected effect: Active-lane-backlog recurrence drops from ~4 to ~1-2 per 72h. Tickets legitimately running 2-3 h are no longer escalated.
Problem: _operator_dev_loop_throttled writes the 6-hour throttle timestamp before the subprocess runs. If subprocess.run returns a non-zero exit code, execution falls through to operator_dev_loop_finished normally — no exception, so the error-handler that expires the throttle never fires. A failed dev-loop is then blocked for a full 6 hours.
Location: _launch_operator_dev_loop, immediately after the subprocess.run call and the operator_dev_loop_finished record.
Change: If result.returncode != 0, expire the throttle so the next run can retry:
if result.returncode != 0:
expired = (self.options.now - timedelta(hours=6)).isoformat()
state_path.write_text(
json.dumps({"last_started_at": expired}), encoding="utf-8")
Expected effect: The 1 failed dev-loop in this window would have been eligible for retry on the next operator tick (~30 min) instead of waiting 6 h. Successful runs keep the full 6-hour throttle.
repo_dirty_all_known_generated resolutions prove the filter is working. No _GIT_KNOWN_GENERATED_PREFIXES additions without seeing the actual porcelain lines.