The scheduler (~/workspace/scheduler.py) is Luci's task execution engine. It runs every minute via cron, evaluates which tasks are due based on cron expressions, and executes them with locking, retry, self-healing, and failure escalation.
python3 scheduler.py tick every minute. This is one of only two entries allowed in Luci's crontab (the other is the heartbeat).~/workspace/tasks/ as markdown files with YAML frontmatter.mc.db (task_runs table) -- started_at, finished_at, status, output, duration..md files from ~/workspace/tasks/id or scheduleEvery task file starts with a YAML frontmatter block:
---
id: example-task # Unique identifier (duplicate = hard crash)
title: Human-readable name
schedule: "0 6 * * 1-5" # Cron expression (evaluated in SAST)
timeout: 300 # Max seconds before kill (default: 600)
retry: true # Simple retry on first failure (default: false)
enabled: true # false = skip entirely (default: true)
disabled_reason: by_choice # why disabled: auto_suspended | retired | paused | by_choice (set when enabled: false)
self_heal: true # Allow Claude to diagnose and fix (default: true)
notify_on: failure # failure | success | always | never (default: failure)
notify_to: home # notify.py destination key: dm|home|work|mc|life-manager|general (optional; injected as LUCI_NOTIFY_DEST env)
run_as: shell # shell | claude | script
command: "python3 foo.py" # Shell command to execute
tags: [infra, backup] # Categorization tags
---
Markdown body with human-readable description of what the task does.
subprocess.Popen() with shell=True, /bin/bash, and start_new_session=True — the child gets its own process group so a timeout kills the whole tree (bash + claude + grandchildren) with os.killpg, not just the direct bash child~/.claude/env/api_keys.env and ~/.bashrc on each run, plus ~/.npm-global/bin on PATH. The refresh is non-interactive (bash --noprofile --norc) so interactive aliases such as the Telegram-enabled claude alias cannot leak into scheduler jobs.~/workspace, unless a task sets explicit cwd or cwd_policy/tmp/luci-task-{id}.lock contain PID and start time (JSON)O_CREAT | O_EXCL to prevent racesThe persistent Luci/Telegram session is the only process allowed to use the Telegram-enabled Claude configuration. Scheduler-owned Claude calls are guarded automatically:
claude, ${CLAUDE}, /usr/bin/env claude, and the standard
~/.local/bin/claude path are wrapped so they run with
--settings ~/.claude/settings-worker.json.TELEGRAM_BOT_TOKEN is cleared for those Claude task commands so they cannot
start a second Telegram poller.If a task intentionally needs a different Claude configuration, make that
explicit in the task definition and document why. Avoid sudo claude, remote
ssh ... claude, or sh -c 'claude ...' in scheduler commands because those can
bypass the bash function wrapper.
The scheduler is machine-level infrastructure, so it defaults task commands to
~/workspace. Task definitions may override this in two ways:
| Task setting | Result |
|---|---|
cwd: /some/path |
Run exactly from that path |
cwd_policy: pka or pka_repo |
Run from ~/workspace/PKA |
cwd_policy: mission-control, mission_control, or mc |
Run from ~/workspace/mission-control |
| No cwd setting | Run from ~/workspace |
Many legacy task commands still begin with an explicit cd ...; that remains
valid and should be treated as the command's own local override. Ticket workers
use a related project-based resolver documented in
02-mission-control/worker-system.
Before running a task, the scheduler checks 02-mission-control/overview|Mission Control for unread human comments on the task. Claude interprets the comments and returns one of:
This allows Elmar to pause or adjust tasks by commenting on them in the MC dashboard.
When a task fails, the scheduler follows an escalation ladder:
retry: true)enabled: false), Telegram alert sent, MC ticket createdself_heal: false to disable healing entirely~/workspace/logs/self-heal-audit.logFor tasks with self_heal: false, the scheduler tracks consecutive failures:
MAX_CONSECUTIVE_FAILURES (3) consecutive failures, the task is suspendedsuspend_task stamps disabled_reason: auto_suspended + disabled_at in the task frontmatter, so the tasks page can tell a failure-suspended task from one disabled on purposedisabled_reason: by_choice; re-enabling strips both keys| Event | Action |
|---|---|
| Task fails once | Retry (if enabled) |
| Retry fails | Self-heal attempt 1 |
| Heal 1 fails | Self-heal attempt 2 |
| Heal 2 fails | Suspend task, Telegram alert (force, bypasses quiet hours), MC ticket |
| Task timeout | Log as timeout, Telegram alert, MC ticket |
| Scheduler crash | Telegram alert (force), MC ticket, "all tasks paused" warning |
| Duplicate task ID | Hard exit with Telegram alert |
python3 scheduler.py tick # Cron calls this every minute
python3 scheduler.py run <id> # Force-run a specific task (ignores schedule)
python3 scheduler.py list # Show all tasks with schedule, enabled, last/next run
python3 scheduler.py history # Show last 20 task runs from mc.db
| Path | Purpose |
|---|---|
~/workspace/scheduler.py |
Main scheduler code |
~/workspace/tasks/*.md |
Task definitions |
~/workspace/mission-control/mc.db |
Run history (task_runs table) |
/tmp/luci-task-{id}.lock |
Per-task lock files |
~/workspace/logs/self-heal-audit.log |
Heal attempt audit trail |
~/workspace/.heal-state.json |
Probation tracking state |
~/workspace/logs/fail-counts/{id}.count |
Consecutive failure counters |
~/workspace/prompts/self-heal.txt |
Prompt template for Claude self-heal |
~/workspace/prompts/check-comments.txt |
Prompt template for comment interpretation |
~/workspace/tasks/Mission Control is the board for your delegated work: requests come in, Luci coordinates the next step, and evidence stays visible for review.
Luci is your always-on assistant for routing, status updates, and follow-through. Operators can still open deeper evidence when needed.