⌂ Home ☷ Board

Luci Review — agent-watch borrowable ideas (Week 2026-W20)

Reviewed by Luci, 2026-05-12 Upstream sources: openclaw v2026.5.9-beta.1 through v2026.5.12-beta.1, hermes-agent v2026.5.7 + desktop-pr20059-installers

Philosophy Filter

We only borrow ideas that: 1. Strengthen Mission Control as the single source of truth 2. Improve scheduler / mc_pickup reliability or observability 3. Harden security boundaries around vault.db, Telegram, and ticket workers 4. Reduce operational cost or token waste (memory, compute, API spend) 5. Fit our existing stack (Flask + SQLite, systemd, cron, Telegram bot) without new infrastructure

Ideas that require Discord/Slack/voice/mobile or add new runtimes are skipped — not because they're bad, but because they dilute focus.


Tier 1 — Implement This Quarter

1. Security hardening: redaction ON by default + TOCTOU fixes

Source: hermes-agent v2026.5.7 (P0 security wave) Why: MC stores sensitive data (financial entities, POPIA passwords, Telegram chat IDs, SafairBru leaderboard data). Hermes closed 8 P0 security issues including redaction-by-default, role-allowlist guild-scoping, and TOCTOU race windows. Action: Audit MC ticket descriptions, comments, and logs for unredacted secrets. Add a redact_secrets() helper to notify.py and the MC API logging layer. Scope Telegram bot commands to known chat IDs (already partially done via CCGram, but tighten). Effort: Small — 1-2 files, no new deps.

2. Cron inspection API + watchdog mode

Source: openclaw v2026.5.12-beta.1 (cron.get) + hermes-agent v2026.5.7 (cron watchdog no_agent) Why: Scheduler failures are our #1 operational pain (see MC tickets 3203, 3158, 3119, 3101, 3059, 3039, 3038). We currently have scheduler-watchdog tasks that create failure tickets, but we have no easy way to inspect "what cron jobs are currently scheduled and when did they last run?" Action: Add a /api/v1/scheduler/inspect route to MC that dumps the crontab for lucienne user, last run times from task_runs, and next scheduled runs. Pair with a cron_watchdog.py script that runs every 5 min (via scheduler, not system cron) and alerts if a task is overdue by >2× its expected interval. Effort: Medium — new MC route + new scheduler task.

3. Targeted diagnostics with URL redaction

Source: openclaw v2026.5.12-beta.1 (model transport, payload, SSE, code-mode logging) Why: When CCGram, mc_pickup, or a ticket worker fails, we often have no visibility into whether it was an API timeout, a malformed payload, or an SSE stream drop. Current logging is ad-hoc print() statements. Action: Add structured logging to mc_pickup.py (worker spawn, SSH dispatch, runtime profile injection) and ccgram (Telegram API calls, webhook processing). Redact API keys and URLs in logs. Use Python logging module with a rotating file handler in ~/workspace/logs/. Effort: Small-Medium — refactor print→logging in 2-3 core files.


Tier 2 — Implement This Half

4. Session resilience: auto-resume after restarts

Source: hermes-agent v2026.5.7 (gateway auto-resumes interrupted sessions) Why: dev-loop and ticket workers can be interrupted by Hetzner maintenance, OOM kills, or runtime switches. We lose in-progress work and context. session-memory-extractor depends on clean session boundaries. Action: Add a session_checkpoint.py helper that writes a lightweight checkpoint (ticket ID, current step, partial diff context) to ~/workspace/checkpoints/ every 5 min during long-running tasks. On worker restart, check for stale checkpoints and offer resume or cleanup. Effort: Medium — requires hooking into worker lifecycle.

5. Memory optimization / transcript streaming

Source: openclaw v2026.5.10-beta.4 (252MB → 27MB for large sessions via transcript streaming) Why: Luci processes long meeting transcripts, CEO briefings, and wiki compilations. We currently load full transcripts into memory before chunking. This costs tokens and can OOM on large files. Action: Audit meeting-notes skill and ceo_briefing_audio.py for full-in-memory loads. Implement streaming chunking for transcripts — process N lines at a time instead of reading the whole file. Effort: Small — refactor existing scripts, no new deps.

6. Per-sender tool policies

Source: openclaw v2026.5.12-beta.1 (canonical channel-scoped sender keys) Why: mc_pickup dispatches workers with broad tool access (Bash, Read, Write, Edit). A rogue or misconfigured worker could damage the system. We have no granular access control. Action: Add a tool_policy field to MC tickets (or worker metadata) that restricts available tools per ticket type. Example: data-sync tickets get Bash+Read only; dev-loop tickets get full access; wiki-ingest tickets get Read+Write but no Bash. Effort: Medium — touches mc_pickup dispatch logic and ticket schema.


Tier 3 — Nice to Have / Research

7. Config hot-reload

Source: openclaw v2026.5.10-beta.3 (gateway config rereads from disk after restart loops) Why: Currently, changing scheduler task configs or CCGram settings requires a systemd restart. Hot-reload would reduce downtime. Action: Add SIGHUP handler to ccgram and mc_pickup that re-reads configs. For scheduler tasks, the .md files are already re-read every tick, so this is partially solved. Effort: Small — signal handling in Python.

8. Context mapping / treemap visualization

Source: openclaw v2026.5.10-beta.2 and beta.3 (/context map treemap) Why: Could improve the PKA dashboard's session/memory visualization. Action: Add a "Session Graph" view to the PKA dashboard showing parent-child relationships between ticket workers, subagents, and dev-loop sessions. Effort: Large — new UI component, data model changes.

9. Extended agent communication turns

Source: openclaw v2026.5.12-beta.1 (maxPingPongTurns 5→20) Why: Dev-loop council review can hit turn limits with multiple models. Action: Only relevant if we actually hit the limit. Monitor first; increase only if needed. Effort: Trivial — config change.


Skip List (Out of Scope)


Recommendation

Immediate (this week): Items 1 (redaction) and 3 (diagnostics logging). Both are small, high-leverage security/observability wins.

Next sprint: Item 2 (cron inspection + watchdog). This directly addresses our top failure mode.

Backlog: Items 4-6 as medium-term reliability improvements.