← Reports

agent-watch — week 2026-W23

Rolling weekly report. New releases append below as detected.

Week label: 2026-W23 First detection: 2026-06-01T05:00:44.771146+00:00


openclaw v2026.5.31-beta.4 — 2026-06-01T04:15:10Z

Repo: https://github.com/OpenClaw/openclaw Release: https://github.com/openclaw/openclaw/releases/tag/v2026.5.31-beta.4 Detected: 2026-06-01T05:00:44.771146+00:00 Borrowable ideas: 2

OpenClaw v2026.5.31-beta.4 — Release Review for Luci

Executive Summary

Nothing mission-critical for current Luci setup, but three incremental hardening items worth backporting.

Features Worth Borrowing

Feature OpenClaw Purpose Luci Relevance Action
Tool call recovery (#88129, #88136, #88141, #88162, #88182) Agents resume from interrupted tool calls, stale session bindings, compaction handoffs mc_pickup workers spawn independent CLI sessions; interrupted tool calls (API timeout, network drop, daemon stale) currently have no recovery path. Worker can hang or emit partial state → zombie MC ticket. Recovery logic would reduce hung-worker incidents. Yes. Backport session-recovery pattern into mc_pickup worker subprocess wrap (test via network interrupt scenario)
Request timeout bounding (provider, OAuth, media, service probes) Cap lifetimes before they hang a run Scheduler.py calls Anthropic SDK + provider-profile injection; no explicit timeout per request. Slow provider (GLM API timeout, Kimi latency spike) hangs task runner → blocks tick scheduler heartbeat. Better timeouts = predictable failure + retry. Yes. Add per-provider request timeout config + fallback to main-loop tick timeout; prevents scheduler cascade.
Stale disabled-skill snapshot handling (#79072, #79173) Plugin/skill loaders handle disabled snapshots, emit recovery guidance Luci skill-evolver auto-creates/disables skills; stale disabled snapshots (disabled skill re-enabled mid-session, loader caches old disabled state) can silently drop skills. Clearer recovery = better visibility into skill state churn. Maybe. Audit skill-evolver for snapshot stale risk; low priority (rare in practice). Skip unless skill-evolver audit surfaces a gap.

Not Relevant

Questions for Elmar

  1. Tool recovery: Worth a backport now, or acceptable to wait for next major upgrade cycle?
  2. Request timeouts: Is scheduler.py experiencing timeout cascades, or is this preventive hardening?

openclaw v2026.6.2-alpha.1 — 2026-06-02T02:56:29Z

Repo: https://github.com/OpenClaw/openclaw Release: https://github.com/openclaw/openclaw/releases/tag/v2026.6.2-alpha.1 Detected: 2026-06-02T05:00:53.630833+00:00 Borrowable ideas: 0

openclaw 2026.6.2-alpha.1 Review for Luci

Executive Summary

Features New to openclaw — Load-Bearing Only

Feature Relevance to Luci Rationale
Worker recovery from tool interrupts/stale bindings Passive benefit mc_pickup spawns background workers; cleaner recovery from stale sessions = fewer silent task failures. Automatic as base improves.
Provider request timeout bounds Passive benefit scheduler.py injects provider env per runtime_profile; unbounded OAuth/media/polling can stall tasks. Bounds reduce hanging tasks.
Skill/plugin loading error clarity Passive benefit ~/.claude/skills/ + auto-skill-evolver inherit clearer diagnostics on disabled snapshots, plugin load failures. Better visibility on skill-refresh errors.
SQLite-backed plugin install ledger Passive benefit Reduces filesystem scanning on restarts; may help skill-evolver detect stale plugin state faster. Secondary.
Workboard orchestration primitives Watch, not adopt yet New multi-agent coordination surface. If it's an alternative to MC tickets for agent routing, could replace/augment mc_pickup. Needs docs to assess.
Skill Workshop proposal review UI Watch, not adopt yet Human-driven skill review (propose → approve → ship). auto-skill-evolver already generates proposals; Skill Workshop could streamline the review gate. Not urgent; verify it doesn't duplicate auto-evolver.

Skip

Verdict

No action required. All three load-bearing improvements are passive base-system gains. Luci's mc_pickup, scheduler, and skill-evolver inherit them automatically. Watch Workboard and Skill Workshop when docs land; they may become relevant for future orchestration redesigns, but not now.

openclaw v2026.6.3-alpha.1 — 2026-06-03T04:06:15Z

Repo: https://github.com/OpenClaw/openclaw Release: https://github.com/openclaw/openclaw/releases/tag/v2026.6.3-alpha.1 Detected: 2026-06-03T05:00:46.155863+00:00 Borrowable ideas: 3

OpenClaw 2026.6.3-alpha.1 Review for Luci

Executive Summary

Features Worth Borrowing

Feature Relevance to MC/Luci One-liner
Agent tool-call recovery (#88129, #88136, #88141) MC workers (mc_pickup) + scheduler task runtimes spawn agents; cleaner binding recovery = fewer orphan sessions Adopt: add explicit session recovery on network dropout in worker bootstrap
Timer/retry bounds for OAuth, polling, media Scheduler tasks & mc_pickup workers can hang waiting on Anthropic API, GCS media, or Google OAuth; explicit caps prevent stalls Adopt: cap all external calls in scheduler.py + mc_pickup.py with timeouts; audit claude CLI invocation patterns
Provider model metadata caching (OpenRouter, Copilot) Runtime profile routing (runtime_profile field) picks Anthropic/GLM/Kimi/MiniMax; model catalog + caching tightens cost tracking Adopt: cache provider models in vault.db (schema new provider_models table), refresh on scheduler heartbeat, use in runtime-profile audit
Disabled-snapshot handling for skill loading Auto-skill-evolver creates skills; stale disabled snapshots = unclear recovery Skip: auto-skill-evolver is lightweight; not worth the governance overhead yet
Skill Workshop proposal flow Guarded skill creation with review states, versioned proposals, rollback metadata Skip: auto-skill-evolver auto-triggers on repeat patterns; proposal review is overkill for Luci's use case
SQLite state migration (iMessage ledgers, plugin installs) Mirrors Luci's vault.db strategy; pattern is sound, already in use Neutral: Luci already uses SQLite for activity_log + scheduled-task state

Skip

Action Items

If the overhead is trivial: - Add session-recovery guard to mc_pickup.py worker spawn (catch stale bindings on failed tool calls, re-attach) - Add timeout caps to scheduler.py task execution (prevent hangs on Anthropic API, GCS, OAuth)

If worth the complexity: - Extend vault.db schema with provider model metadata table; refresh on heartbeat

Otherwise: pattern catalogue only (reference for next reliability crisis).

openclaw v2026.6.4-alpha.1 — 2026-06-04T03:40:03Z

Repo: https://github.com/OpenClaw/openclaw Release: https://github.com/openclaw/openclaw/releases/tag/v2026.6.4-alpha.1 Detected: 2026-06-04T05:00:40.431690+00:00 Borrowable ideas: 4

openclaw v2026.6.4-alpha.1 Review for Luci

Executive Summary

Features Worth Borrowing into MC/Luci

Feature Relevance Rationale
Skill install operator policy + doctor ✅ Yes auto-skill-evolver currently installs without gates. Operator policy + doctor checks reduce supply-chain risk (MC-4657 hardening precedent).
Telegram admin writeback + DM exec allowlists ✅ Yes ccgram owns polling; approval gates + admin checks harden Telegram control plane (Telegram lock critical per CLAUDE.md).
Provider runtime fanout ✅ Yes scheduler already routes via runtime_profile; bundled aliases + custom fanout reduce model-spawn overhead when dev-loop spawns council (Opus/Sonnet/Gemini/Kimi/GLM in parallel).
Session write-lock recovery ✅ Yes defensive. mc_pickup dispatcher spawns per-ticket workers; lock-release failures could block ticket dispatch. Low-cost hardening.
Streaming text visibility + ACK reconcile ⚠️ Maybe MC's SSE broadcaster already streams task output. ACK timing metadata is nice-to-have for debugging UI/state races, not blocking.
Prompt cache boundaries ❓ Defer Luci/workers don't use caching yet. Worth scanning if cost-optimization becomes priority.

Skip

Action

  1. Install policy for skills — open ticket to backport operator gates into auto-skill-evolver (model: install-policy enum, doctor checks on skill source)
  2. Telegram approval gates — vet ccgram's DM allowlist pattern, surface to Elmar for config (who can exec Telegram commands?)
  3. Provider fanout — review scheduler's runtime_profile injection; add bundled aliases table if spawning 5+ model council votes gets common

goose v1.37.0 — 2026-06-03T19:46:33Z

Repo: https://github.com/block/goose Release: https://github.com/aaif-goose/goose/releases/tag/v1.37.0 Detected: 2026-06-04T05:01:26.886354+00:00 Borrowable ideas: 4

Goose v1.37.0 — Luci Release Review

Executive Summary


Load-Bearing Features Worth Borrowing

Feature Luci System Rationale Effort
PreToolUse denial hooks (#9304) dev-loop / scheduler Luci already blocks Edit/Write/Bash before dev-loop via require-dev-loop.sh. Goose's hook pattern could replace or strengthen that gate. Low — inspect goose implementation, backport pattern if cleaner
GOOSE_MAX_TOOL_RESPONSE_SIZE (#9256) context-mode sandbox Luci's ctx_execute already contains large output, but global limit prevents workers from accidentally flooding context. Forward-compatible with vault.db query results. Low — add env var to scheduler + MC worker spawn
Provider model exposure + ACP system prompt setter (#9475, #9478) scheduler runtime_profile Luci dispatches multi-model council (Opus/Sonnet/Codex/Gemini/Kimi/GLM) hardcoded in tasks. Exposing raw models + per-session system-prompt could let scheduler dynamically route by model capability (e.g., "use Opus for council, Sonnet for Tier 1 fix"). Medium — schema change to task runner, update mc_pickup provider injection
Declarative provider system (Perplexity, Alibaba, Databricks, etc.) (#9443, #9254, #9274) scheduler task routing Currently scheduler hardcodes provider→env mapping. Goose's declarative shape (name, endpoint, auth-field) could replace scheduler's _apply_provider_profile_env() with a config table, reducing code + enabling new providers without redeploy. Medium-to-High — requires task/provider schema migration, test coverage

Not Applicable (Skip)


Recommendation

Backport PreToolUse denial hooks + MAX_TOOL_RESPONSE_SIZE immediately (both low-risk, high-safety gain).

Defer provider model exposure + declarative system to a follow-up ticket: useful for council dispatch tuning + onboarding new models, but requires schema planning. Not blocking.

Open ticket for auto-skill-evolver to adopt Goose's operator install-policy pattern (2026-06-02 context shows this was already identified as backport candidate from prior release review).

openclaw v2026.6.5-alpha.1 — 2026-06-05T04:03:44Z

Repo: https://github.com/OpenClaw/openclaw Release: https://github.com/openclaw/openclaw/releases/tag/v2026.6.5-alpha.1 Detected: 2026-06-05T05:00:44.772406+00:00 Borrowable ideas: 4

openclaw v2026.6.5-alpha.1 Release Review

Executive Summary

Features Worth Borrowing

Feature Load-Bearing for Luci
Telegram admin writeback + DM approval allowlists ccgram.service is sole Telegram poller; new role verification + exec approval gates tighten MC ticket approval routing. Check if ccgram enforces these.
Durable sends on transcript mirror failure Currently ccgram fails the entire Telegram send if transcript mirroring (e.g., to audit log) fails. This decouples mirror reliability from send delivery.
Custom-provider runtime fanout scheduler.py injects runtime_profile as env string (anthropic/glm/kimi/minimax); openclaw now routes context-aware. Could replace string-injection with structured dispatch for GLM/Kimi/MiniMax cost control.
Policy load-time rejection (corrupt shells, unsupported keys, unsafe exec) If tighter policy enforcement, MC ticket workers + skill-loader may hit new rejections. Verify no false positives on existing worker spawns.

Skip

Recommendation

Medium priority backport: Telegram safety gates + durable sends. Low priority: custom-provider fanout (forward-looking for cost scaling). Test on ccgram.service next scheduler run and report any admin-role enforcement side effects.