⌂ Home ☷ Board

AI Coding & Orchestration Agents 2026 — Seed Dossier

Audience: Elmar / PKA team — adoption decision for the existing Luci/Lucienne/Larry stack which already runs on an OpenClaw-style architecture (Mission Control, scheduler, skills, file-first memory). Question: stay on OpenClaw, migrate to Hermes, or stack other coding agents inside the existing workers?

Date: 2026-05-04 · Slug: agents-comparison-2026-05


Executive Summary


State of Play

Persistent-Memory Orchestration Layer

Framework Stars License Architecture Memory Model Differentiator
OpenClaw ~345K MIT 4-layer (Gateway/Agent/Tools/Memory), local gateway + ReAct loop File-first Markdown + SQLite cache + optional Mem0 hybrid Reference architecture; Task Brain unified scheduler (Mar 2026); rich skill ecosystem
Hermes Agent 95-110K (Apr 2026) MIT Terminal UI + gateway + 6-channel messaging 3-layer: short-term context + persistent skill library + cross-session FTS5 search Self-improving learning loop auto-generates skills from experience
NemoClaw n/a (commercial) Proprietary OpenClaw architecture commercialised Hybrid SQL+vector "Enterprise GA of the deep-agent pattern"

Both OpenClaw and Hermes are model-agnostic (Claude / GPT / Gemini / DeepSeek / local Ollama / OpenRouter / Nous Portal). Both reach Telegram, Discord, Slack, WhatsApp, Signal, Email. Differences:

Coding-Execution Agents

Agent Form Factor Pricing Stars SWE-bench Verified Notable
Claude Code (Anthropic) Terminal CLI + IDE $20–200/mo + API n/a (closed) 80.9% (Opus 4.5/4.6) Agent Teams, MCP, computer-use Mar 2026, 4% of all public GitHub commits
Cursor 3 AI-native IDE (VS Code fork) $20+/mo, ARR $2B+ n/a n/a (model-dependent) Cloud agents on isolated VMs, parallel Agent Tabs, /worktree, 30% of own PRs agent-made
Cline VS Code extension Free + BYOK 30K+ model-dependent Plan/Code mode split, multi-model
Continue VS Code/JetBrains plugin Free + BYOK 21K n/a Mature, plugin-style
Roo Code (formerly Roo Cline) VS Code plugin Free + BYOK n/a n/a Custom personas, privacy-first
Aider Terminal, git-native Free + BYOK 39K, 4.1M installs n/a 15B tokens/wk processed; every edit = git commit
OpenHands (formerly OpenDevin) Cloud sandbox + CLI/SDK OSS + cloud 70K+ 77% (Sonnet 4.5) $18.8M Series A; V0→V1 split Nov 2025; OpenHands Index benchmark
Devin (Cognition) Async remote agent $20/mo + $2.25/ACU n/a 51.5% reported Plan-first, Devin Review Jan 2026, 67% PR merge rate
Goose (Block→AAIF) Desktop + CLI + API Free OSS, Apache 2.0 27-29K n/a Recipes (YAML workflows), 3000+ MCP servers, 60% of Block's 12k staff use weekly
Codex CLI (OpenAI) Terminal OpenAI sub n/a 77.3% Terminal-Bench 2.0, 56.8% SWE-bench Pro GPT-5.4-Codex leads SWE-bench Pro
Gemini CLI (Google) Terminal Free + BYOK n/a n/a 2M-token context, sub-agent delegation
OpenCode Terminal-native Free + BYOK 95K+ (Apr 2026) n/a 153K stars per one source, 75+ LLM providers, plan-first
Windsurf (formerly Codeium) AI-native IDE $15+/mo n/a n/a Cascade agent, #1 LogRocket Feb 2026
Jules (Google) Proactive cloud agent Google Cloud n/a n/a Scans repos, auto-proposes work, 140K improvements
GitHub Copilot IDE integrations $10/mo Pro 15M users n/a Universal default, opened to Claude+Codex Feb 2026

[unverified] OpenCode star count varies between 95K and 153K depending on source.

Benchmark Honest Caveats


Motivations / Why Each Tool Exists


Beneficiaries / Adoption Signals

Group What they pick
Enterprise dev teams Claude Code + Cursor + Copilot stack; Goose for open-source teams
Solo OSS devs OpenCode / Aider / Cline + BYOK provider
Agent platform builders OpenClaw (incumbent) or Hermes (compounding learning)
Background-task / async workflow runners OpenHands, Devin, Jules
Voice/multi-channel automation Hermes Agent, OpenClaw
Privacy-first / local-only Goose + Ollama, Aider + local model
Block (the company) 60% of 12,000 employees use Goose weekly
Anthropic (the company) 70-80% of technical employees use Claude Code daily

Scenarios / Probabilities (12-month outlook)

  1. (60%) Two-layer stack stabilises. Persistent-memory orchestration (OpenClaw / Hermes / NemoClaw) below, coding-execution agents (Claude Code, Cursor, Goose, OpenHands) above. Most teams run 2–3 tools.
  2. (25%) Hermes' learning-loop pattern gets backported into OpenClaw. OpenClaw absorbs the Reflective Phase concept, OpenClaw retains the lead via ecosystem advantage. Hermes stays niche but influential.
  3. (10%) Claude Code subsumes the orchestration layer. Anthropic ships native multi-channel + cross-session learning, eating the bottom layer. Less likely because Anthropic prefers the "narrow + deep" position.
  4. (5%) A dark-horse (Goose/AAIF, OpenHands V2, Cursor's cloud agents) becomes the convergence point. Possible if Linux Foundation governance becomes the default.

Second-Order Effects


Contested / Unverified


Specific PKA Fit Assessment

Luci's stack already implements the OpenClaw pattern: Mission Control (gateway + ticket board), scheduler (cron), skills (~/.claude/skills/), file-first memory (vault.db, MEMORY.md), Telegram channel, multi-provider (Anthropic / GLM / Kimi / MiniMax). The work to migrate is small.

Capability Luci has it? Hermes adds? Other coding agent adds?
Persistent memory across sessions ✅ MEMORY.md + vault.db Marginal n/a
Multi-channel (Telegram, Discord, …) Partial (Telegram only) ✅ +5 channels n/a
File-first skills ✅ 90+ skills Equivalent n/a
Self-improving learning loop ✅ Reflective Phase every 15 calls n/a
Cron / scheduler ✅ scheduler.py Equivalent n/a
Multi-provider model switching ✅ provider-switch skill Equivalent n/a
Subagent dispatch ✅ Agent tool, tessa, scott Equivalent n/a
Coding-depth on hard problems ❌ depends on Anthropic ✅ Claude Code
Autonomous PR generation Partial (Larry) ✅ OpenHands, Devin
Recipe-style portable workflows Partial (skills) ✅ Goose Recipes

Biggest concrete gap: the learning loop. Worth backporting Hermes' Reflective Phase (compound skill is partway there) into Luci's existing skills system.


YouTube + Social Source Set (curated)

To be added to the NotebookLM notebook in Phase 2:

YouTube — channel : URL pattern : focus - Theo Browne (t3.gg) on Claude Code, Cursor, agent stack opinions - Fireship on agent tool roundups - Indy Dev Dan on Claude Code workflows - Matthew Berman on agent benchmarking - AI Jason on Cursor / Cline / OpenHands - Greg Isenberg on agent landscape commentary - Cole Medin on multi-agent setups - David (Hermes Agent walkthrough): https://www.youtube.com/watch?v=4Sln_6K2z8c - "Hermes agent just hit 57,000 GitHub stars" short: https://www.youtube.com/shorts/ns_X7wsm_HQ

Reddit threads / subs - r/ClaudeAI, r/ClaudeCode — Claude Code rankings (226 community mentions) - r/cursor — Cursor billing, parallel agents - r/LocalLLaMA — Hermes / OpenClaw / Goose threads - r/programming, r/ChatGPTCoding — broad rankings - r/hermesagent (~2.9K subs)

X / Twitter named devs - @swyx (latent.space) — agent landscape commentary - @simonw — daily-driver hands-on - @leerob (Vercel) — Cursor/Claude Code adoption - @anthropic / @AnthropicAI — Claude Code releases - @openai — Codex CLI - @teknium (Nous Research) — Hermes founder

Tech blogs / publications - The New Stack — "OpenClaw vs Hermes Agent: race to build AI assistants that never forget" - every.to — agent commentary - latent.space — agent commentary - simonw blog — daily hands-on - Anthropic Eng Blog — Claude Code release notes


Full URL Source List

OpenClaw

Hermes Agent

Hermes vs OpenClaw vs Claude Code (head-to-head)

Claude Code / Coding-agent rankings

OpenHands

Devin (Cognition)

Goose (Block / AAIF)

Reddit / community pulse


Methodology Notes


Next Steps

  1. Create NotebookLM notebook with this seed + the curated URL set.
  2. Run NotebookLM Deep Research with a "deep web" query asking for:
  3. Named-analyst commentary not yet in seed
  4. Independent benchmark reproductions (especially Hermes 40% claim, Devin SWE-bench)
  5. Migration case studies (OpenClaw → Hermes, Claude Code → Goose)
  6. Security posture comparison (CVE histories)
  7. Cost-modelling for a Luci-equivalent deployment on each
  8. Gap-analyse, iterate once if needed.
  9. Generate audio overview (richly framed for PKA decision), slide deck (visual matrix), and briefing report (CEO-style prescription).