← Reports
← Back to The Den
research

Scout Report - Claude Code Skill and Subagent Dispatch Reliability

Generated 2026-04-18 08:05  ·  PKA / Lucienne
Scout Report - Claude Code Skill and Subagent Dispatch Reliability back to The Den
research

Scout Report - Claude Code Skill and Subagent Dispatch Reliability

Scott Quick Scout   2026-04-17   Consumer: Lucienne direct

Question: What is the Claude Code community actually doing to improve the reliability of skill and subagent dispatch - so a main session reliably delegates to the right specialist instead of handling work itself?

Scope: Hook patterns (all event types), existing plugins, native auto-dispatch reliability, what failed, dedicated-context conventions. Not SA-specific. No substance recommendation.

Time-box: 30 minutes. Stop condition hit: 4 Tier-1, 6 Tier-2, 3 trending, 4 counterpoint positions found. Confidence overall: High on hooks mechanics (Anthropic primary docs); Medium on failure-mode data (limited indexed practitioner negative feedback).

TL;DR

Sources - Tier 1 (Primary / Authoritative)

#SourceCredibilityKey extract
1 Claude Code Hooks Reference - Anthropic official docs Primary Anthropic documentation. Verified 2026-04-17. 21 lifecycle events, 4 handler types, full JSON schema. 21 hook events across session, per-turn, tool-call, subagent, worktree lifecycle. UserPromptSubmit and SessionStart inject via stdout/additionalContext (10KB cap). PreToolUse can block via permissionDecision:deny. Exit code 2 = blocking for most events but SILENTLY DROPPED in subagent tool calls. SubagentStart injects context into spawned agents.
2 Create Custom Subagents - Anthropic official docs Primary Anthropic documentation. Covers subagent isolation model, context windows, permission profiles. Each subagent runs in its own context window with independent system prompt and tool access. SubagentStart hook can inject additionalContext into spawned agents. Context isolation is architectural, not hook-dependent.
3 Hook Development Skill - anthropics/claude-code repo Anthropic-maintained. Canonical hook-building patterns. Covers handler types (command/prompt/agent), matcher syntax, deduplication rules, and the distinction between stdout injection vs. stderr error paths. Agent-type hook handlers can invoke full subagents for deep verification.
4 Superpowers v5.0.7 - inspected directly at ~/.claude/plugins/cache/claude-plugins-official/superpowers/5.0.7/hooks/ Installed plugin source. claude-plugins-official. Verified 2026-04-17. Uses exactly ONE hook: SessionStart. Injects the full using-superpowers SKILL.md wrapped in EXTREMELY_IMPORTANT tags via hookSpecificOutput.additionalContext. THE RULE injected: if there is even a 1% chance a skill applies, invoke it. No PreToolUse hooks. No dispatch enforcement beyond context injection. Platform-aware bash (Cursor / Claude Code / Copilot branching).

Sources - Tier 2 (Reputable Secondary)

#SourceCredibilityKey extract
5 claude-code-workflow-orchestration - GitHub Public practitioner plugin. Directly relevant - designed for dispatch enforcement. 2026. Graduated-reminder pattern: PreToolUse emits escalating stderr hints (silent -> hint -> warning -> strong) when main agent bypasses /workflow-orchestrator:delegate. Counter resets each user turn, zeros on delegation. 14 hooks across 6 lifecycle events. Note: dedicated delegation-orchestrator agent was DEPRECATED - replaced by native plan mode. Lesson: dedicated orchestrator agents add overhead vs. native plan mode.
6 agent-dispatch skill - GitHub Public practitioner skill. Platform-agnostic (Claude Code, Cursor, Codex). 2026. TOML keyword index (2k tokens) maps task keywords to agent names and GitHub URLs. Main session carries only the index; full agent SKILL.md fetched and cached on first match. Limitation: each keyword maps to exactly one agent (TOML constraint). Requires internet for initial fetch. This is a routing pattern, not a hook-based enforcement pattern.
7 190 Things Claude Code Hooks Cannot Enforce - DEV Community Practitioner post with systematic gap analysis. March 2026. [single-source on some specifics - treat carefully] Six gap categories: (1) hooks don't fire in pipe/bare/cowork/VSCode-stop/certain-worktree modes; (2) hooks fire but are ignored - MCP calls ignore deny, subagent tool calls drop exit-2, updatedInput ignored for agent tools; (3) platform bugs (concurrent session corruption, exec permission stripping); (4) architectural gaps - hooks only at tool-call boundary; (5) model routes around blocked tools; (6) security bypasses (wildcard injection, bypassPermissions). Recommended alternative: OS-level controls.
8 Claude Code in Production: What Actually Works - Herashchenko Named practitioner, production deployment. April 2026. [single-source on cost figures] (a) disable-model-invocation:true preserves human control over risky skills; (b) CLAUDE_CODE_SUBAGENT_MODEL=haiku = ~60% cost reduction on investigation tasks; (c) research subagents return SUMMARIES ONLY, never raw file contents - prevents context saturation; (d) removed broad auto-triggering, uses explicit invocation except high-frequency patterns. Removed advisory CLAUDE.md rules for critical workflows - insufficient alone.
9 Context Optimization: 54% reduction - johnlindquist, GitHub Gist Named practitioner with specific methodology. 2026. [single-source on specific numbers] Replaced verbose 10KB upfront skill docs with 3KB trigger tables. Skill() lazy-loading: Claude reads descriptions at startup (~50 tokens), loads full SKILL.md only on invocation. Initial context 7,584 to 3,434 tokens. Key insight: Claude needs to know WHEN to invoke a skill; the protocol is deferred.
10 Superpowers: the Claude Code plugin that enforces what you should do - ddewhurst.com Named practitioner, extended plugin review. 2026. Corroborates own plugin source inspection. Hard gates in v4.3+ block implementation code until design approved. Overhead: 10-20 min on small tasks. Token cost: five sub-tasks x 50k+ tokens from context duplication (corroborated by Willison). No controlled A/B data on whether hook enforcement outperforms a well-written instruction document.

Sources - Trending / Recent

What the Community Is Actually Doing

Pattern 1: SessionStart context injection (the superpowers pattern)

Dominant approach. Inject routing/dispatch rules into every session before the first turn. Superpowers wraps the skills manifest in EXTREMELY_IMPORTANT. Rules stay visible rather than buried in CLAUDE.md that may compact away. This is what Lucienne already has from superpowers - it covers the baseline.

Pattern 2: UserPromptSubmit dynamic injection

Fires on every user turn. Can inject dynamic context (current task type, agent roster hints) via additionalContext. More expensive than SessionStart but survives compaction gaps because it re-fires each turn. Use case: detect keywords in the incoming prompt and inject targeted dispatch reminders (e.g., if prompt contains 'architecture' or 'system design', inject 'Atlas handles this - dispatch him').

Pattern 3: Graduated PreToolUse reminders (the barkain pattern)

Instead of hard-blocking direct tool use, emit escalating stderr nudges when the main session does work that should be delegated. Counter-per-turn, resets on delegation. Allows emergency direct access without killing the session. The deprecated delegation-orchestrator agent teaches a lesson: dedicated orchestrator agents add overhead; native plan mode is cheaper for the orchestration half.

Pattern 4: Compact keyword index + lazy skill loading

Main context carries only a 2k-token trigger table (TOML or inline). Full SKILL.md loaded on-demand on keyword match. Architecturally different from hook enforcement - it is a context-efficiency pattern that reduces skills being forgotten because they were never in context to begin with.

Pattern 5: Dedicated context per agent via SubagentStart hook

SubagentStart injects additionalContext into each spawned agent's context. Main session holds routing logic only; subagents get full domain context at spawn time. Practitioners return summaries only from research subagents, never raw file contents - prevents context saturation in the parent.

What failed / was removed

Plugin Registry Status

Installed on PKA: superpowers@claude-plugins-official v5.0.7 - ONE hook (SessionStart), context injection only, no PreToolUse enforcement. Not installed: barkain/workflow-orchestration, userFRM/agent-dispatch. No overlap between superpowers and those two - they address different enforcement layers. Adding either would be additive, not duplicative.

Counterpoints / Contrary Views

View A: Hook enforcement is a leaky abstraction (strongest objection)

The systematic gap analysis (DEV Community, 2026) documents hooks silently failing across pipe mode, bare mode, subagent tool-call boundaries, MCP tool calls, and compaction. The recommended alternative for safety-critical subagent scenarios is OS-level controls, not hooks. Practitioners use hooks despite knowing this - they are 95% coverage, not 100%.

View B: No A/B evidence that enforcement beats good instructions

The ddewhurst/superpowers review notes no controlled comparison of hook-enforced dispatch vs. a well-written instruction document exists. Superpowers adds 10-20 minutes overhead on small tasks. The philosophical question - 'is this meaningfully different from a carefully worded CLAUDE.md?' - has not been empirically answered by the community.

View C: Token cost of subagent dispatch is a real deterrent

Simon Willison observed five sub-tasks each consuming 50,000+ tokens from duplicated context. Aggressive dispatch enforcement may solve drift at the cost of creating token bills that make the system economically impractical for routine work. The graduated-reminder pattern (nudge, don't force) exists partly as an economic compromise.

View D: Compaction defeats hook-injected context anyway

Context injected by SessionStart hooks is subject to compaction - the same problem hooks were supposed to solve relative to CLAUDE.md. The only reliable escape is UserPromptSubmit hooks that fire every turn, at the cost of per-turn overhead. This makes the SessionStart-only pattern less robust than it appears.

No credible 'hooks are entirely wrong' position found

The debate is 'hooks are incomplete, know the gaps' vs. 'hooks are enforcement, CLAUDE.md is guidance.' Not 'hooks vs. nothing.' The critics are detailed and technically specific, which makes the consensus feel genuine rather than captured.

Gaps / What I Could Not Find

Handoff Note

Destined for: Lucienne direct.

Suggested next steps (topic, not recommendation):

  1. Examine the barkain/workflow-orchestration source code - it is the closest existing open-source implementation to what PKA needs for Atlas dispatch enforcement, including the graduated-reminder pattern and the lesson about native plan mode replacing orchestrator agents.
  2. Evaluate whether a UserPromptSubmit hook (keyword detection -> inject Atlas reminder) is sufficient before building a full PreToolUse escalation system - it may be cheaper and good enough.
  3. Consider the agent-dispatch TOML routing skill as a zero-hook alternative - keyword-based routing at skill-invocation time rather than hook-enforcement time.
  4. If a custom hook is built: test specifically in subagent contexts and pipe mode, where the community has documented silent failures. These are the failure modes that will otherwise surprise PKA at 2am.