Skip to content

Chapter 16: Sub-Agent and Multi-Agent Coordination

What You'll Learn

This chapter takes you deep into Claude Code's most powerful and architecturally sophisticated capability: multi-agent collaboration. When a task exceeds what a single agent can handle efficiently, Claude Code can dynamically spawn sub-agents, decompose the work, and run it in parallel. This is not simply "calling another AI" — it is a carefully engineered system for execution isolation, permission delegation, and result aggregation.

By the end of this chapter, you will understand:

  1. How AgentTool acts as the entry point for sub-agents, and the complete execution flow inside runAgent.ts
  2. The context forking mechanism: which state is cloned versus shared, and the engineering rationale behind each choice
  3. The fundamental difference between Coordinator mode and regular REPL mode, and how a coordinator manages a pool of workers
  4. The three swarm execution backends (tmux, iTerm2, in-process), their respective use cases, and the auto-detection logic
  5. The seven task types and their design boundaries — from local shell commands to remote agents
  6. How in-process workers escalate permission requests to their leader's terminal using a permission bridge protocol

16.1 From Single Agent to Multi-Agent: Why Collaboration Matters

Claude Code's core is the query() loop described in Chapter 5: the model emits tool calls, tools execute and return results, and the cycle repeats until the task is complete. This model works well for most cases, but it hits structural limits when facing:

Parallelism bottleneck: A single agent is strictly sequential. If a task's subtasks are independent of each other, forcing them to queue is wasteful.

Context window pressure: Investigating a large codebase may require reading dozens of files, accumulating tens of thousands of tokens. Placing all subtasks in one conversation accelerates exhausting the context window.

Permission scope reduction: Some subtasks (like read-only code investigation) do not need write access that the parent task might hold. Sub-agents provide a natural way to narrow the permission surface.

Work isolation: When parallel file modifications are needed, sub-agents can operate in separate git worktrees without interfering with each other.

Claude Code's solution is AgentTool, which lets the current agent (hereafter "parent agent" or "coordinator") spawn one or more sub-agents (hereafter "workers"). Each worker has its own conversation history, a configurable tool set and permission scope, but shares the underlying file system access.


16.2 AgentTool and runAgent: The Complete Sub-Agent Lifecycle

16.2.1 AgentTool's Role

tools/AgentTool/AgentTool.tsx is the tool entry point that the Claude model can invoke. When the model decides to spawn a sub-agent, it passes the task description, subagent_type (which determines which AgentDefinition to use), and an initial prompt to this tool.

AgentTool itself is a thin layer: it parses arguments, determines whether to execute synchronously (foreground) or asynchronously (background), looks up the corresponding AgentDefinition by subagent_type, and delegates execution to runAgent.ts.

16.2.2 The Core Flow of runAgent.ts

runAgent.ts is the heart of sub-agent execution — approximately 900 lines of code with a clear logical structure:

typescript
// src/tools/AgentTool/runAgent.ts (lines 248-329)
export async function* runAgent({
  agentDefinition,
  promptMessages,
  toolUseContext,
  canUseTool,
  isAsync,
  canShowPermissionPrompts,
  forkContextMessages,
  querySource,
  override,
  model,
  maxTurns,
  availableTools,
  allowedTools,
  onCacheSafeParams,
  contentReplacementState,
  useExactTools,
  worktreePath,
  description,
  transcriptSubdir,
  onQueryProgress,
}: { ... }): AsyncGenerator<Message, void>

This is an async generator function. It yields messages from the sub-agent one by one, allowing the caller (AgentTool) to process output as a stream.

Step 1: Initialize Agent Identity

typescript
// runAgent.ts (lines 347-348)
const agentId = override?.agentId ? override.agentId : createAgentId()

Every sub-agent gets a unique agentId (e.g., agent-a1b2c3d4). In resume scenarios, override.agentId carries the historical ID.

Step 2: Context Message Forking

typescript
// runAgent.ts (lines 370-378)
const contextMessages: Message[] = forkContextMessages
  ? filterIncompleteToolCalls(forkContextMessages)
  : []
const initialMessages: Message[] = [...contextMessages, ...promptMessages]

When forkContextMessages is present (the fork sub-agent scenario), the sub-agent inherits the parent's conversation history. filterIncompleteToolCalls removes any tool calls that lack corresponding tool_result entries, preventing API errors.

Step 3: Resolving the Model and Permission Mode

typescript
// runAgent.ts (lines 340-345)
const resolvedAgentModel = getAgentModel(
  agentDefinition.model,
  toolUseContext.options.mainLoopModel,
  model,
  permissionMode,
)

The model field in an agent definition can be "inherit" (use the parent agent's model) or a specific model alias. Notice the permission mode override logic:

typescript
// runAgent.ts (lines 415-430)
// If the parent is in bypassPermissions or acceptEdits mode, those
// are never overridden — parent authorization levels always take precedence.
if (
  agentPermissionMode &&
  state.toolPermissionContext.mode !== 'bypassPermissions' &&
  state.toolPermissionContext.mode !== 'acceptEdits'
) {
  toolPermissionContext = {
    ...toolPermissionContext,
    mode: agentPermissionMode,
  }
}

Step 4: Building the agentGetAppState Closure

This is a key design pattern: instead of directly mutating AppState, runAgent creates a wrapped getAppState function. This wrapper is responsible for:

  • Overriding the permission mode on-demand
  • Setting shouldAvoidPermissionPrompts: true for async agents (preventing unattended background agents from triggering permission dialogs)
  • Injecting sub-agent-specific allowedTools rules while preserving parent-level cliArg permissions

Step 5: Creating the Sub-Agent's ToolUseContext

typescript
// runAgent.ts (lines 700-714)
const agentToolUseContext = createSubagentContext(toolUseContext, {
  options: agentOptions,
  agentId,
  agentType: agentDefinition.agentType,
  messages: initialMessages,
  readFileState: agentReadFileState,
  abortController: agentAbortController,
  getAppState: agentGetAppState,
  // Sync agents share setAppState with the parent
  shareSetAppState: !isAsync,
  // Both sync and async contribute to response metrics
  shareSetResponseLength: true,
  contentReplacementState,
})

Step 6: Recursively Calling the query() Loop

typescript
// runAgent.ts (lines 748-806)
for await (const message of query({
  messages: initialMessages,
  systemPrompt: agentSystemPrompt,
  userContext: resolvedUserContext,
  systemContext: resolvedSystemContext,
  canUseTool,
  toolUseContext: agentToolUseContext,
  querySource,
  maxTurns: maxTurns ?? agentDefinition.maxTurns,
})) {
  // Forward stream events to parent's metrics display
  if (message.type === 'stream_event' && ...) {
    toolUseContext.pushApiMetricsEntry?.(message.ttftMs)
    continue
  }
  // Record conversation to sidechain transcript file
  if (isRecordableMessage(message)) {
    await recordSidechainTranscript([message], agentId, lastRecordedUuid)
    yield message
  }
}

This reveals the architectural elegance of Claude Code: sub-agents and parent agents use the exact same query() function (see Chapter 5). There is no "second-class citizen" code path for sub-agents. Both are equivalent at the API level.

Step 7: Cleanup in the finally Block

Regardless of whether a sub-agent completes normally, is aborted, or throws an exception, the finally block always runs:

  • Closes any MCP server connections registered by this sub-agent
  • Clears session hooks registered by this sub-agent
  • Releases the cloned file state cache
  • Removes the todos entry (preventing AppState memory leaks)
  • Terminates background bash tasks spawned by this sub-agent

16.3 Context Forking: What Is Shared, What Is Isolated

createSubagentContext() (in utils/forkedAgent.ts) is the key to understanding agent isolation. It follows the principle of "isolated by default, explicitly shared when needed."

readFileState (file read cache): Cloned, not shared. The sub-agent's file read history starts from the parent's current state, but subsequent changes are independent. When forkContextMessages is present (context inheritance scenario), it is cloned from the parent; otherwise an empty cache is created.

AbortController: A child controller is created. When the parent agent aborts, the sub-agent is also aborted (cascading abort). But when the sub-agent itself aborts, the parent is unaffected.

getAppState: A wrapped version. By default, the AppState that a sub-agent reads will have shouldAvoidPermissionPrompts: true forced in. The agentGetAppState closure additionally injects permission mode overrides and tool allowlists at runtime.

setAppState: Here is a subtle design. For synchronous sub-agents (isAsync=false), it shares setAppState with the parent, because synchronous sub-agents run inline in the parent's execution flow and state changes should be immediately visible to the parent. For async sub-agents, setAppState is a no-op — but with one important exception:

typescript
// forkedAgent.ts (lines 416-417)
setAppStateForTasks:
  parentContext.setAppStateForTasks ?? parentContext.setAppState,

Task registration and cleanup (setAppStateForTasks) always routes to the root-level AppState writer, even when the regular setAppState is a no-op. This ensures that background bash tasks launched by async sub-agents are properly registered and cleaned up, preventing zombie processes.

messages (conversation history): The sub-agent has its own independent conversation history. This is the most important isolation dimension — a sub-agent does not "see" the parent's conversation with the user, only the task prompt sent to it (plus optional inherited context).


16.4 Coordinator Mode: Making Claude the Project Manager

16.4.1 What Is Coordinator Mode

Coordinator mode is activated via the environment variable CLAUDE_CODE_COORDINATOR_MODE=1, detected by coordinator/coordinatorMode.ts:

typescript
// coordinator/coordinatorMode.ts (lines 36-41)
export function isCoordinatorMode(): boolean {
  if (feature('COORDINATOR_MODE')) {
    return isEnvTruthy(process.env.CLAUDE_CODE_COORDINATOR_MODE)
  }
  return false
}

In coordinator mode, Claude receives a completely different system prompt (getCoordinatorSystemPrompt()), positioning it as a "software engineering task orchestrator" rather than a "direct executor." The system prompt explicitly defines the coordinator's role:

  • Spawn workers via AgentTool
  • Follow up with existing workers via SendMessageTool
  • Stop misdirected workers via TaskStopTool
  • Answer user questions directly (do not delegate everything)

16.4.2 Task Notifications: Workers Reporting to the Coordinator

When a worker completes, its result is injected into the coordinator's conversation history as a user-role message in XML format:

xml
<task-notification>
<task-id>agent-a1b2c3</task-id>
<status>completed</status>
<summary>Agent "Investigate auth Bug" completed</summary>
<result>Found null pointer at src/auth/validate.ts:42. Session.user is
undefined when sessions expire but the token remains cached...</result>
<usage>
  <total_tokens>24580</total_tokens>
  <tool_uses>18</tool_uses>
  <duration_ms>45230</duration_ms>
</usage>
</task-notification>

The coordinator's system prompt explicitly tells the model: these task-notification messages come from the user role but are not real user input — they are internal signals, and the coordinator must never "thank" or "acknowledge" them as if they were human messages.

16.4.3 The Power of Parallelism

One of the most important guidelines in coordinator mode is written into the system prompt itself:

typescript
// coordinator/coordinatorMode.ts (line 213)
// "PARALLELISM IS YOUR SUPERPOWER. Workers are async.
//  Launch independent workers concurrently whenever possible —
//  don't serialize work that can run simultaneously."

In a single API response, the coordinator can emit multiple AgentTool tool call blocks, which will be executed concurrently. A typical parallel research task looks like:

# A coordinator's single response (pseudocode)
AgentTool({ description: "Investigate auth module",
            prompt: "Find null pointer candidates in src/auth/..." })
AgentTool({ description: "Research test coverage",
            prompt: "Find all tests related to src/auth/..." })

"Investigating from two angles simultaneously — I'll report back with findings."

Worker concurrency rules (from the system prompt):

  • Read-only tasks (research): run in parallel freely
  • Write-heavy tasks (implementation): one worker per set of files at a time
  • Verification can sometimes run alongside implementation on different areas

16.4.4 The Synthesize-Then-Delegate Principle

The coordinator system prompt repeatedly emphasizes this core working pattern: after workers return research findings, the coordinator must understand and synthesize those findings itself before writing the next execution prompt.

typescript
// coordinator/coordinatorMode.ts (lines 255-259)
// Anti-pattern — lazy delegation:
// AgentTool({ prompt: "Based on your findings, fix the auth bug" })
//
// Good — synthesized spec:
// AgentTool({ prompt: "Fix the null pointer in src/auth/validate.ts:42.
//   The user field on Session is undefined when Session.expired is true
//   but the token is still cached. Add a null check before user.id access
//   — if null, return 401 with 'Session expired'..." })

This is not just a style recommendation — it is a necessary condition for workers to complete tasks independently, because workers cannot see the coordinator's conversation history with the user.


16.5 Swarm Architecture: Unified Abstraction Over Multiple Backends

"Swarm" is Claude Code's internal name for the Teammate feature — multiple Claude instances working as team members, each visualized as a separate terminal pane.

16.5.1 Three Backends

The swarm architecture provides three backends, managed in utils/swarm/backends/:

typescript
// utils/swarm/backends/types.ts (lines 8-10)
export type BackendType = 'tmux' | 'iterm2' | 'in-process'

tmux backend: Uses tmux's split-pane capability to launch a separate claude process in each new pane. Suitable for use in tmux sessions or GUI-less terminal environments.

iTerm2 backend: Uses iTerm2's native Python API (via the it2 CLI tool) to create native split-pane layouts in macOS's iTerm2 terminal. Provides the best visual experience.

in-process backend: Runs multiple agents within the same Node.js process, using AsyncLocalStorage for context isolation. No subprocess overhead, but requires sharing resources within the main process.

16.5.2 Auto-Detection Logic

registry.ts implements a clearly prioritized detection flow:

typescript
// utils/swarm/backends/registry.ts (lines 136-253)
// Priority:
// 1. If inside tmux → use tmux (even if inside iTerm2, tmux takes priority)
// 2. If inside iTerm2 with it2 CLI available → use native iTerm2 backend
// 3. If inside iTerm2 but no it2, and tmux available → use tmux as fallback
// 4. If tmux available (external session mode) → use tmux
// 5. Otherwise: throw error with installation instructions

The detection result is cached because the runtime environment does not change during the process lifetime.

typescript
// registry.ts (lines 351-389)
export function isInProcessEnabled(): boolean {
  // Non-interactive sessions (-p mode) force in-process
  // since tmux-based teammates don't make sense without a terminal UI
  if (getIsNonInteractiveSession()) {
    return true
  }
  const mode = getTeammateMode()
  if (mode === 'in-process') return true
  if (mode === 'tmux') return false
  // 'auto' mode: not inside tmux and not inside iTerm2 → use in-process
  return !isInsideTmuxSync() && !isInITerm2()
}

16.5.3 TeammateExecutor: The Unified Interface

Regardless of which backend is in use, calling code always operates through the TeammateExecutor interface:

typescript
// utils/swarm/backends/types.ts (lines 279-300)
export type TeammateExecutor = {
  readonly type: BackendType
  isAvailable(): Promise<boolean>
  spawn(config: TeammateSpawnConfig): Promise<TeammateSpawnResult>
  sendMessage(agentId: string, message: TeammateMessage): Promise<void>
  terminate(agentId: string, reason?: string): Promise<boolean>
  kill(agentId: string): Promise<boolean>
  isActive(agentId: string): Promise<boolean>
}

InProcessBackend (backends/InProcessBackend.ts) implements this interface. Its spawn() method:

  1. Calls spawnInProcessTeammate() to create a TeammateContext and register it in AppState
  2. Calls startInProcessTeammate() to start the agent execution loop in the background (fire-and-forget)
  3. Returns agentId, taskId, and abortController

16.5.4 In-Process vs. Subprocess: The Trade-off

The choice between in-process and subprocess execution represents a classic trade-off:

In-process teammates share the same Node.js heap, meaning they can share MCP connections without duplication, and can interact with the leader's React state directly through the permission bridge. The downside is that a runaway in-process agent could theoretically affect the main process — though the AbortController provides a clean termination mechanism.

Subprocess teammates (tmux/iTerm2) are fully isolated operating system processes. They communicate only through the file-based mailbox protocol. This is the safest isolation model, but carries subprocess startup overhead and requires separate API connections.

The in-process model becomes the default in non-interactive environments (-p mode, SDK usage) precisely because subprocess-based visual panes serve no purpose when there is no terminal UI to display them.


16.6 Task System: Unified Lifecycle Management

16.6.1 Overview of Task Types

Task.ts defines the base types for all tasks in the system:

typescript
// Task.ts (lines 6-14)
export type TaskType =
  | 'local_bash'           // Local shell commands (Bash tool)
  | 'local_agent'          // Background agents (AgentTool async mode)
  | 'remote_agent'         // Remote agents (via CCR protocol)
  | 'in_process_teammate'  // In-process teammates (Swarm feature)
  | 'local_workflow'       // Local workflow execution
  | 'monitor_mcp'          // MCP monitor tasks
  | 'dream'                // Dream tasks (experimental)

All tasks share the same state machine:

typescript
// Task.ts (lines 15-19)
export type TaskStatus =
  | 'pending'    // Not yet started
  | 'running'    // Currently executing
  | 'completed'  // Finished normally
  | 'failed'     // Execution failed
  | 'killed'     // Forcibly terminated

Task IDs carry type prefixes for easier debugging: b for bash tasks, a for local agents, r for remote agents, t for in-process teammates.

16.6.2 LocalAgentTask: State Management for Background Agents

tasks/LocalAgentTask/LocalAgentTask.tsx is the state management center for background agents (the async execution mode of AgentTool):

typescript
// LocalAgentTask.tsx (lines 116-148)
export type LocalAgentTaskState = TaskStateBase & {
  type: 'local_agent'
  agentId: string
  prompt: string
  agentType: string
  abortController?: AbortController
  error?: string
  result?: AgentToolResult
  progress?: AgentProgress    // Tool use count, token count, recent activities
  retrieved: boolean
  messages?: Message[]        // Optional conversation history (for transcript view)
  isBackgrounded: boolean     // false = running in foreground, true = backgrounded
  pendingMessages: string[]   // Messages queued by SendMessageTool
  retain: boolean             // Whether UI is currently displaying this task
  diskLoaded: boolean         // Whether the transcript JSONL has been loaded into memory
  evictAfter?: number         // Expiration timestamp (for GC)
}

The progress field contains an AgentProgress view aggregated from the ProgressTracker: tool use count, cumulative input/output tokens (tracked separately to avoid double-counting, since the Claude API's input_tokens is cumulative per turn), and the last 5 tool activity records.

When a task completes, enqueueAgentNotification() injects a structured <task-notification> XML message into the coordinator's conversation queue:

typescript
// LocalAgentTask.tsx (lines 252-262)
const message = `<${TASK_NOTIFICATION_TAG}>
<${TASK_ID_TAG}>${taskId}</${TASK_ID_TAG}>
<${OUTPUT_FILE_TAG}>${outputPath}</${OUTPUT_FILE_TAG}>
<${STATUS_TAG}>${status}</${STATUS_TAG}>
<${SUMMARY_TAG}>${summary}</${SUMMARY_TAG}>${resultSection}${usageSection}
</${TASK_NOTIFICATION_TAG}>`
enqueuePendingNotification({ value: message, mode: 'task-notification' })

16.6.3 InProcessTeammateTask: Unique State for In-Process Companions

Unlike background agents, in-process teammates (InProcessTeammateTask) have distinctive state fields:

  • identity: Contains agentId (format: agentName@teamName), team name, display color
  • pendingUserMessages: Queue of user messages waiting to be injected (for sending messages from outside into a teammate)
  • shutdownRequested: Whether a graceful shutdown has been requested
  • messages: Conversation history with a cap (used for the "zoom view")
  • abortController: The teammate's independent abort controller

Communication between teammates uses the file-based mailbox (teammateMailbox.ts) — even for in-process teammates. This keeps the communication protocol identical regardless of the execution backend.


16.7 Permission Proxy: How Workers Escalate to the Leader

16.7.1 The Core Problem

Async background agents cannot display permission dialogs in the terminal — the user might be interacting with the main interface, and having a permission dialog suddenly pop up from a background agent would be disorienting. So by default, background agents have shouldAvoidPermissionPrompts: true, which auto-denies any permission-requiring operations.

But in-process teammates are an exception. Although they also run asynchronously, they are visible in the same terminal window and can reuse the main REPL's permission dialog mechanism.

16.7.2 leaderPermissionBridge.ts: The Cross-Component Permission Channel

utils/swarm/leaderPermissionBridge.ts is the core of this mechanism:

typescript
// leaderPermissionBridge.ts (lines 1-54)

// The Leader (main REPL) registers its permission queue setter at startup
let registeredSetter: SetToolUseConfirmQueueFn | null = null
let registeredPermissionContextSetter: SetToolPermissionContextFn | null = null

export function registerLeaderToolUseConfirmQueue(
  setter: SetToolUseConfirmQueueFn,
): void {
  registeredSetter = setter
}

export function getLeaderToolUseConfirmQueue(): SetToolUseConfirmQueueFn | null {
  return registeredSetter
}

The workflow:

This design uses a module-level singleton rather than passing state through React props or context. This is necessary: the in-process runner is non-React code, but it needs to trigger updates in React components (the permission dialog). A module-level reference is the most direct way to bridge these two worlds.

16.7.3 The 'bubble' Permission Mode

Fork sub-agents (forkSubagent.ts) have their permission mode set to 'bubble':

typescript
// forkSubagent.ts (lines 60-71)
export const FORK_AGENT = {
  agentType: FORK_SUBAGENT_TYPE,
  tools: ['*'],
  maxTurns: 200,
  model: 'inherit',
  permissionMode: 'bubble',  // Permission requests bubble up to parent terminal
  source: 'built-in',
  getSystemPrompt: () => '',
}

bubble mode means permission requests "bubble up" to the parent terminal, where the parent's permission dialog handles them. This differs from the in-process teammate approach in that fork sub-agents share the parent's conversation context.

The check in runAgent.ts handles this correctly:

typescript
// runAgent.ts (lines 439-451)
// For agents in bubble mode, prompts should still be shown
// (they bubble to parent terminal, not auto-denied)
const shouldAvoidPrompts =
  canShowPermissionPrompts !== undefined
    ? !canShowPermissionPrompts
    : agentPermissionMode === 'bubble'
      ? false   // bubble mode: never avoid prompts
      : isAsync  // default: async agents avoid prompts

16.8 Fork Sub-Agents: The Highest Level of Context Inheritance

forkSubagent.ts implements a special sub-agent spawning mode: Fork.

16.8.1 What Is a Fork

A regular AgentTool sub-agent starts from a blank slate (seeing only the prompt sent to it). A fork sub-agent inherits the parent's complete conversation history, then receives a "directive" to execute a specific task.

typescript
// forkSubagent.ts (lines 107-168)
export function buildForkedMessages(
  directive: string,
  assistantMessage: AssistantMessage,
): MessageType[] {
  // Building strategy:
  // [...parent history, assistant(all tool_uses), user(placeholder results..., directive)]
  // Only the final directive text block differs between fork children.
  // This maximizes prompt cache hit rates.
}

The key insight: all tool results use the same placeholder text ("Fork started — processing in background"), and only the final directive block is unique to each fork child. This ensures the API request prefix is byte-identical, fully leveraging Claude's prompt caching.

16.8.2 Preventing Recursive Forking

A fork sub-agent's system prompt explicitly says "do not spawn further sub-agents," but relying solely on prompt-level constraints is fragile. There is an explicit code-level guard:

typescript
// forkSubagent.ts (lines 78-89)
export function isInForkChild(messages: MessageType[]): boolean {
  return messages.some(m => {
    if (m.type !== 'user') return false
    const content = m.message.content
    if (!Array.isArray(content)) return false
    return content.some(
      block =>
        block.type === 'text' &&
        block.text.includes(`<${FORK_BOILERPLATE_TAG}>`),
    )
  })
}

By scanning the conversation history for the fork boilerplate tag, this guard reliably detects whether the current agent is in a fork child role — even after autocompact (context compression) has rewritten the message structure.


16.9 Agent-Level MCP Server Injection

Agent definitions can declare their own MCP servers in frontmatter, which are appended on top of the parent agent's MCP connections:

typescript
// runAgent.ts (lines 648-665)
const {
  clients: mergedMcpClients,
  tools: agentMcpTools,
  cleanup: mcpCleanup,
} = await initializeAgentMcpServers(
  agentDefinition,
  toolUseContext.options.mcpClients,
)

// Agent-specific MCP tools are merged with resolved tools (deduplicated by name)
const allTools = agentMcpTools.length > 0
  ? uniqBy([...resolvedTools, ...agentMcpTools], 'name')
  : resolvedTools

initializeAgentMcpServers() distinguishes between two ways of referencing MCP servers:

  1. By name (string): Reuses the parent agent's existing connection (memoized). No new connection is created, and it is not closed when the agent finishes.
  2. Inline definition (object): Creates a new connection, closed via cleanup() when the agent completes.

This embodies the "isolate on demand, share resources" design principle: if an MCP server is shared infrastructure, reuse it; if it is agent-specific, independently manage its lifecycle.


16.10 Lean Sub-Agents: Reducing Unnecessary Token Consumption

At scale (millions of agent invocations), every token consumed by each agent matters. runAgent.ts contains several careful optimization strategies:

Slimming CLAUDE.md: Read-only exploration agents (Explore, Plan) do not need the commit guidelines, PR requirements, and other directives in CLAUDE.md, so they skip it:

typescript
// runAgent.ts (lines 388-396)
const shouldOmitClaudeMd =
  agentDefinition.omitClaudeMd &&
  !override?.userContext &&
  getFeatureValue_CACHED_MAY_BE_STALE('tengu_slim_subagent_claudemd', true)

Slimming git status: Explore and Plan agents do not need the git status snapshot from the start of the parent session (potentially 40KB), so it is actively filtered out:

typescript
// runAgent.ts (lines 404-410)
const resolvedSystemContext =
  agentDefinition.agentType === 'Explore' ||
  agentDefinition.agentType === 'Plan'
    ? systemContextNoGit
    : baseSystemContext

Disabling extended thinking: Sub-agents disable extended thinking by default (thinkingConfig: { type: 'disabled' }). Only fork sub-agents (useExactTools=true path) inherit the parent's thinking configuration, because prompt cache hits require byte-identical API request prefixes.


16.11 Design Philosophy: Why This Layering

Looking back at the entire multi-agent architecture, several design decisions run consistently throughout:

Unified execution path: Sub-agents and parent agents use the same query() function. There is no "second-class citizen" execution path built specifically for sub-agents. This reduces code complexity and ensures sub-agents benefit from every improvement made to the main loop.

Declarative isolation: createSubagentContext() explicitly declares which resources need to be shared via parameters (shareSetAppState, shareAbortController, etc.). This is clearer than implicit inheritance and makes it easier to reason about the side-effect scope of a sub-agent.

Selective leanness: Different types of sub-agents have different "weights." Read-only exploration agents omit CLAUDE.md and git status; thinking is disabled by default. This is not about simplifying code — it is about controlling costs at a scale of tens of millions of agent invocations.

Value of backend abstraction: The TeammateExecutor interface lets tmux, iTerm2, and in-process execution share the same calling code. This enables in-process mode to serve as a natural fallback for GUI-less environments or SDK call scenarios without modifying the higher-level code at all.


Key Takeaways

This chapter systematically disassembled every layer of Claude Code's multi-agent collaboration architecture.

From the AgentTool entry point to the runAgent.ts execution core, launching a sub-agent is a complete flow involving context forking, permission isolation, MCP server injection, and sidechain transcript recording. The central design principle is "isolated by default" — each sub-agent has its own conversation history, its own AbortController, and its own file cache, but precisely shares necessary resources through well-defined interfaces.

Coordinator mode (coordinatorMode.ts) transforms Claude from an executor into a dispatcher. It receives workers' task-notification reports, synthesizes findings, and produces precise directives for the next step. Parallelism is its core value — multiple independent workers can be launched concurrently in a single response.

The swarm architecture (utils/swarm/) unifies three execution backends through the TeammateExecutor interface: tmux's classic split-pane approach, iTerm2's native interface, and the GUI-free in-process mode. Backend detection is automatic, selecting the backend that best matches the current runtime environment.

The task system (tasks/) provides a unified lifecycle framework for all background operations: pending → running → completed/failed/killed. LocalAgentTask manages ordinary background agents; InProcessTeammateTask manages in-process teammates. Both use enqueueAgentNotification() to send structured completion notifications to the coordinator.

The permission bridge (leaderPermissionBridge.ts) resolves the challenge of background agents being unable to display permission dialogs. In-process teammates obtain the permission queue setter registered by the main REPL through a module-level singleton, "bubbling" permission requests up to the Leader's terminal interface.

Understanding this architecture gives you the foundational knowledge behind Claude Code's evolution from a single conversational assistant to a parallel engineering orchestration system.

Built for learners who want to read Claude Code like a real system.