/t · guide · subagents

Claude Code recursion and context isolation: one level deep, no grandchildren.

Matthew Diakonov, Written with AI

Published May 21, 20266 min read

Every guide on Claude Code subagents starts with the same sentence: "each subagent gets a fresh isolated 200k context window." True, and lovely. What those guides almost never finish is the rest: that isolation is exactly one level deep. The thing the subagent itself runs in, it cannot give to a grandchild. If you showed up here because you want recursive context isolation (a parent that delegates to a child, that delegates to its own child, and so on), the harness will refuse you at depth two.

This page is the short version of why, what the rule actually is, where it is enforced, what changes for your CLAUDE.md cost, and the four workarounds people use. In that order. Skim it on your phone.

1. The rule, in two sentences from Anthropic

Two doc pages say the same thing in two voices. The SDK page is terse and operational. The built-in-subagent page frames it as architectural intent.

docs: subagents cannot spawn subagents

"This prevents infinite nesting" is the giveaway about why. A recursive agent that mis-decomposes a task can fan out into a tree of cold sessions, each one rebuilding CLAUDE.md, MCP servers, and any preloaded skills. Anthropic capped that shape at depth one and let users pay for fan-out at depth one only.

2. Where the rule lives: the missing tool

The enforcement is not a flag, not a setting, not a check inside the model. It is the absence of one tool. A subagent inherits everything the parent has except the one tool that would let it call another subagent. Here is the list a subagent actually sees when you ask it.

subagent: available tools (Agent is missing on purpose)

GitHub issue anthropics/claude-code#4182 (filed July 23, 2025, closed as duplicate) is the public paper trail. The user listed the same allowlist, asked for Agent to be exposed to children, and got the duplicate-closure with no plan to flip the switch.

Adding tools: [Agent] to a subagent frontmatter does not unlock recursion either. The harness gates the tool above the model: the model can ask for it, the harness will not honor it.

~/.claude/agents/example.md

3. Claude Code vs VS Code Copilot vs the headless workaround

The fastest way to internalize the Claude Code position is the comparison. VS Code Copilot ships the opposite default. The claude -p headless trick lives outside the harness rules entirely.

Feature	VS Code Copilot subagents	Claude Code (in-session subagents)
Does a subagent inherit a fresh context window?	Yes (VS Code Copilot subagents also get a fresh window at each level).	Yes. Each subagent (Task call) starts with a fresh, isolated window. Parent conversation history is not visible.
Can a subagent itself spawn a subagent?	Yes (VS Code Copilot, when enabled). Maximum nesting depth: 5.	No. Agent tool is not in the subagent's allowlist. Officially documented restriction.
Where is the restriction enforced?	At the harness, with an explicit depth counter (capped at 5).	At the harness, by omitting Agent from the inherited tool set. The subagent literally cannot call it.
What if I add `tools: [Agent]` to the subagent frontmatter?	Configurable: VS Code exposes `chat.agent.subAgentDepth` as a setting.	The Agent SDK docs say do not do this. It does not unlock nesting; the harness gates the tool above the model.
Official workaround for nested work	Just enable the setting and keep nesting up to depth 5.	Use Skills (no own context), chain subagents from the main conversation, or shell out to `claude -p` headless from a Bash call (each call is a fresh main session, full cold load).
What the recursion ceiling means for CLAUDE.md cost	Unbounded up to 5 levels: every level rehydrates configs, MCP, and skills from scratch unless the agent framework dedups.	Bounded at depth 1: every subagent loads CLAUDE.md fresh once. Predictable. The 'claude -p' workaround removes the bound and removes the predictability.

4. What the recursion ceiling does for your CLAUDE.md cost

Here is the only good news in this whole topic. Because Claude Code caps fan-out at depth one, the cost story is linear, not exponential. Every subagent loads CLAUDE.md fresh exactly once, full stop. That is the upper bound. Anthropic publishes agent teams at roughly seven times a standard session, and that is the same shape: linear in fan-out.

cost math: bounded fan-out vs imagined recursion vs the headless workaround

The bound disappears the moment you reach for the workaround. A subagent that shells out to claude -p inside a Bash call is firing up a brand new main session every time, with the full CLAUDE.md, the full MCP boot, the full skill scan, and none of the harness depth checks. That is the path that turns a worker fan-out into a five-dollar tab in under a minute.

5. The four workarounds, ranked by how much they hurt

People who want a tree of work in Claude Code reach for one of four patterns. In order from "fits the model" to "buyer beware".

Skills (no own context, no recursion needed)

A skill is content injected into one context window, not its own session. If the work fits inside the same window, a skill is the cheapest path: no rehydration, no new MCP boot, no extra cold load. Use a skill when the answer is 'do this specific thing' rather than 'go explore'.

Flat fan-out from the parent (the official model)

The parent calls Agent N times in one turn. Each child runs in parallel at depth one. Anthropic puts agent teams at ~7x a standard session for this shape. It is bounded, predictable, and the per-child CLAUDE.md cost is exactly your analyzer number. This is the path the docs want you on.

Agent teams (multi-session, the cross-session shape)

Separate sessions that communicate. Distinct from in-session subagents. The teammate's context is its own and you wire the messages between them yourself. Sane for long-running parallel work, expensive for short bursts because each teammate is a full session.

claude -p headless from inside a Bash call (the unofficial 'recursion')

A subagent shells out to `claude -p` via Bash. Each call is a fresh main session: full CLAUDE.md load, full MCP boot, full skill scan, and no depth check from the harness. This is what every 'recursive Claude Code agent' blog post is doing under the hood. It works. It also turns a single fan-out into a $5 to $11 cold-load loop. Use only when you have measured.

6. What to check on your own CLAUDE.md

The recursion ceiling is fixed. The thing you control is what loads at every subagent cold start. Three checks, in order.

Per-turn token count. Paste your CLAUDE.md into the analyzer at ccmd.dev. Whatever number it prints is what every subagent pays once. If you fan out three children per turn, your floor is the number x three.
Lines that fire every turn but should not. The analyzer flags rules that the model silently skips on long sessions (over 28 words, missing-why, duplicates). These pay their full cost at every cold load and contribute nothing.
Preloaded skills. Every name in a subagent's skills field is injected in full at startup. Three preloaded skills per child times three children per parent is nine cold loads of skill content per turn. The recursion is blocked but this multiplier is not.

Want a second pair of eyes on your fan-out shape?

Bring your CLAUDE.md, your subagent files, and the worst fan-out you've actually used. 15 minutes, free. We walk through the cold-load number, what to cut, and where Skills can do the work a hypothetical grandchild would have done.

Frequently asked questions

Can a Claude Code subagent spawn its own subagent?

No. The official Agent SDK documentation states: 'Subagents cannot spawn their own subagents. Don't include Agent in a subagent's tools array.' The Agent tool (the one Claude uses to delegate to a subagent) is not in the inherited tool set for a subagent, and the docs explicitly tell you not to add it. The built-in subagent docs phrase the same rule as 'this prevents infinite nesting (subagents cannot spawn other subagents).' Verified against https://code.claude.com/docs/en/agent-sdk/subagents and https://code.claude.com/docs/en/sub-agents on 2026-05-21.

What tools does a subagent actually inherit?

Bash, Glob, Grep, LS, Read, Edit, MultiEdit, Write, NotebookRead, NotebookEdit, WebFetch, WebSearch, TodoWrite, ExitPlanMode, plus any MCP tools the parent has, and any Skill tool the parent has. Not in the list: Agent (the spawner) and Task (its legacy name). GitHub issue anthropics/claude-code#4182, filed July 23, 2025 and closed as duplicate, is the public reference for the omission. The user there documented the exact list and the closure carried no plan to expose Agent to children.

Is the context window of a subagent isolated from the parent?

Yes. A subagent starts with a fresh, separate 200k context window. It does not see your parent conversation history, the files Claude already read in the parent, the skills the parent already invoked, or the parent's tool results. The only channel from parent to child is the prompt string the parent writes when it calls Agent. All work the child does (tool calls, reasoning, partial output) stays inside the child's window. The parent gets the child's final summary, typically a few hundred tokens. That isolation is the whole reason subagents exist; it is also the whole reason the cost story is what it is: every child pays to rebuild its own context from scratch.

If recursion is blocked, how do people get nested work done?

Four patterns, in roughly decreasing alignment with the official model. (1) Skills: a skill is content injected into one context, not its own context, so it composes inside a single agent rather than spawning a child. Use a skill when the work fits in the same window. (2) Flat fan-out from the parent: the parent calls Agent multiple times in one turn; each child runs in parallel but at depth 1. (3) Agent teams (multi-session): separate sessions that communicate, distinct from the in-session subagent model. (4) Shell out to claude -p headless from a Bash tool inside a subagent. Each call is a full new main session: full CLAUDE.md load, MCP boot, skill scan. This is the unofficial 'recursion' path and the one that turns a session into a $5 to $11 cold-load loop.

How is this different from VS Code Copilot's subagents?

VS Code Copilot, when the setting is enabled, allows subagents to spawn their own subagents up to a maximum depth of 5. The harness counts depth and refuses past it. Claude Code does not expose a depth knob: it caps at one. The architectural reason is the same in both products (each level pays a full cold load), the policy choice is different. If you read a guide that says 'Claude Code supports recursive subagents,' it is conflating Claude Code with VS Code Copilot or with a plugin that uses claude -p under the hood.

Why does Anthropic block recursion if the SDK could allow it?

Two reasons match the public docs and behavior. (1) Cost: each subagent rebuilds CLAUDE.md, MCP servers, and any preloaded skills at startup. Unbounded recursion is unbounded cold-load tax, and the runaway shape is easy to write. The Plan subagent doc explicitly calls this out as the reason ('this prevents infinite nesting'). (2) Failure mode: a recursive agent that mis-decomposes a task can fan out into hundreds of parallel children before a human notices, which is the worst possible interaction with the weekly rate limit. The flat model forces the parent to reason about the shape of the work before delegating, which is cheap and obvious next to a depth-limited tree.

Does the Plan subagent or the Explore subagent break the rule?

No. Both Plan and Explore are the only built-in subagents that skip CLAUDE.md, MCP servers, git status, and the memory hierarchy at startup. They still cannot themselves spawn subagents; what is different about them is the loading shape, not the recursion ceiling. The Plan subagent doc page is the explicit one: 'When you're in plan mode and Claude needs to understand your codebase, it delegates research to the Plan subagent. This prevents infinite nesting (subagents cannot spawn other subagents) while still gathering necessary context.' Plan calling another Plan is not a thing.

What does this mean for the cost of my CLAUDE.md?

Predictable. Every flat fan-out you can write multiplies CLAUDE.md cost by the number of subagents in the round, and that is the upper bound. Anthropic puts agent teams at roughly seven times a standard session and that is the same shape: linear in fan-out, never quadratic, because there is no recursion. The number that matters for you is the per-turn CLAUDE.md token count, and that is the one ccmd's analyzer prints. Paste your file at ccmd.dev to get it; multiply by the worst fan-out you actually use to get the worst-case input cost.

Where in CCMD's analyzer does the per-turn token math live?

Lines 263 to 269 of src/lib/analyzer.ts. The math is: assume CLAUDE.md fires every turn, 30 turns per long session, $15 per million input tokens (Opus 4.7), one cold load per subagent. The analyzer prints the single-agent number; you multiply by your fan-out to get the bound under the no-recursion rule. The repo is m13v/ccmd-website on GitHub.