/t · guide · subagents

Subagent CLAUDE.md inheritance cost: you inherit the bill, not the brain

Matthew Diakonov, Written with AI

Published May 20, 20267 min read

The word "inheritance" flatters this design. In an OOP sense inheritance lets the child reuse the parent's state. In Claude Code, a subagent inherits the parent's static configuration (CLAUDE.md hierarchy, tool schemas, MCP servers, model) and inherits noneof the parent's productive state (conversation history, files already read, skills already invoked, accumulated reasoning). You pay for the setup. You do not get the brain.

0tokens, echo subagent (inherited)

0tokens, echo subagent (no inherit)

0xratio, identical work

0frontmatter fields to tune it

What a subagent actually loads at startup

The canonical source is one section of the Claude Code sub-agents page. It is worth reading verbatim, because the cost story falls out of these six bullets cleanly:

code.claude.com/docs/en/sub-agents — What loads at startup

Two sentences carry most of the cost weight. The first: every level of the memory hierarchy the main conversation loads. That is not a summary, not a delta, not a cached handoff. It is a fresh, full read of every layer of your CLAUDE.md stack. The second: there is no frontmatter field or per-agent setting to change which agents skip them. The only escape hatches are the two named built-ins.

What inherits and what does not

A field-by-field breakdown of what a subagent gets from the parent on startup. The first five rows are the asymmetry that matters: your static config crosses over and your accumulated work does not.

Feature	Every other subagent	Explore and Plan
~/.claude/CLAUDE.md (user-level)	Loaded fresh, full file	Skipped (Explore, Plan only)
Project CLAUDE.md + nested CLAUDE.md	Loaded fresh, full hierarchy	Skipped (Explore, Plan only)
CLAUDE.local.md	Loaded fresh, full file	Skipped (Explore, Plan only)
Managed policy files	Loaded fresh	Skipped (Explore, Plan only)
Git status snapshot	Loaded fresh	Skipped (Explore, Plan only)
Tools (Read, Bash, MCP, etc.)	Inherited unless tools/disallowedTools restrict	Inherited unless tools/disallowedTools restrict
Model (Opus, Sonnet, Haiku)	Inherits from parent unless model field overrides	Inherits from parent unless model field overrides
Conversation history (parent's turns)	Not inherited (fork is the only exception)	Not inherited (fork is the only exception)
Files the parent has already read	Not inherited	Not inherited
Skills already invoked in parent	Not inherited (skills field can preload statically)	Not inherited (skills field can preload statically)
Auto-memory built during the parent session	Not inherited	Not inherited

Field-by-field, sourced from the official sub-agents docs (section: What loads at startup). The fork subagent is the one universal exception: it inherits the parent conversation in its entirety.

The bottom four rows are the cost asymmetry. The conversation history, the files the parent already read, the skills already invoked, the auto-memory the parent built up: none of it transfers. So the subagent re-loads a CLAUDE.md that tells it how to behave, then has to redo the work of figuring out what the codebase looks like, because the parent's exploration does not survive.

The 60k vs 3k benchmark

The clearest evidence in the wild comes from a GitHub issue on the Claude Code repo. The author built the smallest subagent they could imagine — a one-line worker whose entire job was to run echo success — and measured token usage in two environments. With normal parent context (a user CLAUDE.md and a project CLAUDE.md present), the subagent consumed about 60,000 input tokens. With no user or project memory at all, it consumed about 3,000.

echo success subagent, with vs without inheritance

Twenty times the input bill for identical functional output. The work the subagent did was 50 tokens of echo. The other ~57,000 tokens were the parent's static configuration ferried across the boundary. That ratio is the upper bound of what your CLAUDE.md hierarchy costs you per subagent invocation, and it scales linearly: cut the hierarchy in half and the inherited-payload half of the bill halves with it.

The unconfigurable bit

The supported frontmatter fields for a subagent are description, prompt, tools, disallowedTools, model, permissionMode, mcpServers, hooks, maxTurns, skills, initialPrompt, memory, effort, background, isolation, and color. None of them control whether the subagent reads your CLAUDE.md hierarchy.

The community has noticed. The most relevant open issue is anthropics/claude-code#6825, "Allow configurable inheritance of system prompt and memory for subagents." It proposes an includes field modeled after the existing tools field, with values like System, User, Project, and Subagent. As of this writing the issue is open, labeled enhancement, with no visible Anthropic response. Until something like it ships, the two-named-built-ins exception is the only knob.

What to actually do about it

Four levers, in order of how much they move the bill:

Subagent cost levers, in order

Use Agent({ subagent_type: "Explore" }) for read-only codebase search. CLAUDE.md is skipped by design.
Use Agent({ subagent_type: "Plan" }) for drafting an approach in plan mode. CLAUDE.md is also skipped.
Restrict tools or disallowedTools to slim the inherited tool schema (each MCP server description ships at startup).
Cut your CLAUDE.md hierarchy itself: every byte you remove is a byte every custom subagent stops paying for, on every invocation.
Run ccmd.dev on each layer in the hierarchy (~/.claude/CLAUDE.md, project, CLAUDE.local.md) and prune the lines that fire every turn but Claude has been ignoring.

The fourth lever is the only one that compounds. Cutting your CLAUDE.md hierarchy from 6,000 tokens to 1,500 saves 4,500 tokens on every single subagent invocation you make for the rest of the project's life, and the same 4,500 tokens on every turn of the parent session itself. The other three levers help on one call at a time. The structural one stacks.

The sibling pages on this site that bracket this topic: CLAUDE.md cost with parallel agents covers the multiplier when you fan out to teammates, and context cost scaling explains why the parent session's own bill grows quadratically before any subagent enters the picture.

Want us to look at your subagent bill?

Paste your CLAUDE.md hierarchy and a /usage screenshot. We'll show you which lines every subagent invocation is paying for and which ones you can cut without breaking parent behavior.

Frequently asked questions

Does a subagent inherit my CLAUDE.md cost?

Yes for every custom subagent, and yes for every built-in subagent except Explore and Plan. The official docs spell it out under "What loads at startup" on the sub-agents page: a non-fork subagent's initial context contains "CLAUDE.md and memory: every level of the memory hierarchy the main conversation loads, including ~/.claude/CLAUDE.md, project rules, CLAUDE.local.md, and managed policy files." The load is fresh; there is no cached cross-subagent reuse. The cost is the full token count of every level you have configured, paid once per subagent invocation.

What does NOT get inherited?

The productive state. Anthropic's own line is "Each subagent starts with a fresh, isolated context window. It does not see your conversation history, the skills you've already invoked, or the files Claude has already read." The parent's accumulated reasoning, the files it has cracked open, the tool results it has paged through, the auto-memory it has built up during the session — none of that crosses over. You pay for the static setup and lose the dynamic state. The one exception is fork, which inherits the parent conversation in its entirety; everything else starts blank.

How big is the inheritance tax in practice?

GitHub issue anthropics/claude-code#6825 has the most-cited benchmark. A tiny subagent whose only job was to run echo success consumed about 60,000 tokens because it inherited the full parent memory and context. The same agent, in a clean environment with no user or project memory, consumed about 3,000 tokens. Twenty times the input cost for identical work. The delta is the CLAUDE.md hierarchy plus the inherited tool schemas. The 3,000-token floor is roughly the system-prompt overhead of running a Claude instance at all.

Can I configure which layers a subagent inherits?

Not today. The sub-agents docs are explicit: "Explore and Plan are the only subagents that omit CLAUDE.md and git status. There is no frontmatter field or per-agent setting to change which agents skip them." Issue #6825, opened in August 2025, proposes a configurable includes field with values like System, User, Project, and Subagent, but as of this writing it is still labeled enhancement with no Anthropic response visible. The only knobs you have are tools and disallowedTools (slim the inherited tool list), model (route to Haiku to drop the per-token rate), and the structural one — make the CLAUDE.md hierarchy smaller so every subagent pays less.

Why does Explore exist if the inheritance is the problem?

Because Anthropic engineered the two cases that hurt most as named exceptions. Explore is read-only codebase search; Plan is approach-drafting in plan mode. Both are the kinds of calls a main session makes constantly during a feature build, and both are exactly the calls where your CLAUDE.md rules about implementation style add cost without adding value. Skipping CLAUDE.md on those two specifically waives the per-call tax in the highest-traffic path. The docs note Anthropic's stated reason verbatim: "to keep research fast and inexpensive." If a piece of work fits Explore or Plan, route it there and the inheritance bill goes to zero on the CLAUDE.md line.

Does the fan-out multiplier on parallel agents still apply on top of inheritance?

Yes. Inheritance is the per-call tax; the multiplier is what happens when you make N calls. A custom subagent pays for CLAUDE.md once per invocation. An agent team pays for it once per teammate, every turn each teammate runs — Anthropic's costs page puts agent teams at roughly seven times a standard session for that reason. Background agents and git worktrees are full sessions, so they each pay the full per-turn cost the parent pays. The cheapest pattern is the union of three habits: prefer Explore and Plan, keep your CLAUDE.md hierarchy small, and only fan out to teammates when you actually need parallel implementation work, not parallel research. See our sibling page on CLAUDE.md cost with parallel agents for the full multiplier table.

How do I audit what each subagent invocation is actually paying for?

Start from the top of the hierarchy and work down. List every CLAUDE.md file the parent session would load (~/.claude/CLAUDE.md, the project CLAUDE.md, every nested CLAUDE.md in any directory Claude has read, CLAUDE.local.md, managed policy). For each one, paste it into ccmd.dev to get a per-file token count and a rubric score for which lines actually earn their tokens. The sum of those token counts is the floor every custom subagent invocation pays before doing any work. Cut whatever fires every turn but Claude has been ignoring — those are the cheapest tokens to remove because no behavior changes when they go.