Layered CLAUDE.md token cost: the file you pasted is one layer of the bill.
You pasted your CLAUDE.md into an analyzer, saw a token number, and felt either fine or guilty. Either way the number was wrong, because it measured one layer of a stack. Claude Code does not load a CLAUDE.md. It walks up the directory tree and concatenates every CLAUDE.md it finds, expands every @import, and prepends your machine-wide user file, all before it reads a line of your code. This page is the arithmetic nobody else does: how to add up the layers and get the real per-turn cost.
1. The analyzer scores one file
Start with the tool, because the tool is honest about its scope. ccmd's analyzer, the one behind the textarea on ccmd.dev, takes a single string and returns a single score.
That is the right unit of analysis. A file is a file. But it means the token number you get back is the cost of that file, not the cost of opening Claude Code in that repo. Those are different numbers, and on most machines they differ by more than 10x. The analyzer is not hiding the rest of the stack from you. It simply never saw it, because you only pasted one layer.
2. ccmd's own repo: a 3-token paste, a ~4,980-token floor
The cleanest demonstration is the repository this site is built in. Its project CLAUDE.md is 11 bytes. The whole file:
One line. That is the canonical pattern from Anthropic's memory docs: if a repo already has an AGENTS.md for other coding agents, you write a CLAUDE.md that imports it with @AGENTS.md so both tools read the same instructions. Paste that 11-byte file into the analyzer and it scores it at 3 tokens. The analyzer is correct. The file is 3 tokens. It is also nowhere near what Claude Code loads when you open this repo.
Pasted project file: 3 tokens. Per-turn floor in this repo: about 4,982 tokens. The analyzer, fed the project file alone, saw 0.06% of the input cost of every turn. Not because it is broken, because it was handed one layer of a stack.
Where does ~4,982 come from? Three files, all loaded in full at launch, every one of them sitting in context on every turn:
- ./CLAUDE.md — 11 bytes, 3 tokens. The @AGENTS.md import line.
- ./AGENTS.md — 327 bytes, 82 tokens. Pulled into context by that import, expanded inline at launch.
- ~/.claude/CLAUDE.md — a real user file on a working machine measures 19,586 bytes, about 4,897 tokens by the analyzer's chars / 4 heuristic. It loads in every repo on that machine, and you almost never think about it.
The project file you would naturally paste is the smallest layer by three orders of magnitude. The layer that dominates the bill is the one you have not opened since you set up your machine.
3. Watch the layers add up
Same repo, played as a sequence. Each frame is one thing Claude Code does at launch, with the running token total.
layered load, ccmd-website repo
You paste one file
Prompt caching claws most of that back on turns where the prefix is byte-identical, and the three cost states guide does that math. But caching divides the layered total. It does not change which layers are in it. A cache hit on a stack you mismeasured still mismeasures the stack.
4. Which layer loads, and when
Not every layer behaves the same. Two load conditionally, the rest load at launch and never leave. The distinction is the whole game for cost, so here it is in one table.
| Layer | Loads | In your per-turn floor? |
|---|---|---|
| Managed policy CLAUDE.md | at launch, in full | yes |
| ~/.claude/CLAUDE.md (user) | at launch, in full | yes |
| ./CLAUDE.md (project) | at launch, in full | yes |
| @import targets (recursive, max 5 hops) | at launch, expanded inline | yes |
| .claude/rules/*.md without paths: | at launch | yes |
| ./CLAUDE.local.md | at launch, in full | yes |
| Subdirectory CLAUDE.md | on demand, when Claude reads files there | no, until triggered |
| .claude/rules/*.md with paths: | when a matching file is read | no, until triggered |
Read the first six rows again. That is the most expensive sentence in this whole topic: an @import does not reduce token cost. The most repeated piece of layering advice online is "split your CLAUDE.md into imports to keep it lean." Anthropic's memory docs say the opposite, in plain words: imports help organization "but does not reduce context, since imported files load at launch." The expanded content enters context exactly as if you pasted it inline. A three-line root file that imports a 4,000-token file costs 4,000 tokens a turn, not three.
The one genuine layering win is the second-to-last row. Subdirectory CLAUDE.md files load on demand, only when Claude opens a file in that directory. In a monorepo, a per-package file defers its cost until you work in that package. That is the only nesting move that actually lowers your floor. If a guide tells you nesting saves tokens without naming the on-demand mechanism, it has conflated the two and you should not trust its arithmetic.
5. Measure your real layered total
The analyzer takes one file, so feed it the layers one at a time and sum. The terminal command that lists the byte sizes, then the analyzer's own token math on each:
The procedure, four steps, no signup:
- Paste ~/.claude/CLAUDE.md into the analyzer on ccmd.dev. Write down the token count. This is the layer that repeats in every repo, so it is worth getting honest about first.
- Paste your project ./CLAUDE.md. If it contains @import lines, open each referenced file and paste those too. Follow the chain up to five hops deep.
- Paste any .claude/rules/*.md file that has no paths: frontmatter. Those load unconditionally too.
- Add the token counts. That sum is your per-turn floor. Multiply by turns per session, then by sessions per week, for the number you can put in front of a teammate.
One precision note. The analyzer counts raw characters, and Claude Code strips block-level HTML comments before injecting a file, so a file heavy with <!-- ... --> maintainer notes scores slightly high per layer. That is a rounding error. The layer you forgot to paste at all is the real mistake.
6. The layer you forgot, and how to cut it
In practice the forgotten layer is one of two. The first is your own ~/.claude/CLAUDE.md. It grows the way a junk drawer grows: one preference at a time, never reviewed, and it taxes every project on the machine. If it has drifted past a couple thousand tokens, that is the highest-leverage file to audit, because the saving multiplies across every repo you touch.
The second is a layer you cannot edit: an ancestor CLAUDE.md from another team, picked up by the upward directory walk in a shared monorepo. For that, Anthropic's docs describe claudeMdExcludes, a setting that skips specific CLAUDE.md files by path or glob. Put it in .claude/settings.local.json so the exclusion stays on your machine and does not change anyone else's context. The one layer no setting can drop is a managed policy file: that one is deliberately not excludable.
The takeaway is small and it is the whole page. A token number is only as good as the question it answered. "How big is this file" and "what does opening Claude Code in this repo cost" are different questions. The analyzer answers the first precisely. The second is a sum, and now you know every term in it.
Want help summing your real layered cost?
15 minutes, we paste through your user file, project file, and imports together and leave with one per-turn token number. Free.
Frequently asked questions
What counts as a layer in a layered CLAUDE.md?
Six things, per Anthropic's memory docs. Managed policy CLAUDE.md (an org-deployed file at a fixed system path), the user file at ~/.claude/CLAUDE.md, the project file at ./CLAUDE.md or ./.claude/CLAUDE.md, ./CLAUDE.local.md, every file pulled in by an @path import, and .claude/rules/ files with no paths: frontmatter. All six load in full at launch and sit in context on every turn. A seventh kind, subdirectory CLAUDE.md files, loads only when Claude reads a file in that subtree.
Does splitting my CLAUDE.md into @import files reduce token cost?
No. This is the most common piece of layering advice and it is wrong about cost. Anthropic's memory docs are explicit: 'Splitting into @path imports helps organization but does not reduce context, since imported files load at launch.' An @import is a readability tool, not a token tool. The expanded content enters the context window exactly as if you had pasted it inline. ccmd's own repo proves it: CLAUDE.md is the single line @AGENTS.md, and AGENTS.md still loads in full.
Is nesting CLAUDE.md ever a real token win?
Yes, in exactly one case. Subdirectory CLAUDE.md files are loaded on demand, only when Claude reads a file in that directory, not at launch. So in a monorepo, a per-package CLAUDE.md genuinely defers its cost until Claude works inside that package. That is the one layering technique that lowers your per-turn floor. Imports do not, managed and user and project files do not. If a guide tells you nesting saves tokens without naming the on-demand distinction, it is conflating the two.
How do I measure my total layered cost with ccmd?
The analyzer takes one file. So feed it the layers one at a time and add the totals. Paste ~/.claude/CLAUDE.md, note the token count. Paste ./CLAUDE.md, note it. Paste each file referenced by an @import, note each. Paste any .claude/rules/*.md without a paths: field. The sum is your per-turn floor. Subdirectory files are extra, counted only once Claude has opened files in those directories during the session.
Why does the analyzer only score one file at a time?
By design. analyzeConfig(input: string) in src/lib/analyzer.ts takes a single string, and line 264 sets estimatedTokensFireEveryTurn = totalTokens, the token count of that one file. It grades a file, not a machine. It cannot see your ~/.claude/CLAUDE.md, your managed policy file, or your imports unless you paste them. That is not a gap to apologize for, it is the unit of analysis. The layered total is yours to sum, and this page is the procedure.
How deep can @import recursion go?
Five hops. From Anthropic's memory docs, verbatim: 'Imported files can recursively import other files, with a maximum depth of five hops.' Every file in that chain expands into context at launch. A tidy three-line root CLAUDE.md that imports a file that imports two more can still put several thousand tokens on every turn. The root file's token count tells you nothing about the layered total.
How do I cut a layer I cannot edit, like a parent CLAUDE.md in a monorepo?
Use claudeMdExcludes. Anthropic's docs describe a setting that skips specific CLAUDE.md files by path or glob, set in .claude/settings.local.json so the exclusion stays local to your machine. It is the right lever when an ancestor CLAUDE.md from another team gets picked up by the upward directory walk and adds tokens you never read. One exception: a managed policy CLAUDE.md cannot be excluded.
Related: the three cost states of a single file, the four surfaces a rule can fire on, or paste a layer on the analyzer.
Comments (••)
Leave a comment to see what others are saying.Public and anonymous. No signup.