CLAUDE.md token waste per turn: the seven categories of bytes you pay for and get nothing back from
Every byte of CLAUDE.md fires on every turn. That's not the interesting question. The interesting question is which of those bytes produce zero behavioral signal. Our analyzer enumerates exactly seven categories at src/lib/analyzer.ts:6. Six of them waste the bytes on a specific line. One of them, the worst one, wastes the entire file.
Every byte of CLAUDE.md is concatenated into the system prompt at session start and re-sent on every API call. Our analyzer flags seven categories of that traffic as waste: bloat (lines over 28 words), vague (a 14-term word list), aspirational (a 6-term word list without escape clauses), cache_bust (ISO date or "today" or "this session" in the first 20 lines), duplicate, missing_why (prohibitions with no reason in the next four lines), and conflict (contradicting absolutes). Ranked by dollar impact, cache_bust is the worst: a single line in rows 1-20 voids the prefix cache and turns every turn into a roughly 10x bill.
Source: src/lib/analyzer.ts line 6 (waste taxonomy), line 124 (vague list), line 130 (aspirational list), line 194 (cache_bust regex), line 267 (cost formula).
The taxonomy is one TypeScript union
You don't have to take our word for which categories the analyzer counts as waste. They are literally enumerated as the Finding["kind"] union. Seven string literals, one severity, an optional tokenSavings number per finding, which is what the per-turn waste math reduces.
The per-turn dollar number, in source form
Before walking the seven categories, the formula. The cost is deliberately small: file tokens, turns, Opus 4.7 input rate. The analyzer hardcodes 30 turns as the round number for a long Claude Code session and $15 / M as the published input rate. Your per-turn waste for a given category is the sum of tokenSavings across findings of that kind, multiplied by the rate. The cache_bust case is the exception: when it fires, the waste is the entire file's tokens (not the line's), because the cache miss is global.
A real-shaped CLAUDE.md, scored by waste category
Same 6,042-token file we use across the other ccmd guides. The per-category waste row tells you which fixes to take in which order. Cache_bust pays back the entire file's bill, every turn. Bloat is a long way behind. The bottom three are real but mostly cosmetic at the per-turn level.
1. cache_bust: the only category that wastes the whole file
The cache_bust check runs on lines 1 through 20 of the file. It fires on any ISO date (regex \b20[2-9]\d-\d{2}-\d{2}\b) or the strings today, this session, or right now. The reason this single check matters more than the other six combined is mechanical: Anthropic's prompt cache hits only when the prefix is byte-identical to a prior prefix. A volatile string in row 3 mutates the prefix every session. Every turn after the first then pays full cold-read cost instead of the cached rate.
The fix is one line: move the volatile string to the bottom of the file, or strip it entirely. The bytes don't change; the prefix stability does. Cached vs cache-busted on a 6,000-token file across 30 turns at the Opus 4.7 input rate is roughly $0.27 vs $2.72, an order of magnitude.
2. bloat: the line is too long for the model to keep as one signal
Bloat fires on word count, not byte count. The threshold is 28 words. The empirical observation is that rules past that consistently get treated as one signal and the back half gets silently dropped. The analyzer estimates 35% of the line's tokens as recoverable, on the assumption that splitting into 2-3 shorter sub-25-word directives keeps the meaning and cuts the redundancy.
Per-turn waste: about 35% of the line's token weight, billed every turn, every session, in perpetuity until you split the line. On a file with four bloat lines averaging 30 tokens, that's roughly 42 tokens of pure waste per turn, or about $0.019 per 30-turn session before any cache benefit.
3. duplicate: the second copy is 100% waste
The check builds a Map of trimmed lowercased lines longer than 10 characters. Any exact second occurrence gets flagged with the line's full token cost as the recoverable savings. Unlike bloat and vague, this one is unambiguous: the bytes of the second copy are pure duplication. Delete them.
Per-turn waste: 100% of the duplicate line's tokens, every turn. Two duplicates of 45 tokens each on a 30-turn session is 2,700 tokens of billed-for-nothing traffic, or about $0.04. Small in isolation, large if your file has accumulated drift across teammates.
4. missing_why: rule fires but stops being followed at edge cases
The check fires on any line starting with DO NOT, NEVER, or don't if the next four lines don't contain any of because, why, reason, past, got burned, incident, happened, or caused. The observation is that a prohibition without a reason gets followed by the model until it hits an edge case, at which point the model has no constraint to reason against and guesses.
The bytes fire every turn either way. The waste is not the byte count, it's the rule's effective half-life. A one-line "Why:" with the past incident or constraint turns the prohibition into something the model can extrapolate from. Same firing cost, real behavioral signal.
5. vague: 14 words that have no testable success condition
The VAGUE_TERMS constant is 14 entries. The check is a per-word word-boundary regex. When any matches, the line gets flagged as vague. The waste is that the rule fires (bytes are in the prompt) but the model has no test for when it succeeded, so it reads the rule and stops trying. "Write clean, well-structured code with good naming and handle edge cases appropriately" fires four matches off this list in one line.
Remediation: replace each vague term with a number, a named tool, or a forbidden pattern. "Lines under 80 characters" beats "clean code". "No any outside of unknown casts" beats "good TypeScript". The bytes cost the same; the second one actually steers.
6. aspirational: absolutes without escape clauses
The ASPIRATIONAL constant is 6 entries (always, never, must, should always, in all cases, every time). The check fires only when the line lacks an escape pattern (unless / except / but if / when X then) AND the line is under 25 words. The escape gate is there because a long line with "always" might already include a scoped exception elsewhere.
The waste pattern is the same as missing_why: the bytes fire every turn, the model reads the absolute, then it pattern-matches that real codebases have exceptions and treats the rule as advisory. The fix is either an explicit "unless X" or removing the absolute.
7. conflict: two rules that cannot both win
The conflict check is the smallest one. It is a literal string match against a hardcoded pair: a file containing both "never use comments" and "add comments" gets flagged. The point isn't the specific pair, it's the category: contradicting absolutes have no resolution path, so the model picks one at random per session. Both rules waste bytes on every turn, and the file produces nondeterministic behavior. The remediation is to pick one rule or to scope each with a context ("for public APIs: add comments; elsewhere: skip them").
Ranked by dollar impact, descending
Order matters because the cache_bust line is roughly an order of magnitude more expensive than every other category combined. Fix that first, then walk the rest in the order below.
| Feature | Waste pattern (every turn) | Fix |
|---|---|---|
| cache_bust (ISO date / 'today' / 'this session' in first 20 lines) | 10x the entire file, every turn | delete the line. one edit, biggest dollar move. |
| bloat (line > 28 words) | ~35% of the line's bytes, every turn, every session | split into 2-3 sub-25-word directives. |
| duplicate (same trimmed line appears twice) | 100% of the second copy, every turn | delete the copy. |
| missing_why (DO NOT / NEVER without reason) | rule fires, then gets ignored at edge cases | add a one-line `Why:` with the past incident. |
| vague (14-term word list) | rule fires, model has no testable condition | replace with a number, a named tool, or a forbidden pattern. |
| aspirational (6-term word list, no escape) | rule fires, model treats it as advisory | add `unless X` or drop the absolute. |
| conflict (contradicting absolutes) | both rules fire, model picks one at random | pick one, or scope by context. |
Where the waste moves when you cut it
Deleting lines is not the only fix. Several of the seven categories are not actually "the rule is bad", they're "the rule belongs on a different firing surface". Claude Code reads four surfaces, and only one of them fires every turn:
- CLAUDE.md fires unconditionally on every turn for the life of the session.
- Skills(skills/<name>.md) only enter the context when the skill name or description matches the request.
- Hooks only fire on tool events (PreToolUse, PostToolUse, UserPromptSubmit).
- MCP tool descriptions only enter the context when the model picks that tool.
A rule like "when about to write SQL, parameterize" is not a CLAUDE.md line, it's a PreToolUse hook on Bash and Edit. A rule like "when running the QA flow, do X" is not a CLAUDE.md line, it's a skill. Moving rules off the every-turn surface and onto the conditional surfaces is the structural fix for everything except cache_bust (which is purely about prefix stability).
Want us to walk your CLAUDE.md and rank its waste live?
Paste it on ccmd.dev for the free per-line scan, or book a 20-minute call and we'll do the full taxonomy reading on your file with you.
Frequently asked questions
Which bytes of my CLAUDE.md actually waste tokens every turn?
All of them fire, but most of them are doing work. The waste is the bytes that fire but produce zero behavioral signal. Our analyzer enumerates seven categories of waste as the Finding.kind union at src/lib/analyzer.ts line 6: bloat (lines over 28 words), vague (14-term word list), aspirational (6-term word list with no escape), cache_bust (ISO date or 'today' or 'this session' in the first 20 lines), duplicate (same trimmed line twice), missing_why (DO NOT or NEVER with no reason in the next 4 lines), and conflict (contradicting absolutes). Every category produces a tokenSavings number for its specific line, which is the per-turn waste for that line.
Why is cache_bust the biggest dollar waste category?
Because it does not waste the bytes on the flagged line. It wastes the entire file. Anthropic's prompt cache requires a byte-identical prefix for a cache hit. A single ISO date or 'today is' string in the first 20 lines mutates the prefix on every session, so every turn pays full input cost instead of the ~10x cheaper cache-hit rate. On a 6,000-token CLAUDE.md across a 30-turn session, that's the difference between roughly $2.72 and roughly $0.27 at the $15 / M Opus 4.7 input rate the analyzer hardcodes at line 267. One line, 10x bill, every session.
How is bloat different from just 'long lines'?
The bloat detector at src/lib/analyzer.ts line 148 fires on word count, not byte count, and the threshold is 28 words. Lines past that consistently get treated as one signal by the model and the second half is silently dropped. The analyzer estimates 35% of the line's tokens as recoverable savings on the assumption that splitting into 2-3 shorter actionable directives keeps the meaning and cuts the redundancy. The bytes fire every turn either way; the question is whether they're doing anything.
What's the actual cost-per-turn formula?
It's at src/lib/analyzer.ts line 263. estimatedTokensFireEveryTurn equals totalTokens (all bytes fire). estimatedCostPerLongRunSession equals tokens × TURNS × OPUS_IN_PER_M / 1_000_000, where TURNS is 30 and OPUS_IN_PER_M is 15. Plug your file's token count in: a 6,042-token CLAUDE.md across 30 turns at $15 / M is $2.72 of input cost before Claude reads a single byte of your repo. Cache-hit rates land closer to $0.27 if your prefix is stable.
What does the vague-term check actually catch?
It runs the regex /\b<term>\b/i for each of 14 terms from the VAGUE_TERMS constant at line 124 (appropriate, appropriately, good, best, proper, properly, carefully, thoughtfully, well, nicely, cleanly, as needed, where applicable, if relevant, when possible). When any matches, the line gets flagged as 'vague' with the message that the term has no testable success condition. The waste is not the bytes of the word itself; it's that the whole rule is unenforceable, so the model reads it, can't tell when it succeeded, and stops trying. The bytes fire on every turn for as long as the rule sits there.
Why does the aspirational check require no escape clause?
Because a naked 'always' or 'never' in any real codebase is a lie, and the model knows it. The check at line 177 looks for any of the 6 ASPIRATIONAL terms (always, never, must, should always, in all cases, every time) and also tests for an escape pattern (unless / except / but if / when X then). If the absolute has no escape AND the line is under 25 words, it fires. The remediation is either 'unless X' or removing the absolute. Bytes fire every turn until you do.
Are duplicate and conflict checking the same thing?
No. The duplicate check at line 207 builds a Map of trimmed lowercased lines longer than 10 characters and flags exact second-occurrences with the per-line token cost as the savings number. The conflict check at line 244 fires on a hardcoded pair (text containing both 'never use comments' and 'add comments') because contradicting absolutes have no resolution path for the model. Duplicate wastes the bytes of the copy; conflict wastes the bytes of both rules plus produces nondeterministic behavior.
Does this apply to AGENTS.md, .cursorrules, and .grokrules?
Yes. The analyzer detects type at line 41 (detectType) and runs the same seven checks against all four formats; detection is by content, not filename. AGENTS.md (Codex), .cursorrules (Cursor), and .grokrules (Grok Build) all get injected verbatim into the system prompt and fire every turn the same way CLAUDE.md does. The token-waste taxonomy is identical across the four. The conditional sibling surfaces (skills, hooks, MCP descriptions for Claude Code; their equivalents for the others) are where waste moves to when you cut it from the main file.
How do I see the per-category breakdown for my own file?
Paste your file into the textarea on ccmd.dev. The analyzer runs entirely in your browser (no upload, no signup) and returns an AnalysisResult with totalTokens, estimatedTokensFireEveryTurn, estimatedCostPerLongRunSession, an array of findings (each tagged with kind, severity, message, suggestion, and tokenSavings), and the Karpathy 12 rubric pass-count. The findings array is what you sort by tokenSavings to rank waste by dollar impact for your specific file.
Related guides
- Do CLAUDE.md rules fire on every turn? — the firing model, in code form.
- CLAUDE.md token cost audit: the three states your file is in — cold vs cached vs cache-busted, dollar math.
- Why Claude skips rules in your CLAUDE.md — firing isn't following.
- CLAUDE.md weekly quota burn — the seven-day math that hits you on Wednesday.
Comments (••)
Leave a comment to see what others are saying.Public and anonymous. No signup.