CLAUDE.md token bloat is one specific check, not a vibe.

Matthew Diakonov, Written with AI

Published May 21, 20269 min read

The phrase shows up in every Reddit thread on Claude Code costs and almost nobody pins it down. In ccmd's analyzer it has an exact definition: one rule line over 28 words. That is it. Everything else people call "bloat" (the dated header, the duplicated paragraph, the "always" with no escape clause) is a separate finding with separate math.

Direct answer · verified 2026-05-21

In ccmd, "bloat" is one of seven finding kinds. It triggers when wordCount > 28 on a single rule line. The recoverable savings credited is floor(estTokens(line) * 0.35), not the full line. Source: src/lib/analyzer.ts lines 148-159.

What the detector actually does

Twelve lines of TypeScript. No model call, no network round trip. The paste-and-score analyzer runs this in the browser and emits one finding per line that crosses the threshold.

analyzer.ts

The threshold is 28, not 25, even though the message string says "rule lines over 25 words consistently get ignored." The 25-word claim is the observation; the 28-word gate is the analyzer giving a 3-word buffer before it flags. A 28-word rule passes silently; a 29-word rule lights up.

Why 35% and not 100%

A long rule line almost always has a real directive at the front and a qualifier tail at the back. The agent reads the front. The tail is hedging and war stories that survive revision because nobody wants to delete the sentence that mentions the incident.

ccmd assumes you keep the directive and cut the tail. 35% is a conservative floor on what a normal editorial pass can recover without rewriting. Cut more and you are rewriting, which the analyzer will not do for you. Cut less and you have not actually addressed the finding.

Worked example

ccmd.dev — paste a 41-word rule

The six things bloat is not

The Finding union type names seven kinds. Six of them get called "bloat" on Reddit and are not. Mixing them up makes the cuts worse: you delete a dated line and miss the cache discount, or you split a long rule when the actual waste is a 4-line duplicate.

analyzer.ts

Feature	What triggers it	What it actually flags
bloat	any rule line over 28 words	tokenSavings = floor(line tokens * 0.35). The back half is filler. The front half stays.
duplicate	same trimmed/lowercased line appears twice	tokenSavings = full token cost of the second copy. Cheapest fix in the audit.
cache_bust	ISO date, 'today', 'this session' in first 20 lines	tokenSavings = full token cost of the line. Listed as bloat in Reddit threads; it is the most expensive surcharge in ccmd by margin.
vague	"appropriate", "carefully", "well", "as needed"	No tokenSavings. Line ships, agent ignores. Filler with no recoverable count.
aspirational	"always", "never" with no escape clause	No tokenSavings. Absolutes the agent rounds off in real code.
missing_why	NEVER / DO NOT with no "because" within 4 lines	No tokenSavings. Prohibition the agent will guess around on edge cases.
conflict	two contradicting absolutes anywhere in the file	No tokenSavings. The whole-file rule pair pays double and produces noise.

Treat the seven as distinct categories with distinct fixes. Bloat splits. Duplicate deletes. Cache_bust moves. Vague rewrites. Aspirational gets an "unless". Missing_why gets a "Why:" line. Conflict picks one. Six of those seven interventions are not what people mean when they say "cut the bloat," and that is the bug.

The savings number understates the waste

ccmd's headline number is potentialSavingsTokens. It is the sum of tokenSavings across all findings. Only three of the seven finding kinds populate that field.

analyzer.ts

The other four (vague, aspirational, missing_why, conflict) flag the line but credit no tokens. Not because the line is free. Because per-line dollar accounting on a vague term ("appropriate") is harder to defend than per-line accounting on 35% of a 41-word rule. We picked the conservative side. The real waste on your file is always at least as big as the number ccmd prints, often substantially bigger.

When the savings counter reads 0 but the file is still wasted

The shape of the fix

A bloat finding has one suggestion: "split into 2-3 shorter directives, each one actionable." The verb matters. Not "delete." Not "trim." Split. The point is to turn one ignored long sentence into two read short ones.

Find the verb in the original line. That is the rule. Pull it out as its own sentence under 12 words.
Find every qualifier ("appropriately," "carefully," "where applicable") and either delete it or replace it with a concrete condition.
If there is a real reason for the rule (we got burned, audit requires it), put it on its own line as Why: .... That also satisfies the missing_why detector for the rule.
Re-paste into ccmd. The bloat finding on that line should be gone. If it still flags, you missed a qualifier.

The Reddit-thread misread

When a popular Claude Code thread says "I cut my CLAUDE.md by 80% and it got better," the 80% usually came from three things at once: removing a stale section (dead lines, no findings to flag because the line was deleted), removing a dated header (cache_bust, full-line savings, the big multiplier), and splitting a few long rules (bloat, 35% per line).

The reader walks away thinking "bloat = 80%". The analyzer disagrees with that equation. Bloat is the 35%-of-the-line slice. Dated lines and stale sections are different findings. Reproducing the 80% on your file means doing all three. Reading the cuts as one undifferentiated category is why the second person who tries it ends up cutting the wrong lines and seeing no improvement.

Want a second pair of eyes on your CLAUDE.md?

Paste the file at ccmd.dev for the 220 ms scan, or book 20 minutes and we will go line by line on yours.

Questions

Frequently asked questions

What is 'CLAUDE.md token bloat' in plain terms?

It is the slice of your CLAUDE.md that ships to the model on every turn and pays tokens without producing behavior. Reddit threads use the phrase loosely (every kind of waste). In ccmd's analyzer it is a specific named finding (kind: 'bloat') that triggers when a single rule line exceeds 28 words. The other kinds of waste (cache_bust, duplicate, vague, aspirational, missing_why, conflict) are flagged under different names with different math.

Why does ccmd credit only 35% of a bloated line as savings?

Because the front half of a long rule is usually the part the agent actually reads. A 40-word line like 'Always handle edge cases appropriately and write comprehensive tests... because we got burned' has a real directive in the first 8 words and a tail of qualifiers, hedges, and war stories that the model skims. ccmd assumes you keep the directive and cut the tail. 35% is a conservative floor; cutting the whole line is often safe, but the analyzer refuses to claim credit it cannot defend per-file.

What is the exact threshold for the bloat finding?

wordCount > 28, set at src/lib/analyzer.ts line 150. The message string says 'over 25 words consistently get ignored' because the 25-word claim comes from public observation of long-rule comprehension; the 28-word threshold gives a 3-word buffer before the analyzer flags. A 28-word line passes; a 29-word line flags.

Why does the savings number ignore four of the seven finding kinds?

Three of the four (vague, aspirational, missing_why) flag lines the agent treats as no-ops but cannot honestly assign a recoverable byte count. A 'never use any' line is short and aspirational; cutting it saves a handful of tokens but the right fix is usually to rewrite, not delete. The fourth (conflict) is a whole-file pattern, not a line, so per-line accounting does not apply. The reducer at src/lib/analyzer.ts line 271 sums only the three kinds (bloat, duplicate, cache_bust) that have a defensible per-line number.

Does this mean my real bloat is bigger than ccmd's number?

Yes, almost always. The headline 'potentialSavingsTokens' is a floor. Add the rough token cost of every vague, aspirational, and missing_why line you would also cut on a polish pass and the real number lands higher. We keep it conservative so the cuts ccmd recommends are ones we can stand behind on the specific file you pasted, not ones that sound right in a blog post.

How is bloat different from cache_bust?

Cache_bust is one line near the top with an ISO date or 'today' that flips the prompt cache from a ~10x discount to a cold read. The line's token cost is small; the surcharge is multiplicative. Bloat is a rule line that is too long; the line's token cost is large but each session pays once at the cached rate. Reddit threads conflate the two because both look like 'a line wasting tokens'. ccmd separates them because the fix is different (move the dated line, or split the long rule) and the cost shape is different (multiplier vs flat).

If I have 200 short rule lines, is my file bloated?

Not by the bloat finding's definition. None of the 200 will flag if each is under 28 words. The aggregate volume can still be a problem (each line still ships every turn, the file is still ~2000 tokens) but ccmd will tag duplicates and vague terms instead. The literal 'bloat' counter on a file of short lines often reads zero even when the file is too big. Watch totalTokens and rubric score, not just the bloat finding.

Where does the 35% factor come from?

It is an opinionated default in the code, not a benchmark. We picked it because cutting more than ~35% of a long line is usually a rewrite (which the analyzer cannot do for you), and cutting less is the trivial editorial pass (kill the qualifier tail). If you tune src/lib/analyzer.ts you can change the constant; the public analyzer at ccmd.dev runs with 0.35 and we have not seen a real file where that value misled a reader.

Is the analyzer the same for CLAUDE.md, AGENTS.md, .cursorrules, and .grokrules?

Yes for the bloat finding. The per-line scan that emits 'bloat' does not branch on file type. detectType (src/lib/analyzer.ts line 41) tags the input for the result header, but the wordCount > 28 check fires identically on all four formats. A long rule in .cursorrules pays the same per-turn cost in Cursor as the same long rule in CLAUDE.md pays in Claude Code. We have a polyglot walkthrough at /t/agent-config-token-bloat-audit.

What is the cheapest fix for a bloat finding?

Delete the qualifier tail. Take 'Always make sure to handle edge cases appropriately and write comprehensive tests because we have been burned' and keep 'Handle edge cases. Write tests on the path you touch.' Two short sentences, both testable, ~60% fewer tokens, and the agent reads both. The bloat finding's suggestion field literally says 'split into 2-3 shorter directives, each one actionable.'

Should I delete every line ccmd flags?

No. Treat bloat and duplicate as definite cuts; treat cache_bust as a move (relocate the dated line to the bottom or remove the date). Treat vague, aspirational, and missing_why as rewrites (replace the term, add an escape, add a 'why'). Conflict is a manual triage; pick one rule and scope the other. The analyzer prints all seven so you can decide; it does not file the cut for you.

Related guides

CLAUDE.md token bloat is one specific check, not a vibe.

What the detector actually does

Why 35% and not 100%

The six things bloat is not

The savings number understates the waste

The shape of the fix

The Reddit-thread misread

Want a second pair of eyes on your CLAUDE.md?

Questions

Frequently asked questions

Comments (••)

Comments ()