/t · guide · cost

A CLAUDE.md token-cost audit, in three states and one formula.

Matthew Diakonov, Written with AI

Published May 18, 20265 min read

"Audit your CLAUDE.md" is the most common piece of advice on this topic. Almost none of those posts say what an audit actually produces. The dollar version is small enough to fit on a napkin: three cost states, one formula, one line in the file that flips between two of the states. The whole audit takes 220 ms in your browser and produces a number you can put in front of your CTO.

1. The file

A trimmed sample for legibility. Pretend it's the first 200 tokens of a 6,000-token file (yours probably is, between stack notes, conventions, prior incidents, and skill catalog references).

CLAUDE.md

The only line that matters for the cost audit on this snippet is L3. The rest of the audit (vague absolutes on L7-9, missing-why on the prohibitions) shows up in the structural audit on the context-burn guide. Different scoreboard.

2. The formula

Lives in src/lib/analyzer.ts at line 268 of the analyzer that powers the homepage. Three constants, one multiplication, one division.

analyzer.ts

That number is the cold-read upper bound: every turn re-bills the full file at full input cost. The prompt cache, when it hits, divides that number by roughly 10. Anthropic's public guidance puts cache read at 0.1x the base input rate; in practice we see 8-12x in the wild depending on file size and session shape.

÷0chars per token

0turns / session

$0/MOpus 4.7 input

×0cache hit cost

3. The three states

One file, one formula, three answers. The file above on a realistic 6,000-token expansion, 30-turn session, Opus 4.7 input.

1
Cold
First turn of a session. No cache to hit. 6,042 × $5 / 1M = $0.0302 per turn. One turn only.
2
Cached
Turns 2 through N in a session where the prefix matches. ~$0.0030 per turn. Across 30 turns: $0.091. This is the happy path.
3
Cache-busted
Every turn pays full input cost because line 3 changed (a new date, a session-specific note). 30 × $0.0302 = $0.906. The bad path. The audit catches this.

Multiply each state by sessions per week. One long session a day, five days a week, fifty weeks a year:

Cached: $0.091 × 250 = $22.75 per year.
Cache-busted: $0.906 × 250 = $226.50 per year.

$200 a year per engineer that almost nobody notices, attributable almost entirely to one timestamp on line 3. Across a five-person team on a shared CLAUDE.md it is a $1,000 line.

4. The audit output

What the analyzer prints when you paste the file in. The UI on ccmd.dev renders the same data in a richer panel; the shape is the terminal equivalent.

ccmd · cost report

Two numbers on the same line tell the whole story. The current state is in red because a cache_bust finding fired. The cached state is the number you'd see if you cut one line. Delta 10x.

5. The one-line fix

Move the date out of the cached prefix. Either delete it (the agent gets the current date from its system context anyway) or push it to the bottom of the file behind a heading so the top stays stable.

L3 — what to do with the date

Today is 2026-05-18 and we are mid-sprint on the subscriptions rewrite. (line 3, voids the cache every session)

Mutates on every new session
Full input cost on all 30 turns
$0.906 per session, $226.50 per year

every turn

“every single API call to Claude sends the whole context, including prompts, meaning that all this extra text in CLAUDE.md is sent over and over”

caymanjim, Hacker News

6. Where the existing tooling stops

The two camps in this space and what they each leave on the floor.

Feature	ccusage / claude-meter	ccmd cost audit
Shows what you spent last week	yes	no (forward-looking)
Shows what each CLAUDE.md line costs	no	yes, per line
Catches cache-busting prefix mutations	not modeled	regex on first 20 lines
Three cost states (cold / cached / busted)	single retrospective number	explicit
Recommends a specific fix	no (read-only)	named line + token saving
Works on the file you already wrote	needs CLI install + a billing window	paste in browser, 220 ms

Use both. ccusage tells you the bill is too big. ccmd tells you which line did it.

7. What this audit does not cover (yet)

Three things the cost number above is honest about leaving out. Each one is its own line item on the real bill.

Skill listing tax. Every installed skill costs 75-150 tokens in the per-session skill list. 12 skills is ~1,500 tokens of frontmatter shipped on every turn, separate from CLAUDE.md. Covered in the skills bloat guide.
Tool-call tokens. Read, Bash, Grep, subagent calls each cost their own input and output tokens. A workflow rule in CLAUDE.md that triggers three subagents per task multiplies that cost. The Agent SDK billing change on June 15 2026 moves subagent calls to a separate credit pool; the cost-audit simulator for that change is in progress.
Output tokens. Opus 4.7 output is $25 per million, 5x the input rate. CLAUDE.md itself is 100% input, but rules like "explain your reasoning verbatim before each change" inflate output tokens. The structural audit on the context-burn page catches verbosity rules; it does not price them.

Want a second pair of eyes on your CLAUDE.md cost?

15 minutes, walk through the audit on your actual file, leave with a per-line dollar number you can take to your team. Free.

Frequently asked questions

What is a CLAUDE.md token cost audit?

It's the dollar version of a context audit. Instead of asking 'is my file good', you ask 'what does this file cost me per turn, per session, per month, in three states: cold, cached, cache-busted'. The math is small: tokens × turns × input rate. At Opus 4.7 ($5 per million input tokens as of 2026-05-18), a 6,000-token CLAUDE.md is $0.03 per cold turn, $0.003 per cached turn, and roughly $0.90 per long session if the cache never hits. Multiply by sessions per week to get your real bill.

Why three cost states?

Because the same file produces three very different invoices depending on what the prompt-cache prefix looks like. Cold: the first turn of a session, no cache to hit, full input cost. Cached: every subsequent turn in a session where the prefix is byte-identical, roughly a tenth of the cold cost. Cache-busted: any session where a volatile line (an ISO date, 'today is', 'this session', 'right now') near the top of the file mutates the prefix and forces full cost on every turn. The third state is the surprise on your bill.

How do I actually run the audit on my file?

Paste your CLAUDE.md into the textarea at the top of ccmd.dev. The analyzer counts tokens (chars / 4, same heuristic every CLI uses), runs seven checks against the content, and prints the three cost states. It runs entirely in your browser; nothing is uploaded. The cache-bust check is the load-bearing one for cost: it fires the regex \b20[2-9]\d-\d{2}-\d{2}|today|this session|right now\b on the first 20 lines and flags any line that would void the prompt cache.

Where does the 30-turn session number come from?

It's the working heuristic the analyzer uses for a long session, codified in analyzer.ts at line 266 as 'const TURNS = 30'. Short ad-hoc Claude Code sessions are 5-10 turns. The kind of session where token cost actually shows up on your weekly limit is the multi-hour refactor or feature build, which lands in the 25-40 turn range. 30 is the round number we picked. Your real number is in your usage page; plug it into the formula.

How do I get the token cost per session for my own file, not the 6,000-token sample?

Paste your actual CLAUDE.md into the textarea on ccmd.dev and read the token count it returns. Then run the same formula with your two real numbers: tokens × turns × $5/M ÷ 1M. The worked example on this page assumes 6,000 tokens and a 30-turn session; swap both for your figures (turn count is in the Claude Code usage view). If a volatile line near the top of your file trips the cache-bust finding, use the cold per-turn cost on every turn for the per-session total; if it doesn't, use the cached cost for turns 2 through N. That gives you the token cost per session for the file you actually wrote, not the sample.

Why does one ISO date matter so much?

Prompt caching only hits when the prefix is byte-for-byte identical to a previous request. Every new chat session writes a different date string at line 3, which means the cached prefix never matches, which means every turn in the new session pays full input cost on the entire file. On a 6,000-token CLAUDE.md, that one line is the difference between $0.09 and $0.90 per session. Cheaper to delete it than to optimize anything else in the file.

What input rate should I use for the math?

As of 2026-05-18 Opus 4.7 input is $5 per million tokens (down from $15 on Opus 4.1). Sonnet 4.6 input is $3 per million. Haiku 4.5 is $1. ccmd's analyzer is calibrated to Opus 4.7 because that's what most paid Claude Code sessions actually use. If you're on Sonnet by default, multiply our dollar numbers by 0.6.

Does the audit catch costs from Read, Bash, or subagent calls too?

Not yet. The audit on ccmd.dev today scores the CLAUDE.md file itself: token count, three cost states, line-by-line findings. A separate piece in progress is the Agent SDK credit-cost simulator that estimates the per-session cost contribution of Read, Bash, and subagent density given a CLAUDE.md's workflow rules. That ships ahead of the June 15 2026 Agent SDK billing change, where Claude Code subagent calls move to a separate credit pool.

Why is this different from ccusage or claude-meter?

ccusage and claude-meter are retrospective: they tell you what you spent last week and which sessions were expensive. They cannot tell you why. The CLAUDE.md cost audit is forward-looking and causal: it points at the specific line in the file that is producing the cost and gives the dollar delta of fixing it. The two tools complement each other; use ccusage to see the bill, use ccmd to see the cause.