Your CLAUDE.md token budget per turn is one ceiling, not an allowance.

A "per-turn budget" sounds like a spend you get back each turn. It is not. Your whole CLAUDE.md is re-billed in full on every turn, so the only budget that exists is a single fixed ceiling on the file's total token count. Set the number, then keep the file under it.

Matthew Diakonov, Written with AI

Published May 19, 20268 min

Direct answer (verified 2026-05-19)

There is no per-turn token allowance that resets. Every turn re-sends your entire CLAUDE.md, so a per-turn budget is one number: the maximum token count you allow the whole file to reach. Anthropic's Claude Code cost docs put the practical ceiling at "under 200 lines", which is roughly 2,000 to 2,500 tokens at the chars / 4 estimate. Set your budget there, then keep the file under it.

Ceiling from code.claude.com/docs/en/costs. Firing model from src/lib/analyzer.ts line 264.

The phrase "per turn" misleads you

A budget, in normal usage, is something you spend down and get back. A monthly budget refills monthly. A per-request quota refills per request. So "token budget per turn" reads like a discretionary pool: some tokens you choose to spend on CLAUDE.md this turn, some you save. That mental model is wrong, and it is the reason people keep tuning the wrong thing.

CLAUDE.md does not get a per-turn allowance. It is a standing reservation. The file's entire token count sits in the system prompt on every single turn, unchanged, whether the turn is a one-line question or a 40-tool refactor. You do not decide per turn how much of it to load. You decided that once, when you saved the file at whatever size it is.

Two ways to read "token budget per turn"

You picture a pool of tokens you allocate to CLAUDE.md each turn. Some turns you spend more, some less. Tuning means deciding, per turn, how much of the file to send. This model leads you to look for a setting that does not exist.

Implies a per-turn knob you can turn
Implies the file is partly loaded sometimes
Sends you hunting for a config flag

The proof: the analyzer never models a per-turn fraction

Our analyzer is a pure function with no network calls. When it computes the per-turn token figure, there is no fraction anywhere in the code. It estimates total tokens as characters over 4, then sets the per-turn number equal to that total. That is the whole calculation.

src/lib/analyzer.ts

This matters because it tells you exactly which number to budget. There is no "portion that fires" to tune down. There is no partial load. The per-turn token cost of CLAUDE.md is a property of the file, full stop, so the budget is a property of the file too: one ceiling on totalTokens. Anthropic's own cost guidance says the same thing in plain English: your CLAUDE.md is "loaded into context at session start" and its tokens "are present even when you're doing unrelated work."

Setting the number

A budget you never picked is not a budget. So pick one. The cleanest anchor is Anthropic's own line in the Claude Code cost docs: "Aim to keep CLAUDE.md under 200 lines by including only essentials." That is a line ceiling. To turn it into a token budget, count characters and divide by 4, the same estimate the analyzer uses.

A 200-line file averaging 45 to 55 characters per line is about 9,000 to 11,000 characters, which is roughly 2,250 to 2,750 tokens. Sparse files with short directives land lower; dense prose lands higher. Here is the conversion as three working tiers. Pick the row that matches your file's job, then treat its token number as your ceiling.

File role	Line budget	Token budget per turn
Single-purpose repo, one language	~120 lines	~1,500 tokens
Standard app repo (Anthropic's default)	~200 lines	~2,500 tokens
Monorepo root with shared conventions	~280 lines	~3,500 tokens

Line numbers anchor to Anthropic's 200-line guidance; token numbers are the chars/4 conversion at typical CLAUDE.md line density. The monorepo row assumes the root file plus what nested files add; everything in the firing prefix counts against one combined ceiling.

A budget is necessary, not sufficient

Here is the part most cost guides skip. Hitting the budget does not make the file good. We ran our analyzer on the SAMPLE_CLAUDE_MD constant that ships in src/lib/analyzer.ts, a deliberately bad example file. It is 280 tokens across 35 lines: comfortably inside even the tightest budget row above. And it still fails on every quality axis.

Analyzer output on the 280-token sample

findings flagged, all quality problems, none about size.

0 / 12

Karpathy rubric rules passed. Ten missing.

recoverable tokens. You cannot trim your way out of this one.

The 18 findings break down as 10 aspirational absolutes with no escape clause, 5 vague terms with no testable meaning, and 3 prohibitions with no stated reason. None of them carry a tokenSavings number, which is why recoverable tokens is 0. The fix is not deletion, it is rewriting. A budget caps what the file costs you per turn. It does nothing about whether the file works. You need both: a size ceiling and a quality pass.

Enforcing the ceiling

Once you have a number, enforcement is a comparison: file token count versus ceiling. Paste your file into the analyzer on ccmd.dev and it reports totalTokens and estimatedTokensFireEveryTurn (always equal). It runs in your browser, no upload, no signup. Here is what a check looks like on a file that is over a 2,500-token budget:

ccmd budget check

When you are over, the per-turn cost is not the first thing to cut. Two cheaper moves come before deleting content. First, move conditional rules off the every-turn surface: a rule that only matters during database migrations or PR reviews does not belong in a file that fires every turn. Anthropic's docs say it plainly, "move instructions from CLAUDE.md to skills", because skills load on-demand and stop counting toward the per-turn budget entirely. Second, kill duplicates and split bloated lines, the bytes inside your budget producing nothing. Trimming actual content is the third lever, not the first.

A budget set once does not stay met. Every commit that touches CLAUDE.md is a chance to drift back over the ceiling, and the file tends to grow because adding a rule is easy and removing one feels risky. The free analyzer is a one-shot check. The paid tier ($9 to $19 per month solo, $49 for a team) watches the file over time, sends a weekly drift email, and posts a diff comment on any pull request that pushes the file back over your budget.

Want a per-turn budget set against your actual CLAUDE.md?

Paste your file on ccmd.dev for the free in-browser scan, or book 20 minutes and we will pick a ceiling with you and rank what to cut to hit it.

Frequently asked questions

What CLAUDE.md token budget should I set per turn?

Set one ceiling on the whole file, not a per-turn spend. There is no allowance that refills each turn: your entire CLAUDE.md is re-sent in the system prompt on every turn, so the only number you control is the file's total token count. Anthropic's Claude Code cost docs put the practical ceiling at 'under 200 lines'. At a typical line density that is roughly 2,000 to 2,500 tokens by the chars/4 estimate. Pick a number in that range, write it down, and treat any file above it as over budget.

Does my CLAUDE.md get a fresh token budget each turn?

No. Nothing resets. The Messages API is stateless, so the harness rebuilds the full context on every turn and your whole CLAUDE.md sits in the system-prompt prefix on turn 1 and on turn 100. Our analyzer encodes this directly at src/lib/analyzer.ts line 264: estimatedTokensFireEveryTurn = totalTokens, with no fraction. A 'per-turn budget' is not a recurring quota; it is a single standing ceiling on the file. Going over budget once means going over budget on every turn for the life of the session.

How do I convert Anthropic's '200 lines' into a token number?

Count characters, divide by 4. That is the heuristic the analyzer uses (Math.ceil(text.length / 4) at src/lib/analyzer.ts line 38). A 200-line CLAUDE.md averaging about 45 to 55 characters per line is roughly 9,000 to 11,000 characters, which is about 2,250 to 2,750 tokens. Sparse files with many short directives and blank lines land lower, near 1,600 tokens; dense prose files land higher, near 3,500. So 'under 200 lines' is a budget of roughly 2,000 to 2,500 tokens for a normal project file.

If my CLAUDE.md is small, is it automatically within budget?

Being under the size ceiling is necessary but not sufficient. We ran our analyzer on the deliberately bad SAMPLE_CLAUDE_MD that ships in src/lib/analyzer.ts: it is only 280 tokens across 35 lines, comfortably inside any budget, yet it returns 18 findings, passes only 2 of the 12 Karpathy rubric rules, and has 0 recoverable tokens. Every problem in it is a quality problem, not a size problem. A budget caps how much the file costs you per turn; it does not make the file any good.

What actually happens when CLAUDE.md goes over budget?

Two things. First, the overage is billed on every turn, so a file 2,000 tokens over budget pays that 2,000-token tax 30 times in a 30-turn session before Claude reads a line of your repo. Second, the file competes with your working context: the bytes that explain your project sit in the same window as the code Claude is editing, and a bloated config crowds the part that actually matters. Cutting the file is the only lever; the per-turn cost is a property of the file, not of the turn.

Does the per-turn budget apply to AGENTS.md, .cursorrules, and .grokrules?

Yes, identically. The analyzer detects file type by content (detectType at src/lib/analyzer.ts line 41) and runs the same token math on all four formats. Codex (AGENTS.md), Cursor (.cursorrules), and Grok Build (.grokrules) each inject their rule file into the host model's system prompt and re-send it on every turn, exactly the way Claude Code injects CLAUDE.md. The budget is the same idea everywhere: one ceiling on the file, because the whole file fires every turn.

How do I check my own file against a budget?

Paste it into the textarea on ccmd.dev. The analyzer runs entirely in your browser, no upload and no signup, and reports totalTokens and estimatedTokensFireEveryTurn (always equal). Compare totalTokens to your chosen ceiling. It also returns a per-line findings array tagged by kind (bloat, duplicate, vague, aspirational, missing_why, conflict, cache_bust) so you can see which lines push you over and which conditional rules can move to a skill that only loads on demand.

Per-turn token cost