/t · guide · firing model

Do CLAUDE.md rules fire on every turn? Yes. The fix isn't cutting bytes.

Matthew Diakonov, Written with AI

Published May 18, 20265 min read

Most posts on this topic stop after the first sentence above. "Every line fires every turn, so trim your file." That is correct and it is half the picture. The other half is that you have three sibling surfaces where the same rule can live without firing every turn. Most CLAUDE.md files we score have at least a third of their bytes on the wrong surface.

This page covers: how we prove the "every turn" claim from the analyzer source, the four firing surfaces side by side, and which kinds of rules belong on which surface.

1. The claim, from the source

Our analyzer is pure client-side TypeScript that runs in your browser when you paste a file into ccmd.dev. Open DevTools, watch the network tab, paste a file: no POST. Two relevant slices from src/lib/analyzer.ts:

src/lib/analyzer.ts

The field is named estimatedTokensFireEveryTurn because that is the model. On a CLAUDE.md the assignment is unconditional. The same identifier ships across all four file types we detect; the assumption is the same for AGENTS.md, .cursorrules, and .grokrules.

2. What that looks like on a real file

A 6,042-token CLAUDE.md from a Next.js payments repo, run through analyzeConfig(). The number under estimatedTokensFireEveryTurn is the full file, every turn, no gating.

ccmd · firing surfaces

The bottom of that output is the part most audit guides skip. CLAUDE.md is one row, not the only row. The other three rows are where "rules that fire when needed" actually live.

every turn

“every single API call to Claude sends the whole context, including prompts, meaning that all this extra text in CLAUDE.md is sent over and over”

caymanjim, Hacker News thread 47581701

3. The four firing surfaces

Claude Code reads from four places. Three of them are gated by something the model or the harness does at the relevant moment. Only one fires no matter what.

Feature	CLAUDE.md (always-on)	conditional surfaces
When does it enter the system prompt?	Session start, every turn for the session	Only when the model invokes it / event fires / tool is selected
Token cost on turns that do not need the rule	Full file every turn	Zero
Good fit	Stack, broad style, identity, ground rules	Workflow-specific procedures, tool gates, format outputs
Bad fit	Long procedures, step-by-step recipes, niche tool rules	Identity, top-level project facts, hard prohibitions
How to add it	Append to the markdown file	Skill .md, settings.json hook, or MCP server description
How to remove it	Edit the file (touches cache prefix)	Disable the skill / hook / tool (cache prefix unchanged)

The naming convention in the analyzer maps to the firing semantics: we report estimatedTokensFireEveryTurn only for the unconditional file. Skills and hooks have their own analyzer surfaces and their own cost model.

4. Triage: which lines stay, which lines move

Open your file and walk it top to bottom. For each block, ask one question: "does this need to be in front of the model on every turn of every session?" If no, it has a better home.

Stack and identity — keep on CLAUDE.md

Lines like 'Next.js 16, Postgres on Neon, Stripe, deploys to Vercel'. The model needs the stack to interpret any request. Same with the team or solo identity. These are 10-20 lines and they belong in the always-on surface. This is the only block where 'fires every turn' is the right behavior.

Workflow procedures — move to skills

Anything that reads like 'when the user asks to do X, follow these steps'. A skill .md with a name and description gets loaded only when the description matches the user's request. If you have a 200-token 'how to add a new API route' procedure in CLAUDE.md and you add one route per week, that procedure fires every turn of every session anyway. Move it to skills/add-api-route.md.

Tool gates and pre-checks — move to hooks

Rules like 'never run destructive SQL without dry-run first' belong as a PreToolUse hook on Bash and Edit, not as a CLAUDE.md sentence. The hook fires exactly when the model is about to use the tool; the sentence in CLAUDE.md fires every turn whether the model is writing SQL or formatting a paragraph. Hooks also bypass the 'model decided to ignore the rule' failure mode.

MCP-specific instructions — move to the MCP server description

Rules about how to call a specific MCP tool ('use this argument shape', 'pass timeout in milliseconds') belong in the tool's description, which only enters the context when the model picks that tool. CLAUDE.md sentences about an MCP tool fire every turn even on turns where the tool is irrelevant.

Hard prohibitions with a Why — keep on CLAUDE.md

Lines like 'never commit secrets. Why: leaked Stripe key in March 2026.' need to be in front of the model on every turn because the model has no way of knowing on which turn the rule will apply. Always-on is the right surface. Just keep them short, name the past incident, and prefer 'NEVER X. Why: Y' over decorated paragraphs.

Aspirational lines (always / never / must) without exceptions — delete

These do not belong on any surface. The model fires them, can't resolve them against a real edge case, and silently ignores them. The analyzer flags these as 'aspirational' findings. Either add an escape clause and keep on CLAUDE.md, or replace with a concrete test and keep on CLAUDE.md, or delete.

Volatile lines (dates, 'today', 'this session') — delete or move to the bottom

Anything that changes between sessions belongs nowhere near the top of CLAUDE.md. It busts the prompt cache prefix and re-bills the full file at full input cost on every turn. The analyzer flags these as 'cache_bust' findings. If you must keep a date, put it at the bottom past the cache breakpoint or rotate it out of the file entirely.

5. What the math looks like after triage

A real-shaped 6,042-token CLAUDE.md from the sample above, after moving four blocks off the always-on surface:

Move 6 workflow procedures to skills/ (around 1,800 tokens). They were each firing every turn of every session, including turns where nothing in the procedure was relevant. After move: zero tokens on those turns.
Move 3 SQL / migration prohibitions to a PreToolUse hook on Bash and Edit (around 400 tokens). The hook is a settings.json entry, not a system-prompt line.
Move 2 MCP-tool argument rules to the MCP server description (around 250 tokens). They now load only when the model picks the tool.
Delete 1 ISO date line and 4 aspirational absolutes (around 200 tokens, plus the cache_bust recovery).

Remaining CLAUDE.md: roughly 3,400 tokens of stack, identity, and short prohibitions with named past incidents. At Opus 4.7 input rates and a 30-turn session, the cold-read input drops from $0.91 to $0.51 per session, and the cache-hit input drops from $0.091 to $0.051. Those numbers matter less than the second-order effect: the model is no longer fighting 6,000 tokens of context every turn to surface the 50 tokens that actually apply.

Want a second pair of eyes on your firing surfaces?

15 minutes, walk through your CLAUDE.md plus any skills, hooks, and MCP tools, leave with a shortlist of what to move where. Free.

Frequently asked questions

Do CLAUDE.md rules really fire on every turn?

Yes, every line. Claude Code concatenates the contents of CLAUDE.md into the system prompt at session start and re-sends that entire prompt on every API call for the session. Our analyzer codifies the assumption directly: src/lib/analyzer.ts line 264 sets estimatedTokensFireEveryTurn = totalTokens. There is no per-line gating on CLAUDE.md, nor on AGENTS.md, .cursorrules, or .grokrules. If a rule lives in any of those files, it fires unconditionally for the life of the session.

Then why do my CLAUDE.md rules sometimes get ignored?

Firing is not the same as following. Lines over 28 words consistently get treated as one signal by the model and the second half is ignored. Vague terms (appropriate, properly, carefully) have no testable success condition so the model cannot tell when it succeeded. Absolute words (always, never) without an escape clause get treated as aspirational. The rule fires, the model reads it, and then it skips the parts it cannot resolve. The bytes still get billed.

Is there a way to make a rule NOT fire every turn?

Yes, but not by editing CLAUDE.md. You move the rule to a different surface. Skill .md files only enter the context when the skill name or description matches the request. Hooks only fire on tool events (PreToolUse, PostToolUse, UserPromptSubmit). MCP tool descriptions only enter the context when the model picks that tool. If a rule's job is 'when the model is about to write SQL, do X', it should be a PreToolUse hook on Bash and Edit, not a CLAUDE.md line.

What is prompt caching and does it help?

Yes, when it hits. Anthropic caches the prefix of long system prompts and bills a cache hit at roughly a tenth of a fresh read. If your CLAUDE.md does not change between sessions, the second and later turns of any given session hit the cache. The killer is one ISO date or 'this session' string near the top of the file. The analyzer flags this as a cache_bust finding (high severity). On a 6,000-token CLAUDE.md the difference between cached and cache-busted across a 30-turn session is roughly 10x.

How do I see which lines are firing per turn?

Paste your file into the textarea on ccmd.dev. The analyzer reports totalTokens, estimatedTokensFireEveryTurn (always equal to totalTokens on the free tier), and a line-by-line list of findings with token savings estimates. It runs entirely in your browser, no upload. The paid tier hooks into your Claude Code session logs and reports which rules actually came up in the model's reasoning across the last 100 sessions; that tells you which of the lines are firing AND being followed vs. firing AND being ignored.

Does the same thing apply to AGENTS.md and .cursorrules?

The firing model is identical. AGENTS.md (Codex), .cursorrules (Cursor), and .grokrules (Grok Build) all get injected verbatim into the system prompt. The same analyzer runs the same seven checks against all four formats; detection is by content, not filename. The conditional sibling surfaces differ per host: Codex has its own skill format, Cursor has rule files plus .cursorignore, Grok Build has .grokrules plus tool gating. The principle is the same: rules that only apply sometimes belong on a sometimes-firing surface.

If I move 30 lines out of CLAUDE.md into a skill, what changes?

Cold-read input tokens drop by whatever those 30 lines weigh (typically 200-500 tokens). The skill .md is not loaded into the system prompt at session start; it only enters the context when the model decides to invoke the skill, which is a single turn, not all of them. On a 30-turn session that is a 30x reduction in firing for those rules. The cost recovery is real but secondary; the bigger win is that the rules stop competing with everything else in the system prompt for the model's attention.

Are nested CLAUDE.md files (per-directory) included every turn too?

Only the ones that are in the working tree of the current session. Claude Code walks up from the cwd and concatenates every CLAUDE.md it finds, plus the user's global ~/.claude/CLAUDE.md. All of those fire every turn for the session. Switching to a different subdirectory does not reload the file list mid-session, so a nested CLAUDE.md you forgot about can sit in the system prompt for hours.