/t · guide · three billing models

CLAUDE.md vs hooks vs skills: the token-cost asymmetry nobody graphs

Matthew Diakonov, Written with AI

Published May 18, 20266 min read

A 400-line hook script can cost zero context tokens. A one-line CLAUDE.md sentence has full context cost on every turn. That is the asymmetry. Once you see it, the question "where does this rule belong?" stops being aesthetic and becomes arithmetic.

This page covers: the three billing models side by side, the math on a real 30-turn session, the one piece of analyzer code that proves the CLAUDE.md claim, and a triage rule for which surface a given rule belongs on.

1. Three surfaces, three bills

Every guide I have read on this topic groups skills, hooks, and MCP tool descriptions together as "conditional" surfaces, in opposition to "always-on" CLAUDE.md. That framing flattens a real difference. Skills bill on two events. Hooks bill on a third. The cost curves are not the same shape.

Feature	CLAUDE.md vs skills	hooks (settings.json)
Billing event	CLAUDE.md: every API call. Skills: every API call for the listing; first matching turn onward for the body.	Each matching tool event (PreToolUse, PostToolUse, UserPromptSubmit, etc.).
Token cost to define	CLAUDE.md: the byte count of the file. Skills: 75–150 tokens per skill in the listing, plus body bytes when invoked.	Zero. The settings.json entry and the hook script never enter the model's context.
Token cost when the rule never applies	CLAUDE.md: full file every turn. Skills: listing every turn, body zero.	Zero, unless the hook is configured to emit unconditionally.
Persistence after first fire	CLAUDE.md: always present. Skill body: persists in context for the rest of the session.	Each fire is a discrete tool-result block. Subject to autocompact, not the prefix cache.
Prefix-cache eligible	Yes for CLAUDE.md and skill listing (if the prefix isn't busted). Skill body caches from invocation onward.	No (lives in mid-session tool-result content).
Best fit	Identity, stack, hard prohibitions (CLAUDE.md). Workflow recipes (skills).	Pre-flight gates on a specific tool: 'when about to do X, check Y'.
Worst fit	Long workflow procedures in CLAUDE.md; small inline reminders in skill bodies.	Stylistic guidance ('prefer X over Y') with no tool event to attach to.

The row that catches most people is row two: hooks cost zero tokens to define. A 400-line shell script that does sophisticated regex matching, git inspection, or a network call to a security service contributes nothing to the model's context until and unless it writes to stdout. A one-line CLAUDE.md sentence saying "check before running destructive SQL" bills its bytes on every turn whether the model is writing SQL or renaming a CSS variable.

2. The CLAUDE.md leg, from the analyzer source

Our analyzer is one client-side TypeScript file. It runs in your browser when you paste a file into ccmd.dev. Open DevTools, watch the network tab, paste a file: no POST. The line that defines the CLAUDE.md billing model:

src/lib/analyzer.ts

Unconditional. Every byte. Every turn. The same identifier is used for AGENTS.md, .cursorrules, and .grokrules, because all four files get concatenated verbatim into the system prompt at session start. That is the always-on surface. Now the question is: what does the same analyzer report for the other two surfaces?

every turn

“every single API call to Claude sends the whole context, including prompts, meaning that all this extra text in CLAUDE.md is sent over and over”

caymanjim, Hacker News thread 47581701

3. The hook leg: zero by default

Hooks live in ~/.claude/settings.json (user-scoped) or .claude/settings.json in the repo. They register a shell command to fire on a specific tool event. The registration block is read by the Claude Code harness, not by the model. It never enters the API request.

~/.claude/settings.json

The model 'sees' the hook only via its stdout, and only on the turn where the matching tool event fires. A hook that returns nothing on allow contributes zero tokens to the input. A hook that returns a 30-token JSON block on block contributes 30 tokens, once, on the blocking turn. Compare to the same rule expressed as a CLAUDE.md line ("never run destructive Bash without --dry-run", ~12 tokens with the surrounding context): 12 tokens × every turn of the session. On a 30-turn session, the CLAUDE.md form is 360 input tokens, the hook form is 0 to 30. And the hook form actually stops the call rather than relying on the model to read and comply.

4. What the math looks like on a real session

A 6,042-token CLAUDE.md, 12 installed skills averaging 143 tokens each in the listing, and a single PreToolUse hook on Bash. Two scenarios: the skills never fire and the hook is silent; one skill fires on turn 8 and the hook blocks 14 calls with a short reason. Same files, different bills.

ccmd · three-surface cost simulator

The asymmetry: in scenario A, the skill bodies and the hooks contribute zero. The whole 232,860-token bill is CLAUDE.md (181,260) and the skill listing (51,600). In scenario B, the same CLAUDE.md and the same listing produce the same 232,860, and the invoked skill body adds 55,200 on top while the hook adds a rounding-error 420. Decide where to put a rule based on the bill it ends up on, not on a generic "move it to a skill" reflex.

5. Triage: pick the bill that matches the rule

Open your CLAUDE.md and walk it block by block. For each block, ask two questions: how often does this rule actually apply, and is there a specific tool event that gates it? Those two answers pick the surface.

Applies on every turn (identity, stack, hard prohibitions) → CLAUDE.md

10-20 lines of 'Next.js 16, Postgres on Neon, deploys to Vercel, never commit secrets'. The model needs them to interpret any request. Pay the always-on bill. This is the one block where CLAUDE.md is the cheapest answer.

Applies on a specific tool event (gate, check, format) → hook

Rules of the shape 'when the model is about to use Bash / Edit / Write, do X'. PreToolUse on Bash for SQL safety. PostToolUse on Edit to run a formatter. UserPromptSubmit to inject session context. The settings.json entry is zero tokens. The shell script is zero tokens. You pay only the stdout the hook actually emits.

Applies when a specific kind of work is happening (workflow recipe) → skill

A 200-token procedure for 'how we add a new API route' or 'how we file a bug'. As a skill, it costs ~143 tokens in the listing every turn and the full procedure body only on the turns where the model invokes it. As a CLAUDE.md block, it costs the full procedure on every turn. Skills win when the procedure is large and applies rarely.

Applies vaguely ('prefer X over Y', 'be concise') → CLAUDE.md, short, with a Why

These have no tool event to attach to, so hooks are out. They are usually too small and too cross-cutting to be a skill. Keep them on CLAUDE.md, under 28 words per line, with a 'Why: <past incident>' tail. The Why turns the line from aspirational into testable.

Applies never (dates, 'today', 'this session' strings) → delete

These bust the prefix cache and re-bill the entire CLAUDE.md and the skill listing at full input cost on every turn. The analyzer flags them as 'cache_bust' findings. If you must keep a date, put it past the cache breakpoint at the bottom of the file or rotate it out entirely.

6. Second-order effect: attention budget, not just tokens

The dollar cost is the easy part to argue about. The harder part is that the model has finite attention across the system prompt. A 6,000-token CLAUDE.md plus a 2,000-token skill listing is 8,000 tokens of "consider this on every reasoning step" that the model has to weigh against the user's actual request. Hooks bypass this entirely: they enforce in the harness, not in the model. A PreToolUse hook that blocks a destructive Bash command does not need the model to remember or comply; it never reaches Claude in the first place.

That is why the asymmetry matters beyond accounting. Every rule moved from CLAUDE.md to a hook is one less rule the model is juggling. Every rule moved from CLAUDE.md to a skill is one less rule competing for attention on the turns where it does not apply. The token bill is the visible signal. The invisible win is the attention you free up on every other turn.

Want help mapping your rules to the right surface?

Bring your CLAUDE.md, your skills folder, and your settings.json. 15 minutes, we walk through which rules belong where, you leave with a punch list. Free.

Frequently asked questions

Do hooks count toward the model's context window?

Not the hook definition. The settings.json block that registers a hook never enters the model's context. The hook script itself runs in your shell, not in Claude. What enters Claude's context is the JSON the hook writes to stdout when a matching tool event fires. A hook that emits 30 bytes on every Bash call adds 30 bytes × N Bash calls to that session's input tokens. A hook that emits nothing on allow and a one-line reason on block adds tokens only on the blocked calls. The 400-line script behind the hook is invisible to the model.

Do skills count on every turn even if they never fire?

Partly. Every installed skill puts its name and description into a per-session listing that ships in the system prompt on every turn. Anthropic's docs put that at roughly 75–150 tokens per skill with the XML wrapper, and the default skill-listing budget is skillListingBudgetFraction = 0.01, about 2,000 tokens on a 200K Sonnet or Opus context. The SKILL.md body, the actual rules, is zero tokens until the model invokes the skill. After invocation it persists in context for the rest of that session.

Then what does CLAUDE.md cost per turn?

All of it, every turn, no gating. Our analyzer codifies it: src/lib/analyzer.ts line 264 sets estimatedTokensFireEveryTurn = totalTokens. A 6,000-token CLAUDE.md adds 6,000 input tokens to every single API call for the life of the session. A 30-turn session is 180,000 input tokens of CLAUDE.md alone before any code is read.

Which surface is cheapest for a rule that almost never applies?

Hooks, by a wide margin, for the right shape of rule. A rule like 'never run a destructive SQL command without --dry-run first' belongs on a PreToolUse hook against Bash and Edit. The settings.json entry is invisible to the model. The script does the regex match locally. On 'allow' it emits nothing. The model pays only when the hook actually blocks. Compare that to a CLAUDE.md sentence saying the same thing: it bills its full byte count on every turn, including the turns where the model is formatting a JSON response and not writing SQL at all.

Why does a skill's body persist in context after invocation?

Because the model reads it once during the invocation turn and then keeps it in the rolling context for the remainder of the session. If a skill body is 2,400 tokens and the skill fires on turn 8 of a 30-turn session, you pay 23 × 2,400 = 55,200 input tokens for that body across the remaining turns. That is the trap: a 'rarely fires' skill that fires once still bills its body 20+ times. The defense is to keep SKILL.md bodies short, the same way you would keep a CLAUDE.md short.

Are these three surfaces also cached differently?

CLAUDE.md sits in the system prompt prefix and is cache-eligible if you don't bust the prefix with a date or 'today' string. The skill listing sits with it and caches the same way. Skill bodies enter mid-session, so they cache from invocation onward but not before. Hook stdout enters as tool-result content and is not part of the long-lived prefix cache; it gets compressed by autocompact rules instead. On a 30-turn session with prefix cache hits, the CLAUDE.md and skill-listing input costs drop by roughly 10x. Hook stdout cost does not.

What is the right mental model for picking a surface?

Think of it as three different bills. CLAUDE.md bills bytes × turns. Skills bill listing-bytes × turns plus body-bytes × turns-after-invocation. Hooks bill emitted-stdout-bytes × matching-tool-events. Pick the bill that maps to how often the rule actually applies. Identity and stack apply on every turn, so CLAUDE.md is correct. Workflow recipes that apply when a specific kind of work is being done belong on skills. Pre-flight gates that apply when a specific tool is about to fire belong on hooks.

Can I move a CLAUDE.md rule directly into a hook?

Sometimes. If the rule has the shape 'when the model is about to do X, check Y, possibly stop', it maps cleanly to a PreToolUse hook. The check runs in your shell with full access to the file system, git, and any binary you have installed. Rules of the shape 'in general, prefer X over Y' do not map to hooks because there is no tool event to attach them to. Those belong on CLAUDE.md or a skill, depending on how often they apply.

Does the same asymmetry exist in Codex (AGENTS.md), Cursor, or Grok Build?

The CLAUDE.md side does: AGENTS.md, .cursorrules, and .grokrules all ship verbatim into the system prompt on every turn, the same way. The conditional sibling surfaces differ. Codex has its own skills concept but no hooks. Cursor has rule files and .cursorignore but no hook-equivalent shell trigger. Grok Build has .grokrules plus tool gating. The principle holds in all four hosts: rules that apply unconditionally belong on the always-on surface, rules that apply on a specific event belong on the surface that gates by that event. The hook surface is currently the deepest in Claude Code.

How does ccmd score these three surfaces?

Paste any of the four agent-config files (CLAUDE.md, AGENTS.md, .cursorrules, .grokrules) into the textarea on ccmd.dev. The analyzer is one TypeScript file: src/lib/analyzer.ts. It reports totalTokens, estimatedTokensFireEveryTurn (always equal to totalTokens for CLAUDE.md and siblings), and a line-by-line list of findings: bloat (over 28 words), vague, aspirational, missing_why, cache_bust, duplicate. The paid tier monitors skill-listing growth, hook-stdout volume, and CLAUDE.md drift over time so you can catch which surface is silently growing.

Related: do CLAUDE.md rules fire on every turn, the two surfaces skills bloat on, or paste your file on the analyzer.