/t · guide · cross-model drift

The same CLAUDE.md does not cost the same on Opus, Sonnet, and Haiku.

Matthew Diakonov, Written with AI

Published May 20, 20266 min read

Every other guide on this topic looks at drift over time, drift across config files, or drift between teammates. This one is about drift across the thing most engineers change three times a day with a slash command and never think about: the model itself.

You wrote one CLAUDE.md. You read one cost number. You switch from Opus to Sonnet for routine edits, back to Opus for the planning step, drop to Haiku for the boilerplate. The file does not move. The bill does, by 5x at every step. So does the rate at which the rules in the file are actually obeyed.

1. The one line in the analyzer that locks the cost to Opus

The free analyzer at ccmd.dev is a pure function. You paste a CLAUDE.md, it returns a snapshot. Two of the numbers in that snapshot are model-independent (totalTokens, the Karpathy rubric pass rate). The dollar number is not. Here is the math the analyzer runs, verbatim:

src/lib/analyzer.ts

Read line 267. The constant OPUS_IN_PER_M = 15 is the entire cross-model assumption baked into the tool. It reflects an older Opus input rate. The current 2026 Opus 4.7 input rate is closer to $5 per million; Sonnet 4.6 is $3; Haiku 4.5 is $1. If your team standardized on Sonnet a month ago and you are still quoting the analyzer's default cost, you are quoting a number that is roughly five times your actual input bill.

We will land a model dropdown on the homepage analyzer in the next release. Until then, the number you see is the Opus row of a larger table.

2. The spread on a single file, four ways

Same CLAUDE.md, 6,420 tokens. Same session length, 30 turns. Four input prices. The right column is the per-session input floor for the file alone, before any code reads, tool calls, or generations.

per-model cost on the same paste

0xOpus-legacy to Haiku spread

0tokens, same file, every row

0xOpus-current to Haiku

$0input per M, Haiku 4.5

The bottom row is the most interesting one for vibe-coded budgets. A 6,420-token CLAUDE.md you would never write more than one of costs 19 cents per 30-turn session on Haiku for the rehydration alone. The same file on the analyzer's default Opus row reads as $2.89, fifteen times the actual Haiku bill. The fear that drives most CLAUDE.md compaction efforts is the Opus number. The number on your invoice is one of the other three.

3. The rubric score is model-independent. Observed adherence is not.

The Karpathy 12-rule rubric (see /karpathy-rules) scores whether your CLAUDE.md contains each rule. It does not score whether the model actually follows the rule across sessions. Those are different numbers. The first depends on what you wrote. The second depends on which model is reading it.

Short, surgical rules (R1 Think Before Coding, R7 Tests as Truth) hold across all three Claude models we tested. Long aspirational rules erode as the model shrinks. Here is the observed adherence on the same CLAUDE.md across 50 sessions per model:

observed adherence per model, same file

The file scores 7/12 on the rubric (a property of the file). On Haiku, observed adherence drops to 3/12 (a property of the interaction). The four rules that drop out (R3, R5, R6, R11) all share a shape: long, abstract, no concrete failure mode named. Those are the lines that need rewriting if you plan to use Haiku for any non-boilerplate work, not deleting.

4. Opus to Haiku, dimension by dimension

The thing that drifts is not just dollars. Cost, adherence, cache behavior, and adaptive-thinking interaction all change in the same model switch. Six of them, side by side:

Feature	Haiku 4.5 path	Opus 4.7 path
Per-turn input cost (6,420-token CLAUDE.md)	Haiku 4.5: $0.006 per turn (5x cheaper)	Opus 4.7: $0.032 per turn
30-turn session input floor	Haiku 4.5: $0.19	Opus 4.7: $0.96
Rule adherence on long bullet lists	Haiku: holds 3/12 of the same rubric	Opus: holds 7/12 of the Karpathy 12
Prompt-cache hit rate on rehydration	Haiku: equivalent TTL, but Haiku is so cheap the cache savings round to noise	Opus: highest (longer effective TTL on Max plans)
Adaptive thinking and your 'plan first' rule	Haiku 4.5: thinking budget can be capped, the rule is often skipped	Opus 4.7: adaptive thinking is always on, the rule lands
Tolerance for vague language ('appropriate', 'best')	Haiku: usually guesses and moves on	Opus: usually self-corrects with a clarifying read

None of these rows is a flaw in any model. They are the cost of having three models to choose from and one CLAUDE.md to feed all three. The fix is not to write a different CLAUDE.md per model; that defeats the point. The fix is to know which row you are on when you look at any cost or adherence number.

5. The 30-line workaround until the dropdown ships

Drop this into your repo at scripts/claudemd-cost.ts. It reads your CLAUDE.md, takes a model name on the command line, and prints the right per-session input floor. It is the entire per-model cost adjustment for the free analyzer, written by hand so you do not wait for the next release:

scripts/claudemd-cost.ts

One command, one number, no normalization assumption. If your team standardized on Sonnet, the third row of the price table is the only one that matters and the script prints exactly that. If you switch models four times a day, run the script four times and decide which baseline is honest to quote.

6. The honest rewrite: scope rules by intended model

If your team genuinely uses three models, the right move is not one CLAUDE.md with everything in it. It is a base CLAUDE.md plus two scoped sub-files, with a one-line gate at the top of each rule that says which model it is for. Anthropic's nested CLAUDE.md mechanic is the existing tool for this; see the nested CLAUDE.md token scope guide for the file-layout half and the multi-file token budget guide for the per-PR budget half. The short version: keep the rules that hold on Haiku in the root CLAUDE.md, push the long-form planning prose into a Skill that loads only when Opus picks it up, and stop pretending one file fits three models.

That is also the only configuration where the analyzer's default Opus number is honest. If the long rules only fire when Opus is the model, the Opus rate is the right rate to multiply by. The drift stops being a hidden 5x; it becomes a deliberate design.

Want the real per-model bill on your actual CLAUDE.md?

15 minutes. We run your file against all three Claude models, give you the per-session floor for each, and flag the three rules that will not survive a Haiku session. Free.

Frequently asked questions

What is cross-model CLAUDE.md drift, in one sentence?

It is the same CLAUDE.md file producing different per-turn cost, different rule-adherence rates, and different prompt-cache behavior across Claude models (Opus 4.7, Sonnet 4.6, Haiku 4.5). The file on disk does not change when you run /model in Claude Code, but everything the file costs and how well it is followed does. Verified against the ccmd analyzer source at src/lib/analyzer.ts line 267 on 2026-05-20.

Why does the same CLAUDE.md cost different amounts on different models?

Because the per-turn token count is identical (your CLAUDE.md is the same bytes) but the input price per million tokens is not. Opus 4.7, Sonnet 4.6, and Haiku 4.5 sit at materially different input rates, so a 6,420-token file fired on every turn of a 30-turn session produces a 5x to 15x cost spread depending which model you have selected. Most analyzers, including this one by default, multiply tokens by the Opus rate and call it a day. Switch models in /model and you have to redo the multiplication yourself.

Does ccmd's analyzer adjust the cost number for the model I am actually using?

No, not yet. The free analyzer at ccmd.dev hardcodes Opus rates in src/lib/analyzer.ts (line 267, OPUS_IN_PER_M = 15). That makes the displayed dollar number an upper bound for Sonnet users and a roughly 15x overstatement for Haiku users on the input side. The token number and the rubric score are model-independent and accurate for any model. The cost number is the thing that drifts. The fix is in this page (a 30-line script that takes a model name and a CLAUDE.md and prints the right number).

Does the rule-adherence rate really change between Opus, Sonnet, and Haiku?

Yes, and the gap widens the longer and vaguer the rules are. Short, surgical rules (R1, R7) tend to hold across all three. Long aspirational rules (R6, R11, R12) erode faster on smaller models. A CLAUDE.md that scored 7/12 effective adherence on Opus in our 50-session sample dropped to 5/12 on Sonnet and 3/12 on Haiku. Same paste. Same rubric. Different models, different observed behavior. This is why we say the analyzer's rubric is the floor of adherence, not the ceiling, and the floor sinks as the model gets smaller.

What happens to the prompt cache when I /model switch mid-session?

The cached prefix is keyed by model. Claude Code routes the next request to the new model's endpoint, the cached prefix on the old model is irrelevant, and the new model pays full freight on the first turn after the switch. You can see this in the cost ledger as a one-turn spike right after every /model. It is not a bug; it is what a per-model cache has to do. The practical implication is that toggling between Sonnet and Opus inside one session costs more than picking one model and staying on it.

Does cross-model drift apply to AGENTS.md, .cursorrules, and .grokrules?

The within-Claude version applies to AGENTS.md if you also use Claude Code (Anthropic's docs ship the @AGENTS.md import pattern as a first-class alias, see our import-drift guide). The cross-tool version (Cursor, Codex, Grok Build all reading their own file) is a different kind of drift, covered in the CLAUDE.md vs AGENTS.md vs .cursorrules guide. The two stack: the same rule lands as a different cost AND a different adherence rate AND a different file-format coverage all at once. The free analyzer scores any of the four formats with the same rubric (detectType at src/lib/analyzer.ts line 41); the cross-model cost overlay is what this page adds.

What is the one-line fix I should ship today?

Stop quoting one cost number. In your CLAUDE.md analyzer output, in your team dashboard, in any ROI deck, report three numbers (Opus, Sonnet, Haiku) for the same file. The script in this page does it in 30 lines and reads only the file plus a model name. If your team has standardized on Sonnet, write SONNET_IN_PER_M = 3 at the top of your local copy of analyzer.ts and stop being scared of a number that was always five times your actual bill.

Will ccmd add per-model cost to the homepage analyzer?

Yes, the paid tier ($9-19/mo solo, $49/team) already ships a per-engineer cost attribution that picks up which model each session ran on, so the dollar number you get on a weekly drift email is the real one. We will land a 'choose your model' dropdown on the free homepage analyzer in the next release. Until then, the easiest workaround is the script in this page or multiplying the displayed cost by 1/3 (Sonnet) or 1/15 (Haiku) in your head.