A CLAUDE.md template costs you every turn, every session, for as long as you keep it.

Starter templates from generators get quoted as one-time scaffolding. They are not. Every token in that template fires on 100% of turns, on every session, until you delete the line. That makes a 2,000-token starter a recurring annuity, not a setup cost.

Matthew Diakonov, Written with AI

Published May 20, 20265 min

Direct answer (verified 2026-05-20)

A template's per-turn token cost equals its full token count, because the whole file is re-sent in the system prompt on every API call. At Opus 4.7's $5 per million input rate, a 2,000-token template runs about $0.010 per turn uncached and $0.001 per turn on a prompt-cache hit. Over a 30-turn session that is $0.30, and over a year of five sessions a week, roughly $75.

Rates from platform.claude.com/docs/en/about-claude/pricing. Firing model from src/lib/analyzer.ts line 264.

The line that proves the cost is recurring

You can argue about whether a particular template line is useful. You cannot argue about whether it fires. Our analyzer treats per-turn cost as the whole file, because that is what Claude Code does. The number is not a fraction. It is not the headers only. It is every byte.

src/lib/analyzer.ts

The Messages API is stateless. Every turn, the harness rebuilds the whole context and prepends the system prompt, and your CLAUDE.md (project file, any nested files, the global ~/.claude/CLAUDE.md) sits inside that prefix. There is no mechanism that sends half of it. That is what makes the template cost recurring and predictable.

What a starter actually costs by year one

Tokens are the second column. The fourth column assumes five sessions a week of 30 turns each, 50 weeks of work, uncached input at Opus 4.7's $5 per million. Cached, divide by 10. These are typical sizes for templates from generator sites and common starter patterns we have measured.

Template shape	Tokens	$ / turn	$ / year (uncached)
Tiny .cursorrules (one paragraph)	50-100	$0.0003	$2
Anthropic-style minimal starter	200-400	$0.0015	$11
Generator output, single stack	800-1,500	$0.0058	$43
Generator output + Always/Never blocks	2,000-3,500	$0.0138	$103
Comprehensive starter (per-file rules)	4,000-6,000	$0.025	$188
Bloated (left untouched for months)	8,000-12,000	$0.050	$375

Per-turn figures use the middle of each token range. Per-year figures assume uncached input; if your template stays byte-identical session to session and you stay inside the cache window, the real number is roughly 1/10th of what is shown.

Measured: typical templates run through the analyzer

These are the bundled samples in scripts/score.ts, shaped like the kind of starter a generator emits or a solo engineer writes in one sitting. The token counts come from Math.ceil(chars / 4); the findings are real outputs from the analyzer.

bash

Token counts here are small because the samples are deliberately minimal. The point is the findings column: even a 56-token .cursorrules ships six flagged lines (three aspirational, two vague, one missing-why). Templates from form-based generators tend to amplify those same patterns, so the line count and the findings grow together.

The math, copy-pasteable

One formula, three time horizons. Drop in your template's token count and read off the rest.

template-cost.py

What a template loses when you cut the scaffolding

The left side is a representative starter, the kind a generator emits. The right side is what is left after deleting the role-prompting prelude, the aspirational "Always" block, and the marketing-paragraph stack description, then adding one line of real failure-mode context. Tokens drop from about 480 to about 120: a 4x reduction in per-turn cost.

Same project, two templates

# Project: payments-api You are an expert TypeScript engineer. Always write clean, well-structured code with good naming and handle edge cases appropriately. ## Always - Always write clean, well-structured code. - Always handle errors properly. - Always add comprehensive tests for everything. - Always think carefully before making changes. - Always use modern best practices. ## Never - Never use `any`. - Never commit secrets. ## Stack We use Next.js, TypeScript, Tailwind, Postgres on Neon, Stripe for payments, Resend for email, PostHog for analytics, deployed to Vercel from main with preview deploys on every PR. ## Style Follow our existing style. Look at neighbouring files. Be consistent.

Role prompt the model already knows
Five "Always" rules with no exceptions
Stack as a paragraph not a list
No "why" anywhere

When the template is worth its tokens, and when it isn't

The cost is real, but cost without context is not the same as waste. The test that holds up over time: if a section disappeared from your template tomorrow, would Claude's output get worse in a specific, identifiable way? If yes, it is worth its tokens. If you cannot point at the failure mode that section prevents, it is scaffolding and it is costing you.

A "we got burned by destructive migrations on Sept 12 2025" paragraph at 80 tokens per turn that prevents one outage is the best per-token spend you will ever make. A "You are an expert TypeScript engineer who values clean, maintainable code" line at 22 tokens per turn that changes nothing about the model's output is pure overhead, billed every turn, every session, until you remove it.

How to measure your own template in 30 seconds

Paste the file into the textarea on ccmd.dev. The analyzer runs in your browser with no upload. The output tells you the total token count, the per-turn cost, and which specific lines the rubric flagged as aspirational, vague, or missing-why. Each finding includes an estimated token-savings number, so you can sort the file by which deletes pay back the most. For the per-year number, multiply per-turn by 30 turns and again by your sessions-per-year.

The same analyzer scores AGENTS.md, .cursorrules, and .grokrules. All four formats fire the same way on their host: full file, every turn, every session.

Want a second pair of eyes on your template?

Bring your CLAUDE.md, AGENTS.md, or .cursorrules and we will walk through what to cut, what to keep, and what to move to a skill in 20 minutes.

Common questions about template token cost

How much does a CLAUDE.md template actually cost in tokens?

Per turn, it costs its full token count, because the entire template is re-sent in the system prompt on every API call. At Opus 4.7's $5 per million input rate, a 2,000-token template runs about $0.010 per turn uncached or $0.001 per turn on a prompt-cache hit. A 6,000-token template runs $0.030 per turn uncached and $0.003 cached. Over a 30-turn session those are $0.30 and $0.90 respectively. To estimate your template's tokens without a tool, divide its character count by 4. Then multiply by the input rate and divide by 1,000,000.

Is the template cost a one-time fee or recurring?

Recurring. The template lives in your CLAUDE.md until you delete the lines, and Claude Code rebuilds the full system prompt on every turn of every session. So a 2,000-token starter from a generator costs you $0.010 per turn for as long as those 2,000 tokens are in the file. Five sessions a week of 30 turns each, 50 weeks a year, comes to $75 a year for that template, before you add anything project-specific. The generator does not quote you this number because it does not know how long you will keep the scaffolding in.

Why does the whole template fire, not just the parts Claude needs?

Because the Messages API is stateless. There is no mechanism inside Claude Code that picks which sections of CLAUDE.md to send on a given turn. The harness reads the file, places it in the system prompt verbatim, and sends the whole thing. Our analyzer encodes exactly this at src/lib/analyzer.ts line 264 with estimatedTokensFireEveryTurn = totalTokens. It is not a fraction. It is not the headers only. It is every byte.

What's the difference between template cost and prompt-cache cost?

If a template stays byte-identical between requests within Anthropic's cache window (5 min by default, or 60 min on the long-cache tier), Claude bills cache reads at 10% of the base input rate. Opus 4.7's input drops from $5 per million to $0.50 per million for the cached portion. The catch: the first request still pays cache-write at 1.25x base rate ($6.25/M on Opus 4.7), and any byte change in the template (a date, a new line, a typo fix) invalidates the cache and forces the next turn to re-pay the full uncached rate. A template with a session-specific timestamp in the first 20 lines is effectively uncached forever.

Are bigger templates worth the tokens?

Sometimes. The cost is real, but so is the benefit when the template encodes real failure modes that would have made you re-explain the same thing on every session. A 'we got burned by X' paragraph that costs 100 tokens per turn but prevents one production incident every six months is cheap. A 'You are an expert TypeScript engineer who writes clean, well-structured code' paragraph that costs 80 tokens per turn and changes nothing about Claude's output is pure overhead. The test is: would you notice if this section disappeared? If no, delete it.

What template patterns cost the most tokens for the least gain?

Three categories the analyzer flags consistently in starter templates. (1) Role-prompting preludes like 'You are an expert <language> engineer' that the underlying model already knows. (2) Aspirational rules with the word 'always' or 'never' but no escape clause and no 'why' line, which Claude ignores at edge cases and the analyzer marks as aspirational and missing_why. (3) Stack descriptions that read like a paragraph of marketing copy when a one-line list of package names would convey the same information at 20% of the token count. These three patterns make up most of the bulk in templates generated by form-based starter sites.

How do I measure my own template's per-turn cost?

Paste it into the analyzer on ccmd.dev. The file runs entirely in your browser, no upload, and the result returns totalTokens, totalChars, and per-line findings with token-savings estimates. Multiply totalTokens by your model's input rate and divide by 1,000,000 for the per-turn dollar cost. Multiply by 30 for a long-running session, by 250 for a year of weekly use. The same analyzer scores AGENTS.md, .cursorrules, and .grokrules files the same way; all three formats inject into the host model's system prompt and re-send on every turn.

Does a polyglot template (CLAUDE.md + AGENTS.md + .cursorrules) cost more?

Yes, additively, because each agent host only reads the file format it cares about. Codex reads AGENTS.md, Cursor reads .cursorrules, Claude Code reads CLAUDE.md and a top-of-file import line in CLAUDE.md that points at AGENTS.md will pull AGENTS.md in too. If you maintain three near-identical templates, each one fires every turn in its own host, and the duplication compounds when one host imports the other. The analyzer's polyglot mode flags this duplication so you can split shared rules into one canonical file and import it.