AGENTS.md token audit cost: the audit is free, the calibration is the work.

Running the audit on your AGENTS.md takes about 250 milliseconds in a browser tab and costs nothing. The dollar number the audit prints is the easy part. The hard part, and the part most write-ups skip, is calibrating that number to the loader you actually use. The same file produces five different bills depending on whether it is read by Cursor, Claude Code, Claude Code with an @AGENTS.md import, Codex CLI cold, or Codex CLI on a cache hit.

Matthew Diakonov, Written with AI

Published May 20, 20266 min

Direct answer (verified 2026-05-20)

Paste your AGENTS.md into the textarea at ccmd.dev. The in-browser analyzer at src/lib/analyzer.ts reports three numbers in about 250 milliseconds: totalTokens (characters / 4), estimatedTokensFireEveryTurn (equal to totalTokens), and estimatedCostPerLongRunSession (totalTokens times 30 turns times $15 per million, divided by one million). The audit costs nothing: no signup, no upload, no rate limit. The dollar number it prints is the upper bound for one specific path (Codex CLI uncached, Opus 4.7 base input). Multiply by your loader's fire rate to get your actual bill: Cursor default 0x, Claude Code without an @-import 0x, Claude Code with @AGENTS.md approximately 0.33x (Opus is $5/M not $15/M), Codex CLI uncached 1x, Codex CLI on a cache hit approximately 0.1x.

The four steps

The first three are mechanical: open the tool, paste, read three numbers. The fourth is the one that turns a generic audit into your actual bill.

Open ccmd.dev. The textarea on the homepage is the analyzer.

No login wall. No upload. No 'connect your GitHub'. The page is the tool. The analyzer is pure client-side TypeScript at src/lib/analyzer.ts, runs in your browser, and never sends your file anywhere. Open your network tab and watch the paste fire zero POST requests.

Paste your AGENTS.md verbatim. The whole file, comments and all.

The detector at src/lib/analyzer.ts line 41 reads the first 300 characters and routes by content, not by filename. It returns 'agents.md' if those 300 chars contain the literal string 'agents.md' or a top-level '# agents' heading. If neither is present, your AGENTS.md classifies as 'claude.md', which does not change the score; the rubric and per-line findings are identical across formats. The classification only matters for the label on the report.

Read three numbers: totalTokens, fires-every-turn tokens, session cost.

totalTokens is characters / 4, rounded up. fires-every-turn tokens equals totalTokens (the audit assumes the worst case: the whole file is re-sent on every turn). estimatedCostPerLongRunSession is totalTokens times 30 turns times $15 per million, divided by one million. A 1,600-token AGENTS.md prints $0.72 per 30-turn session. A 6,400-token file prints $2.88. Both numbers are the upper bound for one specific path.

Calibrate the dollar number to your actual loader. This is the step everyone skips.

Multiply the reported session cost by your loader's fire rate. Cursor default: 0 (AGENTS.md is not in Cursor's loader; the dollar number drops to $0). Claude Code without an @AGENTS.md import in CLAUDE.md: 0 (Anthropic's memory loader reads CLAUDE.md, not AGENTS.md). Claude Code with @AGENTS.md inlined into CLAUDE.md: 1x but Opus 4.7 input is $5/M, not $15/M, so divide the printed dollar number by 3. Codex CLI uncached: 1x at the Codex base input rate. Codex CLI on a prompt-cache hit: 0.1x because cache reads bill at roughly 10% of base. Same audit output, five different bills.

What the audit actually prints

This is the report on a real-shaped 1,612-token AGENTS.md. The shape of the output is fixed; only the numbers move. Everything below the three cost numbers is loader-independent: rubric pass count, per-line findings, suggested savings. Everything at the top is the math from the next section.

ccmd audit output

The math the audit uses (and what it ignores)

The cost code is seven lines. It does not branch on the detected file type. It does not branch on the model. It does not branch on cache state. It does one calculation with hardcoded constants and prints the result. Knowing those constants is the difference between reading the audit and being misled by it.

src/lib/analyzer.ts

Three values to keep in your head when you read the dollar number on the report: TURNS = 30 (double for a long refactor session, halve for a quick fix), OPUS_IN_PER_M = 15 (Opus 4.7 base input; divide by 3 for the public per-call rate, by 5 for Sonnet, by 15 for Haiku, by 10 again for a cache hit), and estimatedTokensFireEveryTurn = totalTokens (the assumption that the whole file is re-sent on every turn; only true for Codex CLI and for Claude Code with an explicit @-import).

Step four: the calibration table

Take the dollar number the audit prints and multiply by the fire-rate column. That product is your actual per-session bill for AGENTS.md alone. Conversation history, tool definitions, and other files in the system prompt are billed separately and are not part of this number.

Feature	Why	Fire rate (multiplier)
Cursor (default config)	Loader reads .cursorrules and .cursor/rules/*.mdc, not AGENTS.md. The file is on disk but invisible to the model.	0
Claude Code (no @-import)	Per Anthropic memory docs, Claude Code reads CLAUDE.md. AGENTS.md is not auto-discovered.	0
Claude Code (CLAUDE.md contains @AGENTS.md)	File is inlined into the resolved memory blob. Re-sent every turn. Opus 4.7 input is $5/M, not the $15/M ccmd prints, so divide by 3.	approx 0.33
Codex CLI (uncached turns)	Loader injects AGENTS.md into the session system prompt. Re-sent on every uncached turn at full input rate.	1.00
Codex CLI (prompt-cache hit)	Cache reads bill at roughly 10% of base input. A 5-minute idle window flushes the cache on most providers.	0.10

A worked example

Take the 1,612-token AGENTS.md from the report above. The audit prints $0.725 per session. If your team uses Cursor and has not wired AGENTS.md into a .cursorrules reference, your actual bill from that file is $0. If your team uses Claude Code and CLAUDE.md does not contain @AGENTS.md, your actual bill is also $0. If CLAUDE.md does contain that import, your actual bill is roughly $0.24 (0.725 divided by 3, because Opus 4.7 input is $5/M not the $15/M the audit assumes). If your team uses Codex CLI and pauses between turns long enough to flush the prompt cache repeatedly, your actual bill is the full $0.725. If you stay warm in the cache, it is roughly $0.073. Same file, same audit, five bills ranging from $0 to $0.73.

What to do with the per-line findings

The dollar number is one slice of the audit. The other slice is the findings array: one row per flagged line, each with a kind (bloat, vague, aspirational, conflict, duplicate, missing_why, cache_bust), a severity, a one-sentence message, and an estimated tokenSavings. Those are loader-independent. A line flagged as "vague" is just as vague in Cursor as it is in Codex; the difference is whether the model ever sees it.

Sort the findings by tokenSavings descending. Cut the top five. Re-run the audit. Most hand-written AGENTS.md files drop by 30 to 50% on this single pass, because the highest-savings rows are almost always the same shapes: a 200-300 token block of motherhood statements, one or two 80-character workflow paragraphs that should be three bullets, and a duplicate stack description that appears both in a Stack heading and inline in a workflow section.

FAQ

What does it cost to audit my AGENTS.md?

Nothing. The analyzer at ccmd.dev runs entirely in your browser, takes about 250 milliseconds on a normal-sized file, and never uploads the file anywhere. No signup, no email gate, no rate limit on the first scan. The paid tier ($9 to $19 per month solo, $49 per team) is for continuous monitoring (weekly drift email, PR diff comments, per-engineer cost attribution), not for the audit itself. The audit itself is and will stay free.

What does my AGENTS.md cost per session, according to the audit?

The audit prints estimatedCostPerLongRunSession, which is totalTokens times 30 turns times $15 per million tokens, divided by one million. A 1,600-token file prints $0.72. A 3,200-token file prints $1.44. A 6,400-token file prints $2.88. The math is at src/lib/analyzer.ts lines 263-269. That number is the upper bound for one specific path: Codex CLI on uncached turns, 30 turns long, Opus 4.7 at the base input rate. Your actual bill is that number multiplied by your loader's fire rate.

Why does the audit print one number when the real cost depends on the loader?

Because the cost path at src/lib/analyzer.ts line 264 sets estimatedTokensFireEveryTurn equal to totalTokens, unconditionally. The detector at line 41 returns 'agents.md' or 'claude.md' but the cost code never reads that value. The audit is doing the worst-case math: the whole file fires every turn at the highest input rate of the three Anthropic models. Calibrating that number to your loader is a manual step you do after reading the report. There is no toggle in the UI because Cursor versus Claude Code versus Codex versus 'CLAUDE.md with @AGENTS.md' is a four-way matrix and the audit prints one number.

If Cursor does not load AGENTS.md, why audit it for Cursor?

Because the rubric and per-line findings are loader-independent. Cursor will not bill you a single token for AGENTS.md if you have not wired it up, but the file is still part of the repo, still gets reviewed, still drifts, and still gets copied into a new project. Auditing it for clarity and for the Karpathy rubric is useful even when the per-session dollar number is zero. The dollar number is only one of the audit outputs; the other outputs (rubric pass count, vague-line flags, bloat warnings, missing failure-mode coverage) are what most teams actually act on.

What is the cheapest way to cut my AGENTS.md audit cost number in half?

Cut the file in half. The audit cost is linear in totalTokens. Run the audit, sort the findings by tokenSavings descending, and remove the top five lines. Most hand-written AGENTS.md files have a 200-300 token block of motherhood statements ('always write clean code', 'always think before changing') and one or two 80-character workflow paragraphs that should be three bullets. Removing those typically drops the file by 40-50%. The audit will reprint a smaller dollar number on the next scan.

Is the 30-turn assumption realistic for an AGENTS.md session?

It is a midpoint. A short bug-fix session is 5-10 turns; a long refactor session is 50-100. The audit uses 30 because it is roughly where the empirical median falls in the Claude Code billing data shared in public threads. For a long-running session, double the printed number. For a short one, halve it. The constant is exposed at src/lib/analyzer.ts line 266 (TURNS = 30) and is the easiest knob to tune mentally; the dollar number scales linearly.

Does the audit catch lines that Claude ignores in AGENTS.md?

Indirectly. The audit flags lines as 'vague' (R-class), 'aspirational', 'bloat', or 'missing why', and Claude has the same problem with those lines as a human reader does: they are too soft to act on. So a line like 'always write clean code' gets flagged as aspirational with a tokenSavings estimate and a suggestion. Whether Claude in fact ignored that line on a given turn is a different observation, made by reading transcripts; the audit cannot read transcripts. But the lines it flags are the lines most likely to be ignored.

Can I audit an AGENTS.md that imports other files with @-references?

Paste each file separately. The browser-side analyzer does not chase @-imports because it cannot read your filesystem. If your AGENTS.md does @docs/style.md and @docs/db.md, the audit prints the cost of AGENTS.md plus the inlined import bodies only if you paste them inline. Otherwise the printed number is the cost of the wrapper file alone, which underestimates the real per-turn cost. The fix is to paste the resolved file (run a quick script that inlines the imports) and audit that. For Claude Code specifically, Anthropic resolves @-imports up to a hop limit before injecting into the system prompt; you want to audit the resolved blob, not the wrapper.

AGENTS.md token audit cost: the audit is free, the calibration is the work.

The four steps

Open ccmd.dev. The textarea on the homepage is the analyzer.

Paste your AGENTS.md verbatim. The whole file, comments and all.

Read three numbers: totalTokens, fires-every-turn tokens, session cost.

Calibrate the dollar number to your actual loader. This is the step everyone skips.

What the audit actually prints

The math the audit uses (and what it ignores)

Step four: the calibration table

A worked example

What to do with the per-line findings

Related reading

Want the audit done on your real AGENTS.md, with the calibration applied?

FAQ

Comments ()

The four steps

Open ccmd.dev. The textarea on the homepage is the analyzer.

Paste your AGENTS.md verbatim. The whole file, comments and all.

Read three numbers: totalTokens, fires-every-turn tokens, session cost.

Calibrate the dollar number to your actual loader. This is the step everyone skips.

What the audit actually prints

The math the audit uses (and what it ignores)

Step four: the calibration table

A worked example

What to do with the per-line findings

Related reading

Want the audit done on your real AGENTS.md, with the calibration applied?

FAQ

Comments (••)

Comments ()