AGENTS.md token cost per turn is not one number. It is four.

People ask what their AGENTS.md costs per turn the same way they ask what a CLAUDE.md costs: as if the file has a price. It does not. The same 640 bytes on disk become 0 tokens in Cursor, 0 in Claude Code without an @AGENTS.md import, the full file in Claude Code with that import, the full file in Codex CLI on uncached turns, and ~10% of the file on cached Codex turns. The per-turn cost is a property of the host, not the file.

M
Matthew Diakonov
7 min
Direct answer (verified 2026-05-20)

Per-turn token cost of AGENTS.md depends on which host loads it. Cursor (default): 0 tokens. Claude Code without an @AGENTS.md import: 0 tokens. Claude Code with @AGENTS.md inside CLAUDE.md: the full file, re-sent in the system prompt on every API call. Codex CLI uncached: the full file. Codex CLI on a cache hit: about 10% of the full file rate (cache reads bill at roughly a tenth of base input). Token count of the file itself is its characters / 4.

Loader behavior cross-checked against Anthropic's memory docs and the AGENTS.md spec. Analyzer firing-model from src/lib/analyzer.ts line 264.

The four-host matrix, on one screen

Pick the row that matches your setup. The token count is the file's share of the per-turn input, not the whole prompt. Dollar columns assume Opus 4.7 input at $5 per million; for Sonnet 4.6 multiply by 0.6, for Haiku 4.5 multiply by 0.2.

Host configurationHow the file gets loadedPer-turn tokensPer-turn cost
Cursor (default)AGENTS.md not in the loader. Reads .cursorrules and .cursor/rules/*.mdc.0$0.0000
Claude Code, no @-importMemory loader reads CLAUDE.md only. AGENTS.md is not discovered.0$0.0000
Claude Code + @AGENTS.md in CLAUDE.md@-import in CLAUDE.md inlines AGENTS.md into the resolved memory blob.full file (chars / 4)tokens × $5 / 1M
Codex CLI, uncached turnCodex loader injects AGENTS.md body into the session system prompt at start.full file (chars / 4)tokens × base rate / 1M
Codex CLI, prompt-cache hitSame loader as above; the prefix is reused from a previous turn within the cache window.full file (cached read)tokens × 10% of base rate

The "note" column would not fit in a table this wide; the per-host walkthrough below carries those notes. The two zero rows are not a bug, they are the answer. If your AGENTS.md lives in a repo you only ever drive from Cursor, its per-turn token cost is zero, regardless of how long the file is.

The same matrix, as a ccmd run

Pasting the file into the analyzer on ccmd.dev reports one number: fires every turn. That is the third-row case (Claude Code with an @-import). Mentally fan it out across the four rows above and you have the full picture:

ccmd score AGENTS.md, fanned across hosts

Why the analyzer prints one number, not four

The analyzer detects file type correctly. It sets inputType = "agents.md" when the first 300 characters contain "agents.md" or a top-level # Agents heading. But the cost path on line 264 sets estimatedTokensFireEveryTurn = totalTokens unconditionally. It does not branch on inputType. So the reported per-turn cost is always the Claude-Code-with-@import number, even for files that would cost zero on the host the developer is actually using.

src/lib/analyzer.ts

This is the gap to close. Knowing the file type is not enough; the honest per-turn cost requires knowing the host pairing. Until the analyzer asks "which host?", the safest interpretation of its output is "this is what AGENTS.md would cost if Claude Code were inlining it via @-import." If you are using Cursor or plain Claude Code, the real per-turn cost of AGENTS.md is zero. If you are using Codex CLI, the real cost depends on whether your prompt cache is warm.

Per-host walkthrough

Cursor: 0 tokens

Cursor's native rule loader reads .cursorrules in the repo root and .cursor/rules/*.mdc. AGENTS.md is not a name the loader knows. So if your repo has one AGENTS.md and no .cursorrules, the file is on disk but never reaches the model. Per-turn token cost: 0. The honest number is 0 regardless of how long AGENTS.md is, until you wire it in explicitly (rule reference, manual paste, or a symlink from .cursorrules to AGENTS.md).

Claude Code without @-import: 0 tokens

Anthropic's memory docs are explicit: Claude Code reads CLAUDE.md, not AGENTS.md. So an AGENTS.md sitting next to a CLAUDE.md that never references it is invisible to the loader. Run /memory in Claude Code and confirm; the loaded list will show CLAUDE.md and any of its @-imports, not AGENTS.md.

Claude Code + @AGENTS.md in CLAUDE.md: full file

If CLAUDE.md contains a line like @AGENTS.md, the loader inlines the AGENTS.md body into the resolved memory blob, recursively up to a hop limit (block-level HTML comments get stripped during this step, which kills the BEGIN/END fence-pair overhead). At that point the per-turn cost is identical to a CLAUDE.md of the same character count: full file, every turn, billed at the model's input rate. This is the row the analyzer reports.

Codex CLI: full file uncached, ~10% cached

The Codex loader injects AGENTS.md into the session system prompt at start (HTML comments preserved verbatim, fence pair costing about 24 tokens of overhead). Because the underlying API is stateless, the system prompt is included on every call, so the file is technically "sent every turn". The difference between cached and uncached turns is how that re-send is billed: a fresh turn pays the base input rate on every token of the prefix; a turn whose prefix matches a previous turn's within the cache window pays the cache-read rate, which is roughly 10% of base. So on Codex CLI, the marginal per-turn cost of AGENTS.md is closer to tokens × base / 10 than tokens × base for any turn that lands inside the cache window. Long pauses between turns flush the cache and reset the bill back to the full rate.

How to actually compute your number

1

Check the host. Token cost belongs to whichever loader pulls the file.

The same AGENTS.md does not have one cost. It has a different cost in Cursor (zero by default), Claude Code (zero unless you @-import it), and Codex CLI (full file, with caching as a discount).

2

Decide whether the file fires every turn or once per session.

Anthropic's Messages API and OpenAI's are both stateless. Every turn re-sends the system prompt prefix. The number that matters is how much of that prefix is cached. A warm prompt cache cuts the prefix portion to about 10% of base input rate.

3

Estimate the file in tokens: characters divided by 4.

The analyzer's heuristic at src/lib/analyzer.ts line 37 is Math.ceil(text.length / 4). A 640-character AGENTS.md is roughly 160 tokens. A 6,400-character one is roughly 1,600. Multiply by your rate to get dollars.

4

Plug into the right rate for your model and cache state.

Opus 4.7 input: $5 per million, $0.50 per million cached. Sonnet 4.6: $3 / $0.30. Haiku 4.5: $1 / $0.10. Output rates are 5x input. AGENTS.md is part of the input bill, not the output bill.

Worked example: a 6,400-character AGENTS.md

~1,600 tokens (chars / 4). Per turn at Opus 4.7 input ($5 / M):

  • Cursor (default): 0 tokens, $0.0000 per turn.
  • Claude Code, no @-import: 0 tokens, $0.0000 per turn.
  • Claude Code + @AGENTS.md: 1,600 tokens, $0.008 per turn. Over a 30-turn session: $0.24.
  • Codex CLI, uncached turn: 1,600 tokens at base rate, ~$0.008 per turn.
  • Codex CLI, cache hit: 1,600 tokens at cache-read rate, ~$0.0008 per turn. Over 30 cached turns: $0.024.

The session-level gap between "Cursor with this file" and "Claude Code inlining this file via @-import" is a factor of infinity (zero to nonzero). The gap between cached and uncached Codex is a factor of ten. Picking the wrong row of the matrix is more expensive than tuning the file itself.

Related reading on ccmd

Want the matrix run on your actual repo?

A 20-minute call: paste your AGENTS.md and CLAUDE.md, we walk through which host pays for which lines and what to move to a skill.

FAQ

Frequently asked questions

What does my AGENTS.md cost per turn?

It depends on the host that loads it. In Cursor, AGENTS.md is 0 tokens per turn by default because Cursor's loader looks at .cursorrules and .cursor/rules/*.mdc, not AGENTS.md. In Claude Code, it is also 0 unless you explicitly write @AGENTS.md inside CLAUDE.md; the memory loader does not pick it up otherwise. With that @-import in place, Claude Code inlines the file and re-sends 100% of the tokens in the system prompt on every turn. In Codex CLI, the file is injected at session start and re-sent on every turn, billed at the full input rate on uncached turns and about 10% of that on cached turns. Same 640 bytes on disk, four different per-turn bills.

Why does ccmd's analyzer report one per-turn number for AGENTS.md?

Because the cost path at src/lib/analyzer.ts line 264 sets estimatedTokensFireEveryTurn = totalTokens unconditionally. The file type is detected at line 41 (detectType returns 'agents.md' when the first 300 characters contain 'agents.md' or a top-level 'agents' header) but the downstream cost code never reads inputType. The number the analyzer prints is the Claude-Code-with-@import case: full file, every turn. For Cursor or Claude Code without an @-import it overstates the per-turn cost by 100%; for a cached Codex turn it overstates by about 10x.

Does AGENTS.md fire every turn in Codex CLI?

Yes, in the sense that it is part of the system prompt prefix on every API call, because the underlying API is stateless. No, in the sense that on a cache hit the model provider only re-bills the cached prefix at the cache-read rate, which is roughly 10% of the base input rate. So the marginal per-turn cost of AGENTS.md in Codex CLI is the full token count multiplied by the base rate on uncached turns, and the full token count multiplied by the cache-read rate on cached turns. The cache window is short (5 minutes on Anthropic's API; OpenAI's caching has its own window) so a slow developer with long pauses between turns will see more uncached turns than a fast one.

Does AGENTS.md fire every turn in Claude Code?

Only if CLAUDE.md contains an @AGENTS.md import. Per Anthropic's memory docs at code.claude.com/docs/en/memory, Claude Code reads CLAUDE.md, not AGENTS.md. So a repo with both files on disk and no import line will load 0 tokens of AGENTS.md per turn. A repo with one CLAUDE.md whose only line is @AGENTS.md will load the full AGENTS.md every turn, after the loader strips block-level HTML comments and resolves any further @-imports up to a hop limit. The per-turn cost in that case is identical to a CLAUDE.md of the same character count.

Does Cursor load AGENTS.md at all?

Not by default. Cursor's native rule loader reads .cursorrules in the repo root and .cursor/rules/*.mdc. AGENTS.md is read only if you wire it up explicitly: a rule reference, a manual paste into a Cursor rule, or a symlink from .cursorrules to AGENTS.md. So the honest per-turn token cost of AGENTS.md in a Cursor session is 0 unless the developer has done one of those three things, in which case it becomes whatever the rule mechanism injects (typically full file).

If the model API is stateless, how can per-turn cost be 0 for some hosts?

Because 'per-turn cost of AGENTS.md' is the cost contribution of that specific file to one turn, not the cost of the whole turn. The system prompt always exists and is always billed. The question is whether AGENTS.md is a byte-range inside that system prompt. In Cursor without explicit wiring, it is not, so the file's marginal contribution to per-turn cost is 0 even though the rest of the system prompt (Cursor's own scaffolding, the developer's .cursorrules, the user message, tool definitions, conversation history) is still billed normally.

Does the per-turn token count for AGENTS.md change as the session grows?

No. The per-turn token count of the file itself is flat for the life of the session, equal to the file's character count divided by 4. What grows linearly is the conversation history that sits next to AGENTS.md in the context, not AGENTS.md itself. So an AGENTS.md of 1,600 tokens contributes 1,600 tokens to turn 1 and 1,600 tokens to turn 50. The dollar contribution per turn is also flat unless the cache state changes (cold vs. warm) or the model is switched mid-session.

How do I see the per-turn token cost for my actual AGENTS.md?

Paste it into the textarea on ccmd.dev. The analyzer runs in your browser with no upload and returns totalTokens; multiply that by the input rate of the host's model and divide by 1,000,000 for a per-turn dollar figure in the Claude-Code-with-@import or Codex-uncached case. For Cursor without explicit wiring, the honest number is 0. For Codex with cache hits, divide the uncached number by about 10. The analyzer also returns a per-line findings array so you can see which specific lines are inflating the file before deciding whether it is worth the per-turn cost it adds.

How did this page land for you?

React to reveal totals

Comments ()

Leave a comment to see what others are saying.

Public and anonymous. No signup.