/t · guide · weekly cap

Your CLAUDE.md is on the weekly bill too. Most people only count one turn at a time.

Matthew Diakonov, Written with AI

Published May 18, 20266 min read

The standard advice on hitting the weekly cap is: switch to Sonnet, wait for the reset, upgrade to Max 20x. All three are real options. None of them ask why the cap arrived on Wednesday. The answer for most paid users is in the config file that ships unchanged on every turn. The per-turn cost is small. The per-week cost is not.

1. The weekly cap, in one paragraph

Pro, Max 5x ($100/mo), and Max 20x ($200/mo) each have two limits. The five-hour rolling window doubled on 2026-05-06. The seven-day weekly cap got a temporary 50% bump on 2026-05-13 through 2026-07-13. After that, nobody knows. Independent measurements put Max 5x at roughly 24 Opus hours per week and Max 20x at roughly 40 Opus hours per week at the time of writing, with Sonnet hours several multiples higher. The point of this article is not the number. The point is that anything that fires on every turn pays for itself in seven-day units, not per-turn units.

0tokens in a typical CLAUDE.md

0turns / long session

0sessions / work week

×0cache-bust cost multiplier

2. The file

Trimmed sample for legibility. Pretend it is the first 200 tokens of a 6,000-token file (yours probably is, between stack notes, prior incidents, conventions, and skill references). One line on this snippet drives the weekly bill.

CLAUDE.md

The cache-busting line is L3. The rest of the audit (vague absolutes on L9-10, missing-why on the prohibitions, the bloat on the Stack paragraph) shows up in the context-burn audit. Different scoreboard.

3. Per-turn math, then per-week math

Per-turn formula lives in src/lib/analyzer.ts (line 266 onwards):

analyzer.ts

For a 6,042-token file that is $0.906 per cache-busted session (cold input on all 30 turns) or $0.091 if the prompt cache holds. The weekly extension is one more multiplier:

weekly-extension.ts

4.5M input tokens does not sound large compared to the millions of tokens of code Claude actually reads. It is the floor. CLAUDE.md ships before anything else. Every Read, Bash, Grep, and subagent call piles on top of that floor. The weekly cap doesn't care which tokens are which.

4. The cache-bust amplifier

The analyzer ships one check that has more weekly impact than the other six combined. It is ten lines of TypeScript and one regex:

src/lib/analyzer.ts:194

Three things to notice. First, the window is idx < 20 (only the first 20 lines of the file). Cache prefixes are matched from the top down; only the top mutating matters. Second, the regex matches four classes of volatile text: an ISO date 2026-05-18, the literal word today, the phrase this session, and the phrase right now. Third, the severity is high for a reason: each match flips a 6,000-token file from costing 0.6% of a Max 5x Opus week to costing 6-8% of it.

L3 — one line, one week

Line 3 reads 'Today is 2026-05-18 and we are mid-sprint on the subscriptions rewrite.' Every new session rewrites the date. Cached prefix never matches.

25 fresh prefixes / week
4.5M weekly input tokens billed
$22.66 / week at Opus 4.7 rates
~6-8% of Max 5x weekly Opus budget

every turn × every session

“every single API call to Claude sends the whole context, including prompts, meaning that all this extra text in CLAUDE.md is sent over and over”

caymanjim, Hacker News

5. The weekly audit, in one terminal block

What the analyzer prints on this snippet when you tell it to use a weekly window. The UI on ccmd.dev shows the same numbers in a panel. The terminal layout is easier to screenshot.

ccmd · weekly cost report

Two numbers on the same line tell the story. The current state is in red because L3 fires the cache_bust finding. The cached state is the number you'd see one delete away. The delta is 10x. The bottom row reads in percent of your weekly Opus budget; it is the only number a CTO actually wants to see.

6. The subagent multiplier nobody counts

Every subagent invocation starts with the parent CLAUDE.md as part of its system prompt. A workflow rule in your file that says "for any non-trivial bug, spawn an Explore subagent first" means every non-trivial turn rehydrates CLAUDE.md at least twice: once for the parent, once for the subagent. Two-deep fan-outs (Explore then Plan then implement) triple it.

The weekly math, with one subagent per turn on half the turns:

Parent turns / week: 750 (unchanged)
Subagent turns / week: 375 (half of parent turns spawn one)
Effective CLAUDE.md rehydrations / week: 1,125
Cache-busted token cost: 6,042 × 1,125 = 6.8M tokens / week

On a Max 5x weekly budget that is the band where you start hitting throttles before the Friday wrap-up. On June 15, 2026, the Agent SDK credit-cost change splits subagent calls into a separate credit pool. The math gets clearer (you'll see two line items on the bill) but it does not get smaller. The same lines in CLAUDE.md keep firing.

7. Why the standard advice misses

The three pieces of advice that always come up when someone posts "hit the weekly cap on Wednesday", ranked by what they actually buy you on a typical CLAUDE.md.

Feature	Switch model / upgrade plan / wait	Cut the cache_bust line
Time to take effect	after weekly reset OR a billing change	next session
Effect on CLAUDE.md token bill	0x (file ships unchanged)	~10x cheaper
Effect on output quality	Sonnet ≈ 90% of Opus on most tasks; varies by task	none (line was volatile noise)
Recurring monthly cost	$100 -> $200/mo to go Max 5x -> 20x	$0
Solves the file-level cause	no (treats the symptom)	yes, on the specific line
Still useful if you also do it	yes (Sonnet routing for non-complex turns is real)	yes (compounds with everything else)

Both columns are real. The point is that the first column ships today, costs nothing, and almost nobody is doing it because the per-turn math hides what the per-week math shows.

8. What to do this afternoon

Open ccmd.dev. Paste your CLAUDE.md into the textarea. Read the first finding. If it is a cache_bust, move that line to the bottom of the file under a heading like ## Current sprint or delete it. Cache prefix recovers next session.
Multiply the analyzer's per-session cost by your real weekly session count from claude.ai/usage. That number is your weekly CLAUDE.md floor. Track it.
Read the rest of the structural audit on the context-burn page. The cache_bust finding is the biggest single lever; the bloat, duplicate, and missing-why findings compound across 750 weekly turns.
If you have a shared CLAUDE.md across a team, check the drift guide. A shared file gets multiple cache_bust lines added by well-intentioned teammates; each one resets a different engineer's cache.

Want a weekly-cost number on your actual CLAUDE.md?

15 minutes, walk through your file together, leave with a per-week dollar number and the one line that is producing it. Free.

Frequently asked questions

What does 'weekly quota burn' actually mean on Claude Code?

Anthropic introduced a weekly usage cap on every paid Claude Code plan in August 2025. It sits on top of the older five-hour rolling window. Pro, Max 5x, and Max 20x each get a fixed envelope of Sonnet hours and a much smaller envelope of Opus hours per seven-day period. Hit the cap and Claude Code throttles you, regardless of how much rolling-window budget you have left. The May 6, 2026 update doubled the five-hour window, and the May 13 update temporarily increased the weekly cap by 50% through July 13. The cap itself did not go away. Verified on support.claude.com 2026-05-18.

How does CLAUDE.md show up on the weekly bill?

Every Claude Code request ships the full CLAUDE.md as part of the system prompt. A 6,000-token file × 30 turns / session × 5 sessions / day × 5 days = 4.5 million input tokens consumed by CLAUDE.md alone in one work week, before Claude has read a single line of your actual code. If the prompt cache hits across turns, that drops by roughly 10x. If a single volatile line near the top of the file busts the cache every session, you pay the full 4.5M every week.

Why does an ISO date on line 3 cost so much over a week?

Prompt caching is byte-exact. The cached prefix has to match the new request byte-for-byte to count. The analyzer's regex at src/lib/analyzer.ts line 194 fires on \b20[2-9]\d-\d{2}-\d{2}|today|this session|right now\b in the first 20 lines. Each new chat session rewrites the date. Each new date means a brand-new cached prefix. Across a week, that's 25 fresh prefixes instead of 1, and every turn in every session pays full input cost. The line is roughly 10 tokens. It is responsible for an order of magnitude more weekly billing than the rest of the file combined.

I am on Max 20x with the temporary 50% boost. Does this still matter?

Yes, in two ways. First, the boost is temporary; Anthropic has not said whether the new ceiling becomes permanent or drops back at the July 13 reset. Second, if you run multiple Claude Code sessions concurrently or fan out to subagents, the weekly Opus budget moves faster than people expect. The math compounds across (sessions per day) × (turns per session) × (subagents per turn). A 6,000-token CLAUDE.md cache-busted across 35 sessions of 30 turns each with two subagents fanning out is over 12.7 million tokens of rehydration. That is real percentage of Max 20x's weekly Opus envelope.

How is this different from the per-session token cost audit?

The per-session audit answers 'what does this session cost' in three states (cold, cached, cache-busted). The weekly quota audit answers 'what share of my Max plan's seven-day cap does this file consume'. The session math is one multiplication. The weekly math multiplies session math by sessions per day, days per week, and subagents per turn. The reason this matters is that nobody hits the rolling-window cap on a Tuesday. They hit the weekly cap on a Wednesday afternoon after four normal-feeling days, and the bill that surprised them came from rehydrating CLAUDE.md, not from the code Claude wrote.

Does the ccmd analyzer compute the weekly share directly?

Today the free analyzer at ccmd.dev computes total tokens, the Karpathy 12-rule pass rate, line-by-line findings (cache_bust, bloat, vague, aspirational, conflict, duplicate, missing_why), and per-session cost. Weekly multipliers are a paste-and-tune number in the FAQ rather than a UI yet. The constants are public: src/lib/analyzer.ts holds chars / 4 for tokens (line 38), TURNS = 30 for a long session (line 266), and OPUS_IN_PER_M for the input rate. Multiply your sessions/week by your turns/session and the formula falls out.

What is the single line I should cut first to stop the weekly burn?

Any line in the first 20 of your CLAUDE.md that contains a date string, the word 'today', 'this session', or 'right now'. Those four patterns are the analyzer's cache_bust check and they have an outsized weekly impact relative to their token count. After that: any duplicate (the duplicate finding flags the second instance), then the bloat finding (any non-blank line over 28 words). The bloat findings buy you per-turn savings that compound over a week. The cache_bust finding buys you a 10x weekly discount on the file you already wrote.

Does this also apply to AGENTS.md, .cursorrules, and .grokrules?

Yes. The same analyzer runs on all four formats and the same prompt-cache mechanics apply on Codex (AGENTS.md), Cursor (.cursorrules), and Grok Build (.grokrules). The weekly cap shape is different on each platform (Cursor has request-based pricing rather than an hour-style weekly cap, Codex bills per token with a daily soft limit), but the cache-busting line near the top of any config file produces the same multi-x billing delta. Paste any of the four formats; the analyzer detects which one it is.