Goal
Cascading per-agent budgets for Claude Max that are: durable across daemon restarts, period-resetting, enforced inside the cognitive loop (including the bridge's native loop), expressed as percentages at the user/agent boundary, and accurate per model class.
Status (2026-04-29)
Foundation landed — persistence, period rollover, mandatory budgets, cascading clamp, inner-loop enforcement for non-bridge providers, and the get_my_budget_status tool. 38 tests green. But this is not yet shippable for Claude Max users:
- Budgets are token-denominated; user/agent should declare percentages.
- Estimator blends Haiku/Sonnet/Opus into one ratio — meaningless number.
- Bridge native loop enforces only post-orchestration; a runaway 50-turn execution can still blow the cap.
Remaining scope
Percent-based budget API at boundaries
- New schema:
{provider: {"pct": N, "of": "parent" | "subscription_5h", "period": "..."}}
- Internal storage stays in tokens (recorder, counters, cascade math unchanged)
- Conversion happens at the boundary:
pct of parent → tokens at registration (parent_token_cap * pct / 100)
pct of subscription_5h → tokens at lookup time, recomputed each refresh
- Backward compat: int form (
{provider: 1000}) and existing dict form ({"limit": N, ...}) still accepted.
Per-model-class estimator (haiku / sonnet / opus)
- Three EMA buckets in
BridgeProvider, one per class
- Snapshot tuple becomes
(tokens_delta_for_class, util_5h_delta, model_class)
- Tag each turn's tokens with the model that produced them (from
response.model)
- Hard-coded bootstrap multipliers used until 2 snapshots exist for a class: relative cost
haiku=1, sonnet=5, opus=25 normalized via Anthropic's headers once available
tokens_per_pct[class] is the conversion used when a budget says pct of subscription_5h
Bridge inner-loop enforcement
- TS bridge emits
usage event per assistant turn: {input, output, cache_read, cache_creation, model} — same channel as existing tool_use_start/result/compaction events
- Python
BridgeProvider._stream_events calls the same usage_recorder callback used by the base provider; on (False, blocker) calls self.interrupt() to stop the SDK
- Engine sees
stop_reason="budget_exceeded" (already wired)
Reconciliation pass
- After each bridge orchestration, refresh subscription headers
- Compare measured
util_5h_delta against sum of per-turn events
- Big discrepancy → recalibrate the estimator
- Small → trust it. Safety net for when the estimator drifts.
Acceptance
- A user/agent can declare
{"claude_max": {"pct": 30, "of": "parent"}} or {"pct": 80, "of": "subscription_5h", "period": "weekly"} and registration computes the equivalent token cap.
- The cascading clamp at registration uses percent terms naturally — child's
pct of parent ≤ 100 minus siblings' total.
- The bridge orchestration aborts mid-loop within ≤1 turn after crossing a cap, with
stop_reason="budget_exceeded".
- The estimator produces three independent
tokens_per_pct ratios for haiku/sonnet/opus.
- Reconciliation diff is logged per orchestration; estimator self-corrects after a configurable number of mismatched cycles.
- Tests in
tests/atn/ covering each.
Out of scope
Goal
Cascading per-agent budgets for Claude Max that are: durable across daemon restarts, period-resetting, enforced inside the cognitive loop (including the bridge's native loop), expressed as percentages at the user/agent boundary, and accurate per model class.
Status (2026-04-29)
Foundation landed — persistence, period rollover, mandatory budgets, cascading clamp, inner-loop enforcement for non-bridge providers, and the
get_my_budget_statustool. 38 tests green. But this is not yet shippable for Claude Max users:Remaining scope
Percent-based budget API at boundaries
{provider: {"pct": N, "of": "parent" | "subscription_5h", "period": "..."}}pct of parent→ tokens at registration (parent_token_cap * pct / 100)pct of subscription_5h→ tokens at lookup time, recomputed each refresh{provider: 1000}) and existing dict form ({"limit": N, ...}) still accepted.Per-model-class estimator (haiku / sonnet / opus)
BridgeProvider, one per class(tokens_delta_for_class, util_5h_delta, model_class)response.model)haiku=1, sonnet=5, opus=25normalized via Anthropic's headers once availabletokens_per_pct[class]is the conversion used when a budget sayspct of subscription_5hBridge inner-loop enforcement
usageevent per assistant turn:{input, output, cache_read, cache_creation, model}— same channel as existingtool_use_start/result/compactioneventsBridgeProvider._stream_eventscalls the sameusage_recordercallback used by the base provider; on(False, blocker)callsself.interrupt()to stop the SDKstop_reason="budget_exceeded"(already wired)Reconciliation pass
util_5h_deltaagainst sum of per-turn eventsAcceptance
{"claude_max": {"pct": 30, "of": "parent"}}or{"pct": 80, "of": "subscription_5h", "period": "weekly"}and registration computes the equivalent token cap.pct of parent≤ 100 minus siblings' total.stop_reason="budget_exceeded".tokens_per_pctratios for haiku/sonnet/opus.tests/atn/covering each.Out of scope