Package Exports
- claude-subagent-budget
- claude-subagent-budget/index.js
This package does not declare an exports field, so the exports above have been automatically detected and optimized by JSPM instead. If any package subpath is missing, it is recommended to post an issue to the original package (claude-subagent-budget) to support the "exports" field. If that is not possible, create a JSPM override to customize the exports field for this package.
Readme
claude-subagent-budget
Pre-flight cost & quota estimator for Claude Code subagent spawns.
Before launching a multi-agent workflow (e.g. Writer → Codex fact-check → QA-Guard × N parallel articles), claude-subagent-budget estimates how much of your Claude Max plan / ChatGPT Plus quota the spawn will consume — plus duration and risk warnings — so you can avoid quota exhaustion or context-limit crashes mid-flight.
Why
- Claude Max ($20 / $100 / $200) and ChatGPT Plus are subscription-based. There is no built-in "how much of my 5h quota will this spawn use?" feedback.
- Multi-agent spawns can quietly explode the context window or burn through ChatGPT quota in minutes.
- This tool is dependency-free Node.js and prints a quota-percentage view that matches how subscription users actually think about cost.
Features
- Anthropic Max plan quota usage per model (Opus / Sonnet / Haiku) — primary display
- ChatGPT Plus quota usage for Codex CLI / GPT-5.x sessions — secondary display
- USD / JPY reference for API-billed mode (OSS users on pay-per-use)
- Wall-clock duration estimate using per-agent runtime and parallelism wave-splitting
- Risk evaluation (
warn/block) at configurable thresholds - Exit codes 0/1/2 for CI/script integration
- JSON output for tooling integration
- Cross-platform (Windows / macOS / Linux), Node.js >= 18, zero dependencies
Installation
# Run with npx (no install)
npx claude-subagent-budget < plan.json
# Or install globally
npm install -g claude-subagent-budget
claude-subagent-budget < plan.jsonUsage
CLI
echo '{
"plan": [
{"agent": "writer", "task": "article generation", "model": "opus", "expected_chars": 5500},
{"agent": "codex-rescue", "task": "FC --fresh", "model": "gpt-5.5", "expected_chars": 5500},
{"agent": "qa-guard", "task": "QA review", "model": "sonnet", "expected_chars": 5500}
],
"parallelism": 12
}' | claude-subagent-budgetOutput (default: pretty)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
🪙 Subagent Budget Estimate
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
plan: 3 agents × 12 articles parallel
Tokens:
in: ~ 180,096 tokens
out: ~ 153,000 tokens
total: ~ 333,096 tokens
📊 Claude Max ($100) 5h quota usage:
opus : 2.9% ( 159,024 / 5,500,000 tokens) → 97.1% remaining
sonnet : 0.3% ( 78,036 / 27,500,000 tokens) → 99.7% remaining
📊 ChatGPT Plus ($20) 5h quota usage:
codex : 19.3% ( 96,036 / 500,000 tokens) → 80.7% remaining
(API-billed reference: $8.78 / ¥1,317 — no real charge for subscription users)
Duration:
median: 84 min (parallelism=12, 3 wave(s))
p90: 147 min
⚠️ Warnings:
[BLOCK] - codex-rescue × 12 parallel (=12 sessions) exceeds ChatGPT Plus quota
❌ Block: true (reason: codex-parallel-hard-block)JSON output
echo '<plan>' | claude-subagent-budget --json{
"plan_summary": { "agents": 3, "parallelism": 12 },
"tokens": { "input_tokens": 180096, "output_tokens": 153000, "total_tokens": 333096 },
"anthropic_quota": {
"plan": "max_100",
"tier_name": "Claude Max ($100)",
"by_model": {
"opus": { "used": 159024, "quota": 5500000, "pct": 2.9 },
"sonnet": { "used": 78036, "quota": 27500000, "pct": 0.3 },
"haiku": { "used": 0, "quota": 55000000, "pct": 0 }
}
},
"chatgpt_quota": {
"plan": "plus",
"tier_name": "ChatGPT Plus ($20)",
"used": 96036, "quota": 500000, "pct": 19.3
},
"cost_reference": { "usd": 8.78, "jpy": 1317, "note": "..." },
"duration": { "median_min": 84, "p90_min": 147, "per_article_min": 28, "waves": 3 },
"warnings": ["..."],
"block": true,
"block_reason": "codex-parallel-hard-block",
"exit_code": 2
}Input format
| Field | Type | Description |
|---|---|---|
plan[].agent |
string | Agent identifier (e.g. writer, qa-guard, codex-rescue, analyst-scout) |
plan[].task |
string | Task description (length affects input token estimate) |
plan[].model |
string | Model identifier (see Supported models) |
plan[].expected_chars |
number | Expected output length in characters (drives output token estimate) |
parallelism |
number | Number of articles processed in parallel (default: 1) |
Flags
| Flag | Description |
|---|---|
--json |
Output JSON instead of pretty text |
--auto |
Same as --json (intended for tool integration) |
--block-on-quota |
Promote 80%+ quota warnings to block (exit 2) |
--block-on-context |
Promote 800K+ context warnings to block (exit 2) |
-h, --help |
Show help |
Exit codes
| Code | Meaning |
|---|---|
0 |
OK — no warnings |
1 |
WARN — warnings present, execution allowed |
2 |
BLOCK — quota exhaustion / context overflow / hard parallelism cap reached |
Supported models
| Model | Provider | Billing |
|---|---|---|
opus, sonnet, haiku |
Anthropic Claude | Subscription quota (Max plan) or pay-per-use API |
gpt-5.5, gpt-5.4 |
ChatGPT Plus (Codex CLI) | 5h subscription quota |
gpt-4o, gpt-4o-mini |
OpenAI API | Pay-per-use |
Unknown models fall back to Sonnet-equivalent quota tracking.
Configuration
Switching plan tier
Edit config/model-pricing.json:
{
"user_plan": "max_200", // "pro" | "max_100" | "max_200"
...
}Customising quota figures
Anthropic does not publish exact numbers, so the bundled defaults are approximations. Tune them in config/model-pricing.json:
"max_100": {
"tier_name": "Claude Max ($100)",
"5h_token_quota": {
"opus": 5500000,
"sonnet": 27500000,
"haiku": 55000000
}
}Adding a custom model
{
"models": {
"my-custom-model": {
"input_per_1m": 5,
"output_per_1m": 20,
"currency": "USD"
}
}
}Library usage
const { estimatePlan } = require('claude-subagent-budget/lib/token-estimator');
const { calcPlanCost } = require('claude-subagent-budget/lib/cost-calculator');
const { predictDuration } = require('claude-subagent-budget/lib/duration-predictor');
const { evaluateRisk } = require('claude-subagent-budget/lib/risk-evaluator');
const plan = [
{ agent: 'writer', task: 'article', model: 'opus', expected_chars: 5500 },
{ agent: 'codex-rescue', task: 'fc', model: 'gpt-5.5', expected_chars: 5500 },
];
const parallelism = 6;
const tokens = estimatePlan(plan, parallelism);
const cost = calcPlanCost(plan, tokens.per_agent, parallelism);
const duration = predictDuration(plan, parallelism);
const risk = evaluateRisk({ tokens: tokens.totals, cost, parallelism, plan }, {
blockOnQuota: true,
blockOnContext: true,
});
if (risk.block) {
console.error(`BLOCKED: ${risk.block_reason}`);
process.exit(2);
}How it works
Plan JSON
│
▼
┌──────────────────────────────┐
│ token-estimator │ Japanese chars × 1.5 + agent-specific output factors
├──────────────────────────────┤
│ cost-calculator │ Anthropic quota / ChatGPT quota / USD reference
├──────────────────────────────┤
│ duration-predictor │ per-agent runtime × ceil(parallelism / 5) waves
├──────────────────────────────┤
│ risk-evaluator │ warn at 80% / block at 95% per quota
└──────────────┬───────────────┘
▼
Pretty / JSON outputLimitations
- Token estimates are heuristic. They assume Japanese-text input; tune
CHARS_PER_TOKENinlib/token-estimator.jsfor English-heavy workloads. - Anthropic Max plan quota figures are approximations; calibrate from observed usage.
- Per-agent runtime values are based on observed Claude Code behavior with default plans/skills. Override via
agentRuntimesargument topredictDuration().
Releasing
Publishing is fully automated via GitHub Actions on tag push.
One-time setup (maintainer)
- Create an npm Automation Token at https://www.npmjs.com/settings/
/tokens (scope: Automation). - Add it to the repo as a secret named
NPM_TOKEN(Settings → Secrets and variables → Actions → New repository secret).
Each release
# Bump version in package.json
npm version patch # or minor / major
# Push the commit + the tag created by npm version
git push origin main
git push origin v0.1.1The Publish to npm workflow runs automatically on v* tag push:
- Runs smoke tests (
test/run.js) - Verifies tag matches
package.jsonversion - Publishes to npm with provenance (
--provenance)
You can also dry-run the publish from the GitHub Actions UI (Run workflow → enable "dry_run").
CI
The Test workflow runs the smoke tests on every push to main and every PR, across Node 18/20/22 on Ubuntu/macOS/Windows.
License
MIT — see LICENSE.
Contributing
Pull requests welcome. The core surface is small (~1,000 lines, 5 files). Useful extensions:
- More model-pricing presets (e.g. provider-specific tiers)
- Real-time quota API integration (when Anthropic/OpenAI expose it)
- Historical run log → automatic recalibration of runtime/token defaults