Package Exports

claude-subagent-budget
claude-subagent-budget/index.js

This package does not declare an exports field, so the exports above have been automatically detected and optimized by JSPM instead. If any package subpath is missing, it is recommended to post an issue to the original package (claude-subagent-budget) to support the "exports" field. If that is not possible, create a JSPM override to customize the exports field for this package.

Readme

claude-subagent-budget

Pre-flight cost & quota estimator for Claude Code subagent spawns.

Before launching a multi-agent workflow (e.g. Writer → Codex fact-check → QA-Guard × N parallel articles), claude-subagent-budget estimates how much of your Claude Max plan / ChatGPT Plus quota the spawn will consume — plus duration and risk warnings — so you can avoid quota exhaustion or context-limit crashes mid-flight.

Why

Claude Max ($20 / $100 / $200) and ChatGPT Plus are subscription-based. There is no built-in "how much of my 5h quota will this spawn use?" feedback.
Multi-agent spawns can quietly explode the context window or burn through ChatGPT quota in minutes.
This tool is dependency-free Node.js and prints a quota-percentage view that matches how subscription users actually think about cost.

Features

Anthropic Max plan quota usage per model (Opus / Sonnet / Haiku) — primary display
ChatGPT Plus quota usage for Codex CLI / GPT-5.x sessions — secondary display
USD / JPY reference for API-billed mode (OSS users on pay-per-use)
Wall-clock duration estimate using per-agent runtime and parallelism wave-splitting
Risk evaluation (warn / block) at configurable thresholds
Exit codes 0/1/2 for CI/script integration
JSON output for tooling integration
Cross-platform (Windows / macOS / Linux), Node.js >= 18, zero dependencies

Installation

# Run with npx (no install)
npx claude-subagent-budget < plan.json

# Or install globally
npm install -g claude-subagent-budget
claude-subagent-budget < plan.json

Usage

CLI

echo '{
  "plan": [
    {"agent": "writer",       "task": "article generation", "model": "opus",    "expected_chars": 5500},
    {"agent": "codex-rescue", "task": "FC --fresh",         "model": "gpt-5.5", "expected_chars": 5500},
    {"agent": "qa-guard",     "task": "QA review",          "model": "sonnet",  "expected_chars": 5500}
  ],
  "parallelism": 12
}' | claude-subagent-budget

Output (default: pretty)

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
🪙 Subagent Budget Estimate
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
plan: 3 agents × 12 articles parallel

Tokens:
  in:  ~ 180,096 tokens
  out: ~ 153,000 tokens
  total: ~ 333,096 tokens

📊 Claude Max ($100) 5h quota usage:
  opus   :   2.9%  (   159,024 /  5,500,000 tokens)  → 97.1% remaining
  sonnet :   0.3%  (    78,036 / 27,500,000 tokens)  → 99.7% remaining

📊 ChatGPT Plus ($20) 5h quota usage:
  codex  :  19.3%  (    96,036 /    500,000 tokens)  → 80.7% remaining

(API-billed reference: $8.78 / ¥1,317 — no real charge for subscription users)

Duration:
  median: 84 min  (parallelism=12, 3 wave(s))
  p90:    147 min

⚠️ Warnings:
  [BLOCK] - codex-rescue × 12 parallel (=12 sessions) exceeds ChatGPT Plus quota

❌ Block: true  (reason: codex-parallel-hard-block)

JSON output

echo '<plan>' | claude-subagent-budget --json

{
  "plan_summary": { "agents": 3, "parallelism": 12 },
  "tokens": { "input_tokens": 180096, "output_tokens": 153000, "total_tokens": 333096 },
  "anthropic_quota": {
    "plan": "max_100",
    "tier_name": "Claude Max ($100)",
    "by_model": {
      "opus":   { "used": 159024, "quota": 5500000,  "pct": 2.9 },
      "sonnet": { "used":  78036, "quota": 27500000, "pct": 0.3 },
      "haiku":  { "used":      0, "quota": 55000000, "pct": 0 }
    }
  },
  "chatgpt_quota": {
    "plan": "plus",
    "tier_name": "ChatGPT Plus ($20)",
    "used": 96036, "quota": 500000, "pct": 19.3
  },
  "cost_reference": { "usd": 8.78, "jpy": 1317, "note": "..." },
  "duration": { "median_min": 84, "p90_min": 147, "per_article_min": 28, "waves": 3 },
  "warnings": ["..."],
  "block": true,
  "block_reason": "codex-parallel-hard-block",
  "exit_code": 2
}

Input format

Field	Type	Description
`plan[].agent`	string	Agent identifier (e.g. `writer`, `qa-guard`, `codex-rescue`, `analyst-scout`)
`plan[].task`	string	Task description (length affects input token estimate)
`plan[].model`	string	Model identifier (see Supported models)
`plan[].expected_chars`	number	Expected output length in characters (drives output token estimate)
`parallelism`	number	Number of articles processed in parallel (default: 1)

Flags

Flag	Description
`--json`	Output JSON instead of pretty text
`--auto`	Same as `--json` (intended for tool integration)
`--block-on-quota`	Promote 80%+ quota warnings to block (exit 2)
`--block-on-context`	Promote 800K+ context warnings to block (exit 2)
`-h`, `--help`	Show help

Exit codes

Code	Meaning
`0`	OK — no warnings
`1`	WARN — warnings present, execution allowed
`2`	BLOCK — quota exhaustion / context overflow / hard parallelism cap reached

Supported models

Model	Provider	Billing
`opus`, `sonnet`, `haiku`	Anthropic Claude	Subscription quota (Max plan) or pay-per-use API
`gpt-5.5`, `gpt-5.4`	ChatGPT Plus (Codex CLI)	5h subscription quota
`gpt-4o`, `gpt-4o-mini`	OpenAI API	Pay-per-use

Unknown models fall back to Sonnet-equivalent quota tracking.

Configuration

Switching plan tier

Edit config/model-pricing.json:

{
  "user_plan": "max_200",   // "pro" | "max_100" | "max_200"
  ...
}

Customising quota figures

Anthropic does not publish exact numbers, so the bundled defaults are approximations. Tune them in config/model-pricing.json:

"max_100": {
  "tier_name": "Claude Max ($100)",
  "5h_token_quota": {
    "opus": 5500000,
    "sonnet": 27500000,
    "haiku": 55000000
  }
}

Adding a custom model

{
  "models": {
    "my-custom-model": {
      "input_per_1m": 5,
      "output_per_1m": 20,
      "currency": "USD"
    }
  }
}

Library usage

const { estimatePlan } = require('claude-subagent-budget/lib/token-estimator');
const { calcPlanCost } = require('claude-subagent-budget/lib/cost-calculator');
const { predictDuration } = require('claude-subagent-budget/lib/duration-predictor');
const { evaluateRisk } = require('claude-subagent-budget/lib/risk-evaluator');

const plan = [
  { agent: 'writer',       task: 'article', model: 'opus',    expected_chars: 5500 },
  { agent: 'codex-rescue', task: 'fc',      model: 'gpt-5.5', expected_chars: 5500 },
];
const parallelism = 6;

const tokens = estimatePlan(plan, parallelism);
const cost = calcPlanCost(plan, tokens.per_agent, parallelism);
const duration = predictDuration(plan, parallelism);
const risk = evaluateRisk({ tokens: tokens.totals, cost, parallelism, plan }, {
  blockOnQuota: true,
  blockOnContext: true,
});

if (risk.block) {
  console.error(`BLOCKED: ${risk.block_reason}`);
  process.exit(2);
}

How it works

Plan JSON
   │
   ▼
┌──────────────────────────────┐
│  token-estimator             │  Japanese chars × 1.5 + agent-specific output factors
├──────────────────────────────┤
│  cost-calculator             │  Anthropic quota / ChatGPT quota / USD reference
├──────────────────────────────┤
│  duration-predictor          │  per-agent runtime × ceil(parallelism / 5) waves
├──────────────────────────────┤
│  risk-evaluator              │  warn at 80% / block at 95% per quota
└──────────────┬───────────────┘
               ▼
        Pretty / JSON output

Limitations

Token estimates are heuristic. They assume Japanese-text input; tune CHARS_PER_TOKEN in lib/token-estimator.js for English-heavy workloads.
Anthropic Max plan quota figures are approximations; calibrate from observed usage.
Per-agent runtime values are based on observed Claude Code behavior with default plans/skills. Override via agentRuntimes argument to predictDuration().

Releasing

Publishing is fully automated via GitHub Actions on tag push.

One-time setup (maintainer)

Create an npm Automation Token at https://www.npmjs.com/settings//tokens (scope: Automation).
Add it to the repo as a secret named NPM_TOKEN (Settings → Secrets and variables → Actions → New repository secret).

Each release

# Bump version in package.json
npm version patch     # or minor / major

# Push the commit + the tag created by npm version
git push origin main
git push origin v0.1.1

The Publish to npm workflow runs automatically on v* tag push:

Runs smoke tests (test/run.js)
Verifies tag matches package.json version
Publishes to npm with provenance (--provenance)

You can also dry-run the publish from the GitHub Actions UI (Run workflow → enable "dry_run").

CI

The Test workflow runs the smoke tests on every push to main and every PR, across Node 18/20/22 on Ubuntu/macOS/Windows.

License

MIT — see LICENSE.

Contributing

Pull requests welcome. The core surface is small (~1,000 lines, 5 files). Useful extensions:

More model-pricing presets (e.g. provider-specific tiers)
Real-time quota API integration (when Anthropic/OpenAI expose it)
Historical run log → automatic recalibration of runtime/token defaults