JSPM

claude-subagent-budget

0.1.1
  • ESM via JSPM
  • ES Module Entrypoint
  • Export Map
  • Keywords
  • License
  • Repository URL
  • TypeScript Types
  • README
  • Created
  • Published
  • Downloads 7
  • Score
    100M100P100Q65020F
  • License MIT

Pre-flight cost & quota estimator for Claude Code subagent spawns. Estimate Anthropic Max plan / ChatGPT Plus quota consumption, duration, and risk before launching multi-agent workflows.

Package Exports

  • claude-subagent-budget
  • claude-subagent-budget/index.js

This package does not declare an exports field, so the exports above have been automatically detected and optimized by JSPM instead. If any package subpath is missing, it is recommended to post an issue to the original package (claude-subagent-budget) to support the "exports" field. If that is not possible, create a JSPM override to customize the exports field for this package.

Readme

claude-subagent-budget

Pre-flight cost & quota estimator for Claude Code subagent spawns.

Before launching a multi-agent workflow (e.g. Writer → Codex fact-check → QA-Guard × N parallel articles), claude-subagent-budget estimates how much of your Claude Max plan / ChatGPT Plus quota the spawn will consume — plus duration and risk warnings — so you can avoid quota exhaustion or context-limit crashes mid-flight.

Why

  • Claude Max ($20 / $100 / $200) and ChatGPT Plus are subscription-based. There is no built-in "how much of my 5h quota will this spawn use?" feedback.
  • Multi-agent spawns can quietly explode the context window or burn through ChatGPT quota in minutes.
  • This tool is dependency-free Node.js and prints a quota-percentage view that matches how subscription users actually think about cost.

Features

  • Anthropic Max plan quota usage per model (Opus / Sonnet / Haiku) — primary display
  • ChatGPT Plus quota usage for Codex CLI / GPT-5.x sessions — secondary display
  • USD / JPY reference for API-billed mode (OSS users on pay-per-use)
  • Wall-clock duration estimate using per-agent runtime and parallelism wave-splitting
  • Risk evaluation (warn / block) at configurable thresholds
  • Exit codes 0/1/2 for CI/script integration
  • JSON output for tooling integration
  • Cross-platform (Windows / macOS / Linux), Node.js >= 18, zero dependencies

Installation

# Run with npx (no install)
npx claude-subagent-budget < plan.json

# Or install globally
npm install -g claude-subagent-budget
claude-subagent-budget < plan.json

Usage

CLI

echo '{
  "plan": [
    {"agent": "writer",       "task": "article generation", "model": "opus",    "expected_chars": 5500},
    {"agent": "codex-rescue", "task": "FC --fresh",         "model": "gpt-5.5", "expected_chars": 5500},
    {"agent": "qa-guard",     "task": "QA review",          "model": "sonnet",  "expected_chars": 5500}
  ],
  "parallelism": 12
}' | claude-subagent-budget

Output (default: pretty)

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
🪙 Subagent Budget Estimate
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
plan: 3 agents × 12 articles parallel

Tokens:
  in:  ~ 180,096 tokens
  out: ~ 153,000 tokens
  total: ~ 333,096 tokens

📊 Claude Max ($100) 5h quota usage:
  opus   :   2.9%  (   159,024 /  5,500,000 tokens)  → 97.1% remaining
  sonnet :   0.3%  (    78,036 / 27,500,000 tokens)  → 99.7% remaining

📊 ChatGPT Plus ($20) 5h quota usage:
  codex  :  19.3%  (    96,036 /    500,000 tokens)  → 80.7% remaining

(API-billed reference: $8.78 / ¥1,317 — no real charge for subscription users)

Duration:
  median: 84 min  (parallelism=12, 3 wave(s))
  p90:    147 min

⚠️ Warnings:
  [BLOCK] - codex-rescue × 12 parallel (=12 sessions) exceeds ChatGPT Plus quota

❌ Block: true  (reason: codex-parallel-hard-block)

JSON output

echo '<plan>' | claude-subagent-budget --json
{
  "plan_summary": { "agents": 3, "parallelism": 12 },
  "tokens": { "input_tokens": 180096, "output_tokens": 153000, "total_tokens": 333096 },
  "anthropic_quota": {
    "plan": "max_100",
    "tier_name": "Claude Max ($100)",
    "by_model": {
      "opus":   { "used": 159024, "quota": 5500000,  "pct": 2.9 },
      "sonnet": { "used":  78036, "quota": 27500000, "pct": 0.3 },
      "haiku":  { "used":      0, "quota": 55000000, "pct": 0 }
    }
  },
  "chatgpt_quota": {
    "plan": "plus",
    "tier_name": "ChatGPT Plus ($20)",
    "used": 96036, "quota": 500000, "pct": 19.3
  },
  "cost_reference": { "usd": 8.78, "jpy": 1317, "note": "..." },
  "duration": { "median_min": 84, "p90_min": 147, "per_article_min": 28, "waves": 3 },
  "warnings": ["..."],
  "block": true,
  "block_reason": "codex-parallel-hard-block",
  "exit_code": 2
}

Input format

Field Type Description
plan[].agent string Agent identifier (e.g. writer, qa-guard, codex-rescue, analyst-scout)
plan[].task string Task description (length affects input token estimate)
plan[].model string Model identifier (see Supported models)
plan[].expected_chars number Expected output length in characters (drives output token estimate)
parallelism number Number of articles processed in parallel (default: 1)

Flags

Flag Description
--json Output JSON instead of pretty text
--auto Same as --json (intended for tool integration)
--block-on-quota Promote 80%+ quota warnings to block (exit 2)
--block-on-context Promote 800K+ context warnings to block (exit 2)
-h, --help Show help

Exit codes

Code Meaning
0 OK — no warnings
1 WARN — warnings present, execution allowed
2 BLOCK — quota exhaustion / context overflow / hard parallelism cap reached

Supported models

Model Provider Billing
opus, sonnet, haiku Anthropic Claude Subscription quota (Max plan) or pay-per-use API
gpt-5.5, gpt-5.4 ChatGPT Plus (Codex CLI) 5h subscription quota
gpt-4o, gpt-4o-mini OpenAI API Pay-per-use

Unknown models fall back to Sonnet-equivalent quota tracking.

Configuration

Switching plan tier

Edit config/model-pricing.json:

{
  "user_plan": "max_200",   // "pro" | "max_100" | "max_200"
  ...
}

Customising quota figures

Anthropic does not publish exact numbers, so the bundled defaults are approximations. Tune them in config/model-pricing.json:

"max_100": {
  "tier_name": "Claude Max ($100)",
  "5h_token_quota": {
    "opus": 5500000,
    "sonnet": 27500000,
    "haiku": 55000000
  }
}

Adding a custom model

{
  "models": {
    "my-custom-model": {
      "input_per_1m": 5,
      "output_per_1m": 20,
      "currency": "USD"
    }
  }
}

Library usage

const { estimatePlan } = require('claude-subagent-budget/lib/token-estimator');
const { calcPlanCost } = require('claude-subagent-budget/lib/cost-calculator');
const { predictDuration } = require('claude-subagent-budget/lib/duration-predictor');
const { evaluateRisk } = require('claude-subagent-budget/lib/risk-evaluator');

const plan = [
  { agent: 'writer',       task: 'article', model: 'opus',    expected_chars: 5500 },
  { agent: 'codex-rescue', task: 'fc',      model: 'gpt-5.5', expected_chars: 5500 },
];
const parallelism = 6;

const tokens = estimatePlan(plan, parallelism);
const cost = calcPlanCost(plan, tokens.per_agent, parallelism);
const duration = predictDuration(plan, parallelism);
const risk = evaluateRisk({ tokens: tokens.totals, cost, parallelism, plan }, {
  blockOnQuota: true,
  blockOnContext: true,
});

if (risk.block) {
  console.error(`BLOCKED: ${risk.block_reason}`);
  process.exit(2);
}

How it works

Plan JSON
   │
   ▼
┌──────────────────────────────┐
│  token-estimator             │  Japanese chars × 1.5 + agent-specific output factors
├──────────────────────────────┤
│  cost-calculator             │  Anthropic quota / ChatGPT quota / USD reference
├──────────────────────────────┤
│  duration-predictor          │  per-agent runtime × ceil(parallelism / 5) waves
├──────────────────────────────┤
│  risk-evaluator              │  warn at 80% / block at 95% per quota
└──────────────┬───────────────┘
               ▼
        Pretty / JSON output

Limitations

  • Token estimates are heuristic. They assume Japanese-text input; tune CHARS_PER_TOKEN in lib/token-estimator.js for English-heavy workloads.
  • Anthropic Max plan quota figures are approximations; calibrate from observed usage.
  • Per-agent runtime values are based on observed Claude Code behavior with default plans/skills. Override via agentRuntimes argument to predictDuration().

Releasing

Publishing is fully automated via GitHub Actions on tag push.

One-time setup (maintainer)

  1. Create an npm Automation Token at https://www.npmjs.com/settings//tokens (scope: Automation).
  2. Add it to the repo as a secret named NPM_TOKEN (Settings → Secrets and variables → Actions → New repository secret).

Each release

# Bump version in package.json
npm version patch     # or minor / major

# Push the commit + the tag created by npm version
git push origin main
git push origin v0.1.1

The Publish to npm workflow runs automatically on v* tag push:

  1. Runs smoke tests (test/run.js)
  2. Verifies tag matches package.json version
  3. Publishes to npm with provenance (--provenance)

You can also dry-run the publish from the GitHub Actions UI (Run workflow → enable "dry_run").

CI

The Test workflow runs the smoke tests on every push to main and every PR, across Node 18/20/22 on Ubuntu/macOS/Windows.

License

MIT — see LICENSE.

Contributing

Pull requests welcome. The core surface is small (~1,000 lines, 5 files). Useful extensions:

  • More model-pricing presets (e.g. provider-specific tiers)
  • Real-time quota API integration (when Anthropic/OpenAI expose it)
  • Historical run log → automatic recalibration of runtime/token defaults