TokenFence — Cost Circuit Breaker for AI Agents

Stop runaway AI agent costs with per-workflow budget caps, automatic model downgrade, and a hard kill switch.

Quick Start

npm install tokenfence

OpenAI

import { guard } from "tokenfence";
import OpenAI from "openai";

const client = guard(new OpenAI(), {
  budget: "$0.50",           // Max spend for this workflow
  fallback: "gpt-4o-mini",  // Auto-downgrade at 80% budget
  onLimit: "stop",           // Graceful stop at budget cap
});

// Use exactly like your normal OpenAI client
const res = await client.chat.completions.create({
  model: "gpt-4o",
  messages: [{ role: "user", content: "Analyze this data..." }],
});

// Check spend anytime
console.log(`Spent: $${client.tokenfence.spent.toFixed(4)}`);
console.log(`Remaining: $${client.tokenfence.remaining.toFixed(4)}`);

Anthropic

import { guard } from "tokenfence";
import Anthropic from "@anthropic-ai/sdk";

const client = guard(new Anthropic(), {
  budget: "$1.00",
  fallback: "claude-3-haiku-20240307",
  onLimit: "stop",
});

const res = await client.messages.create({
  model: "claude-sonnet-4-20250514",
  max_tokens: 1024,
  messages: [{ role: "user", content: "Summarize this document..." }],
});

How It Works

Budget Tracking — Every API call is metered using real model pricing
Auto-Downgrade — At 80% budget (configurable), switches to your fallback model
Kill Switch — At 100%, blocks further calls with a synthetic response

Options

Option	Type	Default	Description
`budget`	`string \| number`	required	Max spend (`"$0.50"` or `0.50`)
`fallback`	`string`	`undefined`	Model to downgrade to
`onLimit`	`"stop" \| "warn" \| "raise"`	`"stop"`	Behaviour at budget cap
`threshold`	`number`	`0.8`	Budget fraction to trigger downgrade

onLimit Modes

"stop" — Returns a synthetic response (no API call). Your code keeps running.
"warn" — Logs a warning, allows the call through anyway.
"raise" — Throws BudgetExceeded error.

Supported Models

OpenAI (GPT-4o, GPT-4o-mini, GPT-4, o1, o3-mini, GPT-5.4), Anthropic (Claude Opus 4, Sonnet 4, 3.7, 3.5, Haiku), Google Gemini (2.5, 2.0, 1.5), DeepSeek, and more.

Free Tier & Pricing

The free Hobby tier includes 50K tracked requests/month. For production workloads:

Tier	Requests	Price
Hobby	50K/mo	Free
Pro	500K/mo	$49/mo
Team	2M/mo	$149/mo

→ Upgrade to Pro at tokenfence.dev — 7-day free trial, no credit card required to start.

License

MIT