JSPM

  • Created
  • Published
  • Downloads 9660
  • Score
    100M100P100Q117933F
  • License MIT

A local LLM router. One endpoint, every provider — Claude subscriptions, OpenAI, OpenRouter, Groq, local LiteLLM, any OpenAI-compat endpoint — your tools don't need to change.

Package Exports

  • @askalf/dario

Readme

dario

A local LLM router. One endpoint on your machine, every provider behind it, your tools don't need to change.

npm version CI CodeQL License Downloads

Quick StartWho it's forBackendsWhy switchTrustFAQ


What it is

Dario runs on your machine and gives every tool you use one local URL that reaches every LLM you use. Point Cursor, Continue, Aider, LiteLLM, your own scripts — anything that speaks the Anthropic or OpenAI API — at http://localhost:3456, and dario routes each request to the right backend:

  • OpenAI — your API key, routed to api.openai.com straight through.
  • Any OpenAI-compat endpoint — OpenRouter, Groq, a local LiteLLM, Ollama's openai-compat mode, self-hosted vLLM. Set the backend's baseUrl once, done.
  • Claude Max / Pro subscriptions — OAuth-backed, billed against your plan instead of API pricing. Multi-account pooling if you have more than one.

Your tool sees one base URL. gpt-4o goes to OpenAI. llama-3-70b goes to Groq. claude-opus-4-6 goes to your Claude subscription. None of your tools have to know about any of it.

Backends are plugins, not the product. Dario's job is the one local endpoint your tools point at. Each backend is a swappable adapter behind it — when a provider ships, a backend entry lands, your tools don't change. That's the durable part.

No account anywhere is required. Single-backend Claude dario works with nothing but dario login. Multi-backend dario works with nothing but local config files. Nothing phones home. Zero runtime dependencies. ~2,000 lines of TypeScript.


Who this is for

Best fit:

  • Developers using multiple LLMs across multiple tools who are tired of juggling base URLs, API keys, and per-tool provider configs.
  • Teams running local or hosted OpenAI-compat servers (LiteLLM, vLLM, Ollama, Groq, OpenRouter) who want one stable local endpoint in front of them that every tool can reuse.
  • Anyone who wants to switch providers without reconfiguring every tool — change the model name in your tool, dario picks a different backend, your tool keeps working.
  • Claude Max or Pro subscribers who want their subscription usable anywhere that speaks the Anthropic or OpenAI API — without paying API rates for every request.
  • Power users running multi-agent workloads on Claude subscriptions who want multi-account pooling with headroom-aware routing on their own machine, against their own subscriptions, without a hosted platform.

Not a fit:

  • You need vendor-managed production SLAs on every request. Use the provider APIs directly.
  • You need a hosted multi-tenant routing platform with a dashboard. Try askalf, a separate product in the same family — different problem, different tool.
  • You want a chat UI. Use claude.ai or chatgpt.com.

First use case

I install dario, point every tool I already use at http://localhost:3456, and every LLM I have access to works through that one URL.

Flow on a fresh machine:

# Install
npm install -g @askalf/dario

# Optional: log in to your Claude subscription (Max or Pro)
dario login

# Optional: add an OpenAI-compat backend
dario backend add openai --key=sk-proj-...

# Start the proxy
dario proxy

# Use it — set these once, every tool that honors them just works
export ANTHROPIC_BASE_URL=http://localhost:3456
export ANTHROPIC_API_KEY=dario
export OPENAI_BASE_URL=http://localhost:3456/v1
export OPENAI_API_KEY=dario

Now from the same Cursor/Continue/Aider instance:

  • gpt-4o → OpenAI, your key, straight through
  • claude-opus-4-6 → Claude subscription, billed against your Max plan
  • opus → shortcut, same as above
  • llama-3.1-70b on OpenRouter → configure dario backend add openrouter --key=sk-or-... --base-url=https://openrouter.ai/api/v1, done

One URL. Your tool doesn't know or care which provider is answering.


Why switch

Use dario if you use more than one LLM provider, or more than one tool, or both — and you're tired of configuring each tool with a different base URL and API key per provider.

Use dario if you want provider independence. Switching from GPT-4o to Claude to Llama is a model-name change in your tool, not a reconfigure of every SDK and base URL you've got.

Use dario if you pay for Claude Max or Pro and you want that subscription reachable from every tool on your machine, without paying API rates or opening a second billing surface.

Use dario pool mode if you're running multi-agent workloads on Claude subscriptions and hitting per-account rate limits. Add 2–N accounts with dario accounts add and dario routes across them by per-account headroom, all on your machine, against your own subscriptions. See Multi-Account Pool Mode.

Use a provider API directly if you need vendor-managed production SLAs or high-scale orchestration primitives the providers ship themselves. Dario isn't trying to replace their APIs — it's trying to put one local shim in front of all of them so your tools don't care which is which.

Don't use dario if you want a subprocess bridge that shells out to claude --print under the hood (openclaw-claude-bridge and similar). That's a valid answer for single-team single-machine workloads that can accept a one-subscription rate ceiling and a one-machine deployment — different tradeoffs, different tool.


Quick Start

# Install
npm install -g @askalf/dario

# Claude subscription path (detects Claude Code credentials if CC is installed,
# runs its own OAuth flow otherwise)
dario login

# OpenAI or any OpenAI-compat provider (optional, additive)
dario backend add openai --key=sk-proj-...

# Start the proxy
dario proxy

# Point anything that speaks the Anthropic or OpenAI API at localhost:3456
export ANTHROPIC_BASE_URL=http://localhost:3456
export ANTHROPIC_API_KEY=dario
export OPENAI_BASE_URL=http://localhost:3456/v1
export OPENAI_API_KEY=dario

Opus, Sonnet, Haiku, GPT-4o, o1, o3, o4, plus anything the configured OpenAI-compat backend serves. Streaming, tool use, prompt caching, extended thinking. Zero runtime dependencies. Auto-launches under Bun when available for TLS fingerprint fidelity with Claude Code's runtime on the Claude path.


Backends

Dario's routing is organized around backends, each with its own auth and its own target. Backends are swappable adapters — add one, your tools reach it at localhost:3456 with whatever API shape they already speak. v3.6.0 ships two backends, with more coming.

1. Claude subscription backend (built in)

OAuth-backed Claude Max / Pro, billed against your plan instead of the API. Activated by dario login.

What it does:

  • Every request is replaced with a Claude Code template before it goes upstream — 25 tool definitions, 25KB system prompt, exact CC field order, exact beta headers, exact metadata structure. Only the conversation content is preserved. Anthropic's classifier sees what looks like a Claude Code session because, from the wire up, it is one — and that's what keeps your usage on subscription billing instead of Extra Usage.
  • Billing tag reconstructed using CC's own algorithm: x-anthropic-billing-header: cc_version=<version>.<build_tag>; cc_entrypoint=cli; cch=<5-char-hex>; where build_tag = SHA-256(seed + chars[4,7,20] of user message + version).slice(0,3).
  • OAuth config auto-detected from the installed CC binary at startup. When Anthropic rotates client_id, authorize URL, or scopes, dario picks up the new values on the next run without needing a release.
  • Multi-account pool mode — see below. Automatic when 2+ accounts are configured.
  • Framework scrubbing — known fingerprint tokens (OpenClaw, sessions_* prefixes, orchestration tags) stripped from system prompt and message content before the request leaves your machine.
  • Bun auto-relaunch — when Bun is installed, dario relaunches under it so the TLS fingerprint matches CC's runtime. Without Bun, dario runs on Node.js.

Passthrough mode (dario proxy --passthrough) does an OAuth swap and nothing else — no template, no identity, no scrubbing. Use it when the upstream tool already builds a Claude-Code-shaped request on its own and you just need the token auth.

Detection scope. The Claude backend is a per-request layer. Template replay and scrubbing are designed to be indistinguishable from Claude Code at the request level. What they cannot defend against is Anthropic's session-level behavioral classifier, which operates on cumulative per-OAuth aggregates (token throughput, conversation depth, streaming duration, inter-arrival timing). The practical answer to that is pool mode — distributing load across multiple subscriptions so no one account accumulates enough signal to trip anything. See the FAQ entry for the full mechanism.

2. OpenAI-compat backend (v3.6.0+)

Any provider that speaks the OpenAI Chat Completions API. Activated by:

# OpenAI itself (default base URL)
dario backend add openai --key=sk-proj-...

# Groq
dario backend add groq --key=gsk_... --base-url=https://api.groq.com/openai/v1

# OpenRouter
dario backend add openrouter --key=sk-or-... --base-url=https://openrouter.ai/api/v1

# Local LiteLLM / vLLM / Ollama openai-compat mode
dario backend add local --key=anything --base-url=http://127.0.0.1:4000/v1

Credentials live at ~/.dario/backends/<name>.json with mode 0600.

How it routes. When the OpenAI-compat backend is configured, each request at /v1/chat/completions is checked:

Request model Route
gpt-*, o1-*, o3-*, o4-*, chatgpt-*, text-davinci-*, text-embedding-* OpenAI-compat backend
claude-* (or opus / sonnet / haiku) Claude subscription backend
Anything else Claude backend with OpenAI-compat translation

Dario's passthrough for the OpenAI-compat backend is literal: client request body goes upstream as-is, only the Authorization header is swapped for the configured API key and the URL is pointed at baseUrl + /chat/completions. Response body streams back unchanged.

Coming in a follow-up

  • Anthropic → OpenAI request translation for /v1/messages requests with GPT-family model names (tool_use format, streaming delta conversion).
  • Multiple simultaneous openai-compat backends with per-model routing rules (gpt-* → OpenAI, llama-* → Groq, mixtral-* → OpenRouter).
  • Fallback rules. "If Claude 429s, use Gemini." v3.6.0 ships the routing plumbing; fallback logic layers on top.

Multi-Account Pool Mode

New in v3.5.0, for the Claude subscription backend. Dario can manage multiple Claude subscriptions and route each request to the account with the most headroom. Single-account Claude dario is unchanged — pool mode activates only when ~/.dario/accounts/ contains 2+ accounts.

dario accounts add work
dario accounts add personal
dario accounts add side-project
dario accounts list
dario proxy

Each request picks the account with the highest headroom:

headroom = 1 - max(util_5h, util_7d)

The response's anthropic-ratelimit-unified-* headers are parsed back into the pool so the next selection sees fresh utilization. An account that returns a 429 is marked rejected and routed around until its window resets. When every account is exhausted, requests queue for up to 60 seconds waiting for headroom to reappear.

Accounts can mix plans — Max and Pro accounts can sit in the same pool; dario doesn't care about tier, only headroom.

Pool inspection endpoints:

curl http://localhost:3456/accounts     # per-account utilization, claim, status
curl http://localhost:3456/analytics    # per-account / per-model stats, burn rate, exhaustion predictions

Scope. v3.5.0 ships headroom-aware selection across requests — a 429 on one request marks the account rejected and the next request goes to a different one. Retrying a single in-flight request against a different account when that request 429s (inside-request failover) ships in v3.5.1 along with analytics recording wiring.


Commands

Command Description
dario login Log in to the Claude backend (detects CC credentials or runs its own OAuth flow)
dario proxy Start the local API proxy on port 3456
dario status Show Claude backend OAuth token health and expiry
dario refresh Force an immediate Claude token refresh
dario logout Delete stored Claude credentials
dario accounts list List accounts in the multi-account pool
dario accounts add <alias> Add a Claude account to the pool (runs OAuth flow)
dario accounts remove <alias> Remove an account from the pool
dario backend list List configured OpenAI-compat backends
dario backend add <name> --key=<key> [--base-url=<url>] Add an OpenAI-compat backend
dario backend remove <name> Remove an OpenAI-compat backend
dario shim -- <cmd> [args...] Experimental (v3.12.0). Run a child process with an in-process fetch patch that rewrites its outbound Anthropic requests — no HTTP proxy involved. See Experimental: Shim mode.
dario help Full command reference

Proxy options

Flag / env Description Default
--passthrough / --thin Thin proxy for the Claude backend — OAuth swap only, no template injection off
--preserve-tools / --keep-tools Keep client tool schemas instead of remapping to CC's Bash/Read/Grep/Glob/WebSearch/WebFetch. Required for clients whose tools have fields CC doesn't (sessionId, custom ids, etc.) — see Custom tool schemas. Trade-off: drops the CC request fingerprint. off
--hybrid-tools / --context-inject Remap to CC tools and inject request-context values (sessionId, requestId, channelId, userId, timestamp) into client-declared fields CC's schema doesn't carry. Preserves the CC fingerprint while keeping custom schemas functional — see Hybrid tool mode. Mutually exclusive with --preserve-tools. off
--model=<name> Force a model. Shortcuts (opus, sonnet, haiku), full IDs (claude-opus-4-6), or a provider prefix (openai:gpt-4o, groq:llama-3.3-70b, claude:opus, local:qwen-coder) to force the backend server-wide. See Provider prefix. passthrough
--port=<n> Port to listen on 3456
--host=<addr> / DARIO_HOST Bind address. Use 0.0.0.0 for LAN, or a specific IP (e.g. a Tailscale interface). When non-loopback, also set DARIO_API_KEY. 127.0.0.1
--verbose / -v Log every request off
DARIO_API_KEY If set, all endpoints (except /health) require a matching x-api-key or Authorization: Bearer header. Required when --host binds non-loopback. unset (open)
DARIO_CORS_ORIGIN Override browser CORS origin http://localhost:${port}
DARIO_NO_BUN Disable automatic Bun relaunch unset
DARIO_MIN_INTERVAL_MS Minimum ms between Claude-backend requests (rate governor) 500
DARIO_CC_PATH Override path to the Claude Code binary for OAuth detection auto-detect

Usage

Python (Anthropic SDK)

import anthropic

client = anthropic.Anthropic(
    base_url="http://localhost:3456",
    api_key="dario",
)

msg = client.messages.create(
    model="claude-opus-4-6",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Hello!"}],
)
print(msg.content[0].text)

Python (OpenAI SDK — same proxy, different provider)

from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:3456/v1",
    api_key="dario",
)

# gpt-4o routes to the configured OpenAI backend
msg = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello!"}],
)

# claude-opus-4-6 routes to the Claude subscription backend — same SDK, same URL
claude_msg = client.chat.completions.create(
    model="claude-opus-4-6",
    messages=[{"role": "user", "content": "Hello!"}],
)

TypeScript / Node.js

import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic({
  baseURL: "http://localhost:3456",
  apiKey: "dario",
});

const msg = await client.messages.create({
  model: "claude-opus-4-6",
  max_tokens: 1024,
  messages: [{ role: "user", content: "Hello!" }],
});

OpenAI-compatible tools (Cursor, Continue, Aider, LiteLLM, …)

export OPENAI_BASE_URL=http://localhost:3456/v1
export OPENAI_API_KEY=dario

Any tool that accepts an OpenAI base URL works. Use Claude model names (claude-opus-4-6, opus, sonnet, haiku) for the Claude backend, or GPT-family names for the configured OpenAI-compat backend.

curl

# Claude backend via Anthropic format
curl http://localhost:3456/v1/messages \
  -H "Content-Type: application/json" \
  -H "anthropic-version: 2023-06-01" \
  -d '{"model":"claude-opus-4-6","max_tokens":1024,"messages":[{"role":"user","content":"Hello!"}]}'

# OpenAI backend via OpenAI format
curl http://localhost:3456/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer dario" \
  -d '{"model":"gpt-4o","messages":[{"role":"user","content":"Hello!"}]}'

Streaming, tool use, prompt caching, extended thinking

All supported. Claude backend: full Anthropic SSE format plus OpenAI-SSE translation for tool_use streaming. OpenAI-compat backend: streaming body forwarded byte-for-byte.

Provider prefix

Any request's model field can be written as <provider>:<name> to force which backend handles it, regardless of what the model name looks like. This is useful when regex-based routing (gpt-* → OpenAI, claude-* → Claude) doesn't match — for example when routing a llama-3.3-70b request through an OpenAI-compat backend, or when you want the same model name to go to different providers on different requests.

Recognized prefixes:

Prefix Backend
openai: OpenAI-compat backend (the configured one)
groq: OpenAI-compat backend
openrouter: OpenAI-compat backend
local: OpenAI-compat backend
compat: OpenAI-compat backend
claude: Claude subscription backend
anthropic: Claude subscription backend

Examples:

# Force openai backend
curl http://localhost:3456/v1/chat/completions \
  -H "Authorization: Bearer dario" \
  -d '{"model":"openai:gpt-4o","messages":[{"role":"user","content":"hi"}]}'

# Force a non-gpt model through the openai-compat backend (e.g. OpenRouter)
curl http://localhost:3456/v1/chat/completions \
  -H "Authorization: Bearer dario" \
  -d '{"model":"openrouter:meta-llama/llama-3.1-70b-instruct","messages":[...]}'

# Force Claude subscription backend — same as `opus` shortcut but explicit
curl http://localhost:3456/v1/messages \
  -d '{"model":"claude:opus","max_tokens":1024,"messages":[...]}'

The prefix gets stripped before the request goes upstream — the backend only sees the bare model name. Unrecognized prefixes are ignored, so ollama-style llama3:8b passes through untouched.

Server-wide override. dario proxy --model=openai:gpt-4o applies the prefix to every request, regardless of what the client sends. Useful for "I want everything routed to this specific backend and model" without editing every tool's config.

Custom tool schemas

By default, on the Claude backend, dario replaces your client's tool definitions with the real Claude Code tools (Bash, Read, Grep, Glob, WebSearch, WebFetch) and translates parameters back and forth. That's how dario looks like CC on the wire, which is what lets your request bill against your Claude subscription instead of API pricing.

The trade-off: if your client's tools carry fields CC's schema doesn't have — a sessionId, a custom request id, a channel-bound context token, anything — those fields don't survive the round trip. The model only ever sees Bash({command}), responds with Bash({command}), and dario's reverse map rebuilds your tool call without the fields the model never saw. Your validator then rejects the call for a missing required field.

Symptom: your tool calls come back looking stripped-down, or your runtime complains about a required field being absent only when routed through dario's Claude backend, while the same tools work fine against a direct API key or the OpenAI-compat backend.

Fix: run dario with --preserve-tools (or --keep-tools). That skips the CC tool remap entirely, passes your client's tool definitions through to the model unchanged, and lets the model populate every field your schema expects.

dario proxy --preserve-tools

The cost: requests no longer look like CC on the wire, so the CC subscription fingerprint is gone. On a Max/Pro plan, that means the request may be counted against your API usage rather than your subscription quota. If you're on API-key billing already, --preserve-tools is free; if you're using dario specifically to route against a subscription, the hybrid tool mode below is the compromise that keeps both.

The openai-compat backend (OpenRouter, OpenAI, Groq, local LiteLLM, etc.) is unaffected — it forwards tool definitions byte-for-byte and doesn't need this flag.

Hybrid tool mode

For the very common case where the "missing" fields on your client's tool are request contextsessionId, requestId, channelId, userId, timestamp — dario can remap to CC tools and inject those values on the reverse path. The fingerprint stays intact, the model still sees only CC's tools (so subscription billing still routes), and your validator still sees the fields it requires because dario fills them from request headers on the way back.

dario proxy --hybrid-tools

How it works. On each request, dario builds a RequestContext from headers (x-session-id, x-request-id, x-channel-id, x-user-id) plus its own generated ids and the current timestamp. After translateBack produces the client-shaped tool call on the response path, any field declared on the client's tool schema whose name matches a known context field (sessionId/session_id, requestId/request_id, channelId/channel_id, userId/user_id, timestamp/created_at/createdAt) and isn't already populated gets filled from the context. Fields the model genuinely populated via translateBack are never overwritten.

When to use which flag.

Your situation Flag Why
Your custom fields are request context (session/request/channel/user ids, timestamps) --hybrid-tools Keeps the CC fingerprint and your validator is satisfied.
Your custom fields need the model's reasoning (e.g. confidence, reasoning_trace, tool_selection_rationale) --preserve-tools The model has to see the real schema to populate these. Accept the fingerprint loss.
Your client's tools are already a subset of CC's Bash/Read/Grep/Glob/WebSearch/WebFetch (neither) Default mode works as-is.

Limitations of hybrid mode.

  • Top-level fields only. If your custom field is nested (e.g. meta: {sessionId: ...}), v1 doesn't reach into the nested object. Tracked in #33.
  • The field-to-context mapping is a fixed list. If you need arbitrary fields (e.g. an internal tenant_id) pulled from headers, file an issue and we'll extend the map.
  • No type coercion beyond string. If your schema requires a numeric sessionId, dario sends the string it got from headers — override at your client level or use --preserve-tools.

Hybrid mode was built to resolve #29 cleanly for OpenClaw-style agents whose process tool declares sessionId, after the full provider-comparison diagnostic from @boeingchoco made clear that the problem wasn't fixable in the translation layer alone.

Library mode

import { startProxy, getAccessToken, getStatus, listBackends } from "@askalf/dario";

await startProxy({ port: 3456, verbose: true });
const token = await getAccessToken();
const status = await getStatus();
const backends = await listBackends();

Health check

curl http://localhost:3456/health

Experimental: Shim mode

New in v3.12.0. Opt-in. The default path is still the HTTP proxy — shim mode is a second transport, not a replacement.

Shim mode runs a child process with an in-process globalThis.fetch patch that rewrites the child's outbound requests to api.anthropic.com/v1/messages exactly the way the proxy would, then sends them directly from the child to Anthropic. No localhost HTTP hop. No port to bind. No ANTHROPIC_BASE_URL to set.

dario shim -- claude --print "hello"
dario shim -v -- claude --print "hello"        # verbose

Under the hood: dario shim spawns the child with NODE_OPTIONS=--require <dario-runtime.cjs> and a unix socket / named pipe for telemetry. The runtime patches globalThis.fetch only for Anthropic messages requests, applies the same template replay the proxy does (system prompt, tools, user agent, beta flags), and relays per-request events back to the parent so analytics still work. Every other fetch call in the child is untouched and failsafe-passes through on any internal error.

When to use shim mode

  • Running a single CC instance on a locked-down machine where binding a local port is inconvenient or forbidden.
  • Wrapping one-off scripts (dario shim -- node my-agent.js) without setting up environment variables.
  • Debugging a specific child process in isolation — verbose logs are scoped to that process.

When to stay on the proxy (which is still the default)

  • Multi-client routing. The proxy serves every tool on the machine through one endpoint; the shim wraps one child at a time.
  • Multi-account pool mode. Pooling across subscriptions needs a shared OAuth pool the proxy owns — a shim patch inside one child can't see the pool state.
  • Anything that isn't a Node / Bun child. The shim relies on NODE_OPTIONS, so non-JS runtimes (Python SDK, a Go CLI) still need the proxy.

Limitations at v3.12.0:

  • Bun child detection is partial — known-good with claude --print on Node.
  • No --replace claude global wrapper yet; you call dario shim -- claude ... explicitly.
  • Per-request token cost recording in shim mode is still being wired into analytics.
  • Windows named-pipe CI coverage is incomplete.

The shim runtime lives at src/shim/runtime.cjs (hand-written CJS so --require can load it) and the host orchestrator at src/shim/host.ts. ~180 lines total. See the v3.12.0 release notes for the full design writeup.


Endpoints

Path Description
POST /v1/messages Anthropic Messages API (Claude backend)
POST /v1/chat/completions OpenAI-compatible Chat API (routes by model name)
GET /v1/models Model list (Claude models — OpenAI models come from the OpenAI backend directly)
GET /health Proxy health + OAuth status + request count
GET /status Detailed Claude OAuth token status
GET /accounts Pool snapshot (pool mode only)
GET /analytics Per-account / per-model stats, burn rate, exhaustion predictions. v3.11.1+: every request carries a billingBucket field (five_hour / seven_day / overage / unknown) so you can see, at a glance, which bucket each request billed against. (pool mode only)

Trust & Transparency

Dario handles your OAuth tokens and API keys locally. Here's why you can trust it:

Signal Status
Source code ~2,500 lines of TypeScript across 10 files — small enough to audit in one sitting
Dependencies 0 runtime dependencies. Verify: npm ls --production
npm provenance Every release is SLSA-attested via GitHub Actions
Security scanning CodeQL runs on every push and weekly
Credential handling Tokens and API keys never logged, redacted from errors, stored with 0600 permissions
OAuth flow PKCE (Proof Key for Code Exchange), no client secret
Network scope Binds to 127.0.0.1 by default. --host allows LAN/mesh with DARIO_API_KEY gating. Upstream traffic goes only to the configured backend target URLs over HTTPS
SSRF protection /v1/messages hits api.anthropic.com only; /v1/chat/completions hits the configured backend baseUrl only — hardcoded allowlist
Telemetry None. Zero analytics, tracking, or data collection
Audit trail CHANGELOG.md documents every release

Verify the npm tarball matches this repo:

npm audit signatures
npm view @askalf/dario dist.integrity
cd $(npm root -g)/@askalf/dario && npm ls --production

FAQ

Does this violate Anthropic's terms of service? Dario's Claude backend uses your existing Claude Code credentials with the same OAuth tokens CC uses. It authenticates you as you, with your subscription, through Anthropic's official API endpoints.

What subscription plans work on the Claude backend? Claude Max and Claude Pro. Any plan that lets you use Claude Code.

Does it work with Team / Enterprise? Should work if your plan includes Claude Code access. Not widely tested yet — open an issue with results.

Do I need Claude Code installed? Recommended for the Claude backend, not strictly required. With CC installed, dario login picks up your credentials automatically. Without CC, dario runs its own OAuth flow against Anthropic's authorize endpoint.

Do I need Bun? Optional, recommended for Claude-backend requests. Dario auto-relaunches under Bun when available so the TLS fingerprint matches CC's runtime. Without Bun, dario runs on Node.js and works fine; the TLS fingerprint is the only difference.

First time setup on a fresh Claude account. If dario is the first thing you run against a brand-new Claude account, prime the account with a few real Claude Code commands first:

claude --print "hello"
claude --print "hello"

This establishes a session baseline. Without priming, brand-new accounts occasionally see billing classification issues on first use.

What happens when Anthropic rotates the OAuth config? Dario auto-detects OAuth config from the installed Claude Code binary. When CC ships a new version with rotated values, dario picks them up on the next run. Cache at ~/.dario/cc-oauth-cache-v3.json, keyed by the CC binary fingerprint. Falls back to hardcoded CC 2.1.104 prod values if CC isn't installed.

What happens when Anthropic changes the CC request template? New in v3.11.0. Dario extracts the live request template from your installed Claude Code binary on startup — the system prompt slices, tool schemas, user-agent, beta flags — and uses those to replay requests instead of a version pinned into dario itself. When CC ships a new version with a tweaked template, the next dario proxy run picks it up automatically. Fallback: the hand-curated src/cc-template-data.json bundled with the release, so dario still works even if the installed CC binary is a version the extractor doesn't know how to read. See src/live-fingerprint.ts.

I'm hitting rate limits on the Claude backend. What do I do? Claude subscriptions have rolling 5-hour and 7-day usage windows. Check utilization with Claude Code's /usage command or the statusline. For multi-agent workloads, add more accounts and let pool mode distribute the load: dario accounts add <alias>.

I'm seeing representative-claim: seven_day in my rate-limit headers instead of five_hour. Am I being downgraded to API billing?

No. You're still on subscription billing. Both five_hour and seven_day are the same subscription billing mode — they're just two different accounting buckets inside it.

Here's the full picture. Every Claude Max and Pro subscription has two rolling usage windows:

  • 5-hour window — your short-term usage bucket. Refreshes on a rolling 5-hour schedule. It's the one you'll see most of the time if you use Claude casually.
  • 7-day window — your longer-term usage bucket. Refreshes on a rolling 7-day schedule. It's intentionally larger than the 5-hour one so you can keep working past brief bursts of heavy usage.

When Anthropic bills a request, it decides which bucket to charge it against based on your current utilization. That decision comes back to you in the anthropic-ratelimit-unified-representative-claim response header:

Claim What it means
five_hour You're well inside your 5-hour window; billing against the short-term bucket.
seven_day You've exhausted (or come close to exhausting) the 5-hour window for this rolling cycle, so Anthropic is now charging this request against the 7-day bucket. Still subscription billing. Still your plan. Not API pricing, not overage.
overage Both subscription windows are effectively exhausted. This is where per-token Extra Usage charges kick in — if you've enabled Extra Usage on the account. If you haven't, you get 429'd instead.

Seeing seven_day is a healthy state. It means your Max/Pro plan is doing exactly what it's supposed to do: letting you keep working past short bursts of heavy use by absorbing them into the larger 7-day bucket. Your subscription is not being "downgraded." You're not being charged API rates. Nothing has reclassified you to a worse billing tier. When your 5-hour window rolls forward enough, the claim on new requests will go back to five_hour on its own.

What about overage? That's the state to watch. It means both windows are saturated and Anthropic is either billing you per-token under Extra Usage (if enabled) or refusing the request (if disabled). If you see this on a Claude Max account under normal use, it usually means (a) you're running a multi-agent workload that's genuinely outgrowing one subscription, or (b) Anthropic's session-level classifier has reclassified your long-running OAuth session as agentic load — see the next FAQ entry for the mechanism.

Checking where you stand. You can inspect your current utilization three ways:

  1. Claude Code's built-in command — run /usage inside a claude session. Shows both windows as percentages with reset times.
  2. The statusline — see Claude Code's statusline docs for a per-prompt readout.
  3. Dario's pool endpointcurl http://localhost:3456/accounts when running pool mode. The returned snapshot includes util5h, util7d, and claim per account.

Practical answer if seven_day is painful for your workload. Add more Claude subscriptions to the pool. Each account has its own independent 5-hour and 7-day windows, and dario pool mode will route each request to the account with the most headroom (1 - max(util5h, util7d)). With 2-3 accounts, you almost never see the seven_day bucket get touched because the router steers traffic to whichever account still has five_hour headroom. dario accounts add <alias>.

Dario's test suite asserts five_hour — what if I see failures saying got: seven_day? Some of dario's stealth-test assertions use representative-claim == "five_hour" as a shorthand for "is subscription billing classification working?" That assertion is correct for a fresh account but noisy for an account that's been developed against heavily — exactly the situation our own CI hits after an afternoon of test runs. If you're running the stealth suite against an account that's been busy recently and you see failures of the form Billing claim is five_hour / got: seven_day, that's a test infrastructure limitation, not a dario bug. The request was still billed against your subscription, which is what matters. These assertions will be tightened in a follow-up so they accept both buckets.

Standalone writeup with more detail: Discussion #32 — why you see representative-claim: seven_day and why it's not a downgrade.

My multi-agent workload is getting reclassified to overage even though dario template-replays per request. Why? Reclassification at high agent volume is not a per-request problem. Anthropic's classifier operates on cumulative per-OAuth-session aggregates — token throughput, conversation depth, streaming duration, inter-arrival timing, thinking-block volume. Dario's Claude backend can make each individual request indistinguishable from Claude Code and still hit this wall on a long-running agent session, because the wall isn't at the request level. Thorough diagnostic work on this was contributed by @belangertrading in #23, including the v3.4.3/v3.4.5 hardening that landed as a result. The practical answer at the dario layer is pool mode — distribute load across multiple subscriptions so no single account accumulates enough signal to trip anything. See Multi-Account Pool Mode.

Can I route non-OpenAI providers through dario? Yes — anything that speaks the OpenAI Chat Completions API. dario backend add groq --key=... --base-url=https://api.groq.com/openai/v1, dario backend add openrouter --key=... --base-url=https://openrouter.ai/api/v1, or point at a local LiteLLM / vLLM / Ollama-openai-compat server with --base-url=http://localhost:4000/v1. v3.6.0 supports one active OpenAI-compat backend at a time; per-model routing to multiple OpenAI-compat backends ships in a follow-up.

Does dario work with only the OpenAI backend, no Claude subscription? Yes. Don't run dario login, just run dario backend add openai --key=... and dario proxy. Claude-backend requests will return an authentication error; OpenAI-compat requests will work normally. Dario becomes a local OpenAI-compat shim with no Claude involvement.

Why "dario"? It's a name, not an acronym. Don't overthink it.


Technical Deep Dives

Longer-form writing on how dario works and why it works that way:


Contributing

PRs welcome. The codebase is small TypeScript — around ~3,000 lines across ~14 files:

File Purpose
src/proxy.ts HTTP proxy server, request handler, rate governor, Claude backend dispatch
src/cc-template.ts CC request template engine, tool mapping, orchestration & framework scrubbing
src/cc-template-data.json Bundled fallback CC request template (used when live-fingerprint extraction isn't possible)
src/cc-oauth-detect.ts OAuth config auto-detection from the installed CC binary
src/live-fingerprint.ts v3.11.0. Live extraction of the CC request template (system prompt, tools, user-agent, beta flags) from the installed Claude Code binary
src/oauth.ts Single-account token storage, PKCE flow, auto-refresh
src/accounts.ts Multi-account credential storage and independent OAuth lifecycle
src/pool.ts Account pool, headroom-aware routing, failover target selection
src/analytics.ts Rolling request history, per-account / per-model stats, burn-rate, billing bucket classification
src/openai-backend.ts OpenAI-compat backend credential storage and request forwarder
src/shim/runtime.cjs v3.12.0. Hand-written CJS payload loaded into child processes via NODE_OPTIONS=--require; patches globalThis.fetch for Anthropic messages requests only
src/shim/host.ts v3.12.0. Parent-side orchestrator for dario shim — spawns the child, owns the telemetry socket / named pipe, feeds analytics
src/cli.ts CLI entry point, command routing, Bun auto-relaunch
src/index.ts Library exports
git clone https://github.com/askalf/dario
cd dario
npm install
npm run dev   # runs with tsx, no build step

Contributors

Who Contributions
@GodsBoy Proxy authentication, token redaction, error sanitization (#2)
@belangertrading Billing classification investigation (#4), cache_control fingerprinting (#6), billing reclassification root cause (#7), OAuth client_id discovery (#12), multi-agent session-level billing analysis (#23)
@nathan-widjaja README positioning rewrite structure (#21)
@iNicholasBE macOS keychain credential detection (#30)
@boeingchoco Reverse-direction tool parameter translation (#29), SSE event-group framing regression catch (v3.7.1), provider-comparison diagnostic that surfaced the --preserve-tools discoverability gap (v3.8.1), and the motivating case for hybrid tool mode (#33, v3.9.0)

License

MIT