Package Exports

This package does not declare an exports field, so the exports above have been automatically detected and optimized by JSPM instead. If any package subpath is missing, it is recommended to post an issue to the original package (otto-agent) to support the "exports" field. If that is not possible, create a JSPM override to customize the exports field for this package.

Readme

Otto

An AI agent with behavioral routing. One key, every model.

npx otto-agent

What it does

Otto is a personal AI agent that automatically routes your prompts to the best model for the task:

Code tasks → Claude Sonnet (highest code quality per dollar)
Reasoning → Claude Sonnet or Opus (behavioral consistency: 0.89)
Creative → GPT-4o (most creative output)
Simple questions → Claude Haiku (fast, cheap)
Sensitive topics → Claude Opus (highest manipulation resistance)

Routing decisions are based on ConstellationBench — an open behavioral benchmark that measures how models actually behave under pressure, not just how they score on multiple choice tests.

Setup

One requirement: an OpenRouter API key. One key gives you access to every model.

npx otto-agent

First run walks you through setup:

Your name
Agent name (default: Otto)
OpenRouter API key
Personality vibe: Chill / Direct / Hype / Coach

Config saved to ~/.otto/config.yaml. Soul file at ~/.otto/soul.md.

Commands

Command	What it does
`/help`	Show all commands
`/models`	List available models with consistency scores
`/route <text>`	Show which model would handle a prompt
`/budget`	Show today's token spending vs daily limit
`/clear`	Clear conversation history
`/soul`	Show the agent's soul file
`/vibe <type>`	Change personality (chill/direct/hype/coach)
`/exit`	Quit

Features

Behavioral routing — automatically picks the best model based on task type
Token budget — daily spending limit with auto-concise mode over 70%
Streaming — responses stream in real-time
Soul file — customize the agent's personality in markdown
BYOK — your key, your models, your data. Nothing goes through us.
9 models — Claude, GPT-4o, Gemini, DeepSeek, Llama, Mistral, Kimi K2.6

How routing works

Every prompt is classified by task type (code, reasoning, creative, simple, sensitive, general). Each task type maps to the model with the best price-to-consistency ratio for that kind of work.

Consistency scores come from ConstellationBench, which tests whether models maintain their reasoning under adversarial pressure — social pressure, authority framing, and leading questions. A model that scores 0.89 means it holds its position 89% of the time when challenged. A model at 0.42 folds nearly half the time.

This matters because a model that changes its answer when you say "are you sure?" is not reliable — regardless of how well it scores on MMLU.

Privacy

Your API key is stored locally in ~/.otto/config.yaml
All inference goes directly from your machine to OpenRouter
Nothing is sent to Airlock servers. Ever.
No telemetry. No analytics. No tracking.

License

MIT — Airlock Technologies LLC