Package Exports
This package does not declare an exports field, so the exports above have been automatically detected and optimized by JSPM instead. If any package subpath is missing, it is recommended to post an issue to the original package (@superbased/observer) to support the "exports" field. If that is not possible, create a JSPM override to customize the exports field for this package.
Readme
@superbased/observer
Capture, normalize, compress, and analyze every AI coding tool call you run — across Claude Code, Codex, Cursor, Cline / Roo Code, GitHub Copilot, OpenCode, OpenClaw, and Pi — in one local single-binary tool. No telemetry, no cloud, no data leaves your machine.
┌──────────────┐ ┌──────────────────────────────────┐
│ AI clients │ ─▶ │ observer proxy :8820 (local) │ ─▶ Anthropic / OpenAI
│ (Claude │ │ • intercepts every API call │
│ Code, │ │ • compresses conversations │
│ Cursor, │ │ • records api_turns + tokens │
│ Codex …) │ └──────────────┬───────────────────┘
│ │ ▼
│ │ ◀── observer mcp (stdio JSON-RPC) ◀── 12 cross-client tools
└──────┬───────┘
│ JSONL session logs
▼
┌──────────────────────────────────────────────────────────────┐
│ observer watch SQLite store :8081 dashboard │
│ • normalizes tool calls • cost over time │
│ • freshness, redundancy detection • compression savings │
│ • patterns, suggestions • discovery / waste │
└──────────────────────────────────────────────────────────────┘Table of contents
- Install
- Five-minute quickstart
- Per-AI-client setup
- Architecture in detail
- Dashboard tour
- MCP tools reference
- Compression mechanisms
- Cost and token math
- Terminology and glossary
- CLI reference
- Configuration
- Troubleshooting
- Security and privacy
- Source, contributing, license
Install
npm install -g @superbased/observer
observer --versionPre-built binaries ship for:
| Platform | Architecture |
|---|---|
| Linux | x64, arm64 |
| macOS (Intel) | x64 |
| macOS (Apple Silicon) | arm64 |
| Windows | x64 |
The package uses the optionalDependencies-per-platform pattern (same
shape as esbuild / swc / @biomejs/biome) — only the binary
matching your machine downloads. No postinstall network calls, no
compile step.
If your platform isn't listed, build from source — instructions in the main repo.
Five-minute quickstart
# 1) Install + register hooks with whichever AI clients you use.
npm install -g @superbased/observer
observer init # interactive: pick clients
# 2) Start the long-running services (watcher + dashboard + proxy).
observer start & # all three in one process
# (or run them separately, see CLI ref)
# 3) Engage the proxy by pointing your AI client at the local URL.
# See "Per-AI-client setup" for the matching env var.
export ANTHROPIC_BASE_URL=http://127.0.0.1:8820 # Claude Code
export OPENAI_BASE_URL=http://127.0.0.1:8820/v1 # Codex / OpenAI
# 4) Open the dashboard.
open http://127.0.0.1:8081/ # macOS
xdg-open http://127.0.0.1:8081/ # Linux
start http://127.0.0.1:8081/ # WindowsAfter ten minutes of normal AI-coding usage, the dashboard will be populated with cost over time, per-tool activity, compression savings, and stale-reread waste signals.
Per-AI-client setup
Different clients send to different upstreams. The local proxy on
127.0.0.1:8820 handles all of them — routes by URL path. Set the
env var that matches your client(s); both can coexist on one machine.
| AI client | Env var | Capture mode |
|---|---|---|
| Claude Code | ANTHROPIC_BASE_URL=http://127.0.0.1:8820 |
proxy + JSONL |
| Cursor (Anthropic mode) | ANTHROPIC_BASE_URL=http://127.0.0.1:8820 |
proxy + JSONL |
| Codex | OPENAI_BASE_URL=http://127.0.0.1:8820/v1 (note /v1) |
proxy + JSONL |
| Cursor (OpenAI mode) | OPENAI_BASE_URL=http://127.0.0.1:8820/v1 |
proxy + JSONL |
| Cline / Roo Code | ANTHROPIC_BASE_URL=... or OPENAI_BASE_URL=... per provider |
proxy + JSONL |
| GitHub Copilot | (no proxy yet) | JSONL only |
| OpenCode | (no proxy yet) | JSONL only — ~/.local/share/ai.opencode.desktop/ |
| OpenClaw (openclaw.ai) | (no proxy yet) | JSONL + sqlite — ~/.openclaw/tasks/runs.sqlite + ~/.openclaw/agents/<agent>/sessions/sessions.json |
| Pi (pi.dev) | (no proxy yet) | JSONL only — ~/.pi/agent/sessions/*.jsonl |
JSONL-only clients are captured passively by the watcher (always-on,
no setup beyond observer init). You won't see real-time cost numbers
for them on the Compression tab (those need the proxy), but every tool
call shows up on Sessions / Actions / Discovery / Tools / Patterns and
the JSONL-derived token counts feed the Cost tab with Source=jsonl
Reliability=unreliabletags.
Persistent setups
Claude Code (~/.claude/settings.json):
{
"env": { "ANTHROPIC_BASE_URL": "http://127.0.0.1:8820" },
"hooks": { /* `observer init` writes these */ }
}Codex (~/.codex/config.toml):
[env]
OPENAI_BASE_URL = "http://127.0.0.1:8820/v1"Shell rc (~/.bashrc / ~/.zshrc) — affects every program:
export ANTHROPIC_BASE_URL=http://127.0.0.1:8820
export OPENAI_BASE_URL=http://127.0.0.1:8820/v1Architecture in detail
Five components running side by side:
1. JSONL adapters (passive ingest)
Watch ~/.claude/projects/, ~/.codex/sessions/, etc. for new
session log lines. Normalize per-client tool names to a shared
taxonomy (read_file, run_command, spawn_subagent, …) and write
them into the actions table. Always-on, no setup required —
observer init registers the hooks once.
What this gets you: every tool call you've ever run, queryable.
2. API reverse proxy (active capture)
A localhost HTTP server (127.0.0.1:8820) you point your AI client
at via ANTHROPIC_BASE_URL / OPENAI_BASE_URL. Intercepts every
request before it hits Anthropic / OpenAI and:
- Records exact token usage from the upstream
usageenvelope (the most accurate cost source — proxy beats JSONL parsing here). - Runs the conversation compression pipeline to trim large
tool_resultblocks and drop low-importance messages before forwarding upstream. - Captures the
cost_usdthe upstream reports (when present).
What this gets you: ground-truth cost numbers and conversation compression savings you can measure.
3. SQLite store
A single file at ~/.observer/observer.db. Tables include:
projects,sessions,actions— the taxonomyapi_turns— one row per proxy-intercepted upstream requesttoken_usage— JSONL-derived token-row events (deduped via spec §A1)file_state— content hashes for freshness classificationcompression_events— per-event compression detail (post-migration 010)project_patterns— derived patterns fromobserver patternsfailure_context,action_excerpts— diagnostic data
Pure-Go via modernc.org/sqlite, no CGO. WAL mode by default.
4. Local dashboard (:8081)
Eight tabs covering: Overview, Cost, Sessions, Actions, Tools, Compression, Discovery, Patterns. See Dashboard tour.
Static HTML + Chart.js. No analytics, no external requests.
5. MCP server (stdio)
12 read-only tools the AI client itself can call mid-conversation —
check_file_freshness, get_last_test_result, search_past_outputs,
etc. Powers cross-client tool sharing: if Claude Code ran go test,
Cursor's MCP query for the latest test result will return Claude
Code's run. See MCP tools reference.
Dashboard tour
Open http://127.0.0.1:8081/ after observer start. Eight tabs.
Overview tab
High-level snapshot of the selected window:
- KPI tiles: Sessions count, API turns (proxy-captured), Token rows (JSONL-recovered), Failures (24h)
- Cost over time chart — daily token volume, split into the four billable buckets (net input / cache read / cache write / output)
- Actions over time chart — total actions vs failures
- Top models (by tokens) chart — top-8 models stacked by net input / cache read / output
- Top tools (actions over time) — per-AI-client stacked-area showing when each client is active
Cost tab
Per-model breakdown over the selected window. Tokens split into the
four billing buckets, with computed dollar cost and a reliability
flag (high = upstream-reported, medium = computed, low =
estimated, unreliable = no pricing entry). Daily stacked-bar chart
of token volume per day.
Hover any column header for tooltip; click for the full definition in the help drawer.
Sessions tab
One row per AI-coding session. Each session has a stable ID, a tool
(claude-code / cursor / codex / cline / copilot), a working-directory
project, action count, sub-agent action count (when the session
spawned sub-agents via the Agent tool), and — if observer score
has run — quality / errors / redundancy ratios.
Click a row to open the session-detail modal: action breakdown, token buckets, cost, recent actions.
Actions tab
The flat firehose: every recorded tool call, normalized across
adapters. Filter by action type (read_file, write_file,
run_command, spawn_subagent, todo_update, mcp_call, …).
Pagination caps at 50 rows per page; total count is shown next to
the heading.
Tools tab
Per-AI-client (the client — claude-code / cursor / codex / etc., not the per-tool name) aggregates plus three views:
- KPI tiles: Total actions, Distinct tools, Overall success rate, Busiest tool
- Activity over time stacked-area showing per-tool action volume per day
- Action-type mix per tool horizontal stacked bar — what each tool actually does (read_file vs edit_file vs run_command vs search_text vs spawn_subagent)
- The full per-tool aggregate table with first/last seen
Compression tab
How many tokens and dollars the conversation-compression pipeline saved by trimming requests before forwarding upstream:
- KPI tiles: Tokens saved (est.), Dollars saved (est.), Bytes saved, Turns compressed
- Savings per day chart — daily tokens-saved (left axis) and bytes-saved (right axis)
- Savings by mechanism stacked bar — segments per mechanism (json / code / logs / text / diff / html / drop). Toggle the y-axis between tokens and bytes with the chart-header switch.
- Per-model breakdown table — tokens saved ~, $ saved ~, bytes saved, saved %, turns, tool-results compressed, dropped, markers
- Recent compression events — paginated per-event detail with mechanism, original / compressed / saved bytes, message slot, importance score (for drops), and a Source column showing whether the event came from a main-thread or sub-agent runtime call
Discovery tab
Wasted-effort signals:
- Stale rereads — files re-read after they changed inside the same session. KPI tiles show count, ~Tokens wasted, ~$ wasted (at your blended input rate), affected files. The CROSS-THREAD column flags re-reads that crossed the parent ↔ sub-agent boundary — these are the "pass content via Agent's prompt parameter" candidates.
- Repeated commands — commands run multiple times with no
relevant inputs changed in between (e.g. you ran
go testthree times without editing anything between runs). - Cross-tool overlap — files touched by ≥2 AI clients in the
window (e.g. claude-code AND cursor both edited
auth.ts). This is the visible side of cross-platform tool-call sharing via the MCP server.
Patterns tab
Repeatable behaviours the observer noticed across your sessions —
"after running go test, you almost always run go vet", "when
working on auth.ts, you also touch login.tsx", etc. Each pattern
has a confidence score (decay-weighted: more observations + recent
observations push it higher) and an observation count.
observer patterns derives them; observer suggest writes the
high-confidence ones into CLAUDE.md / AGENTS.md / .cursorrules
so new sessions inherit your habits.
Help drawer
Press ? anywhere on the dashboard or click the ? Help button in
the topbar. Every column header, KPI tile, chart label, and filter
control on every tab is annotated — hover any element to see a
one-liner tooltip; click to open the drawer at the matching glossary
entry.
The drawer has full descriptions, formulas, data sources, examples,
"why it matters", "what to do", and cross-links. Search at the top.
Deep-linkable via URL fragment — #help/metric.stale_count opens
the drawer at that entry.
Each compression mechanism (json / code / logs / text / diff / html / drop) has a "Full methodology · see more" expandable section explaining the actual algorithm.
MCP tools reference
Every AI client running through observer init gets these 12 tools
registered as an MCP server. They're read-only queries against
the unified database, so any agent can read any other agent's
recorded work — true cross-platform tool-call sharing.
| Tool | Purpose |
|---|---|
check_file_freshness |
Has this file been read in the current session? Has it changed since? |
get_file_history |
Full read/edit history of a file across all sessions and clients |
get_session_summary |
Roll-up stats for a session: action count, success rate, cost, token buckets |
search_past_outputs |
Full-text search across recorded tool outputs (FTS5-indexed) |
get_last_test_result |
Most recent go test / npm test / pytest etc. output |
get_failure_context |
Recent failures: which command, which file, which session |
get_action_details |
One specific action's full record (target, args, output excerpt) |
check_command_freshness |
Has this command been run in the current session? With what result? |
get_session_recovery_context |
Recent activity for resuming a paused session |
get_project_patterns |
High-confidence patterns derived from this project's history |
get_cost_summary |
Daily / per-model / per-session cost rollups |
get_redundancy_report |
Stale rereads, repeated commands, cross-tool overlap for the project |
Cross-tool sharing: when observer init registers the MCP server
with Claude Code AND Cursor (and Codex…), all of them call the same
12 tools against the same database. Cursor's get_last_test_result
returns Claude Code's last test run; Codex's check_file_freshness
reflects edits made by Cursor.
Compression mechanisms
The conversation-compression pipeline runs inside the proxy on every
upstream request. It tries to fit the request body within
target_ratio × original_bytes (default 0.85) without breaking the
conversation's referential integrity.
Two passes, in order:
Pass 1 — per-content-type compression
Each tool_result block is sniffed for content type, then routed to
a content-aware compressor. Six compressors:
| Mechanism | What it does | When it fires |
|---|---|---|
| json | Replaces every scalar value with a type sentinel ("<string>", "<number>", …) preserving structure (keys, arrays, nesting). Arrays of length > 1 collapse to one element with _len: N. |
API responses, structured logs, telemetry exports. JSON tool_results are usually the biggest savings target. |
| code | Heuristic skeleton: keeps top-of-file imports + signature lines (function / method / class / struct / interface / type), drops bodies. | Source code files. Opt-in only (excluded from default compress_types) because the model often needs to read full code. |
| logs | Two-pass: collapses adjacent identical lines to <line> [×N], then head+tail-truncates to 200 lines if still long. |
Log-shaped output — go test ./..., npm run build, polling/retry loops. Lossless on distinct lines; only the truncate pass is lossy. |
| text | Catch-all for content not classified as code/json/logs/diff/html. Head+tail truncation: keeps 40 + 40 lines on inputs over 80 lines. | Markdown bodies, README excerpts, narrative descriptions. |
| diff | Strips unified-diff context beyond ±1 line of each change. Keeps every header, every +/- line, drops the rest with elision markers. |
git diff, patch tool outputs. Lossless on changes; lossy only on the cheap-to-rebuild context. |
| html | Three regex passes: strips <script>, <style>, and HTML comments. Tag attributes + visible text + structural elements survive. |
web_fetch results pulling whole HTML pages — usually 80%+ scripts/styles. |
Pass 2 — drop with marker
If Pass 1's compressed body is still over budget, the budget enforcer ranks remaining messages by importance score (a deterministic weighted sum) and drops the lowest-scored non-preserved ones until the budget is met. Each dropped message is replaced by a single marker block (a placeholder text) so the conversation flow stays intact for the model.
Importance score = 0.4 × recency + 0.3 × reference + 0.15 × density + 0.15 × role
- Recency =
(i+1) / n— newest message scores 1.0, oldest scores 1/n - Reference =
1.0if any of the message'stool_use_idsis cited by a latertool_result, OR any of itsreferenced_idspoints to a live tool_use;0.0otherwise. Tool-pair-live messages always get full weight regardless of position. - Density = fraction of non-whitespace runes (whitespace-padded outputs get dropped first)
- Role =
system 1.0,user 0.9,assistant 0.7,tool 0.5(tool outputs are the most-compressible by policy)
Preserved messages (never droppable):
- The last
PreserveLastNmessages (default 4) - Any
systemrole message - Tool-pair-live messages: any message whose
tool_use_idis referenced by a later tool_result (parent side), AND anytool_resultmessage whosereferenced_idpoints to a live tool_use (consumer side)
Tool-pair preservation is symmetric — dropping either side leaves an orphan that Anthropic rejects with 400.
Per-event detail
Every drop and every per-type compression is recorded as a row in
the compression_events table (post migration 010). The Compression
tab's "Recent compression events" view surfaces these with
mechanism, original / compressed / saved bytes, message slot, and
importance score (for drops).
Tuning
observer config settings (in ~/.observer/config.toml):
[compression.conversation]
enabled = true
target_ratio = 0.85
mode = "token" # or "cache" — restricts drops to tail half
preserve_last_n = 4
prefix_bytes = 8192
compress_types = ["json", "logs", "text"] # add "code", "diff", "html" to opt in
min_bytes_to_compress = 1024
[compression.conversation.weights]
recency = 0.4
reference = 0.3
density = 0.15
role = 0.15High importance scores on dropped events (≥0.5) suggest the threshold
is too aggressive — raise target_ratio (e.g. 0.9 or 0.95).
Cost and token math
Anthropic's usage envelope reports four token buckets per request, each at a different rate:
| Bucket | What it is | Bills at |
|---|---|---|
net_input |
Fresh prompt tokens not served from cache | model's standard input rate |
cache_read |
Prompt tokens served from Anthropic's ephemeral cache | ~10% of input rate |
cache_creation |
Tokens written to ephemeral cache | 1.25× input (5m tier) or 2× input (1h tier) |
output |
What the model generated | typically 5× input rate |
prompt_context = net_input + cache_read + cache_creation
total_tokens = prompt_context + output
The cost engine (internal/intelligence/cost) computes USD via:
cost_usd = (net_input × p.input + cache_read × p.cache_read +
cache_creation_5m × p.cache_creation +
cache_creation_1h × p.cache_creation_1h +
output × p.output) ÷ 1,000,000When the upstream API returns cost_usd in the response envelope
(proxy-sourced rows), that value is preferred over the computed one —
ground truth, reliability=high.
Blended input rate
The Discovery tab's "~$ wasted" tile uses a blended input rate
computed from your last-30d api_turns mix: each model's input rate
weighted by the prompt-token volume it consumed. Example: if you
spent 70% of prompt tokens on opus-4-7 ($15/1M) and 30% on
haiku-4-5 ($1/1M), the blended rate is 0.7 × 15 + 0.3 × 1 = $10.80/1M.
Falls back to $3/1M (claude-sonnet-4 input) on fresh installs with no proxy data.
JSONL dedup
When the proxy isn't engaged, observer falls back to parsing the AI client's on-disk session log. Clients echo the same cumulative usage on every content block of a multi-block response, so naive parsing counts one API call 2-4×. Two layers of dedup catch this:
- Adapter-level: dedupes on Anthropic
message.idat write time - Cost-engine-level: dedupes on
(source_file, model, timestamp-bucketed-to-minute, tokens)at read time
Migration 007 ran a one-time pass collapsing pre-fix duplicates.
Terminology and glossary
Quick reference; the in-platform help drawer (press ? on the
dashboard) has the full versions with cross-links.
- Action — one normalized tool call recorded by an adapter. Action
types are taxonomic and cross-client:
read_file,write_file,edit_file,run_command,search_text,search_files,web_search,web_fetch,mcp_call,spawn_subagent,todo_update,ask_user,task_complete,user_prompt,unknown. - API turn — one HTTP request captured by the local proxy. Records
one row in
api_turnsper request, with the upstream usage envelope intact. - Cache 5m vs 1h tier — Anthropic's prompt cache has two TTLs.
Default is 5 minutes;
cache_control: {type: ephemeral, ttl: 3600}extends to 1 hour at 2× the write cost. Reads bill the same rate regardless of tier. - Compression event — one individual compression decision (one per-type compress, or one drop) recorded post migration 010.
- Conversation compression — pre-forward trimming of API request bodies. Pass 1 = per-content-type compression, Pass 2 = drop with marker. See Compression mechanisms.
- Cross-platform tool calling — every AI client connected via
observer initcan call the 12 MCP tools against the unified database. So Cursor'sget_last_test_resultcan return ago testClaude Code ran an hour earlier. - Cross-thread reread — the parent thread re-reads a file the
sub-agent already saw (or vice versa) within the same session. Fix:
pass content via the Agent tool's
promptparameter rather than letting the child re-read. - Freshness state — per-read tag from the freshness engine:
fresh(first read in this session, OR re-read with same content),stale(re-read after change in same session),missing(file no longer exists),modified-elsewhere(file changed by something other than an observable AI action). - Mechanism — one of
json,code,logs,text,diff,html(per-content-type compressor) ordrop(low-importance message replaced by a marker). - Pattern — a derived behaviour:
command_pair(X often followed by Y),cross_tool_file(file touched by multiple clients),knowledge_snippet(consistent topic-specific habit),failure_correlation(X often precedes a failure of Y),session_summary. Each has a decay-weighted confidence score 0-1. - Project — working-directory root that owns sessions and
actions. Derived from cwd at session start;
/.git/worktrees/...paths fold back to the working-tree root. - Proxy vs JSONL — proxy intercepts upstream HTTP calls (ground truth, reliability=high). JSONL parses the AI client's on-disk session log (works without configuring a base URL, but client echoes cumulative usage on every block, requiring dedup — reliability=unreliable for token counts on Claude Code).
- Reliability — cost-engine confidence:
high(upstream- reported),medium(computed from known pricing),low(some buckets estimated),unreliable(no pricing entry). - Session — one continuous AI-coding conversation in a single tool, scoped to one working directory. Has a stable ID (Claude Code's UUID, Codex's rollout ID, …).
- Sidechain — actions emitted inside a sub-agent runtime spawned
via the parent's
Agenttool. Sub-agents share the parent's session_id; theis_sidechaincolumn distinguishes them. The Discovery tab's CROSS-THREAD column counts stale rereads that crossed this boundary. - Stale reread — same-session re-read of a file whose content changed between reads. Cross-session reads are excluded (a fresh session has no memory of a prior session's read).
- Tool — in this dashboard, "tool" means the AI client
(claude-code, cursor, codex, cline, copilot), not the per-tool
name (
read_file,run_command). The latter is "Tool name" on the Actions tab. - Tool-pair integrity — Anthropic requires every
tool_resultblock to have a correspondingtool_useblock in a preceding message. The compression pipeline preserves both sides of every live pair to satisfy this constraint.
CLI reference
Every command supports --help for the full surface.
| Subcommand | Purpose |
|---|---|
observer init |
Register hooks + MCP server with installed AI clients |
observer uninstall |
Reverse observer init |
observer start |
Run watcher + dashboard + proxy in one process (recommended) |
observer watch |
Long-running JSONL watcher only |
observer dashboard --addr ADDR |
HTTP dashboard only |
observer proxy start |
Reverse proxy only |
observer scan |
One-shot ingest of existing JSONL files (catch-up after install) |
observer status |
DB stats + recent activity |
observer doctor |
Diagnostic — checks paths, schemas, hook registration |
observer tail |
Live tail of incoming events |
observer cost |
Per-model cost summary CLI |
observer score |
Compute quality_score / error_rate / redundancy_ratio for sessions |
observer discover |
Stale rereads + repeated commands report (CLI version of the Discovery tab) |
observer patterns |
Derive patterns from session history |
observer learn |
Adapter for ingesting external JSONL exports |
observer suggest |
Write high-confidence patterns into CLAUDE.md / AGENTS.md / .cursorrules |
observer summarize |
Roll-up summary across sessions |
observer export |
Export DB to xlsx / json |
observer prune |
Manual retention pass (delete old data) |
observer backfill --is-sidechain |
Re-walk JSONL to populate is_sidechain on pre-migration rows |
observer metrics |
Prometheus-format metrics endpoint |
observer serve |
MCP server (stdio JSON-RPC) — usually invoked by observer init registration |
observer tail |
Live event stream |
Configuration
~/.observer/config.toml — created with defaults on first run.
[paths]
db_path = "~/.observer/observer.db"
log_dir = "~/.observer/logs"
[proxy]
listen_addr = "127.0.0.1"
port = 8820
anthropic_upstream = "https://api.anthropic.com"
openai_upstream = "https://api.openai.com"
[dashboard]
listen_addr = "127.0.0.1"
port = 8081
[compression.conversation]
enabled = true
target_ratio = 0.85
mode = "token"
preserve_last_n = 4
prefix_bytes = 8192
compress_types = ["json", "logs", "text"]
min_bytes_to_compress = 1024
[compression.conversation.weights]
recency = 0.4
reference = 0.3
density = 0.15
role = 0.15
[compression.shell]
enabled = true
# per-command filters configured under [compression.shell.filters]
[retention]
prune_on_startup = true
max_actions = 5_000_000
max_age_days = 365
[pricing]
# Per-model overrides if the baked-in pricing is wrong for you.
# [pricing.models."claude-opus-4-7"]
# input = 15
# output = 75
# cache_read = 1.5
# cache_creation = 18.75Troubleshooting
npm install -g fails with EACCES: permission denied
Default npm puts globals under /usr/local/lib/node_modules which
Homebrew-managed Node owns as root on macOS. Three fixes; pick one:
# 1) RECOMMENDED — point npm at a user-writable prefix.
mkdir -p ~/.npm-global
npm config set prefix '~/.npm-global'
echo 'export PATH=~/.npm-global/bin:$PATH' >> ~/.zshrc
source ~/.zshrc
npm install -g @superbased/observer
# 2) Use a Node version manager — fnm / nvm install Node into your
# home directory and dodge the permission issue entirely.
brew install fnm
fnm install --lts
npm install -g @superbased/observer
# 3) sudo (works but you'll fight permissions on every update).
sudo npm install -g @superbased/observerobserver: command not found after install
The shim binary is at ~/.npm-global/bin/observer (or wherever your
npm prefix points). Make sure that directory is on $PATH:
echo $PATH | tr ':' '\n' | grep -E 'npm|node'
# add the prefix's bin/ to PATH if missing — see fix aboveIf you installed only a platform package (e.g. @superbased/observer-darwin-x64)
without the main @superbased/observer, the shim doesn't get created
— there's no bin field. Install the main package; npm picks up the
right platform binary automatically via optionalDependencies.
observer init says "no tools selected and none auto-detected"
Auto-detection looks for the AI clients' default session-log dirs
(~/.claude/projects/, ~/.codex/sessions/, ~/.cursor/, etc.).
On a fresh machine where no client has run yet, those dirs don't
exist. Pass the flag explicitly:
observer init --claude-code # or --codex / --cursor / --cline / --allThis registers hooks regardless — the next time the client runs, its dirs get created and the watcher picks them up.
Empty dashboard / "No proxy traffic"
The JSONL adapter populates passively after observer init, but
ground-truth cost / compression numbers require the proxy. Set
ANTHROPIC_BASE_URL=http://127.0.0.1:8820 (Claude Code) or
OPENAI_BASE_URL=http://127.0.0.1:8820/v1 (Codex) in the shell
that launches your AI client.
Verify with observer status | grep api_turns — count should
climb during AI-client activity.
observer --version says dev
You're on a non-released build. Reinstall with npm install -g @superbased/observer or rebuild with the workflow's -X main.version=$VERSION ldflag.
tool_result block must have a corresponding tool_use block
Anthropic 400. Means the conversation-compression pipeline dropped
a tool_use while keeping its matching tool_result. Versions
prior to 1.3.2 had this bug; upgrade. If you're on 1.3.2+ and still
see it, file an issue with the conversation prefix.
tool use concurrency issues
Anthropic 400 surfaced in Claude Code as this message. Means the
parallel-tool-use case (multiple tool_use blocks in one assistant
message) isn't paired correctly with the multi-block tool_result
that follows. Versions prior to 1.3.2 had this bug; upgrade.
Cross-thread numbers are 0
Pre-migration data was ingested without the is_sidechain flag.
Run observer backfill --is-sidechain once to re-walk JSONL and
populate the flag on existing rows.
Migration error: duplicate column name
Race condition between concurrent daemon startups, fixed in 1.4.1.
Upgrade. If you still see it, run daemons serially: observer watch, wait, then observer dashboard, then observer proxy start (or just use observer start which runs all three in one
process — proxy + watcher + dashboard).
observer start log says only proxy + observer — no :8081
You're on a pre-1.4.7 build. Earlier versions ran only proxy +
watcher under observer start; the dashboard had to be started
separately via observer dashboard --addr 127.0.0.1:8081. Upgrade
to 1.4.7+ — the dashboard goroutine is now part of observer start
and the log line confirms all three: proxy <addr> + watcher + dashboard http://127.0.0.1:8081. Pass --no-dashboard to opt out.
"address already in use" on port 8820
Another observer proxy start or observer start is still running.
Find it with pgrep -af 'observer (proxy|start)' and kill <pid>.
On macOS:
lsof -nP -iTCP:8820 -sTCP:LISTEN
kill <pid>Dashboard port already in use
observer dashboard --addr 127.0.0.1:8082 # pick a different port
# or
[dashboard]
port = 8082 # in config.tomlSecurity and privacy
- No telemetry. The observer never makes outbound network calls except: (a) the proxy forwarding to the upstream API on your behalf, (b) optional codegraph-MCP integration if you've configured it. Both are explicit.
- Local-only HTTP. The proxy and dashboard bind to
127.0.0.1by default. Don't bind to0.0.0.0unless you've thought about it — there's no auth. - Secrets scrubbing. Tool inputs and outputs are passed through
a configurable scrubber (
internal/scrub) that redacts API keys, bearer tokens, and other obvious secrets before persistence. Not a guarantee — review the rules in~/.observer/config.tomlfor your own secret formats. - Database.
~/.observer/observer.dbis a SQLite file. Owns the same security as your~/.claude/and~/.codex/session logs (which already contain the same content). Encrypt the disk if your threat model needs that. - Logs.
~/.observer/logs/carries operational logs (start / stop, errors, retention). Not request bodies — those are in the DB only.
Source, contributing, license
- Source: https://github.com/marmutapp/superbased-observer
- Specification:
superbased-final-spec-v2.mdin the repo - Issues: https://github.com/marmutapp/superbased-observer/issues
- License: Apache 2.0
- Author: Santosh Kathira contact@marmut.app
This npm package is a thin Node.js shim that resolves the right
pre-built binary at runtime and spawns it. Same shape as esbuild /
swc / @biomejs/biome. The Go source lives in the main repo;
binaries are cross-compiled per release tag via GitHub Actions and
published as @superbased/observer-<platform>-<arch> per-platform
packages.