CTO — AI context selection done right

Pick the right files for any AI task. Secrets auto-redacted. Learns from your feedback.

# Select context, copy to clipboard
cto --context "fix the auth middleware" --stdout | pbcopy

# Generate a complete AI prompt
cto --context "fix the auth middleware" --prompt "Refactor this to use JWT"

# Was the AI output good? Tell CTO so it learns.
cto --accept

74KB package. Zero bloat.

What it does

When you ask an AI to help with code, it needs the right files as context. Send too few and the AI hallucinates. Send too many and you waste tokens. CTO picks the right ones:

Matches your task — TF-IDF/BM25 semantic matching, not keyword guessing
Ranks by composite score — risk × 0.4 + semantic × 0.4 + learner × 0.2
Sanitizes output — API keys, tokens, passwords auto-redacted before they reach any AI
Learns from feedback — --accept / --reject teach it what you actually need

Different tasks → different files. "fix auth" and "add database tests" return completely different selections.

Install

npm i -g cto-ai-cli    # global
npx cto-ai-cli         # or one-shot

Context Selection

# Human-readable summary
cto --context "refactor the auth middleware"

# Pipe to clipboard (macOS)
cto --context "fix login bug" --stdout | pbcopy

# Save to file (secrets auto-redacted)
cto --context "add tests" --output context.md

# Full AI prompt with instruction
cto --context "fix login" --prompt "Refactor to use async/await"

# JSON for tooling
cto --context "debug scoring" --json

# Custom token budget
cto --context "fix auth" --budget 30000

Output includes full file contents in markdown, ready to paste into Claude, ChatGPT, or any AI. Secrets are automatically redacted — API keys, tokens, passwords, PII are replaced with **** before output.

Feedback Loop

CTO learns from real feedback, not from itself:

# After using the context and it worked:
cto --accept

# If the AI needed files CTO didn't include:
cto --reject
cto --reject --missing src/types/auth.ts

# See what CTO has learned:
cto --stats

On --reject, CTO also detects files you edited after the selection that weren't in the context — those get automatically boosted for next time.

Secret Audit

cto --audit                  # scan all files
cto --audit --init-hook      # install pre-commit hook
cto --audit --full-scan      # ignore cache, scan everything
cto --audit --json           # machine-readable output

45+ patterns (AWS, Stripe, GitHub, OpenAI, Slack, etc.) plus Shannon entropy analysis. But the real value is that audit protects context: every --stdout, --output, and --prompt command auto-sanitizes secrets before output.

MCP Server

Works as an MCP server for AI editors (Windsurf, Claude Desktop, Cursor).

3 tools: cto_select_context, cto_audit_secrets, cto_explain

// Windsurf: ~/.codeium/windsurf/mcp_config.json
{ "mcpServers": { "cto": { "command": "cto-mcp" } } }

// Claude Desktop
{ "mcpServers": { "cto": { "command": "npx", "args": ["-y", "cto-ai-cli"] } } }

MCP output is also auto-sanitized when includeContents: true.

How it works

Dependency graph — parses imports, builds adjacency list, identifies hubs
Risk scoring — complexity × centrality × recency (continuous, log-scaled)
TF-IDF/BM25 semantic matching — task description scored against all file contents + path boosting
Composite ranking — finalScore = risk × 0.4 + semantic × 0.4 + learner × 0.2
Greedy allocation — fills token budget top-down, cascading prune levels (full → signatures → skeleton)
Bayesian learning — exponential decay on priors, Wilson score confidence, per-task-type patterns

No AI is used for selection. Same input → same output. Deterministic.

Programmatic API

import { analyzeProject, selectContext, buildIndex, query } from 'cto-ai-cli';

const analysis = await analyzeProject('./my-project');
const index = buildIndex(files);
const semanticScores = query(index, 'fix auth', 50)
  .map(m => ({ filePath: m.filePath, score: m.score }));

const selection = await selectContext({
  task: 'fix auth',
  analysis,
  budget: 50_000,
  semanticScores,  // wired into ranking
});

Honest limitations

TypeScript/JavaScript gets deep analysis. Other languages get basic file + import analysis.
TF-IDF, not embeddings. Handles most tasks well but won't understand complex intent.
Learning needs ~5 feedback cycles to start influencing selection. First runs are pure graph + risk + semantic.
Not compared against Cursor/Copilot internal context. Our baselines are naive (alphabetical, random).

Contributing

git clone https://github.com/cto-ai/cto-ai-cli.git && cd cto-ai-cli
npm install && npm run build && npm test  # 597 tests

License

MIT