Package Exports

kosha-discovery
kosha-discovery/cli
kosha-discovery/server

Readme

kosha-discovery — कोश

AI Model & Provider Discovery Registry

node version

Kosha (कोश — treasury/repository) automatically discovers AI models across providers, resolves credentials from CLI tools and environment variables, enriches models with pricing data, and exposes the catalog via library, CLI, and HTTP API.

Why

AI applications hardcode model IDs, pricing, and provider configs. When providers add models or change pricing, every app breaks. Kosha solves this:

Dynamic discovery — fetches real model lists from provider APIs
Smart credentials — finds API keys from env vars, CLI tools (Claude, Copilot, Gemini CLI), and config files
Pricing enrichment — fills in costs and context windows from litellm's community-maintained dataset
Model aliases — sonnet → claude-sonnet-4-20250514, updated as models evolve
Role matrix — query provider -> model -> roles (chat, embedding, image_generation, etc.)
Capability discovery — explore all capabilities across the ecosystem, find models by capability
Multi-provider routing — see all provider routes for a model, with direct/preferred flags
Cheapest routing — rank cheapest eligible models for tasks like embeddings or image generation
Credential prompts — returns provider-specific API key hints when required credentials are missing
Local LLM scanning — detects Ollama models alongside cloud providers
Three access patterns — use as a library, CLI tool, or HTTP API

Install

npm install kosha-discovery
# or
pnpm add kosha-discovery

Getting Started — Provider Credentials

Kosha auto-discovers credentials from environment variables, CLI tool configs, and cloud auth files. Set up whichever providers you use:

Anthropic

# Option A: Environment variable
export ANTHROPIC_API_KEY=sk-ant-...

# Option B: Auto-detected from Claude CLI / Claude Code
# If you've run `claude` or `claude-code`, kosha reads the stored token from:
#   ~/.claude.json
#   ~/.config/claude/settings.json
#   ~/.claude/credentials.json

# Option C: Auto-detected from Codex CLI
#   ~/.codex/auth.json

OpenAI

# Option A: Environment variable
export OPENAI_API_KEY=sk-...

# Option B: Auto-detected from GitHub Copilot
# If you've authenticated with Copilot, kosha reads tokens from:
#   ~/.config/github-copilot/hosts.json (Linux/macOS)
#   %LOCALAPPDATA%/github-copilot/hosts.json (Windows)

Google (Gemini)

# Option A: Environment variable
export GOOGLE_API_KEY=AIza...
# or
export GEMINI_API_KEY=AIza...

# Option B: Auto-detected from Gemini CLI
#   ~/.gemini/oauth_creds.json

# Option C: gcloud Application Default Credentials
gcloud auth application-default login

AWS Bedrock

# Option A: Environment variables
export AWS_ACCESS_KEY_ID=AKIA...
export AWS_SECRET_ACCESS_KEY=...
export AWS_DEFAULT_REGION=us-east-1   # optional, defaults to us-east-1

# Option B: AWS CLI configured profile
aws configure
# kosha reads ~/.aws/credentials [default] automatically

# Option C: Named profile
export AWS_PROFILE=my-profile

# Option D: SSO / IAM role
# kosha detects sso_start_url or role_arn in ~/.aws/config

# Optional: install the AWS SDK for live model listing (otherwise uses static fallback)
npm install @aws-sdk/client-bedrock

Google Vertex AI

# Option A: Service account JSON
export GOOGLE_APPLICATION_CREDENTIALS=/path/to/service-account.json
export GOOGLE_CLOUD_PROJECT=my-project

# Option B: gcloud Application Default Credentials
gcloud auth application-default login
# Project auto-detected from: GOOGLE_CLOUD_PROJECT, GCLOUD_PROJECT,
# or `gcloud config get-value project`

# Option C: gcloud access token (auto-detected via subprocess)
gcloud auth print-access-token

OpenRouter

# Optional — OpenRouter works without auth (rate-limited)
export OPENROUTER_API_KEY=sk-or-...

Ollama (Local)

# No credentials needed — auto-detected if running locally
# Default: http://localhost:11434
ollama serve

Config file (optional)

Instead of env vars, you can create ~/.kosharc.json (global) or kosha.config.json (project-level):

{
  "providers": {
    "anthropic": { "apiKey": "sk-ant-..." },
    "openai": { "apiKey": "sk-..." },
    "bedrock": { "enabled": true },
    "vertex": { "enabled": true },
    "openrouter": { "enabled": false }
  },
  "aliases": {
    "fast": "claude-haiku-4-5-20251001"
  },
  "cacheTtlMs": 3600000
}

Config priority: ~/.kosharc.json < kosha.config.json < programmatic config.

Quick Start

Library

import { createKosha } from "kosha-discovery";

const kosha = await createKosha();

// List all models
const models = kosha.models();

// Filter by provider
const anthropicModels = kosha.models({ provider: "anthropic" });

// Get embedding models
const embeddings = kosha.models({ mode: "embedding" });

// Resolve alias
const model = kosha.model("sonnet"); // → full ModelCard for claude-sonnet-4-20250514

// Get pricing
console.log(model.pricing); // { inputPerMillion: 3, outputPerMillion: 15, ... }

// Role matrix for assistants (provider -> models -> roles)
const roles = kosha.providerRoles({ role: "embeddings" });

// Cheapest model ranking for a task
const cheapest = kosha.cheapestModels({ role: "image", limit: 3 });
console.log(cheapest.matches[0]);

CLI

# Discover all providers
kosha discover

# List models
kosha list
kosha list --provider anthropic
kosha list --origin anthropic
kosha list --mode embedding

# Search
kosha search gemini
kosha search claude --origin anthropic

# Model details
kosha model sonnet

# Role matrix
kosha roles
kosha roles --role embeddings

# Capabilities ecosystem view
kosha capabilities
kosha capable vision
kosha capable embeddings --limit 5

# Cheapest routing candidates
kosha cheapest --role embeddings
kosha cheapest --role image --limit 3

# All provider routes for a model
kosha routes gpt-4o

# Providers status
kosha providers

# Resolve alias
kosha resolve haiku

# Start API server
kosha serve --port 3000

HTTP API

kosha serve --port 3000

GET /api/models                    — All models
GET /api/models?provider=anthropic — Filter by provider
GET /api/models?mode=embedding     — Filter by mode
GET /api/models/cheapest           — Cheapest ranked models for a role/capability
GET /api/models/:idOrAlias         — Single model
GET /api/models/:idOrAlias/routes  — All provider routes for one model
GET /api/roles                     — Provider → model → roles matrix
GET /api/providers                 — All providers
GET /api/providers/:id             — Single provider
POST /api/refresh                  — Re-discover
GET /api/resolve/:alias            — Resolve alias
GET /health                        — Health check

Assistant Routing Flow

Kosha is designed to answer routing questions from assistants like Vaayu and Takumi:

Ask for capabilities: call GET /api/roles?role=embeddings.
Rank by cost: call GET /api/models/cheapest?role=embeddings.
If missingCredentials is non-empty, prompt the user for one of the listed env vars.
Route execution using the chosen provider/model pair.

Embeddings Quick Call

If your task is embeddings and you want the cheapest option:

kosha cheapest --role embeddings --price-metric input --limit 1

API equivalent:

curl "http://localhost:3000/api/models/cheapest?role=embeddings&priceMetric=input&limit=1"

Provider vs Origin

Kosha distinguishes:

provider: where you call the model (serving layer, e.g. openrouter)
originProvider: who built the model (e.g. openai)

Example:

provider: openrouter
id: openai/gpt-5.3-codex
originProvider: openai

If a direct OpenAI route exists, route metadata marks it as preferred so assistants can call openai directly instead of openrouter.

CLI Reference

USAGE
  kosha <command> [options]

COMMANDS
  discover                      Discover all providers and models
  list                          List all known models
    --provider <name>             Filter by serving-layer provider
    --origin <name>               Filter by origin/creator provider (e.g. anthropic)
    --mode <mode>                 Filter by mode (chat, embedding, image, audio)
    --capability <cap>            Filter by capability (vision, function_calling, etc.)
  search <query>                Search models by name/ID (fuzzy match)
    --origin <name>               Restrict search to a specific origin provider
  model <id|alias>              Show detailed info for one model
  roles                         Show provider -> model -> roles matrix
    --role <role>                 Filter by task role (e.g. embeddings, image, tool_use)
    --provider <name>             Filter by serving-layer provider
    --origin <name>               Filter by model creator provider
    --mode <mode>                 Filter by mode (chat, embedding, image, audio, moderation)
    --capability <cap>            Filter by capability tag
  capabilities (caps)           Show all capabilities across the ecosystem
    --provider <name>             Scope to one provider
  capable <capability>          List models with a given capability
    --provider <name>             Filter by serving-layer provider
    --origin <name>               Filter by origin/creator provider
    --mode <mode>                 Filter by mode (chat, embedding, image, audio)
    --limit <n>                   Maximum models to show
  cheapest                      Find cheapest eligible models
    --role <role>                 Task role, e.g. embeddings or image
    --capability <cap>            Capability filter (vision, embedding, function_calling)
    --mode <mode>                 Mode filter
    --limit <n>                   Maximum matches to return (default 5)
    --price-metric <metric>       input | output | blended
    --input-weight <n>            Weight for blended metric input price
    --output-weight <n>           Weight for blended metric output price
    --include-unpriced            Include unpriced models after ranked matches
  routes <id|alias>             Show all provider routes for a model
  providers                     List all providers and their status
  resolve <alias>               Resolve an alias to canonical model ID
  refresh                       Force re-discover all providers (bypass cache)
  serve [--port 3000]           Start HTTP API server

OPTIONS
  --json                          Output as JSON (works with any command)
  --help                          Show this help message
  --version                       Show version

Example: `kosha list`

Provider     Model                              Mode       Context    $/M in   $/M out
──────────── ────────────────────────────────── ────────── ────────── ──────── ────────
anthropic    claude-opus-4-20250918             chat       200K       $15.00   $75.00
anthropic    claude-sonnet-4-20250514           chat       200K       $3.00    $15.00
anthropic    claude-haiku-4-5-20251001          chat       200K       $0.80    $4.00
openai       gpt-4o                             chat       128K       $2.50    $10.00
openai       text-embedding-3-small             embedding  8K         $0.02    —
google       gemini-2.5-pro-preview-05-06       chat       1M         $1.25    $10.00
ollama       qwen3:8b                           chat       —          free     free
───────────────────────────────────────────────────────────────────────────────────────
42 models from 4 providers

Example: `kosha model sonnet`

Model: claude-sonnet-4-20250514
Provider: Anthropic
Mode: chat
Aliases: sonnet, sonnet-4
Context Window: 200,000 tokens
Max Output: 16,384 tokens
Capabilities: chat, vision, function_calling, code, nlu
Pricing: $3.00 / $15.00 per million tokens (in/out)
Source: api + litellm
Discovered: 2026-02-26T10:30:00Z

Example: `kosha providers`

Provider     Status          Models  Credential Source
──────────── ─────────────── ─────── ─────────────────
anthropic    ✓ authenticated     12  env (ANTHROPIC_API_KEY)
openai       ✓ authenticated      8  cli (~/.config/github-copilot)
google       ✓ authenticated     15  env (GOOGLE_API_KEY)
ollama       ✓ local              6  none (local)
openrouter   ✗ no credentials     0  —

Example: `kosha roles --role embeddings`

Provider     Model                                   Mode       Roles
──────────── ─────────────────────────────────────── ────────── ───────────────────────────────
openai       text-embedding-3-small                  embedding  embedding
google       text-embedding-004                      embedding  embedding

Example: `kosha cheapest --role image --limit 2`

Provider     Model                                   Mode       Metric      Score    $/M in  $/M out
──────────── ─────────────────────────────────────── ────────── ──────── ────────── ──────── ────────
openrouter   openai/dall-e-3                         image      blended     $8.00    $8.00    $0.00
openrouter   black-forest-labs/flux-1-schnell        image      blended    $10.00   $10.00    $0.00

Example: `kosha capabilities`

Capability           Models
──────────────────── ──────
chat                     38
vision                   12
function_calling         10
code                      8
embedding                 6
image_generation          4
audio                     2

Example: `kosha capable vision --limit 3`

Provider     Model                              Mode       Context    $/M in   $/M out
──────────── ────────────────────────────────── ────────── ────────── ──────── ────────
anthropic    claude-sonnet-4-20250514           chat       200K       $3.00    $15.00
openai       gpt-4o                             chat       128K       $2.50    $10.00
google       gemini-2.5-pro-preview-05-06       chat       1M         $1.25    $10.00

Example: `kosha routes gpt-4o`

Model: gpt-4o
Preferred provider: openai

Provider     Origin     Base URL                     Direct  Preferred
──────────── ────────── ──────────────────────────── ─────── ─────────
openai       openai     https://api.openai.com       ✓       ✓
openrouter   openai     https://openrouter.ai        —       —

HTTP API Reference

Start the server:

kosha serve --port 3000
# or
PORT=3000 node dist/server.js

Endpoints

`GET /api/models`

List all discovered models. Supports query parameters for filtering.

Parameter	Type	Description
`provider`	string	Filter by provider ID (e.g., `anthropic`)
`mode`	string	Filter by mode (`chat`, `embedding`, etc.)
`capability`	string	Filter by capability (`vision`, etc.)

curl http://localhost:3000/api/models?provider=anthropic&mode=chat

{
  "models": [ ... ],
  "count": 12
}

`GET /api/models/cheapest`

Rank the cheapest eligible models for a role/capability.
Useful for assistant routers asking questions like: "For embeddings, what is cheapest right now?"

Parameter	Type	Description
`role`	string	Flexible role alias (e.g. `embeddings`, `image`, `tool_use`)
`capability`	string	Explicit capability (e.g. `vision`, `embedding`)
`mode`	string	Restrict by mode (`chat`, `embedding`, `image`, `audio`, `moderation`)
`provider`	string	Restrict by serving provider
`originProvider`	string	Restrict by origin model provider
`limit`	number	Max ranked matches (default `5`)
`priceMetric`	string	`input`, `output`, or `blended`
`inputWeight`	number	Input weight for `blended` scoring
`outputWeight`	number	Output weight for `blended` scoring
`includeUnpriced`	bool	Include unpriced models after ranked matches

curl "http://localhost:3000/api/models/cheapest?role=embeddings&limit=3"

{
  "matches": [
    {
      "model": { "id": "text-embedding-3-small", "provider": "openai", "...": "..." },
      "score": 0.02,
      "priceMetric": "input"
    }
  ],
  "candidates": 6,
  "pricedCandidates": 4,
  "skippedNoPricing": 2,
  "priceMetric": "input",
  "missingCredentials": [
    {
      "providerId": "google",
      "providerName": "Google",
      "envVars": ["GOOGLE_API_KEY", "GEMINI_API_KEY"],
      "message": "Set GOOGLE_API_KEY or GEMINI_API_KEY to enable Google model discovery."
    }
  ],
  "cheapest": {
    "model": { "id": "text-embedding-3-small", "provider": "openai", "...": "..." },
    "score": 0.02,
    "priceMetric": "input"
  }
}

`GET /api/roles`

Return a provider -> model -> roles matrix.

curl "http://localhost:3000/api/roles?role=image"

{
  "providers": [
    {
      "id": "openrouter",
      "name": "OpenRouter",
      "authenticated": false,
      "credentialSource": "none",
      "models": [
        {
          "id": "openai/dall-e-3",
          "mode": "image",
          "roles": ["image", "image_generation"]
        }
      ]
    }
  ],
  "count": 1,
  "modelCount": 12,
  "missingCredentials": []
}

`GET /api/models/:idOrAlias`

Get a single model by its full ID or alias, including resolved provider URL and version hint.

curl http://localhost:3000/api/models/sonnet

{
  "id": "claude-sonnet-4-20250514",
  "provider": "anthropic",
  "originProvider": "anthropic",
  "baseUrl": "https://api.anthropic.com",
  "version": "20250514",
  "resolvedOriginProvider": "anthropic",
  "isDirectProvider": true
}

`GET /api/models/:idOrAlias/routes`

Return all serving routes for one underlying model with direct/preferred flags.

curl http://localhost:3000/api/models/gpt-5.3-codex/routes

{
  "model": "gpt-5.3-codex",
  "preferredProvider": "openai",
  "routes": [
    {
      "provider": "openai",
      "originProvider": "openai",
      "baseUrl": "https://api.openai.com",
      "version": "5.3",
      "isDirect": true,
      "isPreferred": true,
      "model": { "...": "..." }
    },
    {
      "provider": "openrouter",
      "originProvider": "openai",
      "baseUrl": "https://openrouter.ai",
      "version": "5.3",
      "isDirect": false,
      "isPreferred": false,
      "model": { "...": "..." }
    }
  ]
}

`GET /api/providers`

List all providers with summary info.

curl http://localhost:3000/api/providers

{
  "providers": [
    {
      "id": "anthropic",
      "name": "Anthropic",
      "baseUrl": "https://api.anthropic.com",
      "authenticated": true,
      "credentialSource": "env",
      "modelCount": 12,
      "lastRefreshed": 1740000000000,
      "missingCredentialPrompt": null,
      "credentialEnvVars": []
    }
  ],
  "count": 4,
  "missingCredentials": []
}

`GET /api/providers/:id`

Get a single provider with all its models.

curl http://localhost:3000/api/providers/anthropic

`POST /api/refresh`

Trigger re-discovery of all providers, or a specific one.

# Refresh all
curl -X POST http://localhost:3000/api/refresh

# Refresh a specific provider
curl -X POST http://localhost:3000/api/refresh -H "Content-Type: application/json" -d '{"provider": "anthropic"}'

`GET /api/resolve/:alias`

Resolve a model alias to its canonical ID.

curl http://localhost:3000/api/resolve/sonnet

{
  "alias": "sonnet",
  "resolved": "claude-sonnet-4-20250514",
  "isAlias": true
}

`GET /health`

Health check endpoint.

curl http://localhost:3000/health

{
  "status": "ok",
  "models": 42,
  "providers": 4,
  "uptime": 123.45
}

Supported Providers

Provider	Discovery	Credential Sources
Anthropic	API (`/v1/models`)	`ANTHROPIC_API_KEY`, Claude CLI, Codex CLI
OpenAI	API (`/v1/models`)	`OPENAI_API_KEY`, GitHub Copilot tokens
Google	API (`/v1beta/models`)	`GOOGLE_API_KEY`, `GEMINI_API_KEY`, Gemini CLI, gcloud
AWS Bedrock	SDK → CLI → static fallback	`AWS_ACCESS_KEY_ID`+`AWS_SECRET_ACCESS_KEY`, `~/.aws/credentials`, SSO, IAM roles
Vertex AI	API + gcloud	`GOOGLE_APPLICATION_CREDENTIALS`, gcloud ADC, `gcloud auth print-access-token`
Ollama	Local API (`/api/tags`)	None needed (local)
OpenRouter	API (`/api/v1/models`)	`OPENROUTER_API_KEY` (optional)

Model Aliases

Built-in aliases for common models:

Alias	Resolves To
`sonnet`	`claude-sonnet-4-20250514`
`opus`	`claude-opus-4-20250918`
`haiku`	`claude-haiku-4-5-20251001`
`gpt4o`	`gpt-4o`
`gemini-pro`	`gemini-2.5-pro-preview-05-06`
`embed-small`	`text-embedding-3-small`
`nomic`	`nomic-embed-text`

Custom aliases:

import { ModelRegistry } from "kosha-discovery";
const registry = new ModelRegistry({ aliases: { "fast": "claude-haiku-4-5-20251001" } });

Configuration

const registry = new ModelRegistry({
  cacheDir: "~/.kosha",           // Cache directory (default: ~/.kosha)
  cacheTtlMs: 86400000,           // Cache TTL: 24 hours (default)
  providers: {
    anthropic: { enabled: true, apiKey: "sk-..." },
    ollama: { enabled: true, baseUrl: "http://localhost:11434" },
    openrouter: { enabled: false },
  },
  aliases: {
    "my-model": "claude-sonnet-4-20250514",
  },
});

Pricing Enrichment

Model pricing is sourced from litellm's model pricing database -- a community-maintained dataset covering 300+ models. Kosha fetches this data and enriches discovered models with:

Input/output token pricing
Context window sizes
Cache read/write costs
Capability flags (vision, function calling, etc.)

Architecture

┌─────────────────────────────────────────┐
│          Your Application               │
│  import { createKosha } from "kosha"    │
└────────────────┬────────────────────────┘
                 │
┌────────────────▼────────────────────────┐
│            ModelRegistry                │
│ models() · providerRoles() · cheapestModels() │
└───┬────────────┬────────────────┬───────┘
    │            │                │
┌───▼──┐  ┌─────▼─────┐  ┌──────▼──────┐
│Alias │  │ Discovery  │  │ Enrichment  │
│System│  │ Layer      │  │ Layer       │
└──────┘  └─────┬──────┘  └──────┬──────┘
          ┌─────┼──────┐         │
          ▼     ▼      ▼         ▼
       Anthropic OpenAI Google  litellm
       Bedrock  Vertex  Ollama   JSON
       OpenRouter

Project Structure

src/
  types.ts              Type definitions (ModelCard, ProviderInfo, etc.)
  registry.ts           ModelRegistry class — core orchestrator
  cli.ts                CLI entry point (process.argv parser)
  server.ts             HTTP API server (Hono)
  discovery/
    base.ts             Abstract base discoverer (retry + exponential backoff)
    anthropic.ts        Anthropic API discoverer
    openai.ts           OpenAI API discoverer
    google.ts           Google Gemini API discoverer
    bedrock.ts          AWS Bedrock discoverer (SDK → CLI → static)
    vertex.ts           Vertex AI discoverer (API + gcloud)
    ollama.ts           Ollama local discoverer
    openrouter.ts       OpenRouter API discoverer
    index.ts            Discovery orchestrator
  credentials/
    resolver.ts         Credential resolver (env, CLI, config)
    index.ts            Credential resolver entry
  enrichment/
    litellm.ts          litellm pricing enrichment
    index.ts            Enrichment entry
bin/
  kosha.js              CLI bin entry point

Credits & Inspiration

litellm -- Community-maintained model pricing database. Kosha uses their model_prices_and_context_window.json for enrichment.
openrouter -- Model aggregation API providing rich model metadata.
ollama -- Local LLM runtime with model discovery API.
chitragupta -- Autonomous AI Agent Platform whose provider registry patterns inspired kosha's design.
takumi -- AI coding agent TUI whose model routing needs drove kosha's creation.

What "Kosha" Means

Kosha comes from Sanskrit and is commonly used to mean a container, treasury, or layered sheath of knowledge.

In this project, Kosha is a standalone model-discovery utility that can be used by any AI system or developer tooling stack (CLIs, agents, apps, or services), not only Kaala-brahma projects.

License

MIT

kosha-discovery