JSPM

kosha-discovery

0.3.1
  • ESM via JSPM
  • ES Module Entrypoint
  • Export Map
  • Keywords
  • License
  • Repository URL
  • TypeScript Types
  • README
  • Created
  • Published
  • Downloads 44
  • Score
    100M100P100Q82633F
  • License MIT

AI Model & Provider Discovery Registry -- kosha (कोश) treasury of models

Package Exports

  • kosha-discovery
  • kosha-discovery/cli
  • kosha-discovery/server

Readme

Kosha — AI Model Discovery

kosha-discovery — कोश

AI Model & Provider Discovery Registry

npm version license node version

Kosha (कोश — treasury/repository) automatically discovers AI models across providers, resolves credentials from CLI tools and environment variables, enriches models with pricing data, and exposes the catalog via library, CLI, and HTTP API.

Why

AI applications hardcode model IDs, pricing, and provider configs. When providers add models or change pricing, every app breaks. Kosha solves this:

  • Dynamic discovery — fetches real model lists from provider APIs
  • Smart credentials — finds API keys from env vars, CLI tools (Claude, Copilot, Gemini CLI), and config files
  • Pricing enrichment — fills in costs and context windows from litellm's community-maintained dataset
  • Model aliasessonnetclaude-sonnet-4-20250514, updated as models evolve
  • Role matrix — query provider -> model -> roles (chat, embedding, image_generation, etc.)
  • Capability discovery — explore all capabilities across the ecosystem, find models by capability
  • Multi-provider routing — see all provider routes for a model, with direct/preferred flags
  • Cheapest routing — rank cheapest eligible models for tasks like embeddings or image generation
  • Credential prompts — returns provider-specific API key hints when required credentials are missing
  • Local LLM scanning — detects Ollama models alongside cloud providers
  • Three access patterns — use as a library, CLI tool, or HTTP API

Install

npm install kosha-discovery
# or
pnpm add kosha-discovery

Getting Started — Provider Credentials

Kosha auto-discovers credentials from environment variables, CLI tool configs, and cloud auth files. Set up whichever providers you use:

Anthropic

# Option A: Environment variable
export ANTHROPIC_API_KEY=sk-ant-...

# Option B: Auto-detected from Claude CLI / Claude Code
# If you've run `claude` or `claude-code`, kosha reads the stored token from:
#   ~/.claude.json
#   ~/.config/claude/settings.json
#   ~/.claude/credentials.json

# Option C: Auto-detected from Codex CLI
#   ~/.codex/auth.json

OpenAI

# Option A: Environment variable
export OPENAI_API_KEY=sk-...

# Option B: Auto-detected from GitHub Copilot
# If you've authenticated with Copilot, kosha reads tokens from:
#   ~/.config/github-copilot/hosts.json (Linux/macOS)
#   %LOCALAPPDATA%/github-copilot/hosts.json (Windows)

Google (Gemini)

# Option A: Environment variable
export GOOGLE_API_KEY=AIza...
# or
export GEMINI_API_KEY=AIza...

# Option B: Auto-detected from Gemini CLI
#   ~/.gemini/oauth_creds.json

# Option C: gcloud Application Default Credentials
gcloud auth application-default login

AWS Bedrock

# Option A: Environment variables
export AWS_ACCESS_KEY_ID=AKIA...
export AWS_SECRET_ACCESS_KEY=...
export AWS_DEFAULT_REGION=us-east-1   # optional, defaults to us-east-1

# Option B: AWS CLI configured profile
aws configure
# kosha reads ~/.aws/credentials [default] automatically

# Option C: Named profile
export AWS_PROFILE=my-profile

# Option D: SSO / IAM role
# kosha detects sso_start_url or role_arn in ~/.aws/config

# Optional: install the AWS SDK for live model listing (otherwise uses static fallback)
npm install @aws-sdk/client-bedrock

Google Vertex AI

# Option A: Service account JSON
export GOOGLE_APPLICATION_CREDENTIALS=/path/to/service-account.json
export GOOGLE_CLOUD_PROJECT=my-project

# Option B: gcloud Application Default Credentials
gcloud auth application-default login
# Project auto-detected from: GOOGLE_CLOUD_PROJECT, GCLOUD_PROJECT,
# or `gcloud config get-value project`

# Option C: gcloud access token (auto-detected via subprocess)
gcloud auth print-access-token

OpenRouter

# Optional — OpenRouter works without auth (rate-limited)
export OPENROUTER_API_KEY=sk-or-...

Ollama (Local)

# No credentials needed — auto-detected if running locally
# Default: http://localhost:11434
ollama serve

Config file (optional)

Instead of env vars, you can create ~/.kosharc.json (global) or kosha.config.json (project-level):

{
  "providers": {
    "anthropic": { "apiKey": "sk-ant-..." },
    "openai": { "apiKey": "sk-..." },
    "bedrock": { "enabled": true },
    "vertex": { "enabled": true },
    "openrouter": { "enabled": false }
  },
  "aliases": {
    "fast": "claude-haiku-4-5-20251001"
  },
  "cacheTtlMs": 3600000
}

Config priority: ~/.kosharc.json < kosha.config.json < programmatic config.


Quick Start

Library

import { createKosha } from "kosha-discovery";

const kosha = await createKosha();

// List all models
const models = kosha.models();

// Filter by provider
const anthropicModels = kosha.models({ provider: "anthropic" });

// Get embedding models
const embeddings = kosha.models({ mode: "embedding" });

// Resolve alias
const model = kosha.model("sonnet"); // → full ModelCard for claude-sonnet-4-20250514

// Get pricing
console.log(model.pricing); // { inputPerMillion: 3, outputPerMillion: 15, ... }

// Role matrix for assistants (provider -> models -> roles)
const roles = kosha.providerRoles({ role: "embeddings" });

// Cheapest model ranking for a task
const cheapest = kosha.cheapestModels({ role: "image", limit: 3 });
console.log(cheapest.matches[0]);

CLI

# Discover all providers
kosha discover

# List models
kosha list
kosha list --provider anthropic
kosha list --origin anthropic
kosha list --mode embedding

# Search
kosha search gemini
kosha search claude --origin anthropic

# Model details
kosha model sonnet

# Role matrix
kosha roles
kosha roles --role embeddings

# Capabilities ecosystem view
kosha capabilities
kosha capable vision
kosha capable embeddings --limit 5

# Cheapest routing candidates
kosha cheapest --role embeddings
kosha cheapest --role image --limit 3

# All provider routes for a model
kosha routes gpt-4o

# Providers status
kosha providers

# Resolve alias
kosha resolve haiku

# Start API server
kosha serve --port 3000

HTTP API

kosha serve --port 3000
GET /api/models                    — All models
GET /api/models?provider=anthropic — Filter by provider
GET /api/models?mode=embedding     — Filter by mode
GET /api/models/cheapest           — Cheapest ranked models for a role/capability
GET /api/models/:idOrAlias         — Single model
GET /api/models/:idOrAlias/routes  — All provider routes for one model
GET /api/roles                     — Provider → model → roles matrix
GET /api/providers                 — All providers
GET /api/providers/:id             — Single provider
POST /api/refresh                  — Re-discover
GET /api/resolve/:alias            — Resolve alias
GET /health                        — Health check

Assistant Routing Flow

Kosha is designed to answer routing questions from assistants like Vaayu and Takumi:

  1. Ask for capabilities: call GET /api/roles?role=embeddings.
  2. Rank by cost: call GET /api/models/cheapest?role=embeddings.
  3. If missingCredentials is non-empty, prompt the user for one of the listed env vars.
  4. Route execution using the chosen provider/model pair.

Embeddings Quick Call

If your task is embeddings and you want the cheapest option:

kosha cheapest --role embeddings --price-metric input --limit 1

API equivalent:

curl "http://localhost:3000/api/models/cheapest?role=embeddings&priceMetric=input&limit=1"

Provider vs Origin

Kosha distinguishes:

  • provider: where you call the model (serving layer, e.g. openrouter)
  • originProvider: who built the model (e.g. openai)

Example:

provider: openrouter
id: openai/gpt-5.3-codex
originProvider: openai

If a direct OpenAI route exists, route metadata marks it as preferred so assistants can call openai directly instead of openrouter.

CLI Reference

USAGE
  kosha <command> [options]

COMMANDS
  discover                      Discover all providers and models
  list                          List all known models
    --provider <name>             Filter by serving-layer provider
    --origin <name>               Filter by origin/creator provider (e.g. anthropic)
    --mode <mode>                 Filter by mode (chat, embedding, image, audio)
    --capability <cap>            Filter by capability (vision, function_calling, etc.)
  search <query>                Search models by name/ID (fuzzy match)
    --origin <name>               Restrict search to a specific origin provider
  model <id|alias>              Show detailed info for one model
  roles                         Show provider -> model -> roles matrix
    --role <role>                 Filter by task role (e.g. embeddings, image, tool_use)
    --provider <name>             Filter by serving-layer provider
    --origin <name>               Filter by model creator provider
    --mode <mode>                 Filter by mode (chat, embedding, image, audio, moderation)
    --capability <cap>            Filter by capability tag
  capabilities (caps)           Show all capabilities across the ecosystem
    --provider <name>             Scope to one provider
  capable <capability>          List models with a given capability
    --provider <name>             Filter by serving-layer provider
    --origin <name>               Filter by origin/creator provider
    --mode <mode>                 Filter by mode (chat, embedding, image, audio)
    --limit <n>                   Maximum models to show
  cheapest                      Find cheapest eligible models
    --role <role>                 Task role, e.g. embeddings or image
    --capability <cap>            Capability filter (vision, embedding, function_calling)
    --mode <mode>                 Mode filter
    --limit <n>                   Maximum matches to return (default 5)
    --price-metric <metric>       input | output | blended
    --input-weight <n>            Weight for blended metric input price
    --output-weight <n>           Weight for blended metric output price
    --include-unpriced            Include unpriced models after ranked matches
  routes <id|alias>             Show all provider routes for a model
  providers                     List all providers and their status
  resolve <alias>               Resolve an alias to canonical model ID
  refresh                       Force re-discover all providers (bypass cache)
  serve [--port 3000]           Start HTTP API server

OPTIONS
  --json                          Output as JSON (works with any command)
  --help                          Show this help message
  --version                       Show version

Example: kosha list

Provider     Model                              Mode       Context    $/M in   $/M out
──────────── ────────────────────────────────── ────────── ────────── ──────── ────────
anthropic    claude-opus-4-20250918             chat       200K       $15.00   $75.00
anthropic    claude-sonnet-4-20250514           chat       200K       $3.00    $15.00
anthropic    claude-haiku-4-5-20251001          chat       200K       $0.80    $4.00
openai       gpt-4o                             chat       128K       $2.50    $10.00
openai       text-embedding-3-small             embedding  8K         $0.02    —
google       gemini-2.5-pro-preview-05-06       chat       1M         $1.25    $10.00
ollama       qwen3:8b                           chat       —          free     free
───────────────────────────────────────────────────────────────────────────────────────
42 models from 4 providers

Example: kosha model sonnet

Model: claude-sonnet-4-20250514
Provider: Anthropic
Mode: chat
Aliases: sonnet, sonnet-4
Context Window: 200,000 tokens
Max Output: 16,384 tokens
Capabilities: chat, vision, function_calling, code, nlu
Pricing: $3.00 / $15.00 per million tokens (in/out)
Source: api + litellm
Discovered: 2026-02-26T10:30:00Z

Example: kosha providers

Provider     Status          Models  Credential Source
──────────── ─────────────── ─────── ─────────────────
anthropic    ✓ authenticated     12  env (ANTHROPIC_API_KEY)
openai       ✓ authenticated      8  cli (~/.config/github-copilot)
google       ✓ authenticated     15  env (GOOGLE_API_KEY)
ollama       ✓ local              6  none (local)
openrouter   ✗ no credentials     0  —

Example: kosha roles --role embeddings

Provider     Model                                   Mode       Roles
──────────── ─────────────────────────────────────── ────────── ───────────────────────────────
openai       text-embedding-3-small                  embedding  embedding
google       text-embedding-004                      embedding  embedding

Example: kosha cheapest --role image --limit 2

Provider     Model                                   Mode       Metric      Score    $/M in  $/M out
──────────── ─────────────────────────────────────── ────────── ──────── ────────── ──────── ────────
openrouter   openai/dall-e-3                         image      blended     $8.00    $8.00    $0.00
openrouter   black-forest-labs/flux-1-schnell        image      blended    $10.00   $10.00    $0.00

Example: kosha capabilities

Capability           Models
──────────────────── ──────
chat                     38
vision                   12
function_calling         10
code                      8
embedding                 6
image_generation          4
audio                     2

Example: kosha capable vision --limit 3

Provider     Model                              Mode       Context    $/M in   $/M out
──────────── ────────────────────────────────── ────────── ────────── ──────── ────────
anthropic    claude-sonnet-4-20250514           chat       200K       $3.00    $15.00
openai       gpt-4o                             chat       128K       $2.50    $10.00
google       gemini-2.5-pro-preview-05-06       chat       1M         $1.25    $10.00

Example: kosha routes gpt-4o

Model: gpt-4o
Preferred provider: openai

Provider     Origin     Base URL                     Direct  Preferred
──────────── ────────── ──────────────────────────── ─────── ─────────
openai       openai     https://api.openai.com       ✓       ✓
openrouter   openai     https://openrouter.ai        —       —

HTTP API Reference

Start the server:

kosha serve --port 3000
# or
PORT=3000 node dist/server.js

Endpoints

GET /api/models

List all discovered models. Supports query parameters for filtering.

Parameter Type Description
provider string Filter by provider ID (e.g., anthropic)
mode string Filter by mode (chat, embedding, etc.)
capability string Filter by capability (vision, etc.)
curl http://localhost:3000/api/models?provider=anthropic&mode=chat
{
  "models": [ ... ],
  "count": 12
}

GET /api/models/cheapest

Rank the cheapest eligible models for a role/capability.
Useful for assistant routers asking questions like: "For embeddings, what is cheapest right now?"

Parameter Type Description
role string Flexible role alias (e.g. embeddings, image, tool_use)
capability string Explicit capability (e.g. vision, embedding)
mode string Restrict by mode (chat, embedding, image, audio, moderation)
provider string Restrict by serving provider
originProvider string Restrict by origin model provider
limit number Max ranked matches (default 5)
priceMetric string input, output, or blended
inputWeight number Input weight for blended scoring
outputWeight number Output weight for blended scoring
includeUnpriced bool Include unpriced models after ranked matches
curl "http://localhost:3000/api/models/cheapest?role=embeddings&limit=3"
{
  "matches": [
    {
      "model": { "id": "text-embedding-3-small", "provider": "openai", "...": "..." },
      "score": 0.02,
      "priceMetric": "input"
    }
  ],
  "candidates": 6,
  "pricedCandidates": 4,
  "skippedNoPricing": 2,
  "priceMetric": "input",
  "missingCredentials": [
    {
      "providerId": "google",
      "providerName": "Google",
      "envVars": ["GOOGLE_API_KEY", "GEMINI_API_KEY"],
      "message": "Set GOOGLE_API_KEY or GEMINI_API_KEY to enable Google model discovery."
    }
  ],
  "cheapest": {
    "model": { "id": "text-embedding-3-small", "provider": "openai", "...": "..." },
    "score": 0.02,
    "priceMetric": "input"
  }
}

GET /api/roles

Return a provider -> model -> roles matrix.

curl "http://localhost:3000/api/roles?role=image"
{
  "providers": [
    {
      "id": "openrouter",
      "name": "OpenRouter",
      "authenticated": false,
      "credentialSource": "none",
      "models": [
        {
          "id": "openai/dall-e-3",
          "mode": "image",
          "roles": ["image", "image_generation"]
        }
      ]
    }
  ],
  "count": 1,
  "modelCount": 12,
  "missingCredentials": []
}

GET /api/models/:idOrAlias

Get a single model by its full ID or alias, including resolved provider URL and version hint.

curl http://localhost:3000/api/models/sonnet
{
  "id": "claude-sonnet-4-20250514",
  "provider": "anthropic",
  "originProvider": "anthropic",
  "baseUrl": "https://api.anthropic.com",
  "version": "20250514",
  "resolvedOriginProvider": "anthropic",
  "isDirectProvider": true
}

GET /api/models/:idOrAlias/routes

Return all serving routes for one underlying model with direct/preferred flags.

curl http://localhost:3000/api/models/gpt-5.3-codex/routes
{
  "model": "gpt-5.3-codex",
  "preferredProvider": "openai",
  "routes": [
    {
      "provider": "openai",
      "originProvider": "openai",
      "baseUrl": "https://api.openai.com",
      "version": "5.3",
      "isDirect": true,
      "isPreferred": true,
      "model": { "...": "..." }
    },
    {
      "provider": "openrouter",
      "originProvider": "openai",
      "baseUrl": "https://openrouter.ai",
      "version": "5.3",
      "isDirect": false,
      "isPreferred": false,
      "model": { "...": "..." }
    }
  ]
}

GET /api/providers

List all providers with summary info.

curl http://localhost:3000/api/providers
{
  "providers": [
    {
      "id": "anthropic",
      "name": "Anthropic",
      "baseUrl": "https://api.anthropic.com",
      "authenticated": true,
      "credentialSource": "env",
      "modelCount": 12,
      "lastRefreshed": 1740000000000,
      "missingCredentialPrompt": null,
      "credentialEnvVars": []
    }
  ],
  "count": 4,
  "missingCredentials": []
}

GET /api/providers/:id

Get a single provider with all its models.

curl http://localhost:3000/api/providers/anthropic

POST /api/refresh

Trigger re-discovery of all providers, or a specific one.

# Refresh all
curl -X POST http://localhost:3000/api/refresh

# Refresh a specific provider
curl -X POST http://localhost:3000/api/refresh -H "Content-Type: application/json" -d '{"provider": "anthropic"}'

GET /api/resolve/:alias

Resolve a model alias to its canonical ID.

curl http://localhost:3000/api/resolve/sonnet
{
  "alias": "sonnet",
  "resolved": "claude-sonnet-4-20250514",
  "isAlias": true
}

GET /health

Health check endpoint.

curl http://localhost:3000/health
{
  "status": "ok",
  "models": 42,
  "providers": 4,
  "uptime": 123.45
}

Supported Providers

Provider Discovery Credential Sources
Anthropic API (/v1/models) ANTHROPIC_API_KEY, Claude CLI, Codex CLI
OpenAI API (/v1/models) OPENAI_API_KEY, GitHub Copilot tokens
Google API (/v1beta/models) GOOGLE_API_KEY, GEMINI_API_KEY, Gemini CLI, gcloud
AWS Bedrock SDK → CLI → static fallback AWS_ACCESS_KEY_ID+AWS_SECRET_ACCESS_KEY, ~/.aws/credentials, SSO, IAM roles
Vertex AI API + gcloud GOOGLE_APPLICATION_CREDENTIALS, gcloud ADC, gcloud auth print-access-token
Ollama Local API (/api/tags) None needed (local)
OpenRouter API (/api/v1/models) OPENROUTER_API_KEY (optional)

Model Aliases

Built-in aliases for common models:

Alias Resolves To
sonnet claude-sonnet-4-20250514
opus claude-opus-4-20250918
haiku claude-haiku-4-5-20251001
gpt4o gpt-4o
gemini-pro gemini-2.5-pro-preview-05-06
embed-small text-embedding-3-small
nomic nomic-embed-text

Custom aliases:

import { ModelRegistry } from "kosha-discovery";
const registry = new ModelRegistry({ aliases: { "fast": "claude-haiku-4-5-20251001" } });

Configuration

const registry = new ModelRegistry({
  cacheDir: "~/.kosha",           // Cache directory (default: ~/.kosha)
  cacheTtlMs: 86400000,           // Cache TTL: 24 hours (default)
  providers: {
    anthropic: { enabled: true, apiKey: "sk-..." },
    ollama: { enabled: true, baseUrl: "http://localhost:11434" },
    openrouter: { enabled: false },
  },
  aliases: {
    "my-model": "claude-sonnet-4-20250514",
  },
});

Pricing Enrichment

Model pricing is sourced from litellm's model pricing database -- a community-maintained dataset covering 300+ models. Kosha fetches this data and enriches discovered models with:

  • Input/output token pricing
  • Context window sizes
  • Cache read/write costs
  • Capability flags (vision, function calling, etc.)

Architecture

┌─────────────────────────────────────────┐
│          Your Application               │
│  import { createKosha } from "kosha"    │
└────────────────┬────────────────────────┘
                 │
┌────────────────▼────────────────────────┐
│            ModelRegistry                │
│ models() · providerRoles() · cheapestModels() │
└───┬────────────┬────────────────┬───────┘
    │            │                │
┌───▼──┐  ┌─────▼─────┐  ┌──────▼──────┐
│Alias │  │ Discovery  │  │ Enrichment  │
│System│  │ Layer      │  │ Layer       │
└──────┘  └─────┬──────┘  └──────┬──────┘
          ┌─────┼──────┐         │
          ▼     ▼      ▼         ▼
       Anthropic OpenAI Google  litellm
       Bedrock  Vertex  Ollama   JSON
       OpenRouter

Project Structure

src/
  types.ts              Type definitions (ModelCard, ProviderInfo, etc.)
  registry.ts           ModelRegistry class — core orchestrator
  cli.ts                CLI entry point (process.argv parser)
  server.ts             HTTP API server (Hono)
  discovery/
    base.ts             Abstract base discoverer (retry + exponential backoff)
    anthropic.ts        Anthropic API discoverer
    openai.ts           OpenAI API discoverer
    google.ts           Google Gemini API discoverer
    bedrock.ts          AWS Bedrock discoverer (SDK → CLI → static)
    vertex.ts           Vertex AI discoverer (API + gcloud)
    ollama.ts           Ollama local discoverer
    openrouter.ts       OpenRouter API discoverer
    index.ts            Discovery orchestrator
  credentials/
    resolver.ts         Credential resolver (env, CLI, config)
    index.ts            Credential resolver entry
  enrichment/
    litellm.ts          litellm pricing enrichment
    index.ts            Enrichment entry
bin/
  kosha.js              CLI bin entry point

Credits & Inspiration

  • litellm -- Community-maintained model pricing database. Kosha uses their model_prices_and_context_window.json for enrichment.
  • openrouter -- Model aggregation API providing rich model metadata.
  • ollama -- Local LLM runtime with model discovery API.
  • chitragupta -- Autonomous AI Agent Platform whose provider registry patterns inspired kosha's design.
  • takumi -- AI coding agent TUI whose model routing needs drove kosha's creation.

What "Kosha" Means

Kosha comes from Sanskrit and is commonly used to mean a container, treasury, or layered sheath of knowledge.

In this project, Kosha is a standalone model-discovery utility that can be used by any AI system or developer tooling stack (CLIs, agents, apps, or services), not only Kaala-brahma projects.

License

MIT