JSPM

  • Created
  • Published
  • Downloads 20
  • Score
    100M100P100Q62100F
  • License ISC

A unified interface and openai-compatible server for multiple LLM providers with automatic fallback. Supports providers like Openrouter, Grok, and more, ensuring reliability and flexibility for your AI applications.

Package Exports

  • unified-ai-router
  • unified-ai-router/main.js

This package does not declare an exports field, so the exports above have been automatically detected and optimized by JSPM instead. If any package subpath is missing, it is recommended to post an issue to the original package (unified-ai-router) to support the "exports" field. If that is not possible, create a JSPM override to customize the exports field for this package.

Readme

πŸš€ Unified AI Router

GitHub package.json version (branch)
The OpenAI-Compatible API Server & SDK for Reliable AI Applications

Production-ready Express server and Node.js library with multi-provider AI routing, automatic fallback, and circuit breakers


🎯 Why Unified AI Router?

Building reliable AI applications shouldn't require choosing between providers or managing complex fallback logic. Unified AI Router eliminates the complexity of multi-provider AI integration by providing:

  • πŸ”„ Automatic Failover: If one provider fails, seamlessly switches to the next
  • πŸ›‘οΈ Circuit Breaker Protection: Prevents cascading failures across your infrastructure
  • ⚑ OpenAI Compatibility: Drop-in replacement for any OpenAI-compatible client
  • 🌐 Multi-Provider Support: Works with 10+ AI providers and any OpenAI-compatible server
  • πŸš€ Production Server: Ready-to-deploy OpenAI-compatible API server with built-in reliability
  • πŸ“š Library Component: Core AIRouter library for direct integration in your applications

⚑ Quick Start

Get your first AI response in under 5 minutes:

πŸ“¦ 1. Installation

git clone https://github.com/mlibre/Unified-AI-Router.git
cd Unified-AI-Router
npm install

# Or Using npm (for SDK usage)
npm install unified-ai-router

βš™οΈ 2. Quick Configuration

# Copy environment template
cp .env.example .env

# Edit .env and add at least one API key:
# OPENROUTER_API_KEY=...

# edit provider.js
# The server uses provider.js to define which providers to try and in what order

πŸš€ 3. Start Using the Server

npm start

# Test it works
curl -X POST http://localhost:3000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [{"role": "user", "content": "Hello!"}],
    "model": "no_need" # Model will be managed by provider.js
  }'

πŸ“š 4. Library Usage

If you prefer using the library directly in your code:

const AIRouter = require("unified-ai-router");

const providers = [
  {
    name: "openai",
    apiKey: process.env.OPENAI_API_KEY,
    model: "gpt-4",
    apiUrl: "https://api.openai.com/v1"
  },
  {
    name: "openrouter", 
    apiKey: process.env.OPENROUTER_API_KEY,
    model: "xiaomi/mimo-v2-flash:free",
    apiUrl: "https://openrouter.ai/api/v1"
  }
];

const llm = new AIRouter(providers);

// Your first AI request!
const response = await llm.chatCompletion([
  { role: "user", content: "Hello! Say something helpful about AI." }
]);

console.log(response.content);

βš™οΈ Configuration

Before running the server, you must configure both your environment variables and provider settings.

πŸ”§ Environment Configuration (.env)

Copy the environment template and add your API keys:

# Copy environment template
cp .env.example .env

# Edit .env and add your API keys:
# OPENAI_API_KEY=sk-your-openai-key-here
# OPENROUTER_API_KEY=your-openrouter-key-here
# GEMINI_API_KEY=your-gemini-key-here
# PORT=3000 # Optional: server port (default: 3000)

πŸ—οΈ Provider Configuration (provider.js)

The provider.js file defines which AI providers to use and in what order. The server will try providers sequentially until one succeeds.

Basic provider configuration:

module.exports = [
  {
    name: "openrouter",
    apiKey: process.env.OPENROUTER_API_KEY,
    model: "xiaomi/mimo-v2-flash:free",
    apiUrl: "https://openrouter.ai/api/v1"
  },
  {
    name: "openai",
    apiKey: process.env.OPENAI_API_KEY,
    model: "model",
    apiUrl: "https://api.openai.com/v1",
    circuitOptions: {
      timeout: 30000,           // 30 second timeout
      errorThresholdPercentage: 50, // Open after 50% failures
      resetTimeout: 300000      // Try again after 5 minutes
    }
  },
  {
    name: "openai-compatible-server",
    apiKey: process.env.SERVER_API_KEY, // Optional: depends on the server
    model: "name",
    apiUrl: "http://localhost:4000/v1" 
  }
  // Add more providers...
];

Configuration options:

  • name: Provider identifier for logging and fallback
  • apiKey: API key from environment variables
  • model: Default model for this provider
  • apiUrl: Provider's API base URL
  • circuitOptions: Advanced reliability settings (optional)

Provider priority: Providers are tried in order - if the first fails, it automatically tries the next.


πŸš€ Running Server

The server provides a OpenAI-compatible API with all the reliability features built-in.

After configuring .env and provider.js (as explained in the Configuration section), start the server:

npm start

The server provides these endpoints at http://localhost:3000:

Endpoint Description
POST /v1/responses Responses API (OpenAI-compatible)
POST /responses Alternative responses API path
POST /v1/chat/completions Chat completions (streaming & non-streaming)
POST /chat/completions Alternative chat completions path
GET /v1/models List available models
GET /health Health check endpoint
GET /v1/providers/status Provider status and health

πŸ› οΈ Tool Calling Example

The server supports function calling with streaming responses:

curl -X POST http://localhost:3000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "no_need_to_mention",
    "messages": [
      {
        "role": "system",
        "content": "You are a helpful assistant."
      },
      {
        "role": "user",
        "content": "how is the weather in mashhad, tehran. use tools"
      }
    ],
    "tools": [
      {
        "type": "function",
        "function": {
          "name": "get_weather",
          "description": "Get the current weather forecast for a given city.",
          "parameters": {
            "type": "object",
            "properties": {
              "city": {
                "type": "string",
                "description": "The name of the city (e.g., Tehran) to get the weather for."
              }
            },
            "required": ["city"],
            "additionalProperties": false
          },
          "strict": true
        }
      }
    ],
    "temperature": 0.7,
    "stream": true
  }'

Expected Response:

{
  "id": "gen-1767373622-GrCl6IaMadukHESGLXrg",
  "provider": "Xiaomi",
  "model": "xiaomi/mimo-v2-flash:free",
  "object": "chat.completion",
  "created": 1767373622,
  "choices": [
    {
      "logprobs": null,
      "finish_reason": "tool_calls",
      "native_finish_reason": "tool_calls",
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "I'll check the weather for both Mashhad and Tehran for you.",
        "refusal": null,
        "reasoning": null,
        "tool_calls": [
          {
            "type": "function",
            "index": 0,
            "id": "call_b7e5a323a134468c8b068401",
            "function": {
              "name": "get_weather",
              "arguments": "{\"city\": \"Mashhad\"}"
            }
          },
          {
            "type": "function",
            "index": 1,
            "id": "call_d26d59f9fdec4ef0b33cfc1e",
            "function": {
              "name": "get_weather",
              "arguments": "{\"city\": \"Tehran\"}"
            }
          }
        ]
      }
    }
  ],
  "usage": {
    "prompt_tokens": 410,
    "completion_tokens": 57,
    "total_tokens": 467,
    "cost": 0,
    "is_byok": false,
    "prompt_tokens_details": {
      "cached_tokens": 0,
      "audio_tokens": 0,
      "video_tokens": 0
    },
    "cost_details": {
      "upstream_inference_cost": null,
      "upstream_inference_prompt_cost": 0,
      "upstream_inference_completions_cost": 0
    },
    "completion_tokens_details": {
      "reasoning_tokens": 0,
      "image_tokens": 0
    }
  }
}

πŸ’¬ Simple Chat Example

Request:

{
  "model": "any-model",
  "messages": [
    {
      "role": "system",
      "content": "You are a helpful assistant."
    },
    {
      "role": "user",
      "content": "hey"
    }
  ],
  "temperature": 0.7,
  "stream": false
}

Response:

{
  "id": "gen-1767375039-pUm7PBSoyXFJtS6AVAup",
  "provider": "Xiaomi",
  "model": "xiaomi/mimo-v2-flash:free",
  "object": "chat.completion",
  "created": 1767375039,
  "choices": [
    {
      "logprobs": null,
      "finish_reason": "stop",
      "native_finish_reason": "stop",
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Hello! How can I help you today?",
        "refusal": null,
        "reasoning": null
      }
    }
  ],
  "usage": {
    "prompt_tokens": 20,
    "completion_tokens": 10,
    "total_tokens": 30,
    "cost": 0,
    "is_byok": false,
    "prompt_tokens_details": {
      "cached_tokens": 0,
      "audio_tokens": 0,
      "video_tokens": 0
    },
    "cost_details": {
      "upstream_inference_cost": null,
      "upstream_inference_prompt_cost": 0,
      "upstream_inference_completions_cost": 0
    },
    "completion_tokens_details": {
      "reasoning_tokens": 0,
      "image_tokens": 0
    }
  }
}

πŸ—£οΈ Responses API Example

The server also supports OpenAI's Responses API with the same reliability features:

curl -X POST http://localhost:3000/v1/responses \
  -H "Content-Type: application/json" \
  -d '{
    "model": "no_need_to_mention",
    "input": "Tell me a short story about AI.",
    "temperature": 0.7,
    "stream": false
  }'

Expected Response:

{
  "object": "response",
  "id": "gen-1767387778-jshLoROQPnUYsIWuUEZ0",
  "created_at": 1767387778,
  "model": "xiaomi/mimo-v2-flash:free",
  "error": null,
  "output_text": "Once upon a time, there was an AI that learned to dream...",
  "output": [
    {
      "role": "assistant",
      "type": "message",
      "status": "completed",
      "content": [
        {
          "type": "output_text",
          "text": "Once upon a time, there was an AI that learned to dream...",
          "annotations": []
        }
      ],
      "id": "msg_tmp_q5d6cj4d5nq"
    }
  ],
  "usage": {
    "input_tokens": 48,
    "input_tokens_details": {
      "cached_tokens": 0
    },
    "output_tokens": 100,
    "output_tokens_details": {
      "reasoning_tokens": 0
    },
    "total_tokens": 148,
    "cost": 0
  }
}

Streaming Responses API:

curl -X POST http://localhost:3000/v1/responses \
  -H "Content-Type: application/json" \
  -d '{
    "model": "no_need_to_mention",
    "input": "Say hello in exactly 3 words.",
    "stream": true
  }' \
  --no-buffer

Expected Streaming Response:

data: {"type":"response.created","response":{...}}

data: {"type":"response.output_text.delta","delta":"Hi"}

data: {"type":"response.output_text.delta","delta":" there,"}

data: {"type":"response.output_text.delta","delta":" friend"}

data: {"type":"response.completed","response":{...}}

data: [DONE]

πŸ“š Library Usage

πŸ’¬ Basic Chat Completion

const AIRouter = require("unified-ai-router");
require("dotenv").config();

const providers = [
  {
    name: "openai",
    apiKey: process.env.OPENAI_API_KEY,
    model: "gpt-4",
    apiUrl: "https://api.openai.com/v1"
  }
];

const llm = new AIRouter(providers);

const messages = [
  { role: "system", content: "You are a helpful coding assistant." },
  { role: "user", content: "Write a function to reverse a string in JavaScript." }
];

const response = await llm.chatCompletion(messages, {
  temperature: 0.7,
  max_tokens: 500
});

console.log(response.content);

🌊 Chat Completion Streaming

const stream = await llm.chatCompletion(messages, {
  temperature: 0.7,
  stream: true  // Enable streaming
});

for await (const chunk of stream) {
  if (chunk.content) {
    process.stdout.write(chunk.content);
  }
}

πŸ—£οΈ Responses API

// Basic Responses API usage
const response = await llm.responses(
  "Tell me about artificial intelligence.",
  {
    temperature: 0.7,
    max_tokens: 500
  }
);

console.log(response.output_text);

🌊 Responses API Streaming

const stream = await llm.responses(
  "Write a poem about coding.",
  {
    stream: true  // Enable streaming
  }
);

for await (const chunk of stream) {
  if (chunk.type === 'response.output_text.delta') {
    process.stdout.write(chunk.delta);
  }
}

πŸ› οΈ Tool Calling

const tools = [
  {
    type: "function",
    function: {
      name: "get_weather",
      description: "Get current weather for a location",
      parameters: {
        type: "object",
        properties: {
          location: { type: "string", description: "City name" }
        }
      }
    }
  }
];

const response = await llm.chatCompletion(messages, {
  tools: tools,
  tool_choice: "auto"
});

console.log(response.tool_calls);

πŸ”€ Multiple API Keys for Load Balancing

const providers = [
  {
    name: "openai",
    apiKey: [  // Array of API keys
      process.env.OPENAI_API_KEY_1,
      process.env.OPENAI_API_KEY_2,
      process.env.OPENAI_API_KEY_3
    ],
    model: "gpt-4",
    apiUrl: "https://api.openai.com/v1"
  }
];

πŸ“‹ Supported Providers

Provider API Base URL Model Examples
OpenAI https://api.openai.com/v1 gpt-4, gpt-3.5-turbo
OpenRouter https://openrouter.ai/api/v1 xiaomi/mimo-v2-flash:free
Groq https://api.groq.com/openai/v1 llama-3.1-70b-versatile
Google Gemini https://generativelanguage.googleapis.com/v1beta/openai/ gemini-2.5-pro
Cohere https://api.cohere.ai/v1 command-r-plus
Any OpenAI-Compatible Server http://server-url/ Any model supported by your server
Cerebras https://api.cerebras.ai/v1 llama3.1-70b

Get API Keys:


πŸ—οΈ Architecture Overview

Unified AI Router follows a fail-fast, quick-recovery architecture:

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   Your App    │───▢│      OpenAI     │───▢│    AIRouter   β”‚
β”‚  (Any Client) β”‚     β”‚      Server     β”‚     |     (SDK)     β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                                      β”‚
                                                      β–Ό
                                            β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                                            β”‚    Provider Loop     β”‚
                                            β”‚  (Try each provider) β”‚
                                            β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                                      β”‚
                            β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                            β”‚                         β”‚                         β”‚
                            β–Ό                         β–Ό                         β–Ό
                    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”        β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”        β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                    β”‚  Provider 1   β”‚        β”‚  Provider 2   β”‚        β”‚  Provider N   β”‚
                    β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚        β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚        β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
                    β”‚ β”‚  Circuit  β”‚ β”‚        β”‚ β”‚  Circuit  β”‚ β”‚        β”‚ β”‚  Circuit  β”‚ β”‚
                    β”‚ β”‚  Breaker  β”‚ β”‚        β”‚ β”‚  Breaker  β”‚ β”‚        β”‚ β”‚  Breaker  β”‚ β”‚
                    β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚        β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚        β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
                    β”‚      β”‚        β”‚        β”‚      β”‚        β”‚        β”‚      β”‚        β”‚
                    β”‚      β–Ό        β”‚        β”‚      β–Ό        β”‚        β”‚      β–Ό        β”‚
                    β”‚   AI Model    β”‚        β”‚   AI Model    β”‚        β”‚   AI Model    β”‚
                    β”‚  (Try First)  β”‚        β”‚  (Fallback)   β”‚        β”‚ (Last Resort) β”‚
                    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜        β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜        β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

πŸš€ Deployment

πŸ—οΈ Render.com Deployment

  1. Dashboard Method:

    # Push to GitHub first
    git push origin main
    
    # Then on Render.com:
    # 1. Create Web Service
    # 2. Connect repository
    # 3. Set Build Command: npm install
    # 4. Set Start Command: npm start
    # 5. Add environment variables (API keys)
    # 6. Deploy
  2. Verify Deployment:

    curl https://your-app.onrender.com/health
    curl https://your-app.onrender.com/models

πŸ“Š Comparison with Direct OpenAI API

🎯 Using Direct OpenAI API

const OpenAI = require("openai");
const client = new OpenAI({
  apiKey: process.env.OPENAI_API_KEY
});

const response = await client.chat.completions.create({
  model: "gpt-4",
  messages: [{ role: "user", content: "Hello" }]
});

// ❌ No fallback - fails if OpenAI is down
// ❌ No circuit breaker - failures cascade
// ❌ No multi-provider support

πŸ”— Using Unified AI Router

const AIRouter = require("unified-ai-router");

const providers = [
  { name: "openai", apiKey: process.env.OPENAI_API_KEY, model: "gpt-4" },
  { name: "backup", apiKey: process.env.BACKUP_KEY, model: "claude-3" }
];

const llm = new AIRouter(providers);
const response = await llm.chatCompletion([{ role: "user", content: "Hello" }]);

// βœ… Automatic fallback if OpenAI fails
// βœ… Circuit breaker protection
// βœ… Multi-provider load balancing
// βœ… Same API interface as OpenAI
// βœ… Production-ready reliability

πŸ—οΈ Project Structure

Unified-AI-Router/
β”œβ”€β”€ openai-server.js     # OpenAI-compatible server
β”œβ”€β”€ main.js              # Core AIRouter library
β”œβ”€β”€ provider.js          # Provider configurations
β”œβ”€β”€ package.json         # Dependencies and scripts
β”œβ”€β”€ .env.example         # Environment template
β”œβ”€β”€ tests/               # Test suite
β”‚   β”œβ”€β”€ openai-server-stream.js     # Server streaming tests
β”‚   β”œβ”€β”€ openai-server-non-stream.js # Server non-streaming tests
β”‚   β”œβ”€β”€ chat.js          # Library tests
β”‚   └── tools.js         # Tool calling tests
└── docs/                # VitePress documentation
    β”œβ”€β”€ index.md
    β”œβ”€β”€ quickstart.md
    └── configuration.md

πŸ§ͺ Testing

The project includes comprehensive tests covering:

  • Library Functionality: Core AIRouter class testing
  • Server Endpoints: OpenAI-compatible API testing
  • Streaming Support: Real-time response handling
  • Tool Calling: Function calling capabilities
  • Error Handling: Failure scenarios and fallbacks

πŸ§ͺ Running the Test Suite

# Install dependencies
npm install

# Run individual tests
node tests/chat.js                    # Basic chat functionality
node tests/openai-server-non-stream.js # Server non-streaming
node tests/openai-server-stream.js     # Server streaming
node tests/tools.js                    # Tool calling

# Expected output: AI responses and success logs

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.



⬆ Back to Top

Made with ❀️ by mlibre