Package Exports
- large-models-interface
- large-models-interface/src/index.js
This package does not declare an exports field, so the exports above have been automatically detected and optimized by JSPM instead. If any package subpath is missing, it is recommended to post an issue to the original package (large-models-interface) to support the "exports" field. If that is not possible, create a JSPM override to customize the exports field for this package.
Readme
large-models-interface
Maintained by chenxingqiang
Introduction
Large Models Interface is a comprehensive npm module designed to streamline interactions with various AI model providers in your Node.js applications. Our mission is to provide a unified interface for all types of large models, making it simple to switch between providers and leverage the best models for your specific needs.
๐ฏ Our Vision: Universal access to all kinds of large AI models through a single, consistent interface.
๐จ๐ณ Special Focus on Chinese AI Ecosystem: We prioritize comprehensive support for leading Chinese AI providers including Baidu, Alibaba, ByteDance, Tencent, iFLYTEK, and emerging players, making this the most China-friendly international AI interface.
๐ Multi-Modal AI Support
We are building the most comprehensive interface for modern AI models:
- ๐ฃ๏ธ Natural Language Models - Chat completion, text generation, and language understanding
- ๐ผ๏ธ Vision Models - Image analysis, generation, and vision-language tasks
- ๐ต Audio Models - Speech recognition, synthesis, and audio processing
- ๐ฌ Video Models - Video analysis, generation, and multimodal video understanding
- ๐ง Specialized Models - Code generation, embeddings, and domain-specific AI
The Large Models Interface package currently offers comprehensive support for 51 language model providers and hundreds of models, with active development to expand into all AI modalities. This extensive and growing coverage ensures maximum flexibility in choosing the best models for your applications.
๐ Current Support: 51 Providers & Hundreds of Models
๐ฃ๏ธ Natural Language Models (Current)
๐ Global Leading Providers
International: OpenAI, Anthropic, Google Gemini, Mistral AI, Groq, DeepSeek, Hugging Face, NVIDIA AI, xAI, Coze, and 30+ more providers.
Supported Global Providers: AI21 Studio, AiLAYER, AIMLAPI, Anyscale, Anthropic, Cloudflare AI, Cohere, Corcel, Coze, DeepInfra, DeepSeek, Fireworks AI, Forefront AI, FriendliAI, Google Gemini, GooseAI, Groq, Hugging Face Inference, HyperBee AI, Lamini, LLaMA.CPP, Mistral AI, Monster API, Neets.ai, Novita AI, NVIDIA AI, OctoAI, Ollama, OpenAI, Perplexity AI, Reka AI, Replicate, Shuttle AI, SiliconFlow, TheB.ai, Together AI, Voyage AI, Watsonx AI, Writer, xAI, and Zhipu AI.
๐จ๐ณ Chinese AI Ecosystem
Leading Chinese Providers: ็พๅบฆๆๅฟไธ่จ (Baidu ERNIE), ้ฟ้้ไนๅ้ฎ (Alibaba Qwen), ๅญ่่ทณๅจ่ฑๅ (ByteDance Doubao), ่ฎฏ้ฃๆ็ซ (iFLYTEK Spark), ๆบ่ฐฑ ChatGLM, ่ พ่ฎฏๆททๅ (Tencent Hunyuan), and more.
Chinese Providers (ๅทฒๆฏๆ/Currently Supported):
- ็พๅบฆๆๅฟไธ่จ็ณปๅๆจกๅ - Baidu ERNIE Series โ
- ้ฟ้้ไนๅ้ฎ็ณปๅๆจกๅ - Alibaba Qwen Series โ
- ๅญ่่ทณๅจ่ฑๅ ๅคงๆจกๅ - ByteDance Doubao (Volcano Engine) โ
- ่ฎฏ้ฃๆ็ซ่ฎค็ฅๅคงๆจกๅ - iFLYTEK Spark Cognitive Model โ
- ๆบ่ฐฑ ChatGLM ็ณปๅๆจกๅ - Zhipu ChatGLM Series โ
- ่ พ่ฎฏๆททๅ ๅคงๆจกๅ - Tencent Hunyuan โ
- Moonshot AI - ๆไนๆ้ข โ
- ็พๅทๅคงๆจกๅ - Baichuan AI โ
- MINIMAX - MiniMax Models โ
- ้ถไธไธ็ฉ - 01.AI (Yi Series) โ
- ้ถ่ทๆ่พฐ - StepFun โ
- ็ก ๅบๆตๅจ SiliconCloud - SiliconFlow โ
๐ง Coming Soon: Multi-Modal Expansion
- ๐ผ๏ธ Vision Models - Image understanding, OCR, visual question answering
- ๐ต Audio Models - Speech-to-text, text-to-speech, audio generation
- ๐ฌ Video Models - Video analysis, captioning, generation
- ๐ง Specialized Models - Code completion, scientific computing, domain-specific AI
Our roadmap includes expanding across all AI modalities, with dynamic model discovery to automatically support the latest releases.
โจ Core Features
๐ฏ Universal AI Interface
- Unified API:
LLMInterface.sendMessageprovides a single, consistent interface to interact with 51 AI model providers - Multi-Modal Ready: Designed to support text, vision, audio, and video models through the same interface
- Dynamic Model Discovery: Automatically detects and supports newly released models without code updates
- ๐จ๐ณ China-First Design: Comprehensive support for Chinese AI ecosystem with native language examples and documentation
๐ Advanced Capabilities
- Chat Completion & Streaming: Full support for chat completion, streaming, and embeddings with intelligent failover
- Smart Model Selection: Automatically choose the best model based on task type and requirements
- Response Caching: Intelligent caching system to reduce costs and improve performance
- Graceful Error Handling: Robust retry mechanisms with exponential backoff
๐ง Developer Experience
- Dynamic Module Loading: Lazy loading of provider interfaces to minimize resource usage
- JSON Output & Repair: Native JSON output support with automatic repair for malformed responses
- Extensible Architecture: Easy integration of new providers and model types
- Type Safety: Full TypeScript support for better development experience
๐ Future-Ready Architecture
- Modality Expansion: Built to seamlessly integrate vision, audio, and video models
- Provider Agnostic: Switch between providers without changing your application code
- Auto-Discovery: Continuously updated model registry for the latest AI capabilities
Dependencies
The project relies on several npm packages and APIs. Here are the primary dependencies:
axios: For making HTTP requests (used for various HTTP AI APIs).@google/generative-ai: SDK for interacting with the Google Gemini API.dotenv: For managing environment variables. Used by test cases.jsonrepair: Used to repair invalid JSON responses.loglevel: A minimal, lightweight logging library with level-based logging and filtering.
The following optional packages can added to extend LLMInterface's caching capabilities:
flat-cache: A simple JSON based cache.cache-manager: An extendible cache module that supports various backends including Redis, MongoDB, File System, Memcached, Sqlite, and more.
Installation
To install the LLM Interface npm module, you can use npm:
npm install large-models-interfaceQuick Start
- Looking for API Keys? This document provides helpful links.
- Detailed usage documentation is available here.
- Various examples are also available to help you get started.
- A breakdown of model aliases is available here.
- A breakdown of embeddings model aliases is available here.
- If you still want more examples, you may wish to review the test cases for further examples.
Usage
First import LLMInterface. You can do this using either the CommonJS require syntax:
const { LLMInterface } = require('large-models-interface');๐ Global Providers Example
LLMInterface.setApiKey({ openai: process.env.OPENAI_API_KEY });
try {
const response = await LLMInterface.sendMessage(
'openai',
'Explain the importance of low latency LLMs.',
);
} catch (error) {
console.error(error);
}๐จ๐ณ Chinese Providers Example
// ๆบ่ฐฑ ChatGLM
LLMInterface.setApiKey({ zhipuai: process.env.ZHIPUAI_API_KEY });
const response = await LLMInterface.sendMessage(
'zhipuai',
'่ฏท่งฃ้ๅคง่ฏญ่จๆจกๅๅจไธญๆ่ช็ถ่ฏญ่จๅค็ไธญ็้่ฆๆง',
{ model: 'glm-4' }
);
// ็พๅบฆๆๅฟไธ่จ
LLMInterface.setApiKey({ baidu: process.env.BAIDU_API_KEY });
const response = await LLMInterface.sendMessage(
'baidu',
'่ฏทๅธฎๆๅไธๆฎตๅ
ณไบไบบๅทฅๆบ่ฝๅๅฑ็ๆ็ซ ',
{ model: 'ernie-4.0-8k' }
);
// ้ฟ้้ไนๅ้ฎ
LLMInterface.setApiKey({ alibaba: process.env.ALIBABA_API_KEY });
const response = await LLMInterface.sendMessage(
'alibaba',
'่ฏทไป็ปไธไธไบบๅทฅๆบ่ฝ็ๅๅฑๅ็จ',
{ model: 'qwen-turbo' }
);if you prefer, you can pass use a one-liner to pass the provider and API key, essentially skipping the LLMInterface.setApiKey() step.
const response = await LLMInterface.sendMessage(
['openai', process.env.OPENAI_API_KEY],
'Explain the importance of low latency LLMs.',
);Passing a more complex message object is just as simple. The same rules apply:
const message = {
model: 'gpt-4o-mini',
messages: [
{ role: 'system', content: 'You are a helpful assistant.' },
{ role: 'user', content: 'Explain the importance of low latency LLMs.' },
],
};
try {
const response = await LLMInterface.sendMessage('openai', message, {
max_tokens: 150,
});
} catch (error) {
console.error(error);
}LLMInterfaceSendMessage and LLMInterfaceStreamMessage are still available and will be available until version 3
Running Tests
The project includes tests for each LLM handler. To run the tests, use the following command:
npm testThe comprehensive test suite covers all 51 providers with proper API key validation and graceful skipping when credentials are not available.
๐๏ธ Roadmap
โ Phase 1: Enhanced Language Models (Completed)
- Dynamic Model Discovery - Auto-detect latest models from all providers
- Chinese AI Providers Integration:
- ็พๅบฆๆๅฟไธ่จ (Baidu ERNIE) - ERNIE-4.0, ERNIE-3.5 series
- ้ฟ้้ไนๅ้ฎ (Alibaba Qwen) - Qwen2.5, Qwen-Turbo, Qwen-Plus
- ๅญ่่ทณๅจ่ฑๅ (ByteDance Doubao) - Doubao-pro, Doubao-lite series
- ่ฎฏ้ฃๆ็ซ (iFLYTEK Spark) - Spark-4.0, Spark-3.5 models
- ่ พ่ฎฏๆททๅ (Tencent Hunyuan) - Hunyuan-large, Hunyuan-pro
- ๆไนๆ้ข (Moonshot AI) - Moonshot-v1 series
- ็พๅทๅคงๆจกๅ (Baichuan AI) - Baichuan2 series
- ้ถไธไธ็ฉ (01.AI) - Yi-34B, Yi-6B series
- ้ถ่ทๆ่พฐ (StepFun) - Step-1V, Step-2 models
- New Global Providers - xAI Grok, SiliconFlow, Coze
- Enhanced Embeddings - Voyage AI, improved embedding support
๐ผ๏ธ Phase 2: Vision Models (Next)
- Image Understanding - GPT-4V, Claude Vision, Gemini Vision
- Image Generation - DALL-E, Midjourney, Stable Diffusion
- OCR & Document AI - Advanced document processing capabilities
- Visual Question Answering - Multi-modal reasoning
๐ต Phase 3: Audio Models (Future)
- Speech Recognition - Whisper, Azure Speech, Google Speech-to-Text
- Text-to-Speech - ElevenLabs, Azure TTS, OpenAI TTS
- Audio Generation - Music generation, sound effects
- Real-time Audio - Streaming audio processing
๐ฌ Phase 4: Video & Advanced AI (Future)
- Video Understanding - Video analysis, captioning, content moderation
- Video Generation - AI video creation and editing
- Multi-modal Reasoning - Cross-modal understanding and generation
- Specialized AI - Scientific computing, code generation, domain-specific models
๐ Submit your feature requests and suggestions!
Contribute
Contributions to this project are welcome. Please fork the repository and submit a pull request with your changes or improvements.
Acknowledgments
This project is based on and extends the excellent llm-interface project. We thank the original authors for their foundational work.
License
This project is licensed under the MIT License - see the LICENSE file for details.
Author: chenxingqiang
GitHub: chenxingqiang