Package Exports
- @node-llm/core
Readme
NodeLLM
An opinionated architectural layer for integrating Large Language Models in Node.js.
Provider-agnostic by design.
Most LLM SDKs tightly couple your application to vendors, APIs, and churn. NodeLLM provides a unified, production-oriented API for interacting with over 540+ models across multiple providers (OpenAI, Gemini, Anthropic, DeepSeek, OpenRouter, Ollama, etc.) without the SDK fatigue.
🛑 What NodeLLM is NOT
NodeLLM represents a clear architectural boundary between your system and LLM vendors.
NodeLLM is NOT:
- A wrapper around a single provider SDK (like
openaior@google/generative-ai) - A prompt-engineering framework
- An agent playground or experimental toy
🏗️ Why NodeLLM?
Most AI integrations today are provider-specific, SDK-driven, and leaky at abstraction boundaries. This creates long-term architectural risk. LLMs should be treated as infrastructure, and NodeLLM exists to help you integrate them without vendor lock-in.
NodeLLM exists to solve architectural problems, not just provide API access. It is the core architectural layer for LLMs in the Node.js ecosystem.
Strategic Goals
- Provider Isolation: Decouple your services from vendor SDKs.
- Production-Ready: Native support for streaming, retries, and unified error handling.
- Predictable API: Consistent behavior for Tools, Vision, and Structured Outputs across all models.
⚡ The Architectural Path
import { NodeLLM } from "@node-llm/core";
// 1. Configure once
NodeLLM.configure({ provider: "openai" });
// 2. Chat (High-level request/response)
const chat = NodeLLM.chat("gpt-4o");
const response = await chat.ask("Explain event-driven architecture");
console.log(response.content);
// 3. Streaming (Standard AsyncIterator)
for await (const chunk of chat.stream("Explain event-driven architecture")) {
process.stdout.write(chunk.content);
}🔧 Strategic Configuration
NodeLLM provides a flexible configuration system designed for enterprise usage:
// Recommended for multi-provider pipelines
NodeLLM.configure((config) => {
config.openaiApiKey = process.env.OPENAI_API_KEY;
config.anthropicApiKey = process.env.ANTHROPIC_API_KEY;
config.ollamaApiBase = process.env.OLLAMA_API_BASE;
});
// Switch providers at the framework level
NodeLLM.configure({ provider: "anthropic" });
// Support for Custom Endpoints (e.g., Azure or LocalAI)
NodeLLM.configure({
openaiApiKey: process.env.AZURE_KEY,
openaiApiBase: "https://your-resource.openai.azure.com/openai/deployments/...",
});🔮 Capabilities
💬 Unified Chat
Stop rewriting code for every provider. NodeLLM normalizes inputs and outputs into a single, predictable mental model.
const chat = NodeLLM.chat(); // Defaults to GPT-4o
await chat.ask("Hello world");👁️ Smart Vision & Files
Pass images, PDFs, or audio files directly. We handle the heavy lifting: fetching remote URLs, base64 encoding, and MIME type mapping.
await chat.ask("Analyze this interface", {
files: ["./screenshot.png", "https://example.com/spec.pdf"]
});🛠️ Auto-Executing Tools
Define tools once;NodeLLM manages the recursive execution loop for you, keeping your controller logic clean. Works seamlessly with both regular chat and streaming!
import { Tool, z } from "@node-llm/core";
// Class-based DSL
class WeatherTool extends Tool {
name = "get_weather";
description = "Get current weather";
schema = z.object({ location: z.string() });
async handler({ location }) {
return `Sunny in ${location}`;
}
}
// Now the model can use it automatically
await chat.withTool(WeatherTool).ask("What's the weather in Tokyo?");🛡️ Loop Protection & Resource Limits
Prevent runaway costs, infinite loops, and hanging requests with comprehensive protection against resource exhaustion.
NodeLLM provides defense-in-depth security that you can configure globally or per-request:
// 1. Global config
NodeLLM.configure({
requestTimeout: 30000, // Timeout requests after 30 seconds (default)
maxToolCalls: 5, // Stop after 5 sequential tool execution turns
maxRetries: 2, // Retry provider-level errors up to 2 times
maxTokens: 4096 // Limit output to 4K tokens (default)
});
// 2. Per request override
await chat.ask("Deep search task", {
requestTimeout: 120000, // 2 minutes for this request
maxToolCalls: 10,
maxTokens: 8192 // 8K tokens for this request
});Security Benefits:
requestTimeout: Prevents DoS attacks and hanging requestsmaxToolCalls: Prevents infinite tool execution loopsmaxRetries: Prevents retry storms during outagesmaxTokens: Prevents excessive output and cost overruns
🔍 Comprehensive Debug Logging
Enable detailed logging for all API requests and responses across every feature and provider:
// Set environment variable
process.env.NODELLM_DEBUG = "true";
// Now see detailed logs for every API call:
// [NodeLLM] [OpenAI] Request: POST https://api.openai.com/v1/chat/completions
// { "model": "gpt-4o", "messages": [...] }
// [NodeLLM] [OpenAI] Response: 200 OK
// { "id": "chatcmpl-123", ... }Covers: Chat, Streaming, Images, Embeddings, Transcription, Moderation - across all providers!
🛡️ Content Policy Hooks
NodeLLM provides pluggable hooks to implement custom security, compliance, and moderation logic. Instead of hard-coded rules, you can inject your own policies at the edge.
beforeRequest(): Intercept and modify messages before they hit the LLM (e.g., PII detection/redaction).afterResponse(): Process the final response before it returns to your code (e.g., output masking or compliance checks).
chat
.beforeRequest(async (messages) => {
// Detect PII and redact
return redactSSN(messages);
})
.afterResponse(async (response) => {
// Ensure output compliance
return response.withContent(maskSensitiveData(response.content));
});🧱 Smart Context Isolation
Stop worrying about prompt injection or instruction drift. NodeLLM automatically separates system instructions from the conversation history, providing a higher level of protection and strictness.
- Zero-Config Security: Enabled by default for all chats. No special flags required.
- Smart Model Mapping: Automatically uses OpenAI's modern
developerrole for compatible models (GPT-4o, o1, o3) while safely falling back to the standardsystemrole for older or local models (Ollama, DeepSeek, etc.). - Universal Context: Instructions stay separated internally, ensuring they are always prioritized by the model and never accidentally overridden by user messages.
- Provider Agnostic: Write instructions once; NodeLLM handles the specific role requirements for every major provider (OpenAI, Anthropic, Gemini).
🔍 Observability & Tool Auditing
For enterprise compliance, NodeLLM provides deep visibility into the tool execution lifecycle. You can monitor, log, and audit every step of a tool's execution.
chat
.onToolCallStart((call) => log(`Starting tool: ${call.function.name}`))
.onToolCallEnd((call, res) => log(`Tool ${call.id} finished with: ${res}`))
.onToolCallError((call, err) => alert(`Tool ${call.function.name} failed: ${err.message}`));✨ Structured Output
Get type-safe, validated JSON back using Zod schemas.
import { z } from "@node-llm/core";
const Product = z.object({ name: z.string(), price: z.number() });
const res = await chat.withSchema(Product).ask("Generate a gadget");
console.log(res.parsed.name); // Full type-safety🎨 Image Generation
await NodeLLM.paint("A cyberpunk city in rain");🎤 Audio Transcription
await NodeLLM.transcribe("meeting-recording.wav");⚡ Scoped Parallelism
Run multiple providers in parallel safely without global configuration side effects using isolated contexts.
const [gpt, claude] = await Promise.all([
// Each call branch off into its own isolated context
NodeLLM.withProvider("openai").chat("gpt-4o").ask(prompt),
NodeLLM.withProvider("anthropic").chat("claude-3-5-sonnet").ask(prompt),
]);🧠 Deep Reasoning
Direct access to the thought process of models like DeepSeek R1 or OpenAI o1/o3 using the .reasoning field.
const res = await NodeLLM.chat("deepseek-reasoner").ask("Solve this logical puzzle");
console.log(res.reasoning); // Chain-of-thought🚀 Why use this over official SDKs?
| Feature | NodeLLM | Official SDKs | Architectural Impact |
|---|---|---|---|
| Provider Logic | Transparently Handled | Exposed to your code | Low Coupling |
| Streaming | Standard AsyncIterator |
Vendor-specific Events | Predictable Data Flow |
| Streaming + Tools | Automated Execution | Manual implementation | Seamless UX |
| Tool Loops | Automated Recursion | Manual implementation | Reduced Boilerplate |
| Files/Vision | Intelligent Path/URL handling | Base64/Buffer management | Cleaner Service Layer |
| Configuration | Centralized & Global | Per-instance initialization | Easier Lifecycle Mgmt |
📋 Supported Providers
| Provider | Supported Features |
|---|---|
| Chat, Streaming + Tools, Vision, Audio, Images, Transcription, Reasoning, Smart Developer Role | |
| Chat, Streaming + Tools, Vision, Audio, Video, Embeddings | |
| Chat, Streaming + Tools, Vision, PDF, Structured Output | |
| Chat (V3), Reasoning (R1), Streaming + Tools | |
| Aggregator, Chat, Streaming, Tools, Vision, Embeddings, Reasoning | |
| Local Inference, Chat, Streaming, Tools, Vision, Embeddings |
📚 Documentation & Installation
npm install @node-llm/core🤝 Contributing
We welcome contributions! Please see our Contributing Guide for more details on how to get started.
🫶 Credits
Heavily inspired by the elegant design of RubyLLM.
📄 License
MIT © [NodeLLM contributors]