Package Exports
- @lelemondev/sdk
- @lelemondev/sdk/express
- @lelemondev/sdk/hono
- @lelemondev/sdk/integrations
- @lelemondev/sdk/lambda
- @lelemondev/sdk/next
- @lelemondev/sdk/package.json
Readme
@lelemondev/sdk
Automatic LLM observability for Node.js. Wrap your client, everything is traced.
LLM Agent Docs
https://lelemondev.github.io/lelemondev-sdk/llms.txt
https://lelemondev.github.io/lelemondev-sdk/llms-full.txtFeatures
- Automatic Tracing - Wrap your client, all calls are traced
- Zero Config - Works out of the box
- Framework Integrations - Next.js, Express, Lambda, Hono
- Streaming Support - Full support for streaming responses
- Type-safe - Preserves your client's TypeScript types
Installation
npm install @lelemondev/sdkQuick Start
import { init, observe } from '@lelemondev/sdk';
import OpenAI from 'openai';
// 1. Initialize once
init({ apiKey: process.env.LELEMON_API_KEY });
// 2. Wrap your client
const openai = observe(new OpenAI());
// 3. Use normally - all calls traced automatically
const response = await openai.chat.completions.create({
model: 'gpt-4',
messages: [{ role: 'user', content: 'Hello!' }],
});Supported Providers
| Provider | Status | Methods |
|---|---|---|
| OpenAI | ✅ | chat.completions.create(), responses.create(), completions.create(), embeddings.create() |
| OpenRouter | ✅ | chat.completions.create() (access to 400+ models) |
| Anthropic | ✅ | messages.create(), messages.stream() |
| AWS Bedrock | ✅ | ConverseCommand, ConverseStreamCommand, InvokeModelCommand |
| Google Gemini | ✅ | generateContent(), generateContentStream(), chat.sendMessage() |
OpenRouter
OpenRouter provides unified access to 400+ models from OpenAI, Anthropic, Google, Meta, Mistral, and more through a single API.
import { init, observe } from '@lelemondev/sdk';
import OpenAI from 'openai';
init({ apiKey: process.env.LELEMON_API_KEY });
// Configure OpenAI SDK to use OpenRouter
const openrouter = observe(new OpenAI({
baseURL: 'https://openrouter.ai/api/v1',
apiKey: process.env.OPENROUTER_API_KEY,
defaultHeaders: {
'HTTP-Referer': 'https://your-app.com', // Optional: for OpenRouter rankings
'X-Title': 'Your App Name', // Optional: for OpenRouter rankings
},
}));
// Access any model through OpenRouter
const response = await openrouter.chat.completions.create({
model: 'anthropic/claude-3-opus', // or 'openai/gpt-4', 'google/gemini-pro', etc.
messages: [{ role: 'user', content: 'Hello!' }],
});Models are specified in provider/model format (e.g., anthropic/claude-3-opus, openai/gpt-4, meta-llama/llama-3-70b). See OpenRouter Models for the full list.
Usage Without Framework Integrations
The SDK works with any Node.js application. Framework integrations are optional - they just automate the flush() call.
Long-running Processes (Servers, Workers)
For long-running processes, the SDK auto-flushes every second (configurable via flushIntervalMs):
import { init, observe } from '@lelemondev/sdk';
import OpenAI from 'openai';
init({ apiKey: process.env.LELEMON_API_KEY });
const openai = observe(new OpenAI());
// In your HTTP server, WebSocket handler, queue worker, etc.
async function handleRequest(userId: string, message: string) {
const client = observe(new OpenAI(), { userId });
const result = await client.chat.completions.create({
model: 'gpt-4',
messages: [{ role: 'user', content: message }],
});
return result.choices[0].message;
// Traces are auto-flushed in the background (every 1s by default)
}Short-lived Processes (Scripts, Serverless, CLI)
For scripts or serverless functions, call flush() before the process exits:
import { init, observe, flush } from '@lelemondev/sdk';
import OpenAI from 'openai';
init({ apiKey: process.env.LELEMON_API_KEY });
const openai = observe(new OpenAI());
async function main() {
const result = await openai.chat.completions.create({
model: 'gpt-4',
messages: [{ role: 'user', content: 'Hello!' }],
});
console.log(result.choices[0].message.content);
// IMPORTANT: Flush before exit to ensure traces are sent
await flush();
}
main();Custom HTTP Server (No Framework)
import http from 'http';
import { init, observe, flush } from '@lelemondev/sdk';
import OpenAI from 'openai';
init({ apiKey: process.env.LELEMON_API_KEY });
const server = http.createServer(async (req, res) => {
if (req.method === 'POST' && req.url === '/chat') {
const openai = observe(new OpenAI(), {
userId: req.headers['x-user-id'] as string,
});
const result = await openai.chat.completions.create({
model: 'gpt-4',
messages: [{ role: 'user', content: 'Hello!' }],
});
res.writeHead(200, { 'Content-Type': 'application/json' });
res.end(JSON.stringify(result.choices[0].message));
// Auto-flush handles this in background
}
});
// Graceful shutdown - flush remaining traces
process.on('SIGTERM', async () => {
await flush();
server.close();
});
server.listen(3000);When to Use Framework Integrations vs Manual
| Scenario | Recommendation |
|---|---|
| Next.js, Express, Hono, Lambda | Use framework integration (auto-flush) |
| Custom HTTP server | Auto-flush works, add flush() on shutdown |
| CLI scripts | Always call flush() before exit |
| Background workers (Bull, BullMQ) | Auto-flush works, add flush() on shutdown |
| One-off scripts | Always call flush() before exit |
| Long-running daemons | Auto-flush works |
Framework Integrations
Framework integrations automate the flush() call so you don't have to think about it.
Next.js App Router
// app/api/chat/route.ts
import { init, observe } from '@lelemondev/sdk';
import { withObserve } from '@lelemondev/sdk/next';
import { after } from 'next/server';
import OpenAI from 'openai';
init({ apiKey: process.env.LELEMON_API_KEY });
const openai = observe(new OpenAI());
export const POST = withObserve(
async (req) => {
const { message } = await req.json();
const result = await openai.chat.completions.create({
model: 'gpt-4',
messages: [{ role: 'user', content: message }],
});
return Response.json(result.choices[0].message);
},
{ after } // Non-blocking flush (Next.js 15+)
);Express
import express from 'express';
import { init, observe } from '@lelemondev/sdk';
import { createMiddleware } from '@lelemondev/sdk/express';
import OpenAI from 'openai';
init({ apiKey: process.env.LELEMON_API_KEY });
const openai = observe(new OpenAI());
const app = express();
app.use(createMiddleware()); // Auto-flush on response finish
app.post('/chat', async (req, res) => {
const result = await openai.chat.completions.create({
model: 'gpt-4',
messages: [{ role: 'user', content: req.body.message }],
});
res.json(result.choices[0].message);
});AWS Lambda
import { init, observe } from '@lelemondev/sdk';
import { withObserve } from '@lelemondev/sdk/lambda';
import OpenAI from 'openai';
init({ apiKey: process.env.LELEMON_API_KEY });
const openai = observe(new OpenAI());
export const handler = withObserve(async (event) => {
const body = JSON.parse(event.body);
const result = await openai.chat.completions.create({
model: 'gpt-4',
messages: [{ role: 'user', content: body.message }],
});
return {
statusCode: 200,
body: JSON.stringify(result.choices[0].message),
};
});Hono (Cloudflare Workers, Deno, Bun)
import { Hono } from 'hono';
import { init, observe } from '@lelemondev/sdk';
import { createMiddleware } from '@lelemondev/sdk/hono';
import OpenAI from 'openai';
init({ apiKey: process.env.LELEMON_API_KEY });
const openai = observe(new OpenAI());
const app = new Hono();
app.use(createMiddleware()); // Uses waitUntil on Workers
app.post('/chat', async (c) => {
const { message } = await c.req.json();
const result = await openai.chat.completions.create({
model: 'gpt-4',
messages: [{ role: 'user', content: message }],
});
return c.json(result.choices[0].message);
});
export default app;API Reference
init(config)
Initialize the SDK. Call once at app startup.
init({
apiKey: 'le_xxx', // Required (or LELEMON_API_KEY env var)
endpoint: 'https://...', // Optional, custom endpoint
debug: false, // Optional, enable debug logs
disabled: false, // Optional, disable tracing
batchSize: 10, // Optional, items per batch
flushIntervalMs: 1000, // Optional, auto-flush interval
});observe(client, options?)
Wrap an LLM client with automatic tracing.
const openai = observe(new OpenAI(), {
sessionId: 'session-123',
userId: 'user-456',
metadata: { source: 'api' },
tags: ['production'],
});flush()
Manually flush pending traces. Use in serverless without framework integration.
await flush();User & Session Tracking
Track which user generated each trace and group related conversations.
Available Fields
| Field | Purpose | In Dashboard |
|---|---|---|
userId |
Identify the end user | Shown in "User" column, searchable |
sessionId |
Group related traces (e.g., a conversation) | Shown in "Session" column, searchable |
metadata |
Custom data attached to the trace | Visible in trace detail view |
tags |
Labels for categorization | Shown as badges, filterable dropdown |
Field Details
userId - Identifies who made the request
userId: req.user.id // From your auth system
userId: 'user-123' // Any string identifier
userId: req.user.email // Email works toosessionId - Groups multiple calls into one conversation/session
sessionId: req.body.conversationId // Chat conversation ID
sessionId: req.headers['x-session-id'] // From client header
sessionId: crypto.randomUUID() // Generate per sessionmetadata - Any extra data you want to attach (stored as JSON)
metadata: {
plan: 'premium', // User's subscription plan
feature: 'chat', // Which feature triggered this
version: '2.1.0', // Your app version
environment: 'production', // Environment
ip: req.ip, // Client IP (for debugging)
customField: 'any value', // Anything you need
}tags - Quick labels for filtering (array of strings)
tags: ['production'] // Environment
tags: ['chat', 'premium'] // Feature + plan
tags: ['api', 'v2'] // Service + version
tags: [tenantId, 'high-priority'] // Multi-tenant + priorityBasic Usage
const openai = observe(new OpenAI(), {
userId: 'user-123',
sessionId: 'conversation-abc',
});In an API Endpoint
// Express
app.post('/chat', async (req, res) => {
// Create client with user context from the request
const openai = observe(new OpenAI(), {
userId: req.user.id,
sessionId: req.headers['x-session-id'],
metadata: {
plan: req.user.plan,
endpoint: '/chat'
},
});
const result = await openai.chat.completions.create({
model: 'gpt-4',
messages: [{ role: 'user', content: req.body.message }],
});
res.json(result.choices[0].message);
});Reusable Context with createObserve()
When you have multiple LLM clients or make calls from different places, use createObserve() to avoid repeating context:
import { createObserve } from '@lelemondev/sdk';
import OpenAI from 'openai';
import { GoogleGenerativeAI } from '@google/generative-ai';
import { BedrockRuntimeClient } from '@aws-sdk/client-bedrock-runtime';
// Create a scoped observe function with user context
const observeForUser = createObserve({
userId: 'user-123',
sessionId: 'session-456',
tags: ['premium'],
});
// All clients inherit the same context
const openai = observeForUser(new OpenAI());
const gemini = observeForUser(new GoogleGenerativeAI(apiKey));
const bedrock = observeForUser(new BedrockRuntimeClient({}));
// All these calls will be associated with user-123
await openai.chat.completions.create({ ... });
await gemini.getGenerativeModel({ model: 'gemini-pro' }).generateContent('...');In Middleware (Express)
Set up user context once, use it everywhere:
import { createObserve } from '@lelemondev/sdk';
// Middleware to attach observe function to request
app.use((req, res, next) => {
req.observe = createObserve({
userId: req.user?.id,
sessionId: req.headers['x-session-id'] || crypto.randomUUID(),
metadata: {
ip: req.ip,
userAgent: req.headers['user-agent'],
},
});
next();
});
// In any route handler
app.post('/chat', async (req, res) => {
const openai = req.observe(new OpenAI());
// Calls are automatically associated with the user
});
app.post('/summarize', async (req, res) => {
const gemini = req.observe(new GoogleGenerativeAI(apiKey));
// Same user context, different endpoint
});Multi-tenant Application
app.post('/api/:tenantId/chat', async (req, res) => {
const openai = observe(new OpenAI(), {
userId: req.user.id,
sessionId: req.body.conversationId,
metadata: {
tenantId: req.params.tenantId,
environment: process.env.NODE_ENV,
},
tags: [req.params.tenantId, req.user.plan],
});
// Traces can be filtered by tenant, user, or conversation
});Distributed Systems (API + WebSocket)
When your application spans multiple services (REST API, WebSocket server, background workers), use consistent identifiers across all of them:
// Shared utility for creating observe context
// utils/observe-context.ts
import { createObserve } from '@lelemondev/sdk';
export function createUserObserve(userId: string, sessionId: string, service: string) {
return createObserve({
userId,
sessionId,
metadata: { service },
tags: [service],
});
}REST API Server:
// api-server.ts
import { createUserObserve } from './utils/observe-context';
app.post('/chat/start', async (req, res) => {
const { userId, sessionId } = req.body;
const observe = createUserObserve(userId, sessionId, 'api');
const openai = observe(new OpenAI());
const result = await openai.chat.completions.create({
model: 'gpt-4',
messages: [{ role: 'user', content: req.body.message }],
});
res.json({
response: result.choices[0].message,
sessionId, // Return sessionId for client to use in WebSocket
});
});WebSocket Server:
// ws-server.ts
import { createUserObserve } from './utils/observe-context';
wss.on('connection', (ws, req) => {
// Get userId and sessionId from connection (query params, auth token, etc.)
const userId = getUserFromToken(req);
const sessionId = new URL(req.url, 'http://localhost').searchParams.get('sessionId');
// Create observe function for this connection
const observe = createUserObserve(userId, sessionId, 'websocket');
const gemini = observe(new GoogleGenerativeAI(apiKey));
ws.on('message', async (data) => {
const { message } = JSON.parse(data);
// This trace will be linked to the same session as the API calls
const model = gemini.getGenerativeModel({ model: 'gemini-pro' });
const result = await model.generateContent(message);
ws.send(JSON.stringify({ response: result.response.text() }));
});
});Background Worker / Queue Consumer:
// worker.ts
import { createUserObserve } from './utils/observe-context';
queue.process('ai-task', async (job) => {
const { userId, sessionId, prompt } = job.data;
const observe = createUserObserve(userId, sessionId, 'worker');
const bedrock = observe(new BedrockRuntimeClient({}));
const command = new ConverseCommand({
modelId: 'anthropic.claude-3-sonnet-20240229-v1:0',
messages: [{ role: 'user', content: [{ text: prompt }] }],
});
const result = await bedrock.send(command);
return result.output.message.content[0].text;
});Client-side (passing sessionId between services):
// Frontend - maintains sessionId across API and WebSocket
const sessionId = crypto.randomUUID(); // Or from your session management
// REST API call
const response = await fetch('/chat/start', {
method: 'POST',
body: JSON.stringify({ userId, sessionId, message: 'Hello' }),
});
// WebSocket connection with same sessionId
const ws = new WebSocket(`wss://your-app.com/ws?sessionId=${sessionId}`);
// Both API and WebSocket traces will be grouped under the same sessionViewing in Dashboard:
With consistent userId and sessionId across services, you can:
- See a user's complete journey across API → WebSocket → Background jobs
- Filter by service (
metadata.service) to isolate issues - Track a conversation that spans multiple services
What This Enables
With proper user/session tracking you can:
- Filter traces by user - See all LLM calls from a specific user
- View full conversations - Group all calls in a session together
- Debug issues - Find exactly what happened for a specific user request
- Analyze usage patterns - Understand how different user segments use your AI features
- Cost attribution - Track token usage per user, tenant, or feature
Streaming
All providers support streaming:
const stream = await openai.chat.completions.create({
model: 'gpt-4',
messages: [{ role: 'user', content: 'Hello!' }],
stream: true,
});
for await (const chunk of stream) {
process.stdout.write(chunk.choices[0]?.delta?.content || '');
}
// Trace captured automatically when stream completesWhat Gets Traced
Each LLM call automatically captures:
- Provider - openai, anthropic, bedrock, gemini
- Model - gpt-4, claude-3-opus, gemini-pro, etc.
- Input - Messages/prompt (sanitized)
- Output - Response content
- Tokens - Input and output counts
- Duration - Request latency in ms
- Status - success or error
- Streaming - Whether streaming was used
Security
The SDK automatically sanitizes sensitive data:
- API keys and tokens are redacted
- Large payloads are truncated
- Errors are captured safely
Environment Variables
| Variable | Description |
|---|---|
LELEMON_API_KEY |
Your API key (starts with le_) |
License
MIT
Sources
Framework integration patterns based on: