Package Exports
- asyncllm
- asyncllm/anthropic
- asyncllm/gemini
Readme
asyncLLM
Fetch LLM responses as an async iterable.
Features
- 🚀 Lightweight (<2KB) and dependency-free
- 🔄 Works with multiple LLM providers (OpenAI, Anthropic, Gemini, and more)
- 🌐 Browser and Node.js compatible
- 📦 Easy to use with ES modules
Installation
npm install asyncllmUsage
Call asyncLLM() just like you would use fetch with any LLM provider with streaming responses.
- OpenAI Streaming. Many providers like Azure, Groq, OpenRouter, etc. follow the OpenAI API.
- Anthropic Streaming
- Gemini Streaming
The result is an async generator that yields objects with content, tool, and args properties.
For example, to update the DOM with the LLM's response:
<!doctype html>
<html lang="en">
<body>
<div id="output"></div>
</body>
<script type="module">
import { asyncLLM } from "https://cdn.jsdelivr.net/npm/asyncllm@1";
const apiKey = "YOUR_API_KEY";
// Example usage with OpenAI
for await (const { content } of asyncLLM("https://api.openai.com/v1/chat/completions", {
method: "POST",
headers: {
"Content-Type": "application/json",
Authorization: `Bearer ${apiKey}`,
},
body: JSON.stringify({
model: "gpt-4o-mini",
stream: true,
messages: [{ role: "user", content: "Hello, world!" }],
}),
})) {
// Update the output in real time.
document.getElementById("output").textContent = content;
}
</script>
</html>Node.js or bundled projects
import { asyncLLM } from "asyncllm";
// Usage is the same as in the browser exampleAPI
asyncLLM(request: string | Request, options?: RequestInit, config?: SSEConfig): AsyncGenerator<LLMEvent, void, unknown>
Fetches streaming responses from LLM providers and yields events.
request: The URL or Request object for the LLM API endpointoptions: Optional fetch optionsconfig: Optional configuration object for SSE handlingonResponse: Async callback function that receives the Response object before streaming begins. If the callback returns a promise, it will be awaited before continuing the stream.
Returns an async generator that yields LLMEvent objects.
LLMEvent
content: The text content of the responsetool: The name of the tool being called (for function calling)args: The arguments for the tool call (for function calling) as a JSON-encoded string, e.g.{"order_id":"123456"}message: The raw message object from the LLM provider
Examples
OpenAI
import { asyncLLM } from "https://cdn.jsdelivr.net/npm/asyncllm@1";
const body = {
model: "gpt-4o-mini",
stream: true,
messages: [
{ role: "system", content: "You are a helpful assistant." },
{ role: "user", content: "What is 2+2?" },
],
temperature: 0.7,
max_tokens: 10,
tools: [
{
type: "function",
function: {
name: "get_weather",
description: "Get the weather for a location",
parameters: {
type: "object",
properties: { location: { type: "string" } },
required: ["location"],
},
},
},
],
};
const config = {
onResponse: async (response) => {
console.log(response.status, response.headers);
},
};
for await (const { content } of asyncLLM(
"https://api.openai.com/v1/chat/completions",
{
method: "POST",
headers: {
"Content-Type": "application/json",
Authorization: `Bearer ${apiKey}`,
},
body: JSON.stringify(body),
},
config,
)) {
console.log(content);
}Anthropic
The package includes an Anthropic adapter that converts OpenAI-style requests to Anthropic's format, allowing you to use the same code structure across providers.
import { asyncLLM } from "https://cdn.jsdelivr.net/npm/asyncllm@1";
import { anthropic } from "https://cdn.jsdelivr.net/npm/asyncllm@1/dist/anthropic.js";
// You can use the anthropic() adapter to convert OpenAI-style requests to Anthropic's format.
const body = anthropic({
// Same as OpenAI example above
});
// Or you can use the asyncLLM() function directly with the Anthropic API endpoint.
const body = {
model: "claude-3-haiku-20240307",
stream: true,
max_tokens: 10,
messages: [{ role: "user", content: "What is 2 + 2" }],
};
for await (const { content } of asyncLLM("https://api.anthropic.com/v1/messages", {
headers: { "Content-Type": "application/json", "x-api-key": apiKey },
body: JSON.stringify(body),
})) {
console.log(content);
}The Anthropic adapter supports:
- System messages
- Multi-modal content (text and images only, no audio support)
- Model parameters (temperature, max_tokens, top_p, stop, metadata.user_id, but not n, presence_penalty, frequency_penalty, logprobs, top_logprobs)
- User metadata
- Function/tool calling with parallel execution control
- Stop sequences
Gemini
The package includes a Gemini adapter that converts OpenAI-style requests to Gemini's format, allowing you to use the same code structure across providers.
import { asyncLLM } from "https://cdn.jsdelivr.net/npm/asyncllm@1";
import { gemini } from "https://cdn.jsdelivr.net/npm/asyncllm@1/dist/gemini.js";
// You can use the anthropic() adapter to convert OpenAI-style requests to Anthropic's format.
const body = anthropic({
// Same as OpenAI example above
});
// Or you can use the asyncLLM() function directly with the Anthropic API endpoint.
const body = {
contents: [{ role: "user", parts: [{ text: "What is 2+2?" }] }],
};
for await (const { content } of asyncLLM(
"https://generativelanguage.googleapis.com/v1beta/models/gemini-1.5-flash-8b:streamGenerateContent?alt=sse",
{
method: "POST",
headers: {
"Content-Type": "application/json",
Authorization: `Bearer ${apiKey}`,
},
body: JSON.stringify(body),
},
)) {
console.log(content);
}The Gemini adapter supports:
- System messages
- Multi-modal content (text, images, audio via URL or data URI)
- Model parameters (temperature, max_tokens, top_p, stop, n, presence_penalty, frequency_penalty, logprobs, top_logprobs, but not metadata)
- Function calling (no parallel execution support)
- JSON mode and schema validation
- Stop sequences
- Multiple candidates
Function Calling
asyncLLM supports function calling (aka tools). Here's an example with OpenAI:
for await (const { tools } of asyncLLM("https://api.openai.com/v1/chat/completions", {
method: "POST",
headers: {
"Content-Type": "application/json",
Authorization: `Bearer ${apiKey}`,
},
body: JSON.stringify({
model: "gpt-4o-mini",
stream: true,
messages: [
{ role: "system", content: "Get delivery date for order" },
{ role: "user", content: "Order ID: 123456" },
],
tool_choice: "required",
tools: [
{
type: "function",
function: {
name: "get_delivery_date",
parameters: { type: "object", properties: { order_id: { type: "string" } }, required: ["order_id"] },
},
},
],
}),
})) {
console.log(JSON.stringify(tools));
}tools is an array of objects with name and args properties. It streams like this:
[{"name":"get_delivery_date","args":""}]
[{"name":"get_delivery_date","args":"{\""}]
[{"name":"get_delivery_date","args":"{\"order"}]
[{"name":"get_delivery_date","args":"{\"order_id"}]
[{"name":"get_delivery_date","args":"{\"order_id\":\""}]
[{"name":"get_delivery_date","args":"{\"order_id\":\"123"}]
[{"name":"get_delivery_date","args":"{\"order_id\":\"123456"}]
[{"name":"get_delivery_date","args":"{\"order_id\":\"123456\"}"}]Use a library like partial-json to parse the args incrementally.
Error handling
If an error occurs, it will be yielded in the error property. For example:
for await (const { content, error } of asyncLLM("https://api.openai.com/v1/chat/completions", {
method: "POST",
// ...
})) {
if (error) console.error(error);
else console.log(content);
}Changelog
- 2.0.0: Multiple tools support.
- Breaking change:
toolandargsare not part of the response. Instead, it hastools, an array of{ name, args } - Fixed Gemini adapter to return
toolConfiginstead oftoolsConfig
- Breaking change:
- 1.2.0: Added
config.onResponse(response)that receives the Response object before streaming begins - 1.1.3: Ensure
max_tokensfor Anthropic. Improve error handling - 1.1.1: Added Anthropic adapter
- 1.1.0: Added Gemini adapter
- 1.0.0: Initial release with asyncLLM and LLMEvent
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
License
This project is licensed under the MIT License - see the LICENSE file for details.