JSPM

  • ESM via JSPM
  • ES Module Entrypoint
  • Export Map
  • Keywords
  • License
  • Repository URL
  • TypeScript Types
  • README
  • Created
  • Published
  • Downloads 21
  • Score
    100M100P100Q59558F
  • License MIT

Fetch streaming LLM responses as an async iterable

Package Exports

  • asyncllm
  • asyncllm/gemini

Readme

asyncLLM

npm version License: MIT

Fetch LLM responses as an async iterable.

Features

  • 🚀 Lightweight (<2KB) and dependency-free
  • 🔄 Works with multiple LLM providers (OpenAI, Anthropic, Gemini, and more)
  • 🌐 Browser and Node.js compatible
  • 📦 Easy to use with ES modules

Installation

npm install asyncllm

Usage

Call asyncLLM() just like you would use fetch with any LLM provider with streaming responses.

The result is an async generator that yields objects with content, tool, and args properties.

For example, to update the DOM with the LLM's response:

<!DOCTYPE html>
<html lang="en">
  <body>
    <div id="output"></div>
  </body>

  <script type="module">
    import { asyncLLM } from "https://cdn.jsdelivr.net/npm/asyncllm@1";

    const apiKey = "YOUR_API_KEY";

    // Example usage with OpenAI
    for await (const { content } of asyncLLM("https://api.openai.com/v1/chat/completions", {
      method: "POST",
      headers: {
        "Content-Type": "application/json",
        Authorization: `Bearer ${apiKey}`,
      },
      body: JSON.stringify({
        model: "gpt-4",
        stream: true,
        messages: [{ role: "user", content: "Hello, world!" }],
      }),
    })) {
      // Update the output in real time.
      document.getElementById("output").textContent = content;
    }
  </script>
</html>

Node.js or bundled projects

import { asyncLLM } from "asyncllm";

// Usage is the same as in the browser example

API

asyncLLM(request: string | Request, options?: RequestInit): AsyncGenerator<LLMEvent, void, unknown>

Fetches streaming responses from LLM providers and yields events.

  • request: The URL or Request object for the LLM API endpoint
  • options: Optional fetch options

Returns an async generator that yields LLMEvent objects.

LLMEvent

  • content: The text content of the response
  • tool: The name of the tool being called (for function calling)
  • args: The arguments for the tool call (for function calling) as a JSON-encoded string, e.g. {"order_id":"123456"}
  • message: The raw message object from the LLM provider

Examples

OpenAI

for await (const { content } of asyncLLM("https://api.openai.com/v1/chat/completions", {
  method: "POST",
  headers: {
    "Content-Type": "application/json",
    Authorization: `Bearer ${apiKey}`,
  },
  body: JSON.stringify({
    model: "gpt-4",
    stream: true,
    messages: [{ role: "user", content: "Hello world" }],
  }),
})) {
  console.log(content);
}

Anthropic

for await (const { content } of asyncLLM("https://api.anthropic.com/v1/messages", {
  method: "POST",
  headers: {
    "Content-Type": "application/json",
    "x-api-key": apiKey,
  },
  body: JSON.stringify({
    model: "claude-3-haiku-20240307",
    stream: true,
    max_tokens: 10,
    messages: [{ role: "user", content: "What is 2 + 2" }],
  }),
})) {
  console.log(content);
}

Gemini

The package includes a Gemini adapter that converts OpenAI-style requests to Gemini's format, allowing you to use the same code structure across providers.

import { asyncLLM } from "https://cdn.jsdelivr.net/npm/asyncllm@1";
import { gemini } from "https://cdn.jsdelivr.net/npm/asyncllm@1/dist/gemini.js";

for await (const { content } of asyncLLM(
  "https://generativelanguage.googleapis.com/v1beta/models/gemini-1.5-flash-8b:streamGenerateContent?alt=sse",
  {
    method: "POST",
    headers: {
      "Content-Type": "application/json",
      Authorization: `Bearer ${apiKey}`,
    },
    body: JSON.stringify(
      gemini({
        // Use OpenAI-style parameters
        model: "gemini-1.5-flash-8b",
        messages: [
          { role: "system", content: "You are a helpful assistant." },
          { role: "user", content: "What is 2+2?" },
        ],
        temperature: 0.7,
        max_tokens: 100,
        tools: [
          {
            type: "function",
            function: {
              name: "get_weather",
              description: "Get the weather for a location",
              parameters: { type: "object", properties: { location: { type: "string" } }, required: ["location"] },
            },
          },
        ],
      })
    ),
  }
)) {
  console.log(content);
}

The Gemini adapter supports:

  • System messages
  • Multi-modal content (text, images, audio)
  • Generation parameters (temperature, max_tokens, etc.)
  • Function calling
  • JSON mode and schema validation
  • Stop sequences
  • Multiple candidates

Function Calling

asyncLLM supports function calling for compatible LLM providers. Here's an example with OpenAI:

for await (const { content, tool, args } of asyncLLM("https://api.openai.com/v1/chat/completions", {
  method: "POST",
  headers: {
    "Content-Type": "application/json",
    Authorization: `Bearer ${apiKey}`,
  },
  body: JSON.stringify({
    model: "gpt-4",
    stream: true,
    messages: [
      { role: "system", content: "Call get_delivery_date with the order ID." },
      { role: "user", content: "123456" },
    ],
    tools: [
      {
        type: "function",
        function: {
          name: "get_delivery_date",
          description: "Get the delivery date for a customer order.",
          parameters: {
            type: "object",
            properties: { order_id: { type: "string", description: "The customer order ID." } },
            required: ["order_id"],
          },
        },
      },
    ],
  }),
})) {
  console.log(content, tool, args);
}

Changelog

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

This project is licensed under the MIT License - see the LICENSE file for details.