JSPM

  • ESM via JSPM
  • ES Module Entrypoint
  • Export Map
  • Keywords
  • License
  • Repository URL
  • TypeScript Types
  • README
  • Created
  • Published
  • Downloads 2009
  • Score
    100M100P100Q115875F
  • License Apache-2.0

Realtime audio pipeline for Kuralle — multi-provider speech-to-speech and orchestration.

Package Exports

  • @kuralle-agents/realtime-audio

Readme

@kuralle-agents/realtime-audio

Provider-native realtime audio for Kuralle — speech-to-speech voice agents powered by Gemini Live, OpenAI Realtime, or xAI Grok Realtime, with Kuralle keeping tool, flow, and handoff authority.

Install

npm install @kuralle-agents/realtime-audio

Peer dependencies:

npm install @kuralle-agents/core ai zod

What it does

Unlike the cascaded path in @kuralle-agents/livekit-plugin (STT → LLM → TTS), provider-native realtime sends raw audio directly to the model and receives audio back in a single connection — lower latency, no transcript round-trip.

  • VoiceEngine — call acceptor. Accepts incoming audio connections and creates per-call VoiceCallSession workers that bridge a transport to the chosen provider.
  • VoiceCallSession / RealtimeCallWorker — per-call lifecycle: connects to the provider, routes tool calls through Kuralle runtime, manages session state.
  • GeminiLiveSession — thin wrapper around @google/genai live.connect(); manages the WebSocket to Gemini, PCM audio encoding, tool dispatch, and session resumption.
  • OpenAIRealtimeClient — OpenAI Realtime API client.
  • CloudflareRealtimeAdapter — plugs any RealtimeAudioClient into Kuralle runtime authority inside a Cloudflare Durable Object.
  • CloudflareGeminiLiveClient, CloudflareOpenAIRealtimeClient, CloudflareXAIGrokRealtimeClient — Cloudflare Workers variants.
  • createGeminiClientFactory / createOpenAIClientFactory — provider client factories.
  • voiceAgentToRuntimeAgent — converts a VoiceAgentConfig to a standard Kuralle agent config.

Usage

import { VoiceEngine, createGeminiClientFactory } from '@kuralle-agents/realtime-audio';

const engine = new VoiceEngine({
  agents: [
    {
      id: 'support',
      name: 'Support Agent',
      instructions: 'You are a support agent.',
      voice: 'Charon',
      tools: { /* Kuralle tool definitions */ },
    },
  ],
  defaultAgentId: 'support',
  modelClientFactory: createGeminiClientFactory({
    apiKey: process.env.GOOGLE_API_KEY!,
    model: 'gemini-2.5-flash-preview-native-audio',
  }),
});

// Accept a call from any transport (WebSocket, LiveKit, etc.)
const session = await engine.acceptCall({
  callId: crypto.randomUUID(),
  transport: myTransportSession, // implements TransportSession
});

await session.start();