Package Exports
- @kuralle-agents/realtime-audio
Readme
@kuralle-agents/realtime-audio
Provider-native realtime audio for Kuralle — speech-to-speech voice agents powered by Gemini Live, OpenAI Realtime, or xAI Grok Realtime, with Kuralle keeping tool, flow, and handoff authority.
Install
npm install @kuralle-agents/realtime-audioPeer dependencies:
npm install @kuralle-agents/core ai zodWhat it does
Unlike the cascaded path in @kuralle-agents/livekit-plugin (STT → LLM → TTS), provider-native realtime sends raw audio directly to the model and receives audio back in a single connection — lower latency, no transcript round-trip.
VoiceEngine— call acceptor. Accepts incoming audio connections and creates per-callVoiceCallSessionworkers that bridge a transport to the chosen provider.VoiceCallSession/RealtimeCallWorker— per-call lifecycle: connects to the provider, routes tool calls through Kuralle runtime, manages session state.GeminiLiveSession— thin wrapper around@google/genailive.connect(); manages the WebSocket to Gemini, PCM audio encoding, tool dispatch, and session resumption.OpenAIRealtimeClient— OpenAI Realtime API client.CloudflareRealtimeAdapter— plugs anyRealtimeAudioClientinto Kuralle runtime authority inside a Cloudflare Durable Object.CloudflareGeminiLiveClient,CloudflareOpenAIRealtimeClient,CloudflareXAIGrokRealtimeClient— Cloudflare Workers variants.createGeminiClientFactory/createOpenAIClientFactory— provider client factories.voiceAgentToRuntimeAgent— converts aVoiceAgentConfigto a standard Kuralle agent config.
Usage
import { VoiceEngine, createGeminiClientFactory } from '@kuralle-agents/realtime-audio';
const engine = new VoiceEngine({
agents: [
{
id: 'support',
name: 'Support Agent',
instructions: 'You are a support agent.',
voice: 'Charon',
tools: { /* Kuralle tool definitions */ },
},
],
defaultAgentId: 'support',
modelClientFactory: createGeminiClientFactory({
apiKey: process.env.GOOGLE_API_KEY!,
model: 'gemini-2.5-flash-preview-native-audio',
}),
});
// Accept a call from any transport (WebSocket, LiveKit, etc.)
const session = await engine.acceptCall({
callId: crypto.randomUUID(),
transport: myTransportSession, // implements TransportSession
});
await session.start();Related
@kuralle-agents/livekit-plugin— cascaded STT → LLM → TTS voice path via LiveKit@kuralle-agents/livekit-plugin-transport-ws— WebSocket transport for audio connections@kuralle-agents/core— agents, flows, runtime