Package Exports

qt-ai-gateway-npm-sdk
qt-ai-gateway-npm-sdk/dist/index.esm.js
qt-ai-gateway-npm-sdk/dist/index.js

This package does not declare an exports field, so the exports above have been automatically detected and optimized by JSPM instead. If any package subpath is missing, it is recommended to post an issue to the original package (qt-ai-gateway-npm-sdk) to support the "exports" field. If that is not possible, create a JSPM override to customize the exports field for this package.

Readme

TTS WebSocket Client

A high-performance JavaScript/TypeScript package for real-time Text-to-Speech (TTS) over WebSocket connections with advanced audio streaming capabilities.

Features

🚀 Real-time TTS: WebSocket-based communication for low-latency text-to-speech
🎵 Advanced Audio Playback: AudioWorklet-based PCM audio streaming with beat control
🔄 Auto-reconnection: Robust WebSocket connection management with automatic reconnection
📱 Cross-platform: Works in all modern browsers with Web Audio API support
🎛️ Latency Control: Adaptive playback rate adjustment for optimal audio quality
🔒 JWT Authentication: Secure WebSocket connections with JWT token support
📊 Real-time Statistics: Audio buffer and connection monitoring
🎯 TypeScript Support: Full TypeScript definitions included

Installation

npm install qt-ai-gateway-sdk

Quick Start

import TTSClient from 'qt-ai-gateway-sdk';

// Initialize the client
const ttsClient = new TTSClient({
  websocket: {
    url: 'wss://your-tts-server.com/ws',
    jwtToken: 'your-jwt-token'
  },
  onTextMessage: (content) => {
    console.log('Text message:', content);
  },
  onBSMessage: (content) => {
    console.log('BS message:', content);
  },
  onError: (error) => {
    console.error('TTS Error:', error);
  }
});

// Enable audio and send TTS (must be called from user gesture - click, touch, etc.)
document.getElementById('speakBtn').addEventListener('click', async () => {
  await ttsClient.enableAudio(); // Enable audio first
  await ttsClient.tts('Hello, this is a test message!', {role:'role', speed:1.0}); // Auto-initializes if needed
  console.log('TTS request sent!');
});

Configuration

TTSClientConfig

interface TTSClientConfig {
  websocket: WebSocketConfig;
  audio?: AudioConfig;
  onTextMessage?: TextCallback;
  onBSMessage?: BSCallback;
  onError?: (error: Error) => void;
  onConnect?: () => void;
  onDisconnect?: () => void;
}

WebSocketConfig

interface WebSocketConfig {
  url: string;                    // WebSocket server URL
  jwtToken: string;              // JWT authentication token
  reconnectAttempts?: number;    // Max reconnection attempts (default: 5)
  reconnectDelay?: number;       // Delay between reconnections in ms (default: 3000)
}

AudioConfig

interface AudioConfig {
  sampleRate?: number;    // Audio sample rate (default: 16000)
  channels?: number;      // Number of audio channels (default: 1)
  bufferSize?: number;    // Audio buffer size (default: 4096)
}

角色列表

172956 - 32 - 小学机灵鬼
172946 - 32 - 萝莉女友
095622 - 16 - 调皮女孩
095706 - 8 - 可爱小女孩
095747 - 8 - 成熟女性
095852 - 8 - 大妈
100056 - 8 - 可爱的女精灵
100157 - 16 - 可爱男精灵
100837 - 16 - 猥琐大叔
102130 - 16 - 小狐妖1
102147 - 32 - 妩媚女人
102210 - 32 - 小绿茶
102555 - 8 - 清爽帅哥
102640 - 8 - 磁性男神
103059 - 16 - 贱贱的帅哥
103200 - 16 - 邪恶大反派
105217 - 32 - 阳光开朗大男孩
105233 - 16 - 慢热暖男
105323 - 16 - 慈祥老公公
105350 - 32 - 老太监
170710 - 16 - 阿飞
171105 - 32 - 小当家
171309 - 32 - 邪剑仙
172011 - 16 - 台湾傲娇妹
172241 - 16 - 台湾甜美
172510 - 16 - 广西表妹

API Reference

TTSClient

Methods

`initialize(): Promise<void>`

Initializes the WebSocket connection and audio system. Note: This is called automatically when needed, so manual calling is optional.

`tts(content: string, role: string): Promise<void>`

Sends a TTS request. Automatically interrupts any currently playing audio.

await ttsClient.tts('Your text to speak', 'user');

`enableAudio(): Promise<void>`

Enables audio playback. Must be called from a user gesture (click, touch, etc.).

button.addEventListener('click', async () => {
  await ttsClient.enableAudio();
});

`disconnect(): void`

Disconnects the WebSocket connection.

`dispose(): Promise<void>`

Cleans up all resources including WebSocket and audio context.

`setTextCallback(callback: TextCallback): void`

Sets the callback for text messages (type 1000).

`setBSCallback(callback: BSCallback): void`

Sets the callback for BS messages (type 1001).

Status Methods

`getConnectionState(): ConnectionState`

Returns the current WebSocket connection state.

`getAudioState(): AudioState`

Returns the current audio playback state.

`isConnected(): boolean`

Returns true if WebSocket is connected.

`isPlaying(): boolean`

Returns true if audio is currently playing.

`getAudioStats(): Promise<AudioStats>`

Returns detailed audio statistics.

`isAudioEnabled(): boolean`

Returns true if audio is enabled and ready for playback.

Message Protocol

Outgoing Messages (Client → Server)

TTS Request

{
  "type": "tts",
  "content": "Text to convert to speech"
}

Authentication

{
  "type": "auth",
  "token": "your-jwt-token"
}

Incoming Messages (Server → Client)

Audio Data

Type: ArrayBuffer
Format: PCM, 1 channel, 16000 Hz, 16-bit signed integers
Usage: Automatically played through AudioWorklet

Text Messages

Format: String starting with "1000"
Example: "1000Your text message here"
Callback: onTextMessage(content)

BS (Business Service) Messages

Format: String starting with "1001"
Example: "1001Your BS data here"
Callback: onBSMessage(content)

Audio Features

AudioWorklet Processing

Low Latency: Direct audio processing in dedicated thread
Smooth Playback: Advanced buffering with underrun protection
Beat Control: Adaptive playback rate for latency management
Interruption Support: Seamless audio interruption for new TTS requests

Streaming Audio Support

Continuous Playback: Handles rapid audio stream chunks without interruption
Smart Buffering: Automatically appends new audio data to existing stream
Buffer Management: Intelligent cleanup of played audio data
Stream Detection: Distinguishes between new TTS requests and streaming chunks

Latency Management

Target Latency: 100ms default target
Max Latency: 300ms before rate adjustment
Adaptive Rate: Automatic playback speed adjustment (0.9x - 1.1x)
Smoothing: Gradual rate changes to avoid audio artifacts

Error Handling

const ttsClient = new TTSClient({
  // ... config
  onError: (error) => {
    switch (error.message) {
      case 'WebSocket is not connected':
        // Handle connection issues
        break;
      case 'Failed to initialize audio':
        // Handle audio system issues
        break;
      default:
        console.error('TTS Error:', error);
    }
  }
});

Important: User Gesture Requirement

⚠️ Modern browsers require user interaction before audio can be played. You must call enableAudio() from a user gesture (click, touch, keypress) before using TTS functionality.

// ✅ Correct - called from user event
button.addEventListener('click', async () => {
  await ttsClient.enableAudio();
  await ttsClient.tts('Now I can speak!', 'assistant');
});

// ❌ Wrong - called without user gesture
await ttsClient.tts('This will fail!', 'user'); // AudioContext error

Browser Compatibility

Chrome: 66+ (AudioWorklet support)
Firefox: 76+ (AudioWorklet support)
Safari: 14.1+ (AudioWorklet support)
Edge: 79+ (AudioWorklet support)

Examples

Basic Usage

import TTSClient from 'qt-ai-gateway-sdk';

const client = new TTSClient({
  websocket: {
    url: 'wss://api.example.com/tts',
    jwtToken: 'eyJhbGciOiJIUzI1NiIs...'
  },
  onTextMessage: (content) => {
    console.log('Server message:', content);
  },
  onBSMessage: (content) => {
    console.log('Business service message:', content);
  }
});

// Direct usage - auto-initializes when needed
// Must be called from user gesture for audio to work
button.addEventListener('click', async () => {
  await client.enableAudio(); // Enable audio first
  await client.tts('Hello World!', 'user'); // Auto-connects and initializes
});

Advanced Configuration

const client = new TTSClient({
  websocket: {
    url: 'wss://api.example.com/tts',
    jwtToken: 'your-token',
    reconnectAttempts: 10,
    reconnectDelay: 5000
  },
  audio: {
    sampleRate: 22050,
    channels: 1,
    bufferSize: 8192
  },
  onConnect: () => console.log('Connected!'),
  onDisconnect: () => console.log('Disconnected!'),
  onTextMessage: (msg) => console.log('Text:', msg),
  onBSMessage: (msg) => console.log('BS:', msg),
  onError: (err) => console.error('Error:', err)
});

Monitoring Audio Statistics

setInterval(async () => {
  const stats = await client.getAudioStats();
  console.log('Buffer size:', stats.bufferSize);
  console.log('Playback rate:', stats.playbackRate);
  console.log('Buffered duration:', stats.bufferedDuration);
}, 1000);

Development

Building

npm run build

Testing

npm test

Development Mode

npm run dev

License

MIT License - see LICENSE file for details.

Contributing

Fork the repository
Create your feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'Add some amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

Support

For issues and questions, please open an issue on GitHub or contact support at your-email@example.com.