Package Exports
- qt-ai-gateway-npm-sdk
- qt-ai-gateway-npm-sdk/dist/index.esm.js
- qt-ai-gateway-npm-sdk/dist/index.js
This package does not declare an exports field, so the exports above have been automatically detected and optimized by JSPM instead. If any package subpath is missing, it is recommended to post an issue to the original package (qt-ai-gateway-npm-sdk) to support the "exports" field. If that is not possible, create a JSPM override to customize the exports field for this package.
Readme
TTS WebSocket Client
A high-performance JavaScript/TypeScript package for real-time Text-to-Speech (TTS) over WebSocket connections with advanced audio streaming capabilities.
Features
- 🚀 Real-time TTS: WebSocket-based communication for low-latency text-to-speech
- 🎵 Advanced Audio Playback: AudioWorklet-based PCM audio streaming with beat control
- 🔄 Auto-reconnection: Robust WebSocket connection management with automatic reconnection
- 📱 Cross-platform: Works in all modern browsers with Web Audio API support
- 🎛️ Latency Control: Adaptive playback rate adjustment for optimal audio quality
- 🔒 JWT Authentication: Secure WebSocket connections with JWT token support
- 📊 Real-time Statistics: Audio buffer and connection monitoring
- 🎯 TypeScript Support: Full TypeScript definitions included
Installation
npm install qt-ai-gateway-sdk
Quick Start
import TTSClient from 'qt-ai-gateway-sdk';
// Initialize the client
const ttsClient = new TTSClient({
websocket: {
url: 'wss://your-tts-server.com/ws',
jwtToken: 'your-jwt-token'
},
onTextMessage: (content) => {
console.log('Text message:', content);
},
onBSMessage: (content) => {
console.log('BS message:', content);
},
onError: (error) => {
console.error('TTS Error:', error);
}
});
// Enable audio and send TTS (must be called from user gesture - click, touch, etc.)
document.getElementById('speakBtn').addEventListener('click', async () => {
await ttsClient.enableAudio(); // Enable audio first
await ttsClient.tts('Hello, this is a test message!', {role:'role', speed:1.0}); // Auto-initializes if needed
console.log('TTS request sent!');
});
Configuration
TTSClientConfig
interface TTSClientConfig {
websocket: WebSocketConfig;
audio?: AudioConfig;
onTextMessage?: TextCallback;
onBSMessage?: BSCallback;
onError?: (error: Error) => void;
onConnect?: () => void;
onDisconnect?: () => void;
}
WebSocketConfig
interface WebSocketConfig {
url: string; // WebSocket server URL
jwtToken: string; // JWT authentication token
reconnectAttempts?: number; // Max reconnection attempts (default: 5)
reconnectDelay?: number; // Delay between reconnections in ms (default: 3000)
}
AudioConfig
interface AudioConfig {
sampleRate?: number; // Audio sample rate (default: 16000)
channels?: number; // Number of audio channels (default: 1)
bufferSize?: number; // Audio buffer size (default: 4096)
}
角色列表
172956 - 32 - 小学机灵鬼
172946 - 32 - 萝莉女友
095622 - 16 - 调皮女孩
095706 - 8 - 可爱小女孩
095747 - 8 - 成熟女性
095852 - 8 - 大妈
100056 - 8 - 可爱的女精灵
100157 - 16 - 可爱男精灵
100837 - 16 - 猥琐大叔
102130 - 16 - 小狐妖1
102147 - 32 - 妩媚女人
102210 - 32 - 小绿茶
102555 - 8 - 清爽帅哥
102640 - 8 - 磁性男神
103059 - 16 - 贱贱的帅哥
103200 - 16 - 邪恶大反派
105217 - 32 - 阳光开朗大男孩
105233 - 16 - 慢热暖男
105323 - 16 - 慈祥老公公
105350 - 32 - 老太监
170710 - 16 - 阿飞
171105 - 32 - 小当家
171309 - 32 - 邪剑仙
172011 - 16 - 台湾傲娇妹
172241 - 16 - 台湾甜美
172510 - 16 - 广西表妹
API Reference
TTSClient
Methods
initialize(): Promise<void>
Initializes the WebSocket connection and audio system. Note: This is called automatically when needed, so manual calling is optional.
tts(content: string, role: string): Promise<void>
Sends a TTS request. Automatically interrupts any currently playing audio.
await ttsClient.tts('Your text to speak', 'user');
enableAudio(): Promise<void>
Enables audio playback. Must be called from a user gesture (click, touch, etc.).
button.addEventListener('click', async () => {
await ttsClient.enableAudio();
});
disconnect(): void
Disconnects the WebSocket connection.
dispose(): Promise<void>
Cleans up all resources including WebSocket and audio context.
setTextCallback(callback: TextCallback): void
Sets the callback for text messages (type 1000).
setBSCallback(callback: BSCallback): void
Sets the callback for BS messages (type 1001).
Status Methods
getConnectionState(): ConnectionState
Returns the current WebSocket connection state.
getAudioState(): AudioState
Returns the current audio playback state.
isConnected(): boolean
Returns true if WebSocket is connected.
isPlaying(): boolean
Returns true if audio is currently playing.
getAudioStats(): Promise<AudioStats>
Returns detailed audio statistics.
isAudioEnabled(): boolean
Returns true if audio is enabled and ready for playback.
Message Protocol
Outgoing Messages (Client → Server)
TTS Request
{
"type": "tts",
"content": "Text to convert to speech"
}
Authentication
{
"type": "auth",
"token": "your-jwt-token"
}
Incoming Messages (Server → Client)
Audio Data
- Type:
ArrayBuffer
- Format: PCM, 1 channel, 16000 Hz, 16-bit signed integers
- Usage: Automatically played through AudioWorklet
Text Messages
- Format: String starting with "1000"
- Example:
"1000Your text message here"
- Callback:
onTextMessage(content)
BS (Business Service) Messages
- Format: String starting with "1001"
- Example:
"1001Your BS data here"
- Callback:
onBSMessage(content)
Audio Features
AudioWorklet Processing
- Low Latency: Direct audio processing in dedicated thread
- Smooth Playback: Advanced buffering with underrun protection
- Beat Control: Adaptive playback rate for latency management
- Interruption Support: Seamless audio interruption for new TTS requests
Streaming Audio Support
- Continuous Playback: Handles rapid audio stream chunks without interruption
- Smart Buffering: Automatically appends new audio data to existing stream
- Buffer Management: Intelligent cleanup of played audio data
- Stream Detection: Distinguishes between new TTS requests and streaming chunks
Latency Management
- Target Latency: 100ms default target
- Max Latency: 300ms before rate adjustment
- Adaptive Rate: Automatic playback speed adjustment (0.9x - 1.1x)
- Smoothing: Gradual rate changes to avoid audio artifacts
Error Handling
const ttsClient = new TTSClient({
// ... config
onError: (error) => {
switch (error.message) {
case 'WebSocket is not connected':
// Handle connection issues
break;
case 'Failed to initialize audio':
// Handle audio system issues
break;
default:
console.error('TTS Error:', error);
}
}
});
Important: User Gesture Requirement
⚠️ Modern browsers require user interaction before audio can be played. You must call enableAudio()
from a user gesture (click, touch, keypress) before using TTS functionality.
// ✅ Correct - called from user event
button.addEventListener('click', async () => {
await ttsClient.enableAudio();
await ttsClient.tts('Now I can speak!', 'assistant');
});
// ❌ Wrong - called without user gesture
await ttsClient.tts('This will fail!', 'user'); // AudioContext error
Browser Compatibility
- Chrome: 66+ (AudioWorklet support)
- Firefox: 76+ (AudioWorklet support)
- Safari: 14.1+ (AudioWorklet support)
- Edge: 79+ (AudioWorklet support)
Examples
Basic Usage
import TTSClient from 'qt-ai-gateway-sdk';
const client = new TTSClient({
websocket: {
url: 'wss://api.example.com/tts',
jwtToken: 'eyJhbGciOiJIUzI1NiIs...'
},
onTextMessage: (content) => {
console.log('Server message:', content);
},
onBSMessage: (content) => {
console.log('Business service message:', content);
}
});
// Direct usage - auto-initializes when needed
// Must be called from user gesture for audio to work
button.addEventListener('click', async () => {
await client.enableAudio(); // Enable audio first
await client.tts('Hello World!', 'user'); // Auto-connects and initializes
});
Advanced Configuration
const client = new TTSClient({
websocket: {
url: 'wss://api.example.com/tts',
jwtToken: 'your-token',
reconnectAttempts: 10,
reconnectDelay: 5000
},
audio: {
sampleRate: 22050,
channels: 1,
bufferSize: 8192
},
onConnect: () => console.log('Connected!'),
onDisconnect: () => console.log('Disconnected!'),
onTextMessage: (msg) => console.log('Text:', msg),
onBSMessage: (msg) => console.log('BS:', msg),
onError: (err) => console.error('Error:', err)
});
Monitoring Audio Statistics
setInterval(async () => {
const stats = await client.getAudioStats();
console.log('Buffer size:', stats.bufferSize);
console.log('Playback rate:', stats.playbackRate);
console.log('Buffered duration:', stats.bufferedDuration);
}, 1000);
Development
Building
npm run build
Testing
npm test
Development Mode
npm run dev
License
MIT License - see LICENSE file for details.
Contributing
- Fork the repository
- Create your feature branch (
git checkout -b feature/amazing-feature
) - Commit your changes (
git commit -m 'Add some amazing feature'
) - Push to the branch (
git push origin feature/amazing-feature
) - Open a Pull Request
Support
For issues and questions, please open an issue on GitHub or contact support at your-email@example.com.