Package Exports
- @thinkata/voice-kit
- @thinkata/voice-kit/components/SimpleFormExample.vue
- @thinkata/voice-kit/components/VoiceInput.vue
Readme
Voice Kit
Complete voice-powered application toolkit - Speech-to-text, AI form filling, and serverless API handlers all in one package.
Version: 0.0.4
Overview
Voice Kit is an all-in-one solution for building voice-powered web applications. It combines:
- Core - Framework-agnostic speech-to-text primitives
- Forms - Vue 3 composables for voice-powered form filling
- Server - Nitro/h3 API handlers for serverless deployments
- Components - Ready-to-use Vue components for voice input
All in a single, easy-to-install package!
⚠️ Important Notice
This version of Voice Kit is NOT suitable for PII (Personally Identifiable Information) workflows. The example applications and default configurations are designed for non-sensitive use cases such as product feedback, surveys, and general data collection. For applications handling PII, additional security measures, data encryption, and compliance considerations are required.
Features
- 🎤 Multiple Speech Providers - ElevenLabs, OpenAI, Together AI
- 🤖 AI-Powered Form Filling - Intelligent form parsing with LLM integration
- ⚡ Serverless Ready - Works with Nuxt, Nitro, Cloudflare Workers, Vercel
- 🔒 Built-in Rate Limiting - Protect your API endpoints
- 📱 Vue 3 Composables - Reactive hooks for easy integration
- 🎨 Ready-to-Use Components - Beautiful voice input UI components
- 🎯 TypeScript First - Full type safety out of the box
- 🌐 Framework Agnostic Core - Use anywhere JavaScript runs
Installation
npm install @thinkata/voice-kitQuick Start
Option 1: Use Pre-built Components (Easiest)
<script setup>
import VoiceInput from '@thinkata/voice-kit/components/VoiceInput.vue'
import { reactive } from 'vue'
const formData = reactive({
productName: '',
rating: '',
feedback: ''
})
const handleTranscript = async (transcript) => {
// Call your parse API
const response = await fetch('/api/parse-speech', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
text: transcript,
formStructure: { fields: ['productName', 'rating', 'feedback'] }
})
})
const result = await response.json()
Object.assign(formData, result.data)
}
</script>
<template>
<div>
<VoiceInput
@transcript="handleTranscript"
apiEndpoint="/api/speech-to-text"
/>
<form>
<input v-model="formData.productName" placeholder="Product Name" />
<select v-model="formData.rating">
<option value="">Select Rating</option>
<option value="5">5 - Excellent</option>
<option value="4">4 - Very Good</option>
<option value="3">3 - Good</option>
<option value="2">2 - Fair</option>
<option value="1">1 - Poor</option>
</select>
<textarea v-model="formData.feedback" placeholder="Your feedback..." />
<button>Submit</button>
</form>
</div>
</template>Option 2: Use Composables (More Control)
<template>
<div>
<button @click="toggleRecording">
{{ isRecording ? 'Stop' : 'Start' }} Recording
</button>
<p v-if="transcript">{{ transcript }}</p>
<p v-if="error">{{ error }}</p>
</div>
</template>
<script setup>
import { useSimpleVoiceKit } from '@thinkata/voice-kit'
const {
isRecording,
transcript,
error,
toggleRecording
} = useSimpleVoiceKit({
apiEndpoint: '/api/speech-to-text'
})
</script>Server-Side Setup (Required)
Create API endpoints in your Nuxt/Nitro app:
/server/api/speech-to-text.post.ts
import { createSpeechToTextHandler } from '@thinkata/voice-kit'
export default createSpeechToTextHandler({
elevenLabsApiKey: process.env.ELEVENLABS_API_KEY,
togetherApiKey: process.env.TOGETHER_API_KEY,
openaiApiKey: process.env.OPENAI_API_KEY,
speechProvider: (process.env.SPEECH_PROVIDER || 'elevenlabs') as any
})/server/api/parse-speech.post.ts (Optional, for form filling)
import { createParseSpeechHandler } from '@thinkata/voice-kit'
export default createParseSpeechHandler({
openaiApiKey: process.env.OPENAI_API_KEY,
anthropicApiKey: process.env.ANTHROPIC_API_KEY,
togetherApiKey: process.env.TOGETHER_API_KEY,
defaultProvider: (process.env.LLM_PROVIDER || 'together') as any
})Environment Variables
Create .env file:
# Speech-to-Text Provider (choose one)
ELEVENLABS_API_KEY=your_elevenlabs_key
SPEECH_PROVIDER=elevenlabs
# LLM Provider for form parsing (optional)
TOGETHER_API_KEY=your_together_key
LLM_PROVIDER=togetherAPI Reference
Vue Components
See COMPONENTS.md for detailed component documentation.
VoiceInput.vue
Ready-to-use voice input component with recording UI.
Import:
import VoiceInput from '@thinkata/voice-kit/components/VoiceInput.vue'Props:
apiEndpoint?: string- Speech-to-text API endpoint (default:/api/speech-to-text)
Events:
@transcript(text: string)- Emitted when speech is transcribed@error(error: string)- Emitted on error
Client-Side Composables
useSimpleVoiceKit(options)
Basic voice recording and transcription.
Options:
apiEndpoint?: string- Server endpoint (default: '/api/speech-to-text')autoStart?: boolean- Auto-start recording (default: false)
Returns:
isRecording: Ref<boolean>- Recording stateisProcessing: Ref<boolean>- Processing statetranscript: Ref<string>- Transcribed texterror: Ref<string | null>- Error messagestartRecording()- Start recordingstopRecording()- Stop and transcribetoggleRecording()- Toggle state
useVoiceKitWithForms(options)
Voice-powered form filling with AI parsing.
Options:
formStructure: FormStructure- Form field definitionsapiEndpoint?: string- Speech-to-text endpointparseEndpoint?: string- Form parsing endpoint (default: '/api/parse-speech')
Returns: Same as useSimpleVoiceKit plus:
formData: Ref<Record<string, any>>- Parsed form datafillForm()- Fill form with parsed data
Server-Side Handlers
createSpeechToTextHandler(config)
Creates h3 handler for speech-to-text.
Config:
elevenLabsApiKey?: string- ElevenLabs API keytogetherApiKey?: string- Together AI API keyopenaiApiKey?: string- OpenAI API keyspeechProvider: 'elevenlabs' | 'together' | 'openai'- Provider to use
createParseSpeechHandler(config)
Creates h3 handler for parsing speech into form data.
Config:
openaiApiKey?: string- OpenAI API keyanthropicApiKey?: string- Anthropic API keytogetherApiKey?: string- Together AI API keydefaultProvider: 'openai' | 'anthropic' | 'together'- LLM providermaxRetries?: number- Max retry attempts (default: 3)timeout?: number- Request timeout ms (default: 30000)
createLLMStatusHandler(config)
Creates h3 handler for checking LLM availability.
Config:
openaiApiKey?: stringanthropicApiKey?: stringtogetherApiKey?: stringllmProvider: 'openai' | 'anthropic' | 'together' | 'auto'llmModel?: string- Model name (default: 'auto')
Supported Platforms
- ✅ Nuxt 3 - Full support with server API routes
- ✅ Nitro - Standalone server applications
- ✅ Cloudflare Workers - Serverless edge deployment
- ✅ Vercel - Serverless functions
- ✅ Node.js - Standard HTTP servers
- ✅ Vue 3 - Client-side composables
Speech Providers
ElevenLabs (Recommended)
- High accuracy
- Low latency
- Supports multiple languages
- Get API key: https://elevenlabs.io
OpenAI Whisper
- Excellent accuracy
- Supports 100+ languages
- Get API key: https://platform.openai.com
Together AI
- Cost-effective
- Good performance
- Get API key: https://together.ai
Security Best Practices
- Never expose API keys - Always use environment variables on server
- Use rate limiting - Protect against abuse (built-in)
- Validate inputs - Use built-in validation utilities
- HTTPS only - Never use HTTP in production
- CORS configuration - Restrict origins in production
Troubleshooting
"No audio recorded"
- Check microphone permissions in browser
- Verify browser supports MediaRecorder API
- Test on different browser (Chrome recommended)
"API key invalid"
- Verify
.envfile is loaded correctly - Check API key format and validity
- Ensure keys are set on server-side only
"Package not found" errors
- Run
npm installto ensure dependencies installed - Clear cache:
npm cache clean --force - Check package version:
npm list @thinkata/voice-kit
Component import errors
- Ensure you're using the correct import path:
@thinkata/voice-kit/components/VoiceInput.vue
- Check that Vue is installed:
npm install vue@^3.0.0
Examples
See the examples/formfiller directory for a complete working example of a Product Feedback Form - a non-PII use case demonstrating voice-powered form filling for product reviews and feedback collection.
Contributing
Contributions are welcome! Please see our GitHub repository for guidelines.
License
MIT © Mark Williams