Package Exports
- expo-whisper
- expo-whisper/build/index.js
This package does not declare an exports field, so the exports above have been automatically detected and optimized by JSPM instead. If any package subpath is missing, it is recommended to post an issue to the original package (expo-whisper) to support the "exports" field. If that is not possible, create a JSPM override to customize the exports field for this package.
Readme
expo-whisper
High-performance speech-to-text transcription for React Native/Expo apps using OpenAI's Whisper model
Features
- Real-time transcription with streaming audio support
- Cross-platform - Works on iOS, Android, and Web
- High accuracy using OpenAI's Whisper models
- Native performance with C++ integration
- Easy integration with React hooks
- Multi-language support (100+ languages)
- Flexible configuration for different use cases
- Offline capable - No internet required
Getting Started
Prerequisites
Before you begin, make sure you have:
- Expo SDK 54+ or React Native 0.74+
- Development build environment set up (expo-whisper cannot run in Expo Go)
- Android Studio (for Android builds) or Xcode (for iOS builds)
Step 1: Installation
# Install the package
npm install expo-whisper
# Install required peer dependencies
npx expo install expo-build-propertiesStep 2: Configure app.json
Add the expo-whisper plugin to your app.json configuration:
{
"expo": {
"name": "Your App",
"plugins": [
[
"expo-build-properties",
{
"android": {
"minSdkVersion": 21,
"compileSdkVersion": 34,
"targetSdkVersion": 34,
"buildToolsVersion": "34.0.0",
"enableProguardInReleaseBuilds": true,
"packagingOptions": {
"pickFirst": ["**/libc++_shared.so", "**/libjsc.so"]
}
},
"ios": {
"deploymentTarget": "11.0"
}
}
],
"expo-whisper"
],
"android": {
"permissions": [
"android.permission.RECORD_AUDIO",
"android.permission.READ_EXTERNAL_STORAGE",
"android.permission.WRITE_EXTERNAL_STORAGE"
]
},
"ios": {
"infoPlist": {
"NSMicrophoneUsageDescription": "This app needs access to microphone for speech recognition."
}
}
}
}Step 3: Download a Whisper Model
Choose a model based on your performance and accuracy needs:
| Model | Size | Speed | Accuracy | Use Case |
|---|---|---|---|---|
tiny.en |
39 MB | Fastest | Basic | Real-time, mobile |
base.en |
74 MB | Fast | Good | Mobile apps |
small.en |
244 MB | Medium | Better | General purpose |
medium.en |
769 MB | Slow | High | High accuracy needed |
Download options:
# Option 1: Download to assets folder
mkdir -p assets/models
curl -L "https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-base.en.bin" -o assets/models/ggml-base.en.bin
# Option 2: Use react-native-fs to download at runtime
npm install react-native-fsStep 4: Build Your App
Important: expo-whisper requires a development build:
# For Android
npx expo run:android
# For iOS
npx expo run:ios
# Or use EAS Build
eas build --platform android --profile development
eas build --platform ios --profile developmentCannot use Expo Go - Native libraries require a development build
Step 5: Basic Implementation
import React, { useEffect, useState } from 'react';
import { View, Button, Text, Alert } from 'react-native';
import { useWhisper } from 'expo-whisper';
import { Audio } from 'expo-av';
export default function App() {
const {
isModelLoaded,
isTranscribing,
lastResult,
error,
loadModel,
transcribeFile,
clearError,
} = useWhisper();
const [recording, setRecording] = useState<Audio.Recording>();
useEffect(() => {
// Load model when app starts
const initializeWhisper = async () => {
try {
// Path to your downloaded model
await loadModel('/path/to/assets/models/ggml-base.en.bin');
Alert.alert('Success', 'Whisper model loaded successfully!');
} catch (error) {
Alert.alert('Error', `Failed to load model: ${error.message}`);
}
};
initializeWhisper();
}, []);
const startRecording = async () => {
try {
const permission = await Audio.requestPermissionsAsync();
if (permission.status !== 'granted') {
Alert.alert('Permission required', 'Microphone access is needed');
return;
}
await Audio.setAudioModeAsync({
allowsRecordingIOS: true,
playsInSilentModeIOS: true,
});
const { recording } = await Audio.Recording.createAsync(
Audio.RECORDING_OPTIONS_PRESET_HIGH_QUALITY
);
setRecording(recording);
} catch (err) {
console.error('Failed to start recording', err);
}
};
const stopRecording = async () => {
if (!recording) return;
setRecording(undefined);
await recording.stopAndUnloadAsync();
const uri = recording.getURI();
if (uri) {
try {
const result = await transcribeFile(uri, {
language: 'en',
temperature: 0.0,
});
Alert.alert('Transcription', result.text);
} catch (error) {
Alert.alert('Error', `Transcription failed: ${error.message}`);
}
}
};
return (
<View style={{ flex: 1, padding: 20, justifyContent: 'center' }}>
<Text style={{ fontSize: 18, marginBottom: 20, textAlign: 'center' }}>
Whisper Speech-to-Text
</Text>
<Text style={{ marginBottom: 10 }}>
Model Status: {isModelLoaded ? ' Loaded' : ' Loading...'}
</Text>
<Button
title={recording ? 'Stop Recording' : 'Start Recording'}
onPress={recording ? stopRecording : startRecording}
disabled={!isModelLoaded || isTranscribing}
/>
{lastResult && (
<View style={{ marginTop: 20, padding: 10, backgroundColor: '#f0f0f0' }}>
<Text style={{ fontWeight: 'bold' }}>Last Transcription:</Text>
<Text>{lastResult.text}</Text>
</View>
)}
{error && (
<View style={{ marginTop: 20 }}>
<Text style={{ color: 'red' }}>Error: {error}</Text>
<Button title="Clear Error" onPress={clearError} />
</View>
)}
</View>
);
}Step 6: Test Your Implementation
- Build and install your development build
- Grant microphone permissions when prompted
- Tap "Start Recording" and speak clearly
- Tap "Stop Recording" to see transcription results
Configuration Options
Advanced app.json Setup
For production apps, you may want additional configuration:
{
"expo": {
"plugins": [
[
"expo-build-properties",
{
"android": {
"minSdkVersion": 21,
"compileSdkVersion": 34,
"targetSdkVersion": 34,
"proguardMinifyEnabled": true,
"enableProguardInReleaseBuilds": true,
"packagingOptions": {
"pickFirst": [
"**/libc++_shared.so",
"**/libjsc.so",
"**/libfbjni.so"
]
}
},
"ios": {
"deploymentTarget": "11.0",
"bundler": "metro"
}
}
],
[
"expo-whisper",
{
"modelPath": "assets/models/ggml-base.en.bin",
"enableMicrophone": true,
"enableAudioSession": true
}
]
]
}
}EAS Build Configuration
Create eas.json for cloud builds:
{
"cli": {
"version": ">= 5.9.0"
},
"build": {
"development": {
"developmentClient": true,
"distribution": "internal",
"android": {
"buildType": "developmentBuild"
},
"ios": {
"buildConfiguration": "Debug"
}
},
"preview": {
"distribution": "internal",
"android": {
"buildType": "apk"
}
},
"production": {
"android": {
"buildType": "app-bundle"
}
}
}
}Development Build Required
expo-whisper uses native libraries and cannot run in Expo Go. You must use a development build:
Why Development Build is Required:
- Native Libraries: expo-whisper includes compiled C++ libraries (libwhisper.so)
- Custom Native Code: Direct integration with Whisper C++ implementation
- Platform-specific Optimizations: Hardware-accelerated audio processing
Setting Up Development Build:
# Install development build tools
npx expo install expo-dev-client
# Build for development
npx expo run:android # Local Android build
npx expo run:ios # Local iOS build
# Or use EAS Build (recommended)
eas build --platform android --profile development
eas build --platform ios --profile developmentNext Steps
After completing the setup:
- Explore the API: Check out the API Reference section
- Real-time Streaming: Learn about streaming transcription
- Performance Optimization: Read our Audio Guide
- Real-world Examples: See REAL_WORLD_EXAMPLE.md
API Reference
Hooks
useWhisper()
The main hook for managing Whisper transcription state and actions.
Returns:
{
// State
isModelLoaded: boolean;
isLoading: boolean;
isTranscribing: boolean;
error: string | null;
lastResult: WhisperResult | null;
// Actions
loadModel: (modelPath: string) => Promise<void>;
transcribe: (audioPath: string, config?: WhisperConfig) => Promise<WhisperResult>;
transcribeFile: (audioPath: string, config?: WhisperConfig) => Promise<WhisperResult>;
transcribePCM: (pcmData: Uint8Array, config?: WhisperPCMConfig) => Promise<WhisperResult>;
releaseModel: () => Promise<void>;
getModelInfo: () => Promise<string>;
getSupportedFormats: () => Promise<string>;
clearError: () => void;
}Core API
ExpoWhisper.loadModel(modelPath: string)
Load a Whisper model from the given file path.
ExpoWhisper.transcribeFile(audioPath: string, config?: WhisperConfig)
Transcribe an audio file. Supports WAV, MP3, M4A formats.
ExpoWhisper.transcribePCM(pcmData: Uint8Array, config?: WhisperPCMConfig)
Transcribe raw PCM audio data for real-time streaming.
Configuration Options
WhisperConfig
interface WhisperConfig {
language?: string; // Language code (e.g., 'en', 'es', 'auto')
temperature?: number; // Sampling temperature (0.0 - 1.0)
maxTokens?: number; // Maximum tokens to generate
beamSize?: number; // Beam search size
bestOf?: number; // Number of candidates
patience?: number; // Beam search patience
lengthPenalty?: number; // Length penalty
suppressTokens?: number[]; // Tokens to suppress
initialPrompt?: string; // Initial prompt text
wordTimestamps?: boolean; // Include word-level timestamps
preprendPunctuations?: string;
appendPunctuations?: string;
}WhisperPCMConfig
interface WhisperPCMConfig {
sampleRate: number; // Audio sample rate (16000 recommended)
channels: number; // Number of channels (1 for mono)
realtime?: boolean; // Enable real-time processing
chunkSize?: number; // Processing chunk size
}Audio Format Support
Supported Formats:
- WAV (recommended): 16kHz, 16-bit PCM, mono/stereo
- MP3: Various bitrates and sample rates
- M4A: AAC encoded audio
- Raw PCM: For streaming applications
Optimal Settings for Best Performance:
- Sample Rate: 16kHz
- Bit Depth: 16-bit
- Channels: Mono (1 channel)
- Format: WAV or raw PCM for streaming
Advanced Usage
Real-time Streaming
import { useWhisper } from 'expo-whisper';
function StreamingComponent() {
const { transcribePCM } = useWhisper();
const processAudioChunk = async (audioData: Uint8Array) => {
const result = await transcribePCM(audioData, {
sampleRate: 16000,
channels: 1,
realtime: true,
});
console.log('Live transcription:', result.text);
};
}Batch Processing
const processMultipleFiles = async (filePaths: string[]) => {
const results = [];
for (const path of filePaths) {
try {
const result = await transcribeFile(path, {
language: 'auto',
temperature: 0.0,
});
results.push({ path, text: result.text });
} catch (error) {
results.push({ path, error: error.message });
}
}
return results;
};Custom Configuration Presets
// High accuracy preset
const highAccuracyConfig = {
temperature: 0.0,
beamSize: 5,
bestOf: 5,
wordTimestamps: true,
};
// Fast processing preset
const fastConfig = {
temperature: 0.1,
beamSize: 1,
bestOf: 1,
maxTokens: 224,
};
// Real-time streaming preset
const streamingConfig = {
sampleRate: 16000,
channels: 1,
realtime: true,
chunkSize: 1024,
};Platform Support
| Platform | Status | Native Library |
|---|---|---|
| iOS | libwhisper.a | |
| Android | libwhisper.so | |
| Web | WASM (planned) |
Development
Building from Source
# Clone the repository
git clone https://github.com/poovarasan4046/expo-whisper.git
cd expo-whisper
# Install dependencies
npm install
# Build the package
npm run build
# Run tests
npm testNative Dependencies
The package includes pre-compiled Whisper libraries:
- Android:
libwhisper.so(ARM64, ARMv7) - iOS:
libwhisper.a(ARM64, x86_64)
Contributing
We welcome contributions! Please see our Contributing Guide for details.
- Fork the repository
- Create your feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
License
This project is licensed under the MIT License - see the LICENSE file for details.
Acknowledgments
- OpenAI Whisper - The amazing speech recognition model
- whisper.cpp - C++ implementation
- Expo - Development platform
Support
Made with for the React Native community