Package Exports

expo-whisper
expo-whisper/build/index.js

This package does not declare an exports field, so the exports above have been automatically detected and optimized by JSPM instead. If any package subpath is missing, it is recommended to post an issue to the original package (expo-whisper) to support the "exports" field. If that is not possible, create a JSPM override to customize the exports field for this package.

Readme

expo-whisper

High-performance speech-to-text transcription for React Native/Expo apps using OpenAI's Whisper model

Features

Real-time transcription with streaming audio support
Cross-platform - Works on iOS, Android, and Web
High accuracy using OpenAI's Whisper models
Native performance with C++ integration
Easy integration with React hooks
Multi-language support (100+ languages)
Flexible configuration for different use cases
Offline capable - No internet required

Getting Started

Prerequisites

Before you begin, make sure you have:

Expo SDK 54+ or React Native 0.74+
Development build environment set up (expo-whisper cannot run in Expo Go)
Android Studio (for Android builds) or Xcode (for iOS builds)

Step 1: Installation

# Install the package
npm install expo-whisper

# Install required peer dependencies
npx expo install expo-build-properties

Step 2: Configure app.json

Add the expo-whisper plugin to your app.json configuration:

{
  "expo": {
    "name": "Your App",
    "plugins": [
      [
        "expo-build-properties",
        {
          "android": {
            "minSdkVersion": 21,
            "compileSdkVersion": 34,
            "targetSdkVersion": 34,
            "buildToolsVersion": "34.0.0",
            "enableProguardInReleaseBuilds": true,
            "packagingOptions": {
              "pickFirst": ["**/libc++_shared.so", "**/libjsc.so"]
            }
          },
          "ios": {
            "deploymentTarget": "11.0"
          }
        }
      ],
      "expo-whisper"
    ],
    "android": {
      "permissions": [
        "android.permission.RECORD_AUDIO",
        "android.permission.READ_EXTERNAL_STORAGE",
        "android.permission.WRITE_EXTERNAL_STORAGE"
      ]
    },
    "ios": {
      "infoPlist": {
        "NSMicrophoneUsageDescription": "This app needs access to microphone for speech recognition."
      }
    }
  }
}

Step 3: Download a Whisper Model

Choose a model based on your performance and accuracy needs:

Model	Size	Speed	Accuracy	Use Case
`tiny.en`	39 MB	Fastest	Basic	Real-time, mobile
`base.en`	74 MB	Fast	Good	Mobile apps
`small.en`	244 MB	Medium	Better	General purpose
`medium.en`	769 MB	Slow	High	High accuracy needed

Download options:

# Option 1: Download to assets folder
mkdir -p assets/models
curl -L "https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-base.en.bin" -o assets/models/ggml-base.en.bin

# Option 2: Use react-native-fs to download at runtime
npm install react-native-fs

Step 4: Build Your App

Important: expo-whisper requires a development build:

# For Android
npx expo run:android

# For iOS  
npx expo run:ios

# Or use EAS Build
eas build --platform android --profile development
eas build --platform ios --profile development

Cannot use Expo Go - Native libraries require a development build

Step 5: Basic Implementation

import React, { useEffect, useState } from 'react';
import { View, Button, Text, Alert } from 'react-native';
import { useWhisper } from 'expo-whisper';
import { Audio } from 'expo-av';

export default function App() {
  const {
    isModelLoaded,
    isTranscribing,
    lastResult,
    error,
    loadModel,
    transcribeFile,
    clearError,
  } = useWhisper();

  const [recording, setRecording] = useState<Audio.Recording>();

  useEffect(() => {
    // Load model when app starts
    const initializeWhisper = async () => {
      try {
        // Path to your downloaded model
        await loadModel('/path/to/assets/models/ggml-base.en.bin');
        Alert.alert('Success', 'Whisper model loaded successfully!');
      } catch (error) {
        Alert.alert('Error', `Failed to load model: ${error.message}`);
      }
    };

    initializeWhisper();
  }, []);

  const startRecording = async () => {
    try {
      const permission = await Audio.requestPermissionsAsync();
      if (permission.status !== 'granted') {
        Alert.alert('Permission required', 'Microphone access is needed');
        return;
      }

      await Audio.setAudioModeAsync({
        allowsRecordingIOS: true,
        playsInSilentModeIOS: true,
      });

      const { recording } = await Audio.Recording.createAsync(
        Audio.RECORDING_OPTIONS_PRESET_HIGH_QUALITY
      );
      setRecording(recording);
    } catch (err) {
      console.error('Failed to start recording', err);
    }
  };

  const stopRecording = async () => {
    if (!recording) return;

    setRecording(undefined);
    await recording.stopAndUnloadAsync();
    
    const uri = recording.getURI();
    if (uri) {
      try {
        const result = await transcribeFile(uri, {
          language: 'en',
          temperature: 0.0,
        });
        Alert.alert('Transcription', result.text);
      } catch (error) {
        Alert.alert('Error', `Transcription failed: ${error.message}`);
      }
    }
  };

  return (
    <View style={{ flex: 1, padding: 20, justifyContent: 'center' }}>
      <Text style={{ fontSize: 18, marginBottom: 20, textAlign: 'center' }}>
        Whisper Speech-to-Text
      </Text>

      <Text style={{ marginBottom: 10 }}>
        Model Status: {isModelLoaded ? ' Loaded' : ' Loading...'}
      </Text>

      <Button
        title={recording ? 'Stop Recording' : 'Start Recording'}
        onPress={recording ? stopRecording : startRecording}
        disabled={!isModelLoaded || isTranscribing}
      />

      {lastResult && (
        <View style={{ marginTop: 20, padding: 10, backgroundColor: '#f0f0f0' }}>
          <Text style={{ fontWeight: 'bold' }}>Last Transcription:</Text>
          <Text>{lastResult.text}</Text>
        </View>
      )}

      {error && (
        <View style={{ marginTop: 20 }}>
          <Text style={{ color: 'red' }}>Error: {error}</Text>
          <Button title="Clear Error" onPress={clearError} />
        </View>
      )}
    </View>
  );
}

Step 6: Test Your Implementation

Build and install your development build
Grant microphone permissions when prompted
Tap "Start Recording" and speak clearly
Tap "Stop Recording" to see transcription results

Configuration Options

Advanced app.json Setup

For production apps, you may want additional configuration:

{
  "expo": {
    "plugins": [
      [
        "expo-build-properties",
        {
          "android": {
            "minSdkVersion": 21,
            "compileSdkVersion": 34,
            "targetSdkVersion": 34,
            "proguardMinifyEnabled": true,
            "enableProguardInReleaseBuilds": true,
            "packagingOptions": {
              "pickFirst": [
                "**/libc++_shared.so",
                "**/libjsc.so",
                "**/libfbjni.so"
              ]
            }
          },
          "ios": {
            "deploymentTarget": "11.0",
            "bundler": "metro"
          }
        }
      ],
      [
        "expo-whisper",
        {
          "modelPath": "assets/models/ggml-base.en.bin",
          "enableMicrophone": true,
          "enableAudioSession": true
        }
      ]
    ]
  }
}

EAS Build Configuration

Create eas.json for cloud builds:

{
  "cli": {
    "version": ">= 5.9.0"
  },
  "build": {
    "development": {
      "developmentClient": true,
      "distribution": "internal",
      "android": {
        "buildType": "developmentBuild"
      },
      "ios": {
        "buildConfiguration": "Debug"
      }
    },
    "preview": {
      "distribution": "internal",
      "android": {
        "buildType": "apk"
      }
    },
    "production": {
      "android": {
        "buildType": "app-bundle"
      }
    }
  }
}

Development Build Required

expo-whisper uses native libraries and cannot run in Expo Go. You must use a development build:

Why Development Build is Required:

Native Libraries: expo-whisper includes compiled C++ libraries (libwhisper.so)
Custom Native Code: Direct integration with Whisper C++ implementation
Platform-specific Optimizations: Hardware-accelerated audio processing

Setting Up Development Build:

# Install development build tools
npx expo install expo-dev-client

# Build for development
npx expo run:android  # Local Android build
npx expo run:ios      # Local iOS build

# Or use EAS Build (recommended)
eas build --platform android --profile development
eas build --platform ios --profile development

Next Steps

After completing the setup:

Explore the API: Check out the API Reference section
Real-time Streaming: Learn about streaming transcription
Performance Optimization: Read our Audio Guide
Real-world Examples: See REAL_WORLD_EXAMPLE.md

API Reference

Hooks

`useWhisper()`

The main hook for managing Whisper transcription state and actions.

Returns:

{
  // State
  isModelLoaded: boolean;
  isLoading: boolean;
  isTranscribing: boolean;
  error: string | null;
  lastResult: WhisperResult | null;

  // Actions
  loadModel: (modelPath: string) => Promise<void>;
  transcribe: (audioPath: string, config?: WhisperConfig) => Promise<WhisperResult>;
  transcribeFile: (audioPath: string, config?: WhisperConfig) => Promise<WhisperResult>;
  transcribePCM: (pcmData: Uint8Array, config?: WhisperPCMConfig) => Promise<WhisperResult>;
  releaseModel: () => Promise<void>;
  getModelInfo: () => Promise<string>;
  getSupportedFormats: () => Promise<string>;
  clearError: () => void;
}

Core API

`ExpoWhisper.loadModel(modelPath: string)`

Load a Whisper model from the given file path.

`ExpoWhisper.transcribeFile(audioPath: string, config?: WhisperConfig)`

Transcribe an audio file. Supports WAV, MP3, M4A formats.

`ExpoWhisper.transcribePCM(pcmData: Uint8Array, config?: WhisperPCMConfig)`

Transcribe raw PCM audio data for real-time streaming.

Configuration Options

`WhisperConfig`

interface WhisperConfig {
  language?: string;           // Language code (e.g., 'en', 'es', 'auto')
  temperature?: number;        // Sampling temperature (0.0 - 1.0)
  maxTokens?: number;         // Maximum tokens to generate
  beamSize?: number;          // Beam search size
  bestOf?: number;            // Number of candidates
  patience?: number;          // Beam search patience
  lengthPenalty?: number;     // Length penalty
  suppressTokens?: number[];  // Tokens to suppress
  initialPrompt?: string;     // Initial prompt text
  wordTimestamps?: boolean;   // Include word-level timestamps
  preprendPunctuations?: string;
  appendPunctuations?: string;
}

`WhisperPCMConfig`

interface WhisperPCMConfig {
  sampleRate: number;         // Audio sample rate (16000 recommended)
  channels: number;           // Number of channels (1 for mono)
  realtime?: boolean;         // Enable real-time processing
  chunkSize?: number;         // Processing chunk size
}

Audio Format Support

Supported Formats:

WAV (recommended): 16kHz, 16-bit PCM, mono/stereo
MP3: Various bitrates and sample rates
M4A: AAC encoded audio
Raw PCM: For streaming applications

Optimal Settings for Best Performance:

Sample Rate: 16kHz
Bit Depth: 16-bit
Channels: Mono (1 channel)
Format: WAV or raw PCM for streaming

Advanced Usage

Real-time Streaming

import { useWhisper } from 'expo-whisper';

function StreamingComponent() {
  const { transcribePCM } = useWhisper();

  const processAudioChunk = async (audioData: Uint8Array) => {
    const result = await transcribePCM(audioData, {
      sampleRate: 16000,
      channels: 1,
      realtime: true,
    });

    console.log('Live transcription:', result.text);
  };
}

Batch Processing

const processMultipleFiles = async (filePaths: string[]) => {
  const results = [];

  for (const path of filePaths) {
    try {
      const result = await transcribeFile(path, {
        language: 'auto',
        temperature: 0.0,
      });
      results.push({ path, text: result.text });
    } catch (error) {
      results.push({ path, error: error.message });
    }
  }

  return results;
};

Custom Configuration Presets

// High accuracy preset
const highAccuracyConfig = {
  temperature: 0.0,
  beamSize: 5,
  bestOf: 5,
  wordTimestamps: true,
};

// Fast processing preset
const fastConfig = {
  temperature: 0.1,
  beamSize: 1,
  bestOf: 1,
  maxTokens: 224,
};

// Real-time streaming preset
const streamingConfig = {
  sampleRate: 16000,
  channels: 1,
  realtime: true,
  chunkSize: 1024,
};

Platform Support

Platform	Status	Native Library
iOS		libwhisper.a
Android		libwhisper.so
Web		WASM (planned)

Development

Building from Source

# Clone the repository
git clone https://github.com/poovarasan4046/expo-whisper.git
cd expo-whisper

# Install dependencies
npm install

# Build the package
npm run build

# Run tests
npm test

Native Dependencies

The package includes pre-compiled Whisper libraries:

Android: libwhisper.so (ARM64, ARMv7)
iOS: libwhisper.a (ARM64, x86_64)

Contributing

We welcome contributions! Please see our Contributing Guide for details.

Fork the repository
Create your feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'Add amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

OpenAI Whisper - The amazing speech recognition model
whisper.cpp - C++ implementation
Expo - Development platform

Support

Made with for the React Native community