Package Exports
This package does not declare an exports field, so the exports above have been automatically detected and optimized by JSPM instead. If any package subpath is missing, it is recommended to post an issue to the original package (speak2text) to support the "exports" field. If that is not possible, create a JSPM override to customize the exports field for this package.
Readme
speak2text
Speech-to-text CLI tool. Drop audio or video files into input/, run s2t transcribe, get transcripts in output/. Powered by OpenAI Whisper.
How It Works
- Put any audio or video file in
input/ - Run
s2t transcribe - Transcripts appear in
output/
Supported natively (sent directly to OpenAI): mp3, mp4, m4a, wav, webm
Requires ffmpeg (converted to mp3 first): mkv, avi, mov, flac, ogg, and any other format
Video files are always stripped to audio only — no video is uploaded.
Example
$ s2t transcribe
Transcribing 1 file(s)...
chile_promo.mp3 — transcribing...
✓ output/chile_promo.txt [19dee66d]
Done.Output:
September 11, 1973, a military coup overthrows the government in Chile, ending the longest
democratic tradition in Latin America. It was a bloody, bloody coup. Chileans who lived
through the coup and years of repression reflect on its meaning for us today.Tech Stack
| Tool | Purpose |
|---|---|
| TypeScript | Language |
| Commander | CLI framework |
| better-sqlite3 | Local transcript history |
| OpenAI Whisper API | Transcription |
| ffmpeg | Audio conversion (optional, only for unsupported formats) |
| Vitest | Testing |
| Biome | Lint & format |
| pnpm | Package manager |
Requirements
- Node.js 22+
- pnpm
- OpenAI API key (or Grok / Gemini)
- ffmpeg — only if using unsupported formats (mkv, avi, flac, ogg...)
Install ffmpeg (if needed)
macOS
brew install ffmpegLinux (Debian/Ubuntu)
sudo apt install ffmpegWindows
winget install ffmpegInstallation
pnpm add -g speak2textConfiguration
Store your API key in ~/.speak2text/.env:
OPENAI_API_KEY=your-key-hereOr set the default provider:
s2t config set provider openaiOther providers:
GROK_API_KEY=your-key-here
GEMINI_API_KEY=your-key-hereUsage
Transcribe all files in input/
s2t transcribeTranscribe a specific file
s2t transcribe path/to/audio.mp3Options
s2t transcribe --format srt
s2t transcribe --format json
s2t transcribe --provider gemini
s2t transcribe --language fiManage transcripts
s2t list
s2t show <id>
s2t export <id> --format srt
s2t delete <id>Output Formats
| Format | Description |
|---|---|
txt |
Plain text (default) |
srt |
SRT subtitles with timestamps |
json |
Full JSON with timestamps, confidence, and metadata |
Providers
| Provider | Flag | Notes |
|---|---|---|
| OpenAI Whisper | --provider openai |
Default. $0.006/min |
| Grok | --provider grok |
OpenAI-compatible API |
| Gemini | --provider gemini |
OpenAI-compatible API |
Storage
Transcripts are stored locally in SQLite:
- macOS/Linux:
~/.speak2text/transcripts.db
Config and API keys:
- macOS/Linux:
~/.speak2text/.env
Roadmap
- v0.2.0 —
--translateflag: transcribe and translate to English in one step
License
See MIT