JSPM

media-transcriber

1.0.0
  • ESM via JSPM
  • ES Module Entrypoint
  • Export Map
  • Keywords
  • License
  • Repository URL
  • TypeScript Types
  • README
  • Created
  • Published
  • Downloads 19
  • Score
    100M100P100Q38317F
  • License MIT

Batch transcribe audio/video files using pluggable AI backends (Whisper, OpenAI API, and more)

Package Exports

  • media-transcriber
  • media-transcriber/dist/index.js

This package does not declare an exports field, so the exports above have been automatically detected and optimized by JSPM instead. If any package subpath is missing, it is recommended to post an issue to the original package (media-transcriber) to support the "exports" field. If that is not possible, create a JSPM override to customize the exports field for this package.

Readme

media-transcriber

Batch transcribe audio/video files using pluggable AI backends (Whisper local, OpenAI API, and more).

Quick Start

# Run directly with npx (no install)
npx media-transcriber setup

# Or install globally
npm install -g media-transcriber

# Transcribe files (stateless)
media-transcriber transcribe ./data/input ./data/output

Requirements

  • Node.js >= 18
  • FFmpeg and ffprobe in PATH
    • Windows: winget install ffmpeg or scoop install ffmpeg
    • macOS: brew install ffmpeg
    • Linux: sudo apt install ffmpeg

Backend requirements:

  • whisper-local: Python 3.10+ with openai-whisper installed
  • whisper-api: OpenAI API key (OPENAI_API_KEY env var or --openai-api-key)

Usage

Transcribe

# Basic
media-transcriber transcribe ./recordings ./transcripts

# Model and device
media-transcriber transcribe ./data/input ./data/output -m medium -d cpu

# OpenAI API backend
media-transcriber transcribe ./data/input ./data/output -b whisper-api --openai-api-key <key>

# Keep temporary files in ./data/output/temp
media-transcriber transcribe ./data/input ./data/output --include-temp

# JSON output for agents
media-transcriber transcribe ./data/input ./data/output --json

Setup

Dependency check wizard (no persistent config file):

media-transcriber setup

Execution Options

All parameters are passed at execution time (stateless CLI).

Field Type Default Description
inputFolder string required Input folder argument
outputFolder string required Output folder argument
backend string whisper-local Transcription backend
whisperModel string large-v2 Model name
device cuda or cpu cuda Processing device
maxDurationSeconds number 1200 Split threshold
enableAudioEnhancement boolean false Enable enhancement filters
keepIntermediateFiles boolean false Keep temp files with --include-temp
tempFolder string <outputFolder>/temp Temp working folder
pythonPath string unset Python path for local backend
openaiApiKey string env/flag API key for OpenAI backend

AI Agent Integration

Structured JSON

media-transcriber transcribe ./data/input ./data/output --json 2>/dev/null

Progress Events (stderr)

In --json mode, NDJSON progress events are emitted to stderr.

Exit Codes

Code Meaning
0 Success (all files transcribed)
1 General error
2 Missing dependency
3 Configuration/argument error
4 No input files found
10 Partial success

Supported Formats

Input: .m4a, .mp3, .mp4, .mkv, .wav, .flac, .ogg, .webm

Output: .txt, .srt

Development

npm install
npm run typecheck
npm test
npm run build

License

MIT