Package Exports
- @adamhancock/transcribe-cli
- @adamhancock/transcribe-cli/dist/index.js
This package does not declare an exports field, so the exports above have been automatically detected and optimized by JSPM instead. If any package subpath is missing, it is recommended to post an issue to the original package (@adamhancock/transcribe-cli) to support the "exports" field. If that is not possible, create a JSPM override to customize the exports field for this package.
Readme
Transcribe CLI
A TypeScript CLI tool for transcribing and summarizing MP4 recordings using local models.
Prerequisites
- Node.js 18+
- FFmpeg installed (
brew install ffmpegon macOS) - OpenAI Whisper (
pip install openai-whisper) - runs locally for transcription - Ollama -
ollama serve(port 11434) for summarization- Install:
curl -fsSL https://ollama.com/install.sh | sh - Pull model:
ollama pull gemma3
- Install:
How it works
- Audio Extraction: Uses FFmpeg to extract audio from MP4
- Transcription: Uses Whisper (runs locally on your machine) to convert speech to text
- Frame Analysis: Extracts video frames at regular intervals for visual context (enabled by default)
- Summarization: Uses Ollama to summarize the transcript and analyze frames
- Summary is displayed in the console before saving to file
- Default model: gemma3 (text-only)
- For multimodal analysis, use models like
llavaorbakllava - The tool automatically falls back to text-only if the model doesn't support images
Installation
From npm (recommended)
npm install -g @adamhancock/transcribe-cliFrom source
git clone https://github.com/adamhancock/transcribe-cli.git
cd transcribe-cli
npm install
npm run build
npm link # Makes 'transcribe' command available globallyUsage
Basic usage:
transcribe video.mp4Options:
-o, --output <path>- Output file path (default: input_transcript.txt)-a, --audio-only- Only extract audio without transcription-t, --transcribe-only- Only transcribe without summarization--host <host>- Ollama API host (default: http://localhost:11434)--model <model>- Ollama model name for summarization (default: gemma3)--whisper-model <model>- Whisper model size (default: base)--keep-audio- Keep extracted audio file after processing--no-analyze-frames- Disable frame extraction and analysis (enabled by default)--frame-interval <seconds>- Seconds between frame extraction (default: 3)--max-frames <count>- Maximum number of frames to extract (default: 30)--keep-frames- Keep extracted frames after processing
Examples
# Basic transcription and summary (uses Ollama with gemma3)
transcribe recording.mp4
# Only extract audio
transcribe recording.mp4 --audio-only
# Only transcribe (no summary)
transcribe recording.mp4 --transcribe-only
# Use a specific Whisper model
transcribe recording.mp4 --whisper-model medium
# Specify output file
transcribe recording.mp4 -o my-transcript.txt
# Use a different Ollama model
transcribe recording.mp4 --model llama3.2
# Use a multimodal model for visual analysis
transcribe recording.mp4 --model llava
# Use custom Ollama host
transcribe recording.mp4 --host http://remote-server:11434
# Disable frame analysis
transcribe recording.mp4 --no-analyze-frames
# Customize frame extraction
transcribe recording.mp4 --frame-interval 5 --max-frames 50
# Keep extracted frames for manual review
transcribe recording.mp4 --keep-framesDevelopment
Run directly with tsx:
npm run dev -- video.mp4Build:
npm run buildPublish to npm:
npm version patch # or minor/major
npm publish