Package Exports
This package does not declare an exports field, so the exports above have been automatically detected and optimized by JSPM instead. If any package subpath is missing, it is recommended to post an issue to the original package (@krasnoperov/transcribe) to support the "exports" field. If that is not possible, create a JSPM override to customize the exports field for this package.
Readme
Transcribe
AI transcription skill for Claude Code - Transform audio/video recordings into transcripts with speaker diarization, AI-powered summaries, and visual infographics.
This skill provides a complete pipeline for processing recordings:
- Transcription - Convert audio/video to VTT format with speaker identification (OpenAI Whisper)
- Summarization - Generate structured markdown summaries (OpenAI GPT-5.1)
- Infographics - Create visual summaries from text (Google Gemini)
- All-in-one - Process video → transcript → summary → infographic in one command
See skills/transcribe/SKILL.md for complete usage guide.
Use in Claude Code
This is a Claude Code skill. Install it from the marketplace:
/plugin marketplace add krasnoperov/claude-plugins
/plugin install transcribe@krasnoperov-pluginsOnce installed, use the /transcribe skill in your conversations:
/transcribe transcribe meeting.mp4 to VTT with speaker diarization
/transcribe summarize this transcript into key points
/transcribe create an infographic from this summaryCommand Line Usage
You can also use this package directly via npx:
export OPENAI_API_KEY="your-openai-key"
export GOOGLE_AI_STUDIO_KEY="your-google-key"
# Transcribe audio/video
npx -y @krasnoperov/transcribe@latest transcribe meeting.mp4 -o transcript.vtt
# Generate summary
npx -y @krasnoperov/transcribe@latest summarize transcript.vtt -o summary.md
# Create infographic
npx -y @krasnoperov/transcribe@latest infographic summary.md -o visual.png
# All-in-one pipeline
npx -y @krasnoperov/transcribe@latest process recording.mp4 --output-dir ./outputGet your API keys:
Core Operations
transcribe <input> Audio/Video → VTT transcript with speakers
summarize <input> Text/VTT → Markdown summary
infographic <input> Text → Visual infographic image
process <input> All-in-one: video → transcript → summary → infographicThese operations can be used individually or chained together.
Examples
See skills/transcribe/examples/ directory:
- 01-basic-workflow.sh - Step-by-step transcription pipeline
- 02-all-in-one.sh - Single command processing
Transcription with Speaker Diarization
npx -y @krasnoperov/transcribe@latest transcribe podcast.mp3 \
--language es \
--model gpt-4o-transcribe-diarize \
-o podcast.vttTranscription with Gemini 3
npx -y @krasnoperov/transcribe@latest transcribe meeting.mp4 \
--model gemini-3 \
-o meeting.vttGemini 3 offers excellent transcription with built-in speaker diarization and can handle very long audio files (up to ~8 hours).
Output (VTT with speaker tags):
WEBVTT
00:00:00.000 --> 00:00:02.450
<v A>Welcome to the podcast...
00:00:02.850 --> 00:00:08.200
<v B>Thanks for having me...Custom Summarization
npx -y @krasnoperov/transcribe@latest summarize transcript.vtt \
--prompt "Focus on action items and decisions" \
-o summary.mdStyled Infographic
npx -y @krasnoperov/transcribe@latest infographic summary.md \
--style "modern minimal corporate" \
-o infographic.pngOptions
Transcribe
--model <model> Transcription model:
OpenAI: gpt-4o-transcribe-diarize (default), gpt-4o-transcribe, whisper-1
Google: gemini-3
--language <lang> Language code (en, es, ru, de, etc.)
-o, --output <file> Output VTT fileSummarize
--prompt <text> Custom summarization instructions
-o, --output <file> Output markdown fileInfographic
--style <text> Style instructions for visual
--reference <image> Reference image for style
-o, --output <file> Output image fileProcess (All-in-one)
--output-dir <dir> Output directory for all files
--language <lang> Language for transcription
--model <model> Transcription model
--style <text> Style for infographicRequirements
- Node.js >= 18.0.0
- ffmpeg (for audio extraction)
# macOS
brew install ffmpeg
# Ubuntu/Debian
sudo apt install ffmpegDevelopment
npm run build # Build TypeScript
npm run typecheck # Type checking
npm run test # Run tests
npm run dev # Dev mode with type strippingLicense
MIT License - Copyright (c) 2025 Aleksei Krasnoperov
See LICENSE file for details.