Package Exports
- @thunderkiller/video-clipper
- @thunderkiller/video-clipper/cli
Readme
video-clipper
A TypeScript CLI tool that takes a YouTube URL, analyzes the transcript with an LLM, and returns the most interesting moments as ranked timestamp ranges. Optionally downloads the video and cuts clips automatically.
How it works
YouTube URL
│
▼
Parse URL → fetch transcript → group into chunks
│
▼
Parallel LLM analysis (Vercel AI SDK + gpt-4o)
│
▼
Rank & deduplicate segments
│
▼
Refine clip boundaries (second LLM pass)
│
▼
(Optional) Download video + cut clips with ffmpegTech Stack
| Layer | Choice |
|---|---|
| Language | TypeScript (Node.js 18+) |
| Transcript | youtube-transcript |
| LLM | Vercel AI SDK (ai + @ai-sdk/openai, @ai-sdk/anthropic, @ai-sdk/google, @ai-sdk/xai, @ai-sdk/mistral, @ai-sdk/groq, @ai-sdk/openrouter) |
| Structured output | generateObject + zod |
| Video download | yt-dlp via execa |
| Clip cutting | fluent-ffmpeg |
| Config validation | zod |
| Concurrency | p-limit |
Requirements
- Node.js 18+
yt-dlp(for video download)ffmpeg(for clip cutting)
# macOS
brew install yt-dlp ffmpegAudio/Video Sync
Clips are generated by re-encoding with libx264 (video) and aac (audio) to ensure perfect audio/video synchronization. This is slower than stream copy mode but prevents the common issue where video and audio become desynchronized in the output clips.
Performance vs Quality Trade-off:
Use the FFMPEG_PRESET environment variable to adjust encoding speed:
| Preset | Speed | Quality | Use Case |
|---|---|---|---|
ultrafast |
Very fast | Lowest | Quick testing |
fast (default) |
Fast | Good | Balanced performance |
medium |
Medium | Better | Higher quality clips |
slow |
Slow | High | Final production clips |
Example:
# Faster processing (lower quality)
FFMPEG_PRESET=ultrafast npm run start -- <url> --clip
# Higher quality (slower)
FFMPEG_PRESET=medium npm run start -- <url> --clipSetup
npm install
cp .env.example .envEdit .env and configure your LLM provider:
# Choose your provider (openai, anthropic, google, xai, mistral, groq, zai, openrouter)
LLM_PROVIDER=openai
OPENAI_API_KEY=your_key_here
# Or use a free model via OpenRouter:
# LLM_PROVIDER=openrouter
# OPENROUTER_API_KEY=sk-or-...
# LLM_MODEL=meta-llama/llama-3.3-70b-instruct:freeAdvanced Examples
# Analyze only (no download)
npm run start -- <youtube-url>
# Analyze and download full video
npm run start -- <youtube-url> --download
# Analyze and cut clips
npm run start -- <youtube-url> --clip
# Limit number of clips to generate
npm run start -- <youtube-url> --clip --max-clips 3
# Use custom output directory
OUTPUT_DIR=my-clips npm run start -- <youtube-url> --clipConfiguration
All parameters are set via .env:
| Variable | Default | Description |
|---|---|---|
| Provider selection | ||
LLM_PROVIDER |
openai |
LLM provider (openai, anthropic, google, xai, mistral, groq, zai, openrouter) |
OPENAI_API_KEY |
— | Your OpenAI API key (required if LLM_PROVIDER=openai) |
ANTHROPIC_API_KEY |
— | Your Anthropic API key (required if LLM_PROVIDER=anthropic) |
GOOGLE_GENERATIVE_AI_API_KEY |
— | Your Google API key (required if LLM_PROVIDER=google) |
XAI_API_KEY |
— | Your XAI API key (required if LLM_PROVIDER=xai) |
MISTRAL_API_KEY |
— | Your Mistral API key (required if LLM_PROVIDER=mistral) |
GROQ_API_KEY |
— | Your Groq API key (required if LLM_PROVIDER=groq) |
ZAI_API_KEY |
— | Your Zai API key (required if LLM_PROVIDER=zai) |
OPENROUTER_API_KEY |
— | Your OpenRouter API key (required if LLM_PROVIDER=openrouter) |
| Model & LLM | ||
LLM_MODEL |
gpt-4o |
Model ID (depends on provider) |
LLM_MAX_RETRIES |
3 |
Max retries on rate-limit errors |
LLM_CONCURRENCY |
3 |
Max parallel LLM calls |
LLM_SYSTEM_PROMPT |
(default prompt) | Custom system prompt for LLM analysis |
| Analysis parameters | ||
SCORE_THRESHOLD |
7 |
Minimum score (1–10) to keep a segment |
TOP_N_SEGMENTS |
10 |
Max number of segments to return |
CHUNK_LENGTH_SEC |
120 |
LLM analysis window size in seconds |
CHUNK_OVERLAP_SEC |
20 |
Overlap between consecutive chunks |
MICRO_BLOCK_SEC |
15 |
Transcript grouping window in seconds |
MAX_CHUNKS |
— | Limit number of chunks sent to LLM (optional) |
| Video download | ||
DOWNLOAD_SECTIONS_MODE |
all |
yt-dlp mode: all (full video) or N (top N segments only, e.g. 1, 2, 3...) |
FFMPEG_PRESET |
fast |
ffmpeg encoding preset: ultrafast, superfast, veryfast, fast (default), medium, slow, slower |
TIMESTAMP_OFFSET_SECONDS |
0 |
Adjust all clip timestamps (positive = later, negative = earlier) to fix transcript-video misalignment |
| Paths | ||
DOWNLOAD_DIR |
downloads/ |
Where to store downloaded videos |
OUTPUT_DIR |
outputs/ |
Where to store generated clips and dumps |
CACHE_DIR |
outputs/cache |
Where to store transcript and LLM result cache |
| Output options | ||
DUMP_OUTPUTS |
true |
Write transcript/analysis JSON dumps |
Output
{
"video_id": "abc123",
"title": "Video Title",
"duration": 1823,
"segments": [
{
"rank": 1,
"start": 120,
"end": 150,
"score": 9,
"reason": "strong controversial opinion"
},
{
"rank": 2,
"start": 420,
"end": 455,
"score": 8,
"reason": "funny storytelling moment"
}
]
}Caching
The CLI caches both transcript fetches and LLM chunk results to speed up subsequent runs:
- Transcript cache: Stored per video ID in
CACHE_DIR - LLM chunk cache: Stores successful chunk analyses to avoid re-analyzing the same content
Cache is automatically used on re-runs. Use --no-cache to bypass.
Working with Pre-Downloaded Videos
If you already have a video downloaded (from yt-dlp, browser download, or other tool), you can skip the download step and work directly with that file.
Workflow:
# Step 1: Run analysis once to get segment timestamps
npm run start -- <url> --output-json analysis.json
# Step 2: (Optional) Edit timestamps in analysis.json if needed
# Edit the "start" and "end" values for each segment
# Step 3: Place your video in downloads/ directory
cp /path/to/your/video.mp4 downloads/<videoId>.mp4
# Step 4: Run again - will skip download and use your video
npm run start -- <url> --clipUse cases:
- Testing different settings - Run different clip configurations without re-downloading
- Manual timestamp adjustment - Fine-tune segment boundaries based on visual inspection
- Alternative video sources - Work with videos downloaded from other tools or browsers
- Large video files - If you have a high-quality version, use that instead
Notes:
- The video file must be named exactly
{videoId}.mp4in theDOWNLOAD_DIR - You can apply
TIMESTAMP_OFFSET_SECONDSglobally instead of editing each timestamp - Transcript cache is used, so re-running is fast (no API calls)
Combining with Timestamp Offset
For pre-downloaded videos with known sync issues:
# Skip download, apply 3-second offset to all clips
TIMESTAMP_OFFSET_SECONDS=-3 npm run start -- <url> --clipThe CLI will find the existing video in downloads/, skip the download step, and apply the offset to all clip generation.
Usage
Basic analysis (no download)
npm run start -- https://youtube.com/watch?v=abc123Download full video and generate clips
npm run start -- https://youtube.com/watch?v=abc123 --clipDownload top N segments only
# Download top 3 segments
npm run start -- https://youtube.com/watch?v=abc123 --download-sections 3
# Download top 5 segments
npm run start -- https://youtube.com/watch?v=abc123 --download-sections 5Custom output directory
# Store clips in custom directory
npm run start -- https://youtube.com/watch?v=abc123 --clip --video-path ./my-clips
# Download segments to custom path
npm run start -- https://youtube.com/watch?v=abc123 --download-sections 3 --video-path ./downloadsCustom thresholds
npm run start -- https://youtube.com/watch?v=abc123 --threshold 8 --top-n 5Testing with limited chunks
npm run start -- https://youtube.com/watch?v=abc123 --max-chunks 3Custom thresholds with timestamp offset
# Fix 3-second audio delay (shift earlier)
TIMESTAMP_OFFSET_SECONDS=-3 npm run start -- <url> --clip
# Fix 2-second early start (shift later)
TIMESTAMP_OFFSET_SECONDS=2 npm run start -- <url> --clip
# High quality, slower processing, with offset
FFMPEG_PRESET=slow TIMESTAMP_OFFSET_SECONDS=-3 npm run start -- <url> --clipTroubleshooting Audio Sync Issues
Problem: Audio is delayed or starts early
Symptoms:
- Video starts at correct moment but audio plays 2-5 seconds later/earlier
- Lip movements don't match speech in the clip
- Content in clip doesn't match the transcript segment
Root Causes:
Transcript misalignment - Transcript timestamps don't perfectly match the video
- Auto-generated captions: Often have 1-3 second delays
- Manual captions: Usually more accurate but can have timing issues
- Multiple caption tracks: Transcripts from different video versions
Millisecond precision loss - Old implementation lost decimal seconds
- Now fixed:
--download-sectionsuses HH:MM:SS.mmm format
- Now fixed:
Version differences - The transcript might be from a slightly different version of the video
Solution: Use TIMESTAMP_OFFSET_SECONDS
What it does: Applies a global offset to all clip timestamps. Positive = shift later, negative = shift earlier.
How to use:
# Add to .env
TIMESTAMP_OFFSET_SECONDS=-3
# Or inline
TIMESTAMP_OFFSET_SECONDS=-3 npm run start -- <url> --clipFinding the Correct Offset
Step 1: Test with logging
Run a single segment and observe the logs:
TIMESTAMP_OFFSET_SECONDS=0 npm run start -- <url> --download-sections 1Look for these log lines:
[info] Downloading segment 1: 00:02:00.500-00:02:30.000 (strong opinion...)
[info] Requested: 120.50s - 150.00s
[info] Adjusted: 117.50s - 147.00s (offset: -3s)
[info] Cutting clip: start=117.50s, end=147.00s, duration=29.50sStep 2: Play and verify
- Open the generated clip
- Check if the moment matches the transcript description
- Note if it's too early or too late
Step 3: Adjust offset
If clip starts 3 seconds late:
TIMESTAMP_OFFSET_SECONDS=-3 # Negative = shift earlierIf clip starts 2 seconds early:
TIMESTAMP_OFFSET_SECONDS=2 # Positive = shift laterStep 4: Verify with multiple clips
TIMESTAMP_OFFSET_SECONDS=-3 npm run start -- <url> --download-sections 3Check if the offset works consistently across different segments.
Binary Search for Optimal Offset
If you're unsure of the exact offset:
# Try 0, -3, -6, -9 to see which is closest
for offset in 0 -3 -6 -9; do
TIMESTAMP_OFFSET_SECONDS=$offset npm run start -- <url> --download-sections 1
echo "Tested offset: $offset"
# Play and check accuracy
doneThen narrow down: -3 seems good, try -2 and -4, etc.
Common Scenarios
| Scenario | Likely Offset | Explanation |
|---|---|---|
| Auto-generated captions | -1 to -3 |
ASR timing often lags behind actual speech |
| Manual captions | 0 to -1 |
Usually more accurate, small sync issues |
| Multiple caption tracks | -2 to -5 |
Different versions may have systematic offset |
| Regional variations | Varies | Different regions may have different caption timing |
Verifying the Fix
After applying TIMESTAMP_OFFSET_SECONDS, verify:
- Watch the clip: Audio and video should be synchronized
- Check multiple clips: Offset should work consistently
- Compare with original: Clip should match the described content
If offset varies between segments, the issue might be video-specific rather than a global transcript offset.
CLI Flags
| Flag | Description |
|---|---|
--clip |
Download video and generate mp4 clips for each segment |
--download-sections <mode> |
yt-dlp mode: all (full video) or N (top N segments only, e.g. 1, 2, 3...) |
--video-path <path> |
Custom output directory for downloaded videos and clips |
--threshold <n> |
Minimum score (1–10) to keep a segment |
--top-n <n> |
Maximum number of segments to return |
--max-duration <s> |
Abort if video is longer than N seconds |
--max-chunks <n> |
Limit number of transcript chunks sent to LLM |
--max-parallel <n> |
Max number of LLM calls to run in parallel |
--output-json <path> |
Write output JSON to file instead of stdout |
--no-cache |
Bypass all caches and force a fresh run |
--help, -h |
Show help message |
Docs
Full architecture and build plan: docs/plan.md
yt-dlp download modes: docs/yt-downloader.md