JSPM

  • ESM via JSPM
  • ES Module Entrypoint
  • Export Map
  • Keywords
  • License
  • Repository URL
  • TypeScript Types
  • README
  • Created
  • Published
  • Downloads 135
  • Score
    100M100P100Q59331F
  • License ISC

CLI that analyzes YouTube transcripts with an LLM to find interesting moments and cut clips

Package Exports

  • @thunderkiller/video-clipper
  • @thunderkiller/video-clipper/dist/index.js

This package does not declare an exports field, so the exports above have been automatically detected and optimized by JSPM instead. If any package subpath is missing, it is recommended to post an issue to the original package (@thunderkiller/video-clipper) to support the "exports" field. If that is not possible, create a JSPM override to customize the exports field for this package.

Readme

video-clipper

A TypeScript CLI tool that takes a YouTube URL, analyzes the transcript with an LLM, and returns the most interesting moments as ranked timestamp ranges. Optionally downloads the video and cuts clips automatically.

How it works

YouTube URL
    │
    ▼
Parse URL → fetch transcript → group into chunks
    │
    ▼
Parallel LLM analysis (Vercel AI SDK + gpt-4o)
    │
    ▼
Rank & deduplicate segments
    │
    ▼
Refine clip boundaries (second LLM pass)
    │
    ▼
(Optional) Download video + cut clips with ffmpeg

Tech Stack

Layer Choice
Language TypeScript (Node.js 18+)
Transcript youtube-transcript
LLM Vercel AI SDK (ai + @ai-sdk/openai, @ai-sdk/anthropic, @ai-sdk/google, @ai-sdk/xai, @ai-sdk/mistral, @ai-sdk/groq, @ai-sdk/openrouter)
Structured output generateObject + zod
Video download yt-dlp via execa
Clip cutting fluent-ffmpeg
Config validation zod
Concurrency p-limit

Requirements

  • Node.js 18+
  • yt-dlp (for video download)
  • ffmpeg (for clip cutting)
# macOS
brew install yt-dlp ffmpeg

Audio/Video Sync

Clips are generated by re-encoding with libx264 (video) and aac (audio) to ensure perfect audio/video synchronization. This is slower than stream copy mode but prevents the common issue where video and audio become desynchronized in the output clips.

Performance vs Quality Trade-off:

Use the FFMPEG_PRESET environment variable to adjust encoding speed:

Preset Speed Quality Use Case
ultrafast Very fast Lowest Quick testing
fast (default) Fast Good Balanced performance
medium Medium Better Higher quality clips
slow Slow High Final production clips

Example:

# Faster processing (lower quality)
FFMPEG_PRESET=ultrafast npm run start -- <url> --clip

# Higher quality (slower)
FFMPEG_PRESET=medium npm run start -- <url> --clip

Setup

npm install
cp .env.example .env

Edit .env and configure your LLM provider:

# Choose your provider (openai, anthropic, google, xai, mistral, groq, zai, openrouter)
LLM_PROVIDER=openai
OPENAI_API_KEY=your_key_here

# Or use a free model via OpenRouter:
# LLM_PROVIDER=openrouter
# OPENROUTER_API_KEY=sk-or-...
# LLM_MODEL=meta-llama/llama-3.3-70b-instruct:free

Advanced Examples

# Analyze only (no download)
npm run start -- <youtube-url>

# Analyze and download full video
npm run start -- <youtube-url> --download

# Analyze and cut clips
npm run start -- <youtube-url> --clip

# Limit number of clips to generate
npm run start -- <youtube-url> --clip --max-clips 3

# Use custom output directory
OUTPUT_DIR=my-clips npm run start -- <youtube-url> --clip

Configuration

All parameters are set via .env:

Variable Default Description
Provider selection
LLM_PROVIDER openai LLM provider (openai, anthropic, google, xai, mistral, groq, zai, openrouter)
OPENAI_API_KEY Your OpenAI API key (required if LLM_PROVIDER=openai)
ANTHROPIC_API_KEY Your Anthropic API key (required if LLM_PROVIDER=anthropic)
GOOGLE_GENERATIVE_AI_API_KEY Your Google API key (required if LLM_PROVIDER=google)
XAI_API_KEY Your XAI API key (required if LLM_PROVIDER=xai)
MISTRAL_API_KEY Your Mistral API key (required if LLM_PROVIDER=mistral)
GROQ_API_KEY Your Groq API key (required if LLM_PROVIDER=groq)
ZAI_API_KEY Your Zai API key (required if LLM_PROVIDER=zai)
OPENROUTER_API_KEY Your OpenRouter API key (required if LLM_PROVIDER=openrouter)
Model & LLM
LLM_MODEL gpt-4o Model ID (depends on provider)
LLM_MAX_RETRIES 3 Max retries on rate-limit errors
LLM_CONCURRENCY 3 Max parallel LLM calls
LLM_SYSTEM_PROMPT (default prompt) Custom system prompt for LLM analysis
Analysis parameters
SCORE_THRESHOLD 7 Minimum score (1–10) to keep a segment
TOP_N_SEGMENTS 10 Max number of segments to return
CHUNK_LENGTH_SEC 120 LLM analysis window size in seconds
CHUNK_OVERLAP_SEC 20 Overlap between consecutive chunks
MICRO_BLOCK_SEC 15 Transcript grouping window in seconds
MAX_CHUNKS Limit number of chunks sent to LLM (optional)
Video download
DOWNLOAD_SECTIONS_MODE all yt-dlp mode: all (full video) or N (top N segments only, e.g. 1, 2, 3...)
FFMPEG_PRESET fast ffmpeg encoding preset: ultrafast, superfast, veryfast, fast (default), medium, slow, slower
TIMESTAMP_OFFSET_SECONDS 0 Adjust all clip timestamps (positive = later, negative = earlier) to fix transcript-video misalignment
Paths
DOWNLOAD_DIR downloads/ Where to store downloaded videos
OUTPUT_DIR outputs/ Where to store generated clips and dumps
CACHE_DIR outputs/cache Where to store transcript and LLM result cache
Output options
DUMP_OUTPUTS true Write transcript/analysis JSON dumps

Output

{
  "video_id": "abc123",
  "title": "Video Title",
  "duration": 1823,
  "segments": [
    {
      "rank": 1,
      "start": 120,
      "end": 150,
      "score": 9,
      "reason": "strong controversial opinion"
    },
    {
      "rank": 2,
      "start": 420,
      "end": 455,
      "score": 8,
      "reason": "funny storytelling moment"
    }
  ]
}

Caching

The CLI caches both transcript fetches and LLM chunk results to speed up subsequent runs:

  • Transcript cache: Stored per video ID in CACHE_DIR
  • LLM chunk cache: Stores successful chunk analyses to avoid re-analyzing the same content

Cache is automatically used on re-runs. Use --no-cache to bypass.

Working with Pre-Downloaded Videos

If you already have a video downloaded (from yt-dlp, browser download, or other tool), you can skip the download step and work directly with that file.

Workflow:

# Step 1: Run analysis once to get segment timestamps
npm run start -- <url> --output-json analysis.json

# Step 2: (Optional) Edit timestamps in analysis.json if needed
# Edit the "start" and "end" values for each segment

# Step 3: Place your video in downloads/ directory
cp /path/to/your/video.mp4 downloads/<videoId>.mp4

# Step 4: Run again - will skip download and use your video
npm run start -- <url> --clip

Use cases:

  • Testing different settings - Run different clip configurations without re-downloading
  • Manual timestamp adjustment - Fine-tune segment boundaries based on visual inspection
  • Alternative video sources - Work with videos downloaded from other tools or browsers
  • Large video files - If you have a high-quality version, use that instead

Notes:

  • The video file must be named exactly {videoId}.mp4 in the DOWNLOAD_DIR
  • You can apply TIMESTAMP_OFFSET_SECONDS globally instead of editing each timestamp
  • Transcript cache is used, so re-running is fast (no API calls)

Combining with Timestamp Offset

For pre-downloaded videos with known sync issues:

# Skip download, apply 3-second offset to all clips
TIMESTAMP_OFFSET_SECONDS=-3 npm run start -- <url> --clip

The CLI will find the existing video in downloads/, skip the download step, and apply the offset to all clip generation.

Usage

Basic analysis (no download)

npm run start -- https://youtube.com/watch?v=abc123

Download full video and generate clips

npm run start -- https://youtube.com/watch?v=abc123 --clip

Download top N segments only

# Download top 3 segments
npm run start -- https://youtube.com/watch?v=abc123 --download-sections 3

# Download top 5 segments
npm run start -- https://youtube.com/watch?v=abc123 --download-sections 5

Custom output directory

# Store clips in custom directory
npm run start -- https://youtube.com/watch?v=abc123 --clip --video-path ./my-clips

# Download segments to custom path
npm run start -- https://youtube.com/watch?v=abc123 --download-sections 3 --video-path ./downloads

Custom thresholds

npm run start -- https://youtube.com/watch?v=abc123 --threshold 8 --top-n 5

Testing with limited chunks

npm run start -- https://youtube.com/watch?v=abc123 --max-chunks 3

Custom thresholds with timestamp offset

# Fix 3-second audio delay (shift earlier)
TIMESTAMP_OFFSET_SECONDS=-3 npm run start -- <url> --clip

# Fix 2-second early start (shift later)
TIMESTAMP_OFFSET_SECONDS=2 npm run start -- <url> --clip

# High quality, slower processing, with offset
FFMPEG_PRESET=slow TIMESTAMP_OFFSET_SECONDS=-3 npm run start -- <url> --clip

Troubleshooting Audio Sync Issues

Problem: Audio is delayed or starts early

Symptoms:

  • Video starts at correct moment but audio plays 2-5 seconds later/earlier
  • Lip movements don't match speech in the clip
  • Content in clip doesn't match the transcript segment

Root Causes:

  1. Transcript misalignment - Transcript timestamps don't perfectly match the video

    • Auto-generated captions: Often have 1-3 second delays
    • Manual captions: Usually more accurate but can have timing issues
    • Multiple caption tracks: Transcripts from different video versions
  2. Millisecond precision loss - Old implementation lost decimal seconds

    • Now fixed: --download-sections uses HH:MM:SS.mmm format
  3. Version differences - The transcript might be from a slightly different version of the video

Solution: Use TIMESTAMP_OFFSET_SECONDS

What it does: Applies a global offset to all clip timestamps. Positive = shift later, negative = shift earlier.

How to use:

# Add to .env
TIMESTAMP_OFFSET_SECONDS=-3

# Or inline
TIMESTAMP_OFFSET_SECONDS=-3 npm run start -- <url> --clip

Finding the Correct Offset

Step 1: Test with logging

Run a single segment and observe the logs:

TIMESTAMP_OFFSET_SECONDS=0 npm run start -- <url> --download-sections 1

Look for these log lines:

[info] Downloading segment 1: 00:02:00.500-00:02:30.000 (strong opinion...)
[info]   Requested: 120.50s - 150.00s
[info]   Adjusted: 117.50s - 147.00s (offset: -3s)
[info] Cutting clip: start=117.50s, end=147.00s, duration=29.50s

Step 2: Play and verify

  • Open the generated clip
  • Check if the moment matches the transcript description
  • Note if it's too early or too late

Step 3: Adjust offset

If clip starts 3 seconds late:

TIMESTAMP_OFFSET_SECONDS=-3  # Negative = shift earlier

If clip starts 2 seconds early:

TIMESTAMP_OFFSET_SECONDS=2   # Positive = shift later

Step 4: Verify with multiple clips

TIMESTAMP_OFFSET_SECONDS=-3 npm run start -- <url> --download-sections 3

Check if the offset works consistently across different segments.

Binary Search for Optimal Offset

If you're unsure of the exact offset:

# Try 0, -3, -6, -9 to see which is closest
for offset in 0 -3 -6 -9; do
  TIMESTAMP_OFFSET_SECONDS=$offset npm run start -- <url> --download-sections 1
  echo "Tested offset: $offset"
  # Play and check accuracy
done

Then narrow down: -3 seems good, try -2 and -4, etc.

Common Scenarios

Scenario Likely Offset Explanation
Auto-generated captions -1 to -3 ASR timing often lags behind actual speech
Manual captions 0 to -1 Usually more accurate, small sync issues
Multiple caption tracks -2 to -5 Different versions may have systematic offset
Regional variations Varies Different regions may have different caption timing

Verifying the Fix

After applying TIMESTAMP_OFFSET_SECONDS, verify:

  1. Watch the clip: Audio and video should be synchronized
  2. Check multiple clips: Offset should work consistently
  3. Compare with original: Clip should match the described content

If offset varies between segments, the issue might be video-specific rather than a global transcript offset.

CLI Flags

Flag Description
--clip Download video and generate mp4 clips for each segment
--download-sections <mode> yt-dlp mode: all (full video) or N (top N segments only, e.g. 1, 2, 3...)
--video-path <path> Custom output directory for downloaded videos and clips
--threshold <n> Minimum score (1–10) to keep a segment
--top-n <n> Maximum number of segments to return
--max-duration <s> Abort if video is longer than N seconds
--max-chunks <n> Limit number of transcript chunks sent to LLM
--max-parallel <n> Max number of LLM calls to run in parallel
--output-json <path> Write output JSON to file instead of stdout
--no-cache Bypass all caches and force a fresh run
--help, -h Show help message

Docs

Full architecture and build plan: docs/plan.md

yt-dlp download modes: docs/yt-downloader.md