Package Exports

This package does not declare an exports field, so the exports above have been automatically detected and optimized by JSPM instead. If any package subpath is missing, it is recommended to post an issue to the original package (vidistill) to support the "exports" field. If that is not possible, create a JSPM override to customize the exports field for this package.

Readme

vidistill

Turn coding tutorials into source trees your AI editor can read.

vidistill watches a video — YouTube, local file, or any yt-dlp-supported URL — and distills it into structured markdown, reconstructed source files, transcripts, and speaker-attributed notes. Feed it a 40-minute React tutorial and get back the exact files the instructor typed, a timestamped transcript, and a navigable guide. Feed it a team meeting and get action items, speaker profiles, and chat messages.

MCP Quick-Start

Use vidistill as an MCP server so Claude Code (or any MCP-compatible tool) can analyze videos and query results directly.

# 1. Install
npm install -g vidistill

# 2. Register the MCP server
claude mcp add vidistill -- npx vidistill mcp

# 3. Ask Claude to analyze a video
#    "Analyze this tutorial and show me the code files"

Available tools:

Tool	Description
`analyze_video`	Run the full pipeline on a URL or file
`get_transcript`	Read transcript with optional time range filter
`get_code`	Read reconstructed source files
`get_notes`	Overview, decisions, concepts, topics
`get_people`	Speaker/participant details
`get_action_items`	Tasks assigned during the video
`get_links`	All URLs mentioned
`get_chat`	Chat messages from streams/meetings

Before / After

Input: a YouTube tutorial URL

Output:

vidistill-output/react-server-components/
├── guide.md              # overview and navigation
├── transcript.md          # full timestamped transcript
├── combined.md            # transcript + visual notes + screenshots
├── notes.md               # synthesized notes and themes
├── code/                  # reconstructed source files
│   ├── app.tsx
│   ├── server-component.tsx
│   └── code-timeline.md   # code evolution timeline
├── images/                # keyframe screenshots
│   └── frame-*.png
├── people.md              # speakers and participants
├── chat.md                # chat messages and links
├── action-items.md        # tasks and follow-ups
├── links.md               # all URLs mentioned
├── metadata.json          # processing metadata
└── raw/                   # raw pass outputs

Which files are generated depends on the content — coding videos get code/, meetings get people.md and action-items.md, etc.

Usage

vidistill [input] [options]

Flag	Description
`input`	YouTube URL, video URL, local video/audio path (prompted if omitted)
`-c, --context`	Context about the video (e.g. "CS lecture")
`-o, --output`	Output directory (default: `./vidistill-output/`)
`-l, --lang`	Output language (e.g. `zh`, `ja`, `es`)
`-b, --batch`	Path to a batch file for processing multiple videos
`-q, --quick`	Quick mode — skip consensus for faster results (~60% fewer API calls)
`-f, --format`	Output format: `standard` (default) or `obsidian` (YAML frontmatter + wikilinks)

Examples:

# Interactive mode
vidistill

# YouTube video
vidistill "https://youtube.com/watch?v=dQw4w9WgXcQ"

# Local file with context
vidistill ./lecture.mp4 --context "distributed systems"

# Quick mode — faster, fewer API calls
vidistill ./demo.mp4 --quick

# Obsidian-friendly output
vidistill ./lecture.mp4 --format obsidian

# Non-YouTube URL (Bilibili, Vimeo, Twitter/X, etc.)
vidistill "https://vimeo.com/123456789"

# Batch processing
vidistill --batch videos.txt

# List previous outputs
vidistill list
vidistill list --dir ./custom-output/

Batch Files

One URL or file path per line. Lines starting with # are comments. Add context after a | separator:

# Lectures
https://youtube.com/watch?v=abc|distributed systems
https://vimeo.com/123456|networking basics

# Local files
./recording.mp4|team standup

Listing Outputs

vidistill list

Scans ./vidistill-output/ (or --dir <path>) and displays a table of all processed videos with title, duration, type, date, and file count.

Speaker Naming

When multiple speakers are detected, use rename-speakers to assign real names. Names replace generic labels (SPEAKER_00, SPEAKER_01) across all output files.

# Interactive rename
vidistill rename-speakers ./vidistill-output/my-meeting/

# List current speaker state
vidistill rename-speakers ./vidistill-output/my-meeting/ --list

# Quick rename
vidistill rename-speakers ./vidistill-output/my-meeting/ --rename "Steven Kang" "Steven K."

# Merge duplicate speakers
vidistill rename-speakers ./vidistill-output/my-meeting/ --merge "K Iphone" "Kristian"

Install

npm install -g vidistill

Requires Node.js 22+ and ffmpeg. Non-YouTube URLs also require yt-dlp.

API Key

vidistill needs a Gemini API key. It checks these sources in order:

GEMINI_API_KEY environment variable
~/.vidistill/config.json
Interactive prompt (with option to save)

Get a key at ai.google.dev.

How It Works

Supported formats: MP4, MOV, WebM, MKV, AVI, MPEG, FLV, WMV, 3GPP (video) and MP3, AAC, WAV, FLAC, OGG, M4A (audio).

Pass 0 — scene analysis classifies the video and determines processing strategy
Pass 1a/1b — transcription + speaker diarization, each running 3x with consensus alignment
Pass 2 — visual content extraction (code, slides, diagrams, screen states)
Pass 3 — specialist passes: chat/links (3c), implicit signals (3d), people (3b), code reconstruction (3a, 3x consensus + validation)
Synthesis — cross-references all passes into unified analysis
Output — structured markdown and source files

Long videos are segmented automatically. Failed passes are skipped gracefully. In interactive mode, a cost estimate is shown before processing and a quality summary (coverage, consensus rate, tokens) is displayed after.

License

MIT