Package Exports

framecap
framecap/dist/index.js

This package does not declare an exports field, so the exports above have been automatically detected and optimized by JSPM instead. If any package subpath is missing, it is recommended to post an issue to the original package (framecap) to support the "exports" field. If that is not possible, create a JSPM override to customize the exports field for this package.

Readme

framecap

YouTube videos → structured markdown with visual frame captures.

Takes a YouTube URL and outputs a clean markdown document with a structured transcript (chapter headers, speaker labels, paragraph breaks) and frame captures at key moments — embedded as images in the markdown.

Why

YouTube videos contain valuable knowledge, but it's trapped in a format you can't search, reference, or link to. Transcripts alone miss the visual context. framecap gives you both: a readable document with visual bookmarks.

Install

# Prerequisites
brew install yt-dlp ffmpeg

# Install framecap
npm install -g framecap

Usage

# Single video
framecap https://youtube.com/watch?v=abc123

# Multiple videos
framecap https://youtube.com/watch?v=abc https://youtube.com/watch?v=def

# Full playlist
framecap https://youtube.com/playlist?list=PLxyz

# Custom output directory
framecap https://youtube.com/watch?v=abc -o ./notes/

# Hint speaker names for interviews
framecap https://youtube.com/watch?v=abc --speakers "Lex Fridman,Andrej Karpathy"

# Skip LLM structuring (free mode — raw transcript + frames only)
framecap https://youtube.com/watch?v=abc --no-structure

# Obsidian-compatible output (wikilink image syntax)
framecap https://youtube.com/watch?v=abc --format obsidian

Output

./how-karpathy-builds-software.md
./frames/how-karpathy-builds-software/
├── frame-0001-00m00s.jpg
├── frame-0002-01m45s.jpg
├── frame-0003-05m30s.jpg
└── ...

The markdown file includes:

YAML frontmatter — title, channel, URL, duration, upload date, auto-generated tags
Structured transcript — organized by chapters (from video description), with speaker labels and natural paragraph breaks
Embedded frames — images at chapter boundaries or fixed intervals, with timestamps and captions
Quotes section — notable quotes extracted during structuring

Options

Flag	Default	Description
`-o, --output`	`./`	Output directory
`--interval`	auto	Force fixed-interval frame capture (seconds)
`--max-frames`	`50`	Maximum frames to extract
`--dedup-threshold`	`0.85`	Frame similarity filter (0.0-1.0)
`--no-dedup`	off	Keep all frames
`--format`	`markdown`	`markdown` or `obsidian` (wikilinks)
`--capture-at`	—	Capture at specific timestamps (e.g. `1:30,5:00`)
`--speakers`	auto	Comma-separated speaker names
`--no-structure`	off	Skip LLM pass (free mode)
`--no-frames`	off	Transcript only
`--language`	`en`	Transcript language
`--keep-video`	off	Retain downloaded video file
`--cookies-from-browser`	—	Use cookies from browser (chrome, firefox, edge)
`-v, --verbose`	off	Detailed logging

Configuration

Defaults can be set in ~/.framecap.yml:

interval: 15
max_frames: 50
dedup_threshold: 0.85
format: markdown
language: en
output: ~/Notes/Videos/

CLI flags override config file values.

Requirements

Node.js 18+
yt-dlp — video/transcript download
ffmpeg — frame extraction
Anthropic API key (optional, for transcript structuring — set ANTHROPIC_API_KEY)

How It Works

Fetch metadata — yt-dlp gets title, channel, duration, chapters, description
Extract transcript — yt-dlp pulls auto/manual captions, parses VTT
Capture frames — ffmpeg extracts frames at intervals or chapter boundaries
Deduplicate frames — removes visually similar frames (configurable threshold)
Structure transcript (optional) — LLM adds chapter headers, speaker labels, paragraph breaks. All words stay verbatim — only whitespace and labels are added.
Assemble markdown — combines metadata, structured transcript, and frame references into the output file

Cost

The LLM structuring pass is the only part that costs money (requires Anthropic API key):

Video Length	Approximate Cost
15 minutes	~$0.02
1 hour	~$0.10
2 hours	~$0.20

Use --no-structure for completely free operation (raw transcript + frames).

License

MIT