Package Exports
- youtube2md
- youtube2md/dist/index.js
This package does not declare an exports field, so the exports above have been automatically detected and optimized by JSPM instead. If any package subpath is missing, it is recommended to post an issue to the original package (youtube2md) to support the "exports" field. If that is not possible, create a JSPM override to customize the exports field for this package.
Readme
youtube2md
Convert any YouTube video into a structured Markdown summary — with chapter detection, clickable timestamp links, and key takeaways.
Install
npm install -g youtube2mdOr use without installing:
OPENAI_API_KEY=sk-... npx youtube2md --url https://youtu.be/VIDEO_IDUsage
# Basic usage
youtube2md --url https://www.youtube.com/watch?v=VIDEO_ID
# With custom output path
youtube2md --url https://youtu.be/VIDEO_ID --out ./notes/video.md
# Set summary language
youtube2md --url https://youtu.be/VIDEO_ID --lang Korean
# Use a specific model
youtube2md --url https://youtu.be/VIDEO_ID --model gpt-4o-miniOutput is saved to ./summaries/<video_id>.md by default.
Options
| Option | Description |
|---|---|
--url <youtube_url> |
YouTube video URL (required) |
--model <model> |
OpenAI model to use (default: gpt-5-mini). Overrides OPENAI_MODEL env var. |
--lang <language> |
Summary output language (default: same as transcript language) |
--out <path> |
Output file path (default: ./summaries/<video_id>.md). Use --out ./<video_id>.md to save in the current directory. |
--help |
Show help |
--version |
Show version |
Requirements
- Node.js 18+
- OpenAI API key with access to GPT-4/5 models
Environment variables
| Variable | Description |
|---|---|
OPENAI_API_KEY |
Required. Your OpenAI API key. |
OPENAI_MODEL |
Optional. Fallback model if --model is not passed (default: gpt-5-mini). |
Set your API key before running:
export OPENAI_API_KEY=sk-...
youtube2md --url https://youtu.be/VIDEO_IDOr create a .env file in your working directory:
OPENAI_API_KEY=sk-...Output format
# Video Title
> [Watch on YouTube](https://youtu.be/VIDEO_ID) | Duration: 12:34 | Published: 2024-01-01
## Summary
One paragraph overview of the video content.
## Chapters
### [0:00] Introduction
[▶ 0:00](https://youtu.be/VIDEO_ID?t=0)
- First key point from this section.
- Second key point from this section.
### [2:30] Main Topic
[▶ 2:30](https://youtu.be/VIDEO_ID?t=150)
- First key point from this section.
- Second key point from this section.
## Key Takeaways
- Key point 1
- Key point 2Transcript strategy
The tool tries these methods in order:
- YouTube captions via Android Innertube — uses caption tracks from YouTube directly (supports
json3and XML timedtext formats) youtube-transcriptfallback — retries transcript extraction with an alternate parser path- OpenAI Whisper STT fallback — downloads audio and transcribes it when captions are unavailable (requires API quota; audio must be under 25 MB)
Summary process
Summarization runs in two modes based on transcript token count (using tiktoken with model-aware encoding):
- Normalize transcript: convert each segment to
[MM:SS] spoken textso timestamps stay tied to content. - Count tokens: compute transcript size with
tiktoken(model-aware encoding; fallback too200k_base). - Choose mode:
- Single-pass when total tokens are
<= 5000 - Chunked when total tokens are
> 5000
- Single-pass when total tokens are
- Single-pass mode:
- Send one GPT request with metadata, optional native YouTube chapters, and full transcript.
- Expect strict JSON output:
summary,chapters,takeaways.
- Chunked mode:
- Split transcript into chunks targeting
5000tokens. - If the last chunk is too small (
< 25%of chunk limit), merge it into the previous chunk. - Summarize each chunk to the same JSON schema.
- Combine chunk summaries + chapters locally (chronological sort + dedupe).
- Run one final GPT request for full-video summary + takeaways.
- Split transcript into chunks targeting
- Validate + normalize:
- Parse JSON and require non-empty
summary,chapters, andtakeaways. - Normalize timestamps/seconds, sort chapters chronologically, and deduplicate chapters/takeaways.
- Parse JSON and require non-empty
- Render Markdown:
- Convert the normalized structured result into the final Markdown output.
This prevents long videos from being truncated and keeps output quality more proportional to transcript length.
Token thresholds
These constants are defined in src/summarizer.ts:
SINGLE_PASS_TOKEN_LIMIT = 5000- If the full transcript is
<= 5000tokens, the app uses single-pass summarization.
- If the full transcript is
CHUNK_TOKEN_LIMIT = 5000- In chunked mode, each chunk targets up to about
5000tokens.
- In chunked mode, each chunk targets up to about
MIN_LAST_CHUNK_RATIO = 0.25- If the final chunk is smaller than
25%ofCHUNK_TOKEN_LIMIT, it is merged into the previous chunk. - This avoids a tiny final chunk that usually lowers summary quality.
- If the final chunk is smaller than
Development
# Clone and install
git clone https://github.com/sunghyo/youtube2md
cd youtube2md
npm install
# Set your API key
cp .env.example .env
# Edit .env and add your OPENAI_API_KEY
# Run without building
npx tsx src/index.ts --url https://www.youtube.com/watch?v=VIDEO_IDBuild
npm run build # Compile TypeScript to dist/
npm run dev # Run directly with tsx (no build needed)
npm run clean # Remove dist/Project structure
src/
├── index.ts # Entry point — orchestrates all steps
├── cli.ts # CLI argument parsing (Commander)
├── youtube.ts # Metadata fetch + transcript fetch with fallback
├── summarizer.ts # OpenAI Responses API prompting + JSON parsing
├── markdown.ts # Markdown generation + file writing
└── types.ts # Shared TypeScript interfaces
summaries/ # Default output directoryAttribution
This project was built with AI assistance from Claude (Anthropic) and Codex (OpenAI).