Package Exports

@nadimtuhin/ytranscript

Readme

ytranscript

Fast YouTube transcript extraction with bulk processing, Google Takeout support, MCP server, and multiple output formats.

Built with Bun for maximum performance.

Features

Direct YouTube API - No third-party services, uses YouTube's innertube API
MCP Server - Use with Claude, Cursor, and other AI assistants via Model Context Protocol
Bulk processing - Process thousands of videos with concurrency control
Google Takeout support - Import from watch history JSON and watch-later CSV
Resume-safe - Automatically skips already-processed videos
Multiple output formats - JSON, JSONL, CSV, SRT, VTT, plain text
Language selection - Choose preferred transcript languages
Programmatic API - Use as a library in your TypeScript/JavaScript projects

Installation

# Install globally
bun install -g ytranscript

# Or use locally in a project
bun add ytranscript

CLI Usage

Fetch a single transcript

# Basic usage (outputs plain text)
ytranscript get dQw4w9WgXcQ

# From URL
ytranscript get "https://www.youtube.com/watch?v=dQw4w9WgXcQ"

# With specific language
ytranscript get dQw4w9WgXcQ --lang es

# Output as SRT subtitles
ytranscript get dQw4w9WgXcQ --format srt -o video.srt

# Output as JSON with timestamps
ytranscript get dQw4w9WgXcQ --format json

Check available languages

ytranscript info dQw4w9WgXcQ

Bulk processing

# From Google Takeout exports
ytranscript bulk \
  --history "Takeout/YouTube/history/watch-history.json" \
  --watch-later "Takeout/YouTube/playlists/Watch later-videos.csv" \
  --out-jsonl transcripts.jsonl \
  --out-csv transcripts.csv

# From a list of video IDs
ytranscript bulk --videos "dQw4w9WgXcQ,jNQXAC9IVRw,9bZkp7q19f0"

# From a file (one ID or URL per line)
ytranscript bulk --file videos.txt

# Resume a previous run
ytranscript bulk --history watch-history.json --resume

# Control concurrency and rate limiting
ytranscript bulk \
  --history watch-history.json \
  --concurrency 8 \
  --pause-after 20 \
  --pause-ms 3000

Programmatic API

Fetch a single transcript

import { fetchTranscript } from 'ytranscript';

const transcript = await fetchTranscript('dQw4w9WgXcQ', {
  languages: ['en', 'es'], // Preference order
  includeAutoGenerated: true,
});

console.log(transcript.text); // Full transcript text
console.log(transcript.segments); // Array of { text, start, duration }
console.log(transcript.language); // 'en'
console.log(transcript.isAutoGenerated); // true/false

Bulk processing

import {
  loadWatchHistory,
  loadWatchLater,
  mergeVideoSources,
  processVideos,
} from 'ytranscript';

// Load from Google Takeout
const history = await loadWatchHistory('./watch-history.json');
const watchLater = await loadWatchLater('./watch-later.csv');

// Merge and deduplicate
const videos = mergeVideoSources(history, watchLater);

// Process with progress callback
const results = await processVideos(videos, {
  concurrency: 4,
  pauseAfter: 10,
  pauseDuration: 5000,
  onProgress: (completed, total, result) => {
    const status = result.transcript ? 'OK' : 'FAIL';
    console.log(`[${completed}/${total}] ${result.meta.videoId}: ${status}`);
  },
});

// Filter successful results
const transcripts = results.filter((r) => r.transcript);

Streaming for large datasets

import { streamVideos, appendJsonl } from 'ytranscript';

for await (const result of streamVideos(videos, { concurrency: 4 })) {
  // Write each result immediately (resume-safe)
  await appendJsonl(result, 'output.jsonl');
}

Output formatting

import { fetchTranscript, formatSrt, formatVtt, formatText } from 'ytranscript';

const transcript = await fetchTranscript('dQw4w9WgXcQ');

// SRT subtitles
const srt = formatSrt(transcript);
await Bun.write('video.srt', srt);

// VTT subtitles
const vtt = formatVtt(transcript);
await Bun.write('video.vtt', vtt);

// Plain text with timestamps
const text = formatText(transcript, true);
// [0:00] First line of transcript
// [0:05] Second line...

Google Takeout

To export your YouTube data:

Go to Google Takeout
Deselect all, then select only "YouTube and YouTube Music"
Click "All YouTube data included" and select:
- History → Watch history
- Playlists (includes Watch Later)
Export and download
Extract the archive

The relevant files are:

Takeout/YouTube and YouTube Music/history/watch-history.json
Takeout/YouTube and YouTube Music/playlists/Watch later-videos.csv

API Reference

Types

interface Transcript {
  videoId: string;
  text: string;
  segments: TranscriptSegment[];
  language: string;
  isAutoGenerated: boolean;
}

interface TranscriptSegment {
  text: string;
  start: number;  // seconds
  duration: number;  // seconds
}

interface TranscriptResult {
  meta: WatchHistoryMeta;
  transcript: Transcript | null;
  error?: string;
}

interface FetchOptions {
  languages?: string[];
  timeout?: number;
  includeAutoGenerated?: boolean;
}

interface BulkOptions extends FetchOptions {
  concurrency?: number;
  pauseAfter?: number;
  pauseDuration?: number;
  skipIds?: Set<string>;
  onProgress?: (completed: number, total: number, result: TranscriptResult) => void;
}

License

MIT

MCP Server (Model Context Protocol)

ytranscript includes an MCP server that allows AI assistants like Claude to fetch YouTube transcripts directly.

Available Tools

Tool	Description
`get_transcript`	Fetch transcript for a YouTube video with format options (text, segments, srt, vtt)
`get_transcript_languages`	List available caption languages for a video
`extract_video_id`	Extract video ID from various YouTube URL formats
`get_transcripts_bulk`	Fetch transcripts for multiple videos at once

Setup with Claude Desktop

Add to your Claude Desktop config (~/Library/Application Support/Claude/claude_desktop_config.json on macOS):

{
  "mcpServers": {
    "ytranscript": {
      "command": "npx",
      "args": ["-y", "ytranscript-mcp"]
    }
  }
}

Or if installed globally:

{
  "mcpServers": {
    "ytranscript": {
      "command": "ytranscript-mcp"
    }
  }
}

Setup with Cursor

Add to your Cursor MCP settings:

{
  "mcpServers": {
    "ytranscript": {
      "command": "npx",
      "args": ["-y", "ytranscript-mcp"]
    }
  }
}

Example Usage in Claude

Once configured, you can ask Claude:

"Get the transcript for this YouTube video: https://youtube.com/watch?v=dQw4w9WgXcQ"
"What languages are available for this video?"
"Summarize the transcript of this video"
"Get transcripts for these 5 videos and compare their content"

Running the MCP Server Manually

# Via npx
npx ytranscript-mcp

# Or if installed globally
ytranscript-mcp

# For development
bun run dev:mcp

@nadimtuhin/ytranscript

Package Exports

Readme

ytranscript

Features

Installation

CLI Usage

Fetch a single transcript

Check available languages

Bulk processing

Programmatic API

Fetch a single transcript

Bulk processing

Streaming for large datasets

Output formatting

Google Takeout

API Reference

Types

License

MCP Server (Model Context Protocol)

Available Tools

Setup with Claude Desktop

Setup with Cursor

Example Usage in Claude

Running the MCP Server Manually