Package Exports

yaytt
yaytt/dist/index.js

This package does not declare an exports field, so the exports above have been automatically detected and optimized by JSPM instead. If any package subpath is missing, it is recommended to post an issue to the original package (yaytt) to support the "exports" field. If that is not possible, create a JSPM override to customize the exports field for this package.

Readme

YAYTT - Yet Another Youtube Transcriptor

Features

Smart deduplication - Removes overlapping auto-generated caption segments
TypeScript support - Full type definitions included
Zero dependencies - Lightweight and self-contained

Installation

 bun add yaytt

npm install yaytt

yarn add yaytt

pnpm add yaytt

Quick Start

import { extractCaptions } from "yaytt";

const captions = await extractCaptions("WcBA3QEXJ2o");

const englishCaptions = await extractCaptions("WcBA3QEXJ2o", { lang: "en" });

const captions = await extractCaptions(
  "https://www.youtube.com/watch?v=WcBA3QEXJ2o",
);

Advanced Usage

Ultra-aggressive deduplication for heavily overlapping captions

import { extractCaptions } from "yaytt";

const cleanCaptions = await extractCaptions("WcBA3QEXJ2o", {
  deduplicationOptions: {
    aggressiveMode: true, // Maximum deduplication
  },
});

Check available languages

import { getAvailableLanguages } from "yaytt";

const languages = await getAvailableLanguages("WcBA3QEXJ2o");
console.log(languages);
// [{ code: 'pt', name: 'Portuguese (auto-generated)', isAutomatic: true }]

Full configuration

import { YouTubeCaptionExtractor } from "yaytt";

const extractor = new YouTubeCaptionExtractor({
  userAgent: "MyApp/1.0",
  timeout: 15000,
  rateLimitDelay: 3000,
});

const captions = await extractor.extractCaptions("WcBA3QEXJ2o", {
  lang: "pt",
  retries: 3,
  deduplicate: true,
  deduplicationOptions: {
    timeThreshold: 3, // Seconds
    similarityThreshold: 0.8, // 80% similarity
    mergePartialMatches: true,
    aggressiveMode: false, // Set to true for maximum deduplication
  },
});

CLI

npx yaytt WcBA3QEXJ2o

npx yaytt WcBA3QEXJ2o --aggressive

npx yaytt "https://www.youtube.com/watch?v=WcBA3QEXJ2o"

API Reference

`extractCaptions(videoIdOrUrl, options?)`

Extract captions from a YouTube video.

Parameters:

videoIdOrUrl (string): YouTube video ID or full URL
options (object, optional):
- lang (string): Language code (default: 'pt' for Portuguese)
- deduplicate (boolean): Enable deduplication (default: true)
- deduplicationOptions (object): Deduplication settings

Returns: Promise<Caption[]>

`getAvailableLanguages(videoIdOrUrl)`

Get all available caption languages for a video.

Parameters:

videoIdOrUrl (string): YouTube video ID or full URL

Returns: Promise<{ code: string, name: string, isAutomatic: boolean }[]>

Types

interface Caption {
  start: number; // Start time in seconds
  dur: number; // Duration in seconds
  text: string; // Caption text
}

interface CaptionOptions {
  lang?: string;
  retries?: number;
  fallback?: boolean;
  deduplicate?: boolean;
  deduplicationOptions?: {
    timeThreshold?: number; // Default: 3 seconds
    similarityThreshold?: number; // Default: 0.8 (80% similarity)
    mergePartialMatches?: boolean; // Default: true
    aggressiveMode?: boolean; // Default: false
  };
}

Deduplication

YouTube's auto-generated captions often contain overlapping segments:

Before:
[0:00] [Música]
[0:00] [Música] O podcast que você ouve agora é uma
[0:02] O podcast que você ouve agora é uma
[0:02] O podcast que você ouve agora é uma produção da Central 3.

After:
[0:02] O podcast que você ouve agora é uma produção da Central 3.

Results:

Normal mode: ~50% reduction in caption count
Aggressive mode: ~70% reduction for heavily overlapping content

How It Works

Extracts API keys from YouTube video pages
Calls YouTube's Innertube API directly (same API used by youtube.com)
Fetches caption track URLs from video metadata
Downloads VTT caption files directly from YouTube's servers
Parses timestamps and text into a clean format
Applies smart deduplication to remove overlapping segments

Requirements

Node.js 16+ or compatible runtime
Server-side only (not for browser use due to CORS)

Error Handling

import { extractCaptions, CaptionExtractionError } from "yaytt";

try {
  const captions = await extractCaptions("invalid-video-id");
} catch (error) {
  if (error instanceof CaptionExtractionError) {
    console.error(`Caption extraction failed: ${error.message}`);
    console.error(`Video ID: ${error.videoId}`);
  }
}

License

MIT