Package Exports
- yaytt
- yaytt/dist/index.js
This package does not declare an exports field, so the exports above have been automatically detected and optimized by JSPM instead. If any package subpath is missing, it is recommended to post an issue to the original package (yaytt) to support the "exports" field. If that is not possible, create a JSPM override to customize the exports field for this package.
Readme
YAYTT - Yet Another Youtube Transcriptor
Features
- Smart deduplication - Removes overlapping auto-generated caption segments
- TypeScript support - Full type definitions included
- Zero dependencies - Lightweight and self-contained
Installation
bun add yayttnpm install yayttyarn add yayttpnpm add yayttQuick Start
import { extractCaptions } from "yaytt";
const captions = await extractCaptions("WcBA3QEXJ2o");
const englishCaptions = await extractCaptions("WcBA3QEXJ2o", { lang: "en" });
const captions = await extractCaptions(
"https://www.youtube.com/watch?v=WcBA3QEXJ2o",
);Advanced Usage
Ultra-aggressive deduplication for heavily overlapping captions
import { extractCaptions } from "yaytt";
const cleanCaptions = await extractCaptions("WcBA3QEXJ2o", {
deduplicationOptions: {
aggressiveMode: true, // Maximum deduplication
},
});Check available languages
import { getAvailableLanguages } from "yaytt";
const languages = await getAvailableLanguages("WcBA3QEXJ2o");
console.log(languages);
// [{ code: 'pt', name: 'Portuguese (auto-generated)', isAutomatic: true }]Full configuration
import { YouTubeCaptionExtractor } from "yaytt";
const extractor = new YouTubeCaptionExtractor({
userAgent: "MyApp/1.0",
timeout: 15000,
rateLimitDelay: 3000,
});
const captions = await extractor.extractCaptions("WcBA3QEXJ2o", {
lang: "pt",
retries: 3,
deduplicate: true,
deduplicationOptions: {
timeThreshold: 3, // Seconds
similarityThreshold: 0.8, // 80% similarity
mergePartialMatches: true,
aggressiveMode: false, // Set to true for maximum deduplication
},
});CLI
npx yaytt WcBA3QEXJ2o
npx yaytt WcBA3QEXJ2o --aggressive
npx yaytt "https://www.youtube.com/watch?v=WcBA3QEXJ2o"API Reference
extractCaptions(videoIdOrUrl, options?)
Extract captions from a YouTube video.
Parameters:
videoIdOrUrl(string): YouTube video ID or full URLoptions(object, optional):lang(string): Language code (default: 'pt' for Portuguese)deduplicate(boolean): Enable deduplication (default: true)deduplicationOptions(object): Deduplication settings
Returns: Promise<Caption[]>
getAvailableLanguages(videoIdOrUrl)
Get all available caption languages for a video.
Parameters:
videoIdOrUrl(string): YouTube video ID or full URL
Returns: Promise<{ code: string, name: string, isAutomatic: boolean }[]>
Types
interface Caption {
start: number; // Start time in seconds
dur: number; // Duration in seconds
text: string; // Caption text
}
interface CaptionOptions {
lang?: string;
retries?: number;
fallback?: boolean;
deduplicate?: boolean;
deduplicationOptions?: {
timeThreshold?: number; // Default: 3 seconds
similarityThreshold?: number; // Default: 0.8 (80% similarity)
mergePartialMatches?: boolean; // Default: true
aggressiveMode?: boolean; // Default: false
};
}Deduplication
YouTube's auto-generated captions often contain overlapping segments:
Before:
[0:00] [Música]
[0:00] [Música] O podcast que você ouve agora é uma
[0:02] O podcast que você ouve agora é uma
[0:02] O podcast que você ouve agora é uma produção da Central 3.
After:
[0:02] O podcast que você ouve agora é uma produção da Central 3.Results:
- Normal mode: ~50% reduction in caption count
- Aggressive mode: ~70% reduction for heavily overlapping content
How It Works
- Extracts API keys from YouTube video pages
- Calls YouTube's Innertube API directly (same API used by youtube.com)
- Fetches caption track URLs from video metadata
- Downloads VTT caption files directly from YouTube's servers
- Parses timestamps and text into a clean format
- Applies smart deduplication to remove overlapping segments
Requirements
- Node.js 16+ or compatible runtime
- Server-side only (not for browser use due to CORS)
Error Handling
import { extractCaptions, CaptionExtractionError } from "yaytt";
try {
const captions = await extractCaptions("invalid-video-id");
} catch (error) {
if (error instanceof CaptionExtractionError) {
console.error(`Caption extraction failed: ${error.message}`);
console.error(`Video ID: ${error.videoId}`);
}
}License
MIT