JSPM

@osiris-ai/youtube-captions-sdk

0.1.0
  • ESM via JSPM
  • ES Module Entrypoint
  • Export Map
  • Keywords
  • License
  • Repository URL
  • TypeScript Types
  • README
  • Created
  • Published
  • Downloads 23
  • Score
    100M100P100Q68597F
  • License MIT

A JavaScript/TypeScript SDK for fetching YouTube video transcripts and captions without API keys or headless browsers

Package Exports

  • @osiris-ai/youtube-captions-sdk

Readme

✨ YouTube Captions SDK ✨

npm version License: MIT TypeScript

A modern JavaScript/TypeScript SDK for fetching YouTube video transcripts and captions. No API keys required, no headless browsers needed!

This SDK is inspired by the popular Python youtube-transcript-api and provides the same functionality for JavaScript/TypeScript applications.

🚀 Features

  • No API Keys Required - Works directly with YouTube's internal API
  • No Headless Browser - Pure HTTP requests for maximum performance
  • Auto-generated Transcripts - Support for both manual and auto-generated subtitles
  • Translation Support - Translate transcripts to different languages
  • TypeScript Support - Full TypeScript definitions included
  • Modern ESM - Built with modern JavaScript standards
  • Lightweight - Minimal dependencies for fast installation

📦 Installation

npm install @osiris-ai/youtube-captions-sdk
yarn add @osiris-ai/youtube-captions-sdk
pnpm add @osiris-ai/youtube-captions-sdk

🎯 Quick Start

import { TranscriptList } from '@osiris-ai/youtube-captions-sdk';

// Fetch transcript for a YouTube video
const videoId = 'dQw4w9WgXcQ'; // Rick Astley - Never Gonna Give You Up
const transcriptList = await TranscriptList.fetch(videoId);

// Get the first available transcript
const transcript = transcriptList.find(['en', 'en-US']);
const fetchedTranscript = await transcript.fetch();

// Access the transcript data
console.log(fetchedTranscript.snippets);
/*
[
  {
    text: "Never gonna give you up",
    start: 0.0,
    duration: 3.5
  },
  {
    text: "Never gonna let you down",
    start: 3.5,
    duration: 4.2
  }
  // ... more snippets
]
*/

📚 API Documentation

TranscriptList

The main entry point for fetching transcripts.

TranscriptList.fetch(videoId: string): Promise<TranscriptList>

Fetches all available transcripts for a YouTube video.

const transcriptList = await TranscriptList.fetch('dQw4w9WgXcQ');

Parameters:

  • videoId (string): YouTube video ID (not the full URL)

Returns: Promise resolving to a TranscriptList instance

transcriptList.find(languageCodes: string[], preferGenerated?: boolean): Transcript

Finds a transcript in the specified languages.

// Find English transcript (manual preferred)
const transcript = transcriptList.find(['en', 'en-US']);

// Find transcript, preferring auto-generated
const transcript = transcriptList.find(['en', 'en-US'], true);

Parameters:

  • languageCodes (string[]): Array of language codes to search for
  • preferGenerated (boolean, optional): Whether to prefer auto-generated transcripts

Returns: Transcript instance

Transcript

Represents a single transcript in a specific language.

Properties

  • videoId (string): The YouTube video ID
  • language (string): Human-readable language name
  • languageCode (string): Language code (e.g., 'en', 'es')
  • isGenerated (boolean): Whether this is an auto-generated transcript
  • isTranslatable (boolean): Whether this transcript can be translated

transcript.fetch(preserve?: boolean): Promise<FetchedTranscript>

Fetches the actual transcript content.

const fetchedTranscript = await transcript.fetch();

// Preserve HTML formatting in transcript text
const fetchedTranscript = await transcript.fetch(true);

Parameters:

  • preserve (boolean, optional): Whether to preserve HTML formatting

Returns: Promise resolving to a FetchedTranscript object

transcript.translate(languageCode: string): Transcript

Translates the transcript to another language.

const englishTranscript = transcriptList.find(['en']);
const spanishTranscript = englishTranscript.translate('es');
const spanishContent = await spanishTranscript.fetch();

Parameters:

  • languageCode (string): Target language code

Returns: New Transcript instance for the translated content

FetchedTranscript

The actual transcript content with timing information.

interface FetchedTranscript {
  snippets: FetchedTranscriptSnippet[];
  videoId: string;
  language: string;
  languageCode: string;
  isGenerated: boolean;
}

interface FetchedTranscriptSnippet {
  text: string;
  start: number;      // Start time in seconds
  duration: number;   // Duration in seconds
}

🔧 Advanced Usage

Working with Multiple Languages

const transcriptList = await TranscriptList.fetch('dQw4w9WgXcQ');

// Try multiple languages in order of preference
try {
  const transcript = transcriptList.find(['en', 'en-US', 'es', 'fr']);
  const content = await transcript.fetch();
  console.log(`Found transcript in: ${transcript.language}`);
} catch (error) {
  console.log('No transcript found in preferred languages');
}

Iterating Through All Available Transcripts

const transcriptList = await TranscriptList.fetch('dQw4w9WgXcQ');

for (const transcript of transcriptList) {
  console.log(`${transcript.language} (${transcript.languageCode})`);
  console.log(`Generated: ${transcript.isGenerated}`);
  console.log(`Translatable: ${transcript.isTranslatable}`);
}

Translation Example

const transcriptList = await TranscriptList.fetch('dQw4w9WgXcQ');
const englishTranscript = transcriptList.find(['en']);

if (englishTranscript.isTranslatable) {
  // Translate to Spanish
  const spanishTranscript = englishTranscript.translate('es');
  const spanishContent = await spanishTranscript.fetch();
  
  console.log('Spanish transcript:', spanishContent.snippets);
}

Processing Transcript Content

const transcriptList = await TranscriptList.fetch('dQw4w9WgXcQ');
const transcript = transcriptList.find(['en']);
const content = await transcript.fetch();

// Get full text
const fullText = content.snippets.map(s => s.text).join(' ');

// Get text for a specific time range (e.g., first 30 seconds)
const first30Seconds = content.snippets
  .filter(s => s.start < 30)
  .map(s => s.text)
  .join(' ');

// Format as SRT-style subtitles
const srtFormat = content.snippets
  .map((snippet, index) => {
    const start = formatTime(snippet.start);
    const end = formatTime(snippet.start + snippet.duration);
    return `${index + 1}\n${start} --> ${end}\n${snippet.text}\n`;
  })
  .join('\n');

function formatTime(seconds: number): string {
  const h = Math.floor(seconds / 3600);
  const m = Math.floor((seconds % 3600) / 60);
  const s = Math.floor(seconds % 60);
  const ms = Math.floor((seconds % 1) * 1000);
  return `${h.toString().padStart(2, '0')}:${m.toString().padStart(2, '0')}:${s.toString().padStart(2, '0')},${ms.toString().padStart(3, '0')}`;
}

🛠️ Error Handling

The SDK throws descriptive errors for common issues:

try {
  const transcriptList = await TranscriptList.fetch('invalid_video_id');
} catch (error) {
  if (error.message.includes('YouTube request failed')) {
    console.log('Video not found or private');
  }
}

try {
  const transcript = transcriptList.find(['nonexistent_language']);
} catch (error) {
  if (error.message.includes('Transcript not found')) {
    console.log('No transcript available in requested language');
  }
}

try {
  const translated = transcript.translate('invalid_code');
} catch (error) {
  if (error.message.includes('Translation unavailable')) {
    console.log('Translation to this language is not supported');
  }
}

🎨 TypeScript Support

The SDK is built with TypeScript and provides full type definitions:

import { 
  TranscriptList, 
  Transcript, 
  FetchedTranscript, 
  FetchedTranscriptSnippet 
} from '@osiris-ai/youtube-captions-sdk';

// All types are automatically inferred
const transcriptList: TranscriptList = await TranscriptList.fetch('dQw4w9WgXcQ');
const transcript: Transcript = transcriptList.find(['en']);
const content: FetchedTranscript = await transcript.fetch();
const snippets: FetchedTranscriptSnippet[] = content.snippets;

🌍 Supported Languages

The SDK supports all languages available on YouTube, including:

  • Manual transcripts: Created by video uploaders
  • Auto-generated transcripts: Created by YouTube's speech recognition
  • Translated transcripts: Available for many videos in multiple languages

Common language codes:

  • en - English
  • es - Spanish
  • fr - French
  • de - German
  • it - Italian
  • pt - Portuguese
  • ru - Russian
  • ja - Japanese
  • ko - Korean
  • zh - Chinese
  • And many more...

⚠️ Important Notes

  1. Video ID Format: Use the video ID (e.g., dQw4w9WgXcQ), not the full URL
  2. Rate Limiting: YouTube may rate limit requests. Consider implementing delays for bulk operations
  3. API Changes: This uses YouTube's internal API, which may change without notice
  4. Availability: Not all videos have transcripts available

🤝 Contributing

We welcome contributions! Please see our Contributing Guide for details.

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments


Made with ❤️ by Osiris Labs