JSPM

  • ESM via JSPM
  • ES Module Entrypoint
  • Export Map
  • Keywords
  • License
  • Repository URL
  • TypeScript Types
  • README
  • Created
  • Published
  • Downloads 6
  • Score
    100M100P100Q49263F
  • License MIT

Multimodal video understanding for Claude Code — extract frames, transcribe audio, build timelines from any video

Package Exports

    This package does not declare an exports field, so the exports above have been automatically detected and optimized by JSPM instead. If any package subpath is missing, it is recommended to post an issue to the original package (vidclaude) to support the "exports" field. If that is not possible, create a JSPM override to customize the exports field for this package.

    Readme

    vidclaude

    Multimodal video understanding for Claude Code. Extract frames, transcribe audio in 90+ languages, build temporal timelines — all from a single command. No API key needed.

    npm install -g vidclaude
    vidclaude video.mp4 --mode standard --verbose

    Prerequisites

    • Python 3.10+python.org
    • ffmpeg — Windows: winget install ffmpeg / macOS: brew install ffmpeg / Linux: sudo apt install ffmpeg

    Python dependencies (Pillow, faster-whisper) are installed automatically during npm install.

    Install

    npm install -g vidclaude

    Or use without installing:

    npx vidclaude video.mp4 --mode standard --verbose

    Usage

    # Set up the skill in your project (one time)
    vidclaude --install-skill
    
    # Then in Claude Code, just say:
    # "analyze the video at path/to/video.mp4"
    # "what does the speaker say about the budget?"
    # "when does the chart appear on screen?"

    Your Max/Pro plan covers everything. Follow-up questions are instant (cached).

    Standalone CLI

    # Standard analysis
    vidclaude video.mp4 --mode standard --verbose
    
    # Quick (fewer frames, faster)
    vidclaude video.mp4 --mode quick
    
    # Deep (dense frames, full OCR)
    vidclaude video.mp4 --mode deep --verbose
    
    # Batch process a folder
    vidclaude ./videos/ --verbose
    
    # Skip audio / force fresh extraction
    vidclaude video.mp4 --no-audio --no-cache

    Modes

    Mode Frames Whisper model Best for
    quick ~20 base Short clips, fast overview
    standard ~60, shot-aware large-v3 General use
    deep ~150, burst sampling large-v3 Long videos, detailed review

    What it extracts

    Every run creates a .vidcache/ directory:

    .vidcache/<hash>/
      evidence.md        ← Report for Claude to read
      frames/            ← Extracted JPEG frames
      transcript.json    ← Timestamped speech (90+ languages)
      timeline.json      ← Unified event timeline
      meta.json          ← Video metadata

    How it works

    1. Frames — Adaptive sampling with shot boundary detection via ffmpeg
    2. Audio — faster-whisper large-v3 transcription with auto language detection
    3. OCR — On-screen text extraction via pytesseract (optional)
    4. Timeline — Merges all modalities into a time-sorted event list
    5. Evidence — Generates evidence.md that Claude reads and reasons over

    CLI Reference

    vidclaude [input] [options]
    
      --install-skill               Set up Claude Code skill
      --mode {quick,standard,deep}  Processing mode (default: standard)
      -f, --fps N                   Override frames per second
      -m, --max-frames N            Override max frame count
      --no-audio                    Skip transcription
      --no-ocr                      Skip OCR
      --no-cache                    Force re-extraction
      --verbose                     Show progress
      -o FILE                       Write output to file

    Also available via pip

    pip install vidclaude

    License

    MIT