JSPM

Found 346 results for whisper

open-agents-ai

AI coding agent powered by open-source models (Ollama/vLLM) — interactive TUI with agentic tool-calling loop

    • v0.187.552
    • 58.13
    • Published

    glin-profanity

    Glin-Profanity is a lightweight and efficient npm package designed to detect and filter profane language in text inputs across multiple languages. Whether you’re building a chat application, a comment section, or any platform where user-generated content

    • v3.3.0
    • 56.55
    • Published

    node-av

    FFmpeg bindings for Node.js

    • v5.2.3
    • 55.13
    • Published

    whisper.rn

    React Native binding of whisper.cpp

    • v0.5.5
    • 53.85
    • Published

    modelfusion

    The TypeScript library for building AI applications.

    • v0.137.0
    • 48.20
    • Published

    @fugood/whisper.node

    An another Node binding of whisper.cpp to make same API with whisper.rn as much as possible.

    • v1.0.18
    • 46.59
    • Published

    smart-whisper

    Whisper.cpp Node.js binding with auto model offloading strategy.

    • v0.8.1
    • 46.38
    • Published

    @chengsokdara/use-whisper

    React Hook for OpenAI Whisper API with speech recorder and silence removal built-in.

    • v0.2.0
    • 44.26
    • Published

    create-byan-agent

    BYAN v2.8 - Intelligent AI agent creator with ELO trust system + scientific fact-check + Hermes universal dispatcher + native Claude Code integration (hooks, skills, MCP server). Multi-platform (Copilot CLI, Claude Code, Codex). Merise Agile + TDD + 64 Ma

    • v2.16.1
    • 43.78
    • Published

    minutes-sdk

    Conversation memory SDK — query meeting transcripts, decisions, and action items from any AI agent or application

    • v0.16.4
    • 42.92
    • Published

    @qvac/sdk

    **QVAC SDK** is the canonical entry point to develop AI applications with QVAC.

    • v0.10.2
    • 42.90
    • Published

    whisper

    A task-based automation app. Leiningen style.

    • v0.3.3
    • 42.04
    • Published

    minutes-mcp

    MCP server for minutes — conversation memory for AI assistants. Works with Claude Desktop, Mistral Vibe, Cursor, Windsurf, and any MCP client.

    • v0.16.4
    • 41.66
    • Published

    graphifyy

    AI coding assistant skill (Claude Code, Codex, Gemini CLI, GitHub Copilot CLI, Aider, OpenCode, OpenClaw) - turn any folder of code, docs, papers, images, or audio/video transcripts into a queryable knowledge graph

    • v0.7.5
    • 40.56
    • Published

    copilot-plus

    Voice + screenshots + model hotkeys + live agent monitor — drop-in wrapper for GitHub Copilot CLI

    • v1.0.28
    • 39.55
    • Published

    browser-whisper

    Browser-native audio transcription powered by WebGPU Whisper — zero server, fully local.

    • v1.0.1
    • 39.17
    • Published

    webtalk

    Buildless STT (Whisper WebGPU) + TTS (Pocket TTS ONNX) SDK

    • v1.0.46
    • 38.95
    • Published

    voice-router-dev

    Universal speech-to-text router for Gladia, AssemblyAI, Deepgram, Azure, OpenAI Whisper, Speechmatics, Soniox, and ElevenLabs

    • v0.9.4
    • 38.49
    • Published

    @wovin/tranz

    Audio transcription library with provider support and auto-splitting

    • v0.1.36
    • 38.28
    • Published

    aetherlight

    Voice-to-intelligence platform for developers. Voice capture, sprint planning with AI, bug/feature forms, pattern matching to prevent AI hallucinations.

    • v0.18.15
    • 38.09
    • Published

    claude-video-vision

    MCP server that gives Claude Code the ability to watch and understand videos — extracts frames via ffmpeg and processes audio via multiple backends

    • v1.2.1
    • 37.88
    • Published

    whisper-windows-mcp

    Windows-native MCP server for local audio transcription using whisper.cpp with Vulkan GPU acceleration

      • v2.2.2
      • 37.03
      • Published

      transcribe-cli

      Local audio/video transcription with speaker diarization and live audio support. No API keys. Powered by faster-whisper.

      • v2.0.2
      • 36.99
      • Published

      @mouadja02/murmur

      Voice-first prompt engineering for vibe coders. Floating overlay that turns your voice into clean, structured prompts using local Whisper + your own local LLM.

      • v0.4.0
      • 36.04
      • Published

      faster-whisper-ts

      TypeScript port of SYSTRAN/faster-whisper for Node.js, built on CTranslate2, Koffi, FFmpeg, and ONNX Runtime.

      • v1.2.1
      • 34.43
      • Published

      @zhin.js/plugin-voice

      Voice input/output plugin for Zhin.js — STT via Whisper + TTS via edge-tts

      • v0.0.13
      • 34.14
      • Published

      cat-crawl

      Multi-channel Obsidian clipping and video transcription CLI (WeChat/YouTube/Douyin).

      • v0.2.10
      • 33.87
      • Published

      @hzttt/multimodal-rag

      OpenClaw plugin for multimodal RAG - semantic indexing and time-aware search for images and audio using local AI models

      • v0.5.3
      • 33.52
      • Published

      @framers/agentos-ext-voice-synthesis

      Voice synthesis and transcription tools for AgentOS via OpenAI, ElevenLabs, Deepgram, and local Ollama/Whisper-compatible runtimes

      • v2.0.1
      • 33.51
      • Published

      create-plaud-pipeline

      Bootstrap a PLAUD NotePin auto-recording pipeline (Whisper + Obsidian, macOS)

      • v1.2.0
      • 33.44
      • Published

      @casatwy/deyo

      CLI for submitting Deyo transcription jobs from the terminal

      • v0.1.7
      • 33.13
      • Published

      @asmostans/daeva

      Daeva — local GPU pod orchestrator for AI workloads

      • v0.2.6
      • 33.07
      • Published

      @polargrid/polargrid-sdk

      JavaScript/TypeScript SDK for PolarGrid Edge AI Infrastructure with Full API Support

      • v0.6.1
      • 33.02
      • Published

      voice-stream

      A powerful React hook for real-time voice streaming, designed for AI-powered applications. Perfect for real-time transcription, voice assistants, and audio processing with features like silence detection and configurable audio processing.

      • v1.0.1
      • 32.76
      • Published

      @renjfk/opencode-voice

      Speech-to-text and text-to-speech for OpenCode. Record voice prompts with whisper-cpp, hear responses via Piper TTS, with LLM normalization through any OpenAI-compatible endpoint.

      • v0.1.4
      • 32.34
      • Published

      speak2text

      speak2text CLI tool. Transcribe and translate audio and video files using OpenAI Whisper.

      • v0.2.3
      • 31.87
      • Published

      @pr0gramm/fluester

      Node.js bindings for OpenAI's Whisper. Optimized for CPU.

      • v0.9.15
      • 31.87
      • Published

      @umituz/web-cloudflare

      Comprehensive Cloudflare Workers & Pages integration with config-based patterns, middleware, router, workflows, AI (with audio/music generation, TTS, ASR), React hooks, and multi-tenant support

      • v1.7.8
      • 31.82
      • Published

      opencode-voice

      Speech-to-text plugin for OpenCode — voice input with Deepgram, Groq, and OpenAI Whisper

        • v0.1.4
        • 31.49
        • Published

        vidclaude

        Multimodal video understanding for Claude Code — extract frames, transcribe audio, build timelines from any video

          • v0.2.4
          • 31.48
          • Published

          typelessform-widget

          Voice input widget for HTML forms. Users speak once — AI fills all fields at once. Drop-in for React, Vue, Angular, Next.js, WordPress. 25+ languages, 96% accuracy.

          • v1.0.7
          • 31.28
          • Published

          create-axiom-body

          Give any AI agent a physical body — eyes, ears, voice, face. Patent Pending. One command install.

          • v2.1.0
          • 31.28
          • Published

          whisper.cpp

          Whisper speech recognition

          • v1.0.3
          • 30.94
          • Published

          remotion-media-mcp

          MCP server for AI media generation in Remotion projects - images, videos, music, sound effects, speech, and subtitles

          • v1.2.2
          • 30.56
          • Published

          whispermix

          🎙️ WhisperMix is a versatile module for transcribing audio using OpenAI’s Whisper or Groq’s Whisper v3 model.

          • v1.4.8
          • 30.46
          • Published

          whisper-web-transcriber

          Real-time audio transcription in the browser using OpenAI's Whisper model via WebAssembly

          • v0.2.5
          • 30.42
          • Published

          create-heed

          Set up heed — local-first meeting transcription with real speaker diarization. One command, everything installs.

          • v0.1.3
          • 30.38
          • Published

          assistvideo

          CLI that downloads video/audio from a URL (YouTube today, more coming) and transcribes to markdown using local whisper.cpp. Drop the URL, get an MP3, MP4, or transcript — in your current folder.

            • v0.1.4
            • 30.31
            • Published

            @dymoo/media-understanding

            MCP server that converts audio/video/image to text + images for LLM consumption

            • v1.1.0
            • 30.25
            • Published

            @fontsource/whisper

            Self-host the Whisper font in a neatly bundled NPM package.

            • v5.2.8
            • 30.23
            • Published

            omnivad

            Cross-platform Voice Activity Detection and Audio Event Detection via WebAssembly. Runs in browsers, Web Workers, and Node.js. Built on FireRedVAD. Whisper-ready chunking included.

            • v0.2.12
            • 29.97
            • Published

            yt2blog

            Transform YouTube videos into polished blog posts using AI

            • v1.0.2
            • 29.53
            • Published

            whisper-cpp-node

            Node.js bindings for whisper.cpp - fast speech-to-text with GPU acceleration

            • v0.2.12
            • 29.52
            • Published

            react-native-smart-ai

            The easiest way to add AI-powered chat, speech recognition, text-to-speech, and image analysis to React Native apps.

            • v1.0.2
            • 29.04
            • Published

            tracegist-mcp-bridge

            Local-first MCP bridge for reading and transcribing TraceGist package zips.

              • v0.3.2
              • 28.98
              • Published

              @orka-js/realtime

              Voice agent for OrkaJS — STT → LLM → TTS pipeline with WebSocket support

                • v1.5.1
                • 28.59
                • Published

                n8n-nodes-deapi

                n8n community node for deAPI - AI image generation, video generation, transcription and prompt optimization

                • v0.4.0
                • 28.57
                • Published

                mcp-listen

                Give your AI agents the ability to listen. Microphone capture and speech-to-text tools for MCP-compatible agents.

                • v0.1.3
                • 28.53
                • Published

                aicw-video

                AICW Video is open-source toolkit with a CLI, MCP server, and web hub for turning videos into short clips for TikTok, Instagram, YouTube Shorts and other short video platforms.

                • v1.0.3
                • 28.50
                • Published

                monsterapi

                monsterapi is a JavaScript client library for interacting with the Monster API. It provides an easy way to access the API's features and integrate them into your applications.

                • v0.0.5
                • 28.14
                • Published

                claude-video-install

                Installer for the claude-video Claude Code skill — teach Claude Code to watch videos.

                • v0.1.1
                • 28.00
                • Published

                @wiro-ai/n8n-nodes-wiroai

                n8n community node for Wiro AI — 290+ AI models: video, image, audio, LLM, 3D, and more.

                • v2.0.1
                • 27.81
                • Published

                @crafter/trx

                Agent-first CLI for audio/video transcription via Whisper

                • v0.4.0
                • 27.80
                • Published

                whspr

                CLI tool for audio transcription with Groq Whisper API

                • v1.2.0
                • 27.79
                • Published

                claude-video-analyzer

                MCP server that enables Claude Code to analyze video files (@video.mp4) by extracting frames and audio for vision and STT analysis.

                • v1.0.1
                • 27.49
                • Published

                noosphere

                Unified AI creation engine — text, image, video, audio across all providers

                • v0.9.3
                • 27.48
                • Published

                samuraizer

                Local-first CLI that turns meeting recordings into transcripts, summaries, action items, and decisions

                • v0.2.0
                • 27.46
                • Published

                pi-whisper-voice

                Minimal hold-SPACE voice input for Pi using an OpenAI-compatible Whisper/STT endpoint.

                • v0.2.0
                • 27.43
                • Published

                @sqliteai/sqlite-ai

                SQLite AI extension for Node.js - On-device inference, embedding generation, and model interaction directly into your database

                • v1.0.4
                • 27.24
                • Published

                ghostshell-ai

                AI-powered real-time interview copilot. Captures audio, transcribes with Whisper, generates answers with Gemma AI -- all in an invisible overlay.

                • v1.0.1
                • 27.09
                • Published

                twitchps

                Library to easily interact with Twitch PubSub System

                • v1.6.0
                • 26.95
                • Published

                whisper-mcp

                MCP server for local audio transcription using Whisper

                • v0.1.1
                • 26.68
                • Published

                nwhisper

                Native Node.js bindings for OpenAI's Whisper using whisper.cpp. High-performance local speech-to-text with custom model support.

                • v0.3.0
                • 26.43
                • Published

                openclaw-video-generator

                Automated video generation pipeline with OpenAI TTS, Whisper, and Remotion - from text script to professional short videos

                • v1.6.2
                • 26.30
                • Published

                transcribly

                CLI tool to transcribe YouTube videos and local audio/video files using OpenAI Whisper API

                  • v1.0.3
                  • 26.24
                  • Published

                  n8n-nodes-transcribe-audio

                  Perform speech-to-text on audio files within your n8n workflows.This node provides local audio transcription, no internet or third-party APIs required for processing.

                  • v0.1.23
                  • 26.21
                  • Published

                  overture-mcp

                  MCP Server for Claude Code, Cursor, Cline, Copilot, Github Copilot, Windsurf - Visual AI Agent Plan Execution, Approval Workflow, Plan Visualization, Agent Orchestration. See what your AI is thinking before it writes code. Works with Claude, GPT, Gemini,

                  • v0.1.8
                  • 26.12
                  • Published

                  @omnizap/sdk

                  Platform-agnostic AI agent SDK for chat applications — WhatsApp, Telegram, Messenger, or any custom channel.

                  • v1.1.3
                  • 25.96
                  • Published

                  whisperai-sdk

                  TypeScript SDK for WhisperAI: methods and interfaces for interacting with the service without external runtime dependencies.

                  • v1.0.2
                  • 25.91
                  • Published

                  create-gicellbot

                  Scaffold a self-hosted AI voice bot for Free4Talk voice rooms

                  • v1.1.0
                  • 25.90
                  • Published

                  @orka-js/multimodal

                  Multimodal utilities and agents for OrkaJS - Vision, Audio, Cross-modal workflows

                    • v3.0.1
                    • 25.88
                    • Published

                    pi-dj

                    AI music suite for pi — YouTube, global radio (30k+ stations), Suno, Lyria AI, SoundCloud/Bandcamp, mix, trim, BPM. Windows + macOS + Linux + Termux.

                    • v3.2.4
                    • 25.79
                    • Published

                    @albertsyh/use-whisper

                    React Hook for OpenAI Whisper API with speech recorder and silence removal built-in.

                    • v0.2.17
                    • 25.51
                    • Published

                    krio-stt

                    Speech-to-text for Krio and other African languages — powered by Whisper large-v3 via Hugging Face

                    • v1.0.0
                    • 25.40
                    • Published

                    whisper-tnode

                    library for use with whisper.cpp and nodejs or typescript project

                    • v1.3.2
                    • 25.37
                    • Published

                    voicesmith-mcp

                    Local AI voice for coding assistants — TTS & STT via MCP. Kokoro ONNX + faster-whisper, fully offline.

                    • v1.0.19
                    • 25.20
                    • Published

                    voice2text

                    speech to text functionality with minimum configuration and maximum compatibility

                    • v0.5.6
                    • 25.02
                    • Published

                    koishi-plugin-whisper-asr

                    [openai whisper-asr](https://github.com/ahmetoner/whisper-asr-webservice) 语音识别服务,支持一百多种语言+翻译,适配wechaty语音消息

                    • v1.0.4
                    • 24.53
                    • Published

                    tubewords

                    Extract YouTube video transcripts with AI summaries from the terminal

                    • v1.0.3
                    • 24.45
                    • Published

                    n8n-nodes-groq

                    N8N community node for Groq API - Speech-to-Text transcription using Whisper AI. Convert audio to text with high accuracy. Perfect for WhatsApp voice messages, audio files, and voice automation workflows.

                    • v0.2.0
                    • 24.35
                    • Published

                    ai-utils.js

                    Build AI applications, chatbots, and agents with JavaScript and TypeScript.

                    • v0.0.43
                    • 23.68
                    • Published

                    @lov3kaizen/agentsea-core

                    AgentSea - Unite and orchestrate AI agents. A production-ready ADK for building agentic AI applications with multi-provider support.

                    • v0.6.0
                    • 23.53
                    • Published

                    claude-transcribe

                    Claude Code plugin: transcribe video/audio from URLs (YouTube, X, TikTok, Vimeo, podcasts) or local media files using yt-dlp + OpenAI Whisper, fully on-device.

                    • v0.1.0
                    • 23.50
                    • Published

                    speekr

                    Practice speaking languages locally

                    • v0.0.1
                    • 23.43
                    • Published

                    tphim

                    TPHIM - Ultimate Video Pipeline: Download, Transcode HLS, AI Subtitles (with skip option), Resume Upload, and Cloud Upload.

                    • v2.5.1
                    • 23.17
                    • Published

                    redflower-ai

                    Official client for Redflower AI — open-source AI server with speech, vision, and language.

                    • v1.0.0
                    • 23.08
                    • Published

                    aac-voice-api

                    Games where augmentative and alternative communication (AAC) devices are used as controllers for the game is promising for increasing social inclusion of children who use these devices, such as minimally verbal autistic children. We want to build out an A

                    • v1.0.1
                    • 23.03
                    • Published

                    @ji8122s/use-whisper-test

                    React Hook for OpenAI Whisper API with speech recorder and silence removal built-in in Qubby

                    • v0.0.62
                    • 23.02
                    • Published

                    clipping-cli

                    Public CLI for clipping long-form videos into short-form packages with ffmpeg, yt-dlp, local Whisper backends, and Cloudflare Workers AI planning.

                    • v0.2.0
                    • 22.96
                    • Published

                    @beckjiang/use-whisper

                    React Hook for OpenAI Whisper API with speech recorder and silence removal built-in.

                    • v0.2.42
                    • 22.96
                    • Published

                    @zs-soft/ai-openai

                    AI OpenAI - OpenAI API implementation with Whisper STT support for AI engine abstraction

                    • v0.10.0
                    • 22.92
                    • Published

                    @derogab/stt-proxy

                    A simple and lightweight proxy for seamless integration with multiple STT (Speech-to-Text) providers including Whisper.cpp

                    • v0.3.1
                    • 22.87
                    • Published

                    agent-skill-ai-knowledge-base

                    An AI Knowledge Base skill for Gemini CLI and Claude Code with automated video transcription, presentation generation, and Obsidian vault integration.

                    • v2.6.0
                    • 22.67
                    • Published

                    @usewhisper/mcp-server

                    Model Context Protocol server for Whisper Context API - Connect Claude Desktop to your knowledge base

                    • v2.16.0
                    • 22.65
                    • Published

                    whisper-onnx-speech-to-text

                    Node.js plugin for speech recognition that works with OpenAI's Whisper models using ONNX.

                    • v1.0.1
                    • 22.59
                    • Published

                    @artale/pi-voice

                    Voice input for Pi. Multi-provider STT with Deepgram streaming, Groq Whisper, OpenAI Whisper. 56+ languages.

                    • v2.0.0
                    • 22.32
                    • Published

                    video-transcriber-mcp

                    MCP server for transcribing videos from 1000+ platforms (YouTube, Vimeo, TikTok, Twitter, etc.) or local video files using Whisper

                    • v1.1.1
                    • 22.25
                    • Published

                    morphclaude-senses

                    MCP server — ear, mouth, and eye for POLY HUD + ALSA

                    • v0.1.0
                    • 22.09
                    • Published

                    remotion-captioneer

                    Drop-in animated captions for Remotion. Audio to word-level synced subtitle components. Supports OpenAI, Groq, Deepgram, AssemblyAI.

                    • v0.9.0
                    • 22.06
                    • Published

                    escribano

                    AI-powered session intelligence tool — turn screen recordings into structured work summaries

                    • v0.5.0
                    • 21.88
                    • Published

                    react-native-whisper

                    React Native implementation of OpenAI's Whisper automatic speech recognition (ASR) model

                    • v0.0.1
                    • 21.83
                    • Published

                    reelsum

                    A powerful, minimalistic CLI tool to download, transcribe, and intelligently format speech from Instagram Reels using OpenAI Whisper and GPT.

                    • v1.0.8
                    • 21.82
                    • Published

                    @zssz-soft/ai-openai

                    AI OpenAI - OpenAI API implementation with Whisper STT support for AI engine abstraction

                    • v0.10.0
                    • 21.81
                    • Published

                    demo-shh

                    Distributed private messaging for a distributed country

                    • v0.5.3
                    • 21.28
                    • Published

                    @clawbow/mcp-whisper

                    MCP server for audio transcription with OpenAI API, whisper-cli, or whisper.cpp

                    • v1.1.4
                    • 20.95
                    • Published

                    audio-transcription-mcp

                    MCP server for real-time audio transcription using OpenAI Whisper

                    • v0.7.1
                    • 20.88
                    • Published

                    @usewhisper/sdk

                    TypeScript SDK for Whisper Context API - Add reliable context to your AI agents

                    • v3.11.0
                    • 20.57
                    • Published

                    meeting-notes-mcp

                    MCP Server for recording meetings and generating notes in Claude Code

                      • v0.4.1
                      • 20.50
                      • Published

                      @revizly/node-av

                      FFmpeg bindings for Node.js

                      • v5.2.3-revizly2
                      • 20.45
                      • Published

                      autonota

                      CLI to transcribe YouTube audio and summarize transcripts

                        • v0.2.2
                        • 20.25
                        • Published

                        lxrt

                        Local AI infrastructure for agent frameworks - Transformers.js wrapper with LLM, TTS, STT, Web Workers, React & Vue support

                        • v0.5.0
                        • 20.23
                        • Published

                        use-whisper

                        React Hook for OpenAI Whisper API with speech recorder and silence removal built-in.

                        • v0.0.1
                        • 20.20
                        • Published

                        whisperjs

                        HTTP Request functionality from within a node.js application using preset expresss routes and middleware

                        • v0.2.1
                        • 20.18
                        • Published

                        @samfp/pi-meeting-copilot

                        Live meeting transcription copilot for pi — captures audio via whisper-cpp on Mac, streams transcripts to your dev machine, and gives pi real-time meeting context.

                        • v0.1.0
                        • 20.12
                        • Published

                        zimujun

                        CLI for the coze-js-api whisper speech-to-text endpoint

                          • v0.1.1
                          • 20.02
                          • Published

                          @did-kr-cg/whisper-connect

                          Whisper Connect is an easy and simple decentralized (without kind of a google cloud services) p2p connect solution. Desktop browser login via mobile app. or create transaction for smart contracts and send a signature and message via whisper too. and it ca

                          • v0.1.5
                          • 20.01
                          • Published

                          whisperdb

                          Module to perform CRUD operations on Whisper, the RRD database.

                          • v0.1.3
                          • 19.87
                          • Published

                          @whitegodkingsley/arena-cli

                          AI-powered video clip generation tool for the terminal - Turn long-form content into viral clips

                          • v0.3.16
                          • 19.82
                          • Published

                          @cloudraker/use-whisper

                          React Hook for OpenAI Whisper API with speech recorder and silence removal built-in.

                          • v0.3.0
                          • 19.78
                          • Published

                          @openpets/whisperkit

                          Transcribe and translate audio files using WhisperKit CLI. Default output is JSON with detailed segment data and timestamps. Includes tool to convert JSON to LLM-friendly markdown. Supports MP3, WAV, M4A, FLAC formats with multiple Whisper models. Runs co

                          • v1.0.0
                          • 19.66
                          • Published

                          n8n-nodes-puter-ai

                          Advanced n8n node for Puter.js AI with RAG agentic capabilities, document processing, audio transcription, Supabase integration, and cost-optimized model priorities

                          • v2.0.4
                          • 19.65
                          • Published

                          cuttledoc

                          Fast offline video-to-document transcription with AI enhancement

                            • v1.0.0
                            • 19.62
                            • Published

                            use-whisper-on-azure

                            React Hook for OpenAI Whisper API with speech recorder and silence removal built-in.

                            • v0.2.9
                            • 19.50
                            • Published

                            smart-whisper-electron

                            Whisper.cpp Node.js binding with auto model offloading strategy.

                            • v0.8.2
                            • 19.39
                            • Published

                            @codearcade/subtitle-generator

                            Generate subtitles from audio files using OpenAI Whisper with support for SRT, VTT, and TXT formats. Automatically downloads required binaries and models, with cross-platform support and configurable performance options.

                            • v1.0.4
                            • 19.39
                            • Published

                            @billy1kaplan/use-whisper

                            React Hook for OpenAI Whisper API with speech recorder and silence removal built-in.

                            • v0.3.10-bk
                            • 19.28
                            • Published

                            @illyism/transcribe

                            CLI tool to transcribe audio/video files to SRT format using OpenAI Whisper API

                            • v3.1.0
                            • 19.22
                            • Published

                            speech-opencode

                            Voice input plugin for OpenCode using OpenAI Whisper

                            • v1.2.0
                            • 19.15
                            • Published

                            @qubby/use-whisper-beta

                            React Hook for OpenAI Whisper API with speech recorder and silence removal built-in in Qubby

                            • v0.0.27
                            • 18.93
                            • Published

                            audio-training

                            Audio parsing using deepgram

                            • v1.5.0
                            • 18.91
                            • Published

                            @adamhancock/transcribe-cli

                            CLI tool for transcribing and summarizing MP4 recordings using Whisper and Ollama

                            • v1.0.4
                            • 18.79
                            • Published

                            whisper-coreml

                            OpenAI Whisper ASR for Node.js with CoreML/ANE acceleration on Apple Silicon

                            • v1.1.0
                            • 18.72
                            • Published

                            samarthya-bot

                            SamarthyaBot — Privacy-First Local Agentic AI Operating System. Self-hosted multi-agent RPA engine with Telegram, Discord, Web Dashboard, Puppeteer browser control, SSH deployment, encrypted memory, voice transcription, and Indian workflow automation (GST

                            • v2.2.1
                            • 18.62
                            • Published

                            @qubby/use-whisper

                            React Hook for OpenAI Whisper API with speech recorder and silence removal built-in in Qubby

                            • v0.0.42
                            • 18.59
                            • Published

                            iflow-mcp-sixhq-overture

                            MCP Server for Claude Code, Cursor, Cline, Copilot, Github Copilot, Windsurf - Visual AI Agent Plan Execution, Approval Workflow, Plan Visualization, Agent Orchestration. See what your AI is thinking before it writes code. Works with Claude, GPT, Gemini,

                            • v0.1.8
                            • 18.52
                            • Published

                            n8n-nodes-zihin

                            n8n nodes for Zihin AI - Chat Model with Tool Calling, Image Analysis, Audio Transcription, Document Parsing

                            • v0.6.2
                            • 17.76
                            • Published

                            @mynamezxc/mow-speech-to-text

                            Advanced speech-to-text transcription tool using OpenAI Whisper with GPU acceleration support

                              • v1.2.1
                              • 17.64
                              • Published

                              @dhansoo/use-audio-stream

                              React Hook for OpenAI Whisper API with speech recorder and silence removal built-in.

                              • v0.0.14
                              • 17.54
                              • Published

                              dikt

                              Voice dictation for the terminal.

                              • v1.4.1
                              • 17.43
                              • Published

                              web-asr-core

                              WebASR Core - Browser-based speech processing with VAD, WakeWord and Whisper - Unified all-in-one version

                              • v0.8.1
                              • 17.03
                              • Published

                              openclaw-drama-generator

                              Automated drama video generator - from script to multi-character drama videos with OpenAI TTS, Whisper, and Remotion

                              • v2.0.0
                              • 16.66
                              • Published

                              voxagent

                              Voice-powered terminal agent. Fully offline. Speak commands, get answers.

                              • v0.1.0
                              • 16.63
                              • Published

                              @whisper-security/whisper-api-sdk

                              # Whisper Security Threat Intelligence API The Whisper Security API provides comprehensive threat intelligence, geolocation data, and security operations capabilities for enterprise security teams and developers. ## Key Capabilities - **Indicator Enric

                              • v0.2.3
                              • 16.60
                              • Published

                              openai-whisper-js

                              openai-whisper-js is a Node.js wrapper for the OpenAI Whisper library, enabling seamless audio transcription using Whisper models. This package simplifies the process of interacting with Whisper by providing a JavaScript interface to execute transcription

                              • v1.0.7
                              • 16.60
                              • Published

                              vrack-db

                              This is an In Memory database designed for storing time series (graphs).

                              • v3.0.2
                              • 16.55
                              • Published

                              doubleratchet

                              An Implementation of The Double Ratchet Algorithm designed by Open Whisper Systems

                              • v0.0.4
                              • 16.49
                              • Published

                              whisper-ws

                              Node SDK for whisper.ws

                              • v1.0.4
                              • 16.37
                              • Published

                              @codemasters/whisper

                              a lightweight, framework-independent notification library

                              • v1.5.6
                              • 16.30
                              • Published

                              markupr

                              Record your screen, narrate feedback, get structured Markdown with screenshots. Desktop app, CLI, and MCP server for AI coding agents like Claude Code, Cursor, and Windsurf.

                              • v2.6.8
                              • 16.25
                              • Published

                              @dvrosalesm/pi-notetaker

                              pi extension for meeting notes — record, transcribe locally with Whisper, and summarize with LLM

                              • v1.0.0
                              • 16.03
                              • Published

                              aivox

                              A lightweight CLI tool for translating voice to text using Whisper, seamlessly piping the transcribed text to any Unix-like command for versatile integration.

                              • v0.0.10
                              • 15.99
                              • Published

                              typeless-sdk

                              Node.js SDK: audio + custom vocabulary → polished text (STT + LLM)

                              • v0.2.2
                              • 15.96
                              • Published

                              expo-whisper

                              Expo plugin for OpenAI Whisper speech-to-text integration with React Native

                              • v1.0.10
                              • 15.96
                              • Published

                              statslog

                              High performance statistics logger written in node.js. Send UDP packet to it and it records the datum.

                              • v1.0.4
                              • 15.80
                              • Published

                              airspeech

                              AI-powered speech-to-text and screen control

                              • v0.3.7
                              • 15.72
                              • Published

                              subtitles-generator

                              Generate subtitles easily with ffmpeg and whisper

                              • v0.0.5
                              • 15.50
                              • Published

                              @krasnoperov/transcribe

                              CLI tool for audio/video transcription with speaker diarization, AI summarization, and infographic generation

                              • v1.1.0
                              • 15.41
                              • Published

                              n8n-nodes-nvidia-nim-whisper-v2

                              n8n community node for NVIDIA NIM Whisper Large V3 – speech recognition and translation via Riva gRPC API

                                • v0.1.3
                                • 15.20
                                • Published

                                scrub-cli

                                Turn videos into Agent Skills in seconds.

                                • v0.2.0
                                • 15.13
                                • Published

                                ai-input-react

                                React component for text/audio input with AI API integration. Framework-agnostic, works with Next.js, Vite, PHP, and any React setup.

                                • v1.0.0-beta.5
                                • 15.04
                                • Published

                                media-transcriber

                                Batch transcribe audio/video files using pluggable AI backends (Whisper, OpenAI API, and more)

                                • v1.0.2
                                • 15.03
                                • Published

                                tap2talk

                                Voice transcription at your fingertips - Instantly convert speech to text with a simple keyboard shortcut

                                • v5.1.7
                                • 14.95
                                • Published

                                youtube-scrap-mcp

                                MCP server for extracting YouTube video content with transcript processing.

                                • v0.1.1
                                • 14.73
                                • Published

                                @rightnow/sdk

                                TypeScript SDK for RightNow AI — Arabic-first AI inference API

                                • v0.2.3
                                • 14.65
                                • Published

                                aac-speech-recognition

                                Multi-API speech recognition library with confidence scoring for AAC applications

                                • v1.2.0
                                • 14.63
                                • Published

                                @burtonator/use-whisper

                                React Hook for OpenAI Whisper API with speech recorder and silence removal built-in.

                                • v0.2.1
                                • 14.54
                                • Published

                                capsaicin

                                1. [Installation](#org5c267ff) 2. [Usage](#org7c2ed85) 1. [Options](#orgd85636e) 3. [Development](#orga6df04c) 1. [Scripts](#orgf84d518)

                                • v0.9.5
                                • 14.45
                                • Published

                                create-manor-minutes

                                Set up Minutes — AI meeting transcription in one command

                                • v1.0.0
                                • 14.45
                                • Published

                                cloud-asr-mcp

                                MCP server using audio multimodal models for transcription and output styling in one pass. Format-specific outputs (email, todo, blog, etc.) via OpenRouter, Voxtral, OpenAI, and Gemini.

                                • v0.4.0
                                • 14.37
                                • Published

                                @tuan_son.dinh/claude-voice

                                Voice MCP server for Claude Code — hands-free voice input/output via TTS + Whisper

                                • v0.1.1
                                • 14.24
                                • Published

                                @whisper-cpp-node/core

                                Node.js bindings for whisper.cpp - fast speech-to-text with GPU acceleration

                                • v0.2.0
                                • 14.20
                                • Published

                                @ainative/ai-kit-video

                                AI Kit - Video processing utilities including recording and transcription

                                • v0.1.1
                                • 14.16
                                • Published

                                @redlasha/talk-to

                                Korean voice MCP server for Claude Code - STT/TTS with local Whisper + Edge TTS

                                • v0.1.2
                                • 14.16
                                • Published

                                @misgara/ai-agent

                                Librería hexagonal para agentes de IA

                                  • v1.2.0
                                  • 14.06
                                  • Published

                                  jasper-mini

                                  Local AI scene partner & life assistant — sovereign AI for creators. Runs on Mac Mini with voice I/O, scene management, and proactive scheduling. No cloud. No content policies. Your hardware, your model, your rules.

                                  • v0.1.3
                                  • 14.06
                                  • Published

                                  @chinchillaenterprises/mcp-recall

                                  Event-driven MCP server for Recall.ai meeting transcription with enhanced speaker identification and local storage

                                  • v1.1.0
                                  • 14.02
                                  • Published

                                  sttttsmodels

                                  STT (whisper-base), TTS (pocket-tts-onnx), and speaker embedding model files

                                    • v1.0.4
                                    • 13.79
                                    • Published

                                    @localmode/transformers

                                    Transformers.js provider for @localmode - implements all ML model interfaces

                                    • v2.0.0
                                    • 13.72
                                    • Published

                                    node-red-contrib-speech-to-text-ubos

                                    The speech to text API provides two endpoints, transcriptions and translations, based on our state-of-the-art open source large-v2 Whisper model.

                                    • v1.0.2
                                    • 13.30
                                    • Published

                                    @robinpath/openai

                                    OpenAI integration — chat completions, embeddings, image generation, transcription, moderation. Uses the encrypted credential vault for API keys.

                                      • v0.3.0
                                      • 13.30
                                      • Published

                                      @hasna/transcriber

                                      Transcribe audio and video from files and URLs (YouTube, Vimeo, Wistia, etc.) using ElevenLabs, OpenAI Whisper, or DeepGram

                                      • v0.0.1
                                      • 13.05
                                      • Published

                                      using-whisper

                                      React Hook for OpenAI Whisper API with speech recorder and silence removal built-in.

                                      • v0.1.8
                                      • 13.03
                                      • Published

                                      @synervoz/edgespeech

                                      React Native library for on-device voice processing with Switchboard SDK

                                      • v0.1.0
                                      • 13.00
                                      • Published

                                      node-ai-ragbot

                                      Node.js backend package for building AI chatbots and voicebots with Retrieval-Augmented Generation (RAG). It ingests website pages or local files (PDF, DOCX, TXT, MD), creates embeddings with LangChain + OpenAI, stores them in a fast in-memory vector data

                                      • v1.0.2
                                      • 12.98
                                      • Published

                                      dropin-feedback-widget

                                      Drop-in feedback widget for React — text + voice recording with Whisper transcription

                                      • v0.1.1
                                      • 12.98
                                      • Published

                                      @rkimball/use-whisper

                                      React Hook for OpenAI Whisper API with speech recorder and silence removal built-in.

                                      • v0.2.6
                                      • 12.71
                                      • Published

                                      whisper-nodejs-wrapper

                                      Node.js wrapper for OpenAI Whisper speech recognition with TypeScript support

                                      • v1.0.0
                                      • 12.71
                                      • Published

                                      autosub

                                      Automatically generate and overlay subtitles for any video.

                                      • v1.0.4
                                      • 12.58
                                      • Published

                                      soyle.rn

                                      React Native binding of whisper.cpp

                                      • v0.4.0-rc.8
                                      • 12.58
                                      • Published

                                      susurro-audio

                                      🎙️ Real-time conversational audio with AI transcription. Build ChatGPT-style voice interfaces in minutes with <300ms latency

                                      • v2.1.1
                                      • 12.43
                                      • Published

                                      whisper-clipboard-cli

                                      Own your transcription workflow. Press Cmd+Shift+X, speak, get text in clipboard instantly.

                                        • v1.1.1
                                        • 12.28
                                        • Published