Found 346 results for whisper

open-agents-ai

AI coding agent powered by open-source models (Ollama/vLLM) — interactive TUI with agentic tool-calling loop

Glin-Profanity is a lightweight and efficient npm package designed to detect and filter profane language in text inputs across multiple languages. Whether you’re building a chat application, a comment section, or any platform where user-generated content

@seydx/node-av-linux-x64

node-av (linux-x64 binary)

node-av

FFmpeg bindings for Node.js

@seydx/node-av-darwin-arm64

node-av (darwin-arm64 binary)

@remotion/install-whisper-cpp

Helpers for installing and using Whisper.cpp

@seydx/node-av-darwin-x64

node-av (darwin-x64 binary)

whisper.rn

React Native binding of whisper.cpp

@seydx/node-av-linux-arm64

node-av (linux-arm64 binary)

@remotion/whisper-web

Helpers for using Whisper.cpp in browser using WASM

@seydx/node-av-win32-x64-msvc

node-av (win32-x64-msvc binary)

@qvac/transcription-whispercpp

transcription addon for qvac

modelfusion

The TypeScript library for building AI applications.

@fugood/whisper.node

An another Node binding of whisper.cpp to make same API with whisper.rn as much as possible.

smart-whisper

Whisper.cpp Node.js binding with auto model offloading strategy.

@chengsokdara/use-whisper

React Hook for OpenAI Whisper API with speech recorder and silence removal built-in.

create-byan-agent

BYAN v2.8 - Intelligent AI agent creator with ELO trust system + scientific fact-check + Hermes universal dispatcher + native Claude Code integration (hooks, skills, MCP server). Multi-platform (Copilot CLI, Claude Code, Codex). Merise Agile + TDD + 64 Ma

@fugood/node-whisper-linux-x64-cuda

Native module for An another Node binding of whisper.cpp (linux-x64-cuda)

@fugood/node-whisper-linux-x64-vulkan

Native module for An another Node binding of whisper.cpp (linux-x64-vulkan)

minutes-sdk

Conversation memory SDK — query meeting transcripts, decisions, and action items from any AI agent or application

@qvac/sdk

**QVAC SDK** is the canonical entry point to develop AI applications with QVAC.

@fugood/node-whisper-linux-x64

Native module for An another Node binding of whisper.cpp (linux-x64)

whisper

A task-based automation app. Leiningen style.

silence-aware-recorder

Audio control with silence detection.

minutes-mcp

MCP server for minutes — conversation memory for AI assistants. Works with Claude Desktop, Mistral Vibe, Cursor, Windsurf, and any MCP client.

@fugood/node-whisper-darwin-arm64

Native module for An another Node binding of whisper.cpp (darwin-arm64)

graphifyy

AI coding assistant skill (Claude Code, Codex, Gemini CLI, GitHub Copilot CLI, Aider, OpenCode, OpenClaw) - turn any folder of code, docs, papers, images, or audio/video transcripts into a queryable knowledge graph

@kutalia/whisper-node-addon

A GPU accelerated .node addon for whisper.cpp with prebuilt binaries

copilot-plus

Voice + screenshots + model hotkeys + live agent monitor — drop-in wrapper for GitHub Copilot CLI

browser-whisper

Browser-native audio transcription powered by WebGPU Whisper — zero server, fully local.

webtalk

Buildless STT (Whisper WebGPU) + TTS (Pocket TTS ONNX) SDK

voice-router-dev

Universal speech-to-text router for Gladia, AssemblyAI, Deepgram, Azure, OpenAI Whisper, Speechmatics, Soniox, and ElevenLabs

@wovin/tranz

Audio transcription library with provider support and auto-splitting

aetherlight

Voice-to-intelligence platform for developers. Voice capture, sprint planning with AI, bug/feature forms, pattern matching to prevent AI hallucinations.

claude-video-vision

MCP server that gives Claude Code the ability to watch and understand videos — extracts frames via ffmpeg and processes audio via multiple backends

@seydx/node-av-win32-x64-mingw

node-av (win32-x64-mingw binary)

whisper-windows-mcp

Windows-native MCP server for local audio transcription using whisper.cpp with Vulkan GPU acceleration

transcribe-cli

Local audio/video transcription with speaker diarization and live audio support. No API keys. Powered by faster-whisper.

@mouadja02/murmur

Voice-first prompt engineering for vibe coders. Floating overlay that turns your voice into clean, structured prompts using local Whisper + your own local LLM.

@seydx/node-av-win32-arm64-mingw

node-av (win32-arm64-mingw binary)

@seydx/node-av-win32-arm64-msvc

node-av (win32-arm64-msvc binary)

@choewy/whisper

Whisper Node.js Wrapper

faster-whisper-ts

TypeScript port of SYSTRAN/faster-whisper for Node.js, built on CTranslate2, Koffi, FFmpeg, and ONNX Runtime.

@zhin.js/plugin-voice

Voice input/output plugin for Zhin.js — STT via Whisper + TTS via edge-tts

cat-crawl

Multi-channel Obsidian clipping and video transcription CLI (WeChat/YouTube/Douyin).

@hzttt/multimodal-rag

OpenClaw plugin for multimodal RAG - semantic indexing and time-aware search for images and audio using local AI models

@framers/agentos-ext-voice-synthesis

Voice synthesis and transcription tools for AgentOS via OpenAI, ElevenLabs, Deepgram, and local Ollama/Whisper-compatible runtimes

create-plaud-pipeline

Bootstrap a PLAUD NotePin auto-recording pipeline (Whisper + Obsidian, macOS)

@fugood/node-whisper-win32-x64-vulkan

Native module for An another Node binding of whisper.cpp (win32-x64-vulkan)

@casatwy/deyo

CLI for submitting Deyo transcription jobs from the terminal

@asmostans/daeva

Daeva — local GPU pod orchestrator for AI workloads

@polargrid/polargrid-sdk

JavaScript/TypeScript SDK for PolarGrid Edge AI Infrastructure with Full API Support

@fugood/node-whisper-win32-x64-cuda

Native module for An another Node binding of whisper.cpp (win32-x64-cuda)

voice-stream

A powerful React hook for real-time voice streaming, designed for AI-powered applications. Perfect for real-time transcription, voice assistants, and audio processing with features like silence detection and configurable audio processing.

@fugood/node-whisper-win32-x64

Native module for An another Node binding of whisper.cpp (win32-x64)

@renjfk/opencode-voice

Speech-to-text and text-to-speech for OpenCode. Record voice prompts with whisper-cpp, hear responses via Piper TTS, with LLM normalization through any OpenAI-compatible endpoint.

speak2text

speak2text CLI tool. Transcribe and translate audio and video files using OpenAI Whisper.

@pr0gramm/fluester

Node.js bindings for OpenAI's Whisper. Optimized for CPU.

@umituz/web-cloudflare

Comprehensive Cloudflare Workers & Pages integration with config-based patterns, middleware, router, workflows, AI (with audio/music generation, TTS, ASR), React hooks, and multi-tenant support

opencode-voice

Speech-to-text plugin for OpenCode — voice input with Deepgram, Groq, and OpenAI Whisper

vidclaude

Multimodal video understanding for Claude Code — extract frames, transcribe audio, build timelines from any video

typelessform-widget

Voice input widget for HTML forms. Users speak once — AI fills all fields at once. Drop-in for React, Vue, Angular, Next.js, WordPress. 25+ languages, 96% accuracy.

create-axiom-body

Give any AI agent a physical body — eyes, ears, voice, face. Patent Pending. One command install.

@nfinitmonkeys/cortex-sdk

Official TypeScript SDK for Cortex API — secure LLM gateway

whisper.cpp

Whisper speech recognition

@fugood/node-whisper-linux-arm64-cuda

Native module for An another Node binding of whisper.cpp (linux-arm64-cuda)

whisper-node-addon

a .node addon for whisper

@fugood/node-whisper-linux-arm64-vulkan

Native module for An another Node binding of whisper.cpp (linux-arm64-vulkan)

@ozymandros/electron-message-bridge-plugin-speech-whisper

Optional Whisper.cpp STT plugin for @ozymandros/electron-message-bridge: mic capture + IPC bridge (main/preload).

remotion-media-mcp

MCP server for AI media generation in Remotion projects - images, videos, music, sound effects, speech, and subtitles

whispermix

🎙️ WhisperMix is a versatile module for transcribing audio using OpenAI’s Whisper or Groq’s Whisper v3 model.

whisper-web-transcriber

Real-time audio transcription in the browser using OpenAI's Whisper model via WebAssembly

create-heed

Set up heed — local-first meeting transcription with real speaker diarization. One command, everything installs.

assistvideo

CLI that downloads video/audio from a URL (YouTube today, more coming) and transcribes to markdown using local whisper.cpp. Drop the URL, get an MP3, MP4, or transcript — in your current folder.

@dymoo/media-understanding

MCP server that converts audio/video/image to text + images for LLM consumption

@fontsource/whisper

Self-host the Whisper font in a neatly bundled NPM package.

omnivad

Cross-platform Voice Activity Detection and Audio Event Detection via WebAssembly. Runs in browsers, Web Workers, and Node.js. Built on FireRedVAD. Whisper-ready chunking included.

@fugood/node-whisper-linux-arm64

Native module for An another Node binding of whisper.cpp (linux-arm64)

yt2blog

Transform YouTube videos into polished blog posts using AI

whisper-cpp-node

Node.js bindings for whisper.cpp - fast speech-to-text with GPU acceleration

@timur00kh/whisper.wasm

WebAssembly bindings for OpenAI Whisper speech recognition

react-native-smart-ai

The easiest way to add AI-powered chat, speech recognition, text-to-speech, and image analysis to React Native apps.

tracegist-mcp-bridge

Local-first MCP bridge for reading and transcribing TraceGist package zips.

@orka-js/realtime

Voice agent for OrkaJS — STT → LLM → TTS pipeline with WebSocket support

n8n-nodes-deapi

n8n community node for deAPI - AI image generation, video generation, transcription and prompt optimization

mcp-listen

Give your AI agents the ability to listen. Microphone capture and speech-to-text tools for MCP-compatible agents.

aicw-video

AICW Video is open-source toolkit with a CLI, MCP server, and web hub for turning videos into short clips for TikTok, Instagram, YouTube Shorts and other short video platforms.

monsterapi

monsterapi is a JavaScript client library for interacting with the Monster API. It provides an easy way to access the API's features and integrate them into your applications.

claude-video-install

Installer for the claude-video Claude Code skill — teach Claude Code to watch videos.

@wiro-ai/n8n-nodes-wiroai

n8n community node for Wiro AI — 290+ AI models: video, image, audio, LLM, 3D, and more.

@crafter/trx

Agent-first CLI for audio/video transcription via Whisper

whspr

CLI tool for audio transcription with Groq Whisper API

claude-video-analyzer

MCP server that enables Claude Code to analyze video files (@video.mp4) by extracting frames and audio for vision and STT analysis.

noosphere

Unified AI creation engine — text, image, video, audio across all providers

samuraizer

Local-first CLI that turns meeting recordings into transcripts, summaries, action items, and decisions

pi-whisper-voice

Minimal hold-SPACE voice input for Pi using an OpenAI-compatible Whisper/STT endpoint.

@sqliteai/sqlite-ai

SQLite AI extension for Node.js - On-device inference, embedding generation, and model interaction directly into your database

ghostshell-ai

AI-powered real-time interview copilot. Captures audio, transcribes with Whisper, generates answers with Gemma AI -- all in an invisible overlay.

twitchps

Library to easily interact with Twitch PubSub System

whisper-mcp

MCP server for local audio transcription using Whisper

nwhisper

Native Node.js bindings for OpenAI's Whisper using whisper.cpp. High-performance local speech-to-text with custom model support.

openclaw-video-generator

Automated video generation pipeline with OpenAI TTS, Whisper, and Remotion - from text script to professional short videos

transcribly

CLI tool to transcribe YouTube videos and local audio/video files using OpenAI Whisper API

n8n-nodes-transcribe-audio

Perform speech-to-text on audio files within your n8n workflows.This node provides local audio transcription, no internet or third-party APIs required for processing.

overture-mcp

MCP Server for Claude Code, Cursor, Cline, Copilot, Github Copilot, Windsurf - Visual AI Agent Plan Execution, Approval Workflow, Plan Visualization, Agent Orchestration. See what your AI is thinking before it writes code. Works with Claude, GPT, Gemini,

@omnizap/sdk

Platform-agnostic AI agent SDK for chat applications — WhatsApp, Telegram, Messenger, or any custom channel.

whisperai-sdk

TypeScript SDK for WhisperAI: methods and interfaces for interacting with the service without external runtime dependencies.

create-gicellbot

Scaffold a self-hosted AI voice bot for Free4Talk voice rooms

@orka-js/multimodal

Multimodal utilities and agents for OrkaJS - Vision, Audio, Cross-modal workflows

pi-dj

AI music suite for pi — YouTube, global radio (30k+ stations), Suno, Lyria AI, SoundCloud/Bandcamp, mix, trim, BPM. Windows + macOS + Linux + Termux.

@albertsyh/use-whisper

React Hook for OpenAI Whisper API with speech recorder and silence removal built-in.

krio-stt

Speech-to-text for Krio and other African languages — powered by Whisper large-v3 via Hugging Face

whisper-tnode

library for use with whisper.cpp and nodejs or typescript project

voicesmith-mcp

Local AI voice for coding assistants — TTS & STT via MCP. Kokoro ONNX + faster-whisper, fully offline.

@kkaczynski/use-whisper

React Hook for OpenAI Whisper API with speech recorder.

voice2text

speech to text functionality with minimum configuration and maximum compatibility

@fugood/node-whisper-win32-arm64-vulkan

Native module for An another Node binding of whisper.cpp (win32-arm64-vulkan)

codex-video-short-maker-skill

A Codex skill for turning local videos into short clips with optional local captions.

koishi-plugin-whisper-asr

[openai whisper-asr](https://github.com/ahmetoner/whisper-asr-webservice) 语音识别服务,支持一百多种语言+翻译，适配wechaty语音消息

tubewords

Extract YouTube video transcripts with AI summaries from the terminal

@fugood/node-whisper-darwin-x64

Native module for An another Node binding of whisper.cpp (darwin-x64)

n8n-nodes-groq

N8N community node for Groq API - Speech-to-Text transcription using Whisper AI. Convert audio to text with high accuracy. Perfect for WhatsApp voice messages, audio files, and voice automation workflows.

@fugood/node-whisper-win32-arm64

Native module for An another Node binding of whisper.cpp (win32-arm64)

ai-utils.js

Build AI applications, chatbots, and agents with JavaScript and TypeScript.

peertube-plugin-transposer-connector

Transposer connector is a PeerTube language tool plugin to transcribe and translate with Whisper

@lov3kaizen/agentsea-core

AgentSea - Unite and orchestrate AI agents. A production-ready ADK for building agentic AI applications with multi-provider support.

claude-transcribe

Claude Code plugin: transcribe video/audio from URLs (YouTube, X, TikTok, Vimeo, podcasts) or local media files using yt-dlp + OpenAI Whisper, fully on-device.

speekr

Practice speaking languages locally

tphim

TPHIM - Ultimate Video Pipeline: Download, Transcode HLS, AI Subtitles (with skip option), Resume Upload, and Cloud Upload.

redflower-ai

Official client for Redflower AI — open-source AI server with speech, vision, and language.

aac-voice-api

Games where augmentative and alternative communication (AAC) devices are used as controllers for the game is promising for increasing social inclusion of children who use these devices, such as minimally verbal autistic children. We want to build out an A

@ji8122s/use-whisper-test

React Hook for OpenAI Whisper API with speech recorder and silence removal built-in in Qubby

clipping-cli

Public CLI for clipping long-form videos into short-form packages with ffmpeg, yt-dlp, local Whisper backends, and Cloudflare Workers AI planning.

@beckjiang/use-whisper

React Hook for OpenAI Whisper API with speech recorder and silence removal built-in.

@zs-soft/ai-openai

AI OpenAI - OpenAI API implementation with Whisper STT support for AI engine abstraction

@junwan666/remotion-install-whisper-cpp

Helpers for installing and using Whisper.cpp

@derogab/stt-proxy

A simple and lightweight proxy for seamless integration with multiple STT (Speech-to-Text) providers including Whisper.cpp

agent-skill-ai-knowledge-base

An AI Knowledge Base skill for Gemini CLI and Claude Code with automated video transcription, presentation generation, and Obsidian vault integration.

@usewhisper/mcp-server

Model Context Protocol server for Whisper Context API - Connect Claude Desktop to your knowledge base

whisper-onnx-speech-to-text

Node.js plugin for speech recognition that works with OpenAI's Whisper models using ONNX.

@robinson_ai_systems/openai-mcp

Comprehensive OpenAI MCP server with API and Agents SDK support

@artale/pi-voice

Voice input for Pi. Multi-provider STT with Deepgram streaming, Groq Whisper, OpenAI Whisper. 56+ languages.

video-transcriber-mcp

MCP server for transcribing videos from 1000+ platforms (YouTube, Vimeo, TikTok, Twitter, etc.) or local video files using Whisper

morphclaude-senses

MCP server — ear, mouth, and eye for POLY HUD + ALSA

remotion-captioneer

Drop-in animated captions for Remotion. Audio to word-level synced subtitle components. Supports OpenAI, Groq, Deepgram, AssemblyAI.

escribano

AI-powered session intelligence tool — turn screen recordings into structured work summaries

react-native-whisper

React Native implementation of OpenAI's Whisper automatic speech recognition (ASR) model

reelsum

A powerful, minimalistic CLI tool to download, transcribe, and intelligently format speech from Instagram Reels using OpenAI Whisper and GPT.

@zssz-soft/ai-openai

AI OpenAI - OpenAI API implementation with Whisper STT support for AI engine abstraction

demo-shh

Distributed private messaging for a distributed country

@clawbow/mcp-whisper

MCP server for audio transcription with OpenAI API, whisper-cli, or whisper.cpp

audio-transcription-mcp

MCP server for real-time audio transcription using OpenAI Whisper

@revizly/node-av-linux-x64

node-av (linux-x64 binary)

@usewhisper/sdk

TypeScript SDK for Whisper Context API - Add reliable context to your AI agents

meeting-notes-mcp

MCP Server for recording meetings and generating notes in Claude Code

@revizly/node-av

FFmpeg bindings for Node.js

autonota

CLI to transcribe YouTube audio and summarize transcripts

lxrt

Local AI infrastructure for agent frameworks - Transformers.js wrapper with LLM, TTS, STT, Web Workers, React & Vue support

use-whisper

React Hook for OpenAI Whisper API with speech recorder and silence removal built-in.

whisperjs

HTTP Request functionality from within a node.js application using preset expresss routes and middleware

@samfp/pi-meeting-copilot

Live meeting transcription copilot for pi — captures audio via whisper-cpp on Mac, streams transcripts to your dev machine, and gives pi real-time meeting context.

zimujun

CLI for the coze-js-api whisper speech-to-text endpoint

@did-kr-cg/whisper-connect

Whisper Connect is an easy and simple decentralized (without kind of a google cloud services) p2p connect solution. Desktop browser login via mobile app. or create transaction for smart contracts and send a signature and message via whisper too. and it ca

whisperdb

Module to perform CRUD operations on Whisper, the RRD database.

@whitegodkingsley/arena-cli

AI-powered video clip generation tool for the terminal - Turn long-form content into viral clips

@cloudraker/use-whisper

React Hook for OpenAI Whisper API with speech recorder and silence removal built-in.

@openpets/whisperkit

Transcribe and translate audio files using WhisperKit CLI. Default output is JSON with detailed segment data and timestamps. Includes tool to convert JSON to LLM-friendly markdown. Supports MP3, WAV, M4A, FLAC formats with multiple Whisper models. Runs co

n8n-nodes-puter-ai

Advanced n8n node for Puter.js AI with RAG agentic capabilities, document processing, audio transcription, Supabase integration, and cost-optimized model priorities

cuttledoc

Fast offline video-to-document transcription with AI enhancement

use-whisper-on-azure

React Hook for OpenAI Whisper API with speech recorder and silence removal built-in.

smart-whisper-electron

Whisper.cpp Node.js binding with auto model offloading strategy.

@codearcade/subtitle-generator

Generate subtitles from audio files using OpenAI Whisper with support for SRT, VTT, and TXT formats. Automatically downloads required binaries and models, with cross-platform support and configurable performance options.

@billy1kaplan/use-whisper

React Hook for OpenAI Whisper API with speech recorder and silence removal built-in.

@illyism/transcribe

CLI tool to transcribe audio/video files to SRT format using OpenAI Whisper API

speech-opencode

Voice input plugin for OpenCode using OpenAI Whisper

@qubby/use-whisper-beta

React Hook for OpenAI Whisper API with speech recorder and silence removal built-in in Qubby

audio-training

Audio parsing using deepgram

@bergetai/n8n-nodes-berget-ai-speech

n8n node for Berget AI speech-to-text models

@adamhancock/transcribe-cli

CLI tool for transcribing and summarizing MP4 recordings using Whisper and Ollama

whisper-coreml

OpenAI Whisper ASR for Node.js with CoreML/ANE acceleration on Apple Silicon

samarthya-bot

SamarthyaBot — Privacy-First Local Agentic AI Operating System. Self-hosted multi-agent RPA engine with Telegram, Discord, Web Dashboard, Puppeteer browser control, SSH deployment, encrypted memory, voice transcription, and Indian workflow automation (GST

@qubby/use-whisper

React Hook for OpenAI Whisper API with speech recorder and silence removal built-in in Qubby

iflow-mcp-sixhq-overture

@whisper-protocol/wallet-derived-keys

Derive a stable X25519 encryption keypair from a Sui wallet's personal-message signature.

n8n-nodes-zihin

n8n nodes for Zihin AI - Chat Model with Tool Calling, Image Analysis, Audio Transcription, Document Parsing

@mynamezxc/mow-speech-to-text

Advanced speech-to-text transcription tool using OpenAI Whisper with GPU acceleration support

node-red-contrib-simple-whisper

Node-REDでWhisperを使って文字起こしができます。

@dhansoo/use-audio-stream

React Hook for OpenAI Whisper API with speech recorder and silence removal built-in.

dikt

Voice dictation for the terminal.

web-asr-core

WebASR Core - Browser-based speech processing with VAD, WakeWord and Whisper - Unified all-in-one version

openclaw-drama-generator

Automated drama video generator - from script to multi-character drama videos with OpenAI TTS, Whisper, and Remotion

voxagent

Voice-powered terminal agent. Fully offline. Speak commands, get answers.

@whisper-security/whisper-api-sdk

# Whisper Security Threat Intelligence API The Whisper Security API provides comprehensive threat intelligence, geolocation data, and security operations capabilities for enterprise security teams and developers. ## Key Capabilities - **Indicator Enric

openai-whisper-js

openai-whisper-js is a Node.js wrapper for the OpenAI Whisper library, enabling seamless audio transcription using Whisper models. This package simplifies the process of interacting with Whisper by providing a JavaScript interface to execute transcription

vrack-db

This is an In Memory database designed for storing time series (graphs).

doubleratchet

An Implementation of The Double Ratchet Algorithm designed by Open Whisper Systems

whisper-ws

Node SDK for whisper.ws

@codemasters/whisper

a lightweight, framework-independent notification library

markupr

Record your screen, narrate feedback, get structured Markdown with screenshots. Desktop app, CLI, and MCP server for AI coding agents like Claude Code, Cursor, and Windsurf.

openclaw-plugin-whisper-local-auto

Local Whisper-based auto voice transcription for OpenClaw - works across all channels

@dvrosalesm/pi-notetaker

pi extension for meeting notes — record, transcribe locally with Whisper, and summarize with LLM

aivox

A lightweight CLI tool for translating voice to text using Whisper, seamlessly piping the transcribed text to any Unix-like command for versatile integration.

typeless-sdk

Node.js SDK: audio + custom vocabulary → polished text (STT + LLM)

expo-whisper

Expo plugin for OpenAI Whisper speech-to-text integration with React Native

statslog

High performance statistics logger written in node.js. Send UDP packet to it and it records the datum.

airspeech

AI-powered speech-to-text and screen control

subtitles-generator

Generate subtitles easily with ffmpeg and whisper

@krasnoperov/transcribe

CLI tool for audio/video transcription with speaker diarization, AI summarization, and infographic generation

@iflow-mcp/shshalom-voicesmith-mcp

Local AI voice for coding assistants — TTS & STT via MCP. Kokoro ONNX + faster-whisper, fully offline.

n8n-nodes-nvidia-nim-whisper-v2

n8n community node for NVIDIA NIM Whisper Large V3 – speech recognition and translation via Riva gRPC API

scrub-cli

Turn videos into Agent Skills in seconds.

ai-input-react

React component for text/audio input with AI API integration. Framework-agnostic, works with Next.js, Vite, PHP, and any React setup.

media-transcriber

Batch transcribe audio/video files using pluggable AI backends (Whisper, OpenAI API, and more)

tap2talk

Voice transcription at your fingertips - Instantly convert speech to text with a simple keyboard shortcut

youtube-scrap-mcp

MCP server for extracting YouTube video content with transcript processing.

@rightnow/sdk

TypeScript SDK for RightNow AI — Arabic-first AI inference API

aac-speech-recognition

Multi-API speech recognition library with confidence scoring for AAC applications

@burtonator/use-whisper

React Hook for OpenAI Whisper API with speech recorder and silence removal built-in.

@background404/node-red-contrib-whisper

Node for Whisper

capsaicin

1. [Installation](#org5c267ff) 2. [Usage](#org7c2ed85) 1. [Options](#orgd85636e) 3. [Development](#orga6df04c) 1. [Scripts](#orgf84d518)

create-manor-minutes

Set up Minutes — AI meeting transcription in one command

cloud-asr-mcp

MCP server using audio multimodal models for transcription and output styling in one pass. Format-specific outputs (email, todo, blog, etc.) via OpenRouter, Voxtral, OpenAI, and Gemini.

@lordofmax/subtitle-generator

Generates subtitles for a video file using OpenAI Whisper API.

@tuan_son.dinh/claude-voice

Voice MCP server for Claude Code — hands-free voice input/output via TTS + Whisper

@revizly/node-av-linux-arm64

node-av (linux-arm64 binary)

@whisper-cpp-node/core

Node.js bindings for whisper.cpp - fast speech-to-text with GPU acceleration

@ainative/ai-kit-video

AI Kit - Video processing utilities including recording and transcription

@redlasha/talk-to

Korean voice MCP server for Claude Code - STT/TTS with local Whisper + Edge TTS

@misgara/ai-agent

Librería hexagonal para agentes de IA

jasper-mini

Local AI scene partner & life assistant — sovereign AI for creators. Runs on Mac Mini with voice I/O, scene management, and proactive scheduling. No cloud. No content policies. Your hardware, your model, your rules.

@chinchillaenterprises/mcp-recall

Event-driven MCP server for Recall.ai meeting transcription with enhanced speaker identification and local storage

sttttsmodels

STT (whisper-base), TTS (pocket-tts-onnx), and speaker embedding model files

@localmode/transformers

Transformers.js provider for @localmode - implements all ML model interfaces

homebridge-scentair

Homebridge plugin for ScentAir diffusers

@sridhar-mani/whisper-web-transcriber

Real-time audio transcription in the browser using OpenAI's Whisper model via WebAssembly

n8n-nodes-groq-speech-to-text

n8n node for Groq Speech-to-Text API - works with any audio provider

node-red-contrib-speech-to-text-ubos

The speech to text API provides two endpoints, transcriptions and translations, based on our state-of-the-art open source large-v2 Whisper model.

@robinpath/openai

OpenAI integration — chat completions, embeddings, image generation, transcription, moderation. Uses the encrypted credential vault for API keys.

@hasna/transcriber

Transcribe audio and video from files and URLs (YouTube, Vimeo, Wistia, etc.) using ElevenLabs, OpenAI Whisper, or DeepGram

using-whisper

React Hook for OpenAI Whisper API with speech recorder and silence removal built-in.

@synervoz/edgespeech

React Native library for on-device voice processing with Switchboard SDK

node-ai-ragbot

Node.js backend package for building AI chatbots and voicebots with Retrieval-Augmented Generation (RAG). It ingests website pages or local files (PDF, DOCX, TXT, MD), creates embeddings with LangChain + OpenAI, stores them in a fast in-memory vector data