open-agents-ai
AI coding agent powered by open-source models (Ollama/vLLM) — interactive TUI with agentic tool-calling loop
Found 346 results for whisper
AI coding agent powered by open-source models (Ollama/vLLM) — interactive TUI with agentic tool-calling loop
Glin-Profanity is a lightweight and efficient npm package designed to detect and filter profane language in text inputs across multiple languages. Whether you’re building a chat application, a comment section, or any platform where user-generated content
node-av (linux-x64 binary)
FFmpeg bindings for Node.js
node-av (darwin-arm64 binary)
Helpers for installing and using Whisper.cpp
node-av (darwin-x64 binary)
React Native binding of whisper.cpp
node-av (linux-arm64 binary)
Helpers for using Whisper.cpp in browser using WASM
node-av (win32-x64-msvc binary)
transcription addon for qvac
The TypeScript library for building AI applications.
An another Node binding of whisper.cpp to make same API with whisper.rn as much as possible.
Whisper.cpp Node.js binding with auto model offloading strategy.
React Hook for OpenAI Whisper API with speech recorder and silence removal built-in.
BYAN v2.8 - Intelligent AI agent creator with ELO trust system + scientific fact-check + Hermes universal dispatcher + native Claude Code integration (hooks, skills, MCP server). Multi-platform (Copilot CLI, Claude Code, Codex). Merise Agile + TDD + 64 Ma
Native module for An another Node binding of whisper.cpp (linux-x64-cuda)
Native module for An another Node binding of whisper.cpp (linux-x64-vulkan)
Conversation memory SDK — query meeting transcripts, decisions, and action items from any AI agent or application
**QVAC SDK** is the canonical entry point to develop AI applications with QVAC.
Native module for An another Node binding of whisper.cpp (linux-x64)
A task-based automation app. Leiningen style.
Audio control with silence detection.
MCP server for minutes — conversation memory for AI assistants. Works with Claude Desktop, Mistral Vibe, Cursor, Windsurf, and any MCP client.
Native module for An another Node binding of whisper.cpp (darwin-arm64)
AI coding assistant skill (Claude Code, Codex, Gemini CLI, GitHub Copilot CLI, Aider, OpenCode, OpenClaw) - turn any folder of code, docs, papers, images, or audio/video transcripts into a queryable knowledge graph
A GPU accelerated .node addon for whisper.cpp with prebuilt binaries
Voice + screenshots + model hotkeys + live agent monitor — drop-in wrapper for GitHub Copilot CLI
Browser-native audio transcription powered by WebGPU Whisper — zero server, fully local.
Buildless STT (Whisper WebGPU) + TTS (Pocket TTS ONNX) SDK
Universal speech-to-text router for Gladia, AssemblyAI, Deepgram, Azure, OpenAI Whisper, Speechmatics, Soniox, and ElevenLabs
Audio transcription library with provider support and auto-splitting
Voice-to-intelligence platform for developers. Voice capture, sprint planning with AI, bug/feature forms, pattern matching to prevent AI hallucinations.
MCP server that gives Claude Code the ability to watch and understand videos — extracts frames via ffmpeg and processes audio via multiple backends
node-av (win32-x64-mingw binary)
Windows-native MCP server for local audio transcription using whisper.cpp with Vulkan GPU acceleration
Local audio/video transcription with speaker diarization and live audio support. No API keys. Powered by faster-whisper.
Voice-first prompt engineering for vibe coders. Floating overlay that turns your voice into clean, structured prompts using local Whisper + your own local LLM.
node-av (win32-arm64-mingw binary)
node-av (win32-arm64-msvc binary)
Whisper Node.js Wrapper
TypeScript port of SYSTRAN/faster-whisper for Node.js, built on CTranslate2, Koffi, FFmpeg, and ONNX Runtime.
Voice input/output plugin for Zhin.js — STT via Whisper + TTS via edge-tts
Multi-channel Obsidian clipping and video transcription CLI (WeChat/YouTube/Douyin).
OpenClaw plugin for multimodal RAG - semantic indexing and time-aware search for images and audio using local AI models
Voice synthesis and transcription tools for AgentOS via OpenAI, ElevenLabs, Deepgram, and local Ollama/Whisper-compatible runtimes
Bootstrap a PLAUD NotePin auto-recording pipeline (Whisper + Obsidian, macOS)
Native module for An another Node binding of whisper.cpp (win32-x64-vulkan)
CLI for submitting Deyo transcription jobs from the terminal
Daeva — local GPU pod orchestrator for AI workloads
JavaScript/TypeScript SDK for PolarGrid Edge AI Infrastructure with Full API Support
Native module for An another Node binding of whisper.cpp (win32-x64-cuda)
A powerful React hook for real-time voice streaming, designed for AI-powered applications. Perfect for real-time transcription, voice assistants, and audio processing with features like silence detection and configurable audio processing.
Native module for An another Node binding of whisper.cpp (win32-x64)
Speech-to-text and text-to-speech for OpenCode. Record voice prompts with whisper-cpp, hear responses via Piper TTS, with LLM normalization through any OpenAI-compatible endpoint.
speak2text CLI tool. Transcribe and translate audio and video files using OpenAI Whisper.
Node.js bindings for OpenAI's Whisper. Optimized for CPU.
Comprehensive Cloudflare Workers & Pages integration with config-based patterns, middleware, router, workflows, AI (with audio/music generation, TTS, ASR), React hooks, and multi-tenant support
Speech-to-text plugin for OpenCode — voice input with Deepgram, Groq, and OpenAI Whisper
Multimodal video understanding for Claude Code — extract frames, transcribe audio, build timelines from any video
Voice input widget for HTML forms. Users speak once — AI fills all fields at once. Drop-in for React, Vue, Angular, Next.js, WordPress. 25+ languages, 96% accuracy.
Give any AI agent a physical body — eyes, ears, voice, face. Patent Pending. One command install.
Official TypeScript SDK for Cortex API — secure LLM gateway
Whisper speech recognition
Native module for An another Node binding of whisper.cpp (linux-arm64-cuda)
a .node addon for whisper
Native module for An another Node binding of whisper.cpp (linux-arm64-vulkan)
Optional Whisper.cpp STT plugin for @ozymandros/electron-message-bridge: mic capture + IPC bridge (main/preload).
MCP server for AI media generation in Remotion projects - images, videos, music, sound effects, speech, and subtitles
🎙️ WhisperMix is a versatile module for transcribing audio using OpenAI’s Whisper or Groq’s Whisper v3 model.
Real-time audio transcription in the browser using OpenAI's Whisper model via WebAssembly
Set up heed — local-first meeting transcription with real speaker diarization. One command, everything installs.
CLI that downloads video/audio from a URL (YouTube today, more coming) and transcribes to markdown using local whisper.cpp. Drop the URL, get an MP3, MP4, or transcript — in your current folder.
MCP server that converts audio/video/image to text + images for LLM consumption
Self-host the Whisper font in a neatly bundled NPM package.
Cross-platform Voice Activity Detection and Audio Event Detection via WebAssembly. Runs in browsers, Web Workers, and Node.js. Built on FireRedVAD. Whisper-ready chunking included.
Native module for An another Node binding of whisper.cpp (linux-arm64)
Transform YouTube videos into polished blog posts using AI
Node.js bindings for whisper.cpp - fast speech-to-text with GPU acceleration
WebAssembly bindings for OpenAI Whisper speech recognition
The easiest way to add AI-powered chat, speech recognition, text-to-speech, and image analysis to React Native apps.
Local-first MCP bridge for reading and transcribing TraceGist package zips.
Voice agent for OrkaJS — STT → LLM → TTS pipeline with WebSocket support
n8n community node for deAPI - AI image generation, video generation, transcription and prompt optimization
Give your AI agents the ability to listen. Microphone capture and speech-to-text tools for MCP-compatible agents.
AICW Video is open-source toolkit with a CLI, MCP server, and web hub for turning videos into short clips for TikTok, Instagram, YouTube Shorts and other short video platforms.
monsterapi is a JavaScript client library for interacting with the Monster API. It provides an easy way to access the API's features and integrate them into your applications.
Installer for the claude-video Claude Code skill — teach Claude Code to watch videos.
n8n community node for Wiro AI — 290+ AI models: video, image, audio, LLM, 3D, and more.
Agent-first CLI for audio/video transcription via Whisper
CLI tool for audio transcription with Groq Whisper API
MCP server that enables Claude Code to analyze video files (@video.mp4) by extracting frames and audio for vision and STT analysis.
Unified AI creation engine — text, image, video, audio across all providers
Local-first CLI that turns meeting recordings into transcripts, summaries, action items, and decisions
Minimal hold-SPACE voice input for Pi using an OpenAI-compatible Whisper/STT endpoint.
SQLite AI extension for Node.js - On-device inference, embedding generation, and model interaction directly into your database
AI-powered real-time interview copilot. Captures audio, transcribes with Whisper, generates answers with Gemma AI -- all in an invisible overlay.
Library to easily interact with Twitch PubSub System
MCP server for local audio transcription using Whisper
Native Node.js bindings for OpenAI's Whisper using whisper.cpp. High-performance local speech-to-text with custom model support.
Automated video generation pipeline with OpenAI TTS, Whisper, and Remotion - from text script to professional short videos
CLI tool to transcribe YouTube videos and local audio/video files using OpenAI Whisper API
Perform speech-to-text on audio files within your n8n workflows.This node provides local audio transcription, no internet or third-party APIs required for processing.
MCP Server for Claude Code, Cursor, Cline, Copilot, Github Copilot, Windsurf - Visual AI Agent Plan Execution, Approval Workflow, Plan Visualization, Agent Orchestration. See what your AI is thinking before it writes code. Works with Claude, GPT, Gemini,
Platform-agnostic AI agent SDK for chat applications — WhatsApp, Telegram, Messenger, or any custom channel.
TypeScript SDK for WhisperAI: methods and interfaces for interacting with the service without external runtime dependencies.
Scaffold a self-hosted AI voice bot for Free4Talk voice rooms
Multimodal utilities and agents for OrkaJS - Vision, Audio, Cross-modal workflows
AI music suite for pi — YouTube, global radio (30k+ stations), Suno, Lyria AI, SoundCloud/Bandcamp, mix, trim, BPM. Windows + macOS + Linux + Termux.
React Hook for OpenAI Whisper API with speech recorder and silence removal built-in.
Speech-to-text for Krio and other African languages — powered by Whisper large-v3 via Hugging Face
library for use with whisper.cpp and nodejs or typescript project
Local AI voice for coding assistants — TTS & STT via MCP. Kokoro ONNX + faster-whisper, fully offline.
React Hook for OpenAI Whisper API with speech recorder.
speech to text functionality with minimum configuration and maximum compatibility
Native module for An another Node binding of whisper.cpp (win32-arm64-vulkan)
A Codex skill for turning local videos into short clips with optional local captions.
[openai whisper-asr](https://github.com/ahmetoner/whisper-asr-webservice) 语音识别服务,支持一百多种语言+翻译,适配wechaty语音消息
Extract YouTube video transcripts with AI summaries from the terminal
Native module for An another Node binding of whisper.cpp (darwin-x64)
N8N community node for Groq API - Speech-to-Text transcription using Whisper AI. Convert audio to text with high accuracy. Perfect for WhatsApp voice messages, audio files, and voice automation workflows.
Native module for An another Node binding of whisper.cpp (win32-arm64)
Build AI applications, chatbots, and agents with JavaScript and TypeScript.
Transposer connector is a PeerTube language tool plugin to transcribe and translate with Whisper
AgentSea - Unite and orchestrate AI agents. A production-ready ADK for building agentic AI applications with multi-provider support.
Claude Code plugin: transcribe video/audio from URLs (YouTube, X, TikTok, Vimeo, podcasts) or local media files using yt-dlp + OpenAI Whisper, fully on-device.
Practice speaking languages locally
TPHIM - Ultimate Video Pipeline: Download, Transcode HLS, AI Subtitles (with skip option), Resume Upload, and Cloud Upload.
Official client for Redflower AI — open-source AI server with speech, vision, and language.
Games where augmentative and alternative communication (AAC) devices are used as controllers for the game is promising for increasing social inclusion of children who use these devices, such as minimally verbal autistic children. We want to build out an A
React Hook for OpenAI Whisper API with speech recorder and silence removal built-in in Qubby
Public CLI for clipping long-form videos into short-form packages with ffmpeg, yt-dlp, local Whisper backends, and Cloudflare Workers AI planning.
React Hook for OpenAI Whisper API with speech recorder and silence removal built-in.
AI OpenAI - OpenAI API implementation with Whisper STT support for AI engine abstraction
Helpers for installing and using Whisper.cpp
A simple and lightweight proxy for seamless integration with multiple STT (Speech-to-Text) providers including Whisper.cpp
An AI Knowledge Base skill for Gemini CLI and Claude Code with automated video transcription, presentation generation, and Obsidian vault integration.
Model Context Protocol server for Whisper Context API - Connect Claude Desktop to your knowledge base
Node.js plugin for speech recognition that works with OpenAI's Whisper models using ONNX.
Comprehensive OpenAI MCP server with API and Agents SDK support
Voice input for Pi. Multi-provider STT with Deepgram streaming, Groq Whisper, OpenAI Whisper. 56+ languages.
MCP server for transcribing videos from 1000+ platforms (YouTube, Vimeo, TikTok, Twitter, etc.) or local video files using Whisper
MCP server — ear, mouth, and eye for POLY HUD + ALSA
Drop-in animated captions for Remotion. Audio to word-level synced subtitle components. Supports OpenAI, Groq, Deepgram, AssemblyAI.
AI-powered session intelligence tool — turn screen recordings into structured work summaries
React Native implementation of OpenAI's Whisper automatic speech recognition (ASR) model
A powerful, minimalistic CLI tool to download, transcribe, and intelligently format speech from Instagram Reels using OpenAI Whisper and GPT.
AI OpenAI - OpenAI API implementation with Whisper STT support for AI engine abstraction
Distributed private messaging for a distributed country
MCP server for audio transcription with OpenAI API, whisper-cli, or whisper.cpp
MCP server for real-time audio transcription using OpenAI Whisper
node-av (linux-x64 binary)
TypeScript SDK for Whisper Context API - Add reliable context to your AI agents
MCP Server for recording meetings and generating notes in Claude Code
FFmpeg bindings for Node.js
CLI to transcribe YouTube audio and summarize transcripts
Local AI infrastructure for agent frameworks - Transformers.js wrapper with LLM, TTS, STT, Web Workers, React & Vue support
React Hook for OpenAI Whisper API with speech recorder and silence removal built-in.
HTTP Request functionality from within a node.js application using preset expresss routes and middleware
Live meeting transcription copilot for pi — captures audio via whisper-cpp on Mac, streams transcripts to your dev machine, and gives pi real-time meeting context.
CLI for the coze-js-api whisper speech-to-text endpoint
Whisper Connect is an easy and simple decentralized (without kind of a google cloud services) p2p connect solution. Desktop browser login via mobile app. or create transaction for smart contracts and send a signature and message via whisper too. and it ca
Module to perform CRUD operations on Whisper, the RRD database.
AI-powered video clip generation tool for the terminal - Turn long-form content into viral clips
React Hook for OpenAI Whisper API with speech recorder and silence removal built-in.
Transcribe and translate audio files using WhisperKit CLI. Default output is JSON with detailed segment data and timestamps. Includes tool to convert JSON to LLM-friendly markdown. Supports MP3, WAV, M4A, FLAC formats with multiple Whisper models. Runs co
Advanced n8n node for Puter.js AI with RAG agentic capabilities, document processing, audio transcription, Supabase integration, and cost-optimized model priorities
Fast offline video-to-document transcription with AI enhancement
React Hook for OpenAI Whisper API with speech recorder and silence removal built-in.
Whisper.cpp Node.js binding with auto model offloading strategy.
Generate subtitles from audio files using OpenAI Whisper with support for SRT, VTT, and TXT formats. Automatically downloads required binaries and models, with cross-platform support and configurable performance options.
React Hook for OpenAI Whisper API with speech recorder and silence removal built-in.
CLI tool to transcribe audio/video files to SRT format using OpenAI Whisper API
Voice input plugin for OpenCode using OpenAI Whisper
React Hook for OpenAI Whisper API with speech recorder and silence removal built-in in Qubby
Audio parsing using deepgram
n8n node for Berget AI speech-to-text models
CLI tool for transcribing and summarizing MP4 recordings using Whisper and Ollama
OpenAI Whisper ASR for Node.js with CoreML/ANE acceleration on Apple Silicon
SamarthyaBot — Privacy-First Local Agentic AI Operating System. Self-hosted multi-agent RPA engine with Telegram, Discord, Web Dashboard, Puppeteer browser control, SSH deployment, encrypted memory, voice transcription, and Indian workflow automation (GST
React Hook for OpenAI Whisper API with speech recorder and silence removal built-in in Qubby
MCP Server for Claude Code, Cursor, Cline, Copilot, Github Copilot, Windsurf - Visual AI Agent Plan Execution, Approval Workflow, Plan Visualization, Agent Orchestration. See what your AI is thinking before it writes code. Works with Claude, GPT, Gemini,
Derive a stable X25519 encryption keypair from a Sui wallet's personal-message signature.
n8n nodes for Zihin AI - Chat Model with Tool Calling, Image Analysis, Audio Transcription, Document Parsing
Advanced speech-to-text transcription tool using OpenAI Whisper with GPU acceleration support
Node-REDでWhisperを使って文字起こしができます。
React Hook for OpenAI Whisper API with speech recorder and silence removal built-in.
Voice dictation for the terminal.
WebASR Core - Browser-based speech processing with VAD, WakeWord and Whisper - Unified all-in-one version
Automated drama video generator - from script to multi-character drama videos with OpenAI TTS, Whisper, and Remotion
Voice-powered terminal agent. Fully offline. Speak commands, get answers.
# Whisper Security Threat Intelligence API The Whisper Security API provides comprehensive threat intelligence, geolocation data, and security operations capabilities for enterprise security teams and developers. ## Key Capabilities - **Indicator Enric
openai-whisper-js is a Node.js wrapper for the OpenAI Whisper library, enabling seamless audio transcription using Whisper models. This package simplifies the process of interacting with Whisper by providing a JavaScript interface to execute transcription
This is an In Memory database designed for storing time series (graphs).
An Implementation of The Double Ratchet Algorithm designed by Open Whisper Systems
Node SDK for whisper.ws
a lightweight, framework-independent notification library
Record your screen, narrate feedback, get structured Markdown with screenshots. Desktop app, CLI, and MCP server for AI coding agents like Claude Code, Cursor, and Windsurf.
Local Whisper-based auto voice transcription for OpenClaw - works across all channels
pi extension for meeting notes — record, transcribe locally with Whisper, and summarize with LLM
A lightweight CLI tool for translating voice to text using Whisper, seamlessly piping the transcribed text to any Unix-like command for versatile integration.
Node.js SDK: audio + custom vocabulary → polished text (STT + LLM)
Expo plugin for OpenAI Whisper speech-to-text integration with React Native
High performance statistics logger written in node.js. Send UDP packet to it and it records the datum.
AI-powered speech-to-text and screen control
Generate subtitles easily with ffmpeg and whisper
CLI tool for audio/video transcription with speaker diarization, AI summarization, and infographic generation
Local AI voice for coding assistants — TTS & STT via MCP. Kokoro ONNX + faster-whisper, fully offline.
n8n community node for NVIDIA NIM Whisper Large V3 – speech recognition and translation via Riva gRPC API
Turn videos into Agent Skills in seconds.
React component for text/audio input with AI API integration. Framework-agnostic, works with Next.js, Vite, PHP, and any React setup.
Batch transcribe audio/video files using pluggable AI backends (Whisper, OpenAI API, and more)
Voice transcription at your fingertips - Instantly convert speech to text with a simple keyboard shortcut
MCP server for extracting YouTube video content with transcript processing.
TypeScript SDK for RightNow AI — Arabic-first AI inference API
Multi-API speech recognition library with confidence scoring for AAC applications
React Hook for OpenAI Whisper API with speech recorder and silence removal built-in.
Node for Whisper
1. [Installation](#org5c267ff) 2. [Usage](#org7c2ed85) 1. [Options](#orgd85636e) 3. [Development](#orga6df04c) 1. [Scripts](#orgf84d518)
Set up Minutes — AI meeting transcription in one command
MCP server using audio multimodal models for transcription and output styling in one pass. Format-specific outputs (email, todo, blog, etc.) via OpenRouter, Voxtral, OpenAI, and Gemini.
Generates subtitles for a video file using OpenAI Whisper API.
Voice MCP server for Claude Code — hands-free voice input/output via TTS + Whisper
node-av (linux-arm64 binary)
Node.js bindings for whisper.cpp - fast speech-to-text with GPU acceleration
AI Kit - Video processing utilities including recording and transcription
Korean voice MCP server for Claude Code - STT/TTS with local Whisper + Edge TTS
Librería hexagonal para agentes de IA
Local AI scene partner & life assistant — sovereign AI for creators. Runs on Mac Mini with voice I/O, scene management, and proactive scheduling. No cloud. No content policies. Your hardware, your model, your rules.
Event-driven MCP server for Recall.ai meeting transcription with enhanced speaker identification and local storage
STT (whisper-base), TTS (pocket-tts-onnx), and speaker embedding model files
Transformers.js provider for @localmode - implements all ML model interfaces
Homebridge plugin for ScentAir diffusers
Real-time audio transcription in the browser using OpenAI's Whisper model via WebAssembly
n8n node for Groq Speech-to-Text API - works with any audio provider
The speech to text API provides two endpoints, transcriptions and translations, based on our state-of-the-art open source large-v2 Whisper model.
OpenAI integration — chat completions, embeddings, image generation, transcription, moderation. Uses the encrypted credential vault for API keys.
Transcribe audio and video from files and URLs (YouTube, Vimeo, Wistia, etc.) using ElevenLabs, OpenAI Whisper, or DeepGram
React Hook for OpenAI Whisper API with speech recorder and silence removal built-in.
React Native library for on-device voice processing with Switchboard SDK
Node.js backend package for building AI chatbots and voicebots with Retrieval-Augmented Generation (RAG). It ingests website pages or local files (PDF, DOCX, TXT, MD), creates embeddings with LangChain + OpenAI, stores them in a fast in-memory vector data
Drop-in feedback widget for React — text + voice recording with Whisper transcription
n8n node for Flowbie video transcription API
React Hook for OpenAI Whisper API with speech recorder and silence removal built-in.
Node.js wrapper for OpenAI Whisper speech recognition with TypeScript support
Automatically generate and overlay subtitles for any video.
React Native binding of whisper.cpp
🎙️ Real-time conversational audio with AI transcription. Build ChatGPT-style voice interfaces in minutes with <300ms latency
Own your transcription workflow. Press Cmd+Shift+X, speak, get text in clipboard instantly.