voice-activity-detection
Mic input activity detection
Found 1794 results for voice activity detection
Mic input activity detection
Mic input activity detection
Voice activity detection (VAD) AudioWorklet.
Silero VAD barge-in plugin for ArionTalk — AI-powered voice activity detection
Advanced audio sentence detection using signal processing and voice activity detection
Cobra VAD engine for web browsers (via WebAssembly)
Picovoice Cobra Node.js binding
Implementation of the Discord Voice API for Node.js
Javascript client library for Soniox Speech-to-Text websocket API
A button to start dictation using Web Speech API, with an easy to understand event lifecycle.
Twilio's JavaScript Voice SDK
Universal Voice Activity Detection SDK for WebAssembly - supports multiple VAD engines with a unified API
Simple one-to-one WebRTC video/voice and data channels
Provides text-to-speech functionality.
CLI for AuralWise audio intelligence API - transcription, speaker diarization, audio event detection
The Voice API lets you create outbound calls, control in-progress calls and get information about historical calls.
Neura — CLI for installing and managing the Neura AI assistant core service. Includes text chat and voice listen clients.
React library for audio recording and visualization using Web Audio API
Official VoicePilot JavaScript SDK — TTS, STT, Agents, and real-time conversations.
React Native Text-To-Speech module for Android and iOS
An audio recording helper for React. Provides a component and a hook to help with audio recording.
`@inworld/runtime` is a Node.js SDK for building AI applications with LLM inference, graph orchestration, speech pipelines, retrieval, tool use, and telemetry.
AI coding agent powered by open-source models (Ollama/vLLM) — interactive TUI with agentic tool-calling loop
Library for creating voice messages
An MCP server to allow LLMs to speak and listen via bidirectional voice loops
Various diagnostics functions to help analyze connections to Twilio
A node.js wrapper for the MessageBird REST API
A cross-platform SDK enabling developers to integrate real-time VoIP chat technology into their projects
Official AfricasTalking node.js API wrapper
Telephony plugin for Chatter — adds voice call and SMS support via Twilio
Capacitor plugin for voice recording
Simple one-to-one WebRTC video/voice and data channels
Telnyx React Native Voice SDK - A complete WebRTC voice calling solution
StellaLib — A powerful Lavalink v3+v4 client for TypeScript with auto version detection, session persistence, smart autoplay, and graceful shutdown
Ready-to-use Chat UI Components for React(Javascript/Web)
Universal text-to-speech library using Microsoft Edge's online TTS service. Works in Node.js and browsers WITHOUT needing Microsoft Edge, Windows, or an API key
ElevenLabs React Native SDK for the Agents Platform
React hooks for wake word detection using Web Speech API
Windows-optimized smart voice, sound, and desktop notifications for Pi coding agent.
Offline, in-browser voice commands powered by EfficientWord-Net (ResNet-50 ArcFace).
Get live audio stream data for React Native
SignalWire Compatibility API
Simple, light-weight WebRTC video/voice and data channels
An easy to use react client for building generative ai application using Rapida platform.
Porcupine wake word engine for web browsers (via WebAssembly)
<p align="center"> <a href="https://voipi.vercel.app/"><img src="logo.svg" alt="voipi" width="128" height="128"></a> </p>
Voice-to-intelligence platform for developers. Voice capture, sprint planning with AI, bug/feature forms, pattern matching to prevent AI hallucinations.
This is an open source Eleven Labs NodeJS package for converting text to speech using the Eleven Labs API
A JavaScript library for adding voice commands to your site, using speech recognition
Capacitor plugin for comprehensive on-device speech recognition with live partial results.
Integrates the Twilio Voice SDK into Capacitor
SendBird Calls JavaScript SDK
Core library to check for valid SSML
Node.js Library for TNZ Group REST API
🖐️🎤 Micdrop: Real-Time Voice Conversations with AI
Local AI voice for coding assistants — TTS & STT via MCP. Kokoro ONNX + faster-whisper, fully offline.
Embeddable voice AI SDK for web pages — form filling, navigation, Q&A via speech recognition and server proxy
A powerful React hook for real-time voice streaming, designed for AI-powered applications. Perfect for real-time transcription, voice assistants, and audio processing with features like silence detection and configurable audio processing.
Picovoice Porcupine Node.js binding
A high-performance React Native library for text-to-speech on iOS and Android
Advanced React speech-to-text library with real-time audio analysis and comprehensive speech metrics
Voice synthesis library for AITuber OnAir
Official TeleSign SDK for Rest APIs including Messaging (SMS), Intelligence Cloud, PhoneID, Voice, and AppVerify
Holostaff AI avatar widget — embeddable voice assistant for any webpage
Voice-driven code explorer for your terminal
Check for valid SSML
Simple one-to-one WebRTC video/voice and data channels
Now your AI Agents can finally talk back! Professional TTS voice for Claude Code, Claude Desktop (via MCP), and Clawdbot with multi-provider support.
Vent CLI — CI/CD for voice AI agents
A collection of React components for building AI voice interfaces with real-time audio visualization
Package to make it very easy to send text messages with CM.com
Production-ready speech detection using Silero VAD ONNX model for web browsers
retext plugin to check for passive voice
Audio / Voice Recorder for React
React version of siriwave.js
A mobile-friendly audio player for React with a modern look and convenient usage.
React Hook for OpenAI Whisper API with speech recorder and silence removal built-in.
Voice pipeline for Cloudflare Agents — STT, TTS, VAD, streaming, and SFU utilities
Get audio PCM stream data for React Native
Voice-to-tool-call browser library: wake word detection, speech-to-text, LLM intent interpretation, tool execution, and text-to-speech
Voice capabilities: TTS, STT, and conversational AI
Comprehensive Cloudflare Workers & Pages integration with config-based patterns, middleware, router, workflows, AI (with audio/music generation, TTS, ASR), React hooks, and multi-tenant support
Nodejs SDK wrapper for Termii API written with Typescript support
JavaScript client for Speechly Streaming API
Node binding for continuous offline voice recoginition with Vosk library.
This is the client js sdk for cutting-edge Samvyo real-time voice/video cloud.
Voice calling plugin for OpenClaw — give your AI agent a phone number
Unofficial TypeScript SDK for the Gradium API
Capacitor plugin for voice recording
JavaScript SDK for Obi
React component for Porcupine Web SDK
Real-time voice AI for HazelJS - OpenAI Realtime API & Gemini Live integration for low-latency speech-to-speech
Official WAVE SDK for TypeScript and Node.js — 34 API modules for live video streaming, production, analytics, voice, captions, and more
React Native Voice library for iOS and Android - Fork with New Architecture, Bridgeless mode, and React Native 0.76+ support
Simple one-to-one WebRTC video/voice and data channels
The official Bland AI command-line interface
An iOS only on-device transcription library for React Native and Expo apps.
Rhino Speech-to-Intent engine for web browsers (via WebAssembly)
React Native library for on-device voice processing with Switchboard SDK
Official JavaScript SDK for the Unbound API - A comprehensive toolkit for integrating with Unbound's communication, AI, and data management services
A Stimulus controller that uses the Web Speech API to capture speech and fill an input or element.
Voice + screenshots + model hotkeys + live agent monitor — drop-in wrapper for GitHub Copilot CLI
Node.js package of ai-coustics SDK
Code from anywhere with your voice. Autonomous coding system controlled from Telegram.
A speech to text module.
Make your app understand language. Summarize conversations, categorize articles, and more.
CLI for managing OpenHome voice AI abilities
exk - Control Claude CLI with voice and programmable interfaces
Cross-platform 3D avatar component for React Native & web — lip-sync, gestures, accessories, and LLM integration. Powered by TalkingHead + Three.js.
A high-level, state-agnostic, drop-in module for the Telnyx React Native SDK that simplifies WebRTC voice calling integration
React library for audio recording and visualization using Web Audio API
React Native Native Voice library for iOS and Android
Artyom is a Robust Wrapper of the Google Chrome SpeechSynthesis and SpeechRecognition that allows you to create a virtual assistent
Ready-to-use Chat UI Components for Vue(Javascript/Web)
Personal AI orchestrator that turns Discord into a persistent workspace
Headless voice AI engine with page understanding — services, types, and session logic
Polyfill for the Speech Recognition API using Speechly
Jitsi Meet SDK wrapper for React Native.
Voice wake plugin for page-agent with Vue2/Vue3 compatibility.
Onereach.ai Voice Steps
OpenClaw plugin — connect agents to MyDazy voice devices via MCP relay with TTS push
jambonz SDK for building voice applications — optimized for AI agents
Voice AI coding assistant - local agent that connects to Osborn frontend
The **Inworld AI Node.js SDK** enables Developers to easily integrate AI characters into your Node.js environment.
React component and hook for audio recording in your React applications
Voice input widget for HTML forms. Users speak once — AI fills all fields at once. Drop-in for React, Vue, Angular, Next.js, WordPress. 25+ languages, 96% accuracy.
A React 19 compatible fork of react-voice-visualizer by Yurii Zarytskyi. It's a React library for audio recording and visualization using Web Audio API.
Streaming Volcengine speech-to-text plugin for the OpenCode TUI
One-command bootstrap for Cookiy local skills and MCP connections in your AI coding clients
Koala Noise Suppression engine for web browsers (via WebAssembly)
Unified messaging library for Email, SMS, and Voice with multiple provider support
Speak and it types. Hands-free voice transcription CLI for macOS.
React client for the PrimVoices Agents API
Programmatic AI media generation SDK for Zyka
React Native Native Voice library for iOS and Android
Voice calling SDK for MyChatBot Sales Platform agents
Connect your phone to Claude Code. Voice control Claude Code from anywhere.
Voice synthesis adapter for Lunar (ElevenLabs TTS)
Javascript client for the Speechmatics Flow API
超哥办公室 — OpenClaw AI 多部门指挥中心:像素办公室、会议室(跨部门顺序讨论+协商投票+行动项)、信任评分、子代理(sessions_spawn委派)、实时流式响应、定时任务、工作流、公告板、记忆系统、仪表盘、Gmail/Drive/Sheets集成、语音输入、Webhook、PWA、移动端适配、命令面板(Cmd+K)、中英双语 | Pixel-art virtual office with multi-agent chat, meeting room (sequential discuss
Web Interface for reading and testing notifications during development
TOTP but you say it out loud. Derive time-rotating, human-speakable verification tokens from a shared secret.
Audio recording tools for voice-driven development workflows
Unified Voice SDK for TTS and ASR
OpenClaw plugin for Omi ambient transcript processing — always-on AI listening and command detection
MCP server for Chamade — voice gateway for AI agents. v3 is a thin stdio shim around the hosted HTTP MCP at mcp.chamade.io, so every MCP client (stdio and HTTP) talks to the same hosted surface. Supports Claude Code channel mode for push events.
Alan Web SDK: a lightweight JavaScript library for adding a voice experience to your website or web application
Development runtime for Sinch Functions - serverless voice applications
React client for the VoiceRun Agents API
React client for Speechly Streaming API
Core library for AITuber OnAir providing voice synthesis and chat processing
Official PIOPIY WebRTC SDK for high-quality voice communication and telephony integration in the browser.
Instant thought capture CLI with voice and on-demand AI analysis. Think now, organize later.
Audio transcription and voice commands for kodrdriv (transcribe, voice-note)
Experiment for making video streaming work for discord selfbots
An Open-source Voice & Video Calling UI Component Based on Tencent Cloud Service.
AI-powered chatbot widget for Next.js and React.js — answers site questions, web search fallback, appointment scheduling, navigation, voice support, and tutorials.
Voice synthesis and transcription tools for AgentOS via OpenAI, ElevenLabs, Deepgram, and local Ollama/Whisper-compatible runtimes
Voice calls, SMS, missions, and approvals via ClawTalk — OpenClaw plugin
Voice input for Pi. Multi-provider STT with Deepgram streaming, Groq Whisper, OpenAI Whisper. 56+ languages.
Check for valid SSML
Professional live speech transcription library for TypeScript/JavaScript with multi-provider support
React Native SDK for VocalLabs audio calls with direct WebSocket connection
Transforms the text in speech and hear it using Sonos player or generate an audio file to be used with third parties nodes. Works with voices from Google (without credentials as well), Google TTS, ElevenLabs.io TTS, Voice.ai TTS or your own voice. You can
A lightweight token generator for 4Players ODIN
🎙️ A lightweight React hook for audio recording using native Web APIs (MediaRecorder, getUserMedia). Start, stop, pause, resume audio recordings with customizable callbacks. Perfect for voice notes, interviews, podcasts, and real-time audio processing in
Native voice plugin for discordjs-nextgen (without @discordjs/voice)
TeliTask MCP server — manage contacts, tasks, and calls from AI assistants
Voice calling plugin for OpenClaw — give your AI agent a phone number
OpenClaw VoicyClaw channel plugin
MCP server for Voximplant — cloud telephony, call history, SMS (Russia)
A custom filter library for speech enhancement
npm wrapper for responsivevoice.js obtained from dataplusscience.com
Drop-in feedback widget for React — text + voice recording with Whisper transcription
The official Node.js library for the Typecast API. Text-to-Speech with AI voices. TypeScript support included.
科大讯飞语音识别 SDK,支持浏览器中实时语音听写功能
Animalese TTS is an Animal Crossing style Voice Synthesis (TTS) engine.
Web SDK for the Estuary real-time AI conversation platform
Hold-to-talk voice input for Pi CLI — cloud streaming via Deepgram or fully offline with 19 local models
Eagle Speaker Recognition engine for web browsers (via WebAssembly)
Managed Oomi chat, voice bridge, and XR-first persona scaffolding for OpenClaw
Dial your AI agent into every platform. One identity. Every channel.
NodeJS wrapper for Deepgram
AI-powered cursor companion for web apps
Core library to check for valid SSML
Real-time voice reporting for Claude Code
Opus 1.6 audio encoding for React Native and Expo with audio level metering and lifecycle events. Forked from Scdales/opuslib.
Cheetah Speech-to-Text engine for web browsers (via WebAssembly)
React component for Rhino Web SDK
React wrapper for VoxGlide — loads SDK at runtime from proxy server, zero bundled SDK code
React Native VOIP SDK bridge wrapper for Android and iOS
Open-source voice toolkit for Apple Silicon. Speech-to-text, language detection. 25 languages.
Picovoice Rhino Node.js binding
Vosk node API based on Koffi.
Web SDK for Voice.ai - Easy integration of voice agents into JavaScript applications
The WebSocket/WebRTC library by lirax.ua (PBX Cloud Platform)
Real-time audio noise reduction with advanced chunked processing for web applications
An Open-source Voice & Video Calling UI Component Based on Tencent Cloud Service.
Local AI voice for coding assistants — TTS & STT via MCP. Kokoro ONNX + faster-whisper, fully offline.
JavaScript Web API for Text-to-Speech and Speech-to-Text.
This is a simple audio recorder package for react application using the javascript Web Audio API.
TypeScript SDK for Sarvam Conversational AI
Infobip Node Client
Wake-word inference for JavaScript and TypeScript. High-level API over @sonnetics/core.
Your terminal has feelings now
Framework-agnostic WebSocket library for real-time audio streaming with MessagePack and Opus encoding
AI audio content creation CLI — stories, podcasts, narration, dubbing, transcription, translation, and video translation with TTS
High-quality audio recording Capacitor plugin with native iOS & Android support. Features pause/resume, microphone management, real-time monitoring, audio trimming, and comprehensive mobile audio recording capabilities.
Lightweight WebRTC browser library that supports video, audio and data channels
Unofficial Typescript version of the Africa's Talking SDK
$CLAWD — Solana x xAI agentic engine powered by Grok. Multi-agent research (16 agents), vision, image gen, voice, function calling, X search, and 31 MCP tools. CLAWD Cloud OS bootstrap for E2B/Docker/any terminal.
Universal speech-to-text router for Gladia, AssemblyAI, Deepgram, Azure, OpenAI Whisper, Speechmatics, Soniox, and ElevenLabs
n8n community node for Soniox speech-to-text transcription
Official Node.js/TypeScript SDK for the ClawOps Voice API
A command-line tool to test voice agents using Puppeteer
Autonomous Agent SDK for executing real-world commercial transactions with automatic x402 payments
Fork of capacitor-voice-recorder with custom modifications
React hook for Cheetah Web SDK
Synthesize speech from text with full control over language, voice, pitch, rate, and volume.
React library for audio recording and visualization using Web Audio API
Tiledesk VOICE Twilio connector
Telnyx voice call provider for AgentOS — outbound/inbound calls via Telnyx Call Control v2
A voice module for Stoat
Comprehensive Voice Agent SDK with Customizable Widget - Real-time audio, WebSocket communication, React components, and extensive customization options
Inworld TTS SDK – generate, stream, and voice management
Leopard Speech-to-Text engine for web browsers (via WebAssembly)
Smart voice notification plugin for OpenCode with multiple TTS engines (ElevenLabs, Edge TTS, Windows SAPI), AI-generated dynamic messages, and intelligent reminder system
React Native 阿里云实时语音识别 SDK
Security validation, logging, context monitoring, and Kokoro TTS voice notifications for OpenCode
Picovoice Eagle Node.js binding
ESP32 LAN voice coding bridge with inject, Codex, and Claude modes; inject is the recommended default.
自用,使用 minimax 国际版生成语音,不注册chatluna只截取chatluna输出LLM。
ElevenLabs adapter for TanStack AI realtime voice
Extensible composite AI voice-agent browser SDK
Orca Text-to-Speech engine for web browsers (via WebAssembly)
Real-time bidirectional audio streaming for Expo and React Native. Record microphone input and play audio chunks with low latency using native AVAudioEngine.
This package creates Speech Synthesis Markup Language (SSML) using the builder pattern.
MCP server for Mango Office — cloud PBX, call management (Russia)
Official AfricasTalking node.js API wrapper (with updated dependencies)
Voice generation widget with model/voice selection and audio preview for Directus
``` npm i react-recorder-voice ```
Next.js SDK and components for ServiceAgent chat, low-latency voice agents, dialer workflows, booking, webhooks, and server-side API access
MCP server for AgentPhone — give AI agents phone numbers, SMS, and voice calls
MCP server for MTS Exolve — SMS, calls, recordings, Viber messaging (Russia)
React hook for Leopard Web SDK
n8n community node for VocalLabs AI Voice API
An Open-source Voice & Video Calling UI Component Based on Tencent Cloud Service.
TypeScript and Node.js SDK for ServiceAgent APIs, AI agents, knowledge base search, CRM sync, analytics, billing, and workflow automation