Package Exports

This package does not declare an exports field, so the exports above have been automatically detected and optimized by JSPM instead. If any package subpath is missing, it is recommended to post an issue to the original package (openvoiceui) to support the "exports" field. If that is not possible, create a JSPM override to customize the exports field for this package.

Readme

OpenVoiceUI

The open-source voice AI that actually does work.

Talk to any LLM. Watch it build live web pages. Automate with 35+ skills.
Self-host with full privacy. MIT licensed, forever free.

Watch the demo -- see voice-to-canvas in action

Quickstart

npx openvoiceui setup
npx openvoiceui start

Open localhost:5001, say "build me a dashboard", and watch it render live.

What is OpenVoiceUI?

Most voice AI platforms sell you a voice API. OpenVoiceUI gives you an entire AI workspace — voice, canvas, skills, agents, media generation — open source and self-hosted.

It's a voice-first AI assistant that doesn't just talk back — it builds live HTML pages mid-conversation, runs 35+ built-in skills, delegates work to parallel sub-agents, and remembers everything across sessions. Works with any LLM. Runs on your hardware. MIT licensed.

Core Features

Hands-Free Voice Control — Wake word, push-to-talk, or continuous mode. Works with any LLM provider.
Canvas UI — AI builds live HTML pages mid-conversation: dashboards, reports, galleries, tools. Real web apps, not text responses.
Skill System — 35+ built-in skills for social media, SEO, email, business briefings, marketing. Build your own without touching core code.
Sub-Agents — Parallel AI workers. Delegate multiple tasks simultaneously and get results back.
Memory System — Remembers across sessions. Gets smarter over time.
Self-Hosted — Your hardware, your data. Docker, npm, or one-click Pinokio. No vendor lock-in, no monthly fees.

And More

Desktop OS interface with themes (Windows XP, macOS, Ubuntu, Win95, Win 3.1)
Image generation (FLUX.1, Stable Diffusion 3.5)
AI music generation & player (Suno)
Video creation (Remotion Studio)
Voice cloning (Qwen3-TTS via fal.ai)
Cron jobs for scheduled automation
File explorer with drag-and-drop
Animated face modes (eye-face avatar, halo smoke orb)
Agent profiles — switch personas via JSON

Install Options

Method	Command / Steps	Best For
npm	`npx openvoiceui setup && npx openvoiceui start`	Quickest start
Pinokio	One-click install from Pinokio app store	Non-technical users
VPS	Run the setup script on any Ubuntu server	Production hosting
Docker	`docker compose up`	Containerized deployment
Dev Container	Open in VS Code Dev Container	Contributing / development

Works With Any Provider

LLM

Provider	Status
OpenClaw	Built-in — routes to OpenAI, Anthropic, Groq, and more
Z.AI (GLM)	Built-in
Ollama (local)	Via adapter
Any LLM	Drop-in gateway plugin

Text-to-Speech

Provider	Status
Supertonic (local)	Free, default
Groq Orpheus	Supported
Qwen3-TTS (fal.ai)	Supported
Hume EVI	Supported
ElevenLabs	Supported

Speech-to-Text

Provider	Status
Web Speech API	Free, default
Deepgram	Supported
Groq Whisper	Supported
Hume	Supported

Use Cases

Small Business — AI receptionist, appointment scheduler, report builder. Talk to your AI and get a live dashboard of today's leads, reviews, and tasks.

Digital Agencies — Deploy custom AI assistants per client. Multi-tenant, white-label ready. Each client gets their own voice-powered workspace.

Developers — Fork it, extend it, deploy it anywhere. MIT licensed. Build custom skills, gateway plugins, and adapters on top of a voice-first platform.

How It's Different

	OpenVoiceUI	Typical Voice AI
Source	Open source (MIT)	Closed source
Canvas UI	Live HTML rendering	Text/audio only
Skills	35+ built-in, extensible	API endpoints
Hosting	Self-hosted, your data	Vendor cloud only
Pricing	Free forever	Per-minute billing

Extend It

Build a skill — Add capabilities without touching core code. See docs/
Build a gateway plugin — Connect any LLM provider. See plugins/README.md
Build an adapter — Add new STT/TTS providers. See src/adapters/_template.js

Tech Stack

Layer	Technology
Backend	Python / Flask
Frontend	Vanilla JS (ES modules, no framework)
Canvas	Fullscreen iframe + SSE
STT	Web Speech API, Deepgram, Groq Whisper
TTS	Supertonic, Groq Orpheus, Qwen3-TTS
LLM	Any provider via gateway adapter
Auth	Clerk (optional)
Deploy	npm, Docker, Pinokio, VPS

Documentation

Full Docs — architecture, provider guides, configuration
Architecture Overview
Website
Environment Variables

Contributing

We welcome contributions. See CONTRIBUTING.md for guidelines. This project is MIT licensed — fork it, build on it, make it yours.

License

MIT

Website · GitHub · npm · hello@openvoiceui.com