JSPM

  • ESM via JSPM
  • ES Module Entrypoint
  • Export Map
  • Keywords
  • License
  • Repository URL
  • TypeScript Types
  • README
  • Created
  • Published
  • Downloads 167
  • Score
    100M100P100Q84331F
  • License MIT

Voice-powered AI assistant platform — connect any LLM, any TTS, with a live web canvas, music generation, and agent orchestration

Package Exports

    This package does not declare an exports field, so the exports above have been automatically detected and optimized by JSPM instead. If any package subpath is missing, it is recommended to post an issue to the original package (openvoiceui) to support the "exports" field. If that is not possible, create a JSPM override to customize the exports field for this package.

    Readme

    OpenVoiceUI Banner

    OpenVoiceUI

    The open-source voice AI that actually does work.

    npm version MIT License GitHub Stars Website

    Install, open localhost:5001, say "build me a dashboard", and watch it render live.


    Watch the demo -- see voice-to-canvas in action


    Install

    Prerequisite: Docker must be installed and running for all install methods.

    Pinokio (one-click)

    Download Pinokio if you don't have it, then search "OpenVoiceUI" in the app store and click Install.

    npm

    npx openvoiceui setup     # interactive wizard — walks you through API keys + builds Docker images
    npx openvoiceui start     # starts everything

    Docker

    git clone https://github.com/MCERQUA/OpenVoiceUI.git
    cd OpenVoiceUI
    cp .env.example .env        # edit with your API keys
    docker compose up

    Open localhost:5001 and start talking.


    What is OpenVoiceUI?

    OpenVoiceUI is a hands-free, AI-controlled computer. You talk — it builds. Live web apps, dashboards, games, full websites — rendered in real time while you watch. No mouse, no keyboard, no typing prompts into a chat box.

    It runs on OpenClaw and works with any LLM. The AI agent can build and display apps mid-conversation, switch between projects with a voice command, generate music on the fly, delegate work to parallel sub-agents, and remember everything across sessions. It uses any Claude Code or OpenClaw skill — and the community can build and share more through the plugin system.

    Self-hosted. Your hardware, your data. MIT licensed, forever free.

    Core Features

    • Hands-Free AI Computer — Talk and watch it work. The AI builds apps, switches between projects, runs tasks, and displays results on a live visual canvas — all without touching a mouse or keyboard.
    • Live Canvas — AI renders real HTML pages mid-conversation: dashboards, tools, galleries, reports, full web apps. Not text responses — real interactive pages you can use.
    • AI Music Generation — Generate songs on the fly with your voice using Suno. Full music player with playlist management built in.
    • Custom Animated Interface — Choose from animated face modes (eye-face avatar, reactive halo-smoke orb) or install community-built faces through plugins. Build your own — the face system is fully extensible.
    • Sub-Agents — Delegate multiple tasks to parallel AI workers simultaneously and get results back.
    • Long-Term Memory — Optional context engine plugin curates knowledge every turn. Persists across sessions in human-readable markdown.
    • Desktop OS Interface — Themed desktop environment with window management (Windows XP, macOS, Ubuntu, Win95, Win 3.1).
    • Admin Dashboard — Mobile-responsive. Agent profiles, provider config, workspace file browser, plugin management, system health. Everything editable live.
    • Self-Hosted — Your hardware, your data. No vendor lock-in, no monthly fees.

    And More

    • Image generation (FLUX.1, Stable Diffusion 3.5)
    • Video creation (Remotion Studio)
    • Voice cloning (Qwen3-TTS via fal.ai)
    • Cron jobs for scheduled automation
    • File explorer with drag-and-drop
    • Agent profiles — switch personas, voices, and LLM providers from the admin panel

    Install Details

    Option 1: Pinokio (one-click)

    1. Install Pinokio if you don't have it
    2. Search "OpenVoiceUI" in the Pinokio app store
    3. Click Install, then Start

    Pinokio handles Docker, dependencies, and configuration automatically.

    Option 2: npm

    Requires Node.js 20+, Python 3.10+, and Docker.

    npx openvoiceui setup     # interactive wizard — configures LLM, TTS, API keys, builds Docker images
    npx openvoiceui start     # starts OpenClaw gateway + Supertonic TTS + voice UI

    The setup wizard walks you through choosing an LLM provider, TTS provider, and entering API keys. Configuration is saved to .env and openclaw-data/.

    npx openvoiceui stop      # stop all services
    npx openvoiceui status    # check what's running
    npx openvoiceui logs      # tail service logs

    Option 3: Docker

    Requires Docker and Docker Compose.

    git clone https://github.com/MCERQUA/OpenVoiceUI.git
    cd OpenVoiceUI
    cp .env.example .env

    Edit .env with your API keys (at minimum: an LLM provider key and optionally a TTS key). Then:

    docker compose up -d

    This starts three containers:

    Container Port Purpose
    openclaw 18791 LLM gateway — routes to your chosen LLM provider
    supertonic (internal) Free local TTS — no API key needed
    openvoiceui 5001 Voice UI + Canvas + Admin dashboard

    Open http://localhost:5001 to use the voice interface, or http://localhost:5001/admin for the admin dashboard.

    To stop: docker compose down

    Option 4: VPS / Production

    For running on an Ubuntu server with nginx and systemd:

    git clone https://github.com/MCERQUA/OpenVoiceUI.git
    cd OpenVoiceUI
    cp .env.example .env               # edit with your API keys
    sudo bash deploy/setup-sudo.sh     # creates dirs, installs systemd service
    bash deploy/setup-nginx.sh         # generates nginx config (edit domain)

    See deploy/ for the full production setup including SSL, nginx reverse proxy, and systemd service files.


    Configuration

    All configuration is in .env. Copy .env.example to .env and fill in your values.

    Required:

    • An LLM provider API key (OpenAI, Anthropic, Groq, Z.AI, or any OpenClaw-compatible provider)
    • CLAWDBOT_AUTH_TOKEN — set during npx openvoiceui setup or in OpenClaw's setup wizard

    Optional but recommended:

    • GROQ_API_KEY — enables Groq Orpheus TTS (fast, high quality, free tier)
    • SUNO_API_KEY — enables AI music generation
    • CLERK_PUBLISHABLE_KEY — enables login/auth (for multi-user or public deployments)

    See .env.example for all available options with descriptions.


    Works With Any Provider

    LLM

    Provider Status
    OpenClaw Gateway Built-in — routes to OpenAI, Anthropic, Groq, Z.AI, and more
    Z.AI (GLM-5-turbo) Built-in
    Groq (Llama, Qwen) Via OpenClaw
    Google Gemini Via OpenClaw
    MiniMax Via OpenClaw
    Ollama (local) Via adapter
    Any LLM Drop-in gateway plugin

    Text-to-Speech

    Provider Status
    Supertonic (local) Free, ships with Docker setup
    Groq Orpheus Fast cloud TTS, free tier
    Resemble AI Premium cloned voices
    Qwen3-TTS (fal.ai) Voice cloning
    Hume EVI Emotion-aware
    ElevenLabs High quality, many voices

    Speech-to-Text

    Provider Status
    Web Speech API Free, browser-native (default)
    Deepgram Streaming, accurate
    Groq Whisper Fast cloud transcription

    Admin Dashboard

    Access at localhost:5001/admin. Mobile-responsive.

    • Profiles — View and activate agent personas
    • Agent Editor — Edit name, voice, LLM provider, system prompt, features, and agent workspace files. 4 tabs: Profile, System Prompt, Features, Agent Files
    • Plugins — Install and manage face packs, gateways, and extensions
    • Canvas Pages — Toggle public/private, lock pages, delete with archive
    • Workspace Files — Browse and edit agent workspace. Audio playback, image preview built in.
    • Music (Suno) — View all generated songs, play inline, archive tracks
    • Provider Config — Select LLM, TTS, STT providers. Saves to active profile.
    • Health and Stats — CPU, RAM, disk, gateway status, session reset
    • Connector Tests — 12 automated endpoint diagnostics

    Use Cases

    Small Business — AI receptionist, appointment scheduler, report builder. Talk to your AI and get a live dashboard of today's leads, reviews, and tasks.

    Digital Agencies — Deploy custom AI assistants per client. Multi-tenant ready. Each client gets their own voice-powered workspace.

    Developers — Fork it, extend it, deploy it anywhere. MIT licensed. Build custom plugins, gateway adapters, and canvas pages on top of a voice-first platform.


    How It's Different

    OpenVoiceUI Typical Voice AI
    Source Open source (MIT) Closed source
    Canvas UI Live HTML rendering Text/audio only
    Skills Any Claude Code or OpenClaw skill API endpoints
    Music AI music generation (Suno) None
    Memory Plugin-based long-term context Session only
    Admin Full dashboard, mobile-ready Config files
    Plugins Community face packs, pages, workflows None
    Hosting Self-hosted, your data Vendor cloud only
    Pricing Free forever Per-minute billing

    Tech Stack

    Layer Technology
    Backend Python / Flask
    Frontend Vanilla JS (ES modules, no framework)
    Canvas Fullscreen iframe + SSE
    STT Web Speech API, Deepgram, Groq Whisper
    TTS Supertonic, Groq Orpheus, Resemble, Qwen3-TTS
    LLM Any provider via OpenClaw gateway
    Memory Context engine plugin (markdown knowledge base)
    Auth Clerk (optional)
    Deploy npm, Docker, Pinokio, VPS/systemd

    Plugins

    OpenVoiceUI has a plugin system for community-built extensions. Plugins can include animated face packs, canvas pages, workflow dashboards, gateway adapters, or any combination.

    Plugin Type Description
    BHB Animated Characters Face Pack Animated BigHead Billionaires character avatars with lip-sync, mood expressions, and show lore. By BHaleyart
    Hermes Agent Gateway Self-improving AI agent with auto-generated skills, deep memory search, and autonomous tasks. Adds OpenClaw+Hermes hybrid and Hermes-only modes
    SEO Platform Canvas Page Full SEO dashboard powered by DataForSEO — keyword research, rank tracking, backlink analysis, site audits, AI visibility, and local SEO
    Twenty CRM Canvas Page Connect to a Twenty CRM instance for contact, company, deal, and task management with embedded CRM view and setup wizard
    ByteRover Long-Term Memory Context Engine Persistent long-term memory that curates conversation knowledge every turn into a human-readable markdown knowledge base

    Build your own. Face packs, canvas pages, workflow dashboards, gateway adapters (template), or STT/TTS adapters (template). See the plugins repo for submission guidelines.


    Documentation

    Contributing

    We welcome contributions — especially plugins. Build a face pack, a canvas page, a workflow dashboard, or a full extension and submit it to the plugins repo. See CONTRIBUTING.md for code contribution guidelines and openvoiceui.com for full documentation.

    License

    MIT


    Website  ·  GitHub  ·  npm  ·  Plugins