JSPM

@elizaos/plugin-elevenlabs-root

2.0.0-alpha.1
  • ESM via JSPM
  • ES Module Entrypoint
  • Export Map
  • Keywords
  • License
  • Repository URL
  • TypeScript Types
  • README
  • Created
  • Published
  • Downloads 11
  • Score
    100M100P100Q25588F
  • License MIT

High-quality text-to-speech and speech-to-text plugin for ElizaOS using ElevenLabs API

Package Exports

  • @elizaos/plugin-elevenlabs-root
  • @elizaos/plugin-elevenlabs-root/package.json

Readme

ElevenLabs Plugin for ElizaOS

High-quality text-to-speech (TTS) and speech-to-text (STT) plugin for ElizaOS using the ElevenLabs API.

Features

  • Text-to-Speech (TTS): High-quality voice synthesis with multiple voice models
  • Speech-to-Text (STT): Accurate transcription with Scribe v1 model
  • Speaker Diarization: Identify up to 32 different speakers
  • Multi-language Support: 99 languages for STT
  • Audio Event Detection: Detect laughter, applause, and other audio events
  • Streaming Support: Efficient memory usage with streaming audio output
  • Multi-runtime: Available for TypeScript, Python, and Rust

Installation

TypeScript (npm)

npm install @elizaos/plugin-elevenlabs
# or
bun add @elizaos/plugin-elevenlabs

Python (PyPI)

pip install eliza-plugin-elevenlabs

Rust (crates.io)

Add to your Cargo.toml:

[dependencies]
eliza-plugin-elevenlabs = "0.1.0"

Quick Start

TypeScript

import { elevenLabsPlugin } from '@elizaos/plugin-elevenlabs';

// Add to your character configuration
const character = {
  plugins: ['@elizaos/plugin-elevenlabs'],
  settings: {
    ELEVENLABS_API_KEY: 'your-api-key',
  },
};

Python

from eliza_plugin_elevenlabs import ElevenLabsService

async with ElevenLabsService(api_key="your-api-key") as service:
    # Text-to-speech
    audio = await service.text_to_speech_bytes("Hello, world!")
    
    # Speech-to-text
    transcript = await service.speech_to_text(audio_bytes)

Rust

use eliza_plugin_elevenlabs::ElevenLabsService;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let service = ElevenLabsService::new("your-api-key");
    
    // Text-to-speech
    let audio = service.text_to_speech("Hello, world!").await?;
    
    // Speech-to-text
    let transcript = service.speech_to_text(&audio).await?;
    
    Ok(())
}

Configuration

Environment Variable Description Default
ELEVENLABS_API_KEY ElevenLabs API key Required
ELEVENLABS_VOICE_ID Voice ID for TTS EXAVITQu4vr4xnSDxMaL
ELEVENLABS_MODEL_ID TTS model ID eleven_monolingual_v1
ELEVENLABS_VOICE_STABILITY Voice stability (0-1) 0.5
ELEVENLABS_VOICE_SIMILARITY_BOOST Similarity boost (0-1) 0.75
ELEVENLABS_VOICE_STYLE Voice style intensity (0-1) 0
ELEVENLABS_VOICE_USE_SPEAKER_BOOST Enable speaker boost true
ELEVENLABS_OPTIMIZE_STREAMING_LATENCY Latency optimization (0-4) 0
ELEVENLABS_OUTPUT_FORMAT Audio output format mp3_44100_128
ELEVENLABS_BROWSER_URL Browser proxy URL -
ELEVENLABS_STT_MODEL_ID STT model ID scribe_v1
ELEVENLABS_STT_LANGUAGE_CODE Language code for STT auto-detect
ELEVENLABS_STT_TIMESTAMPS_GRANULARITY Timestamp detail level word
ELEVENLABS_STT_DIARIZE Enable speaker diarization false
ELEVENLABS_STT_NUM_SPEAKERS Expected number of speakers (1-32) -
ELEVENLABS_STT_TAG_AUDIO_EVENTS Tag audio events false

Project Structure

plugin-elevenlabs/
├── package.json          # Root package with multi-language scripts
├── README.md             # This file
├── .gitignore
├── .github/
│   └── workflows/
│       ├── ci.yml        # CI for all languages
│       ├── npm-deploy.yml    # npm publishing
│       ├── pypi-deploy.yml   # PyPI publishing
│       └── crates-deploy.yml # crates.io publishing
├── typescript/           # TypeScript implementation
│   ├── package.json
│   ├── src/
│   └── README.md
├── python/               # Python implementation
│   ├── pyproject.toml
│   ├── src/
│   └── README.md
└── rust/                 # Rust implementation
    ├── Cargo.toml
    ├── src/
    └── README.md

Development

Build All

bun run build

Build Individual Languages

bun run build:ts      # TypeScript
bun run build:python  # Python
bun run build:rust    # Rust

Test All

bun run test

Test Individual Languages

bun run test:ts      # TypeScript
bun run test:python  # Python
bun run test:rust    # Rust

Lint

bun run lint         # All languages
bun run lint:ts      # TypeScript
bun run lint:python  # Python
bun run lint:rust    # Rust

Model Types

TEXT_TO_SPEECH

Converts text into spoken audio. Supports:

  • Multiple voice models
  • Configurable voice parameters
  • Streaming output
  • Various audio formats (MP3, PCM, etc.)

TRANSCRIPTION

Converts audio/video into text transcripts. Supports:

  • 99 languages with auto-detection
  • Speaker diarization (up to 32 speakers)
  • Word/character-level timestamps
  • Audio event tagging

Supported Models

TTS Models

  • eleven_monolingual_v1
  • eleven_multilingual_v1
  • eleven_multilingual_v2
  • eleven_turbo_v2
  • eleven_turbo_v2_5

STT Models

  • scribe_v1

License

MIT