Installation
Authentication
Set your API key in the environment:Components
TTS - Text-to-Speech
Convert text to natural-sounding speech:ElevenLabs API key. Defaults to
ELEVENLABS_API_KEY environment variableThe voice ID to use for synthesis. Browse voices at ElevenLabs Voice Library
The model ID for synthesis:
eleven_multilingual_v2- Multilingual, high qualityeleven_turbo_v2- Fastest, English optimizedeleven_monolingual_v1- English only
STT - Speech-to-Text
ElevenLabs also provides STT capabilities:
Usage Examples
Basic Voice Agent
With Custom Voice
With Turbo Model for Speed
Set Output Track Manually
Listen to Audio Events
Voice Selection
ElevenLabs offers a wide variety of voices. Find voices at:- ElevenLabs Voice Library
- Voice Lab - Create custom voices
21m00Tcm4TlvDq8ikWAM- Rachel (female, calm)VR6AewLTigWG4xSOukaG- Default (female)pNInz6obpgDQGcFmaJgB- Adam (male, deep)EXAVITQu4vr4xnSDxMaL- Bella (female, soft)
Model Selection
| Model | Best For | Speed | Languages |
|---|---|---|---|
eleven_turbo_v2 | Fast responses | Fastest | English |
eleven_multilingual_v2 | Quality & languages | Medium | 29+ languages |
eleven_monolingual_v1 | English quality | Medium | English only |
Configuration
Environment Variables
Quality vs Speed Trade-offs
For Best Quality:Features
- Natural, expressive voices
- Low-latency streaming
- Multilingual support (29+ languages)
- Custom voice cloning (on ElevenLabs platform)
- Voice design and fine-tuning
- Emotion and style control
API Details
The plugin uses ElevenLabs API v1:- WebSocket streaming for low latency
- Automatic chunking for real-time playback
- Event-based audio delivery
References
- ElevenLabs API Documentation
- Voice Library
- Pricing
- Plugin Source:
plugins/elevenlabs/vision_agents/plugins/elevenlabs/__init__.py