elevenlabs-voices
High-quality voice synthesis with 18 personas, 32 languages, sound effects, batch processing, and voice design using ElevenLabs API.
Install
mkdir -p .claude/skills/elevenlabs-voices && curl -L -o skill.zip "https://mcp.directory/api/skills/download/2437" && unzip -o skill.zip -d .claude/skills/elevenlabs-voices && rm skill.zipInstalls to .claude/skills/elevenlabs-voices
About this skill
ElevenLabs Voice Personas v2.1
Comprehensive voice synthesis toolkit using ElevenLabs API.
🚀 First Run - Setup Wizard
When you first use this skill (no config.json exists), run the interactive setup wizard:
python3 scripts/setup.py
The wizard will guide you through:
- API Key - Enter your ElevenLabs API key (required)
- Default Voice - Choose from popular voices (Rachel, Adam, Bella, etc.)
- Language - Set your preferred language (32 supported)
- Audio Quality - Standard or high quality output
- Cost Tracking - Enable usage and cost monitoring
- Budget Limit - Optional monthly spending cap
🔒 Privacy: Your API key is stored locally in config.json only. It never leaves your machine and is automatically excluded from git via .gitignore.
To reconfigure at any time, simply run the setup wizard again.
✨ Features
- 18 Voice Personas - Carefully curated voices for different use cases
- 32 Languages - Multi-language synthesis with the multilingual v2 model
- Streaming Mode - Real-time audio output as it generates
- Sound Effects (SFX) - AI-generated sound effects from text prompts
- Batch Processing - Process multiple texts in one go
- Cost Tracking - Monitor character usage and estimated costs
- Voice Design - Create custom voices from descriptions
- Pronunciation Dictionary - Custom word pronunciation rules
- OpenClaw Integration - Works with OpenClaw's built-in TTS
🎙 Available Voices
| Voice | Accent | Gender | Persona | Best For |
|---|---|---|---|---|
| rachel | 🇺🇸 US | female | warm | Conversations, tutorials |
| adam | 🇺🇸 US | male | narrator | Documentaries, audiobooks |
| bella | 🇺🇸 US | female | professional | Business, presentations |
| brian | 🇺🇸 US | male | comforting | Meditation, calm content |
| george | 🇬🇧 UK | male | storyteller | Audiobooks, storytelling |
| alice | 🇬🇧 UK | female | educator | Tutorials, explanations |
| callum | 🇺🇸 US | male | trickster | Playful, gaming |
| charlie | 🇦🇺 AU | male | energetic | Sports, motivation |
| jessica | 🇺🇸 US | female | playful | Social media, casual |
| lily | 🇬🇧 UK | female | actress | Drama, elegant content |
| matilda | 🇺🇸 US | female | professional | Corporate, news |
| river | 🇺🇸 US | neutral | neutral | Inclusive, informative |
| roger | 🇺🇸 US | male | casual | Podcasts, relaxed |
| daniel | 🇬🇧 UK | male | broadcaster | News, announcements |
| eric | 🇺🇸 US | male | trustworthy | Business, corporate |
| chris | 🇺🇸 US | male | friendly | Tutorials, approachable |
| will | 🇺🇸 US | male | optimist | Motivation, uplifting |
| liam | 🇺🇸 US | male | social | YouTube, social media |
🎯 Quick Presets
default→ rachel (warm, friendly)narrator→ adam (documentaries)professional→ matilda (corporate)storyteller→ george (audiobooks)educator→ alice (tutorials)calm→ brian (meditation)energetic→ liam (social media)trustworthy→ eric (business)neutral→ river (inclusive)british→ georgeaustralian→ charliebroadcaster→ daniel (news)
🌍 Supported Languages (32)
The multilingual v2 model supports these languages:
| Code | Language | Code | Language |
|---|---|---|---|
| en | English | pl | Polish |
| de | German | nl | Dutch |
| es | Spanish | sv | Swedish |
| fr | French | da | Danish |
| it | Italian | fi | Finnish |
| pt | Portuguese | no | Norwegian |
| ru | Russian | tr | Turkish |
| uk | Ukrainian | cs | Czech |
| ja | Japanese | sk | Slovak |
| ko | Korean | hu | Hungarian |
| zh | Chinese | ro | Romanian |
| ar | Arabic | bg | Bulgarian |
| hi | Hindi | hr | Croatian |
| ta | Tamil | el | Greek |
| id | Indonesian | ms | Malay |
| vi | Vietnamese | th | Thai |
# Synthesize in German
python3 tts.py --text "Guten Tag!" --voice rachel --lang de
# Synthesize in French
python3 tts.py --text "Bonjour le monde!" --voice adam --lang fr
# List all languages
python3 tts.py --languages
💻 CLI Usage
Basic Text-to-Speech
# List all voices
python3 scripts/tts.py --list
# Generate speech
python3 scripts/tts.py --text "Hello world" --voice rachel --output hello.mp3
# Use a preset
python3 scripts/tts.py --text "Breaking news..." --voice broadcaster --output news.mp3
# Multi-language
python3 scripts/tts.py --text "Bonjour!" --voice rachel --lang fr --output french.mp3
Streaming Mode
Generate audio with real-time streaming (good for long texts):
# Stream audio as it generates
python3 scripts/tts.py --text "This is a long story..." --voice adam --stream
# Streaming with custom output
python3 scripts/tts.py --text "Chapter one..." --voice george --stream --output chapter1.mp3
Batch Processing
Process multiple texts from a file:
# From newline-separated text file
python3 scripts/tts.py --batch texts.txt --voice rachel --output-dir ./audio
# From JSON file
python3 scripts/tts.py --batch batch.json --output-dir ./output
JSON batch format:
[
{"text": "First line", "voice": "rachel", "output": "line1.mp3"},
{"text": "Second line", "voice": "adam", "output": "line2.mp3"},
{"text": "Third line"}
]
Simple text format (one per line):
Hello, this is the first sentence.
This is the second sentence.
And this is the third.
Usage Statistics
# Show usage stats and cost estimates
python3 scripts/tts.py --stats
# Reset statistics
python3 scripts/tts.py --reset-stats
🎵 Sound Effects (SFX)
Generate AI-powered sound effects from text descriptions:
# Generate a sound effect
python3 scripts/sfx.py --prompt "Thunder rumbling in the distance"
# With specific duration (0.5-22 seconds)
python3 scripts/sfx.py --prompt "Cat meowing" --duration 3 --output cat.mp3
# Adjust prompt influence (0.0-1.0)
python3 scripts/sfx.py --prompt "Footsteps on gravel" --influence 0.5
# Batch SFX generation
python3 scripts/sfx.py --batch sounds.json --output-dir ./sfx
# Show prompt examples
python3 scripts/sfx.py --examples
Example prompts:
- "Thunder rumbling in the distance"
- "Cat purring contentedly"
- "Typing on a mechanical keyboard"
- "Spaceship engine humming"
- "Coffee shop background chatter"
🎨 Voice Design
Create custom voices from text descriptions:
# Basic voice design
python3 scripts/voice-design.py --gender female --age middle_aged --accent american \
--description "A warm, motherly voice"
# With custom preview text
python3 scripts/voice-design.py --gender male --age young --accent british \
--text "Welcome to the adventure!" --output preview.mp3
# Save to your ElevenLabs library
python3 scripts/voice-design.py --gender female --age young --accent american \
--description "Energetic podcast host" --save "MyHost"
# List all design options
python3 scripts/voice-design.py --options
Voice Design Options:
| Option | Values |
|---|---|
| Gender | male, female, neutral |
| Age | young, middle_aged, old |
| Accent | american, british, african, australian, indian, latin, middle_eastern, scandinavian, eastern_european |
| Accent Strength | 0.3-2.0 (subtle to strong) |
📖 Pronunciation Dictionary
Customize how words are pronounced:
Edit pronunciations.json:
{
"rules": [
{
"word": "OpenClaw",
"replacement": "Open Claw",
"comment": "Pronounce as two words"
},
{
"word": "API",
"replacement": "A P I",
"comment": "Spell out acronym"
}
]
}
Usage:
# Pronunciations are applied automatically
python3 scripts/tts.py --text "The OpenClaw API is great" --voice rachel
# Disable pronunciations
python3 scripts/tts.py --text "The API is great" --voice rachel --no-pronunciations
💰 Cost Tracking
The skill tracks your character usage and estimates costs:
python3 scripts/tts.py --stats
Output:
📊 ElevenLabs Usage Statistics
Total Characters: 15,230
Total Requests: 42
Since: 2024-01-15
💰 Estimated Costs:
Starter $4.57 ($0.30/1k chars)
Creator $3.66 ($0.24/1k chars)
Pro $2.74 ($0.18/1k chars)
Scale $1.68 ($0.11/1k chars)
🤖 OpenClaw TTS Integration
Using with OpenClaw's Built-in TTS
OpenClaw has built-in TTS support that can use ElevenLabs. Configure in ~/.openclaw/openclaw.json:
{
"tts": {
"enabled": true,
"provider": "elevenlabs",
"elevenlabs": {
"apiKey": "your-api-key-here",
"voice": "rachel",
"model": "eleven_multilingual_v2"
}
}
}
Triggering TTS in Chat
In OpenClaw conversations:
- Use
/tts onto enable automatic TTS - Use the
ttstool directly for one-off speech - Request "read this aloud" or "speak this"
Using Skill Scripts from OpenClaw
# OpenClaw can run these scripts directly
exec python3 /path/to/skills/elevenlabs-voices/scripts/tts.py --text "Hello" --voice rachel
⚙ Configuration
The scripts look for API key in this order:
ELEVEN_API_KEYorELEVENLABS_API_KEYenvironment variable- Skill-local
.envfile (in the skill directory)
Create .env file:
echo 'ELEVEN_API_KEY=your-key-here' > .env
Note: The skill no longer reads from
~/.openclaw/openclaw.json. Use environment variables or the skill-local.envfile.
🎛 Voice Settings
Each voice has tuned settings for optimal output:
| Setting | Range | Description |
|---|---|---|
| stability | 0.0-1.0 | Higher = consistent, lower = expressive |
| similarity_boost | 0.0-1.0 | How closely to match original voice |
| style | 0.0-1.0 | Exaggeration of speaking style |
📝 Triggers
- "use {voice_name} voice"
- "speak as {persona}"
- "list voices"
- "voice settings"
- "generate sound effect"
- "design a voice"
📁 Files
elevenlabs-vo
---
*Content truncated.*
More by openclaw
View all skills by openclaw →You might also like
flutter-development
aj-geddes
Build beautiful cross-platform mobile apps with Flutter and Dart. Covers widgets, state management with Provider/BLoC, navigation, API integration, and material design.
drawio-diagrams-enhanced
jgtolentino
Create professional draw.io (diagrams.net) diagrams in XML format (.drawio files) with integrated PMP/PMBOK methodologies, extensive visual asset libraries, and industry-standard professional templates. Use this skill when users ask to create flowcharts, swimlane diagrams, cross-functional flowcharts, org charts, network diagrams, UML diagrams, BPMN, project management diagrams (WBS, Gantt, PERT, RACI), risk matrices, stakeholder maps, or any other visual diagram in draw.io format. This skill includes access to custom shape libraries for icons, clipart, and professional symbols.
ui-ux-pro-max
nextlevelbuilder
"UI/UX design intelligence. 50 styles, 21 palettes, 50 font pairings, 20 charts, 8 stacks (React, Next.js, Vue, Svelte, SwiftUI, React Native, Flutter, Tailwind). Actions: plan, build, create, design, implement, review, fix, improve, optimize, enhance, refactor, check UI/UX code. Projects: website, landing page, dashboard, admin panel, e-commerce, SaaS, portfolio, blog, mobile app, .html, .tsx, .vue, .svelte. Elements: button, modal, navbar, sidebar, card, table, form, chart. Styles: glassmorphism, claymorphism, minimalism, brutalism, neumorphism, bento grid, dark mode, responsive, skeuomorphism, flat design. Topics: color palette, accessibility, animation, layout, typography, font pairing, spacing, hover, shadow, gradient."
godot
bfollington
This skill should be used when working on Godot Engine projects. It provides specialized knowledge of Godot's file formats (.gd, .tscn, .tres), architecture patterns (component-based, signal-driven, resource-based), common pitfalls, validation tools, code templates, and CLI workflows. The `godot` command is available for running the game, validating scripts, importing resources, and exporting builds. Use this skill for tasks involving Godot game development, debugging scene/resource files, implementing game systems, or creating new Godot components.
nano-banana-pro
garg-aayush
Generate and edit images using Google's Nano Banana Pro (Gemini 3 Pro Image) API. Use when the user asks to generate, create, edit, modify, change, alter, or update images. Also use when user references an existing image file and asks to modify it in any way (e.g., "modify this image", "change the background", "replace X with Y"). Supports both text-to-image generation and image-to-image editing with configurable resolution (1K default, 2K, or 4K for high resolution). DO NOT read the image file first - use this skill directly with the --input-image parameter.
fastapi-templates
wshobson
Create production-ready FastAPI projects with async patterns, dependency injection, and comprehensive error handling. Use when building new FastAPI applications or setting up backend API projects.
Related MCP Servers
Browse all serversEnable your AI virtual assistant with automatic speech recognition and speech into text using faster-whisper for seamles
Voice Interface is a browser-based speech to text website offering fast, hands-free speech to text online and website sp
Integrate AivisSpeech for Japanese text-to-speech with customizable voice options, compatible with Google Cloud Speech t
Use any LLM for deep research. Performs multi-step web search, content analysis, and synthesis for comprehensive researc
Unlock powerful text to speech and AI voice generator tools with ElevenLabs. Create, clone, and customize speech easily.
Voice MCP powers two-way voice apps with Google Cloud Speech to Text, Speech Recognition, and Text to Speech API for acc
Stay ahead of the MCP ecosystem
Get weekly updates on new skills and servers.