edge-tts
Text-to-speech conversion using node-edge-tts npm package for generating audio from text. Supports multiple voices, languages, speed adjustment, pitch control, and subtitle generation. Use when: (1) User requests audio/voice output with the "tts" trigger or keyword. (2) Content needs to be spoken rather than read (multitasking, accessibility, driving, cooking). (3) User wants a specific voice, speed, pitch, or format for TTS output.
Install
mkdir -p .claude/skills/edge-tts && curl -L -o skill.zip "https://mcp.directory/api/skills/download/4317" && unzip -o skill.zip -d .claude/skills/edge-tts && rm skill.zipInstalls to .claude/skills/edge-tts
About this skill
Edge-TTS Skill
Overview
Generate high-quality text-to-speech audio using Microsoft Edge's neural TTS service via the node-edge-tts npm package. Supports multiple languages, voices, adjustable speed/pitch, and subtitle generation.
Quick Start
When you detect TTS intent from triggers or user request:
- Call the tts tool (Clawdbot built-in) to convert text to speech
- The tool returns a MEDIA: path
- Clawdbot routes the audio to the current channel
// Example: Built-in tts tool usage
tts("Your text to convert to speech")
// Returns: MEDIA: /path/to/audio.mp3
Trigger Detection
Recognize "tts" keyword as TTS requests. The skill automatically filters out TTS-related keywords from text before conversion to avoid converting the trigger words themselves to audio.
Advanced Customization
Using the Node.js Scripts
For more control, use the bundled scripts directly:
TTS Converter
cd scripts
npm install
node tts-converter.js "Your text" --voice en-US-AriaNeural --rate +10% --output output.mp3
Options:
--voice, -v: Voice name (default: en-US-AriaNeural)--lang, -l: Language code (e.g., en-US, es-ES)--format, -o: Output format (default: audio-24khz-48kbitrate-mono-mp3)--pitch: Pitch adjustment (e.g., +10%, -20%, default)--rate, -r: Rate adjustment (e.g., +10%, -20%, default)--volume: Volume adjustment (e.g., +0%, -10%, default)--save-subtitles, -s: Save subtitles as JSON file--output, -f: Output file path (default: tts_output.mp3)--proxy, -p: Proxy URL (e.g., http://localhost:7890)--timeout: Request timeout in milliseconds (default: 10000)--list-voices, -L: List available voices
Configuration Manager
cd scripts
npm install
node config-manager.js --set-voice en-US-AriaNeural
node config-manager.js --set-rate +10%
node config-manager.js --get
node config-manager.js --reset
Voice Selection
Common voices (use --list-voices for full list):
English:
en-US-MichelleNeural(female, natural, default)en-US-AriaNeural(female, natural)en-US-GuyNeural(male, natural)en-GB-SoniaNeural(female, British)en-GB-RyanNeural(male, British)
Other Languages:
es-ES-ElviraNeural(Spanish, Spain)fr-FR-DeniseNeural(French)de-DE-KatjaNeural(German)ja-JP-NanamiNeural(Japanese)zh-CN-XiaoxiaoNeural(Chinese)ar-SA-ZariyahNeural(Arabic)
Rate Guidelines
Rate values use percentage format:
"default": Normal speed"-20%"to"-10%": Slow, clear (tutorials, stories, accessibility)"+10%"to"+20%": Slightly fast (summaries)"+30%"to"+50%": Fast (news, efficiency)
Output Formats
Choose audio quality based on use case:
audio-24khz-48kbitrate-mono-mp3: Standard quality (voice notes, messages)audio-24khz-96kbitrate-mono-mp3: High quality (presentations, content)audio-48khz-96kbitrate-stereo-mp3: Highest quality (professional audio, music)
Resources
scripts/tts-converter.js
Main TTS conversion script using node-edge-tts. Generates audio files with customizable voice, rate, volume, pitch, and format. Supports subtitle generation and voice listing.
scripts/config-manager.js
Manages persistent user preferences for TTS settings (voice, language, format, pitch, rate, volume). Stores config in ~/.tts-config.json.
scripts/package.json
NPM package configuration with node-edge-tts dependency.
references/node_edge_tts_guide.md
Complete documentation for node-edge-tts npm package including:
- Full voice list by language
- Prosody options (rate, pitch, volume)
- Usage examples (CLI and Module)
- Subtitle generation
- Output formats
- Best practices and limitations
Voice Testing
Test different voices and preview audio quality at: https://tts.travisvn.com/
Refer to this when you need specific voice details or advanced features.
Installation
To use the bundled scripts:
cd /home/user/clawd/skills/public/tts-skill/scripts
npm install
This installs:
node-edge-tts- TTS librarycommander- CLI argument parsing
Workflow
- Detect intent: Check for "tts" trigger or keyword in user message
- Choose method: Use built-in
ttstool for simple requests, orscripts/tts-converter.jsfor customization - Generate audio: Convert the target text (message, search results, summary)
- Return to user: The tts tool returns a MEDIA: path; Clawdbot handles delivery
Testing
Basic Test
Run the test script to verify TTS functionality:
cd /home/user/clawd/skills/public/edge-tts/scripts
npm test
This generates a test audio file and verifies the TTS service is working.
Voice Testing
Test different voices and preview audio quality at: https://tts.travisvn.com/
Integration Test
Use the built-in tts tool for quick testing:
// Example: Test TTS with default settings
tts("This is a test of the TTS functionality.")
Configuration Test
Verify configuration persistence:
cd /home/user/clawd/skills/public/edge-tts/scripts
node config-manager.js --get
node config-manager.js --set-voice en-US-GuyNeural
node config-manager.js --get
Troubleshooting
- Test connectivity: Run
npm testto check if TTS service is accessible - Check voice availability: Use
node tts-converter.js --list-voicesto see available voices - Verify proxy settings: If using proxy, test with
node tts-converter.js "test" --proxy http://localhost:7890 - Check audio output: The test should generate
test-output.mp3in the scripts directory
Notes
- node-edge-tts uses Microsoft Edge's online TTS service (updated, working authentication)
- No API key needed (free service)
- Output is MP3 format by default
- Requires internet connection
- Supports subtitle generation (JSON format with word-level timing)
- Temporary File Handling: By default, audio files are saved to the system's temporary directory (
/tmp/edge-tts-temp/on Unix,C:\Users\<user>\AppData\Local\Temp\edge-tts-temp\on Windows) with unique filenames (e.g.,tts_1234567890_abc123.mp3). Files are not automatically deleted - the calling application (Clawdbot) should handle cleanup after use. You can specify a custom output path with the--outputoption if permanent storage is needed. - TTS keyword filtering: The skill automatically filters out TTS-related keywords (tts, TTS, text-to-speech) from text before conversion to avoid converting the trigger words themselves to audio
- For repeated preferences, use
config-manager.jsto set defaults - Default voice:
en-US-MichelleNeural(female, natural) - Neural voices (ending in
Neural) provide higher quality than Standard voices
More by openclaw
View all skills by openclaw →You might also like
flutter-development
aj-geddes
Build beautiful cross-platform mobile apps with Flutter and Dart. Covers widgets, state management with Provider/BLoC, navigation, API integration, and material design.
drawio-diagrams-enhanced
jgtolentino
Create professional draw.io (diagrams.net) diagrams in XML format (.drawio files) with integrated PMP/PMBOK methodologies, extensive visual asset libraries, and industry-standard professional templates. Use this skill when users ask to create flowcharts, swimlane diagrams, cross-functional flowcharts, org charts, network diagrams, UML diagrams, BPMN, project management diagrams (WBS, Gantt, PERT, RACI), risk matrices, stakeholder maps, or any other visual diagram in draw.io format. This skill includes access to custom shape libraries for icons, clipart, and professional symbols.
ui-ux-pro-max
nextlevelbuilder
"UI/UX design intelligence. 50 styles, 21 palettes, 50 font pairings, 20 charts, 8 stacks (React, Next.js, Vue, Svelte, SwiftUI, React Native, Flutter, Tailwind). Actions: plan, build, create, design, implement, review, fix, improve, optimize, enhance, refactor, check UI/UX code. Projects: website, landing page, dashboard, admin panel, e-commerce, SaaS, portfolio, blog, mobile app, .html, .tsx, .vue, .svelte. Elements: button, modal, navbar, sidebar, card, table, form, chart. Styles: glassmorphism, claymorphism, minimalism, brutalism, neumorphism, bento grid, dark mode, responsive, skeuomorphism, flat design. Topics: color palette, accessibility, animation, layout, typography, font pairing, spacing, hover, shadow, gradient."
godot
bfollington
This skill should be used when working on Godot Engine projects. It provides specialized knowledge of Godot's file formats (.gd, .tscn, .tres), architecture patterns (component-based, signal-driven, resource-based), common pitfalls, validation tools, code templates, and CLI workflows. The `godot` command is available for running the game, validating scripts, importing resources, and exporting builds. Use this skill for tasks involving Godot game development, debugging scene/resource files, implementing game systems, or creating new Godot components.
nano-banana-pro
garg-aayush
Generate and edit images using Google's Nano Banana Pro (Gemini 3 Pro Image) API. Use when the user asks to generate, create, edit, modify, change, alter, or update images. Also use when user references an existing image file and asks to modify it in any way (e.g., "modify this image", "change the background", "replace X with Y"). Supports both text-to-image generation and image-to-image editing with configurable resolution (1K default, 2K, or 4K for high resolution). DO NOT read the image file first - use this skill directly with the --input-image parameter.
fastapi-templates
wshobson
Create production-ready FastAPI projects with async patterns, dependency injection, and comprehensive error handling. Use when building new FastAPI applications or setting up backend API projects.
Related MCP Servers
Browse all serversEasily convert markdown to PDF using Markitdown MCP server. Supports HTTP, STDIO, and SSE for fast converting markdown t
Convert files easily with File Format Converter (Pandoc): transform PDF, HTML, Markdown, HEIC to JPG, JPG to PDF, and mo
Convert lat long or GPS data easily with GIS Operations. Perform coordinate transformations, MGRS, and military grid con
Imagician is an AI image editor for fast photo edit, resizing, cropping, conversion, compression, and batch image editin
Search and install node packages easily using the npm CLI. Streamline JavaScript library discovery and manage dependenci
Easily convert Markdown to PDF using Lightning Network micropayments. Pay per document with QR codes and get a downloada
Stay ahead of the MCP ecosystem
Get weekly updates on new skills and servers.