Kokoro Speech

Kokoro Speech

hammeiam

Converts text to natural-sounding speech using the Kokoro TTS model with customizable voice selection and playback speed.

Provides text-to-speech capabilities using the Kokoro TTS model, enabling natural-sounding voice output with customizable playback speed and voice selection through robust error handling and temporary file management.

1274 views3Local (stdio)

What it does

  • Convert text to speech with multiple voice options
  • Adjust speech speed from 0.5x to 2.0x
  • List all available TTS voices
  • Check TTS model initialization status

Best for

Adding voice output to applicationsCreating audio content from textAccessibility features for reading text aloud
No API key requiredHigh-quality Kokoro TTS modelMultiple voice options

About Kokoro Speech

Kokoro Speech is a community-built MCP server published by hammeiam that provides AI assistants with tools and capabilities via the Model Context Protocol. Kokoro Speech: natural-sounding Kokoro TTS with customizable voices and playback speed — fast, reliable text-to-speech w It is categorized under ai ml, developer tools.

How to install

You can install Kokoro Speech in your AI client of choice. Use the install panel on this page to get one-click setup for Cursor, Claude Desktop, VS Code, and other MCP-compatible clients. This server runs locally on your machine via the stdio transport.

License

Kokoro Speech is released under the MIT license. This is a permissive open-source license, meaning you can freely use, modify, and distribute the software.

Speech MCP Server

A Model Context Protocol server that provides text-to-speech capabilities using the Kokoro TTS model.

Configuration

The server can be configured using the following environment variables:

VariableDescriptionDefaultValid Range
MCP_DEFAULT_SPEECH_SPEEDDefault speed multiplier for text-to-speech1.10.5 to 2.0
MCP_DEFAULT_VOICEDefault voice for text-to-speechaf_bellaAny valid voice ID

In Cursor:

{
  "mcpServers": {
    "speech": {
      "command": "npx",
      "args": [
        "-y",
        "speech-mcp-server"
      ],
      "env": {
        "MCP_DEFAULT_SPEECH_SPEED": 1.3,
        "MCP_DEFAULT_VOICE": "af_bella"
      }
    }
  }
}

Features

  • 🎯 High-quality text-to-speech using Kokoro TTS model
  • 🗣️ Multiple voice options available
  • 🎛️ Customizable speech parameters (voice, speed)
  • 🔌 MCP-compliant interface
  • 📦 Easy installation and setup
  • 🚀 No API key required

Installation

# Using npm
npm install speech-mcp-server

# Using pnpm (recommended)
pnpm add speech-mcp-server

# Using yarn
yarn add speech-mcp-server

Usage

Run the server:

# Using default configuration
npm start

# With custom configuration
MCP_DEFAULT_SPEECH_SPEED=1.5 MCP_DEFAULT_VOICE=af_bella npm start

The server provides the following MCP tools:

  • text_to_speech: Basic text-to-speech conversion
  • text_to_speech_with_options: Text-to-speech with customizable speed
  • list_voices: List all available voices
  • get_model_status: Check the initialization status of the TTS model

Development

# Clone the repository
git clone <your-repo-url>
cd speech-mcp-server

# Install dependencies
pnpm install

# Start development server with auto-reload
pnpm dev

# Build the project
pnpm build

# Run linting
pnpm lint

# Format code
pnpm format

# Test with MCP Inspector
pnpm inspector

Available Tools

1. text_to_speech

Converts text to speech using the default settings.

{
  "type": "request",
  "id": "1",
  "method": "call_tool",
  "params": {
    "name": "text_to_speech",
    "arguments": {
      "text": "Hello world",
      "voice": "af_bella"  // optional
    }
  }
}

2. text_to_speech_with_options

Converts text to speech with customizable parameters.

{
  "type": "request",
  "id": "1",
  "method": "call_tool",
  "params": {
    "name": "text_to_speech_with_options",
    "arguments": {
      "text": "Hello world",
      "voice": "af_bella",  // optional
      "speed": 1.0,         // optional (0.5 to 2.0)
    }
  }
}

3. list_voices

Lists all available voices for text-to-speech.

{
  "type": "request",
  "id": "1",
  "method": "list_voices",
  "params": {}
}

4. get_model_status

Check the current status of the TTS model initialization. This is particularly useful when first starting the server, as the model needs to be downloaded and initialized.

{
  "type": "request",
  "id": "1",
  "method": "call_tool",
  "params": {
    "name": "get_model_status",
    "arguments": {}
  }
}

Response example:

{
  "content": [{
    "type": "text",
    "text": "Model status: initializing (5s elapsed)"
  }]
}

Possible status values:

  • uninitialized: Model initialization hasn't started
  • initializing: Model is being downloaded and initialized
  • ready: Model is ready to use
  • error: An error occurred during initialization

Testing

You can test the server using the MCP Inspector or by sending raw JSON messages:

# List available tools
echo '{"type":"request","id":"1","method":"list_tools","params":{}}' | node dist/index.js

# List available voices
echo '{"type":"request","id":"2","method":"list_voices","params":{}}' | node dist/index.js

# Convert text to speech
echo '{"type":"request","id":"3","method":"call_tool","params":{"name":"text_to_speech","arguments":{"text":"Hello world","voice":"af_bella"}}}' | node dist/index.js

Integration with Claude Desktop

To use this server with Claude Desktop, add the following to your Claude Desktop config file (~/Library/Application Support/Claude/claude_desktop_config.json):

{
  "servers": {
    "speech": {
      "command": "npx",
      "args": ["@decodershq/speech-mcp-server"]
    }
  }
}

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

MIT License - see the LICENSE file for details.

Troubleshooting

Model Initialization Issues

The server automatically attempts to download and initialize the TTS model on startup. If you encounter initialization errors:

  1. The server will automatically retry up to 3 times with a cleanup between attempts
  2. Use the get_model_status tool to monitor initialization progress and any errors
  3. If initialization fails after all retries, try manually removing the model files:
# Remove model files (MacOS/Linux)
rm -rf ~/.npm/_npx/**/node_modules/@huggingface/transformers/.cache/onnx-community/Kokoro-82M-v1.0-ONNX/onnx/model_quantized.onnx
rm -rf ~/.cache/huggingface/transformers/onnx-community/Kokoro-82M-v1.0-ONNX/onnx/model_quantized.onnx

# Then restart the server
npm start

The get_model_status tool will now include retry information in its response:

{
  "content": [{
    "type": "text",
    "text": "Model status: initializing (5s elapsed, retry 1/3)"
  }]
}

Alternatives

Related Skills

Browse all skills
ui-design-system

UI design system toolkit for Senior UI Designer including design token generation, component documentation, responsive design calculations, and developer handoff tools. Use for creating design systems, maintaining visual consistency, and facilitating design-dev collaboration.

18
ai-sdk

Answer questions about the AI SDK and help build AI-powered features. Use when developers: (1) Ask about AI SDK functions like generateText, streamText, ToolLoopAgent, embed, or tools, (2) Want to build AI agents, chatbots, RAG systems, or text generation features, (3) Have questions about AI providers (OpenAI, Anthropic, Google, etc.), streaming, tool calling, structured output, or embeddings, (4) Use React hooks like useChat or useCompletion. Triggers on: "AI SDK", "Vercel AI SDK", "generateText", "streamText", "add AI to my app", "build an agent", "tool calling", "structured output", "useChat".

6
api-documenter

Master API documentation with OpenAPI 3.1, AI-powered tools, and modern developer experience practices. Create interactive docs, generate SDKs, and build comprehensive developer portals. Use PROACTIVELY for API documentation or developer portal creation.

4
openai-knowledge

Use when working with the OpenAI API (Responses API) or OpenAI platform features (tools, streaming, Realtime API, auth, models, rate limits, MCP) and you need authoritative, up-to-date documentation (schemas, examples, limits, edge cases). Prefer the OpenAI Developer Documentation MCP server tools when available; otherwise guide the user to enable `openaiDeveloperDocs`.

4
cli-builder

Guide for building TypeScript CLIs with Bun. Use when creating command-line tools, adding subcommands to existing CLIs, or building developer tooling. Covers argument parsing, subcommand patterns, output formatting, and distribution.

3
ydc-ai-sdk-integration

Integrate Vercel AI SDK applications with You.com tools (web search, AI agent, content extraction). Use when developer mentions AI SDK, Vercel AI SDK, generateText, streamText, or You.com integration with AI SDK.

2