Rime Text-to-Speech

Name: Rime Text-to-Speech
Rating: 4.9 (32 reviews)
Author: matthewdailey

Converts text to speech using Rime's API and plays audio through your system's audio player with minimal latency.

Text-to-speech server that converts text into spoken audio through Rime's API, streaming with optimized buffering for minimal latency between generation and playback.

28318 views5Local (stdio)

other

GitHub

What it does

Convert text to speech using Rime API
Play generated audio through system audio
Customize voice selection from available options
Configure when and how speech is triggered

Best for

Coding agents providing audio announcementsAccessibility applications requiring speech outputInteractive assistants with voice responses

Optimized buffering for low latencyCross-platform audio supportConfigurable speech behavior

About Rime Text-to-Speech

Rime Text-to-Speech is a community-built MCP server published by matthewdailey that provides AI assistants with tools and capabilities via the Model Context Protocol. Convert text to speech instantly using Rime's API. Enjoy fast, streaming AI voice generation with minimal latency. Try o It is categorized under other.

How to install

You can install Rime Text-to-Speech in your AI client of choice. Use the install panel on this page to get one-click setup for Cursor, Claude Desktop, VS Code, and other MCP-compatible clients. This server runs locally on your machine via the stdio transport.

License

Rime Text-to-Speech is released under the Unlicense license. This is a permissive open-source license, meaning you can freely use, modify, and distribute the software.

Rime MCP

A Model Context Protocol (MCP) server that provides text-to-speech capabilities using the Rime API. This server downloads audio and plays it using the system's native audio player.

Features

Exposes a speak tool that converts text to speech and plays it through system audio
Uses Rime's high-quality voice synthesis API

Requirements

Node.js 16.x or higher
A working audio output device
macOS: Uses afplay

There's sample code from Claude for the following that is not tested 🤙✨

Windows: Built-in Media.SoundPlayer (PowerShell)
Linux: mpg123, mplayer, aplay, or ffplay

MCP Configuration

"ref": {
  "command": "npx",
  "args": ["rime-mcp"],
  "env": {
      RIME_API_KEY=your_api_key_here

      # Optional configuration
      RIME_GUIDANCE="<guide how the agent speaks>"
      RIME_WHO_TO_ADDRESS="<your name>"
      RIME_WHEN_TO_SPEAK="<tell the agent when to speak>"
      RIME_VOICE="cove" 
  }
}

All of the optional env vars are part of the tool definition and are prompts to

All voice options are listed here.

You can get your API key from the Rime Dashboard.

The following environment variables can be used to customize the behavior:

RIME_GUIDANCE: The main description of when and how to use the speak tool
RIME_WHO_TO_ADDRESS: Who the speech should address (default: "user")
RIME_WHEN_TO_SPEAK: When the tool should be used (default: "when asked to speak or when finishing a command")
RIME_VOICE: The default voice to use (default: "cove")

Example use cases

Example 1: Coding agent announcements

"RIME_WHEN_TO_SPEAK": "Always conclude your answers by speaking.",
"RIME_GUIDANCE": "Give a brief overview of the answer. If any files were edited, list them."

Example 2: Learn how the kids talk these days

RIME_GUIDANCE="Use phrases and slang common among Gen Alpha."
RIME_WHO_TO_ADDRESS="Matt"
RIME_WHEN_TO_SPEAK="when asked to speak"

Example 3: Different languages based on context

RIME_VOICE="use 'cove' when talking about Typescript and 'antoine' when talking about Python"

Development

Install dependencies:

npm install

Build the server:

npm run build

Run in development mode with hot reload:

npm run dev

License

MIT

Badges

Installing via Smithery

To install Rime Text-to-Speech Server for Claude Desktop automatically via Smithery:

npx -y @smithery/cli install @MatthewDailey/rime-mcp --client claude

Alternatives

Ableton Live

ahujasid

2.3k

Control Ableton Live for advanced music production—track creation, MIDI editing, playback, and sound design. Perfect for

Community

7125

Godot MCP

Coding-Solo

2.2k

MCP server for interfacing with the Godot game engine. Launch the editor, run projects, capture screenshots, manage scen

Community

2256

ElevenLabs

elevenlabs

1.2k

Unlock powerful text to speech and AI voice generator tools with ElevenLabs. Create, clone, and customize speech easily.

Official

7019

YouTube Transcripts

kimtaeyoon83

488

Extract and analyze YouTube transcripts in multiple languages. Use our YouTube transcriptor to easily transcribe for You

CommunityRemote

8452

Related Skills

Browse all skills

agentarxiv

Outcome-driven scientific publishing for AI agents. Publish research papers, hypotheses, and experiments with validated artifacts, structured claims, milestone tracking, and independent replications. Claim replication bounties, submit peer reviews, and collaborate with other AI researchers.

primer-x402

Make HTTP-native crypto payments using the x402 protocol. Pay for APIs, access paid resources, and handle 402 Payment Required responses with USDC on Base and other EVM chains.

create-environments

Create or migrate verifiers environments for the Prime Lab ecosystem. Use when asked to build a new environment from scratch, port an eval or benchmark from papers or other libraries, start from an environment on the Hub, or convert existing tasks into a package that exposes load_environment and installs cleanly with prime env install.

drawio-diagrams-enhanced

Create professional draw.io (diagrams.net) diagrams in XML format (.drawio files) with integrated PMP/PMBOK methodologies, extensive visual asset libraries, and industry-standard professional templates. Use this skill when users ask to create flowcharts, swimlane diagrams, cross-functional flowcharts, org charts, network diagrams, UML diagrams, BPMN, project management diagrams (WBS, Gantt, PERT, RACI), risk matrices, stakeholder maps, or any other visual diagram in draw.io format. This skill includes access to custom shape libraries for icons, clipart, and professional symbols.

1,093

pptx

Presentation creation, editing, and analysis. When Claude needs to work with presentations (.pptx files) for: (1) Creating new presentations, (2) Modifying or editing content, (3) Working with layouts, (4) Adding comments or speaker notes, or any other presentation tasks

210

docx

Comprehensive document creation, editing, and analysis with support for tracked changes, comments, formatting preservation, and text extraction. When Claude needs to work with professional documents (.docx files) for: (1) Creating new documents, (2) Modifying or editing content, (3) Working with tracked changes, (4) Adding comments, or any other document tasks