acestep
Use ACE-Step API to generate music, edit songs, and remix music. Supports text-to-music, lyrics generation, audio continuation, and audio repainting. Use this skill when users mention generating music, creating songs, music production, remix, or audio continuation.
Install
mkdir -p .claude/skills/acestep && curl -L -o skill.zip "https://mcp.directory/api/skills/download/4573" && unzip -o skill.zip -d .claude/skills/acestep && rm skill.zipInstalls to .claude/skills/acestep
About this skill
ACE-Step Music Generation Skill
Use ACE-Step V1.5 API for music generation. Always use scripts/acestep.sh script — do NOT call API endpoints directly.
Quick Start
# 1. cd to this skill's directory
cd {project_root}/{.claude or .codex}/skills/acestep/
# 2. Check API service health
./scripts/acestep.sh health
# 3. Generate with lyrics (recommended)
./scripts/acestep.sh generate -c "pop, female vocal, piano" -l "[Verse] Your lyrics here..." --duration 120 --language zh
# 4. Output saved to: {project_root}/acestep_output/
Workflow
For user requests requiring vocals:
- Use the acestep-songwriting skill for lyrics writing, caption creation, duration/BPM/key selection
- Write complete, well-structured lyrics yourself based on the songwriting guide
- Generate using Caption mode with
-cand-lparameters
Only use Simple/Random mode (-d or random) for quick inspiration or instrumental exploration.
If the user needs a simple music video, use the acestep-simplemv skill to render one with waveform visualization and synced lyrics.
MV Production Requirements: Making a simple MV requires three additional skills to be installed:
- acestep-songwriting — for writing lyrics and planning song structure
- acestep-lyrics-transcription — for transcribing audio to timestamped lyrics (LRC)
- acestep-simplemv — for rendering the final music video
- acestep-thumbnail (optional) — for generating cover art / MV background images via Gemini API
MV Background Image: When the user requests MV production, ask whether they want a background image for the video:
- Generate via Gemini — use the acestep-thumbnail skill (requires Gemini API key configuration)
- Provide an existing image — user supplies a local image path
- Skip — use the default animated gradient background (no image needed)
Use AskUserQuestion to let the user choose before proceeding with MV rendering.
Parallel Processing: Lyrics transcription and thumbnail generation are independent tasks. When the user chooses to generate a background image, run acestep-lyrics-transcription and acestep-thumbnail in parallel (e.g. via two concurrent Agent calls) to save time, then use both outputs for the final MV render.
Script Commands
CRITICAL - Complete Lyrics Input: When providing lyrics via the -l parameter, you MUST pass ALL lyrics content WITHOUT any omission:
- If user provides lyrics, pass the ENTIRE text they give you
- If you generate lyrics yourself, pass the COMPLETE lyrics you created
- NEVER truncate, shorten, or pass only partial lyrics
- Missing lyrics will result in incomplete or incoherent songs
Music Parameters: Use the acestep-songwriting skill for guidance on duration, BPM, key scale, and time signature.
# need to cd to this skill's directory first
cd {project_root}/{.claude or .codex}/skills/acestep/
# Caption mode - RECOMMENDED: Write lyrics first, then generate
./scripts/acestep.sh generate -c "Electronic pop, energetic synths" -l "[Verse] Your complete lyrics
[Chorus] Full chorus here..." --duration 120 --bpm 128
# Instrumental only
./scripts/acestep.sh generate "Jazz with saxophone"
# Quick exploration (Simple/Random mode)
./scripts/acestep.sh generate -d "A cheerful song about spring"
./scripts/acestep.sh random
# Cover / Repainting from source audio
./scripts/acestep.sh cover song.mp3 -c "Rock cover style" -l "[Verse] Lyrics..." --duration 120 --bpm 128
./scripts/acestep.sh generate --src-audio song.mp3 --task-type repaint -c "Pop" --repaint-start 30 --repaint-end 60
# Music attribute options
./scripts/acestep.sh generate "Rock" --duration 60 --bpm 120 --key-scale "C major" --time-sig "4/4"
./scripts/acestep.sh generate "Rock" --duration 60 --batch 2
./scripts/acestep.sh generate "EDM" --no-thinking # Faster
# Other commands
./scripts/acestep.sh status <job_id>
./scripts/acestep.sh health
./scripts/acestep.sh models
Cover / Audio Repainting
The cover command generates music based on a source audio file. The audio is base64-encoded and sent to the API.
# Cover: regenerate with new style/lyrics, preserving melody structure
./scripts/acestep.sh cover input.mp3 -c "Jazz cover" -l "[Verse] New lyrics..." --duration 120
# Repainting: modify a specific region of the audio
./scripts/acestep.sh generate --src-audio input.mp3 --task-type repaint -c "Pop ballad" --repaint-start 30 --repaint-end 90
# Cover options
# --src-audio Source audio file path
# --task-type cover (default with --src-audio), repaint, text2music
# --cover-strength 0.0-1.0 (default: 1.0, higher = closer to source)
# --repaint-start Repainting start position (seconds)
# --repaint-end Repainting end position (seconds)
# --key-scale Musical key (e.g. "E minor")
# --time-signature Time signature (e.g. "4/4")
Note: For cloud API usage, large audio files may be rejected by Cloudflare. Compress audio before uploading if needed (e.g. using ffmpeg: ffmpeg -i input.mp3 -b:a 64k -ar 24000 -ac 1 compressed.mp3).
Output Files
After generation, the script automatically saves results to the acestep_output folder in the project root (same level as .claude):
project_root/
├── .claude/
│ └── skills/acestep/...
├── acestep_output/ # Output directory
│ ├── <job_id>.json # Complete task result (JSON)
│ ├── <job_id>_1.mp3 # First audio file
│ ├── <job_id>_2.mp3 # Second audio file (if batch_size > 1)
│ └── ...
└── ...
JSON Result Structure
Important: When LM enhancement is enabled (use_format=true), the final synthesized content may differ from your input. Check the JSON file for actual values:
| Field | Description |
|---|---|
prompt | Actual caption used for synthesis (may be LM-enhanced) |
lyrics | Actual lyrics used for synthesis (may be LM-enhanced) |
metas.prompt | Original input caption |
metas.lyrics | Original input lyrics |
metas.bpm | BPM used |
metas.keyscale | Key scale used |
metas.duration | Duration in seconds |
generation_info | Detailed timing and model info |
seed_value | Seeds used (for reproducibility) |
lm_model | LM model name |
dit_model | DiT model name |
To get the actual synthesized lyrics, parse the JSON and read the top-level lyrics field, not metas.lyrics.
Configuration
Important: Configuration follows this priority (high to low):
- Command line arguments > config.json defaults
- User-specified parameters temporarily override defaults but do not modify config.json
- Only
config --setcommand permanently modifies config.json
Default Config File (scripts/config.json)
{
"api_url": "http://127.0.0.1:8001",
"api_key": "",
"api_mode": "completion",
"generation": {
"thinking": true,
"use_format": false,
"use_cot_caption": true,
"use_cot_language": false,
"batch_size": 1,
"audio_format": "mp3",
"vocal_language": "en"
}
}
| Option | Default | Description |
|---|---|---|
api_url | http://127.0.0.1:8001 | API server address |
api_key | "" | API authentication key (optional) |
api_mode | completion | API mode: completion (OpenRouter, default) or native (polling) |
generation.thinking | true | Enable 5Hz LM (higher quality, slower) |
generation.audio_format | mp3 | Output format (mp3/wav/flac) |
generation.vocal_language | en | Vocal language |
Prerequisites - ACE-Step API Service
IMPORTANT: This skill requires the ACE-Step API server to be running.
Required Dependencies
The scripts/acestep.sh script requires: curl and jq.
# Check dependencies
curl --version
jq --version
If jq is not installed, the script will attempt to install it automatically. If automatic installation fails:
- Windows:
choco install jqor download from https://jqlang.github.io/jq/download/ - macOS:
brew install jq - Linux:
sudo apt-get install jq(Debian/Ubuntu) orsudo dnf install jq(Fedora)
Before First Use
You MUST check the API key and URL status before proceeding. Run:
cd "{project_root}/{.claude or .codex}/skills/acestep/" && bash ./scripts/acestep.sh config --check-key
cd "{project_root}/{.claude or .codex}/skills/acestep/" && bash ./scripts/acestep.sh config --get api_url
Case 1: Using Official Cloud API (https://api.acemusic.ai) without API key
If api_url is https://api.acemusic.ai and api_key is empty, you MUST stop and guide the user to configure their key:
- Tell the user: "You're using the ACE-Step official cloud API, but no API key is configured. An API key is required to use this service."
- Explain how to get a key: API keys are currently available through acemusic.ai for free.
- Use
AskUserQuestionto ask the user to provide their API key. - Once provided, configure it:
cd "{project_root}/{.claude or .codex}/skills/acestep/" && bash ./scripts/acestep.sh config --set api_key <KEY> - Additionally, inform the user: "If you also want to render music videos (MV), it's recommended to configure a lyrics transcription API key as well (OpenAI Whisper or ElevenLabs Scribe), so that lyrics can be automatically transcribed with accurate timestamps. You can configure it later via the
acestep-lyrics-transcriptionskill."
Case 2: API key is configured
Verify the API endpoint: ./scripts/acestep.sh health and proceed with music generation.
Case 3: Using local/custom API without key
Local services (http://127.0.0.1:*) typically don't require a key. Verify with ./scripts/acestep.sh health and proceed.
If health check fails:
- Ask: "Do you have ACE-Step installed?"
- If installed but not running: Use the acestep-docs skill to help
Content truncated.
More by ace-step
View all skills by ace-step →You might also like
flutter-development
aj-geddes
Build beautiful cross-platform mobile apps with Flutter and Dart. Covers widgets, state management with Provider/BLoC, navigation, API integration, and material design.
drawio-diagrams-enhanced
jgtolentino
Create professional draw.io (diagrams.net) diagrams in XML format (.drawio files) with integrated PMP/PMBOK methodologies, extensive visual asset libraries, and industry-standard professional templates. Use this skill when users ask to create flowcharts, swimlane diagrams, cross-functional flowcharts, org charts, network diagrams, UML diagrams, BPMN, project management diagrams (WBS, Gantt, PERT, RACI), risk matrices, stakeholder maps, or any other visual diagram in draw.io format. This skill includes access to custom shape libraries for icons, clipart, and professional symbols.
ui-ux-pro-max
nextlevelbuilder
"UI/UX design intelligence. 50 styles, 21 palettes, 50 font pairings, 20 charts, 8 stacks (React, Next.js, Vue, Svelte, SwiftUI, React Native, Flutter, Tailwind). Actions: plan, build, create, design, implement, review, fix, improve, optimize, enhance, refactor, check UI/UX code. Projects: website, landing page, dashboard, admin panel, e-commerce, SaaS, portfolio, blog, mobile app, .html, .tsx, .vue, .svelte. Elements: button, modal, navbar, sidebar, card, table, form, chart. Styles: glassmorphism, claymorphism, minimalism, brutalism, neumorphism, bento grid, dark mode, responsive, skeuomorphism, flat design. Topics: color palette, accessibility, animation, layout, typography, font pairing, spacing, hover, shadow, gradient."
godot
bfollington
This skill should be used when working on Godot Engine projects. It provides specialized knowledge of Godot's file formats (.gd, .tscn, .tres), architecture patterns (component-based, signal-driven, resource-based), common pitfalls, validation tools, code templates, and CLI workflows. The `godot` command is available for running the game, validating scripts, importing resources, and exporting builds. Use this skill for tasks involving Godot game development, debugging scene/resource files, implementing game systems, or creating new Godot components.
nano-banana-pro
garg-aayush
Generate and edit images using Google's Nano Banana Pro (Gemini 3 Pro Image) API. Use when the user asks to generate, create, edit, modify, change, alter, or update images. Also use when user references an existing image file and asks to modify it in any way (e.g., "modify this image", "change the background", "replace X with Y"). Supports both text-to-image generation and image-to-image editing with configurable resolution (1K default, 2K, or 4K for high resolution). DO NOT read the image file first - use this skill directly with the --input-image parameter.
fastapi-templates
wshobson
Create production-ready FastAPI projects with async patterns, dependency injection, and comprehensive error handling. Use when building new FastAPI applications or setting up backend API projects.
Related MCP Servers
Browse all serversGenerate lyrics, songs, and tracks instantly with Mureka.ai Music Generation, an AI music generator. No production skill
MusicMCP.AI — AI music generator that creates songs and musical content within conversations using advanced music-genera
Control Ableton Live for advanced music production—track creation, MIDI editing, playback, and sound design. Perfect for
AI-driven CAD modeling with FreeCAD: control design workflows, generate logos, and edit objects using remote Python scri
Transform Figma designs into high-quality code with AI. Seamless figma to code and figma to html workflows for efficient
Empower AI agents for efficient API automation in Postman for API testing. Streamline workflows and boost productivity w
Stay ahead of the MCP ecosystem
Get weekly updates on new skills and servers.