Magenta RealTime 2: Open Live Music Model Guide
Magenta RealTime 2 is Google Magenta's open-weights live music model for MacBook musicians, audio developers, and people building interactive AI instruments. This guide turns the launch posts, official MRT2 app page, technical blog, GitHub repo, Reddit notes, and early issues into a practical map.

On this page - 18 sectionsv
One-sentence definition
Magenta RealTime 2 is an open-weights live music model that generates 48kHz stereo audio while responding to MIDI, text, and audio controls fast enough to be played like an instrument.
Artifact
Answer first
Use MRT2 when you want an AI sound source inside a live music workflow: MIDI keyboard, DAW track, standalone jam app, creative controller, or custom C++/Python experiment. Keep conventional song generators in the stack when the task is a finished track from one prompt.
Why MRT2 matters
Most AI music demos still feel like batch jobs. You type a prompt, wait, and judge the rendered clip. MRT2 aims at a different interface: press keys, move a control, blend a prompt, and hear the model react while the session is still alive.
Google Magenta's technical post says the first Magenta RealTime model worked in chunks, which made control feel delayed. MRT2 moves to frame-level streaming, runs through an MLX-backed C++ engine on Apple Silicon, and ships with apps and Audio Unit plugins instead of only notebooks.
Model
230M / 2.4B
Small and base model sizes documented by Google and GitHub.
Frame size
40ms
Google says MRT2 moved from 2-second chunks to frame-level operation.
Output
48kHz
The apps require 48kHz stereo audio settings for playback.
Launch posts
The user-facing message is consistent across Google Gemma and Google Magenta: MRT2 is not only an open audio model. It is a playable instrument surface for live control. The official Magenta thread is more detailed; the Gemma thread is the broader ecosystem announcement.
Official launch post
Google Magenta introduces MRT2 as a live music model with open weights, an open source inference engine, apps, plugins, MIDI control, and low-latency MacBook playback.
Google Magenta introduces MRT2 as a live music model with open weights, an open source inference engine, apps, plugins, MIDI control, and low-latency MacBook playback.
— Google Magenta Project (@GoogleMagenta) June 4, 2026Official launch post
Google Gemma frames MRT2 as an open model musicians can play as an instrument using MIDI, text, and audio on a MacBook.
Google Gemma frames MRT2 as an open model musicians can play as an instrument using MIDI, text, and audio on a MacBook.
— Google Gemma (@googlegemma) June 4, 2026Official launch post
Google Gemma points readers to the MRT2 app and plugin download page.
Google Gemma points readers to the MRT2 app and plugin download page.
— Google Gemma (@googlegemma) June 4, 2026Official launch post
Google Gemma sends musicians and developers to the Google Magenta project for more experiments.
Google Gemma sends musicians and developers to the Google Magenta project for more experiments.
— Google Gemma (@googlegemma) June 4, 2026
Mental model: five moving parts
MRT2 is easiest to understand as a stack. The public demos sit at the top, but the useful developer surface is the line between model, inference engine, and controls.
MIDI notes / text prompt / audio prompt
|
v
MusicCoCa style embedding + note/drum conditioning
|
v
Depthformer generates SpectroStream audio tokens
|
v
MLX-backed C++ streaming engine on Apple Silicon
|
v
Standalone app / AU plugin / custom instrumentMusicCoCa maps text and audio prompts into style embeddings. SpectroStream is the codec that turns audio into tokens and back. Depthformer is the transformer that generates the token stream. The C++ engine is what makes the real-time path practical on MacBook GPUs.
Smallest local run
For musicians, the shortest path is the Mac app bundle from the MRT2 apps page. For developers, the shortest reproducible path is the Python package and CLI. The docs use `uv`, Python 3.12, and the `magenta-rt` package.
curl -LsSf https://astral.sh/uv/install.sh | sh
uv venv --python 3.12
source .venv/bin/activate
# Apple Silicon live path
uv pip install "magenta-rt[mlx]"
# Shared resources, then one streaming model
mrt models init
mrt models download
# Generate a four-second check clip
mrt mlx generate --prompt "disco funk" --duration 4.0 --model=mrt2_smallThe GitHub installation docs say MLX is the live Apple Silicon path. JAX is still available for offline, batch, or research work, including Linux setups with the appropriate JAX wheel.
Apps and plugins
The MRT2 download page matters because it shows Google is not shipping only a model checkpoint. The bundle includes standalone apps, Audio Unit plugin support, and examples that demonstrate different interfaces for the same model.
MIDI steering
Hold a note or chord and the model generates accompaniment that follows the harmony.
Text-to-synth
Describe a playable instrument, such as a string ensemble or disco funk patch.
Audio cloning
Drop in a short audio reference and turn the sound into a playable source.
Prompt mixing
Move between text and audio prompts to explore hybrid styles.
Sound design
Combine musical prompts with noisy textures and modulate chaos over time.
Gesture control
Use MIDI, LFOs, camera gestures, Max, PureData, or SuperCollider style interfaces.
The Audio Unit plugin angle is the most practical part for DAW users. Google's install page tells users to register the plugin, refresh AU plugins in the DAW, and drag the plugin onto a MIDI track. That means the first serious tests will happen in normal music sessions, not only in research demos.
The controls are the product
MRT2 uses multiple control signals. Text and audio prompts steer style. MIDI notes steer pitch and harmony. Drum control can be used to suppress drums when the model needs to sit beside other tracks.
The technical post says note control is trained from audio and MIDI pairs, with MIDI labels inferred by MT3. It also describes two note modes: one where the model chooses attacks from held notes, and one where the user supplies onset timing. That is the difference between a model that follows a chord and a model that follows performance timing.
Artifact
A useful test prompt
Try three passes with the same MIDI chord progression: `disco funk`, `string ensemble`, and a short audio reference. Then listen for three things: whether the harmony follows the keys, whether the style changes without collapsing, and whether the attack timing feels playable.
Latency: the real product test
Google's own post says MRT2 moved from a 2-second frame in the first version to a 40ms frame in MRT2, with average control latency under 200ms once buffers and system overhead are counted. That is the central claim.
The engineering reason is frame-level autoregression. Instead of waiting for a large chunk boundary before new control information matters, MRT2 injects conditioning at each generation step. It also uses decoder-only streaming with sliding window attention so the model can continue over time without unbounded memory.
| Generation unit | What changes | Why it matters |
|---|---|---|
| 2-second chunks | Controls wait for the next chunk. | Feels like a delayed effect. |
| 40ms frames | Controls can affect the next frame. | Starts to feel playable. |
Hardware and model choice
Start with the small model. The official docs say `mrt2_small` runs real-time on any Apple Silicon Mac, including Air models. The base model is larger and higher quality, but it has a much tighter timing budget.
| Path | Size | Hardware | Use |
|---|---|---|---|
| mrt2_small | 230M | Any Apple Silicon Mac | Best first install |
| mrt2_base | 2.4B | Pro/Max class Apple Silicon | Higher quality, higher risk |
| Offline JAX | Either | Linux/NVIDIA or CPU research path | Not the live plugin path |
The app page also says first launch may download model weights: roughly 450MB for small and 2.5GB for base. Treat those numbers as install-size planning, not runtime memory guarantees.
Architecture notes for developers
MRT2 is a codec language model. A codec language model generates discrete audio tokens, then decodes those tokens back to audio. Google's technical appendix says SpectroStream compresses 48kHz stereo audio into token frames at 25Hz, with 12 tokens per frame.
The key design choice is not only tokenization. MRT2 uses a decoder-only architecture, local sliding window attention, and attention sink embeddings to keep continuous generation stable after old context is evicted. The team also dropped explicit positional embeddings for this path, relying on causal masking and sliding attention for length generalization.
Artifact
Where to inspect the system
- `magenta_rt/` is the Python inference library with JAX and MLX backends.
- `core/` is the C++ inference engine for streaming apps.
- `examples/mrt2/auv3` is the all-in-one Audio Unit plugin.
- `examples/mrt2/standalone` is the standalone macOS app.
Developer paths
There are three serious developer paths. Use the app bundle if you are evaluating musical feel. Use the Python package if you are testing prompts, tokens, and offline generation. Use the C++ engine if you are building an instrument or plugin where audio callback behavior matters.
# C++ app development path from the docs
uv pip install "cmake<3.28"
cmake . -B build
cmake --build build --target hello_mrt2 -j10
./build/examples/hello_mrt2/hello_mrt2 \
~/Documents/Magenta/magenta-rt-v2/models/mrt2_small/mrt2_small.mlxfn \
~/Documents/Magenta/magenta-rt-v2/resources \
100 \
--prompt "ambient pads with sub bass"For MCP.Directory readers, the interesting next step is not an MCP server yet. It is a local agent workflow around the model: generate prompt sets, test presets, catalog useful controller mappings, and create repeatable DAW session templates.
What we got wrong at first
The first read makes MRT2 sound like another open model release. That misses the point. The model only becomes different when the control loop is short enough for a musician to react to it.
We also initially treated the MacBook requirement as a simple limitation. It is a limitation, but it is also an architecture choice. Google started with Apple Silicon because MLX gives the C++ engine a predictable local GPU target and many musicians already use MacBooks. That does not remove the need for Windows and Linux paths, but it explains the launch shape.
Caveats and early friction
The launch reaction is positive, but the caveats are concrete. X replies ask for Windows, Linux, API access, and broader DAW clarity. One reply to Google Magenta asked directly about Windows; the project account answered that a broader release would be useful, but they started with Apple Silicon because the live path needs a moderately powerful GPU and MacBooks are common among musicians.
Real-time claims need local benchmarking
GitHub issue #39 reports a case where `mrt2_base` on an M3 Max exceeded the 40ms frame budget in the official benchmark. Treat the hardware table as guidance, then measure your exact model, MLX version, DAW buffer, and background GPU load.
Do not assume it is a cloud API
The launch is about open weights, local apps, plugins, and code. If your product needs a hosted API, the official sources here do not present one for MRT2.
Start with musical latency, not demo novelty
A clip can sound impressive and still fail as an instrument. Test note-following, onset feel, drift, buffer underruns, and whether performers can predict how the model responds.
Three workflows to try
The best first workflows are small and measurable. Do not start with an album. Start with one repeatable controller setup.
Workflow 1
MIDI duet sketch
Put the AU plugin on a MIDI track, hold simple chord changes, and record how the ensemble follows. Use `mrt2_small` first, then compare base only if latency stays stable.
Workflow 2
Text-to-synth preset bank
Generate ten playable patches from short prompts. Score them on attack clarity, style match, noise, and whether they sit behind a vocal or lead instrument.
Workflow 3
Prompt-mixing controller
Map one knob or XY controller between two style prompts. Listen for useful transitions, not only endpoint quality. This is where MRT2 feels least like a prompt box.
Verdict
MRT2 is the most interesting when you judge it as a live instrument runtime. The open weights matter, but the bigger story is the control surface: MIDI, text, audio, MLX, C++, apps, and plugins all aimed at the same question: can a model be played?
The cautious answer is yes, for the right Mac and the right model size. The practical recommendation is simple: install the bundle, run the small model first, set audio to 48kHz, measure latency in your real DAW session, and only then decide whether the base model belongs in a live workflow.
FAQ
What is Magenta RealTime 2?
Magenta RealTime 2 is an open-weights live music model from Google Magenta. It generates 48kHz stereo audio in real time and can be controlled with MIDI, text prompts, and audio prompts.
Is Magenta RealTime 2 open source?
The magenta-realtime code repository is Apache 2.0, and Google describes MRT2 as an open-weights model. Check the model card and license terms before commercial redistribution or training workflows.
What hardware does MRT2 need?
The small 230M model is documented for real-time streaming on Apple Silicon Macs, including Air models. The base 2.4B model is higher quality and needs stronger Pro or Max class Apple Silicon for real-time streaming.
Does MRT2 work on Windows or Linux?
The live apps and plugins are Mac-first because the streaming engine uses MLX on Apple Silicon. The Python library also exposes JAX for offline and research generation on other hardware, but the launch materials do not present Windows live apps.
Can I use MRT2 inside a DAW?
Yes. The app bundle includes Audio Unit plugin support for DAWs, plus standalone apps and examples. The official install page says to register the plugin, refresh AU plugins in the DAW, and place it on a MIDI track.
What is the main early caveat?
Latency is the real product test. The official docs describe sub-200ms control latency, but an early GitHub issue reports a case where the base model missed the 40ms frame budget on an M3 Max benchmark setup.
Glossary
Codec language model
A model that predicts compressed audio tokens, then decodes them into sound.
MLX
Apple Silicon machine-learning runtime used by MRT2's streaming engine.
Audio Unit
Apple's plugin format for DAWs such as Logic and other macOS music tools.
Open weights
Public model parameters that developers can download and run under the model terms.
Sources
This post uses first-party launch material, official Google Magenta docs, the GitHub repository, GitHub issues and PRs, Reddit discussion, and X reactions. Claims about hardware, install flow, features, and architecture are tied to the sources below.
- Google Gemma MRT2 launch thread on X
- Google Magenta MRT2 launch thread on X
- Magenta RealTime 2 apps and plugins page
- Magenta RealTime 2 technical blog
- magenta/magenta-realtime GitHub repository
- Magenta RealTime installation docs
- Magenta RealTime models and hardware docs
- GitHub issue #39: M3 Max benchmark caveat
- GitHub PR #40: audio callback safety and benchmark context
- Reddit r/aicuriosity MRT2 discussion
- Live Music Models technical report