Gemini 2.5 Flash Image

Name: Gemini 2.5 Flash Image
Rating: 4.4 (21 reviews)
Author: nanameru

Connects to Google's Gemini 2.5 Flash to generate, edit, compose, and apply style transfers to images using text prompts and existing images.

Integrates with Google Gemini 2.5 Flash to provide text-to-image generation, image editing, composition, and style transfer capabilities with support for base64 and file path inputs.

3490 views3Local (stdio)

ai ml

GitHub

What it does

Generate images from text descriptions
Edit existing images with natural language instructions
Compose new images from multiple input images
Transfer artistic styles between images
Save generated images to files or return as base64

Best for

Content creators needing quick image generationDevelopers building image-focused applicationsArtists experimenting with AI-assisted image editing

Works with any MCP-enabled clientSupports both file paths and base64 inputsMulti-image composition capabilities

About Gemini 2.5 Flash Image

Gemini 2.5 Flash Image is a community-built MCP server published by nanameru that provides AI assistants with tools and capabilities via the Model Context Protocol. Gemini 2.5 Flash Image is an AI image generator for text-to-image creation, editing, and style transfer using artificial intelligence images generator tools. It is categorized under ai ml. This server exposes 4 tools that AI clients can invoke during conversations and coding sessions.

How to install

You can install Gemini 2.5 Flash Image in your AI client of choice. Use the install panel on this page to get one-click setup for Cursor, Claude Desktop, VS Code, and other MCP-compatible clients. This server runs locally on your machine via the stdio transport.

License

Gemini 2.5 Flash Image is released under the MIT license. This is a permissive open-source license, meaning you can freely use, modify, and distribute the software.

Tools (4)

generate_image

Generate an image from a text prompt using Gemini 2.5 Flash Image

edit_image

Edit an image using a prompt. Provide one input image via base64 or file path.

compose_images

Compose a new image using multiple input images and a guiding prompt.

style_transfer

Transfer style from a style image to a base image, guided by an optional prompt.

Gemini 2.5 Flash Image MCP

A Model Context Protocol (MCP) server for conversational image generation and editing with Google's Gemini 2.5 Flash Image Preview. Designed to be easy to install and use from Claude Code and other MCP clients.

Key Features

Text-to-Image: Generate images from detailed prompts
Image Editing: Edit images with natural language instructions
Multi-Image Composition / Style Transfer: Combine images or transfer styles
File Save Option: Return base64 image and optionally save to file
Provider-Agnostic MCP: Works in any MCP-enabled client

Requirements

Node.js 18 or newer
An MCP client (Claude Code, Cursor, VS Code, Windsurf, etc.)
Google Gemini API Key: set GEMINI_API_KEY

Get a Gemini API key

Follow these steps to obtain an API key from Google AI Studio:

Open Google AI Studio and sign in: https://aistudio.google.com/apikey
Click “Create API key” (or “Manage keys” if you already have one)
Copy the generated key
Set it as an environment variable on your machine when running this server

Examples:

# macOS / Linux (bash/zsh)
export GEMINI_API_KEY="YOUR_API_KEY"

# Windows PowerShell
$env:GEMINI_API_KEY="YOUR_API_KEY"

Getting Started

First, install the MCP server with your client. The following examples center on Claude Code usage.

Standard config works in most tools:

{
  "mcpServers": {
    "gemini-2-5-flash-mcp": {
      "command": "npx",
      "args": ["@taiyokimura/gemini-2-5-flash-mcp@latest"]
    }
  }
}

Quick usage (Claude Code)

# npx（非対話フラグ付き） + APIキー同時指定（Claudeの -e 指定）
claude mcp add gemini-2-5-flash-mcp -s user -e GEMINI_API_KEY="YOUR_API_KEY" -- npx -y @taiyokimura/gemini-2-5-flash-mcp@latest

# グローバルインストール + APIキー同時指定（Claudeの -e 指定）
npm i -g @taiyokimura/gemini-2-5-flash-mcp \
  && claude mcp add gemini-2-5-flash-mcp -s user -e GEMINI_API_KEY="YOUR_API_KEY" -- gemini-2-5-flash-mcp

# HTTP モードで登録（SSE既定）例（対応クライアントのみ）
# ※ HTTP モードはこのプロセス自体がHTTPサーバとして常駐します
claude mcp add gemini-2-5-flash-mcp -s user \
  -e GEMINI_API_KEY="YOUR_API_KEY" \
  -e MCP_TRANSPORT="http" \
  -e MCP_HTTP_PORT="7801" \
  -e MCP_HTTP_PATH="/mcp" \
  -- npx -y @taiyokimura/gemini-2-5-flash-mcp@latest

Streamable HTTP mode（実験的）

STDIO の代わりに Streamable HTTP を使うこともできます。MCP クライアントが Streamable HTTP に対応している場合のみ利用してください。

サーバーを HTTP モードで起動

export MCP_TRANSPORT=http
export GEMINI_API_KEY=YOUR_API_KEY
# 任意（既定: 7801, /mcp, SSE）
export MCP_HTTP_PORT=7801
export MCP_HTTP_PATH=/mcp
export MCP_HTTP_ENABLE_JSON=false

npm run build
node ./build/index.js
# => HTTP transport listening on http://localhost:7801/mcp

クライアント側設定（例: Streamable HTTP対応クライアント）

Type: HTTP (Streamable)
URL: http://localhost:7801/mcp

注:

SSE ストリーミングが既定。JSONレスポンスで使いたい場合は MCP_HTTP_ENABLE_JSON=true。
セッションはサーバー側で生成（stateful）。完全 stateless にしたい場合はコード側で sessionIdGenerator: undefined に変更可能です。

Claude Code (Recommended)

Use the Claude Code CLI to add the MCP server:

claude mcp add gemini-2-5-flash-mcp -s user -- npx @taiyokimura/gemini-2-5-flash-mcp@latest

Remove if needed:

claude mcp remove gemini-2-5-flash-mcp

Claude Desktop

Follow the MCP install guide and use the standard config above.

Guide: https://modelcontextprotocol.io/quickstart/user

Cursor

Go to Cursor Settings → MCP → Add new MCP Server.

Use the following:

Name: gemini-2-5-flash-mcp
Type: command
Command: npx
Args: @taiyokimura/gemini-2-5-flash-mcp@latest
Auto start: on (optional)

VS Code

Add via CLI:

code --add-mcp '{"name":"gemini-2-5-flash-mcp","command":"npx","args":["@taiyokimura/gemini-2-5-flash-mcp@latest"]}'

Or use the standard config in settings.

LM Studio

Add MCP Server with:

Command: npx
Args: ["@taiyokimura/gemini-2-5-flash-mcp@latest"]

Goose

Advanced settings → Extensions → Add custom extension:

Type: STDIO
Command: npx
Args: @taiyokimura/gemini-2-5-flash-mcp@latest
Enabled: true

opencode

Example ~/.config/opencode/opencode.json:

{
  "$schema": "https://opencode.ai/config.json",
  "mcp": {
    "gemini-2-5-flash-mcp": {
      "type": "local",
      "command": [
        "npx",
        "@taiyokimura/gemini-2-5-flash-mcp@latest"
      ],
      "enabled": true
    }
  }
}

Qodo Gen

Open Qodo Gen → Connect more tools → + Add new MCP → Paste the standard config above → Save.

Windsurf

Follow Windsurf MCP documentation and use the standard config above.

Docs: https://docs.windsurf.com/windsurf/cascade/mcp

Environment Variables

GEMINI_API_KEY (required)
GEMINI_IMAGE_ENDPOINT (optional) default: https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash-image-preview:generateContent
MCP_NAME (optional, default: gemini-2-5-flash-mcp)

Available Tools

1. generate_image

Generate an image from a text prompt.

Parameters:

prompt (required): Detailed description to generate
saveToFilePath (optional): Path to save the image

Example input:

{
  "prompt": "Create a picture of a nano banana dish in a fancy restaurant with a Gemini theme",
  "saveToFilePath": "./gemini-native-image.png"
}

2. edit_image

Edit an image using a prompt.

Parameters:

prompt (required): Edit instruction
image (required): { dataBase64?: string, path?: string, mimeType?: string }
saveToFilePath (optional)

Example input:

{
  "prompt": "Add a small, knitted wizard hat to the cat",
  "image": { "path": "./cat.jpeg", "mimeType": "image/jpeg" },
  "saveToFilePath": "./gemini-edited-image.png"
}

3. compose_images

Combine elements from multiple images.

Parameters:

prompt (required)
images (required): Array of image inputs (2-3 recommended)
saveToFilePath (optional)

4. style_transfer

Transfer the style of one image to another.

Parameters:

prompt (optional)
baseImage (required)
styleImage (required)
saveToFilePath (optional)

Development

Run locally:

npm install
npm run build
npx .

Name Consistency & Troubleshooting

Always use CANONICAL_ID (gemini-2-5-flash-mcp) for identifiers and keys.
Use CANONICAL_DISPLAY (Gemini 2.5 Flash MCP) only for UI labels.
Do not mix different names across clients.

Consistency Matrix:

npm package name → gemini-2-5-flash-mcp
Binary name → gemini-2-5-flash-mcp
MCP server name (SDK metadata) → gemini-2-5-flash-mcp
Env default MCP_NAME → gemini-2-5-flash-mcp
Client registry key → gemini-2-5-flash-mcp
UI label → Gemini 2.5 Flash MCP

Conflict Cleanup:

Remove any old entries like "GeminiFlash" and re-add with gemini-2-5-flash-mcp.
Ensure global registries only use gemini-2-5-flash-mcp for keys.
Cursor: configure in the UI only. This project does not include .cursor/mcp.json.

References

MCP SDK: https://modelcontextprotocol.io/docs/sdks
Architecture: https://modelcontextprotocol.io/docs/learn/architecture
Server concepts: https://modelcontextprotocol.io/docs/learn/server-concepts
Server spec (2025-06-18): https://modelcontextprotocol.io/specification/2025-06-18/server/index
Gemini image generation: https://ai.google.dev/gemini-api/docs/image-generation

Alternatives

Knowledge Graph Memory

anthropic

80.5k

Build persistent semantic networks for enterprise & engineering data management. Enable data persistence and memory across chats efficiently.

OfficialPopular

2.7k171

Context7

upstash

48.2k

Boost your AI code assistant with Context7: inject real-time API documentation from OpenAPI specification sources into your coding workflow.

OfficialRemotePopular

17.3k832

Blender

ahujasid

17.6k

Connect Blender to Claude AI for seamless 3D modeling. Use AI 3D model generator tools for faster, intuitive, interactive 3D scene creation.

CommunityPopular

3.1k52

Google GenAI Toolbox

google

13.3k

Google GenAI Toolbox: open-source GenAI database agent and AI database connector for Google Cloud database—query Cloud SQL connector, Spanner & AlloyDB with…

OfficialPopular

330

Related Skills

Browse all skills

gemini-tg-image-gen

Generate images via OpenRouter (google/gemini-2.5-flash-image) and send to Telegram. Use when user asks for AI-generated images in TG.

nano-banana-pro

Generate and edit images using Google's Nano Banana Pro (Gemini 3 Pro Image) API. Use when the user asks to generate, create, edit, modify, change, alter, or update images. Also use when user references an existing image file and asks to modify it in any way (e.g., "modify this image", "change the background", "replace X with Y"). Supports both text-to-image generation and image-to-image editing with configurable resolution (1K default, 2K, or 4K for high resolution). DO NOT read the image file first - use this skill directly with the --input-image parameter.

920

gemini-imagegen

Generate and edit images using the Gemini API (Nano Banana Pro). Use this skill when creating images from text prompts, editing existing images, applying style transfers, generating logos with text, creating stickers, product mockups, or any image generation/manipulation task. Supports text-to-image, image editing, multi-turn refinement, and composition from multiple reference images.

ai-image-generation

Generate AI images with FLUX, Gemini, Grok, Seedream, Reve and 50+ models via inference.sh CLI. Models: FLUX Dev LoRA, FLUX.2 Klein LoRA, Gemini 3 Pro Image, Grok Imagine, Seedream 4.5, Reve, ImagineArt. Capabilities: text-to-image, image-to-image, inpainting, LoRA, image editing, upscaling, text rendering. Use for: AI art, product mockups, concept art, social media graphics, marketing visuals, illustrations. Triggers: flux, image generation, ai image, text to image, stable diffusion, generate image, ai art, midjourney alternative, dall-e alternative, text2img, t2i, image generator, ai picture, create image with ai, generative ai, ai illustration, grok image, gemini image

generate-image

Generate or edit images using AI models (FLUX, Gemini). Use for scientific illustrations, diagrams, schematics, infographics, concept visualizations, and artistic images. Supports image editing to modify existing images (change colors, add/remove elements, style transfer). Useful for figures, posters, and visual explanations.

ai-multimodal

Process and generate multimedia content using Google Gemini API. Capabilities include analyze audio files (transcription with timestamps, summarization, speech understanding, music/sound analysis up to 9.5 hours), understand images (captioning, object detection, OCR, visual Q&A, segmentation), process videos (scene detection, Q&A, temporal analysis, YouTube URLs, up to 6 hours), extract from documents (PDF tables, forms, charts, diagrams, multi-page), generate images (text-to-image, editing, composition, refinement). Use when working with audio/video files, analyzing images or screenshots, processing PDF documents, extracting structured data from media, creating images from text prompts, or implementing multimodal AI features. Supports multiple models (Gemini 2.5/2.0) with context windows up to 2M tokens.