generate-image

93
0
Source

Generate or edit images using AI models (FLUX, Gemini). Use for scientific illustrations, diagrams, schematics, infographics, concept visualizations, and artistic images. Supports image editing to modify existing images (change colors, add/remove elements, style transfer). Useful for figures, posters, and visual explanations.

Install

mkdir -p .claude/skills/generate-image && curl -L -o skill.zip "https://mcp.directory/api/skills/download/168" && unzip -o skill.zip -d .claude/skills/generate-image && rm skill.zip

Installs to .claude/skills/generate-image

About this skill

Generate Image

Generate and edit high-quality images using OpenRouter's image generation models including FLUX.2 Pro and Gemini 3 Pro.

When to Use This Skill

Use generate-image for:

  • Photos and photorealistic images
  • Artistic illustrations and artwork
  • Concept art and visual concepts
  • Visual assets for presentations or documents
  • Image editing and modifications
  • Any general-purpose image generation needs

Use scientific-schematics instead for:

  • Flowcharts and process diagrams
  • Circuit diagrams and electrical schematics
  • Biological pathways and signaling cascades
  • System architecture diagrams
  • CONSORT diagrams and methodology flowcharts
  • Any technical/schematic diagrams

Quick Start

Use the scripts/generate_image.py script to generate or edit images:

# Generate a new image
python scripts/generate_image.py "A beautiful sunset over mountains"

# Edit an existing image
python scripts/generate_image.py "Make the sky purple" --input photo.jpg

This generates/edits an image and saves it as generated_image.png in the current directory.

API Key Setup

CRITICAL: The script requires an OpenRouter API key. Before running, check if the user has configured their API key:

  1. Look for a .env file in the project directory or parent directories
  2. Check for OPENROUTER_API_KEY=<key> in the .env file
  3. If not found, inform the user they need to:
    • Create a .env file with OPENROUTER_API_KEY=your-api-key-here
    • Or set the environment variable: export OPENROUTER_API_KEY=your-api-key-here
    • Get an API key from: https://openrouter.ai/keys

The script will automatically detect the .env file and provide clear error messages if the API key is missing.

Model Selection

Default model: google/gemini-3-pro-image-preview (high quality, recommended)

Available models for generation and editing:

  • google/gemini-3-pro-image-preview - High quality, supports generation + editing
  • black-forest-labs/flux.2-pro - Fast, high quality, supports generation + editing

Generation only:

  • black-forest-labs/flux.2-flex - Fast and cheap, but not as high quality as pro

Select based on:

  • Quality: Use gemini-3-pro or flux.2-pro
  • Editing: Use gemini-3-pro or flux.2-pro (both support image editing)
  • Cost: Use flux.2-flex for generation only

Common Usage Patterns

Basic generation

python scripts/generate_image.py "Your prompt here"

Specify model

python scripts/generate_image.py "A cat in space" --model "black-forest-labs/flux.2-pro"

Custom output path

python scripts/generate_image.py "Abstract art" --output artwork.png

Edit an existing image

python scripts/generate_image.py "Make the background blue" --input photo.jpg

Edit with a specific model

python scripts/generate_image.py "Add sunglasses to the person" --input portrait.png --model "black-forest-labs/flux.2-pro"

Edit with custom output

python scripts/generate_image.py "Remove the text from the image" --input screenshot.png --output cleaned.png

Multiple images

Run the script multiple times with different prompts or output paths:

python scripts/generate_image.py "Image 1 description" --output image1.png
python scripts/generate_image.py "Image 2 description" --output image2.png

Script Parameters

  • prompt (required): Text description of the image to generate, or editing instructions
  • --input or -i: Input image path for editing (enables edit mode)
  • --model or -m: OpenRouter model ID (default: google/gemini-3-pro-image-preview)
  • --output or -o: Output file path (default: generated_image.png)
  • --api-key: OpenRouter API key (overrides .env file)

Example Use Cases

For Scientific Documents

# Generate a conceptual illustration for a paper
python scripts/generate_image.py "Microscopic view of cancer cells being attacked by immunotherapy agents, scientific illustration style" --output figures/immunotherapy_concept.png

# Create a visual for a presentation
python scripts/generate_image.py "DNA double helix structure with highlighted mutation site, modern scientific visualization" --output slides/dna_mutation.png

For Presentations and Posters

# Title slide background
python scripts/generate_image.py "Abstract blue and white background with subtle molecular patterns, professional presentation style" --output slides/background.png

# Poster hero image
python scripts/generate_image.py "Laboratory setting with modern equipment, photorealistic, well-lit" --output poster/hero.png

For General Visual Content

# Website or documentation images
python scripts/generate_image.py "Professional team collaboration around a digital whiteboard, modern office" --output docs/team_collaboration.png

# Marketing materials
python scripts/generate_image.py "Futuristic AI brain concept with glowing neural networks" --output marketing/ai_concept.png

Error Handling

The script provides clear error messages for:

  • Missing API key (with setup instructions)
  • API errors (with status codes)
  • Unexpected response formats
  • Missing dependencies (requests library)

If the script fails, read the error message and address the issue before retrying.

Notes

  • Images are returned as base64-encoded data URLs and automatically saved as PNG files
  • The script supports both images and content response formats from different OpenRouter models
  • Generation time varies by model (typically 5-30 seconds)
  • For image editing, the input image is encoded as base64 and sent to the model
  • Supported input image formats: PNG, JPEG, GIF, WebP
  • Check OpenRouter pricing for cost information: https://openrouter.ai/models

Image Editing Tips

  • Be specific about what changes you want (e.g., "change the sky to sunset colors" vs "edit the sky")
  • Reference specific elements in the image when possible
  • For best results, use clear and detailed editing instructions
  • Both Gemini 3 Pro and FLUX.2 Pro support image editing through OpenRouter

Integration with Other Skills

  • scientific-schematics: Use for technical diagrams, flowcharts, circuits, pathways
  • generate-image: Use for photos, illustrations, artwork, visual concepts
  • scientific-slides: Combine with generate-image for visually rich presentations
  • latex-posters: Use generate-image for poster visuals and hero images

More by K-Dense-AI

View all →

literature-review

K-Dense-AI

Conduct comprehensive, systematic literature reviews using multiple academic databases (PubMed, arXiv, bioRxiv, Semantic Scholar, etc.). This skill should be used when conducting systematic literature reviews, meta-analyses, research synthesis, or comprehensive literature searches across biomedical, scientific, and technical domains. Creates professionally formatted markdown documents and PDFs with verified citations in multiple citation styles (APA, Nature, Vancouver, etc.).

17186

markitdown

K-Dense-AI

Convert various file formats (PDF, Office documents, images, audio, web content, structured data) to Markdown optimized for LLM processing. Use when converting documents to markdown, extracting text from PDFs/Office files, transcribing audio, performing OCR on images, extracting YouTube transcripts, or processing batches of files. Supports 20+ formats including DOCX, XLSX, PPTX, PDF, HTML, EPUB, CSV, JSON, images with OCR, and audio with transcription.

11834

scientific-writing

K-Dense-AI

Write scientific manuscripts. IMRAD structure, citations (APA/AMA/Vancouver), figures/tables, reporting guidelines (CONSORT/STROBE/PRISMA), abstracts, for research papers and journal submissions.

10320

reportlab

K-Dense-AI

"PDF generation toolkit. Create invoices, reports, certificates, forms, charts, tables, barcodes, QR codes, Canvas/Platypus APIs, for professional document automation."

887

matplotlib

K-Dense-AI

Foundational plotting library. Create line plots, scatter, bar, histograms, heatmaps, 3D, subplots, export PNG/PDF/SVG, for scientific visualization and publication figures.

847

drugbank-database

K-Dense-AI

Access and analyze comprehensive drug information from the DrugBank database including drug properties, interactions, targets, pathways, chemical structures, and pharmacology data. This skill should be used when working with pharmaceutical data, drug discovery research, pharmacology studies, drug-drug interaction analysis, target identification, chemical similarity searches, ADMET predictions, or any task requiring detailed drug and drug target information from DrugBank.

925

You might also like

flutter-development

aj-geddes

Build beautiful cross-platform mobile apps with Flutter and Dart. Covers widgets, state management with Provider/BLoC, navigation, API integration, and material design.

289790

drawio-diagrams-enhanced

jgtolentino

Create professional draw.io (diagrams.net) diagrams in XML format (.drawio files) with integrated PMP/PMBOK methodologies, extensive visual asset libraries, and industry-standard professional templates. Use this skill when users ask to create flowcharts, swimlane diagrams, cross-functional flowcharts, org charts, network diagrams, UML diagrams, BPMN, project management diagrams (WBS, Gantt, PERT, RACI), risk matrices, stakeholder maps, or any other visual diagram in draw.io format. This skill includes access to custom shape libraries for icons, clipart, and professional symbols.

213415

godot

bfollington

This skill should be used when working on Godot Engine projects. It provides specialized knowledge of Godot's file formats (.gd, .tscn, .tres), architecture patterns (component-based, signal-driven, resource-based), common pitfalls, validation tools, code templates, and CLI workflows. The `godot` command is available for running the game, validating scripts, importing resources, and exporting builds. Use this skill for tasks involving Godot game development, debugging scene/resource files, implementing game systems, or creating new Godot components.

213296

nano-banana-pro

garg-aayush

Generate and edit images using Google's Nano Banana Pro (Gemini 3 Pro Image) API. Use when the user asks to generate, create, edit, modify, change, alter, or update images. Also use when user references an existing image file and asks to modify it in any way (e.g., "modify this image", "change the background", "replace X with Y"). Supports both text-to-image generation and image-to-image editing with configurable resolution (1K default, 2K, or 4K for high resolution). DO NOT read the image file first - use this skill directly with the --input-image parameter.

219234

ui-ux-pro-max

nextlevelbuilder

"UI/UX design intelligence. 50 styles, 21 palettes, 50 font pairings, 20 charts, 8 stacks (React, Next.js, Vue, Svelte, SwiftUI, React Native, Flutter, Tailwind). Actions: plan, build, create, design, implement, review, fix, improve, optimize, enhance, refactor, check UI/UX code. Projects: website, landing page, dashboard, admin panel, e-commerce, SaaS, portfolio, blog, mobile app, .html, .tsx, .vue, .svelte. Elements: button, modal, navbar, sidebar, card, table, form, chart. Styles: glassmorphism, claymorphism, minimalism, brutalism, neumorphism, bento grid, dark mode, responsive, skeuomorphism, flat design. Topics: color palette, accessibility, animation, layout, typography, font pairing, spacing, hover, shadow, gradient."

172200

rust-coding-skill

UtakataKyosui

Guides Claude in writing idiomatic, efficient, well-structured Rust code using proper data modeling, traits, impl organization, macros, and build-speed best practices.

166173

Stay ahead of the MCP ecosystem

Get weekly updates on new skills and servers.