
OpenRouter Image Analysis
Analyzes images using OpenRouter's vision models, supporting various input formats including file paths, URLs, and base64 data. Includes specialized tools for webpage screenshots and mobile app design evaluation.
Provides image analysis capabilities through OpenRouter's vision models, supporting base64, file paths, and URLs with specialized tools for general analysis, webpage screenshot evaluation, and mobile app design assessment against platform guidelines.
What it does
- Analyze images from files, URLs, or base64 data
- Evaluate webpage screenshots for design and UX
- Assess mobile app designs against platform guidelines
- Choose from multiple vision models (Claude, Gemini, GPT-4 Vision)
- Process photos, diagrams, and visual content
- Generate detailed image descriptions and insights
Best for
About OpenRouter Image Analysis
OpenRouter Image Analysis is a community-built MCP server published by jonathanjude that provides AI assistants with tools and capabilities via the Model Context Protocol. OpenRouter Image Analysis offers color analyze and image j compatibilities for advanced image analysis using vision mode It is categorized under ai ml.
How to install
You can install OpenRouter Image Analysis in your AI client of choice. Use the install panel on this page to get one-click setup for Cursor, Claude Desktop, VS Code, and other MCP-compatible clients. This server runs locally on your machine via the stdio transport.
License
OpenRouter Image Analysis is released under the MIT license. This is a permissive open-source license, meaning you can freely use, modify, and distribute the software.
πΌοΈπ€ OpenRouter Image MCP Server
π₯ Supercharge your AI agents with powerful image analysis capabilities! π₯
A blazing-fast β‘ MCP (Model Context Protocol) server that enables AI agents to see and understand images using OpenRouter's cutting-edge vision models. Perfect for screenshots, photos, diagrams, and any visual content! πΈβ¨
π What Makes This Special?
- π― Multi-Model Support: Choose from Claude, Gemini, GPT-4 Vision, and more!
- π Lightning Fast: Built with TypeScript and optimized for performance
- π§ Flexible Input: Support for file paths, URLs, and base64 data
- π° Cost-Effective: Smart model selection for the best price-to-quality ratio
- π‘οΈ Production Ready: Robust error handling, retries, and comprehensive logging
- π¨ Easy Integration: Works seamlessly with Claude Code, Cline, Cursor, and more!
π Quick Start
Prerequisites π
- Node.js 18+ β‘
- OpenRouter API Key π (Get one at openrouter.ai)
- Your favorite MCP client π€ (Claude Code, Cline, etc.)
Installation π¦
# π Option 1: Use immediately with npx (recommended)
npx openrouter-image-mcp
# π Option 2: Install globally for frequent use
npm install -g openrouter-image-mcp
# π οΈ Option 3: Clone and build locally
git clone https://github.com/JonathanJude/openrouter-image-mcp.git
cd openrouter-image-mcp
npm install
npm run build
npm install -g .
π‘ Why npx is recommended: No installation required, always gets the latest version, and works perfectly for MCP server usage!
Configuration βοΈ
The MCP server requires an OpenRouter API key. You can configure it in several ways:
Method 1: Environment Variables (Recommended)
# π Set your API key
export OPENROUTER_API_KEY=sk-or-v1-your-api-key-here
# π― Set model (uses free model by default)
export OPENROUTER_MODEL=google/gemini-2.0-flash-exp:free
Method 2: .env File
# π Copy the environment template
cp .env.example .env
# βοΈ Edit with your credentials
nano .env
Add your OpenRouter credentials to .env:
# π Required
OPENROUTER_API_KEY=sk-or-v1-your-api-key-here
# π Model (FREE by default - great for getting started!)
OPENROUTER_MODEL=google/gemini-2.0-flash-exp:free
# ποΈ Optional settings
LOG_LEVEL=info
MAX_IMAGE_SIZE=10485760
RETRY_ATTEMPTS=3
Method 3: Direct Configuration in MCP Client
Add the API key directly in your MCP client configuration (see examples below).
π Works Locally - No Restarts Needed! π―
π HUGE ADVANTAGE: This MCP server works perfectly locally with zero manual intervention once configured! No restarts, no manual server starts, no fiddling with settings. It just works! β¨
π How It Works Automatically
- π― Configure once β Set up your MCP client one time
- π Auto-launches β Client starts the server automatically
- π§ Connects β Validates API and loads models instantly
- π οΈ Ready to use β All 3 tools available immediately
β‘ Local Setup Benefits
- π₯ Fire-and-forget: Set up once, forget forever
- β‘ Lightning startup: ~5 seconds total ready time
- π Persistent across restarts: Survives laptop shutdowns
- π± Cross-platform: Works on any OS with Node.js
- π― Zero maintenance: No babysitting required
π§ MCP Configuration
Option 1: Using npx (Recommended - No Installation Required)
The easiest way to use this MCP server is with npx, which automatically downloads and runs the package without any installation:
For Claude Code
Add to ~/.claude.json:
{
"mcp": {
"servers": {
"openrouter-image": {
"command": "npx",
"args": ["openrouter-image-mcp"],
"env": {
"OPENROUTER_API_KEY": "sk-or-v1-your-api-key-here",
"OPENROUTER_MODEL": "google/gemini-2.0-flash-exp:free"
}
}
}
}
}
For Claude Desktop
Add to ~/Library/Application Support/Claude/claude_desktop_config.json:
{
"mcpServers": {
"openrouter-image": {
"command": "npx",
"args": ["openrouter-image-mcp"],
"env": {
"OPENROUTER_API_KEY": "sk-or-v1-your-api-key-here",
"OPENROUTER_MODEL": "google/gemini-2.0-flash-exp:free"
}
}
}
}
For Other MCP Clients
- Cursor:
~/.cursor/mcp.json - Cline:
~/.cline/mcp.json - Windsurf: MCP settings file
- Other agents: Check your agent's MCP documentation
β¨ Benefits of npx:
- π No installation needed - works immediately
- π Always latest version - automatically updates
- π± Cross-platform - works everywhere Node.js is installed
- π§Ή Clean system - no global packages required
Option 2: Global Installation (For Frequent Users)
If you plan to use this MCP server frequently, install it globally:
npm install -g openrouter-image-mcp
Then use this configuration:
{
"mcp": {
"servers": {
"openrouter-image": {
"command": "openrouter-image-mcp",
"env": {
"OPENROUTER_API_KEY": "sk-or-v1-your-api-key-here",
"OPENROUTER_MODEL": "google/gemini-2.0-flash-exp:free"
}
}
}
}
}
Benefits of global installation:
- β‘ Faster startup - no download time
- π Works offline - once installed
- π§ Simpler command - shorter configuration
Option 3: Local Development
If you cloned the repo locally for development:
{
"mcpServers": {
"openrouter-image": {
"command": "node",
"args": ["/path/to/openrouter-image-mcp/dist/index.js"],
"env": {
"OPENROUTER_API_KEY": "sk-or-v1-your-api-key-here",
"OPENROUTER_MODEL": "google/gemini-2.0-flash-exp:free"
}
}
}
}
π― Pro Tip: Replace the API key with your actual OpenRouter key. The free model works great for most use cases!
π‘ Recommendation: Start with npx (Option 1) - it's the easiest and most reliable way to get started!
π‘ Pro Tips for Local Setup
π― Path Management
- Absolute paths work best:
/path/to/openrouter-image-mcp/dist/index.js - Avoid relative paths: May break when switching directories
- Use your actual path: Update the examples with your real project location
π§ Environment Variables
- Set in
.envfile: Keep your API key secure - OR set in system:
export OPENROUTER_API_KEY=sk-or-v1-... - Test quickly: Run
OPENROUTER_API_KEY=... node dist/index.js
π Quick Verification
# π Test if server works
export OPENROUTER_API_KEY=sk-or-v1-your-key
export OPENROUTER_MODEL=google/gemini-2.5-flash-lite-preview-09-2025
node dist/index.js
# β
Should see logs: "Starting OpenRouter Image MCP Server"
π Troubleshooting Local Issues
β "Command not found"
# β
Use absolute path to node
"$(which node)" "/path/to/openrouter-image-mcp/dist/index.js"
β "File not found"
# β
Verify the built file exists
ls -la /path/to/openrouter-image-mcp/dist/index.js
# π Rebuild if missing
npm run build
β "API key required"
# β
Check your environment variables
echo $OPENROUTER_API_KEY
# π§ Or create .env file
echo "OPENROUTER_API_KEY=sk-or-v1-your-key" > .env
π Local Development Workflow
- π οΈ Build once:
npm run build - βοΈ Configure once: Add MCP config to your AI agent
- π Restart agent: Pick up the new configuration
- π― Use immediately: No manual server management needed!
π₯ Usage Examples
With Claude Code π€
Add this to your ~/.claude.json:
{
"mcp": {
"servers": {
"openrouter-image": {
"command": "npx",
"args": ["openrouter-image-mcp"],
"env": {
"OPENROUTER_API_KEY": "sk-or-v1-your-api-key-here",
"OPENROUTER_MODEL": "google/gemini-2.0-flash-exp:free"
}
}
}
}
}
With Claude Desktop π₯οΈ
Add this to your claude_desktop_config.json:
{
"mcpServers": {
"openrouter-image": {
"command": "npx",
"args": ["openrouter-image-mcp"],
"env": {
"OPENROUTER_API_KEY": "sk-or-v1-your-api-key-here",
"OPENROUTER_MODEL": "google/gemini-2.0-flash-exp:free"
}
}
}
}
π― Amazing Things You Can Do!
# πΈ Analyze any screenshot
"Analyze this screenshot: /path/to/screenshot.png"
# π Extract text from images
"What text do you see in this document: /path/to/scan.jpg"
# π¨ Review UI designs
"Review this UI mockup for accessibility issues: /path/to/design.png"
# π± Debug mobile apps
"Analyze this mobile app screenshot for UX problems: /path/to/app.png"
# π Analyze webpages
"What can you tell me about this webpage: https://example.com/screenshot.png"
π οΈ Available Tools
πΌοΈ analyze_image - General Image Analysis
Perfect for photos, diagrams, charts, and general visual content!
Parameters:
typeπ Input type:file,url, orbase64dataπΈ Image data (path, URL, or base64 string)promptπ Custom analysis promptformatπ Output:textorjsonmaxTokensπ’ Maximum response tokens (default: 4000)temperatureπ‘οΈ Creativity 0-2 (default: 0.1)
π analyze_webpage_screenshot - Webpage Specialist
Designed specifically for web page analysis and debugging!
Features:
- π― Layout analysis
- π± Content extraction
- π Navi
README truncated. View full README on GitHub.
Alternatives
Related Skills
Browse all skillsGenerate comprehensive market research reports (50+ pages) in the style of top consulting firms (McKinsey, BCG, Gartner). Features professional LaTeX formatting, extensive visual generation with scientific-schematics and generate-image, deep integration with research-lookup for data gathering, and multi-framework strategic analysis including Porter's Five Forces, PESTLE, SWOT, TAM/SAM/SOM, and BCG Matrix.
World-class computer vision skill for image/video processing, object detection, segmentation, and visual AI systems. Expertise in PyTorch, OpenCV, YOLO, SAM, diffusion models, and vision transformers. Includes 3D vision, video analysis, real-time processing, and production deployment. Use when building vision AI systems, implementing object detection, training custom vision models, or optimizing inference pipelines.
Process and generate multimedia content using Google Gemini API. Capabilities include analyze audio files (transcription with timestamps, summarization, speech understanding, music/sound analysis up to 9.5 hours), understand images (captioning, object detection, OCR, visual Q&A, segmentation), process videos (scene detection, Q&A, temporal analysis, YouTube URLs, up to 6 hours), extract from documents (PDF tables, forms, charts, diagrams, multi-page), generate images (text-to-image, editing, composition, refinement). Use when working with audio/video files, analyzing images or screenshots, processing PDF documents, extracting structured data from media, creating images from text prompts, or implementing multimodal AI features. Supports multiple models (Gemini 2.5/2.0) with context windows up to 2M tokens.
AI-powered PPT generation with document analysis and styled images
Large Language and Vision Assistant. Enables visual instruction tuning and image-based conversations. Combines CLIP vision encoder with Vicuna/LLaMA language models. Supports multi-turn image chat, visual question answering, and instruction following. Use for vision-language chatbots or image understanding tasks. Best for conversational image analysis.
Python library for working with DICOM (Digital Imaging and Communications in Medicine) files. Use this skill when reading, writing, or modifying medical imaging data in DICOM format, extracting pixel data from medical images (CT, MRI, X-ray, ultrasound), anonymizing DICOM files, working with DICOM metadata and tags, converting DICOM images to other formats, handling compressed DICOM data, or processing medical imaging datasets. Applies to tasks involving medical image analysis, PACS systems, radiology workflows, and healthcare imaging applications.