Webcam/Screenshot Capture

Name: Webcam/Screenshot Capture
Rating: 4.5 (90 reviews)
Author: evalstate

Provides AI assistants access to your webcam and screen captures for visual context about your environment and current activity.

Enables capturing and analyzing live webcam images and screenshots for real-time visual context in AI applications.

111414 views14Local (stdio)

developer tools

GitHub

What it does

Capture live webcam images
Take desktop screenshots
Stream images to multiple AI clients
Send sampling requests with visual data
Access current image as a resource

Best for

AI assistants analyzing your physical environmentVisual debugging and screen assistanceMulti-modal AI applications requiring real-time visual input

Web-based sampling interfaceMulti-user streaming modeRemote instance available

About Webcam/Screenshot Capture

Webcam/Screenshot Capture is a community-built MCP server published by evalstate that provides AI assistants with tools and capabilities via the Model Context Protocol. Capture live webcam images or screenshots easily. Supports screenshot on Mac, screen snip on Mac, and screenshot screen It is categorized under developer tools. This server exposes 2 tools that AI clients can invoke during conversations and coding sessions.

How to install

You can install Webcam/Screenshot Capture in your AI client of choice. Use the install panel on this page to get one-click setup for Cursor, Claude Desktop, VS Code, and other MCP-compatible clients. This server runs locally on your machine via the stdio transport.

License

Webcam/Screenshot Capture is released under the MIT license. This is a permissive open-source license, meaning you can freely use, modify, and distribute the software.

Tools (2)

capture

Gets the latest picture from the webcam. You can use this if the human asks questions about their immediate environment, if you want to see the human or to examine an object they may be referring to or showing you.

screenshot

Gets a screenshot of the current screen or window

⭐⭐ mcp-webcam 0.2.0 - the 50 Star Update ⭐⭐

In celebration of getting 52 GitHub stars, mcp-webcam 0.2.0 is here! Now supports streamable-http!! No installation required! - try it now at https://webcam.fast-agent.ai/. You can specify your own UserID by adding ?user=<YOUR_USER_ID> after the URL. Note this shared instance is for fun, not security - see below for instructions how to run your own copy locally.

In streamable-http mode multiple clients can connect simultaneously, and you can choose which is used for Sampling.

mcp_webcam_020_thumb

If we get to 100 stars I'll add another feature 😊.

Multi-user Mode

When run in Streaming mode, if you set an MCP_HOST environment variable the host name is used as a prefix in URL construction, and 5 character UserIDs are automatically generated when the User lands on the webpage.

mcp-webcam

MCP Server that provides access to your WebCam. Provides capture and screenshot tools to take an image from the Webcam, or take a screenshot. The current image is also available as a Resource.

MCP Sampling

mcp-webcam supports "sampling"! Press the "Sample" button to send a sampling request to the Client along with your entered message.

[!TIP] Claude Desktop does not currently support Sampling. If you want a Client that can handle multi-modal sampling request, try https://github.com/evalstate/fast-agent/ or VSCode (more details below).

Installation and Running

NPX

Install a recent version of NodeJS for your platform. The NPM package is @llmindset/mcp-webcam.

To start in STDIO mode: npx @llmindset/mcp-webcam. This starts the mcp-webcam UI on port 3333. Point your browser at http://localhost:3333 to get started.

To change the port: npx @llmindset/mcp-webcam 9999. This starts mcp-webcam the UI on port 9999.

For Streaming HTTP mode: npx @llmindset/mcp-webcam --streaming. This will make the UI available at http://localhost:3333 and the MCP Server available at http://localhost:3333/mcp.

Docker

You can run mcp-webcam using Docker. By default, it starts in streaming mode:

docker run -p 3333:3333 ghcr.io/evalstate/mcp-webcam:latest

Environment Variables

MCP_TRANSPORT_MODE - Set to stdio for STDIO mode, defaults to streaming
PORT - The port to run on (default: 3333)
BIND_HOST - Network interface to bind the server to (default: localhost)
MCP_HOST - Public-facing URL for user instructions and MCP client connections (default: http://localhost:3333)

Examples

# STDIO mode
docker run -p 3333:3333 -e MCP_TRANSPORT_MODE=stdio ghcr.io/evalstate/mcp-webcam:latest

# Custom port
docker run -p 8080:8080 -e PORT=8080 ghcr.io/evalstate/mcp-webcam:latest

# For cloud deployments with custom domain (e.g., Hugging Face Spaces)
docker run -p 3333:3333 -e MCP_HOST=https://evalstate-mcp-webcam.hf.space ghcr.io/evalstate/mcp-webcam:latest

# Complete cloud deployment example
docker run -p 3333:3333 -e MCP_HOST=https://your-domain.com ghcr.io/evalstate/mcp-webcam:latest

Clients

If you want a Client that supports sampling try:

fast-agent

Start the mcp-webcam in streaming mode, install uv and connect with:

uvx fast-agent-mcp go --url http://localhost:3333/mcp

fast-agent currently uses Haiku as its default model, so set an ANTHROPIC_API_KEY. If you want to use a different model, you can add --model on the command line. More instructions for installation and configuration are available here: https://fast-agent.ai/models/.

To start the server in STDIO mode, add the following to your fastagent.config.yaml

webcam_local:
   command: "npx"
   args: ["@llmindset/mcp-webcam"]

VSCode

VSCode versions 1.101.0 and above support MCP Sampling. Simply start mcp-webcam in streaming mode, and add http://localhost:3333/mcp as an MCP Server to get started.

Claude Desktop

Claude Desktop does NOT support Sampling. To run mcp-webcam from Claude Desktop, add the following to the mcpServers section of your claude_desktop_config.json file:

    "webcam": {
      "command": "npx",
      "args": [
        "-y",
        "@llmindset/mcp-webcam"
      ]
    }

Start Claude Desktop, and connect to http://localhost:3333. You can then ask Claude to get the latest picture from my webcam, or Claude, take a look at what I'm holding or what colour top am i wearing?. You can "freeze" the current image and that will be returned to Claude rather than a live capture.

You can ask for Screenshots - navigate to the browser so that you can guide the capture area when the request comes in. Screenshots are automatically resized to be manageable for Claude (useful if you have a 4K Screen). The button is there to allow testing of your platform specific Screenshot UX - it doesn't do anything other than prepare you for a Claude intiated request. NB this does not not work on Safari as it requires human initiation.

Other notes

That's it really.

This MCP Server was built to demonstrate exposing a User Interface on an MCP Server, and serving live resources back to Claude Desktop.

This project might prove useful if you want to build a local, interactive MCP Server.

Thanks to https://github.com/tadasant for help with testing and setup.

Please read the article at https://llmindset.co.uk/posts/2025/01/resouce-handling-mcp for more details about handling files and resources in LLM / MCP Chat Applications, and why you might want to do this.

Alternatives

Chrome DevTools MCP

chromedevtools

28.1k

AI-driven control of live Chrome via Chrome DevTools: browser automation, debugging, performance analysis and network mo

OfficialPopular

50711

Chrome DevTools

chromedevtools

28.1k

Use Chrome DevTools for web site test speed, debugging, and performance analysis. The essential chrome developer tools f

OfficialPopular

3.9k172

GitHub

github

27.6k

Extend your developer tools with GitHub MCP Server for advanced automation, supporting GitHub Student and student packag

OfficialRemotePopular

4.5k232

Repomix

yamadashy

22.3k

Optimize your codebase for AI with Repomix—transform, compress, and secure repos for easier analysis with modern AI tool

OfficialPopular

1.0k7

Related Skills

Browse all skills

ui-design-system

UI design system toolkit for Senior UI Designer including design token generation, component documentation, responsive design calculations, and developer handoff tools. Use for creating design systems, maintaining visual consistency, and facilitating design-dev collaboration.

ai-sdk

Answer questions about the AI SDK and help build AI-powered features. Use when developers: (1) Ask about AI SDK functions like generateText, streamText, ToolLoopAgent, embed, or tools, (2) Want to build AI agents, chatbots, RAG systems, or text generation features, (3) Have questions about AI providers (OpenAI, Anthropic, Google, etc.), streaming, tool calling, structured output, or embeddings, (4) Use React hooks like useChat or useCompletion. Triggers on: "AI SDK", "Vercel AI SDK", "generateText", "streamText", "add AI to my app", "build an agent", "tool calling", "structured output", "useChat".

api-documenter

Master API documentation with OpenAPI 3.1, AI-powered tools, and modern developer experience practices. Create interactive docs, generate SDKs, and build comprehensive developer portals. Use PROACTIVELY for API documentation or developer portal creation.

memory-forensics

Master memory forensics techniques including memory acquisition, process analysis, and artifact extraction using Volatility and related tools. Use when analyzing memory dumps, investigating incidents, or performing malware analysis from RAM captures.

openai-knowledge

Use when working with the OpenAI API (Responses API) or OpenAI platform features (tools, streaming, Realtime API, auth, models, rate limits, MCP) and you need authoritative, up-to-date documentation (schemas, examples, limits, edge cases). Prefer the OpenAI Developer Documentation MCP server tools when available; otherwise guide the user to enable `openaiDeveloperDocs`.

cli-builder

Guide for building TypeScript CLIs with Bun. Use when creating command-line tools, adding subcommands to existing CLIs, or building developer tooling. Covers argument parsing, subcommand patterns, output formatting, and distribution.