WebDriverIO MCP Server

WebDriverIO MCP Server

Official
webdriverio

Automates web browsers (Chrome, Firefox, Safari, Edge) and mobile apps (iOS/Android) through natural language commands using WebDriverIO.

Enables Claude Desktop to automate web browsers and mobile applications (iOS/Android) using WebDriverIO. Supports browser automation, mobile app testing, touch gestures, app lifecycle management, and hybrid app context switching through natural language.

15207 views3Local (stdio)

What it does

  • Automate web browser interactions and clicks
  • Control mobile app testing on iOS and Android
  • Execute touch gestures and app lifecycle management
  • Switch between native and web contexts in hybrid apps
  • Take screenshots and capture automation results
  • Run cross-platform tests through unified interface

Best for

QA engineers automating test scenariosDevelopers testing web and mobile applicationsTeams doing cross-platform automation testing
Supports all major browsers and mobile platformsNatural language automation commandsUnified interface for web and mobile testing

About WebDriverIO MCP Server

WebDriverIO MCP Server is an official MCP server published by webdriverio that provides AI assistants with tools and capabilities via the Model Context Protocol. WebDriverIO MCP Server enables Claude Desktop to automate browsers and iOS/Android apps with WebDriverIO — offering brow It is categorized under browser automation, developer tools.

How to install

You can install WebDriverIO MCP Server in your AI client of choice. Use the install panel on this page to get one-click setup for Cursor, Claude Desktop, VS Code, and other MCP-compatible clients. This server runs locally on your machine via the stdio transport.

License

WebDriverIO MCP Server is released under the MIT license. This is a permissive open-source license, meaning you can freely use, modify, and distribute the software.

WebDriverIO MCP Server

A Model Context Protocol (MCP) server that enables Claude Desktop to interact with web browsers and mobile applications using WebDriverIO. Automate Chrome, Firefox, Edge, and Safari browsers plus iOS and Android apps—all through a unified interface.

Installation

Setup

Option 1: Configure Claude Desktop or Claude Code (Recommended)

Add the following configuration to your Claude MCP settings:

{
  "mcpServers": {
    "wdio-mcp": {
      "command": "npx",
      "args": [
        "-y",
        "@wdio/mcp"
      ]
    }
  }
}

Option 2: Global Installation

npm i -g @wdio/mcp

Then configure MCP:

{
  "mcpServers": {
    "wdio-mcp": {
      "command": "wdio-mcp"
    }
  }
}

Note: The npm package is @wdio/mcp, but the executable binary is wdio-mcp.

Restart Claude Desktop

⚠️ You may need to fully restart Claude Desktop. On Windows, use Task Manager to ensure it's completely closed before restarting.

📖 Need help? Read the official MCP configuration guide

Prerequisites For Mobile App Automation

  • Appium Server: Install globally with npm install -g appium
  • Platform Drivers:
    • iOS: appium driver install xcuitest (requires Xcode on macOS)
    • Android: appium driver install uiautomator2 (requires Android Studio)
  • Devices/Emulators:
    • iOS Simulator (macOS) or physical device
    • Android Emulator or physical device
  • For iOS Real Devices: You'll need the device's UDID (Unique Device Identifier)
    • Find UDID on macOS: Connect device → Open Finder → Select device → Click device name/model to reveal UDID
    • Find UDID on Windows: Connect device → iTunes or Apple Devices app → Click device icon → Click "Serial Number" to reveal UDID
    • Xcode method: Window → Devices and Simulators → Select device → UDID shown as "Identifier"

Start the Appium server before using mobile features:

appium
# Server runs at http://127.0.0.1:4723 by default

Features

Browser Automation

  • Session Management: Start and close browser sessions (Chrome, Firefox, Edge, Safari) with headless/headed modes
  • Navigation & Interaction: Navigate URLs, click elements, fill forms, and retrieve content
  • Page Analysis: Get visible elements, accessibility trees, take screenshots
  • Cookie Management: Get, set, and delete cookies
  • Scrolling: Smooth scrolling with configurable distances

Mobile App Automation (iOS/Android)

  • Native App Testing: Test iOS (.app/.ipa) and Android (.apk) apps via Appium
  • Touch Gestures: Tap, swipe, long-press, drag-and-drop
  • App Lifecycle: Launch, background, terminate, check app state
  • Context Switching: Seamlessly switch between native and webview contexts for hybrid apps
  • Device Control: Rotate, lock/unlock, geolocation, keyboard control, notifications
  • Cross-Platform Selectors: Accessibility IDs, XPath, UiAutomator (Android), Predicates (iOS)

Available Tools

Session Management

ToolDescription
start_browserStart a browser session (Chrome, Firefox, Edge, Safari; headless/headed, custom dimensions)
start_app_sessionStart an iOS or Android app session via Appium (supports state preservation via noReset)
close_sessionClose or detach from the current browser or app session (supports detach mode)

Navigation & Page Interaction (Web & Mobile)

ToolDescription
navigateNavigate to a URL
get_visible_elementsGet visible, interactable elements on the page. Supports inViewportOnly (default: true) to filter viewport elements, and includeContainers (default: false) to include layout containers on mobile
get_accessibilityGet accessibility tree with semantic element information
scrollScroll in a direction (up/down) by specified pixels
take_screenshotCapture a screenshot

Element Interaction (Web & Mobile)

ToolDescription
click_elementClick an element
set_valueType text into input fields

Cookie Management (Web)

ToolDescription
get_cookiesGet all cookies or a specific cookie by name
set_cookieSet a cookie with name, value, and optional attributes
delete_cookiesDelete all cookies or a specific cookie

Mobile Gestures (iOS/Android)

ToolDescription
tap_elementTap an element by selector or coordinates
swipeSwipe in a direction (up/down/left/right)
drag_and_dropDrag from one location to another

App Lifecycle (iOS/Android)

ToolDescription
get_app_stateCheck app state (installed, running, background, foreground)

Context Switching (Hybrid Apps)

ToolDescription
get_contextsList available contexts (NATIVE_APP, WEBVIEW_*)
get_current_contextShow the currently active context
switch_contextSwitch between native and webview contexts

Device Control (iOS/Android)

ToolDescription
rotate_deviceRotate to portrait or landscape
hide_keyboardHide on-screen keyboard
get_geolocation / set_geolocationGet or set device GPS location

Usage Examples

Real-World Test Cases

Example 1: Testing Demo Android App (Book Scanning)

Test the Demo Android app at C:\Users\demo-liveApiGbRegionNonMinifiedRelease-3018788.apk on emulator-5554:
1. Start the app with auto-grant permissions
2. Get visible elements on the onboarding screen
3. Tap "Skip" to bypass onboarding
4. Verify main screen loads
5. Take a screenshot

Example 2: Testing World of Books E-commerce Site

You are a Testing expert, and want to assess the basic workflows of worldofbooks.com:
- Open World of Books (accept all cookies)
- Get visible elements to see navigation structure
- Search for a fiction book
- Choose one and validate if there are NEW and used book options
- Report your findings at the end

Browser Automation

Basic web testing prompt:

You are a Testing expert, and want to assess the basic workflows of a web application:
- Open World of Books (accept all cookies)
- Search for a fiction book
- Choose one and validate if there are NEW and used book options
- Report your findings at the end

Browser configuration options:

// Default settings (headed mode, 1280x1080)
start_browser()

// Firefox
start_browser({browser: 'firefox'})

// Edge
start_browser({browser: 'edge'})

// Safari (headed only; requires macOS)
start_browser({browser: 'safari'})

// Headless mode
start_browser({headless: true})

// Custom dimensions
start_browser({windowWidth: 1920, windowHeight: 1080})

// Headless with custom dimensions
start_browser({headless: true, windowWidth: 1920, windowHeight: 1080})

// Pass custom capabilities (e.g. Chrome extensions, profile, prefs)
start_browser({
  headless: false,
  capabilities: {
    'goog:chromeOptions': {
      args: ['--user-data-dir=/tmp/wdio-mcp-profile', '--load-extension=/path/to/unpacked-extension']
    }
  }
})

Mobile App Automation

Testing an iOS app on simulator:

Test my iOS app located at /path/to/MyApp.app on iPhone 15 Pro simulator:
1. Start the app session
2. Tap the login button
3. Enter "testuser" in the username field
4. Take a screenshot of the home screen
5. Close the session

Preserving app state between sessions:

Test my Android app without resetting data:
1. Start app session with noReset: true and fullReset: false
2. App launches with existing login state and user data preserved
3. Run test scenarios
4. Close session (app remains installed with data intact)

Testing an iOS app on real device:

Test my iOS

---

*README truncated. [View full README on GitHub](https://github.com/webdriverio/mcp).*

Alternatives

Related Skills

Browse all skills
chrome-devtools

Browser automation, debugging, and performance analysis using Puppeteer CLI scripts. Use for automating browsers, taking screenshots, analyzing performance, monitoring network traffic, web scraping, form automation, and JavaScript debugging.

30
browser-tools

Lightweight Chrome automation toolkit with shared configuration, JSON-first output, and six focused scripts for starting, navigating, inspecting, capturing, evaluating, and cleaning up browser sessions.

5
browser-setup-devtools

Guide users through browser automation setup using Chrome DevTools MCP as the primary path and the OpenCode browser extension as a fallback. Use when the user asks to set up browser automation, Chrome DevTools MCP, browser MCP, browser extension, or runs the browser-setup command.

3
crewai-developer

Comprehensive CrewAI framework guide for building collaborative AI agent teams and structured workflows. Use when developing multi-agent systems with CrewAI, creating autonomous AI crews, orchestrating flows, implementing agents with roles and tools, or building production-ready AI automation. Essential for developers building intelligent agent systems, task automation, and complex AI workflows.

2
browser

This skill should be used for browser automation tasks using Chrome DevTools Protocol (CDP). Triggers when users need to launch Chrome with remote debugging, navigate pages, execute JavaScript in browser context, capture screenshots, or interactively select DOM elements. No MCP server required.

2
ccxt-typescript

CCXT cryptocurrency exchange library for TypeScript and JavaScript developers (Node.js and browser). Covers both REST API (standard) and WebSocket API (real-time). Helps install CCXT, connect to exchanges, fetch market data, place orders, stream live tickers/orderbooks, handle authentication, and manage errors. Use when working with crypto exchanges in TypeScript/JavaScript projects, trading bots, arbitrage systems, or portfolio management tools. Includes both REST and WebSocket examples.

1