browserbase-browser-automation

11
0
Source

Automate web browser interactions using the stagehand CLI for AI agents

Install

mkdir -p .claude/skills/browserbase-browser-automation && curl -L -o skill.zip "https://mcp.directory/api/skills/download/2427" && unzip -o skill.zip -d .claude/skills/browserbase-browser-automation && rm skill.zip

Installs to .claude/skills/browserbase-browser-automation

About this skill

Browser Automation Skill

Automate web browser interactions using the stagehand CLI for AI agents.

🚨 CRITICAL - READ THIS FIRST 🚨

A Browserbase session with stealth/proxy/captcha has been pre-created for you.

YOU MUST USE stagehand --ws $BROWSERBASE_CONNECT_URL FOR EVERY COMMAND.

DO NOT use stagehand open without --ws - it will launch a LOCAL browser!

When to Use

Use this skill when the user asks to:

  • Browse websites or navigate to URLs
  • Extract data from web pages
  • Fill forms or click buttons
  • Take screenshots of web pages
  • Interact with web applications
  • Automate multi-step web workflows

Core Concepts

The stagehand CLI provides:

  • Element references - Snapshot creates refs like @0-5 for easy clicking/filling
  • Browserbase support - Connect to pre-created cloud browser sessions with --ws

Environment Selection

CRITICAL: A Browserbase session with stealth/proxy/captcha has been pre-created for you.

The session URL is in the BROWSERBASE_CONNECT_URL environment variable.

YOU MUST ALWAYS use stagehand --ws $BROWSERBASE_CONNECT_URL for EVERY command:

stagehand --ws $BROWSERBASE_CONNECT_URL open https://example.com

WHY:

  • ✅ Browser runs in Browserbase cloud (NOT locally)
  • ✅ Advanced stealth mode enabled (bypasses Cloudflare)
  • ✅ Residential proxies enabled
  • ✅ CAPTCHA solving enabled
  • ✅ Session recordings at: $BROWSERBASE_DEBUG_URL

IF YOU FORGET --ws $BROWSERBASE_CONNECT_URL:

  • ❌ Will launch LOCAL Chrome browser
  • ❌ Will NOT use stealth/proxy/captcha
  • ❌ Will fail the evaluation

Quick Start Workflow

# 1. Navigate to page (connects to pre-created Browserbase session)
stagehand --ws $BROWSERBASE_CONNECT_URL open https://example.com

# 2. Get page structure with element refs
stagehand --ws $BROWSERBASE_CONNECT_URL snapshot -c

# Output includes refs like [0-5], [1-2]:
# RootWebArea "Example" url="https://example.com"
#   [0-0] link "Home"
#   [0-1] link "About"
#   [0-2] button "Sign In"

# 3. Interact using refs
stagehand --ws $BROWSERBASE_CONNECT_URL click @0-2
stagehand --ws $BROWSERBASE_CONNECT_URL fill @0-5 "search query"

# 4. Re-snapshot to verify changes
stagehand --ws $BROWSERBASE_CONNECT_URL snapshot -c

# 5. Stop when done (optional, session persists)
stagehand --ws $BROWSERBASE_CONNECT_URL stop

Navigation Commands

REMEMBER: Use stagehand --ws $BROWSERBASE_CONNECT_URL for ALL commands below.

# Navigate to URL
stagehand --ws $BROWSERBASE_CONNECT_URL open <url>

# With custom timeout for slow pages
stagehand --ws $BROWSERBASE_CONNECT_URL open <url> --timeout 60000

# Page navigation
stagehand --ws $BROWSERBASE_CONNECT_URL reload
stagehand --ws $BROWSERBASE_CONNECT_URL back
stagehand --ws $BROWSERBASE_CONNECT_URL forward

Element Interaction

Get Page Structure

# Get accessibility tree with element refs
stagehand --ws $BROWSERBASE_CONNECT_URL snapshot -c

# Get full snapshot with XPath/CSS mappings
stagehand --ws $BROWSERBASE_CONNECT_URL snapshot --json

Click Elements

# Click by ref (from snapshot)
stagehand --ws $BROWSERBASE_CONNECT_URL click @0-5
stagehand --ws $BROWSERBASE_CONNECT_URL click 0-5       # @ prefix optional

# Click with options
stagehand --ws $BROWSERBASE_CONNECT_URL click @0-5 -b right -c 2  # Right-click twice

# Click at coordinates
stagehand --ws $BROWSERBASE_CONNECT_URL click_xy 100 200

Form Filling

# Fill input (auto-presses Enter by default)
stagehand --ws $BROWSERBASE_CONNECT_URL fill @0-5 "my value"

# Fill without pressing Enter
stagehand --ws $BROWSERBASE_CONNECT_URL fill @0-5 "my value" --no-press-enter

# Select dropdown options
stagehand --ws $BROWSERBASE_CONNECT_URL select @0-8 "Option 1" "Option 2"

Typing

# Type text naturally
stagehand --ws $BROWSERBASE_CONNECT_URL type "Hello, world!"

# Type with delay between characters
stagehand --ws $BROWSERBASE_CONNECT_URL type "slow typing" -d 100

# Press special keys
stagehand --ws $BROWSERBASE_CONNECT_URL press Enter
stagehand --ws $BROWSERBASE_CONNECT_URL press Tab
stagehand --ws $BROWSERBASE_CONNECT_URL press "Cmd+A"

Data Extraction

# Get page info
stagehand --ws $BROWSERBASE_CONNECT_URL get url
stagehand --ws $BROWSERBASE_CONNECT_URL get title
stagehand --ws $BROWSERBASE_CONNECT_URL get text body
stagehand --ws $BROWSERBASE_CONNECT_URL get html @0-5

# Take screenshot
stagehand --ws $BROWSERBASE_CONNECT_URL screenshot page.png
stagehand --ws $BROWSERBASE_CONNECT_URL screenshot -f        # Full page
stagehand --ws $BROWSERBASE_CONNECT_URL screenshot --type jpeg

# Get element coordinates
stagehand --ws $BROWSERBASE_CONNECT_URL get box @0-5  # Returns center x,y

Waiting

# Wait for page load
stagehand --ws $BROWSERBASE_CONNECT_URL wait load
stagehand --ws $BROWSERBASE_CONNECT_URL wait load networkidle

# Wait for element
stagehand --ws $BROWSERBASE_CONNECT_URL wait selector ".my-class"
stagehand --ws $BROWSERBASE_CONNECT_URL wait selector ".my-class" -t 10000 -s visible

# Wait for time
stagehand --ws $BROWSERBASE_CONNECT_URL wait timeout 2000

Multi-Tab Support

# List all tabs
stagehand --ws $BROWSERBASE_CONNECT_URL pages

# Open new tab
stagehand --ws $BROWSERBASE_CONNECT_URL newpage https://example.com

# Switch tabs
stagehand --ws $BROWSERBASE_CONNECT_URL tab_switch 1

# Close tab
stagehand --ws $BROWSERBASE_CONNECT_URL tab_close 2

Network Capture

Capture HTTP requests for inspection:

# Start capturing
stagehand --ws $BROWSERBASE_CONNECT_URL network on

# Get capture directory
stagehand --ws $BROWSERBASE_CONNECT_URL network path

# Stop capturing
stagehand --ws $BROWSERBASE_CONNECT_URL network off

# Clear captures
stagehand --ws $BROWSERBASE_CONNECT_URL network clear

Captured requests are saved as directories with request.json and response.json.

Daemon Control

# Check status
stagehand --ws $BROWSERBASE_CONNECT_URL status

# Stop browser
stagehand --ws $BROWSERBASE_CONNECT_URL stop

# Force stop
stagehand --ws $BROWSERBASE_CONNECT_URL stop --force

Element References

After snapshot, elements have refs you can use:

RootWebArea "Login Page"
  [0-0] heading "Welcome"
  [0-1] textbox "Email" name="email"
  [0-2] textbox "Password" name="password"
  [0-3] button "Sign In"

Use these refs directly:

stagehand --ws $BROWSERBASE_CONNECT_URL fill @0-1 "user@example.com"
stagehand --ws $BROWSERBASE_CONNECT_URL fill @0-2 "mypassword"
stagehand --ws $BROWSERBASE_CONNECT_URL click @0-3

Best Practices

1. Always snapshot after navigation

stagehand --ws $BROWSERBASE_CONNECT_URL open https://example.com
stagehand --ws $BROWSERBASE_CONNECT_URL snapshot -c  # Get refs

2. Re-snapshot after actions that change the page

stagehand --ws $BROWSERBASE_CONNECT_URL click @0-5
stagehand --ws $BROWSERBASE_CONNECT_URL snapshot -c  # Get new state

3. Use refs instead of selectors

# ✅ Good: Use refs from snapshot
stagehand --ws $BROWSERBASE_CONNECT_URL click @0-5

# ❌ Avoid: Manual selectors (refs are more reliable)
stagehand --ws $BROWSERBASE_CONNECT_URL click "#submit-button"

4. Wait for elements when needed

stagehand --ws $BROWSERBASE_CONNECT_URL open https://slow-site.com
stagehand --ws $BROWSERBASE_CONNECT_URL wait selector ".content" -s visible
stagehand --ws $BROWSERBASE_CONNECT_URL snapshot -c

5. Always use --ws $BROWSERBASE_CONNECT_URL

# ✅ Correct: Remote browser (connects to pre-created Browserbase session)
stagehand --ws $BROWSERBASE_CONNECT_URL open https://example.com

# ❌ Wrong: Local browser (will fail in evals, launches Chrome locally)
stagehand open https://example.com

Common Patterns

Login Flow

stagehand --ws $BROWSERBASE_CONNECT_URL open https://example.com/login
stagehand --ws $BROWSERBASE_CONNECT_URL snapshot -c
# [0-5] textbox "Email"
# [0-6] textbox "Password"
# [0-7] button "Sign In"
stagehand --ws $BROWSERBASE_CONNECT_URL fill @0-5 "user@example.com"
stagehand --ws $BROWSERBASE_CONNECT_URL fill @0-6 "password123"
stagehand --ws $BROWSERBASE_CONNECT_URL click @0-7
stagehand --ws $BROWSERBASE_CONNECT_URL wait load
stagehand --ws $BROWSERBASE_CONNECT_URL snapshot -c  # Verify logged in

Search and Extract

stagehand --ws $BROWSERBASE_CONNECT_URL open https://example.com
stagehand --ws $BROWSERBASE_CONNECT_URL snapshot -c
# [0-3] textbox "Search"
stagehand --ws $BROWSERBASE_CONNECT_URL fill @0-3 "my query"
stagehand --ws $BROWSERBASE_CONNECT_URL wait selector ".results"
stagehand --ws $BROWSERBASE_CONNECT_URL snapshot -c
# [1-0] text "Result 1"
# [1-1] text "Result 2"
stagehand --ws $BROWSERBASE_CONNECT_URL get text @1-0
stagehand --ws $BROWSERBASE_CONNECT_URL get text @1-1

Multi-Page Navigation

stagehand --ws $BROWSERBASE_CONNECT_URL open https://example.com
stagehand --ws $BROWSERBASE_CONNECT_URL snapshot -c
# [0-5] link "Next Page"
stagehand --ws $BROWSERBASE_CONNECT_URL click @0-5
stagehand --ws $BROWSERBASE_CONNECT_URL wait load
stagehand --ws $BROWSERBASE_CONNECT_URL snapshot -c  # Get new page structure

Troubleshooting

Browser won't start

  • Check that stagehand is installed: which stagehand
  • Check status: stagehand --ws $BROWSERBASE_CONNECT_URL status
  • Force stop and retry: stagehand --ws $BROWSERBASE_CONNECT_URL stop

Element not found

  • Take a snapshot to verify refs: stagehand --ws $BROWSERBASE_CONNECT_URL snapshot -c
  • Wait for element to appear: stagehand --ws $BROWSERBASE_CONNECT_URL wait selector ...
  • Check if ref changed after page update

Page not loading

  • Increase timeout: stagehand --ws $BROWSERBASE_CONNECT_URL open <url> --timeout 60000
  • Wait for load state: stagehand --ws $BROWSERBASE_CONNECT_URL wait load networkidle

Commands failing with "session not found"

  • The daemon auto-recovers from crashes
  • If issues p

Content truncated.

seedream-image-gen

openclaw

Generate images via Seedream API (doubao-seedream models). Synchronous generation.

2359

ffmpeg-cli

openclaw

Comprehensive video/audio processing with FFmpeg. Use for: (1) Video transcoding and format conversion, (2) Cutting and merging clips, (3) Audio extraction and manipulation, (4) Thumbnail and GIF generation, (5) Resolution scaling and quality adjustment, (6) Adding subtitles or watermarks, (7) Speed adjustment (slow/fast motion), (8) Color correction and filters.

6623

context-optimizer

openclaw

Advanced context management with auto-compaction and dynamic context optimization for DeepSeek's 64k context window. Features intelligent compaction (merging, summarizing, extracting), query-aware relevance scoring, and hierarchical memory system with context archive. Logs optimization events to chat.

3622

a-stock-analysis

openclaw

A股实时行情与分时量能分析。获取沪深股票实时价格、涨跌、成交量,分析分时量能分布(早盘/尾盘放量)、主力动向(抢筹/出货信号)、涨停封单。支持持仓管理和盈亏分析。Use when: (1) 查询A股实时行情, (2) 分析主力资金动向, (3) 查看分时成交量分布, (4) 管理股票持仓, (5) 分析持仓盈亏。

9121

himalaya

openclaw

CLI to manage emails via IMAP/SMTP. Use `himalaya` to list, read, write, reply, forward, search, and organize emails from the terminal. Supports multiple accounts and message composition with MML (MIME Meta Language).

7921

garmin-connect

openclaw

Syncs daily health and fitness data from Garmin Connect into markdown files. Provides sleep, activity, heart rate, stress, body battery, HRV, SpO2, and weight data.

7321

You might also like

flutter-development

aj-geddes

Build beautiful cross-platform mobile apps with Flutter and Dart. Covers widgets, state management with Provider/BLoC, navigation, API integration, and material design.

643969

drawio-diagrams-enhanced

jgtolentino

Create professional draw.io (diagrams.net) diagrams in XML format (.drawio files) with integrated PMP/PMBOK methodologies, extensive visual asset libraries, and industry-standard professional templates. Use this skill when users ask to create flowcharts, swimlane diagrams, cross-functional flowcharts, org charts, network diagrams, UML diagrams, BPMN, project management diagrams (WBS, Gantt, PERT, RACI), risk matrices, stakeholder maps, or any other visual diagram in draw.io format. This skill includes access to custom shape libraries for icons, clipart, and professional symbols.

591705

ui-ux-pro-max

nextlevelbuilder

"UI/UX design intelligence. 50 styles, 21 palettes, 50 font pairings, 20 charts, 8 stacks (React, Next.js, Vue, Svelte, SwiftUI, React Native, Flutter, Tailwind). Actions: plan, build, create, design, implement, review, fix, improve, optimize, enhance, refactor, check UI/UX code. Projects: website, landing page, dashboard, admin panel, e-commerce, SaaS, portfolio, blog, mobile app, .html, .tsx, .vue, .svelte. Elements: button, modal, navbar, sidebar, card, table, form, chart. Styles: glassmorphism, claymorphism, minimalism, brutalism, neumorphism, bento grid, dark mode, responsive, skeuomorphism, flat design. Topics: color palette, accessibility, animation, layout, typography, font pairing, spacing, hover, shadow, gradient."

318398

godot

bfollington

This skill should be used when working on Godot Engine projects. It provides specialized knowledge of Godot's file formats (.gd, .tscn, .tres), architecture patterns (component-based, signal-driven, resource-based), common pitfalls, validation tools, code templates, and CLI workflows. The `godot` command is available for running the game, validating scripts, importing resources, and exporting builds. Use this skill for tasks involving Godot game development, debugging scene/resource files, implementing game systems, or creating new Godot components.

339397

nano-banana-pro

garg-aayush

Generate and edit images using Google's Nano Banana Pro (Gemini 3 Pro Image) API. Use when the user asks to generate, create, edit, modify, change, alter, or update images. Also use when user references an existing image file and asks to modify it in any way (e.g., "modify this image", "change the background", "replace X with Y"). Supports both text-to-image generation and image-to-image editing with configurable resolution (1K default, 2K, or 4K for high resolution). DO NOT read the image file first - use this skill directly with the --input-image parameter.

451339

fastapi-templates

wshobson

Create production-ready FastAPI projects with async patterns, dependency injection, and comprehensive error handling. Use when building new FastAPI applications or setting up backend API projects.

304231

Stay ahead of the MCP ecosystem

Get weekly updates on new skills and servers.