browser-test

Name: browser-test
Author: langwatch

by langwatch

2views

1installs

Source

Validate a feature works by driving a real browser with Playwright MCP. No test files — just interactive verification.

Install

mkdir -p .claude/skills/browser-test && curl -L -o skill.zip "https://mcp.directory/api/skills/download/4181" && unzip -o skill.zip -d .claude/skills/browser-test && rm skill.zip

Installs to .claude/skills/browser-test

About this skill

Browser Test — Interactive Feature Validation

You are the orchestrator. You do NOT drive the browser yourself. You spawn a focused sub-agent to do the browser work, monitor its progress, and collect results.

Step 1: Prepare

Parse $ARGUMENTS for:

Port (optional): a number (e.g. 5570) or :<port> format
Feature (optional): a description of what to verify, or a path to a specs/*.feature file

If a feature file path is given, read it now and extract the scenarios into a concrete checklist. If a plain description is given, use it directly. If neither is provided, use the default smoke test: app loads, sign in works, dashboard renders after auth.

Resolve the port

Explicit port in $ARGUMENTS → use it
Read .dev-port file in the repo root → source it for APP_PORT
No port and no .dev-port? → run scripts/dev-up.sh and then read the .dev-port it creates

# .dev-port format (written by dev-up.sh):
APP_PORT=5560
BASE_URL=http://localhost:5560
COMPOSE_PROJECT_NAME=langwatch-abcd1234

Resolve the feature

If a feature file was given, read it and turn each scenario into a numbered verification step. Example:

Feature file: specs/features/beta-pill.feature
Scenarios:
  1. Navigate to dashboard → verify purple "Beta" badge next to Suites in sidebar
  2. Hover over badge → verify popover appears with beta disclaimer text
  3. Press Tab to focus badge → verify same popover appears via keyboard

Create artifact directory

browser-tests/<feature-name>/<YYYY-MM-DD>/screenshots/

Derive <feature-name> from: feature filename (without extension) > slugified description > branch name suffix.

Step 2: Determine data seeding needs

Before verification, decide what data the feature under test requires. Many features need pre-existing data to be meaningful (e.g., a suites page needs at least one suite with runs, a trace viewer needs traces, an evaluations dashboard requires completed evaluations).

Analyze the verification steps from Step 1. For each step, ask: "What data must already exist for this to be testable?"
Build a seeding checklist — the minimal set of entities needed. Examples:
- Suites page → create one suite with a name and at least one scenario
- Trace viewer → send at least one trace via the SDK or API
- Evaluation results → trigger a batch run and wait for results
Prefer seeding through the UI — navigate to create forms, fill them in, submit. This exercises the same path a user would and is the most reliable approach in dev mode.
Fall back to API/SDK only for bulk data that would be impractical to create through the UI (e.g., 50 traces for a pagination test).
Keep seeding MINIMAL — only create what is strictly needed to verify the feature. Do not populate the app with extra data "just in case."

Include the seeding instructions in the sub-agent prompt (Step 3) so the sub-agent creates the data before verifying.

Step 3: Spawn the browser agent

Use the Agent tool to spawn a sub-agent. Give it everything it needs in the prompt — port, verification steps, credentials, artifact path. The sub-agent has access to Playwright MCP tools and Bash.

Critical: The sub-agent prompt must include ALL of the following. Do not assume it knows anything — it starts with zero context:

You are a browser test agent. Your ONLY job is to drive a browser and verify features.

## Your mission
<paste the numbered verification steps here>

## Data seeding
Before verifying, create the minimal data the feature needs. Follow the checklist below.
Prefer seeding through the UI; use API/SDK only when the checklist explicitly calls for it:
<paste the seeding checklist from Step 2 here — e.g.:>
- Navigate to Suites → click "Create Suite" → fill name "Test Suite" → save
- Open the suite → add a scenario → run it once
- Wait for the run to complete before proceeding to verification

Only create what is listed above. Do not add extra data beyond what is needed.

## Connection
- App URL: http://localhost:<port>
- Browser: Chromium (headless) — use Playwright MCP tools
- Save screenshots to: <absolute artifact path>/screenshots/

## Auth (NextAuth credentials form, NOT Auth0)
- Navigate to the app → redirects to /auth/signin (Email + Password form)
- Email: [email protected]
- Password: BrowserTest123!
- If "Register new account" needed, register first with same credentials
- Org name if onboarding: Browser Test Org
- After auth: dashboard shows "Hello, Browser" + "Browser Test Org" header

## How to interact
- Use browser_snapshot (accessibility tree) for finding elements — it's faster than screenshots
- Use browser_take_screenshot to capture evidence at each key step
- Use browser_wait_for with generous timeouts (60-120s for first page loads, dev mode is slow)
- Number screenshots sequentially: 01-sign-in.png, 02-dashboard.png, etc.

## Guardrails — READ THESE
- You have a maximum of 40 tool calls (seeding + verification). If you haven't finished, report what you verified and what's left.
- Do NOT debug app issues. If something doesn't work, screenshot it, mark it FAIL, and move on.
- Do NOT modify any files, fix any code, or investigate root causes.
- Do NOT go off-script. Only verify the steps listed above.
- If a step fails, take a screenshot, record FAIL, and continue to the next step.
- When done, return a markdown summary table: | # | Step | Result | Screenshot |

Step 4: Collect results

When the sub-agent returns:

Parse its summary table
Write the report to browser-tests/<feature-name>/<YYYY-MM-DD>/report.md:

# Browser Test: <feature-name>
**Date:** YYYY-MM-DD
**App:** http://localhost:<port>
**Browser:** Chromium (headless)
**Branch:** <current branch>
**PR:** #<number> (if known)

## Results

| # | Scenario | Result | Screenshot |
|---|----------|--------|------------|
| 1 | <name>   | PASS   | screenshots/01-xxx.png |

## Failures (if any)
- **Scenario 2:** Expected X but saw Y.

## Notes
<any observations>

If you started the app (no .dev-port existed before), tear it down: scripts/dev-down.sh

Step 5: Upload screenshots and update the PR

Screenshots are uploaded to img402.dev (free, no auth) instead of committed to git. This avoids binary bloat in the repo.

Upload each screenshot to img402.dev:

curl -s -F "image=@browser-tests/<feature>/<date>/screenshots/01-xxx.jpeg" https://img402.dev/api/free
# Returns: {"url":"https://i.img402.dev/abc123.jpg", ...}

Collect the returned URLs for each screenshot.

Update the PR description with the results table using img402 URLs so images render inline:

Read the current PR body first (gh pr view --json body), then append a new section:
```
## Browser Test: <feature-name>

| # | Scenario | Result | Screenshot |
|---|----------|--------|------------|
| 1 | <name> | PASS | ![01](https://i.img402.dev/abc123.jpg) |
```
Use gh api repos/langwatch/langwatch/pulls/<number> -X PATCH -f body="..." to update (not gh pr edit).
Do NOT commit browser-tests/ — it is gitignored. Screenshots are ephemeral local artifacts; the img402 URLs in the PR body are the permanent record.

Step 6: Report

Return the summary to the user/orchestrator. Include:

The results table
Link to the PR where screenshots are now visible
Note: img402.dev free tier has 7-day retention; screenshots expire but remain in the PR body as broken images after that

Rules

You are the orchestrator, not the browser driver. Spawn a sub-agent for all browser work.
Never ask the user for anything. Ports, credentials, features, browser choice — all resolved automatically.
Read HOW_TO.md in this skill directory before your first run — it has gotchas about Chakra UI, dev mode slowness, and known issues. Include relevant warnings in the sub-agent prompt.
One sub-agent per run. If it fails or times out, report the failure — don't retry.
Don't create test files. This is interactive verification only.

More by langwatch

View all skills by langwatch →

code

langwatch

Delegate implementation work to the coder agent. Provide requirements or feature file path.

learn

langwatch

Learn from mistakes by updating AGENTS.md. Use when a mistake was made that should be prevented in future sessions.

implement

langwatch

Start implementation of a GitHub issue. Usage: /implement #123 or /implement <issue-url>

test-review

langwatch

Review specs and tests for pyramid placement and quality.

sherpa

langwatch

Delegate repository, agent, or documentation questions to the repo-sherpa. Use for onboarding, DX improvements, or meta-layer changes.

orchestrate

langwatch

Orchestration mode for implementation tasks. Manages the plan → code → review loop. Use /orchestrate <requirements> or let /implement invoke it.

ui-ux-pro-max

nextlevelbuilder

"UI/UX design intelligence. 50 styles, 21 palettes, 50 font pairings, 20 charts, 8 stacks (React, Next.js, Vue, Svelte, SwiftUI, React Native, Flutter, Tailwind). Actions: plan, build, create, design, implement, review, fix, improve, optimize, enhance, refactor, check UI/UX code. Projects: website, landing page, dashboard, admin panel, e-commerce, SaaS, portfolio, blog, mobile app, .html, .tsx, .vue, .svelte. Elements: button, modal, navbar, sidebar, card, table, form, chart. Styles: glassmorphism, claymorphism, minimalism, brutalism, neumorphism, bento grid, dark mode, responsive, skeuomorphism, flat design. Topics: color palette, accessibility, animation, layout, typography, font pairing, spacing, hover, shadow, gradient."

2,8262,505

flutter-development

aj-geddes

Build beautiful cross-platform mobile apps with Flutter and Dart. Covers widgets, state management with Provider/BLoC, navigation, API integration, and material design.

2,1461,638

pdf-to-markdown

aliceisjustplaying

Convert entire PDF documents to clean, structured Markdown for full context loading. Use this skill when the user wants to extract ALL text from a PDF into context (not grep/search), when discussing or analyzing PDF content in full, when the user mentions "load the whole PDF", "bring the PDF into context", "read the entire PDF", or when partial extraction/grepping would miss important context. This is the preferred method for PDF text extraction over page-by-page or grep approaches.

3,7441,635

drawio-diagrams-enhanced

jgtolentino

Create professional draw.io (diagrams.net) diagrams in XML format (.drawio files) with integrated PMP/PMBOK methodologies, extensive visual asset libraries, and industry-standard professional templates. Use this skill when users ask to create flowcharts, swimlane diagrams, cross-functional flowcharts, org charts, network diagrams, UML diagrams, BPMN, project management diagrams (WBS, Gantt, PERT, RACI), risk matrices, stakeholder maps, or any other visual diagram in draw.io format. This skill includes access to custom shape libraries for icons, clipart, and professional symbols.

2,2561,460

godot

bfollington

This skill should be used when working on Godot Engine projects. It provides specialized knowledge of Godot's file formats (.gd, .tscn, .tres), architecture patterns (component-based, signal-driven, resource-based), common pitfalls, validation tools, code templates, and CLI workflows. The `godot` command is available for running the game, validating scripts, importing resources, and exporting builds. Use this skill for tasks involving Godot game development, debugging scene/resource files, implementing game systems, or creating new Godot components.

2,4371,212

nano-banana-pro

garg-aayush

Generate and edit images using Google's Nano Banana Pro (Gemini 3 Pro Image) API. Use when the user asks to generate, create, edit, modify, change, alter, or update images. Also use when user references an existing image file and asks to modify it in any way (e.g., "modify this image", "change the background", "replace X with Y"). Supports both text-to-image generation and image-to-image editing with configurable resolution (1K default, 2K, or 4K for high resolution). DO NOT read the image file first - use this skill directly with the --input-image parameter.

1,944965

Related MCP Servers

Browse all servers

Trello

Sync Trello with Google Calendar easily. Fast, automated Trello workflows, card management & seamless Google Calendar integration in one solution.

2550 tools

Slack MCP Server (Official)

Official Slack MCP server enabling AI agents to interact with Slack workspaces through the Model Context Protocol.

0 tools

Serena

Serena is a free AI code generator toolkit providing robust code editing and retrieval, turning LLMs into powerful artificial intelligence code generators.

21,1630 tools

Slack

Powerful MCP server for Slack with advanced API, message fetching, webhooks, and enterprise features. Robust Slack data for developers.

1,4290 tools

ROS MCP Server

Control any ROS1 or ROS2 robot with natural language using ROS MCP Server—AI-powered, code-free, real-time monitoring and debugging.

1,0630 tools

Excel

Unlock powerful Excel automation: read/write Excel files, create sheets, and automate workflows with seamless integration and data management.

8666 tools

Install

mkdir -p .claude/skills/browser-test && curl -L -o skill.zip "https://mcp.directory/api/skills/download/4181" && unzip -o skill.zip -d .claude/skills/browser-test && rm skill.zip

Installs to .claude/skills/browser-test

Stats

Views

Installs

Author

langwatch

7 skills published

Links

Source Code

browser-test

Install

About this skill

Browser Test — Interactive Feature Validation

Step 1: Prepare

Resolve the port

Resolve the feature

Create artifact directory

Step 2: Determine data seeding needs

Step 3: Spawn the browser agent

Step 4: Collect results

Step 5: Upload screenshots and update the PR

Step 6: Report

Rules

More by langwatch

code

learn

implement

test-review

sherpa

orchestrate

You might also like

ui-ux-pro-max

flutter-development

pdf-to-markdown

drawio-diagrams-enhanced

godot

nano-banana-pro

Related MCP Servers