Claude webapp-testing skill: 10 Playwright cookbook

playwright-cli

Teaches Claude the `--grep`, `--project`, and `--shard` flags this smoke test relies on.

github

Wire the smoke run into a required check via the GitHub MCP server.

Auth flow: login form + session storage + logout

Verify the full sign-in loop: form validation, redirect on success, session cookie set, logout clears it.

ForAnyone with a credentialed app. Pairs naturally with `storageState` reuse for downstream tests.

The prompt

Write tests/auth/login.spec.ts. Fill the email and password inputs (use getByLabel), click 'Sign in', expect URL to become /dashboard, expect a cookie named 'session' to exist. Then click 'Log out' and expect the cookie to be gone. Save the authenticated state to playwright/.auth/user.json with page.context().storageState.

What slides.md looks like

import { test, expect } from '@playwright/test';

test('user can sign in, persist session, and sign out', async ({ page, context }) => {
  await page.goto('/login');
  await page.getByLabel('Email').fill('[email protected]');
  await page.getByLabel('Password').fill(process.env.TEST_PASSWORD!);
  await page.getByRole('button', { name: 'Sign in' }).click();
  await expect(page).toHaveURL('/dashboard');
  expect((await context.cookies()).some(c => c.name === 'session')).toBe(true);
  await context.storageState({ path: 'playwright/.auth/user.json' });
  await page.getByRole('button', { name: 'Log out' }).click();
  expect((await context.cookies()).some(c => c.name === 'session')).toBe(false);
});

One-line tweak

Reuse `storageState: 'playwright/.auth/user.json'` in `playwright.config.ts` so downstream specs skip the login UI entirely — speeds the suite up 5–10x.

Pairs with

playwright-pro

Covers the `storageState` + auth-fixture pattern in depth.

writing-playwright-tests

Teaches the locator-priority rules (getByLabel > getByRole > CSS) this test depends on.

Visual regression with toHaveScreenshot()

Pixel-diff the rendered page against a committed baseline; fail when a CSS change quietly moves your hero by 4px.

ForDesign-system teams, marketing pages, anything where unintended visual drift is a bug.

The prompt

Write tests/visual/pricing.spec.ts. Navigate to /pricing, wait for networkidle, then call expect(page).toHaveScreenshot('pricing.png'). Mask the dynamic '€/$' price block with a CSS selector. Set maxDiffPixels: 100.

What slides.md looks like

import { test, expect } from '@playwright/test';

test('pricing page matches the committed baseline', async ({ page }) => {
  await page.goto('/pricing');
  await page.waitForLoadState('networkidle');
  await expect(page).toHaveScreenshot('pricing.png', {
    mask: [page.locator('[data-test="price-amount"]')],
    maxDiffPixels: 100,
    animations: 'disabled',
  });
});

One-line tweak

Generate baselines on the same OS that runs CI — never locally. `npx playwright test --update-snapshots --project=chromium` from a Linux runner avoids the macOS-vs-Ubuntu anti-aliasing trap.

Pairs with

screenshot-to-code

When the diff exposes drift, this skill turns the screenshot back into JSX/Tailwind for you.

browserbase

Runs the snapshot pass in a deterministic cloud browser, eliminating per-developer baseline drift.

Network mocks via page.route()

Stub flaky upstream APIs so the test verifies your UI, not someone else's uptime.

ForAny frontend that talks to a third-party (Stripe, Algolia, GitHub API, internal microservices).

The prompt

Write tests/mocks/search.spec.ts. Intercept GET /api/search?q=* and respond with a fixture of three results. Type 'react' into the search input and assert the three result titles appear.

What slides.md looks like

import { test, expect } from '@playwright/test';

test('search renders results from a stubbed API', async ({ page }) => {
  await page.route('**/api/search?q=*', async route => {
    await route.fulfill({
      status: 200,
      contentType: 'application/json',
      body: JSON.stringify({ hits: [
        { id: '1', title: 'React docs' },
        { id: '2', title: 'React Router' },
        { id: '3', title: 'React Query' },
      ] }),
    });
  });
  await page.goto('/search');
  await page.getByPlaceholder('Search').fill('react');
  await expect(page.getByRole('listitem')).toHaveCount(3);
  await expect(page.getByText('React Router')).toBeVisible();
});

One-line tweak

Move the fixture to `tests/fixtures/search.json` and `import searchFixture from '../fixtures/search.json'` so the response stays diffable in PRs.

Pairs with

writing-playwright-tests

Covers the `route.fulfill` vs `route.continue` decision tree.

playwright

The Playwright MCP server exposes the same intercept primitives if you want them across many turns.

Mobile viewport with devices['iPhone 13']

Catch the regressions that only happen at 390×844 — overflow, touch targets too close, off-canvas nav broken.

ForAnyone whose mobile traffic is more than 30% of conversions.

The prompt

Add a 'mobile' project to playwright.config.ts using devices['iPhone 13']. Write tests/mobile/nav.spec.ts that opens the hamburger, taps 'Pricing', and asserts the URL is /pricing.

What slides.md looks like

// playwright.config.ts
import { defineConfig, devices } from '@playwright/test';
export default defineConfig({
  projects: [
    { name: 'chromium', use: { ...devices['Desktop Chrome'] } },
    { name: 'mobile',   use: { ...devices['iPhone 13'] } },
  ],
});

// tests/mobile/nav.spec.ts
import { test, expect } from '@playwright/test';
test('mobile nav opens and routes to /pricing', async ({ page }) => {
  await page.goto('/');
  await page.getByRole('button', { name: 'Open menu' }).tap();
  await page.getByRole('link', { name: 'Pricing' }).tap();
  await expect(page).toHaveURL('/pricing');
});

One-line tweak

Run mobile-only with `npx playwright test --project=mobile`. Add `devices['Pixel 7']` as a third project to catch Chrome-on-Android-specific issues.

Pairs with

accessibility-auditor

Mobile is where touch-target and contrast violations bite hardest — bolt on an a11y pass.

playwright-cli

Covers the `--project` selector this entry depends on.

OAuth popup handling

Test the 'Sign in with Google/GitHub' flow without flaking — the popup must be awaited BEFORE the click that opens it.

ForAny product with social sign-in.

The prompt

Write tests/oauth/github.spec.ts. Click 'Continue with GitHub'. Use page.waitForEvent('popup') CREATED BEFORE the click. In the popup, fill the GitHub login fixture and click Authorize. Expect the original page to land at /dashboard.

What slides.md looks like

import { test, expect } from '@playwright/test';

test('github OAuth popup completes and returns to /dashboard', async ({ page }) => {
  await page.goto('/login');
  const popupPromise = page.waitForEvent('popup');           // create BEFORE click
  await page.getByRole('button', { name: 'Continue with GitHub' }).click();
  const popup = await popupPromise;
  await popup.waitForLoadState();
  await popup.getByLabel('Username').fill(process.env.GH_USER!);
  await popup.getByLabel('Password').fill(process.env.GH_PW!);
  await popup.getByRole('button', { name: 'Sign in' }).click();
  await popup.getByRole('button', { name: 'Authorize' }).click();
  await expect(page).toHaveURL('/dashboard');
});

One-line tweak

For deterministic CI runs, stub the `/oauth/callback` endpoint with `page.route` and never hit the real GitHub IdP — see use case 4 for the pattern.

Pairs with

playwright-pro

Has the canonical popup-promise pattern and a section on stubbing OAuth in CI.

github

When the test does need to read a real GitHub fixture user, the GitHub MCP server provisions it.

Performance assertion via CDPSession

Fail the build when First Contentful Paint regresses past a budget — without bolting on a Lighthouse CI process.

ForTeams with a perf budget already (LCP < 2.5s, CLS < 0.1).

The prompt

Write tests/perf/landing.spec.ts. Navigate to /, wait until the page is settled, then read the browser-side `performance.getEntriesByType('paint')` API to assert First Contentful Paint < 1500 ms.

What slides.md looks like

import { test, expect } from '@playwright/test';

test('landing page FCP stays under 1.5s', async ({ page }) => {
  await page.goto('/', { waitUntil: 'networkidle' });
  const fcp = await page.evaluate(() => {
    const entry = performance
      .getEntriesByType('paint')
      .find((e) => e.name === 'first-contentful-paint');
    return entry ? entry.startTime : null;
  });
  console.log('FCP (ms):', fcp);
  expect(fcp).not.toBeNull();
  expect(fcp!).toBeLessThan(1500);
});

One-line tweak

Pair with the `perf-lighthouse` skill if you want full Core Web Vitals (LCP, INP, CLS) — the Paint Timing API is the cheap floor; Lighthouse is the audit.

Pairs with

perf-lighthouse

Covers the full Lighthouse CI integration when CDPSession metrics aren't enough.

chrome-devtools

The Chrome DevTools skill is the canonical reference for CDP domains beyond Performance.

Accessibility audit with @axe-core/playwright

Run axe against every key page; fail the build on serious or critical violations.

ForAnyone who needs WCAG 2.1 AA. The skill prompt also nudges Claude to fix obvious violations in the same PR.

The prompt

Install @axe-core/playwright. Write tests/a11y/dashboard.spec.ts that navigates to /dashboard, runs AxeBuilder({ page }).analyze(), and fails if any 'serious' or 'critical' violation is reported.

What slides.md looks like

import { test, expect } from '@playwright/test';
import AxeBuilder from '@axe-core/playwright';

test('dashboard has no serious or critical a11y violations', async ({ page }) => {
  await page.goto('/dashboard');
  const results = await new AxeBuilder({ page })
    .withTags(['wcag2a', 'wcag2aa'])
    .analyze();
  const blocking = results.violations.filter(
    v => v.impact === 'serious' || v.impact === 'critical'
  );
  expect.soft(blocking, JSON.stringify(blocking, null, 2)).toEqual([]);
});

One-line tweak

Use `expect.soft` so one bad rule doesn't hide the others — the test still fails, but every violation is reported in the same run.

Pairs with

accessibility-auditor

Reads the axe report and proposes the smallest diff that resolves each violation.

axe-accessibility

An MCP-side runner if you want axe results streamable across many tests/turns.

CI matrix: chromium / firefox / webkit on GitHub Actions

Run every spec across the three engines on every push, with traces uploaded on failure.

ForAny team that ships to users on Safari (i.e., everyone).

The prompt

Generate .github/workflows/playwright.yml. Matrix over the three projects (chromium, firefox, webkit). Cache Playwright browsers. Upload playwright-report/ as an artifact on failure.

What slides.md looks like

# .github/workflows/playwright.yml
name: Playwright Tests
on: [push, pull_request]
jobs:
  test:
    runs-on: ubuntu-latest
    strategy:
      fail-fast: false
      matrix:
        project: [chromium, firefox, webkit]
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with: { node-version: 20, cache: 'npm' }
      - run: npm ci
      - run: npx playwright install --with-deps ${{ matrix.project }}
      - run: npx playwright test --project=${{ matrix.project }}
      - if: ${{ failure() }}
        uses: actions/upload-artifact@v4
        with: { name: playwright-report-${{ matrix.project }}, path: playwright-report/, retention-days: 7 }

One-line tweak

Add `--shard=${{ matrix.shard }}/4` and a second matrix axis `shard: [1, 2, 3, 4]` once your suite passes ~5 minutes wall-clock.

Pairs with

playwright-cli

The `--shard` and `--project` flags this workflow leans on.

github

Use the GitHub MCP to make Playwright a required status check programmatically.

Shared user fixture with test.beforeEach

Stop re-logging-in at the start of every spec. Define one authenticated fixture; every downstream test inherits it.

ForAny suite over ~20 tests where login boilerplate is now the slowest part of CI.

The prompt

Refactor the suite. Create tests/fixtures/auth.ts that exports a custom test with an `authedPage` fixture. Use storageState from playwright/.auth/user.json (saved by use case 2). Then rewrite tests/dashboard.spec.ts to import from the fixture and skip the login UI entirely.

What slides.md looks like

// tests/fixtures/auth.ts
import { test as base, expect, Page } from '@playwright/test';
type Fixtures = { authedPage: Page };
export const test = base.extend<Fixtures>({
  authedPage: async ({ browser }, use) => {
    const context = await browser.newContext({ storageState: 'playwright/.auth/user.json' });
    const page = await context.newPage();
    await use(page);
    await context.close();
  },
});
export { expect };

// tests/dashboard.spec.ts
import { test, expect } from './fixtures/auth';
test('authed user sees their org name', async ({ authedPage }) => {
  await authedPage.goto('/dashboard');
  await expect(authedPage.getByTestId('org-name')).toHaveText('Acme Inc.');
});

One-line tweak

Add a second fixture `adminPage` that loads `playwright/.auth/admin.json` for tests that need elevated permissions — the pattern composes.

Pairs with

writing-playwright-tests

Has the canonical fixture-extension recipe with TypeScript generics.

playwright-pro

Covers worker-scoped fixtures and parallel-safe state for larger suites.

Community signal

Three voices from the Show HN thread for the open-source Playwright skill that inspired this category. The first is the clearest endorsement of why a skill works for Playwright; the second is the context-cost story; the third is the author’s own honest framing of when not to bother.

“Playwright runs tests in parallel by default for free, whereas Cypress performs parallelization only for different machines through a paid feature.”

BigBinary engineering team · Blog

Why-we-switched post-mortem; the single biggest reason teams move once Claude is authoring tests — parallel runs are free.

“I'm likely 50–100% more productive in Playwright than I was in Cypress.”

Michael Lynch · Blog

Honest cost/benefit on Cypress→Playwright migration; useful counterweight to the contrarian section below.

“Playwright exposed problems that Cypress's automatic retries and auto-waiting masked — if tests had race conditions, Playwright exposed them consistently.”

21RISK engineering · Blog

The flakiness story most teams discover the week after they migrate.

The contrarian take

Not everyone is sold on skills for Playwright. The most honest critique on the launch thread came from Michael Lynch (mtlynch.io):

“I have a personal appreciation for Cypress as an open-source company, and in particular, Gleb Bahmutov, their VP of Engineering.”

Michael Lynch (mtlynch.io) · Blog

From the Show HN thread on Playwright skills.