e2e-tests-studio
REQUIRED when modifying any file in packages/playground-ui or packages/playground. Triggers on: React component creation/modification/refactoring, UI changes, new playground features, bug fixes affecting studio UI. Generates Playwright E2E tests that validate PRODUCT BEHAVIOR, not just UI states.
Install
mkdir -p .claude/skills/e2e-tests-studio && curl -L -o skill.zip "https://mcp.directory/api/skills/download/4429" && unzip -o skill.zip -d .claude/skills/e2e-tests-studio && rm skill.zipInstalls to .claude/skills/e2e-tests-studio
About this skill
E2E Behavior Validation for Frontend Modifications
Core Principle: Test Product Behavior, Not UI States
CRITICAL: Tests must verify that product features WORK correctly, not just that UI elements render.
What NOT to test (UI States):
- ❌ "Dropdown opens when clicked"
- ❌ "Modal appears after button click"
- ❌ "Loading spinner shows during request"
- ❌ "Form fields are visible"
- ❌ "Sidebar collapses"
What TO test (Product Behavior):
- ✅ "Selecting an LLM provider configures the agent to use that provider"
- ✅ "Creating a new agent persists it and shows in the agents list"
- ✅ "Running a tool with parameters returns the expected output"
- ✅ "Chat messages stream correctly and maintain conversation context"
- ✅ "Workflow execution triggers tools in the correct order"
Prerequisites
Requires Playwright MCP server. If the browser_navigate tool is unavailable, instruct the user to add it:
claude mcp add playwright -- npx @playwright/mcp@latest
Step 1: Understand the Feature Intent
Before writing ANY test, answer these questions:
- What user problem does this feature solve?
- What is the expected outcome when the feature works correctly?
- What data flows through the system? (user input → API → state → UI)
- What should persist after page reload?
- What downstream effects should this action have?
Document these answers as comments in your test file.
Step 2: Build and Start
pnpm build:cli
cd packages/playground/e2e/kitchen-sink && pnpm dev
Verify server at http://localhost:4111
Step 3: Map Feature to Behavior Tests
Feature-to-Test Mapping Guide
| Feature Category | What to Test | Example Assertion |
|---|---|---|
| Agent Configuration | Config changes affect agent behavior | Send message → verify response uses selected model |
| LLM Provider Selection | Selected provider is used in requests | Intercept API call → verify provider in request payload |
| Tool Execution | Tool runs with correct params & returns result | Execute tool → verify output matches expected transformation |
| Workflow Execution | Steps execute in order, data flows between steps | Run workflow → verify each step's output feeds next step |
| Chat/Streaming | Messages persist, context maintained across turns | Multi-turn conversation → verify context awareness |
| MCP Server Tools | Server tools are callable and return data | Call MCP tool → verify response structure and content |
| Memory/Persistence | Data survives page reload | Create item → reload → verify item exists |
| Error Handling | Errors surface correctly to user | Trigger error condition → verify error message + recovery |
Step 4: Write Behavior-Focused Tests
Test Structure Template
import { test, expect, Page } from '@playwright/test';
import { resetStorage } from '../__utils__/reset-storage';
import { selectFixture } from '../__utils__/select-fixture';
import { nanoid } from 'nanoid';
/**
* FEATURE: [Name of feature]
* USER STORY: As a user, I want to [action] so that [outcome]
* BEHAVIOR UNDER TEST: [Specific behavior being validated]
*/
test.describe('[Feature Name] - Behavior Tests', () => {
let page: Page;
test.beforeEach(async ({ browser }) => {
const context = await browser.newContext();
page = await context.newPage();
});
test.afterEach(async () => {
await resetStorage(page);
});
test('should [verb describing behavior] when [trigger condition]', async () => {
// ARRANGE: Set up preconditions
// - Navigate to the feature
// - Configure any required state
// ACT: Perform the user action that triggers the behavior
// ASSERT: Verify the OUTCOME, not the UI state
// - Check data persistence
// - Verify downstream effects
// - Confirm API calls made correctly
});
});
Behavior Test Patterns
Pattern 1: Configuration Affects Behavior
test('selecting LLM provider should use that provider for agent responses', async () => {
// ARRANGE
await page.goto('/agents/my-agent/chat');
// Intercept API to verify provider
let capturedProvider: string | null = null;
await page.route('**/api/chat', route => {
const body = JSON.parse(route.request().postData() || '{}');
capturedProvider = body.provider;
route.continue();
});
// ACT: Select a different provider
await page.getByTestId('provider-selector').click();
await page.getByRole('option', { name: 'OpenAI' }).click();
// Send a message to trigger the agent
await page.getByTestId('chat-input').fill('Hello');
await page.getByTestId('send-button').click();
// ASSERT: Verify the selected provider was used
await expect.poll(() => capturedProvider).toBe('openai');
});
Pattern 2: Data Persistence
test('created agent should persist after page reload', async () => {
// ARRANGE
await page.goto('/agents');
const agentName = `Test Agent ${nanoid()}`;
// ACT: Create new agent
await page.getByTestId('create-agent-button').click();
await page.getByTestId('agent-name-input').fill(agentName);
await page.getByTestId('save-agent-button').click();
// Wait for creation to complete
await expect(page.getByText(agentName)).toBeVisible();
// ASSERT: Verify persistence
await page.reload();
await expect(page.getByText(agentName)).toBeVisible({ timeout: 10000 });
});
Pattern 3: Tool Execution Produces Correct Output
test('weather tool should return formatted weather data', async () => {
// ARRANGE
await selectFixture(page, 'weather-success');
await page.goto('/tools/weather-tool');
// ACT: Execute tool with parameters
await page.getByTestId('param-city').fill('San Francisco');
await page.getByTestId('execute-tool-button').click();
// ASSERT: Verify OUTPUT content, not just that output appears
const output = page.getByTestId('tool-output');
await expect(output).toContainText('temperature');
await expect(output).toContainText('San Francisco');
// Verify structured data if applicable
const outputText = await output.textContent();
const outputData = JSON.parse(outputText || '{}');
expect(outputData).toHaveProperty('temperature');
expect(outputData).toHaveProperty('conditions');
});
Pattern 4: Workflow Step Chaining
test('workflow should pass data between steps correctly', async () => {
// ARRANGE
await selectFixture(page, 'workflow-multi-step');
const sessionId = nanoid();
await page.goto(`/workflows/data-pipeline?session=${sessionId}`);
// ACT: Trigger workflow execution
await page.getByTestId('workflow-input').fill('test input data');
await page.getByTestId('run-workflow-button').click();
// ASSERT: Verify each step received correct input from previous step
// Wait for completion
await expect(page.getByTestId('workflow-status')).toHaveText('completed', { timeout: 30000 });
// Check step outputs show data transformation chain
const step1Output = await page.getByTestId('step-1-output').textContent();
const step2Output = await page.getByTestId('step-2-output').textContent();
// Verify step 2 received step 1's output as input
expect(step2Output).toContain(step1Output);
});
Pattern 5: Streaming Chat with Context
test('chat should maintain conversation context across messages', async () => {
// ARRANGE
await selectFixture(page, 'contextual-chat');
const chatId = nanoid();
await page.goto(`/agents/assistant/chat/${chatId}`);
// ACT: Multi-turn conversation
await page.getByTestId('chat-input').fill('My name is Alice');
await page.getByTestId('send-button').click();
await expect(page.getByTestId('assistant-message').last()).toBeVisible({ timeout: 20000 });
await page.getByTestId('chat-input').fill('What is my name?');
await page.getByTestId('send-button').click();
// ASSERT: Verify context was maintained
const response = page.getByTestId('assistant-message').last();
await expect(response).toContainText('Alice', { timeout: 20000 });
});
Pattern 6: Error Recovery
test('should show actionable error and allow retry when API fails', async () => {
// ARRANGE: Set up failure fixture
await selectFixture(page, 'api-failure');
await page.goto('/tools/flaky-tool');
// ACT: Trigger the error
await page.getByTestId('execute-tool-button').click();
// ASSERT: Error is shown with recovery option
await expect(page.getByTestId('error-message')).toContainText('failed');
await expect(page.getByTestId('retry-button')).toBeVisible();
// Switch to success fixture and retry
await selectFixture(page, 'api-success');
await page.getByTestId('retry-button').click();
// Verify recovery worked
await expect(page.getByTestId('tool-output')).toBeVisible({ timeout: 10000 });
await expect(page.getByTestId('error-message')).not.toBeVisible();
});
Step 5: Update Existing Tests
When a test file already exists:
- Read the existing tests to understand current coverage
- Identify if tests are UI-focused or behavior-focused
- Refactor UI-focused tests to verify behavior instead:
Refactoring Example
BEFORE (UI-focused):
test('dropdown opens when clicked', async () => {
await page.getByTestId('model-dropdown').click();
await expect(page.getByRole('listbox')).toBeVisible();
});
AFTER (Behavior-focused):
test('selecting model from dropdown updates agent configuration', async () => {
// Open dropdown and select model
await page.getByTestId('model-dropdown').click();
await page.getByRole('option', { name: 'GPT-4' }).click();
// Verify the selection persists and affects behavior
await page.reload();
await expect(page.getByTestId('model-dropdown')).toHaveText('GPT-4');
// Optionally: verify the model is used in actual requests
// (via request interception or checking response metadata)
});
Step 6: Kitchen-Sink Fixtures for Behavior Testing
Fixtures should represent realistic scenarios, not just mock data:
Fixture Naming Convention
<feature>-<scenario>.fixture.ts
Examples:
- agent-with-tools.fixture.ts
- chat-multi-turn-context.fixture.ts
- workflow-parallel-execution.fixture.ts
- tool-validation-error.fixture.ts
- mcp-server-timeout.fixture.ts
Fixture Content Requirements
Each fixture must define:
- Scenario description (what behavior it enables testing)
- Expected outcomes (what assertions should pass)
- Edge cases covered (error states, empty states, etc.)
// fixtures/agent-provider-switch.fixture.ts
export const agentProviderSwitch = {
name: 'agent-provider-switch',
description: 'Tests that switching LLM providers changes agent behavior',
// Mock responses for different providers
responses: {
openai: { content: 'Response from OpenAI', model: 'gpt-4' },
anthropic: { content: 'Response from Anthropic', model: 'claude-3' },
},
expectedBehavior: {
// When provider is switched, subsequent messages use new provider
providerSwitchAffectsNextMessage: true,
// Provider selection persists across page reload
providerPersistsOnReload: true,
},
};
Step 7: Run and Validate
cd packages/playground && pnpm test:e2e
Test Quality Checklist
Before considering tests complete, verify:
- Each test has a clear user story comment
- Tests verify OUTCOMES, not intermediate UI states
- Tests would FAIL if the feature broke (not just if UI changed)
- Persistence is verified via
page.reload()where applicable - Error scenarios are covered
- Tests use appropriate timeouts for async operations
- Fixtures represent realistic usage scenarios
Quick Reference
| Step | Command/Action |
|---|---|
| Build | pnpm build:cli |
| Start | cd packages/playground/e2e/kitchen-sink && pnpm dev |
| App URL | http://localhost:4111 |
| Routes | @packages/playground/src/App.tsx |
| Run tests | cd packages/playground && pnpm test:e2e |
| Test dir | packages/playground/e2e/tests/ |
| Fixtures | packages/playground/e2e/kitchen-sink/fixtures/ |
Anti-Patterns to Avoid
| ❌ Don't | ✅ Do Instead |
|---|---|
| Test that modal opens | Test that modal action completes and persists |
| Test that button is clickable | Test that clicking button produces expected result |
| Test loading spinner appears | Test that loaded data is correct |
| Test form validation message shows | Test that invalid form cannot submit AND valid form succeeds |
| Test dropdown has options | Test that selecting option changes system behavior |
| Test sidebar navigation works | Test that navigated page has correct data/functionality |
| Assert element is visible | Assert element contains expected data/state |
More by mastra-ai
View all →You might also like
flutter-development
aj-geddes
Build beautiful cross-platform mobile apps with Flutter and Dart. Covers widgets, state management with Provider/BLoC, navigation, API integration, and material design.
drawio-diagrams-enhanced
jgtolentino
Create professional draw.io (diagrams.net) diagrams in XML format (.drawio files) with integrated PMP/PMBOK methodologies, extensive visual asset libraries, and industry-standard professional templates. Use this skill when users ask to create flowcharts, swimlane diagrams, cross-functional flowcharts, org charts, network diagrams, UML diagrams, BPMN, project management diagrams (WBS, Gantt, PERT, RACI), risk matrices, stakeholder maps, or any other visual diagram in draw.io format. This skill includes access to custom shape libraries for icons, clipart, and professional symbols.
godot
bfollington
This skill should be used when working on Godot Engine projects. It provides specialized knowledge of Godot's file formats (.gd, .tscn, .tres), architecture patterns (component-based, signal-driven, resource-based), common pitfalls, validation tools, code templates, and CLI workflows. The `godot` command is available for running the game, validating scripts, importing resources, and exporting builds. Use this skill for tasks involving Godot game development, debugging scene/resource files, implementing game systems, or creating new Godot components.
nano-banana-pro
garg-aayush
Generate and edit images using Google's Nano Banana Pro (Gemini 3 Pro Image) API. Use when the user asks to generate, create, edit, modify, change, alter, or update images. Also use when user references an existing image file and asks to modify it in any way (e.g., "modify this image", "change the background", "replace X with Y"). Supports both text-to-image generation and image-to-image editing with configurable resolution (1K default, 2K, or 4K for high resolution). DO NOT read the image file first - use this skill directly with the --input-image parameter.
ui-ux-pro-max
nextlevelbuilder
"UI/UX design intelligence. 50 styles, 21 palettes, 50 font pairings, 20 charts, 8 stacks (React, Next.js, Vue, Svelte, SwiftUI, React Native, Flutter, Tailwind). Actions: plan, build, create, design, implement, review, fix, improve, optimize, enhance, refactor, check UI/UX code. Projects: website, landing page, dashboard, admin panel, e-commerce, SaaS, portfolio, blog, mobile app, .html, .tsx, .vue, .svelte. Elements: button, modal, navbar, sidebar, card, table, form, chart. Styles: glassmorphism, claymorphism, minimalism, brutalism, neumorphism, bento grid, dark mode, responsive, skeuomorphism, flat design. Topics: color palette, accessibility, animation, layout, typography, font pairing, spacing, hover, shadow, gradient."
rust-coding-skill
UtakataKyosui
Guides Claude in writing idiomatic, efficient, well-structured Rust code using proper data modeling, traits, impl organization, macros, and build-speed best practices.
Stay ahead of the MCP ecosystem
Get weekly updates on new skills and servers.