test-and-break

6
1
Source

Autonomous testing skill that opens a deployed app, goes through user flows, tries to break things, and writes detailed bug reports. Use after deploying to staging. Triggers on: test the app, find bugs, QA the deployment, break the app, test staging.

Install

mkdir -p .claude/skills/test-and-break && curl -L -o skill.zip "https://mcp.directory/api/skills/download/6212" && unzip -o skill.zip -d .claude/skills/test-and-break && rm skill.zip

Installs to .claude/skills/test-and-break

About this skill

Test and Break

Systematically test a deployed application by going through user flows, trying edge cases, and attempting to break things. Outputs structured bug reports that can be converted to user stories for autonomous fixing.


Prerequisites

  • agent-browser installed (npm install -g agent-browser && agent-browser install)
  • App deployed to a URL (staging/preview)
  • Basic understanding of what the app should do (read the PRD)

The Job

  1. Read the PRD to understand what the app should do
  2. Open the deployed app in agent-browser
  3. Go through each major user flow
  4. Try to break things at each step
  5. Document all bugs and issues found
  6. Output structured bug reports

Testing Process

Step 1: Understand the App

Read tasks/prd.md and tasks/architecture.md to understand:

  • What user flows exist
  • What the app should do
  • What the expected behavior is

Step 2: Open the App

agent-browser open [DEPLOYMENT_URL]
agent-browser snapshot -i

Step 3: Test Each User Flow

For each major feature/flow in the PRD:

A. Happy Path Testing

  1. Go through the flow as a normal user would
  2. Verify each step works as expected
  3. Check that success states appear correctly

B. Edge Case Testing

Try these at each input/interaction point:

Input Edge Cases:

  • Empty inputs (submit with nothing)
  • Very long text (500+ characters)
  • Special characters (<script>alert('xss')</script>, '; DROP TABLE users;--)
  • Unicode/emojis (🎉, 中文, العربية)
  • Negative numbers where positive expected
  • Zero where non-zero expected
  • Future dates, past dates, invalid dates
  • Invalid email formats
  • Spaces only
  • Leading/trailing whitespace

Interaction Edge Cases:

  • Double-click buttons rapidly
  • Click back button during operations
  • Refresh page mid-flow
  • Open same page in multiple tabs
  • Submit form twice quickly

State Edge Cases:

  • Log out mid-operation (if auth exists)
  • Let session expire
  • Navigate directly to URLs that require prior steps
  • Use browser back/forward buttons

Visual/UX Issues:

  • Check mobile responsiveness (resize browser)
  • Look for overlapping elements
  • Check loading states exist
  • Verify error messages are helpful
  • Look for console errors

Step 4: Document Each Bug

For each issue found, document:

## BUG-XXX: [Short descriptive title]

**Severity:** Critical | High | Medium | Low
**Type:** Functional | UI/UX | Security | Performance | Accessibility

**Steps to Reproduce:**
1. Go to [URL]
2. Do [action]
3. Enter [input]
4. Click [button]

**Expected Behavior:**
[What should happen]

**Actual Behavior:**
[What actually happens]

**Screenshot:** [if applicable]

**Console Errors:** [if any]

**Notes:** [any additional context]

Severity Guidelines

SeverityDefinitionExamples
CriticalApp broken, data loss, security issueCrash, XSS vulnerability, data not saving
HighMajor feature broken, bad UXCan't complete main flow, confusing errors
MediumFeature works but has issuesMinor validation missing, UI glitches
LowPolish/minor issuesTypos, slight misalignment, minor UX

Output Format

Save bug report to tasks/bug-report-[date].md:

# Bug Report: [App Name]
**Tested:** [Date]
**URL:** [Deployment URL]
**Tester:** Claude (Automated)

## Summary
- Total bugs found: X
- Critical: X
- High: X
- Medium: X
- Low: X

## Critical Bugs
[List critical bugs first]

## High Priority Bugs
[List high bugs]

## Medium Priority Bugs
[List medium bugs]

## Low Priority Bugs
[List low bugs]

## Positive Findings
[List things that worked well - important for context]

## Recommendations
[Overall suggestions for improvement]

Converting Bugs to User Stories

After generating the bug report, convert each bug to a user story format:

{
  "id": "BUG-001",
  "title": "Fix: [Bug title]",
  "description": "As a user, I expect [expected behavior] but currently [actual behavior].",
  "acceptanceCriteria": [
    "Specific fix criterion 1",
    "Specific fix criterion 2",
    "Regression test: [original bug steps] no longer reproduces",
    "Typecheck passes"
  ],
  "priority": 1,
  "passes": false,
  "notes": "Original bug: [reference]"
}

Priority mapping:

  • Critical bugs → priority 1-2
  • High bugs → priority 3-5
  • Medium bugs → priority 6-10
  • Low bugs → priority 11+

Integration with Ralph

After generating bug stories, they can be:

  1. Added to existing prd.json - Append bug fixes to current project
  2. Create new prd.json - Start a bug-fix-only Ralph run

To add to existing prd.json:

# Read current max priority
MAX_PRIORITY=$(cat prd.json | jq '[.userStories[].priority] | max')

# Add bug stories starting after max priority
# (Claude should do this programmatically)

Example Testing Session

# 1. Open the app
agent-browser open https://my-app-staging.vercel.app

# 2. Take initial snapshot
agent-browser snapshot -i

# 3. Test login flow
agent-browser fill @e1 "[email protected]"
agent-browser fill @e2 "password123"
agent-browser click @e3
agent-browser wait --load networkidle
agent-browser snapshot -i

# 4. Try to break it
agent-browser fill @e1 ""  # empty email
agent-browser click @e3    # submit anyway
agent-browser snapshot -i  # check error handling

# 5. Try XSS
agent-browser fill @e1 "<script>alert('xss')</script>"
agent-browser snapshot -i

# Continue testing other flows...

Checklist

Before finishing testing:

  • Tested all user flows from PRD
  • Tried empty inputs on all forms
  • Tried special characters/XSS on all inputs
  • Checked mobile responsiveness
  • Looked for console errors
  • Verified error messages are helpful
  • Documented all bugs with reproduction steps
  • Assigned severity to each bug
  • Saved bug report to tasks/bug-report-[date].md

sca-trivy

rohunj

Software Composition Analysis (SCA) and container vulnerability scanning using Aqua Trivy for identifying CVE vulnerabilities in dependencies, container images, IaC misconfigurations, and license compliance risks. Use when: (1) Scanning container images and filesystems for vulnerabilities and misconfigurations, (2) Analyzing dependencies for known CVEs across multiple languages (Go, Python, Node.js, Java, etc.), (3) Detecting IaC security issues in Terraform, Kubernetes, Dockerfile, (4) Integrating vulnerability scanning into CI/CD pipelines with SARIF output, (5) Generating Software Bill of Materials (SBOM) in CycloneDX or SPDX format, (6) Prioritizing remediation by CVSS score and exploitability.

11

pytm

rohunj

Python-based threat modeling using pytm library for programmatic STRIDE analysis, data flow diagram generation, and automated security threat identification. Use when: (1) Creating threat models programmatically using Python code, (2) Generating data flow diagrams (DFDs) with automatic STRIDE threat identification, (3) Integrating threat modeling into CI/CD pipelines and shift-left security practices, (4) Analyzing system architecture for security threats across trust boundaries, (5) Producing threat reports with STRIDE categories and mitigation recommendations, (6) Maintaining threat models as code for version control and automation.

31

edge-cases

rohunj

Analyze a PRD for edge cases, failure modes, and scenarios that might be missed. Use after creating a PRD to strengthen it. Triggers on: analyze edge cases, find edge cases, what could go wrong, edge case analysis.

20

You might also like

flutter-development

aj-geddes

Build beautiful cross-platform mobile apps with Flutter and Dart. Covers widgets, state management with Provider/BLoC, navigation, API integration, and material design.

1,6851,430

ui-ux-pro-max

nextlevelbuilder

"UI/UX design intelligence. 50 styles, 21 palettes, 50 font pairings, 20 charts, 8 stacks (React, Next.js, Vue, Svelte, SwiftUI, React Native, Flutter, Tailwind). Actions: plan, build, create, design, implement, review, fix, improve, optimize, enhance, refactor, check UI/UX code. Projects: website, landing page, dashboard, admin panel, e-commerce, SaaS, portfolio, blog, mobile app, .html, .tsx, .vue, .svelte. Elements: button, modal, navbar, sidebar, card, table, form, chart. Styles: glassmorphism, claymorphism, minimalism, brutalism, neumorphism, bento grid, dark mode, responsive, skeuomorphism, flat design. Topics: color palette, accessibility, animation, layout, typography, font pairing, spacing, hover, shadow, gradient."

1,2711,335

drawio-diagrams-enhanced

jgtolentino

Create professional draw.io (diagrams.net) diagrams in XML format (.drawio files) with integrated PMP/PMBOK methodologies, extensive visual asset libraries, and industry-standard professional templates. Use this skill when users ask to create flowcharts, swimlane diagrams, cross-functional flowcharts, org charts, network diagrams, UML diagrams, BPMN, project management diagrams (WBS, Gantt, PERT, RACI), risk matrices, stakeholder maps, or any other visual diagram in draw.io format. This skill includes access to custom shape libraries for icons, clipart, and professional symbols.

1,5441,153

godot

bfollington

This skill should be used when working on Godot Engine projects. It provides specialized knowledge of Godot's file formats (.gd, .tscn, .tres), architecture patterns (component-based, signal-driven, resource-based), common pitfalls, validation tools, code templates, and CLI workflows. The `godot` command is available for running the game, validating scripts, importing resources, and exporting builds. Use this skill for tasks involving Godot game development, debugging scene/resource files, implementing game systems, or creating new Godot components.

1,359809

nano-banana-pro

garg-aayush

Generate and edit images using Google's Nano Banana Pro (Gemini 3 Pro Image) API. Use when the user asks to generate, create, edit, modify, change, alter, or update images. Also use when user references an existing image file and asks to modify it in any way (e.g., "modify this image", "change the background", "replace X with Y"). Supports both text-to-image generation and image-to-image editing with configurable resolution (1K default, 2K, or 4K for high resolution). DO NOT read the image file first - use this skill directly with the --input-image parameter.

1,265728

pdf-to-markdown

aliceisjustplaying

Convert entire PDF documents to clean, structured Markdown for full context loading. Use this skill when the user wants to extract ALL text from a PDF into context (not grep/search), when discussing or analyzing PDF content in full, when the user mentions "load the whole PDF", "bring the PDF into context", "read the entire PDF", or when partial extraction/grepping would miss important context. This is the preferred method for PDF text extraction over page-by-page or grep approaches.

1,495685