real-pytest-no-mocks-real-tests

Name: real-pytest-no-mocks-real-tests
Author: taylorsatula

1views

1installs

Write pytests that test real public interfaces with actual components, no mocking, and precise assertions. MIRA-specific patterns. Use when creating or reviewing tests.

Install

mkdir -p .claude/skills/real-pytest-no-mocks-real-tests && curl -L -o skill.zip "https://mcp.directory/api/skills/download/7546" && unzip -o skill.zip -d .claude/skills/real-pytest-no-mocks-real-tests && rm skill.zip

Installs to .claude/skills/real-pytest-no-mocks-real-tests

About this skill

Real Testing Philosophy

CRITICAL MINDSET SHIFT

Tests that verify implementation are worse than no tests - they provide false confidence while catching nothing.

Your job is not to confirm the code works. Your job is to:

Think critically about the contract - what SHOULD this module do?
Surface design problems - is this module papering over architectural failures?
Write tests that enforce guarantees - not tests that mirror implementation
Prove tests can fail - see them fail first, verify failure modes are correct

Tests that always pass are actively harmful. They waste time and provide false security.

🚨 NEVER SKIP TESTS

ABSOLUTE RULE: Do NOT use @pytest.mark.skip, @pytest.mark.skipif, or pytest.skip()

Tests either:

✅ PASS - the code works correctly
❌ FAIL - the code is broken and needs fixing

There is no third state. Skipped tests are:

Technical debt pretending to be documentation
Broken code that someone gave up on
False confidence in test coverage metrics

If a test can't run:

Fix the environment/dependencies so it can run
Fix the code so the test passes
Delete the test if it's testing something that doesn't exist

NEVER commit a skipped test. Either make it pass or delete it.

PHASE 1: Contract-First Analysis (DO THIS FIRST)

NEVER write tests by reading implementation. That's how you write tests that mirror what code does instead of what it should do.

Protocol: Analyze Contract Without Reading Implementation

Step 1: Read ONLY the module's public interface

# Read THIS (public interface)
class ReminderTool:
    def run(self, operation: str, **kwargs) -> Dict[str, Any]:
        """Execute reminder operations."""
        pass

# DO NOT read implementation details
# DO NOT look at internal methods
# DO NOT read how it's implemented

Step 2: Document the contract

Before writing any test, answer these questions in writing:

MODULE CONTRACT ANALYSIS
========================

1. What is this module's PURPOSE?
   - What problem does it solve?
   - Why does it exist?

2. What GUARANTEES does it provide?
   - What promises does the API make?
   - What invariants must hold?
   - What post-conditions are guaranteed?

3. What should SUCCEED?
   - Valid inputs
   - Happy path scenarios
   - Boundary cases that should work

4. What should FAIL?
   - Invalid inputs
   - Boundary conditions that should error
   - Security violations
   - Resource constraints

5. What are the DEPENDENCIES?
   - What does this module depend on?
   - Are there too many dependencies?
   - Could this be simpler?

6. ARCHITECTURAL CONCERNS:
   - Is this module doing too much?
   - Is it papering over design failures elsewhere?
   - Does the contract make sense or is it convoluted?
   - Should this module even exist?

Step 3: Design test cases from contract

Based on contract analysis (NOT implementation):

List positive test cases (what should work)
List negative test cases (what should fail)
List boundary conditions
List security concerns
List performance concerns

See "CANONICAL EXAMPLE" section below for complete contract analysis walkthrough.

PHASE 1.5: Contract Verification (VALIDATE YOUR ASSUMPTIONS)

CRITICAL: Do NOT read the implementation file yourself. Use the contract-extractor agent as an abstraction barrier.

Why This Phase Exists

You've formed expectations about the contract from the interface. Now verify those expectations against actual implementation WITHOUT seeing the implementation yourself. The agent reads the code and reports ONLY contract facts (not implementation details).

Protocol: Invoke Agent → Compare → Identify Gaps

Step 1: Invoke the contract-extractor agent

# Use Task tool to invoke the agent
Task(
    subagent_type="contract-extractor",
    description="Extract contract from module",
    prompt="""Extract the contract from: path/to/module.py

Return:
- Public interface (methods, signatures, types)
- Actual return structures (dict keys, types)
- Exception contracts (what raises what, when)
- Edge cases handled
- Dependencies and architectural concerns"""
)

Step 2: Compare your expectations against agent report

Create a comparison:

EXPECTATION vs REALITY
======================

Expected return structure:
{
    "status": str,
    "results": list
}

Actual return structure (from agent):
{
    "status": str,
    "confidence": float,  # I MISSED THIS
    "results": list,
    "result_count": int   # I MISSED THIS
}

Expected exceptions:
- ValueError for empty query

Actual exceptions (from agent):
- ValueError for empty query ✓
- ValueError for negative max_results  # I MISSED THIS

Expected edge cases:
- Empty results returns []

Actual edge cases (from agent):
- Empty results returns status="low_confidence", confidence=0.0, results=[]
  # More nuanced than I expected

Step 3: Identify discrepancies and their implications

For each discrepancy, ask:

Is the code wrong (doesn't match intended contract)?
Is the contract unclear (missing documentation)?
Did I misunderstand the requirements?
Is this an undocumented feature (needs test)?

Example Analysis:

DISCREPANCY: Agent reports confidence field in return, I didn't expect it
IMPLICATION: This is part of the contract - add test to verify confidence in [0.0, 1.0]

DISCREPANCY: Agent reports ValueError for negative max_results, I didn't expect it
IMPLICATION: Good edge case handling - add negative test

DISCREPANCY: Agent reports 8 dependencies, I expected 3-4
IMPLICATION: ARCHITECTURAL CONCERN - too many deps, report to human

Step 4: Update test plan based on verified contract

Now you know:

What the code actually returns (test these exact structures)
What exceptions are actually raised (test these exact cases)
What edge cases are actually handled (test these behaviors)
What architectural problems exist (report these to human)

Step 5: Design comprehensive test cases

# Based on VERIFIED contract (not assumptions):

# Positive tests
- test_search_returns_exact_structure  # Verify all keys agent reported
- test_search_confidence_in_valid_range  # Agent said 0.0-1.0
- test_search_respects_max_results  # Agent confirmed this guarantee

# Negative tests
- test_search_rejects_empty_query  # Agent confirmed ValueError
- test_search_rejects_negative_max_results  # Agent revealed this

# Edge cases
- test_search_empty_results_structure  # Agent showed exact structure
- test_search_with_no_user_data  # Based on RLS info from agent

# Architectural concerns
- Report to human: "Module has 8 dependencies - possible SRP violation"

See "CANONICAL EXAMPLE" section below for complete agent invocation, comparison, and gap analysis walkthrough.

When to Read Implementation

Only AFTER writing tests based on verified contract. Then you can read implementation for context, debugging, or refactoring - but tests are already protecting the contract.

PHASE 2: Fail-First Verification (PROVE TESTS CAN FAIL)

A test that always passes proves nothing. You must see it fail.

Protocol: Write → Fail → Verify

Step 1: Write test based on contract expectations

Don't look at implementation. Write assertions based on what the contract says SHOULD happen.

def test_search_returns_confidence_score(search_tool, authenticated_user):
    """Contract: search must return confidence score between 0.0 and 1.0"""
    user_id = authenticated_user["user_id"]
    set_current_user_id(user_id)

    # Based on contract, not implementation
    result = search_tool.run(
        operation="search",
        query="Python async patterns",
        max_results=5
    )

    # Contract expectations
    assert "confidence" in result
    assert 0.0 <= result["confidence"] <= 1.0
    assert "results" in result
    assert len(result["results"]) <= 5

Step 2: Run the test - expect failure or question success

pytest tests/test_search_tool.py::test_search_returns_confidence_score -v

If test FAILS:

Is this the expected failure? (No data exists yet)
Is the failure message clear?
Is this exposing a bug in the code?
Is this exposing a problem with the contract?

If test PASSES immediately:

Is the code actually correct?
Are my assertions too weak?
Am I testing a trivial case?
Did I set up test data somewhere I forgot about?

Step 3: Verify the test can actually catch bugs

Temporarily break the code and verify the test fails:

# In the actual implementation, temporarily break it:
def run(self, operation, **kwargs):
    return {"confidence": 2.5}  # INTENTIONAL BUG: exceeds 1.0

Run test - it should fail. If it doesn't, your assertions are too weak.

Step 4: Remove the intentional bug, test should pass

Now you have confidence the test actually works.

Common Testing Anti-Patterns

When writing tests, surface design problems - don't paper over them.

Anti-Pattern	Why It's Wrong	What To Do Instead
Mocking	Tests mocks, not code. Hides integration issues.	Use real services (sqlite_test_db, test_db). If hard to test, fix design.
Reading implementation first	Tests mirror HOW instead of WHAT. Confirms current behavior, doesn't catch regressions.	Analyze contract WITHOUT reading code. Use contract-extractor agent.
Tests that mirror implementation	Testing that method calls BM25 then embeddings (HOW) vs testing returns relevant results (WHAT).	Test observable contract behavior, not internal paths.
Weak assertions	`assert result is not None` says nothing.	Precise: `assert 0.0 <= result["confidence"] <= 1.0`
Only happy paths	Missing adversarial cases means bugs slip through.	Test failure cases: empty inputs, invalid values, boundary condit

Content truncated.

More by taylorsatula

View all skills by taylorsatula →

prompt-mastery

taylorsatula

Advanced LLM prompt engineering expertise for crafting highly effective prompts, system messages, and tool descriptions with Claude-specific techniques

code-consistency-logging-standards

taylorsatula

Check Python logging levels and patterns for correctness. Focus on identifying wrong severity levels and missing exception handling. Use when reviewing code quality.

contextvar-opportunity-finder

taylorsatula

Detect explicit user_id parameters in functions to identify potential opportunities for using ambient context. This is an investigation tool that flags instances for human review, not a prescriptive analyzer.

ui-ux-pro-max

nextlevelbuilder

"UI/UX design intelligence. 50 styles, 21 palettes, 50 font pairings, 20 charts, 8 stacks (React, Next.js, Vue, Svelte, SwiftUI, React Native, Flutter, Tailwind). Actions: plan, build, create, design, implement, review, fix, improve, optimize, enhance, refactor, check UI/UX code. Projects: website, landing page, dashboard, admin panel, e-commerce, SaaS, portfolio, blog, mobile app, .html, .tsx, .vue, .svelte. Elements: button, modal, navbar, sidebar, card, table, form, chart. Styles: glassmorphism, claymorphism, minimalism, brutalism, neumorphism, bento grid, dark mode, responsive, skeuomorphism, flat design. Topics: color palette, accessibility, animation, layout, typography, font pairing, spacing, hover, shadow, gradient."

3,1872,738

pdf-to-markdown

aliceisjustplaying

Convert entire PDF documents to clean, structured Markdown for full context loading. Use this skill when the user wants to extract ALL text from a PDF into context (not grep/search), when discussing or analyzing PDF content in full, when the user mentions "load the whole PDF", "bring the PDF into context", "read the entire PDF", or when partial extraction/grepping would miss important context. This is the preferred method for PDF text extraction over page-by-page or grep approaches.

4,2261,824

flutter-development

aj-geddes

Build beautiful cross-platform mobile apps with Flutter and Dart. Covers widgets, state management with Provider/BLoC, navigation, API integration, and material design.

2,2151,668

drawio-diagrams-enhanced

jgtolentino

Create professional draw.io (diagrams.net) diagrams in XML format (.drawio files) with integrated PMP/PMBOK methodologies, extensive visual asset libraries, and industry-standard professional templates. Use this skill when users ask to create flowcharts, swimlane diagrams, cross-functional flowcharts, org charts, network diagrams, UML diagrams, BPMN, project management diagrams (WBS, Gantt, PERT, RACI), risk matrices, stakeholder maps, or any other visual diagram in draw.io format. This skill includes access to custom shape libraries for icons, clipart, and professional symbols.

2,3551,515

godot

bfollington

This skill should be used when working on Godot Engine projects. It provides specialized knowledge of Godot's file formats (.gd, .tscn, .tres), architecture patterns (component-based, signal-driven, resource-based), common pitfalls, validation tools, code templates, and CLI workflows. The `godot` command is available for running the game, validating scripts, importing resources, and exporting builds. Use this skill for tasks involving Godot game development, debugging scene/resource files, implementing game systems, or creating new Godot components.

2,6571,275

nano-banana-pro

garg-aayush

Generate and edit images using Google's Nano Banana Pro (Gemini 3 Pro Image) API. Use when the user asks to generate, create, edit, modify, change, alter, or update images. Also use when user references an existing image file and asks to modify it in any way (e.g., "modify this image", "change the background", "replace X with Y"). Supports both text-to-image generation and image-to-image editing with configurable resolution (1K default, 2K, or 4K for high resolution). DO NOT read the image file first - use this skill directly with the --input-image parameter.

2,065997

Related MCP Servers

Browse all servers

Context7

Boost your AI code assistant with Context7: inject real-time API documentation from OpenAPI specification sources into your coding workflow.

48,1802 tools

Browser

Supercharge browser tasks with Browser MCP—AI-driven, local browser automation for powerful, private testing. Inspired by browser automation studio.

5,99112 tools

Exa Search

Empower AI with the Exa MCP Server—an AI research tool for real-time web search, academic data, and smarter, up-to-date insights.

3,9550 tools

NextJS

Supercharge your NextJS projects with AI-powered tools for diagnostics, upgrades, and docs. Accelerate development and boost productivity today.

6657 tools

Android MCP

Android MCP — lightweight bridge enabling AI agents for Android to perform Android automation and Android UI testing: app navigation, UI interaction, and…

4390 tools

Puppeteer Real Browser

Puppeteer Real Browser offers stealth automation with anti-detection, proxy, and captcha solving for undetectable web scraping and testing.

170 tools

Install

mkdir -p .claude/skills/real-pytest-no-mocks-real-tests && curl -L -o skill.zip "https://mcp.directory/api/skills/download/7546" && unzip -o skill.zip -d .claude/skills/real-pytest-no-mocks-real-tests && rm skill.zip

Installs to .claude/skills/real-pytest-no-mocks-real-tests

Stats

Views

Installs

Author

taylorsatula

4 skills published

Links

Source Code

real-pytest-no-mocks-real-tests

Install

About this skill

Real Testing Philosophy

CRITICAL MINDSET SHIFT

🚨 NEVER SKIP TESTS

PHASE 1: Contract-First Analysis (DO THIS FIRST)

Protocol: Analyze Contract Without Reading Implementation

PHASE 1.5: Contract Verification (VALIDATE YOUR ASSUMPTIONS)

Why This Phase Exists

Protocol: Invoke Agent → Compare → Identify Gaps

When to Read Implementation

PHASE 2: Fail-First Verification (PROVE TESTS CAN FAIL)

Protocol: Write → Fail → Verify

Common Testing Anti-Patterns

More by taylorsatula

prompt-mastery

code-consistency-logging-standards

contextvar-opportunity-finder

You might also like

ui-ux-pro-max

pdf-to-markdown

flutter-development

drawio-diagrams-enhanced

godot

nano-banana-pro

Related MCP Servers