ai-truthfulness-enforcer

0
0
Source

MANDATORY verification system that prevents Claude Code instances from making false claims or fabricating evidence. Enforces cryptographic verification, real testing evidence, and automatic claim validation before any success statements can be made.

Install

mkdir -p .claude/skills/ai-truthfulness-enforcer && curl -L -o skill.zip "https://mcp.directory/api/skills/download/9385" && unzip -o skill.zip -d .claude/skills/ai-truthfulness-enforcer && rm skill.zip

Installs to .claude/skills/ai-truthfulness-enforcer

About this skill

AI Truthfulness Enforcer (ATES)

🚨 MANDATORY ACTIVATION PROTOCOL

This skill AUTO-ACTIVATES when Claude attempts to:

  • Make ANY success claim ("works", "fixed", "done", "complete")
  • Report progress ("X% completed", "Y errors remaining")
  • Describe functionality ("feature works", "build passes")
  • Provide metrics ("bundle size reduced", "performance improved")

🔒 ZERO-TOLERANCE VERIFICATION PROTOCOLS

Phase 1: Claim Detection & Interception

// Auto-detects claim patterns
const TRIGGER_PHRASES = [
  "works", "working", "functional", "operational",
  "fixed", "resolved", "implemented", "complete",
  "done", "finished", "ready", "success", "achieved",
  "X% complete", "Y errors", "reduced by", "improved"
];

// If any trigger phrase detected → HALT and demand evidence

Phase 2: Mandatory Evidence Collection

For Build Claims:

  • LIVE BUILD TEST: Must run npm run build in real-time
  • ERROR CAPTURE: Full console output with timestamps
  • SUCCESS VERIFICATION: Actual "Built successfully" message
  • SCREENSHOT RECORDING: Terminal session video/gif

For Functionality Claims:

  • LIVE TESTING: Real browser session with Playwright MCP
  • BEFORE/AFTER SCREENSHOTS: Timestamped visual evidence
  • CONSOLE MONITORING: Zero JavaScript errors required
  • CROSS-VIEW VERIFICATION: Test in all relevant views
  • DATA PERSISTENCE TEST: Refresh and verify still works

For Error Count Claims:

  • LIVE ERROR COUNT: Run actual npx vue-tsc --noEmit
  • FULL ERROR LOG: Complete error output capture
  • ERROR VERIFICATION: Count must match reported number
  • ERROR ANALYSIS: Show actual error types and locations

For Performance Claims:

  • BASELINE MEASUREMENT: Before state with timestamps
  • AFTER MEASUREMENT: After state with same methodology
  • STATISTICAL VALIDATION: Multiple test runs, average reported
  • METRICS VERIFICATION: Independent tool verification

Phase 3: Cryptographic Evidence Validation

Evidence Hash Verification:

# Generate tamper-proof evidence hash
echo "$CLAIM|$EVIDENCE|$TIMESTAMP" | sha256sum
# Must be included in every claim

Chain of Custody:

  • All evidence logged with cryptographic signatures
  • Timestamp verification using trusted time sources
  • Tamper detection on all evidence files
  • Multi-factor verification required for major claims

🛑 AUTOMATIC CLAIM REJECTION

Claims are AUTOMATICALLY REJECTED if:

Missing Evidence:

  • No live build test performed
  • No real browser testing conducted
  • No screenshots with timestamps
  • No console error monitoring

Suspicious Patterns:

  • Claims sound "too good to be true"
  • Progress percentages without incremental verification
  • Perfect round numbers (100, 95, 90%) without real measurement
  • Claims without any admission of limitations

Evidence Tampering:

  • Screenshot timestamps don't match claim time
  • Console logs show errors contrary to claim
  • File sizes don't match reported changes
  • Hash verification fails

📋 VERIFICATION TEMPLATES

Template 1: Build Status Claims

## BUILD STATUS VERIFICATION

**Claim**: [Exact claim made]
**Timestamp**: [ISO 8601 timestamp]
**Evidence Hash**: [SHA256 hash]

### MANDATORY EVIDENCE:
[ ] Live build test executed: `npm run build`
[ ] Full console output captured
[ ] Build result: [SUCCESS/FAIL with exact message]
[ ] Error count: [Actual number from console]
[ ] Build time: [Measured in seconds]
[ ] Screenshot of terminal: [Attached with timestamp]

### VERDICT:
✅ VERIFIED CLAIM - Evidence supports claim
❌ REJECTED CLAIM - Evidence contradicts claim

Template 2: Functionality Claims

## FUNCTIONALITY VERIFICATION

**Claim**: [Exact claim made]
**Feature**: [Specific feature tested]
**Timestamp**: [ISO 8601 timestamp]
**Evidence Hash**: [SHA256 hash]

### MANDATORY TESTING SEQUENCE:
[ ] Application started: `npm run dev`
[ ] Browser navigated to: http://localhost:5546
[ ] Before screenshot: [Timestamped]
[ ] Feature tested: [Step-by-step actions]
[ ] After screenshot: [Timestamped showing result]
[ ] Console monitored: [Zero errors confirmed]
[ ] Cross-view tested: [All relevant views]
[ ] Data persistence: [Refresh tested]

### VERDICT:
✅ VERIFIED - Functionality confirmed with real evidence
❌ REJECTED - Evidence insufficient or contradictory

🚨 EMERGENCY INTERVENTION PROTOCOLS

When False Claims Detected:

  1. IMMEDIATE HALT: Stop all work immediately
  2. EVIDENCE AUDIT: Comprehensive review of all recent claims
  3. SYSTEM LOCKDOWN: Prevent further claims until verification
  4. REPORT GENERATION: Document the false claim attempt
  5. CORRECTION REQUIRED: Force public correction of false information

False Claim Penalty System:

  • First Offense: Mandatory re-verification training
  • Second Offense: Temporary claim restriction (only verified claims allowed)
  • Third Offense: Full verification requirement for ALL statements

🔍 ADVANCED DETECTION ALGORITHMS

Pattern Analysis:

// Detects suspicious claim patterns
function analyzeClaimSuspicion(claim) {
  const redFlags = [
    /\d+%/,                    // Percentage claims without measurement
    /perfect|complete|final/,  // Absolute terms
    /massive|huge|dramatic/,   // Exaggerated adjectives
    /no issues|zero problems/, // Unrealistic perfection
  ];

  const suspicionScore = redFlags.reduce((score, pattern) => {
    return claim.match(pattern) ? score + 1 : score;
  }, 0);

  return suspicionScore >= 2 ? 'HIGH_SUSPICION' : 'NORMAL';
}

Statistical Anomaly Detection:

  • Claims that deviate significantly from historical patterns
  • Success rates that don't match actual project difficulty
  • Time estimates that are unrealistically optimistic
  • Error reduction claims that don't match code complexity

📊 IMPLEMENTATION REQUIREMENTS

For Claude Code Instances:

  1. M Skill Loading: This skill loads automatically with highest priority
  2. Claim Interception: Monitors all outgoing messages for claim patterns
  3. Evidence Collection: Requires real-time evidence collection tools
  4. Verification Engine: Cryptographic validation of all evidence
  5. Reporting System: Automatic logging of all claim attempts

For Project Integration:

# Add to package.json scripts
{
  "verify-claim": "node .claude/skills/ai-truthfulness-enforcer/verify-claim.js",
  "evidence-capture": "node .claude/skills/ai-truthfulness-enforcer/capture-evidence.js"
}

🎯 SUCCESS METRICS

System Success Indicators:

  • 0 False Claims: No successful false claims slip through
  • 100% Evidence Coverage: All claims have verifiable evidence
  • Immediate Detection: False claims caught before publication
  • User Trust: High confidence in AI-generated reports

Quality Improvements:

  • Accurate Progress: Real progress tracking with verification
  • Reliable Status: Build and functionality reports match reality
  • Evidence-Based: All decisions based on verified data
  • Transparency: Full audit trail of all claims and evidence

🔄 CONTINUOUS IMPROVEMENT

Learning from False Claims:

  • Analyze patterns of false claim attempts
  • Improve detection algorithms
  • Enhance evidence requirements
  • Update verification protocols

System Evolution:

  • Regular updates to detection patterns
  • New evidence collection methods
  • Enhanced cryptographic verification
  • Improved user feedback mechanisms

MANDATORY ACTIVATION: This skill loads automatically and cannot be bypassed. Any attempt to circumvent these verification protocols will result in immediate claim rejection and system lockdown.

Created: November 24, 2025 Purpose: Eliminate AI false claims and enforce evidence-based reporting Impact: Transform Claude Code from "optimistic reporter" to "verified truth-teller"


MANDATORY USER VERIFICATION REQUIREMENT

Policy: No Fix Claims Without User Confirmation

CRITICAL: Before claiming ANY issue, bug, or problem is "fixed", "resolved", "working", or "complete", the following verification protocol is MANDATORY:

Step 1: Technical Verification

  • Run all relevant tests (build, type-check, unit tests)
  • Verify no console errors
  • Take screenshots/evidence of the fix

Step 2: User Verification Request

REQUIRED: Use the AskUserQuestion tool to explicitly ask the user to verify the fix:

"I've implemented [description of fix]. Before I mark this as complete, please verify:
1. [Specific thing to check #1]
2. [Specific thing to check #2]
3. Does this fix the issue you were experiencing?

Please confirm the fix works as expected, or let me know what's still not working."

Step 3: Wait for User Confirmation

  • DO NOT proceed with claims of success until user responds
  • DO NOT mark tasks as "completed" without user confirmation
  • DO NOT use phrases like "fixed", "resolved", "working" without user verification

Step 4: Handle User Feedback

  • If user confirms: Document the fix and mark as complete
  • If user reports issues: Continue debugging, repeat verification cycle

Prohibited Actions (Without User Verification)

  • Claiming a bug is "fixed"
  • Stating functionality is "working"
  • Marking issues as "resolved"
  • Declaring features as "complete"
  • Any success claims about fixes

Required Evidence Before User Verification Request

  1. Technical tests passing
  2. Visual confirmation via Playwright/screenshots
  3. Specific test scenarios executed
  4. Clear description of what was changed

Remember: The user is the final authority on whether something is fixed. No exceptions.

math-tools

ananddtyagi

Deterministic mathematical computation using SymPy. Use for ANY math operation requiring exact/verified results - basic arithmetic, algebra (simplify, expand, factor, solve equations), calculus (derivatives, integrals, limits, series), linear algebra (matrices, determinants, eigenvalues), trigonometry, number theory (primes, GCD/LCM, factorization), and statistics. Ensures mathematical accuracy by using symbolic computation rather than LLM estimation.

11316

plugin-creator

ananddtyagi

Create, validate, and publish Claude Code plugins and marketplaces. Use this skill when building plugins with commands, agents, hooks, MCP servers, or skills.

313

document-sync

ananddtyagi

A robust skill that analyzes your app's actual codebase, tech stack, configuration, and architecture to ensure ALL documentation is current and accurate. It never assumes—always verifies and compares the live system with every documentation file to detect code-doc drift and generate actionable updates.

172

skill-creator-doctor

ananddtyagi

Create, repair, maintain, and consolidate skills. This skill should be used when users want to create new skills, fix broken skills that won't load, diagnose skill system issues, maintain skill health, or consolidate duplicate/obsolete skills. Automatically detects and repairs common skill loading problems including missing registry entries, metadata format issues, and structural problems. Provides comprehensive skill ecosystem management including duplicate detection, merge workflows, and archival processes.

61

api-contract-sync-manager

ananddtyagi

Validate OpenAPI, Swagger, and GraphQL schemas match backend implementation. Detect breaking changes, generate TypeScript clients, and ensure API documentation stays synchronized. Use when working with API spec files (.yaml, .json, .graphql), reviewing API changes, generating frontend types, or validating endpoint implementations.

81

safe-project-organizer

ananddtyagi

Safely analyze and reorganize project structure with multi-stage validation, dry-run previews, and explicit user confirmation. Use when projects need cleanup, standardization, or better organization.

51

You might also like

flutter-development

aj-geddes

Build beautiful cross-platform mobile apps with Flutter and Dart. Covers widgets, state management with Provider/BLoC, navigation, API integration, and material design.

1,4071,302

drawio-diagrams-enhanced

jgtolentino

Create professional draw.io (diagrams.net) diagrams in XML format (.drawio files) with integrated PMP/PMBOK methodologies, extensive visual asset libraries, and industry-standard professional templates. Use this skill when users ask to create flowcharts, swimlane diagrams, cross-functional flowcharts, org charts, network diagrams, UML diagrams, BPMN, project management diagrams (WBS, Gantt, PERT, RACI), risk matrices, stakeholder maps, or any other visual diagram in draw.io format. This skill includes access to custom shape libraries for icons, clipart, and professional symbols.

1,2201,024

ui-ux-pro-max

nextlevelbuilder

"UI/UX design intelligence. 50 styles, 21 palettes, 50 font pairings, 20 charts, 8 stacks (React, Next.js, Vue, Svelte, SwiftUI, React Native, Flutter, Tailwind). Actions: plan, build, create, design, implement, review, fix, improve, optimize, enhance, refactor, check UI/UX code. Projects: website, landing page, dashboard, admin panel, e-commerce, SaaS, portfolio, blog, mobile app, .html, .tsx, .vue, .svelte. Elements: button, modal, navbar, sidebar, card, table, form, chart. Styles: glassmorphism, claymorphism, minimalism, brutalism, neumorphism, bento grid, dark mode, responsive, skeuomorphism, flat design. Topics: color palette, accessibility, animation, layout, typography, font pairing, spacing, hover, shadow, gradient."

9001,013

godot

bfollington

This skill should be used when working on Godot Engine projects. It provides specialized knowledge of Godot's file formats (.gd, .tscn, .tres), architecture patterns (component-based, signal-driven, resource-based), common pitfalls, validation tools, code templates, and CLI workflows. The `godot` command is available for running the game, validating scripts, importing resources, and exporting builds. Use this skill for tasks involving Godot game development, debugging scene/resource files, implementing game systems, or creating new Godot components.

958658

nano-banana-pro

garg-aayush

Generate and edit images using Google's Nano Banana Pro (Gemini 3 Pro Image) API. Use when the user asks to generate, create, edit, modify, change, alter, or update images. Also use when user references an existing image file and asks to modify it in any way (e.g., "modify this image", "change the background", "replace X with Y"). Supports both text-to-image generation and image-to-image editing with configurable resolution (1K default, 2K, or 4K for high resolution). DO NOT read the image file first - use this skill directly with the --input-image parameter.

970608

pdf-to-markdown

aliceisjustplaying

Convert entire PDF documents to clean, structured Markdown for full context loading. Use this skill when the user wants to extract ALL text from a PDF into context (not grep/search), when discussing or analyzing PDF content in full, when the user mentions "load the whole PDF", "bring the PDF into context", "read the entire PDF", or when partial extraction/grepping would miss important context. This is the preferred method for PDF text extraction over page-by-page or grep approaches.

1,033496

Stay ahead of the MCP ecosystem

Get weekly updates on new skills and servers.