model-debugging

1
0
Source

Debug and diagnose model errors in Pollinations services. Analyze logs, find error patterns, identify affected users. For taking action on user tiers, see tier-management skill.

Install

mkdir -p .claude/skills/model-debugging && curl -L -o skill.zip "https://mcp.directory/api/skills/download/4062" && unzip -o skill.zip -d .claude/skills/model-debugging && rm skill.zip

Installs to .claude/skills/model-debugging

About this skill

Model Debugging Skill

Use this skill when:

  • Investigating model failures, high error rates, or service issues
  • Finding users affected by errors (402 billing, 403 permissions, 500 backend)
  • Analyzing Tinybird/Cloudflare logs for patterns
  • Diagnosing specific request failures

Related skill: Use tier-management to upgrade users or check balances after identifying issues here.


Understanding Model Monitor Error Rates

Why does the Model Monitor show high error rates when models work fine manually?

The Model Monitor at https://monitor.pollinations.ai shows all real-world traffic, including:

  • 401 errors: Anonymous users without API keys (most common)
  • 402 errors: Users with insufficient pollen balance or exhausted API key budget
  • 403 errors: Users denied access to specific models (API key restrictions)
  • 400 errors: Invalid request parameters (e.g., openai-audio without modalities param)
  • 429 errors: Rate-limited requests
  • 500/504 errors: Actual backend failures (investigate these)

When you test manually with a valid secret key (sk_), you bypass auth/quota issues, so models appear to work fine.

Key insight: High 401/402/403/400 rates are expected from real-world usage. Focus investigation on 500/504 errors.


Data Flow Architecture

User Request → enter.pollinations.ai (Cloudflare Worker)
                    ↓
              Logs to Cloudflare Workers Observability
                    ↓
              Events stored in D1 database
                    ↓
              Batched to Tinybird (async, 100-500 events)
                    ↓
              Model Monitor queries Tinybird (model_health.pipe)

Structured Logging: enter.pollinations.ai uses LogTape with:

  • requestId: Unique per request (passed to downstream via x-request-id header)
  • status, body: Full error response from downstream services
  • Context: method, routePath, userAgent, ipAddress

Quick Diagnostics

1. Check Model Monitor

View current model health at: https://monitor.pollinations.ai

2. Query Recent Errors from D1 Database

# Via enter.pollinations.ai worker (requires wrangler)
cd enter.pollinations.ai
npx wrangler d1 execute pollinations-db --remote --command "SELECT model_requested, response_status, error_message, COUNT(*) as count FROM event WHERE response_status >= 400 AND created_at > datetime('now', '-1 hour') GROUP BY model_requested, response_status, error_message ORDER BY count DESC LIMIT 20"

3. Capture Live Logs

enter.pollinations.ai (Cloudflare Worker)

cd enter.pollinations.ai
wrangler tail --format json | tee logs.jsonl
# Or with formatting:
wrangler tail --format json | npx tsx scripts/format-logs.ts

image.pollinations.ai (EC2 systemd)

# Real-time logs
ssh enter-services "sudo journalctl -u image-pollinations.service -f"

# Last 3 minutes
ssh enter-services "sudo journalctl -u image-pollinations.service --since '3 minutes ago' --no-pager" > image-service-logs.txt

# Recent errors only
ssh enter-services "sudo journalctl -u image-pollinations.service -p err -n 50"

text.pollinations.ai (EC2 systemd)

# Real-time logs
ssh enter-services "sudo journalctl -u text-pollinations.service -f"

# Last 3 minutes
ssh enter-services "sudo journalctl -u text-pollinations.service --since '3 minutes ago' --no-pager" > text-service-logs.txt

Common Error Patterns

Azure Content Safety DNS Failure

Error: getaddrinfo ENOTFOUND gptimagemain1-resource.cognitiveservices.azure.com Cause: Azure Content Safety resource deleted or misconfigured Impact: Fail-open (content proceeds without safety check) Fix: Create new Azure Content Safety resource and update .env:

AZURE_CONTENT_SAFETY_ENDPOINT=https://<new-resource>.cognitiveservices.azure.com/
AZURE_CONTENT_SAFETY_API_KEY=<new-key>

Azure Kontext Content Filter

Error: Content rejected due to sexual/hate/violence content detection Cause: Azure's content moderation blocking prompts/images Impact: 400 error returned to user Fix: User error - prompt violates content policy

Vertex AI Invalid Image

Error: Provided image is not valid Cause: User passing unsupported image URL (e.g., Google Drive links) Impact: 400 error returned to user Fix: User error - need direct image URL

Translation Service Down

Error: No active translate servers available Cause: Translation service unavailable Impact: Prompts not translated (non-fatal) Fix: Check translation service status

OpenAI Audio Invalid Voice

Error: Invalid value for audio.voice Cause: User requesting unsupported voice name Impact: 400 error returned to user Fix: User error - use supported voices: alloy, echo, fable, onyx, nova, shimmer, coral, verse, ballad, ash, sage, etc.

Veo No Video Data

Error: No video data in response Cause: Vertex AI returned empty video response Impact: 500 error Fix: Check Vertex AI quota/status, may be transient


Environment Variables to Check

image.pollinations.ai

ssh enter-services "cat /home/ubuntu/pollinations/image.pollinations.ai/.env | grep -E 'AZURE|GOOGLE|CLOUDFLARE'"

Key variables:

  • AZURE_CONTENT_SAFETY_ENDPOINT - Azure Content Safety API endpoint
  • AZURE_CONTENT_SAFETY_API_KEY - Azure Content Safety API key
  • GOOGLE_PROJECT_ID - Google Cloud project for Vertex AI
  • AZURE_MYCELI_FLUX_KONTEXT_ENDPOINT - Azure Kontext model endpoint

text.pollinations.ai

ssh enter-services "cat /home/ubuntu/pollinations/text.pollinations.ai/.env | grep -E 'AZURE|OPENAI|GOOGLE'"

Updating Secrets

Secrets are stored encrypted with SOPS:

  • image.pollinations.ai/secrets/env.json
  • text.pollinations.ai/secrets/env.json

To update:

# Decrypt, edit, re-encrypt
sops image.pollinations.ai/secrets/env.json

# Deploy to server
sops --output-type dotenv -d image.pollinations.ai/secrets/env.json > /tmp/image.env
scp /tmp/image.env enter-services:/home/ubuntu/pollinations/image.pollinations.ai/.env
rm /tmp/image.env

# Restart service
ssh enter-services "sudo systemctl restart image-pollinations.service"

Log Analysis Commands

# Count errors by type
grep -i "error" image-service-logs.txt | grep -oE "(Azure Flux Kontext|Vertex AI|No active translate|getaddrinfo ENOTFOUND)" | sort | uniq -c | sort -rn

# Find content filter rejections
grep -i "Content rejected" image-service-logs.txt | sort | uniq -c

# Check DNS resolution on server
ssh enter-services "nslookup gptimagemain1-resource.cognitiveservices.azure.com"

Model-Specific Debugging

ModelBackendCommon Issues
fluxAzure/ReplicateRate limits, content filter
kontextAzure Flux KontextContent filter (strict)
nanobananaVertex AI GeminiInvalid image URLs, content filter
seedream-proByteDance ARKNSFW filter, API key issues
veoVertex AIQuota, empty responses
openai-audioAzure OpenAIInvalid voice names
deepseekDeepSeek APIRate limits, API key

Cloudflare Workers Observability API

The enter.pollinations.ai worker has structured logging enabled. You can query logs programmatically via the Cloudflare Workers Observability API.

Prerequisites

1. Get Account ID

# From wrangler.toml
grep account_id enter.pollinations.ai/wrangler.toml

# Or from existing .env
grep CLOUDFLARE_ACCOUNT_ID image.pollinations.ai/.env

2. Create API Token with Workers Observability Permission

Via Cloudflare Dashboard:

  1. Go to https://dash.cloudflare.com/profile/api-tokens
  2. Click Create Token
  3. Click Create Custom Token
  4. Configure:
    • Token name: Workers Observability Read
    • Permissions:
      • Account → Workers Scripts → Read
      • Account → Workers Observability → Edit (required for query API)
    • Account Resources: Include → Your Account
  5. Click Continue to summaryCreate Token
  6. Copy the token immediately (shown only once)

3. Store Token Securely

The token is stored in SOPS-encrypted secrets:

  • Location: enter.pollinations.ai/secrets/env.json
  • Key: CLOUDFLARE_OBSERVABILITY_TOKEN

To add/update:

# Step 1: Decrypt to temp file
cd /path/to/pollinations
sops -d enter.pollinations.ai/secrets/env.json > /tmp/env.json

# Step 2: Add the token (use jq)
jq '. + {"CLOUDFLARE_OBSERVABILITY_TOKEN": "your_token"}' /tmp/env.json > /tmp/env_updated.json

# Step 3: Re-encrypt (must rename to match .sops.yaml pattern)
cp /tmp/env_updated.json /tmp/env.json
sops -e /tmp/env.json > enter.pollinations.ai/secrets/env.json

# Step 4: Cleanup
rm /tmp/env.json /tmp/env_updated.json

# Verify
sops -d enter.pollinations.ai/secrets/env.json | jq 'keys'

Note: The .sops.yaml config requires filenames matching env.json$ pattern.

API Endpoint

POST https://api.cloudflare.com/client/v4/accounts/{account_id}/workers/observability/telemetry/query

Query Examples

Setup: Get Credentials from SOPS

# Extract credentials from encrypted secrets
ACCOUNT_ID=$(sops -d enter.pollinations.ai/secrets/env.json | jq -r '.CLOUDFLARE_ACCOUNT_ID')
API_TOKEN=$(sops -d enter.pollinations.ai/secrets/env.json | jq -r '.CLOUDFLARE_OBSERVABILITY_TOKEN')

List Available Log Keys (Working)

This endpoint works and shows what fields are available:

curl -s "https://api.cloudflare.com/client/v4/accounts/$ACCOUNT_ID/workers/observability/telemetry/keys" \
  -H "Authorization: Bearer $API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"timeframe": {"from": '$(( $(date +%s) - 86400 ))'000, "to": '$(date +%s)'000}, "datasets": ["workers"]}' | jq '.result[:10]'

Query Recent Errors (Last 15 Minutes)

Note: The /query endpoint requires a saved queryId. For ad-hoc queries, use the


Content truncated.

You might also like

flutter-development

aj-geddes

Build beautiful cross-platform mobile apps with Flutter and Dart. Covers widgets, state management with Provider/BLoC, navigation, API integration, and material design.

643969

drawio-diagrams-enhanced

jgtolentino

Create professional draw.io (diagrams.net) diagrams in XML format (.drawio files) with integrated PMP/PMBOK methodologies, extensive visual asset libraries, and industry-standard professional templates. Use this skill when users ask to create flowcharts, swimlane diagrams, cross-functional flowcharts, org charts, network diagrams, UML diagrams, BPMN, project management diagrams (WBS, Gantt, PERT, RACI), risk matrices, stakeholder maps, or any other visual diagram in draw.io format. This skill includes access to custom shape libraries for icons, clipart, and professional symbols.

591705

ui-ux-pro-max

nextlevelbuilder

"UI/UX design intelligence. 50 styles, 21 palettes, 50 font pairings, 20 charts, 8 stacks (React, Next.js, Vue, Svelte, SwiftUI, React Native, Flutter, Tailwind). Actions: plan, build, create, design, implement, review, fix, improve, optimize, enhance, refactor, check UI/UX code. Projects: website, landing page, dashboard, admin panel, e-commerce, SaaS, portfolio, blog, mobile app, .html, .tsx, .vue, .svelte. Elements: button, modal, navbar, sidebar, card, table, form, chart. Styles: glassmorphism, claymorphism, minimalism, brutalism, neumorphism, bento grid, dark mode, responsive, skeuomorphism, flat design. Topics: color palette, accessibility, animation, layout, typography, font pairing, spacing, hover, shadow, gradient."

318398

godot

bfollington

This skill should be used when working on Godot Engine projects. It provides specialized knowledge of Godot's file formats (.gd, .tscn, .tres), architecture patterns (component-based, signal-driven, resource-based), common pitfalls, validation tools, code templates, and CLI workflows. The `godot` command is available for running the game, validating scripts, importing resources, and exporting builds. Use this skill for tasks involving Godot game development, debugging scene/resource files, implementing game systems, or creating new Godot components.

339397

nano-banana-pro

garg-aayush

Generate and edit images using Google's Nano Banana Pro (Gemini 3 Pro Image) API. Use when the user asks to generate, create, edit, modify, change, alter, or update images. Also use when user references an existing image file and asks to modify it in any way (e.g., "modify this image", "change the background", "replace X with Y"). Supports both text-to-image generation and image-to-image editing with configurable resolution (1K default, 2K, or 4K for high resolution). DO NOT read the image file first - use this skill directly with the --input-image parameter.

451339

fastapi-templates

wshobson

Create production-ready FastAPI projects with async patterns, dependency injection, and comprehensive error handling. Use when building new FastAPI applications or setting up backend API projects.

304231

Stay ahead of the MCP ecosystem

Get weekly updates on new skills and servers.