reduce-unoptimized-query-oracle

2
0
Source

Reduce an unoptimized-query-oracle test failure log to the simplest possible reproduction case. Use when you have unoptimized-query-oracle*.log files from a failed roachtest and need to find the minimal SQL to reproduce the bug.

Install

mkdir -p .claude/skills/reduce-unoptimized-query-oracle && curl -L -o skill.zip "https://mcp.directory/api/skills/download/3423" && unzip -o skill.zip -d .claude/skills/reduce-unoptimized-query-oracle && rm skill.zip

Installs to .claude/skills/reduce-unoptimized-query-oracle

About this skill

Reduce Unoptimized Query Oracle Test Failure

Reduce an unoptimized-query-oracle test failure log to the simplest possible reproduction case.

The unoptimized-query-oracle roachtest runs a series of random SQL statements to create a random dataset, and then executes a random "Query of Interest" twice, with different optimization settings. If the two executions return different results, it indicates a bug in CockroachDB.

When to Use

Use this skill when:

  • You have a test failure from the unoptimized-query-oracle roachtest.
  • You need to find the minimal SQL to reproduce the test failure.

Step 1: Locate artifacts

Ask the user where the artifacts directory is.

Find the relevant files in the artifacts directory:

  • Test parameters: params.log (the parameters from the roachtest)
  • Test log: test.log (the log from the roachtest)
  • Failure log: failure*.log (the failure log from the roachtest)
  • Full SQL log: unoptimized-query-oracle*.log (the SQL statements that led to failure)
  • Query of interest log: unoptimized-query-oracle*.failure.log (containing the query of interest and possibly more information about the failure)
  • Cockroach log: logs/1.unredacted/cockroach.log or logs/unredacted/cockroach.log (contains the git commit)

Step 2: Determine test configuration

Determine the git commit from cockroach.log:

grep "binary: CockroachDB" cockroach.log

Look for the commit hash in the version string (e.g., cb94db961b8f55e3473f279d98ae90f0eeb0adcb).

Determine if runtime assertions are enabled by checking for:

  • "runtimeAssertionsBuild": "true" in params.log
  • or Runtime assertions enabled in test.log

Determine if metamorphic settings apply by looking for:

  • lines like these in params.log:
    "metamorphicBufferedSender": "true",
    "metamorphicWriteBuffering": "true",
    
  • or lines like these in test.log:
    metamorphically setting "kv.rangefeed.buffered_sender.enabled" to 'true'
    metamorphically setting "kv.transaction.write_buffering.enabled" to 'true'
    

Determine environment variables from the beginning of cockroach.log:

grep -A10 "using local environment variables:" cockroach.log

Important environment variables include:

  • COCKROACH_INTERNAL_CHECK_CONSISTENCY_FATAL
  • COCKROACH_INTERNAL_DISABLE_METAMORPHIC_TESTING
  • COCKROACH_RANDOM_SEED
  • COCKROACH_TESTING_FORCE_RELEASE_BRANCH But there might be more important environment variables, so best to get all of them.

Determine if this is a multi-region test or single-region test by checking:

  • the test name (e.g., seed-multi-region in test.log indicates multi-region)
  • or the presence of \connect lines in the full SQL log If both of these are missing, it's a single-region test.

Step 3: Check Out and Build

For a normal build use:

git checkout <commit-hash>
./dev build short

If runtime assertions were enabled, use a test build instead:

git checkout <commit-hash>
./dev build short -- --crdb_test

Note: Only build libgeos if the reproduction uses geospatial functions (BOX2D, geometry, geography, etc.):

./dev build libgeos

Step 4: Prepare the Full SQL Log File

First, check that the following statements are at the top of the full SQL log file. If they are not, add them:

SET statement_timeout='1m0s';
SET sql_safe_updates = false;

If metamorphic settings were used, also add them to the top of the full SQL log file:

SET CLUSTER SETTING kv.rangefeed.buffered_sender.enabled = true;
SET CLUSTER SETTING kv.transaction.write_buffering.enabled = true;

Create an appropriate directory either in the artifacts directory or in the repository root for holding temp files.

Step 5: Initial Reproduction

Determine the correct demo command based on test type:

  • Multi-region test: Use --nodes=9
  • Single-region test: Omit --nodes option

Use a command like this to try reproducing the test failure from the full SQL log file. This command could take up to 20 minutes to finish.

<env vars> ./cockroach demo --multitenant=false --nodes=9 --insecure --set=errexit=false --no-example-database --format=tsv -f <full-sql-log-file>

Check that the output reproduces the test failure described in the failure log. There are many possible failure modes. Look for one of the following, which should match the failure log:

  1. Different results between the two executions of the "Query of Interest" (which is the randomly generated SELECT statement repeated twice near the end of the log, wrapped in various SET and RESET staements). These different results could take the form of different result sets, or could also be an error in one case and no error in the other case. This is an "oracle" failure.
  2. Or, internal error or assertion failure. Note the error message for the reduce step.
  3. Or, a panic. Note the error message for the reduce step.
  4. Or, a timeout. Note the statement that timed out.

Troubleshooting

IMPORTANT: Many failures are nondeterministic, especially for multi-region tests. If no failure happens on the first run, try up to 10 times before concluding it doesn't reproduce.

It can be helpful at this point to compare the output with the failure*.log which should show the failure from the original test run.

If the initial run fails to reproduce after 10 times, pause here and report to the user that the failure cannot be reproduced, and show the command that was tried. The user might have additional instructions.

If it looks like it reproduces, it's time to move on to the next step.

Step 6: Use the Reduce Tool

Build the reduce tool:

./dev build reduce

Prepare the Full SQL Log File again

For multi-region tests, remove \connect lines (they cause syntax errors in the reduce tool):

grep -v '^\\connect' <full-sql-log-file> > <cleaned-log>

Run Reduce

IMPORTANT: The reduce tool must be run from the cockroach repository root directory, because it looks for ./cockroach in the current directory.

Use the -multi-region option for multi-region tests, or omit it for single-region tests.

For "oracle" failures (different results):

./bin/reduce -unoptimized-query-oracle -multi-region -chunk 25 -v -file <cleaned-log> 2>&1 | tee reduce-output.log

The -unoptimized-query-oracle option checks whether the two executions of the "Query of Interest" produce the same results.

For internal errors/assertion failures/panics:

./bin/reduce -contains "<error-regex>" -multi-region -chunk 25 -v -file <cleaned-log> 2>&1 | tee reduce-output.log

Use a distinctive part of the error message as the -contains regex (e.g., "nil LeafTxnInputState").

The reduce tool might take up to an hour to run.

Extract the Reduced SQL

The reduce tool outputs progress lines followed by the final SQL. Extract just the SQL:

grep -A1000 "^reduction: " reduce-output.log | tail -n +2 > reduced.sql

IMPORTANT: Immediately save a backup of the reduce output before manual simplification:

cp reduced.sql reduced_original.sql

This provides a recovery point if the working file gets corrupted during simplification.

If the reduce tool fails to reproduce, pause here and report this to the user. They might have additional instructions. Occasionally we have to modify the reduce tool itself, if the test failure is not reproducing.

Step 7: Create Test Script and Determine Reproduction Rate

IMPORTANT: Many bugs are nondeterministic. Before manual simplification, create a reusable test script and determine the reproduction rate.

Create a small test script (adjust as needed):

cat > test_repro.sh << 'EOF'
#!/bin/bash
# Test if reduced_v2.sql reproduces the error (exits on first success, up to 10 attempts)
for i in {1..10}; do
  if ./cockroach demo --multitenant=false --nodes=9 --insecure \
     --set=errexit=false --no-example-database --format=tsv \
     -f reduced_v2.sql 2>&1 | grep -q "<error-pattern>"; then
    echo "Run $i: REPRODUCED"
    exit 0
  else
    echo "Run $i: no error"
  fi
done
echo "FAILED"
EOF
chmod +x test_repro.sh

For "oracle" failures, instead of checking for an error pattern, the test script probably needs to isolate and diff the results of the two executions of the "Query of Interest".

Run the test script to determine the reproduction rate. It's not always 100%.

This rate determines how many attempts you need when testing simplifications:

  • 100% rate: Single attempt sufficient
  • 50% rate: 2-3 attempts usually sufficient
  • 10% rate: Need ~10 attempts to be confident
  • <5% rate: May need 20+ attempts

Note that in some cases, the following settings might need to be added back to the reduced file to get a repro:

SET statement_timeout='1m0s';
SET sql_safe_updates = false;

If the reduced SQL fails to reproduce after 10 attempts, pause here and report this to the user. They might have additional instructions.

Step 8: Manual Simplification

Now iteratively simplify the SQL while maintaining reproduction.

CRITICAL: For nondeterministic failures, you MUST test each simplification with enough attempts based on the repro rate. A single failed attempt does NOT mean the simplification broke the repro - it may just be nondeterminism.

Workflow for Each Simplification

  1. Copy reduced.sql to reduced_v2.sql
  2. Make ONE small change to reduced_v2.sql
  3. Run ./test_repro.sh (which tests reduced_v2.sql)
  4. If it reproduces: Copy reduced_v2.sql to reduced.sql, continue simplifying
  5. If it doesn't reproduce after enough attempts: Discard reduced_v2.sql, try a different change (i.e. backtrack).

This workflow avoids needing to restore files - you always keep the last working version in reduced.sql.

IMPORTANT: Run copy, edit, and test as separate bash commands (not chained with `&&


Content truncated.

You might also like

flutter-development

aj-geddes

Build beautiful cross-platform mobile apps with Flutter and Dart. Covers widgets, state management with Provider/BLoC, navigation, API integration, and material design.

643969

drawio-diagrams-enhanced

jgtolentino

Create professional draw.io (diagrams.net) diagrams in XML format (.drawio files) with integrated PMP/PMBOK methodologies, extensive visual asset libraries, and industry-standard professional templates. Use this skill when users ask to create flowcharts, swimlane diagrams, cross-functional flowcharts, org charts, network diagrams, UML diagrams, BPMN, project management diagrams (WBS, Gantt, PERT, RACI), risk matrices, stakeholder maps, or any other visual diagram in draw.io format. This skill includes access to custom shape libraries for icons, clipart, and professional symbols.

591705

ui-ux-pro-max

nextlevelbuilder

"UI/UX design intelligence. 50 styles, 21 palettes, 50 font pairings, 20 charts, 8 stacks (React, Next.js, Vue, Svelte, SwiftUI, React Native, Flutter, Tailwind). Actions: plan, build, create, design, implement, review, fix, improve, optimize, enhance, refactor, check UI/UX code. Projects: website, landing page, dashboard, admin panel, e-commerce, SaaS, portfolio, blog, mobile app, .html, .tsx, .vue, .svelte. Elements: button, modal, navbar, sidebar, card, table, form, chart. Styles: glassmorphism, claymorphism, minimalism, brutalism, neumorphism, bento grid, dark mode, responsive, skeuomorphism, flat design. Topics: color palette, accessibility, animation, layout, typography, font pairing, spacing, hover, shadow, gradient."

318398

godot

bfollington

This skill should be used when working on Godot Engine projects. It provides specialized knowledge of Godot's file formats (.gd, .tscn, .tres), architecture patterns (component-based, signal-driven, resource-based), common pitfalls, validation tools, code templates, and CLI workflows. The `godot` command is available for running the game, validating scripts, importing resources, and exporting builds. Use this skill for tasks involving Godot game development, debugging scene/resource files, implementing game systems, or creating new Godot components.

339397

nano-banana-pro

garg-aayush

Generate and edit images using Google's Nano Banana Pro (Gemini 3 Pro Image) API. Use when the user asks to generate, create, edit, modify, change, alter, or update images. Also use when user references an existing image file and asks to modify it in any way (e.g., "modify this image", "change the background", "replace X with Y"). Supports both text-to-image generation and image-to-image editing with configurable resolution (1K default, 2K, or 4K for high resolution). DO NOT read the image file first - use this skill directly with the --input-image parameter.

451339

fastapi-templates

wshobson

Create production-ready FastAPI projects with async patterns, dependency injection, and comprehensive error handling. Use when building new FastAPI applications or setting up backend API projects.

304231

Stay ahead of the MCP ecosystem

Get weekly updates on new skills and servers.