fuzzing-dictionary

11
3
Source

Fuzzing dictionaries guide fuzzers with domain-specific tokens. Use when fuzzing parsers, protocols, or format-specific code.

Install

mkdir -p .claude/skills/fuzzing-dictionary && curl -L -o skill.zip "https://mcp.directory/api/skills/download/3782" && unzip -o skill.zip -d .claude/skills/fuzzing-dictionary && rm skill.zip

Installs to .claude/skills/fuzzing-dictionary

About this skill

Fuzzing Dictionary

A fuzzing dictionary provides domain-specific tokens to guide the fuzzer toward interesting inputs. Instead of purely random mutations, the fuzzer incorporates known keywords, magic numbers, protocol commands, and format-specific strings that are more likely to reach deeper code paths in parsers, protocol handlers, and file format processors.

Overview

Dictionaries are text files containing quoted strings that represent meaningful tokens for your target. They help fuzzers bypass early validation checks and explore code paths that would be difficult to reach through blind mutation alone.

Key Concepts

ConceptDescription
Dictionary EntryA quoted string (e.g., "keyword") or key-value pair (e.g., kw="value")
Hex EscapesByte sequences like "\xF7\xF8" for non-printable characters
Token InjectionFuzzer inserts dictionary entries into generated inputs
Cross-Fuzzer FormatDictionary files work with libFuzzer, AFL++, and cargo-fuzz

When to Apply

Apply this technique when:

  • Fuzzing parsers (JSON, XML, config files)
  • Fuzzing protocol implementations (HTTP, DNS, custom protocols)
  • Fuzzing file format handlers (PNG, PDF, media codecs)
  • Coverage plateaus early without reaching deeper logic
  • Target code checks for specific keywords or magic values

Skip this technique when:

  • Fuzzing pure algorithms without format expectations
  • Target has no keyword-based parsing
  • Corpus already achieves high coverage

Quick Reference

TaskCommand/Pattern
Use with libFuzzer./fuzz -dict=./dictionary.dict ...
Use with AFL++afl-fuzz -x ./dictionary.dict ...
Use with cargo-fuzzcargo fuzz run fuzz_target -- -dict=./dictionary.dict
Extract from headergrep -o '".*"' header.h > header.dict
Generate from binarystrings ./binary | sed 's/^/"&/; s/$/&"/' > strings.dict

Step-by-Step

Step 1: Create Dictionary File

Create a text file with quoted strings on each line. Use comments (#) for documentation.

Example dictionary format:

# Lines starting with '#' and empty lines are ignored.

# Adds "blah" (w/o quotes) to the dictionary.
kw1="blah"
# Use \\ for backslash and \" for quotes.
kw2="\"ac\\dc\""
# Use \xAB for hex values
kw3="\xF7\xF8"
# the name of the keyword followed by '=' may be omitted:
"foo\x0Abar"

Step 2: Generate Dictionary Content

Choose a generation method based on what's available:

From LLM: Prompt ChatGPT or Claude with:

A dictionary can be used to guide the fuzzer. Write me a dictionary file for fuzzing a <PNG parser>. Each line should be a quoted string or key-value pair like kw="value". Include magic bytes, chunk types, and common header values. Use hex escapes like "\xF7\xF8" for binary values.

From header files:

grep -o '".*"' header.h > header.dict

From man pages (for CLI tools):

man curl | grep -oP '^\s*(--|-)\K\S+' | sed 's/[,.]$//' | sed 's/^/"&/; s/$/&"/' | sort -u > man.dict

From binary strings:

strings ./binary | sed 's/^/"&/; s/$/&"/' > strings.dict

Step 3: Pass Dictionary to Fuzzer

Use the appropriate flag for your fuzzer (see Quick Reference above).

Common Patterns

Pattern: Protocol Keywords

Use Case: Fuzzing HTTP or custom protocol handlers

Dictionary content:

# HTTP methods
"GET"
"POST"
"PUT"
"DELETE"
"HEAD"

# Headers
"Content-Type"
"Authorization"
"Host"

# Protocol markers
"HTTP/1.1"
"HTTP/2.0"

Pattern: Magic Bytes and File Format Headers

Use Case: Fuzzing image parsers, media decoders, archive handlers

Dictionary content:

# PNG magic bytes and chunks
png_magic="\x89PNG\r\n\x1a\n"
ihdr="IHDR"
plte="PLTE"
idat="IDAT"
iend="IEND"

# JPEG markers
jpeg_soi="\xFF\xD8"
jpeg_eoi="\xFF\xD9"

Pattern: Configuration File Keywords

Use Case: Fuzzing config file parsers (YAML, TOML, INI)

Dictionary content:

# Common config keywords
"true"
"false"
"null"
"version"
"enabled"
"disabled"

# Section headers
"[general]"
"[network]"
"[security]"

Advanced Usage

Tips and Tricks

TipWhy It Helps
Combine multiple generation methodsLLM-generated keywords + strings from binary covers broad surface
Include boundary values"0", "-1", "2147483647" trigger edge cases
Add format delimiters:, =, {, } help fuzzer construct valid structures
Keep dictionaries focused50-200 entries perform better than thousands
Test dictionary effectivenessRun with and without dict, compare coverage

Auto-Generated Dictionaries (AFL++)

When using afl-clang-lto compiler, AFL++ automatically extracts dictionary entries from string comparisons in the binary. This happens at compile time via the AUTODICTIONARY feature.

Enable auto-dictionary:

export AFL_LLVM_DICT2FILE=auto.dict
afl-clang-lto++ target.cc -o target
# Dictionary saved to auto.dict
afl-fuzz -x auto.dict -i in -o out -- ./target

Combining Multiple Dictionaries

Some fuzzers support multiple dictionary files:

# AFL++ with multiple dictionaries
afl-fuzz -x keywords.dict -x formats.dict -i in -o out -- ./target

Anti-Patterns

Anti-PatternProblemCorrect Approach
Including full sentencesFuzzer needs atomic tokens, not proseBreak into individual keywords
Duplicating entriesWastes mutation budgetUse sort -u to deduplicate
Over-sized dictionariesSlows fuzzer, dilutes useful tokensKeep focused: 50-200 most relevant entries
Missing hex escapesNon-printable bytes become mangledUse \xXX for binary values
No commentsHard to maintain and auditDocument sections with # comments

Tool-Specific Guidance

libFuzzer

clang++ -fsanitize=fuzzer,address harness.cc -o fuzz
./fuzz -dict=./dictionary.dict corpus/

Integration tips:

  • Dictionary tokens are inserted/replaced during mutations
  • Combine with -max_len to control input size
  • Use -print_final_stats=1 to see dictionary effectiveness metrics
  • Dictionary entries longer than -max_len are ignored

AFL++

afl-fuzz -x ./dictionary.dict -i input/ -o output/ -- ./target @@

Integration tips:

  • AFL++ supports multiple -x flags for multiple dictionaries
  • Use AFL_LLVM_DICT2FILE with afl-clang-lto for auto-generated dictionaries
  • Dictionary effectiveness shown in fuzzer stats UI
  • Tokens are used during deterministic and havoc stages

cargo-fuzz (Rust)

cargo fuzz run fuzz_target -- -dict=./dictionary.dict

Integration tips:

  • cargo-fuzz uses libFuzzer backend, so all libFuzzer dict flags work
  • Place dictionary file in fuzz/ directory alongside harness
  • Reference from harness directory: cargo fuzz run target -- -dict=../dictionary.dict

go-fuzz (Go)

go-fuzz does not have built-in dictionary support, but you can manually seed the corpus with dictionary entries:

# Convert dictionary to corpus files
grep -o '".*"' dict.txt | while read line; do
    echo -n "$line" | base64 > corpus/$(echo "$line" | md5sum | cut -d' ' -f1)
done

go-fuzz -bin=./target-fuzz.zip -workdir=.

Troubleshooting

IssueCauseSolution
Dictionary file not loadedWrong path or format errorCheck fuzzer output for dict parsing errors; verify file format
No coverage improvementDictionary tokens not relevantAnalyze target code for actual keywords; try different generation method
Syntax errors in dict fileUnescaped quotes or invalid escapesUse \\ for backslash, \" for quotes; validate with test run
Fuzzer ignores long entriesEntries exceed -max_lenKeep entries under max input length, or increase -max_len
Too many entries slow fuzzerDictionary too largePrune to 50-200 most relevant entries

Related Skills

Tools That Use This Technique

SkillHow It Applies
libfuzzerNative dictionary support via -dict= flag
aflppNative dictionary support via -x flag; auto-generation with AUTODICTIONARIES
cargo-fuzzUses libFuzzer backend, inherits -dict= support

Related Techniques

SkillRelationship
fuzzing-corpusDictionaries complement corpus: corpus provides structure, dictionary provides keywords
coverage-analysisUse coverage data to validate dictionary effectiveness
harness-writingHarness structure determines which dictionary tokens are useful

Resources

Key External Resources

AFL++ Dictionaries Pre-built dictionaries for common formats (HTML, XML, JSON, SQL, etc.). Good starting point for format-specific fuzzing.

libFuzzer Dictionary Documentation Official libFuzzer documentation on dictionary format and usage. Explains token insertion strategy and performance implications.

Additional Examples

OSS-Fuzz Dictionaries Real-world dictionaries from Google's continuous fuzzing service. Search project directories for *.dict files to see production examples.

differential-review

trailofbits

Performs security-focused differential review of code changes (PRs, commits, diffs). Adapts analysis depth to codebase size, uses git history for context, calculates blast radius, checks test coverage, and generates comprehensive markdown reports. Automatically detects and prevents security regressions.

1429

semgrep

trailofbits

Semgrep is a fast static analysis tool for finding bugs and enforcing code standards. Use when scanning code for security issues or integrating into CI/CD pipelines.

4614

ton-vulnerability-scanner

trailofbits

Scans TON (The Open Network) smart contracts for 3 critical vulnerabilities including integer-as-boolean misuse, fake Jetton contracts, and forward TON without gas checks. Use when auditing FunC contracts.

104

semgrep-rule-creator

trailofbits

Creates custom Semgrep rules for detecting security vulnerabilities, bug patterns, and code patterns. Use when writing Semgrep rules or building custom static analysis detections.

144

code-maturity-assessor

trailofbits

Systematic code maturity assessment using Trail of Bits' 9-category framework. Analyzes codebase for arithmetic safety, auditing practices, access controls, complexity, decentralization, documentation, MEV risks, low-level code, and testing. Produces professional scorecard with evidence-based ratings and actionable recommendations.

143

modern-python

trailofbits

Configures Python projects with modern tooling (uv, ruff, ty). Use when creating projects, writing standalone scripts, or migrating from pip/Poetry/mypy/black.

273

You might also like

flutter-development

aj-geddes

Build beautiful cross-platform mobile apps with Flutter and Dart. Covers widgets, state management with Provider/BLoC, navigation, API integration, and material design.

1,5701,369

ui-ux-pro-max

nextlevelbuilder

"UI/UX design intelligence. 50 styles, 21 palettes, 50 font pairings, 20 charts, 8 stacks (React, Next.js, Vue, Svelte, SwiftUI, React Native, Flutter, Tailwind). Actions: plan, build, create, design, implement, review, fix, improve, optimize, enhance, refactor, check UI/UX code. Projects: website, landing page, dashboard, admin panel, e-commerce, SaaS, portfolio, blog, mobile app, .html, .tsx, .vue, .svelte. Elements: button, modal, navbar, sidebar, card, table, form, chart. Styles: glassmorphism, claymorphism, minimalism, brutalism, neumorphism, bento grid, dark mode, responsive, skeuomorphism, flat design. Topics: color palette, accessibility, animation, layout, typography, font pairing, spacing, hover, shadow, gradient."

1,1161,188

drawio-diagrams-enhanced

jgtolentino

Create professional draw.io (diagrams.net) diagrams in XML format (.drawio files) with integrated PMP/PMBOK methodologies, extensive visual asset libraries, and industry-standard professional templates. Use this skill when users ask to create flowcharts, swimlane diagrams, cross-functional flowcharts, org charts, network diagrams, UML diagrams, BPMN, project management diagrams (WBS, Gantt, PERT, RACI), risk matrices, stakeholder maps, or any other visual diagram in draw.io format. This skill includes access to custom shape libraries for icons, clipart, and professional symbols.

1,4181,109

godot

bfollington

This skill should be used when working on Godot Engine projects. It provides specialized knowledge of Godot's file formats (.gd, .tscn, .tres), architecture patterns (component-based, signal-driven, resource-based), common pitfalls, validation tools, code templates, and CLI workflows. The `godot` command is available for running the game, validating scripts, importing resources, and exporting builds. Use this skill for tasks involving Godot game development, debugging scene/resource files, implementing game systems, or creating new Godot components.

1,193747

nano-banana-pro

garg-aayush

Generate and edit images using Google's Nano Banana Pro (Gemini 3 Pro Image) API. Use when the user asks to generate, create, edit, modify, change, alter, or update images. Also use when user references an existing image file and asks to modify it in any way (e.g., "modify this image", "change the background", "replace X with Y"). Supports both text-to-image generation and image-to-image editing with configurable resolution (1K default, 2K, or 4K for high resolution). DO NOT read the image file first - use this skill directly with the --input-image parameter.

1,153683

pdf-to-markdown

aliceisjustplaying

Convert entire PDF documents to clean, structured Markdown for full context loading. Use this skill when the user wants to extract ALL text from a PDF into context (not grep/search), when discussing or analyzing PDF content in full, when the user mentions "load the whole PDF", "bring the PDF into context", "read the entire PDF", or when partial extraction/grepping would miss important context. This is the preferred method for PDF text extraction over page-by-page or grep approaches.

1,311614

Stay ahead of the MCP ecosystem

Get weekly updates on new skills and servers.