yara-rule-authoring
Guides authoring of high-quality YARA-X detection rules for malware identification. Use when writing, reviewing, or optimizing YARA rules. Covers naming conventions, string selection, performance optimization, migration from legacy YARA, and false positive reduction. Triggers on: YARA, YARA-X, malware detection, threat hunting, IOC, signature, crx module, dex module.
Install
mkdir -p .claude/skills/yara-rule-authoring && curl -L -o skill.zip "https://mcp.directory/api/skills/download/4616" && unzip -o skill.zip -d .claude/skills/yara-rule-authoring && rm skill.zipInstalls to .claude/skills/yara-rule-authoring
About this skill
YARA-X Rule Authoring
Write detection rules that catch malware without drowning in false positives.
This skill targets YARA-X, the Rust-based successor to legacy YARA. YARA-X powers VirusTotal's production systems and is the recommended implementation. See Migrating from Legacy YARA if you have existing rules.
Core Principles
-
Strings must generate good atoms — YARA extracts 4-byte subsequences for fast matching. Strings with repeated bytes, common sequences, or under 4 bytes force slow bytecode verification on too many files.
-
Target specific families, not categories — "Detects ransomware" catches everything and nothing. "Detects LockBit 3.0 configuration extraction routine" catches what you want.
-
Test against goodware before deployment — A rule that fires on Windows system files is useless. Validate against VirusTotal's goodware corpus or your own clean file set.
-
Short-circuit with cheap checks first — Put
filesize < 10MB and uint16(0) == 0x5A4Dbefore expensive string searches or module calls. -
Metadata is documentation — Future you (and your team) need to know what this catches, why, and where the sample came from.
When to Use
- Writing new YARA-X rules for malware detection
- Reviewing existing rules for quality or performance issues
- Optimizing slow-running rulesets
- Converting IOCs or threat intel into detection signatures
- Debugging false positive issues
- Preparing rules for production deployment
- Migrating legacy YARA rules to YARA-X
- Analyzing Chrome extensions (crx module)
- Analyzing Android apps (dex module)
When NOT to Use
- Static analysis requiring disassembly → use Ghidra/IDA skills
- Dynamic malware analysis → use sandbox analysis skills
- Network-based detection → use Suricata/Snort skills
- Memory forensics with Volatility → use memory forensics skills
- Simple hash-based detection → just use hash lists
YARA-X Overview
YARA-X is the Rust-based successor to legacy YARA: 5-10x faster regex, better errors, built-in formatter, stricter validation, new modules (crx, dex), 99% rule compatibility.
Install: brew install yara-x (macOS) or cargo install yara-x
Essential commands: yr scan, yr check, yr fmt, yr dump
Platform Considerations
YARA works on any file type. Adapt patterns to your target:
| Platform | Magic Bytes | Bad Strings | Good Strings |
|---|---|---|---|
| Windows PE | uint16(0) == 0x5A4D | API names, Windows paths | Mutex names, PDB paths |
| macOS Mach-O | uint32(0) == 0xFEEDFACE (32-bit), 0xFEEDFACF (64-bit), 0xCAFEBABE (universal) | Common Obj-C methods | Keylogger strings, persistence paths |
| JavaScript/Node | (none needed) | require, fetch, axios | Obfuscator signatures, eval+decode chains |
| npm/pip packages | (none needed) | postinstall, dependencies | Suspicious package names, exfil URLs |
| Office docs | uint32(0) == 0x504B0304 | VBA keywords | Macro auto-exec, encoded payloads |
| VS Code extensions | (none needed) | vscode.workspace | Uncommon activationEvents, hidden file access |
| Chrome extensions | Use crx module | Common Chrome APIs | Permission abuse, manifest anomalies |
| Android apps | Use dex module | Standard DEX structure | Obfuscated classes, suspicious permissions |
macOS Malware Detection
No dedicated Mach-O module exists yet. Use magic byte checks + string patterns:
Magic bytes:
// Mach-O 32-bit
uint32(0) == 0xFEEDFACE
// Mach-O 64-bit
uint32(0) == 0xFEEDFACF
// Universal binary (fat binary)
uint32(0) == 0xCAFEBABE or uint32(0) == 0xBEBAFECA
Good indicators for macOS malware:
- Keylogger artifacts:
CGEventTapCreate,kCGEventKeyDown - SSH tunnel strings:
ssh -D,tunnel,socks - Persistence paths:
~/Library/LaunchAgents,/Library/LaunchDaemons - Credential theft:
security find-generic-password,keychain
Example pattern from Airbnb BinaryAlert:
rule SUSP_Mac_ProtonRAT
{
strings:
// Library indicators
$lib1 = "SRWebSocket" ascii
$lib2 = "SocketRocket" ascii
// Behavioral indicators
$behav1 = "SSH tunnel not launched" ascii
$behav2 = "Keylogger" ascii
condition:
(uint32(0) == 0xFEEDFACF or uint32(0) == 0xCAFEBABE) and
any of ($lib*) and any of ($behav*)
}
JavaScript Detection Decision Tree
Writing a JavaScript rule?
├─ npm package?
│ ├─ Check package.json patterns
│ ├─ Look for postinstall/preinstall hooks
│ └─ Target exfil patterns: fetch + env access + credential paths
├─ Browser extension?
│ ├─ Chrome: Use crx module
│ └─ Others: Target manifest patterns, background script behaviors
├─ Standalone JS file?
│ ├─ Look for obfuscation markers: eval+atob, fromCharCode chains
│ ├─ Target unique function/variable names (often survive minification)
│ └─ Check for packed/encoded payloads
└─ Minified/webpack bundle?
├─ Target unique strings that survive bundling (URLs, magic values)
└─ Avoid function names (will be mangled)
JavaScript-specific good strings:
- Ethereum function selectors:
{ 70 a0 82 31 }(transfer) - Zero-width characters (steganography):
{ E2 80 8B E2 80 8C } - Obfuscator signatures:
_0x,var _0x - Specific C2 patterns: domain names, webhook URLs
JavaScript-specific bad strings:
require,fetch,axios— too commonBuffer,crypto— legitimate uses everywhereprocess.envalone — need specific env var names
Essential Toolkit
| Tool | Purpose |
|---|---|
| yarGen | Extract candidate strings: yarGen.py -m samples/ --excludegood → validate with yr check |
| FLOSS | Extract obfuscated/stack strings: floss sample.exe (when yarGen fails) |
| yr CLI | Validate: yr check, scan: yr scan -s, inspect: yr dump -m pe |
| signature-base | Study quality examples |
| YARA-CI | Goodware corpus testing before deployment |
Master these five. Don't get distracted by tool catalogs.
Rationalizations to Reject
When you catch yourself thinking these, stop and reconsider.
| Rationalization | Expert Response |
|---|---|
| "This generic string is unique enough" | Test against goodware first. Your intuition is wrong. |
| "yarGen gave me these strings" | yarGen suggests, you validate. Check each one manually. |
| "It works on my 10 samples" | 10 samples ≠ production. Use VirusTotal goodware corpus. |
| "One rule to catch all variants" | Causes FP floods. Target specific families. |
| "I'll make it more specific if we get FPs" | Write tight rules upfront. FPs burn trust. |
| "This hex pattern is unique" | Unique in one sample ≠ unique across malware ecosystem. |
| "Performance doesn't matter" | One slow rule slows entire ruleset. Optimize atoms. |
| "PEiD rules still work" | Obsolete. 32-bit packers aren't relevant. |
| "I'll add more conditions later" | Weak rules deployed = damage done. |
| "This is just for hunting" | Hunting rules become detection rules. Same quality bar. |
| "The API name makes it malicious" | Legitimate software uses same APIs. Need behavioral context. |
| "any of them is fine for these common strings" | Common strings + any = FP flood. Use any of only for individually unique strings. |
| "This regex is specific enough" | /fetch.*token/ matches all auth code. Add exfil destination requirement. |
| "The JavaScript looks clean" | Attackers poison legitimate code with injects. Check for eval+decode chains. |
| "I'll use .* for flexibility" | Unbounded regex = performance disaster + memory explosion. Use .{0,30}. |
| "I'll use --relaxed-re-syntax everywhere" | Masks real bugs. Fix the regex instead of hiding problems. |
Decision Trees
Is This String Good Enough?
Is this string good enough?
├─ Less than 4 bytes?
│ └─ NO — find longer string
├─ Contains repeated bytes (0000, 9090)?
│ └─ NO — add surrounding context
├─ Is an API name (VirtualAlloc, CreateRemoteThread)?
│ └─ NO — use hex pattern of call site instead
├─ Appears in Windows system files?
│ └─ NO — too generic, find something unique
├─ Is it a common path (C:\Windows\, cmd.exe)?
│ └─ NO — find malware-specific paths
├─ Unique to this malware family?
│ └─ YES — use it
└─ Appears in other malware too?
└─ MAYBE — combine with family-specific marker
When to Use "all of" vs "any of"
Should I require all strings or allow any?
├─ Strings are individually unique to malware?
│ └─ any of them (each alone is suspicious)
├─ Strings are common but combination is suspicious?
│ └─ all of them (require the full pattern)
├─ Strings have different confidence levels?
│ └─ Group: all of ($core_*) and any of ($variant_*)
└─ Seeing many false positives?
└─ Tighten: switch any → all, add more required strings
Lesson from production: Rules using any of ($network_*) where strings included "fetch", "axios", "http" matched virtually all web applications. Switching to require credential path AND network call AND exfil destination eliminated FPs.
When to Abandon a Rule Approach
Stop and pivot when:
-
yarGen returns only API names and paths → See When Strings Fail, Pivot to Structure
-
Can't find 3 unique strings → Probably packed. Target the unpacked version or detect the packer.
-
Rule matches goodware files → Strings aren't unique enough. 1-2 matches = investigate and tighten; 3-5 matches = find different indicators; 6+ matches = start over.
-
Performance is terrible even after optimization → Architecture problem. Split into multiple focused rules or add strict pre-filters.
-
Description is hard to write → The rule is too vague. If you can't explain what it catches, it catches too much.
Debugging False Positives
FP Investigation Flow:
│
├─ 1. Which string matched?
│ Run: yr scan -s rule
---
*Content truncated.*
More by trailofbits
View all skills by trailofbits →You might also like
flutter-development
aj-geddes
Build beautiful cross-platform mobile apps with Flutter and Dart. Covers widgets, state management with Provider/BLoC, navigation, API integration, and material design.
drawio-diagrams-enhanced
jgtolentino
Create professional draw.io (diagrams.net) diagrams in XML format (.drawio files) with integrated PMP/PMBOK methodologies, extensive visual asset libraries, and industry-standard professional templates. Use this skill when users ask to create flowcharts, swimlane diagrams, cross-functional flowcharts, org charts, network diagrams, UML diagrams, BPMN, project management diagrams (WBS, Gantt, PERT, RACI), risk matrices, stakeholder maps, or any other visual diagram in draw.io format. This skill includes access to custom shape libraries for icons, clipart, and professional symbols.
ui-ux-pro-max
nextlevelbuilder
"UI/UX design intelligence. 50 styles, 21 palettes, 50 font pairings, 20 charts, 8 stacks (React, Next.js, Vue, Svelte, SwiftUI, React Native, Flutter, Tailwind). Actions: plan, build, create, design, implement, review, fix, improve, optimize, enhance, refactor, check UI/UX code. Projects: website, landing page, dashboard, admin panel, e-commerce, SaaS, portfolio, blog, mobile app, .html, .tsx, .vue, .svelte. Elements: button, modal, navbar, sidebar, card, table, form, chart. Styles: glassmorphism, claymorphism, minimalism, brutalism, neumorphism, bento grid, dark mode, responsive, skeuomorphism, flat design. Topics: color palette, accessibility, animation, layout, typography, font pairing, spacing, hover, shadow, gradient."
godot
bfollington
This skill should be used when working on Godot Engine projects. It provides specialized knowledge of Godot's file formats (.gd, .tscn, .tres), architecture patterns (component-based, signal-driven, resource-based), common pitfalls, validation tools, code templates, and CLI workflows. The `godot` command is available for running the game, validating scripts, importing resources, and exporting builds. Use this skill for tasks involving Godot game development, debugging scene/resource files, implementing game systems, or creating new Godot components.
nano-banana-pro
garg-aayush
Generate and edit images using Google's Nano Banana Pro (Gemini 3 Pro Image) API. Use when the user asks to generate, create, edit, modify, change, alter, or update images. Also use when user references an existing image file and asks to modify it in any way (e.g., "modify this image", "change the background", "replace X with Y"). Supports both text-to-image generation and image-to-image editing with configurable resolution (1K default, 2K, or 4K for high resolution). DO NOT read the image file first - use this skill directly with the --input-image parameter.
fastapi-templates
wshobson
Create production-ready FastAPI projects with async patterns, dependency injection, and comprehensive error handling. Use when building new FastAPI applications or setting up backend API projects.
Related MCP Servers
Browse all serversUno Platform — Documentation and prompts for building cross-platform .NET apps with a single codebase. Get guides, sampl
Supercharge browser tasks with Browser MCP—AI-driven, local browser automation for powerful, private testing. Inspired b
Automate Excel file tasks without Microsoft Excel using openpyxl and xlsxwriter for formatting, formulas, charts, and ad
Access shadcn/ui v4 components, blocks, and demos for rapid React UI library development. Seamless integration and sourc
DeepWiki converts deepwiki.com pages into clean Markdown, with fast, secure extraction—perfect as a PDF text, page, or i
Unlock powerful image manipulation with ImageSorcery: resize, crop, detect objects, and perform optical character recogn
Stay ahead of the MCP ecosystem
Get weekly updates on new skills and servers.