Published May 2026Deep dive22 min read

Karpathy's CLAUDE.md, Annotated (2026)

Sixty lines of markdown turned into the most-cited Claude Code template in the ecosystem. The repo passed 111K stars; the ideas seeded a dozen forks. This post walks the file line-by-line — verbatim quotes pulled from the source repo, what each rule actually accomplishes, when it would fail in a different codebase, and a 50-line starter you can copy.

Editorial illustration: a luminous teal markdown document glyph haloed by four orbiting principle rings — think, simplify, surgical, verify — connected by faint dotted guide lines on a midnight navy background.
On this page · 15 sections
  1. TL;DR
  2. How it went viral
  3. What CLAUDE.md actually is
  4. The file, line by line
  5. 1. Think Before Coding
  6. 2. Simplicity First
  7. 3. Surgical Changes
  8. 4. Goal-Driven Execution
  9. Patterns it popularised
  10. Derivatives and forks
  11. A 50-line starter you can copy
  12. Anti-patterns
  13. CLAUDE.md vs skills vs MCP
  14. FAQ
  15. Sources

TL;DR

  • Karpathy didn’t write the file. Andrej Karpathy posted observations about LLM coding pitfalls on X in late 2025. Jiayuan Chang turned those observations into a four-principle CLAUDE.md and published the forrestchang/andrej-karpathy-skills repo a few weeks later. The wording is Chang’s; the diagnosis is Karpathy’s.
  • It’s short on purpose. The whole file is roughly 60 lines, well under Anthropic’s recommended 200-line target. Every line earns its place.
  • Four principles, in order: Think Before Coding, Simplicity First, Surgical Changes, Goal-Driven Execution. The order matters — earlier principles gate later ones.
  • The verbatim source is below in every section, pulled from the repo’s main branch. Skip ahead to the 50-line starter if you want the template-and-go version.

How it went viral

Karpathy’s thread (X.com link, paywalled) summarised a few weeks of dogfooding Claude Code. Rather than quote behind the paywall, here is the diagnosis as Chang’s README captures it — the wording on which the four principles are built:

“The models make wrong assumptions on your behalf and just run along with them without checking. They don’t manage their confusion, don’t seek clarifications, don’t surface inconsistencies, don’t present tradeoffs, don’t push back when they should.”

“They really like to overcomplicate code and APIs, bloat abstractions, don’t clean up dead code... implement a bloated construction over 1000 lines when 100 would do.”

“They still sometimes change/remove comments and code they don’t sufficiently understand as side effects, even if orthogonal to the task.”

“LLMs are exceptionally good at looping until they meet specific goals... Don’t tell it what to do, give it success criteria and watch it go.”

— paraphrased and directly cited in the andrej-karpathy-skills README

Chang’s repo dropped within two weeks of the thread. Akshay Pachaar amplified it on X. It hit Hacker News. Cursor users started copying the same rules into .cursor/rules/. By spring 2026 the repo had 111K stars and 11.1K forks. Crucially, the name — “andrej-karpathy-skills” — did most of the work. People searching for “Karpathy claude.md” or “Karpathy claude code” landed in one place.

The viral arc matters because it shaped a generation of CLAUDE.md files. Most of the templates you see on GitHub now share the same skeleton: four to six numbered principles, terse one-line rules under each, and a closing “these guidelines are working if…” section. That format is Chang’s; the diagnosis is Karpathy’s; the popularity is downstream of both.

What CLAUDE.md actually is (and where it lives)

Before annotating the file, a one-paragraph refresher on the Anthropic format. CLAUDE.md is a markdown file Claude Code reads at the start of every conversation. The official docs describe two complementary memory systems: CLAUDE.md is instructions you write; auto memory is notes Claude writes itself. Only the first concerns us here.

Files load in priority order from broadest to most specific:

  • Managed policy (/Library/Application Support/ClaudeCode/CLAUDE.md on macOS, /etc/claude-code/CLAUDE.md on Linux) — organisation-wide instructions IT pushes via MDM. Cannot be overridden by individual settings.
  • Project (./CLAUDE.md or ./.claude/CLAUDE.md) — checked into git, shared with the team. This is where most CLAUDE.md content belongs.
  • User (~/.claude/CLAUDE.md) — your personal preferences across all projects. Karpathy’s guidelines are a good fit here as well as in projects.
  • Local (./CLAUDE.local.md) — personal project-specific notes; gitignored so it stays off the team’s repo.

Files concatenate rather than override; content from the filesystem root is read first, files closer to your working directory are read last. Anthropic’s explicit guidance on size is short and sharp:

“Target under 200 lines per CLAUDE.md file. Longer files consume more context and reduce adherence.” — Anthropic’s memory docs.

Their best-practices page is even blunter: “Bloated CLAUDE.md files cause Claude to ignore your actual instructions.” Karpathy’s file is around 60 lines of prose plus headings. That’s the right scale for a cross-project preamble. Project-specific facts (build commands, test runners, naming conventions) are added on top of Karpathy’s file, not instead of it. The whole stack ideally still fits under 200 lines.

If you’re new to the format, our Claude Code best practices primer covers the surrounding ecosystem (skills, hooks, subagents). The rest of this post assumes you know how CLAUDE.md loads and walks the file itself.

The file, line by line

What follows is the full, verbatim CLAUDE.md from forrestchang/andrej-karpathy-skills, broken into sections with annotation between each block. Code blocks are quoted exactly as they appear in the repo (commit on main, May 2026). The wording is deliberate; nothing is paraphrased inside the <pre> blocks.

The preamble

# CLAUDE.md

Behavioral guidelines to reduce common LLM coding mistakes. Merge with project-specific instructions as needed.

**Tradeoff:** These guidelines bias toward caution over speed. For trivial tasks, use judgment.

Three things to notice. First, the file declares itself. Naming it # CLAUDE.md at the top is harmless on its own but matters when the file gets concatenated with project-level CLAUDE.md content lower down — you want a heading break so Claude can tell them apart.

Second, “Merge with project-specific instructions as needed.” This is the polite way of saying: this is a preamble, not a complete file. The author expects you to paste it on top of build commands, test runners, and conventions that only apply to your project.

Third, the Tradeoff line. Most CLAUDE.md files don’t admit a tradeoff exists. This one does: cautious behaviour is slower than reckless behaviour, and that’s a feature, not a bug. The implicit message “for trivial tasks, use judgment” is the escape valve. Without it, you’d see Claude solemnly asking clarifying questions before fixing a typo.

Principle 1: Think Before Coding

## 1. Think Before Coding

**Don't assume. Don't hide confusion. Surface tradeoffs.**

Before implementing:
- State your assumptions explicitly. If uncertain, ask.
- If multiple interpretations exist, present them - don't pick silently.
- If a simpler approach exists, say so. Push back when warranted.
- If something is unclear, stop. Name what's confusing. Ask.

Why this works. The four bullets target the first failure mode Karpathy named: silent assumptions. LLMs default to picking an interpretation and proceeding; here the file installs an interrupt. Three of the four bullets end with “ask” or “present them” — actions Claude would otherwise skip.

The bold tagline matters. “Don’t assume. Don’t hide confusion. Surface tradeoffs.” The bold lines under each numbered principle are Chang’s most-copied invention. They function as memorable handles: when Claude is mid-task and has to retrieve the relevant rule, the tagline is the cache key. The surrounding bullets are detail.

When this fails in a different codebase. If your codebase is large, well-conventioned, and your tasks are mostly small — say, a Rails monolith with 200 engineers where most tickets are “fix this typo, add this button” — the rule fires too often. Claude will pause to ask clarification on requests that genuinely have one interpretation. Soften it: “If multiple interpretations exist on changes that touch more than three files, present them.” Trade some safety for speed.

If your codebase has a strong “just ship it” culture, this principle will fight you. Some teams want Claude to default to action and apologise later. That’s fine, but you should know you’re trading away the Karpathy default explicitly.

Principle 2: Simplicity First

## 2. Simplicity First

**Minimum code that solves the problem. Nothing speculative.**

- No features beyond what was asked.
- No abstractions for single-use code.
- No "flexibility" or "configurability" that wasn't requested.
- No error handling for impossible scenarios.
- If you write 200 lines and it could be 50, rewrite it.

Ask yourself: "Would a senior engineer say this is overcomplicated?" If yes, simplify.

Why this works. Five bullets that all start with “No” — and the format itself is the lesson. Negative rules are easier to verify than positive ones. “Write simple code” is unfalsifiable; “no error handling for impossible scenarios” is verifiable in a diff review. The 200-to-50-lines rule is the only positive instruction in the section, and it’s a concrete heuristic.

The senior-engineer test at the bottom is a model-elicitation pattern. By asking Claude to imagine a reviewer, you trigger a different sub-routine than “is this correct?”. It’s the same trick that “think step by step” exploits: change the framing, get a different inference path.

When this fails. In safety-critical code — payments, auth, anything regulated — “no error handling for impossible scenarios” is dangerous, because what feels impossible during implementation often turns out to be a CVE waiting to happen. Drop or invert that bullet for those modules. A path-scoped rule (.claude/rules/security.md with paths: ["src/auth/**"]) is the cleanest way to do this.

In speculative-prototype work, the rule is almost too good. You wanted Claude to over-engineer because you’re exploring a design space. Soften the wording, or accept that this principle pushes you toward tight, throwaway code by default.

Principle 3: Surgical Changes

## 3. Surgical Changes

**Touch only what you must. Clean up only your own mess.**

When editing existing code:
- Don't "improve" adjacent code, comments, or formatting.
- Don't refactor things that aren't broken.
- Match existing style, even if you'd do it differently.
- If you notice unrelated dead code, mention it - don't delete it.

When your changes create orphans:
- Remove imports/variables/functions that YOUR changes made unused.
- Don't remove pre-existing dead code unless asked.

The test: Every changed line should trace directly to the user's request.

Why this works. This is the longest section, because it’s where Claude misbehaves most. The “drive-by refactor” is the single most common complaint about LLM-generated PRs: you ask for a one-line fix and get a 200-line diff that also “modernises” three unrelated files. Each bullet here is calibrated to a specific Claude failure mode.

The two-clause structure (“when editing existing code” vs. “when your changes create orphans”) splits the harder ethical question. Claude should clean up its own mess; it shouldn’t clean up someone else’s. The bullet that nails this: “If you notice unrelated dead code, mention it - don’t delete it.” That single line saves a thousand bad PRs.

The test sentence at the bottom — “Every changed line should trace directly to the user’s request” — is a falsification rule. You can run a diff review and check it. It also gives Claude a self-check to run before declaring a task complete.

When this fails. If your codebase needs aggressive cleanup — say, you inherited a React 16 monolith and you actually want Claude to modernise during edits — this principle is the wrong default. Either drop it or rephrase: “Match existing style unless the file is on the migration list at docs/migrations.md. In those files, follow the new patterns.”

Match-existing-style also fights you in monorepos with three competing styles where “existing style” is ambiguous. The fix is project-specific: name the canonical style explicitly (“Match the style in src/api/v2/ — that’s the canonical version”) so Claude isn’t left guessing.

Principle 4: Goal-Driven Execution

## 4. Goal-Driven Execution

**Define success criteria. Loop until verified.**

Transform tasks into verifiable goals:
- "Add validation" → "Write tests for invalid inputs, then make them pass"
- "Fix the bug" → "Write a test that reproduces it, then make it pass"
- "Refactor X" → "Ensure tests pass before and after"

For multi-step tasks, state a brief plan:
```
1. [Step] → verify: [check]
2. [Step] → verify: [check]
3. [Step] → verify: [check]
```

Strong success criteria let you loop independently. Weak criteria ("make it work") require constant clarification.

Why this works. The three example transformations are the load-bearing piece. They show Claude how to rewrite a prompt into something it can verify. “Add validation” is imperative and vague; “Write tests for invalid inputs, then make them pass” is declarative with a built-in check. The arrow notation () is small but visually memorable.

The plan template in the fenced block is doing surprising work. By giving Claude a literal template for multi-step plans, the file removes one degree of creative freedom: when the model gets to step three of a refactor, it doesn’t have to invent a verification step, just fill in the blank. This same idea — structured planning before execution — is what Anthropic’s best-practices page calls “Explore first, then plan, then code.”

The closing line is the entire thesis: “Strong success criteria let you loop independently. Weak criteria (‘make it work’) require constant clarification.” This is the principle that is most directly downstream of Karpathy’s thread: he framed LLMs as exceptionally good at looping toward specific goals, and warned that vague goals burn the loop in useless cycles.

When this fails. Not every task fits the test-then-pass mould. Frontend pixel-tweaks, prose edits, copy-writing, design-token changes — these don’t have a unit-testable success criterion. For visual work, Anthropic’s recommended substitute is screenshots: “take a screenshot of the result and compare it to the original. List differences and fix them.” Add a bullet to your project CLAUDE.md that captures the non-test verification pattern (screenshot diff, lint pass, visual review checklist) so the principle still applies in those domains.

The closing line

---

**These guidelines are working if:** fewer unnecessary changes in diffs, fewer rewrites due to overcomplication, and clarifying questions come before implementation rather than after mistakes.

Most CLAUDE.md files end with “Thanks!” or nothing. This one ends with falsifiable signals. If, after a week of using the file, your diffs are not smaller, your rewrites are not fewer, your clarifying questions are not arriving earlier, the file is not working — and you should tune it. That self-falsification clause is a model for every CLAUDE.md you write.

Patterns it popularised

Five conventions in the Karpathy CLAUDE.md have been copied so widely they now read as the de-facto format. Worth naming them so you can use them deliberately.

  1. Numbered principles, not paragraphs. Four ## N. Title sections are easier to retrieve than 800 words of prose. Anthropic’s docs agree: “use markdown headers and bullets to group related instructions.”
  2. Bold tagline per principle. “Don’t assume. Don’t hide confusion. Surface tradeoffs.” The tagline is the cache key. Sub-bullets are detail.
  3. “Don’t X” over “Prefer Y”. Negative rules are easier to verify than positive ones. The Karpathy file uses some flavour of “don’t”, “no”, or “avoid” on roughly half its rules.
  4. An explicit test per section. “Would a senior engineer say this is overcomplicated?” “Every changed line should trace directly to the user’s request.” These are self-check prompts Claude can run. They turn rules into procedures.
  5. A “working if” clause at the end. Falsifiable signals so you know when to tune the file.

If you’re writing your first CLAUDE.md, those five conventions are the safe defaults. Most homemade files that fail do so by skipping (3) — they pile on positive rules nobody can verify — or by skipping (5), at which point the file just grows.

Derivatives and forks

The Karpathy CLAUDE.md format spawned a small ecosystem. Three derivatives are worth knowing about; each one inherits the four-principle skeleton and adds a different layer.

addyosmani/agent-skills (~27.7K stars)

Addy Osmani’s repo extends the Karpathy skeleton from four principles into a full define-plan-build-verify-review-ship workflow, encoded as 20 skills plus a CLAUDE.md preamble. The repo’s own framing is “Process, not prose.” Each skill ships with step-by-step workflows, anti-rationalisation tables (documenting common shortcuts agents try to take), red flags, and verification requirements.

What it adds beyond Karpathy: phase gating. Define before Plan, Plan before Build, Build before Verify, Verify before Review, Review before Ship. The four-principle Karpathy core is folded in as the build-phase contract. It’s a heavier framework — the Anti-Rationalisation tables alone run hundreds of lines — but if you’re shipping production code with multiple Claude sessions fanning out, that structure earns its keep.

Cursor-rule mirrors

The same andrej-karpathy-skills repo ships a parallel .cursor/rules/karpathy-guidelines.mdc file with the same four principles wrapped in Cursor’s front-matter format. Cursor reads project rules from .cursor/rules/, so dropping the same guidelines into a Cursor codebase is a one-file copy. The lesson here generalises: if your team uses two or more agents, write the rules once and ship two front-matter wrappers.

Anthropic’s own skills repo (~128K stars)

The official anthropics/skills repo is the canonical SKILL.md reference but, perhaps surprisingly, has no CLAUDE.md. Anthropic ships category-organised skills (Creative & Design, Development & Technical, Document Skills) with YAML-frontmatter SKILL.md files. The implicit lesson: long-form behavioural guidelines belong in CLAUDE.md; repeatable workflows belong in skills. Karpathy’s file gets the first job right; Anthropic’s repo shows what the second job looks like.

The 'Goal-Driven Execution' principle captures this: transform imperative instructions into declarative goals with verification loops.

andrej-karpathy-skills README · Blog

The README's framing of why the fourth principle matters — directly downstream of Karpathy's thread on LLM looping behaviour.

Source
Bloated CLAUDE.md files cause Claude to ignore your actual instructions!

Anthropic Engineering · Blog

Anthropic's official best-practices page on writing CLAUDE.md. The exclamation point is in the source.

Source
Include tests, screenshots, or expected outputs so Claude can check itself. This is the single highest-leverage thing you can do.

Anthropic Engineering · Blog

Anthropic's tip on verification — the same idea as Karpathy's fourth principle, framed as a top-priority recommendation.

Source

A 50-line starter you can copy

If you want to start from a Karpathy-style preamble plus the project-specific facts Anthropic recommends, paste the following into ./CLAUDE.md, fill in the blanks, and prune ruthlessly. The total ships well under Anthropic’s 200-line ceiling.

# CLAUDE.md

Project conventions, build commands, and behavioural guidelines for
Claude Code working on this repo. Read top-to-bottom at session start.

## Build & test

- Install: `pnpm install`
- Dev server: `pnpm dev` (Next.js on http://localhost:3000)
- Tests: `pnpm test:unit` for one file, `pnpm test` for all
- Lint: `pnpm lint` (ESLint flat config)
- Typecheck: `pnpm typecheck` after every series of edits

## Architecture (one paragraph)

[2–3 sentences on the one architectural decision you wish you'd
known on day one. Examples: "Static-first; pages don't query the
DB at request time" or "Domain-driven, no shared models across
contexts".]

## Behavioural guidelines

### 1. Think Before Coding
**Don't assume. Don't hide confusion. Surface tradeoffs.**
- State assumptions explicitly. If uncertain, ask.
- If multiple interpretations exist, present them.
- If a simpler approach exists, say so. Push back when warranted.

### 2. Simplicity First
**Minimum code that solves the problem.**
- No features beyond what was asked.
- No abstractions for single-use code.
- No error handling for impossible scenarios.
- If 200 lines could be 50, rewrite it.

### 3. Surgical Changes
**Touch only what you must. Clean up only your own mess.**
- Don't refactor things that aren't broken.
- Match existing style.
- Mention unrelated dead code; don't delete it.
- Every changed line should trace to the user's request.

### 4. Goal-Driven Execution
**Define success criteria. Loop until verified.**
- "Fix the bug" → "Write a test that reproduces it, then make it pass."
- "Add validation" → "Write tests for invalid inputs, then make them pass."
- For visual changes, take a screenshot and diff against the target.

## Project gotchas

- [List 3–5 non-obvious facts that have bitten the team. Examples:
  required env vars, flaky test workarounds, why a specific function
  exists, what NOT to touch.]

## Working signals

These guidelines are working if: fewer unnecessary changes in diffs,
fewer rewrites due to overcomplication, and clarifying questions come
before implementation rather than after mistakes.

Two changes from the source Karpathy file are deliberate. First, the Build & test block at the top — Anthropic’s docs are emphatic that bash commands Claude can’t guess belong in CLAUDE.md, and a build/test cheat sheet is the single most-used reference in any session. Second, the “Project gotchas” section at the bottom — the place to record “a code review caught something Claude should have known about this codebase” lessons. Anthropic’s rule of thumb: if you typed the same correction last session, it belongs here.

Want to use this same file across multiple repos? Put the Behavioural guidelines section in ~/.claude/CLAUDE.md and the project-specific blocks in ./CLAUDE.md. They concatenate at session start. For a team-wide override that no individual user can drop, use the managed-policy location.

Anti-patterns: what not to do

Cargo-culting Karpathy’s rules into a Python web app

The four principles travel well, but the surrounding bullets were tuned for greenfield, single-developer work. “Don’t refactor things that aren’t broken” in a Django monolith with a documented modernisation plan is wrong — you actively want incremental refactors. Either soften the bullets or add an exception clause that names your migration list.

Stacking the file past 400 lines

Anthropic’s 200-line ceiling is the number to beat. Past it, Claude routinely ignores rules; the failure mode is silent. If your file is creeping toward 300 lines, the fix is path-scoped rules in .claude/rules/ with paths: frontmatter, so each rule loads only when relevant.

Vague rules with no verification step

“Write clean code,” “follow best practices,” “use idiomatic Python.” Karpathy’s file uses none of these. Every rule has either a concrete “don’t” or a verifiable test. If a line in your CLAUDE.md can’t be checked in a diff review, it belongs in a hook or a skill, not in CLAUDE.md.

Drive-by additions without pruning

Every time something goes wrong, a new line gets appended. Six months later, the file is 800 lines of contradictions. Anthropic’s exact line: “Treat CLAUDE.md like code: review it when things go wrong, prune it regularly.” If two rules contradict, Claude picks one arbitrarily — delete the loser.

Putting behavioural rules where workflows should go

A 30-line procedure for “how to migrate one file from class components to hooks” doesn’t belong in CLAUDE.md — it’s a skill. If the procedure only matters when Claude is doing that one task, move it to .claude/skills/ so it loads on demand. See our guide to Claude Code skills for the boundary.

CLAUDE.md vs Skills vs MCP

Karpathy’s file is the right tool only when the content needs to load every session. The two alternatives, briefly:

MechanismUse whenLoadsToken cost
CLAUDE.mdAlways-on rules and project factsEvery sessionPaid every conversation
.claude/skills/*Repeatable workflows triggered by intentOn demand when relevantPaid only when invoked
.claude/rules/* (paths)Rules scoped to specific filesWhen matching files are readPaid only when files match
MCP serverLive data or code execution from outside ClaudeWhen the model calls a toolTool-call cost only

Karpathy’s four principles belong in CLAUDE.md because they apply on every task. A migrate-one-component skill belongs in .claude/skills/ because it only runs when you trigger it. Live database queries belong in an MCP server because they need fresh data. Mixing these up wastes tokens.

For the full taxonomy and trade-offs, see our Claude skills vs MCP vs subagents vs CLI decision matrix. If you’re still unsure why CLAUDE.md exists at all, what are Claude Code skills covers the boundary from the other side. And our context bloat guide is the deep version of the “don’t bloat CLAUDE.md” argument.

Treat CLAUDE.md as the place you write down what you'd otherwise re-explain.

Anthropic memory docs · Blog

Anthropic's framing of when content earns a CLAUDE.md slot. The complement: if you wouldn't re-explain it every session, it doesn't belong here.

Source

Frequently asked questions

What is Karpathy's CLAUDE.md?

It's a 60-line behavioural guideline file written by Jiayuan Chang (forrestchang) and published in the andrej-karpathy-skills GitHub repo. The file distils Andrej Karpathy's late-2025 X thread on LLM coding pitfalls into four numbered principles — Think Before Coding, Simplicity First, Surgical Changes, Goal-Driven Execution — that Claude Code reads at the start of every session. The repo sits at roughly 111K stars as of May 2026, making it the most-cited CLAUDE.md template in the ecosystem.

Did Andrej Karpathy actually write this CLAUDE.md?

No. The file is a distillation of Karpathy's public observations on LLM coding behaviour, not text he authored. Karpathy posted his thread on X in late 2025; Jiayuan Chang turned the observations into the four-principle CLAUDE.md and shipped the repo a few weeks later. Karpathy's name appears in the title because the file encodes problems he diagnosed — the wording is Chang's.

Where do I put a CLAUDE.md file?

Anthropic's docs list four locations in priority order: managed-policy (organisation-wide, /Library/Application Support/ClaudeCode/CLAUDE.md on macOS), project (./CLAUDE.md or ./.claude/CLAUDE.md, checked into git), user (~/.claude/CLAUDE.md, your personal preferences across all projects), and local (./CLAUDE.local.md, gitignored). Files closer to the working directory are read last and win on conflicts.

How big should a CLAUDE.md be?

Anthropic's official guidance: target under 200 lines per file. The reasoning is that CLAUDE.md is loaded into context every session, so longer files consume more tokens and reduce adherence. Anthropic's best-practices page is blunt: 'Bloated CLAUDE.md files cause Claude to ignore your actual instructions.' Karpathy's CLAUDE.md is around 60 lines, which is roughly the right scale.

Should I just copy Karpathy's CLAUDE.md into every project?

No. The four principles are general-purpose and travel well, but the file's tone — 'Don't refactor things that aren't broken,' 'Don't remove pre-existing dead code unless asked' — was tuned for greenfield, single-developer Python and Rust work. In a multi-team Python web app or a Rails monolith, you also need project-specific facts (build commands, test runners, naming conventions, gotchas) that Karpathy's file deliberately omits. Use it as a top-of-file preamble, not the whole document.

What's the difference between CLAUDE.md, skills, and MCP servers?

CLAUDE.md is plain-text instructions loaded into context every session — good for always-on rules. Skills are markdown files in .claude/skills/ that load on demand when Claude detects relevance — good for repeatable workflows you don't want to pay for every conversation. MCP servers are external processes that expose tools, resources, and prompts via JSON-RPC — good when you need code execution or live data access. Our decision matrix walks through the trade-offs in detail.

What does the 'Goal-Driven Execution' principle actually mean?

It means rewriting imperative tasks ('add validation,' 'fix the bug,' 'refactor X') into verifiable goals with a check step ('write a test that reproduces it, then make it pass'). The Karpathy file frames it as: 'Strong success criteria let you loop independently. Weak criteria ("make it work") require constant clarification.' The pattern is the same one Anthropic's best-practices page calls out as 'the single highest-leverage thing you can do.'

What's wrong with most homemade CLAUDE.md files?

Three failure modes dominate. One: bloat — files past 200 lines that bury the important rules in noise. Two: vague rules — 'write clean code,' 'use best practices,' 'follow conventions' — none of which give Claude anything to verify. Three: drive-by additions, where every time something goes wrong, a new line gets appended without pruning the line that already addressed it. Treat CLAUDE.md like code: review it, prune it, and delete rules that Claude already follows without being told.

Can I use Karpathy's CLAUDE.md with Cursor or Codex?

Yes. The forrestchang/andrej-karpathy-skills repo ships a parallel .cursor/rules/karpathy-guidelines.mdc file with the same four principles. Cursor reads project rules from .cursor/rules/, so the same guidelines apply when you open the project in Cursor. For OpenAI Codex CLI, AGENTS.md is the equivalent file — Claude Code itself can import an AGENTS.md from a CLAUDE.md via the @ syntax, so one file can drive both agents.

How do I know my CLAUDE.md is working?

Karpathy's file lists four observable signals: fewer unnecessary changes in diffs, fewer rewrites due to overcomplication, clarifying questions arrive before implementation rather than after mistakes, and clean minimal PRs without drive-by refactoring. If you're not seeing those changes, the file is either too long (rules are getting lost) or too vague (rules can't be verified). Anthropic's debugging tip: if Claude keeps doing something you don't want despite a rule against it, the file is probably too long.

Does CLAUDE.md survive /compact?

Project-root CLAUDE.md does — Anthropic's docs are explicit: 'Project-root CLAUDE.md survives compaction: after /compact, Claude re-reads it from disk and re-injects it into the session.' Nested CLAUDE.md files in subdirectories are not re-injected automatically; they reload only when Claude next reads a file in that subdirectory. Plan accordingly: put always-on rules in the root file, not in nested ones.

Where can I see other strong CLAUDE.md examples?

Beyond forrestchang/andrej-karpathy-skills, the most-cited example is addyosmani/agent-skills (~27.7K stars), which encodes a full define-plan-build-verify-review-ship workflow as 20 skills plus a CLAUDE.md preamble. Anthropic's own anthropics/skills repo (~128K stars) is the canonical SKILL.md reference but has no CLAUDE.md. Read three or four of these end-to-end before writing your own — the patterns generalise even when the specifics don't.

Sources

Primary repo + the file itself

Karpathy’s thread (paywalled, mirrored)

Anthropic docs

Derivative repos

Internal links

Keep reading