Updated June 2026Comparison17 min read

Claude Code Memory MCP Servers (2026)

Four MCP servers, one job: stop your coding agent from re-explaining the codebase every session. Each gives Claude Code, Cursor, or Codex a store it can write to mid-task and read back on the next launch — project notes, decisions, task dependencies, and learned context that survive a restart. They solve the same pain in four different shapes, and the right one depends on what you want the agent to remember. Every fact below comes from the official repos.

Editorial illustration: a glowing coding-agent core tethered by luminous threads to four distinct memory vessels — a stack of markdown sheets, a dependency-graph of beads, a vector cloud, and a knowledge-graph constellation — over a deep midnight backdrop.
On this page · 13 sections
  1. TL;DR + decision tree
  2. What memory MCP servers do
  3. Side-by-side matrix
  4. Basic Memory — install + recipe
  5. Beads — install + recipe
  6. RAG Memory — install + recipe
  7. Knowledge Graph Memory — install + recipe
  8. Memory MCP vs CLAUDE.md
  9. Benchmark them yourself
  10. Common pitfalls
  11. Community signal
  12. FAQ
  13. Sources

TL;DR + decision tree

  • If you want memory you can also read and edit yourself, pick Basic Memory. It writes plain markdown to a folder on your disk, so the agent’s notes are just files you can open, grep, and commit. The least magic; the most inspectable.
  • If your pain is the agent forgetting where it was in a long, multi-step task, pick Beads. It models work as issues with dependencies in a version-controlled graph, so the agent can pick up exactly where it left off instead of re-deriving the plan from scratch.
  • If you have a large note corpus and need retrieval-by-meaning, pick RAG Memory. It combines vector search with a knowledge graph, so the agent finds the relevant past context semantically rather than by exact keyword.
  • If you want the smallest, official reference implementation, pick Knowledge Graph Memory. It is the entity-and-relation memory server from the modelcontextprotocol repo — minimal surface, maintained alongside the protocol.

These are coding-agent project memory servers — they run locally and give Claude Code, Cursor, or Codex a durable store over MCP. That is a different problem from application or agent-framework memory (mem0, Letta, Zep, Cognee), which is about giving a product long-term user memory. If that is your question, read mem0 vs Letta vs Zep vs Cognee instead. This post is strictly about stopping your coding agent from re-learning the codebase every session.

What memory MCP servers actually do

A coding agent starts every session blank. Claude Code re-reads your files and your CLAUDE.md, but it does not recall yesterday’s conversation — the decisions you made, the dead ends you ruled out, the half-finished refactor. That context lived in the context window, and the context window is gone. A memory MCP server gives the agent a tool layer over a durable store: it can write a note when it learns something and read notes back on the next launch. The store sits on your disk, not in the conversation, so it survives a restart.

There are three shapes the store can take, and they are the main thing that separates these four servers:

  1. Document store (markdown). Notes are files. The agent appends to and reads markdown you can also open in your editor. Human-readable, diff-friendly, no opaque database. Basic Memory is this.
  2. Task / issue graph. Memory is structured as work items with dependencies, so “what was I doing and what blocks what” is a first-class query. The agent resumes a plan rather than reconstructing it. Beads is this.
  3. Semantic / vector store. Notes are embedded so the agent retrieves by meaning across a large corpus, not by exact match. RAG Memory and Knowledge Graph Memory lean this way, the former with vectors, the latter with an entity-relation graph.

The discipline that separates a memory server that helps from one that hurts is retrieval, not recall-everything. A store the agent searches for the few relevant notes is a win; a store the agent dumps wholesale into every prompt burns tokens and drags stale facts into fresh work. We come back to that in pitfalls. If you are new to the protocol underneath, the What is MCP primer covers the wire format these servers run on.

Side-by-side matrix

Every cell is sourced from the official repo. This table is the metadata reference — we cut these facts from the per-tool prose below so the sections stay decision-focused.

DimensionBasic MemoryBeadsRAG MemoryKnowledge Graph Memory
Memory modelMarkdown documents + semantic graphIssue / dependency graphVector search + knowledge graphEntity-relation graph
Store on diskPlain markdown filesVersion-controlled DB in repoEmbedded vector + graph DBLocal JSON graph file
LicenseMITMITMITMIT
Transportstdiostdiostdiostdio
Installuvx / uv tool install basic-memorynpx beads-mcp (bd CLI)npx rag-memory-mcpnpx @modelcontextprotocol/server-memory
Best forNotes you also read/edit by handResuming long multi-step workRetrieval by meaning over big corpusMinimal official reference memory
MaintainerBasic MachinesSteve Yeggettommyth (community)Anthropic / MCP reference
MCP.Directory page/servers/basic-memory/servers/beads/servers/rag-memory/servers/knowledge-graph-memory

Three takeaways. First, all four run locally and write to your disk — none of them ships your code to a hosted service, which is the whole reason a coding agent uses these instead of a cloud memory platform. Second, the memory model is the real choice: markdown for human-readable notes, an issue graph for resumable work, a vector store for semantic recall. Third, they are not mutually exclusive — Basic Memory for durable notes and Beads for task state coexist happily in the same client, because the install pattern is identical stdio MCP for all of them.

Basic Memory — install + recipe

What it does best

Basic Memory is the only one in this group where the agent’s memory is just files you own. It writes structured markdown to a folder on your disk and builds a semantic graph across those notes, so the store is fully inspectable — you open it in your editor, grep it, diff it, and commit it alongside the code. The maintainers’ own framing is “plain text on your disk, forever,” and that is the appeal: nothing the agent remembers is locked in an opaque database you can’t read. When the agent records a decision, you can correct it by editing a file.

Pick this if you...

  • Want to read, edit, and version the agent’s memory by hand — not trust a black-box store
  • Already keep project notes in markdown and want the agent writing into the same shape
  • Care that nothing leaves your machine and the format stays legible years from now
  • Want memory you can review in a pull request like any other file change

Recipe: capture an architecture decision as a note

In Claude Code with Basic Memory installed, after you and the agent settle a design question, paste this:

Use Basic Memory. Write a note titled "Auth: why we chose
session cookies over JWT" capturing the decision we just made,
the two alternatives we rejected and why, and the files this
touches. Tag it #auth #architecture. Next session, when I ask
about auth, read this note back before proposing changes.

The agent creates a markdown file in your memory folder, links it into the semantic graph by tag, and on the next launch can retrieve it when auth comes up — so it stops re-litigating the JWT-vs-cookie question you already closed. You can open that same file in your editor and fix any detail the agent got wrong; the correction sticks because it is just text on disk.

Skip it if...

Your real need is resuming a long task with dependencies between steps — markdown notes don’t model “step B is blocked on step A.” Beads is built for that. And if your corpus grows into thousands of notes, plain-text retrieval gets coarse; a vector store like RAG Memory finds relevant context more precisely at that scale.

Beads — install + recipe

What it does best

Beads treats memory as a dependency-aware graph of issues rather than a pile of notes. Steve Yegge built it for the exact failure mode where a coding agent loses the thread of a long task — it stores work items, the relationships between them, and what blocks what, in a version-controlled store inside your repo. The agent records the plan as structured issues, then on the next session reads back “here is what’s open, here is what’s blocked, here is what I finished” and resumes, instead of re-deriving the whole plan from the code. The repo’s own pitch is a “drop-in memory upgrade for your coding agent.”

Pick this if you...

  • Run the agent on multi-day, multi-step work where it keeps forgetting which subtask is next
  • Want task state and dependencies tracked, not just free-form notes
  • Need memory that lives in the repo and survives merges without ID collisions
  • Are chasing the trending coding-agent-memory pattern and want the one HN is talking about right now

Recipe: hand off a refactor across sessions

In Claude Code with Beads installed (it exposes the bd CLI through MCP), kick off a long job like this:

Use Beads. Break the "migrate the payments module off the
legacy SDK" work into issues with dependencies: schema changes
block the service rewrite, which blocks the tests. Create them,
mark the first as in-progress, and record what you finish.
Tomorrow, before writing any code, read the open issues and
tell me the next unblocked one.

The agent creates the issue graph, tracks status as it works, and writes that state into the repo store. On the next launch it queries the graph for the next unblocked issue and continues — no “remind me what we were doing” round-trip. Because the store is version-controlled and uses collision-resistant IDs, two agents (or you and an agent) can touch it without trampling each other.

Skip it if...

You just want the agent to remember facts and decisions, not manage a task graph. For free-form “remember that we decided X” memory, Basic Memory’s markdown notes are lighter and more readable than modeling everything as issues with dependencies.

RAG Memory — install + recipe

What it does best

RAG Memory is the retrieval-first option: it embeds the agent’s memory and combines vector search with a knowledge graph, so the agent finds relevant past context by meaning rather than exact keyword. That matters once the store grows — when you have hundreds of accumulated notes, asking “what do we know about rate limiting” should surface the three relevant entries even if none used those exact words. The graph layer adds relationships on top of the vectors, so the agent can follow links between related pieces of context instead of treating every note as an island.

Pick this if you...

  • Expect a large memory corpus where keyword search would miss the relevant note
  • Want semantic “find what’s related” retrieval, not just append-and-reread
  • Like the idea of relationships between memories, not a flat list of notes
  • Are comfortable with an embedded vector store rather than human-readable files

Recipe: recall every constraint touching a subsystem

In Cursor or Claude Code with RAG Memory installed, after weeks of accumulated notes, ask:

Use RAG Memory. Search everything we've stored for context
about the billing subsystem — constraints, gotchas, past bugs,
and decisions — even if the notes don't use the word "billing".
Return the top matches with how they relate to each other, then
summarize the constraints I need to respect before I touch
invoice generation.

The agent runs a vector query over the embedded store, pulls the semantically closest entries, walks the knowledge-graph edges to find connected notes, and returns a ranked, relationship-aware summary. This is the query shape markdown notes handle poorly: you don’t have to know the right keyword, because retrieval is by meaning.

Skip it if...

You want to read and hand-edit the memory — an embedded vector store isn’t a folder of files you open in your editor. For inspectable, commit-it-to-git memory, Basic Memory is the better fit. And for a tiny project where the whole memory fits in a CLAUDE.md, a vector store is more machinery than the job needs.

Knowledge Graph Memory — install + recipe

What it does best

Knowledge Graph Memory is the official reference implementation from the modelcontextprotocol/servers repo, and its strength is being the minimal, canonical version of entity-and-relation memory. It stores facts as entities, relations, and observations in a local graph — “the user prefers X,” “service A depends on service B” — and lets the agent query that graph across sessions. Because it ships in the protocol’s own repository, it is the most conservative pick: small surface, maintained alongside MCP itself, nothing exotic to break.

Pick this if you...

  • Want the smallest, best-known memory server and value being on the official reference path
  • Think in entities and relations — “who depends on what” — more than documents or tasks
  • Want a starting point you can read end-to-end and fork if you need something custom
  • Prefer minimal moving parts over a richer feature set

Recipe: teach the agent your service topology

In Claude Code with the memory server installed, seed it once:

Use the memory server. Record these entities and relations:
"checkout-api" depends_on "payments-service" and "inventory-db";
"payments-service" owned_by team "Billing"; observation on
"checkout-api": "p99 is latency-sensitive, never block on
inventory writes". Next time I ask about checkout, read the
graph first so you respect these relationships.

The agent writes entities, relations, and observations into the local graph file. On later sessions it queries the graph before reasoning about checkout, so it already knows the dependency chain and the latency constraint without you re-explaining the topology. The store is a plain local file, so it is easy to inspect, though it is structured graph data rather than prose notes.

Skip it if...

You want semantic search over a big note corpus (RAG Memory), resumable task tracking (Beads), or human-readable markdown you edit by hand (Basic Memory). Knowledge Graph Memory is deliberately minimal — that is the point, and also the limit.

Memory MCP vs CLAUDE.md — when to use which

The most common question on these launches is whether you even need a memory server when Claude Code already reads a CLAUDE.md file. They are not competitors — they are two layers, and the line between them is who owns the fact.

Use CLAUDE.md for stable, human-owned facts

Build commands, architecture, conventions, the rules you want enforced every session. You write and edit it; Claude Code loads it automatically at session start. It does not grow on its own, and that is a feature — it stays a tight, curated brief. See the annotated CLAUDE.md walkthrough for how to keep it lean.

Use a memory MCP for agent-owned, accumulating facts

Decisions the agent reaches mid-task, task state across sessions, learned gotchas, the “why we ruled that out” trail. The agent writes these as it works and reads them back later. Putting all of this into CLAUDE.md by hand doesn’t scale, and it bloats the context the agent loads every single turn.

The honest middle ground

For a small project, “six markdown files I maintain by hand” genuinely beats any server — a real r/ClaudeAI thread makes exactly that case, and the skeptics on the Recall HN launch agree. A memory MCP earns its keep once the volume of agent-accumulated context outgrows what you want to curate manually. Start with CLAUDE.md; add a server when the manual upkeep hurts.

Benchmark them yourself

There is no single “best” memory server because the right one depends on what you store and how you query it. Spend 30 minutes on the methodology below; it produces a verdict tailored to your project instead of a generic ranking.

# Pick 3 real "I keep re-explaining this" moments from your week.
SCENARIOS=(
  "I re-explained why we chose X over Y for the third time."
  "The agent restarted a refactor it had half-finished yesterday."
  "I asked 'what do we know about subsystem Z' and got nothing."
)

# For each server, capture:
#   1. Write friction — how clean is the prompt to store the fact?
#   2. Recall accuracy — next session, does it retrieve the RIGHT note?
#   3. Token cost — how much context does a recall pull into the prompt?
#   4. Inspectability — can YOU read/fix what it stored?
#   5. Staleness handling — when a fact changes, how hard to update?

# Compare across:
#   - Basic Memory (markdown notes)
#   - Beads (issue/dependency graph)
#   - RAG Memory (vector + graph)
#   - Knowledge Graph Memory (entity-relation)

Expect Basic Memory to win inspectability and write-friction, Beads to win resumable-task scenarios, RAG Memory to win recall accuracy on a large corpus, and Knowledge Graph Memory to win on minimal footprint. The deciding factor is almost always which of the three scenarios above hurts you most — run it against your own week, not a synthetic benchmark.

Common pitfalls

Loading the whole store every turn

The recurring HN criticism of these launches: an agent that re-reads its entire memory file each turn spends more tokens than it saves. The win is retrieval — let the agent search for the few relevant notes and pull only those. If your setup dumps everything into context, you have rebuilt a slow CLAUDE.md, not a memory system.

Stale or poisoned memory

A note the agent wrote three weeks ago can be wrong now, and a wrong memory is worse than no memory because the agent trusts it. Prune the store the way you prune a CLAUDE.md. Basic Memory makes this easiest — the notes are files you can read and fix; opaque vector stores make it harder to spot the bad entry.

Letting the agent decide what’s worth remembering

Without guidance, an agent either stores nothing or stores everything. Tell it explicitly what to capture — decisions, constraints, task state — and what to skip — transient debug chatter. The recipe prompts above are written that way on purpose: they name the fact to store, not “remember this conversation.”

Reaching for a server when CLAUDE.md would do

For a small or short-lived project, a handful of markdown files you maintain by hand beats any memory server — less setup, fully under your control, no retrieval to tune. Add a memory MCP when the volume of agent-accumulated context genuinely outgrows manual curation, not before.

Community signal

Memory for coding agents is the most crowded launch cluster on Hacker News right now. In June 2026 alone the front page carried “Show HN: Recall – Local project memory for Claude Code” (124 points), alongside earlier entries like PMB, Mimirs, and OpenLTM. Recall, Mimirs, and PMB aren’t in our catalog yet — we mention them so you have the full landscape, not just the four servers we cover. The signal from the thread is worth more than the launch count: the top comments are skeptical, and usefully so.

The recurring pushback on the Recall thread is that experienced Claude Code users often just point the agent at the right files and keep a tight CLAUDE.md, and that bolting on a memory layer can waste tokens or drag in stale context rather than help. That same instinct shows up across r/ClaudeAI, where one widely-read post described dropping elaborate tooling in favor of “six markdown files” to stop re-explaining a project to Claude every session. And Beads, the trending entrant in this exact space, frames itself as a “drop-in memory upgrade” precisely because it structures memory as a task graph rather than yet another note pile. The honest read of the community in mid-2026: memory servers are real and improving, but the bar to beat a well-kept CLAUDE.md is higher than the launch hype implies. Pick by the pain you actually have.

Frequently asked questions

How do I give Claude Code persistent memory across sessions?

Two layers. The built-in layer is the CLAUDE.md file Claude Code reads at the start of every session — put conventions, architecture notes, and commands there and they persist for free, no server needed. The second layer is a memory MCP server: you install one (Basic Memory, Beads, RAG Memory, or Knowledge Graph Memory), and the agent gains tools to write notes, decisions, and task state into a store that outlives the conversation, then read them back next time. CLAUDE.md is static context you maintain by hand; a memory MCP is a store the agent reads and writes on its own. Most teams use both: CLAUDE.md for the durable rules, a memory server for accumulating session-to-session knowledge.

What's the difference between a CLAUDE.md file and a memory MCP server?

CLAUDE.md is a single markdown file you write and edit by hand; Claude Code loads it into context automatically at session start. It is perfect for stable facts — build commands, architecture, conventions — but it does not grow on its own, and stuffing everything into it eventually bloats your context window. A memory MCP server is a tool the agent calls: it can write new notes when it learns something, search past notes when it needs them, and keep task state and dependencies between issues. The practical rule: if a human should own the fact and it rarely changes, it belongs in CLAUDE.md; if the agent should accumulate the fact across sessions, it belongs in a memory MCP.

Does Claude Code remember between sessions by default?

Only what you persist. A fresh session starts with no conversation history — Claude Code re-reads your project files and your CLAUDE.md, but it does not recall what you discussed yesterday unless that was written somewhere on disk. That is exactly the gap memory MCP servers fill: they give the agent a durable store it can write to mid-session and read from on the next launch, so decisions, task state, and learned context survive a restart instead of evaporating with the context window.

Which memory MCP server is best for a coding agent in 2026?

It depends on the shape of what you want to remember. For human-readable project notes you also want to read and edit yourself, Basic Memory writes plain markdown to your disk. For tracking long, multi-step work where tasks depend on other tasks — the agent picking up where it left off — Beads models issues and dependencies as a graph. For 'find the relevant past context by meaning' over a large note corpus, RAG Memory combines vector search with a knowledge graph. For the minimal, official reference implementation of an entity-and-relation memory, Knowledge Graph Memory from the modelcontextprotocol repo is the smallest footprint. Pick by question, not by popularity.

Can I use these memory servers with Cursor and Codex, not just Claude Code?

Yes. All four are standard stdio MCP servers, so any MCP-compatible client installs them through the same mcpServers JSON block — Cursor, VS Code, Claude Code, Claude Desktop, Windsurf, Codex CLI, and Gemini CLI included. There is nothing Claude-specific in the protocol; the agent gets the same read/write memory tools regardless of which client mounts the server. The only thing that changes per client is where you paste the config, which the Open server page link on each install card spells out.

Do memory MCP servers store data locally or in the cloud?

All four in this comparison are local-first. Basic Memory writes markdown files to a directory you choose. Beads keeps a version-controlled database in your repo. RAG Memory and Knowledge Graph Memory store to a local file or embedded database on disk. Nothing leaves your machine unless you wire it up to do so, which matters if your codebase is proprietary. This is the main distinction from hosted agent-memory platforms; for application or agent-framework memory in the cloud, see our mem0 vs Letta vs Zep vs Cognee comparison instead.

Will a memory MCP server slow down or bloat my context window?

It can, if you let the agent dump everything it remembers into every prompt. The win comes from retrieval, not from loading the whole store — a good memory server lets the agent search for the few relevant notes and pull only those into context. The failure mode HN keeps flagging on these launches is the opposite: an agent that re-reads its entire memory file each turn wastes more tokens than it saves, and stale or wrong notes actively mislead. Cap what gets loaded, prefer search over full-dump, and prune notes the way you would prune a CLAUDE.md.

Is there a free open-source memory server for Claude Code?

All four servers here are open-source and free to run — Basic Memory, Beads, RAG Memory, and Knowledge Graph Memory are each MIT-licensed and run locally with no paid SaaS attached. Knowledge Graph Memory is the official reference implementation from the modelcontextprotocol/servers repo, so it is the most conservative pick if you want something maintained alongside the protocol itself. None of them require an account, a subscription, or sending your code to a third party.

Sources

Basic Memory

Beads

RAG Memory

Knowledge Graph Memory

Community

Related comparisons

Internal links

Keep reading