Skills vs MCP vs Subagents vs CLI (2026)
Four ways to extend a Claude-class agent in 2026 — Skills, MCP servers, subagents, and plain CLI tools — and four loud vendor camps insisting their pick is the future. The differences are real, but smaller than the marketing makes them sound. We pulled every claim in this post from a primary source: Anthropic’s engineering posts, Vercel’s eval write-up, Cloudflare’s Code Mode benchmark, Simon Willison’s analysis, the open spec at agentskills/agentskills, and the Hacker News threads where the practitioners argued it out. No vendor partisanship. If you read one piece on this topic in 2026, make it this one — and start with our What is MCP primer if you’re new to the protocol layer.

On this page · 14 sections▾
TL;DR — decide in 30 seconds
- Skill if the answer is “teach the agent a repeatable procedure”: brand-styled docs, repo-specific lint rules, a PDF-to-Markdown pipeline, a Godot scene-tree generator. The data lives on disk or in code; the value is the recipe.
- MCP server if the answer is “connect to a live external system”: GitHub, Linear, Stripe, Notion, Sentry, an internal Postgres, a paid docs index. The data is remote, authenticated, and changes minute-to-minute.
- Subagent if the answer is “run a side quest without polluting my main context”: log triage, codebase exploration, a multi-step research task whose intermediate output you don’t need.
- CLI / AGENTS.md if the answer is “the tool already exists, just let the agent run it”:
git,jq,kubectl, your existing test runner. Drop the invocation patterns intoAGENTS.mdorCLAUDE.mdand stop there.
Most production setups use two or three of these together. The mistake we keep seeing is teams picking one and forcing everything through it — that’s how you end up with a 47-tool MCP config eating 80,000 tokens before the user has typed a word, or a 200-file Skills folder where five lines in AGENTS.md would have done the job.
The four primitives, defined
Each primitive has a canonical, dated definition from the vendor that introduced it. We’re going to anchor the rest of the post in those definitions, then everything else is comparison.
1. MCP servers (Anthropic, November 2024)
Per modelcontextprotocol.io, verbatim: “MCP (Model Context Protocol) is an open-source standard for connecting AI applications to external systems… Think of MCP like a USB-C port for AI applications.” An MCP server is a process — local stdio, remote SSE, or streamable HTTP — that exposes three primitives: tools (function calls the agent can invoke), resources (read-only documents the agent can fetch), and prompts (parametric prompt templates). Optional capabilities include sampling (server-initiated LLM calls), elicitation, and roots. The spec is governed by the MCP Steering Committee with broad client support — Claude, ChatGPT, Cursor, VS Code, Windsurf, Codex, Antigravity, and dozens more.
MCP’s superpower is connection. It does not care what the agent is or how it reasons; it cares that you’ve got a Stripe account and a model that can call tools. Trade-off: every connected server’s tool descriptions get loaded into the system prompt, which is where the token-cost debate of late 2025 came from.
2. Subagents (Claude Code, mid-2025)
Subagents are spawned worker conversations defined as Markdown files under .claude/agents/. Per the Claude Code docs, verbatim: “Subagents are specialized AI assistants that handle specific types of tasks. Use one when a side task would flood your main conversation with search results, logs, or file contents you won’t reference again: the subagent does that work in its own context and returns only the summary.”
Each subagent file declares a name, description, optional model override, and a tool allowlist. When the orchestrating Claude detects a matching task, it delegates to the subagent, which runs in its own context window with the subagent’s system prompt and tool scope. The orchestrator only sees the summary the subagent returns. Subagents do not replace MCP or Skills — a subagent uses MCP tools and loads Skills. They’re an orchestration layer on top.
3. CLI / AGENTS.md (multi-vendor convention, 2024 → present)
The simplest and most underrated of the four. Most coding agents — Claude Code, Cursor, OpenAI Codex CLI, Aider, Continue, Cline, Antigravity — read a Markdown file in the repo root on every turn. Anthropic calls it CLAUDE.md; the cross-vendor neutral name is AGENTS.md. Whatever you put in that file is pinned context: project-specific commands, code style, gotchas, the “don’t edit generated/” rules. Combined with shell tools the agent already has — git, npm, jq, rg, kubectl — this is the lowest-overhead extension surface in the entire ecosystem.
Vercel argues this is more powerful than people give it credit for. Their bench: an 8KB documentation index in AGENTS.md outperformed Skills in the same eval. More on that below.
4. Skills (Anthropic, October 16 2025)
Per claude.com/blog/skills, verbatim: “Skills are folders that include instructions, scripts, and resources that Claude can load when needed.” The format is a SKILL.md with YAML frontmatter (just name and description required) plus an optional body of instructions and an optional bundle of helper files Claude can read or execute on demand.
The pattern is what Anthropic’s engineering team calls progressive disclosure: at startup, only the name and description of every available skill is in the system prompt; the full SKILL.md body loads on demand when Claude decides the skill applies; helper files load only as needed during execution. Verbatim from Anthropic: “Progressive disclosure is the core design principle that makes Agent Skills flexible and scalable.” The spec went from a Claude-only feature in October 2025 to a cross-vendor open format in early 2026, with agentskills/agentskills as the reference and vercel-labs/skills as the de facto installer.
If you want a deeper primer on what Skills are and how to ship one, our What are Claude Code Skills post walks through the lifecycle. The cookbook posts under /blog/claude-godot-skill-guide and /blog/claude-pdf-to-markdown-skill-guide show two opposite shapes — a domain-specific generator and a deterministic data pipeline.
The big debate (verbatim quotes)
This is the section vendor blogs tiptoe around. Four positions, four real sources, copied character-for-character from the URLs cited.
“I expect we'll see a Cambrian explosion in Skills which will make this year's MCP rush look pedestrian by comparison.”
Simon Willison · Blog
Headline post on Skills launch day, October 16 2025. The line that defined the early narrative.
“Tool calls are incredibly interesting and useful. MCP is just one means to that end, and not one of the better ones.”
tptacek · Hacker News
Top-rated HN comment on the Willison post. The most-cited security-aware-engineer take that MCP is overrated.
“Fundamentally you're getting hyped over a framework to append text to your prompt?”
rafaelmn · Hacker News
The skeptical counter on the Skills launch HN thread (816 points). Captures the 'it's just RAG with marketing' camp.
“A compressed 8KB docs index embedded directly in AGENTS.md achieved a 100% pass rate, while skills maxed out at 79% even with explicit instructions telling the agent to use them.”
Vercel Engineering · Blog
Vercel's eval claim: AGENTS.md, loaded every turn, beats Skills, loaded on demand. Their argument for keeping things simple.
Two more from Anthropic itself. From the engineering post on Agent Skills, verbatim: “We’ll also explore how Skills can complement Model Context Protocol (MCP) servers by teaching agents more complex workflows that involve external tools and software.” Notice the word — complement, not replace. And from Anthropic’s Code-Execution-with-MCP post, verbatim: “Tool descriptions occupy more context window space, increasing response time and costs. In cases where agents are connected to thousands of tools, they’ll need to process hundreds of thousands of tokens before reading a request.” Anthropic acknowledges the MCP token problem and proposes Code Execution as the fix — that move is what Cloudflare echoed with Code Mode a month later.
The decision matrix
Six rows, four columns. Cells are Win / Tie / Lose with a one-line why. Treat Win as “this is the right primitive,” Tie as “works but with caveats,” Lose as “wrong tool, find another.”
| Use case | Skill | MCP | Subagent | CLI / AGENTS.md |
|---|---|---|---|---|
| Read-only API call (GitHub, Stripe) | Lose — needs auth + live data | Win — exactly the use case | Tie — orchestrates MCP | Tie — works via curl, brittle |
| Multi-step procedural workflow (PDF→MD, brand docs) | Win — recipe + helper scripts | Lose — connection-shaped problem | Tie — useful as wrapper | Tie — good if procedure is short |
| IDE integration (in-editor docs) | Tie — works if loaded | Win — Cursor/VS Code first-class | Lose — wrong layer | Lose — out of scope |
| Cross-vendor portability | Win — Vercel/Mastra/Google ship skills | Win — open spec, broad clients | Lose — Claude-Code-specific | Win — every agent reads AGENTS.md |
| Token efficiency at scale | Win — progressive disclosure | Lose — without Code Mode/Tool Search | Win — isolates context | Tie — small file, but every turn |
| Security boundary / sandbox | Tie — runtime-dependent | Win — process isolation + OAuth | Win — restricted tool list | Lose — full shell access |
Read across, not down. There is no universal winner. Three takeaways. MCP wins read-only API access. That’s what it was built for and the vendor-friendly OAuth-and-tools shape is genuinely better than CLI-with-API-keys. Skills win procedural workflows. The progressive-disclosure model means you can have 100 specialized procedures available without paying their full token cost. AGENTS.md wins portability and cost-zero. Loaded every turn, no discovery overhead, every agent supports it. The one cell where everything ties is “multi-step procedural workflow that needs live API data” — and that’s exactly where you stack Skill + MCP together.
When to pick a Skill
Pick a Skill when the procedure is repeatable, the instructions are stable, and the data is on disk or fetchable through code. Five concrete examples:
- Document generation with brand rules. Anthropic’s built-in
pptxanddocxSkills — “reading and generating Excel spreadsheets with formulas, creating PowerPoint presentations, generating Word documents” — are the canonical case. Our deep dive on the pdf-to-markdown Skill walks through the same shape for ingestion pipelines. - Repo-specific scaffolding. Our Claude Godot Skill guide shows ten game-prototype recipes — every one a Skill, every one packing the genre’s scene-tree conventions and GDScript idioms into a single SKILL.md plus helper validators.
- UI generation with a design system. The UI/UX Pro Max Skill demonstrates encoding a shadcn + Tailwind opinion into a Skill — far more reusable than re-pasting a 200-line system prompt.
- Cross-tool bridges. codex-cli skill is the cleanest example: a Skill that teaches Claude how to shell out to OpenAI’s Codex CLI as a second-opinion reviewer.
- Skill-of-skills. Anthropic ships skill-creator, find-skills, and install-skills as meta-skills that bootstrap the whole flow.
One-line install · by anthropics
Open skill pageInstall
mkdir -p .claude/skills/skill-creator && curl -L -o skill.zip "https://mcp.directory/api/skills/download/2" && unzip -o skill.zip -d .claude/skills/skill-creator && rm skill.zipInstalls to .claude/skills/skill-creator
The signal that a Skill is the right primitive: when you keep copy-pasting the same 10-paragraph system prompt at the start of new conversations, that prompt belongs in a SKILL.md.
When to pick an MCP server
Pick MCP when the data lives behind authentication or changes faster than you can write Markdown. Five examples:
- Hosted docs indices. Context7, sequential-thinking, and the cluster of docs-RAG servers we covered in our Context7 vs DeepWiki vs Ref Tools vs Docfork comparison. The index is too big and updates too often to ship as a Skill.
- Authenticated SaaS access. GitHub, Linear, Stripe, Notion, Sentry, GitLab — all have first-party or high-quality community MCP servers that handle the OAuth dance once and expose typed tool calls.
- Browser automation. Playwright and the Chrome DevTools MCP server expose a level of capability you cannot recreate with a SKILL.md plus shell scripts. The server keeps an open browser session across calls.
- Live database queries. Postgres, Supabase, ClickHouse MCP servers let the agent run typed SELECTs with schema introspection. Procedural-style Skills cannot approximate this safely — you want the server’s permission boundary.
- Cross-client distribution. If you’re building something that should work in Claude, Cursor, VS Code, and ChatGPT all at once — MCP is the lingua franca. Skills are getting there but the runtime semantics still vary.
One-line install · Context7
Open server pageInstall
The signal that MCP is the right primitive: the agent needs to know what is true right now in someone else’s system. That’s a connection problem, not a knowledge problem.
When to pick a subagent
Subagents are the least-understood primitive of the four — people often confuse them with MCP servers or with Skills. They are neither. They are orchestration: a way to spin up a worker conversation, restrict its tool access, and only see its summary back. Per the Claude Code docs they help you “preserve context by keeping exploration and implementation out of your main conversation.”
Pick a subagent when:
- A side task would flood the main context. Log triage that returns 200 KB of grep output — let a subagent read it and report the three lines that matter.
- You want a cheaper model for routine work. The docs note “control costs by routing tasks to faster, cheaper models like Haiku.” A subagent can be pinned to Haiku while the orchestrator stays on Sonnet or Opus.
- You want a narrower tool surface. The orchestrator has full repo access; the subagent has only
ReadandGrep. That’s a real safety boundary, especially for “analyze this but do not edit” flows. - You’re building a parallel pipeline. Spawn N subagents to investigate N hypotheses simultaneously. The community-favorite Skill subagent-driven-development encodes this pattern.
Important: subagents are a Claude Code feature. Other agent runtimes have analogues — Cursor’s “agent tabs,” Antigravity’s parallel agents, Vercel AI’s spawn primitive — but the Markdown-file-with- frontmatter shape is Anthropic-specific. If cross-vendor portability is a requirement, a subagent is the wrong primitive; reach for an MCP server or a Skill.
When to pick a CLI tool / AGENTS.md
The most-overlooked primitive. If the tool already exists as a shell command, the cheapest extension is a sentence in AGENTS.md telling the agent to run it. Vercel made this argument loudly enough to get its own blog post, and the eval data, while limited, is real.
Pick CLI/AGENTS.md when:
- The tool exists.
git status,npm test,jq '.servers[].slug',kubectl get pods— wrapping any of these in a Skill or an MCP server is overkill. - You need persistent, every-turn guidance. Project conventions, “do not edit
generated/,” the canonicalrefreshcommand — these areAGENTS.mdentries because the agent should see them on every turn, not have to discover them. - Your repo is the system of record. The agent has the file tree;
cat README.mdandfind . -name "*.test.ts"are often better than any RAG. - You’re writing for cross-vendor agents. Per the Claude Code best-practices guide,
CLAUDE.mdis read by every Claude session;AGENTS.mdis read by Cursor, Codex, OpenCode, Aider, Continue, Gemini CLI, and Antigravity — making it the only universal extension surface.
The signal that AGENTS.md is the right primitive: when you find yourself thinking “the agent should always know this.” That’s an AGENTS.md line.
Cross-agent portability — the reality
A common claim in early Skills coverage was that Anthropic Skills would “just work” in every agent. That claim has aged into a more nuanced truth.
The format is portable. A SKILL.md is Markdown plus YAML frontmatter — every modern agent runtime can parse it. The agentskills/agentskills spec is the Apache-2.0 reference, governed by Anthropic with community contributions. Vercel’s npx skills add CLI claims support for “OpenCode, Claude Code, Codex, Cursor, and 50 more” agents. google/skills ships first-party Skills under Apache 2.0 (Gemini API, AlloyDB, BigQuery, Cloud Run, the Well-Architected Framework series). mastra-ai/skills ships the Mastra framework Skill. The wire-level interoperability is genuinely there.
The runtime semantics are not. Three concrete points where portability breaks:
- Progressive disclosure. Anthropic’s agents load only the frontmatter at startup and the body on demand. Many third-party runtimes load the full
SKILL.mdbody up front, defeating the token saving. Verify in your agent’s docs. - Code execution. A Skill that ships a
scripts/validate.pyand tells Claude to run it works trivially in Claude Code (the Bash tool is built in) but requires explicit code-execution permission elsewhere. In claude.ai web, “The claude.ai environment has very limited network access,” per Simon Willison — “it can install packages from PyPI and NPM and clone repositories from github.com but it can’t access any other domain, including api.github.com.” - Skill discovery. Vercel’s eval found that even Claude itself sometimes ignores Skills unless prompted explicitly. Other runtimes have less mature invocation logic. AGENTS.md, by contrast, is loaded every turn — no discovery question.
Bottom line: if you’re writing a Skill for one agent, ship it as a SKILL.md and call it done. If you’re writing for many agents, write the SKILL.md to beself-contained — assume the runtime might dump the whole body into the prompt — and pair it with an AGENTS.md line that says “use the X skill for Y task.” Belt and suspenders.
Token cost — what the percentage claims actually mean
Three big numbers travel around in this conversation. Let’s pin them down so you stop seeing them out of context.
- Cloudflare: 99.9% reduction. Per the Code Mode post, blog.cloudflare.com/code-mode-mcp, verbatim: “For a large API like the Cloudflare API, Code Mode reduces the number of input tokens used by 99.9%.” The baseline is “an equivalent MCP server without Code Mode” that “would consume 1.17 million tokens.” Code Mode collapses thousands of MCP tool descriptions into two tools —
search()andexecute()— and lets the model write TypeScript against typed APIs. The 99.9% is real but highly specific to Cloudflare-sized APIs. - Anthropic: 98.7% reduction. Per Anthropic Engineering’s Code Execution with MCP post, the same idea, framed as a filesystem the agent navigates: 150,000 tokens → 2,000 tokens on a Google Drive to Salesforce example. Same 98%–99% range, same workload shape.
- Anthropic Tool Search Tool: not a single number. Anthropic shipped a Tool Search Tool feature that lets the model query for relevant tools at runtime instead of loading every tool definition into the system prompt. The exact percentage varies by tool count; it does not match a single “46.9%” figure that’s sometimes cited online. We tried to verify a specific number from Anthropic’s docs and couldn’t find a canonical one — treat any specific percentage as workload-dependent rather than a guarantee.
The general truth all three claims agree on: tool descriptions in the system prompt scale linearly with the number of connected MCP servers, and at 20+ servers you’re in real trouble. The fixes — Code Mode, Code Execution with MCP, the Tool Search Tool, or moving procedural surface area into Skills — all attack the same root cause. None of them eliminate MCP. They eliminate the upfront tax.
Migration paths
MCP server → Skill
You have an MCP server that wraps a deterministic, code-only process. Maybe it shells out to pdftotext, or runs a Pandoc pipeline, or formats a markdown table. The MCP wrapper is paying connection-and-protocol cost for something that could just be a recipe. Migrate the prompt description and the helper script into a Skill folder; keep the executable as a helper file under scripts/; uninstall the MCP server. The token profile drops dramatically because the tool descriptions stop being in the system prompt every turn.
REST API → MCP server
You’ve been telling the agent “curl this endpoint with this auth header.” It works until it doesn’t — header gets stale, the API adds rate limits, the agent leaks the key into chat. Wrap it in an MCP server with proper OAuth and typed tool calls. The mcp-builder Skill walks Claude through this exact migration; our What is MCP primer covers the protocol shape.
Monolith conversation → subagent fan-out
You’re running long Claude Code sessions and watching the context fill with grep output and exploratory dumps. Identify the recurring side-quests — “find all the files that import X,” “summarize this 10-page PR,” “run the test suite and report failures.” Each becomes a subagent file under .claude/agents/ with a tight tool allowlist. The orchestrator stays focused; the subagents take the noise.
Re-pasting system prompts → AGENTS.md or Skill
The cheapest migration. If you keep starting new sessions with the same multi-paragraph “always do X, never do Y” prompt, that’s a Skill (if it’s conditional — “when working on PDFs, do X”) or an AGENTS.md entry (if it’s always-on — “in this repo, do X”). One is loaded on demand, the other is loaded every turn. Pick by frequency.
What we actually run on this site
Concrete to ground the matrix. The MCP.Directory codebase runs Claude Code as the primary agent. Our CLAUDE.md is the load-bearing piece — it encodes the static-first architecture, the data refresh pipeline, the page-payload pattern. A reader of that file already knows 80% of what they need to make a change. We use one MCP server in dev (Playwright, for the E2E specs the build runs). We use Skills for the bits of procedural surface that recur across blog posts — the cookbook generator, the deep-dive scaffolder. Subagents handle the “explore the data layer” quests that used to flood the main conversation.
That’s four primitives across one team. The mistake we’d make if we picked one was either (a) an AGENTS.md-only stack that couldn’t hit a live API or (b) an MCP-everything stack where the system prompt was 14,000 tokens of tool descriptions before the user typed a word. The pluralism is the point.
Default starting point
AGENTS.md (or CLAUDE.md). Costs nothing, every agent reads it, fixes 80% of the “agent doesn’t know my repo” problem.
Add MCP when
The agent needs authenticated access to a live system or a hosted index it cannot fake from the filesystem.
Add a Skill when
You keep re-pasting the same conditional procedure. Encode it as Markdown plus optional helpers, ship it next to the code.
Add a subagent when
The main conversation is being polluted by exploratory work whose intermediate output you don’t need. Claude-Code-specific.
Anti-pattern
Stacking 20 MCP servers and wondering why the model is slow. That’s the upfront-token tax. Use Code Mode, move procedures to Skills, or trim to the three you actually use.
Frequently asked questions
What's the difference between Claude Skills, MCP servers, subagents, and CLI tools?
Skills are folders with a SKILL.md file that Claude loads on demand to learn a procedure. MCP servers are processes that expose tools, resources, and prompts over a wire protocol so any MCP-aware client can connect. Subagents are spawned worker conversations with their own context window, system prompt, and tool allowlist. CLI tools are existing shell commands the agent invokes via Bash. Skills carry instructions, MCP carries connections, subagents carry isolation, CLIs carry execution. Most production setups use two or three together.
Will Claude Skills replace MCP?
No, despite Simon Willison calling Skills "maybe a bigger deal than MCP." Skills package procedural knowledge as Markdown plus optional helper files. MCP exposes live data and tools over a JSON-RPC connection. A Skill cannot read your private Notion workspace; an MCP server cannot teach the model how to lay out a PowerPoint deck. Anthropic's own engineering post says they will "explore how Skills can complement Model Context Protocol (MCP) servers" — complement, not replace. The two solve different layers.
When should I pick a skill instead of an MCP server?
Pick a skill when the task is procedural and the data is already on disk or fetchable through code execution: PDF generation, brand-styled docs, repo-specific lint rules, internal coding conventions. Pick an MCP server when the task requires authenticated access to a live external system (GitHub, Stripe, Linear, Notion) or a hosted index (Context7, Sentry). If the answer is "both," install both. The patterns coexist cleanly.
What is the SKILL.md spec?
A SKILL.md file is Markdown with YAML frontmatter containing two required fields, name and description, plus an optional body of instructions. Per the anthropics/skills repo: "The frontmatter requires only two fields: name — a unique identifier for your skill (lowercase, hyphens for spaces); description — a complete description of what the skill does and when to use it." The format is intentionally minimal so any agent runtime can parse it. The agentskills/agentskills repo houses the open spec, governed by Anthropic with community contributions.
Do Claude Skills work in Cursor, Codex, Cline, Gemini, or other agents?
Skills are portable Markdown, so technically yes — but each runtime decides how (and whether) to load them. Vercel's open-source skills CLI (npx skills add) supports "OpenCode, Claude Code, Codex, Cursor, and 50 more" agents. Mastra ships a first-party skill via npx skills add mastra-ai/skills. Google publishes google/skills under Apache 2.0. The wire compatibility is good. The runtime semantics — progressive disclosure, sandboxed code execution, helper-file loading — vary. Treat "works" as "the Markdown loads" and verify the behavior per agent.
What are Claude Code subagents and how do they fit in?
A subagent is a specialized assistant defined as a Markdown file under .claude/agents/ with its own system prompt, tool allowlist, and (optionally) a different model. Per Claude's docs, you "use one when a side task would flood your main conversation with search results, logs, or file contents you won't reference again: the subagent does that work in its own context and returns only the summary." Subagents do not replace MCP or Skills; they are an orchestration primitive. A subagent can use MCP tools and load Skills.
What's Vercel's AGENTS.md vs Skills argument?
Vercel ran an internal eval and reported that an 8KB documentation index embedded directly in AGENTS.md hit a 100% pass rate, while Skills capped at 79% even with explicit "use the skill" instructions. Their argument: AGENTS.md is loaded every turn, so the model never has to decide whether to read it; Skills require a discovery step the model can skip. The counter-argument from Anthropic's progressive-disclosure thesis is that loading everything every turn does not scale to dozens of skills. Both are right at different scales.
How big are MCP token costs in practice?
Cloudflare's Code Mode post claims that for a large API like Cloudflare's, an equivalent MCP server without Code Mode would consume 1.17 million tokens of tool descriptions — exceeding most foundation models' context windows. Anthropic's own "Code Execution with MCP" post cites a 98.7% reduction (150,000 → 2,000 tokens) on a Google Drive to Salesforce example. The takeaway: stuffing 20 raw MCP servers into one config is a real performance problem, and the workaround is either Code Mode, the Tool Search Tool, or moving procedural logic into Skills.
What's the agentskills/agentskills GitHub repo?
It's the open spec for the Skills format. Per the README, "Agent Skills are a simple, open format for giving agents new capabilities and expertise" and "an open format maintained by Anthropic and open to contributions from the community." Apache 2.0 licensed for code, CC-BY 4.0 for docs. As of May 2026 the repo sits at roughly 17.8K stars. It is the cross-vendor reference; vercel-labs/skills, mastra-ai/skills, and google/skills all conform to its frontmatter schema.
Are skills more secure than MCP?
Different attack surface, neither is automatically safer. Skills are Markdown plus optional executables loaded by the agent — a malicious SKILL.md can include prompt-injection payloads or, when the runtime allows code execution, run arbitrary code with whatever filesystem access the runtime grants. MCP servers run as separate processes and are typically gated by OAuth or token scopes. Treat any third-party Skill or MCP server as untrusted code. Anthropic's response to the 2026 vulnerability disclosure was a hardened sandbox, not a claim that Skills are safe by default.
Should new projects start with MCP, Skills, or AGENTS.md?
Start with AGENTS.md (or CLAUDE.md). It costs nothing, every modern coding agent reads it, and it answers the question "what does this repo do." Add an MCP server when you need authenticated access to a live external system. Add a Skill when a procedure is repeating across sessions and you want it to live next to the code. Add a subagent when the main conversation is being polluted by exploratory work. In that order, almost always.
Does Anthropic's Tool Search Tool fix the MCP token problem?
Partially. The Tool Search Tool is Anthropic's beta feature that lets the model search for relevant tools at runtime instead of having every tool description loaded into the system prompt. Cloudflare's Code Mode and Anthropic's Code Execution with MCP post tackle the same problem from a different angle: turn MCP servers into a typed-API filesystem that the model navigates with code. Both approaches can deliver large token savings; both add a latency hop. They do not change what MCP is, only how its surface is presented.
Sources
Primary docs (verbatim quotes drawn from these)
- modelcontextprotocol.io — official MCP definition (“USB-C port for AI applications”)
- claude.com/blog/skills — Skills launch announcement, October 16 2025
- anthropic.com/engineering/equipping-agents-for-the-real-world-with-agent-skills — progressive disclosure quote, Skills/MCP complement language
- anthropic.com/engineering/code-execution-with-mcp — 98.7% token-reduction example, tool-description-cost quote
- blog.cloudflare.com/code-mode-mcp — 99.9% reduction claim, 1.17M-token Cloudflare API baseline
- vercel.com/blog/agents-md-outperforms-skills-in-our-agent-evals — AGENTS.md 100% vs Skills 79% eval
- simonwillison.net/2025/Oct/16/claude-skills — “maybe a bigger deal than MCP”, Cambrian explosion line
- code.claude.com/docs/en/sub-agents — Claude Code subagents docs (“preserve context… control costs”)
Repos and specs
- github.com/agentskills/agentskills — open spec, Apache 2.0 + CC-BY-4.0, ~17.8K stars
- github.com/anthropics/skills — Anthropic’s example skills (docx, pdf, pptx, xlsx)
- github.com/vercel-labs/skills — npx skills CLI (“OpenCode, Claude Code, Codex, Cursor, and 50 more”)
- github.com/google/skills — first-party Google Cloud Skills, Apache 2.0
- github.com/mastra-ai/skills — Mastra framework Skill, Apache 2.0
- github.com/obra/superpowers — opinionated Skills framework + methodology, MIT
HN discussion threads
- HN 45607117 — Skills launch thread, 816 points
- HN 45619537 — Willison “bigger than MCP” thread, 738 points
Internal links
- /blog/what-is-mcp — protocol primer
- /blog/what-are-claude-code-skills — Skills primer
- /blog/claude-code-best-practices
- /blog/context7-vs-deepwiki-vs-ref-vs-docfork-2026
- /blog/claude-godot-skill-guide
- /blog/claude-pdf-to-markdown-skill-guide
- /blog/claude-ui-ux-pro-max-skill-guide
- /blog/claude-codex-cli-skill-guide
- /blog/mcp-vs-a2a-understanding-ai-agent-protocols
- /blog/revolutionary-development-sub-agents-in-claude-code
- /blog/openai-launches-agent-skills-for-codex
- /blog/anthropic-launches-claude-agent-sdk
- /blog/claude-code-hooks
- /skills/skill-creator
- /skills/mcp-builder
- /skills/find-skills
- /skills/install-skills
- /skills/subagent-driven-development
- /servers/context7
- /servers/playwright
- /servers/sequential-thinking
- /best-mcp-servers
- /skills — browse all
- /servers — browse all