OAuth 2.1 for Remote MCP Servers — Streamable HTTP Explained (2026)
Remote MCP is the auth engineer’s problem now. The spec moved from stdio to Streamable HTTP, then glued OAuth 2.1, RFC 9728, RFC 8414, RFC 7591, and RFC 8707 into a single discovery-and-token flow that every browser- hosted agent has to follow. ScaleKit, Stytch, and Cloudflare each shipped helpful walkthroughs, but no neutral reference exists. This is that reference — sourced only from the spec itself, the cited RFCs, and verifiable vendor docs. We link to the security playbook for the threat model and to the decision matrix for when remote MCP is the right call at all.

On this page · 17 sections▾
- TL;DR for auth engineers
- What changed: stdio to Streamable HTTP
- Streamable HTTP, in detail
- OAuth 2.1 vs OAuth 2.0
- Discovery: RFC 9728 and RFC 8414
- Dynamic Client Registration (RFC 7591)
- End-to-end auth flow
- Scopes and consent
- Token storage and lifetimes
- Walkthrough: Linear
- Walkthrough: Sentry
- Gateway patterns
- Failure modes
- Server-author checklist
- Community signal
- FAQ
- Sources
TL;DR for auth engineers
- Transport: one HTTP endpoint. POST for client-to-server JSON-RPC; the server may respond with
application/jsonor open an SSE stream (text/event-stream). GET opens an SSE channel for server-initiated messages. Sessions are tracked with theMcp-Session-Idresponse/request header. Resume withLast-Event-ID. - Auth basis: OAuth 2.1 IETF draft 13 plus RFC 9728 (Protected Resource Metadata), RFC 8414 (Authorization Server Metadata), RFC 7591 (Dynamic Client Registration), RFC 8707 (Resource Indicators).
- Required: PKCE for all clients (S256 if capable). Discovery via
WWW-Authenticate→/.well-known/oauth-protected-resource→/.well-known/oauth-authorization-server.resource=parameter on every authorize and token request. Audience-bound tokens, validated locally. - Forbidden: implicit flow, ROPC password grant, plain-PKCE when S256 is possible, bearer tokens in URI query strings, token passthrough to upstream APIs.
- SHOULD: Dynamic Client Registration on both sides, refresh-token rotation for public clients, short-lived access tokens, redirect-URI exact-match.
- Practical: ship Streamable HTTP, delegate the OAuth surface to a library or gateway (workers-oauth-provider, ScaleKit, Stytch Connected Apps, Auth0, IBM mcp-context-forge, Composio Strata), keep your business-logic Worker oblivious to the dance.
What changed: stdio to Streamable HTTP
The first wave of MCP servers ran over stdio — the agent spawned a subprocess, piped JSON-RPC over stdin and stdout, and trusted the local filesystem for credentials. That worked for desktop tools and IDE plugins. It did not work for SaaS: the server has to live somewhere Internet-reachable, multi-tenant, and authenticated. The Anthropic team’s answer landed in two stages.
2025-03-26 — Streamable HTTP. The spec replaced the original HTTP+SSE transport (which used two endpoints: one GET /sse for the down-stream channel and one POST /sse/messages for the up-stream) with a single MCP endpoint that handles both directions. Cloudflare’s launch post described the old shape as “like having a conversation with two phones, one for listening and one for speaking,” and the new shape as a single bidirectional pipe over HTTP.
2025-06-18 — OAuth 2.1 hardening. The authorization section was rewritten around OAuth 2.1 draft 13, with explicit references to RFC 9728 and RFC 8707. Where the earlier spec had been loose about audience binding and discovery, the new version is precise: clients MUST use the resource= parameter, servers MUST validate token audience, MCP servers MUST NOT pass through tokens to downstream APIs. The spec page itself opens with this framing: the authorization mechanism is “based on established specifications listed below, but implements a selected subset of their features to ensure security and interoperability while maintaining simplicity.”
If you skip our primer on what MCP is, the rest of this post still works — it’s aimed at engineers who already understand the JSON-RPC wire format and just need the auth picture.
Streamable HTTP, in detail
Streamable HTTP is HTTP. There is no separate framing layer. The MCP server exposes one URL — call it https://mcp.example.com/mcp — and that URL accepts POST and GET. The wire is JSON-RPC 2.0, UTF-8 encoded, optionally streamed as SSE.
POST. Every JSON-RPC message from the client is a new HTTP POST. The body is a single JSON-RPC request, notification, or response. The client MUST send Accept: application/json, text/event-stream. If the message is a notification or response, the server returns 202 with no body. If the message is a request, the server returns one of two shapes:
Content-Type: application/jsonwith a single JSON-RPC response object — the simple case, ideal for fast tool calls.Content-Type: text/event-streamwith an SSE stream that may carry server-initiated requests and notifications before the eventual response. The streamSHOULDclose after the response is sent.
GET. The client may issue a GET to the same MCP endpoint with Accept: text/event-stream to open a long-running SSE channel for server-initiated messages unrelated to any pending request. The server may return 405 Method Not Allowed if it does not offer such a channel.
Sessions. The server may assign a session identifier in the Mcp-Session-Id response header on the InitializeResult. The client MUST include that header on every subsequent request. The session ID has to be cryptographically random and visible-ASCII (0x21–0x7E). The server may terminate the session at any time, returning 404 to subsequent requests with that ID; the client’s response is to start a new session with a fresh InitializeRequest. Clients should send DELETE on session shutdown to free server resources, though the server may answer 405 if it doesn’t support explicit termination.
Resumption. Server SSE events may carry a globally unique id. After a network drop the client reconnects with Last-Event-ID: <last-id> and the server may replay messages it had queued for that stream. Replay is per-stream — the server MUST NOT replay messages from a different stream onto a resumed connection.
Protocol version. Once initialization negotiates a version (currently 2025-06-18), the client MUST send MCP-Protocol-Version: 2025-06-18 on every subsequent HTTP request. Servers without that header default to 2025-03-26 for backwards compatibility. An invalid version is a 400.
Security warnings the spec calls out. Servers must validate the Origin header (DNS-rebinding mitigation), bind to localhost rather than 0.0.0.0 when running locally, and authenticate every connection. The Origin check is the single most-missed line in the spec — every audit our team has run on a custom MCP server has found it absent.
POST /mcp HTTP/1.1
Host: mcp.example.com
Authorization: Bearer eyJhbGciOiJSUzI1NiIs...
MCP-Protocol-Version: 2025-06-18
Mcp-Session-Id: 1868a90c-f9f3-4f6d-b1f0-bc34c2db1e1c
Accept: application/json, text/event-stream
Content-Type: application/json
{"jsonrpc":"2.0","id":42,"method":"tools/call",
"params":{"name":"create_issue","arguments":{...}}}OAuth 2.1 vs OAuth 2.0
OAuth 2.1 is not a new protocol. It’s OAuth 2.0 with roughly a decade of security best-practice consolidations rolled into one document and a handful of legacy flows cut. The MCP spec normatively references draft-ietf-oauth-v2-1-13. If you’ve read the OAuth 2.0 RFC 6749, the diff is small and angry.
What 2.1 deprecates or removes:
- Implicit grant. Section 10.1 of the draft is titled “Removal of the OAuth 2.0 Implicit grant.” Gone. The historical reason for implicit — browser clients that couldn’t do a POST to the token endpoint cross-origin — disappeared with CORS.
- Resource Owner Password Credentials. Section 1.8 says “some features available in OAuth 2.0, such as the Implicit or Resource Owner Credentials grant types, are not specified in OAuth 2.1.” ROPC’s username-and-password-to-token shape never had a defensible threat model.
- Plain PKCE when S256 is feasible. Section 4.1.1 says “If the client is capable of using S256, it MUST use S256.” Plain stays only for embedded environments that physically cannot SHA-256.
- Bearer tokens in query strings. The MCP spec restates this directly: “Access tokens
MUST NOTbe included in the URI query string.” Use theAuthorizationheader.
What 2.1 promotes from RECOMMENDED to REQUIRED:
- PKCE for every client. “Clients
MUSTuse code_challenge and code_verifier and authorization serversMUSTenforce their use,” per the OAuth 2.1 draft. Confidential clients used to be able to skip PKCE. Not anymore. - Refresh token rotation for public clients. “Authorization servers
MUSTrotate refresh tokens as described in OAuth 2.1 Section 4.3.1,” per the MCP authorization page. - Exact-match redirect URIs. No more prefix matching, no more wildcards. The authorization server
MUSTcompare against the pre-registered value byte-for-byte.
If you implement those changes, you’ve covered 80% of OAuth 2.1. The other 20% lives in the security considerations: token theft mitigation, open-redirect hardening, and the confused-deputy problem the MCP spec quotes verbatim.
Discovery: RFC 9728 and RFC 8414
A new MCP client connecting to a server it’s never seen needs to learn three things: (1) is auth required, (2) where do I authorize, (3) what do I call myself. The spec composes RFC 9728 and RFC 8414 to cover the first two.
Step 0 — unauthenticated probe. The client sends an MCP request without a token. The server, if it requires auth, returns 401 with a WWW-Authenticate header pointing at its Protected Resource Metadata document, per RFC 9728 §5.1.
HTTP/1.1 401 Unauthorized
WWW-Authenticate: Bearer realm="mcp",
resource_metadata="https://mcp.example.com/.well-known/oauth-protected-resource"Step 1 — fetch the Protected Resource Metadata. The client GETs that URL. The response is a JSON document with at least an authorization_servers array — RFC 9728 allows multiple, and the client picks one per §7.6. For most MCP servers there is exactly one.
GET /.well-known/oauth-protected-resource HTTP/1.1
Host: mcp.example.com
200 OK
Content-Type: application/json
{
"resource": "https://mcp.example.com",
"authorization_servers": ["https://auth.example.com"],
"bearer_methods_supported": ["header"],
"scopes_supported": ["mcp.read", "mcp.write"]
}Step 2 — fetch the Authorization Server Metadata. Per RFC 8414, the client GETs /.well-known/oauth-authorization-server on the authorization-server origin. The response includes authorization_endpoint, token_endpoint, the registration endpoint (RFC 7591), supported PKCE methods, and supported scopes. The MCP spec says clients MUST follow this step.
GET /.well-known/oauth-authorization-server HTTP/1.1
Host: auth.example.com
200 OK
Content-Type: application/json
{
"issuer": "https://auth.example.com",
"authorization_endpoint": "https://auth.example.com/authorize",
"token_endpoint": "https://auth.example.com/token",
"registration_endpoint": "https://auth.example.com/register",
"code_challenge_methods_supported": ["S256"],
"grant_types_supported": ["authorization_code", "refresh_token"],
"response_types_supported": ["code"],
"scopes_supported": ["mcp.read", "mcp.write", "offline_access"]
}The spec is strict: MCP clients MUST parse WWW-Authenticate on 401, MCP clients MUST use both Protected Resource Metadata and Authorization Server Metadata for discovery. Anything that hardcodes URLs is non-conformant. The friction this discovery sequence introduces is exactly what makes Dynamic Client Registration the next required step.
Dynamic Client Registration (RFC 7591)
MCP clients ship to millions of users. Each user connects to a different set of MCP servers from a different set of authorization servers. The Slack MCP server’s auth server has never heard of Cursor; Cursor has never heard of the auth server. Manual OAuth-app provisioning would require the user to (a) understand OAuth, (b) find the IDP’s admin page, (c) paste a client_id and secret into Cursor’s settings. That’s the friction MCP exists to eliminate.
RFC 7591 defines a POST /register endpoint where the client self-asserts its metadata and receives a client_id in return. The MCP spec quotes the rationale almost word-for-word: clients may not know all possible MCP servers in advance, manual registration creates friction, DCR enables seamless connection.
POST /register HTTP/1.1
Host: auth.example.com
Content-Type: application/json
{
"client_name": "Cursor",
"client_uri": "https://cursor.com",
"redirect_uris": ["http://127.0.0.1:7777/callback"],
"grant_types": ["authorization_code", "refresh_token"],
"response_types": ["code"],
"token_endpoint_auth_method": "none",
"scope": "mcp.read mcp.write offline_access",
"software_id": "com.cursor.app",
"software_version": "0.45.2"
}
201 Created
Content-Type: application/json
{
"client_id": "c_8f4e2a91b7c34d5e",
"client_id_issued_at": 1735689600,
"redirect_uris": ["http://127.0.0.1:7777/callback"],
"grant_types": ["authorization_code", "refresh_token"],
"token_endpoint_auth_method": "none"
}Public vs confidential. Most MCP clients are public — they ship on user laptops with no way to store a long-lived secret. They register with token_endpoint_auth_method: “none” and rely on PKCE plus exact-match redirect URIs for security. Server-to-server MCP clients running on managed infrastructure can register as confidential and use client_secret_basic.
Software statements. RFC 7591 also defines a software_statement field — a JWT signed by a trusted issuer that asserts the client’s metadata. ScaleKit’s Client ID Metadata Documents proposal extends this idea further by treating the client ID itself as a metadata URL, but most production MCP auth servers today accept the simpler unsigned shape and rely on the spec’s warning: “the authorization server MUST treat all client metadata as self-asserted.”
The confused-deputy hole. If your MCP server uses a static client ID against an upstream IDP (a common pattern when the MCP server itself proxies to Google Drive or GitHub) and accepts arbitrary clients via DCR, you’re vulnerable. The spec calls this out: “MCP proxy servers using static client IDs MUST obtain user consent for each dynamically registered client before forwarding to third-party authorization servers.” Skipping this consent step lets one DCR-registered MCP client trade an authorization code for a token issued under a different client’s identity.
End-to-end auth flow
Putting the pieces together, the actual sequence a fresh MCP client follows on first contact with a remote server looks like this. We’ve named the actors as the spec does: Client (the agent: Claude Desktop, Cursor, Cline), Resource Server (the MCP server itself), Authorization Server (the IDP — could be embedded in the MCP server or separate), Browser (the user-agent that handles the interactive consent step).
- Probe. Client → Resource Server: POST
initializewith no token. Resource Server → Client: 401 withWWW-Authenticate: Bearer resource_metadata=…. - Resource discovery. Client → Resource Server: GET
/.well-known/oauth-protected-resource. Returnsauthorization_servers, supported scopes. - AS discovery. Client → Authorization Server: GET
/.well-known/oauth-authorization-server. Returns endpoints and capabilities. - DCR (if needed). Client → Authorization Server: POST
/registerwith metadata. Returnsclient_id. - PKCE setup. Client generates a 43–128 character random
code_verifier, computescode_challenge = base64url(SHA-256(verifier)). - Authorize. Client opens the user’s browser to
/authorize?response_type=code&client_id=…& redirect_uri=…&code_challenge=…& code_challenge_method=S256&state=…& scope=mcp.read mcp.write offline_access& resource=https%3A%2F%2Fmcp.example.com. User logs in, consents. - Callback. Authorization Server → Browser: 302 to
redirect_uri?code=…&state=…. Browser → Client: callback with the auth code (typically on a localhost loopback port for desktop clients). - Token exchange. Client → Authorization Server: POST
/tokenwithgrant_type=authorization_code,code,code_verifier,redirect_uri,resource=https%3A%2F%2Fmcp.example.com. Returnsaccess_token+refresh_token. - Tool call. Client → Resource Server: POST
tools/callwithAuthorization: Bearer <access_token>. Resource Server validates audience locally (the token was issued forhttps://mcp.example.com), validates scope, executes the tool, returns the result. - Refresh. When the access token expires, Client → Authorization Server: POST
/tokenwithgrant_type=refresh_token. Returns a fresh access token plus (for public clients) a rotated refresh token. - Revoke. User revokes the app from the IDP’s consent screen. Refresh token is invalidated; the next access-token expiry kills agent access.
The mermaid diagram on the spec page captures the same sequence. Note that step 4 (DCR) is conditional — the client first checks whether it already has a client_id for this authorization server, and only registers if not.
Scopes and consent
MCP scopes are not standardized by the spec. The server publishes scopes_supported in its Protected Resource Metadata, the client requests scopes at authorize time, and the IDP shows the user a consent screen. What each scope means is up to the server author. Three patterns show up in practice.
Coarse-grained. One mcp.read and one mcp.write scope, mapped to read-only vs read-write tools. Easy for users to reason about. Bad blast radius if a tool is compromised.
Per-tool. One scope per exposed tool — create_issue, update_issue, list_projects. Atlassian’s and GitHub’s remote MCP servers approximate this. Strong least-privilege story but the consent screen is long.
Resource-action. Scopes shaped like todos.read and todos.write per resource type. Stytch’s walkthrough used this shape. Practical compromise — granular enough to enforce useful boundaries, short enough to read. The Stytch post sums it up: “Scopes let you limit access precisely (e.g., read-only vs. write). Much better than all-or-nothing API keys.”
Whatever you pick, the consent surface is the user’s only meaningful control. Vague scope names (“Access your account”) defeat the entire OAuth threat model. Concrete names (“Create issues in your Linear workspace”) let users make informed decisions. If you ship more than three scopes, group them on the consent screen.
Token storage and lifetimes
Three places store tokens, with three different threat models.
Client side. Claude Desktop, Cursor, Cline, ChatGPT desktop, and VS Code all keep tokens in the OS keychain when one is available — Keychain on macOS, Credential Manager on Windows, libsecret on Linux. Some clients fall back to a config file with restrictive permissions. The threat is local malware reading the keychain; mitigation is short-lived access tokens (minutes, not hours), refresh-token rotation so a stolen refresh token is invalid after first use, and IDP-side detection of refresh-token replay (which the OAuth 2.1 draft requires authorization servers to handle by revoking the entire token chain).
Server side. The MCP server holds an opaque or JWT access token long enough to validate it on each request. If you cache validation results, cache the JWT’s exp, the audience, and the scopes — never the raw token. Cloudflare’s workers-oauth-provider stores secrets only by hash, which is the right shape for this. The spec is explicit: “Clients and servers MUST implement secure token storage and follow OAuth best practices.”
IDP side. The authorization server keeps the long-tail state — issued tokens, granted consents, revocations. This is where introspection (RFC 7662) and revocation (RFC 7009) endpoints live, though the MCP spec does not require either. A production MCP deployment usually does want introspection so a revoke takes effect in seconds, not minutes; the trade-off is one extra network hop per tool call. Short token TTLs without introspection is the simpler design.
Empirically, what each agent does in 2026:
- Claude Desktop / Claude Code: per-server tokens in the OS keychain, refresh on expiry, browser prompt on first connect.
- Cursor: tokens in the user’s Cursor profile dir, encrypted at rest. DCR-based registration with localhost loopback redirect.
- Cline (open source): tokens in the VS Code extension secret storage API. Same OAuth flow.
- ChatGPT desktop: tokens managed server-side by OpenAI; the client sees a session reference, not the raw bearer.
Walkthrough: Linear
Linear’s remote MCP server lives at https://mcp.linear.app/mcp. Linear’s docs page state the auth model verbatim: the server “uses OAuth 2.1 with dynamic client registration for authentication”. First-time connections open the user’s browser to a Linear consent screen; subsequent calls use the cached access token. Linear also accepts a personal API key via Authorization: Bearer for service accounts, but the spec-conformant path is DCR.
What you actually see when adding it to Claude Code or Cursor: a short paste of the URL, a browser tab opens, you approve the workspace scope, and the tool list loads. No client_id to copy, no secret to paste. That’s the user-visible result of DCR.
One-line install · Linear
Open server pageInstall
Recipe — sprint planning agent. In Cursor with the Linear MCP installed and Anthropic Claude Sonnet 4.6 selected:
Pull the open issues in our "API Platform" project, group them
by priority, and summarize what's in flight vs blocked. Then
create a parent issue titled "Sprint 142 plan" with each priority
group as a sub-issue, linked to the contributing tickets.The agent calls Linear’s tools through the MCP server. Each call carries the access token Linear issued for your user — the audience is bound, the workspace is bound, the scopes are bound. Revoke the Cursor app from your Linear account and the agent loses access on the next token expiry.
Walkthrough: Sentry
Sentry’s remote MCP server is open-source — the repo is at github.com/getsentry/sentry-mcp and the hosted endpoint is https://mcp.sentry.dev/mcp. The README describes the architecture in one line worth quoting: “This remote MCP server acts as middleware to the upstream Sentry API, optimized for coding assistants like Cursor, Claude Code.” The implementation is inspired by Cloudflare’s remote MCP examples and is a TypeScript Workers monorepo.
Sentry’s auth shape on the cloud endpoint is Streamable HTTP with OAuth — no API keys, no manual setup. The Sentry docs page calls it “OAuth-based device-code authentication” and notes “On first run it opens your browser to log in — no manual token creation needed.” Tokens cache to ~/.sentry/mcp.json for local stdio invocations.
For self-hosted Sentry, the docs list the required upstream scopes: org:read, project:read, project:write, team:read, team:write, event:write. That mapping is the canonical shape of a non-trivial MCP server: scopes on the MCP resource (which the user consents to once) translate to upstream scopes (which the MCP server holds via its own OAuth-app credentials, never the user’s token). That’s the no-passthrough rule made concrete.
One-line install · Sentry
Open server pageInstall
Recipe — error triage in Claude Code.
Use the Sentry MCP. List unresolved issues from the last 24
hours in the "checkout" project, ranked by user count. For the
top three, fetch the most recent event and identify the line of
code that's throwing. Suggest a patch and link the Sentry issue
URL.The agent walks through Sentry’s tools using the user’s OAuth token. Sentry’s server validates the token audience locally (the token was issued for mcp.sentry.dev), checks scopes, then makes its own server-to-server calls to the underlying Sentry API using its own credentials. The user’s MCP token never leaves the MCP server. This is the pattern every multi-tenant MCP-to-SaaS server should follow.
Gateway patterns
Almost every production MCP team ends up with one of three shapes for their auth surface.
1. Embedded. The MCP server hosts its own authorization server. Stytch’s blog calls this the embedded pattern: “MCP server includes its own built-in authorization server. It acts as both the Identity Provider and Relying Party.” Cloudflare’s workers-oauth-provider is the canonical implementation — you instantiate it with authorizeEndpoint, tokenEndpoint, and clientRegistrationEndpoint handlers, and it wraps your MCP fetch handler so your tools receive already-authenticated user props.
export default new OAuthProvider({
apiRoute: "/mcp",
apiHandler: MyMCPServer.serve("/mcp"),
defaultHandler: MyAuthHandler,
authorizeEndpoint: "/authorize",
tokenEndpoint: "/token",
clientRegistrationEndpoint: "/register",
});2. Delegated. The MCP server delegates the entire OAuth flow to an external authorization server. Stytch describes this case as the MCP server “[ delegating] the entire OAuth flow to an external authorization server or service.” ScaleKit’s four-step recipe — register the MCP server, expose Protected Resource Metadata, validate JWTs, verify scopes — fits this pattern. The MCP server holds no client state; the gateway does. Auth0, Stytch Connected Apps, ScaleKit, and Clerk all play this role.
3. API gateway in front. A traditional API gateway (Kong, Tyk, Cloudflare AI Gateway, IBM mcp-context-forge, Composio Strata) sits in front of the MCP server, terminates OAuth, attaches user claims to the request, and forwards a sanitized request to the MCP origin. Cloudflare AI Gateway adds rate limiting, request logging, and a single observability surface for multiple MCP servers.
Pick by team shape, not by elegance. If you have one MCP server and a small auth team, embedded with workers-oauth-provider is fastest. If you have ten MCP servers and need a single sign-on story across all of them, gateway. If you’re a SaaS that already runs an IDP, delegated to that IDP keeps your blast radius and your audit logs in one place. The context-bloat post argues a related point: gateway aggregation also helps manage tool-description token cost across many servers.
Failure modes
Token expiry mid-tool-call
The agent kicked off a long-running tool call; the access token expired before the SSE stream closed. The server returns 401 mid-response. The client must refresh and retry from the last idempotent boundary. Ergonomic mitigation: refresh proactively when exp is within 60 seconds; never let an in-flight request use a soon-to-expire token.
Revocation propagation lag
User revokes from the IDP. Existing access tokens stay valid until expiry — “the agent quietly loses access once its current token expires,” per Stytch. If your tokens last 24 hours, revocation lag is up to 24 hours. Mitigation: short TTLs, or token introspection on every request.
Multi-account ambiguity
A user with two GitHub accounts (work and personal) connects the GitHub MCP twice. The MCP client must track separate (server URL, user ID) pairs and present a picker before tool calls. Several MCP clients in 2026 still ship this as a known rough edge.
IDP downtime
Authorization server is down; existing tokens still work but refresh fails. The agent appears intermittent. Mitigation: surface IDP status to the user, cache JWKS aggressively, fail loudly rather than retrying silently — the worst experience is a tool that half-works.
Origin-header DNS rebinding
Local MCP server binds to 0.0.0.0 with no Origin check. Malicious site rebinds DNS, hits the server from the user’s browser, exfiltrates data. The spec demands Origin validation and localhost-only binds for local servers; both regularly missing.
Audience-claim laxness
Your MCP server validates JWT signatures but not audience. An attacker presents a token issued for other.example.com; signature checks pass; you serve them. The OWASP MCP Top 10 calls this out and so does the spec — see our security playbook for the full threat-model breakdown.
Server-author checklist
If you’re shipping an OAuth-2.1 MCP server, this is the minimum bar to clear. Each item is non-optional per the spec.
Transport
Streamable HTTP at one MCP endpoint. POST + GET. SSE upgrade for streaming. Mcp-Session-Id on init. MCP-Protocol-Version negotiation. Origin header validation. Localhost bind for local deployments.
Discovery
401 with WWW-Authenticate for unauthenticated requests. /.well-known/oauth-protected-resource served per RFC 9728, listing at least one authorization_servers entry.
Authorization server
OAuth 2.1 endpoints with PKCE-S256 enforcement. /.well-known/oauth-authorization-server served per RFC 8414. Authorization-code grant. Refresh-token rotation for public clients. RFC 7591 DCR endpoint at /register.
Token validation
Verify signature, iss, exp, and aud on every request. aud must match this MCP server’s canonical URI. 401 on invalid/expired, 403 on insufficient scope.
Downstream calls
Never pass through the user’s MCP token. If your server calls upstream APIs, mint or fetch a separate token using the MCP server’s own credentials and the user identity from the validated MCP token.
Logging and audit
Log (client_id, user_id, tool, scope, result) for every tool call. Never log the raw token. Surface a revocation API to your customers.
Community signal
Three voices that capture the consensus shape of remote MCP auth as of mid-2026.
“Remote MCP servers are Internet-accessible. People simply sign in and grant permissions to MCP clients using familiar authorization flows.”
Cloudflare Engineering · Blog
Cloudflare's remote-MCP launch post — the canonical statement of why OAuth, not API keys, is the path forward.
“Implementing OAuth for your MCP server transforms it from a prototype into a production-ready service.”
ScaleKit · Blog
ScaleKit's four-step OAuth-2.1 recipe — registration, RFC 9728 metadata, JWT validation, scope verification.
“Stytch marks its refresh token invalid, and because access tokens are short-lived, the agent quietly loses access once its current token expires — no extra work on the Worker side.”
Stytch · Blog
Stytch's June 2025 walkthrough of revocation propagation — the cleanest description of why short access-token TTLs matter.
Frequently asked questions
What changed when MCP moved to OAuth 2.1 and Streamable HTTP?
The 2025-03-26 spec replaced the older HTTP+SSE transport with Streamable HTTP — a single MCP endpoint that handles JSON-RPC over HTTP POST and optionally upgrades to Server-Sent Events for streaming. The 2025-06-18 revision then made OAuth 2.1 the basis of the auth section, mandating PKCE for all clients, RFC 9728 Protected Resource Metadata for discovery, RFC 8414 for authorization-server metadata, and RFC 8707 resource indicators so tokens are bound to a specific MCP server. Implicit grant and ROPC are gone. Bearer tokens in URI query strings are forbidden.
Is OAuth required for remote MCP servers, or is it optional?
The spec says authorization is OPTIONAL. Verbatim from modelcontextprotocol.io: 'Authorization is OPTIONAL for MCP implementations. When supported: Implementations using an HTTP-based transport SHOULD conform to this specification. Implementations using an STDIO transport SHOULD NOT follow this specification, and instead retrieve credentials from the environment.' In practice every public remote MCP server ships OAuth — Linear, Sentry, GitHub, Notion, Atlassian, Stripe — because anything else is unauditable and unrevocable.
Why does MCP need Dynamic Client Registration (RFC 7591)?
Because MCP clients (Claude Desktop, Claude Code, Cursor, ChatGPT, VS Code, Cline) cannot pre-register with every authorization server in the world. The spec calls this out directly: 'Clients may not know all possible MCP servers and their authorization servers in advance. Manual registration would create friction for users… It enables seamless connection to new MCP servers and their authorization servers.' DCR lets the client POST to /register and receive a client_id back — no human in the loop. Servers that refuse DCR force users to manually create OAuth apps, which is the friction MCP exists to remove.
Is PKCE actually mandatory in OAuth 2.1?
Yes for all clients, including confidential ones. The OAuth 2.1 draft says: 'Clients MUST use code_challenge and code_verifier and authorization servers MUST enforce their use.' If the client supports S256 it MUST use S256; the plain method is allowed only when S256 is technically impossible. The MCP spec restates this in its own Authorization Code Protection section: 'MCP clients MUST implement PKCE according to OAuth 2.1 Section 7.5.2.'
What is RFC 8707 and why does MCP require it?
RFC 8707 defines the resource parameter for OAuth 2.0. The MCP spec mandates that clients include resource= in both authorization and token requests, set to the canonical URI of the MCP server (for example, https%3A%2F%2Fmcp.example.com). This binds the token to one specific resource server so a token issued for mcp.example.com cannot be replayed against mcp.evilcorp.com. The spec explicitly forbids token passthrough — an MCP server that accepts a token issued for a different audience is a confused-deputy hole, and the spec calls that out by name.
How is Streamable HTTP different from the old HTTP+SSE transport?
The 2024-11-05 transport used two endpoints: /sse for the server-to-client SSE stream and /sse/messages for client-to-server POSTs. Streamable HTTP collapses that into one MCP endpoint that accepts POST and GET. POSTs carry JSON-RPC requests; the response is either a single JSON object (Content-Type: application/json) or an SSE stream the server may use to send notifications and the eventual response. GETs let the client open an SSE channel for server-initiated messages. Sessions are tracked via the Mcp-Session-Id header, and the client uses Last-Event-ID to resume after a network drop. Cloudflare's launch post described the change as eliminating 'two phones, one for listening and one for speaking.'
Where do refresh tokens fit, and how does revocation propagate?
Authorization servers SHOULD issue short-lived access tokens; the MCP spec also requires refresh-token rotation for public clients, mirroring OAuth 2.1 Section 4.3.1. When a user revokes an app from the IDP's consent screen, the refresh token is invalidated immediately, but the existing access token keeps working until it expires — typically within minutes if the IDP is sane. Stytch's example puts it cleanly: 'Stytch marks its refresh token invalid, and because access tokens are short-lived, the agent quietly loses access once its current token expires — no extra work on the Worker side.' If you need instant revocation, you need token introspection (RFC 7662) or short TTLs measured in seconds, not minutes.
Should I put a gateway in front of my MCP server?
If your team already runs API-gateway infrastructure and your MCP server has more than one tool calling more than one upstream, yes. Cloudflare's workers-oauth-provider, ScaleKit, Stytch Connected Apps, IBM mcp-context-forge, and Composio Strata all offload the OAuth 2.1 + DCR + RFC 9728 + RFC 8414 plumbing so you only ship business logic. The cost is one more network hop and one more vendor surface. For a single-tool dev-tools MCP server, the embedded model (auth server lives next to the resource server) is faster to ship; for SaaS multi-tenant MCP, gateway-delegated auth scales better.
Can I keep using the old SSE transport for backwards compatibility?
Yes, but only for transition. The spec includes a backwards-compatibility section: servers can host both /sse (old) and /mcp (new) and clients should attempt POST /mcp first, falling back to GET /sse on a 4xx. Cloudflare's McpAgent class auto-handles both. New deployments should ship Streamable HTTP only — every major client now negotiates MCP-Protocol-Version: 2025-06-18 by default, and SSE-only servers will drop off the long tail by end of 2026.
What's the single biggest implementation mistake?
Token passthrough. An MCP server that accepts a Bearer token from the client and forwards the same token to a downstream API (Slack, GitHub, Stripe) violates the spec, breaks audience binding, and creates a confused-deputy bug — the downstream API trusts the token because it's signed, not realizing it was minted for an entirely different resource. The MCP spec is explicit: 'MCP servers MUST NOT pass through the token it received from the MCP client.' Issue your own token at the MCP server, validate it locally, then exchange for a separate downstream token using your own credentials. This is exactly what Cloudflare's workers-oauth-provider does by design.
Does Linear's official MCP server use OAuth 2.1 with DCR?
Yes. Linear's docs at linear.app/docs/mcp say verbatim that the server 'uses OAuth 2.1 with dynamic client registration for authentication.' The endpoint is https://mcp.linear.app/mcp. Authentication happens through the user's browser the first time a client connects; subsequent calls use the stored access token. Linear also supports passing OAuth tokens or API keys directly via Authorization: Bearer for server-to-server use, but the default agent flow is the spec-conformant DCR path.
Where can I see a full reference implementation for an OAuth-2.1 MCP server?
Two are widely cited. Cloudflare's workers-oauth-provider (github.com/cloudflare/workers-oauth-provider) is the OAuth-2.1 provider library used by Cloudflare's own remote MCP examples — it stores secrets only as hashes and wraps your fetch handler so your tool code receives an already-authenticated user. Sentry's open-source MCP server (github.com/getsentry/sentry-mcp) ships a TypeScript Streamable-HTTP server with OAuth, deployed to mcp.sentry.dev — the README describes it as 'remote MCP middleware to the upstream Sentry API.'
Sources
MCP specification
- modelcontextprotocol.io — Authorization (2025-06-18)
- modelcontextprotocol.io — Transports (Streamable HTTP)
- github.com/modelcontextprotocol/modelcontextprotocol
Underlying RFCs and drafts
- draft-ietf-oauth-v2-1-13 — OAuth 2.1
- RFC 7591 — OAuth 2.0 Dynamic Client Registration
- RFC 7636 — PKCE
- RFC 8414 — OAuth 2.0 Authorization Server Metadata
- RFC 8707 — Resource Indicators for OAuth 2.0
- RFC 9728 — OAuth 2.0 Protected Resource Metadata
Vendor docs and engineering posts
- Cloudflare — Remote MCP servers (March 2025)
- Cloudflare — Streamable HTTP MCP servers (April 2025)
- cloudflare/workers-oauth-provider
- Stytch — MCP Authentication and Authorization Servers
- Stytch — OAuth for MCP, real-world example (June 2025)
- ScaleKit — Implement OAuth for MCP servers
- Linear — MCP server docs
- Sentry — MCP server docs
- github.com/getsentry/sentry-mcp
Internal links
- /servers/linear
- /servers/sentry
- /servers/github — also OAuth 2.1
- /servers/notion — also OAuth 2.1
- /servers/atlassian-cloud — also OAuth 2.1
- /blog/what-is-mcp — protocol primer
- /blog/mcp-security-… — security playbook
- /blog/claude-skills-vs-mcp-… — decision matrix
- /blog/mcp-context-bloat-fix — gateway and tool-budget angle
- /best-mcp-servers — curated roundup
- /servers — browse all 3,000+