Connects to DeepSeek's language models for AI-powered chat, text completion, and code generation. Works as a remote service or local installation.

Integrates DeepSeek's language models, enabling AI-powered chat completions with customizable parameters for tasks like writing assistance and code generation.

302503 views46Local (stdio)

What it does

  • Generate chat completions using DeepSeek models
  • Create text and code with customizable parameters
  • Check account balance and usage
  • List available DeepSeek models
  • Process AI requests via remote endpoint
  • Run locally via Docker or npm package

Best for

Developers needing AI assistance for coding tasksWriters using AI for content generationApplications requiring DeepSeek model integrationTeams wanting self-hosted AI capabilities
Official DeepSeek integrationRemote endpoint availableMultiple deployment options

About DeepSeek

DeepSeek is a community-built MCP server published by dmontgomery40 that provides AI assistants with tools and capabilities via the Model Context Protocol. DeepSeek offers an AI-powered chatbot and writing assistant for chat completions, writing help, and code generation with It is categorized under ai ml.

How to install

You can install DeepSeek in your AI client of choice. Use the install panel on this page to get one-click setup for Cursor, Claude Desktop, VS Code, and other MCP-compatible clients. This server runs locally on your machine via the stdio transport.

License

DeepSeek is released under the MIT license. This is a permissive open-source license, meaning you can freely use, modify, and distribute the software.

DeepSeek MCP Server

DeepSeek logo

As of February 24, 2026, this is the only DeepSeek MCP server repo linked in DeepSeek's official integration list and listed in the official MCP Registry.

DeepSeek Official List Official MCP Registry npm version npm downloads OCI package GitHub stars Glama MCP Listing

Official DeepSeek MCP server for chat/completions/models/balance. Why V4 is a big deal (plain-language explainer).

  • Hosted remote endpoint: https://deepseek-mcp.ragweld.com/mcp
  • Auth: Authorization: Bearer <token>
  • Local package and Docker are also supported.

Quick Install (Copy/Paste)

1) Set your hosted token once

export DEEPSEEK_MCP_AUTH_TOKEN="REPLACE_WITH_TOKEN"

2) Codex CLI (remote MCP)

codex mcp add deepseek --url https://deepseek-mcp.ragweld.com/mcp --bearer-token-env-var DEEPSEEK_MCP_AUTH_TOKEN

3) Claude Code (remote MCP)

claude mcp add --transport http deepseek https://deepseek-mcp.ragweld.com/mcp --header "Authorization: Bearer $DEEPSEEK_MCP_AUTH_TOKEN"

4) Cursor (remote MCP)

node -e 'const fs=require("fs"),p=process.env.HOME+"/.cursor/mcp.json";let j={mcpServers:{}};try{j=JSON.parse(fs.readFileSync(p,"utf8"))}catch{};j.mcpServers={...(j.mcpServers||{}),deepseek:{url:"https://deepseek-mcp.ragweld.com/mcp",headers:{Authorization:"Bearer ${env:DEEPSEEK_MCP_AUTH_TOKEN}"}}};fs.mkdirSync(process.env.HOME+"/.cursor",{recursive:true});fs.writeFileSync(p,JSON.stringify(j,null,2));'

5) Local install (stdio, if you prefer self-hosted)

DEEPSEEK_API_KEY="REPLACE_WITH_DEEPSEEK_KEY" npx -y deepseek-mcp-server

6) Local install with Docker (stdio, self-hosted)

docker pull docker.io/dmontgomery40/deepseek-mcp-server:0.4.0 && \
docker run --rm -i -e DEEPSEEK_API_KEY="REPLACE_WITH_DEEPSEEK_KEY" docker.io/dmontgomery40/deepseek-mcp-server:0.4.0

Non-Technical Users

If you mostly use chat apps and don’t want terminal setup:

  1. Use Cursor’s MCP settings UI and add:
    • URL: https://deepseek-mcp.ragweld.com/mcp
    • Header: Authorization: Bearer <token>
  2. If your app does not support custom remote MCP servers with bearer headers yet, use Codex/Claude Code/Cursor as your MCP-enabled client and keep your usual model provider.

OpenRouter users (API + chat UI)

OpenRouter now documents MCP usage, but its MCP flow is SDK/client-centric (not “paste URL in chat and done” for most users). Easiest path is: keep OpenRouter for models, and connect this MCP server through an MCP-capable client (Codex/Claude Code/Cursor).

Remote vs Local (Which Should I Use?)

Remote server

Use remote if you want the fastest setup and centralized updates.

  • Pros: no local server process, easy multi-device use, one shared endpoint.
  • Cons: depends on network + hosted token.

Local server

Use local if you want full runtime control.

  • Pros: fully self-managed, easy private-network workflows.
  • Cons: you manage updates/secrets/process lifecycle.

Code Execution with MCP (What This Actually Means)

In basic tool-calling mode, the model usually needs:

  • many tool definitions loaded into context before it starts;
  • one model round-trip per tool call;
  • intermediate results repeatedly fed back into context.

That works for small toolsets, but it scales poorly. You burn tokens on tool metadata, add latency from repeated inference hops, and raise failure risk when tools are similarly named or require multi-step orchestration.

Code execution changes the control flow. Instead of repeatedly asking the model to call one tool at a time, the model can write a small program that calls tools directly in an execution runtime. That runtime handles loops, branching, filtering, joins, retries, and result shaping. The model then gets a compact summary instead of every raw intermediate payload.

Why this matters in practice:

  • lower context pressure: you avoid dumping full tool catalogs and every raw result into prompt history;
  • better orchestration: code handles deterministic logic that is awkward in pure natural-language loops;
  • lower latency at scale: fewer model turns for multi-step workflows;
  • usually better reliability: less chance of drifting tool choice across long chains.

Limits to keep in mind:

  • code execution does not remove the need for good tool schemas and permissions;
  • this is still an agent system, so guardrails/quotas/auditing matter;
  • for tiny single-tool tasks, plain tool calling can still be simpler.

For this DeepSeek MCP server, the practical takeaway is: keep tool interfaces explicit and stable, then let MCP clients choose direct tool-calling or code-execution orchestration based on workload size and complexity.

Learn More (Curated)

Registry Identity

  • MCP Registry name: io.github.DMontgomery40/deepseek

License

MIT

Alternatives

Related Skills

Browse all skills
context-optimizer

Advanced context management with auto-compaction and dynamic context optimization for DeepSeek's 64k context window. Features intelligent compaction (merging, summarizing, extracting), query-aware relevance scoring, and hierarchical memory system with context archive. Logs optimization events to chat.

29
opencode-cli

This skill should be used when configuring or using the OpenCode CLI for headless LLM automation. Use when the user asks to "configure opencode", "use opencode cli", "set up opencode", "opencode run command", "opencode model selection", "opencode providers", "opencode vertex ai", "opencode mcp servers", "opencode ollama", "opencode local models", "opencode deepseek", "opencode kimi", "opencode mistral", "fallback cli tool", or "headless llm cli". Covers command syntax, provider configuration, Vertex AI setup, MCP servers, local models, cloud providers, and subprocess integration patterns.

9
ai-model-nodejs

Use this skill when developing Node.js backend services or CloudBase cloud functions (Express/Koa/NestJS, serverless, backend APIs) that need AI capabilities. Features text generation (generateText), streaming (streamText), AND image generation (generateImage) via @cloudbase/node-sdk ≥3.16.0. Built-in models include Hunyuan (hunyuan-2.0-instruct-20251111 recommended), DeepSeek (deepseek-v3.2 recommended), and hunyuan-image for images. This is the ONLY SDK that supports image generation. NOT for browser/Web apps (use ai-model-web) or WeChat Mini Program (use ai-model-wechat).

3
clawrouter

Smart LLM router — save 78% on inference costs. Routes every request to the cheapest capable model across 30+ models from OpenAI, Anthropic, Google, DeepSeek, and xAI.

3
moe-training

Train Mixture of Experts (MoE) models using DeepSpeed or HuggingFace. Use when training large-scale models with limited compute (5× cost reduction vs dense models), implementing sparse architectures like Mixtral 8x7B or DeepSeek-V3, or scaling model capacity without proportional compute increase. Covers MoE architectures, routing mechanisms, load balancing, expert parallelism, and inference optimization.

2
training-llms-megatron

Trains large language models (2B-462B parameters) using NVIDIA Megatron-Core with advanced parallelism strategies. Use when training models >1B parameters, need maximum GPU efficiency (47% MFU on H100), or require tensor/pipeline/sequence/context/expert parallelism. Production-ready framework used for Nemotron, LLaMA, DeepSeek.

1