DeepSeek Thinker

DeepSeek Thinker

ruixingshi

Connects to DeepSeek's reasoning model to expose its step-by-step thinking process and chain-of-thought reasoning capabilities. Works with both cloud API and local Ollama deployments.

Integrates with DeepSeek Thinker model to enable chain-of-thought reasoning and complex problem-solving for applications requiring advanced cognitive capabilities.

67330 views17Local (stdio)

What it does

  • Access DeepSeek's internal reasoning process
  • Generate step-by-step problem solving explanations
  • Connect to DeepSeek via OpenAI-compatible API
  • Run DeepSeek models locally through Ollama
  • Capture and display chain-of-thought reasoning

Best for

Developers building reasoning-heavy applicationsResearchers studying AI thinking processesUsers wanting transparent problem-solving stepsTeams needing explainable AI decision-making
Dual deployment modes (cloud/local)Exposes model's thinking process

About DeepSeek Thinker

DeepSeek Thinker is a community-built MCP server published by ruixingshi that provides AI assistants with tools and capabilities via the Model Context Protocol. Integrate DeepSeek Thinker for AI problem solving and chain-of-thought reasoning in advanced artificial intelligence app It is categorized under ai ml.

How to install

You can install DeepSeek Thinker in your AI client of choice. Use the install panel on this page to get one-click setup for Cursor, Claude Desktop, VS Code, and other MCP-compatible clients. This server runs locally on your machine via the stdio transport.

License

DeepSeek Thinker is released under the MIT license. This is a permissive open-source license, meaning you can freely use, modify, and distribute the software.

Deepseek Thinker MCP Server

smithery badge

A MCP (Model Context Protocol) provider Deepseek reasoning content to MCP-enabled AI Clients, like Claude Desktop. Supports access to Deepseek's thought processes from the Deepseek API service or from a local Ollama server.

Deepseek Thinker Server MCP server

Core Features

  • 🤖 Dual Mode Support

    • OpenAI API mode support
    • Ollama local mode support
  • 🎯 Focused Reasoning

    • Captures Deepseek's thinking process
    • Provides reasoning output

Available Tools

get-deepseek-thinker

  • Description: Perform reasoning using the Deepseek model
  • Input Parameters:
    • originPrompt (string): User's original prompt
  • Returns: Structured text response containing the reasoning process

Environment Configuration

OpenAI API Mode

Set the following environment variables:

API_KEY=<Your OpenAI API Key>
BASE_URL=<API Base URL>

Ollama Mode

Set the following environment variable:

USE_OLLAMA=true

Usage

Integration with AI Client, like Claude Desktop

Add the following configuration to your claude_desktop_config.json:

{
  "mcpServers": {
    "deepseek-thinker": {
      "command": "npx",
      "args": [
        "-y",
        "deepseek-thinker-mcp"
      ],
      "env": {
        "API_KEY": "<Your API Key>",
        "BASE_URL": "<Your Base URL>"
      }
    }
  }
}

Using Ollama Mode

{
  "mcpServers": {
    "deepseek-thinker": {
      "command": "npx",
      "args": [
        "-y",
        "deepseek-thinker-mcp"
      ],
      "env": {
        "USE_OLLAMA": "true"
      }
    }
  }
}

Local Server Configuration

{
  "mcpServers": {
    "deepseek-thinker": {
      "command": "node",
      "args": [
        "/your-path/deepseek-thinker-mcp/build/index.js"
      ],
      "env": {
        "API_KEY": "<Your API Key>",
        "BASE_URL": "<Your Base URL>"
      }
    }
  }
}

Development Setup

# Install dependencies
npm install

# Build project
npm run build

# Run service
node build/index.js

FAQ

Response like this: "MCP error -32001: Request timed out"

This error occurs when the Deepseek API response is too slow or when the reasoning content output is too long, causing the MCP server to timeout.

Tech Stack

  • TypeScript
  • @modelcontextprotocol/sdk
  • OpenAI API
  • Ollama
  • Zod (parameter validation)

License

This project is licensed under the MIT License. See the LICENSE file for details.

Alternatives

Related Skills

Browse all skills
context-optimizer

Advanced context management with auto-compaction and dynamic context optimization for DeepSeek's 64k context window. Features intelligent compaction (merging, summarizing, extracting), query-aware relevance scoring, and hierarchical memory system with context archive. Logs optimization events to chat.

29
opencode-cli

This skill should be used when configuring or using the OpenCode CLI for headless LLM automation. Use when the user asks to "configure opencode", "use opencode cli", "set up opencode", "opencode run command", "opencode model selection", "opencode providers", "opencode vertex ai", "opencode mcp servers", "opencode ollama", "opencode local models", "opencode deepseek", "opencode kimi", "opencode mistral", "fallback cli tool", or "headless llm cli". Covers command syntax, provider configuration, Vertex AI setup, MCP servers, local models, cloud providers, and subprocess integration patterns.

9
ai-model-nodejs

Use this skill when developing Node.js backend services or CloudBase cloud functions (Express/Koa/NestJS, serverless, backend APIs) that need AI capabilities. Features text generation (generateText), streaming (streamText), AND image generation (generateImage) via @cloudbase/node-sdk ≥3.16.0. Built-in models include Hunyuan (hunyuan-2.0-instruct-20251111 recommended), DeepSeek (deepseek-v3.2 recommended), and hunyuan-image for images. This is the ONLY SDK that supports image generation. NOT for browser/Web apps (use ai-model-web) or WeChat Mini Program (use ai-model-wechat).

3
clawrouter

Smart LLM router — save 78% on inference costs. Routes every request to the cheapest capable model across 30+ models from OpenAI, Anthropic, Google, DeepSeek, and xAI.

3
moe-training

Train Mixture of Experts (MoE) models using DeepSpeed or HuggingFace. Use when training large-scale models with limited compute (5× cost reduction vs dense models), implementing sparse architectures like Mixtral 8x7B or DeepSeek-V3, or scaling model capacity without proportional compute increase. Covers MoE architectures, routing mechanisms, load balancing, expert parallelism, and inference optimization.

2
training-llms-megatron

Trains large language models (2B-462B parameters) using NVIDIA Megatron-Core with advanced parallelism strategies. Use when training models >1B parameters, need maximum GPU efficiency (47% MFU on H100), or require tensor/pipeline/sequence/context/expert parallelism. Production-ready framework used for Nemotron, LLaMA, DeepSeek.

1