Llama.cpp Bridge

Name: Llama.cpp Bridge
Rating: 4.9 (38 reviews)
Author: openconstruct

Connects Claude Desktop to your local llama.cpp models, letting you chat with local LLMs directly through Claude's interface.

Bridges local llama-server instances with MCP clients, providing chat interface, health monitoring, and configurable generation parameters for integrating llama.cpp models with desktop applications

82,386 views5Local (stdio)

ai ml developer tools

GitHub

What it does

Chat with local llama.cpp models through Claude Desktop
Control generation parameters like temperature and max_tokens
Monitor llama-server health and status
Track performance metrics and token usage
Test model capabilities with built-in tools

Best for

AI researchers running local modelsPrivacy-focused users avoiding cloud APIsDevelopers integrating local LLMs with desktop workflows

No cloud API keys requiredFull conversation supportBuilt-in testing tools

About Llama.cpp Bridge

Llama.cpp Bridge is a community-built MCP server published by openconstruct that provides AI assistants with tools and capabilities via the Model Context Protocol. Llama.cpp Bridge connects local llama-server instances to MCP clients, enabling chat, health checks, and flexible llama. It is categorized under ai ml, developer tools.

How to install

You can install Llama.cpp Bridge in your AI client of choice. Use the install panel on this page to get one-click setup for Cursor, Claude Desktop, VS Code, and other MCP-compatible clients. This server runs locally on your machine via the stdio transport.

License

Llama.cpp Bridge is released under the MIT license. This is a permissive open-source license, meaning you can freely use, modify, and distribute the software.

LibreModel MCP Server 🤖

A Model Context Protocol (MCP) server that bridges Claude Desktop with your local LLM instance running via llama-server.

Features

💬 Full conversation support with Local Model through Claude Desktop
🎛️ Complete parameter control (temperature, max_tokens, top_p, top_k)
✅ Health monitoring and server status checks
🧪 Built-in testing tools for different capabilities
📊 Performance metrics and token usage tracking
🔧 Easy configuration via environment variables

Quick Start

npm install @openconstruct/llama-mcp-server

A Model Context Protocol (MCP) server that bridges Claude Desktop with your local LLM instance running via llama-server.

Features

💬 Full conversation support with LibreModel through Claude Desktop
🎛️ Complete parameter control (temperature, max_tokens, top_p, top_k)
✅ Health monitoring and server status checks
🧪 Built-in testing tools for different capabilities
📊 Performance metrics and token usage tracking
🔧 Easy configuration via environment variables

Quick Start

1. Install Dependencies

cd llama-mcp
npm install

2. Build the Server

npm run build

3. Start Your LibreModel

Make sure llama-server is running with your model:

./llama-server -m lm37.gguf -c 2048 --port 8080

4. Configure Claude Desktop

Add this to your Claude Desktop configuration (~/.config/claude/claude_desktop_config.json):

{
  "mcpServers": {
    "libremodel": {
      "command": "node",
      "args": ["/home/jerr/llama-mcp/dist/index.js"]
    }
  }
}

5. Restart Claude Desktop

Claude will now have access to LibreModel through MCP!

Usage

Once configured, you can use these tools in Claude Desktop:

💬 `chat` - Main conversation tool

Use the chat tool to ask LibreModel: "What is your name and what can you do?"

🧪 `quick_test` - Test LibreModel capabilities

Run a quick_test with type "creative" to see if LibreModel can write poetry

🏥 `health_check` - Monitor server status

Use health_check to see if LibreModel is running properly

Configuration

Set environment variables to customize behavior:

export LLAMA_SERVER_URL="http://localhost:8080"  # Default llama-server URL

Available Tools

Tool	Description	Parameters
`chat`	Converse with MOdel	`message`, `temperature`, `max_tokens`, `top_p`, `top_k`, `system_prompt`
`quick_test`	Run predefined capability tests	`test_type` (hello/math/creative/knowledge)
`health_check`	Check server health and status	None

Resources

Configuration: View current server settings
Instructions: Detailed usage guide and setup instructions

Development

# Install dependencies
npm install # LibreModel MCP Server 🤖

A Model Context Protocol (MCP) server that bridges Claude Desktop with your local LLM instance running via llama-server.

## Features

- 💬 **Full conversation support** with LibreModel through Claude Desktop
- 🎛️ **Complete parameter control** (temperature, max_tokens, top_p, top_k)
- ✅ **Health monitoring** and server status checks
- 🧪 **Built-in testing tools** for different capabilities
- 📊 **Performance metrics** and token usage tracking
- 🔧 **Easy configuration** via environment variables

## Quick Start

### 1. Install Dependencies

```bash
cd llama-mcp
npm install

2. Build the Server

npm run build

3. Start Your LibreModel

Make sure llama-server is running with your model:

./llama-server -m lm37.gguf -c 2048 --port 8080

4. Configure Claude Desktop

Add this to your Claude Desktop configuration (~/.config/claude/claude_desktop_config.json):

{
  "mcpServers": {
    "libremodel": {
      "command": "node",
      "args": ["/home/jerr/llama-mcp/dist/index.js"]
    }
  }
}

5. Restart Claude Desktop

Claude will now have access to LibreModel through MCP!

Usage

Once configured, you can use these tools in Claude Desktop:

💬 `chat` - Main conversation tool

Use the chat tool to ask LibreModel: "What is your name and what can you do?"

🧪 `quick_test` - Test LibreModel capabilities

Run a quick_test with type "creative" to see if LibreModel can write poetry

🏥 `health_check` - Monitor server status

Use health_check to see if LibreModel is running properly

Configuration

Set environment variables to customize behavior:

export LLAMA_SERVER_URL="http://localhost:8080"  # Default llama-server URL

Available Tools

Tool	Description	Parameters
`chat`	Converse with MOdel	`message`, `temperature`, `max_tokens`, `top_p`, `top_k`, `system_prompt`
`quick_test`	Run predefined capability tests	`test_type` (hello/math/creative/knowledge)
`health_check`	Check server health and status	None

Resources

Configuration: View current server settings
Instructions: Detailed usage guide and setup instructions

Development

# Install dependencies
npm install openconstruct/llama-mcp-server


# Development mode (auto-rebuild)
npm run dev

# Build for production
npm run build

# Start the server directly
npm start

Architecture

Claude Desktop ←→ LLama MCP Server ←→ llama-server API ←→ Local Model

The MCP server acts as a bridge, translating MCP protocol messages into llama-server API calls and formatting responses for Claude Desktop.

Troubleshooting

"Cannot reach LLama server"

Ensure llama-server is running on the configured port
Check that the model is loaded and responding
Verify firewall/network settings

"Tool not found in Claude Desktop"

Restart Claude Desktop after configuration changes
Check that the path to index.js is correct and absolute
Verify the MCP server builds without errors

Poor response quality

Adjust temperature and sampling parameters
Try different system prompts

License

CC0-1.0 - Public Domain. Use freely!

Built with ❤️ for open-source AI and the LibreModel project. by Claude Sonnet4

Development mode (auto-rebuild)

npm run dev

Build for production

npm run build

Start the server directly

npm start


## Architecture

Claude Desktop ←→ LLama MCP Server ←→ llama-server API ←→ Local Model


The MCP server acts as a bridge, translating MCP protocol messages into llama-server API calls and formatting responses for Claude Desktop.

## Troubleshooting

**"Cannot reach LLama server"**
- Ensure llama-server is running on the configured port
- Check that the model is loaded and responding
- Verify firewall/network settings

**"Tool not found in Claude Desktop"**
- Restart Claude Desktop after configuration changes
- Check that the path to `index.js` is correct and absolute
- Verify the MCP server builds without errors

**Poor response quality**
- Adjust temperature and sampling parameters
- Try different system prompts

## License

CC0-1.0 - Public Domain. Use freely!

---

Built with ❤️ for open-source AI and the LibreModel project. by Claude Sonnet4

### 1. Install Dependencies

```bash
cd llama-mcp
npm install

2. Build the Server

npm run build

3. Start Your LibreModel

Make sure llama-server is running with your model:

./llama-server -m lm37.gguf -c 2048 --port 8080

4. Configure Claude Desktop

Add this to your Claude Desktop configuration (~/.config/claude/claude_desktop_config.json):

{
  "mcpServers": {
    "libremodel": {
      "command": "node",
      "args": ["/home/jerr/llama-mcp/dist/index.js"]
    }
  }
}

5. Restart Claude Desktop

Claude will now have access to LibreModel through MCP!

Usage

Once configured, you can use these tools in Claude Desktop:

💬 `chat` - Main conversation tool

Use the chat tool to ask LibreModel: "What is your name and what can you do?"

🧪 `quick_test` - Test LibreModel capabilities

Run a quick_test with type "creative" to see if LibreModel can write poetry

🏥 `health_check` - Monitor server status

Use health_check to see if LibreModel is running properly

Configuration

Set environment variables to customize behavior:

export LLAMA_SERVER_URL="http://localhost:8080"  # Default llama-server URL

Available Tools

Tool	Description	Parameters
`chat`	Converse with MOdel	`message`, `temperature`, `max_tokens`, `top_p`, `top_k`, `system_prompt`
`quick_test`	Run predefined capability tests	`test_type` (hello/math/creative/knowledge)
`health_check`	Check server health and status	None

Resources

Configuration: View current server settings
Instructions: Detailed usage guide and setup instructions

Development

# Install dependencies
npm install

# Development mode (auto-rebuild)
npm run dev

# Build for production
npm run build

# Start the server directly
npm start

Architecture

Claude Desktop ←→ LLama MCP Server ←→ llama-server API ←→ Local Model

The MCP server acts as a bridge, translating MCP protocol messages into llama-server API calls and formatting responses for Claude Desktop.

Troubleshooting

"Cannot reach LLama server"

Ensure llama-server is running on the configured port
Check that the model is loaded and responding
Verify firewall/network settings

"Tool not found in Claude Desktop"

Restart Claude Desktop after configuration changes
Check that the path to index.js is correct and absolute
Verify the MCP server builds without errors

Poor response quality

Adjust temperature and sampling parameters
Try different system prompts

License

CC0-1.0 - Public Domain. Use freely!

Built with ❤️ for open-source AI and the LibreModel project. by Claude Sonnet

README truncated. View full README on GitHub.

Alternatives

Knowledge Graph Memory

anthropic

80.5k

Build persistent semantic networks for enterprise & engineering data management. Enable data persistence and memory acro

OfficialPopular

2.1k147

Context7

upstash

48.2k

Boost your AI code assistant with Context7: inject real-time API documentation from OpenAPI specification sources into y

OfficialRemotePopular

15.5k763

Chrome DevTools MCP

chromedevtools

28.1k

AI-driven control of live Chrome via Chrome DevTools: browser automation, debugging, performance analysis and network mo

OfficialPopular

50711

Chrome DevTools

chromedevtools

28.1k

Use Chrome DevTools for web site test speed, debugging, and performance analysis. The essential chrome developer tools f

OfficialPopular

3.9k172

Related Skills

Browse all skills

ui-design-system

UI design system toolkit for Senior UI Designer including design token generation, component documentation, responsive design calculations, and developer handoff tools. Use for creating design systems, maintaining visual consistency, and facilitating design-dev collaboration.

codex-cli-bridge

Bridge between Claude Code and OpenAI Codex CLI - generates AGENTS.md from CLAUDE.md, provides Codex CLI execution helpers, and enables seamless interoperability between both tools

ai-sdk

Answer questions about the AI SDK and help build AI-powered features. Use when developers: (1) Ask about AI SDK functions like generateText, streamText, ToolLoopAgent, embed, or tools, (2) Want to build AI agents, chatbots, RAG systems, or text generation features, (3) Have questions about AI providers (OpenAI, Anthropic, Google, etc.), streaming, tool calling, structured output, or embeddings, (4) Use React hooks like useChat or useCompletion. Triggers on: "AI SDK", "Vercel AI SDK", "generateText", "streamText", "add AI to my app", "build an agent", "tool calling", "structured output", "useChat".

api-documenter

Master API documentation with OpenAPI 3.1, AI-powered tools, and modern developer experience practices. Create interactive docs, generate SDKs, and build comprehensive developer portals. Use PROACTIVELY for API documentation or developer portal creation.

openai-knowledge

Use when working with the OpenAI API (Responses API) or OpenAI platform features (tools, streaming, Realtime API, auth, models, rate limits, MCP) and you need authoritative, up-to-date documentation (schemas, examples, limits, edge cases). Prefer the OpenAI Developer Documentation MCP server tools when available; otherwise guide the user to enable `openaiDeveloperDocs`.

cli-builder

Guide for building TypeScript CLIs with Bun. Use when creating command-line tools, adding subcommands to existing CLIs, or building developer tooling. Covers argument parsing, subcommand patterns, output formatting, and distribution.

What it does

Best for

About Llama.cpp Bridge

How to install

License

LibreModel MCP Server 🤖

Features

Quick Start

Features

Quick Start

1. Install Dependencies

2. Build the Server

3. Start Your LibreModel

4. Configure Claude Desktop

5. Restart Claude Desktop

Usage

💬 chat - Main conversation tool

🧪 quick_test - Test LibreModel capabilities

🏥 health_check - Monitor server status

Configuration

Available Tools

Resources

Development

2. Build the Server

3. Start Your LibreModel

4. Configure Claude Desktop

5. Restart Claude Desktop

Usage

💬 chat - Main conversation tool

🧪 quick_test - Test LibreModel capabilities

🏥 health_check - Monitor server status

Configuration

Available Tools

Resources

Development

Architecture

Troubleshooting

License

Development mode (auto-rebuild)

Build for production

Start the server directly

2. Build the Server

3. Start Your LibreModel

4. Configure Claude Desktop

5. Restart Claude Desktop

Usage

💬 chat - Main conversation tool

🧪 quick_test - Test LibreModel capabilities

🏥 health_check - Monitor server status

Configuration

Available Tools

Resources

Development

Architecture

Troubleshooting

License

Alternatives

Knowledge Graph Memory

Context7

Chrome DevTools MCP

Chrome DevTools

Related Skills

💬 `chat` - Main conversation tool

🧪 `quick_test` - Test LibreModel capabilities

🏥 `health_check` - Monitor server status

💬 `chat` - Main conversation tool

🧪 `quick_test` - Test LibreModel capabilities

🏥 `health_check` - Monitor server status

💬 `chat` - Main conversation tool

🧪 `quick_test` - Test LibreModel capabilities

🏥 `health_check` - Monitor server status