
Memory Plus
A local RAG memory store that lets MCP agents save, search, and recall persistent memories (notes, context, ideas) across sessions. Includes visualization of memory relationships through interactive graph clusters.
Lightweight, local RAG memory store for MCP agents. Easily record, retrieve, update, delete, and visualize persistent "memories" across sessions.
What it does
- Record and store persistent memories across sessions
- Search memories by keywords or topics
- Update and modify existing memory entries
- Import documents directly into memory
- Visualize memory relationships with interactive graphs
- Track memory versions and history
Best for
About Memory Plus
Memory Plus is a community-built MCP server published by yuchen20 that provides AI assistants with tools and capabilities via the Model Context Protocol. Memory Plus is a lightweight, local RAG memory store for MCP agents to record, manage, and visualize persistent memories It is categorized under ai ml.
How to install
You can install Memory Plus in your AI client of choice. Use the install panel on this page to get one-click setup for Cursor, Claude Desktop, VS Code, and other MCP-compatible clients. This server runs locally on your machine via the stdio transport.
License
Memory Plus is released under the Apache-2.0 license. This is a permissive open-source license, meaning you can freely use, modify, and distribute the software.


Memory-Plus
A lightweight, local Retrieval-Augmented Generation (RAG) memory store for MCP agents. Memory-Plus lets your agent record, retrieve, update, and visualize persistent "memories"—notes, ideas, and session context—across runs.
🏆 First Place at the Infosys Cambridge AI Centre Hackathon!
Key Features
- Record Memories:Save user data, ideas, and important context.
- Retrieve Memories:Search by keywords or topics over past entries.
- Recent Memories:Fetch the last N items quickly.
- Update Memories:Append or modify existing entries seamlessly.
- Visualize Memories:Interactive graph clusters revealing relationships.
- File Import (since v0.1.2):Ingest documents directly into memory.
- Delete Memories (since v0.1.2):Remove unwanted entries.
- Memory for Memories (since v0.1.4):Now we use
resourcesto teach your AI exactly when (and when not) to recall past interactions. - Memory Versioning (since v0.1.4):When memories are updated, we keep the old versions to provide a full history.

Installation
1. Prerequisites
Google API Key
Obtain from Google AI Studio and set as GOOGLE_API_KEY in your environment.
Note that we will only use the
Gemini Embedding APIwith this API key, so it is Entirely Free for you to use!
Setup Google API Key Example
# macOS/Linux
export GOOGLE_API_KEY="<YOUR_API_KEY>"
# Windows (PowerShell)
setx GOOGLE_API_KEY "<YOUR_API_KEY>"
UV Runtime Required to serve the MCP plugin.
Install UV Runtime
pip install uv
Or install via shell scripts:
# macOS/Linux
curl -LsSf https://astral.sh/uv/install.sh | sh
# Windows (PowerShell)
powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"
VS Code One-Click Setup
Click the badge below to automatically install and configure Memory-Plus in VS Code:
This will add the following to your settings.json:
{
"mcpServers": {
//..., your other MCP servers
"memory-plus": {
"command": "uvx",
"args": [
"-q",
"memory-plus@latest"
],
}
}
}
For cursor, go to file -> Preferences -> Cursor Settings -> MCP and add the above config.
If you didn't add the GOOGLE_API_KEY to your secrets / environment variables, you can add it with:
"env": {
"GOOGLE_API_KEY": "<YOUR_API_KEY>"
}
just after the args array with in the memory-plus dictionary.
For Cline add the following to your cline_mcp_settings.json:
{
"mcpServers": {
//..., your other MCP servers
"memory-plus": {
"disabled": false,
"timeout": 300,
"command": "uvx",
"args": [
"-q",
"memory-plus@latest"
],
"env": {
"GOOGLE_API_KEY": "${{ secrets.GOOGLE_API_KEY }}"
},
"transportType": "stdio"
}
}
}
For other IDEs it should be mostly similar to the above.
Local Testing and Development
Using MCP Inspector, you can test the memory-plus server locally.
git clone https://github.com/Yuchen20/Memory-Plus.git
cd Memory-Plus
npx @modelcontextprotocol/inspector fastmcp run run .\\memory_plus\\mcp.py
Or If you prefer using this MCP in an actual Chat Session. There is a template chatbot in agent.py.
# Clone the repository
git clone https://github.com/Yuchen20/Memory-Plus.git
cd Memory-Plus
# Install dependencies
pip install uv
uv pip install fast-agent-mcp
uv run fast-agent setup
setup the fastagent.config.yaml and fastagent.secrets.yaml with your own API keys.
# Run the agent
uv run agent_memory.py
RoadMap
- Memory Update
- Improved prompt engineering for memory recording
- Better Visualization of Memory Graph
- File Import
- Remote backup!
- Web UI for Memory Management
If you have any feature requests, please feel free to add them by adding a new issue or by adding a new entry in the Feature Request
License
This project is licensed under the Apache License 2.0. See LICENSE for details.
FAQ
1. Why is memory-plus not working?
- Memory-plus has a few dependencies that can be slow to download the first time. It typically takes around 1 minute to fetch everything needed.
- Once dependencies are installed, subsequent usage will be much faster.
- If you experience other issues, please feel free to open a new issue on the repository.
2. How do I use memory-plus in a real chat session?
- Simply add the MCP JSON file to your MCP setup.
- Once added, memory-plus will automatically activate when needed.
Alternatives
Related Skills
Browse all skillsGuide for reverse engineering tools and techniques used in game security research. Use this skill when working with debuggers, disassemblers, memory analysis tools, binary analysis, or decompilers for game security research.
Advanced context management with auto-compaction and dynamic context optimization for DeepSeek's 64k context window. Features intelligent compaction (merging, summarizing, extracting), query-aware relevance scoring, and hierarchical memory system with context archive. Logs optimization events to chat.
Profile and optimize Python code using cProfile, memory profilers, and performance best practices. Use when debugging slow Python code, optimizing bottlenecks, or improving application performance.
Framework for building LLM-powered applications with agents, chains, and RAG. Supports multiple providers (OpenAI, Anthropic, Google), 500+ integrations, ReAct agents, tool calling, memory management, and vector store retrieval. Use for building chatbots, question-answering systems, autonomous agents, or RAG applications. Best for rapid prototyping and production deployments.
Comprehensive guidelines for Obsidian.md plugin development including all 27 ESLint rules, TypeScript best practices, memory management, API usage (requestUrl vs fetch), UI/UX standards, and submission requirements. Use when working with Obsidian plugins, main.ts files, manifest.json, Plugin class, MarkdownView, TFile, vault operations, or any Obsidian API development.
Audit and improve CLAUDE.md files in repositories. Use when user asks to check, audit, update, improve, or fix CLAUDE.md files. Scans for all CLAUDE.md files, evaluates quality against templates, outputs quality report, then makes targeted updates. Also use when the user mentions "CLAUDE.md maintenance" or "project memory optimization".