What is the primary function of the MCP Memory Service?

The MCP Memory Service provides semantic memory and persistent storage capabilities for Claude Desktop, utilizing ChromaDB and sentence transformers for advanced memory management.

How does the natural language time-based recall feature work?

This feature allows users to retrieve memories using natural language expressions like 'last week' or 'yesterday morning,' making it easier to access past information.

Is the MCP Memory Service compatible with all operating systems?

Yes, the service is cross-platform compatible, supporting Apple Silicon, Intel, Windows, and Linux environments.

Can the MCP Memory Service integrate with the UBOS platform?

Absolutely, it integrates seamlessly with the UBOS platform, enhancing AI agent orchestration and enterprise data connectivity.

What are the hardware requirements for deploying the MCP Memory Service on a VPS?

The minimum system requirements are 2 vCPU cores, 4 GB of RAM, and 10 GB of storage, with recommended specs being higher for optimal performance.

MCP Memory Service

An MCP server providing semantic memory and persistent storage capabilities for Claude Desktop using ChromaDB and sentence transformers. This service enables long-term memory storage with semantic search capabilities, making it ideal for maintaining context across conversations and instances.

Features

Semantic search using sentence transformers
Natural language time-based recall (e.g., “last week”, “yesterday morning”)
Tag-based memory retrieval system
Persistent storage using ChromaDB
Automatic database backups
Memory optimization tools
Exact match retrieval
Debug mode for similarity analysis
Database health monitoring
Duplicate detection and cleanup
Customizable embedding model
Cross-platform compatibility (Apple Silicon, Intel, Windows, Linux)
Hardware-aware optimizations for different environments
Graceful fallbacks for limited hardware resources

Quick Start

For the fastest way to get started:

# Install UV if not already installed
pip install uv

# Clone and install
git clone https://github.com/doobidoo/mcp-memory-service.git
cd mcp-memory-service
uv venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate
uv pip install -r requirements.txt
uv pip install -e .

# Run the service
uv run memory

Docker and Smithery Integration

Docker Usage

The service can be run in a Docker container for better isolation and deployment:

# Build the Docker image
docker build -t mcp-memory-service .

# Run the container
# Note: On macOS, paths must be within Docker's allowed file sharing locations
# Default allowed locations include:
# - /Users
# - /Volumes
# - /private
# - /tmp
# - /var/folders

# Example with proper macOS paths:
docker run -it 
  -v $HOME/mcp-memory/chroma_db:/app/chroma_db 
  -v $HOME/mcp-memory/backups:/app/backups 
  mcp-memory-service

# For production use, you might want to run it in detached mode:
docker run -d 
  -v $HOME/mcp-memory/chroma_db:/app/chroma_db 
  -v $HOME/mcp-memory/backups:/app/backups 
  --name mcp-memory 
  mcp-memory-service

To configure Docker’s file sharing on macOS:

Open Docker Desktop
Go to Settings (Preferences)
Navigate to Resources -> File Sharing
Add any additional paths you need to share
Click “Apply & Restart”

Smithery Integration

The service is configured for Smithery integration through smithery.yaml. This configuration enables stdio-based communication with MCP clients like Claude Desktop.

To use with Smithery:

Ensure your claude_desktop_config.json points to the correct paths:

{
  "memory": {
    "command": "docker",
    "args": [
      "run",
      "-i",
      "--rm",
      "-v", "$HOME/mcp-memory/chroma_db:/app/chroma_db",
      "-v", "$HOME/mcp-memory/backups:/app/backups",
      "mcp-memory-service"
    ],
    "env": {
      "MCP_MEMORY_CHROMA_PATH": "/app/chroma_db",
      "MCP_MEMORY_BACKUPS_PATH": "/app/backups"
    }
  }
}

The smithery.yaml configuration handles stdio communication and environment setup automatically.

Testing with Claude Desktop

To verify your Docker-based memory service is working correctly with Claude Desktop:

Build the Docker image with docker build -t mcp-memory-service .

Create the necessary directories for persistent storage:

mkdir -p $HOME/mcp-memory/chroma_db $HOME/mcp-memory/backups

Update your Claude Desktop configuration file:
- On macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
- On Windows: %APPDATA%\Claude\claude_desktop_config.json
- On Linux: ~/.config/Claude/claude_desktop_config.json
Restart Claude Desktop
When Claude starts up, you should see the memory service initialize with a message:
```
MCP Memory Service initialization completed
```
Test the memory feature:
- Ask Claude to remember something: “Please remember that my favorite color is blue”
- Later in the conversation or in a new conversation, ask: “What is my favorite color?”
- Claude should retrieve the information from the memory service

If you experience any issues:

Check the Claude Desktop console for error messages
Verify Docker has the necessary permissions to access the mounted directories
Ensure the Docker container is running with the correct parameters
Try running the container manually to see any error output

For detailed installation instructions, platform-specific guides, and troubleshooting, see our documentation:

Installation Guide - Comprehensive installation instructions for all platforms
Troubleshooting Guide - Solutions for common issues
Technical Documentation - Detailed technical procedures and specifications
Scripts Documentation - Overview of available scripts and their usage

VPS Deployment Requirements

If you’re planning to deploy the Memory Service to a Virtual Private Server (VPS), here are the recommended specifications and configuration guidelines.

Minimum System Requirements

Resource	Minimum	Recommended	Notes
CPU	2 vCPU cores	4+ vCPU cores	Modern x86_64 or ARM64 compatible
RAM	4 GB	8+ GB	Memory usage depends on embedding model size
Storage	10 GB	20+ GB	SSD recommended for ChromaDB performance
OS	Any Linux with Python 3.10+	Ubuntu 22.04+	macOS and Windows also supported
Network	Basic connectivity	100+ Mbps	Low bandwidth once initialized

Key Considerations for VPS Deployment

Memory Usage Breakdown

The memory footprint consists of three main components:

Sentence Transformer Model: ~400-600 MB (depends on model)
ChromaDB: ~200-500 MB (depends on database size)
Python Runtime + Dependencies: ~200-300 MB
Working Memory for Embeddings: ~100-500 MB (depends on batch size)

For most deployments, 4 GB RAM is sufficient for basic operation, but 8 GB provides better performance for larger memory stores.

Storage Requirements

ChromaDB: Starts at ~100 MB, grows ~1 MB per 100 memories
Backups: Plan for 2-3x the DB size for rotation
Code + Dependencies: ~300-500 MB
OS + Other: Variable based on your setup

A 10 GB SSD is adequate for getting started, but plan for growth if you anticipate storing many thousands of memories.

Recommended Docker Deployment for VPS

For VPS deployment, Docker is strongly recommended to simplify installation and management:

# On your VPS
git clone https://github.com/doobidoo/mcp-memory-service.git
cd mcp-memory-service

# Create directories for persistent storage
mkdir -p ./data/chroma_db ./data/backups

# Add resource constraints appropriate for your VPS
docker build -t mcp-memory-service .
docker run -d 
  --name mcp-memory 
  --restart unless-stopped 
  --memory=4g 
  --cpus=2 
  -p 8000:8000 
  -v $(pwd)/data/chroma_db:/app/chroma_db 
  -v $(pwd)/data/backups:/app/backups 
  -e MCP_MEMORY_CHROMA_PATH=/app/chroma_db 
  -e MCP_MEMORY_BACKUPS_PATH=/app/backups 
  -e MCP_MEMORY_MODEL_NAME=paraphrase-MiniLM-L6-v2 
  -e MCP_MEMORY_BATCH_SIZE=4 
  mcp-memory-service

This configuration:

Limits the container to 4GB RAM and 2 CPUs
Uses a balanced model for good performance/resource usage
Mounts persistent volumes for data storage
Automatically restarts the service if it crashes or on system reboot

Optimizations for Low-Resource VPS Environments

If running on a constrained VPS with limited resources:

Use a smaller embedding model:

export MCP_MEMORY_MODEL_NAME=paraphrase-MiniLM-L3-v2  # Much smaller model

Reduce batch size:

export MCP_MEMORY_BATCH_SIZE=4  # Default is 8

Enable CPU optimization:

export MCP_MEMORY_USE_ONNX=1  # Uses ONNX runtime which can be faster on CPU

For more detailed VPS deployment information, see issue #22.

Configuration

Standard Configuration (Recommended)

Add the following to your claude_desktop_config.json file to use UV (recommended for best performance):

{
  "memory": {
    "command": "uv",
    "args": [
      "--directory",
      "your_mcp_memory_service_directory",  // e.g., "C:\\REPOSITORIES\\mcp-memory-service"
      "run",
      "memory"
    ],
    "env": {
      "MCP_MEMORY_CHROMA_PATH": "your_chroma_db_path",  // e.g., "C:\\Users\\John.Doe\\AppData\\Local\\mcp-memory\\chroma_db"
      "MCP_MEMORY_BACKUPS_PATH": "your_backups_path"  // e.g., "C:\\Users\\John.Doe\\AppData\\Local\\mcp-memory\\backups"
    }
  }
}

Windows-Specific Configuration (Recommended)

For Windows users, we recommend using the wrapper script to ensure PyTorch is properly installed. See our Windows Setup Guide for detailed instructions.

{
  "memory": {
    "command": "python",
    "args": [
      "C:\\path\\to\\mcp-memory-service\\memory_wrapper.py"
    ],
    "env": {
      "MCP_MEMORY_CHROMA_PATH": "C:\\Users\\YourUsername\\AppData\\Local\\mcp-memory\\chroma_db",
      "MCP_MEMORY_BACKUPS_PATH": "C:\\Users\\YourUsername\\AppData\\Local\\mcp-memory\\backups"
    }
  }
}

The wrapper script will:

Check if PyTorch is installed and properly configured
Install PyTorch with the correct index URL if needed
Run the memory server with the appropriate configuration

Hardware Compatibility

Platform	Architecture	Accelerator	Status
macOS	Apple Silicon (M1/M2/M3)	MPS	Fully supported
macOS	Apple Silicon under Rosetta 2	CPU	Supported with fallbacks
macOS	Intel	CPU	Fully supported
Windows	x86_64	CUDA	Fully supported
Windows	x86_64	DirectML	Supported
Windows	x86_64	CPU	Supported with fallbacks
Linux	x86_64	CUDA	Fully supported
Linux	x86_64	ROCm	Supported
Linux	x86_64	CPU	Supported with fallbacks
Linux	ARM64	CPU	Supported with fallbacks

Note for macOS 15 (Sequoia) users: If you’re using macOS 15 with Apple Silicon and encounter PyTorch package installation errors when using UV, you might need to:

Use pip instead of UV: pip install -e .

Use conda to create a dedicated environment: conda create -n memory-service python=3.10 && conda activate memory-service && conda install pytorch -c pytorch

Specify a compatible torch version: pip install torch==2.0.1 --find-links https://download.pytorch.org/whl/torch_stable.html
This issue happens because PyTorch wheels for macOS 15 on arm64 are still being updated. See issue #29 for more details.

Memory Operations

The memory service provides the following operations through the MCP server:

Core Memory Operations

store_memory - Store new information with optional tags
retrieve_memory - Perform semantic search for relevant memories
recall_memory - Retrieve memories using natural language time expressions
search_by_tag - Find memories using specific tags
exact_match_retrieve - Find memories with exact content match
debug_retrieve - Retrieve memories with similarity scores

For detailed information about tag storage and management, see our Tag Storage Documentation.

Database Management

create_backup - Create database backup
get_stats - Get memory statistics
optimize_db - Optimize database performance
check_database_health - Get database health metrics
check_embedding_model - Verify model status

Memory Management

delete_memory - Delete specific memory by hash
delete_by_tag - Delete all memories with specific tag
cleanup_duplicates - Remove duplicate entries

Configuration Options

Configure through environment variables:

CHROMA_DB_PATH: Path to ChromaDB storage
BACKUP_PATH: Path for backups
AUTO_BACKUP_INTERVAL: Backup interval in hours (default: 24)
MAX_MEMORIES_BEFORE_OPTIMIZE: Threshold for auto-optimization (default: 10000)
SIMILARITY_THRESHOLD: Default similarity threshold (default: 0.7)
MAX_RESULTS_PER_QUERY: Maximum results per query (default: 10)
BACKUP_RETENTION_DAYS: Number of days to keep backups (default: 7)
LOG_LEVEL: Logging level (default: INFO)

# Hardware-specific environment variables
PYTORCH_ENABLE_MPS_FALLBACK: Enable MPS fallback for Apple Silicon (default: 1)
MCP_MEMORY_USE_ONNX: Use ONNX Runtime for CPU-only deployments (default: 0)
MCP_MEMORY_USE_DIRECTML: Use DirectML for Windows acceleration (default: 0)
MCP_MEMORY_MODEL_NAME: Override the default embedding model
MCP_MEMORY_BATCH_SIZE: Override the default batch size

Getting Help

If you encounter any issues:

Check our Troubleshooting Guide
Review the Installation Guide
For Windows-specific issues, see our Windows Setup Guide
Contact the developer via Telegram: t.me/doobeedoo

Project Structure

mcp-memory-service/
├── src/mcp_memory_service/      # Core package code
│   ├── __init__.py
│   ├── config.py                # Configuration utilities
│   ├── models/                  # Data models
│   ├── storage/                 # Storage implementations
│   ├── utils/                   # Utility functions
│   └── server.py                # Main MCP server
├── scripts/                     # Helper scripts
│   ├── convert_to_uv.py         # Script to migrate to UV
│   └── install_uv.py            # UV installation helper
├── .uv/                         # UV configuration
├── memory_wrapper.py            # Windows wrapper script
├── memory_wrapper_uv.py         # UV-based wrapper script
├── uv_wrapper.py                # UV wrapper script
├── install.py                   # Enhanced installation script
└── tests/                       # Test suite

Development Guidelines

Python 3.10+ with type hints
Use dataclasses for models
Triple-quoted docstrings for modules and functions
Async/await pattern for all I/O operations
Follow PEP 8 style guidelines
Include tests for new features

License

MIT License - See LICENSE file for details

Acknowledgments

ChromaDB team for the vector database
Sentence Transformers project for embedding models
MCP project for the protocol specification

Contact

t.me/doobidoo

Cloudflare Worker Implementation

A serverless implementation of the MCP Memory Service is now available using Cloudflare Workers. This implementation:

Uses Cloudflare D1 for storage (serverless SQLite)
Uses Workers AI for embeddings generation
Communicates via Server-Sent Events (SSE) for MCP protocol
Requires no local installation or dependencies
Scales automatically with usage

Benefits of the Cloudflare Implementation

Zero local installation: No Python, dependencies, or local storage needed
Cross-platform compatibility: Works on any device that can connect to the internet
Automatic scaling: Handles multiple users without configuration
Global distribution: Low latency access from anywhere
No maintenance: Updates and maintenance handled automatically

Available Tools in the Cloudflare Implementation

The Cloudflare Worker implementation supports all the same tools as the Python implementation:

Tool	Description
`store_memory`	Store new information with optional tags
`retrieve_memory`	Find relevant memories based on query
`recall_memory`	Retrieve memories using natural language time expressions
`search_by_tag`	Search memories by tags
`delete_memory`	Delete a specific memory by its hash
`delete_by_tag`	Delete all memories with a specific tag
`cleanup_duplicates`	Find and remove duplicate entries
`get_embedding`	Get raw embedding vector for content
`check_embedding_model`	Check if embedding model is loaded and working
`debug_retrieve`	Retrieve memories with debug information
`exact_match_retrieve`	Retrieve memories using exact content match
`check_database_health`	Check database health and get statistics
`recall_by_timeframe`	Retrieve memories within a specific timeframe
`delete_by_timeframe`	Delete memories within a specific timeframe
`delete_before_date`	Delete memories before a specific date

Configuring Claude to Use the Cloudflare Memory Service

Add the following to your Claude configuration to use the Cloudflare-based memory service:

{
  "mcpServers": [
    {
      "name": "cloudflare-memory",
      "url": "https://your-worker-subdomain.workers.dev/mcp",
      "type": "sse"
    }
  ]
}

Replace your-worker-subdomain with your actual Cloudflare Worker subdomain.

Deploying Your Own Cloudflare Memory Service

Clone the repository and navigate to the Cloudflare Worker directory:

git clone https://github.com/doobidoo/mcp-memory-service.git
cd mcp-memory-service/cloudflare_worker

Install Wrangler (Cloudflare’s CLI tool):
```
npm install -g wrangler
```
Login to your Cloudflare account:
```
wrangler login
```
Create a D1 database:
```
wrangler d1 create mcp_memory_service
```
Update the wrangler.toml file with your database ID from the previous step.

Initialize the database schema:

wrangler d1 execute mcp_memory_service --local --file=./schema.sql

Where schema.sql contains:

CREATE TABLE IF NOT EXISTS memories (
  id TEXT PRIMARY KEY,
  content TEXT NOT NULL,
  embedding TEXT NOT NULL,
  tags TEXT,
  memory_type TEXT,
  metadata TEXT,
  created_at INTEGER
);
CREATE INDEX IF NOT EXISTS idx_created_at ON memories(created_at);

Deploy the worker:
```
wrangler deploy
```
Update your Claude configuration to use your new worker URL.

Testing Your Cloudflare Memory Service

After deployment, you can test your memory service using curl:

List available tools:

curl https://your-worker-subdomain.workers.dev/list_tools

Store a memory:

curl -X POST https://your-worker-subdomain.workers.dev/mcp 
  -H "Content-Type: application/json" 
  -d '{"method":"store_memory","arguments":{"content":"This is a test memory","metadata":{"tags":["test"]}}}'

Retrieve memories:

curl -X POST https://your-worker-subdomain.workers.dev/mcp 
  -H "Content-Type: application/json" 
  -d '{"method":"retrieve_memory","arguments":{"query":"test memory","n_results":5}}'

Limitations

Free tier limits on Cloudflare Workers and D1 may apply
Workers AI embedding models may differ slightly from the local sentence-transformers models
No direct access to the underlying database for manual operations
Cloudflare Workers have a maximum execution time of 30 seconds on free plans