Ollama MCP Server
An enhanced MCP (Model Context Protocol) server for interacting with the Ollama API, providing a robust bridge between Claude Desktop and locally-running LLMs via Ollama.
Features
Complete Ollama API Coverage: All major Ollama endpoints implemented
Connection Pooling: Efficient HTTP client with connection reuse
Smart Retry Logic: Automatic retries with exponential backoff
Comprehensive Logging: Configurable logging for debugging
Response Caching: Intelligent caching for improved performance
Error Handling: Graceful error handling with helpful messages
Flexible Configuration: Environment variables and .env file support
Smart Host Detection: Automatically detects localhost vs external network access
Type Safety: Full Pydantic validation for requests/responses
Advanced Options: Support for temperature, top_p, seed, and more
Streaming Support: Real-time token streaming for long responses
Quick Start
macOS Instructions
Install Ollama and ensure it’s running:
# Download from https://ollama.com or use: curl -fsSL https://ollama.com/install.sh | sh ollama serve
Install the MCP server:
git clone https://github.com/cuba6112/ollama-mcp.git cd ollama-mcp python3 -m venv venv source venv/bin/activate pip install -r requirements.txt
Configure Claude Desktop by adding to your config file:
{ "mcpServers": { "ollama": { "command": "/Users/mac_orion/mcp_server/ollama_mcp_server/venv/bin/python", "args": ["-m", "ollama_mcp_server.main"], "cwd": "/Users/mac_orion/mcp_server/ollama_mcp_server" } } }
Restart Claude Desktop and start using Ollama models!
Smithery Integration
Using with Smithery
Smithery provides a convenient way to install and manage MCP servers. The Ollama MCP server includes automatic network detection to work seamlessly with external tools like Smithery.
Installation via Smithery
npx -y @smithery/cli@latest install @cuba6112/ollama-mcp --client windsurf --key YOUR_KEY
Network Configuration
The server automatically detects the appropriate Ollama host:
- Local Development: Uses
http://localhost:11434
when Ollama is accessible locally - External Access: Automatically detects your local network IP (e.g.,
http://YOUR_LOCAL_IP:11434
) when localhost is not accessible - Manual Override: Set
OLLAMA_HOST
environment variable for custom configurations
Ensuring Ollama External Access
For Smithery and other external tools to connect to your local Ollama instance:
Start Ollama with external binding:
ollama serve --host 0.0.0.0
Or set environment variable:
export OLLAMA_HOST=0.0.0.0 ollama serve
Verify connectivity:
# Test from another machine or tool curl http://YOUR_LOCAL_IP:11434/api/tags
Troubleshooting Smithery Connection
If Smithery cannot connect to your Ollama instance:
- Check Ollama is accepting external connections:
ollama serve --host 0.0.0.0
- Verify firewall settings: Ensure port 11434 is not blocked
- Test network connectivity: Try accessing
http://YOUR_LOCAL_IP:11434
from another device - Check server logs: Set
OLLAMA_LOG_LEVEL=DEBUG
for detailed connection information
Linux Instructions
Install Ollama and ensure it’s running:
# Download from https://ollama.com or use: curl -fsSL https://ollama.com/install.sh | sh ollama serve
Install the MCP server:
git clone https://github.com/cuba6112/ollama-mcp.git cd ollama-mcp python3 -m venv venv source venv/bin/activate pip install -r requirements.txt
Configure Claude Desktop by adding to your config file:
{ "mcpServers": { "ollama": { "command": "/path/to/your/venv/bin/python", "args": ["-m", "ollama_mcp_server.main"], "cwd": "/path/to/ollama-mcp" } } }
Restart Claude Desktop and start using Ollama models!
Windows Instructions
Install Ollama: Download and install from the official Ollama website.
Install the MCP server:
git clone https://github.com/cuba6112/ollama-mcp.git cd ollama-mcp python -m venv venv .venvScriptsactivate pip install -r requirements.txt
Configure Claude Desktop by adding to your config file:
{ "mcpServers": { "ollama": { "command": "C:\path\to\your\venv\Scripts\python.exe", "args": ["-m", "ollama_mcp_server.main"], "cwd": "C:\path\to\ollama-mcp" } } }
Restart Claude Desktop and start using Ollama models!
Prerequisites
- Python 3.9+ installed.
- Ollama is installed and running on your local machine. You can download it from ollama.com.
uv
orpip
is installed for package management.
Installation
Navigate to the project directory:
cd /Users/mac_orion/mcp_server/ollama_mcp_server
Create a virtual environment:
Using
venv
:python -m venv .venv source .venv/bin/activate
Install dependencies:
Using
pip
:pip install -e .
Or using
uv
:uv pip install -e .
Running the Server
Once the dependencies are installed, you can run the server directly:
python -m ollama_mcp_server.main
Or use the mcp dev tool for development:
mcp dev ollama_mcp_server/main.py
Claude Desktop Configuration
To use this server with the Claude Desktop app, you need to add it to your configuration file.
On macOS, edit ~/Library/Application Support/Claude/claude_desktop_config.json
:
{
"mcpServers": {
"ollama": {
"command": "/path/to/your/venv/bin/python",
"args": ["-m", "ollama_mcp_server.main"],
"cwd": "/Users/mac_orion/mcp_server/ollama_mcp_server"
}
}
}
Replace /path/to/your/venv/bin/python
with the actual path to your Python executable.
Available Tools
Core Tools
list_models
: List all available Ollama models with size and modification infoshow_model
: Get detailed information about a specific modelcheck_model_exists
: Check if a model exists locally
Generation Tools
generate_completion
: Generate text completions with advanced options- Supports temperature, top_p, top_k, seed, num_predict, stop sequences
- Streaming support for real-time responses
generate_chat_completion
: Generate chat responses with conversation history- Full message history support (system, user, assistant roles)
- Same advanced options as completion
generate_embeddings
: Create embeddings for text (supports both single strings and lists)
Model Management
pull_model
: Download models from the Ollama librarycopy_model
: Duplicate a model with a new namedelete_model
: Remove models from local storagelist_running_models
: Show currently loaded models in memory
Configuration
The server can be configured using environment variables or a .env
file:
Connection Settings
# Ollama host - automatically detected by default
OLLAMA_HOST=http://localhost:11434 # Manual override for Ollama API URL
OLLAMA_REQUEST_TIMEOUT=30.0 # Request timeout in seconds
OLLAMA_CONNECTION_TIMEOUT=5.0 # Connection timeout in seconds
OLLAMA_MAX_RETRIES=3 # Max retry attempts
OLLAMA_RETRY_DELAY=1.0 # Initial retry delay
Host Auto-Detection
The server automatically detects the appropriate Ollama host:
- Environment Variable: If
OLLAMA_HOST
is set, uses that value - Localhost Test: Tries to connect to
http://localhost:11434
- Network Detection: If localhost fails, automatically detects local network IP
- Fallback: Uses localhost as final fallback
This ensures seamless operation in both local development and external access scenarios (like Smithery).
Other Settings
# Logging
OLLAMA_LOG_LEVEL=INFO # Log level (DEBUG, INFO, WARNING, ERROR)
OLLAMA_LOG_REQUESTS=false # Log all API requests/responses
# Performance
OLLAMA_ENABLE_CACHE=true # Enable response caching
OLLAMA_CACHE_TTL=300 # Cache TTL in seconds
Copy .env.example
to .env
and customize as needed.
Troubleshooting
MCP Server Not Connecting
If Claude Desktop shows connection errors:
- Restart Claude Desktop after making configuration changes
- Check Ollama is running:
ollama ps
should show running models - Verify Python path in your Claude Desktop config is correct
- Check logs by setting
OLLAMA_LOG_LEVEL=DEBUG
in your.env
file
Schema Mismatch Errors
If you see parameter-related errors:
- Restart the MCP server completely
- Restart Claude Desktop
- Check that all dependencies are installed:
pip install -r requirements.txt
Connection to Ollama Fails
If the server can’t connect to Ollama:
- Ensure Ollama is running:
ollama serve
- Check the Ollama URL in your configuration
- Try accessing Ollama directly:
curl http://localhost:11434/api/tags
Performance Issues
For better performance:
- Enable caching:
OLLAMA_ENABLE_CACHE=true
- Adjust cache TTL:
OLLAMA_CACHE_TTL=600
- Increase timeout for large models:
OLLAMA_REQUEST_TIMEOUT=60
Testing
Test the server directly using the MCP dev tool:
mcp dev ollama_mcp_server/main.py
Or run the server and test individual tools:
# Start the server
python -m ollama_mcp_server.main
# In another terminal, test with Claude Desktop or other MCP clients
# You can also check the example usage:
python examples/usage_example.py
This will verify that all core functions work correctly with your Ollama installation.
Contributing
Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.
License
This project is licensed under the MIT License - see the LICENSE file for details.
Acknowledgments
- Anthropic for the Model Context Protocol specification
- Ollama for the excellent local LLM platform
- The MCP community for tools and documentation
Support
If you encounter any issues or have questions:
- Check the troubleshooting section above
- Look through existing GitHub issues
- Create a new issue with detailed information about your problem
Ollama MCP Server
Project Details
- cuba6112/ollama-mcp
- MIT License
- Last Updated: 6/13/2025
Recomended MCP Servers
A MCP server for Claude Desktop that enables Perplexity.ai searching
An MCP server that enables AI agents to interact with PumpSwap for real-time token swaps and automated on-chain...
This project provides an MCP (Multi-Channel Pipeline) server that acts as a wrapper for the MLB Stats API....
MongoDB Lens: Full Featured MCP Server for MongoDB Databases
MIRROR ONLY!! This Model Context Protocol (MCP) server provides tools and resources for interacting with the Forgejo (specifically...