Ollama MCP Server
An enhanced MCP (Model Context Protocol) server for interacting with the Ollama API, providing a robust bridge between Claude Desktop and locally-running LLMs via Ollama.
Features
- ✅ Complete Ollama API Coverage: All major Ollama endpoints implemented
- 🔄 Connection Pooling: Efficient HTTP client with connection reuse
- 🚦 Smart Retry Logic: Automatic retries with exponential backoff
- 📝 Comprehensive Logging: Configurable logging for debugging
- ⚡ Response Caching: Intelligent caching for improved performance
- 🛡️ Error Handling: Graceful error handling with helpful messages
- ⚙️ Flexible Configuration: Environment variables and .env file support
- 🌐 Smart Host Detection: Automatically detects localhost vs external network access
- 🔍 Type Safety: Full Pydantic validation for requests/responses
- 📊 Advanced Options: Support for temperature, top_p, seed, and more
- 🌊 Streaming Support: Real-time token streaming for long responses
Quick Start
macOS Instructions
Install Ollama and ensure it’s running:
# Download from https://ollama.com or use: curl -fsSL https://ollama.com/install.sh | sh ollama serveInstall the MCP server:
git clone https://github.com/cuba6112/ollama-mcp.git cd ollama-mcp python3 -m venv venv source venv/bin/activate pip install -r requirements.txtConfigure Claude Desktop by adding to your config file:
{ "mcpServers": { "ollama": { "command": "/Users/mac_orion/mcp_server/ollama_mcp_server/venv/bin/python", "args": ["-m", "ollama_mcp_server.main"], "cwd": "/Users/mac_orion/mcp_server/ollama_mcp_server" } } }Restart Claude Desktop and start using Ollama models!
Smithery Integration
Using with Smithery
Smithery provides a convenient way to install and manage MCP servers. The Ollama MCP server includes automatic network detection to work seamlessly with external tools like Smithery.
Installation via Smithery
npx -y @smithery/cli@latest install @cuba6112/ollama-mcp --client windsurf --key YOUR_KEY
Network Configuration
The server automatically detects the appropriate Ollama host:
- Local Development: Uses
http://localhost:11434when Ollama is accessible locally - External Access: Automatically detects your local network IP (e.g.,
http://YOUR_LOCAL_IP:11434) when localhost is not accessible - Manual Override: Set
OLLAMA_HOSTenvironment variable for custom configurations
Ensuring Ollama External Access
For Smithery and other external tools to connect to your local Ollama instance:
Start Ollama with external binding:
ollama serve --host 0.0.0.0Or set environment variable:
export OLLAMA_HOST=0.0.0.0 ollama serveVerify connectivity:
# Test from another machine or tool curl http://YOUR_LOCAL_IP:11434/api/tags
Troubleshooting Smithery Connection
If Smithery cannot connect to your Ollama instance:
- Check Ollama is accepting external connections:
ollama serve --host 0.0.0.0 - Verify firewall settings: Ensure port 11434 is not blocked
- Test network connectivity: Try accessing
http://YOUR_LOCAL_IP:11434from another device - Check server logs: Set
OLLAMA_LOG_LEVEL=DEBUGfor detailed connection information
Linux Instructions
Install Ollama and ensure it’s running:
# Download from https://ollama.com or use: curl -fsSL https://ollama.com/install.sh | sh ollama serveInstall the MCP server:
git clone https://github.com/cuba6112/ollama-mcp.git cd ollama-mcp python3 -m venv venv source venv/bin/activate pip install -r requirements.txtConfigure Claude Desktop by adding to your config file:
{ "mcpServers": { "ollama": { "command": "/path/to/your/venv/bin/python", "args": ["-m", "ollama_mcp_server.main"], "cwd": "/path/to/ollama-mcp" } } }Restart Claude Desktop and start using Ollama models!
Windows Instructions
Install Ollama: Download and install from the official Ollama website.
Install the MCP server:
git clone https://github.com/cuba6112/ollama-mcp.git cd ollama-mcp python -m venv venv .venvScriptsactivate pip install -r requirements.txtConfigure Claude Desktop by adding to your config file:
{ "mcpServers": { "ollama": { "command": "C:\path\to\your\venv\Scripts\python.exe", "args": ["-m", "ollama_mcp_server.main"], "cwd": "C:\path\to\ollama-mcp" } } }Restart Claude Desktop and start using Ollama models!
Prerequisites
- Python 3.9+ installed.
- Ollama is installed and running on your local machine. You can download it from ollama.com.
uvorpipis installed for package management.
Installation
Navigate to the project directory:
cd /Users/mac_orion/mcp_server/ollama_mcp_serverCreate a virtual environment:
Using
venv:python -m venv .venv source .venv/bin/activateInstall dependencies:
Using
pip:pip install -e .Or using
uv:uv pip install -e .
Running the Server
Once the dependencies are installed, you can run the server directly:
python -m ollama_mcp_server.main
Or use the mcp dev tool for development:
mcp dev ollama_mcp_server/main.py
Claude Desktop Configuration
To use this server with the Claude Desktop app, you need to add it to your configuration file.
On macOS, edit ~/Library/Application Support/Claude/claude_desktop_config.json:
{
"mcpServers": {
"ollama": {
"command": "/path/to/your/venv/bin/python",
"args": ["-m", "ollama_mcp_server.main"],
"cwd": "/Users/mac_orion/mcp_server/ollama_mcp_server"
}
}
}
Replace /path/to/your/venv/bin/python with the actual path to your Python executable.
Available Tools
Core Tools
list_models: List all available Ollama models with size and modification infoshow_model: Get detailed information about a specific modelcheck_model_exists: Check if a model exists locally
Generation Tools
generate_completion: Generate text completions with advanced options- Supports temperature, top_p, top_k, seed, num_predict, stop sequences
- Streaming support for real-time responses
generate_chat_completion: Generate chat responses with conversation history- Full message history support (system, user, assistant roles)
- Same advanced options as completion
generate_embeddings: Create embeddings for text (supports both single strings and lists)
Model Management
pull_model: Download models from the Ollama librarycopy_model: Duplicate a model with a new namedelete_model: Remove models from local storagelist_running_models: Show currently loaded models in memory
Configuration
The server can be configured using environment variables or a .env file:
Connection Settings
# Ollama host - automatically detected by default
OLLAMA_HOST=http://localhost:11434 # Manual override for Ollama API URL
OLLAMA_REQUEST_TIMEOUT=30.0 # Request timeout in seconds
OLLAMA_CONNECTION_TIMEOUT=5.0 # Connection timeout in seconds
OLLAMA_MAX_RETRIES=3 # Max retry attempts
OLLAMA_RETRY_DELAY=1.0 # Initial retry delay
Host Auto-Detection
The server automatically detects the appropriate Ollama host:
- Environment Variable: If
OLLAMA_HOSTis set, uses that value - Localhost Test: Tries to connect to
http://localhost:11434 - Network Detection: If localhost fails, automatically detects local network IP
- Fallback: Uses localhost as final fallback
This ensures seamless operation in both local development and external access scenarios (like Smithery).
Other Settings
# Logging
OLLAMA_LOG_LEVEL=INFO # Log level (DEBUG, INFO, WARNING, ERROR)
OLLAMA_LOG_REQUESTS=false # Log all API requests/responses
# Performance
OLLAMA_ENABLE_CACHE=true # Enable response caching
OLLAMA_CACHE_TTL=300 # Cache TTL in seconds
Copy .env.example to .env and customize as needed.
Troubleshooting
MCP Server Not Connecting
If Claude Desktop shows connection errors:
- Restart Claude Desktop after making configuration changes
- Check Ollama is running:
ollama psshould show running models - Verify Python path in your Claude Desktop config is correct
- Check logs by setting
OLLAMA_LOG_LEVEL=DEBUGin your.envfile
Schema Mismatch Errors
If you see parameter-related errors:
- Restart the MCP server completely
- Restart Claude Desktop
- Check that all dependencies are installed:
pip install -r requirements.txt
Connection to Ollama Fails
If the server can’t connect to Ollama:
- Ensure Ollama is running:
ollama serve - Check the Ollama URL in your configuration
- Try accessing Ollama directly:
curl http://localhost:11434/api/tags
Performance Issues
For better performance:
- Enable caching:
OLLAMA_ENABLE_CACHE=true - Adjust cache TTL:
OLLAMA_CACHE_TTL=600 - Increase timeout for large models:
OLLAMA_REQUEST_TIMEOUT=60
Testing
Test the server directly using the MCP dev tool:
mcp dev ollama_mcp_server/main.py
Or run the server and test individual tools:
# Start the server
python -m ollama_mcp_server.main
# In another terminal, test with Claude Desktop or other MCP clients
# You can also check the example usage:
python examples/usage_example.py
This will verify that all core functions work correctly with your Ollama installation.
Contributing
Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.
License
This project is licensed under the MIT License - see the LICENSE file for details.
Acknowledgments
- Anthropic for the Model Context Protocol specification
- Ollama for the excellent local LLM platform
- The MCP community for tools and documentation
Support
If you encounter any issues or have questions:
- Check the troubleshooting section above
- Look through existing GitHub issues
- Create a new issue with detailed information about your problem
Ollama MCP Server
Project Details
- cuba6112/ollama-mcp
- MIT License
- Last Updated: 6/13/2025
Recomended MCP Servers
A working example to create a FastAPI server with SSE-based MCP support
MCP server providing a knowledge graph implementation with semantic search capabilities powered by Qdrant vector database
MCP Server for Nutanix
MCP server that visually reviews your agent's design edits
MCP Server to Use HuggingFace spaces, easy configuration and Claude Desktop mode.
github-enterprise-mcp
OpenAPI specification MCP server.





