Ollama MCP Server
An enhanced MCP (Model Context Protocol) server for interacting with the Ollama API, providing a robust bridge between Claude Desktop and locally-running LLMs via Ollama.
Features
- ✅ Complete Ollama API Coverage: All major Ollama endpoints implemented
- 🔄 Connection Pooling: Efficient HTTP client with connection reuse
- 🚦 Smart Retry Logic: Automatic retries with exponential backoff
- 📝 Comprehensive Logging: Configurable logging for debugging
- ⚡ Response Caching: Intelligent caching for improved performance
- 🛡️ Error Handling: Graceful error handling with helpful messages
- ⚙️ Flexible Configuration: Environment variables and .env file support
- 🌐 Smart Host Detection: Automatically detects localhost vs external network access
- 🔍 Type Safety: Full Pydantic validation for requests/responses
- 📊 Advanced Options: Support for temperature, top_p, seed, and more
- 🌊 Streaming Support: Real-time token streaming for long responses
Quick Start
macOS Instructions
Install Ollama and ensure it’s running:
# Download from https://ollama.com or use: curl -fsSL https://ollama.com/install.sh | sh ollama serveInstall the MCP server:
git clone https://github.com/cuba6112/ollama-mcp.git cd ollama-mcp python3 -m venv venv source venv/bin/activate pip install -r requirements.txtConfigure Claude Desktop by adding to your config file:
{ "mcpServers": { "ollama": { "command": "/Users/mac_orion/mcp_server/ollama_mcp_server/venv/bin/python", "args": ["-m", "ollama_mcp_server.main"], "cwd": "/Users/mac_orion/mcp_server/ollama_mcp_server" } } }Restart Claude Desktop and start using Ollama models!
Smithery Integration
Using with Smithery
Smithery provides a convenient way to install and manage MCP servers. The Ollama MCP server includes automatic network detection to work seamlessly with external tools like Smithery.
Installation via Smithery
npx -y @smithery/cli@latest install @cuba6112/ollama-mcp --client windsurf --key YOUR_KEY
Network Configuration
The server automatically detects the appropriate Ollama host:
- Local Development: Uses
http://localhost:11434when Ollama is accessible locally - External Access: Automatically detects your local network IP (e.g.,
http://YOUR_LOCAL_IP:11434) when localhost is not accessible - Manual Override: Set
OLLAMA_HOSTenvironment variable for custom configurations
Ensuring Ollama External Access
For Smithery and other external tools to connect to your local Ollama instance:
Start Ollama with external binding:
ollama serve --host 0.0.0.0Or set environment variable:
export OLLAMA_HOST=0.0.0.0 ollama serveVerify connectivity:
# Test from another machine or tool curl http://YOUR_LOCAL_IP:11434/api/tags
Troubleshooting Smithery Connection
If Smithery cannot connect to your Ollama instance:
- Check Ollama is accepting external connections:
ollama serve --host 0.0.0.0 - Verify firewall settings: Ensure port 11434 is not blocked
- Test network connectivity: Try accessing
http://YOUR_LOCAL_IP:11434from another device - Check server logs: Set
OLLAMA_LOG_LEVEL=DEBUGfor detailed connection information
Linux Instructions
Install Ollama and ensure it’s running:
# Download from https://ollama.com or use: curl -fsSL https://ollama.com/install.sh | sh ollama serveInstall the MCP server:
git clone https://github.com/cuba6112/ollama-mcp.git cd ollama-mcp python3 -m venv venv source venv/bin/activate pip install -r requirements.txtConfigure Claude Desktop by adding to your config file:
{ "mcpServers": { "ollama": { "command": "/path/to/your/venv/bin/python", "args": ["-m", "ollama_mcp_server.main"], "cwd": "/path/to/ollama-mcp" } } }Restart Claude Desktop and start using Ollama models!
Windows Instructions
Install Ollama: Download and install from the official Ollama website.
Install the MCP server:
git clone https://github.com/cuba6112/ollama-mcp.git cd ollama-mcp python -m venv venv .venvScriptsactivate pip install -r requirements.txtConfigure Claude Desktop by adding to your config file:
{ "mcpServers": { "ollama": { "command": "C:\path\to\your\venv\Scripts\python.exe", "args": ["-m", "ollama_mcp_server.main"], "cwd": "C:\path\to\ollama-mcp" } } }Restart Claude Desktop and start using Ollama models!
Prerequisites
- Python 3.9+ installed.
- Ollama is installed and running on your local machine. You can download it from ollama.com.
uvorpipis installed for package management.
Installation
Navigate to the project directory:
cd /Users/mac_orion/mcp_server/ollama_mcp_serverCreate a virtual environment:
Using
venv:python -m venv .venv source .venv/bin/activateInstall dependencies:
Using
pip:pip install -e .Or using
uv:uv pip install -e .
Running the Server
Once the dependencies are installed, you can run the server directly:
python -m ollama_mcp_server.main
Or use the mcp dev tool for development:
mcp dev ollama_mcp_server/main.py
Claude Desktop Configuration
To use this server with the Claude Desktop app, you need to add it to your configuration file.
On macOS, edit ~/Library/Application Support/Claude/claude_desktop_config.json:
{
"mcpServers": {
"ollama": {
"command": "/path/to/your/venv/bin/python",
"args": ["-m", "ollama_mcp_server.main"],
"cwd": "/Users/mac_orion/mcp_server/ollama_mcp_server"
}
}
}
Replace /path/to/your/venv/bin/python with the actual path to your Python executable.
Available Tools
Core Tools
list_models: List all available Ollama models with size and modification infoshow_model: Get detailed information about a specific modelcheck_model_exists: Check if a model exists locally
Generation Tools
generate_completion: Generate text completions with advanced options- Supports temperature, top_p, top_k, seed, num_predict, stop sequences
- Streaming support for real-time responses
generate_chat_completion: Generate chat responses with conversation history- Full message history support (system, user, assistant roles)
- Same advanced options as completion
generate_embeddings: Create embeddings for text (supports both single strings and lists)
Model Management
pull_model: Download models from the Ollama librarycopy_model: Duplicate a model with a new namedelete_model: Remove models from local storagelist_running_models: Show currently loaded models in memory
Configuration
The server can be configured using environment variables or a .env file:
Connection Settings
# Ollama host - automatically detected by default
OLLAMA_HOST=http://localhost:11434 # Manual override for Ollama API URL
OLLAMA_REQUEST_TIMEOUT=30.0 # Request timeout in seconds
OLLAMA_CONNECTION_TIMEOUT=5.0 # Connection timeout in seconds
OLLAMA_MAX_RETRIES=3 # Max retry attempts
OLLAMA_RETRY_DELAY=1.0 # Initial retry delay
Host Auto-Detection
The server automatically detects the appropriate Ollama host:
- Environment Variable: If
OLLAMA_HOSTis set, uses that value - Localhost Test: Tries to connect to
http://localhost:11434 - Network Detection: If localhost fails, automatically detects local network IP
- Fallback: Uses localhost as final fallback
This ensures seamless operation in both local development and external access scenarios (like Smithery).
Other Settings
# Logging
OLLAMA_LOG_LEVEL=INFO # Log level (DEBUG, INFO, WARNING, ERROR)
OLLAMA_LOG_REQUESTS=false # Log all API requests/responses
# Performance
OLLAMA_ENABLE_CACHE=true # Enable response caching
OLLAMA_CACHE_TTL=300 # Cache TTL in seconds
Copy .env.example to .env and customize as needed.
Troubleshooting
MCP Server Not Connecting
If Claude Desktop shows connection errors:
- Restart Claude Desktop after making configuration changes
- Check Ollama is running:
ollama psshould show running models - Verify Python path in your Claude Desktop config is correct
- Check logs by setting
OLLAMA_LOG_LEVEL=DEBUGin your.envfile
Schema Mismatch Errors
If you see parameter-related errors:
- Restart the MCP server completely
- Restart Claude Desktop
- Check that all dependencies are installed:
pip install -r requirements.txt
Connection to Ollama Fails
If the server can’t connect to Ollama:
- Ensure Ollama is running:
ollama serve - Check the Ollama URL in your configuration
- Try accessing Ollama directly:
curl http://localhost:11434/api/tags
Performance Issues
For better performance:
- Enable caching:
OLLAMA_ENABLE_CACHE=true - Adjust cache TTL:
OLLAMA_CACHE_TTL=600 - Increase timeout for large models:
OLLAMA_REQUEST_TIMEOUT=60
Testing
Test the server directly using the MCP dev tool:
mcp dev ollama_mcp_server/main.py
Or run the server and test individual tools:
# Start the server
python -m ollama_mcp_server.main
# In another terminal, test with Claude Desktop or other MCP clients
# You can also check the example usage:
python examples/usage_example.py
This will verify that all core functions work correctly with your Ollama installation.
Contributing
Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.
License
This project is licensed under the MIT License - see the LICENSE file for details.
Acknowledgments
- Anthropic for the Model Context Protocol specification
- Ollama for the excellent local LLM platform
- The MCP community for tools and documentation
Support
If you encounter any issues or have questions:
- Check the troubleshooting section above
- Look through existing GitHub issues
- Create a new issue with detailed information about your problem
Ollama MCP Server
Project Details
- cuba6112/ollama-mcp
- MIT License
- Last Updated: 6/13/2025
Recomended MCP Servers
Model Context Protocol (MCP) server that interacts with a Language Server
A Model Context Protocol server for generating DecentSampler drum kit configurations.
This read-only MCP Server allows you to connect to Salesforce Pardot data from Claude Desktop through CData JDBC...
An LLM-powered, autonomous coding assistant. Also offers an MCP mode.
MCP Server for Facebook ADs Library - Get instant answers from FB's ad library
MCP AutoProvisioner
An MCP server that integrates with the MCP protocol. https://modelcontextprotocol.io/introduction
mcprouter for remote mcp servers
A Model Context Protocol (MCP) server for fetching rubygems metadata via rubygems.org API





