What is the Ollama MCP Server?

The Ollama MCP Server is a FastAPI-based server that acts as a Model Context Protocol (MCP) wrapper for the Ollama API. It allows you to seamlessly integrate local large language models from Ollama with any MCP-compatible client, such as Claude Desktop.

What is MCP (Model Context Protocol)?

MCP is an open protocol that standardizes how applications provide context to LLMs (Large Language Models). It acts as a bridge, enabling AI models to access and interact with external data sources and tools.

What are the benefits of using the Ollama MCP Server?

The benefits include seamless integration of local LLMs, enhanced data privacy, reduced latency, cost savings, robust error handling, flexible configuration, and support for streaming.

What are the prerequisites for installing the Ollama MCP Server?

You need Python 3.9+ installed, Ollama installed and running on your local machine, and `uv` or `pip` for package management.

How do I install the Ollama MCP Server?

1. Clone the repository. 2. Create a virtual environment. 3. Install dependencies using `pip install -r requirements.txt` or `uv pip install -r requirements.txt`.

How do I configure the Ollama MCP Server for Claude Desktop?

Add the server configuration to your Claude Desktop config file (e.g., `~/Library/Application Support/Claude/claude_desktop_config.json` on macOS), specifying the path to your Python executable, the arguments, and the current working directory.

Can I configure the server using environment variables?

Yes, you can configure the server using environment variables or a `.env` file. You can customize settings such as the Ollama host, request timeouts, and logging levels.

How do I run the Ollama MCP Server?

Run the server using the command: `python -m ollama_mcp_server.main` or using the mcp dev tool for development: `mcp dev ollama_mcp_server/main.py`.

What do I do if Claude Desktop shows connection errors?

1. Restart Claude Desktop. 2. Check that Ollama is running. 3. Verify the Python path in your Claude Desktop config. 4. Check logs by setting `OLLAMA_LOG_LEVEL=DEBUG` in your `.env` file.

What do I do if the server can't connect to Ollama?

1. Ensure Ollama is running: `ollama serve`. 2. Check the Ollama URL in your configuration. 3. Try accessing Ollama directly: `curl http://localhost:11434/api/tags`.

How can I improve the performance of the Ollama MCP Server?

Enable caching by setting `OLLAMA_ENABLE_CACHE=true`, adjust the cache TTL using `OLLAMA_CACHE_TTL`, and increase the request timeout for large models by adjusting `OLLAMA_REQUEST_TIMEOUT`.

How does the Ollama MCP Server integrate with the UBOS platform?

The UBOS platform allows you to orchestrate AI Agents powered by local Ollama LLMs, connect them with your enterprise data, build custom AI Agents, and create Multi-Agent Systems.

Where can I find support if I encounter issues?

Check the troubleshooting section in the documentation, look through existing GitHub issues, or create a new issue with detailed information about your problem.

Ollama MCP Server

An enhanced MCP (Model Context Protocol) server for interacting with the Ollama API, providing a robust bridge between Claude Desktop and locally-running LLMs via Ollama.

Features

✅ Complete Ollama API Coverage: All major Ollama endpoints implemented
🔄 Connection Pooling: Efficient HTTP client with connection reuse
🚦 Smart Retry Logic: Automatic retries with exponential backoff
📝 Comprehensive Logging: Configurable logging for debugging
⚡ Response Caching: Intelligent caching for improved performance
🛡️ Error Handling: Graceful error handling with helpful messages
⚙️ Flexible Configuration: Environment variables and .env file support
🌐 Smart Host Detection: Automatically detects localhost vs external network access
🔍 Type Safety: Full Pydantic validation for requests/responses
📊 Advanced Options: Support for temperature, top_p, seed, and more
🌊 Streaming Support: Real-time token streaming for long responses

Quick Start

macOS Instructions

Install Ollama and ensure it’s running:

# Download from https://ollama.com or use:
curl -fsSL https://ollama.com/install.sh | sh
ollama serve

Install the MCP server:

git clone https://github.com/cuba6112/ollama-mcp.git
cd ollama-mcp
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt

Configure Claude Desktop by adding to your config file:

{
  "mcpServers": {
    "ollama": {
      "command": "/Users/mac_orion/mcp_server/ollama_mcp_server/venv/bin/python",
      "args": ["-m", "ollama_mcp_server.main"],
      "cwd": "/Users/mac_orion/mcp_server/ollama_mcp_server"
    }
  }
}

Restart Claude Desktop and start using Ollama models!

Smithery Integration

Using with Smithery

Smithery provides a convenient way to install and manage MCP servers. The Ollama MCP server includes automatic network detection to work seamlessly with external tools like Smithery.

Installation via Smithery

npx -y @smithery/cli@latest install @cuba6112/ollama-mcp --client windsurf --key YOUR_KEY

Network Configuration

The server automatically detects the appropriate Ollama host:

Local Development: Uses http://localhost:11434 when Ollama is accessible locally
External Access: Automatically detects your local network IP (e.g., http://YOUR_LOCAL_IP:11434) when localhost is not accessible
Manual Override: Set OLLAMA_HOST environment variable for custom configurations

Ensuring Ollama External Access

For Smithery and other external tools to connect to your local Ollama instance:

Start Ollama with external binding:
```
ollama serve --host 0.0.0.0
```

Or set environment variable:

export OLLAMA_HOST=0.0.0.0
ollama serve

Verify connectivity:

# Test from another machine or tool
curl http://YOUR_LOCAL_IP:11434/api/tags

Troubleshooting Smithery Connection

If Smithery cannot connect to your Ollama instance:

Check Ollama is accepting external connections: ollama serve --host 0.0.0.0
Verify firewall settings: Ensure port 11434 is not blocked
Test network connectivity: Try accessing http://YOUR_LOCAL_IP:11434 from another device
Check server logs: Set OLLAMA_LOG_LEVEL=DEBUG for detailed connection information

Linux Instructions

Install Ollama and ensure it’s running:

# Download from https://ollama.com or use:
curl -fsSL https://ollama.com/install.sh | sh
ollama serve

Install the MCP server:

git clone https://github.com/cuba6112/ollama-mcp.git
cd ollama-mcp
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt

Configure Claude Desktop by adding to your config file:

{
  "mcpServers": {
    "ollama": {
      "command": "/path/to/your/venv/bin/python",
      "args": ["-m", "ollama_mcp_server.main"],
      "cwd": "/path/to/ollama-mcp"
    }
  }
}

Restart Claude Desktop and start using Ollama models!

Windows Instructions

Install Ollama: Download and install from the official Ollama website.

Install the MCP server:

git clone https://github.com/cuba6112/ollama-mcp.git
cd ollama-mcp
python -m venv venv
.venvScriptsactivate
pip install -r requirements.txt

Configure Claude Desktop by adding to your config file:

{
  "mcpServers": {
    "ollama": {
      "command": "C:\path\to\your\venv\Scripts\python.exe",
      "args": ["-m", "ollama_mcp_server.main"],
      "cwd": "C:\path\to\ollama-mcp"
    }
  }
}

Restart Claude Desktop and start using Ollama models!

Prerequisites

Python 3.9+ installed.
Ollama is installed and running on your local machine. You can download it from ollama.com.
uv or pip is installed for package management.

Installation

Navigate to the project directory:

cd /Users/mac_orion/mcp_server/ollama_mcp_server

Create a virtual environment:

Using venv:

python -m venv .venv
source .venv/bin/activate

Install dependencies:
Using pip:
```
pip install -e . 
```
Or using uv:
```
uv pip install -e .
```

Running the Server

Once the dependencies are installed, you can run the server directly:

python -m ollama_mcp_server.main

Or use the mcp dev tool for development:

mcp dev ollama_mcp_server/main.py

Claude Desktop Configuration

To use this server with the Claude Desktop app, you need to add it to your configuration file.

On macOS, edit ~/Library/Application Support/Claude/claude_desktop_config.json:

{
  "mcpServers": {
    "ollama": {
      "command": "/path/to/your/venv/bin/python",
      "args": ["-m", "ollama_mcp_server.main"],
      "cwd": "/Users/mac_orion/mcp_server/ollama_mcp_server"
    }
  }
}

Replace /path/to/your/venv/bin/python with the actual path to your Python executable.

Available Tools

Core Tools

list_models: List all available Ollama models with size and modification info
show_model: Get detailed information about a specific model
check_model_exists: Check if a model exists locally

Generation Tools

generate_completion: Generate text completions with advanced options
- Supports temperature, top_p, top_k, seed, num_predict, stop sequences
- Streaming support for real-time responses
generate_chat_completion: Generate chat responses with conversation history
- Full message history support (system, user, assistant roles)
- Same advanced options as completion
generate_embeddings: Create embeddings for text (supports both single strings and lists)

Model Management

pull_model: Download models from the Ollama library
copy_model: Duplicate a model with a new name
delete_model: Remove models from local storage
list_running_models: Show currently loaded models in memory

Configuration

The server can be configured using environment variables or a .env file:

Connection Settings

# Ollama host - automatically detected by default
OLLAMA_HOST=http://localhost:11434      # Manual override for Ollama API URL
OLLAMA_REQUEST_TIMEOUT=30.0             # Request timeout in seconds
OLLAMA_CONNECTION_TIMEOUT=5.0           # Connection timeout in seconds
OLLAMA_MAX_RETRIES=3                    # Max retry attempts
OLLAMA_RETRY_DELAY=1.0                  # Initial retry delay

Host Auto-Detection

The server automatically detects the appropriate Ollama host:

Environment Variable: If OLLAMA_HOST is set, uses that value
Localhost Test: Tries to connect to http://localhost:11434
Network Detection: If localhost fails, automatically detects local network IP
Fallback: Uses localhost as final fallback

This ensures seamless operation in both local development and external access scenarios (like Smithery).

Other Settings

# Logging
OLLAMA_LOG_LEVEL=INFO                   # Log level (DEBUG, INFO, WARNING, ERROR)
OLLAMA_LOG_REQUESTS=false               # Log all API requests/responses

# Performance
OLLAMA_ENABLE_CACHE=true                # Enable response caching
OLLAMA_CACHE_TTL=300                    # Cache TTL in seconds

Copy .env.example to .env and customize as needed.

Troubleshooting

MCP Server Not Connecting

If Claude Desktop shows connection errors:

Restart Claude Desktop after making configuration changes
Check Ollama is running: ollama ps should show running models
Verify Python path in your Claude Desktop config is correct
Check logs by setting OLLAMA_LOG_LEVEL=DEBUG in your .env file

Schema Mismatch Errors

If you see parameter-related errors:

Restart the MCP server completely
Restart Claude Desktop
Check that all dependencies are installed: pip install -r requirements.txt

Connection to Ollama Fails

If the server can’t connect to Ollama:

Ensure Ollama is running: ollama serve
Check the Ollama URL in your configuration
Try accessing Ollama directly: curl http://localhost:11434/api/tags

Performance Issues

For better performance:

Enable caching: OLLAMA_ENABLE_CACHE=true
Adjust cache TTL: OLLAMA_CACHE_TTL=600
Increase timeout for large models: OLLAMA_REQUEST_TIMEOUT=60

Testing

Test the server directly using the MCP dev tool:

mcp dev ollama_mcp_server/main.py

Or run the server and test individual tools:

# Start the server
python -m ollama_mcp_server.main

# In another terminal, test with Claude Desktop or other MCP clients
# You can also check the example usage:
python examples/usage_example.py

This will verify that all core functions work correctly with your Ollama installation.

Contributing

Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.