Doc Scraper MCP Server
A Model Context Protocol (MCP) server that provides documentation scraping functionality. This server converts web-based documentation into markdown format using jina.ai’s conversion service.
Features
- Scrapes documentation from any web URL
- Converts HTML documentation to markdown format
- Saves the converted documentation to a specified output path
- Integrates with the Model Context Protocol (MCP)
Installation
Installing via Smithery
To install Doc Scraper for Claude Desktop automatically via Smithery:
npx -y @smithery/cli install @askjohngeorge/mcp-doc-scraper --client claude
- Clone the repository:
git clone https://github.com/askjohngeorge/mcp-doc-scraper.git
cd mcp-doc-scraper
- Create and activate a virtual environment:
python -m venv venv
source venv/bin/activate # On Windows, use: venvScriptsactivate
- Install the dependencies:
pip install -e .
Usage
The server can be run using Python:
python -m mcp_doc_scraper
Tool Description
The server provides a single tool:
- Name:
scrape_docs
- Description: Scrape documentation from a URL and save as markdown
- Input Parameters:
url
: The URL of the documentation to scrapeoutput_path
: The path where the markdown file should be saved
Project Structure
doc_scraper/
├── __init__.py
├── __main__.py
└── server.py
Dependencies
- aiohttp
- mcp
- pydantic
Development
To set up the development environment:
- Install development dependencies:
pip install -r requirements.txt
- The server uses the Model Context Protocol. Make sure to familiarize yourself with MCP documentation.
License
MIT License
Doc Scraper
Project Details
- askjohngeorge/mcp-doc-scraper
- Last Updated: 4/18/2025
Recomended MCP Servers
An MCP server to read MCP logs to debug directly inside the client
A connector for Claude Desktop to read and search an Obsidian vault.
Infisical's official MCP server.
Shell and coding agent on claude desktop app
A Model Context Protocol (MCP) server that converts various file formats to Markdown using the MarkItDown utility.
A CLI tool to convert your codebase into a single LLM prompt with source tree, prompt templating, and...
An MCP server that enables communication with users through Telegram. This server provides a tool to ask questions...
A Model Content Protocol server that provides tools to search and retrieve academic papers from PubMed database.
Monorepo providing 1) OpenAPI to MCP Tool generator 2) Exposing all of Twilio's API as MCP Tools
An intelligent MCP server that provides tools for collecting and documenting code from directories
A Model Context Protocol (MCP) server that converts Mermaid diagrams to PNG images