Dataset Viewer MCP Server
An MCP server for interacting with the Hugging Face Dataset Viewer API, providing capabilities to browse and analyze datasets hosted on the Hugging Face Hub.
Features
Resources
- Uses
dataset://URI scheme for accessing Hugging Face datasets - Supports dataset configurations and splits
- Provides paginated access to dataset contents
- Handles authentication for private datasets
- Supports searching and filtering dataset contents
- Provides dataset statistics and analysis
Tools
The server provides the following tools:
validate
- Check if a dataset exists and is accessible
- Parameters:
dataset: Dataset identifier (e.g. ‘stanfordnlp/imdb’)auth_token(optional): For private datasets
get_info
- Get detailed information about a dataset
- Parameters:
dataset: Dataset identifierauth_token(optional): For private datasets
get_rows
- Get paginated contents of a dataset
- Parameters:
dataset: Dataset identifierconfig: Configuration namesplit: Split namepage(optional): Page number (0-based)auth_token(optional): For private datasets
get_first_rows
- Get first rows from a dataset split
- Parameters:
dataset: Dataset identifierconfig: Configuration namesplit: Split nameauth_token(optional): For private datasets
get_statistics
- Get statistics about a dataset split
- Parameters:
dataset: Dataset identifierconfig: Configuration namesplit: Split nameauth_token(optional): For private datasets
search_dataset
- Search for text within a dataset
- Parameters:
dataset: Dataset identifierconfig: Configuration namesplit: Split namequery: Text to search forauth_token(optional): For private datasets
filter
- Filter rows using SQL-like conditions
- Parameters:
dataset: Dataset identifierconfig: Configuration namesplit: Split namewhere: SQL WHERE clause (e.g. “score > 0.5”)orderby(optional): SQL ORDER BY clausepage(optional): Page number (0-based)auth_token(optional): For private datasets
get_parquet
- Download entire dataset in Parquet format
- Parameters:
dataset: Dataset identifierauth_token(optional): For private datasets
Installation
Prerequisites
- Python 3.12 or higher
- uv - Fast Python package installer and resolver
Setup
- Clone the repository:
git clone https://github.com/privetin/dataset-viewer.git
cd dataset-viewer
- Create a virtual environment and install:
# Create virtual environment
uv venv
# Activate virtual environment
# On Unix:
source .venv/bin/activate
# On Windows:
.venvScriptsactivate
# Install in development mode
uv add -e .
Configuration
Environment Variables
HUGGINGFACE_TOKEN: Your Hugging Face API token for accessing private datasets
Claude Desktop Integration
Add the following to your Claude Desktop config file:
On Windows: %APPDATA%Claudeclaude_desktop_config.json
On MacOS: ~/Library/Application Support/Claude/claude_desktop_config.json
{
"mcpServers": {
"dataset-viewer": {
"command": "uv",
"args": [
"run",
"dataset-viewer"
]
}
}
}
Usage Examples
- Validate a dataset:
{
"dataset": "stanfordnlp/imdb"
}
- Get dataset information:
{
"dataset": "stanfordnlp/imdb"
}
- Search dataset contents:
{
"dataset": "stanfordnlp/imdb",
"config": "plain_text",
"split": "train",
"query": "great movie"
}
- Filter and sort rows:
{
"dataset": "stanfordnlp/imdb",
"config": "plain_text",
"split": "train",
"where": "label = 'positive'",
"orderby": "text DESC",
"page": 0
}
- Get dataset statistics:
{
"dataset": "stanfordnlp/imdb",
"config": "plain_text",
"split": "train"
}
License
MIT License - see LICENSE for details
Dataset Viewer
Project Details
- privetin/dataset-viewer
- MIT License
- Last Updated: 4/16/2025
Recomended MCP Servers
A Model Context Protocol server for Scrapybara
MCP server for Israel Government Data
Model Context Protocol server for ActivityWatch time tracking data
A minimal MCP Server based on the Anthropic's "think" tool research
An Anthropic MCP server (with OpenAI Function calling compatibility) for the Coingecko Pro API
MCP to provide secure IT tools for AI network troubleshooting (remote ssh, ping, nslookup, etc)
Official mailtrap.io MCP server
✨ mem0 MCP Server: A modern memory system using mem0 for AI applications with model context protocl (MCP)...
A Model Context Protocol (MCP) server implementation for comprehensive code analysis. This tool integrates with Claude Desktop to...
This project provides a toolset to crawl websites wikis, tool/library documentions and generate Markdown documentation, and make that...
MCP server implementation for Telegram





