MLC Bakery
A Python-based service for managing ML model provenance and lineage, built with FastAPI and SQLAlchemy.
Features
- Dataset management with collection support
- Entity tracking
- Activity logging
- Agent management
- Provenance relationships tracking
- RESTful API endpoints
Development Setup
Prerequisites
- Python 3.12+
- uv (Python package manager)
- PostgreSQL (running locally or via Docker)
Local Development Setup
Clone the repository:
git clone <your-repo-url> mlcbakery cd mlcbakeryInstall Dependencies:
uvusespyproject.tomlto manage dependencies. It will automatically create a virtual environment if one doesn’t exist.# Install main, dev, and webclient dependencies in editable mode uv pip install -e .[dev,webclient]Set up Environment Variables: Create a
.envfile in the project root by copying the example:cp .env.example .env # Ensure .env.example exists and is up-to-dateEdit
.envwith your local PostgreSQL connection details. The key variable isDATABASE_URL. Example for a user ‘devuser’ with password ‘devpass’ connecting to database ‘mlcbakery_dev’:# .env DATABASE_URL=postgresql+asyncpg://devuser:devpass@localhost:5432/mlcbakery_dev(Ensure your PostgreSQL server is running and the specified database exists and the user has permissions)
Run Database Migrations: Apply the latest database schema using Alembic.
uv runexecutes commands within the project’s managed environment.uv run alembic upgrade heads
Running the Server (Locally)
Start the FastAPI application using uvicorn:
# Make sure your .env file is present for the DATABASE_URL
uv run uvicorn mlcbakery.main:app --reload --host 0.0.0.0 --port 8000
The API will be available at http://localhost:8000 (or your machine’s IP address).
- Swagger UI:
http://localhost:8000/docs - ReDoc:
http://localhost:8000/redoc
Running Tests
The tests are configured to run against a PostgreSQL database defined by the DATABASE_URL environment variable. You can use the same database as your development environment or configure a separate test database in your .env file if preferred (adjust connection string as needed).
# Ensure DATABASE_URL is set in your environment or .env file
uv run pytest
To run specific tests:
uv run pytest tests/test_activities.py -v
Project Structure
mlcbakery/
├── alembic/ # Database migrations (Alembic)
├── .github/ # GitHub Actions workflows
├── mlcbakery/ # Main application package
│ ├── models/ # SQLAlchemy models
│ ├── schemas/ # Pydantic schemas
│ ├── api/ # API routes (FastAPI)
│ └── main.py # FastAPI application entrypoint
├── tests/ # Test suite (pytest)
├── .env.example # Example environment variables
├── alembic.ini # Alembic configuration
├── pyproject.toml # Project metadata and dependencies (uv/Poetry)
└── README.md # This file
Database Schema
Managed by Alembic migrations in the alembic/versions directory. The main tables include:
collectionsentities(polymorphic base for datasets, models, etc.)datasetstrained_modelsactivitiesagentsactivity_relationships(tracks provenance)
Resetting the database (Local Development)
If using a local PostgreSQL instance, you can drop and recreate the database:
# Example commands using psql
# Connect as a superuser or the database owner
dropdb mlcbakery_dev
createdb mlcbakery_dev
# Re-run migrations
uv run alembic upgrade heads
Warning: This deletes all data in the development database.
Contributing
- Create a new branch for your feature (
git checkout -b feature/my-new-feature) - Make your changes
- Run tests to ensure everything passes (
uv run pytest) - Commit your changes (
git commit -am 'Add some feature') - Push to the branch (
git push origin feature/my-new-feature) - Submit a pull request
License
MIT
Deployment (Docker Compose)
This project includes a docker-compose.yml file for easier deployment of the API, database, Streamlit viewer, and Caddy reverse proxy.
Prerequisites
- Docker and Docker Compose installed.
- A Docker network named
caddy-networkcreated:docker network create caddy-network
Steps
Configure Environment Variables: The
docker-compose.ymlfile sets a defaultDATABASE_URLpointing to thedbservice within the Docker network. However, you must configure theADMIN_AUTH_TOKENfor theapiservice. You can do this by:- Creating a
.envfile: Create a.envfile in the project root and add the following line:
Docker Compose automatically loadsADMIN_AUTH_TOKEN=your_secure_admin_token_here.envfiles. - Modifying
docker-compose.yml: Directly add theADMIN_AUTH_TOKENunder theenvironmentsection of theapiservice (less secure for secrets). - Passing at runtime: Use the
-eflag withdocker-compose up, e.g.,ADMIN_AUTH_TOKEN=your_secure_admin_token_here docker-compose up -d.
- Creating a
Build and Run Services: Navigate to the project root directory and run:
docker-compose up --build -dThis will build the necessary images and start all services (api, mcp_server, streamlit, db, caddy) in the background.
Database Migrations: Once the
dbandapicontainers are running, apply the database migrations by executing thealembiccommand inside the runningapicontainer:docker-compose exec api alembic upgrade headNote: You might need to wait a few seconds for the database service to fully initialize before running migrations.
Accessing Services:
- API: The API will be accessible via the Caddy reverse proxy, typically at
http://localhostorhttp://<your-domain>if configured inCaddyfile. Direct access (bypassing Caddy) is usually on port 8000 if mapped. Swagger UI:http://localhost/docs(or/api/v1/docsdepending on Caddy setup). - Streamlit Viewer: Accessible via Caddy, e.g.,
http://streamlit.localhost. - MCP Server: Accessible via Caddy, e.g.,
http://mcp.localhost. - Caddy: Handles reverse proxying based on
Caddyfile. ModifyCaddyfileand restart thecaddyservice (docker-compose restart caddy) to update domains or proxy configurations.
- API: The API will be accessible via the Caddy reverse proxy, typically at
Stopping Services
docker-compose down
To remove the volumes (including database data):
docker-compose down -v
Important Notes
ADMIN_AUTH_TOKEN: This token is required for any mutable API operations (POST, PUT, PATCH, DELETE). Include it in requests as a Bearer token in theAuthorizationheader (e.g.,Authorization: Bearer your_secure_admin_token_here).DATABASE_URL: Ensure theapiandstreamlitservices can reach the database specified byDATABASE_URL. The default indocker-compose.ymlassumes thedbservice within the same Docker network.Caddyfile: Customize theCaddyfilefor your specific domains and HTTPS setup. The provided file includes examples for local.localhostdomains and placeholders likebakery.jetty.io. Remember to restart Caddy after changes.caddy-network: The services rely on the external Docker networkcaddy-networkfor inter-service communication and Caddy proxying. Ensure this network exists.
Some useful commands
Add / drop the database:
docker compose exec db psql -U postgres -c "drop DATABASE mlcbakery;"
docker compose exec db psql -U postgres -c "create DATABASE mlcbakery;"
Once the api server is running, migrate the schema:
docker compose exec api alembic -c alembic.ini upgrade heads
MLC Bakery MCP Server
Project Details
- jettyio/mlcbakery
- MIT License
- Last Updated: 5/14/2025
Recomended MCP Servers
🔍 Model Context Protocol (MCP) tool for search using the Tavily API
Created with StackBlitz ⚡️
Waldzell AI's monorepo of MCP servers. Use in Claude Desktop, Cline, Roo Code, and more!
MCP Server for Vercel AI SDK with Figma and magic-mcp integration
MCP Server for Firefly III
Python tool for converting files and office documents to Markdown.
Local version of Smartlead MCP for quick download and deployment to MCP compatible clients or n8n.
MCP server that creates its own tools as needed





