UBOS Asset Marketplace: MLC Bakery - Your Foundation for ML Model Governance
In the rapidly evolving landscape of Machine Learning Operations (MLOps), managing model provenance and lineage is no longer a ‘nice-to-have’ but a critical necessity. The UBOS Asset Marketplace introduces MLC Bakery, a robust, Python-based service meticulously designed to address this challenge head-on. Built upon the solid foundation of FastAPI and SQLAlchemy, MLC Bakery offers a comprehensive solution for tracking, managing, and governing your machine learning models throughout their entire lifecycle.
MLC Bakery isn’t just another tool; it’s a cornerstone for establishing trust, transparency, and reproducibility in your AI initiatives. It empowers data scientists, ML engineers, and compliance officers to understand the origins, dependencies, and evolution of their models, fostering a culture of accountability and continuous improvement.
Why is Model Provenance and Lineage Crucial?
- Reproducibility: Ensure that you can recreate models and their results, even months or years down the line. This is vital for debugging, auditing, and regulatory compliance.
- Accountability: Understand who created, modified, or deployed a model, and when. This is essential for identifying and addressing potential biases or errors.
- Compliance: Meet the increasingly stringent regulatory requirements for AI, which often mandate detailed documentation of model development and deployment processes.
- Collaboration: Facilitate seamless collaboration between team members by providing a shared understanding of model dependencies and relationships.
- Risk Management: Identify and mitigate potential risks associated with model drift, data poisoning, or adversarial attacks.
Key Features of MLC Bakery
MLC Bakery provides a rich set of features designed to simplify and streamline model provenance and lineage management:
- Dataset Management with Collection Support: Organize and track your datasets, including support for collections, allowing you to group related datasets together for easier management. This feature enables you to maintain a clear understanding of the data used to train and evaluate your models.
- Entity Tracking: Track any type of entity involved in your ML pipeline, from datasets and models to code repositories and environment configurations. This comprehensive tracking ensures that you have a complete picture of your ML ecosystem.
- Activity Logging: Automatically log all activities performed on your ML entities, such as data ingestion, model training, and deployment. This detailed audit trail provides valuable insights into the evolution of your models.
- Agent Management: Manage the agents (e.g., users, scripts, or automated processes) that interact with your ML system. This feature enables you to track who is responsible for each activity and to enforce access control policies.
- Provenance Relationships Tracking: Establish and track relationships between different ML entities, such as which dataset was used to train which model, or which code repository contains the code for a specific model. This feature provides a clear understanding of the dependencies between different components of your ML pipeline.
- RESTful API Endpoints: Interact with MLC Bakery programmatically through a comprehensive set of RESTful API endpoints. This allows you to integrate MLC Bakery seamlessly into your existing ML workflows and tools.
Use Cases for MLC Bakery
MLC Bakery can be used in a wide range of industries and applications, including:
- Financial Services: Track the provenance of models used for fraud detection, credit scoring, and algorithmic trading to ensure compliance with regulatory requirements.
- Healthcare: Manage the lineage of models used for medical diagnosis, drug discovery, and personalized medicine to ensure patient safety and data integrity.
- Manufacturing: Track the provenance of models used for quality control, predictive maintenance, and supply chain optimization to improve efficiency and reduce costs.
- Retail: Manage the lineage of models used for customer segmentation, recommendation engines, and targeted advertising to improve customer experience and increase sales.
- Research: Ensure the reproducibility of scientific experiments by tracking the provenance of models used for data analysis and simulation.
Getting Started with MLC Bakery
MLC Bakery is designed to be easy to install and use. The following steps provide a quick overview of how to get started:
- Installation: Clone the MLC Bakery repository from the UBOS Asset Marketplace and install the required dependencies using
uv, a modern Python package manager. - Configuration: Configure the database connection details in the
.envfile. MLC Bakery supports PostgreSQL as its backend database. - Database Migrations: Apply the latest database schema using Alembic. This will create the necessary tables and relationships to store model provenance information.
- Running the Server: Start the FastAPI application using Uvicorn. The API will be available at
http://localhost:8000(or your machine’s IP address).
Deep Dive into the Development Setup
Setting up MLC Bakery for local development is straightforward, ensuring you can quickly contribute or customize the tool to fit your specific needs. Here’s a more detailed breakdown of the setup process:
- Prerequisites: Ensure you have Python 3.12+ installed, along with
uvfor package management and a running PostgreSQL instance (either locally or via Docker). - Cloning the Repository: Begin by cloning the MLC Bakery repository from the UBOS Asset Marketplace. Navigate into the newly created directory.
- Installing Dependencies: Use
uvto install the necessary dependencies. This command installs main dependencies, development tools, and web client dependencies in editable mode, allowing you to make changes to the code and see them reflected immediately. - Setting up Environment Variables: Create a
.envfile by copying the example and modify it with your local PostgreSQL connection details. TheDATABASE_URLvariable is crucial for connecting to your database. Ensure your PostgreSQL server is running, the database exists, and the specified user has the necessary permissions. - Running Database Migrations: Apply the latest database schema using Alembic. This command executes within the project’s managed environment, ensuring the correct versions of dependencies are used. This step sets up the necessary tables in your PostgreSQL database.
Running the Server and Accessing Documentation
Once the database is migrated, you can start the FastAPI application. Make sure your .env file is present so the application can connect to the database. The --reload flag enables automatic reloading of the server whenever you make changes to the code, which is very useful during development. The API will be available at http://localhost:8000. You can access the interactive Swagger UI at http://localhost:8000/docs and the ReDoc documentation at http://localhost:8000/redoc. These documentation interfaces provide a comprehensive overview of the available API endpoints and their usage.
Testing and Ensuring Quality
MLC Bakery comes with a suite of tests to ensure its reliability and correctness. The tests are configured to run against a PostgreSQL database defined by the DATABASE_URL environment variable. You can use the same database as your development environment or configure a separate test database. To run all tests, use the command: uv run pytest. To run specific tests (e.g., tests related to activities), use the command: uv run pytest tests/test_activities.py -v.
Project Structure Explained
Understanding the project structure is key to navigating and contributing to MLC Bakery:
alembic/: Contains database migration scripts managed by Alembic..github/: Holds GitHub Actions workflows for automated testing and deployment.mlcbakery/: The main application package, containing:models/: SQLAlchemy models representing database tables.schemas/: Pydantic schemas for data validation and serialization.api/: FastAPI routes defining the API endpoints.main.py: The FastAPI application entry point.
tests/: Contains the test suite written using pytest..env.example: An example environment variables file.alembic.ini: Alembic configuration file.pyproject.toml: Project metadata and dependencies managed byuv.README.md: The documentation you’re reading now.
Database Schema Details
The database schema is managed by Alembic migrations, located in the alembic/versions directory. Key tables include collections, entities (a polymorphic base for datasets, models, etc.), datasets, trained_models, activities, agents, and activity_relationships (which tracks provenance).
Resetting the Database for Local Development
During development, you might need to reset the database. This will delete all data in the development database. Use these commands with caution. To reset the database, you’ll first need to connect as a superuser or the database owner. Then, drop and recreate the database. Finally, re-run the database migrations.
Contributing to MLC Bakery
We welcome contributions to MLC Bakery! To contribute:
- Create a new branch for your feature (
git checkout -b feature/my-new-feature). - Make your changes.
- Run tests to ensure everything passes (
uv run pytest). - Commit your changes (
git commit -am 'Add some feature'). - Push to the branch (
git push origin feature/my-new-feature). - Submit a pull request.
License Information
MLC Bakery is licensed under the MIT License, making it free to use, modify, and distribute.
Deployment with Docker Compose
MLC Bakery includes a docker-compose.yml file for easier deployment of the API, database, Streamlit viewer, and Caddy reverse proxy. This simplifies the process of setting up a complete environment for running and interacting with MLC Bakery.
- Prerequisites: Ensure you have Docker and Docker Compose installed. You also need to create a Docker network named
caddy-network. - Configuration: Configure the
ADMIN_AUTH_TOKENenvironment variable for theapiservice. This token is required for any mutable API operations (POST, PUT, PATCH, DELETE). You can set this variable in a.envfile, directly in thedocker-compose.ymlfile, or at runtime. - Building and Running Services: Navigate to the project root directory and run
docker-compose up --build -d. This will build the necessary images and start all services in the background. - Database Migrations: Once the
dbandapicontainers are running, apply the database migrations by executingdocker-compose exec api alembic upgrade headinside the runningapicontainer. - Accessing Services: The API will be accessible via the Caddy reverse proxy, typically at
http://localhost. The Streamlit Viewer and MCP Server will also be accessible via Caddy. Customize theCaddyfilefor your specific domains and HTTPS setup.
MLC Bakery and the UBOS Platform
MLC Bakery seamlessly integrates with the UBOS platform, a full-stack AI Agent Development Platform designed to bring AI Agents to every business department. UBOS helps you orchestrate AI Agents, connect them with your enterprise data, build custom AI Agents with your LLM model and Multi-Agent Systems. By using MLC Bakery within the UBOS ecosystem, you can ensure that your AI Agents are built on a foundation of trust, transparency, and accountability.
In Conclusion
MLC Bakery is a powerful tool for managing ML model provenance and lineage. Its comprehensive feature set, ease of use, and seamless integration with the UBOS platform make it an ideal solution for organizations of all sizes. By adopting MLC Bakery, you can establish a robust framework for governing your AI initiatives and ensuring that your models are reliable, reproducible, and compliant with regulatory requirements.
Start using MLC Bakery today and unlock the full potential of your machine learning investments!
MLC Bakery MCP Server
Project Details
- jettyio/mlcbakery
- MIT License
- Last Updated: 5/14/2025
Recomended MCP Servers
Greenwhales-based AI Tool for Smart Manufacturing
A Model Context Protocol (MCP) server for querying the VirusTotal API.
An MCP server that uses AppleScript to send iMessages and manage contacts.
一键导出PC微信聊天记录工具
10分钟搭建自己可免费商用的ChatGPT环境,搭建简单,包含用户,订单,任务,付费等功能
MCP integration for Google Calendar to manage events.
Playwright MCP server
🤗 smolagents: a barebones library for agents. Agents write python code to call tools and orchestrate other agents.
A MCP Server for browsing the official Minecraft Wiki!





