MLOps Model Stores: Definition, Functionality, Tools Review
What is ML (Machine Learning) Model Store? Here’s a scenario you might find yourself in. You’ve gone through a rigorous development workflow, experimenting and training various machine learning models with different results and performance scores. You decide the best way to collaborate is by sharing your models stored in object storage like S3 or GCS and records on a spreadsheet.
You go through the process above, so when everyone on the team meets, you can all discuss the results and probably go back to developing more experiments, or in some cases, argue about which of the models the team should deploy.
After several hours of arguing back and forth on what model(s) to deploy, the team finally agrees on one. Now comes another issue of actually packaging the model that will be deployed, staging it, and making it production-ready. Finally, you push that model to production with one of the new model deployment tools. Uff.
This manual, laborious, and repetitive process can be time-consuming and potentially even harmful for you and your team. There has to be a way to keep all the collaboration on developing and deploying machine learning models efficient and streamlined, right?
Model stores are a new thing that will help you with that. Let’s explore what model stores are, how they help, and how to pick the right model store for your project.
May interest you
ML Metadata Store: What It Is, Why It Matters, and How to Implement It
What’s a model store?
The model store is a central storage for data scientists to take and manage their models and experiments, including the model files, artifacts, and metadata.
With model stores, you control the complexity of managing multiple machine learning models. This structure also helps data scientists to:
- Compare multiple, newly trained model versions against existing deployed versions;
- Compare completely new models against versions of other models on labeled data;
- Track model performance over time;
- Track organization-wide experiments;
- Manage serving needs for organization-wide machine learning models.
Model stores like feature stores are asset management technologies for your machine learning projects. The “model stores” terminology is part of the new wave of MLOps jargon. It seems like a marketing gimmick but is crucial to how you develop and deploy your machine learning applications.
Beyond a searchable inventory of these models, you can also access the model artifacts, model configuration files, and the experiment metadata for the model in question.
Model stores serve as the staging environment for models you will serve to the production environment. Model stores allow you to couple reproducibility and production-ready models. Think of it as a wrapper around all your models after they’ve gone through the necessary development workflow. At this point, they’re ready to be deployed to the production environment or pulled into production by a prediction service.
You use model stores when you want your models to be reproducible and production-ready. The model store contains all the information and files needed to reproduce the results of a model. It includes the necessary process for the model to be packaged, tested, and ready to be deployed.
Arguably, there’s almost no difference between a model store and a registry. Still, one significant point made by Eduardo Bonet in the MLOps Community Slack channel is that in a model registry, you store and fetch models (like a docker registry).
In a model store, you have logging, discovery, examples, metadata, all you need, and the model store contains a model registry. You might still find that companies prefer using “model registry” as an encompassing term because there isn’t any particular standard in the naming convention now.
Why do you need a model store for your MLOps projects?
There are three key reasons you’ll need model stores for your machine learning operations (MLOps) project:
- Reproducibility of the model(s).
- Ensuring the model(s) is production-ready.
- Managing the model(s) effectively.
Let’s look closer at these critical reasons to understand why you’d need model stores for your project.
Reproducibility of the model(s)
One of the most challenging aspects of a machine learning project is reproducing results. While experimenting, your models are initialized with random values and then adjusted based on training data. This randomness makes reproducibility a complicated process to implement manually. Other artifacts need to be fixed to ensure reproducibility during training.
Model stores guarantee reproducibility in the following ways:
- Tracking and collecting experiment– and ML pipeline-related metadata (experiment author/owner, description, etc.).
- Collecting dataset metadata, including version, location, and description of the dataset. Also, how a user chose the data or where the data links to in the feature store.
- Collecting model artifacts, metadata (packages, frameworks, language, environment files, git files, etc), and configuration files.
- Collecting container artifacts.
- Project documentation, including demos and examples on how to run a model.
The store is also a spot for teams to “shop” models that can be reusable, making for more efficient collaboration and accessible machine learning projects. You can retrain off-the-shelf models on datasets similar to what someone else initially trained the model on, ensuring teams can experiment and roll out models faster.
Teammates can search for relevant models through the catalog feature to reuse, improve, or manage these models. This feature, in some way, eliminates the problem of building in silos and eases collaboration across the entire organization. With the model store, project results are consistent regardless of when and where they run.
In summary, model stores promote collaboration, accessibility, and more efficient machine learning project workflows for individuals and teams in terms of reproducibility.
Ensuring the model(s) is production-ready
Model stores are integrated with production systems to provide resilient serving of the models. With model stores, you have a faster rollout of production models because one of the technical bottlenecks of deploying models into production occurs due to the hand-off process between data scientists (developing the models) and the operations team (deploying the models).
Within the model store, models bound for production have their artifacts validated, the models compiled and integrated into the staging environment. This feature ensures that the models can be tested for compatibility with other applications and predictive prowess before you deploy the models to a prediction service.
Model stores also contain the preprocessing description for the models. The data pipeline in production can use the same pre-processing the model underwent during experimentation to avoid training-serving skew. The model stores integrate with the production environment and serving infrastructure to facilitate model deployment, model retirement (or archival), and rollbacks.
Beyond staging, model stores can also support other strategies for validating and deploying models such as canary integration tests and deployment, shadow mode deployment, and A/B deployment. For automated MLOps pipelines, you can integrate the model store with continuous integration and delivery/deployment (CI/CD) and continuous training (CT) to eliminate manual operational practices.
How Machine Learning Teams Use CI/CD in Production [Examples]
To track down and reproduce errors with your application, you’ll need to know which model is running in production. To do so, you need to keep a record of trained models linked to the datasets (and features) the models were trained on—the model store does this for you.
Managing model(s) effectively
Managing models across projects and entire organizations can be very complicated. Organizations have gone from deploying a handful of models solving specific problems to deploying hundreds and even thousands of models to solve different production issues. Model stores help with the visibility, discoverability, and management of these models.
Improving model governance
One primary reason for managing models in production is to improve the governance and security of these models. Most industries are employed to follow regulatory requirements in terms of how they deploy their product to customers. Model stores allow models to be reviewed and audited, so their results comply with the regulatory requirements.
Beyond regulatory requirements, it is essential to ensure there are no constraints in using licensed and open-sourced tools to develop and deploy the model. Such restrictions may lead to breaches in license agreements or usage of these tools. In this scenario, reviews and audits become crucial aspects of model management.
Improving model security
Models and the underlying packages used in building them also need to be scanned for vulnerabilities, especially when lots of packages develop and deploy the models—managing the specific versions of the packages, removing any security vulnerabilities that may pose a threat to the system.
Models are also susceptible to adversarial attacks and therefore need to be managed and protected against such attacks. There are also situations where the security principle of least privilege access needs to be applied so that only authorized users can access specific model information and resources.
Where a model store fits in your MLOps platform
So far, we know that model stores improve reproducibility and resilient serving of models. Let’s look at where they fit in your MLOps platform.
The image below has been modified from Google Cloud’s “MLOps: Continuous delivery and automation pipelines in machine learning” article.
Model stores are coupled with your experiment management system and integrated with your production pipeline to govern the model deployment process: review, test, approve, release, and rollback. A model store handles the artifacts and metadata from the experiment management system or a continuous training-enabled automated production pipeline at the high level.
In terms of artifact management, model stores manage the model lifecycle, including packaging the model for staging or release. For metadata management, searching for model-related metadata, creating reports, and other features make it easy to discover models in the store.
What you can find in a model store
Let’s look at the journey of a model after you have trained and validated it:
- Artifacts and metadata of models that you have trained and validated arrive at the model store from the experiment management system.
- The artifacts are validated to ensure the trained model includes all necessary artifacts for serving and monitoring.
- The model artifacts and metadata are packaged into a self-contained and loadable package ready for serving the integrated production environment.
- The packaged model is staged for deployment to validate models and ensure they’re suitable for serving. The validation includes a quality assurance test on the model and other applications it will integrate within the production environment.
The primary reason for these steps is to ensure the stability of the model in production. Since multiple models are loaded in the same container, a lousy model may cause prediction request failure and potentially interrupt models in the same container. In other scenarios, the same model may disrupt an entire application in terms of results and performance.
Let’s take a look at what you can find within a model store:
- Diverse metadata: From models, data, and experiments.
- Artifacts: Like the metadata, the store contains all artifacts relevant to how you develop, deploy, and manage models.
- Documentation and reporting tools: Documentation is crucial for reviews and reproducible projects. Model stores enable documentation relevant to how you develop your model, deploy, and manage them.
- Catalog: The information in the model stores needs to be searchable, and the catalog enables this. Searching for models to use? How about related metadata? Searching for which models were trained on a particular dataset? The catalog makes the store searchable.
- Staging tools: Another feature of the model store is the staging integration tests it can carry out on models. You can find tools for staging models for testing within the model store.
- Automation tools: One of the goals of model stores is to automate some repetitive tasks after you have trained and validated a model to increase the productivity of teams deploying lots of models. Within the store you can find automation tools and workflows that enable this process.
Tracking and managing metadata is crucial for any machine learning workflow. This feature includes any type of metadata that is important to how you develop, deploy, and manage models. Model stores are integrated with experiment management systems to track experiments or pipeline-related metadata.
When you run experiments, the experiment management systems log the outputs of the experiments so they can be tracked and managed effectively. “Effectively” here means the metadata from experiment runs are intended to help you monitor your models during development, debug any error you might encounter, and even visualize the performance of the model in graphs and charts.
Some of the experiment metadata you can find within the model store include:
- Environment configurations: Encapsulates the complete environment, including all tools, dependencies, packages, Docker files, and other configuration files needed to build the model.
- Compute information: The type of hardware and accelerator (if any) used during that experiment run and how much power the process consumed during the process helps determine the project’s carbon footprint.
- Code version: Includes the git SHA of a commit or an actual snapshot of code used to build the model.
- Execution details: Unique identifier for that particular run, date the experiment was triggered (timestamp), how long it took to run, and the time it completed.
- Pipeline-related metadata: Such as the pipeline version, run number, and pipeline parameters.
- URI of the files for charts, graphs, and performance results of the experiment.
Let’s look at some other metadata categories that you can find within the model store for an experiment run.
The model metadata that you can find within the model store include:
- Model name set by the user.
- Model description developed by the user.
- The URI of the model (or where it will be stored).
- Model type: The algorithm you used for the model. Suppose it is a logistic regression algorithm or a convolutional neural network algorithm.
- Framework(s) and libraries, including the version number of the trained model.
- Hyperparameter configurations of the model for that experiment run.
- Model version, which is crucial for uniquely identifying the model in the model store.
- Model tags (development, retired, or production) or labels.
- Model code.
- Type of metric used for the model.
- Other necessary model configuration files.
The dataset metadata you can log within the model store includes:
- The version of the dataset.
- URI or location (file path) of the dataset you trained the model on.
- Dataset owner.
- Dataset source.
All these allow you to track the ingested data in more detail. They will also enable you to verify that the dataset used during the training is still the dataset at a later point in time which is critical for reproducibility.
The artifacts you can find in the model store include the model artifacts (saved models in various formats such ONNX, pickle, protobuf, etc) and container artifacts such as Docker images during the packaging of the models for testing and deployment.
Model documentation is a crucial part of the reproducibility and auditing of models. Documenting and sharing context on how a model makes predictions can enable collaboration so teammates can reproduce the results based on the examples and aid transparent model reporting. The documentation will also contain details on the model’s intended use, the biases that the model may likely exhibit, and disclaimers about the types of inputs the model is expected to perform well on. Some stores are testing out model cards for model documentation and reporting.
Within a model store, you will find a catalog of models you can search so assets can be found easily, especially when hundreds to thousands of models are either running in production or not. In some model stores, this search is done through a graphical user interface (GUI) or an API so the feature can integrate with other systems.
Functionalities of a model store
What are the key functionalities and must-haves of a model store?
Integration into an experiment management system
Tracking your experiments is key if you want to increase productivity and ensure the reproducibility of your model results so they’re consistent across runs. Model stores should be coupled with experiment management systems to track experiments, so metadata, artifacts, and other information are collected into the store. This integration also helps to facilitate the decision about releasing the model to production.
Within the model store, there should be a collection of infrastructure that’s needed to run the model integrated with other parts of the application or service. This functionality usually includes application servers, databases, caches, and other systems integrating with the prediction service (containing the model). The serialized models are compiled and deployed to the staging environment.
The staging environment is used for testing in an integrated environment before releasing the model to production, which is usually the environment the customers use and other services they interact with. From the staging environment, we can perform quality assurance tests, proper governance, and approval workflows.
Proper governance and approval workflows will also require sharing and collaboration in the model store. Teams and stakeholders can easily access the registered model and edit and update information on it. It also should include an access control feature where the owner of a registered model in the store can give full access, partial access, read or write access to a user. This functionality should also come with an activity log that tracks and records changes in the model store and the necessary information.
Automation is another functionality of the model store because, as we argued earlier, many project workflows are manual and repetitive. This functionality brings about the need for a tool that can automate the model lifecycle after you develop the model, so the review, approval, release, and roll-back steps in the lifecycle are seamless.
Model stores should use APIs and webhooks to trigger downstream actions when previous steps are completed. Earlier in this article, you saw the journey of a model after experimentation. Downstream activities that need to be triggered when the experiment management system sends the artifacts and metadata are crucial for the model store.
For example, when the experiment management system passes the model artifacts, they must be automatically registered. When a user moves the model to the staging environment, they’re containerized to test them locally before approval and release (deployment).
You can configure some model stores in terms of how they load models to the prediction service. For instance, as discussed earlier, loading models to the suitable clusters in the prediction service or using a particular deployment strategy (like canary deployment or A/B deployment). This functionality can also be the case when deploying large models requiring specific accelerators to work in the production environment effectively.
Another core functionality of model stores is the integration with the production environment. Earlier, you must have seen that if you use an automated pipeline that includes CI/CD workflows, delivering well-tested models is crucial, making the production systems stable.
Reviewing various model stores
In this section, we will review various model stores to think about some options you might want to explore. It’s worth noting that some of these tools may also go by “model registry.” We already learned about what’s arguably a slight difference between the model store and the model registry earlier in this article.
Some of these solutions are open-source; others involve a pricing plan. We will now briefly look at 5 (five) model stores you can start exploring, and in the next section, we will round up the article with how to select an appropriate model store for your needs.
Modelstore (how original, Neal! ) is an open-source Python library that allows you to version, export, and save/retrieve machine learning models to and from your filesystem or a cloud storage provider (AWS or GCP).
Apart from being open-source, modelstore has some of the following features:
- Automates the versioning of your models.
- Stores your model in a structured way for accessibility.
- Collects and handles metadata about the Python runtime, dependencies, and git status of the code used to train your models.
- Supports several popular open-source machine learning libraries, and you can find them here.
- It can use your local file store, Google Cloud Storage, or AWS S3 Buckets for your backend.
- It provides a model packaging feature as well as an artifact handling feature.
- Actively maintained with other support and model store functionalities being added.
As of the time of this writing, modelstore doesn’t support some core features of model stores and has some downsides that include:
- Features are currently accessed through APIs as there is no GUI access yet.
- Model documentation and reporting features are still a work-in-progress with this library.
It’s worth noting that one contributor currently manages modelstore, so you need to take that into account if you want to adopt this solution.
To get started with modelstore, check out the quickstart guide.
ClearML Model Stores
ClearML states on their website that it’s the only open-source tool to manage all your MLOps in a unified and robust platform providing collaborative experiment management, powerful orchestration, easy-to-build data stores, and one-click model deployment.
One exciting ClearML feature is the MLOps Open Architecture Stack, which comes with an open-sourced MLOps engine you can build custom applications on if the ClearML applications don’t solve your problem.
While a model store is not part of the ClearML application stack, you can build a custom model store on the open-source MLOps engine that pretty much provides the core functionalities of what a model store should provide. The only issues, as this video by Ariel Biller (Evangelist at ClearML) explains, are:
- You can’t create custom labels for your metadata.
- You can’t use custom serializations (such as ONNX) during model packaging. The engine uses the serialized file resulting from the framework you used during development—for example, protobuf or .h5 formats for Tensorflow and Keras models. This challenge may cause compatibility issues (such as model interoperability) in production.
You can get started with model stores in ClearML by watching this video with a follow-along Google Colab demo.
MLflow Model Registry
MLflow is an open-source platform to manage the ML lifecycle, including experimentation, reproducibility, deployment, and a central model registry. The MLflow Model Registry component is a centralized model store, set of APIs, and UI, to collaboratively manage the entire lifecycle of an MLflow Model across data teams.
Some of the features of the MLflow Model Registry include:
- Provides a central repository to store and manage uniquely named registered models for collaboration and visibility across data teams.
- Provides a UI and API for registry operations and a smooth workflow experience.
- Allows multiple versions of the model in different stages environments (staging and production environments).
- Allows transition and model promotion schemes across different environments and stages. Models can be moved from staging, loaded to the production environment, rolled back, and retired or archived.
- Integrated with CI/CD pipelines to quickly load a specific model version for testing, review, approval, release, and rollback.
- Model lineage tracking feature that provides model description, lineage, and activity.
The MLflow registry also provides support for metadata stores and artifact stores. The metadata store can be used anywhere PostgreSQL, MySQL, and SQLlite are available. The artifact store supports local filesystem backends and some hosted backends such as S3 storage, Azure Blob storage, Google Cloud storage, and DBFS artifact repository. There’s also a managed MLflow plan available that you may want to check here.
You can get started with MLflow Model Registry with this workshop and look at the documentation.
Free and paid solutions
Neptune is a metadata store for MLOps, built for research and production teams that run many experiments.
It gives you a central place to log, store, display, organize, compare, and query all metadata generated during the machine learning lifecycle.
Individuals and organizations use Neptune for experiment tracking and model registry to control their experimentation and model development.
Some of the core features of Neptune include:
- Logging all types of machine learning model-related metadata types. You can find a complete list of all the metadata types that you can log in to the documentation.
- Working where you work because Neptune comes with 25+ integrations with Python libraries popular in machine learning, deep learning, and reinforcement learning.
Neptune is more of a metadata store than an actual artifact store, which a model store also manages and handles. Neptune is optimal for both your experimentation and production use cases.
Neptune is free for one user and paid for teams (a team trial is available). You can start using it for free in about 5 minutes. You can also learn more:
Verta.ai uses a suite of tools to empower data science and machine learning teams to rapidly develop and deploy production-ready models, thereby enabling efficient integration of ML into various products. One of the tools in their platform is the Model Registry, a central repository to find, publish, collaborate on and use production-ready models.
Verta.ai integrates with your model governance workflows and deployments systems to guarantee a reliable and automated release process. To get started with Verta’s Model Registry tool, you can click here for a free trial.
Some other options you might want to look into are:
- Kubeflow Machine Learning Metadata Service (with a MySQL database as the backend) and Artifact Store for pipeline run.
- Weights and biases.
How to select an appropriate model store for your MLOps projects
There are a few considerations when you want to select a model store for your project. Let’s review them.
Size of the team and number of models being deployed
If you’re a team of one or a small team starting, then the model store isn’t a top priority in terms of your platform stack, understandably so. Typically, you wouldn’t need a model store if a small team deploys a handful of models. Still, as your team keeps growing and the number of models the team is developing and deploying keeps increasing, there has to be a consideration for adopting a model store to ease reproducibility and serving.
You want to make sure you optimize your team’s productivity by reducing redundant processes. These processes include setting up a storage bucket, manually uploading artifacts, configuration files, and packaging models each time a team decides a model is ready for production.
For a small team and want to get started with a model store, try using a free option with the most basic functionalities such as: integrating with the development environment, metadata and artifact handling, catalog feature, and packaging feature.
For larger teams deploying many models where collaboration and visibility are essential, consider selecting a model store that solves the pain-point around the faster rollout of your models, improved governance plus approval workflows, and organization-wide model visibility and discovery.
Another consideration would include the type of model deployed to production if they must be documented and audited, especially in regulatory-heavy industries.
Conclusion and resources
Throughout this guide, you have learned what model stores are and why you may need them. You learned that they are a model-first approach to managing and deploying machine learning models.
They help centralize organization-wide ML models, so they’re easier to reproduce and, in some cases, reuse. You learned that model stores integrate with your experiment management system and aid your model launch process regarding where they are in the MLOps stack.
Model stores tie reproducibility, governance, security, and resilient serving together into your project workflow.
Resources and references
- Simplifying MLOps with Model Registry – YouTube
- (1619) Asset Management for Machine Learning – Georg Hildebrand – YouTube
- Operator AI (substack.com)
- (1702) [MLOps] The Clear SHOW – S02E11 – DIY Strikes Back! Building the Model Store! – YouTube
- ML Metadata Store: What It Is, Why It Matters, and How to Implement It – neptune.ai
- Continuous Integration and Deployment for Machine Learning Online Serving and Models | Uber Engineering Blog
- Model artifacts: the war stories – by Neal Lathia – Operator AI (substack.com)
- operatorai/modelstore: modelstore is a Python library that allows you to version, export, and save a machine learning model to your filesystem or a cloud storage provider. (github.com)
- Google Cloud MLOps: Continuous delivery and automation pipelines in machine learning