Blog » ML Tools » Best Machine Learning Model Management Tools That You Need to Know

Best Machine Learning Model Management Tools That You Need to Know

Developing your model is an essential part of working on ML projects. And it’s usually a tough challenge. 

Every data scientist has to face it, along with difficulties, like losing track of experiments. These difficulties are likely to be both annoying and unobvious, which will make you feel confused from time to time.

That’s why it’s good to streamline the process of managing your ML model, and luckily there are several tools for that. These tools can help with things like:

  • Experiment tracking
  • Model versioning
  • Measuring inference time
  • Team collaboration
  • Resource monitoring

So it’s common sense and good practice to find and use tools suitable for your projects.

In this article, we’ll explore the landscape of model management tools. I’ll try to show you the variety of tools and highlight what’s good about them.

We’ll cover:

  • Criteria for choosing a model management tool
  • Model management tools: Neptune, Amazon SageMaker, Azure Machine Learning, Domino Data Science Platform, Google Cloud AI Platform, Metaflow, MLflow

READ MORE
Machine Learning Model Management in 2021 and Beyond – Everything That You Need to Know


ML model management tools

There are plenty of ML tools for model management. To understand how they’re different, we need some criteria to evaluate each of them.

Criteria

We’ll analyze each tool in this article based on the following criteria:

  1. Data, notebook, model, code, and environment versioning and experiment tracking – the most important criterion in my opinion, as it covers the general concept of model management.
  2. Official documentation, availability of simple tutorials and examples – how to install, set up, and use the tool being the crucial tutorials.
  3. Resource monitoring and logging.
  4. Friendly and customizable UI a must-have tool feature that will make it convenient to navigate through.
  5. Number of tool’s integrations – whether the tool is a standalone platform and has different libraries and frameworks integrated into it, or the tool can be considered a library itself and can be used with various frameworks, environments, and libraries, for example R, Google Collab, Tensorboard, scikit-learn, and Keras.
  6. Stable and active community and regular updates – model management is a crucial part of the project’s success. That’s why we want to use a well-known, tested, and up-to-date tool.

We’ll explore the following model management tools:

  1. Neptune
  2. Amazon SageMaker
  3. Azure Machine Learning
  4. Domino Data Science Platform
  5. Google Cloud AI Platform
  6. Metaflow
  7. MLflow

Let’s jump in.

1. Neptune

Neptune is a lightweight tool that fits most workflows. Its primary focus is experiment management.

ml-model-management-tool-neptune

Neptune can be used to:

  • Track experiments – logging and visualizing 
  • Model versioning – log all the model training artifacts (model binary, parameters, training scripts and more) for any training job you run
  • Record data exploration – working with different notebook checkpoints
  • Organize teamwork – manage your team and organize experiments
  • Compare and analyze experiments and model training runs with low effort

To install Neptune, simply use a pip command or refer to the installation guides:

pip install neptune-client

Let’s see how Neptune fits the criteria:

  1. Data, notebook, model, code, and environment versioning and experiment tracking:

Neptune is focused on experiment tracking. This makes it great for model management, as it supports versioning and tracks every aspect of an ML experiment.

  1. Official documentation, availability of simple tutorials and examples:

It’s super easy to start using Neptune, there’s a variety of simple examples and tutorials, along with video guides that seem to cover most aspects of the tool. The official documentation seems complete and easy to navigate through.

  1. Resource monitoring and logging:

Neptune monitors resources and logs pretty much everything that you do. It’s unlikely that you will lose any valuable piece of information when you use Neptune. 

  1. Friendly and customizable UI:

Neptune has a friendly UI, it looks nice, it’s very informative and convenient to use. The customizable dashboards are pretty neat.

  1. Number of tool’s integrations:

The number of supported tools and frameworks is quite big, so you shouldn’t have any issues with integrations. You can check out the full list here. 

  1. Stable and active community and regular updates:

Neptune has a stable and active community. Neptune has an active blog and social network pages where you can find articles and news updates. Most blog contributors are part of Neptune’s community. You can also find a lot of independent articles featuring Neptune on sites like Medium or Kaggle.

Considering updates, Neptune is maintained and updated.

Overall, Neptune is a powerful model management tool. It has a great community and complete documentation that will help you start working. 


READ NEXT
ML Experiment Tracking: What It Is, Why It Matters, and How to Implement It


2. Amazon SageMaker

Amazon SageMaker is Amazon’s fully managed integrated development environment (IDE) for machine learning and deep learning. It covers the entire ML lifecycle.

ml-model-management-tool-amazon-sagemaker

That is why SageMaker can be used to:

  1. Prepare the data
  2. Build an algorithm
  3. Train & tune a model
  4. Deploy & manage the model

SageMaker has an enormous amount of built-in instruments that cover every single stage of the ML lifecycle. If you want to learn all about it, check the official website – there are honestly too many things in there to cover in this article.

The broad set of built-in tools makes SageMaker a bit difficult to start working in. You have to learn it before you unveil its true potential.

To start working with Amazon SageMaker, refer to the official guide.

Let’s see how Amazon SageMaker fits the criteria:

  1. Data, notebook, model, code, and environment versioning and experiment tracking:

Amazon SageMaker has all you need to use it for ML model management. 

  1. Official documentation, availability of simple tutorials and examples:

From my point of view, the official documentation of Amazon SageMaker could be better. It has everything you need to start using SageMaker, but it’s not user-friendly, and kind of hard to navigate through, I felt a bit confused. Still, once you find what you’re looking for, it will be enough to answer your questions. 

  1. Resource monitoring and logging:

Resource monitoring and logging are supported and frequently updated as developers tend to make these capabilities even better. 

  1. Friendly and customizable UI:

The UI is friendly and can be easily customized. It looks really nice and provides a nice set of animations that make your workflow a bit brighter.

  1. Number of tool’s integrations:

Amazon SageMaker is a standalone tool that has other tools and frameworks integrated into it. As of today, it supports pretty much everything you need to work as a data scientist.

  1. Stable and active community and regular updates:

There is no doubt that SageMaker is regularly updated. You’re likely to find new features every month or less. I’m a bit concerned about SageMaker’s community – despite being a powerful ML lifecycle tool, SageMaker isn’t super popular yet. 

Let’s make this clear. Amazon SageMaker is really powerful. 

The main issue with it may be the entry threshold, which is rather high. You need to learn the tool before you can use it effectively. If you’re strapped for time and only need some functionalities from SageMaker, it might be better to choose another tool.

3. Azure Machine Learning

Azure Machine Learning is similar to Amazon SageMaker. It’s a cloud-based environment that you can use to train, deploy, automate, manage, and track machine learning models. It aims to accelerate the end-to-end machine learning lifecycle.

ml-model-management-tool-azure

Azure ML has tools for every stage of the machine learning lifecycle. 

To start working with Azure Machine Learning, please refer to its starting page and follow the instructions.

Let’s see how Azure Machine Learning fits the criteria:

  1. Data, notebook, model, code, and environment versioning and experiment tracking:

Azure Machine Learning has everything you might need from a good model management tool. 

  1. Official documentation, availability of simple tutorials and examples:

The official documentation seems complete and well structured. I was able to find everything I needed in a blink of time, and had no troubles setting up thanks to great starting tutorials and how-to guides. 

  1. Resource monitoring and logging:

Resource monitoring and logging are supported, but you will not find anything outstanding here.

  1. Friendly and customizable UI:

To tell the truth, I don’t like Azure’s UI. It’s subjective, I just feel like the UI is overloaded with buttons and information, and there isn’t much you can customize.

  1. Number of tool’s integrations:

Everything you need to start working. Azure is a standalone platform that has various development tools (PyCharm, Jupyter, Visual Studio), languages (R, Python), and frameworks (TensorFlow, MxNet, PyTorch, Keras, and others) integrated.

  1. Stable and active community and regular updates:

Azure Machine Learning is a popular Microsoft tool used by many companies as the main working tool. It’s updated almost every month.

To sum up, Azure Machine Learning seems like a bit more structured and popular version of Amazon SageMaker. They’re really pretty similar, but I would prefer Azure just because it’s way easier to start working with it.

4. Domino Data Science Platform

Domino’s Data Science Platform philosophy is the automatization of DevOps for data science. With this approach, you spend more time on research and test more ideas, faster.

ml-model-management-tool-domino-datalab

Domino positions itself at the center of the ML ecosystem. It works with an expansive list of industry-leading tools and technologies and a wide range of data sources, languages, IDEs, tools, and libraries.

To start using Domino, refer to the sign-up page

Let’s see how Domino Data Science Platform fits the criteria:

  1. Data, notebook, model, code, and environment versioning and experiment tracking:

You won’t face any problems here. Domino’s philosophy assumes nice model management functionality, so Domino is really good at it. You can be sure you won’t lose track of your experiments and will be able to reproduce the best result.

  1. Official documentation, availability of simple tutorials and examples:

It’s relatively easy to start using Domino Data Science Platform thanks to various simple examples and tutorials. The official documentation seems complete and convenient to navigate through.

  1. Resource monitoring and logging:

Domino monitors the computer’s resource usage but has quite limited logging capabilities. You should definitely check the documentation first to find out if, for example, logging audio or video is supported.

  1. Friendly and customizable UI:

I really like Domino’s UI. It feels well-structured and has a great minimalistic design while being really informative. Considering customizations, I wasn’t impressed. There are tools with wider capabilities.

  1. Number of tool’s integrations:

As a standalone platform Domino haseverything you might think of. Check the full list if you’re interested in details.

  1. Stable and active community and regular updates:

Domino is regularly updated and has a great business blog. So, if you’re interested in the business, social, and technology side of the data science industry, you might want to check it out. Still, I feel like Domino is relatively unknown despite being a nice tool and community-friendly.

Overall, Domino Data Science Platform is a great tool. It doesn’t have any major disadvantages, except maybe limited logging. So, if logging isn’t crucial for you, you should consider using Domino Data Science Platform as a model management tool for your next ML project.

5. Google Cloud AI Platform

Google Cloud AI Platform is an end-to-end tool for the machine learning lifecycle. It includes a variety of functions that support each stage of the lifecycle.

ml-model-management-tool-google-cloud

Google Cloud includes:

  • An overall dashboard
  • AI Hub
  • Data labeling
  • Notebooks, jobs, model management
  • Model deployment

To start using Google Cloud AI Platform, refer to the sign-up page.

Let’s see how Google Cloud AI Platform fits the criteria:

  1. Data, notebook, model, code, and environment versioning and experiment tracking:

Google Cloud AI Platform is a good model management tool. However, it focuses on the entire ML lifecycle, so check the documentation first to make sure that Google Cloud has the exact model management capabilities that you need.

  1. Official documentation, availability of simple tutorials and examples:

The documentation is not just complete. It seems almost perfect as it provides a comprehensive step-by-step guide containing multiple simple examples for each stage of the lifecycle. It will be easy for newcomers to start working with this tool.

  1. Resource monitoring and logging:

Resource monitoring is supported but logging is limited.

  1. Friendly and customizable UI:

UI is customizable, but I don’t like it much. It seems overloaded and reminds me of the Azure Machine Learning UI.

  1. Number of tool’s integrations: 

Everything you need to feel comfortable when working on an ML project. Moreover, notebooks are integrated with Google Collab where you can run them for free.

  1. Stable and active community and regular updates:

The tool is maintained and has a nice AI Hub. The Hub includes a number of public resources including Kubeflow pipelines, notebooks, services, TensorFlow modules, VM images, trained models, and technical guides. Public data resources are available for image, text, audio, video, and other types of data. All that makes Google Cloud AI Platform a well-known and respected tool.

To sum up, Google Cloud AI Platform is a nice end-to-end platform. When it comes to model management, it really does impress, despite having limited logging. If you want a tool for model management only, Google Cloud AI Platform might come with too much overhead.

Metaflow

Metaflow is a Python-friendly, code-based workflow system. Just like most tools in this article, it’s specialized for machine learning lifecycle management. 

ml-model-management-tool-metaflow

Metaflow helps you design your workflow as a directed acyclic graph, run it at scale, and deploy it to production. It versions and tracks all your experiments and data automatically. 

To install Metaflow simply use a pip command:

pip install metaflow

Let’s see how Metaflow fits the criteria:

  1. Data, notebook, model, code, and environment versioning and experiment tracking:

As mentioned above, Metaflow does versioning and tracking automatically.

  1. Official documentation, availability of simple tutorials and examples:

The official documentation seems well structured but incomplete. It has some simple examples and valuable tutorials, but doesn’t provide a full-picture overview.

  1. Resource monitoring and logging:

As far as I’m aware, it doesn’t have a resource monitoring feature, and logging is pretty limited.

  1. Friendly and customizable UI:

Unfortunately, it doesn’t have a graphical user interface like you see in most of the tools listed in this article. 

  1. Number of tool’s integrations:

Metaflow was open-sourced by Netflix and AWS in 2019. It can integrate with Amazon SageMaker, Python-based machine learning and deep learning libraries, and big data systems.

  1. Stable and active community and regular updates: 

Metaflow was originally developed at Netflix to address the needs of its data scientists who work on demanding real-life data science projects. Since it became open-source, Metaflow remains a niche and not commonly used tool. However, it is maintained and updated. For example, Metaflow for R was released in August 2020.

Overall, Metaflow is a relatively unknown tool without a UI. For me, UI is an important part of a model management tool. You might not need it, but if you’re like me and want a UI, I don’t recommend using Metaflow.

MLflow

MLflow is an open-source machine learning lifecycle management platform.

ml-model-management-tool-mlflow

MLflow helps you track an experiment, organize it, describe it for your teammates, and pack it into a machine learning model. This approach enables scalability from one person to a big team, or even an organization. It works even better for a single user. 

To install MLflow simply use a pip command:

pip install mlflow

Let’s see how MLflow fits the criteria:

  1. Data, notebook, model, code, and environment versioning and experiment tracking:

MLflow is a good model management tool right now. It has nice experiment tracking and versioning implemented. Also, there is a model packaging functionality with the model registry.

  1. Official documentation, availability of simple tutorials and examples:

The official documentation is regularly updated and covers both completed and experimental parts of the tool. It is structured yet a bit confusing as the experimental part changes frequently. The documentation has simple examples but lacks comprehensive tutorials. Moreover, due to the fact that MLflow is open-source, there are multiple third party guides, tutorials, and blog posts about it. It really helps to fill the gaps.

  1. Resource monitoring and logging:

As of today resource monitoring is not supported and logging is limited. Still, there is experimental logging functionality that can be used quite effectively. The tool is regularly updated, so please check the logging part of the official documentation to see if something changed.

  1. Friendly and customizable UI:

MLflow web interface is quite nice but can’t be considered a final product. I think that’s why it’s a bit overloaded. Still, it’s great to have at least some sort of UI.

  1. Number of tool’s integrations:

Considering integrations, it feels like that is what the team is currently working on, as the number has grown lately. Still, it’s worth mentioning that some integrations are not there yet. You should probably double-check every integration you need just to be sure.

  1. Stable and active community and regular updates:

MLflow is an open-source project so it will be maintained and updated for a long time. Moreover, it is considered the most commonly used experiment tracking and model registry tool.

To sum up, MLflow definitely deserves your attention. It does both experiment tracking and model registry and has a wide community. Unfortunately, there are some limitations such as the UI, user management, and logging. Still, it is a great open-source option chosen by many Data Scientists.

Summary

Let’s summarize it all so you’re able to pick the right tool for your next project.

Model management is all about developing a good machine learning model. Every data scientist knows that it’s not straightforward, but rather an iterative process with multiple steps and a lot of creativity

That is why it’s essential to have a tool that will help you with this challenge by being good at experiment tracking, versioning, logging, and visualization.

As you may have noticed, most of the tools mentioned above are focused on the entire machine learning lifecycle. Kind of like project management tools that have a model management functionality as a part of them.

They can be a great fit for your machine learning project, unless you prefer to use your own tools of choice for different stages of the lifecycle.

Also, what about pricing? I think that standalone platforms are good for big teams and organizations, but they’re unnecessary and too expensive for personal usage. It’s cheaper to use something that fits your exact demands, without a ton of features you won’t use.

Every tool in this article can be used as a model management tool right now and it will be good at it. It comes down to your personal preference. Each has strengths and weaknesses, and you should check the documentation to see if a tool meets your needs.

My personal choice is Neptune, it’s all I need from a model management tool and more, it’s easy to use, and has a great UI.

Final thoughts

In this article, we figured out what to look for when choosing model management tools, and explored different model management tools.

If you enjoyed this post, a great next step would be to start a project with any model management tool you like. Check out tools like:

Hopefully, with this information, you will have no problems choosing the model management tool for your next project. Thanks for reading, and happy training!

Resources


NEXT STEPS

Get started with Neptune in 5 minutes

If you are looking for an experiment tracking tool you may want to take a look at Neptune. 

It takes literally 5 minutes to set up and as one of our happy users said:

“Within the first few tens of runs, I realized how complete the tracking was – not just one or two numbers, but also the exact state of the code, the best-quality model snapshot stored to the cloud, the ability to quickly add notes on a particular experiment. My old methods were such a mess by comparison.” – Edward Dixon, Data Scientist @intel

To get started follow these 4 simple steps. 

Step 1

Install the client library.

pip install neptune-client

Step 2

Connect to the tool by adding a snippet to your training code. 

For example:

import neptune

neptune.init(...) # credentials
neptune.create_experiment() # start logger

Step 3

Specify what you want to log:

neptune.log_metric('accuracy', 0.92)

for prediction_image in worst_predictions:
    neptune.log_image('worst predictions', prediction_image)

Step 4

Run your experiment as you normally would:

python train.py

And that’s it!

Your experiment is logged to a central experiment database and displayed in the experiment dashboard, where you can search, compare, and drill down to whatever information you need.

Get your free account ->
Model Management Tools

Best Machine Learning Model Management Tools That You Need to Know

Read more

The Best MLflow Alternatives (2020 Update)

Read more
Best tools featured

15 Best Tools for Tracking Machine Learning Experiments

Read more
Model deployment tools

Best 8 Machine Learning Model Deployment Tools That You Need to Know

Read more