While working on a machine learning project, getting good results from a single model-training run is one thing, but keeping all of your machine learning experiments organized and having a process that lets you draw valid conclusions from them is quite another. That’s what machine learning experiment management helps with.
In this article, I will explain why you, as data scientists and machine learning engineers, need a tool for tracking machine learning experiments and what is the best software you can use for that.
Tools for tracking machine learning experiments – who needs them and why?
- Data Scientists: In many organizations, machine learning engineers and data scientists tend to work alone. That makes some people think that keeping track of their experimentation process is not that important as long as they can deliver that one last model. This is true to an extent, but when you want to come back to an idea, re-run a model from a couple of months ago or simply compare and visualize the differences between runs, the need for a system or tool for tracking ML experiments becomes (painfully) apparent.
- Teams of Data Scientists: A specialized tool for tracking ML experiments is even more useful for the whole team of data scientists. It allows them to see what others are doing, share the ideas and insights, store experiment metadata, retrieve it at any time and analyze it whenever they need to. It makes the teamwork much more efficient, prevents situations where several people work on the same task, and makes onboarding of new members way easier.
- Managers/Business people: tracking software creates an opportunity to involve other team members like managers or business stakeholder in your machine learning projects. Thanks to the possibility to prepare visualizations, add comments and share the work, managers and co-workers can easily track the progress and cooperate with the machine learning team.
Here is an in-depth article about experiment management for those of you who want to learn more.
The best tools for tracking machine learning experiments (also deep learning experiments)
Here’s a comparison of the features and integrations of the 15 best experiment management tools.
This comparison table was last updated on 29/April/2020. Some information may be outdated today. See some incorrect info? -> tell us we’ll update.
|Neptune||Weights & Biases||Comet||Sacred + Omniboard||MLflow||TensorBoard||Guild AI||PolyAxon||TRAINS||Valohai||Pachyderm||Kubeflow||Verta.ai||SageMaker Studio||DVC|
|Focus||Experiment Management||Experiment Management||Experiment Management||Experiment Management||Entire Lifecycle||Experiment Management||Experiment Management||Experiment Management||Experiment Management||Entire Lifecycle||Entire Lifecycle||Run orchestration||Entire Lifecycle||Entire Lifecycle||Data Versioning|
||Free||Enterprise: NA||You pay extra on top of compute||Free|
|Free plan limitation||
||only 3 days of data available|
|Open – source||limited||limited|
|Experiment Tracking Features|
|Logging Images and Charts||limited||limited||limited||limited||limited|
|Update Finished Experiment|
|Saving Experiment Views||limited||limited|
|Fetching experiments via API|
|Can be deployed on-premise|
|Hosted version available||
on top of Databricks platform
|Scales to millions of runs|
|Dedicated User Support|
Neptune is the most lightweight experiment management tool available on the market. It’s an excellent tracking platform for any data scientist. The software easily integrates with your workflow and offers an extensive range of tracking features. One can use it to track, retrieve and analyze experiments but also to share them with a team and managers. Additionally, Neptune is very flexible, works with many other frameworks and thanks to its stable user interface, it enables great scalability (to millions of runs).
- Possibility to store, retrieve and analyze a large amount of data
- Tools for efficient team collaboration and project supervision
- Jupyter notebook tracking included
Weight & Biases targets the most advanced deep learning teams. It allows them to record experiments and visualize every part of the research. Weight & Biases has been created to facilitate collaboration between data scientists and offers many useful features in this matter. All of it with well-designed user experience.
- Created for deep learning experiment tracking
- Easy integration process
- Customizable visualization and reporting tools
See the comparison between Weights & Biases and Neptune.
Similar to the previously described tools, Comet was built to enable tracking of machine learning projects. The team behind this software has a mission to help data scientists better organize and manage their experiments. Comet provides the possibility to easily compare experiments and keep a record of the collected data, as well as collaborate with other team members.
- Quick and easy adaptation with any machine
- Works well with existing ML libraries
- Safeguard IP
“Every experiment is sacred…” as they say in the Sacred tool description. Sacred is open-source software and allows machine learning engineers to configure, organize, log and reproduce experiments. Sacred doesn’t come with its proper UI but there are a few dashboarding tools that you can connect to it, such as Omniboard, Sacredboard or Neptune. Also, it doesn’t have the scalability of previous tools and has not been adapted to team collaboration, however, it has great potential when it comes to individual research.
- Open-source tool
- Extensive experiment parameters customization options
- Easy integration
See the comparison between Sacred + Omniboard and Neptune.
MLflow is an open-source platform that helps manage the whole machine learning lifecycle. This includes experimentation, but also reproducibility and deployment. Each of these three elements represented by one MLflow component: Tracking, Projects, and Models. That means a data scientist who works with MLflow is able to track an experiment, organize it, describe it for other ML engineers and pack it into a machine learning model. It’s been designed to enable scalability from one person to big organization, however, it works best for an individual user.
- Focus on the whole lifecycle of the machine learning process
- Compatible with many additional tools and platforms
- Open interface integrated with any ML library or language
See the comparison between MLflow and Neptune.
TensorBoard is another experiment tracking tool. It’s open-source and offers a suite of tools for visualization and debugging of machine learning models. TensorBoard is the most popular solution on the market and thus it’s widely integrated with many other tools and applications. What’s more, it has an extensive network of engineers using this software and sharing their experience and ideas. This makes a powerful community ready to solve any problem. The software, itself however, is best suited for an individual user.
- Large library of pre-built tracking tools
- Integration with many other tools and applications
- Well prepared problem-solving materials and community
See the comparison between TensorBoard and Neptune.
7. Guild AI
The team behind Guild AI states that “The faster and more effective you can apply experiments, the sooner you’ll complete your work.” In order to make this process well organized they created this open-source experiment tracking software, which is best suited for individual projects. It’s lightweight and equipped with many useful features that make it easier to run, analyze, optimize and recreate machine learning experiments. What’s more, Guild AI includes a variety of analytics tools making the experiments comparison process much easier.
- The automated machine learning process
- Integrated with any language and library
- Remote training and backup possibility
See the comparison between Guild AI and Neptune.
Polyaxon is a platform that focuses on both, the whole life cycle management of machine learning projects as well as the facilitation of the ML team collaboration. It includes a wide range of features from tracking and optimization of experiments to model management and regulatory compliance. The main goal of its developers is to maximize the results and productivity while saving costs. It’s worth mentioning, however, that Polyaxon needs to be integrated into your infra/cloud before it’s ready to use.
- Integrated with most popular deep learning frameworks and ML libraries
- Designed to serve different groups of interests including data scientists, team leads and architects
- Team collaboration possibilities
See the comparison between Polyaxon and Neptune.
Trains was built to track the “glorious but messy process of training production-grade deep learning models”, as stated by its creators. The main focus of the software is to help keep track of machine learning and deep learning experiments in an effortless, yet effective way. Trains is an open-source platform that is still in the beta stage, however, it is being constantly developed and upgraded.
- Quick and easy implementation process
- Possibility to boost team collaboration
- Useful features designed to track the experiment process and save data to one centralized server
Valohai has been designed with data scientists in mind and its main benefit is that it makes the model building process faster. It does it with large-scale automation but needs to be integrated with your infrastructure/private cloud first. Valohai is compatible with any language or framework, as well as many different tools and apps. The software is also teamwork-oriented and has many features that facilitate it.
- Significant acceleration of the model building process
- Helpful customer service and monthly checkup
- Focused on the entire lifecycle of machine learning
Pachyderm is a tool that makes it possible for its users to control an end-to-end machine learning cycle. From data lineage, through building and tracking experiments, to scalability options – with Pachyderm, it’s all covered. The software is available in three different versions, Community Edition (open-source, with ability to be used anywhere), Enterprise Edition (complete version-controlled platform) and Hub Edition (still a beta version, combining characteristics of the two previous versions). It needs to be integrated with your infrastructure/private cloud, thus, it’s not as lightweight as some of the other tools mentioned before.
- Possibility to adapt the software version to your own needs
- End-to-end process support
- Established and backed by a strong community of experts
See the comparison between Pachyderm and Neptune.
Kubeflow is a software with the main goal of run orchestration and making deployments of machine learning workflows easier. It’s known as the machine learning toolkit for Kubernetes and aims to use the Kubernetes potential to facilitate the scaling of machine learning models. The team behind Kubeflow is constantly developing its features and does its best to make data scientists’ life easier. There are some tracking capabilities but it’s not the main focus of the project. It can be easily used with other tools on this list as a complementary tool.
- Multi-framework integration
- Perfect for Kubernetes users
- Open-source character
See the comparison between Kubeflow and Neptune.
Verta’s main features can be summarized in four words: track, collaborate, deploy and monitor. As one can see, the software has been created to facilitate the management of the entire machine learning lifecycle. And it’s equipped with the necessary tools to assist ML teams in every stage of the process. The variety of features, however, causes the platform to be more complex and thus, not as lightweight as other options we mention.
- Compatibility with other ML frameworks
- Assistance in the end-to-end machine learning process
- User-friendly design
14. SageMaker Studio
SageMaker Studio is an Amazon tool that allows data scientists to manage an entire machine learning lifecycle. From building and training to deploying ML models. The idea behind this software is to make it easier and less time-consuming to develop high-quality experiments. It’s a web-based tool and comes with the whole toolset designed to help data scientists improve their performance.
- Possibility to track thousands of experiments
- Integration with a wide range of Amazon tools for ML related tasks
- Fully managed
See the comparison between SageMaker Studio and Neptune.
The last project is an open-source version control system created specifically for machine learning projects. Its aim is to enable data scientists to share the ML models and make them reproducible. DVC user interface can cope with versioning and organization of big amounts of data and store them in a well-organized, accessible way. It focuses on data and pipeline versioning and management but has some (limited) experiment tracking functionalities. It can be easily used with other tools on this list as a complementary tool.
- Adaptable to any language and framework
- Possibility to version large amount of data
- Open-source character
See the comparison between DVC and Neptune.
Tracking machine learning experiments has always been an important element of the ML development process, however, in the past, it required a lot of effort from data scientists. The tracking tools were limited and thus the process was manual and time-consuming.
For this reason, data scientists and engineers often neglected this part of the machine learning lifecycle or created home-grown solutions. It shouldn’t be the case anymore.
Over the last few years, tools for tracking machine learning experiments have matured a lot and are extremely accessible and easy to use. The apps and platforms we listed today are the best examples. Hopefully, every data scientist finds here the software that will make his or her life easier!