The phrase “Every model is wrong but some are useful” is especially true in Machine Learning. When developing machine learning models you should always understand where it works as expected and where it fails miserably.
There are many methods that you can use to get that understanding:
- Look at evaluation metrics (also you should know how to choose an evaluation metric for your problem)
- Look at performance charts like ROC, Lift Curve, Confusion Matrix, and others
- Look at learning curves to estimate overfitting
- Look at model predictions on best/worst cases
- Look how resource-intensive is model training and inference (they translate to serious costs and will be crucial to the business side of things)
Once you get some decent understanding for one model you are good, right? Wrong 🙂
Typically, you need to do some or a lot of experimenting with model improvement ideas and visualizing differences between various experiments become crucial.
You can do all of those (or most of those) yourself but today there are tools that you can use. If you’re looking for the best tools that will help you visualize, organize, and gather data, you’re in the right place.
Neptune is a Metadata store for MLOps for research and production teams that run a lot of experiments. It offers an open-source library that lets users log metadata generated during the model development process whether by executing scripts (Python, R, other) or notebooks (local, Google Colab, AWS SageMaker).
Projects in Neptune can have multiple members with different roles (viewer, contributor, admin), so all machine learning experiments that land in Neptune can be viewed, shared, and discussed by every team member.
Neptune is meant to provide an easy way to store, organize, display, and compare all metadata generated during the model development process.
Neptune – summary:
- Log model predictions
- Log losses and metrics
- Log artifacts(data version, model binaries)
- Log git information, code, or notebook checkpoints
- Log hardware utilization
- Log error analysis in notebooks after the training has been completed
- Log performance visualizations like ROC curve or Confusion matrix (during or after training) or anything else (and it becomes interactive) ‘metric_charts’ has nice example charts
- Log interactive visualizations from Altair, Bokeh, Plotly, or other HTML objects
- Compare hyperparameters and metrics across many runs with an intelligent compare table that highlights what was different.
➡️ Check what types of metadata you can log to Neptune.
Weights & Biases a.k.a. WandB is focused on deep learning. Users can track experiments to the application with Python library, and – as a team – can see each other’s experiments.
WandB is a hosted service allowing you to backup all experiments in a single place and work on a project with the team – work sharing features are there to use.
In the WandB users can log and analyze multiple data types.
Weights & Biases – summary:
- Monitor training runs information like loss, accuracy (learning curves)
- View histograms of weights and biases (no pun intended), or gradients
- Log rich objects like, charts, video, audio or interactive charts during training
- Use various comparison tools like tables showing auto-diffs, parallel coordinates plot and others
- Interactive prediction bounding box visualization for object detection models
- Interactive prediction masks visualization for semantic segmentation models
Comet is a meta machine learning platform for tracking, comparing, explaining, and optimizing experiments and models.
Just like many other tools like Neptune (neptune-client specifically) or WandB, Comet provides you with an open source Python library to allow data scientists to integrate their code with Comet and start tracking work in the application.
As it’s offered both cloud-hosted and self-hosted, users can have team projects and save a backup of experimentation history.
Comet is converging towards more automated approaches to ML, by predictive early stopping (not available with the free version of the software) and Neural architecture search (in the future).
Comet.ml – summary:
- Visualize samples with dedicated modules for vision, audio, text and tabular data to detect overfitting and easily identify issues with your dataset
- You can customize and combine your visualizations
- You can monitor your learning curves
- Comet’s flexible experiments and visualization suite allow you to record, compare, and visualize many artifact types
TensorBoard provides the visualization and tooling needed for machine learning experimentation. It’s open-source and offers a suite of tools for visualization and debugging of machine learning models. TensorBoard is the most popular solution on the market and thus it’s widely integrated with many other tools and applications.
What’s more, it has an extensive network of engineers using this software and sharing their experience and ideas. This makes a powerful community ready to solve any problem. The software, itself, however, is best suited for an individual user.
TensorBoard – summary:
- Tracking and visualizing metrics such as loss and accuracy
- Visualizing the model graph (ops and layers)
- Viewing histograms of weights, biases, or other tensors as they change over time
- Projecting embeddings to a lower-dimensional space
- Displaying images, text, and audio data
- Profiling TensorFlow programs
Visdom is a tool for flexible creating, organizing, and sharing visualizations of live, rich data. It supports Torch and Numpy.
Visdom facilitates visualization of remote data with an emphasis on supporting scientific experimentation and has a simple set of features that can be composed for various use-cases.
Visdom allows you to reflect results of statistical calculations and share them with other people, conveniently test, view, and experiment since all your results are presented in the interactive form.
A slight disadvantage may be the fact that there is no easy way to access the data, and to compare consecutive runs.
Visdom – summary:
- It helps to interactively visualize any data (including remote machine model training)
- It contains a ton of visualization atomics. In the context of machine learning models the most useful are: line plots, histograms, scatter plots, images, matplotlib figures, audio, videos, html objects but there is a ton to choose from
- Various visualization elements can be combined into a dashboard of visualizations
- It can be easily shared with your team or collaborators
- Since you have full customizability you can create your own favourite deep learning dashboard -> as explained here
Hiplot is a straightforward interactive visualization tool to help AI researchers discover correlations and patterns in high-dimensional data. It uses parallel plots and other graphical ways to represent information more clearly.
HiPlot can be run quickly from a Jupyter notebook with no setup required. The tool enables machine learning (ML) researchers to more easily evaluate the influence of their hyperparameters, such as learning rate, regularizations, and architecture. It can also be used by researchers in other fields, so they can observe and analyze correlations in data relevant to their work.
HiPlot – summary:
- Creates an interactive parallel plot visualization to easily explore various hyperparameter-metric interactions
- Based on selection on the parallel plot the experiment table is updated automatically
- It’s super lightweight and can be used inside notebooks or as a standalone webserver
Machine learning model visualization tools are so important because a visual summary of your ML or deep learning models makes it easier to identify trends and patterns, understand connections, and interact with your data.
I hope you found what you were looking for and can now improve your experiments.
15 Best Tools for Tracking Machine Learning Experiments
7 mins read | Author Pawel Kijko | Updated July 14th, 2021
While working on a machine learning project, getting good results from a single model-training run is one thing, but keeping all of your machine learning experiments organized and having a process that lets you draw valid conclusions from them is quite another. That’s what machine learning experiment management helps with.
In this article, I will explain why you, as data scientists and machine learning engineers, need a tool for tracking machine learning experiments and what is the best software you can use for that.
Tools for tracking machine learning experiments – who needs them and why?
- Data Scientists: Currently many organizations are either including ML in their products as an added value or are AI-first companies. These organizations have to adhere to MLOps processes and tools such as experiment tracking to ensure better collaborations between individuals, teams, and the success of the ML project. Without them when a data scientist or data science team wants to come back to an idea, re-run a model from a couple of months ago, or simply compare and visualize the differences between runs, the need for a system or tool for tracking ML experiments becomes (painfully) apparent.
- Machine Learning Engineers: Right after the data scientist or data science team finalizes model development and is ready to launch the said model, the MLEs are the ones who take all the research/dev code, model and turn it all into a production-ready version as well as deploy it. But in between this handoff process from DS’ to MLEs, there is more information about the model than just the weights that are needed to ensure a successfully deployed solution (one that is debuggable, maintainable, reproducible, and comparable). That’s when experiment metadata is very important – from the dataset, code, and model version to hyperparameters and other configurations used to train the model.
- Managers/Business people: tracking software creates an opportunity to involve other team members like managers or business stakeholders in your machine learning projects. Thanks to the possibility to prepare visualizations, add comments and share the work, managers and co-workers can easily track the progress and cooperate with the machine learning team.
Here is an in-depth article about experiment tracking for those of you who want to learn more.Continue reading ->