The phrase “Every model is wrong but some are useful” is especially true in Machine Learning. When developing machine learning models you should always understand where it works as expected and where it fails miserably.
There are many methods that you can use to get that understanding:
- Look at evaluation metrics (also you should know how to choose an evaluation metric for your problem)
- Look at performance charts like ROC, Lift Curve, Confusion Matrix, and others
- Look at learning curves to estimate overfitting
- Look at model predictions on best/worst cases
- Look how resource-intensive is model training and inference (they translate to serious costs and will be crucial to the business side of things)
Once you get some decent understanding for one model you are good, right? Wrong 🙂
Typically, you need to do some or a lot of experimenting with model improvement ideas and visualizing differences between various experiments become crucial.
You can do all of those (or most of those) yourself but today there are tools that you can use. If you’re looking for the best tools that will help you visualize, organize, and gather data, you’re in the right place.
Neptune is an experiment management and collaboration tool.
Neptune offers an open-source Python library that lets users log any experiments whether those run in Python scripts, Jupyter Notebooks, Amazon SageMaker Notebooks, or Google Colab.
Projects in Neptune can have multiple members with different roles (viewer, contributor, admin), so all machine learning experiments that land in Neptune can be viewed, shared, and discussed by every team member. Neptune is meant to provide an easy-to-use and quick-to-learn way to keep track of your experiments.
Neptune – summary:
- Log model predictions
- Log losses and metrics
- Log model binaries and code
- Log hardware utilization
- Log error analysis in notebooks after the training has been completed
- Log performance visualizations like ROC curve or Confusion matrix (during or after training) or anything else (and it becomes interactive) ‘metric_charts’ has nice example charts
- Log interactive visualizations from Altair, Bokeh, Plotly or other HTML objects
- Compare hyperparameters and metrics across many runs with an intelligent compare table that highlights what was different.
Weights & Biases a.k.a. WandB is focused on deep learning. Users can track experiments to the application with Python library, and – as a team – can see each other’s experiments.
WandB is a hosted service allowing you to backup all experiments in a single place and work on a project with the team – work sharing features are there to use.
In the WandB users can log and analyze multiple data types.
Weights & Biases – summary:
- Monitor training runs information like loss, accuracy (learning curves)
- View histograms of weights and biases (no pun intended), or gradients
- Log rich objects like, charts, video, audio or interactive charts during training
- Use various comparison tools like tables showing auto-diffs, parallel coordinates plot and others
- Interactive prediction bounding box visualization for object detection models
- Interactive prediction masks visualization for semantic segmentation models
Comet is a meta machine learning platform for tracking, comparing, explaining, and optimizing experiments and models.
Just like many other tools like Neptune (neptune-client specifically) or WandB, Comet provides you with an open source Python library to allow data scientists to integrate their code with Comet and start tracking work in the application.
As it’s offered both cloud-hosted and self-hosted, users can have team projects and save a backup of experimentation history.
Comet is converging towards more automated approaches to ML, by predictive early stopping (not available with the free version of the software) and Neural architecture search (in the future).
Comet.ml – summary:
- Visualize samples with dedicated modules for vision, audio, text and tabular data to detect overfitting and easily identify issues with your dataset
- You can customize and combine your visualizations
- You can monitor your learning curves
- Comet’s flexible experiments and visualization suite allow you to record, compare, and visualize many artifact types
TensorBoard provides the visualization and tooling needed for machine learning experimentation. It’s open-source and offers a suite of tools for visualization and debugging of machine learning models. TensorBoard is the most popular solution on the market and thus it’s widely integrated with many other tools and applications.
What’s more, it has an extensive network of engineers using this software and sharing their experience and ideas. This makes a powerful community ready to solve any problem. The software, itself, however, is best suited for an individual user.
TensorBoard – summary:
- Tracking and visualizing metrics such as loss and accuracy
- Visualizing the model graph (ops and layers)
- Viewing histograms of weights, biases, or other tensors as they change over time
- Projecting embeddings to a lower-dimensional space
- Displaying images, text, and audio data
- Profiling TensorFlow programs
Visdom is a tool for flexible creating, organizing, and sharing visualizations of live, rich data. It supports Torch and Numpy.
Visdom facilitates visualization of remote data with an emphasis on supporting scientific experimentation and has a simple set of features that can be composed for various use-cases.
Visdom allows you to reflect results of statistical calculations and share them with other people, conveniently test, view, and experiment since all your results are presented in the interactive form.
A slight disadvantage may be the fact that there is no easy way to access the data, and to compare consecutive runs.
Visdom – summary:
- It helps to interactively visualize any data (including remote machine model training)
- It contains a ton of visualization atomics. In the context of machine learning models the most useful are: line plots, histograms, scatter plots, images, matplotlib figures, audio, videos, html objects but there is a ton to choose from
- Various visualization elements can be combined into a dashboard of visualizations
- It can be easily shared with your team or collaborators
- Since you have full customizability you can create your own favourite deep learning dashboard -> as explained here
Hiplot is a straightforward interactive visualization tool to help AI researchers discover correlations and patterns in high-dimensional data. It uses parallel plots and other graphical ways to represent information more clearly.
HiPlot can be run quickly from a Jupyter notebook with no setup required. The tool enables machine learning (ML) researchers to more easily evaluate the influence of their hyperparameters, such as learning rate, regularizations, and architecture. It can also be used by researchers in other fields, so they can observe and analyze correlations in data relevant to their work.
HiPlot – summary:
- Creates an interactive parallel plot visualization to easily explore various hyperparameter-metric interactions
- Based on selection on the parallel plot the experiment table is updated automatically
- It’s super lightweight and can be used inside notebooks or as a standalone webserver
Machine learning model visualization tools are so important because a visual summary of your ML or deep learning models makes it easier to identify trends and patterns, understand connections, and interact with your data.
I hope you found what you were looking for and can now improve your experiments.
Get started with Neptune in 5 minutes
If you are looking for an experiment tracking tool you may want to take a look at Neptune.
It takes literally 5 minutes to set up and as one of our happy users said:
“Within the first few tens of runs, I realized how complete the tracking was – not just one or two numbers, but also the exact state of the code, the best-quality model snapshot stored to the cloud, the ability to quickly add notes on a particular experiment. My old methods were such a mess by comparison.” – Edward Dixon, Data Scientist @intel
To get started follow these 4 simple steps.
Install the client library.
pip install neptune-client
Connect to the tool by adding a snippet to your training code.
import neptune neptune.init(...) # credentials neptune.create_experiment() # start logger
Specify what you want to log:
neptune.log_metric('accuracy', 0.92) for prediction_image in worst_predictions: neptune.log_image('worst predictions', prediction_image)
Run your experiment as you normally would:
And that’s it!
Your experiment is logged to a central experiment database and displayed in the experiment dashboard, where you can search, compare, and drill down to whatever information you need.Get your free account ->