Monitoring machine learning experiment runs is an important and healthy practice but it can be a challenge. Main problems are:
- You cannot look at your console logs all the time,
- When you look at logs you don’t see the change over time immediately (think learning curve vs losses on epoch 10),
- Sometimes you can’t even access the model training environment.
And that’s where tools come in handy! You can use them to flexibly monitor your ML experiments and look at model training information whenever you need to. Especially if you don’t have access to the machine (computational cluster at University, VPN at work, Cloud server you’re using somewhere, or when you’re on a bus :)).
Monitoring ML experiments with dedicated tools gives you the comfort of knowing what is going on with your training runs. That is especially true if you want to go beyond watching your learning curve and want to see additional information like performance charts, or prediction visualizations after every epoch.
Check out our list of the best tools that will make monitoring your machine learning experiment runs a breeze!
Neptune is a lightweight experiment management and collaboration tool. It is very flexible, works with many other frameworks, and thanks to its stable user interface, it enables great scalability (to millions of runs).
It’s a robust software that can store, retrieve, and analyze a large amount of data. Neptune has all the tools for efficient monitoring of ML experiment runs. You can also integrate it with other tools for more flexibility.
- Fast and beautiful UI with a lot of capabilities to organize runs in groups, save custom dashboard views and share them with the team
- Provides user and organization management with a different organization, projects, and user roles
- You can use a hosted app to avoid all the hassle with maintaining yet another tool (or have it deployed on your on-prem infrastructure)
- Your team can track experiments which are executed in scripts (Python, R, other), notebooks (local, Google Colab, AWS SageMaker) and do that on any infrastructure (cloud, laptop, cluster)
- Extensive experiment tracking and visualization capabilities (resource consumption, scrolling through lists of images)
TensorBoard is a visualization toolkit for TensorFlow that lets you analyze model training runs. It’s open-source and has functionalities helpful in the entire machine learning workflow.
Additionally, it has an extensive network of engineers using this software and sharing their experience and ideas. This makes a powerful community ready to solve any problem. The software, itself, however, is best suited for an individual user.
- Track and visualize metrics such as loss and accuracy
- Compare learning curves of various runs
- Parallel coordinates plot to visualize parameter-metric interactions
- It has other visualization features that are not parameter-metric related
- Project embeddings to a lower dimensional space
Do you know that you can integrate TensorBoard with Neptune? Check it out here.
⇒ And make sure to see the comparison between TensorBoard & Neptune.
Hyperdash is another tool helpful in monitoring machine learning experiment runs. It’s a cloud-based solution for those who like flexibility and are focused on fast knowledge gaining. It’s a straightforward tool and can be used with scripts and Jupyter.
Interestingly, unlike most of the tools, Hyperdash is available on mobile devices (iOS, Android) so you can monitor your experiments no matter where you are.
- Track hyperparameters across different model experiments
- Graphs performance metrics in real-time
- Notifications when a long-running experiment is finished
4. Guild AI
Guild AI is a tool for running, tracking, and comparing experiments. Guild AI is cross-platform and framework independent — you can train and capture experiments in any language using any library.
Guild AI runs your unmodified code so you get to use the libraries you want. The tool doesn’t require databases or other infrastructure to manage experiments — it’s simple and easy to use.
- Track experiment of any model training and any programming language
- Has automated machine learning process
- Integrated with any language and library
- Remote training and backup possibility
- Reproduce results or recreate experiments
Weights & Biases a.k.a. WandB is focused on deep learning. Users can track experiments to the application with Python library, and – as a team – can see each other’s experiments.
WandB is a hosted service allowing you to backup all experiments in a single place and work on a project with the team – work sharing features are there to use.
In the WandB users can log and analyze multiple data types.
Weights & Biases – summary:
- Monitor training runs information like loss, accuracy (learning curves)
- View histograms of weights and biases (no pun intended), or gradients
- Log rich objects like, charts, video, audio or interactive charts during training
- Use various comparison tools like tables showing auto-diffs, parallel coordinates plot and others
Now that you have the right tools, you can freely monitor ML experiment runs from any place in the world. Use them to optimize your work, save time, and work more efficiently.
Enjoy monitoring your machine learning experiment runs!
Get started with Neptune in 5 minutes
If you are looking for an experiment tracking tool you may want to take a look at Neptune.
It takes literally 5 minutes to set up and as one of our happy users said:
“Within the first few tens of runs, I realized how complete the tracking was – not just one or two numbers, but also the exact state of the code, the best-quality model snapshot stored to the cloud, the ability to quickly add notes on a particular experiment. My old methods were such a mess by comparison.” – Edward Dixon, Data Scientist @intel
To get started follow these 4 simple steps.
Install the client library.
pip install neptune-client
Connect to the tool by adding a snippet to your training code.
import neptune neptune.init(...) # credentials neptune.create_experiment() # start logger
Specify what you want to log:
neptune.log_metric('accuracy', 0.92) for prediction_image in worst_predictions: neptune.log_image('worst predictions', prediction_image)
Run your experiment as you normally would:
And that’s it!
Your experiment is logged to a central experiment database and displayed in the experiment dashboard, where you can search, compare, and drill down to whatever information you need.Get your free account ->