MLOps Blog

How to Version and Organize ML Experiments That You Run in Google Colab

4 min
Aayush Bajaj
28th November, 2022

Running ML experiments has never been easier since the advent of Google Colab. It has been able to offer a seamless experience to run experiments with the advantage of on-demand shared GPU instances. However, ML experiments come with a natural requirement of tracking and organization, which Colab doesn’t offer natively. In this blog, we’re going to touch bases on:

  • 1 Why is it important to version experiments in Google Colab?
  • 2 Different ways to version experiments in Google Colab.
  • 3 How can Neptune help track experiments in Google Colab?

Why should you version ML experiments that you run in Colab?

Building ML models is experimentative in nature, and it’s common to run numerous experiments in search of a combination of an algorithm, parameters, and data preprocessing steps that would yield the best model for the task at hand. This requires some form of organization once the complexity of the problem grows. 

While running experiments in Colab, you’ll feel the need for versioning the same way you would otherwise. Here are some key points on why you should adopt the best practice of setting up some form of versioning for your ML experiments in Colab:

  1. Collaboration: Working in a team requires a collaborative effort in decision making which would be cumbersome if there are no centrally logged experiment details like model metadata, metrics, etc. This can fit nicely with the sharing feature of Google Colab where you can also write code collaboratively.
  2. Reproducibility: It saves a lot of time for retraining and testing if you are logging the model configurations somewhere. By taking snapshots of the entire Machine Learning pipeline, it becomes possible to reproduce the same output again.
  3. Dependency tracking: By using version control, you can track different versions of the datasets (training, validation, and test), test more than one model on different branches or repositories, tune the model parameters and hyperparameters, and monitor the accuracy of each change.
  4. Model updates: Model development is not done in one step, it works in cycles. With the help of version control, you can control which version is released while continuing the development for the next release.

How to version ML experiments run in Google Colab?

There are many ways to version experiments in Colab ranging from simple log files to full-scale experiment tracking tools that offer a host of features. Let’s talk about some from each category and understand what would be the right choice for you.

1. Spreadsheets


Tracking ML Experiments in Excel or Google spreadsheets is a fast yet brute-force solution. Spreadsheets provide a comfortable, easy-to-use experience to directly paste your metadata and create multiple sheets for multiple runs. But it comes with lots of caveats, let’s see where it shines and where it doesn’t:


  1. Easy to use with a familiar interface.
  2. Reports for stakeholders can be directly created within the tool.
  3. Can be a boon for non-technical folks on the team to contribute.


  1. Tracking experiment in spreadsheets is a tedious affair, where you would either need to copy and paste model metadata and metrics onto the spreadsheet or use a module like pandas to log information and later save it to a spreadsheet.
  2. Once the number of experiments increases, it will get unmanageable to log each run in a separate sheet.
  3. Tracking and managing countless variables and artifacts in a simple spreadsheet is not the best way to approach the problem.

2. Git


Git comes pre-installed in the Colab session which you can directly use to clone or commit to a repository. This will enable you to push model-related metadata like trained weights, evaluation reports like confusion matrix, etc. to a central repository that your Data Science team can use to make informed decisions. Let’s look at some pros and cons of using Git for experiment tracking:


  1. Native availability of Git on Colab means no extra dependency or installation.
  2. Popular and known tool among Data Scientists and ML practitioners.
  3. Gives access to millions of other repositories, which can be used as a starting point.


  1. Hard to onboard non-programmers and other stakeholders.
  2. An unintuitive interface that may create friction for collaborative work.
  3. Need technical expertise to execute and maintain experiment-related repositories.

3. ML experiment tracking tools


Experiment tracking tools are tailor-made for this use case. They cover almost all of the requirements you might want from a tool, from experiment tracking to model registry. There have been a lot of tools in this space in the last few years, with prominent players being Neptune, Weights and Biases, or MLflow. Let’s look at some of their advantages/disadvantages:


  1. Covers almost all of the functionalities that you need while versioning and organizing your ML runs.
  2. All of these tools come with a dedicated interactive UI that can be used for comparisons, debugging or report generation.
  3. Each tool offers a plethora of features for team collaboration.


  1. As opposed to Git or spreadsheets, experiment tracking tools usually come with a fee. Although almost all of them have a free tier for a single user, it has its limitations. But on the other hand, paying for the tool means you don’t have to worry about the setup, maintenance, or developing features.

    Explore more tools

    15 Best Tools for ML Experiment Tracking and Management

    Let’s dive deeper on how versioning Colab notebooks works in such tools. We’ll focus on Neptune.

    Tracking Google Colab experiments with Neptune

    Example dashboard in Neptune with different metadata logged

    Neptune is an ML metadata store that was built for research and production teams that run many experiments. It has a flexible metadata structure that allows you to organize training and production metadata the way you want to.

    It gives you a central place to log, store, display, organize, compare, and query all metadata generated during the machine learning lifecycle. Individuals and organizations use Neptune for experiment tracking and model registry to have control over their experimentation and model development.

    The web app was built for managing ML model metadata and lets you:

    • filter experiments and models with an advanced query language.
    • customize which metadata you see with flexible table views and dashboards.
    • monitor, visualize, and compare experiments and models.

    Neptune supports multiple IDEs and notebooks, including Google Colab. You can directly use the power of experiment tracking without having to flounder with many tools.

    Here’s what you can track with Neptune in Colab: 

    Model-building metadata

    1. Parameters and model configuration – in single values as well as in dictionary structure (good practice to use a dictionary or YAML file for hyperparameters).
    2. Metrics  – like Accuracy, Precision-Recall, etc.
    3. Model checkpoints – Neptune supports all forms of checkpoint extensions like .h5, .ckpt, etc.

    Artifacts and data versioning

    Neptune’s track_files() method can be used to log metadata about any file. This method can be used to track and version artifacts like intermediate data samples and model files which you’ll store elsewhere.

    If you wish to upload all the files ending with a specific extension right from the start of the experiment, you can specify that while initiating the Neptune instance, and it will upload all of them automatically in the background.


    Neptune allows you to log intermediate experiment files like images and audio seamlessly. Here are some of the file formats currently supported (at the time of writing this) by Neptune:

    • Images – in formats like png, jpg.
    • Interactive visualizations – such as Matplotlib figures.
    • HTML – log from an HTML string object, or upload a file directly.
    • Arrays and tensors – log and display as images.
    • Tabular data – log and preview CSV or pandas DataFrame.
    • Audio and video – log and watch or listen in Neptune.
    • Text – log text entries in various ways.

    Git information

    If you have Git initialized in your Google Colab session the way discussed in the previous section, Neptune extracts information from the .git directory and logs it under the source_code/git namespace.

    Learn more

    Check how you can version all these metadata types in Neptune. 

    You can get started in the easiest way possible, head over to the Neptune & Google Colab docs to learn more.

    Why should you use Neptune with Google Colab?

    For many users, the aforementioned features make Neptune the default choice of experiment tracker in Google Colab. Here’s what makes it a top contender for the role, apart from the technical features we discussed in the last section:

    1. Seamless integration: With the Neptune python module, you can seamlessly integrate your Colab session with the Neptune dashboard. This reduces friction as compared to other methods.
    2. An abundance of features: Features offered by Neptune gives you the freedom to monitor/log/store/compare whatever you want to make your experiment successful.
    3. Availability of free tier: A free tier is available for single users which offers important features at no cost. Check the available plans.
    4. Community support: With active community support for Neptune, you can get your problems fixed at a faster pace and keep your focus on building models. 

    Head over to this example project in Google Colab to see the Colab notebooks support in action. 

    You’ve reached the end!

    Congratulations! You are now fully equipped to understand what you require in terms of your ideal method to achieve organization in your ML experiments. In this article, we explored straightforward ad-hoc methods like Spreadsheets and Git, as well as more nuanced approaches like experiment tracking tools. Here are some more bonus tips to help you choose your next tool easily:

    1. Stick to what you need! It’s easier to get lost in the sea of tools and methods, but sticking to your requirements would help you make better decisions.
    2. I’d recommend using the “Try for Free” feature in every tool before you lock-in on any single solution.

      Thanks for reading! Stay tuned for more! Adios!