Case Study

How deepsense.ai Tracked and Analyzed 120K+ Models Using Neptune

At a certain stage of machine learning maturity the need for a tool like this one rises naturally. And then Neptune is a solid choice because of low entry threshold, many useful features, and good documentation and support.

Patryk Miziuła

Senior Data Scientist at deepsense.ai

Before

No experiment tracking solution in place

No way to track, visualize and compare models

After

The team has an out-of-the-box solution that can handle tracking and analyzing of over 120k experiments

deepsense.ai is an AI-focused software services company that delivers ML-based end-to-end solutions to enterprises in retail, manufacturing, financial, and other sectors.

We spoke to Patryk Miziuła, who led an interesting project that deepsense.ai delivered for a leading Central and Eastern European (CEE) food company. The task was to leverage machine learning to analyze the impact of promotional campaigns on sales increase.

The challenge

The project was fairly complex, as it included:

Vast amounts of data from multiple sources.
Various product types, contractors, and promotional strategies.

Due to the unique promotional dynamics for each product type and contractor, the team trained over 120,000 models to address more than 7,000 subproblems.

With no experiment tracking in place, this would quickly turn into chaos, eventually resulting in missed deadlines and huge technical delays.

So, the team started to look for a tool that would allow them to:

Efficiently track and manage a huge amount of metadata;
Create robust visualizations and dashboards to compare 120k+ models;
Collaborate within the team to make sure no work is duplicated and everybody works in sync.

The team used Neptune previously for other projects, and they enjoyed the convenient API, as well as user friendly UI and good customer support. So they decided to give it a go here, too.

Efficient logging and saving of metadata

A single ML experiment generates a lot of metadata, including metrics (training/validation/testing), results (graphs/charts/plots, and numeric data), and model artifacts. If you multiply this by thousands of runs, you have a real metadata management problem.

Luckily, it wasn’t a problem for Neptune. It lets you efficiently track and manage an extensive number of experiments and models. One workaround the team had to do was to view experiments in batches as the UI couldn’t display >100k experiments at once.

In addition to experiments, they also logged pickled models from the experiment run directly to experiment metadata, resulting in easy access.

We trained more than 120 000 models in total, for more than 7000 subproblems identified by various combinations of features. We stored a lot of metadata, visualizations of hyperparameters’ tuning, predictions, pickled models, etc. In short, we were saving everything we needed in Neptune.

Patryk Miziuła Senior Data Scientist at deepsense.ai

Integrating hyperparameter optimization with experiment tracking

Patryk’s team utilized Optuna to optimize and monitor the hyperparameters of those 120k experiments. They particularly liked the seamless integration of Optuna with Neptune. They logged CSV containing feature sets from Optuna to Neptune.

Neptune turned out to be working well with Optuna. We were running 100 Optuna tries per model, the optimal hyperparameters found and the search history were stored in Neptune as easy-to-access interactive charts. In short: we liked it.

Patryk Miziuła Senior Data Scientist at deepsense.ai

Generating relevant visualizations

A large part of the team’s workflow involved running and comparing experiments with plots and graphs. However, it becomes tedious to look through metric plots one by one in a static environment.

Neptune addressed this thanks to a flexible and customizable UI. They created dynamic dashboards to compare multiple runs on a single plot. This saved time and was much more efficient and collaborative.

We were able to filter experiments for given subproblems and compare them to find the best one. And because Neptune is really aesthetic, we could simply use the visualization it generated in our reports.

Patryk Miziuła Senior Data Scientist at deepsense.ai

The results

Tracked 120k+ experiments without worrying about storage deficits, disk failures, and synchronization of metadata to Neptune.
Enhanced productivity and decision-making through advanced data visualization and experiment tracking.
Saved weeks thanks to managing experiments with Neptune instead of manual methods.
Successfully analysed the impact of promotional campaigns on sales for hundreds of products.

Thanks to Patryk Miziuła for his help in creating this case study!

Product resource

How Brainly Enabled Tracking Metadata from SageMaker Pipelines

From Research to Production: Building The Most Scalable Experiment Tracker For Foundation Models

Back to Customers

Industry

Software & Technology

Product or service

AI solutions

Headquarters location

Warsaw, Poland

Number of employees

51-100

ML team size

MLOps stack

Optuna, Neptune, TensorFlow, Keras, PyTorch

Feels like your team could use Neptune, too?

Request free trial

If there were no experiment tracker then we would end up emulating some. So we would end up writing our own, poor version of a thing like that. It would probably take us a month or more. On the other hand, adding Neptune API to our workflow took us less than two days.

Patryk Miziuła Senior Data Scientist at deepsense.ai

Running thousands of experiments and need an effective way to organize and compare them?

Request free trial Watch demo [20 min]

Transition Hub

Train FM

State of Foundation Model Training Report 2025

Transition Hub

Train FM

State of Foundation Model Training Report 2025

How deepsense.ai Tracked and Analyzed 120K+ Models Using Neptune

The challenge

Efficient logging and saving of metadata

Integrating hyperparameter optimization with experiment tracking

Generating relevant visualizations

The results

How Brainly Enabled Tracking Metadata from SageMaker Pipelines

From Research to Production: Building The Most Scalable Experiment Tracker For Foundation Models

Running thousands of experiments and need an effective way to organize and compare them?