How deepsense.ai Tracked and Analyzed 120K+ Models Using Neptune
deepsense.ai is an AI-focused software services company that delivers ML-based end-to-end solutions to enterprises in retail, manufacturing, financial, and other sectors.
We spoke to Patryk Miziuła, who led an interesting project that deepsense.ai delivered for a leading Central and Eastern European (CEE) food company. The task was to leverage machine learning to analyze the impact of promotional campaigns on sales increase.
The challenge
The project was fairly complex, as it included:
- Vast amounts of data from multiple sources.
- Various product types, contractors, and promotional strategies.
Due to the unique promotional dynamics for each product type and contractor, the team trained over 120,000 models to address more than 7,000 subproblems.
With no experiment tracking in place, this would quickly turn into chaos, eventually resulting in missed deadlines and huge technical delays.
So, the team started to look for a tool that would allow them to:
- Efficiently track and manage a huge amount of metadata;
- Create robust visualizations and dashboards to compare 120k+ models;
- Collaborate within the team to make sure no work is duplicated and everybody works in sync.
The team used Neptune previously for other projects, and they enjoyed the convenient API, as well as user friendly UI and good customer support. So they decided to give it a go here, too.
Efficient logging and saving of metadata
A single ML experiment generates a lot of metadata, including metrics (training/validation/testing), results (graphs/charts/plots, and numeric data), and model artifacts. If you multiply this by thousands of runs, you have a real metadata management problem.
Luckily, it wasn’t a problem for Neptune. It lets you efficiently track and manage an extensive number of experiments and models. One workaround the team had to do was to view experiments in batches as the UI couldn’t display >100k experiments at once.
In addition to experiments, they also logged pickled models from the experiment run directly to experiment metadata, resulting in easy access.
Integrating hyperparameter optimization with experiment tracking
Patryk’s team utilized Optuna to optimize and monitor the hyperparameters of those 120k experiments. They particularly liked the seamless integration of Optuna with Neptune. They logged CSV containing feature sets from Optuna to Neptune.
Generating relevant visualizations
A large part of the team’s workflow involved running and comparing experiments with plots and graphs. However, it becomes tedious to look through metric plots one by one in a static environment.
Neptune addressed this thanks to a flexible and customizable UI. They created dynamic dashboards to compare multiple runs on a single plot. This saved time and was much more efficient and collaborative.
The results
- Tracked 120k+ experiments without worrying about storage deficits, disk failures, and synchronization of metadata to Neptune.
- Enhanced productivity and decision-making through advanced data visualization and experiment tracking.
- Saved weeks thanks to managing experiments with Neptune instead of manual methods.
- Successfully analysed the impact of promotional campaigns on sales for hundreds of products.
Thanks to Patryk Miziuła for his help in creating this case study!