Continuum Industries
Continuum Industries is a company in the infrastructure industry that wants to automate and optimize the design of linear infrastructure assets like water pipelines, overhead transmission lines, subsea power lines, or telecommunication cables.
Its core product Optioneer lets customers input the engineering design assumptions and the geospatial data and uses evolutionary optimization algorithms to find possible solutions to connect point A to B given the constraints.
As Chief Scientist Andreas Malekos, who works on the Optioneer AI-powered engine, explains:

But creating and operating the Optioneer engine is more challenging than it seems:
- The objective function does not represent reality
- There are a lot of assumptions that civil engineers donât know in advance
- Different customers feed it completely different problems, and the algorithm needs to be robust enough to handle those
Instead of building the perfect solution, itâs better to present them with a list of interesting design options so that they can make informed decisions.
The engine team leverages a diverse skillset from mechanical engineering, electrical engineering, computational physics, applied mathematics, and software engineering to pull this off.
Problem
A side effect of building a successful software product, whether it uses AI or not, is that people rely on it working. And when people rely on your optimization engine with million-dollar infrastructure design decisions, you need to have a robust quality assurance (QA) in place.
As Andreas pointed out, they have to be able to say that the solutions they return to the users are:
- Good, meaning that it is a result that a civil engineer can look at and agree with
- Correct, meaning that all the different engineering quantities that are calculated and returned to the end-user are as accurate as possible
On top of that, the team is constantly working on improving the optimization engine. But to do that, you have to make sure that the changes:
- Donât break the algorithm in some way or another
- They actually improve the results not just on one infrastructure problem but across the board
Basically, you need to set up a proper validation and testing, but the nature of the problem the team is trying to solve presents additional challenges:
- You cannot automatically tell whether an algorithm output is correct or not. It is not like in ML where you have labeled data to compute accuracy or recall on your evaluation set.
- You need a set of example problems that is representative of the kind of problem that the algorithm will be asked to solve in production. Furthermore, these problems need to be versioned so that repeatability is as easily achievable as possible.

Initially, the team has developed a relatively simple and completely custom solution to those problems:
- They implemented a database of âbaseline problemsâ
- The algorithm would run on these problems, and quality metrics would be recorded and written to the database
- A developer could then make some changes to the algorithm, run the code against the âbaseline problemsâ, and compare the metrics generated with the database
- They created some visualization tools that worked by downloading all the metrics for a run
This proved to be an extremely clunky system for the following reasons:
- The metrics stored in the database would go out of date as soon as someone made a change to the algorithm, which meant they had to run an âupdateâ job very often.
- This âupdate jobâ was not unit tested properly, so it often broke. This meant that every time a developer tried to update the baseline metric, they would also have to fix the system itself. This would turn into a tedious and painful process.
- The system was pretty complex, which turned it into a âproduct within a productâ that they did not have time to maintain or fix when it broke.

According to Andreas, it took them a while to realize that even though they do not use ML in the product, they face many of the same challenges that ML-in-production faces. Thatâs when they decided to properly investigate the MLOps solutions that were already out there and see which one could fit their use case best.
Solution
As Andreas explains, with experience from trying to build a similar solution themselves, they knew that:
- they wanted a tool that could easily track and visualize different types of data
- they could track both local and cloud runs in the same way
- they wouldnât need to self-host or maintain the solution

After reading many blogs comparing different experiment trackers and then spending most of their evaluation time going through the documentation of each of the tools, they decided to go with Neptune.
The Optioneer engine team chose Neptune because:
- 1 Getting started is really easy
- 2 Comparing, monitoring, and debugging works great
- 3 They have total flexibility in the metadata structure
- 4 They love the support
- 5 It is easy to access Neptune from anywhere including CI/CD pipelines
-
From zero to Hello World was very quick; adding Neptune calls to the existing metrics-logging classes took a week or so, but that was primarily because of the complexities of the codebase. For example, they had trouble logging multidimensional NumPy arrays, which was eventually solved by uploading them as files.
As Andreas explains:
âWe also liked the conceptual simplicity of Neptune: unlike some of its competitors, itâs just a metadata store and doesnât try to solve a million different problems, so it was easy to add it to our existing code.â
-
After integrating Neptune into their codebase, it became much easier to track experiment runs, compare plots of different metrics across runs, but also monitor and debug production runs.
âThe ability to compare runs on the same graph was the killer feature, and being able to monitor production runs was an unexpected win that has proved invaluable.â â Andreas Malekos, Chief Scientist Continuum IndustriesÂ
As Andreas told us, they record quality metrics (objective value, constraint violation), the final value of the design variables, and all the input parameters for production runs.
Keeping track of all that helps them debug production failures easily:
- If the optimization crashes, recording the input parameters and the code version makes it fairly easy to replicate the error and find out why things crashed.
- If the result looks very bad, recording the objective and constraint violation and the final design variables allows the team to re-create the final result locally. They can then inspect it and figure out why their algorithm thinks this is a good result and why it was preferred.
-
âWe are huge fans of the way data is structured in Neptune runs. The fact that we can basically design our own file structure effortlessly gives us enormous flexibility.â â Andreas Malekos, Chief Scientist Continuum IndustriesÂ
It makes it easy to use Neptune for pretty much anything:
- When researching algorithm improvements, they can record results in their custom structure and compare them easily
- Monitoring and recording production runs in a way that is convenient for debugging
- Using Neptune as part of their CI pipeline by setting up many batch jobs, all writing metric data to the same Neptune run, and then comparing things to the current version of master to make sure that theyâve not broken anything.
For example, in the case of the engine tests, where multiple jobs write to a single run, the structure would look like this:
- metrics: top-level folder where all the metrics are stored
- {TEST_CASE_NAME}
- {INDEX}: each case is run multiple times with different seeds
- {STAGE_NAME}: there are multiple stages during the optimization
- metric0
- metric1
- metric2
- {STAGE_NAME}: there are multiple stages during the optimization
- {INDEX}: each case is run multiple times with different seeds
- {TEST_CASE_NAME}
-
âWe had some issues with using the new Neptune API at first, and weâve received an incredible amount of support from the team since then. Talk to the Neptune team if you run into any issues as they are incredibly helpful.â â Andreas Malekos, Chief Scientist Continuum IndustriesÂ
As Andreas shared with us, as you adopt Neptune (or any tool for that matter), there may be some bumps along the way. What is unique about Neptune is that you can really count on the team to help you through it, share your feedback and improvements ideas, and see them implemented. You get to be a part of the journey.
It was important for the Optioneer engine team to use Neptune locally and in the cloud environment and log metadata for offline jobs or set it up for debugging with various connection modes.
They needed Neptune to play nicely with their CI/CD pipelines too.
âWhen someone submits a Pull Request it triggers a CI/CD pipelines via GitHub Actions. Each step evaluates the newest version of the algorithm on a baseline problem in a separate process.
We were afraid that organising all of those results for each CI/CD pipeline execution would be a nightmare but thanks to Neptune custom run ID functionality we can log all of the evaluations to the same run and keep it nice and clean.â â Andreas Malekos, Chief Scientist Continuum Industries
Results
As the team shared with us, Neptune improved their entire workflow.

Andreas explained that, when working on optimization engine improvements, they start with one of the test problems, run the modified version of the algorithm, and have all parameters and results tracked in Neptune. It lets the team quickly look back at what they have tried until now and plan the next steps relatively easily.

In addition to having Neptune in the experimentation phase, it also sits at the core of their version of the production MLOps pipeline, executed through GitHub actions. To assure model quality with proper CI/CD jobs they:
- Deploy a bunch of cloud instances on AWS ec2
- In each instance, clone the repository and install the requirements
- Run one of many test problems
- For each running instance of a test problem, they collect metrics and write them all to the same Neptune run
- Calculate aggregate metrics across all tests.
- Compare these aggregate metrics to a previous point in time and decide whether the quality of the algorithm improved at statistical significance
With Neptune, the Optioneer team can:
- Easily keep track of and share the results of our experiments,
- Monitor production runs, track down, and reproduce errors when something goes wrong much faster than before
- Have much more confidence in the results they generate and in how the new versions of Optioneer engine were built
- Understand the performance of their algorithm at any given time with all the engine-related metadata recorded to Neptune through their weekly Quality Assurance CI/CD pipelines
Before Neptune, getting all that functionality required an order of magnitude more time.
Now, they have more trust in their algorithm and more time to work on the core features rather than tedious and manual updates.

Thanks to the whole team behind the Optioneer engine (Andreas Malekos, Miles Gould, Daniel Toth, Ivan Chan) for their help in creating this case study!