Case Study

InstaDeep

“I like that Neptune does not get in your way – it is not very intrusive. It also does very well with the comparison of runs, sharing, and working collaboratively."

Nicolas Lopez Carranza

DeepChain and BioAI Lead at InstaDeep

logo instadeep

InstaDeep is an EMEA leader in delivering decision-making AI products. Leveraging their extensive know-how in GPU-accelerated computing, deep learning, and reinforcement learning, they have built products, such as the novel DeepChain™ platform, to tackle the most complex challenges across a range of industries. 

InstaDeep
InstaDeep | Source

InstaDeep has also developed collaborations with global leaders in the AI ecosystem, such as Google DeepMind, NVIDIA, and Intel. They are part of Intel’s AI Builders program and are one of only 2 NVIDIA Elite Service Delivery Partners across EMEA. The InstaDeep team is made up of approximately 155 people working across its network of offices in London, Paris, Tunis, Lagos, Dubai, and Cape Town, and is growing fast.

About the BioAI team

The BioAI team is the place at InstaDeep where Biology meets Artificial intelligence. At BioAI, they advance healthcare and push the boundaries of medical science through a combination of biology and machine learning expertise. They are currently building DeepChain™, their platform for protein design. They are also working with their customers in the bio sector to tackle the most challenging problems with the help of bioinformatics and machine learning.

Deepchain dashboard
DeepChain dashboard | Source

They apply the DeepChain™ protein design platform to engineer new sequences for protein targets using sophisticated optimization techniques such as reinforcement learning and evolutionary algorithms. They also leverage Language Models pre-trained on millions of protein sequences and train their own in-house protein language models. Finally, they use machine learning to predict protein structure from sequence.

Problem

Building complex software like DeepChain™, a platform for protein design, requires a lot of research with different moving parts. Customers demand various types of solutions that require new experiments and research every time. With several experiments running for different customers, it will be unavoidably daunting for a team of any size to keep track of the experiments while ensuring they remain productive.

Fazed with the thought of managing numerous experiments, Nicolas and the BioAI team encountered a series of challenges:

  • 1Experiment logs were all over the place
  • 2It was difficult to share experiment results
  • 3Machine learning researchers were dealing with infrastructure and operations
Experiment logs were all over the place
  • Experiment logs were all over the place
  • Difficult to share experiment results
  • Machine learning researchers were dealing with infrastructure and operations

“Logs ended up all over the place as there was no centralized repository for the data.” – Nicolas Lopez Carranza, DeepChain and BioAI Lead at InstaDeep

Dealing with an enormous amount of experiment logs is hard enough – what makes it harder is when they are not organized. Finding experiment results became a huge challenge for the team – there was hardly any visibility. With logs scattered across documents and files, experiments become difficult to manage. You spend more time figuring out where the results are rather than doing the actual research. Engineers and researchers would take a long time to compare results of previous runs because they would have to search for the log file where a previous experiment result was logged. This was really unproductive for the team to deal with.

“When you use TensorBoard or equivalent tools you have to deal with the extra DevOps stuff of exposing your localhost somehow if you wish to share your results.” – Nicolas Lopez Carranza, DeepChain and BioAI Lead at InstaDeep

Working on many experiments requires researchers to work collaboratively and share results. This isn’t just challenging to achieve when logs are not organized but if there is no easy way to share results, the collaboration between researchers becomes difficult. With the team at BioAI, sharing results was a bit complicated to do. Obstacles like this one would often throw any researcher off and affect flow during work. There was no easy way for the team to share experiment results from TensorBoard without some workarounds. If a researcher has to worry about exposing the URL to their experiment results, ensuring it is secure (especially for sensitive work), all before sharing them with a colleague, it would inevitably reduce the urge to collaborate.

“I’d say the advantage (of TensorBoard) is that it’s free and it works pretty well but anytime an engineer wanted to show the team some training curve, they’d need to start the VM (Virtual Machine) containing the logs, or make their localhost port available, expose it to the internet, it was not very secure… When you end up having to start a VM just to visualize some logs, you realize there should be a better tool.” – Nicolas Lopez Carranza, DeepChain and BioAI Lead at InstaDeep

Researchers are majorly trained to research. It’s often a damp in productivity when they have to configure the infrastructure to do their job. This is one of the challenges the BioAI team encountered when they used TensorBoard to manage their experiments. Oftentimes, they had to tackle operational challenges that included spinning up and managing the infrastructure for TensorBoard before they could visualize experiment results. If you have had to configure any type of infrastructure for any software before, you will know this is far from an easy task – even for an operations engineer.

Solution

Plagued with these challenges, Nicolas decided to look for a solution that could help the team solve their experiment management problems, utilizing their time and efforts more efficiently. 

“We needed a tool that could expose and share TensorBoard-like dashboards between the team, and store the logs of previous runs.”

Nicolas Lopez Carranza

Nicolas Lopez Carranza

DeepChain and BioAI Lead at InstaDeep

In searching for a solution for the team, Nicolas also needed a tool that could fit the following criteria:

  • 1Easy to use
  • 2Simple to connect to TensorFlow and PyTorch logs
  • 3Wasn’t too expensive

It turned out that some of the engineers on the BioAI team were already using a tool for their projects that fit the criteria Nicolas outlined.

“Over time we realised that our ML engineers were utilising Neptune even for small personal projects so we realised it was the right approach.”

Nicolas Lopez Carranza

Nicolas Lopez Carranza

DeepChain and BioAI Lead at InstaDeep

Neptune ended up being the tool the team adopted not just because it was popular among the engineers on the team, but also met the requirements for an experiment management solution for the BioAI team.

“Before Neptune, we did not have a similar platform. I know there are other tools on the market and some people on the team have used those. But most of our team members already had Neptune accounts and knew the tool. We had to standardize our tool stack and decided to go with Neptune.”

Nicolas Lopez Carranza

Nicolas Lopez Carranza

DeepChain and BioAI Lead at InstaDeep

Neptune proved to be the ideal solution for the team because:

  • It was accessible
  • It provided more visibility for experiment logs
  • It enabled straightforward collaboration between researchers and engineers on the team
  • It eliminated the need for operations and infrastructure configuration

Accessibility

Knowing the challenge they previously faced with TensorBoard, the team needed to make sure Neptune was accessible and easy to use – it turned out it was. As Nicolas explained during his encounter with Neptune:

“The documentation is quite detailed and offers code snippets to get started hence in less than 1 hour, we were logging our machine learning projects.”

Nicolas Lopez Carranza

Nicolas Lopez Carranza

DeepChain and BioAI Lead at InstaDeep

The team also found the Neptune SDK was complete and useful for their workflow.

“Well, I think the documentation is very good, so you can get started very easily. You will just create your token and the SDK is also quite complete. I think these are the main advantages—you can get started very quickly.”

Nicolas Lopez Carranza

Nicolas Lopez Carranza

DeepChain and BioAI Lead at InstaDeep


Here’s how you can set up Neptune in a few steps.

1. Create a free account
Sign up
2. Install Neptune client library
pip install neptune-client
3. Add logging to your script
import neptune.new as neptune

run = neptune.init('Me/MyProject')
run['params'] = {'lr':0.1, 'dropout':0.4}
run['test_accuracy'] = 0.84
Get started
Example dashboard in Neptune
Try live notebook

Visibility for experiment logs

One of the challenges the team faced before using Neptune was that logs were scattered and unorganized which made visibility of experiment results difficult. This was a crucial consideration for the BioAI team as they needed a solution to manage all their experiments in a central repository so the logs can be easily tracked.

Neptune provided more visibility to experiment logs by centralizing all experiment runs, indexing, and organizing them. This way, the team could view details of various experiment runs, search for specific runs through their metadata and tags, and see visualizations related to specific experiments.

Neptune dashboard - visibility for experiment logs
Neptune dashboard | Source

“I think Neptune does one thing very well, which is get your logs and charts right where and when you need them… The search feature of looking for runs and using the tags for the runs is very good as well. The idea of tagging experiment runs is very useful.”

Nicolas Lopez Carranza

Nicolas Lopez Carranza

DeepChain and BioAI Lead at InstaDeep

Straightforward collaboration

Remember the problem the BioAI team had with sharing results? When they had to expose localhost to be able to share experiment results. Well, it turned out Neptune was able to solve that challenge for the team by providing a password-protected and easily shareable link to experiment results without any extra configuration.

This made collaboration on experiments straightforward and required no additional hassle. They were also able to collaborate on research through Neptune’s feature for comparing runs and experiment metrics run by different researchers and engineers.

“I like that Neptune does not get in your way – it is not very intrusive. It also does very well with the comparison of runs, sharing, and working collaboratively.”

Nicolas Lopez Carranza

Nicolas Lopez Carranza

DeepChain and BioAI Lead at InstaDeep

Eliminated the need for operations and configuring infrastructure

As we learned earlier, one of the obstacles to the productivity of the researchers at BioAI was that they were spending time configuring infrastructure and all the DevOps stuff before they could use a tool to manage their experiment logs. Neptune eliminated the need for configuring infrastructure to log experiment results by providing a fully-managed solution for the team.

“No more DevOps needed for logging. No more starting VMs just to look at some old logs. No more moving data around to compare TensorBoards.”

Nicolas Lopez Carranza

Nicolas Lopez Carranza

DeepChain and BioAI Lead at InstaDeep

Results

Nicolas and the BioAI team have been using Neptune for about 2 years (as of the time this case study was published) and over the long period of using Neptune, the team has reasonably improved its workflow. Neptune proved to be a useful solution because:

It saved a lot of research time

“No time spent looking for the data. It’s always there, available, and displayed the way we want.” – Nicolas Lopez Carranza, DeepChain and BioAI Lead at InstaDeep

The visibility brought about by Neptune’s centralized and organized experiment management dashboard ensured the team spent little to no time searching for results and metadata on experiments.

It improved the team’s productivity

“We use it (Neptune) daily, as a big part of what we do is sharing results and discussing them. The [team’s] productivity increased for this reason.”  – Nicolas Lopez Carranza, DeepChain and BioAI Lead at InstaDeep

As Neptune was able to eliminate most of the obstacles the team faced managing a lot of experiments, making them focus on research and getting other things out of the way, the team’s productivity improved, as well as their ability to collaborate on experiments.

“I definitely recommend you give Neptune a try because it is a very useful tool and I think you will enjoy it.” –  Nicolas Lopez Carranza, DeepChain and BioAI Lead at InstaDeep


Thanks to Nicolas Lopez Carranza for his help in creating this case study!

Want your team to be more productive and focus on experimentation?

Be more productive
logo instadeep
  • Industry Bioinformatic and machine learning
  • Location London, England
  • Team size 30
  • Frameworks Argo Workflows, Kubernetes, Tensorflow, and PyTorch
  • Neptune use cases experiment tracking, experiment management