Blog » How to Use Neptune » neptune.new

neptune.new

First of all – we’re sorry! We’ve been awfully quiet in the last months in terms of product updates. You’ve seen here and there an update in the web interface or a bug fix in the Python client library, but nothing larger than that. 

What was going on?

Over time we got a tremendous amount of feedback from you. You asked how to use Neptune with spot instances and pipelines, how to organize things in a more hierarchical way, and many more and we wanted to do that. But adding those improvements to the current product in small increments was getting harder (and slower). Don’t get me wrong we love to iterate, but sometimes you need to take a deep deep deep breath, take a step back and rebuild the foundations. 

So, in short, that’s what was going on. However, believe me – the wait was worth it 🙂

Today we are happy to announce that a brand new version of Neptune is ready for you with many new features and a revamped Python API!

What is so new about neptune.new?

Better organization, more flexibility, more customization

Typically when training a simple model with just a few parameters, you can recite them all from your head. And they all fit on one screen. 

However, the problem starts once you add that one-too-many parameter. And becomes a real pain when you cannot easily manage complex parameter configurations apart from sorting them alphabetically.

You all had different solutions for that:

  • uploading explicitly just a few parameters and upload the rest as a YAML config file,
  • clever use of prefixes, 
  • and many, many others. 

All of that were workarounds for not enough organization and flexibility in the Neptune itself. And that changed! 

With the new Neptune API, you can organize all of your metadata hierarchically into neatly organized namespaces (folders). And by all metadata, I mean ALL metadata, not only parameters but also logs, validation metrics, binary artifacts, etc. 

neptune new structure comparison

How can you do this in code? The new Neptune API relies on a dictionary-like interface. Instead of passing parameters there, properties there, metrics this way, artifacts that way, you track all metadata in one unified way.

import neptune.new as neptune
run = neptune.init()

# Track dataset version and upload sample data
run['data/version/train'] = md5(train_set.data).hexdigest()
run['data/version/test'] = md5(test_set.data).hexdigest()
run["data/sample"].upload_files("data/sample_images")

# Track hyperparameters of your run
PARAMS = {'lr': 0.005, 'momentum': 0.9}
run["model/params"] = PARAMS

# Upload charts as image files e.g. model visualization
run["model/visualization"].upload(File.as_image(model_vis))

# Track the training process metadata
for epoch in epochs:
    for batch in batches:
        [...]
        run["batch/accuracy"].log(accuracy)
        run["batch/loss"].log(accuracy)

# Log the final results
run["test/total_accuracy"] = calculate_test_accuracy()

Oh, and that also means that you no longer need to worry about setting the parameters upfront – you can update them when and where it is convenient for you!

More workflows supported: offline mode, resuming runs, and ML pipelines  

Sometimes you run your script, you analyze the results, and you move on. More often than not, it’s not that simple:

  • Maybe you needed to train for a few more epochs? 
  • Maybe after the first analysis, you need to compute few more validation metrics? 
  • Maybe your spot instance died, and you need to resume your training? 

Fear not. Neptune is now prepared for that. And more 🙂

With the new Python API, you can resume tracking to any existing run. You can fetch all logged metadata, update it and log new. What’s more, it’s thread-safe now, and you can connect outputs from different scripts into one. Thanks to these you can use Neptune with spot instances, in parallel and distributed computing, and with multistep ML pipelines.

Finally, if you train your models in places without permanent access to the internet, you will be happy to hear we added an offline mode. The tracked metadata will be stored locally, and you can upload it in batch at a convenient time. If your internet connection is a bit shaky or an unexpected interruption happens – Neptune will now automatically switch to the offline mode after a number of retries so that your data is always safe. 

Better web interface: folder structure, better compare, and split view

The first thing that you will notice is that the new UI takes advantage of the flexibility in the organization of tracked metadata. Previously you had to jump between different sections to fully understand what the input that generated the model was. Now you can organize metadata all in a hierarchical structure that’s most convenient for you

neptune new metadata

The second thing you will notice is that the table, where you see all your runs and their metadata, changed. Previously, the option to compare different runs not only was a bit hard to find but also changes meant going back and forth. Finally, it allowed comparing only 10 runs at the same time. Now, you can:

  • compare way more runs (say 100 or more)
  • add and remove what you want to display live, 
  • toggle between the table view and comparison or keep both visible at the same time.
neptune new compare

Cool right? Psst, there’s one more thing. When you browse specific run’s details, you may notice “Add a new dashboard” button. It’s still a bit early, and we’ll be adding more types of widgets, but you can already compose your own dashboard to visualize metadata in the way that suits you best. Please check it out and let us know what you think about it!

neptune new custom dashboard

Run, run, run (goodbye experiment)

From the start of this article, I’ve been using the word ‘run’ instead of ‘experiment’, and it’s not a coincidence. As our industry grows and more Neptune users care about operationalizing ML models (MLOps) we need to grow with it. 

Of course, the experimentation phase is important (and close to our hearts), and experiment tracking will remain one of the primary use cases of Neptune, but more and more people use it for other things like:

  • model registry
  • monitor model re-training pipelines 
  • monitor models running in production

We want to serve the use cases that YOU have, better. We want to be more aligned with the naming that YOU use to describe your work. 

Calling it ‘experiment’ just doesn’t make sense anymore, so we are changing it to ‘run’.  You will see the change from Experiment to Run both in the web interface and in the new Python API.

How do I get started with neptune.new?

Wait, so do I need to change my code?

In short – no. You don’t need to do anything now, and your runs will be tracked by Neptune without a problem. 

The new Python API and revamped user interface, under the hood, require a changed data structure. Over the following weeks, we will be migrating existing projects to that new structure, but you can already try it out, as all new projects are created using the new structure. 

The current Python API will continue to work after the migration, so you don’t need to change a single line of code. In the background, we quietly do our magic and make sure things work for you. However, the new Python API is only compatible with the new data structure so it will only be available for your project once it’s migrated. Similarly, the improved web interface also requires the new data structure. You can already try it out with a new project, and it will be available for your existing projects once they are migrated.

At some point in the future, we plan to make the new Python API the default one with a release of v1.0 of the client library. However, we will be supporting the current Python API for a long time so that you can make the switch at a convenient moment. It’s worth the switch though, it’s quite awesome 🙂 We prepared a handy migration guide to help you with the process.

I want to use neptune.new, what do I do now?

It’s super simple:

Step 1:

Create a new project – you will notice it has a badge indicating it’s created with the new structure

Step 2:

Update the Neptune client library to at least version 0.9. Simply run in your environment:

pip install --upgrade neptune-client

Step 3:

Check out the new documentation. If you want to play around and have a fresh start – the quickstart section is your best friend. If you want to update your existing code, we prepared a migration guide to help you.

Step 4:

Enjoy the new way to track metadata!

We will be migrating existing projects to the new structure over the following weeks, and once your project is migrated you will be able to use the new Python API with it as well.

New to Neptune?

First of all, hello, welcome, and thanks for reading this far!

If you want to see what all the fuss is about you can:

…or you can just:

1. Create a free account

Sign up

2. Install Neptune client library

pip install neptune-client

3. Add logging to your script

import neptune.new as neptune

run = neptune.init(project="your_workspace/your_project")

# Track metadata and hyperparameters of your run
run["JIRA"] = "NPT-952"
run["algorithm"] = "ConvNet"

params = {
    "batch_size": 64,
    "dropout": 0.2,
    "learning_rate": 0.001,
    "optimizer": "Adam"
}
run["parameters"] = params

# Track the training process by logging your training metrics
for epoch in range(100):
    run["train/accuracy"].log(epoch * 0.6)
    run["train/loss"].log(epoch * 0.4)

# Log the final results
run["f1_score"] = 0.67

4. See it in Neptune

What’s next for Neptune?

So is that it?

Dict-like API, folder structure, offline mode, better compare, and the Neptune team is off to the beach to drink piña coladas (via Zoom, of course)? 

Nope, we have more stuff coming soon, way sooner this time. 

As I mentioned before, more and more Neptune users are pushing their models to production (congrats, people!). We want not only to support it but to actually make things easier, just as we did with experiment tracking. 

In the following months:

  • we’ll be making the experience of people using Neptune as a model registry better, with support for artifact versioning (data and model);
  • we’ll add support for more metadata types so that you can easily log, display, and compare it in Neptune;
  • we’ll add integrations with more libraries from the MLOps ecosystem.

But generally, big picture, we’ll be working hard to make storing, displaying, organizing, and querying metadata in MLOps workflows easier. We’ll continue building a metadata store for MLOps.


READ NEXT

Switching From Spreadsheets to Neptune.ai and How It Pushed My Model Building Process to the Next Level

6 mins read | Nikita Kozodoi | Posted April 30, 2021

Many ML projects, including Kaggle competitions, have a similar workflow. You start with a simple pipeline with a benchmark model. 

Next, you begin incorporating improvements: adding features, augmenting the data, tuning the model… On each iteration, you evaluate your solution and keep changes that improve the target metric.

Iterative improvement process ML
The figure illustrates the iterative improvement process in ML projects. 
Green lines indicate an improvement, red lines – a decrease in the score.

This workflow involves running a lot of experiments. As time goes by, it becomes difficult to keep track of the progress and positive changes. 

Instead of working on new ideas, you spend time thinking:

  • “have I already tried this thing?”,
  • “what was that hyperparameter value that worked so well last week?” 

You end up running the same stuff multiple times. If you are not tracking your experiments yet, I highly recommend you to start!

In my previous Kaggle projects, I used to rely on spreadsheets for tracking. It worked very well in the beginning, but soon I realized that setting up and managing spreadsheets with experiment meta-data requires loads of additional work. I got tired of manually filling in model parameters and performance values after each experiment and really wanted to switch to an automated solution. 

[Neptune.ai] allowed me to save a lot of time and focus on modeling decisions which helped me to earn three medals in Kaggle competitions.

This is when I discovered Neptune.ai. This tool allowed me to save a lot of time and focus on modeling decisions, which helped me to earn three medals in Kaggle competitions.

In this post, I will share my story of switching from spreadsheets to Neptune for experiment tracking. I will describe a few disadvantages of spreadsheets, explain how Neptune helps to address them, and give a couple of tips on using Neptune for Kaggle.

Continue reading ->
MLOps guide

MLOps: What It Is, Why it Matters, and How To Implement It

Read more
Experiment tracking Experiment management

15 Best Tools for ML Experiment Tracking and Management

Read more

A Complete Guide to Monitoring ML Experiments Live in Neptune

Read more

The Best MLOps Tools You Need to Know as a Data Scientist

Read more