We Raised $8M Series A to Continue Building Experiment Tracking and Model Registry That “Just Works”

Read more

Neptune vs MLflow

Scalable MLflow alternative with a hosted UI built for teamwork

neptune ai blue horizontal 1
vs
MLflow-logo

Looking for an experiment tracking and/or model registry solution? Not sure how Neptune.ai and MLflow are actually different? Or maybe you’re using MLflow but wanted to check some alternatives?

You’re in the right place. On this page we break down:

  • what are the differences between MLflow and Neptune
  • why do people switch from MLflow to Neptune
  • what are the cases when you should go with Neptune (and when you shouldn’t)

When should you go with Neptune?

You don’t want to maintain infrastructure

It’s not easy to run experiments, manage models and maintain the infrastructure at the same time. SaaS (or even an on-premise solution with out-of-the-box autoscaling) takes this infra off your shoulders and lets you focus 100% on building and deploying models.

You keep scaling your projects

For a small number of experiments, MLflow is really good. You don’t see the issue. The real headache comes when you have 100, 10k, or 100k runs and you want to easily visualize, filter, and compare them. Then, Neptune may be a better option.

You collaborate with a team

When you work alone, MLflow can be a good starting point to track your runs. But when the 1-person team grows, you start to think about access management and you want to share the results with others. This is when the UI built for teamwork is more useful.

Why others choose Neptune over MLflow?

MLflow is open-source and free to use. You can customize it if you want to, but you have to set it up and maintain the infrastructure.

Neptune.ai is a hosted SaaS solution (an on-prem option is also available). It’s easy to plug it into your workflow and it comes with user support.

So, in this aspect, it’s the good, old “buy vs use open-source” dilemma. Many ML people we talked to have a general impression that open source tools are good enough for very small teams or individual contributors with a need for basic tracking. After some time you just get to the point when it’s not enough.

Now when you look into what parts of the ML lifecycle both tools cover, the scope is quite similar.

MLflow does experiment tracking and model registry, but also comes with two additional components: Projects (packaging format) and Models (general format for sending models to deployment tools).

Neptune.ai focuses on experiment tracking and model registry, but it doesn’t limit its functionality. Within those features, it covers also dataset versioning, model versioning, or monitoring model training live. On top of that, thanks to the flexible API you can package models however you like and attach any metadata in any structure to a model version.

This flexibility is often brought up by Neptune users, as one of the first advantages they notice. The tool easily hooks into multiple frameworks, so no matter what solution you use for training, deploying, or monitoring models, Neptune will fit in.

There’s one more thing that will probably be important when comparing Neptune and MLflow, especially for the teams.

Neptune comes with out-of-the-box collaboration features.

  • You can share persistent links to the UI with other team members, or with external people if needed.
  • And you can create multiple projects and manage who can access them.

There are teams that use the open-source MLflow, but the collaboration is much more challenging there. Team management features are available in the Managed MLflow which is a paid solution developed by Databricks. So probably this kind of functionality won’t be added to the free, open-source version.

MLflow is a solid solution, and it checks many boxes when the team needs a simple experiment tracking and model registry. But it doesn’t scale well. Many ML teams quickly reach its limits and look for something more advanced.

Here’s what Neptune users say

“For now, I’m not using MLflow anymore ever since I switched to Neptune because I feel like Neptune is a superset of what MLflow has to offer.” [Read full case study]

Kha Nguyen
Senior Data Scientist at Zoined

“Previously used tensorboard and azureml but Neptune is hugely better. In particular, getting started is really easy; documentation is excellent, and the layout of charts and parameters is much clearer.”

Simon Mackenzie
AI Engineer and Data Scientist

(…) thanks for the great tool, has been really useful for keeping track of the experiments for my Master’s thesis. Way better than the other tools I’ve tried (comet / wandb).

I guess the main reason I prefer neptune is the interface, it is the cleanest and most intuitive in my opinion, the table in the center view just makes a great deal of sense. I like that it’s possible to set up and save the different view configurations as well. Also, the comparison is not as clunky as for instance with wandb. Another plus is the integration with ignite, as that’s what I’m using as the high-level framework for model training.”

Klaus-Michael Lux
Data Science and AI student, Kranenburg, Germany
This thing is so much better than Tensorboard, love you guys for creating it!
Dániel Lévai
Junior Researcher at Rényi Alfréd Institute of Mathematics in Budapest, Hungary

While logging experiments is great, what sets Neptune apart for us at the lab is the ease of sharing those logs. The ability to just send a Neptune link in slack and letting my coworkers see the results for themselves is awesome. Previously, we used Tensorboard + locally saved CSVs and would have to send screenshots and CSV files back and forth which would easily get lost. So I’d say Neptune’s ability to facilitate collaboration is the biggest plus.”

Greg Rolwes
Computer Science Undergraduate at Saint Louis University
Such a fast setup! Love it:)
Kobi Felton
PhD student in Music Information Processing at Télécom Paris

For me the most important thing about Neptune is its flexibility. Even if I’m training with Keras or Tensorflow on my local laptop, and my colleagues are using fast.ai on a virtual machine, we can share our results in a common environment.

Víctor Peinado
Senior NLP/ML Engineer
Load more

Give Neptune a try

2
Install Neptune client library
pip install neptune-client
3
Add logging to your script
import neptune.new as neptune

run = neptune.init_run("Me/MyProject")
run["parameters"] = {"lr":0.1, 
                    "dropout":0.4}
run["test_accuracy"] = 0.84

Dig deeper into the differences between Neptune and MLflow

Commercial Requirements
Standalone component or a part of a broader ML platform?

Standalone component. ML metadata store that focuses on experiment tracking and model registry

Open-source platform which offers four separate components for experiment tracking, code packaging, model deployment, and model registry

Is the product available on-premises and / or in your private/public cloud?

Tracking is hosted on a local/remote server (on-prem or cloud). Is also available on a managed server as part of the Databricks platform

Is the product delivered as commercial software, open-source software, or a managed cloud service?

Managed cloud service

The standalone product is open-source, while the Databricks managed version is commercial

SLOs / SLAs: Does the vendor provide guarantees around service levels?

Community support for the open-source version, various support plans for the databricks managed version

Support: Does the vendor provide 24×7 support?

No

SSO, ACL: does the vendor provide user access management?
General Capabilities
What are the infrastructure requirements?

No special requirements other than having the neptune-client installed and access to the internet if using managed hosting. Check here for infrastructure requirements for on-prem deployment

No requirements other than having mlflow installed if using a local tracking server. Check here for infrastructure requirements for using a remote tracking server

How much do you have to change in your training process?

Minimal. Just a few lines of code needed for traking. Read more

Minimal. Just a few lines of code needed for traking. Read more

Does it integrate with the training process via CLI/YAML/Client library?

Yes, through the neptune-client library

Does it come with a web UI or is it console-based?
Serverless UI

No

Yes

Customizable metadata structure

Yes

No

How can you access model metadata?

– gRPC API

No

No

– CLI / custom API

Yes

No

– REST API

No

Yes

– Python SDK

Yes

Yes

– R SDK

Yes

Yes

– Java SDK

No

Yes

– Julia SDK

No

No

Supported operations

– Search

Yes

Yes

– Update

Yes

No

– Delete

Yes

Yes

– Download

Yes

Yes

Distributed training support

Yes

Pipelining support

Yes

Yes

Logging modes

– Offline

Yes

Yes

– Debug

Yes

No

– Asynchronous

Yes

Yes

– Synchronous

Yes

Live monitoring

Yes

Yes

Mobile support

No

No

Experiment Tracking
Dataset
– location (path/s3)

Yes

Yes

– hash (md5)

Yes

Yes

– Preview table

Yes

No

– Preview image

Yes

Yes

– Preview text

Yes

Yes

– Preview rich media

No

– Multifile support

Yes

Yes

– Dataset slicing support

No

No

Code versions

– Git

Yes

Only the commit ID

– Source

Yes

Yes

– Notebooks

Yes

Parameters

Yes

Yes

Metrics and losses

– Single values

Yes

Yes

– Series values

Yes

Yes

– Series aggregates (min/max/avg/var/last)

Yes

Yes

Tags

Yes

Yes

Descriptions/comments

Yes

Yes

Rich format

– Images (support for labels and descriptions)

Yes

No

– Plots

Yes

Yes

– Interactive visualizations (widgets and plugins)

Yes

No

– Video

Yes

No

– Audio

Yes

No

– Neural Network Histograms

No

No

– Prediction visualization (tabular)

No

No

– Prediction visualization (image)

No

No

Hardware consumption

– CPU

Yes

No

– GPU

Yes

No

– TPU

No

No

– Memory

Yes

No

System information

– Console logs (Stderr, Stdout)

Yes

No

– Error stack trace

Yes

No

– Execution command

No

Yes

– System details (host, user, hardware specs)

Yes

No

Environment config

– pip requirements.txt

Yes

Yes

– conda env.yml

Yes

Yes

– Docker Dockerfile

Yes

Yes

Files

– Model binaries

Yes

Yes

– CSV

Yes

Yes

– External file reference (s3 buckets)

Yes

Yes

Table format diff

Yes

No

Overlayed learning curves

Yes

Yes

Parameters and metrics

– Groupby on experiment values (parameters)

Yes

No

– Parallel coordinates plots

Yes

– Parameter Importance plot

No

No

– Slice plot

No

No

– EDF plot

No

No

Rich format (side by side)

– Image

Yes

No

– Video

No

No

– Audio

No

No

– Plots

No

No

– Interactive visualization (HTML)

No

No

– Text

Yes

No

– Neural Network Histograms

No

No

– Prediction visualization (tabular)

Yes

Yes

– Prediction visualization (image, video, audio)

No

No

Code

– Git

No

No

– Source files

No

No

– Notebooks

Yes

No

Environment

– pip requirements.txt

No

No

– conda env.yml

No

No

– Docker Dockerfile

No

No

Hardware

– CPU

Yes

No

– GPU

Yes

No

– Memory

Yes

No

System information

– Console logs (Stderr, Stdout)

Yes

No

– Error stack trace

Yes

No

– Execution command

No

No

– System details (host, owner)

Yes

Yes

Data versions

– Location

Yes

No

– Hash

Yes

No

– Dataset diff

Yes

No

– External reference version diff (s3)

No

No

Files

– Models

No

No

– CSV

No

No

Custom compare dashboards

– Combining multiple metadata types (image, learning curve, hardware)

Yes

No

– Logging custom comparisons from notebooks/code

Yes

No

– Compare/diff of multiple (3+) experiments/runs

Yes

Yes

Experiment table customization

– Adding/removing columns

Yes

Yes

– Renaming columns in the UI

Yes

No

– Adding colors to columns

Yes

No

– Displaying aggregate (min/max/avg/var/last) for series like training metrics in a table

Yes

No

– Automagical column suggestion

Yes

No

Experiment filtering and searching

– Searching on multiple criteria

Yes

Yes

– Query language vs fixed selectors

Query language

Query language

– Saving filters and search history

Yes

No

Custom dashboards for a single experiment

– Can combine different metadata types in one view

Yes

No

– Saving experiment table views

Yes

No

– Logging project-level metadata

Yes

No

– Custom widgets and plugins

No

No

Tagging and searching on tags

Yes

Yes

Nested metadata structure support in the UI

Yes

No

One-command experiment re-run

No

Yes

Experiment lineage

– List of datasets used downstream

No

No

– List of other artifacts (models) used downstream

No

No

– Downstream artifact dependency graph

No

No

Reproducibility protocol

Limited

Yes

Is environment versioned and reproducible

Yes

Yes

Saving/fetching/caching datasets for experiments

No

No

Sharing UI links with project members

Yes

No

Sharing UI links with external people

Yes

No

Commenting

Yes

Yes

Interactive project-level reports

No

No

Model Registry
Code versions (used for training)

Yes

No

Environment versions

No

Yes

Parameters

Yes

Yes

Dataset versions

Yes

No

Results (metrics, visualizations)

Yes

Yes

Explanations (SHAP, DALEX)

Yes

Model files (packaged models, model weights, pointers to artifact storage)

Yes

Yes

Models/experiments created downstream

No

No

History of evaluation/testing runs

No

No

Support for continuous testing

No

No

Users who created a model or downstream experiments

No

No

Main stage transition tags (develop, stage, production)

Yes

Yes

Custom stage tags

No

No

Locking model version and downstream runs, experiments, and artifacts

No

No

Model compare (current vs challenger etc)

No

Compatibility audit (input/output schema)

No

Yes

Compliance audit (datasets used, creation process approvals, results/explanations approvals)

No

No

Model accessibility

No

Yes

Support for continuous testing

No

No

Integrations with CI/CD tools

No

No

Registered models

No

Yes

Active models

No

No

By metadata/artifacts used to create it

No

No

By date

No

No

By user/owner

No

No

By production stage

No

No

Search query language

No

No

Native packaging system

No

Yes

Compatibility with packaging protocols (ONNX, etc)

No

Yes

One model one file or flexible structure

No

Integrations with packaging frameworks

No

Yes

Integrations and Support
Java

No

Yes

Julia

No

No

Python

Yes

Yes

REST API

No

Yes

Catalyst

Yes

Yes

CatBoost

No

Yes

fastai

Yes

Yes

FBProphet

Yes

Yes

Gluon

No

Yes

HuggingFace

Yes

Yes

H2O

No

Yes

lightGBM

Yes

Yes

Paddle

No

Yes

PyTorch

Yes

Yes

PyTorch Ignite

Yes

PyTorch Lightning

Yes

Yes

Scikit Learn

Yes

Yes

Skorch

Yes

Spacy

No

Yes

Spark MLlib

No

Yes

Statsmodel

No

Yes

TesorFlow / Keras

Yes

Yes

XGBoost

Yes

Yes

Hyperopt

No

No

Keras Tuner

No

Optuna

Yes

Yes

Ray Tune

No

Yes

Scikit-Optimize

No

DALEX

No

Netron

No

No

SHAP

No

Yes

TensorBoard
JupyterLab and Jupyter Notebook

Yes

Limited

Google Colab

Yes

Limited

Deepnote

Yes

Limited

AWS SageMaker

Yes

Yes

Airflow

No

No

Argo

No

No

Kedro

Yes

Yes

Kubeflow

No

No

MLflow

NA

Sacred

Yes

No

TensorBoard

No

GitHub Actions

Yes

Gitlab CI

No

No

CircleCI

No

Yes

Travis

No

Yes

Jenkins

No

No

Seldon

No

No

Cortex

No

No

Databricks

No

Yes

Seldon

No

No

Fiddler.ai

No

No

Arthur.ai

No

No

This table was updated on 22 July 2022. Some information may be outdated.
Report outdated information here.

It only takes 5 minutes to integrate Neptune with your code

Sign up now

Have more questions? Let’s talk

Chaz Demera
Chaz Demera

Account Executive

Schedule a call