Compare Weights & Biases vs MLflow vs Neptune

The only experiment tracker built for foundation model training

When the number of metrics you log grows in size, both WandB and MLflow slow down.
When the number of tracked hours grows in size, WandB’s pricing can break your budget.

Avoid both these things, with Neptune.

Show me pricing Chat with us

Feature-by-feature comparison

Take a deep dive into the differences between WandB, MLflow and Neptune

Show differences only

Weights & Biases

MLflow

Neptune

Commercial offering

Open-source software or a managed cloud service?

Managed cloud service

Open-source

Managed cloud service

Pricing model

User based and usage based (tracked hours)

Free

User based and usage based (ingestion data points)

Guarantees around service levels (SLOs / SLAs)

Yes

Community support only

Yes

Support 24×7

Yes

User access management (SSO, ACL)

Yes

Limited

Yes

Security policy and compliance

Yes

General information

Deployment

Cloud (SaaS)

Yes

No. However, it’s available on a managed server as part of the Databricks platform.

Yes

Self-hosted (on-prem, private cloud)

Yes

Installation in air-gapped environment

Yes

Setup

Infrastructure requirements

Minimal setup—install wandb python library and ensure internet access (for managed hosting). Self-hosting requires additional infrastructure; see requirements here.

Minimal setup—install mlflow (for local tracking). Remote tracking server requires additional infrastructure; see requirements here.

Minimal setup—install the Python client and ensure internet access (for managed hosting). Self-hosting requires additional infrastructure; see requirements here.

Integration with the training process

A few lines of code via Python, JavaScript, or CLI.

A few lines of code via Python, REST, R, Java, or CLI.

A few lines of code via the Neptune Python library.

Flexibility and accessibility

Accessing model metadata

CLI/custom API, Python SDK, Java SDK, Julia SDK

CLI/custom API, REST API, Python SDK, R SDK, Java SDK

CLI/custom API and Python SDK

Supported operations

Search, Update, Delete, Download

Search, Update (limited), Delete, Download

Search, Update, Delete, Download

Logging modes

Offline, Disabled/off, Asynchronous, Synchronous

Offline, Disabled/Off, Asynchronous

Customizable metadata structure

Yes

Distributed training support

Pipelining support

Live monitoring

Yes

Webhooks and notifications

Yes

Resuming experiments

Yes

Forking runs

Yes

Capabilities

Log and display

Parameters

Yes

Single values (metrics, losses, gradients, activations, etc.)

Yes

Series of values (metrics, losses, gradients, activations, etc.)

Yes

Series aggregates (min/max/avg/var/last)

Yes

Tags

Yes

Descriptions/comments

Yes

Rich format

Images, Plots, Interactive visualizations, Video, Audio

Plots

Images, Plots, Video, Audio

Hardware consumption

CPU, GPU, TPU, Memory

CPU, GPU, Memory

Yes

Dataset

Yes

Limited

Code versions

Git, Source, Notebooks

Git (limited), Source

Git, Source

System information

Console logs, Error stack trace, Execution command, System details

Execution command

Console logs, Execution command

Environment config

pip requirements.txt, Docker Dockerfile

pip requirements.txt, conda env.yml, Docker Dockerfile

Files (model binaries, CSV)

Yes

External file reference (S3 buckets)

Yes

Searching & filtering

Searching on multiple criteria

Yes

Query language and filtering options

Regex on names at the project level, fixed selectors at the run level

Query language

Both regex and fixed selectors

Custom attribute filtering (e.g. tags)

Yes

Support for regular expressions

Yes

Limited

Yes

Auto-update charts based on regex

Yes

Saving searches and filter history

Yes

Visualizations

Custom (calculated) metrics

Yes

Forked charts

Yes

Histograms

Yes

Custom axes

Yes

Limited

Yes

Customizable global settings

Yes

Limited

Yes

Customizable legends

Yes

Comparing experiments

Table format diff

Yes

Single-metric overlayed plots

Yes

Multi-metric overlayed plots

Yes

Grouping experiments by metadata

Scatter plot

Parallel coordinates plot

Yes

Parameter importance plot

Yes

Rich format (side by side)

Images, Audio, Video, Interactive visualizations, Text

Data versions diff

Cross-project comparisons

Yes

N/A

Yes

Custom analysis

Experiment table customization

Limited

Yes

Saving experiment table views

Yes

Dashboards combining different metadata types

Yes

Custom widgets and plugins

Yes

Persistent custom plots coloring

Yes

Collaboration and knowledge sharing

Reports

Yes

Adding text in Reports

Yes

Commenting

Yes

Limited

Downloading charts

Yes

Sharing persistent UI links

Yes

Limited

Yes

User groups and ACL

Yes

Limited

Yes

This table was updated on 14 May 2025. Some information may be outdated.
Report outdated information here.

The way we work is that we do not experiment constantly. After checking out both Neptune and Weights and Biases, Neptune made sense to us due to its pay-per-use or usage-based pricing. Now when we are doing active experiments then we can scale up and when we’re busy integrating all our models for a few months we scale down again.

Viet Yen Nguyen CTO at Hypefactors

MLflow requires what I like to call software kung fu, because you need to host it yourself. So you have to manage the entire infrastructure — sometimes it’s good, oftentimes it’s not.

Senior Data Scientist Healthcare Analytics Platform, UK

I chose Neptune over WandB because it is more lightweight and I’m more comfortable working with it.

Jonathan Donzallaz Data Science Researche

We tried MLflow. But the problem is that they have no user management features, which messes up a lot of things.

AI/ML Product Manager Customer Service Automation Platform, USA