Log, organize, compare, register, and share
all your ML model metadata in a single place
- Automate and standardize as your modeling team grows
- Collaborate on models and results with your team and across the org
- Use hosted, deploy on-premises or in a private cloud. Integrate with any MLOps stack

import neptune
run = neptune.init_run()
run["parameters"] = {
"batch_size": 64,
"optimizer": {
"type": "SGD", "learning_rate": 0.001
}
}
run["dataset/train_version"].track_files("s3://train")
def any_module_function_or_hook(run):
run["train/accuracy"].append(acc)
run["valid/misclassified_images"].append(img)
run["your/metadata/structure"].append(any_metadata)
model_version = neptune.init_model_version()
model_version["model/binary"].upload("model.pt")
model_version.change_stage("production")
# access model later
# model_version["model/binary"].download("models/")

Log model metadata from anywhere in your pipeline. See results in the web app. All in 5 minutes
Add a snippet to any step of your ML pipeline once. Decide what and how you want to log. Run a million times
- Any framework
- Any metadata type
- From anywhere in your ML pipeline
import neptune
# Connect to Neptune and create a run
run = neptune.init_run()
# Log hyperparameters
run["parameters"] = {
"batch_size": 64,
"dropout":0.5,
"optimizer": {"type":"SGD", "learning_rate": 0.001},
}
# Log dataset versions
run["data/train_version"].track_files("train/images")
# Log the training process
for iter in range(100):
run["train/accuracy"].append(accuracy)
# Log test metrics and charts
run["test/f1_score"] = test_score
run["test/confusion_matrix"].upload(fig)
# Log model weights and versions
run["model/weights"].upload("my_model.pkl")
# Stop logging to your run
run.stop()
from pytorch_lightning.loggers import NeptuneLogger
neptune_logger = NeptuneLogger()
trainer = Trainer(max_epochs=10, logger=neptune_logger)
trainer.fit(my_model, my_dataloader)
from neptune.integrations.tensorflow_keras import NeptuneCallback
run = neptune.init_run()
neptune_cbk = NeptuneCallback(run=run)
model.fit(
x_train,
y_train,
epochs=5,
batch_size=64,
callbacks=[neptune_cbk],
)
run = neptune.init_run()
data_dir = "data/CIFAR10"
params = {
"lr": 1e-2,
"bs": 128,
"input_sz": 32 * 32 * 3,
"n_classes": 10,
"model_filename": "basemodel",
}
run["config/data_dir"] = data_dir
run["config/params"] = params
for i, (x, y) in enumerate(trainloader, 0):
optimizer.zero_grad()
outputs = model.forward(x)
_, preds = torch.max(outputs, 1)
loss = criterion(outputs, y)
acc = (torch.sum(preds == y.data)) / len(x)
run["logs/training/batch/loss"].append(loss)
run["logs/training/batch/acc"].append(acc)
loss.backward()
optimizer.step()
import neptune.integrations.sklearn as npt_utils
run = neptune.init_run()
parameters = {
"n_estimators": 120,
"learning_rate": 0.12,
"min_samples_split": 3,
"min_samples_leaf": 2,
}
gbc = GradientBoostingClassifier(**parameters)
gbc.fit(X_train, y_train)
run["cls_summary"] = npt_utils.create_classifier_summary(
gbc, X_train, X_test, y_train, y_test
)
from neptune.integrations.lightgbm import NeptuneCallback, create_booster_summary
run = neptune.init_run()
neptune_callback = NeptuneCallback(run=run)
params = {
"boosting_type": "gbdt",
"objective": "multiclass",
"num_class": 10,
"metric": ["multi_logloss", "multi_error"],
"num_leaves": 21,
"learning_rate": 0.05,
"max_depth": 12,
}
# Train the model
gbm = lgb.train(
params,
lgb_train,
num_boost_round=200,
valid_sets=[lgb_train, lgb_eval],
valid_names=["training", "validation"],
callbacks=[neptune_callback],
)
run["lgbm_summary"] = create_booster_summary(
booster=gbm,
log_trees=True,
list_trees=[0, 1, 2, 3, 4],
log_confusion_matrix=True,
y_pred=y_pred,
y_true=y_test,
)
from neptune.integrations.xgboost import NeptuneCallback
run = neptune.init_run()
neptune_callback = NeptuneCallback(run=run, log_tree=[0, 1, 2, 3])
params = {
"eta": 0.7,
"gamma": 0.001,
"max_depth": 9,
"objective": "reg:squarederror",
"eval_metric": ["mae", "rmse"],
}
xgb.train(
params=params,
dtrain=dtrain,
num_boost_round=num_round,
evals=evals,
callbacks=[neptune_callback],
)
import neptune.integrations.optuna as optuna_utils
run = neptune.init_run()
neptune_callback = optuna_utils.NeptuneCallback(run)
study = optuna.create_study(direction="maximize")
study.optimize(objective, n_trials=20, callbacks=[neptune_callback])
kedro neptune init
def report_accuracy(predictions: np.ndarray, test_y: pd.DataFrame,
neptune_run: neptune.run.Handler) -> None:
# ...
neptune_run["nodes/report/accuracy"] = accuracy
fig, ax = plt.subplots()
plot_confusion_matrix(target, predictions, ax=ax)
neptune_run["nodes/report/confusion_matrix"].upload(fig)
def create_pipeline(**kwargs):
return Pipeline(
[# ...
node(
report_accuracy,
["example_predictions", "example_test_y","neptune_run"],
None,
name="report",
),
]
)
run["score"] = 0.97
for epoch in range(100):
run["train/accuracy"].append(acc)

run["model/parameters"] = {
"lr":0.2,
"optimizer": {"name": "Adam", "momentum": 0.9},
}

run["train/images"].track_files("./datasets/images")

run["matplotlib-fig"].upload(fig)
for name in misclassified_images_names:
run["misclassified_images"].append(File("misclassified_image.png"))

run["visuals/altair-fig"].upload(File.as_html(fig))

run["video"].upload("/path/to/video-file.mp4")

run = neptune.init_run(capture_hardware_metrics=True)

run = neptune.init_run(source_files=["**/*.py", "config.yaml"])

Log from many pipeline nodes to the same run
export NEPTUNE_CUSTOM_RUN_ID="SOME ID"
Log from multiple machines to the same run
export NEPTUNE_CUSTOM_RUN_ID="SOME ID"
# Open finished run "SUN-123"
run = neptune.init_run(with_id="SUN-123")
# Download model
run["train/model_weights"].download()
# Continue logging
run["test/accuracy"].append(0.68)
Script
run = neptune.init_run(mode="offline")
Console
neptune sync
Organize and display experiment and model metadata however you want
Organize logs in a fully customizable nested structure. Display model metadata in user-defined dashboard templates
- Nested metadata structure
- Custom dashboards
- Table views
run["accuracy"] = 0.62
run["ROC_curve"].upload(fig)
run["model/parameters"] = {
"lr": 0.2,
"optimizer": {"name": "Adam", "momentum": 0.9},
}



Search, debug, and compare experiments, datasets, and models
Visualize training live in the Neptune web app. See how different parameters and configs affect the results. Optimize models quicker
- Compare
- Search, sort, and filter
- Visualize and display
- Monitor live
- Group by






#Supports rendering Altair, Plotly, Bokeh, video, audio, or any fully contained HTML.







Save your production-ready models to a centralized registry
Version, review and access production-ready models and metadata associated with them in a single place
- Version models
- Review and change stages
- Access and share models
Register a production-ready model.
You can attach any metadata or artifacts to it and organize them in any structure you want.
model = neptune.init_model(
name="face_detection", key="DET",
)
model["validation/dataset"].track_files("s3://datasets/validation")

For any registered model, create as many model versions as you want.
Again, you can attach whatever metadata you want to it.
model_version = neptune.init_model_version(
model="FACE-DET",
)
model_version["model/binary"].upload("model.pt")
model_version["validation/acc"] = 0.97

Save hash, location and other model artifact metadata. You don’t have to upload the model to Neptune. Just keep track of the model reference to local or S3-compatible storage.
model_version["model/binary"].track_files("model.pt")

model_version.change_stage("staging")

model_version = neptune.init_model_version(with_id="FACE-DET-42")
model_version["model/signature"].download()
Share and collaborate on experiment results and models across the org
Have a single place where your team can see the results and access all models and experiments
- Send a link
- Query API
- Manage users and projects
- Add your entire org

run = neptune.init_run(with_id="DET-135")
batch_size = run["parameters/batch_size"].fetch()
losses = run["train/loss"].fetch_values()
md5 = run["dataset/train"].fetch_hash()
run["trained_model"].download("models/")

Integrates with any MLOps tool stack



Code examples, videos, projects gallery, and other resources
Yes, you can deploy Neptune on-premises and other answers
-
Read more about our deployment options here.
But in short, yes, you can deploy Neptune on your on-prem infrastructure or in your private cloud.
It is a set of microservices distributed as a Helm chart that you deploy on Kubernetes.
If you don’t have your own Kubernetes cluster deployed, our installer will set up a single-node cluster for you.
As per infrastructure requirements, you need a machine with at least 8 CPUs, 32GB RAM, and 1TB SSD storage.
Read the on-prem documentation if you’re interested, or talk to us (support@neptune.ai) if you have questions.
If you have any trouble, our deployment engineers will help you all the way.
-
Yes, you can just reference datasets that sit on your infrastructure or in the cloud.
For example, you can have your datasets on S3 and just reference the bucket.
run[“train_dataset”].track_files(“s3://datasets/train.csv”)
Neptune will save the following metadata about this dataset:
- version (hash),
- location (path),
- size,
- folder structure, and contents (files)
Neptune never uploads the dataset, just logs the metadata about it.
You can later compare datasets or group experiments by dataset version in the UI.
-
Short version. People choose Neptune when:
- They don’t want to maintain infrastructure (including autoscaling, backups etc.),
- They keep scaling their projects (and get into thousands of runs),
- They collaborate with a team (and want user access, multi-tenant UI etc.).
For the long version, read this full feature-by-feature comparison.
-
Short version. People choose Neptune when:
- They want to pay a reasonable price and the ability to invite unlimited users for free (at $150/month you get unlimited team members and 1500 logging hours added every month),
- They want a super flexible tool (customizable logging structure, dashboards, works great with time series ML),
- They want a component for experiment tracking and model registry NOT an end-to-end platform (WandB is adding HPO, orchestration, model deployment. We integrate with best-in-class tools in the space).
For the long version, read this full feature-by-feature comparison.
-
It depends on what “model monitoring” you mean.
As we talk to teams, it seems that “model monitoring” means six different things to three different people:
- (1) Monitor model performance in production: See if the model performance decays over time, and you should re-train it
- (2) Monitor model input/output distribution: See how the distribution of input data, features, and predictions distribution change over time?
- (3) Monitor model training and re-training: See learning curves, trained model predictions distribution, or confusion matrix during training and re-training
- (4) Monitor model evaluation and testing: log metrics, charts, prediction, and other metadata for your automated evaluation or testing pipelines
- (5) Monitor hardware metrics: See how much CPU/GPU or Memory your models use during training and inference
- (6) Monitor CI/CD pipelines for ML: See the evaluations from your CI/CD pipeline jobs and compare them visually
So when looking at tooling landscape and Neptune:
- Neptune does (3) and (4) really well, but we saw teams use it for (5) and (6)
- Prometheus + Grafana is really good at (5), but people use it for (1) and (2)
- WhyLabs or Arize are really good at (1) and (2)