If it’s not responsive,
it’s not working
Other experiment trackers can’t handle
the scale of your training:
To train at hyperscale without the headwind – you need Neptune
View and analyze thousands of metrics
in milliseconds
With Neptune’s web app, you can render huge (100k+) runs tables. Or compare thousands of metrics on a single chart — minus the screen freeze you get with other tools.
And, because we don’t downsample data, your visualizations are 100% accurate – down to a single metric spike.
Real-time experiment tracking with confidence is now a reality.
Track months-long model training
with more confidence
Forking new runs from any step of your experiment makes it possible to:
- Test multiple configs at the same time. Stop the runs that don’t improve accuracy. And continue from the most accurate last step. No more wasting millions on training experiments that won’t converge.
- Restart failed training sessions from any previous step. Your training history is inherited. And you can see your entire experiment on a single chart. No more wasting time on workarounds that give you inconsistent results.
Forking of runs is only available in Neptune Scale at the moment. Request early access to this version.
Deploy on-prem or in your private cloud
— from day one
Unlike other tools, we built Neptune’s architecture, data model, and algorithms for maximum scalability.
For example, Neptune can ingest 100k data points per second — asynchronously (based on Kafka).
So you can track all the metrics, results, and metadata you generate — while keeping your data safe.
Speaks fluently with your stack
- Any code
- Training frameworks
- HPO frameworks
- Automation frameworks
import neptune
# Connect to Neptune and create a run
run = neptune.init_run()
# Log hyperparameters
run["parameters"] = {
"batch_size": 64,
"dropout": 0.5,
"optimizer": {"type": "SGD", "learning_rate": 0.001},
}
# Log dataset versions
run["data/train_version"].track_files("train/images")
# Log the training process
for epoch in range(100):
accuracy = ...
run["train/accuracy"].append(accuracy)
# Log test metrics and charts
run["test/f1_score"] = test_score
run["test/confusion_matrix"].upload(fig)
# Log model weights and versions
run["model/weights"].upload("my_model.pkl")
# Stop logging to your run
run.stop()
import neptune
from neptune_pytorch import NeptuneLogger
run = neptune.init_run()
neptune_logger = NeptuneLogger(
run=run,
model=model,
)
from lightning.pytorch.loggers import NeptuneLogger
neptune_logger = NeptuneLogger()
trainer = Trainer(
...,
logger=neptune_logger,
)
trainer.fit(...)
training_args = TrainingArguments(
...
report_to="neptune",
)
trainer = Trainer(
model,
training_args,
...,
)
# Track and version data files used for training
run["datasets/version"].track_files("s3://path/to/object")
# Log training parameters
params = {
"num_epochs": 10,
...,
}
run["training/model/params"] = params
# Log metrics in the training loop
for epoch in range(params["num_epochs"]):
...
# Log metrics for the epoch
run["training/train/loss"].append(loss)
run["training/train/accuracy"].append(accuracy)
# Upload trained model to Neptune
model.save("my_model.keras")
run["model"].upload("my_model.keras")
import neptune
from neptune_tensorflow_keras import NeptuneCallback
run = neptune.init_run()
neptune_callback = NeptuneCallback(run=run)
model.fit(
...,
callbacks=[neptune_callback],
)
from composer.loggers import NeptuneLogger
trainer = Trainer(
...,
loggers=NeptuneLogger(),
)
import neptune
from neptune_sklearn import create_classifier_summary
run = neptune.init_run()
run["cls_summary"] = create_classifier_summary(
classifier,
X_train,
X_test,
y_train,
y_test,
)
import neptune
from neptune_lightgbm import NeptuneCallback, create_booster_summary
run = neptune.init_run()
neptune_callback = NeptuneCallback(run=run)
# Log training metrics live
gbm = lgbm.train(
...,
callbacks=[neptune_callback],
)
# Log model summary after training
run["lgbm_summary"] = create_booster_summary(booster=gbm)
import neptune
from neptune_xgboost import NeptuneCallback
run = neptune.init_run()
neptune_callback = NeptuneCallback(run=run)
xgb.train(
...,
callbacks=[neptune_callback],
)
import neptune
from neptune_optuna import NeptuneCallback
run = neptune.init_run()
neptune_callback = NeptuneCallback(run)
...
study.optimize(
...,
callbacks=[neptune_callback],
)
from neptune_airflow import NeptuneLogger
with DAG(
...
) as dag:
def your_task(**context):
logger = NeptuneLogger()
return task_results(logger, **context)
def step_function(
...,
neptune_run: neptune.handler.Handler,
):
...
neptune_run["field"] = value
...
from zenml.integrations.neptune.experiment_trackers.run_state import (
get_neptune_run
)
@step(experiment_tracker="neptune_tracker", ...)
def my_step():
neptune_run = get_neptune_run()
neptune_run["sys/name"] = "My custom run name"
neptune_run["params/lr"] = params.lr
...
Loved by 60000+ researchers. Trusted by enterprises.
The largest models require
the most scalable experiment tracker
Interested to know how Neptune can help you with that?