We Raised $8M Series A to Continue Building Experiment Tracking and Model Registry That “Just Works”

Read more

How to Keep Track of PyTorch Lightning Experiments With Neptune

Working with PyTorch Lightning and wondering which logger should you choose to keep track of your experiments?

Want to find a good way to save hyperparameters, metrics, and other model-building metadata?

Thinking of using PyTorch Lightning to structure your Deep Learning code and wouldn’t mind learning about its logging functionality?

Didn’t know that Lightning has a pretty awesome Neptune integration?

This article is (very likely) for you.

Why PyTorch Lightning and Neptune?

If you have never heard of it, PyTorch Lightning is a very lightweight wrapper on top of PyTorch which is more like a coding standard than a framework. The format allows you to get rid of a ton of boilerplate code while keeping it easy to follow.

The result is a framework that gives researchers, students, and production teams the ultimate flexibility to try crazy ideas without having to learn yet another framework while automating away all the engineering details.

Some great features that you can get out-of-the-box are:

  • Train on CPU, GPU, or TPUs without changing your code,
  • Trivial multi-GPU and multi-node training
  • Trivial 16-bit precision support
  • Built-in performance profiler (Trainer(profile=True))

and a ton of other great functionalities.

But with this great power of running experiments easily and flexibility in tweaking anything you want, comes a problem.

How to keep track of all the changes like:

  • losses and metrics,
  • hyperparameters
  • model binaries
  • validation predictions

and other things that will help you organize your experimentation process?

PyTorch Lightning loggers

Fortunately, PyTorch Lightning gives you an option to easily connect loggers to the pl.Trainer and one of the supported loggers that can track all of the things mentioned before (and many others) is the NeptuneLogger which saves your experiments in… you guessed it, Neptune.

Neptune not only tracks your experiment artifacts but also:

The best part is that this integration really is trivial to use. 

Let me show you how it looks.

TIP

You can also check out this colab notebook and play with the examples we will talk about yourself.

PyTorch Lightning logging: basic integration (save hyperparameters, metrics, and more)

In the simplest case, you just create the NeptuneLogger:

from pytorch_lightning.loggers import NeptuneLogger

neptune_logger = NeptuneLogger(
    api_key="ANONYMOUS",
    project_name="shared/pytorch-lightning-integration")

and pass it to the logger argument of Trainer and fit your model.

from pytorch_lightning import Trainer

trainer = Trainer(logger=neptune_logger)
trainer.fit(model)

By doing so you automatically:

  • Log metrics and losses (and get the charts created),
  • Log and save hyperparameters (if defined via lightning hparams),
  • Log hardware utilization
  • Log Git info and execution script

Check out this experiment.

You can monitor your experiments, compare them, and share them with others.

Not too bad for a 4-liner.

But with just a bit more effort you can get a lot more.

PyTorch Lightning logging: advanced options

Neptune gives you a lot of customization options and you can simply log more experiment-specific things, like image predictions, model weights, performance charts, and more.

All of that functionality is available for Lightning users and in the next sections, I will show you how to leverage Neptune to the fullest.

Logging extra information at NeptuneLogger creation

When you are creating the logger you can log additional useful information: 

  • code: snapshot scripts, jupyter notebooks, config files, and more,
  • hyperparameters: log learning rate, number of epochs, and other things (if you are using lightning hparams object from lightning it will be logged automatically)
  • properties: log data locations,  data versions, or other things
  • tags: add tags like “resnet50” or “no-augmentation” to organize your runs.

Just pass this information to your logger:

neptune_logger = NeptuneLogger(
    api_key="ANONYMOUS",
    project="shared/pytorch-lightning-integration",
    tags=["pytorch-lightning", "mlp"],
)

Logging extra things during training with PyTorch Lightning

A lot of interesting information can be logged during training.  

You may be interested in monitoring things like:

  • model predictions after each epoch (think prediction masks or overlaid bounding boxes)
  • diagnostic charts like ROC AUC curve or Confusion Matrix
  • model checkpoints, or other objects

It is really simple. Just go to your LightningModule and call methods of the Neptune experiment available as self.logger.experiment.

For example, we can log histograms of losses after each epoch:

class CoolSystem(pl.LightningModule):

    def validation_epoch_end(self, outputs):
        # OPTIONAL
        avg_loss = torch.stack([x['val_loss'] for x in outputs]).mean()       
        # log debugging images like histogram of losses
        fig = plt.figure()
        losses = np.stack([x['val_loss'].numpy() for x in outputs])
        plt.hist(losses)
        neptune_logger.experiments['loss_histograms'].log(File.as_image(fig))
        plt.close(fig)
 
        return {'avg_val_loss': avg_loss}

Explore them yourself.

Other things you may want to log during training are:

  • neptune_logger.experiment["your/metadata/metric"].log(metric) # log custom metrics
  • neptune_logger.experiment["your/metadata/text"].log(text) # log text values
  • neptune_logger.experiment["your/metadata/file"].upload(artifact) # log files 
  • neptune_logger.experiment["your/metadata/figure"].upload(File.as_image(artifact)) # log images, charts
  • neptune_logger.experiment["properties/key"] = value # add key value pairs
  • neptune_logger.experiment["sys/tags"].add(['tag1', 'tag2']) # add tags for organization

Pretty cool right?

But … that is not all you can do!

Logging things after PyTorch Lightning training has finished

Tracking your experiment doesn’t have to finish after your .fit loop ends.

You may want to track the metrics of the trainer.test(model) or calculate some additional validation metrics and log them.

To do that you just need to tell NeptuneLogger  not to close after fit:

neptune_logger = NeptuneLogger(
    api_key="ANONYMOUS",
    project_name="shared/pytorch-lightning-integration",
    ...
)

… and you can keep logging 🙂

Test metrics:

trainer.test(model)

Additional (external) metrics:

from sklearn.metrics import accuracy_score
...
accuracy = accuracy_score(y_true, y_pred)
neptune_logger.experiment['test/accuracy'].log(accuracy)

Performance charts on test set:

from scikitplot.metrics import plot_confusion_matrix
import matplotlib.pyplot as plt
...
fig, ax = plt.subplots(figsize=(16, 12))
plot_confusion_matrix(y_true, y_pred, ax=ax)
neptune_logger.experiment['test/confusion_matrix'].upload(File.as_image(fig))

The whole model checkpoints directory:

neptune_logger.experiment('checkpoints').upload('my/checkpoints')

Go to this experiment to see how those objects are logged:

But … there is even more! 

Neptune lets you fetch experiments after training. 

Let me show you how.

Fetching your PyTorch Lightning experiment information directly to the notebooks

You can fetch experiments after they have finished, analyze the results, and update metrics, artifacts, or other things if you want to. 

For example, let’s fetch the experiments dashboard to a pandas DataFrame:

import neptune.new as neptune

project = neptune.init('shared/pytorch-lightning-integration')
project.fetch_runs_table().to_pandas()

or fetch a single experiment and update it with some external metric calculated after training:

exp = neptune.init(project='shared/pytorch-lightning-integration', id='PYTOR-63')
exp['some_external_metric'].log(0.92)

or fetch a single experiment and update it with some external metric calculated after training:

exp = project.get_experiments(id='PYTOR-63')[0]
exp.log_metric('some_external_metric', 0.92)

As you can see there are a lot of things you can log to Neptune from Pytorch Lightning.

If you want to go deeper into this:

Final thoughts

Pytorch Lightning is a great library that helps you with:

  • organizing your deep learning code to make it easily understandable to other people,
  • outsourcing development boilerplate to a team of seasoned engineers,
  • accessing a lot of state-of-the-art functionalities with almost no changes to your code 

With Neptune integration, you get some additional things for free:

  • you can monitor and keep track of your deep learning experiments
  • you can share your research with other people easily
  • you and your team can access experiment metadata and collaborate more efficiently.

Hopefully, with all that power you will know exactly what you (and other people) tried and your deep learning research will be moving at a lightning speed 🙂

Full PyTorch Lightning tracking script

pip install --upgrade torch pytorch-lightning>=1.5.0 \
    neptune-client\
    matplotlib scikit-plot
import os

import numpy as np
import neptune.new as neptune
import torch
from torch.nn import functional as F
from torch.utils.data import DataLoader
from torchvision.datasets import MNIST
from torchvision import transforms
import matplotlib.pyplot as plt

import pytorch_lightning as pl

MAX_EPOCHS=15
LR=0.02
BATCHSIZE=32
CHECKPOINTS_DIR = 'my_models/checkpoints'

class CoolSystem(pl.LightningModule):

    def __init__(self):
        super(CoolSystem, self).__init__()
        # not the best model...
        self.l1 = torch.nn.Linear(28 * 28, 10)

    def forward(self, x):
        return torch.relu(self.l1(x.view(x.size(0), -1)))

    def training_step(self, batch, batch_idx):
        # REQUIRED
        x, y = batch
        y_hat = self.forward(x)
        loss = F.cross_entropy(y_hat, y)
        self.log('train/loss', loss)
        return {'loss': loss}

    def validation_step(self, batch, batch_idx):
        # OPTIONAL
        x, y = batch
        y_hat = self.forward(x)
        loss = F.cross_entropy(y_hat, y)
        self.log('val/loss', loss)
        return {'val_loss': loss}

    def validation_epoch_end(self, outputs):
        # OPTIONAL
        avg_loss = torch.stack([x['val_loss'] for x in outputs]).mean()

        fig = plt.figure()
        losses = np.stack([x['val_loss'].numpy() for x in outputs])
        plt.hist(losses)
        neptune_logger.experiment['imgs/loss_histograms'].upload(neptune.types.File.as_image(fig))

        return {'avg_val_loss': avg_loss}

    def test_step(self, batch, batch_idx):
        # OPTIONAL
        x, y = batch
        y_hat = self.forward(x)
        loss = F.cross_entropy(y_hat, y)
        self.log('test/loss', loss)
        return {'test_loss': loss}

    def test_end(self, outputs):
        # OPTIONAL
        avg_loss = torch.stack([x['test_loss'] for x in outputs]).mean()
        return {'avg_test_loss': avg_loss}

    def configure_optimizers(self):
        # REQUIRED
        # can return multiple optimizers and learning_rate schedulers
        # (LBFGS it is automatically supported, no need for closure function)
        return torch.optim.Adam(self.parameters(), lr=LR)

    def train_dataloader(self):
        # REQUIRED
        return DataLoader(MNIST(os.getcwd(), train=True, download=True, transform=transforms.ToTensor()), batch_size=BATCHSIZE)

    def val_dataloader(self):
        # OPTIONAL
        return DataLoader(MNIST(os.getcwd(), train=True, download=True, transform=transforms.ToTensor()), batch_size=BATCHSIZE)

    def test_dataloader(self):
        # OPTIONAL
        return DataLoader(MNIST(os.getcwd(), train=False, download=True, transform=transforms.ToTensor()), batch_size=BATCHSIZE)


from pytorch_lightning.loggers.neptune import NeptuneLogger

neptune_logger = NeptuneLogger(
    api_key="ANONYMOUS",
    project_name="shared/pytorch-lightning-integration",
    tags=["pytorch-lightning", "mlp"],
)
model_checkpoint = pl.callbacks.ModelCheckpoint(filepath=CHECKPOINTS_DIR)

from pytorch_lightning import Trainer

model = CoolSystem()
trainer = Trainer(max_epochs=MAX_EPOCHS,
                  logger=neptune_logger,
                  checkpoint_callback=model_checkpoint,
                  )
trainer.fit(model)
trainer.test(model)

# Get predictions on external test
import numpy as np

model.freeze()
test_loader = DataLoader(MNIST(os.getcwd(), train=False, download=True, transform=transforms.ToTensor()), batch_size=256)

y_true, y_pred = [],[]
for i, (x, y) in enumerate(test_loader):
    y_hat = model.forward(x).argmax(axis=1).cpu().detach().numpy()
    y = y.cpu().detach().numpy()

    y_true.append(y)
    y_pred.append(y_hat)

    if i == len(test_loader):
        break
y_true = np.hstack(y_true)
y_pred = np.hstack(y_pred)

# Log additional metrics
from sklearn.metrics import accuracy_score

accuracy = accuracy_score(y_true, y_pred)
neptune_logger.experiment['test/accuracy'].log(accuracy)

# Log charts
from scikitplot.metrics import plot_confusion_matrix
import matplotlib.pyplot as plt

fig, ax = plt.subplots(figsize=(16, 12))
plot_confusion_matrix(y_true, y_pred, ax=ax)
neptune_logger.experiment['confusion_matrix'].log(File.as_image(fig))

# Save checkpoints folder
neptune_logger.experiment('checkpoints').upload(CHECKPOINTS_DIR)


# You can stop the experiment
neptune_logger.experiment.stop()

READ NEXT

Setting up a Scalable Research Workflow for Medical ML at AILS Labs [Case Study]

8 mins read | Ahmed Gad | Posted June 22, 2021

AILS Labs is a biomedical informatics research group on a mission to make humanity healthier. That mission is to build models which might someday save your heart from illness. It boils down to applying machine learning to predict cardiovascular disease development based on clinical, imaging, and genetics data.

Four full-time and over five part-time team members. Bioinformaticians, physicians, computer scientists, many on track to get PhDs. Serious business.

Although business is probably the wrong term to use because user-facing applications are not on the roadmap yet, research is the primary focus. Research so intense that it required a custom infrastructure (which took about a year to build) to extract features from different types of data:

  • Electronic health records (EHR),
  • Diagnosis and treatment information (time-to-event regression methods),
  • Images (convolutional neural networks),
  • Structured data and ECG.

With a fusion of these features, precise machine learning models can solve complex issues. In this case, it’s risk stratification for primary cardiovascular prevention. Essentially, it’s about predicting which patients are most likely to get cardiovascular disease

AILS Labs has a thorough research process. For every objective, there are seven stages:

  1. Define the task to be solved (e.g., build a risk model of cardiovascular disease).
  2. Define the task objective (e.g., define expected experiment results).
  3. Prepare the dataset.
  4. Work on the dataset in interactive mode with Jupyter notebooks; quick experimenting, figuring out the best features for both the task and the dataset, coding in R or Python. 
  5. Once the project scales up, use a workflow management system like Snakemake or Prefect to transform the work into a manageable pipeline and make it reproducible. Without that, it would be costly to reproduce the workflow or compare different models.
  6. Create machine learning models using Pytorch Lightning integrated with Neptune, where some initial evaluations are applied. Log experiment data.
  7. Finally, evaluate model performance and inspect the effect of using different sets of features and hyperparameters.

5 problems of scaling up Machine Learning research

AILS Labs started as a small group of developers and researchers. One person wrote code, and another reviewed it. Not a lot of experimenting. But collaboration became more challenging, and new problems started to appear along with the inflow of new team members:

  1. Data privacy,
  2. Workflow standardization,
  3. Feature and model selection,
  4. Experiment management,
  5. Information logging.
Continue reading ->
Experiment tracking Experiment management

15 Best Tools for ML Experiment Tracking and Management

Read more
Switching from spreadsheets

Switching from Spreadsheets to Neptune.ai and How It Pushed My Model Building Process to the Next Level

Read more
8 Creators and Core Contributors Talk About Their Model Training Libraries From PyTorch Ecosystem

8 Creators and Core Contributors Talk About Their Model Training Libraries From PyTorch Ecosystem

Read more
How to Build Machine Learning Teams That Deliver

How to Build Machine Learning Teams That Deliver

Read more