How to Set Up Continuous Integration for Machine Learning with Github Actions and Neptune: Step by Step Guide

Posted August 14, 2020

In software development, Continuous Integration (CI) is a practice of merging code changes from the entire team to the shared codebase often. Before any new code can be merged it is tested and checked for quality automatically. 

CI makes the codebase up-to-date, clean, and tested by design and helps to find any problems with it quickly. 

But what does Continuous Integration mean for machine learning?

The way I see it:

Continuous Integration in machine learning extends the concept to running model training or evaluation jobs for each trigger event (like merge request or commit).

This should be done in a way that is versioned and reproducible to ensure that when things are added to the shared codebase they are properly tested and available for audit when needed.  

Some examples of CI workflows in machine learning could be:

  • running and versioning the training and evaluation for every commit to the repository,
  • running and comparing experiment runs for each Pull Request to a certain branch.
  • creating model predictions on a test set and saving them somewhere on every PR to the feature branch. 
  • about a million other model training and testing scenarios that could be automated.

Good news is today there are tools for that and in this article, I will show you how to set up Continuous Integration workflow with two of those: 

  • Github Actions: that lets you run CI workflows directly from Github
  • Neptune: which makes experiment tracking and model versioning easy

You will learn

How to set up a CI pipeline that automates the following scenario.

On every Pull Request from branch develop to master:

  • Run model training and log all the experiment information to Neptune for both branches
  • Create a comment that contains a table showing diffs in parameters, properties, and metrics, links to experiments and experiment comparison in Neptune 

See this Pull Request on Github

CI for machine learning: Step-by-step guide

Before you start

Make sure you meet the following prerequisites before starting the how-to steps:

  • Create a Github repository (learn how)
  • Create a Neptune project (learn how): this is optional. I will be using an open project and log information as an anonymous user.

Note:

You can see this example project with the markdown table in the Pull Request on Github. Workflow config, environment file, and the training script are all there.

Step 1: Add Neptune logging to your training scripts

In this example project, we will be training a lightGBM multiclass classification model. 

Since we want to properly keep track of models we will also save the learning curves, evaluation metrics on testset, and performance charts like the ROC curve. 

1. Add Neptune tracking to your training script

Let me show you first and explain later.

import os

import lightgbm as lgb
import matplotlib.pyplot as plt
import neptune
from neptunecontrib.monitoring.lightgbm import neptune_monitor
from scikitplot.metrics import plot_roc, plot_confusion_matrix, plot_precision_recall
from sklearn.datasets import load_wine
from sklearn.metrics import f1_score, accuracy_score
from sklearn.model_selection import train_test_split

PARAMS = {'boosting_type': 'gbdt',
         'objective': 'multiclass',
         'num_class': 3,
         'num_leaves': 8,
         'learning_rate': 0.01,
         'feature_fraction': 0.9,
         'seed': 1234
         }
NUM_BOOSTING_ROUNDS = 10

data = load_wine()
X_train, X_test, y_train, y_test = train_test_split(data.data,
                                                   data.target,
                                                   test_size=0.25,
                                                   random_state=1234)
lgb_train = lgb.Dataset(X_train, y_train)
lgb_eval = lgb.Dataset(X_test, y_test, reference=lgb_train)

# Connect your script to Neptune
neptune.init(api_token=os.getenv('NEPTUNE_API_TOKEN'),
            project_qualified_name=os.getenv('NEPTUNE_PROJECT_NAME'))

# Create an experiment and log hyperparameters
neptune.create_experiment('lightGBM-on-wine',
                         params={**PARAMS,
                                 'num_boosting_round': NUM_BOOSTING_ROUNDS})

gbm = lgb.train(PARAMS,
               lgb_train,
               num_boost_round=NUM_BOOSTING_ROUNDS,
               valid_sets=[lgb_train, lgb_eval],
               valid_names=['train', 'valid'],
               callbacks=[neptune_monitor()],  # monitor learning curves
               )
y_test_pred = gbm.predict(X_test)

f1 = f1_score(y_test, y_test_pred.argmax(axis=1), average='macro')
accuracy = accuracy_score(y_test, y_test_pred.argmax(axis=1))

# Log metrics to Neptune
neptune.log_metric('accuracy', accuracy)
neptune.log_metric('f1_score', f1)

fig_roc, ax = plt.subplots(figsize=(12, 10))
plot_roc(y_test, y_test_pred, ax=ax)

fig_cm, ax = plt.subplots(figsize=(12, 10))
plot_confusion_matrix(y_test, y_test_pred.argmax(axis=1), ax=ax)

fig_pr, ax = plt.subplots(figsize=(12, 10))
plot_precision_recall(y_test, y_test_pred, ax=ax)

# Log performance charts to Neptune
neptune.log_image('performance charts', fig_roc)
neptune.log_image('performance charts', fig_cm)
neptune.log_image('performance charts', fig_pr)

It is a typical model training script with a few additions:

  • We connected Neptune to the script with neptune.init() and passed our API token and the project name
  • We created an experiment and saved parameters with neptune.create_experiment(params=PARAMS)
  • We added learning curves callback with callbacks=neptune_monitor()
  • We logged test evaluation metrics with neptune.log_metric()  
  • We logged performance charts with neptune.log_image()

Now when you run your script:

python train.py

You should get something like this:

See this experiment in Neptune

2. Add a snippet that is run only in the CI environment. 

Add the following snippet at the bottom of your training script.

if os.getenv('CI') == "true":
   neptune.append_tag('ci-pipeline', os.getenv('NEPTUNE_EXPERIMENT_TAG_ID'))

What this does is:

  • fetch the CI environment variable to see whether code is run inside the Github Actions workflow
  • add a ci-pipeline tag to the experiment so that it is easier to filter out things in Neptune UI 
  • get the NEPTUNE_EXPERIMENT_TAG_ID environment variable used to identify the experiment in the CI workflow and log it to Neptune (this will become clear later).

Step 2: Create an environment file

Having an environment setup file which makes it easy to create your training or evaluation environment from scratch is generally a good practice.

But when you are training models in the CI workflow (like Github Actions) this is a must. The environment where the workflow is executed (and models are trained) will be created from scratch every time your workflow is triggered. 

There are a few choices when it comes to environment setup files. You can use:

  • Pip and the requirements.txt
  • Conda and the environment.yaml,
  • Docker and the Dockerfile (this is often the best option) 

Let’s go with the simplest solution and create a requirements.txt file with all the packages we need:

requirements.txt

lightgbm==2.3.1
neptune-client==0.4.119
neptune-contrib==0.23.3
numpy==1.19.0
scikit-learn==0.23.1
scikit-plot==0.3.7

Now, whenever you need to run your training install all the packages with:

pip install -r requirements.txt

Step 3: Set up Github Secrets

GitHub Secrets allow you to pass sensitive information like keys or passwords to the Github CI workflow runners so that your automated tests can be executed. 

In our case, two sensitive things are needed:

  • NEPTUNE_API_TOKEN: I’ll set it to the key of anonymous Neptune user ANONYMOUS
  • NEPTUNE_PROJECT_NAME: I’ll set it to the open project shared/github-actions

Note:

You can set those to your API token and the Neptune project you created.

Without those, Github wouldn’t know where to send the experiments and Neptune wouldn’t know who is sending them (and whether this should be allowed). 

To set up GitHub Secrets:

  • Go to your Github project
  • Go to the Settings tab
  • Go to the Secrets section
  • Click on New secret 
  • Specify the name and value of the secret (similarly to environment variables)

Step 4: Create .github/workflows directory and a .yml action file

Github will run all workflows that you define in the .github/workflows directory with .yml configuration files. Which means you need to:

1. Create .github/workflows directory:

Go to your project repository and create both .github and .github/workflows directories.

mkdir -p .github/workflows

2. Create neptune_action.yml

Workflow configs that define actions are .yml files of a certain structure that you put in the .github/workflows directory. You can have multiple .yml files to fire multiple workflows. 

I will create just one neptune_action.yml

touch .github/workflows/neptune_action.yml

As a result, you should see:

your-repository
├── .github
│   └── workflows
│       └── neptune_action.yml
├── .gitignore
├── README.md
├── requirements.txt
└── train.py

Step 5:  Define your workflow .yml config

Workflow configs are .yml files where you specify what you want to happen and when. 

In a nutshell, you define:

  • on which Github event you would like to trigger the workflow. For example, on a commit to the master branch, 
  • what are the jobs you would like to perform. This is mostly to organize the config, 
  • where should those jobs be performed. For example, runs-on: ubuntu-latest will run your workflow on the latest version of ubuntu.  
  • what are the steps within each job that you would like to run sequentially. For example, create an environment, run training, and run evaluation of the model. 

In our machine learning CI workflow we need to run the following sequence of steps:

  1. Checkout to branch develop
  2. Setup the environment and run model training on branch develop
  3. Checkout to branch master
  4. Setup the environment and run model training for branch master
  5. Fetch data from Neptune and create an experiment comparison markdown table 
  6. Comment on the PR with that markdown table

Here is the neptune_action.yml that does all that. I know it seems complex but in reality, it’s just a bunch of steps that run terminal commands with some boilerplate around it.  

neptune_action.yml

name: Neptune actions

on:
 pull_request:
   branches: [master]

jobs:
 compare-experiments:
   runs-on: ubuntu-latest
   strategy:
     matrix:
       python-version: [3.7]
   env:
     NEPTUNE_API_TOKEN: ${{ secrets.NEPTUNE_API_TOKEN }}
     NEPTUNE_PROJECT_NAME: ${{ secrets.NEPTUNE_PROJECT_NAME }}

   steps:
     - name: Set up Python
       uses: actions/setup-python@v2
       with:
         python-version: ${{ matrix.python-version }}

     - name: Checkout pull request branch
       uses: actions/checkout@v2
       with:
         ref: develop

     - name: Setup pull request branch environment and run experiment
       id: experiment_pr
       run: |
         pip install -r requirements.txt
         export NEPTUNE_EXPERIMENT_TAG_ID=$(uuidgen)
         python train.py
         echo ::set-output name=experiment_tag_id::$NEPTUNE_EXPERIMENT_TAG_ID

     - name: Checkout main branch
       uses: actions/checkout@v2
       with:
         ref: master

     - name: Setup main branch environment and run experiment
       id: experiment_main
       run: |
         pip install -r requirements.txt
         export NEPTUNE_EXPERIMENT_TAG_ID=$(uuidgen)
         python train.py
         echo ::set-output name=experiment_tag_id::$NEPTUNE_EXPERIMENT_TAG_ID

     - name: Get Neptune experiments
       env:
         MAIN_BRANCH_EXPERIMENT_TAG_ID: ${{ steps.experiment_main.outputs.experiment_tag_id }}
         PR_BRANCH_EXPERIMENT_TAG_ID: ${{ steps.experiment_pr.outputs.experiment_tag_id }}
       id: compare
       run: |
         pip install -r requirements.txt
         python -m neptunecontrib.create_experiment_comparison_comment \
           --api_token $NEPTUNE_API_TOKEN \
           --project_name $NEPTUNE_PROJECT_NAME \
           --tag_names $MAIN_BRANCH_EXPERIMENT_TAG_ID $PR_BRANCH_EXPERIMENT_TAG_ID \
           --filepath comment_body.md
         result=$(cat comment_body.md)
         echo ::set-output name=result::$result

     - name: Create a comment
       uses: peter-evans/commit-comment@v1
       with:
         body: |
           ${{ steps.compare.outputs.result }}

You can just copy this file and paste it into your .github/workflows directory and it will work out of the box. 

That said, there are some things that you may need to adjust to your setup:

  • Branch names if you want to trigger your workflow on PR from a branch different then develop or to a branch different than master.
  • Environment setup steps if you are using anything different than pip and requirements.txt.
  • The command that runs your training scripts.

Note:

Explaining this config in detail would make this post really long so I decided not to :). If you’d like to understand everything about it check out Github Actions Documentation (which is great by the way).

Step 6: Push it to Github 

Now you need to push this workflow to GitHub. 

git add .github/workflows train.py requirements.txt;
git commit -m "added continuous integration"

Since our workflow will be triggered on every Pull Request to master, nothing will happen just yet. 

Step 7: Create a Pull Request 

Now everything is ready and you just need to create a PR from branch develop to master.

  1. Checkout to a new branch develop
git checkout -b develop

2. Change some parameters in train.py

train.py

PARAMS = {'boosting_type': 'gbdt',
          'objective': 'multiclass',
          'num_class': 3,
          'num_leaves': 15, # previously 8
          'learning_rate': 0.01,
          'feature_fraction': 0.85, #previous 0.9
          'seed': 1234
          }

3. Add, commit and push your changes to the previously created branch develop

git add train.py;
git commit -m"tweaked parameters"
git push origin develop

4. Go to Github and create a Pull Request from branch develop to master.

The workflow is triggered and it goes through all the steps one by one. 

Explore the result

If everything worked correctly you should see a Pull Request comment that shows:

  • Diffs in parameters, properties, and evaluation metrics.
  • Experiment IDs and links to both the main and PR branch runs in Neptune. You can go and see all the details of those experiments including learning curves and performance charts that were logged for those experiments.
  • A link to a full comparison between those runs in Neptune.

See this Pull Request on Github

Final thoughts

Ok, so in this how-to guide, you learned how to set up a Continuous Integration workflow that creates a comparison table for every Pull Request to master. 

Hopefully, with this information, you will be able to create the CI workflow that works for your machine learning project!

Jakub Czakon
Senior Data Scientist