Blog » ML Tools » How to Organize Your LightGBM ML Model Development Process – Examples of Best Practices

How to Organize Your LightGBM ML Model Development Process – Examples of Best Practices

LightGBM is a distributed and efficient gradient boosting framework that uses tree-based learning. It’s known for its fast training, accuracy, and efficient utilization of memory. It uses a leaf-wise tree growth algorithm that tends to converge faster compared to depth-wise growth algorithms. 

LightGBM is great, and building models with LightGBM is easy. But when you train many versions of the model with changing features and hyperparameter configurations, it is easy to get lost in all this metadata. 

Managing these configurations in Excel sheets or text files can quickly become a mess. Luckily, today there are many tools and libraries that can help you keep track of all this. 

This article will look at how we can use one of the most popular experiment and model management library, Neptune, to deal with various versions of your ML model. I will also show you how you can add experiment management to your current workflow in just a few steps. 


CHECK ALSO
Neptune’s integration with lightGBM – docs


How ML model development with LightGBM looks like today

Obtaining the dataset

Any model development process will kick off with obtaining the dataset. Let us use Scikit-learn to generate a regression dataset. After that, we split it into a training and test set.

from sklearn.model_selection import train_test_split
from sklearn.datasets import make_regression

X, y = make_regression(n_samples=100000, n_features=10, n_informative=8,  random_state=101)
import pandas as pd
X = pd.DataFrame(X,columns=["F1","F2","F3","F4","F5","F6","F7","F8","F9","F10"])
y = pd.DataFrame(y,columns=["Target"])

from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.30, random_state=101)

Train the LightGBM model

At this point, we can start the process of training the LightGBM model. However, we need to get a couple of things out of the way:

  • defining the training and validation sets to be of the lgb.Dataset format as required by the train method
  • define training parameters
import lightgbm as lgb
lgb_train = lgb.Dataset(X_train, y_train)
lgb_eval = lgb.Dataset(X_test, y_test, reference=lgb_train)
params = {'boosting_type': 'gbdt',
              'objective': 'regression',
              'num_leaves': 40,
              'learning_rate': 0.1,
              'feature_fraction': 0.9
              }
gbm = lgb.train(params,
    lgb_train,
    num_boost_round=200,
    valid_sets=[lgb_train, lgb_eval],
    valid_names=['train','valid']],
   )

After training, we should save the model to use it during the deployment process. 

gbm.save_model('mode.pkl')

We can now run predictions and save them in a CSV file. 

import pandas as pd
pd.DataFrame(predictions, columns=["Predictions"]).to_csv("light_predictions.csv")

When building machine learning models, you need to manage and version all the above items (code, parameters, data versions,metrics and predictions). You can do this using git, spreadsheets, configs, filesystems etc. However, today I will show you how to get all that versioned using Neptune.ai.

Organizing ML development in Neptune

Installing packages and setting up Neptune 

First, we need to install the Neptune Client package. 

pip install neptune-client

Using Neptune Notebooks, we can save our Notebook checkpoints to Neptune. So let’s install that too:

pip install neptune-notebooks

For that integration to be complete, we need to enable this extension:

jupyter nbextension enable --py neptune-notebooks

Note: If you are not using Notebooks you can skip this part. 

Now that we are installing packages let’s also get the Neptune Contrib package out the way. This package will enable us to log our metrics to Neptune while training the LightGBM model. 

pip install neptune-contrib[monitoring]

Connect your script to Neptune 

For the Neptune Client to communicate with Neptune AI, we need to set up an account and obtain the API Key. The API key can be obtained by clicking the profile picture once you are logged in. 

Neptune getting started

The first step is to create a project. While you are logged into your account, this can be done under the projects tab.

Neptune new project

After that we need to initialize the communication between us and Neptune AI. 

The first step is to connect our Notebook to Neptune by clicking the Neptune logo.

Neptune connect

You will now be prompted to enter your API Token. Once the connection is successful, you can upload your Notebook to Neptune by clicking the upload button.

Neptune configure API

After that, we use neptune.init to initialize the communication between us and neptune.ai project.

import neptune
neptune.init(project_qualified_name='mwitiderrick/LightGBM, api_token='YOUR_API_KEY')

Create an experiment and save hyperparameters 

The first thing we need to do to start logging to Neptune is to create an experiment. It’s a namespace to which you can log metrics, predictions, visualizations, and anything else (full list here). 

Let’s create an experiment and log model hyperparameters. 

neptune.create_experiment('LightGBM',params=params)

Running neptune.create_experiment outputs a link to that experiment in Neptune. 

You can click on it to see the training process live. 

Right now, not much is logged, but we can see hyperparameters in the parameters section.

The parameters tab shows the parameters used to train the LightGBM model.  

Neptune lightgbm parameters

Create Neptune callback and pass it to `train` 

To log the training metrics to Neptune, we use an out-of-the-box callback from the neptune-contrib library. It’s pretty cool because it’s the only thing we need to add at the training stage. 

With that callback set up, Neptune takes care of the rest. 

import lightgbm as lgb
gbm = lgb.train(params,
    lgb_train,
    num_boost_round=200,
    valid_sets=[lgb_train, lgb_eval],
    valid_names=['train','valid'],
    callbacks=[neptune_monitor()],
   )

Note: When working in Notebooks, once you are done running the experiment, ensure that your run neptune.stop() to finish the current work (in scripts the experiment is stopped automatically).

Clicking on the project on Neptune will show all the experiments related to that specific project. 

Neptune lightgbm projects

Clicking on a single experiment will show the charts and logs for that specific experiment. 

Neptune lightgbm charts

The log section shows the training and validation metrics that were used to generate the charts above. 

Interestingly, we can monitor the RAM and CPU usage as our model is training. This information is found in the Monitoring section of the experiment.

Neptune lightgbm ram

As we look at the graphs, Neptune allows us to zoom in and out at various places. This is important for a more in-depth analysis of the training of the model. 

Neptune lightgbm training

Further, we can select several experiments and compare their performance. 

Neptune lightgbm compare

Version test metrics

Neptune also allows us to log our test metrics. This is done using the neptune.log_metric function.

neptune.log_metric('Root Mean Squared Error', np.sqrt(mean_squared_error(y_test, predictions)))
neptune.log_metric('Mean Squarred Error', mean_squared_error(y_test, predictions))
neptune.log_metric('Mean Absolute Error', mean_absolute_error(y_test, predictions))
Neptune lightgbm metrics

Version dataset

Versioning your dataset hash in Neptune can also be very useful. This would enable you to track different versions of your dataset when performing your experiments. This can be done with Python’s hashlib module and Neptune’s set_property function. 

import hashlib
neptune.set_property('x_train_version', hashlib.md5(X_train.values).hexdigest())
neptune.set_property('y_train_version', hashlib.md5(y_train.values).hexdigest())
neptune.set_property('x_test_version', hashlib.md5(X_test.values).hexdigest())
neptune.set_property('y_test_version', hashlib.md5(y_test.values).hexdigest())

After that, you can see the versions under the details tab of your project. 

Neptune lightgbm version

You can also use a data versioning tool such as DVC to manage the version of your dataset. Thereafter, you can log the .dcv file to Neptune. 

In order to do that you first have to add the file to dvc. This is done on the terminal while on your current working directory. 

$ dvc add data.csv

This creates the .dvc file that you can log to Neptune. 

neptune.log_artifact('data.csv.dvc')

Version model binary

You can also save various versions of the model to Neptune by using neptune.log_artifact()

Neptune lightgbm artifacts

Version whatever else you think you will need

Neptune also offers the ability to log other things such as model explainers and interactive charts using your favorite plotting library. 

Logging the explainer is done using the log_explainer function. 

from neptunecontrib.api import log_explainer, log_global_explanations
import dalex as dx

expl = dx.Explainer(model, X, y, label="LightGBM")
log_global_explanations(expl, numerical_features=["F1","F2","F3","F4","F5","F6","F7","F8","F9","F10"])
log_explainer('explainer.pkl', expl)

After doing this, the pickled explainer and charts will be available in the artifacts section of the experiment.

Neptune lightgbm explainer
Neptune lightgbm artifacts

It is also important to note that logging would also work even if you are using the LightGBM Scikit-learn wrapper. The only thing you will do is pass the Neptune Callback at the fit stage of the model. Notice that you can add the evaluation set as well as the evaluation metrics. 

model.fit(X_test,y_test,eval_set=[(X_train,y_train),(X_test,y_test)],eval_metric=['mean_squared_error','root_mean_squared_error'],callbacks=[neptune_monitor()])

Organize experiments in a dashboard

With Neptune, you have the flexibility to decide what you want to see on your dashboard. 

You can add or remove columns from your dashboard as you please. For example, you can add the created in Notebook column to get access to the Notebook checkpoint immediately. 

Neptune lightgbm columns

You can also filter the dashboard by the columns either in descending or ascending order. If you would like to remove columns you just click on the x button. 

Neptune lightgbm columns 1

Neptune also allows you to group your experiments into views and save them. The saved views can be shared or pinned on the dashboard.

Neptune lightgbm save

Collaborate on ML experiments with your team 

Neptune experiments can be shared by inviting your team mates to collaborate. 

Neptune invite team

A project can be shared by first making it public. Once it’s public you can share it freely with anyone with the link. 

Neptune share project

Note:

When using the team plan you can share your private projects with your teammates. The Team Plan is also free for research, non-profit organizations, and Kagglers.

One can share anything that they do on Neptune, for example, I can share the comparison I did earlier by sending a link.

Neptune lightgbm share
Neptune lightgbm share

Download model artifacts programmatically

Neptune also allows you to download files from any experiment. This enables you to download single files from your Python code. For example, you can download a single file using the download_artifact method. For example, to download the model we uploaded earlier, we just need to fetch the experiment object and use that to download the model. The model is stored in a model folder in our current working directory. 

project = neptune.init('mwitiderrick/LightGBM',api_token='YOUR_TOKEN')
my_exp = project.get_experiments(id='LIG-8')[0]
experiment.download_artifact("light.pkl","model")

This comes in handy when you want to move your model to production. That is however a topic for another article. 

Conclusion

Hopefully, this has shown you how easy it is to add experiment tracking and model versioning to your LightGBM training scripts using Neptune. 

Specifically, we covered how to:

  • set up Neptune 
  • use Neptune Callbacks to log our LightGBM training session
  • analyze and compare experiments in Neptune
  • version various items on Neptune 
  • collaborate with team members 
  • download your artifacts from Neptune

Hopefully with all this information developing LightGBM models will now be cleaner and more manageable. 

Thanks for reading!


READ NEXT

ML Experiment Tracking: What It Is, Why It Matters, and How to Implement It

10 mins read | Author Jakub Czakon | Updated July 14th, 2021

Let me share a story that I’ve heard too many times.

”… We were developing an ML model with my team, we ran a lot of experiments and got promising results…

…unfortunately, we couldn’t tell exactly what performed best because we forgot to save some model parameters and dataset versions…

…after a few weeks, we weren’t even sure what we have actually tried and we needed to re-run pretty much everything”

– unfortunate ML researcher.

And the truth is, when you develop ML models you will run a lot of experiments.

Those experiments may:

  • use different models and model hyperparameters
  • use different training or evaluation data, 
  • run different code (including this small change that you wanted to test quickly)
  • run the same code in a different environment (not knowing which PyTorch or Tensorflow version was installed)

And as a result, they can produce completely different evaluation metrics. 

Keeping track of all that information can very quickly become really hard. Especially if you want to organize and compare those experiments and feel confident that you know which setup produced the best result.  

This is where ML experiment tracking comes in. 

Continue reading ->
Lightgbm featured

Understanding LightGBM Parameters (and How to Tune Them)

Read more
Neptune and XGBoost

How to Organize Your XGBoost Machine Learning (ML) Model Development Process – Best Practices

Read more

How to Monitor Machine Learning and Deep Learning Experiments

Read more
Switching from spreadsheets

Switching from Spreadsheets to Neptune.ai and How It Pushed My Model Building Process to the Next Level

Read more