The Best Posts in 2020 – Neptune’s Blog Summary

Posted December 31, 2020

It’s the last day of the year, so we decided to go with the end-of-the-year flow and create our own blog summary! It was a very active time on Neptune’s blog – tons of articles written by amazing authors, thousands of blog visitors, and many important topics covered. 

We always say that visualization is the key to data understanding so, here are a few highlights of the year 2020. 😉

Neptune blog summary

But, that’s not all. I checked which articles were most visited and read, and prepared a list of the top posts in various categories. 

Experiment management

ML Experiment Tracking: What It Is, Why It Matters, and How to Implement It

This post was only published in November but already managed to gain a huge audience. I guess that’s because the topic is not covered enough in the industry literature. Experiment tracking is a part of MLOps which focuses on the iterative model development phase when you try many things to get your model performance to the level you need. Here, Jakub Czakon explains its purpose, best practices, and implementation process.

ml experiment tracking

How to Structure and Manage Natural Language Processing (NLP) Projects

ML projects are messy. You may start clean but, for some reason, things get in the way. This is why in this article, Dhruvil Karani shared some key pointers, guidelines, tips and tricks that can help you stay on top of things and keep your NLP projects (mostly) in check. And it looks like it was very much needed!

General knowledge

10 Real-Life Applications of Reinforcement Learning

If you’re curious about how Reinforcement Learning can be used in real-life (and we noticed a lot of you is interested in it), here’s an article for you. In this post, you’ll find 10 examples from areas like engineering, news recommendation, gaming, robotics, and more.

RL examples

Random Forest Regression: When Does It Fail and Why?

Derrick Mwiti looks at a major problem with using the random forest for regression which is extrapolation. He compares random forest regression vs linear regression, explains the random forest regression extrapolation problem, and presents potential solutions.

AI Limits: Can Deep Learning Models Like BERT Ever Understand Language?

A few months ago, a Natural Language model (GPT-3) wrote an article for Guardian. Understandably, this caused a flurry of apocalyptic terminator-esque social media buzz. In this article, Cathal Horan, wonders what are the limitations of such models.

AI limits

Tips and Tricks from Kaggle Competitions

We have a few posts on the blog where we gathered tips & tricks from Kaggle competitions. There’s a huge audience visiting them every month and they seem to be very helpful for Kagglers. 

Tutorials

How to Deal with Files in Google Colab: Everything You Need to Know

Google Colab comes with (almost) all the setup you need to start coding, but what it doesn’t have out of the box is your datasets. In this post, Siddhant Sadangi explains how to load data to Colab from a multitude of data sources and how to write back to those data sources from within Colab. It seems that it was a useful tutorial for many of you.

Understanding LightGBM Parameters (and How to Tune Them)

In this article, Jakub Cieślik explores his go-to algorithm for most tabular data problems – LightGBM. After reading it, you’ll know which parameters are important in general, which regularization parameters need to be tuned, how to tune LightGBM parameters in Python, and more. 

How to Train Your Own Object Detector Using TensorFlow Object Detection API

In the past, creating a custom object detector looked like a time-consuming and challenging task. Now, with tools like TensorFlow Object Detection API, we can create reliable models quickly and with ease. TensorFlow Object Detection API got a lot of attention this year, and this tutorial was definitely one of the favorites.

tensorflow object detection api

Deep Dive into TensorBoard: Tutorial With Examples

This in-depth TensorBoard tutorial covers visualizing images in TensorBoard, visualizing the model’s architecture, sending custom diagnostic charts to TensorBoard, using it with Keras, PyTorch, and XGBoost, and more. We heard it’s one of the best TensorBoard tutorials online! 😉

How to Make Sense of the Reinforcement Learning Agents? What and Why I Log During Training and Debug

Whether you’re just starting out in Reinforcement Learning or you already have some experience under your belt, this article will help with what to keep track of to inspect/debug your agent learning trajectory.

Guides

Keras Loss Functions: Everything You Need To Know

This is a complete guide on Keras loss functions. You’ll get to know available functions, how to use them, how you can define your own custom loss function in Keras, how to avoid nans in the loss, how you can monitor the loss function via plotting and callbacks, and more. 

PyTorch Loss Functions: The Ultimate Guide

Looks like people are really interested in loss functions! In this article, Alfrick Opidi talks about popular loss functions in PyTorch, and about building custom loss functions. You’ll find out what loss functions are, how to add PyTorch loss functions, which loss functions are available in PyTorch, and how to create a custom loss function in PyTorch.

Hyperparameter Tuning in Python: a Complete Guide 2020

Choosing the right hyperparameters for Machine Learning or Deep Learning models is a common practice to extract the last juice out of your models. Read about how to do it well in this guide prepared by Shahul Es.

Hyperparameter tuning python

Tools

15 Best Tools for Tracking Machine Learning Experiments

In this article, we explain why data scientists and machine learning engineers need a tool for tracking machine learning experiments and what is the best software for that. We received super positive feedback especially about the comparison table that’s in the article – it’s a breakdown of all the important features and integrations of 15 different experiment tracking tools. 

The Best MLOps Tools You Need to Know as a Data Scientist

MLOps has been a hot topic this year, so it’s not a surprise that people are looking for the best tools for that. Here, we recommend the best MLOps tools divided into 6 categories: data and pipeline versioning, run orchestration, experiment tracking and organization, hyperparameter tuning, model serving, and production model monitoring.

MLOps tools

8 Creators and Core Contributors Talk About Their Model Training Libraries From PyTorch Ecosystem

Jakub Czakon asked the authors of 6 high-level training APIs in the PyTorch Ecosystem to explain the differences between them. lots of first-hand information about really great tools.

To wrap it up

It was a great year on our blog, and we’re very happy that so many people found something interesting here! Thanks for visiting our place on the Internet, and stay tuned for more in the New Year. We are not slowing down!

Marketing Assistant at Neptune.ai

READ NEXT

ML Experiment Tracking: What It Is, Why It Matters, and How to Implement It

Jakub Czakon | Posted November 26, 2020

Let me share a story that I’ve heard too many times.

”… We were developing an ML model with my team, we ran a lot of experiments and got promising results…

…unfortunately, we couldn’t tell exactly what performed best because we forgot to save some model parameters and dataset versions…

…after a few weeks, we weren’t even sure what we have actually tried and we needed to re-run pretty much everything”

– unfortunate ML researcher.

And the truth is, when you develop ML models you will run a lot of experiments.

Those experiments may:

  • use different models and model hyperparameters
  • use different training or evaluation data, 
  • run different code (including this small change that you wanted to test quickly)
  • run the same code in a different environment (not knowing which PyTorch or Tensorflow version was installed)

And as a result, they can produce completely different evaluation metrics. 

Keeping track of all that information can very quickly become really hard. Especially if you want to organize and compare those experiments and feel confident that you know which setup produced the best result.  

This is where ML experiment tracking comes in. 

Continue reading ->