When it comes to football, it is surprising to see how a team is able to win a football match against a stronger opponent. At times, viewers get to predict the score of the match by observing team players (their capabilities and strength). Wouldn’t it be interesting to build an automated machine learning model that is able to track team players on the pitch, such that we could predict the next move of a player?
To further demonstrate this, let’s take a bite into how we can intuitively apply computer vision techniques like object detection/tracking to monitor team players right on the football pitch.
The notebook to this work can be found here. All code implemented in this work is done right on colab.
Here is an outline of what we will be looking into:
- Data sourcing
- Labellmg
- Data preparation (exporting as TensorFlow records)
- Model pipeline
- Modelling, training (and logging)
- Model evaluation
- Result and conclusion
Data sourcing
To proceed into football analytics, there is a need to have source data from which the algorithm will learn from. In this project, the source data is gotten from here. Here is a little bit of information you need to know from the match. It was a match between Chelsea (2) and Man City (1). This video contains highlights of the actual football game. Chelsea was able to defeat Man City in this game with some unique tactics from the players.
From the video, 152 images were extracted and processed using this code below.
vidcap = cv2.VideoCapture(ChelseaManCity)
count = 0
def getFrame(sec):
vidcap.set(cv2.CAP_PROP_POS_MSEC,sec*1000)
hasFrames,image = vidcap.read()
if hasFrames:
cv2.imwrite("images/frame"+str(count)+".jpg", image) # save frame as JPG file
return hasFrames
sec = 2
frameRate = 2 #//it will capture image in each 0.5 second
count=88
success = getFrame(sec)
while success:
count = count + 1
sec = sec + frameRate
sec = round(sec, 2)
success = getFrame(sec)
The resulting images were saved in a folder. Since the aim is to be able to track various teammates (objects) on the football pitch it is important the data are labelled so that the algorithm can map the correct image to the actual target.
Machine learning labeling is an important step to making an algorithm intuitively capture information as presented in the data. We always want to present our problems to an algorithm as a supervised learning problem (making machines learn from the input data and target label); rather than unsupervised learning (making machines find out patterns in the input data, without a target label).
One advantage of this is that it reduces too much computation, enhances learning, and also makes evaluation much easier. Since we are dealing with image data or video frames there is a need to make use of a good annotation tool box for efficient detection of various objects present on the football pitch. In this project, labelImg was used as the labelling toolkit.
labellmg
Labellmg is an open-source graphical labelling tool for image processing and annotations. Source code to use it locally can be found here. On a football pitch we do have the following:
- Players from 2 opponent teams (in this case, Chelsea and Man City),
- The referees,
- The goalkeepers,
From the above, we do have 4 different classes to label from 152 images that have been extracted from the video. We can train and make the algorithm keep track of all 4 classes and classify. For the sake of simplicity, I decided to make it 3 classes so that I have:
- Chelsea — Class 0
- Man City — Class 1
- Referee and the goalkeepers — Class 2
So I ended up having 3 classes. The reason for this is, there were more labelling of class 0 and 1 than referees and goalkeepers individually. Since, the labelling for both the referees and goalkeepers, across the 152 frames, are under-represented, we can take them as a single class called ‘Unknown’.
Now that all classes across the frames have been correctly labelled, we can go ahead to model using an object detection architecture called efficientDet. Do take note that before we model we need to convert this labelled data alongside the generated xml files (files generated for each frame, containing the bounding box for each label in an image) into TensorFlow records.

Data preparation (exporting as TensorFlow records)
Now that we have been able to label the data conveniently, we can go ahead to export the data as TensorFlow records. For this purpose, we will use roboflow.com platform. All you need to do is to follow these steps:
- After creating an account with roboflow.com, go ahead and click on create a dataset.
- Fill in your dataset details, upload the data (images and .xml files). Make sure you select object detection when selecting the type of data for your computer vision modelling.
- You can go ahead to add preprocessing steps and augmentation steps if needed, finally,
- In the topmost right corner after upload, click on generate TensorFlow records. Go ahead to select ‘generate as code’, this will give you a link for you to download the TensorFlow record data of your train data.
Model pipeline
Now that we have our TensorFlow records exported for the train and test data; what’s next is to model using EfficientDet. Before modelling with EfficientDet, there are certain pre-requisites that needs to be satisfied alongside the model pipeline which has to be put in place, this includes the following:
- Installation of TensorFlow Object Detection API.
- Setting up the Object Detection Architecture
- Setting up the configuration file.
Installation of TensorFlow Object Detection API
The following information and steps demonstrate how to install the TensorFlow 2 object detection API while training on Colab. First and foremost, clone the TensorFlow model repository on GitHub using the code below:
import os
import pathlib
# Clone the tensorflow models repository if it doesn't already exist
if "models" in pathlib.Path.cwd().parts:
while "models" in pathlib.Path.cwd().parts:
os.chdir('..')
elif not pathlib.Path('models').exists():
!git clone --depth 1 https://github.com/tensorflow/models
Next is to install, TensorFlow object Detection API using the following commands.
# Install the Object Detection API
%%bash
cd models/research/
protoc object_detection/protos/*.proto --python_out=.
cp object_detection/packages/tf2/setup.py .
python -m pip install .
Having installed the TensorFlow object detection API, the code below helps us to confirm that this API has been installed.
import matplotlib
import matplotlib.pyplot as plt
import os
import random
import io
import imageio
import glob
import scipy.misc
import numpy as np
from six import BytesIO
from PIL import Image, ImageDraw, ImageFont
from IPython.display import display, Javascript
from IPython.display import Image as IPyImage
import tensorflow as tf
from object_detection.utils import label_map_util
from object_detection.utils import config_util
from object_detection.utils import visualization_utils as viz_utils
from object_detection.utils import colab_utils
from object_detection.builders import model_builder
%matplotlib inline
Now that the API is correctly installed, we could go ahead to train the data, but before doing that, let’s build up the model tester using the code below. The model tester file helps to confirm installations and importation of libraries required for proper modelling of any object detection problem. To do this, try implementing the code below.
#run model builder test
!python /content/models/research/object_detection/builders/model_builder_tf2_test.py
Moving on with the modelling pipeline, before importing the proposed architecture we can go ahead to download the exported data, in tfrecords format, from roboflow.com using the code below.
Do take note that, having used the roboflow.com platform, in generating your data as tfrecords; the data can be exported as a link that can be downloaded into Colab. Insert the exported link into the program below and run so as to download the data into Colab.
#Downloading data from Roboflow
#UPDATE THIS LINK - get our data from Roboflow
%cd /content
!curl -L "[insert Link]" > roboflow.zip; unzip roboflow.zip; rm roboflow.zip
This is great work done so far if you are able to come this far. Just one more thing before setting up the configuration file, we need to specify the directory to train and test data (this will be needed in setting up the configuration file).
The directory to the train and test data just downloaded from roboflow.com is found in the current directory. Your directory should look like mine below:
train_record_fname = '/content/train/foot.tfrecord'
test_record_fname = '/content/test/foot.tfrecord'
label_map_pbtxt_fname = '/content/train/foot_label_map.pbtxt'
Congrats to have come this far in putting all things in place for smooth training. Now let’s move on to setting up the configuration file before training.
LEARN MORE
See how to take the best of the two worlds: Use TensorFlow for deep learning experiments and Neptune for experiment management and tracking
Setting up the object detection architecture
The desired object detection architecture for this problem is the EfficientDet. This architecture has 4 variants (D0, D1, D2, and D3). The code below shows the model config for D0 — D3 with their respective model name and base_pipeline_file (configuration file).
##change chosen model to deploy different models available in the TF2 object detection zoo
MODELS_CONFIG = {
'efficientdet-d0': {
'model_name': 'efficientdet_d0_coco17_tpu-32',
'base_pipeline_file': 'ssd_efficientdet_d0_512x512_coco17_tpu-8.config',
'pretrained_checkpoint': 'efficientdet_d0_coco17_tpu-32.tar.gz',
'batch_size': 16
},
'efficientdet-d1': {
'model_name': 'efficientdet_d1_coco17_tpu-32',
'base_pipeline_file': 'ssd_efficientdet_d1_640x640_coco17_tpu-8.config',
'pretrained_checkpoint': 'efficientdet_d1_coco17_tpu-32.tar.gz',
'batch_size': 16
},
'efficientdet-d2': {
'model_name': 'efficientdet_d2_coco17_tpu-32',
'base_pipeline_file': 'ssd_efficientdet_d2_768x768_coco17_tpu-8.config',
'pretrained_checkpoint': 'efficientdet_d2_coco17_tpu-32.tar.gz',
'batch_size': 16
},
'efficientdet-d3': {
'model_name': 'efficientdet_d3_coco17_tpu-32',
'base_pipeline_file': 'ssd_efficientdet_d3_896x896_coco17_tpu-32.config',
'pretrained_checkpoint': 'efficientdet_d3_coco17_tpu-32.tar.gz',
'batch_size': 16
}
}
In this tutorial, we implement the lightweight, smallest state of the art EfficientDet Model (D0). To scale up to larger efficientDet models; you will need more computing power. For training, you can begin your number of steps with 5000 (you might want to increase if loss function is still decreasing). The number of evaluations per step is also set to 500. This implies, performing evaluation after 500 steps.
The code below, demonstrates the above model settings.
chosen_model = 'efficientdet-d0'
num_steps = 5000 #The more steps, the longer the training. Increase if your loss function is still decreasing and validation metrics are increasing.
num_eval_steps = 500 #Perform evaluation after so many steps
model_name = MODELS_CONFIG[chosen_model]['model_name']
pretrained_checkpoint = MODELS_CONFIG[chosen_model]['pretrained_checkpoint']
base_pipeline_file = MODELS_CONFIG[chosen_model]['base_pipeline_file']
batch_size = MODELS_CONFIG[chosen_model]['batch_size'] #if you can fit a large batch in memory, it may speed up your training
Having done this, let’s move on to downloading the pre-trained weights for the specified architecture as shown in the code above (amongst D0, D1, D2 and D3). The code below helps us to do this:
#download pretrained weights
%mkdir /content/models/research/deploy/
%cd /content/models/research/deploy/
import tarfile
download_tar = 'http://download.tensorflow.org/models/object_detection/tf2/20200711/' + pretrained_checkpoint
!wget {download_tar}
tar = tarfile.open(pretrained_checkpoint)
tar.extractall()
tar.close()
Let’s move on to writing our custom config file.
Setting up the configuration file
The configuration file is a file extension represented as .config. This file contains all information required for successful training of the object detection model/architecture. This includes parameters like:
- Number of steps for training.
- Directory to the dataset for training and label_maps.
- Fine tune checkpoints.
- SSD model parameters like anchors_generators, image_resizer, box_predictors, feature_extractors and others.
By default, there is a configuration file for every desired architecture needed for training. What needs to be updated in those configuration files is the directory to file checkpoints, label_map, train data (in tfrecords) and test data.
To achieve the above let’s take up the following steps. First and foremost, let’s go ahead to download the custom configuration file using the code below.
#download base training configuration file
%cd /content/models/research/deploy
download_config = 'https://raw.githubusercontent.com/tensorflow/models/master/research/object_detection/configs/tf2/' + base_pipeline_file
!wget {download_config}
Having done the above, we could go ahead to set up the pipeline file name and model checkpoint dir. The code below illustrates this. Moreover, you could confirm the number of classes extracted from the label_map_pbtxt file using the function get_num_classes as shown below.
pipeline_fname = '/content/models/research/deploy/' + base_pipeline_file
fine_tune_checkpoint = '/content/models/research/deploy/' + model_name + '/checkpoint/ckpt-0'
def get_num_classes(pbtxt_fname):
from object_detection.utils import label_map_util
label_map = label_map_util.load_labelmap(pbtxt_fname)
categories = label_map_util.convert_label_map_to_categories(
label_map, max_num_classes=90, use_display_name=True)
category_index = label_map_util.create_category_index(categories)
return len(category_index.keys())
num_classes = get_num_classes(label_map_pbtxt_fname)
For this problem, number of classes is 3 which is:
- Chelsea – Class 0
- Man-city – Class 1
- Unknown (referee, goalkeepers and others) – Class 2
Now, let’s write to the custom config file the following informations:
- Train dir,
- Test dir.
- Label map.
- Checkpoint file dir.
The code below helps us to read the config file and write the file directories into the file.
import re
%cd /content/models/research/deploy
print('writing custom configuration file')
with open(pipeline_fname) as f:
s = f.read()
with open('pipeline_file.config', 'w') as f:
# fine_tune_checkpoint
s = re.sub('fine_tune_checkpoint: ".*?"',
'fine_tune_checkpoint: "{}"'.format(fine_tune_checkpoint), s)
# tfrecord files train and test.
s = re.sub(
'(input_path: ".*?)(PATH_TO_BE_CONFIGURED/train)(.*?")', 'input_path: "{}"'.format(train_record_fname), s)
s = re.sub(
'(input_path: ".*?)(PATH_TO_BE_CONFIGURED/val)(.*?")', 'input_path: "{}"'.format(test_record_fname), s)
# label_map_path
s = re.sub(
'label_map_path: ".*?"', 'label_map_path: "{}"'.format(label_map_pbtxt_fname), s)
# Set training batch_size.
s = re.sub('batch_size: [0-9]+',
'batch_size: {}'.format(batch_size), s)
# Set training steps, num_steps
s = re.sub('num_steps: [0-9]+',
'num_steps: {}'.format(num_steps), s)
# Set number of classes num_classes.
s = re.sub('num_classes: [0-9]+',
'num_classes: {}'.format(num_classes), s)
#fine-tune checkpoint type
s = re.sub(
'fine_tune_checkpoint_type: "classification"', 'fine_tune_checkpoint_type: "{}"'.format('detection'), s)
f.write(s)
You can confirm that the dirs have been written to file by running this code below:
%cat /content/models/research/deploy/pipeline_file.config
Now that we have a config file, let’s go ahead to train. But before training, let’s take note of the directory to the config file and also a directory for saving all training params. They should look like this:
pipeline_file = '/content/models/research/deploy/pipeline_file.config'
model_dir = '/content/training/'
See also
️ How to Track Machine Learning Model Metrics in Your Projects
Modelling and training
All things being equal, we can go ahead to train the data by running the model_main_tf2.py file. This file is used for running any object detection problem with respect to TensorFlow 2. All you need to do for a successful run, is to specify the following:
- Pipeline config path.
- Model dir.
- num of training steps.
- Num of evaluation steps.
The code below helps us to do this.
!python /content/models/research/object_detection/model_main_tf2.py
--pipeline_config_path={pipeline_file}
--model_dir={model_dir}
--alsologtostderr
--num_train_steps={num_steps}
--sample_1_of_n_eval_examples=1
--num_eval_steps={num_eval_steps}
Having done this, I had to keep running and running by changing the num of training steps until I had a desirable result. I finally had to stop after 20000 training steps, when I was certain there was no more loss decrease in my training. This took about 5–6 hours of training.
After training for such long hours, here is the TensorBoard illustrating the learning rate.

Let’s go ahead to export the model’s trained inference graph by running this code.
#see where our model saved weights
%ls '/content/training/'
Next is to run a conversion script that exports the model parameters as an inference for reloading when needed for real-time prediction.
#run conversion script
import re
import numpy as np
output_directory = '/content/fine_tuned_model'
#place the model weights you would like to export here
last_model_path = '/content/training/'
print(last_model_path)
!python /content/models/research/object_detection/exporter_main_v2.py
--trained_checkpoint_dir {last_model_path}
--output_directory {output_directory}
--pipeline_config_path {pipeline_file}
Model evaluation
To evaluate the model, here are the steps needed to be implemented:
- Load the last model checkpoint after training.
- Read in video frames.
- Identify objects in each frame (bounding boxes and accuracy).
- Convert video frames to video.
To load the last model checkpoint after training, run the code below:
#recover our saved model
pipeline_config = pipeline_file
#generally you want to put the last ckpt from training in here
model_dir = '/content/training/ckpt-6'
configs = config_util.get_configs_from_pipeline_file(pipeline_config)
model_config = configs['model']
detection_model = model_builder.build(
model_config=model_config, is_training=False)
# Restore checkpoint
ckpt = tf.compat.v2.train.Checkpoint(
model=detection_model)
ckpt.restore(os.path.join('/content/training/ckpt-6'))
def get_model_detection_function(model):
"""Get a tf.function for detection."""
@tf.function
def detect_fn(image):
"""Detect objects in image."""
image, shapes = model.preprocess(image)
prediction_dict = model.predict(image, shapes)
detections = model.postprocess(prediction_dict, shapes)
return detections, prediction_dict, tf.reshape(shapes, [-1])
return detect_fn
detect_fn = get_model_detection_function(detection_model)
Next is to read in video frames and pass it through the object detection model for bounding box identification and prediction of the proper class. The code below helps us to conveniently do this:
#map labels for inference decoding
label_map_path = configs['eval_input_config'].label_map_path
label_map = label_map_util.load_labelmap(label_map_path)
categories = label_map_util.convert_label_map_to_categories(
label_map,
max_num_classes=label_map_util.get_max_label_map_index(label_map),
use_display_name=True)
category_index = label_map_util.create_category_index(categories)
label_map_dict = label_map_util.get_label_map_dict(label_map, use_display_name=True)
import random
TEST_IMAGE_PATHS = glob.glob('/content/test/*.jpg')
image_path = random.choice(TEST_IMAGE_PATHS)
image_np = load_image_into_numpy_array(image_path)
input_tensor = tf.convert_to_tensor(
np.expand_dims(image_np, 0), dtype=tf.float32)
detections, predictions_dict, shapes = detect_fn(input_tensor)
label_id_offset = 1
image_np_with_detections = image_np.copy()
viz_utils.visualize_boxes_and_labels_on_image_array(
image_np_with_detections,
detections['detection_boxes'][0].numpy(),
(detections['detection_classes'][0].numpy() + label_id_offset).astype(int),
detections['detection_scores'][0].numpy(),
category_index,
use_normalized_coordinates=True,
max_boxes_to_draw=200,
min_score_thresh=.5,
agnostic_mode=False,
)
CHECK THESE ARTICLES
️ F1 Score vs ROC AUC vs Accuracy vs PR AUC: Which Evaluation Metric Should You Choose?
️ The Ultimate Guide to Evaluation and Selection of Models in Machine Learning
Result and conclusion
Having done all above processes successfully, the model was able to identify players from the same team and also classify people on the pitch to be in another class called ‘unknown’, be that a referee or people from a support team during the intermission. Below is a full video demonstration of the object detection architecture in action on video frames.
In conclusion, computer vision, as a deep learning field, has a feasible application in football analytics. More of the things that could be done includes:
- Ball tracking,
- Sentiment analysis of players on the field with computer vision,
- Reinforcement learning in predicting the next player to receive the ball, and many more.
If this article helps you to understand computer vision in detail, do share with friends. Thanks for reading!
References
- roboflow.com
- heartbeat.fritz.ai/end-to-end-object-detection-using-efficientdet-on-raspberry-pi-3-part-2-bb5133646630
- github.com/tzutalin/labelImg
- medium.com/@iKhushPatel/convert-video-to-images-images-to-video-using-opencv-python-db27a128a481
- github.com/microsoft/VoTT
- cvat.org