MLOps Blog

What Image Processing Techniques Are Actually Used in the ML Industry?

Ehi Aig

5 min

29th August, 2023

Computer Vision ML Model Development

Processing can be used to improve the quality of your image, or to help you extract useful information from it. It’s useful in fields like medical imaging, and it can even be used to hide data inside an image.

In this article I’ll tell you about how Image Processing can be applied in Machine Learning, and what techniques you can use. First, let’s explore more real world examples of Image Processing.

Image processing in the real world

Medical imaging

In medicine, scientists study the inner structures and tissues of organisms in order to help identify anomalies faster. Image processing used in medical imaging can help produce high-quality, clear images for scientific and medicinal research, ultimately helping doctors diagnose diseases.

Security

A car dealership company, or a shop may install security cameras to monitor the area and record thieves if they appear. But sometimes, images generated by security cameras need to be processed, either by doubling the size or increasing brightness and contrast to make details visible enough to capture important details in it.

Military and defense

An interesting application of Image Processing in Defense is Steganography. Experts can hide a message or an image inside another image, and send the information back and forth without any 3rd party detecting the message.

General image sharpening and restoration

This is probably the most widely used application of Image Processing. Enhancing and manipulating images using tools like Photoshop, or using filters on Snapchat or Instagram to make our photos more cool.

If you need to automate and process a massive amount of images, it would be a painfully tedious experience to do this manually. This is where machine learning algorithms can boost the speed of image processing, without losing the end quality we need.

Image Processing techniques used in the ML industry

Before I move on, it‘s important to mention that Image Processing is different from Computer Vision, but people often get these two mixed up.

Image Processing is only an aspect of Computer Vision, and they are not the same. Image Processing systems focus on transforming images from one form to another, and Computer Vision systems help the computer to understand, and get meaning from an image.

Many Computer Vision systems employ Image Processing algorithms. For example, a face enhancement app may use computer vision algorithms to detect faces in a photo, and then apply Image Processing techniques like smoothing or grayscale filters to it.

Many advanced Image Processing methods leverage Machine Learning Models like Deep Neural Networks to transform images on a variety of tasks, like applying artistic filters, tuning an image for optimal quality, or enhancing specific image details to maximize quality for computer vision tasks.

Convolutional Neural Networks (CNN) take in an input image and use filters on it, in a way that it learns to do things like object detection, image segmentation and classification.

Apart from doing image manipulation, recent machine learning techniques make it possible for engineers to augment image data. A machine learning model is only as good as the dataset – but what do you do when you don’t have the necessary amount of training data?

Instead of trying to find and label more datasets, we can construct completely new ones from what we have. We do this either by applying simple image transformation techniques (horizontal flipping, color space augmentations, zooming, random cropping) or using deep learning algorithms like Feature Space Augmentation & Autoencoders, Generative Adversarial Networks (GANs), and Meta-Learning.

Example image processing tasks using Keras (with code sample)

Let’s learn how to apply data augmentation to generate an image dataset. We’ll take a single image of a dog, apply transformations on it like right shift, left shift, and zoom to create completely new versions of the image which can later be used as the training dataset for a computer vision task like object detection or classification.

CHECK ALSO

Learn how you can keep track of your model training and have all metadata in one place thanks to the Neptune + Keras integration and Neptune + matplotlib integration

Initial setup

Throughout this tutorial, we will be relying heavily on four Python packages:

Keras: Keras has an Image Data Preprocessing class that allows us to perform data augmentation seamlessly.
Matplotlib: One of the most popular data visualization libraries in Python. It allows us to create figures and plots and makes it very easy to produce static raster or vector files without the need for any GUIs.
Numpy: A very useful library for performing mathematical and logical operations on Arrays. We’ll be using its expand_dim class to expand the shape of an array in this tutorial.
Pillow: A python Imaging Library which we will use for opening, and manipulating our image file in this tutorial.

Let’s go ahead and install these libraries.

In your terminal/command prompt, type:

pip3 list

to see the python packages you already have installed on your computer. Then install missing the missing packages:

pip3 install numpy matplotlib keras numpy pillow

Now that we have installed the necessary packages, let’s move on to step 1.

Step 1

Create a folder called data-aug-sample. Inside it, create a python file called sample.py, and then download a sample dog photo from the internet and save it as dog.jpg in this folder. Then import libraries like this:

import matplotlib.pyplot as plt #For plotting our visualizations
from keras.preprocessing.image import ImageDataGenerator #Keras dataset generator class.
from numpy import expand_dims
from keras.preprocessing.image import load_img
from keras.preprocessing.image import img_to_array
import cv2
from PIL import Image

# %matplotlib inline  # Uncomment this line if running this code in Jypiter notebook

image = Image.open('dog.jpg')
plt.imshow(image)
plt.show()

Now, our folder structure should look like this:

Save this file and run it in your terminal like this: python3 sample.py.

You should see something like this:

Note: If you’re running this tutorial in a Jupyter notebook, you must add the line %matplotlib inline after your imports to help matplotlib easily display the plots. I prefer to run all my Machine Learning programs in Python files instead of in Jupyter notebook, because it gives me a feel of how things would look like in reality.

Step 2

Now lets begin applying transformation operations on the image.

Rotation

Rotation transformation applies a rotation on the image, from right to left, on an axis between 1° and 359°. In the example below, we’re rotating our dog image at 90°. Keras ImageDataGenerator class allows us to pass a rotation_range parameter for this purpose:

import matplotlib.pyplot as plt #For plotting
from keras.preprocessing.image import ImageDataGenerator #image data generator.
from numpy import expand_dims
from keras.preprocessing.image import load_img
from keras.preprocessing.image import img_to_array
import cv2
from PIL import Image

image = Image.open('dog.jpg')

#Rotation
data = img_to_array(image)
samples = expand_dims(data, 0)
data_generated = ImageDataGenerator(rotation_range=90)  #Here we specify angle of rotation as 90
it = data_generated.flow(samples, batch_size=1)

for i in range(9):
    plt.subplot(330 + 1 + i)
    batch = it.next()
    result = batch[0].astype('uint8')
    plt.imshow(result)
plt.show()

Running the code above gives us a new dog dataset:

Translation

We can apply shift transformations on images horizontally, vertically, left or right. This kind of transformation is very useful for avoiding positional bias in data. For example, training a Face Recognition model on a dataset where the faces are centered in the images would result in positional bias, making the model perform very poorly on new faces that are positioned to the left or right. We’ll be using height_shift_range and width_shift_range parameters of the ImageDataGenerator class for this purpose.

Applying vertical shift transformation to our dog image:

from numpy import expand_dims
from keras.preprocessing.image import load_img
from keras.preprocessing.image import img_to_array
import cv2
from PIL import Image

image = Image.open('dog.jpg')

data = img_to_array(image)
samples = expand_dims(data, 0)
data_generator = ImageDataGenerator(height_shift_range=0.5)
it = data_generator.flow(samples, batch_size=1)
for i in range(9):
    plt.subplot(330 + 1 + i)
    batch = it.next()
    result = batch[0].astype('uint8')
    plt.imshow(result)
plt.show()

Our result would look like this:

Applying horizontal shift transformation to our dog image:

from numpy import expand_dims
from keras.preprocessing.image import load_img
from keras.preprocessing.image import img_to_array
import cv2
from PIL import Image

image = Image.open('dog.jpg')

data = img_to_array(image)
samples = expand_dims(data, 0)
# data_generator = ImageDataGenerator(width_shift_range=[-100,100])
it = data_generator.flow(samples, batch_size=1)
for i in range(9):
    plt.subplot(330 + 1 + i)
    batch = it.next()
    result = batch[0].astype('uint8')
    plt.imshow(result)
plt.show()

Our result would look like this:

Color space

Here we will apply transformation on the color channels space of our dog image. What happens here is that it isolates a single color channel (R, G, or B) and the result is either a brightened or darkened version of the image. By simply specifying brightness_range value (usually a tuple or list of two floats) in the ImageDataGenerator class, we can set the brightness shift value to pick from.

from numpy import expand_dims
from keras.preprocessing.image import load_img
from keras.preprocessing.image import img_to_array
import cv2
from PIL import Image

image = Image.open('dog.jpg')

data = img_to_array(image)
samples = expand_dims(data, 0)
datagen = ImageDataGenerator(brightness_range=[0.2,1.0])
it = datagen.flow(samples, batch_size=1)
for i in range(9):
    plt.subplot(330 + 1 + i)
    batch = it.next()
    result = batch[0].astype('uint8')
    plt.imshow(result)
plt.show()

The resulting set is:

Zooming

As the name implies, we can apply transformation on our dog image to get zoomed in/out versions of the image by simply passing in the zoom_range attribute of the ImageDataGenerator class.

from numpy import expand_dims
from keras.preprocessing.image import load_img
from keras.preprocessing.image import img_to_array
import cv2
from PIL import Image

image = Image.open('dog.jpg')

data = img_to_array(image)
samples = expand_dims(data, 0)
datagen = ImageDataGenerator(zoom_range=[0.2,1.0])
it = datagen.flow(samples, batch_size=1)
for i in range(9):
    plt.subplot(330 + 1 + i)
    batch = it.next()
    result = batch[0].astype('uint8')
    plt.imshow(result)
plt.show()

The resulting set is:

Flipping

Applying a flip transformation allows us to change the direction of our image horizontally or vertically by setting vertical_flip=True or horizontal_flip=True.

from numpy import expand_dims
from keras.preprocessing.image import load_img
from keras.preprocessing.image import img_to_array
import cv2
from PIL import Image

image = Image.open('dog.jpg')

data = img_to_array(image)
samples = expand_dims(data, 0)
datagen = ImageDataGenerator(vertical_flip=True)
it = datagen.flow(samples, batch_size=1)
for i in range(9):
    plt.subplot(330 + 1 + i)
    batch = it.next()
    result = batch[0].astype('uint8')
    plt.imshow(result)
plt.show()

Our resulting set is:

With this new dataset that we generated, we can clean it, and eliminate images that are skewed, or those with meaningless information. Then it can be used to train an object detection model or a dog classifier.

Conclusion

Machine Learning algorithms allow you to do image processing at scale, and with great detail. I hope you learned a thing or two about how Image Processing can be used in Machine Learning, and don’t forget that Image Processing is not the same as Computer Vision!