MLOps Blog

Top 8 Image-Processing Python Libraries Used in Machine Learning

4 min
30th August, 2023

According to IDC, digital data will skyrocket up to 175 zettabytes, and the huge part of this data is images. Data scientists need to (pre) process these images before feeding them into any machine learning models. They have to do the important (and sometimes dirty) work before the fun part begins.  

To process a large amount of data with efficiency and speed without compromising the results data scientists need to use image processing tools for machine learning and deep learning tasks.

In this article, I am going to list out the most useful image processing libraries in Python which are being used heavily in machine learning tasks.

1. OpenCV

Source: OpenCV

OpenCV is an open-source library that was developed by Intel in the year 2000. It is mostly used in computer vision tasks such as object detection, face detection, face recognition, image segmentation, etc but also contains a lot of useful functions that you may need in ML.


import cv2 as cv
import numpy as np
import matplotlib.pyplot as plt

img = cv.imread('goku.jpeg')
gray_image = cv.cvtColor(img, cv.COLOR_BGR2GRAY)

fig, ax = plt.subplots(1, 2, figsize=(16, 8))

ax[0].imshow(cv.cvtColor(img, cv.COLOR_BGR2RGB))

ax[1].imshow(cv.cvtColor(gray_image, cv.COLOR_BGR2RGB))

A colored image consists of 3 color channels where a gray image only consists of 1 Color channel which carries intensity information for each pixel showing the image as black-and-white.

The following code separates each color channel:

import cv2 as cv
import numpy as np
import matplotlib.pyplot as plt
img = cv.imread('goku.jpeg')
b, g, r = cv.split(img)

fig, ax = plt.subplots(1, 3, figsize=(16, 8))

ax[0].imshow(cv.cvtColor(r, cv.COLOR_BGR2RGB))

ax[1].imshow(cv.cvtColor(g, cv.COLOR_BGR2RGB))

ax[2].imshow(cv.cvtColor(b, cv.COLOR_BGR2RGB))
Gray-scaling effect

Image translation

import cv2 as cv
import numpy as np
import matplotlib.pyplot as plt

image = cv.imread("pics/goku.jpeg")
h, w = image.shape[:2]

half_height, half_width = h//4, w//8
transition_matrix = np.float32([[1, 0, half_width],
                               [0, 1, half_height]])

img_transition = cv.warpAffine(image, transition_matrix, (w, h))

plt.imshow(cv.cvtColor(img_transition, cv.COLOR_BGR2RGB))
image translation

Above code translates an image from one coordinate to a different coordinate.

Image rotation

import cv2 as cv
import numpy as np
import matplotlib.pyplot as plt

image = cv.imread("pics/goku.jpeg")

h, w = image.shape[:2]
rotation_matrix = cv.getRotationMatrix2D((w/2,h/2), -180, 0.5)

rotated_image = cv.warpAffine(image, rotation_matrix, (w, h))

plt.imshow(cv.cvtColor(rotated_image, cv.COLOR_BGR2RGB))
image rotation

Rotation of an image for the X or Y-axis.

Scaling and resizing

import cv2 as cv
import numpy as np
import matplotlib.pyplot as plt

image = cv.imread("pics/goku.jpeg")

fig, ax = plt.subplots(1, 3, figsize=(16, 8))

# image size being 0.15 times of it's original size
image_scaled = cv.resize(image, None, fx=0.15, fy=0.15)
ax[0].imshow(cv.cvtColor(image_scaled, cv.COLOR_BGR2RGB))
ax[0].set_title("Linear Interpolation Scale")

# image size being 2 times of it's original size
image_scaled_2 = cv.resize(image, None, fx=2, fy=2, interpolation=cv.INTER_CUBIC)
ax[1].imshow(cv.cvtColor(image_scaled_2, cv.COLOR_BGR2RGB))
ax[1].set_title("Cubic Interpolation Scale")

# image size being 0.15 times of it's original size
image_scaled_3 = cv.resize(image, (200, 400), interpolation=cv.INTER_AREA)
ax[2].imshow(cv.cvtColor(image_scaled_3, cv.COLOR_BGR2RGB))
ax[2].set_title("Skewed Interpolation Scale")
scaling and resizing image

Scaling of an image refers to converting an image array into lower or higher dimensions.

These are some of the most basic operations that can be performed with the OpenCV on an image. Apart from this, OpenCV can perform operations such as Image Segmentation, Face Detection, Object Detection, 3-D reconstruction, feature extraction as well.

If you want to have a look at how these pictures were generated using OpenCV then you can check out this GitHub repository.

2. Scikit-Image

scikit-image is a python-based image processing library that has some parts written in Cython (Cython is a programming language which is a superset of Python programming language designed to have performance like C programming language.) to achieve good performance. It includes algorithms for:

  • Segmentation, 
  • Geometric transformations, 
  • Color space manipulation,
  • Analysis,
  • Filtering, 
  • Morphology,
  • Feature detection, and more 

You will find it useful for pretty much any computer vision task.

The scikit-image uses NumPy arrays as image objects.

Operation using scikit-image

Active contour

active contour

In computer vision, contour models describe the boundaries of shapes in an image.

“Active contour models are defined for image segmentation based on the curve flow, curvature, and contour to obtain the exact target region or segment in the image.”

Following code produces the above output:

import numpy as np
import matplotlib.pyplot as plt
from skimage.color import rgb2gray
from skimage import data
from skimage.filters import gaussian
from skimage.segmentation import active_contour

img = data.astronaut()

# Data for circular boundary
s = np.linspace(0, 2*np.pi, 400)
x = 220 + 100*np.cos(s)
y = 100 + 100*np.sin(s)
init = np.array([x, y]).T

# formation of the active contour
cntr = active_contour(gaussian(img, 3),init, alpha=0.015, beta=10, gamma=0.001)
fig, ax = plt.subplots(1, 2, figsize=(7, 7))
ax[0].set_title("Original Image")

# circular boundary
ax[1].plot(init[:, 0], init[:, 1], '--r', lw=3)
ax[1].plot(cntr[:, 0], cntr[:, 1], '-b', lw=3)
ax[1].set_title("Active Contour Image")

3. SciPy

Source: Scipy

Scipy is used for mathematical and scientific computations but can also perform multi-dimensional image processing using the submodule scipy.ndimage. It provides functions to operate on n-dimensional Numpy arrays and at the end of the day images are just that.

Scipy offers the most commonly used image processing operations like: 

  • Reading Images
  • Image Segmentation
  • Convolution
  • Face Detection
  • Feature Extraction and so on.

Blurring an image with scipy

from scipy import misc,ndimage
from matplotlib import pyplot as plt

face = misc.face()
blurred_face = ndimage.gaussian_filter(face, sigma=3)

fig, ax = plt.subplots(1, 2, figsize=(16, 8))

ax[0].set_title("Original Image")
ax[1].set_title("Blurred Image")


blurring image effect

You can find all operations here.

4. Pillow/PIL

PIL logo

PIL (Python Imaging Library) is an open-source library for image processing tasks that requires python programming language. PIL can perform tasks on an image such as reading, rescaling, saving in different image formats.

PIL can be used for Image archives, Image processing, Image display.

Image enhancement with PIL

For example, let’s enhance the following image by 30% contrast.

from PIL import Image, ImageFilter
#Read image
im ='cat_inpainted.png')
#Display image

from PIL import ImageEnhance
enh = ImageEnhance.Contrast(im)
enh.enhance(1.8).show("30% more contrast")


cat enhanced

For more information go here

5. NumPy

numpy logo

An image is essentially an array of pixel values where each pixel is represented by 1 (greyscale) or 3 (RGB) values. Therefore, NumPy can easily perform tasks such as image cropping, masking, or manipulation of pixel values.

For example to extract red/green/blue channels from the following image:

color extract

We can use numpy and “penalize” each channel one at a time by replacing all the pixel values with zero. 

from PIL import Image
import numpy as np

im = np.array('goku.png'))

im_R = im.copy()
im_R[:, :, (1, 2)] = 0
im_G = im.copy()
im_G[:, :, (0, 2)] = 0
im_B = im.copy()
im_B[:, :, (0, 1)] = 0

im_RGB = np.concatenate((im_R, im_G, im_B), axis=1)

pil_img = Image.fromarray(im_RGB)'goku.jpg')
colors extracted effect

6. Mahotas

Mahotas is another image processing and computer vision library that was designed for bioimage informatics. It reads and writes images in NumPy array, and is implemented in C++ with a smooth python interface.

The most popular functions of Mahotas are

Let’s see how Template Matching can be done with Mahotas for finding the wally.

The following code snippet helps in finding the Wally in the crowd.

from pylab import imshow, show
import mahotas
import mahotas.demos
import numpy as np

wally = mahotas.demos.load('Wally')
wfloat = wally.astype(float)

r,g,b = wfloat.transpose((2,0,1))
w = wfloat.mean(2)
pattern = np.ones((24,16), float)

for i in range(2):
    pattern[i::4] = -1
    v = mahotas.convolve(r-w, pattern)
    mask = (v == v.max())
    mask = mahotas.dilate(mask, np.ones((48,24)))
    np.subtract(wally, .8*wally * ~mask[:,:,None], out=wally, casting='unsafe')

7. SimpleITK

ITK or Insight Segmentation and Registration Toolkit is an open-source platform that is widely used for Image Segmentation and Image Registration (a process that overlays two or more images).

Image segmentation

ITK uses the CMake build environment and the library is implemented in C++ which is wrapped for Python.

registration visualization

You can check this Jupyter Notebook for learning and research purposes.

8. Pgmagick

Pgmagick is a GraphicsMagick binding for Python that provides utilities to perform on images such as resizing, rotation, sharpening, gradient images, drawing text, etc.

Blurring an image

from pgmagick.api import Image

img = Image('leena.jpeg')

# blur image 
img.blur(10, 5)
blurring images
blurred image

Scaling of an image

from pgmagick.api import Image

img = Image('leena.png')

# scaling image 
img.scale((150, 100), 'leena_scaled')
scaling image
scaling image

For more info, you can check the curated list of Jupyter Notebooks here.

Final thoughts

We have covered the top 8 image processing libraries for machine learning. Hopefully, you now have an idea of which one of those will work best for your project. Best of luck. 🙂

Was the article useful?

Thank you for your feedback!