Blog » Computer Vision » Segmenting and Colorizing Images in IOS App Using Deoldify and Django API

Segmenting and Colorizing Images in IOS App Using Deoldify and Django API

Image segmentation falls under the field of imaging involving deep object detection and recognition. If we segregate an image into multiple regions by separating pixel-wise, each object in the scene allows us to train sophisticated deep learning models for tasks that require high standards of image analysis and context interpretation. Models trained this way can determine the shapes of detected objects, predict the direction the detected objects will go in, and generate many other insights.

We’ll be learning how image segmentation works by taking a hands-on approach to developing a backend API that would handle model serving and a small IOS application that consumes the services. 

The API will consist of multiple views that process the input images at different stages. The processing logic for the backend service operates as if each view is a nested microservice responsible for a single feature, from background customization to background grayscale and old image colorizing.

Image segmentation can be done with various techniques, each with its advantages and disadvantages:

Segmentation Technique Description Advantages Disadvantages
Thresholding Method Based on image histogram peaks to find specific thresholds One of the simplest methods. Doesn’t need previous information Spatial details are not well-considered, and it largely depends on color variations and peaks
Edge Based Method Applying discontinuity detection techniques Perform very well for images having good contrast between objects Not suitable when there are too many edges in the image
Region-Based Method Homogenous image partitioning to find particular regions Useful when similarity criteria can be easily defined Quite expensive in terms of time and spatial complexity
ANN Based Method Based on the simulation of the learning process for decision making Neural Network architectures, no need to write complex programs Demands a lot of training data

There are three main types of image segmentation:

  • Semantic Segmentation: Identifies the trainable object classes and segregates accordingly.
  • Instance Segmentation: Detects the number of instances of each object class. Thus, it segregates the components more accurately and it helps decompose the whole image into multiple labeled regions that refer to the classes the model was trained on. 
  • Panoptic Segmentation: A unified version of both semantic and instance segmentation.
Semantic segmentation and image colorizer
U-Net semantic segmentation example on the right, Detectron model for instance segmentation on the left.

We’ll also be taking a look at recent image techniques that accurately colorize old black and white photographs. The algorithms used to perform such artistic tasks are a combination of specific Generative Adversarial Networks (GANs) that generate accurate color pigments that match the objects present in the image. The models segment the image and colorize each pixel according to the classes they were trained for. 

The library we’ll be using is DeOldify. They have a well-furnished Github repo with lots of examples and tutorials to quickly get you started.  

Deoldify example
Deoldify example from their github repo | Source

For this article, we’ll cover the following:

  • Some technical background on DeepLab V3+ for image segmentation,
  • Use the Pytorch implementation of DeepLab-ResNet101,
  • Test Deoldify to process black and white images and propose the color version,
  • Wrap all the models within an API to serve them,
  • Create a small IOS application to fetch the image results,
  • Conclusion.

You can check the entire code for this project in my Github repo.

Technical background for DeepLab V3+

Atrous Spatial Pyramid Pooling Convolutions

Most segmentation models use FCNN as the first processing stage to correctly place the masks and boundaries needed before the object detection phases. DeepLab V3+ is the latest and most sophisticated iteration of Google’s DeepLab segmentation models. 

Among the many reasons for the development of DeepLab stands the ability to enable numerous applications, such as the synthetic shallow depth-of-field effect used in the portrait mode feature in Pixel 2 smartphones.

The DeepLab V3+ release includes models built on top of CNN architecture backbones, but mostly the model relies on the newly introduced Atrous Spatial Pyramid Pooling convolutions (ASPP). The general structure presents the following phases:

  • Extract image features using the CNN backbone. In our case, the backbone is ResNet-101, it will identify and detect the mask feature maps that will be fed to further stages.
  • Control the size of the output to not lose so much on the semantic information.
  • During the last stages, ASPP classifies the different pixels of the output image and processes it through 1×1 convolutional layers to recover the proper original size. 
Atrous convolution
Parallel modules with atrous convolution (ASPP) | Source: Rethinking Atrous Convolutions

PyTorch implementation of DeepLab-ResNet101

To quickly start a hands-on experience with DeepLab, we’ll be using the PyTorch implementation that proposes a version of deeplab v3 backboned with ResNet101, pre-trained on the COCO dataset, and easily loadable from the torchvision package. 

I’ll try to detail as much as I can the different steps needed to code the Python module that calls the model. We’ll start from the ground up.

Enable your Python Virtual Environment  

Use virtualenv or anaconda to create a virtual environment for the project within which you’ll install all the required dependencies. Keep in mind that the pretrained versions we’ll be testing are GPU-based.  

1. Download and Install Anaconda: website 

2. Create your virtual environment:

conda create --name seg_env python=3.6 

3. Activate the virtual environment:

conda activate seg_env

4. Install the required libraries:

conda install pytorch torchvision cudatoolkit=10.2 -c pytorch

pip install opencv-python 

pip install numpy 

pip install Pillow=2.2.1

Install Deoldify requirements

  • Clone the Deoldify Github repo and install the requirements.txt

git clone

Implement the Data Loader class

Before starting to code the Python module to wrap the model behavior, we’ll need to code a data loader to take care of the input image files. The main purpose of the data loader is to preprocess all image input files, transforming them into high-level objects with specific attributes and properties that will help ease the work once we want to train or evaluate the model against a batch of raw inputs.

class SegmentationSample(Dataset):

    def __init__(self, root_dir, image_file, device):
        # Process the image input file
        self.image_file = os.path.join(root_dir, image_file)
        self.image =

        # Asses the internal device: cpu or gpu:
        if device == 'cuda' and torch.cuda.is_available():
            self.device = 'cuda'
        if device == 'cpu':
            self.device = 'cpu'

        # Define the Transforms applied to the input image:
        self.preprocessing = transforms.Compose([
            transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
        self.unload_tensor = transforms.ToPILImage()

        # Output returned: processed tensor with an additional dim for the batch:
        self.processed_image = self.preprocessing(self.image)
        self.processed_image = self.processed_image.unsqueeze(0).to(self.device)

    def __getitem__(self, item):
        return self.processed_image

    def print_image(self, title=None):
        image = self.image
        if title is not None:
            plt.title = title

    def print_processed(self, title='After processing'):
        image = self.processed_image.squeeze(0).detach().cpu()
        image = self.unload_tensor(image)
        plt.title = title
  • The init method: takes the root_dir and image file and converts it into a Pillow image object and then into a torch tensor. The pixel values of the raw input image are normalized according to specific mean and std values. Once all the transformations take place, we obtain a well-shaped dimension-wise tensor. Therefore we assure that the input and model dimensions are perfectly matching. 

Create the Python wrapper to serve the DeepLab model

The module has to initialize the pre-trained weights of the deeplab-resnet101 model version. It also requires the user to specify whether to use CPU or GPU acceleration during inference time.

Aside, the model will also implement methods to customize the background. 

class SemanticSeg(nn.Module):
    def __init__(self, pretrained: bool, device):
        super(SemanticSeg, self).__init__()
        if device == 'cuda' and torch.cuda.is_available():
            self.device = 'cuda'
        if device == 'cpu':
            self.device = 'cpu'

        self.model = self.load_model(pretrained)

    def forward(self, input: SegmentationSample):
        # Run the model in the respective device:
        with torch.no_grad():
            output = self.model(input.processed_image)['out']

        reshaped_output = torch.argmax(output.squeeze(), dim=0).detach().cpu()
        return reshaped_output

    # Add the Backbone option in the parameters
    def load_model(self, pretrained=False):
        if pretrained:
            model = models.deeplabv3_resnet101(pretrained=True)
            model = models.deeplabv3_resnet101()
        return model
  • forward(self, input: SegmentationSample): Runs the inference on the sampled image input and returns the tensor predictions.
  • load_model(self, pretrained=False): Loads the pretrained version of deeplabv3_resnet101 available in the Pytorch cloud. Saves the eval mode of the model checkpoint to the respective device. 

Afterward, we’ll add the post-process methods to help remap a customized background on top of the model predictions. Keep in mind that the output tensors have 21 channels matching the prediction results for each target class the model was trained on. Accordingly, we need to decode the tensor shape to output a proper image result.

def background_custom(self, input_image, source, background_source,number_channels=21):

        label_colors = np.array([(0, 0, 0),  # 0=background
                                 # 1=aeroplane, 2=bicycle, 3=bird, 4=boat, 5=bottle
                                 (128, 0, 0), (0, 128, 0), (128, 128, 0), (0, 0, 128), (128, 0, 128),
                                 # 6=bus, 7=car, 8=cat, 9=chair, 10=cow
                                 (0, 128, 128), (128, 128, 128), (64, 0, 0), (192, 0, 0), (64, 128, 0),
                                 # 11=dining table, 12=dog, 13=horse, 14=motorbike, 15=person
                                 (192, 128, 0), (64, 0, 128), (192, 0, 128), (64, 128, 128), (192, 128, 128),
                                 # 16=potted plant, 17=sheep, 18=sofa, 19=train, 20=tv/monitor
                                 (0, 64, 0), (128, 64, 0), (0, 192, 0), (128, 192, 0), (0, 64, 128)])

        # Defining empty matrices for rgb tensors:
        r = np.zeros_like(input_image).astype(np.uint8)
        g = np.zeros_like(input_image).astype(np.uint8)
        b = np.zeros_like(input_image).astype(np.uint8)

        for l in range(0, number_channels):
            if l == 15:
                idx = input_image == l
                r[idx] = label_colors[l, 0]
                g[idx] = label_colors[l, 1]
                b[idx] = label_colors[l, 2]

        rgb = np.stack([r, g, b], axis=2)
        # return rgb

        # and resize image to match shape of R-band in RGB output map
        foreground = cv2.imread(source)
        foreground = cv2.resize(foreground, (r.shape[1], r.shape[0]))
        foreground = cv2.cvtColor(foreground, cv2.COLOR_BGR2RGB)

        background = cv2.imread(background_source, cv2.IMREAD_COLOR)
        background = cv2.resize(background, (rgb.shape[1], rgb.shape[0]), interpolation=cv2.INTER_AREA)
        background = cv2.cvtColor(background, cv2.COLOR_BGR2RGB)

        # Create a binary mask using the threshold
        th, alpha = cv2.threshold(np.array(rgb), 0, 255, cv2.THRESH_BINARY)

        # Convert uint8 to float
        foreground = foreground.astype(float)
        background = background.astype(float)
        # Normalize the alpha mask to keep intensity between 0 and 1
        alpha = alpha.astype(float) / 255
        # Multiply the foreground with the alpha matte
        foreground = cv2.multiply(alpha, foreground)
        # Multiply the background with ( 1 - alpha )
        background = cv2.multiply(1.0 - alpha, background)
        # Add the masked foreground and background.
        outImage = cv2.add(foreground, background)

        return outImage / 255
  • background_custom(self, input_image, source, background_source, channels=21): The method takes the output tensor with height, width, and 21 feature map predictions [1, 21, H, W], the path to the image source file, and the path to the background image file. The logic consists of extracting only the person feature map (feature 15) from the remaining features and tag all the rest as belonging to the background. Finally, merge the previous tagged features as background with a new image source file. 

Add Deoldify to the module

from deoldify import device
from deoldify.device_id import DeviceId
import torch
import fastai
from deoldify.visualize import *

def colorize_image(self, input_image, output_image, render_factor=35):
    torch.backends.cudnn.benchmark = True
    # Instantiate the main object to colorize the image
    colorizer = get_image_colorizer(artistic=False)
    colorized_image = colorizer.get_transformed_image(input_image, render_factor, watermarked=False)
  • colorize_image(self, input_image, output_image): Takes the input image and calls the colorizer.get_transformed_image(input_image), which runs the inference and yields back the output colorized image.

Wrap the models in an API

As we usually do, we’ll be using Django to create a small Restful API that defines local-hosted endpoints to test our model through forward POST and GET calls. 

Typically, an API is a window into a database. The API backend handles querying the database and formatting the response. What you receive is a static response, usually in JSON format, of whatever resource you requested.

Let’s set up the Django Part

Install Django and Django Rest Framework:

pip install django djangorestframework

Once the dependencies correctly installed, head to the root folder and initialize the Django app:

django-admin startproject semantic-seg

Now your Django project is ready to go. The only thing left is to instantiate the Django rest framework and create a specific folder for it within the initial project folder. 

  • Start your api app: python startapp api
  • Add the path to your newly create api folder to general file:

The tree structure for the API folder should look like this:

Tree print of the project folder structure

Once all configurations are in place, we’ll proceed to code the models and serializers that will ultimately handle all transactional processes involving image data that will be requested back and forth. 

Since the API will take care of retrieving the resulting modified images, you could make use of Neptune’s image logging system to trace and log different image versions yield throughout the model iterations.

Basically, each output image can be saved in your Neptune platform and would inform about the model performance and accuracy. Each iteration would give better results and therefore you could compare all of it in a structured and well-organized manner. 

For more info about how to log content in Neptune whether it’s tables, plots, or images, I highly recommend you to take a look at my previous articles on that matter:

Learn more

Check what metadata you can log and display in Neptune.

Django ORM Modules

To keep things simple with Django, as we’re building a simplistic version of an ML backend, we can rely on the ORM classes that Django offers right out of the box. Their importance resides in the fact we need third-party software to manage and store all the data generated from the calls to the API. For our particular case, we need to post images, apply the model inference to get the semantic filter, and then recover them.

Therefore we need two main components:

  1. Model classes that represent the image objects interchanged,
  2. Input and Output Image serializers to help store the image in the database.

The Model Class

A Python class that inherits from django.db.models.Model class and defines a set of attributes and characteristics relevant to the image object.

  • models.FileField: Stores the path of the image file
  • models.UUIDField: Generates a specific id for each image instance
  • models.CharField: A way to name each object instance
  • models.DateTimeField: Save the exact time when they were stored or updated
from django.db import models
from API.utils import get_input_image_path, get_output_image_path

class ImageSegmentation(models.Model):
    uuid = models.UUIDField(primary_key=True, default=uuid.uuid4, editable=False)
    name = models.CharField(max_length=255, null=True, blank=True)
    input_image = models.FileField(upload_to=get_input_image_path, null=True, blank=True)
    output_image = models.FileField(upload_to=get_output_image_path, null=True, blank=True)
    verified = models.BooleanField(default=False)
    created_at = models.DateTimeField(auto_now_add=True)
    updated_at = models.DateTimeField(auto_now=True)

    def __str__(self):
        return "%s" %

Once you have coded the class, migrate your changes to the SQL database:

python makemigrations
python migrate

Input and output image serializers

Define the Django serializers with the corresponding attributes of the image object. We’ll be making two serializers to handle incoming and outgoing image objects.

class InputImageSerializer(serializers.ModelSerializer):
    class Meta:
        model = ImageSegmentation
        fields = ('uuid', 'name', )

class OutputImageSerializer(serializers.ModelSerializer):
    class Meta:
        model = ImageSegmentation
        fields = ('uuid', 'name', 'input_image', 'output_image', 'created_at', 'updated_at')

Finally, after all the changes made you need to register your new model in the admin portal. You can simply do it by going to the file and adding the following line:

Build the API endpoints

For the POST request, there will be two parts. One method handles the Background Customization and the other for the Colorizing part.

  • POST for Background Customization: It sends two file images, the original photo, and the matching background. It processes them and saves them into their corresponding folders.
def run_inference(request):
    property_id = request.POST.get('property_id')

    # converts querydict to original dict
    images = dict((['image']
    flag = 1
    arr = []
    for img_name in images:
        modified_data = modify_input_for_multiple_files(property_id,
        file_serializer = ImageSerializer(data=modified_data)
        if file_serializer.is_valid():
            flag = 0

    if flag == 1:
        image_path = os.path.relpath(arr[0]['image'], '/')
        bg_path = os.path.relpath(arr[1]['image'], '/')
        input_image = ImageSegmentation.objects.create(input_image=image_path, name='image_%02d' % uuid.uuid1())
        bg_image = ImageSegmentation.objects.create(input_image=bg_path, name='image_%02d' % uuid.uuid1())
        RunDeepLabInference(input_image, bg_image).save_bg_custom_output()
        serializer = OutputImageSerializer(input_image)
        return Response(
def run_grayscale_inference(request):
    file_ = request.FILES['image']
    image = ImageSegmentation.objects.create(input_image=file_, name='image_%02d' % uuid.uuid1())
    serializer = OutputImageSerializer(image)
    return Response(
  • POST for the Colorizing Deoldify model: parse the request and extracts the base64 image string. Decodes the base64 string and performs the colorize filter before saving it to the output image folder.
def colorize_image(request):
    file_image = request.FILES['image']
    image = ImageSegmentation.objects.create(input_image=file_image, name='image_%02d' % uuid.uuid1())
    image_string = base64.b64decode(image)
    image_data = BytesIO(image_string)
    img =
    colorized_image = colorizer.get_transformed_image(file_image, render_factor=35, watermarked=False)
    serializer = OutputImageSerializer(image)
    return Response(

The GET method will simply retrieve the transformed images that we store in the database and serve them as static files.

def get_images(request):
    property_id = request.POST.get('property_id')

    # converts querydict to original dict
    images = dict((['image']
    flag = 1
    arr = []
    for img_name in images:
        modified_data = modify_input_for_multiple_files(property_id,
        file_serializer = ImageSerializer(data=modified_data)
        if file_serializer.is_valid():
            flag = 0

    if flag == 1:
        return Response(arr, status=status.HTTP_201_CREATED)
        return Response(arr, status=status.HTTP_400_BAD_REQUEST)

Configure the API Routing

1. Set up the path for the URL patterns in the file:

app_name = 'api'

urlpatterns = [
    path(r'test/', views.test_api, name='test_api_communication'),
    path(r'images/', views.get_images, name='get_images'),
    path(r'inference/', views.run_inference, name='run_inference_on_images'),
    path(r'grayscale/', views.run_grayscale_inference, name='run_grayscale_inference_on_images'),
    path(r'colorize/', views.colorize_image, name='run_deoldify_colorize_filter_on_images'),
    path(r'clean/', views.clean_folders, name='clean_output_folder')

2. Make up the address of your API endpoints in the file:

urlpatterns = [
 path(r'test/', views.test_api, name='test_api_communication'),
 path(r'images/', views.get_images, name='get_images'),
 path(r'api/', views.run_inference, name='run_inference_on_images'),

Build the IOS application

To wrap up, now that we have our API perfectly functioning, all we need to do is build a small IOS application with two viewcontrollers to upload pictures and get back their beautifully transformed versions, with background customization and color filtering. 

The results we’ll eventually obtain in our application interface are similar to these examples from the API:

Image colorizer example

I like to code in Swift entirely programmatically, and I admit that I have a kind of aversion to storyboards and any sort of XML-related UI development. So, let’s keep things simple and entertaining by removing the main storyboard and set the SceneDelegate.swift file.

  1. Delete the Storyboard Name in the file
  2. Change the SceneDelegate file accordingly:
func scene(_ scene: UIScene, willConnectTo session: UISceneSession, options connectionOptions: UIScene.ConnectionOptions) {
    guard let windowScene = (scene as? UIWindowScene) else { return }
    window = UIWindow(frame: windowScene.coordinateSpace.bounds)
    window?.windowScene = windowScene
    window?.rootViewController = ViewController()

Create the entry point ViewController

The first ViewController will act as an entry point to our application. It will define the basic layout with two buttons that can either let the user take a picture or upload one from the library.

The layout is constrained manually avoiding autolayout automatic element positioning.

The layout contains two buttons vertically aligned and a UIImageView logo at the top.

Logo image

  • Small UIImageView as a Logo for the application
let logo: UIImageView = {
    let image = UIImageView(image: #imageLiteral(resourceName: "default").resized(newSize: CGSize(width: screenWidth - 20, height: screenWidth - 20)))
    image.translatesAutoresizingMaskIntoConstraints = false
   return image


  • Button for open camera
lazy var openCameraBtn : CustomButton = {
   let btn = CustomButton()
    btn.translatesAutoresizingMaskIntoConstraints = false
    btn.setTitle("Camera", for: .normal)
    let icon = UIImage(named: "camera")?.resized(newSize: CGSize(width: 45, height: 45))
    let tintedImage = icon?.withRenderingMode(.alwaysTemplate)
    btn.setImage(tintedImage, for: .normal)
    btn.tintColor = #colorLiteral(red: 0.892498076, green: 0.5087850094, blue: 0.9061965346, alpha: 1)
    btn.addTarget(self, action: #selector(openCamera), for: .touchUpInside)
    return btn
  • Button for picture upload:
lazy var openToUploadBtn : CustomButton = {
   let btn = CustomButton()
    btn.addTarget(self, action: #selector(uploadLibrary), for: .touchUpInside)
    btn.translatesAutoresizingMaskIntoConstraints = false
    return btn

Set the general layout and the constraints for each UI element

fileprivate func addButtonsToSubview() {
fileprivate func setupView() {
    logo.centerXAnchor.constraint(equalTo: self.view.centerXAnchor).isActive = true
    logo.topAnchor.constraint(equalTo: self.view.safeAreaLayoutGuide.topAnchor, constant: 20).isActive = true
    openCameraBtn.centerXAnchor.constraint(equalTo: view.centerXAnchor).isActive = true
    openCameraBtn.widthAnchor.constraint(equalToConstant: view.frame.width - 40).isActive = true
    openCameraBtn.heightAnchor.constraint(equalToConstant: 60).isActive = true
    openCameraBtn.bottomAnchor.constraint(equalTo: openToUploadBtn.topAnchor, constant: -40).isActive = true
    openToUploadBtn.centerXAnchor.constraint(equalTo: view.centerXAnchor).isActive = true
    openToUploadBtn.widthAnchor.constraint(equalToConstant: view.frame.width - 40).isActive = true
    openToUploadBtn.heightAnchor.constraint(equalToConstant: 60).isActive = true
    openToUploadBtn.bottomAnchor.constraint(equalTo: view.bottomAnchor, constant: -120).isActive = true
  • Handle Open Camera action:
@objc func openCamera() {
        if UIImagePickerController.isSourceTypeAvailable(.camera) {
            let imagePicker = UIImagePickerController()
            imagePicker.delegate = self
            imagePicker.sourceType = .camera
            imagePicker.allowsEditing = true
            self.present(imagePicker, animated: true, completion: nil)
  • Handle the Upload from library action:
@objc func uploadLibrary() {
        if UIImagePickerController.isSourceTypeAvailable(.photoLibrary) {
            let imagePicker = UIImagePickerController()
            imagePicker.delegate = self
            imagePicker.sourceType = .photoLibrary
            imagePicker.allowsEditing = false
            self.present(imagePicker, animated: true, completion: nil)
  • Override the imagePickerController from the UIImagePickerControllerDelegate:
func imagePickerController(_ picker: UIImagePickerController, didFinishPickingMediaWithInfo info: [UIImagePickerController.InfoKey : Any]) {
        if let image = info[.originalImage] as? UIImage {            
            let segmentationController = ImageSegmentationViewController()
            segmentationController.modalPresentationStyle = .fullScreen
            segmentationController.inputImage.image = image
            dismiss(animated: true, completion: nil)
            self.present(segmentationController, animated: true, completion: nil)

Handle the API callbacks

To manage the HTTP API calls in the Segmentation Controller we’ll use Alamofire, a widely-used Swift package for handling Elegant HTTP networking with Swift. Install the package using your preferred method, I used CocoaPod.

The POST method expects a dictionary of type [String: String], the key being the image, and the value of the base64 format of the original image.

The steps to implement the callback are as follows:

  1. Convert the UIImage to a base64 encoding with no compression ratio,
  2. Create the argument that will be used to send the POST request value to be encoded,
  3. Perform the request with the Alamofire request method,
  4. Handle the API results,
  5. Update the UIImageView with the filtered image.
func colorizeImages() {
    let imageDataBase64 = inputImage.image!.jpegData(compressionQuality: 1)!.base64EncodedString(options: .lineLength64Characters)
    let parameters: Parameters = ["image": imageDataBase64]
    AF.request(URL.init(string: self.apiEntryPoint)!, method: .post, parameters: parameters, encoding: JSONEncoding.default, headers: .none).responseJSON { (response) in
    switch response.result {
        case .success(let value):
                if let JSON = value as? [String: Any] {
                    let base64StringOutput = JSON["output_image"] as! String
                    let newImageData = Data(base64Encoded: base64StringOutput)
                    if let newImageData = newImageData {
                       let outputImage = UIImage(data: newImageData)
                        let finalOutputImage = outputImage
                        self.inputImage.image = finalOutputImage
                        self.colorizedImage = finalOutputImage
        case .failure(let error):


Semantic segmentation results
Results obtained applying the background customization and gray filtering
Image colorizer results
Results obtained from colorize filters, transforming old black and white photographs into fully colorized ones.
Top left corner original photo | Source: Old pictures Casablanca, bottom right original photograph | Source: Souvenirs, Souvenirs


We’ve taken a holistic tour through image segmentation, with some applications that turn out to be easy to implement and quite entertaining. With the small application that I have proposed, I hope I’ve added a bit of juice to your creativity. 

I do encourage you to test other applications with the same canvas. A bunch of other cool features can be implemented with DeepLab V3.

To wrap up, here are some references I recommend you to check:


ML Experiment Tracking: What It Is, Why It Matters, and How to Implement It

10 mins read | Jakub Czakon | Posted November 26, 2020

Let me share a story that I’ve heard too many times.

”… We were developing an ML model with my team, we ran a lot of experiments and got promising results…

…unfortunately, we couldn’t tell exactly what performed best because we forgot to save some model parameters and dataset versions…

…after a few weeks, we weren’t even sure what we have actually tried and we needed to re-run pretty much everything”

– unfortunate ML researcher.

And the truth is, when you develop ML models you will run a lot of experiments.

Those experiments may:

  • use different models and model hyperparameters
  • use different training or evaluation data, 
  • run different code (including this small change that you wanted to test quickly)
  • run the same code in a different environment (not knowing which PyTorch or Tensorflow version was installed)

And as a result, they can produce completely different evaluation metrics. 

Keeping track of all that information can very quickly become really hard. Especially if you want to organize and compare those experiments and feel confident that you know which setup produced the best result.  

This is where ML experiment tracking comes in. 

Continue reading ->
Image processing python

Image Processing in Python: Algorithms, Tools, and Methods You Should Know

Read more
Computer vision tools

Top Tools to Run a Computer Vision Project

Read more
image segmentation kaggle tips and tricks

Image Segmentation: Tips and Tricks from 39 Kaggle Competitions

Read more
Object detector

How to Train Your Own Object Detector Using TensorFlow Object Detection API

Read more