MLOps Blog

Best MLOps Tools For Your Computer Vision Project Pipeline

Akruti Acharya

10 min

17th August, 2023

Computer Vision ML Tools

The lifecycle of an app or software system (also known as SDLC) has several main stages:

Planning,
Development,
Testing,
Deployment,

Then again, back to new releases with features, updates, and/or fixes as needed.

To carry out these processes, software development relies on DevOps to streamline development while continuously delivering new releases and maintaining quality.

The workflow for computer vision models, or any machine learning models, also follows a similar pattern. Where it differs from traditional software development is in the environment where it operates.

For example, computer vision is prominently data-driven, and hence non-deterministic in behavior. Also, as recent global events have shown, our world is constantly changing, so CV practitioners must expect that the real-world data that powers their models will inevitably change too.

Hence, ML practitioners have embraced DevOps practices for ML models and came up with MLOps. In Wikipedia terms, MLOps is the process of taking experimental machine learning models into production. [source]

Gartner defines MLOps as a subset of ModelOps. MLOps is focused on the operationalization of ML models, while ModelOps covers the operationalization of all types of AI models. [source]

Learn more

MLOps: What It Is, Why it Matters, and How To Implement It
The Best MLOps Tools You Need to Know as a Data Scientist

In this blog, our primary focus is MLOps for computer vision (if you’re more interested in the importance of ModelOps, you can check out this blog)

What are the key differences between MLOps and DevOps?

Properties	DevOps	MLOps
Code versioning
Compute environment
Continuous integration/delivery
Monitoring in production
Data provenance
Datasets
Models
Hyperparameters
Workflows

There’s a bit more to MLOps than to DevOps. Is all of that worth doing?

Why should you do MLOps, and not just focus on creating a better ML model?

You can implement and train a model with decent performance and accuracy without MLOps. However, the real challenge is building an integrated ML system to continuously operate it in production.

The following diagram shows that only a small fraction of a real-world ML system is composed of L code. The required surrounding elements are vast and complex.

The above diagram shows the steps you would follow for your CV project. Now, the level of automation of these steps defines the maturity of the CV project.

What’s the maturity about? Let’s look at the three levels of MLOps briefly laid down by Google in “MLOps: Continuous delivery and automation pipelines in machine learning”:

MLOps level 0

Here, the developers manually controlled all the steps from collecting data to building the model and deploying it. The process is usually carried out by experimental code written and executed in notebooks until a workable model is produced. At this level, a trained model is deployed as a production service.

MLOps level 1

In level 1, the whole ML pipeline is automated. Here the process of using new data to retrain models in production is automated. Also, for this data and model validation steps and metadata management needs to be automated. At this level, a whole training pipeline that automatically and recurrently runs to serve the training model is deployed as a predictive service.

MLOps level 2

In level 2, the CI/CD pipeline is automated along with the ML pipeline. The automated CI/CD lets you explore new ideas around feature engineering or model architecture, and implement them easily with automated pipeline building, testing, and deployment.

Now that we know the importance of MLOps, let’s look at each stage and component of your computer vision project in detail, along with all the associated MLOps tools.

Data and feature management

Data collection

Here’s a list of datasets used in common CV projects (apart from the very famous CIFAR-10, ImageNet, and MS Coco):

IMDB-Wiki Dataset

One of the largest and open-sourced datasets of face images with gender and age labels. It consists of 5,23,051 face images.

Cityscapes Dataset

One of the largest datasets with a diverse set of stereo video sequences recorded in street scenes from 50 different cities, with 5,000 high-quality annotated frames in addition to a larger set of 20,000 weakly annotated frames.

Fashion MNIST

It consists of 60,000 training and 10,000 test samples, each of 28 X 28 grayscale images associated with labels from 10 classes.

Kinetics | DeepMind

The dataset consists of 6,50,000 10-second video clips from Youtube, including around 700 human action classes with at least 600 video clips in each class. A wide range of human-focused actions.

The 20BN-SOMETHING-SOMETHING-V2 Dataset | TwentyBN

This open-source dataset has densely labeled video clips of humans performing pre-defined basic actions with everyday objects. It was created for CV models to develop a good understanding of basic actions that occur in the physical world. The total number of videos included is 220,847, where 168,913 is the training set, 24,777 is the validation set, and 27,157 is the test set.

Data creation

If you want to build your own dataset, here are so of the tools commonly used in computer vision which can help in dataset creation:

LabelImg

It’s a graphical image annotation tool.
Written in python and uses Qt for the graphical interface
Annotations saved in XML in PASCAL VOC format, the format used by ImageNet. Also supports YOLO and CreateML formats.

Computer Vision Annotation Tool

It’s a free, online, interactive video and image annotation tool.
Has deep learning serverless functions for automatic labeling

Label Studio

It’s an open-source data labeling tool for images, videos, audio, etc with a simple UI.
Exports to various model formats
Apart from creating raw data, can be used for improving existing data.

Visual Object Tagging Tool

VoTT is a React + Redux Web application, written in TypeScript.
Extensible model for importing and exporting data.

Data management

neptune.ai

Along with doing the data versioning quite well with great UI, Neptune allows you to log visual snapshots of your image directories, which can prove quite helpful for your computer vision project.

Data Version Control · DVC

Designed to handle large files and datasets along with machine learning models and metrics.

Roboflow

With features like unlimited exports, universal hosting, and labeling and annotation tools, Roboflow is great for your computer vision projects.

Dataiku

Dataiku is for people who prefer to work with code rather than visual tools to manipulate, transform and model data.

Data verification

To check the quality assurance of your image or video data, here are some of the tools:

Scale Nucleus

Scale accelerates the development of AI applications by helping computer vision teams generate high-quality ground truth data. Their advanced LiDAR, video, and image annotation APIs allow self-driving, drone, and robotics teams to focus on building differentiated models vs. labeling data. It offers paid services for high-quality training and validation data for all AI applications, both for on-demand and enterprise customers.

great_expectation

Computer vision tools - great expectation

Great Expectations is open-source and helps data teams eliminate pipeline debt, through data testing, documentation, and profiling.

Soda Data Observability

It provides paid services for end-to-end observability and control of all your data. Some of its features include monitoring, testing, and checking data fitness.

Data processing

Data processing and having a pipeline for data are important. There are three key elements:

a source,
processing step(s),
a designation where one would feed to the model for use.

Data pipeline architectures require many considerations. To know more about data pipelines and their architectures, read this article.

Here are some of the tools you should consider for data processing and data pipelines:

Apache Storm

Built by Twitter, this open-source data orchestrator does batch processing for unbounded streams of data in a reliable manner. It consumes streams of data and processes those streams in arbitrarily complex ways, repartitioning the streams between each stage of computation.

Dagster

A data orchestrator for machine learning, analytics, and ETL (Extract, Transform, Load). It lets you define pipelines in terms of dataflows between logical components called solids. These pipelines are built locally and can be run anywhere.

Apache Spark

This open-source and flexible in-memory framework serves as an alternative to map-reduce for handling batch, real-time analytics, and data processing workloads.

Feature store

Developing an ML pipeline is different from developing software, mainly from the data point of view. The quality of data or the features is as important as the quality of the code. Thanks to feature stores, we can reuse features instead of rebuilding them for different models. This automation is an important part of MLOps.

butterfree

This feature store building library is centered around the following concepts:

ETL: a central framework to create data pipelines. Spark-based Extract, Transform and Load modules ready to use.
Declarative Feature Engineering: care about what you want to compute and not how to code it.
Feature Store Modeling: the library easily provides everything you need to process and load data to your feature store.

Feast

This open-source, easy-to-use feature store is the fastest path to operationalizing analytic data for model training. The key features are:

It calls in your offline data so that it’s available for real-time predictions, without custom pipelines.
It eliminates training-serving skew, by guaranteeing that the same data is fed during training and inference.
It reuses the existing infrastructure and spins up new resources when needed as it runs on top of cloud-managed services.

Computer vision tools - Feast — *Fig: Workflow with Feast [Source]*

SageMaker

The Amazon SageMaker Feature Store is a purpose-built repository where you can store and access features, so it’s much easier to name, organize, and reuse them across teams without the need to write additional code or create manual processes to keep features consistent.

Computer vision tools - SageMaker — *Fig: Other features of Amazon SageMaker [Source]*

Bytehub

Another notable feature store for time series data. This open-source, convenient, Python-based feature store provides:

Pandas-like interface for data and features,
A time series for each feature,
Database for metadata storage,
Data storage location for each namespace,
Compatibility with popular tools, like Jupyter Notebooks.

Model development

Model registry

The core idea of building an ML model is to keep improving the model, which is why MLOps adds another critical pillar of continuity called continuous training (in addition to continuous integration and continuous development). The model registry helps you achieve this task by maintaining model lineage, source code versioning, and annotations. Here are some of the tools you can use to store this information and their key features to help help you choose:

MLflow

Share ML models with the team.
Work together from experimentation, online testing, production, so on.
Integrate with approval and governance workflows.
Monitor ML deployments and their performance.

Computer vision tools - MLflow — *Fig: MLflow workflow. [Source]*

neptune.ai

Use software-as-a-service eliminating the need to deploy on your hardware.
Easy integration with all workflows.
Notebook versioning and diffing.

SageMaker model registry

Catalog models for production.
Associate metadata, such as training metrics, with a model.
Manage model versions and the approval status of a model.
Automate model deployment with CI/CD.

Model training

One doesn’t start a computer vision project without knowing about OpenCV and Scikit-learn. Some of the torch bearers in model-building are TensorFlow, PyTorch, and Keras. You can choose any of them depending on the project or simply your compatibility. If you have trouble choosing the right framework for your computer vision project, please look into this blog.

If you aren’t interested in building your own model or fine-tune popular models, here are some of the computer vision platforms with pre-trained models which you may find useful:

Vision AI

Google Cloud’s Vision API offers powerful pre-trained machine learning models through REST and RPC APIs.
Assign labels to images and quickly classify them into millions of predefined categories.
Detect objects and faces, read printed and handwritten text, and build valuable metadata into your image catalog.

DINO

PyTorch implementation and pretrained models of self-supervised vision transformers.
The model can discover and segment objects in an image or a video with absolutely no supervision and without being given a segmentation-targeted objective.
features are easily interpretable, suggesting that this class of models is capable of a higher level of image understanding

Amazon Rekognition

identify objects, people, text, scenes, and activities in images and videos, as well as to detect any inappropriate content
Provides algorithms pretrained on data collected by Amazon
You can build algorithms that you can train on a custom dataset.

Metadata management

When each component of your ML pipeline makes a decision, it’s passed onto the next. Metadata Stores, store and retrieve these decisions made, hyperparameters used, and the data used at different steps. Here are some of the libraries you can use as metadata stores:

neptune.ai

Neptune supports logs and displays for many different types of model building metadata, like metrics and losses, parameters and model weights, and much more. You can check out their live Notebook to learn more.

SiaSearch

SiaSearch is a platform for efficiently exploring vision data based on metadata. It streamlines data identification and searches for difficulties to help increase efficiency. Some of its features are:

Structures the data by automatically creating custom interval attributes.
Use custom attributes to query, find rare edge cases, and curate new training data.
Easily save, edit, version, comment, and share frame sequences with colleagues.
Visualize data and analyze model performance using custom attributes.

Computer vision tools - SiaSearch — *Fig: Features of SiaSearch. [Source]*

Tensorflow’s ML Metadata

MLMD registers the following types of metadata in its database:

Artifacts generated throughout the steps of your ML pipeline.
Metadata about the execution of these components.
Lineage information.

Hyperparameter tuning

Hyperparameter tuning gives you the best version of your ML model. If you want to know all about the importance of hyperparameters with different ways to optimize them, have a look at this book by Tanay Agrawal. Here’s a brief list of tools available for hyperparameter tuning:

Hyperopt

Hyperopt lets users describe the search space. By providing more information about where your function is defined, and where you think the best values are, you allow algorithms in Hyperopt to search more efficiently.

Dask

Dask-ML classes provide a solution to two of the most common issues in hyperparameter optimization:

When the hyperparameter search is memory constrained,
Or when the search is compute constrained.

Optuna

This highly modular hyperparameter optimization framework lets users dynamically construct the search spaces. Some of the key features of Optuna are:

Lightweight architecture,
Pythonic search spaces,
Easy parallelization.

Version control

Version control, also known as source control, tracks and manages changes to the source code. It enables teams to work on and contribute to the same code. At the moment, GitHub and GitLab are the de facto standard for version control. While working on your computer vision project with large data, you may face an issue with Git as it was only made for versioning application code, not to store large amounts of data, nor to support ML pipelining. DVC addresses this issue and also works for application code versioning. If you find DVC too rigid for this matter, with a better UI and more features like visibility into the team’s progress at any time, then using Neptune can prove to be a great advantage.

See how Neptune compares with Data Version Control

Operationalization

Model serving

Model serving is usually the last step of the ML model pipeline. After creating your CV model, you have to decide where to deploy your model, i.e. which platform will you use for serving your model? You can choose on-prem servers but it can get extremely costly and difficult to manage at scale. Here are some of the tools for model serving and deployment:

BentoML

A framework for serving, managing, and deploying machine learning models. Some of the key features of BentoML include:

Support for multiple ML frameworks like PyTorch, Tensorflow, and many more.
Containerized model server for production deployment with Docker, Kubernetes, OpenShift, etc.
Adaptive micro-batching for optimal online serving performance.

Read the Quickstart Guide to learn more about the basic functionalities of BentoML. You can also try it out on Google Colab.

TensorFlow Serving

A flexible system for machine learning models, designed for production environments. Some of the key features are:

Allows deployment of new model versions without changing your code.
Supports many servables.
The size and granularity of a servable are flexible.

Computer vision tools - TF Serving — *Fig: Serve models in production with TensorFlow Serving [Source]*

Seldon

It focuses on solving the last step in any machine learning project to help companies put models into production, solve real-world problems and maximize the return on investment.

Serves models built on any open-source or commercial model building format.
It handles scaling to thousands of production machine learning models.
Provide an easy way to containerizeCustomize ML models using pre-packed inference servers, custom servers, or language wrappers.

CI/CD/CT

Unlike in a software application, training data in ML plays a crucial part. You can’t test model validity without training. This is why the usual CI/CD process is enhanced for ML development by adding continuous training to it. Here are some of the best libraries I could find for implementing CI/CD in your machine learning pipeline:

Why You Should Use Continuous Integration and Continuous Deployment in Your Machine Learning Projects

CML can be used to automate parts of your development workflow, including model training and evaluation, comparing ML experiments across your project history, and monitoring datasets.

Computer vision tools - CML — *Fig: on every pull request, CML helps you automatically train and evaluate models, then generates a visual report with results and metrics, like the report above [Source]*

CircleCI

CircleCI can be configured to efficiently run complex pipelines with sophisticated caching, docker layer caching, resource classes for running on faster machines, and performance pricing.

Computer vision tools - CircleCI — *Fig: Overview of CircleCI [Source]*

Travis CI

Travis CI is a hosted continuous integration service used to build and test projects. It provides custom deployments of proprietary versions on your hardware.

Computer vision tools - Travis CI — *Source*

Monitoring

Model monitoring lets you tweak and improve your model in production continuously. In a highly-developed MLOps workflow, this should be an active process. There are three aspects to monitoring:

Technical/system monitoring checks if the model infrastructure is served correctly or not.
Model monitoring continuously validates the predictions’ accuracy.
Business performance monitoring comes down to whether the model is helping the business or not. Monitoring the impact of changes in a release is necessary.

I’ll share four tools that you can use for production monitoring. If you’re looking for a more extensive list of ML monitoring tools, please go to this article.

neptune.ai

It’s one of the most lightweight tools for monitoring your model. It can version, store, organize, and query models, and model development metadata. Here are some of the features that make it unique:

It helps you organize your work better, by filtering, sorting and grouping model training runs in the dashboard.
It helps you compare metrics and parameters in a table that automatically finds what changed between runs and the anomalies.
The whole team can track experiments that are executed in scripts and do that on any infrastructure (cloud, laptop, cluster).

Fiddler

Fiddler helps you monitor model performance, explain and debug model predictions, analyze model behavior for entire data and slices, deploy machine learning models at scale, and manage your machine learning models and datasets.

Computer vision tools - Fiddler — *Fig: Features fiddler helps with monitoring [Source]*

Evidently

Evidently helps in analyzing machine learning models in production or for validation.

Computer vision tools - Evidently — *Fig: Example of an interactive report generated by evidently to help analyze, monitor, and debug [Source]*

Amazon SageMaker Model Monitor

One of the tools in SageMaker, helps you maintain high-quality machine learning models by automatically detecting and alerting on inaccurate predictions from models deployed in production.

Automation

Until now, I’ve discussed libraries and tools you can use for a single component or a few components of your ML pipeline. But if you’re keen on automating the entire process right from collecting and managing data to deploying it and monitoring it in production, there are tools for that as well!

We will look into two types of automation and tools associated with it:

AutoML

Various AutoML frameworks automate various steps of the machine learning lifecycle. The list that I curated are tools that free the user from algorithm selection and hyperparameter tuning and build a model ready for deployment.

Custom Vision

It’s an end-to-end platform for applying computer vision for a specific scenario.

Customize and embed state-of-the-art computer vision image analysis for specific domains.
Run Custom Vision in the cloud or on the edge in containers.
Rely on enterprise-grade security and privacy for your data and any trained models.

AutoML Vision

AutoML Vision enables you to train machine learning models to classify your images according to your own defined labels.

Train models from labeled images and evaluate their performance.
Leverage a human labeling service for datasets with unlabeled images.
Register trained models for serving through the AutoML API.

TPOT

TPOT is a tree-based pipeline optimization tool that uses genetic algorithms to optimize machine learning pipelines.

It is built on top of scikit-learn and uses its regressor and classifier methods.
It expects a clean dataset.
It does feature processing, model selection, and hyperparameter optimization to return the best-performing model.

Computer vision tools - TPOT — *Fig: an example machine learning pipeline devised by TPOT [Source]*

auto-sklearn

Auto-sklearn is an automated machine learning toolkit. It’s a drop-in replacement for the scikit-learn estimator. Some of the things to keep in mind are:

It’s simple to use, with minimal code. Example.
Auto-sklearn library works well with small and medium datasets, and not with large datasets.
It doesn’t train advanced deep learning models that might perform well with large datasets.

Computer vision tools - auto sklearn — *Fig: Auto-sklearn pipeline [Source]*

Learn more

A Quickstart Guide to Auto-Sklearn (AutoML) for Machine Learning Practitioners

MLBox

MLBox is a powerful automated machine learning Python library. Some of its features include:

Fast Reading and distributed data preprocessing.
Robust feature selection and leak detection.

Computer vision tools - MLbox — *Fig: representation of how MLBox works [Source]*

MLOps automation

Algorithmia

Algorithmia is an enterprise-based MLOps platform that accelerates your research and delivers models quickly, securely, and cost-effectively.

You can securely deploy, serve, manage and monitor all your machine learning workloads.
It uses an automated machine learning pipeline for version control, automation, logging, auditing, and containerization. You can easily access KPIs, performance metrics, and data for monitoring.
Its features will show you a clear picture of risk, compliance, cost, and performance.

Computer vision tools - Algorithmia — *Fig: How Algorithmia operates [Source]*

Kubeflow

Kubeflow is an open-source and free machine learning Kubernetes-native platform for developing, orchestrating, deploying, and running scalable and portable machine learning workloads.

Computer vision tools - Kubeflow — *Source*

Azure Machine Learning

Azure ML is a cloud-based platform that can be used to train, deploy, automate, manage, and monitor all your machine learning experiments. It has its own robust open-source MLOps platform which simplifies ML creation and deployment. To know more, visit Azure’s documentation.

Computer vision tools - Azure ML — *Fig: Features of Azure Machine Learning [Source]*

Gradient

Gradient provides lightweight software and infrastructure for model development, collaboration, and deployment. Some of the services provided by gradient are:

Full reproducibility with automatic versioning, tagging, and lifecycle management.
Automated pipelines: gradient acts as CI/CD for machine learning.
Instant scalability: train in parallel and easily scale deployed models.

Computer vision tools - Gradient — *Fig: Workflow of Gradient [Source]*

Conclusion

In conclusion, implementing your computer vision project in production doesn’t mean only deploying the model as an API for production.

Rather, it means deploying an ML pipeline that is automated at various stages and helps improve your models faster. Setting up a CI/CD system enables you to automatically test and deploy new pipeline implementations.

This system will help you challenge rapid changes in your data and business. It doesn’t mean you’ll move all your processes immediately from one level to another. Don’t rush it! You should gradually implement these practices to help improve the automation of your ML system development and production.