The lifecycle of an app or software system (also known as SDLC) has several main stages:
Then again, back to new releases with features, updates, and/or fixes as needed.
To carry out these processes, software development relies on DevOps to streamline development while continuously delivering new releases and maintaining quality.
The workflow for computer vision models, or any machine learning models, also follows a similar pattern. Where it differs from traditional software development is in the environment where it operates.
For example, computer vision is prominently data-driven, and hence non-deterministic in behavior. Also, as recent global events have shown, our world is constantly changing, so CV practitioners must expect that the real-world data that powers their models will inevitably change too.
Hence, ML practitioners have embraced DevOps practices for ML models and came up with MLOps. In Wikipedia terms, MLOps is the process of taking experimental machine learning models into production. [source]
In this blog, our primary focus is MLOps for computer vision (if you’re more interested in the importance of ModelOps, you can check out this blog)
What are the key differences between MLOps and DevOps?
There’s a bit more to MLOps than to DevOps. Is all of that worth doing?
Why should you do MLOps, and not just focus on creating a better ML model?
You can implement and train a model with decent performance and accuracy without MLOps. However, the real challenge is building an integrated ML system to continuously operate it in production.
The following diagram shows that only a small fraction of a real-world ML system is composed of L code. The required surrounding elements are vast and complex.
The above diagram shows the steps you would follow for your CV project. Now, the level of automation of these steps defines the maturity of the CV project.
What’s the maturity about? Let’s look at the three levels of MLOps briefly laid down by Google in “MLOps: Continuous delivery and automation pipelines in machine learning”:
- MLOps level 0
Here, the developers manually controlled all the steps from collecting data to building the model and deploying it. The process is usually carried out by experimental code written and executed in notebooks until a workable model is produced. At this level, a trained model is deployed as a production service.
- MLOps level 1
In level 1, the whole ML pipeline is automated. Here the process of using new data to retrain models in production is automated. Also, for this data and model validation steps and metadata management needs to be automated. At this level, a whole training pipeline that automatically and recurrently runs to serve the training model is deployed as a predictive service.
- MLOps level 2
In level 2, the CI/CD pipeline is automated along with the ML pipeline. The automated CI/CD lets you explore new ideas around feature engineering or model architecture, and implement them easily with automated pipeline building, testing, and deployment.
Now that we know the importance of MLOps, let’s look at each stage and component of your computer vision project in detail, along with all the associated MLOps tools.
Data and feature management
One of the largest and open-sourced datasets of face images with gender and age labels. It consists of 5,23,051 face images.
One of the largest datasets with a diverse set of stereo video sequences recorded in street scenes from 50 different cities, with 5,000 high-quality annotated frames in addition to a larger set of 20,000 weakly annotated frames.
It consists of 60,000 training and 10,000 test samples, each of 28 X 28 grayscale images associated with labels from 10 classes.
The dataset consists of 6,50,000 10-second video clips from Youtube, including around 700 human action classes with at least 600 video clips in each class. A wide range of human-focused actions.
This open-source dataset has densely labeled video clips of humans performing pre-defined basic actions with everyday objects. It was created for CV models to develop a good understanding of basic actions that occur in the physical world. The total number of videos included is 220,847, where 168,913 is the training set, 24,777 is the validation set, and 27,157 is the test set.
If you want to build your own dataset, here are so of the tools commonly used in computer vision which can help in dataset creation:
- It’s a graphical image annotation tool.
- Written in python and uses Qt for the graphical interface
- Annotations saved in XML in PASCAL VOC format, the format used by ImageNet. Also supports YOLO and CreateML formats.
- It’s a free, online, interactive video and image annotation tool.
- Has deep learning serverless functions for automatic labeling
- It’s an open-source data labeling tool for images, videos, audio, etc with a simple UI.
- Exports to various model formats
- Apart from creating raw data, can be used for improving existing data.
- VoTT is a React + Redux Web application, written in TypeScript.
- Extensible model for importing and exporting data.
Along with doing the data versioning quite well with great UI, Neptune allows you to log visual snapshots of your image directories, which can prove quite helpful for your computer vision project.
Designed to handle large files and datasets along with machine learning models and metrics.
With features like unlimited exports, universal hosting, and labeling and annotation tools, Roboflow is great for your computer vision projects.
Dataiku is for people who prefer to work with code rather than visual tools to manipulate, transform and model data.
To check the quality assurance of your image or video data, here are some of the tools:
Scale accelerates the development of AI applications by helping computer vision teams generate high-quality ground truth data. Their advanced LiDAR, video, and image annotation APIs allow self-driving, drone, and robotics teams to focus on building differentiated models vs. labeling data. It offers paid services for high-quality training and validation data for all AI applications, both for on-demand and enterprise customers.
Great Expectations is open-source and helps data teams eliminate pipeline debt, through data testing, documentation, and profiling.
It provides paid services for end-to-end observability and control of all your data. Some of its features include monitoring, testing, and checking data fitness.
Data processing and having a pipeline for data are important. There are three key elements:
- a source,
- processing step(s),
- a designation where one would feed to the model for use.
Data pipeline architectures require many considerations. To know more about data pipelines and their architectures, read this article.
Here are some of the tools you should consider for data processing and data pipelines:
Built by Twitter, this open-source data orchestrator does batch processing for unbounded streams of data in a reliable manner. It consumes streams of data and processes those streams in arbitrarily complex ways, repartitioning the streams between each stage of computation.
A data orchestrator for machine learning, analytics, and ETL (Extract, Transform, Load). It lets you define pipelines in terms of dataflows between logical components called solids. These pipelines are built locally and can be run anywhere.
This open-source and flexible in-memory framework serves as an alternative to map-reduce for handling batch, real-time analytics, and data processing workloads.
Developing an ML pipeline is different from developing software, mainly from the data point of view. The quality of data or the features is as important as the quality of the code. Thanks to feature stores, we can reuse features instead of rebuilding them for different models. This automation is an important part of MLOps.
This feature store building library is centered around the following concepts:
- ETL: a central framework to create data pipelines. Spark-based Extract, Transform and Load modules ready to use.
- Declarative Feature Engineering: care about what you want to compute and not how to code it.
- Feature Store Modeling: the library easily provides everything you need to process and load data to your feature store.
This open-source, easy-to-use feature store is the fastest path to operationalizing analytic data for model training. The key features are:
- It calls in your offline data so that it’s available for real-time predictions, without custom pipelines.
- It eliminates training-serving skew, by guaranteeing that the same data is fed during training and inference.
- It reuses the existing infrastructure and spins up new resources when needed as it runs on top of cloud-managed services.
The Amazon SageMaker Feature Store is a purpose-built repository where you can store and access features, so it’s much easier to name, organize, and reuse them across teams without the need to write additional code or create manual processes to keep features consistent.
Another notable feature store for time series data. This open-source, convenient, Python-based feature store provides:
- Pandas-like interface for data and features,
- A time series for each feature,
- Database for metadata storage,
- Data storage location for each namespace,
- Compatibility with popular tools, like Jupyter Notebooks.
The core idea of building an ML model is to keep improving the model, which is why MLOps adds another critical pillar of continuity called continuous training (in addition to continuous integration and continuous development). The model registry helps you achieve this task by maintaining model lineage, source code versioning, and annotations. Here are some of the tools you can use to store this information and their key features to help help you choose:
- Share ML models with the team.
- Work together from experimentation, online testing, production, so on.
- Integrate with approval and governance workflows.
- Monitor ML deployments and their performance.
- Use software-as-a-service eliminating the need to deploy on your hardware.
- Easy integration with all workflows.
- Notebook versioning and diffing.
- Catalog models for production.
- Associate metadata, such as training metrics, with a model.
- Manage model versions and the approval status of a model.
- Automate model deployment with CI/CD.
One doesn’t start a computer vision project without knowing about OpenCV and Scikit-learn. Some of the torch bearers in model-building are TensorFlow, PyTorch, and Keras. You can choose any of them depending on the project or simply your compatibility. If you have trouble choosing the right framework for your computer vision project, please look into this blog.
If you aren’t interested in building your own model or fine-tune popular models, here are some of the computer vision platforms with pre-trained models which you may find useful:
- Google Cloud’s Vision API offers powerful pre-trained machine learning models through REST and RPC APIs.
- Assign labels to images and quickly classify them into millions of predefined categories.
- Detect objects and faces, read printed and handwritten text, and build valuable metadata into your image catalog.
- PyTorch implementation and pretrained models of self-supervised vision transformers.
- The model can discover and segment objects in an image or a video with absolutely no supervision and without being given a segmentation-targeted objective.
- features are easily interpretable, suggesting that this class of models is capable of a higher level of image understanding
- identify objects, people, text, scenes, and activities in images and videos, as well as to detect any inappropriate content
- Provides algorithms pretrained on data collected by Amazon
- You can build algorithms that you can train on a custom dataset.
When each component of your ML pipeline makes a decision, it’s passed onto the next. Metadata Stores, store and retrieve these decisions made, hyperparameters used, and the data used at different steps. Here are some of the libraries you can use as metadata stores:
Neptune supports logs and displays for many different types of model building metadata, like metrics and losses, parameters and model weights, and much more. You can check out their live Notebook to learn more.
SiaSearch is a platform for efficiently exploring vision data based on metadata. It streamlines data identification and searches for difficulties to help increase efficiency. Some of its features are:
- Structures the data by automatically creating custom interval attributes.
- Use custom attributes to query, find rare edge cases, and curate new training data.
- Easily save, edit, version, comment, and share frame sequences with colleagues.
- Visualize data and analyze model performance using custom attributes.
MLMD registers the following types of metadata in its database:
- Artifacts generated throughout the steps of your ML pipeline.
- Metadata about the execution of these components.
- Lineage information.
Hyperparameter tuning gives you the best version of your ML model. If you want to know all about the importance of hyperparameters with different ways to optimize them, have a look at this book by Tanay Agrawal. Here’s a brief list of tools available for hyperparameter tuning:
Hyperopt lets users describe the search space. By providing more information about where your function is defined, and where you think the best values are, you allow algorithms in Hyperopt to search more efficiently.
Dask-ML classes provide a solution to two of the most common issues in hyperparameter optimization:
- When the hyperparameter search is memory constrained,
- Or when the search is compute constrained.
This highly modular hyperparameter optimization framework lets users dynamically construct the search spaces. Some of the key features of Optuna are:
- Lightweight architecture,
- Pythonic search spaces,
- Easy parallelization.
Version control, also known as source control, tracks and manages changes to the source code. It enables teams to work on and contribute to the same code. At the moment, GitHub and GitLab are the de facto standard for version control. While working on your computer vision project with large data, you may face an issue with Git as it was only made for versioning application code, not to store large amounts of data, nor to support ML pipelining. DVC addresses this issue and also works for application code versioning. If you find DVC too rigid for this matter, with a better UI and more features like visibility into the team’s progress at any time, then using Neptune can prove to be a great advantage.
Model serving is usually the last step of the ML model pipeline. After creating your CV model, you have to decide where to deploy your model, i.e. which platform will you use for serving your model? You can choose on-prem servers but it can get extremely costly and difficult to manage at scale. Here are some of the tools for model serving and deployment:
A framework for serving, managing, and deploying machine learning models. Some of the key features of BentoML include:
- Support for multiple ML frameworks like PyTorch, Tensorflow, and many more.
- Containerized model server for production deployment with Docker, Kubernetes, OpenShift, etc.
- Adaptive micro-batching for optimal online serving performance.
A flexible system for machine learning models, designed for production environments. Some of the key features are:
- Allows deployment of new model versions without changing your code.
- Supports many servables.
- The size and granularity of a servable are flexible.
It focuses on solving the last step in any machine learning project to help companies put models into production, solve real-world problems and maximize the return on investment.
- Serves models built on any open-source or commercial model building format.
- It handles scaling to thousands of production machine learning models.
- Provide an easy way to containerizeCustomize ML models using pre-packed inference servers, custom servers, or language wrappers.
Unlike in a software application, training data in ML plays a crucial part. You can’t test model validity without training. This is why the usual CI/CD process is enhanced for ML development by adding continuous training to it. Here are some of the best libraries I could find for implementing CI/CD in your machine learning pipeline:
CML can be used to automate parts of your development workflow, including model training and evaluation, comparing ML experiments across your project history, and monitoring datasets.
CircleCI can be configured to efficiently run complex pipelines with sophisticated caching, docker layer caching, resource classes for running on faster machines, and performance pricing.
Travis CI is a hosted continuous integration service used to build and test projects. It provides custom deployments of proprietary versions on your hardware.
Model monitoring lets you tweak and improve your model in production continuously. In a highly-developed MLOps workflow, this should be an active process. There are three aspects to monitoring:
- Technical/system monitoring checks if the model infrastructure is served correctly or not.
- Model monitoring continuously validates the predictions’ accuracy.
- Business performance monitoring comes down to whether the model is helping the business or not. Monitoring the impact of changes in a release is necessary.
I’ll share four tools that you can use for production monitoring. If you’re looking for a more extensive list of ML monitoring tools, please go to this article.
It’s one of the most lightweight tools for monitoring your model. It can version, store, organize, and query models, and model development metadata. Here are some of the features that make it unique:
- It helps you organize your work better, by filtering, sorting and grouping model training runs in the dashboard.
- It helps you compare metrics and parameters in a table that automatically finds what changed between runs and the anomalies.
- The whole team can track experiments that are executed in scripts and do that on any infrastructure (cloud, laptop, cluster).
Fiddler helps you monitor model performance, explain and debug model predictions, analyze model behavior for entire data and slices, deploy machine learning models at scale, and manage your machine learning models and datasets.
Evidently helps in analyzing machine learning models in production or for validation.
One of the tools in SageMaker, helps you maintain high-quality machine learning models by automatically detecting and alerting on inaccurate predictions from models deployed in production.
Until now, I’ve discussed libraries and tools you can use for a single component or a few components of your ML pipeline. But if you’re keen on automating the entire process right from collecting and managing data to deploying it and monitoring it in production, there are tools for that as well!
We will look into two types of automation and tools associated with it:
Various AutoML frameworks automate various steps of the machine learning lifecycle. The list that I curated are tools that free the user from algorithm selection and hyperparameter tuning and build a model ready for deployment.
It’s an end-to-end platform for applying computer vision for a specific scenario.
- Customize and embed state-of-the-art computer vision image analysis for specific domains.
- Run Custom Vision in the cloud or on the edge in containers.
- Rely on enterprise-grade security and privacy for your data and any trained models.
AutoML Vision enables you to train machine learning models to classify your images according to your own defined labels.
- Train models from labeled images and evaluate their performance.
- Leverage a human labeling service for datasets with unlabeled images.
- Register trained models for serving through the AutoML API.
TPOT is a tree-based pipeline optimization tool that uses genetic algorithms to optimize machine learning pipelines.
- It is built on top of scikit-learn and uses its regressor and classifier methods.
- It expects a clean dataset.
- It does feature processing, model selection, and hyperparameter optimization to return the best-performing model.
Auto-sklearn is an automated machine learning toolkit. It’s a drop-in replacement for the scikit-learn estimator. Some of the things to keep in mind are:
- It’s simple to use, with minimal code. Example.
- Auto-sklearn library works well with small and medium datasets, and not with large datasets.
- It doesn’t train advanced deep learning models that might perform well with large datasets.
MLBox is a powerful automated machine learning Python library. Some of its features include:
- Fast Reading and distributed data preprocessing.
- Robust feature selection and leak detection.
Algorithmia is an enterprise-based MLOps platform that accelerates your research and delivers models quickly, securely, and cost-effectively.
- You can securely deploy, serve, manage and monitor all your machine learning workloads.
- It uses an automated machine learning pipeline for version control, automation, logging, auditing, and containerization. You can easily access KPIs, performance metrics, and data for monitoring.
- Its features will show you a clear picture of risk, compliance, cost, and performance.
Kubeflow is an open-source and free machine learning Kubernetes-native platform for developing, orchestrating, deploying, and running scalable and portable machine learning workloads.
Azure ML is a cloud-based platform that can be used to train, deploy, automate, manage, and monitor all your machine learning experiments. It has its own robust open-source MLOps platform which simplifies ML creation and deployment. To know more, visit Azure’s documentation.
Gradient provides lightweight software and infrastructure for model development, collaboration, and deployment. Some of the services provided by gradient are:
- Full reproducibility with automatic versioning, tagging, and lifecycle management.
- Automated pipelines: gradient acts as CI/CD for machine learning.
- Instant scalability: train in parallel and easily scale deployed models.
In conclusion, implementing your computer vision project in production doesn’t mean only deploying the model as an API for production.
Rather, it means deploying an ML pipeline that is automated at various stages and helps improve your models faster. Setting up a CI/CD system enables you to automatically test and deploy new pipeline implementations.
This system will help you challenge rapid changes in your data and business. It doesn’t mean you’ll move all your processes immediately from one level to another. Don’t rush it! You should gradually implement these practices to help improve the automation of your ML system development and production.
- Sorry, Batman is busy
- Delivering on the Vision of MLOps: A maturity-based approach
- Continuous Delivery for Machine Learning
- Rules of Machine Learning: Best practices for ML Engineering
- Building Machine Learning Pipelines
- Introducing MLOps
- Engineering MLOps
- Machine Learning Design Patterns
- MLOps playlist by DVC
- MLOps: from model-centric to data-centric AI
- MLOps vs DevOps vs AIOps
- Bridging AI’s Proof-of-Concept to Production Gap