When it comes to using machine learning technologies in the workplace, IT departments have encountered some difficulties. One is the requirement to design a framework that will allow the models to be scaled up and deployed safely at the same time. The development and operational teams must thus work together more closely.
A term for the coming together of these two worlds already exists on the market today: DevOps. Despite its popularity, DevOps needs specialized technologies to support Machine Learning.
Professionals have been considering MLOps (Machine Learning Operations) to cope with these demands and gain greater performance from intelligent systems.
What is MLOps (quick reminder)
MLOps is a set of best practices, that enable better management of ML projects and smooth production deployment of models. This is the way to go if you want to automate the design and maintenance of clever algorithms and manage their life cycle. MLOps best practices can guarantee consistently high quality of ML models.
Machine learning, Data engineering, and DevOps all come together in this domain. In other words, it ties machine learning into the task of designing, building, and maintaining systems.
Probably if you’re here, you already know what MLOps is, but if you’d like to dig deeper, check this article “MLOps: What It Is, Why It Matters, and How to Implement It“. You’ll learn more about MLOps and its potential applications in the machine learning industry.
MLOps Engineers and their role in the ML team
MLOps is the discipline of applying DevOps ideas to ML systems. It facilitates the creation and deployment of ML models in big data science initiatives. In most projects, the operational elements dwarf the model creation. So it frequently takes more than just data scientists.
Large companies handle functions and duties rather well. Depending on the company and the project, data science teams may have one or more of these responsibilities. The roles of a Machine Learning team are sometimes ambiguous in small and medium-sized groups. Let’s see where an MLOps Engineer fits on this spectrum.
It is their role to find and apply the optimum machine learning model to handle the business challenge. They experiment with different algorithms, fine-tune their hyperparameters, and then assess and corroborate their results using a range of standards.
For smaller teams, data scientists handle most extra roles, resulting in an overwhelmed staff. Data scientists are also the architects and engineers of data in smaller firms.
Using a machine learning prediction requires the usage of a Jupyter notebook and Python scripts. Software engineers would use it if it had a pleasant interface. Also, software engineers worry about things that conventional data scientists don’t. Some examples include access control, use data gathering, cross-platform integration, hosting.
Software engineers working on machine learning projects should be data literate. But not all software engineers are model builders.
Data engineers mainly create data pipelines. Data pipelines provide continuous data flow from sources, pre-transformation, and storage. This is called ETL. Data engineers construct ETL pipelines using Airflow and Prefect. They help arrange and coordinate numerous duties.
MLOps Engineers (or ML Engineers)
MLOps Engineers enable model deployment automation to production systems. The amount of automation varies with the organization. MLOps Engineers take a data scientist’s model and make it accessible to the software that utilizes it. Machine learning models are commonly built, tested, and validated using Jupiter notebooks or script files. However, software developers want machine learning models to be available through callable APIs like REST.
To learn more about the role of MLOps Engineer, I reached out to a few of them and asked a couple of questions.
My guest professionals are:
- Caroline Zago, Brazil, currently working at XP Inc. as MLOps Engineer;
- Dmitry Goryunov, Germany, working as MLOps Engineer at deepset;
- Maisa Daoud, New Zealand, working as MLOps Engineer at Servian;
- Amy Bachir, United States, working as Senior MLOps Engineer at Interos Inc;
- Alexey Grigorev, originally born in Russia but currently living in Berlin, Principal Data Scientist at OLX Group, founder of DataTalksClub, author of the Machine Learning Bookcamp, also recently taught the free Machine Learning Bootcamp;
- and last but not least, Neal Lathia, UK, working as Associate Director of Machine Learning at Monzo Bank.
Each of these folks answered seven questions on the MLOps Engineer role that might help clear up some misconceptions.
How did you get started as an MLOps Engineer?
Dmitry Goryunov: “I have been in Software Engineering for more than ten years when I got interested in Machine Learning. There were a couple of courses on Coursera on the topic. They were not easy, but going through them was fun nevertheless. Later, I have joined a Natural Language Processing team that had a few brilliant data scientists but no engineers.”
Amy Bachir: “I was a DevOps engineer for a few years before I got into MLOps, so I already had experience in CI/CD (Continuous Integration/Continuous Deployment), GitOps, deployments, monitoring, and automation. However, I was missing the pieces that are unique and specific to the machine learning application lifecycle, so I started looking at online courses to learn the basics. I found a very interesting and comprehensive nanodegree with Udacity for building machine learning models, so I took that nanodegree and graduated from it. After earning that nanodegree and having a solid experience in DevOps, it was easy to get my first role in MLOps.”
Caroline Zago: “I started as a Data Scientist Intern, but I liked DevOps and Software Engineering. Then, I found MLOps, which brings together the three areas that I like.”
Dmitry: “From my Software Engineering background, I was used to the software development life cycle. You know, planning, development, testing, deployment, monitoring, etc. At first, I was horrified at how different it was for ML. It seemed like a wild west, where a data scientist trains a model in a Jupyter Notebook, measures its performance, and gives the result a binary file to an engineer.
The engineer deploys the model in the cloud, makes it part of the business logic, and leaves the binary file in production for good: no proper monitoring, no traceability. I am not even mentioning things like continuous integration. The worst part was that everybody just went to the next project after the binary was deployed.
I was lucky to have very understanding data scientists in my first team; together, we saw the MLOps role to ensure that the ML models benefit from the same best practices established in software development. Doing so assures that the ML models in production have more or less the same performance as on the test dataset.”
As you’ve seen, the roads to becoming an MLOps Engineer vary a lot. There’s no one recipe. DevOps experts that are interested in machine learning and are already halfway through the process of working with MLOps are the most typical source of newcomers to the field.
No matter how you slice it, being an MLOps Engineer takes hard work. The fact that this is a new area is one of the biggest challenges. As a result, in many situations, there may be a lack of available material. However, there are currently a number of tools available to help you become more specialized and professional. Deeplearning.ai’s Machine Learning for Production (MLOps) specialization on Coursera is one of the most well-known, as are a number of materials from O’Reilly.
Is there any difference between ML Engineers and MLOps Engineers?
I feel like whenever there’s a discussion about the role of an MLOps Engineer, this question always pops up. Is there any difference? There are not many discussions in the literature that address any difference between MLOps Engineers and ML Engineers. There is content that addresses each one separately. One thing to note is how the responsibilities in each area differ.
The article “What is Machine Learning Engineer: Responsibilities, Skills, and Value Brought” defines a Machine Learning Engineer as “someone who combines software engineering expertise and knowledge of machine learning. The focus here is on engineering, not on building ML algorithms. The primary goal of this specialist is to deploy ML models to production and automate the process of making sense of data — as far as it’s possible.”
Now, in this one “Data Scientist vs Machine Learning Ops Engineer. Here’s the Difference”, we read that “as an MLOps Engineer, you can expect to work with Data Scientists to connect the gap from testing to production within your company software with the practice of both Data Engineering and DevOps tools.”
Looks like the difference between those two roles is not well defined. People often use those terms interchangeably. For sure, a lot depends on the size and structure of the ML team. Anyway, going from theory to practice, let’s find out at the source what it looks like.
Yes! Absolutely! In my opinion, ML Engineers build and retrain machine learning models. MLOps Engineers enable the ML Engineers. MLOps Engineers build and maintain a platform to enable the development and deployment of machine learning models. They typically do that through standardization, automation, and monitoring. MLOps Engineers reiterate the platform and processes to make the machine learning model development and deployment quicker, more reliable, reproducible, and efficient.
Neal Lathia: “Inside of Monzo, we do not (yet) have anyone who is called an “MLOps Engineer.” Instead, we have:
- Machine Learning Scientists who design, train, and ship models;
- Several teams of Backend Engineers who own and manage Monzo’s infrastructure, and
- A Backend Engineer who is focusing on ML-specific systems, like our feature store.*
For me, it is the same as the difference between engineering and Ops teams. Engineers create software, Ops provide the infrastructure for running it and make sure the software runs reliably. However, the lines are blurry, and MLOps Engineers can (and often should) do things end-to-end.
Even though MLOps has been discussed for some time, it is still evolving. As a data scientist or machine learning engineer, you may find yourself taking on some of the tasks of an MLOps Engineer. Companies with small projects or even small teams, on the other hand, may not feel the need to assign specific duties to employees based on their job titles. As Alexey pointed out, the MLOps Engineer’s job might be confusing since it is still a relatively new study area. In other circumstances, the job of an MLOps Engineer may be referred to by a different name or by an entire team.
What does MLOps Engineer do? At work, how is your daily routine?
Maisa Daoud: “We have a very early standup every morning, which is usually followed by a couple of separate meetings with colleagues to answer each other’s questions or set a plan to start working on a task. I work then till lunchtime, which I divide into 15 mins walk (treadmill) and 15 mins for a snack. After that, I go back to the desk and have some little chats with colleagues on how things go. If I finish a task at around 4 pm, then yah, I feel free to look at some personal development course/reading as Servian leaders encourage us to stay updated. I usually finish at 6 pm in this case.”
Oh, it depends. When you work at a company like Zalando, your work makes data scientists productive. So you work closely with data scientists to understand what they are about to do, to define their requirements. Then you try to fit their requirements inside the company’s infrastructure, build the data pipelines, deploy the servers to host the models, organize monitoring and continuous integration. Now that I work for Deepset, I try to generalize my previous experience. Now it is more about building a convenient toolbox so that the data scientists who work with semantic search and question answering systems can train, evaluate, and deploy the question answering systems just by clicking a few buttons.
Neal: “I start the day by making some coffee! I will then do things like joining planning and standup meetings, working with ML Scientists on specific work areas (by giving feedback on their designs and proposals), and working with other leaders across the company on areas of priority. If there are any problems with our systems, I will dive in to help resolve them. I am also spending a fair bit of time hiring and interviewing people right now.”
What is the relationship between the MLOps team with the data science team at your company?
Amy: “It is a very close relationship. We work together all the time. For the most part, everything we build is for them, so we consult with them before we design anything. We also run quarterly surveys to get their feedback on existing systems and what they think might be missing or is not working well for them. In general, I think it is very important to loop the end-users in everything we build because they are ultimately the consumers, and what we build should solve their problems and fit their needs.”
Alexey: “We do not have a separate data science team at OLX – we work in cross-functional packs where data scientists work together with other people on the same part of the product. However, we do have a central team that you may call MLOps. The role of the team is to help data scientists be more effective – especially in the teams that do not have a lot of engineering support. This team also standardizes some of the processes across all different packs.”
What do you think the future holds for this position?
Caroline: “I believe that this position will be increasingly requested in companies with a data science team because only a model running without the necessary structure for construction, production, and monitoring does not have the return expected by companies.”
Dmitry: “We see many tools emerging these days that automate MLOps. If we continue in this direction in a few years, data scientist teams might not need the support of a dedicated MLOps Engineer. This is what you work on in Neptune, and this is what we do in Deepset.
On the opposite side of this trend, we see many models available off the shelf and used by people who might not have a machine learning background. Products like Haystack, Hugging Face, and Neptune Model Registry democratize ML; they make it easier to make ML part of your application.
These two trends might lead to the future where one specialist takes care of the complete life cycle of a model.”
Neal: “As knowledge of machine learning becomes increasingly democratized, I believe that ML will increasingly intersect with Engineering and will not need to be its own separate thing. I expect that in the (near) future, it will become normal for Engineers to train and ship ML systems as part of their daily routine, without needing to go to specialized teams or via specialized tools. It will all become mainstream.”
Looks like the advancement of ML solutions is an essential requirement for the MLOps Engineer role to appear and grow in the ML team. Companies that were more advanced in their use of machine learning had a greater influence on the migration of experts to specialize more and more in various phases of data projects. With the rise of these professions came the terms “Data Engineer” and “MLOps Engineer.”
Separation and professionalization became more necessary as the difficulty of each activity rose (e.g., new methods of storing and modeling data.) The MLOps Engineer wasn’t just a fancy moniker for something already being done extremely effectively.
In what way do you think this new role’s existence has impacted the tech industry?
MLOps is cloud-specific practice. Earlier to the time I started my career in industry, I used to see ML Engineers doing their job on-premise and deploying their models in a cloud provider – although this has slowly shifted into relying more and more on cloud services. MLOPs are taking ML practices in another way; we use cloud services every step from notebooks to deployment, although this is not mandatory, a recommended best practice.
Neal: “Whenever a new job title appears in the market, it is a great opportunity to bring more people into the tech world. It is also a great opportunity to start building communities of people solving the same kind of problem. However, when a job title is new, it also means that most companies do not quite know what they want or expect from that role or how to fit it into their company best. This was the case for Data scientists several years ago. In practice, that meant that the experience of being a Data Scientist could be very different from one company to the next.”
And, finally: what piece of advice would you give to someone who is considering a career in MLOps Engineering?
The most impactful machine learning systems are the ones that can be safely and quickly launched and can impact your company’s product positively. So my advice is to always keep an eye on whether the tools you build enable that to happen.
Maisa: “I do enjoy working as an MLOps Engineer as much as I enjoyed working on ML research before. MLOps roles are teamwork, so you need to be ready for the collaboration idea, especially since there is no single right way to do the job. It is fun, believe me.
Cloud providers are growing very fast, especially in terms of developing ML SDKs (Software Development Kit). You will need a lot of patience and self-motivation to keep yourself on track.”
Amy: “This is a tricky one! MLOps is in a very exciting role! There is much room for innovation! You will never get bored. At the same time, it is a very challenging role. The combination of skills required to succeed at this role is almost impossible to have in one person so prepare yourself for much learning.”
I would say that it is extremely important to understand the problem-solving process through modeling, understand the steps to get to production, and study software engineering concepts focused on machine learning, such as optimization, testing, and monitoring. I also believe that DevOps concepts and, of course, data science should be part of the knowledge of an MLOps Engineer
Alexey: “Do not listen to data scientists advising about MLOps.”
Dmitry: “It depends on the background the person has. If somebody comes from a software engineer job family, as I do, the advice would be to learn at least the basics of machine learning.
I am not speaking about getting a Ph.D. in the area, more like going through an online course or reading a couple of books. It is easy to find much material online these days. Courses of Andrew Ng is how I got started.
It will not be easy to wrap your head around ML concepts and get a basic understanding. At least, it was not easy for me. But, I assure you, it pays off. The basic understanding of ML lets you speak the same language as your data scientists, which is important for ML projects’ success.
While studying ML, you might find out that one of the topics is more interesting than the others. If this is the case, go and read a few papers or watch a few videos explaining them. Videos work even better sometimes. Go deeper into a specific topic. It is more fun to learn about ML that way.”
It’s easy to observe where the guests’ views on certain topics coincided and differed. The distinction between MLOps Engineers and ML Engineers, for example, is intriguing. Amy and Alexey brought up some crucial factors to consider when comparing the two professions. According to both, the MLOps professional platform allows ML engineers to work fast and effectively.
To those questioned about the fate of this role, some believe that MLOps would become an independent career. In contrast, others think these methods will become more incorporated into existing jobs since new technologies are being developed to simplify this process.
That being said, it’s still a bright future. Regardless of what the future holds for the MLOps engineer, best practices and new tools and workflows are here to stay and contribute even more to the challenges that the world of data presents.
The Best MLOps Tools and How to Evaluate Them
12 mins read | Jakub Czakon | Updated August 25th, 2021
In one of our articles—The Best Tools, Libraries, Frameworks and Methodologies that Machine Learning Teams Actually Use – Things We Learned from 41 ML Startups—Jean-Christophe Petkovich, CTO at Acerta, explained how their ML team approaches MLOps.
According to him, there are several ingredients for a complete MLOps system:
- You need to be able to build model artifacts that contain all the information needed to preprocess your data and generate a result.
- Once you can build model artifacts, you have to be able to track the code that builds them, and the data they were trained and tested on.
- You need to keep track of how all three of these things, the models, their code, and their data, are related.
- Once you can track all these things, you can also mark them ready for staging, and production, and run them through a CI/CD process.
- Finally, to actually deploy them at the end of that process, you need some way to spin up a service based on that model artifact.
It’s a great high-level summary of how to successfully implement MLOps in a company. But understanding what is needed in high-level is just a part of the puzzle. The other one is adopting or creating proper tooling that gets things done.
That’s why we’ve compiled a list of the best MLOps tools. We’ve divided them into six categories so you can choose the right tools for your team and for your business. Let’s dig in!Continue reading ->