You’re the one for the job. You know it, I know it—but you still need to prove it to the job interviewer. You’ll be judged on your skills, knowledge, and character, and you need to show that you’re the right candidate. Oh, the stress!
We all know what it’s like. In a complex environment (and machine learning projects definitely are complex) it’s even harder compared to less stressful jobs. Interviews are tough regardless of your seniority level. Seniors and juniors fear them just the same.
To overcome the stress, you need to come prepared. If you want to increase your chances of being hired, you have to be ready to answer even the trickiest questions without losing composure.
In this article, I’ll help you shoo away job interview anxiety by showing you how interviewers like to trip up Machine Learning Engineer candidates, and how to be prepared for their hiring tactics.
Table of contents
What is the role of a Machine Learning Engineer?
Before we dive into the interview itself, let’s first make sure that you’re actually applying for a Machine Learning Engineer (MLE) job. What does an MLE do, exactly?
The ultimate goal of an MLE is to shape and build efficient self-learning AI applications. The main responsibilities are:
- Designing machine learning systems and self-running AI software.
- Transforming data science prototypes.
- Using data modeling and evaluation strategy to find patterns and predict unseen instances.
- Managing the infrastructure and data pipelines necessary for productionizing code.
- Finding available datasets online for training purposes.
- Optimizing existing ML libraries and frameworks.
- Running machine learning tests and interpreting the results.
- Implementing best practices to improve the existing machine learning infrastructure.
- Documenting machine learning processes.
In the image below, you could find an example of an MLE job post from indeed.
Demand for MLEs has recently outgrown the demand for data scientists (although a lot of people probably use these terms interchangeably). According to Indeed, MLE job openings grew 344% between 2015 to 2018 (Source: Best Jobs In The US).
Machine Learning Engineer vs Data Scientist
I mentioned that people use these terms interchangeably. It’s a mistake to do so because there is a difference between the two posts. In fact, the main work of Data scientists is more about building a good model where Machine Learning engineers tend to focus on the deployment of the model and how to ship it in the production environment.
May interest you
The table below will dive you into the skills required for both posts.
– Deep knowledge of maths, probability, statistics, and algorithms
– Ability to write robust code in Python, Java, and R
– Experience with data manipulation and processing using SQL or pandas
– Familiarity with machine learning frameworks (like Keras or PyTorch) and libraries (like Scikit-learn)
– Excellent communication skills with analytical and problem-solving skills
– Experience using data visualization tools
– Experience with data querying languages, and statistical or mathematical software
– Experience in data mining and in using business intelligence tools
– Excellent understanding of statistics, multivariable calculus, and linear algebra
– Understanding of data structures, data modeling, and software architecture
– Excellent time management and organizational abilities
– Understanding the system implications of algorithms in terms of performance and power
– Knowledge of Big Data frameworks like Hadoop, Spark, Pig, Hive, Flume, and similar tools
– Proficiency in Linux environments
– Experience in NLP and computer vision
– Familiarity with machine learning technologies like AWS SageMaker
You could take a look at the following two job descriptions. Where I have highlighted some key points differentiating the two job offers.
The interview process for top companies
Each big company has unique ways of hiring. These companies have established a specialized interview that is able to pick out the finest machine learning engineers.
Google ML interview
The Machine Learning Engineer interview at Google looks for an understanding of data structure, algorithms, system design, and testing.
The interview process will be pretty broad. They will make sure that you’re a smart person and good overall hire for the company. You should be aware of the kind of questions you will be asked, and don’t be surprised when you’re asked to code a palindrome problem or reverse a string.
Apple ML Interview
Like most other companies, the Apple interview process for MLE consists of a phone screen followed by on-site interviews (or Zoom). The interviewers will ask about your past projects with a heavy focus on state-of-the-art Deep Learning. Then some more general questions about coding skills along with optimization, time, and space complexity. You will be asked about the implementation of ML concepts. The questions will test your core knowledge about concepts for ML, including a discussion with the interviewer.
Amazon ML Interview
The Amazon Machine Learning interview is composed of behavioral, software engineering, and machine learning questions. The interviewers may ask you about some basic ML concepts, your recent project, and describe how to solve a given ML/DL problem.
Also, they may give you a coding question like recursion or coding a gradient descent algorithm. Thus, you should keep in mind that you will get coding questions since a Machine Learning Engineer is more of a software engineer than a data scientist.
Facebook ML Interview
The hiring process for MLEs at Facebook goes for holistic evaluation. It includes two coding interviews, a system design interview, a behavioral interview, and a machine learning interview.
The coding interview seems challenging and in some cases, you’ll be provided with a home project assignment. The nice thing about Facebook is that before the interview process, they give you helpful resources for preparation through their recruiting portal.
Twitter ML Interview
During the interview process at Twitter, be prepared to be tested deeply on both computer science and data science knowledge, with an emphasis on recognizing patterns and trends. The interview contains a technical coding interview where you will be asked to implement a program, like how to encode a tweet or how to go through a log of processes. The technical part will test your intuition for ML theory (basic concepts and algorithms). You’ll need to show your knowledge of statistics, experimental models, and system design.
Top Machine Learning Engineer interview questions
Now let’s dive into the top MLE interview questions that might surprise you. Prepare for them well, and you won’t get tripped up during interviews.
These questions are a mix of behavioral, technical, and design system questions. We will divide the interview questions based on this classification.
We will look at common behavioral questions you may encounter during an interview in any company, along with answers.
1) Tell me about the worst manager you ever had.
Depending on the managers you’ve had, some of these questions can be tricky to answer. Don’t give in to the temptation—even if your previous manager was horrible, don’t say so. This is not the right place to vent your frustrations. Focus on the positive sides, and show that you were able to work productively regardless of any management challenges.
2) Do you have any career-related regrets?
For this question, you need to speak about the good that came from the negative experience and clearly highlight the lesson you learned from it. Just be careful and choose a real event that happened, don’t fake it, the interviewers will notice.
3) Do you consider yourself to be lucky?
Luck is considered by some to be the ability to notice an opportunity. It’s the difference between those who find or create opportunities and those who wait for them to come. When you follow your passion, you find joy. So, express to your interviewer that you’re lucky to be working in a field that’s interesting for you, and has so many exciting opportunities for personal and career growth.
4) Give an example of a time where you faced an ethical dilemma?
For this question, you need to show your approach to analyzing and resolving problems with integrity. Be careful and don’t fake a story for the interviewer, he’ll detect if it’s fake. In fact, you may face an ethical dilemma when something goes against your personal ethics and values. It may force you to choose between being honest and dishonest. An example ethical dilemma could go like this: the project deadline came and you had nothing for your manager. Did you admit your fault, or did you place the blame on someone else? Think of stories like this before you go into the interview, and have one ready in case you’re asked this question.
1) How would you explain machine learning to a kid?
This question is to test if you can explain complex things simply and clearly for non-technical people. Prepare an explanation like this before the interview, with some examples within a context familiar to your interviewer.
2) What is the difference between a Type I and Type II error?
Type I error is a false positive (if there’s an alert, and there’s no incident), and Type II error is a false negative (no alert, but there was an incident).
3) What’s the difference between an array and a linked list?
The crucial difference between an array and a linked list is that an array is an ordered collection of objects. The size of an array is specified at the time of declaration and can’t be changed afterward. The linked list is a series of objects with pointers. New elements can be stored anywhere, and a reference is created for each new element using pointers.
4) How do you prevent overfitting?
Detecting overfitting is useful, but the most important is to ensure you’re not overfitting the model. Here are a few of the most popular solutions:
- Collect more data to train the model with more varied samples.
- Use cross-validation techniques
- Keep the model simple to reduce variance
- Use regularization techniques
5) What’s the difference between Entropy and Information Gain?
Entropy is the average rate at which information is produced by a stochastic source of data. It’s an indicator of how dirty your data is. It decreases as you reach closer to the leaf node.
The information gain is the amount of information gained about a random variable or signal from observing another random variable. It’s based on the decrease in entropy after a dataset is split on an attribute. It keeps on increasing as you get closer to the leaf node.
For a more detailed explanation, you could check this link.
6) What’s an imbalanced dataset? Can you list some ways to deal with it?
Any dataset with an unequal class distribution is technically imbalanced.
Here are some techniques to handle imbalanced data:
- Resample the training set: There are two approaches to make a balanced dataset out of an imbalanced one are under-sampling and over-sampling.
- Generate synthetic samples: Using SMOTE (Synthetic Minority Oversampling Technique) to generate new and synthetic data to train the model.
7) Why does XGBoost perform better than SVM?
XGBoost is an ensemble method that uses many trees, so it improves by repeating itself.
SVM is a linear separator. When data is not linearly separable, SVM needs a Kernel to project the data into a high-dimensional space. SVM can find a linear separation for almost any data.
8) What evaluation approaches would you use to gauge the effectiveness of an ML model?
- Split the dataset into training and test sets
- Use a cross-validation technique to segment the dataset
- Implement performance metrics like accuracy and the F1 score
9) What are dropouts?
Dropout is a straightforward implementation to halt neural network overfitting by terminating some of its units. Repeating this for every training example gives us different models for each one, improves processing, and reduces time.
10) What is GPT-3 (or other bleeding-edge technology)? How do you think we can use it?
This question tests if you’re following new technology hype and research. GPT-3, as you probably know, is the newest (at least at the time of writing this article) language generation model that can generate human-like text. There are many perspectives on GPT-3. It can improve chatbots, automate customer service, and boost search engines with NLP.
System design questions
A system design interview analyzes your process in solving problems and creating designing systems to help clients. It’s an opportunity to show the hiring manager that you’re a valuable team member, and to fully show your skills. Interviewers want to see how you think when you’re given ownership of an open-ended problem.
How to design a social network and message board service like Reddit, Quora, etc.?
This is an example of a common system design question. To well answer it, you need to follow these guidance steps:
- Explain the problem state:
Design a forum where users can post questions. The questions will be available to everyone with a comments section where you can write tags.
- List the general problems:
How will the system define tags? How many posts from unfollowed tags are shown in the feed? How are posts distributed across a network of servers?
- Ask for more clarification:
Ask clarifying questions to show the interviewer your knowledge of system needs.
- Discuss emerging technologies like:
How you will use multithreading and a load balancer layer to help support higher traffic? How you will use ML and NLP to find correlations between tags?
You need to narrate any decisions you make, and concisely explain why you made them. The system design interview is really a great opportunity to show the interviewer how you think, not just the knowledge that you have.
You could check the link here for additional information about Machine Learning Systems Design.
Now that you have an overview of the MLE interview from behavioral questions to the design system, you just need to be confident and not let stress control you. Remember that ultimately, it’s not about the questions and answers as much as it’s about the overall impression you leave. So, listen attentively to the interviewer and try your best to make it sound like a natural part of a conversation.
Keep in mind that it’s just an interview. It’s not the end of the world if you don’t get the job! As Albert Einstein said, “in the middle of difficulty lies opportunity”. You’ll find the right opportunity for yourself soon enough…
…as long as you come prepared for job interviews. Good luck in your job search!
MLOps: What It Is, Why it Matters, and How To Implement It
13 mins read | Prince Canuma | Posted January 14, 2021
According to techjury, every person created at least 1.7 MB of data per second in 2020. For data scientists like you and me, that is like early Christmas because there are so many theories/ideas to explore, experiment with, and many discoveries to be made and models to be developed.
But if we want to be serious and actually have those models touch real-life business problems and real people, we have to deal with the essentials like:
- acquiring & cleaning large amounts of data;
- setting up tracking and versioning for experiments and model training runs;
- setting up the deployment and monitoring pipelines for the models that do get to production.
And we need to find a way to scale our ML operations to the needs of the business and/or users of our ML models.
There were similar issues in the past when we needed to scale conventional software systems so that more people can use them. DevOps’ solution was a set of practices for developing, testing, deploying, and operating large-scale software systems. With DevOps, development cycles became shorter, deployment velocity increased, and system releases became auditable and dependable.
That brings us to MLOps. It was born at the intersection of DevOps, Data Engineering, and Machine Learning, and it’s a similar concept to DevOps, but the execution is different. ML systems are experimental in nature and have more components that are significantly more complex to build and operate.
Let’s dig in!Continue reading ->