MLOps Blog

Data Science and Machine Learning in the E-Commerce Industry: Insider Talks About Tools, Use-Cases, Problems, and More

11 min
24th August, 2023

Machine Learning has engulfed our personal and private spaces without reprise, extending to horizons that are only limited by our ability to comprehend it. What was once used to statistically appropriate a mathematical pattern is now being used to understand and predict human behavior. 

The advances in Neural and Linguistic Computation has opened avenues for sophisticated applications that enhance User experience and Customer accusation by solving Matrix Decomposition equations. The product/customer to product/customer relationships can be mapped out in sparse matrices to delve into the purchase patterns and apply sequential pattern analysis to derive a coherent understanding of the customer base. Though the fundamental ideology of Machine Learning (reduction of errors) remains constant, the cost functions are drastically different

What’s even more impressive is that the algorithms used to calculate these patterns are quite nascent and hence there is a lot of room to perfect these algorithms. The algorithms are computationally intensive since the majority of them deal with using Higher-Order Matrices. 

Currently, most of the E-commerce specific applications revolve around Customer Segmentation Models, Matrix Factorization Models, Market Basket Analysis, etc with occasional implementations of State of Art techniques (Neural Matrix Factorization). 

The Cloud Platforms have also taken notice of the opportunities available in the Retail and E-commerce industry and have come up with services/tools that readily integrate with web apps. This has led to the creation of multiple white papers intended to utilize these cloud services to create custom models for specific use cases. 

This article intends to give you a picture of the E-commerce industry from the eyes of a Data Scientist. Read on!

My typical day as a Data Scientist in e-commerce

A typical day in the life of a Data Scientist in the E-commerce space is unusually challenging. You can never predict human behavior but you are tasked to do it. Sure, you get to analyze and have random spurts of enlightenment about customer buying patterns and also witness how a marketing decision could overnight change your hypothesis. But nothing comes close to seeing your work save time and increase the engagement experience of a customer while fulfilling your clients’ vision. 

Take a hypothetical case of an SME which has been targeting its marketing campaign towards the wrong section of customers and receiving unyielding results. Though the company is unaware of this, a quick data analysis would point this out and increase its chances of reaching an appropriate customer. You get to solve problems like these on a day-to-day basis and it’s quite satisfying.

What are the Data Science and Machine Learning use cases in E-Commerce 

As a Data Scientist working in the retail space, all your primary goals revolve around “Customers”. Duh! You can either acquire customers or retain them. Ergo, the problem of selection and optimization. In most cases, both of these problems have independent solutions but in some rare scenarios, you can build one solution to tackle them both. Though the solutions are broadly classified into two, the ways to accomplish those solutions are multi-folded. I can go on just about the use cases but I shall limit this segment to 3 major use cases.

1. Recommendation Engines 

Recommendation Engines are at the epitome of Machine Learning use cases in the online retail space. Much like a salesperson who knows exactly what you want, a Recommendation Engine gets you! It understands you based on your past purchases and interactions and gives you a personalized suggestion. The Machine Learning technique that the majority of Recommendation engines use is the Matrix Factorization algorithm. Though the Math behind it is a little complex (touched upon in the Matrix Factorization section of the next segment), what recommendation engines can achieve is quite incomparable. 

Recommendation Engines can be used to customize the experiences of your customer. You can quickly deploy this for any client since these are readily available in different languages such as R, Python, WEKA, etc. Companies like Netflix, YouTube, Amazon, etc use Recommendation Engines built using Python that readily interact with their microservices. Recommendation Engines also allow you to carry the experience of other customers and recommend products that people living in and around the customer are buying. The only caveat to Recommendation Engines is the highly researched ‘Cold Start’ problem. You need historic customer data to build a Recommendation Engine. Without that, a sophisticated Recommendation Engine cannot be built. There are a few workarounds and a lot of research going about to solve this problem.

2. Customer Segmentation

Customer Segmentation has been used for a very long time to target marketing campaigns. Based on the similarities between customers (for instance, the geographic similarity, the purchasing power, etc), they can be clubbed together into groups such as Loyal Customers, High Spenders, etc. In a nutshell, Customer Segmentation segregates customers into smaller groups that are appropriately targeted. Segments can be identified logically or using a Machine learning algorithm. The latter is almost always more efficient since sometimes similarities between customers are not overtly visible. 

Customer Segmentation models are simply put, created by the Clustering technique. There are a wide number of techniques available but the most common approach is to use the K-means Clustering Technique. Once the clusters are created high yielding clusters are identified and focused marketing campaigns are run. The identification of these clusters requires a strong understanding of the retail space for the product you are trying to sell. Based on the market dynamics and the global economic flux, sometimes clusters might not behave like they were predicted to. Hence, while using Customer Segmentation you are expected to factor in risk.

3. Hidden Markov Models

Hidden Markov Models or HMMs are the latest developments in Data Science use cases. They have a wide range of applications and can be quite fruitful in driving insights. HMMs can be used to detect Location-Based Purchases by leveraging rich location-based data to draw insights about customers. HMM starts by learning the location sequence of a given user and the most likely sequences of the customer’s purchases. Probability scores are assigned to new purchases and these new purchases are recommended to the customer based on their likelihood.

CHECK ALSO

Markov Decision Process in Reinforcement Learning: Everything You Need to Know

HMMs can also be used for Customer Segregation wherein they can capture the dynamics of customer behavior through Markovian sequence analysis (by assigning a hidden state to a customer’s spending activity). HMMs can also be used to do Churn Modeling and Delinquency Predictions. A lot can be done using HMMs since they help to validate the reputation of a customer and predict behavior based on dynamic data mining.

Data Science tools, libraries, frameworks that are used in e-commerce

There are multiple Data Science tools and frameworks that are used for Retail clients. This section is divided into three subsections covering the various algorithms, low code tools, and code-intensive tools.

Machine Learning algorithms used in the e-commerce industry

1. Matrix Factorization

Also known as the Matrix Decomposition method, this algorithm is used to solve complex matrix operations efficiently. With the advent of Deep Learning Techniques that use Neurons to optimize the calculations of Matrices, a simple Embedding model can do the job. An Embedding model can be a User-based embedding model or a product based embedding model. A widely used implementation of the Matrix Factorization algorithm using Neural Networks is the Neural Collaborative Filtering Model (NCF). The NCF is a unique deep learning-based architecture for recommendation engines. It combines the effectiveness of standard matrix factorization along with the complexity of neural networks to produce a complete representation of users/items. 

The Matrix Factorization algorithm is used to create sophisticated Recommendation Engines. Matrix Factorization models create a mapping of users and the products they have interacted with. Take the following table, for instance, User 1 buys Face Masks and Sanitizers while User 2 buys Oatmeal. A sparse matrix with these references is created and a corresponding new user is recommended one of these products based on the similarities between the Users.

Users/ Products
Face masks
Sanitizers
Oatmeal

User 1

✓

✓

User 2

✓

User 3 (similar to 2)

Recommended

Recommended

Not recommended

Though this is a very simple illustration of a Matrix Factorization Model, the actual algorithm is fundamentally no different. The similarity between users can be calculated using metrics such as the Customer Lifetime Value or interaction scores (which are generally provided as side-inputs to the model). These metrics can either be manually calculated or left to the Neural Collaborative Filtering Model through one of its dense layers. The NCF automatically identifies the User-Product pairs and suggests a likely product based on a generated probabilistic score. 

Top Research Papers from the ECML-PKDD 2020 Conference

2. Clustering

Clustering is a very common algorithm used by Marketing teams to categorize customers into logical segments (Customer Segmentation). K Means Clustering is a widely used variant of clustering since it’s very effective in creating meaningful clusters. There are other techniques such as Density-Based Clustering, Grid-Based Clustering, etc but they are rarely used to create Models for an E-commerce use case. In theory, Density-Based Clustering should show the best results but in practice K Means Clustering almost always outperforms it. 

The best way to visualize Clusters is by bubbles. The following showcases how clusters can be formed using bubbles.

Clustering is dependent on two important factors :

  1. The number of segments you wish to create (Number of Clusters)
  2. The similarity between different segments (Distance between Clusters)

While the number of optimal clusters is something a Data Scientist can control, the distance between the clusters is dependent on the data they are dealing with. You can use techniques such as the Silhouette Score or the Davies-Bouldin Index to identify the optimal clusters but in part, the approaches you pick will always depend on the data you have. Clustering does a great job of helping you identify Customer Segments that you can use for target marketing. 

3. Classification

Classification is a common Machine Learning method with ample use cases. Classification, loosely defined, is the process of tagging certain data points into a category based on historical data. This can help us categorize new users into potential buyers, loyal customers, cart abandoners, etc. Classification is generally used in tandem with recommendation engines for E-commerce use cases since classification alone does not prove to be fruitful. Matrix Factorization creates User-Product pairs with a categorical variable (generally a column that flags the record as a sale or non-sale). This information is then used to train a classification model to flag a particular user-product pair as a sale or a non-sale. This can be used to effectively target users with only those set of products which have the highest probability of leading to a sale. 

There are a handful of problem statements that can be solved directly using Classification. For instance, your client wants to block IPs that do not make any purchases on your website (referred to as “Lookers”). Lookers generally have no intention of making a purchase and are almost always detrimental to the online business. Converting a Looker to a buyer is an unnecessary task and has no productive long term gains. Hence, some clients with limited server hits may choose to block IPs. A preemptive categorization of these IPs can save the online business a lot of trouble. 

Now that you are aware of the different algorithms used in E-commerce use cases, let’s dive into the tools that can help you implement Machine Learning on your website. There are two main categories, the first category does not require a dedicated in-house Data Scientist while the second category requires an in-house Data Scientist.

 Most Used Tools, Frameworks, and Libraries in Machine Learning Industry [ROUNDUP]
 The Best Tools, Libraries, Frameworks and Methodologies that Machine Learning Teams Actually Use – Things We Learned from 41 ML Startups [ROUNDUP]

Low code tools to implement Machine Learning in your e-commerce use cases (cost intensive)

1. GA360

Google Analytics 360 is one of the most sophisticated Retail accelerators available. It easily integrates with GCP services such as BigQuery and can be used to draw insights. GA360 also provides a heuristic-based Segmentation model. Though it’s not a Machine Learning implementation, clubbing GA360 with BigQuery can help you build a Recommendation Engine driven by user interaction. Since GA360 captures information such as the number of clicks, time spent on a page, products clicked/bought- an efficient Recommendation engine can be modeled from these data points. Of course, you will have to pay a hefty fee for both the services, but the upside of this combination is that you can customize your user experience even without having historical user data. Hence, this is a perfect tool for any client looking to establish their online presence. 

2. BigQueryML

BigQueryML is one of Google’s most sophisticated low-code Machine Learning approaches to build complex Machine Learning models. Though this approach requires you to have historic user data, you don’t need to worry about the complex implementations required to build a good recommendation engine. With BigQueryML, all you need to do is specify the type of Matrix Factorization you wish to run and let BigQueryML manage everything for you. Let’s say your team lacks the understanding of the plethora of Matrix Factorization Variants or let’s say you want to expedite the process of building a Recommendation engine without the downtime to understand how it works-enter BigQueryML Recommendation Engine. A fully managed service that not only selects the appropriate model but takes care of deploying it and providing you with the inferences. With great power, comes great expenses! Both these services are expensive and require a considerable amount of time to start yielding returns.

3. DialogFlow 

A lot of E-commerce websites come with a handsy Chatbot assisting a customer with queries and also recommending products based on the requirements outlined. Building a custom Chatbot is a challenging task but has loads of benefits. Dialogflow is a very convenient way of integrating a conversational AI with a website, web app, mobile app, etc. You can use DialogFlow CX to build a customized chatbot that readily integrates with most retail use cases. This is one of the fastest ways to integrate an AI assistant which can not only personalize the experience for a customer but also solve E-commerce website issues such as Cart Failure, Cart abandonment, Failure of Payment Statement, or Transactions, etc. DialogFlow also supports multiple languages and does not require a Data Scientist to configure it. This tool is part of the Google Cloud Platform suite and hence is not freely available. 

4. Granify

Remember the classification section wherein we spoke about how categorizing users as buyers and non-buyers is important for a business! Granify does exactly that! It helps online retailers by not only identifying the customers who would not buy a product but also enticing them to buy a product by using Machine Learning. Granify uses a wide range of techniques ranging from Classification, Matrix Factorization, Image Processing, etc to accurately map out a customer journey and identify the most optimal point in the journey where the customer could become a buyer. They do have a very aggressive goal of attaining around 4% revenue increment within 90 days of implementation but it’s an innovative tool to use. Not only does Granify automate the implementation of Machine Learning on your retail website but it also provides nifty ways to analyze the traffic reaching your website all without the need for a data science team. All you need is a set of folks passionate about their business and have a deep understanding of their customer base.

Low code approaches are a quick way of implementing a handy machine learning solution to your online business. But this comes at the expense of exposing/sharing your customer information with a third party. Though most of them have policies of not using Personally Identifiable Information (PII), some clients go a step further by hiring in-house data scientists to create their custom integrations. For such use cases, the following tools/technologies can come in handy when building a solution for your retail client

Code intensive tools to implement Machine Learning in your e-commerce use cases (cost effective)

1. TensorFlow Garden NeuMF 

TensorFlow is an Open-sourced Python library used to create Deep learning Machine Learning models. A very specific variant of this library is the implementation of Neural Matrix Factorization in TensorFlow Model Garden. It’s by far the fastest way to implement a sophisticated Collaborative Filtering Recommendation Engine. Since it’s open-sourced, it’s free to implement and integrates quite readily with most infrastructures. But a strong Python-developer and an ML Engineer is required to train and deploy the NeuMF model in a production setup.

NCF is a general framework for collaborative filtering of recommendations in which neural network architecture is used to model user-item interactions. Unlike traditional models, NCF does not resort to Matrix Factorization (MF) with an inner product on the latent features of users and items. It replaces the inner product with a multi-layer perceptron that can learn an arbitrary function from data.

Two instantiations of NCF are Generalized Matrix Factorization (GMF) and Multi-Layer Perceptron (MLP). GMF applies a linear kernel to model the latent feature interactions, and MLP uses a nonlinear kernel to learn the interaction function from data. NeuMF is a fused model of GMF and MLP to better model the complex user-item interactions and unifies the strengths of linearity of MF and non-linearity of MLP for modeling the user-item latent structures. NeuMF allows GMF and MLP to learn separate embeddings and combines the two models by concatenating their last hidden layer. Though NeuMF doesn’t have a direct implementation in JavaScript, with the advent of TensorFlow Js, you can export TensorFlow NeuMF code directly as a JavaScript object and integrate it with your E-commerce website. But keep in mind, NCF models are generally large and will require a dedicated storage container on your server.

2. Python 

Python is one of the most commonly used programming languages for creating Machine Learning models and Artificial Intelligence applications. While Python itself has numerous frameworks to help build an E-commerce website (for instance Flask) it is generally used to build, train and deploy sophisticated models using libraries such as SKLearn, TensorFlow, Pytorch, Theano, etc. Python also opens doors to build Natural Language Processing Models or to create your chatbot framework. 

With a strong Data Science team, you can build custom Machine Learning models that best fit your client’s use case. Furthermore, some E-commerce websites are readily built using Python-based web frameworks. Such websites will easily integrate with your python-based machine learning models. You can also create wrappers around Node servers or use a dual Node-Python server to deploy your model. A dedicated Microservice or API can also be created that generates the inferences from your model. Hence, with Python the possibilities are endless.

What is different from other industries?

As a Data Scientist working with Retail clients, it’s hard to ignore the unique facets of the Retail space. The E-commerce industry is heavily reliant on the interactions made by a customer and the customers are at the center of it all. One might say that you need an existing supply chain, an innovative product, a demand for that product, etc but with or without the quintessential prerequisites an E-commerce website needs customers visiting a website and interacting with it. As long as you can drive traffic to your website, you have won half the battle. The following set of attributes sets the E-commerce industry apart from any other industry

Internet

The Internet is a vast space and anyone/everyone is on it. Your client’s competitor might be on it, your potential buyer might be on it or a technocrat reading this article can be on it. There is a sense of duality to this: Your customer base is almost infinite but your buyers are minuscule in comparison. This is quite a conundrum and hence opens up avenues for target marketing, region limitations, etc. One rule of thumb for any start-up/SME (small to medium enterprise) is that they shouldn’t grow too quickly. By being on the internet, you don’t have control over the viewership. Hence, there is a high chance of your business booming with a sudden onset of customers. Without the right infrastructure, this could have detrimental effects. Another aspect of Internet-based services is that customer reviews spread like fire. Remember MySpace? Well, some of you must not even have heard of it but MySpace was what Facebook is today. You cannot see such an erratic customer base in any other industry! 

Personalization

The E-commerce industry is known to provide customized experiences. Making a customer feel special is a strong tactic and a very useful one at it. Not having it is a definite let down and a probable cause of loss of customers. And you don’t charge the customers for the experience! This is rarely seen in any other industry. Personalization is almost always an add-on or a ‘luxurious’ product. But in the case of the E-commerce industry, it’s a must-have and a to do!

Customer reach

Predicting whether a customer can/will buy your product is only plausible in the online retail industry. If used properly, an online retail business can reach the right audience without setting a foot outside. This is due to the large scale of customers available online. Today, everyone has access to the internet in some form or shape. The number of people on the internet is only going to rise. This creates a larger medium of people for an average online retailer to reach. Add social media into the mix and what you have is the right concoction to reach your audience without spending money on expensive physical billboards.

Analytics

Though this isn’t specific to E-commerce, it’s used a lot by online businesses. In fact, the birth of analytics can be traced back to the dotcom bubble. Analytics as a field is rampant in retail businesses, specifically online retail businesses. This is because online retailers not only have a larger reach but also have the medium to instantly convert a customer to a buyer.  

Search Engine ranking

Search Engine ranking is a controversial problem. It’s controversial because to rank higher, one has to play by the rules of a certain Marketing conglomerate. These rules are often verbose and not always consistent. Why is it important to rank high on Search Engines? The attention span of an average individual is going down drastically every year. This leads to a large set of people viewing the first 4-5 links presented to them. Hence, it’s quite a unique challenge to tackle since you are not just in contention with your competitors but with the entire internet so to speak. 

READ ALSO

AI Limits: Can Deep Learning Models Like BERT Ever Understand Language?

What surprised me the most when I started working in e-commerce?

Apart from the differentiators, E-commerce is one of the coolest Data Science industries to work in. It always amazes me how a simple Website can be a culmination of several different Machine Learning microservices-all working together to provide a holistic experience. The applications are not just limited to one field of study. You can use Computer Vision, Natural Language Processing, Deep Learning, Fuzzy Logic and so much more. 

Something that surprised me is how rapidly E-commerce operations have moved to the cloud. Large Retail clients with a massive presence in the E-commerce space either come up with their cloud infrastructures (ahem! Amazon) or move their entire infrastructure to an up and coming Cloud service provider (Walmart moving to Google). E-commerce is no longer an industry driven by Web Developers. This came to a revelation with the different use cases I was able to work on.

The E-commerce industry has carved a niche for itself, creating a multitude of spaces for employment and innovation. While it’s a space for large scale retailers to reach a larger customer base, it’s also created opportunities for Small scale online businesses to promote their products/services. I always believed in the Goliath crushing David idiom, where a giant conglomerate would eventually crush a small business by mass-producing their products. But I have seen a lot of small businesses thrive due to their online presence. Some of them leverage basic data science techniques to reach their audience while others use 3rd party advertising tools to drive more traffic to their websites. 

While a separate article can be written just answering this question, the retail space never ceases to surprise you. 

What are the biggest challenges as a Data Scientist in e-commerce?

One of the most challenging tasks as a Data Scientist for online businesses is to craft a Hypothesis that holds! Coming up with a Hypothesis is generally quite challenging in any industry but with the dynamic nature of the E-commerce industry and the sheer unpredictable nature of it, more often than not a strong well thought out Hypothesis fails. It fails not because of an incorrect assumption or the lack of factoring in all variables; it fails because of the radical shift in customer behavior. That’s probably you need to rebuild and redeploy a Machine learning model frequently. 

The other challenge is that E-commerce clients have very different business requirements. This creates a sharp learning curve with a very small scope of overlap. While the fundamental metrics and strategies such as the Customer LifeTime Value, Market Basket Analysis, etc remain the same. Their implementation towards a particular business is dependent on the goals and aspirations of that business. Hence, domain knowledge can only partly be reused.

Another challenge is the saturation of options for user personalization. While UI based personalizations are plenty, a machine learning use case for enhancing customer experience is almost always a Recommendation Engine. You can never really go wrong with building a Recommendation Engine but most online businesses hire Data Scientists to build some form of recommendation engine. This might not seem like a challenge to some but it’s got to reach a point of saturation.

Conclusion

I hope this article was able to give you a few insights into the E-commerce industry. There are a lot of Data Science use cases out there that you can start and get your hands dirty with. I have provided a few links below to enhance your understanding of some of the concepts I touched upon. What’s even more exciting is that the E-commerce industry is almost always hiring for a Data Scientist. Like something about the article or want me to elaborate on something? Please do provide your valuable feedback

Resources:

Do read this and share it across your network.

Was the article useful?

Thank you for your feedback!
What topics would you like to see for your next read
Let us know what should be improved

    Thanks! Your suggestions have been forwarded to our editors