Churn & Score Analysis in Action: Building Loyalty Measurement System

A while back we looked at churn analysis and its importance for a company. We also presented our approach to estimating a customer’s value using RFM analysis and scoring. In this post, I am going to present a real churn & score solution which we implemented for one of our clients. In building the solution, we sought to address three main challenges: how to identify customers who were about to stop using a car service station, how to evaluate them and how to understand the reasons which influenced their decision.

This post is the third in a series of articles on customer churn analysis. The first two can be found here:

 

Using the churn & score analytical framework in practice

We have already discussed the framework in detail, so I am going to use the graphics below as a reference and describe how we carried out each specific step. Bear in mind that this framework can be used in many machine learning problems, not only churn analysis. Our experience in developing models and implementing them in various contexts has translated into a product that is flexible enough to address multiple problems across a range of industries.

We first had to decide which kind of data sources to use. That decision would allow us to agree on the following:

  1. Creating the target variable: the step which is seldom straightforward. We needed to translate the situation which we wanted to model—churn—into a binary variable.
  2. Specifying data used for generating predictions: the characteristics of the customers, their cars and information about every point of contact with our client.

 

Specify the modeling problem

The first goal is crucial—only an accurate specification of the modeling problem allows you to achieve the results you want. Together with our client’s business subject matter experts, we established the following definition of “churn”: a churning customer is one who has not visited the car service in the 12 months following their last visit. We then proceeded to review possible data sources. Most of the data had already been transferred to a data lake, which simplified what would have been a far more complicated and laborious review process. After careful consideration, we identified several possible areas of interest:

  • Customer information: type, demographics, address…
  • Car information: technical specification, age, engine…
  • Sales and service invoices
  • Service order information
  • Additional dictionaries and minor static tables
Churn rate dashboard

 

Exploratory data analysis

A vital step in every ML project is to perform exploratory data analysis. It allows you to investigate potential modeling features and specify what additional operations will be required. These may include: imputing empty records, limiting categories (if there are too many), removing outliers and correcting variable formats. The presentation of such an analysis need not be aesthetically appealing nor graphically polished. It is, after all, no more (nor less) than an efficient way for a data scientist to verify initial assumptions.

 

Feature engineering

Apart from information that is available straight away, we wanted to enrich our dataset with additional variables. That’s why we used several kinds of aggregation, interactions, indicator variables and text mining techniques. Only then could we be sure we were maximizing the benefits of the data sources available to us.  

 

Feature selection

Having chosen our optimal modeling dataset, we next needed to specify the features we would use in our models. There are several ways to go about this, depending on the type of model being considered. For instance, you can employ stepwise selection, LASSO and feature importance to shrink the final set of features. However, we could also have tackled the problem with a more traditional approach, such as correlation analysis, and specified a desired subset of variables beforehand.

 

The ultimate goal – machine learning modeling

They say that data scientists spend 80% of their workday preparing data, a claim that has proven true numerous times in our own projects. After preparing the data, we were ready to start modeling. For the present project, we tried out four different models to tackle our problem: logistic regression, gradient boosted decision trees, random forest and support vector machines. Each has different characteristics, mostly concerning computational complexity and interpretability.

We compared the models using several indicators, starting with model accuracy (the proportion of the total number of correct predictions to all the predictions) but also taking into consideration sensitivity and precision. Ultimately, we were able to achieve around 85% accuracy. So, out of 100 churn indications, around 85 of them proved correct.

 

Critical success factors

Before I proceed to describing the final parts, I’ll have a look at several steps in an ML project that are key to its success. As each problem will have unique requirements and circumstances, the list is far from comprehensive. However, each of these steps has featured prominently in every analytical endeavor we have encountered:

 

1. Collaborate closely with the business departments

Regardless of how experienced your data science team may be, never forget to regularly update business recipient with your findings. They may seem logical and spot-on to you, but you aren’t the final user of the solution, so you may need to shift your approach.

 

2. Healthchecks are crucial

Don’t overlook the crucial step of exploratory data analysis, the healthcheck of your data integration methods and assumptions. Only after confirming that your EDA results are correct, you can safely assume that the dataset you’ve prepared is correct and error-free. You can also decide on the subsequent steps to fill out your pipeline.

 

3. Spend sufficient time on feature engineering

Kaggle Masters and other leading data scientists agree that preparing a broad set of well-thought-out features is key to success in machine learning. Only then you are able to benefit from the full potential of your data.

 

4. Start with easier methods

First consider using regression, a very popular “benchmark model” which can reveal a great deal of what kind of result to expect in your modeling problem.

 

5. Focus on selecting the optimal parameters

When used recklessly, even the most advanced ML methods can prove less accurate than a simple approach. For instance, gradient boosted decision trees will give you better results if the multiple input factors are carefully tweaked.

 

Making use of our findings

Ultimately, we estimated the probability of our client’s customers—and their cars—churning. So what now? There are usually two main use cases for our churn modeling results. The first is fairly straightforward—we generate a single list of customer identification numbers together with the likelihood of their churning. Our client loaded this list into his system in order to make immediate use of the modeling results and generate quick marketing campaigns.

The second use case goes back to the RFM analysis and scoring. We not only evaluate the chances of a customer leaving the client, but also try to capture the value of that customer exiting the company. This will enable the client to decide how much to spend in an effort to retain him.

The information we collected on both churn and score enabled us to prepare a comprehensive managerial dashboard allowing for the deep analysis of results. We provided information on key factors influencing our model decision and also the ability to benefit from multiple dimensions in order to tailor our analysis. Thanks to the scoring analysis, our client could focus on specific customer segments and move ahead with appropriate anti-churn measures. You can find a sample dashboard below.

Customer churn and scoring analysis: dashboard with the results

Did you like our approach to customer churn & scoring? Contact us to discuss it and other possible applications or follow us on social media to stay up to date!

Key takeaways
  1. Churn & score analysis has helped one of our clients from the automotive industry to assess his company’s customer loyalty
  2. A replicable analytical framework was used
  3. Every advanced analytics project is associated with similar risk and success factors
  4. There are a few ways to benefit from machine learning modeling results

Comments

See also

Churn & Score Analysis in Action: Building Loyalty Measurement System

< READ MORE >

5 Steps to Master Access Governance in Your Organization

< READ MORE >

Our DevOps Culture in Action: A Case Study of DevOps Standardization at Predica

< READ MORE >

Get the latest!
LIKE US ON FACEBOOK

Watch now!
SUBSCRIBE US ON YOUTUBE

Our experience.
FOLLOW US ON LINKEDIN

What's new?
FOLLOW US ON INSTAGRAM