Customer Churn

Customer Churn

No business wants its valuable customers to churn. Retaining existing customers is often the most efficient and cost-effective way to bring in revenue. Also, many companies know that using AI and machine learning (ML) to help reduce churn is a great idea. However, they can't hire enough data scientists to do the work! Some businesses are lucky and have a data scientist team in place. They work hard on building the propensity to churn models. However, those businesses still find that not all of their product-lines get the attention they deserve due to resource constraints. There are only so many hours in a day! Therefore, this is lack of bandwidth can often lead to less than optimal customer retention outcomes.

(This post uses data obtained from this Kaggle competition: Predicting churn for Bank Customers)

PI.EXCHANGE AI & Analytics engine can help. We predict that the AI & A engine can increase a data scientist's bandwidth by 200%. Let's do the maths: assume one data scientist can create a propensity to churn model in three months. So that one data scientist can create four models in one year. We predict that piexchange's AI & A Engine will add 200% bandwidth to each data scientist. Therefore, a team of three data scientists can now create and maintain twelve models in one year instead of four models.

So how does PI.EXCHANGE’s AI & A engine achieve the 200% additional capacity? Firstly, we use the AI & A engine to predict, out of hundreds of available machine learning (ML) algorithms, which one is best suited to your data. PI. EXCHANGE's AI & A measures suitability using three measures: accuracy of model, training time, and prediction time. The keyword here is "predict". We do not fit hundreds of ML models on the data and then report on which model works best. Instead, we predict which model works best and make recommendations. The recommendations include time and cost to development and cost of operation. This prediction feature is one of the many features that make piexchange's AI & A engine stand out. It also means that we do not need to waste time and resources developing and training ML models that won't suit your needs.


A visual walk-through of the AI & A platform

To begin using this platform, you would need to set up Organisations; in this case, we will set up the Customer Analytics organisation. Next, we set up the "Propensity to churn and Profitability project."

Set-up a project

Add Data to the platform

Once we have a set up a Project to organise the work, we will supply the platform with data on which to build the machine learning models. In this case, we will upload a CSV containing the data. The platform supports many other data ingestion methods, including from traditional SQL databases like PostgreSQL and NoSQL databases like MongoDB.

Once the data is uploaded, every column of data gets categorised as numeric, categorical, or other. The data is then visualised using the appropriate visualisation.

Setting up Apps

The concept of an App in central to how AIA platform manages the ML models. Each App represents the set of machine models build on a specific data set. One can have multiple apps under one project. In this case study, we will create two Apps: one for predicting the propensity of churn and another to model the profitability of each customer.

Each App takes care of details like partitioning the data into development and test samples so that we can scientifically assess the model's performance.

Once the App has been set up, the platform will analyse the dataset associated with the App and makes predictions about the suitability of the various machine models to data. This is one of the key IPs of PI. EXCHANGE's AI & A engine. The critical aspects of suitability assessed at this point include accuracy of the model, training time of the model, and prediction latency of the model. For this use-cases, training time and latency aren't important considerations, so we will focus on getting the best accuracy.

Creating new models

We can now proceed to the Model Recommender. The Model Recommender will rank our expertly-crafted model templates by their Predictive Performance, Training Time, and Prediction Time metrics! For this use-case, we select the top two models, which are XGBoost and LightGBM.

We will select our models based on model accuracy and here are the models we have selected

Rank Model AI & A Predicted Accuracy^ (min =0, max =1)
2 LightGBM 0.73
1 XGBoost 0.74

^For the data scientists, the accuracy measure is F1-macro

We fit the model with a few clicks, and we get the following result

Rank Model AI & A Predicted Accuracy^ (min =0, max =1) Actual Accuracy Actual ROC AUC Actual Gini
1 XGBoost 0.74 0.75 0.856 0.712
2 LightGBM 0.73 0.74 0.857 0.714


Feature Importance

For each of the models, we can take a look at the Feature Importance chart and see which features are the most useful at predicting churn. For the XGBoost model, we see that Age and NumberOfProduts are the most important features.

Automated creation Key Diagnostics

These statistics are automatically computed and presented for the data scientist in a table. This will be invaluable to your busy data scientists by taking away the boring parts of data science - manually creating important but tedious to compute diagnostics and statistics.

Let’s compare with the model built by the top Kaggle kernel (credit:

As predicted by the AI & I engine, XGBoost Classifier is the best performing model and has an AUC of 0.856, which compares very favourably to the results from the top voted Kaggle results above. Also, the engine reports on a lot more statistics and measure for the data scientist saving them hours and can significantly boost productivity.

AI & A engine auto-generates a comprehensive list of model diagnostics

If one wishes to look at a model's performance in-depth, she may go to the model page and get detailed metrics and diagnostics for each model. Below is a list of metrics and visualisation we automatically generate for your data scientists.

A comprehensive table of model evaluation statistics:

Confusion matrix, a fundamental model performance visualisation for your data scientists.

The ROC curve

Precision or Recall? Both are important


Deploying the model

With a few clicks of the button, the chosen model can be operationalised and deployed ready to do the grunt work of prediction! Deployment can be performed on the cloud or to an on-premise server of your choosing.

Consuming the deployed models

Once a model is deployed, the corresponding API endpoint is ready immediately for use. Typically, the result will be in JSON format. For example, we can call the endpoint by providing data in comma-separated format

Additionally, we have also provided R, Python, and Excel packages to make it easy for analysts to consume the API. All of these packages can return the data in a tabular format, instead of JSON, for ease-of-use.

Example: Using the R package pixapi to consume the API

Example: Using the Excel plugin to consume the API


We can complete the development and deployment of the churn model can in a matter of minutes with AI & A platform. The bank can start using this model right away! For example, an analyst can begin making calls to the API end-points and get the predicted probability of churn for each customer. She can then identify high-value customers with high propensity-to-churn; using those insights to design strategies to stop the out-flow of the valuable customers much faster than before!
The PI.EXCHANGE platform provided a convenient way for companies to prototype and productionise machine learning models using Recommender-powered AIA Engine. According to a study, most machine learning (ML) models do not get operationalised. Part of the problem lies with the difficulties involved in deploying machine learning. This means that data scientists have wasted their effort that went into developing the not-deployed models. AIA Engine platform makes it easy to create and deploy models. The ease-of-use and ease-of-developments help companies eliminate wasted efforts. Additionally, the ease of building machine learning models will promote experimentation with ML in a cost-effective and time-efficient manner.