This article explains how the Engine processes data for the subscription option in the customer churn template.
On the AI & Analytics Engine, the process of building models using templates begins with the input datasets provided by the user. In the case of the subscription option within the Customer Churn Prediction template, the input datasets are:
A dataset that includes the dates of subscription start and termination
At least one event log
An optional dataset with customer information
In the context of subscription-based businesses, the goal is to identify active customers who are likely to terminate their subscriptions in the near future. The model trained by the template will assign a label to each customer indicating the urgency of retention measures, depending on the business' definition of urgency.
The 3 possible labels are:
“too late” → customers classified as “too late” are hard to retain because they have a high chance of leaving soon and interventions are ineffective
“about time” → customers classified as “about time” could be retained with timely interventions
“safe” → customers classified as “safe” do not require any immediate actions
In the template-building process, users can customize the periods corresponding to the three classes (called “churn periods”) as per their business requirements. For instance, users can set customers who are likely to churn within the next 7 days as “too late”, customers who are likely to churn within the 14-day period after the next 7 days as “about time”, and customers who are not likely to churn in the next 21 days as “safe”.
Example of labels that will be predicted by the model
The Engine generates the aforementioned labels by following these steps:
It takes historical snapshots of each past active customer at regular intervals
For each snapshot, it determines how many days were left until the customer’s subscription ended.
This shows how long the customer stayed after the snapshot date, and the Engine will assign the label according to the specified churn periods. The subscription end date is obtained from the dataset of subscription start and end dates
After generating the labels, the next step is to create the contributing factors that can potentially predict the label at a given snapshot. To achieve this, the Engine analyzes the customer’s historical activity from each of the event logs, and computes various statistics of the selected event attributes, over different time frames.
These attributes and time frames are defined by the user in the template creation process. For example, if the user selects “amount” as a numeric attribute and “last 30 days” as a time frame, the Engine will calculate stats (such as sum, max, min, etc.) of the “amount” in the last 30 days for each snapshot. The Engine processes each event log independently, in a similar way if the user provides multiple logs.
The Engine also produces features from the optional customer-information dataset, if the user has added it during the app-building process. These features include the demographic and other static (non-time-varying) attributes of the customer.
For more information on the features generated by the template, see: What features are generated in a subscription customer churn app.