What features are generated after the Engine processes the data in a transactional customer churn prediction app?

This article explains the features that are generated after the Engine processes the data in a transactional customer churn prediction app.

Features are known attributes used as input by machine learning models to predict the unknown target.

For the transactional option in the customer churn template app, the AI & Analytics Engine automatically generates a number of useful features from the customer transactional dataset and customer information dataset. These features represent various customer behavior statistics over different periods of time.

🎓 To learn more about the datasets needed for the customer churn template, read what datasets are required to use the transactional option in the customer churn prediction template?

Different types of aggregated features are generated over the selected time windows:

  • Amount-based features: minimum, maximum, standard deviation, and total amount spent and refunded.

  • Count-based features: number of transactions.

  • Time-interval-based features: minimum, maximum, standard deviation, and average number of days between transactions.

  • Recency features: days since last transaction.

The user can select the time windows used to generate these aggregated features when they are defining contributing factors.

Apart from aggregated features, additional demographic features from the customer info data are also included, if available.

:mortar_board: For more information about “contributing factors”, read what do "contributing factors" mean in the customer churn prediction template? 

Specifying 2 time windowsSpecifying two time windows. One is the most recent 30 days. The other is a 30-day range, 30 days ago.


As an example, consider the following credit-card transactions dataset and customer information dataset as input to the template:

Transaction datasetTransaction dataset

Customer information datasetCustomer information dataset

 

Then, the following features will be generated by the engine. For the transaction activity based features, the suffixes such as _last_30d and _last_15d correspond to the time windows we have chosen while selecting the contributing factors:

Description 

Feature name 

amount-based features

min_amount_spent_last_30d

min_amount_spent_last_15d

max_amount_spent_last_30d

max_amount_spent_last_15d

total_amount_spent_last_30d

total_amount_spent_last_15d

stddev_spent_amount_last_30d

stddev_spent_amount_last_15d

count-based features

number_of_spending_transactions_last_30d

number_of_spending_transactions_last_15d

time-interval based features

min_days_btw_transactions_last_30d

min_days_btw_transactions_last_15d

max_days_btw_transactions_last_30d

max_days_btw_transactions_last_15d

avg_days_btw_transactions_last_30d

avg_days_btw_transactions_last_15d

stddev_days_btw_transactions_last_30d

stddev_days_btw_transactions_last_15d

recency features

days_since_last_transaction

features from snapshot time

current_month

current_week_number_in_year

time based features from customer info data

year_of_dob

month_of_dob

week_in_year_of_dob

weekday_of_dob

days_since_dob

other features from customer info data

district_id

sex