What are feature importance values?

This article outlines the concept of feature importance.

Feature importance is a method to help us interpret a predictive model. The values indicate the relative contribution of the dataset features to the predictions generated by the model.

Each feature - a variable or column that is used as an input to make a prediction - is assigned a value, called the feature importance. It indicates how much the variable impacts prediction. One could rank features from highest to lowest by feature importance, to gain a better insight into a how a model is making predictions.

The model’s predictions are more sensitive to changes in features with higher importance. For example, in a streaming-video service, a model that predicts whether a customer is going to churn can be expected to have a high dependence on the history of the number of hours watched during the weekends.

On the AI & Analytics Engine, feature importance can be generated once a classification or regression model has been trained, and can be accessed from the model’s summary page, under Insights. Also, the AI & Analytics Engine can generate the feature importance values irrespective of the algorithm chosen for training.

Regression feature importance example:

Assume we have a dataset of house features and their historical sale prices (in dollars).

Preview of house prices dataset. The columns are the house features. For example, "LotArea" is the the land area of the property in squared feet.

We create a model to predict the future sale price of a house given its features:

Trained regression model for future house prices sell value, based on 25 features

After the model is created, we can explore the most predictive features by observing the feature importance list.

Feature importance list. In this case, the "OverallQual" is the most predictive feature.

A different example would be a dataset of the metadata from images of steel plates in an effort to classify 6 classes of defects.

figure_08 Preview of the steel images dataset. The columns are image meta features.

Again, we create a model. This time attempting to classify the defect type according to the image metadata:

Trained classification model for classifying different types of defects in steel plates

When we have multiple target classes for prediction each target class has its own list, since, for predicting a specific defect class, "feature A" might be the most predictive, but for another defect class, "feature B" will actually be the most predictive.

Feature importance list per defect type. We can toggle between the fault types and get the list