What is classification?

This article explains the concept of Classification as a Machine Learning term and as a concept within the Engine: Classification is the process of correctly identifying an item’s type, class, or category based on its observed characteristics.

The need for classification is everywhere

Some examples are:

  • Predicting what type of failure a machine will suffer over time given the history of its sensor readings,

  • Predicting whether a customer is going to churn in the next month given their recent account activity,

  • Predicting what kind of mobile phone a customer is going to purchase given their demographic characteristics, and

  • Predicting whether a person is going to have diabetes given their lifestyle.

Businesses often require classification to be automated in a real-time manner and at a large scale. For example, banks may have several thousand customers using their credit cards every minute. In this case, every transaction will need to be classified as either normal or fraudulent, so that the system can take immediate preventive action such as asking the customer for password verification.

Machine Learning can automate these tasks

Machine Learning (ML) provides a way to achieve the automation of such classification tasks. ML Classification Models can be trained using specialized algorithms (or “model templates”) that learn from existing data of observations for which the class (label) to be predicted is known, called labelled data. The resulting trained model can accurately predict the unknown class or label for new data.

Classification within the AI & Analytic Engine 

On the AI & Analytics Engine, users can build state-of-the-art ML classification models from tabular datasets using the Create App function as picture below. The user needs to upload their labelled data and then choose the column or variable they want to predict.

image-20220822-034238

A part of the dataset is used for training and the remaining portion (called test portion) is used to evaluate the model.

The evaluation report shows how accurately the trained model can predict the desired outcomes in the test portion.

image-20220822-034518

The model leaderboard for evaluation of models

image-20220822-034756

Detailed model performance metrics 

The user can then make predictions on new data files automatically, and deploy the model as a prediction API service or use the batch prediction feature 

image-20220822-040515

Prediction API service  

image-20220822-040429

Batch prediction 

There are two types of classification apps available on the Engine:

  • Binary classification - where there are only two possible classes and one of them is of particular interest, and

  • Multi-class classification - where there are two or more possible classes and the prediction of every class is equally important.