This article explains the concept of Classification as a Machine Learning term and as a concept within the Engine: Classification is the process of correctly identifying an item’s type, class, or category based on its observed characteristics.
The need for classification is everywhere
Some examples are:
-
Predicting what type of failure a machine will suffer over time given the history of its sensor readings,
-
Predicting whether a customer is going to churn in the next month given their recent account activity,
-
Predicting what kind of mobile phone a customer is going to purchase given their demographic characteristics, and
-
Predicting whether a person is going to have diabetes given their lifestyle.
Businesses often require classification to be automated in a real-time manner and at a large scale. For example, banks may have several thousand customers using their credit cards every minute. In this case, every transaction will need to be classified as either normal or fraudulent, so that the system can take immediate preventive action such as asking the customer for password verification.
Machine Learning can automate these tasks
Machine Learning (ML) provides a way to achieve the automation of such classification tasks. ML Classification Models can be trained using specialized algorithms (or “model templates”) that learn from existing data of observations for which the class (label) to be predicted is known, called labelled data. The resulting trained model can accurately predict the unknown class or label for new data.
Classification within the AI & Analytic Engine
On the AI & Analytics Engine, users can build state-of-the-art ML classification models from tabular datasets using the Create App function as picture below. The user needs to upload their labelled data and then choose the column or variable they want to predict.
A part of the dataset is used for training and the remaining portion (called test portion) is used to evaluate the model.
The evaluation report shows how accurately the trained model can predict the desired outcomes in the test portion.
The model leaderboard for evaluation of models
Detailed model performance metrics
The user can then make predictions on new data files automatically, and deploy the model as a prediction API service or use the batch prediction feature
Prediction API service
Batch prediction
There are two types of classification apps available on the Engine:
-
Binary classification - where there are only two possible classes and one of them is of particular interest, and
-
Multi-class classification - where there are two or more possible classes and the prediction of every class is equally important.