How to configure the train-test split of a classification or regression app?

This guide will show you how to configure the train-test split of classification and regression apps.

This guide will show you how to configure the train-test split of classification and regression apps

Note: For more information about the train-test split, see here.

Configuring the train/test split

    Train/test split can be configured for classification and regression apps as part of the app creation process.

    Choose your dataset, then the column that you want to predict, and then confirm the recommended prediction type from the three options shown:

    • Predict a numeric value (regression)

    • Predict a label from a list of labels (multi-class classification)

    • Answer a yes/no question (binary classification)

    This process is explained further in this article.

    Then in the final step, click on “Additional Configurations” to expand it. You will see a slider to choose the train/test split:

    train_test_split

      If you are a user with more experience in train/test split configuration, you can choose the desired train/test split by dragging the slider.

      The slider shows the percentage of data to be used for train portion. The rest of the data is used as the test portion for evaluation.

      For the regression, i.e., “predict a numeric value”, the allowed range for train percentage is 1 to 99.

      For the other options i.e., binary and multiclass classification, the allowed range for train percentage depends on the number of instances of the least frequent class label. This is to guard against undesirable failures in the model-training process. If you choose a train percentage that is too high or too low, you will not be able to proceed further. You will see an error message instructing to you the allowed range for train percentage for your dataset:

      image-20220824-131836

      Caution: Once an app is created, users cannot change the train-test split ratio.