1. The AI & Analytics Engine Knowledge Hub
  2. Build from Scratch: Classification and Regression guide
  3. Predictive Modeling: Train your own machine-learning model & make predictions

How to prepare your data with a data preparation recipe

Create a data-wrangling recipe to prepare your data as per your needs. When the recipe is complete, it will be applied to the input dataset to create a new transformed dataset.

Tip: If your dataset is already ready for machine learning/analysis and you do not need to prepare your data further, you may skip this step and create an app directly after importing your dataset

Continuing from the previous article, Create a new dataset, you next have the option to prepare data to suit your needs. Let’s use the same german_credit_score dataset for this.

1. Create a recipe

 You can use “Process data” from the “Quick Access” in dataset listing page or select “Create Recipe” in the dataset detail page. Then, select "Create a new data wrangling recipe", name your recipe, and click "Create".

Note: For more information, see What is a recipe?

Create a recipe page

2. Start a recipe-building session

You will then enter the recipe-building session. The Engine will need 1-2 minutes to prepare the session before you can begin

Recipe building session page

3. Add suggested actions

The Engine will automatically generate suggestions on what actions to add to the recipe. These suggestions are shown in the suggestions tab.

Suggestions tab page

Click on the (+) buttons next to the 2 suggestions to queue them up in the recipe:

  • Convert columns to numeric type

  • Drop columns

After you are done adding both suggestions, click "commit action" in the recipe panel.

Tip: Are you curious about why the Engine provided these suggestions? Click on "see analysis" in the suggestion box to find out.

Note: For more information, see What are suggestions?

4. Add actions

Next, we want to add one or more specific actions to the recipe:

  • Drop columns: Column1 is just the row number

Add Drop Columns action

  1. Click on the "Add Action" tab.

  2. In the search field, enter "Drop".

  3. Select the Drop action

  4. Under Input Columns, select "Column1"

  5. Click “ADD” to add the action to the queue

Add actions page

Add actions page

Note: To see a full list of actions supported in the Engine, see action catalogue.

5. Commit actions

Select the "RECIPE" tab, and at the bottom of tab, click on "Commit Actions".

Caution: Once actions are committed, they can no longer be edited.

This will apply the actions to the entire dataset and generate a fresh set of suggestions based on the latest dataset. At this stage, you may choose to repeat steps (3) and (4) to further transform the dataset as desired.

For this tutorial, we are happy with the current state of the dataset. Proceed to the next step to finalize & end.

6. Finalize & end

Click on "Finalize & end" to finalize the recipe. This will generate a transformed dataset (german_credit_score - Processed) by applying the actions in your recipe to the selected input dataset.

At the time of finalizing the recipe, any queued action will automatically be committed.

Caution: Please note finalized recipes are no longer editable. You will need to create a new recipe to make changes.