Within the AI & Analytics Engine, a recipe is a sequence of data transformation actions
The recipe function on the AI & Analytics Engine supports the need for preparing datasets into an analysis-ready or ML-ready form. A recipe is specified as a chain of data transformations that's applied to a dataset to prepare it. Each transformation is called an “action” in the recipe.
🎓 For more information on actions read, what are actions?
You can create recipes either manually or automatically with the Engine’s template processor, depending on the type of application the user is working with:
While creating apps using the Build from scratch option, users can create custom recipes manually and run them to prepare their datasets using the user-friendly Recipe Editor.
While creating apps using the Template option (such as customer-churn prediction), the Engine processes the business requirements provided by the user as template inputs, to automatically create appropriate recipes and run them.
In the Engine, recipes can be created to cover many aspects of data preparation (also known as data wrangling or data munging), such as:
Data cleaning and normalization
Joining multiple datasets
Aggregating data at different levels
💡For a full list of actions currently supported by the Engine and how to use them, read the action catalog.
With recipes, you can prepare your dataset in a systematic and repeatable way. Each recipe is reproducible, meaning it can be applied to different datasets with a compatible schema.
When an action is added, it forms part of the recipe. Actions can be added/removed/modified/re-configured as long as the recipe is in the editable state. In the Recipe editor, actions are applied to a sample of the dataset to which the recipe is to be applied, giving you a preview of the prepared dataset.
Once you're satisfied with a recipe, you can apply it to the full dataset to generate the output dataset. The recipe can no longer be modified. However, you can copy the recipe into another modifiable recipe to apply it on the same or a different dataset.