Data Preparation

Using the Formula Editor to Transform Datasets


Introducing the new user-friendly formula-based actions. Using these actions you can easily use custom formulas to enrich, transform, and filter any dataset.

Introduction

Have you ever wanted a user-friendly tool that enables you to use complex formulas to clean or transform your big datasets? Well, good news! As part of the data cleaning and data wrangling process, the AI & Analytics Engine’s formula editor gives you the power to easily use your custom-made formulas to create new columns, transform existing columns, or filter rows of data. These actions combine the AI & Analytics Engine’s computational power and a user-friendly formula editor.

Next, we will go over how you can utilize our new formula-based actions. This article is structured in three sections: demonstrating how to create columns, transform columns, and filter rows based on formulas.

For the purpose of this demonstration, we will be using a sample dataset that has previously been uploaded to the engine. Remember that the quality of data uploaded matters in retrieving an output of good quality from the machine learning process. Note that before using the actions, we need to cast the column to appropriate types. For details on how to get started with the Engine, you can visit the knowledge hub here

After some data preparation steps, our dataset looks like this:

Dataset preparation on the AI & Analytics Engine

Creating new columns

To create a new column based on a formula, we use the “ Create a new column with a formula” action following these steps:

  • First, we click on add action and select “Create a new column with a formula” action from the list.

Create new column with a formula action on the AI & Analytics Engine

  • Next, we specify a name for the new column.

Specifying a name for the column on the AI & Analytics Engine
  • Now, we enter the formula using existing column names as variables. The input column names need to be in backticks. Note that the backticks are inserted automatically if you select an input column name from the autocomplete dropdown.

Standard mathematical formula syntax is supported by the formula editor. The formula editor’s autocomplete suggestions makes the process simple and easy. The AI & Analytics Engine’s formula actions support a wide range of numeric, string, date-time and boolean functions.

  • Finally, we add the action to our recipe to see a preview of the result column.

Preview of the results on the AI & Analytics Engine

Here is a short clip of the whole process: 

 

 

Transforming columns

Let’s say we want to transform the values in one or multiple columns using a formula, the AI & Analytics Engine’s “Transform columns with a formula” action can help with that.

This action is pretty similar to the previous one. The only differences are:

  • Transform columns with a formula” overwrites the inputs with the formula result, therefore, we do not need to specify an output column name but we need to select input columns.

  • We can use a single formula to transform multiple columns. To do this, we use $col or $column as placeholders for column(s) to be transformed.

 

To review, here are the steps for using this action:

  • We start by selecting the “Transform columns with a formula” from the actions list.

Transform columns on the AI & Analytics Engine

  • Next, we specify the columns to be transformed, you can use $col or $column as placeholders:

Specify columns to transform on the AI & Analytics Engine
  • Finally, we enter the transformation formula and add the action to the action queue.

Adding the action of transforming column on the AI & Analytics Engine
Preview of results of column transformation on the AI & Analytics Engine

Here is another example of using formulas to transform columns of data:

Filtering rows

The AI & Analytics Engine’s “Filter rows” action can be used to filter rows of a dataset using a condition. This condition needs to be a formula with a boolean output.

To do this we simply select the Filter rows” action, enter the condition formula, add the action and the Engine takes care of the rest:

 

Wrap-up:

In this article, we introduced the AI & Analytics Engine’s three formula-based actions. Combining the custom formula editor and the computation power of the engine, these actions provide simple and user-friendly tools that can help us clean, transform and filter big datasets.

 

See how easy you can make your data preparation tasks with a 2-week free trial of the Engine! 

Free Trial

Similar posts