Selecting Multiple Columns in Style with Advanced Column Selectors


A large part of data wrangling is about manipulating the columns that make up a dataset. Now, there's an easier way to apply actions to multiple columns at once, with the Engine!

When you use our user interface to apply an action to a column, you would normally select the column from a dropdown. This is easy and straightforward since the dropdown is already populated with the columns in the dataset. But what if you want to apply the action to multiple columns at once? Selecting each of the columns one by one would be tiresome! Fortunately, The AI & Analytics Engine can help you achieve your column selection needs with the advanced column selector.

Data

The data is a synthetic dataset that was generated. The data has columns with names like drop_X which are columns we will drop. It also has names like X89.

Dataset view on the AI & Analytics Engine

Dataset View on the AI & Analytics Engine

Here’s an example of the data schema

Data Schema example on the AI & Analytics Engine

Consider the scenario where you wish to drop all columns that start with “drop” in the name i.e. drop_1, drop_2, are all to be dropped. Naturally, you would choose the Drop action

Drop action on the AI & Analytics Engine

As discussed, it’s not fun selecting all the columns one by one using the Basic column selector

Basic column selector on the AI & Analytics Engine

We can actually drop all columns that start with “drop” using the Advanced column selector!

Adding drop action on the AI & Analytics Engine

Once you select the Advanced option, this modal box is shown.

Advanced column selector on the AI & Analytics Engine

To proceed, click Add Criterion and select By Pattern and enter the “drop” pattern into the input box. The input uses regular expression (regex), and if you are a regex enthusiast, you may want to use the more advanced pattern like “^drop” which means the pattern to match is one where the column name starts with “drop”. Click DONE to proceed.

Selecting multiple columns by pattern on the AI & Analytics Engine

Now you should see the description “column matching the pattern “^drop”

Drop columns matching a certain pattern on the AI & Analytics Engine

Add the action and you should see all columns that start with “drop” are now dropped!

Now.... what other ways are there to select columns? You can also select columns by their column types.

Selecting columns by column type on the AI & Analytics Engine

For example, you can select all Text columns like this

Selecting all text columns on the AI & Analytics Engine

The Advance option also allows negative selection. For example, if you select anything but what’s specified, simply change the INCLUDE drop down to EXCLUDE.

Drop down menu of column selector option

For example, after changing the INCLUDE to EXCLUDE, the selector selects all but the Text columns.

Exclude columns option on the AI & Analytics Engine

You can combine multiple criteria too!

Suppose for an action, you will select all numeric columns that start with “X” in the name, unless the name ends with “Y” in which the column should be excluded. You can achieve this by using the Add Criterion and add multiple criteria as in the example below:

Combining multiple column criteria on the AI & Analytics Engine

Wrap-Up:

So that’s it. Using the Advanced column selector, you can select many columns at once by pattern matching the column names, or by selecting all columns with a certain type. All selection criteria can be negated to achieve inverse selection as well. Most powerful of all, these selection criteria can be combined together to perform highly fine-grained column selection!

Ready to get started with Machine Learning? Reach out to us, and we'll be happy to help you find out how the AI & Analytics Engine fits into your business.

Contact Us

Similar posts