Dynamic Filtering

In essence, filtering involves applying one or more logical conditions to the columns of a data frame and, if there is more than one condition, combining the results from each condition via some form of boolean logic.

In some situations we may need to specify the columns or the logical conditions dynamically i.e. through environment variables or function arguments. Two common scenarios are when we wish to build a reusable data manipulation function and when we wish to separate parameter specifications (e.g. conditions to use for filtering) from the data manipulation logic while structuring a script.

This section is organized as follows:

  • Column Specification where we cover how to dynamically specify the columns to which filtering logic would be applied.
  • Function Specification where we cover how to dynamically specify the filtering predicate function(s) (functions that return True or False) that would be applied to the specified columns. We will look at how to dynamically specify a named function, a lambda function, and multiple functions.
  • Condition Specification where we cover how to specify the entire logical condition that we wish to use to filter the rows of a data frame dynamically e.g. filter the rows of a data frame df by the condition ‘col_1 > 0’ specified as a string.

If we are dealing with a scenario where both columns, and functions need to be specified dynamically, which is often the case in real life, we would need to combine the solutions in these three sections.

In addition to the above, the following sections complete the story for dynamic filtering:

  • Dynamic Transformation has a deeper coverage of performing data manipulation operations dynamically. The scenarios covered there can also be applied for filtering.
  • Dynamic Grouped Transformation has a coverage of performing grouped operations dynamically. The scenarios covered there can also be applied for dynamic filtering in a grouped context.
PYTHON
I/O