In an implicit filtering scenario, we wish to specify whether to AND or OR the logical values resulting from applying one or more logical expression(s) to each of a set of columns.
This section is complemented by
True
or False
) to apply to the selected set of columnsWe wish to filter rows for which a logical expression is True
for all of a selected set of columns.
In this example, we wish to filter the rows of the data frame df
for which the value of every column whose name contains the string ‘cvr’
is less than 0.1
.
df_2 = df.loc[(df
.loc[:, lambda x: x.columns.str.contains('cvr_', regex=False)]
.apply(lambda x: x < 0.1)
.all(axis=1))]
Here is how this works:
all()
returns True
if all elements along the designated axis (rows or columns) of the data frame it's called on is True
.all()
will evaluate along by setting the argument axis
where axis=0
evaluates along columns and axis=1
evaluates along rows.apply()
to apply the row filtering logic to each of the selected columns. The result is a data frame of logical True
or False
values where a value is True
if the corresponding column meets the filtering condition (which here is being < 0.1
) for the corresponding row.all()
to combine the logical values for each row such that the output is True
if the filtering condition evaluated to True
for all the selected columns.all()
returns True
.We wish to filter rows for which a logical expression is True
for any of a selected set of columns.
In this example, we wish to filter the rows of the data frame df
for which the value of any column whose name contains the string ‘cvr’
is less than 0.1
.
df_2 = df.loc[(df
.loc[:, lambda x: x.columns.str.contains('cvr_', regex=False)]
.apply(lambda x: x < 0.1)
.any(axis=1))]
Here is how this works:
any()
returns True
if any elements along the designated axis (rows or columns) of the data frame it's called on is True
.any()
will evaluate along by setting the argument axis
where axis=0
evaluates along columns and axis=1
evaluates along rows.apply()
to apply the row filtering logic to each of the selected columns. The result is a data frame of logical True
or False
values where a value is True
if the corresponding column meets the filtering condition (which here is being < 0.1
) for the corresponding row.any()
to combine the logical values for each row such that the output is True
if the filtering condition evaluated to True
for any of the selected columns.