In an implicit filtering scenario, we wish to specify whether to AND or OR the logical values resulting from applying one or more logical expression(s) to each of a set of columns.
This section is complemented by
TRUE
or FALSE
) to apply to the selected set of columnsWe wish to filter rows for which a logical expression is TRUE
for all of a selected set of columns.
In this example, we wish to filter the rows of the data frame df
for which the value of every column whose name contains the string ‘cvr’
is less than 0.1
.
df_2 = df %>%
filter(if_all(contains('cvr'), ~ .x < 0.1))
Here is how this works:
if_all()
to specify that we wish to retain rows for which the given logical expression is TRUE
for each of the selected columns.if_all()
accepts two inputs as follows:if_all()
is a selection of columns. In this example, we use contains('cvr')
to select any column whose name contains the substring ‘cvr’
.if_all()
is the logical expression that we wish to apply to every column selected in the first argument. In this case the logical expression we wish to apply is the anonymous function ~ .x < 0.1
.~ .x < 0.1
(the second argument to if_all()
) is applied to each column selected via contains('cvr')
(the first argument to if_all()
) and the results are combined via an AND operation i.e. a row is retained (included in the output) if its value is < 0.1
for all of the selected columns.We wish to filter rows for which a logical expression is TRUE
for any of a selected set of columns.
In this example, we wish to filter the rows of the data frame df
for which the value of any column whose name contains the string ‘cvr’
is less than 0.1
.
df_2 = df %>%
filter(if_any(contains('cvr'), ~ .x < 0.1))
Here is how this works:
if_any()
to specify that we wish to retain rows for which the given logical expression is TRUE
for any of the selected columns.if_any()
takes inputs in exactly the same way as if_all()
. See above.~ .x < 0.1
(the second argument to if_any()
) is applied to each column selected via contains('cvr')
(the first argument to if_any()
) and the results are combined via an OR operation i.e. a row is retained (included in the output) if its value is < 0.1
for any of the selected columns.