Filtering by Multiple Conditions

We wish to filter rows that satisfy a logical combination of multiple conditions.

We will cover the two most common scenarios:

  • Taking the AND (conjunction) of two (or more) logical expressions.
  • Taking the OR (disjunction) of two (or more) logical expressions.

AND

We wish to filter rows that meet two (or more) conditions.

In this example, we wish to filter rows of the data frame df where the numeric column col_1 is greater than 5 and where the string column col_2 has a value that contains the substring ‘token’.

df_2 = df %>%
    filter(col_1 > 5, str_detect(col_2, 'token'))

Here is how this works:

  • Instead of using & we can pass a list of comma separated conditions to filter(). This greatly simplifies logical expression construction.
  • filter() retains all rows that satisfy the given conditions. In other words, to be included in the output, a row must evaluate to TRUE for all conditions.

OR

We wish to filter rows that meet any one of two (or more) conditions.

In this example, we wish to filter rows of the data frame df where the numeric column col_1 is greater than 5 or where the string column col_2 has a value that contains the substring ‘token’.

df = df %>%
    filter(col_1 > 5 | str_detect(col_2, 'token'))

Here is how this works:

  • To filter rows that satisfy one of two or more conditions, we use the or | operator.
  • Any row satisfying any one of the conditions will be retained (included in the output).
R
I/O