Filter Groups By Column Value

We have a grouped data frame, and we wish to apply the row filtering logic to each group separately.

In this example, we have a data frame df that is grouped by the column col_1 and we wish to filter rows where the value of the column col_2 is greater than the value of the mean of col_2 for the group.

df_2 = df %>%
    group_by(col_1) %>%
    filter(col_2 > mean(col_2))

Here is how this works:

  • When filter() is preceded by group_by(), the logical expression inside filter() is applied to each group individually.
  • In the code above, the logical expression col_2 > mean(col_2) is applied to each group individually to compare the value of col_2 to the mean value of col_2 for the group.
  • Rows for which the value of col_2 is greater than the mean value of col_2 for the group are retained (included in the output).
R
I/O