We have a grouped data frame, and we wish to apply the row filtering logic to each group separately.
In this example, we have a data frame df
that is grouped by the column col_1
and we wish to filter rows where the value of the column col_2
is greater than the value of the mean of col_2
for the group.
df_2 = df %>%
group_by(col_1) %>%
filter(col_2 > mean(col_2))
Here is how this works:
filter()
is preceded by group_by()
, the logical expression inside filter()
is applied to each group individually.col_2 > mean(col_2)
is applied to each group individually to compare the value of col_2
to the mean value of col_2
for the group.col_2
is greater than the mean value of col_2
for the group are retained (included in the output).