In some situations, the filtering logic we wish to carry out can not be applied in a vectorized manner column wise, rather it needs to be applied in a non-vectorized manner to each row individually.
In this example, we wish to filter rows where the mean of the values of the columns col_1
and col_2
is greater than 0.
df_2 = df %>%
rowwise() %>%
filter(mean(c(col_1, col_2)) > 0)
Here is how this works:
rowwise()
switches the mode of execution of the operations that follow from column wise operation to row wise operation which allows us to apply a non-vectorized function one row at a time.rowwise()
, the expression inside filter()
will be applied one row at a time (instead of the usual execution on entire columns).mean(c(col_1, col_2)) > 0
, the mean of the values of col_1
and col_2
for the current row is computed and then compared with 0. If the result is True
, the row is retained, else it is not included in the output.filter()
will also be carried out in a non-vectorized manner. To switch back to regular vectorized operation, add ungroup()
to the chain.