Implicit Aggregation

At times the data aggregation we wish to perform involves applying the same data aggregation operation to multiple columns. Implicit Aggregation is a data manipulation pattern that allows us to succinctly apply one or more data aggregation expressions to a selected set of columns without having to spell out each operation explicitly.

A typical implicit data transformation expression looks like so:

df_2 = df %>%
    group_by(col_1) %>%
    summarise(across(c(col_2, col_3, col_4), sum))

where the construct summarise(across(cols, funs)) allows us to succinctly apply one or more data aggregation operations funs to one or more columns cols without repeating code.

This section is organized to cover the aspects of Implicit data aggregation as follows:

  1. Column Selection where we cover how to select the column(s) on each of which we will apply aggregation operations.
  2. Function Specification where we cover how to specify the data aggregation expressions to apply to each of the selected columns.
  3. Output Naming where we cover how to specify the name(s) of output column(s) created by the implicit data aggregation operations.
R
I/O