One of the most common data manipulation tasks is data aggregation where, in its simplest form, we
carry out operations that reduce multiple values of a column into a single value e.g. compute the
sum of values of a numeric column. Typically, we perform data aggregation on mutually exclusive
groups of rows then combine the results vertically into a new column or table.
In its simplest form, a typical data aggregation operation in SQL looks like so:
SELECT col_1, MIN(col_2) col_2_min, SUM(col_3) col_3_sum
FROM table_1
GROUP BY col_1;
where we use GROUP BY to create mutually exclusive groups of rows, and then we carry out data
aggregation operations on each group.
This section is organized as follows:
Basic Aggregation where we cover how to
perform basic data aggregation operations in two contexts: (1) Individual aggregation operations
typically as part of interactive data inspection and (2) Generating a Summary Table by carrying
out multiple data aggregation operations on a table.
Conditional Aggregation where we
cover how to carry out data aggregation operations that are not simply applied to all rows, rather
rows or groups are treated differently based on one or more conditions. For example, compute the
sum of values of a column where another column has a particular value.
Sequential Aggregation where we cover
how to aggregate data in a way that can only be produced by carrying out multiple aggregation
operations in sequence. For example, compute the monthly average of number of daily events.
Grouping where we cover aspects of
groupings and working with groups including grouping by a function of existing columns, order of
groups, and iterating over groups.
Multi-Value Aggregation where we
cover data aggregation operations that return multiple values (as opposed to the common scenario
of a data aggregation operation returning a single scalar value).
Pivot Aggregation where we cover
generating data summaries in the form of pivot tables. Oftentimes, a pivot table organization
makes it easier to understand and draw insights from a data summary.