Multiple Facets

As we have seen in this section so far, a typical cross table involves two categorical variables; one represented by the rows of the cross table and the other by the columns. Occasionally, we need to work with more than two categorical variables and wish to group by more than one categorical variable across the rows or the columns of the cross table or both . We will cover how to do that in this section.

Rows

We have three grouping columns. We wish to represent two over the rows and the third over the columns of the cross table.

df %>% 
 group_by(col_1, col_2, col_3) %>% 
 count() %>% 
 pivot_wider(names_from = col_3, values_from = n)

Here is how this works:

  • We use group_by() and count() to generate a summary table containing the number of rows in each group where the groups are defined by the values of col_1, col_2, and col_3.
  • We use pivot_wider() to pivot (move) the third column to the columns of the cross table. We pass the argument names_from = col_3 so that the names of the new columns would be the values of col_3.
  • We set the argument values_from = n so that the cells of the resulting cross table would have the values that were previously in the column n created by count().

Columns

We have three grouping columns. We wish to represent two over the columns and the third over the rows of the cross table.

df %>% 
 group_by(col_1, col_2, col_3) %>% 
 count() %>% 
 unite(key, col_2, col_3) %>%
 pivot_wider(names_from = key, values_from = n)

Here is how this works:

  • We use group_by() and count() as described above.
  • We use unite() to merge together col_2 and col_3 in a unique identifier under a new column called key. While we can have multiple grouping columns represented as rows, we can only have one represented as columns (R has no equivalent of the multi-index in Pandas).
  • We use pivot_wider() to pivot (move) the third column (the new compound column named key) to the columns of the cross table. We pass the argument names_from = col_3 so that the names of the new columns would be the values of col_3.
  • We set the argument values_from = n so that the cells of the resulting cross table would have the values that were previously in the column n created by count().

Rows and Columns

We have four grouping columns. We wish to represent two over the rows and the other two over the columns of the cross table.

df %>% 
 group_by(col_1, col_2, col_3, col_4) %>% 
 count() %>% 
 unite(key, col_3, col_4) %>%
 pivot_wider(names_from = key, values_from = n)

Here is how this works:

This works exactly as the “Columns” scenario described above except that when we pivot, we leave behind two columns col_1 and col_2 to be represented by the rows of the cross table.

R
I/O