Multiple Facets

As we have seen in this section so far, a typical cross table involves two categorical variables; one represented by the rows of the cross table and the other by the columns. Occasionally, we need to work with more than two categorical variables and wish to group by more than one categorical variable across the rows or the columns of the cross table or both . We will cover how to do that in this section.

Rows

We have three grouping columns. We wish to represent two over the rows and the third over the columns of the cross table.

pd.crosstab([df['col_1'], df['col_2']], 
            df['col_3'])

Here is how this works:

  • If we pass a list of columns as the first argument to crosstab(), it reflects those groupings on the rows of the resulting cross table.
  • We can apply this behavior to any of the cross tabulation scenarios described in Multivariate Summary.

Columns

We have three grouping columns. We wish to represent two over the columns and the third over the rows of the cross table.

pd.crosstab(df['col_1'],
            [df['col_2'], df['col_3']])

Here is how this works:

Similarly to the “Rows” scenario above, we can facet by multiple variables along the columns of the cross table by passing a list of columns as the second argument of crosstab().

Rows and Columns

We have four grouping columns. We wish to represent two over the rows and the other two over the columns of the cross table.

pd.crosstab([df['col_1'], df['col_2']], 
            [df['col_3'], df['col_4']])

Here is how this works:

We can facet by multiple variables along the rows and the columns of the cross table by passing two list of columns as the first two arguments of crosstab() where the list passed to the first argument specifies the columns to use for the rows and the list passed to the second argument specifies the list to use for the columns.

PYTHON
I/O