Factor Sorting

We wish to sort an ordinal categorical variable in a domain accurate manner not in alphanumeric order. An ordinal categorical variable is one where there is a natural order e.g. responses in a survey. In SQL there is no built-in categorical type, but we can treat any column as categorical by manually defining the sort order.

In this example, we have a table table_1 that has a column col_1 which is categorical variable holding t-shirt sizes XS, S, M, L, XL. We wish to sort the rows of table_1 in the natural order of the t-shirt sizes i.e. XS < S < M < L < XL.

SELECT *
FROM table_1
ORDER BY (CASE
              WHEN col_1 = 'XS' THEN 0
              WHEN col_1 = 'S' THEN 1
              WHEN col_1 = 'M' THEN 2
              WHEN col_1 = 'L' THEN 3
              WHEN col_1 = 'XL' THEN 4
              ELSE 5 END
             );

Here is how this works:

  • Since there is no categorical type in SQL, we need to manually map the column to desired order.
  • We used CASE WHEN statement to map values of the column col_1 to an integer that defines their sorting order. See General Operations for a coverage of conditional statements in SQL.
SQL
I/O