We wish to sort the rows of a table by a transformation of the values of one or more columns instead of by the values themselves. A common example is we want to sort by the numeric part of a string column.

While we can create a new column by transforming existing columns, there are situations where we don’t wish to keep an intermediate value column that we only need for sorting.

We will cover the following custom sorting scenarios:

**Single Column**: We wish to sort the rows of a table by a transformation of one column. We will cover two common situations: (a) We wish to sort a string column in a case-insensitive manner and (b) We wish to sort by the numeric part of a string column.**Multiple Columns**: We wish to sort the rows of a table by a transformation of multiple columns. In particular, we wish to apply the same transformation to each of a set of columns individually and then sort by the transformed columns.**Multiple Transformations:**We wish to sort the rows of a table by transformations of multiple columns. In particular, we wish to apply a different transformation to each of a set of columns and then sort by the transformed columns.**Multivariate Transformation**: We wish to sort the rows of a table by a function of multiple columns. In particular, we wish to sort by one variable which is the output of a function of multiple columns. For example, we wish to sort by the ratio of two columns.**Factor Sorting**: A special case of custom sorting is the sorting of categorical variables e.g. t-shirt sizes S, M, L, XL.**Missing Values**: Rows where the sorting columns have missing values are sorted at the end by default. In some situations, we may want to sort them differently e.g. at the beginning.

We wish to sort the rows of a table by a transformation of one column. We will cover two common situations:

- Case Insensitive: We wish to sort a string column in a case-insensitive manner.
- Numeric String: We wish to sort by the numeric part of a string column.

**Case Insensitive**

We wish to sort by a string column in a **case insensitive** manner i.e. for the purpose of sorting, we wish to treat
upper and lower case letters as the same. With string columns, *sorting is case sensitive by default*, meaning upper
case text will appear first when sorting in ascending order and last when sorting in descending order.

In this example, we wish to sort the table `dummy_table`

in ascending order of a string column with the name `col_2`

in
a *case insensitive* manner.

```
SELECT *
FROM refcon.examples.dummy_table
ORDER BY LOWER(col_2);
```

Here is how this works:

- To sort by the transformed values of a column, we simply apply the transformation we want after
`ORDER BY`

. - We use
`LOWER()`

to convert all characters to lowercase. For detailed coverage of working with strings in SQL see String Operations. - An all lower case version of
`col_2`

is created, used for sorting, then discarded. The original values of`col_2`

stay as is.

**Numeric String**

We wish to sort by the numerical component of a string column. In addition, we wish to order the numeric characters in a
numeric order, not a character order. For example, the ascending order should be `‘9’`

then `‘10’`

and not `‘10’`

then `‘9’`

.

In this example, we wish to sort the table `dummy_table`

in ascending order of the numerical substring of a string
column with the name `col_9`

in numeric order (not character order). The values of `col_9`

are of the
format `order_<val>`

where `val`

can be any integer e.g. `‘order_25'`

.

```
SELECT *
FROM refcon.examples.dummy_table
ORDER BY SAFE_CAST(REPLACE(col_9, 'order_', '') AS INT64);
```

Here is how this works:

- To sort by the numeric part of a string, we extract the numeric part then convert its data type (type cast it) to an integer type then use that resulting numeric variable to sort the rows of the table.
- As noted above, column
`col_9`

is comprised of values of the format`order_<val>`

where`val`

can be any integer e.g.`‘order_25'`

. We use`REPLACE()`

to remove the character prefix`‘order_’`

from every value in the column`col_9`

. This leaves the numeric part e.g.`‘order_25’`

becomes`‘25’`

. For detailed coverage of working with strings in SQL see String Operations. - We then use
`SAFE_CAST()`

to convert the data type of the remaining numeric string to an integer data type. - The output of the above two operations is an integer variable that is then used by
`ORDER BY`

to sort the rows of the table. - A transformed version of
`col_9`

is created, used for sorting, then discarded. The original values of`col_9`

stay as is.

We wish to sort the rows of a table by a transformation of multiple columns. In particular, we wish to apply the same transformation to each of a set of columns individually then sort by the transformed columns.

In this example, we wish to sort the rows of a table by the lowercase transformation of columns `col_2`

and `col_9`

.

```
SELECT *
FROM refcon.examples.dummy_table
ORDER BY LOWER(col_2), LOWER(col_9);
```

Here is how this works:

- This works similarly to the
*Case Insensitive*sorting example described above. - Each of the columns
`col_2`

and`col_9`

is transformed individually and the table is arranged in ascending order of the lowercase transformation of`col_2`

then rows that have the same lowercase value for`col_9`

are sorted in ascending order of the lower case transformation of`col_9`

.

We wish to sort the rows of a table by transformations of multiple columns. In particular, we wish to apply a different transformation to each of a set of columns then sort by the transformed columns.

In this example, we wish to sort the rows of a table in ascending order of the lowercase transformation of the
column `col_2`

and the suffix after the underscore in `col_9`

.

```
SELECT *
FROM refcon.examples.dummy_table
ORDER BY LOWER(col_2), REGEXP_REPLACE(col_9, '.*_', '');
```

Here is how this works:

- This works similarly to the
*Case Insensitive*sorting example described above. - Each sorting variable after
`ORDER BY`

can be a distinct transformation of any one or more columns. - In this example, the first sorting variable is a lowercase transformation of
`col_2`

and the second sorting variable is a substring of`col_9`

(whatever is after the underscore`_`

).

We wish to sort the rows of a table by a function of multiple columns. In particular, we wish to sort by one variable which is the output of a function of multiple columns. For example, we wish to sort by the ratio of two columns.

In this example, we wish to sort the rows of a table by the ratio between two columns `col_6`

and `col_3`

.

```
SELECT *
FROM refcon.examples.dummy_table
ORDER BY col_6 * 1.0 / col_3;
```

Here is how this works:

- We can specify the transformation logic involving multiple columns after
`ORDER BY`

. - The values of
`col_6`

are divided by the corresponding values of`col_3`

. The resulting ratio is then used to sort the rows of the table in ascending order.

We wish to sort an ordinal categorical variable in a domain accurate manner, not in alphanumeric order. An ordinal categorical variable is one where there is a natural order e.g. responses in a survey. See Factor Operations for a coverage of working with factor data.

In this example, we have a table that has a column `col_10`

which is categorical variable holding t-shirt sizes XS, S,
M, L, XL. We wish to sort the rows of the table in the natural order of the t-shirt sizes i.e. XS < S < M < L < XL.

**Value Mapping**

```
SELECT *
FROM refcon.examples.dummy_table
ORDER BY (CASE
WHEN col_10 = 'XS' THEN 0
WHEN col_10 = 'S' THEN 1
WHEN col_10 = 'M' THEN 2
WHEN col_10 = 'L' THEN 3
WHEN col_10 = 'XL' THEN 4
WHEN col_10 = 'XXL' THEN 5
ELSE 6 END
);
```

Here is how this works:

- We used
`CASE WHEN`

to map values of the column`col_10`

to an integer that defines their sorting order. See General Operations for a coverage of conditional statements in SQL. `ORDER BY`

sorts the table based on the output of`CASE WHEN`

.

Missing values in SQL come at the top when sorting in `ASC`

order and at the bottom when sorting `DESC`

order. In this
section we cover how to change this behavior and show rows with missing values at the opposite side of the default
position.

**Ascending**

We want to show missing values at the bottom (instead of at the top) when sorting the table in `ASC`

based on `col_5`

.

```
SELECT *
FROM refcon.examples.dummy_table
ORDER BY col_5 IS NULL, col_5;
```

Here is how this works:

- We sort by
`col_5 IS NULL`

first. It evaluates to`TRUE`

when the value of`col_5`

is`NULL`

and`FALSE`

otherwise. Since`FALSE`

sorts higher than`TRUE`

, all`NULL`

values will be sorted at the end. - The second column to sort by is
`col_5`

. Therefore, rows where`col_5 IS NULL`

is`FALSE`

i.e. not`NULL`

will then be sorted by the ascending value of`col_5`

which is the desired outcome.

**Descending**

We want to show missing values at the top (instead of at the bottom) when sorting the table in `DESC`

based on `col_5`

.

```
SELECT *
FROM refcon.examples.dummy_table
ORDER BY col_5 IS NOT NULL, col_5 DESC;
```

Here is how this works:

- We sort by
`col_5 IS NOT NULL`

first. It evaluates to`TRUE`

when the value of`col_5`

is not`NULL`

and`FALSE`

otherwise. Since`FALSE`

sorts higher than`TRUE`

, all`NULL`

values will be sorted first. - The second column to sort by is
`col_5`

. Therefore, rows where`col_5 IS NOT NULL`

is`TRUE`

i.e. not`NULL`

will then be sorted by the descending value of`col_5`

which is the desired outcome.

SQL