Sorting by Multiple Columns

We wish to sort the rows of a data frame by the values of multiple columns.

When sorting by multiple columns, sorting is carried out in sequence where rows that have the same value for a former column are sorted by the value of a later column. For instance, say we are sorting by two columns col_1 and col_2, rows that have the same value for col_1 will be sorted by the value of col_2.

Ascending

We wish to sort the rows of a data frame in increasing order of the values of two or more columns.

In this example, we wish to sort the rows of a data frame df in ascending order of the values of a column col_1 and to sort rows that have the same value for col_1 in ascending order of the values of col_2.

df = df %>% arrange(col_1, col_2)

Here is how this works:

  • We pass the data frame df to the function arrange().
  • We pass to arrange() the names of the columns we wish to sort by (separated by commas) which here are col_1 and col_2.
  • arrange() sorts in ascending (increasing) order by default.
  • The order in which we specify the sort columns matters. In this example, the rows of the data frame are first sorted in ascending order of the values of col_1 and then rows that have the same value for col_1 are sorted in ascending order of the values of col_2.

Descending

We wish to sort the rows of a data frame in descending order of the values of two or more columns.

In this example, we wish to sort the rows of a data frame df in descending order of the values of a column col_1 and to sort rows that have the same value for col_1 in descending order of the values of col_2.

df = df %>% arrange(desc(col_1), desc(col_2))

Here is how this works:

  • We pass the data frame df to the function arrange().
  • We pass to arrange() the names of the columns we wish to sort by (separated by commas) which here are col_1 and col_2.
  • By default arrange() sorts in ascending order. To sort in descending order, we wrap the column name in the function desc(); here arrange(desc(col_1)).
  • The order in which we specify the sort columns matters. In this example, the rows of the data frame are first sorted in descending order of the values of col_1 and then rows that have the same value for col_1 are sorted in descending order of the values of col_2.

Multiple Directions

We wish to sort the rows of a data frame in ascending order of the value of one or more columns and in descending order of the values of one or more columns.

In this example, we wish to sort the rows of a data frame df in ascending order of the values of a column col_1 and to sort rows that have the same value for col_1 in descending order of the values of col_2.

df = df %>% arrange(col_1, desc(col_2))

Here is how this works:

This works as described in the Descending scenario above except that we wrap only col_2 in desc() because we wish to sort in ascending order of col_1 and in descending order of col_2.

R
I/O