We wish to sort the rows of a data frame by the values of multiple columns.
When sorting by multiple columns, sorting is carried out in sequence where rows that have the same value for a former column are sorted by the value of a later column. For instance, say we are sorting by two columns ‘col_1'
and ‘col_2'
, rows that have the same value for ‘col_1'
will be sorted by the value of ‘col_2'
.
We wish to sort the rows of a data frame in increasing order of the values of two or more columns.
In this example, we wish to sort the rows of a data frame df
in ascending order of the values of a column ‘col_1'
and to sort rows that have the same value for ‘col_1'
in ascending order of the values of ‘col_2'
.
df = df.sort_values(by=['col_1', 'col_2'])
Here is how this works:
sort_values()
to the data frame df
.sort_values()
the names of the columns we want to sort by which here are ['col_1', 'col_2']
.sort_values()
sorts in ascending order.‘col_1'
and then rows that have the same value for col_1
are sorted in ascending order of the values of ‘col_2'
.We wish to sort the rows of a data frame in descending order of the values of two or more columns.
In this example, we wish to sort the rows of a data frame df
in descending order of the values of a column ‘col_1'
and to sort rows that have the same value for ‘col_1'
in descending order of the values of ‘col_2'
.
df = df.sort_values(by=['col_1', 'col_2'], ascending=False)
Here is how this works:
sort_values()
to the data frame df
.sort_values()
the names of the columns we want to sort by which here are ['col_1', 'col_2']
.ascending=False
. The default, as we saw above, is to sort in ascending order; i.e.ascending=True
.We wish to sort the rows of a data frame in ascending order of the value of one or more columns and in descending order of the values of one or more columns.
In this example, we wish to sort the rows of a data frame df
in ascending order of the values of a column ‘col_1'
and to sort rows that have the same value for ‘col_1'
in descending order of the values of ‘col_2'
.
df = df.sort_values(by=['col_1', 'col_2'], ascending=(True, False))
Here is how this works:
sort_values()
to the data frame df
and pass the names of the columns we wish to sort by ['col_1', 'col_2']
like we did above.ascending
of the sort_values()
method takes a tuple of booleans corresponding to the column names defined in by
.ascending
to True
for the columns that we wish to sort in ascending order and to False
for the columns that we wish to sort in descending order. In this example, we are sorting in ascending order of ‘col_1'
and in descending order of ‘col_2'
.Upon sorting a data frame, the rows keep their original row index. In some situations, we want to reset the row index so the rows of the sorted data frame would have an ordered index starting at 0 with the first row and ending at n-1 for the last row (where n is the number of rows in the data frame).
In this example, we wish to sort a data frame df
in ascending order of the values of a column ‘col_1'
and to sort rows that have the same value for ‘col_1'
in ascending order of the values of ‘col_2'
. In addition, we wish to have the row index of the resulting sorted data frame be ordered.
df = df.sort_values(by=['col_1', 'col_2'], ignore_index=True)
Here is how this works:
sort_values()
as described above to sort the data frame df in ascending order of the values of the columns ['col_1', 'col_2']
.ignore_index
parameter to True
. The default is ignore_index=False
.