Dynamic Sorting

We wish to specify the sorting columns dynamically i.e. through a variable or a function argument. In particular, we cover the following scenarios:

  • As Function Argument: We wish to pass the sorting columns to a function where the sorting will be carried out.
  • As String Variable: The names of the column that we wish to sort by are available as string variables.

In this section we cover the specification of sorting column names dynamically. For more involved dynamic sorting scenarios:

  • If you wish to dynamically specify the function to use to select the sorting columns, see Dynamic Selection for how to specify an implicit column selection function dynamically.
  • If you wish to dynamically specify the function to use to apply to the sorting columns before sorting, see Dynamic Transformation for how to specify a transformation function dynamically.

As Function Argument

One Column

We wish to pass one sorting column to a function wherein sorting takes place.

def m_sort_values(p_df, p_by, p_ascending=True):
    p_df = (p_df.sort_values(p_by, ascending=p_ascending))
    return p_df

(df
 .pipe(m_sort_values, 'col_1', False))

Here is how this works:

  • We have a custom function m_sort_values() where sorting happens.
  • We use pipe to pass to our custom function m_sort_values() the data frame df, the column to use for sorting ‘col_1’ and the sorting direction ascending=False.

Multiple Columns

We wish to pass a fixed number of sorting columns to a function wherein sorting takes place.

def m_sort_values(_df, _by, _ascending):
    _df = _df.sort_values(by=_by, ascending=_ascending)
    return _df

(df
 .pipe(m_sort_values, 
       ['col_1', 'col_2'], 
       [True, False]

Here is how this works:

  • We have a custom function m_sort_values() where sorting happens.
  • We use pipe to pass to our custom function m_sort_values() the data frame df, the columns to use for sorting ['col_1', 'col_2'] as a list and the sorting direction as a list of logical values ascending=[True, False].

As String Variable

The names of the column that we wish to sort by are specified as strings in an environment variables (or a function argument).

One Column

col = 'col_1'

df.sort_values(col)

Here is how this works:

We pass the variable holding the column name (as a string) to the sort_values() function.

Multiple Columns

cols = ['col_1', 'col_2', 'col_3']

df.sort_values(cols)

Here is how this works:

We pass the variable holding the column names (as a list of strings) to the sort_values() function.

PYTHON
I/O