Map Names

We wish to rename columns of a data frame by providing a mapping from the current column names to the desired names for the columns whose names we wish to change. This is the recommended approach to column renaming (when it is possible).

In this example, we wish to change the names of two columns from col_a and col_b to col_1 and col_2 respectively.

df_2 = df.rename(columns={'col_a': 'col_1', 'col_b': 'col_2'})

Here is how this works:

  • We use the data frame method rename() to change column names from their current values to the desired values.
  • To rename columns, rename() expects a Python dictionary mapping between the current column names (as dictionary keys) and desired column names (as dictionary values); i.e. {current_name : desired_name}. The name-mapping dictionary in this case is {'col_a': 'col_1', 'col_b': 'col_2'}.
  • We only need to include in the dictionary mapping for the columns that we wish to rename.

Extension: Level of Multi-Index

We wish to change the labels of a particular level of a MultiIndex.

df_2 = (df
        .groupby('col_1')
        .agg(['nunique', 'sum'])
        .rename(columns={'nunique': 'unique_count', 'sum': 'total'}, level=1))

Here is how this works:

  • In this example, we demonstrate renaming a level of a MultiIndex following a data aggregation operation via groupby() and agg(). See Aggregating.
  • To change the labels of a particular level of a MultiIndex:
    • We pass to rename() a dictionary mapping the current labels to the desired labels and
    • we pass to the level argument of rename() an integer value specifying the level we wish to modify, which in this example is level=1 (the second level)
  • The output data frame df_2 will have a MultiIndex of two levels the first of which holds the non-grouping column names of the input data frame df, i.e. col_2 and col_3, and the second level holds the modified labels, i.e. ‘unique_count’ and ‘total’ for each column.
PYTHON
I/O