We do not explicitly specify the columns we wish to relocate by name or position, rather we refer to them implicitly.
In this example, we wish to relocate columns ending with the suffix ‘_id’
to be the end of the data frame.
s_cols = df.columns[df.columns.str.endswith('_id')]
df_2 = df.loc[:, df.columns.difference(s_cols).append(s_cols)]
Here is how this works:
df.columns[df.columns.str.endswith('_id')]
, we select all columns whose name ends with ‘_id’
. See Implicit Selection for coverage of common scenarios of implicit column selection, including by name pattern, data type, and criteria satisfied by the column’s data.‘_id’
) via df.columns.difference()
append()
.loc[]
to extract the columns in the order specified by the modified column index. See Relative Relocating.df_2
will be a copy of the input data frame df
but with columns ending with the suffix ‘_id’
to be the end of the data frame.Extension: Relative to Implicitly Selected Group
We wish to relocate a set of implicitly selected columns to be located relative to, i.e. before or after, another set of implicitly selected columns.
In this example, we wish to have character columns come before numeric columns (which is oftentimes a good practice when working with actual datasets).
s_cols_1 = df.select_dtypes('object').columns
s_cols_2 = df.select_dtypes('number').columns
s_cols = s_cols_1.append(s_cols_2)
df_2 = df.loc[:, s_cols.append(df.columns.difference(s_cols))]
Here is how this works:
df.select_dtypes('object').columns
, we select columns whose data type is object (string) and then extract the names of those columns (as an index). We do the same for numeric columns. See Implicit Selection.s_cols_1
and s_cols_2
corresponding to string and numeric columns respectively.s_cols
via df.columns.difference(s_cols)
and append that to s_cols
loc[]
to extract the columns in the order specified by the modified column index. See Relative Relocating.df_2
will be a copy of the input data frame df
but with character columns appearing before numeric columns and any other columns appearing after.Alternative: Append Columns
df_a = df.select_dtypes('object')
df_b = df.select_dtypes('integer')
df_c = pd.concat([df_a, df_b], axis=1)
df_2 = pd.concat([df_c, df.drop(columns=df_c)], axis=1)
Here is how this works:
Since select_dtypes()
returns a data frame, it may be a more natural approach to work with data frames rather than column names. See Relative Relocating.