We wish to rename columns of a data frame by their position without providing the current name(s).
We should rename columns by mapping the current name to the desired name where possible. Sometimes though we need to identify the columns whose names we wish to change by their position instead of their current name; e.g. when the current names are not valid or too long.
Note that referring to columns by their position is rather fragile and should be handled with care else risk mislabeling columns if we accidentally get the positions of columns wrong.
In this section, we will cover three common scenarios of column name setting:
We wish to rename all columns by providing a set of new column names in the right order.
In this example, we have a data frame of three columns and we wish to set their names to col_a
, col_b
, and col_c
.
df_2 = df %>%
set_names(c('col_a', 'col_b', 'col_c'))
Here is how this works:
set_names()
from base R to assign new names to columns given a vector of names in the right order.We wish to set the name for a specific column given its position.
In this example, we have a data frame of three columns, and we wish to change the name of the third column to col_c
.
df_2 = df %>%
rename_with(~'col_c', 3)
Here is how this works:
rename_with()
from dplyr
to set the name of a particular column given its index. In this case, we set the name of the 3rd column to col_c
. See Implicit Naming.rename_with()
:~'col_c'
.df_2
will be the same as the input data frame df
except that the third column will have its name changed to col_c
.We wish to set the name for some columns given their position.
In this example, we wish to change the name of the first and third columns to col_a
and col_c
respectively.
df_2 = df %>%
rename_with(~c('col_a', 'col_c'), c(1, 3))
Here is how this works:
rename_with()
from dplyr
to set the names of particular columns given their index. In this case, we set the name of the 1st and 3rd columns to col_a
and col_c
. See Implicit Naming.rename_with()
:~c('col_a', 'col_c')
.c(1, 3)
.df_2
will be the same as the input data frame df
except that the first and third columns will have their names changed to col_a
and col_c
respectively.