We wish to identify the columns on each of which we will apply data transformation logic.
We will cover the following scenarios
This section is complemented by
We wish to apply a data transformation operation to each column of a data frame.
In this example, we have a data frame df
of numeric columns, and we wish to round all columns.
df_2 = df %>%
mutate(across(everything(), round))
Here is how this works:
df
to the function mutate()
.mutate()
we use across()
as follows:across()
is a selection of columns. In this example, we use everything()
to select all columns.across()
is the data transformation expression that we wish to apply to each column selected in the first argument. In this case the data transformation we wish to apply is the function round()
.round()
is applied to each column of the data frame df
as selected via everything()
.We wish to apply a data transformation operation to each of a set of explicitly selected columns.
In this example, we wish to apply the round()
function to columns col_1
, col_2
, and col_4
of a data frame df
.
df_2 = df %>%
mutate(across(c(col_1, col_2, col_4), round))
Here is how this works:
across()
inside mutate()
to carry out implicit data transformation as described in the “All Columns”
scenario above.c(col_1, col_2, col_4)
we identify the columns we wish to select by name.
See Basic Selection for a coverage of explicit column selection
scenarios, all of which can be used to select columns for implicit transformation.We wish to apply a data transformation operation to each of a set of implicitly selected columns. Implicit column selection is when we do not spell out the column names or positions explicitly but rather identify the columns via a property of their name or their data.
In this example, we wish to apply the round()
function to each column of the data frame df
of a double data type.
df_2 = df %>%
mutate(across(where(is.double), round))
Here is how this works:
across()
inside mutate()
to carry out implicit data transformation as described in the “All Columns”
scenario above.where(is.double)
to select all columns whose data type is double
.
See Implicit Selection for a coverage of the most common
scenarios of implicit column selection including by name pattern, data type, and Criteria satisfied by the column’s
data.We wish to apply a data transformation operation to all but a set of columns.
In this example, we wish to apply the round()
function to each column of the data frame df
except the
columns col_1
and col_2
.
df_2 = df %>%
mutate(across(!c(col_1, col_2), round))
Here is how this works:
across()
inside mutate()
to carry out implicit transformation as described in the “All Columns” scenario
above.!c(col_1, col_2)
we identify the columns we wish to exclude by name.
See Exclude Columns for a coverage of column exclusion scenarios, all of which can be used for implicit transformation.