Range

We wish to obtain the range (minimum and maximum) values of a column or a vector.

Note that while it is most common to compute the range for numeric data, we may compute the range (min and max values) for other data types to mean the following:

  • For numeric: The smallest and largest numeric values.
  • For string: The alphabetically first and last string values.
  • For date-time: The earliest and latest date-time values.
  • For ordinal (sorted categorical): The values with the smallest and largest order.

Min or Max

We wish to obtain the minimum (or maximum) value of a column of a data frame (or a vector).

In this example, we wish to obtain the minimum and maximum values of the column col_2 for each group where the groups are defined by the column col_1.

df_2 = df %>% 
  group_by(col_1) %>%
  summarize(
    col_2_min = min(col_2),
    col_2_max = max(col_2))

Here is how this works:

We use min() and max() from base R to identify the minimum and maximum value in a vector of values respectively.

Extension: Ignore NA

df_2 = df %>% 
  group_by(col_1) %>%
  summarize(
    col_2_min = min(col_2, na.rm=TRUE),
    col_2_max = max(col_2, na.rm=TRUE))

Here is how this works:

  • If there are any missing values NA in the input vector, both min() and max() will return NA.
  • As is common in base R, we set the argument na.rm of either min() or max() to na.rm=TRUE to ignore any missing values.

Range

We wish to obtain a vector of two values (min and max) denoting the range of a column of a data frame (or a vector).

df %>% pull(col_1) %>% range()

Here is how this works:

  • In pull(col_1), we extract the column col_1 as a vector. See Selecting Single Column.
  • We use range() to get the min and max of col_1.

Extension: Ignore NA

df %>% pull(col_1) %>% range(na.rm=TRUE)

Here is how this works:

  • If there are any missing values NA in the input vector, range() will return NA.
  • We set the argument na.rm of range() to na.rm=TRUE to ignore any missing values.
R
I/O