Length

We wish to obtain the length of a vector (number of elements) or a data frame (number of rows).

This section is organized as follows:

Element Count is concerned with one-dimensional structures such as a list or a data frame column. We cover the following scenarios:
- Vector: The number of elements in a data frame column, a vector, or a list.
- Grouped Column: The number of elements of a column in each group.
Row Count is concerned with data frames. We cover the following scenarios:
- Data Frame: The number of rows in a data frame.
- Grouped Data Frame: The number of rows in each group of a grouped data frame.

Element Count

Vector

We wish to obtain the number of elements in a vector or a list.

my_vec = c(1, 2, 3, 4, NA)
vec_length = length(my_vec)

Here is how this works:

To get the length of a vector or a list, we use the length() function from base R.
We can apply length() to a data frame column in the same way e.g. length(df$col_1).

Extension: Ignore NA

We wish to obtain the number of non NA elements in a vector or a list.

my_vec = c(1, 2, 3, 4, NA)
vec_length = sum(!is.na(my_vec))

Here is how this works:

We use the function is.na() from base R to identify which elements of a vector are missing. See Missing Values.
The output of is.na() is a logical vector of the same length as vec where a value is TRUE if the corresponding element of vec is NA and false otherwise.
Since we wish to count the number of non-NA values, we obtain the complement of the output of is.na() via the complement operator !.
Finally, we sum the logical values to obtain the number of non-NA values via sum(). See Working with Logical Data.

Grouped Column

We wish to obtain the number of elements in a column for each group.

In this example, we wish to obtain the number of values of the column col_2 for each group, where the groups are defined by the values of the column col_1.

df_2 = df %>%
  group_by(col_1) %>% 
  summarize(count = length(col_2))

Here is how this works:

The data frame df is grouped by the column col_1. The operations carried out inside the subsequent call to summarize() are executed on each group. See Aggregating.
We use length() to obtain the number of values of the column col_2 for each group.

Extension: Ignore NA

df_2 = df %>%
  group_by(col_1) %>% 
  summarize(count = sum(!is.na(col_2)))

Here is how this works:

We use the expression sum(!is.na(col_2)) to compute the number of non NA values of the column col_2 in each group. See “Extension: Ignore NA” under Vector above for a description.

Row Count

Data Frame

We wish to obtain the number of rows of a data frame.

df %>% nrow()

Here is how this works:

We use the function nrow() from base R to obtain the number of rows in a data frame.

Extension: Add Row Count Column

We wish to add a row to a data frame that has a constant value equal to the number of rows in the data frame.

df_2 = df %>%
  mutate(count = n())

Here is how this works:

We use n() from dplyr to obtain the number of rows in the data frame.

Grouped Data Frame

We wish to obtain the number of rows in each group of a grouped data frame.

In this example, we wish to obtain the number of rows in each group, where the groups are defined by the values of the column col_1.

df_2 = df %>%
  group_by(col_1) %>%
  summarise(count = n())

Here is how this works:

The data frame df is grouped by the column col_1. The operations carried out inside the subsequent call to summarize() are executed on each group. See Aggregating.
We use n() from dplyr to obtain the number of rows in the current group.

Optima.io Reference beta

Length

Element Count

Row Count