Range of Rows

We wish to obtain a range of rows of a data frame. This is commonly referred to as slicing.

Range

We wish to get a range of rows between a given start position and end position.

df %>% slice(1:10)

Here is how this works:

  • We pass the data frame df to the function slice().
  • slice() can take a range of row positions specified as start:end and returns the corresponding rows including the rows at both start and end.
  • start is the position of the first row we wish to obtain and end is the position of the last row. Row positions start at 1 as the top most row.

From End

Get a range of rows (slice) relative to the bottom of the data frame.

df %>% slice((n()-6) : (n()-2))

Here is how this works:

  • We pass the data frame df to the function slice().
  • n() returns the number of rows in the data frame and since data frame row numbers start 1, n() is the position of the last row of the data frame.
  • We can specify row positions relative to the bottom of the data frame by subtracting from n(). In this example, n()-6 is the seventh row from the bottom and n()-2 is the third last row from the bottom.
  • The rows corresponding to start and end are included.

Specific Sort

Often times we are faced with scenarios where we need the data frame to be sorted in a certain way before we take a slice. In other words, We wish to sort the data frame by a particular column (or set of columns) and then take a slice.

df %>% arrange(col_1) %>% slice(5:8)

Here is how this works:

  • The arrange() function sorts a data frame in ascending order of the column(s) passed to it. In this example, arrange sorts the data frame df in ascending order of the values of the column col_1. For more details see Sorting.
  • We pass the output of arrange(), which is a sorted data frame, to the function slice() to get a range of rows (as described above).

Selected Columns

We wish to get a range of rows from a data frame but return only a particular set of columns.

df %>% select(col_1, col_3) %>% slice(5:8)

Here is how this works:

  • We use select() to specify the column names of the columns of the data frame df that we wish to include in the output. In this example, the column names are col_1 and col_3. For a detailed coverage, see Selecting by Name.
  • We then pass the output of select() to slice(start:end) to get a range of rows.
R
I/O