We wish to obtain a range of rows of a data frame. This is commonly referred to as slicing.
We wish to get a range of rows between a given start
position and end
position.
df %>% slice(1:10)
Here is how this works:
df
to the function slice()
.slice()
can take a range of row positions specified as start:end
and returns the corresponding rows including the rows at both start
and end
.start
is the position of the first row we wish to obtain and end
is the position of the last row. Row positions start at 1 as the top most row.Get a range of rows (slice) relative to the bottom of the data frame.
df %>% slice((n()-6) : (n()-2))
Here is how this works:
df
to the function slice()
.n()
returns the number of rows in the data frame and since data frame row numbers start 1
, n()
is the position of the last row of the data frame.n()
. In this example, n()-6
is the seventh row from the bottom and n()-2
is the third last row from the bottom.start
and end
are included.Often times we are faced with scenarios where we need the data frame to be sorted in a certain way before we take a slice. In other words, We wish to sort the data frame by a particular column (or set of columns) and then take a slice.
df %>% arrange(col_1) %>% slice(5:8)
Here is how this works:
arrange()
function sorts a data frame in ascending order of the column(s) passed to it. In this example, arrange sorts the data frame df
in ascending order of the values of the column col_1.
For more details see Sorting.arrange()
, which is a sorted data frame, to the function slice()
to get a range of rows (as described above).We wish to get a range of rows from a data frame but return only a particular set of columns.
df %>% select(col_1, col_3) %>% slice(5:8)
Here is how this works:
select()
to specify the column names of the columns of the data frame df
that we wish to include in the output. In this example, the column names are col_1
and col_3
. For a detailed coverage, see Selecting by Name.select()
to slice(start:end)
to get a range of rows.