We wish to get rows of a data frame where a particular column takes its largest or smallest values.
Note that the n
largest or smallest values might not necessarily correspond to n
rows. If there are rows that take the same values, n
values would correspond to more than n
rows.
We wish to get the rows with the largest values for a particular column. In this example we wish to get the rows where col_1
has its 5
highest values.
df.nlargest(5, 'col_1')
Here is how this works:
nlargest()
method of Pandas data frames which takes the number of values (not rows) to return as its first argument (here 5
) and the numerical column whose largest values we are interested in as the second argument (here ‘col_1’
).nlargets()
in Pandas only works for numerical columns.We wish to get the rows with the smallest values for a particular column. In this example we wish to get the rows where col_1
has its 5
smallest values.
df.nsmallest(5, 'col_1')
Here is how this works:
This code works similarly to the code above except that we use nsmallest()
instead of nlargest()
.