Extreme Values

We wish to get rows of a data frame where a particular column takes its largest or smallest values.

Note that the n largest or smallest values might not necessarily correspond to n rows. If there are rows that take the same values, n values would correspond to more than n rows.

Largest Values

We wish to get the rows with the largest values for a particular column. In this example we wish to get the rows where col_1 has its 5 highest values.

df.nlargest(5, 'col_1')

Here is how this works:

  • We use the nlargest() method of Pandas data frames which takes the number of values (not rows) to return as its first argument (here 5) and the numerical column whose largest values we are interested in as the second argument (here ‘col_1’).
  • Note that nlargets() in Pandas only works for numerical columns.

Smallest Values

We wish to get the rows with the smallest values for a particular column. In this example we wish to get the rows where col_1 has its 5 smallest values.

df.nsmallest(5, 'col_1')

Here is how this works:

This code works similarly to the code above except that we use nsmallest() instead of nlargest().

PYTHON
I/O