Uniqueness

This section covers two related but subtly different topics: Unique Values and Duplication in data.

In particular, this section is organized as follows:

  • Unique Values: We cover how to identify unique values, count unique values, and compute the frequency and proportion of occurrence of each unique value. We look at scenarios involving a single column (i.e. a vector or a list) and scenarios involving the unique combinations of a set of columns of a data frame.
  • Duplicates: We cover how to identify, count, return, and drop duplicate values. The focus is on duplication among rows in a data frame where a duplicate may be defined as the values of all or a subset of columns being equal among multiple rows.
PYTHON
I/O