We wish to generate the common summary statistics for all column in a data frame, such as quantiles for numeric columns and unique value count for non numeric columns. While we can compute each of those statistics for each column of a data frame individually, it would be efficient during data inspection to use a function that given a data frame computes the common statistics appropriate for the column’s data type.
df.describe()
Here is how this works:
describe()
method that returns summary statistics for the data in the data frame as follows:describe()
will restrict the summary to include only numerical columns or, if there are no numerical column, only non-numerical columns. To force describe()
to return a summary of non-numerical columns, while numerical columns exist, we can use df.describe(include=["object"])
.describe()
doesn’t return stats on missing values. We cover that in Missing Values.