Introduction to Understanding Your Data

Understanding your data is like plotting key positions on a map for a journey. How do we do it? Through statistical quantities like mean, median, and mode. These tell us more about our data. Today, we will learn about these quantities and how to calculate them in pandas.

Basic Statistical Quantities

In data analysis, mean, median, and mode help us understand the central tendency of our data. The mean is the average, the median is the middle value when data is sorted in order, and the mode is the most frequent value. Standard deviation and variance tell us how data varies and the difference between each quantity and the mean. Additionally, we use minimum (min), maximum (max), and quantiles to understand data spread.

As a reminder, quantiles divide data into equal-sized intervals and help understand its distribution. One example of quantile is quartiles, which divide data into 4 equally sized groups. For instance, the first quartile is the value which is greater than the 25% of the data.

Calculation of Statistical Quantities in Pandas

Knowing our destinations, let's see how to reach them using pandas! We compute these quantities for a DataFrame or Series object data using:

  • mean = data.mean(),
  • median = data.median(),
  • mode = data.mode(),
  • standard deviation = data.std(),
  • variance = data.var(),
  • min = data.min(),
  • max = data.max(),
  • quantile = data.quantile(q), where q is the quantile like 0.25, 0.5, etc.

Let's calculate these for a sample DataFrame:

DataFrame Describe Function

pandas provides describe(), which computes these statistical quantities for each DataFrame column. Here's how it looks:

Lesson Summary and Practice

Today, you learned about mean, median, mode, standard deviation, variance, min, max, and quantiles, and how these are calculated using pandas. You saw describe() being used on real-world data.

However, learning doesn't stop at understanding but is solidified in practice. So, get set for some hands-on reinforcement through exercises. Happy exploring!

Sign up
Join the 1M+ learners on CodeSignal
Be a part of our community of 1M+ users who develop and demonstrate their skills on CodeSignal