Overview and Introduction

Welcome to a new lesson! Today, we'll learn about basic statistical operations using Python's NumPy library. These operations, including mean, median, mode, variance, and standard deviation, are vital tools for understanding and interpreting data. After learning each operation, we'll apply our understanding to a real-world dataset.

Mean, Median, Mode in NumPy

The mean, or average, is the sum of all values divided by the number of values. In Python, we use np.mean(array) to calculate the mean. The median is the middle number in a sorted list, which can be calculated using np.median(array). The mode is the most frequent value in your data set, which can be calculated using the mode() function from scipy's stats module.

Note that stats.mode returns an object. In case of a tie, this object contains multiple items. To obtain the actual mode value, we can select the first one of these items like this:

Variance and Standard Deviation in NumPy

Variance measures the spread of data, and the standard deviation is the square root of variance. Use np.var(array) and np.std(array) to calculate them as shown below:

Summary

Congrats! You've learned basic statistical operations using NumPy and applied them to a real-world dataset. In this lesson, we introduced the mean, median, mode, variance, and standard deviation and calculated them using NumPy functions. Up next are some exercises to apply these techniques. Let's get practicing!

Sign up
Join the 1M+ learners on CodeSignal
Be a part of our community of 1M+ users who develop and demonstrate their skills on CodeSignal