Welcome to the next leg of our journey with the Titanic Survival Data - wielding the power of Descriptive Statistics with Numpy and Pandas! In this lesson, we will cover how to use both these libraries to perform descriptive statistical analysis on our dataset. By the end of this lesson, you will have gained the ability to calculate measures of central tendencies such as mean, median, and mode and understand how to interpret measures of variability, quartiles, and percentiles.
Why should we care about learning descriptive statistics? Well, simply put, descriptive statistics provide powerful, informative summaries of our data, allowing us to understand the nature and distribution of our data even before embarking on any form of machine learning or data prediction. Armed with this understanding, we are better equipped to carry out accurate analyses and produce meaningful insights from our data. Ready to investigate the Titanic dataset
more thoroughly? Then, let's dive in!
Descriptive statistics are appropriately named, as they provide insights into the main features of our data. Let's start with the Titanic dataset
and calculate some basic statistics for the age of passengers: the mean, median, and mode.
The code calculates and displays the mean (average), median (middle value), and mode (most frequently occurring value) of the age
column. These are measures of central tendency, and they give us a general picture of the age distribution of passengers aboard the Titanic.
