In our last unit, we used .isnull()
to create a map of True
and False
values. This showed us where data was missing. But for large datasets, a giant map isn't practical. We need a summary.
Engagement Message
Why is a quick count often more useful than a huge map of missing values?
To get a count, we simply need to add up all the True
values in each column of our boolean mask. Remember, True
marks a missing value, so counting the True
s gives us the total number of missing entries per column.
Engagement Message
What advantage does counting missing values give us over just seeing where they are?
Here's a fantastic trick in Pandas: when you use the .sum()
method on boolean values, it treats True
as 1
and False
as 0
. So, summing a column of True
s and False
s is a super-fast way to count the s.
