Loading...

Section 1 - Instruction

In the last unit, we learned to drop missing data. But what if a row with a missing value still contains lots of useful information? Deleting it might mean losing valuable insights. The alternative is to fill in the gaps instead.

Engagement Message

When might filling a blank be better than throwing away the whole row?

Section 2 - Instruction

Pandas provides the .fillna() method for this exact purpose. It works like a "find and replace" for missing data, scanning for NaN values and replacing them with a specific value that you provide. This lets you keep the row.

Engagement Message

What would make a replacement value 'safe' versus potentially misleading?

Section 3 - Instruction

The most common strategy is filling with a constant value. For a numerical column like items_sold, you could replace NaNs with 0. The code would look like this: df['items_sold'].fillna(0). This is a safe bet when zero is a logical default.

Engagement Message

For a 'discount_applied' column, why might filling with 0 be a good choice?

Section 4 - Instruction

This method works great for text columns, too. If a city column has missing values, you could fill them with a placeholder string like 'Unknown'. This keeps your data tidy and ensures every record has a value: df['city'].fillna('Unknown').

Join the 1M+ learners on CodeSignal

Be a part of our community of 1M+ users who develop and demonstrate their skills on CodeSignal