Block 25: Handling Missing Data
Detect and handle NaN values appropriately.
Concepts
- isnull(), isna(), notna()
- dropna() — axis, thresh options
- fillna() with value, method ('ffill', 'bfill')
- Percentage of missing values per column
Code Examples
See exercise below.
Exercise
Create a DataFrame with deliberate NaNs. Compare dropna() vs fillna(0) vs fillna(mean). Write a function missing_report(df) that prints % missing per column.
Homework
When would you prefer fillna(mean) vs dropna()? Give a real scenario for each.