Block 50: Mini-Project: Data Cleaning Notebook
Create a professional data cleaning notebook from raw to clean data.
Concepts
- End-to-end cleaning workflow
- Markdown documentation within the notebook
- Outputting a clean CSV and a cleaning report
Code Examples
See exercise below.
Exercise
Take a messy dataset (missing values, wrong types, duplicate rows, inconsistent casing). Clean it step by step with explanations. Produce cleaned_data.csv and a markdown 'cleaning log' that describes each step. Bonus: visualize before/after statistics for 2 columns.
Homework
Reflection: Write your 'data cleaning checklist' — the 8 steps you'll follow every time.