Block 142: Capstone: Data Acquisition
Collect and store the raw data for your project.
Concepts
- Fetching data from API or loading from file
- Saving raw data as a baseline (never overwrite raw)
- Validating data quality immediately after loading
- Documenting data source and access date
Code Examples
See exercise below.
Exercise
Implement data acquisition for your project. Save raw data to 'data/raw/'. Print: shape, dtypes, head(3), missing value counts. Write a data_description.md noting source, date retrieved, rows/columns, known limitations.
Homework
What assumptions are you making about the data? List 5 and note which ones you'll need to verify. Tuesday