Block 58: Saving & Processing Scraped Data
Clean and store scraped data for downstream analysis.
Concepts
- Stripping whitespace and special characters from scraped text
- Handling missing cells in scraped tables
- Converting text numbers to numeric types
- Saving to CSV and back-dating raw HTML
Code Examples
See exercise below.
Exercise
Scrape a simple sports or financial table. Clean it and compute one meaningful stat (e.g., average). Save both the raw HTML and the cleaned CSV with timestamps.
Homework
What encoding issues can arise in scraped text? How does .encode('utf-8', errors='ignore') help? Friday