Block 45: Regex for Data Cleaning
Use regular expressions to find and transform patterns in text.
Concepts
- re.search(), re.findall(), re.sub()
- Basic patterns: \d, \w, \s, ., *, +, ?
- Groups with parentheses: ()
- Compiled patterns: re.compile()
Code Examples
See exercise below.
Exercise
Extract all email addresses from a text string. Normalize phone numbers to (XXX) XXX-XXXX format from messy input.
Homework
Write a regex that extracts dates in MM/DD/YYYY format from a paragraph.