Week 11 • Tuesday

Block 104: Text Classification with scikit-learn

Build a supervised text classifier.

Concepts

Code Examples

See exercise below.

Exercise

Build a spam vs ham classifier on a small labelled dataset (e.g., SMS spam corpus). Compare Naive Bayes vs Logistic Regression accuracy.

Homework

Why is Naive Bayes surprisingly effective for text classification despite its naive assumption? Wednesday