Block 75: Preprocessing: Scaling & Encoding
Prepare raw features for machine learning models.
Concepts
- StandardScaler and MinMaxScaler
- OneHotEncoder for categorical features
- LabelEncoder for ordinal categories
- Why scaling matters for KNN, SVM but not trees
Code Examples
See exercise below.
Exercise
Apply StandardScaler to features and compare KNN accuracy before/after scaling. One-hot encode a categorical column and verify no information leak into test set.
Homework
Name 3 algorithms that require feature scaling and 3 that do not. Explain why.