Welcome back! In this lesson, we’ll explore practical techniques for feature selection in R. Feature selection helps you focus on the most relevant variables, improving model performance, interpretability, and training efficiency. We’ll use a model-based approach with linear regression and rank features by the magnitude of their standardized coefficients.
To keep everything self-contained and portable, we’ll work with the built-in mtcars dataset and predict mpg (miles per gallon) from the remaining columns.
Sample head(mtcars) output:
- Target:
mpg - Predictors: all other columns (
cyl,disp,hp,wt, etc.)
Coefficients depend on feature scales. To compare feature importance fairly, we’ll standardize predictors (mean 0, sd 1) before fitting the model.
We’ll measure importance using the absolute value of standardized coefficients. Larger absolute coefficients indicate a stronger relationship with the target.
Instead of choosing a fixed number of features, you can select everything above a threshold on |standardized coefficients|:
In this lesson, you:
- Loaded a built-in dataset (
mtcars) and defined a target (mpg). - Standardized predictors to make coefficients comparable.
- Fit a linear model and ranked features by |standardized coefficients|.
- Selected features via top-k and threshold approaches.
