Hello! Today, we're diving into Polynomial Regression, an advanced form of regression analysis for modeling complex relationships between variables. We'll learn how to use Python and Scikit-Learn
to perform polynomial regression. By the end, you'll know how to create polynomial features, train a model, and make predictions.
Polynomial regression is useful for capturing non-linear relationships. For instance, predicting exam scores (the target) based on study hours (the feature) might not follow a simple linear pattern. Polynomial regression can help in such cases.
Why do we need polynomial features? To fit a curve instead of a straight line, we create new features that include polynomial terms (like , ). This helps in modeling more complex relationships.
Scikit-Learn
offers PolynomialFeatures
to transform our input data. Here's how it works:
The new X_poly
includes the original term, its square, and an intercept term (the first column).
We'll create data to work with. We'll generate random values between -1 and 1 as features, and our target variable will follow a quadratic equation , simulating realistic data with some noise.
Now, we have the data where our target variable has a non-linear relationship with the feature.
As always, we'll split our data into training and test sets to train and evaluate our model. We will use X_train
to train the model and X_test
to evaluate its peformance.
First, we'll train a simple linear regression model without polynomial features, like we did in the first lesson.
Now, we have the MSE
score for a regular linear regression model. There is not much to say about it, but we can use it to compare this model to others. Let's train a smarter polynomial regression model and check if it works better.
Next, we'll transform the input data to include polynomial terms and train a polynomial regression model.
By applying PolynominalFeatures(degree=2)().fit_transform()
to our data (both X_train
and X_test
), we create a new feature that models a quadratic relationship.
Having trained both models, we can now compare their performance using the mean squared error (MSE).
The polynomial regression model has a much lower MSE, indicating it fits the data much better.
Great job! We covered polynomial regression, from creating polynomial features to training a model and making predictions. Here’s a quick recap:
- Polynomial Features: We used
PolynomialFeatures
to transform our features. - Sample Data: We created a sample dataset using a quadratic formula with noise.
- Train/Test Split: We split the data into training and test sets.
- Model Training: We trained both a simple linear regression model and a polynomial regression model.
- Evaluation: We compared their performance using MSE.
Next, you'll move to practice, where you'll apply what you've learned. You'll generate your own polynomial features, train models, and make predictions.
Happy coding!
