Enhanced Modeling with LightGBM and Engineered Features

Putting Engineered Features to the Test: Let's Try It!

So far in this course, we've taken a systematic approach to improving our predictive modeling by carefully engineering a wide variety of features. We began by analyzing our baseline models, which included both a simple linear regression and a basic LightGBM model trained on the original, unengineered features. These baseline models gave us reference points for model performance and helped us identify which features were weakly correlated with our target. From there, we applied a series of foundational transformations—such as rounding, normalizing, and creating interaction terms—to help our models, including both linear regression and LightGBM, better capture the underlying patterns in the data. Building on that foundation, we introduced more advanced techniques, including binary flags, ratio features, custom binning, all designed to extract even more predictive power from our raw data. Now, the time has come to put all of these engineered features to the test. In this lesson, we’ll bring together everything we’ve built so far and use those features in our modeling pipeline. This is our opportunity to measure, in a concrete and careful way, whether our feature engineering improves prediction of podcast listening time, and whether different models benefit by different amounts. We’ll evaluate the impact of each feature and transformation and compare the results using the same RMSE metric on the same test set. Are you ready to see the results of your hard work? Let’s dive in and find out just how much our engineered features have boosted our model’s performance!

Previous Lesson

Join the 1M+ learners on CodeSignal

Be a part of our community of 1M+ users who develop and demonstrate their skills on CodeSignal