Loading...

Introduction to Data Splitting and Feature Scaling

Welcome to the next step in our journey with the mtcars dataset. In the previous lesson, you learned how to preprocess and explore the mtcars dataset, laying the groundwork for more complex analyses. Now, we'll progress to splitting the data into training and test sets and scaling our features. These steps are crucial in preparing your data for machine learning models.

Step 1: Loading the mtcars Dataset

First, let's start by loading the mtcars dataset. This dataset is included with R, so you don’t need to download anything extra.

Output:

Step 2: Setting a Seed for Reproducibility

Setting a seed ensures that your results can be reproduced by others. This is especially important for random processes.

This code doesn’t produce visible output but is crucial for reproducibility.

Step 3: Convert categorical columns to factors

In this step, we will convert categorical columns in the mtcars dataset to factors. This is important because factors are treated as categorical data in R, enabling more accurate analyses and model training. Specifically, we'll convert the columns , , , , and to factors.

Join the 1M+ learners on CodeSignal

Be a part of our community of 1M+ users who develop and demonstrate their skills on CodeSignal