Lesson Introduction

Machine learning! You’ve probably heard this term a lot. But what exactly is it? Think of it as teaching a computer to learn from data and make decisions or predictions based on that data. This is like teaching a child to recognize different objects by showing them examples.

In this lesson, our goal is to understand the basics of a machine learning project. We’ll generate data, visualize it, and understand the relationships within it.

Data Generation

Let’s start by generating some data. In real-life projects, the first step is to collect data, but we'll create synthetic (fake) data for our learning purposes using NumPy.

Why random data? It simulates different scenarios and creates a controlled environment for learning. Don't worry, in the end of this course we will work with the real data as well.

We'll use NumPy to generate areas of houses (in square feet) and their prices:

Real-life example: Imagine you want to predict house prices in your neighborhood. The area of the house affects the price. We simulate this by creating a simple linear relationship but add noise to make it realistic.

Let's break down the data generation:

Sign up
Join the 1M+ learners on CodeSignal
Be a part of our community of 1M+ users who develop and demonstrate their skills on CodeSignal