Welcome to k-Nearest Neighbors (k-NN)! This algorithm makes predictions by asking: "What do my closest neighbors look like?"
Imagine you're house hunting. You'd probably check nearby houses to estimate prices, right? k-NN works exactly the same way.
Engagement Message
How might you use nearby examples to make a decision in real life?
Here's how k-NN works: when you want to classify a new data point, it finds the k closest points in your training data and takes a majority vote.
If k=3 and 2 neighbors are "spam" while 1 is "not spam," the prediction is "spam."
Engagement Message
Why do you think we use multiple neighbors instead of just the closest one?
The key challenge is measuring "closeness" between data points. We often use Euclidean distance - the straight-line distance between two points.
It's like measuring the distance between two cities on a map using a ruler.
Engagement Message
What other situations require measuring straight-line distance?
The Euclidean distance formula for two points is:
distance = √((x₂ - x₁)² + (y₂ - y₁)²)
For points (1,2) and (4,6): distance = √((4-1)² + (6-2)²) = √(9 + 16) = √25 = 5
Engagement Message
How does this formula relate to the Pythagorean theorem?
Let's work through a complete example! We have 5 houses with known prices, each represented as a point on a 2D plane:
