Introduction

Welcome! Today, we are peeling back the layers of classification metrics, notably the confusion matrix, precision, and recall. This lesson delves into their theory and provides a practical illustration in Python.

Theory of Confusion Matrix

The performance of binary classifiers is evaluated by comparing predicted and actual values; this structure is encoded as a confusion matrix. A confusion matrix produces four outcomes:

  1. True Positive (TP): Correct positive prediction.
  2. True Negative (TN): Correct negative prediction.
  3. False Positive (FP): Incorrect positive prediction.
  4. False Negative (FN): Incorrect negative prediction.

Consider an email spam filter, classifying Spam (positive) and Not Spam (negative) as follows:

Actual \ PredictedSpam (Predicted)Not Spam (Predicted)
Spam (Actual)True Positives (TP)False Negatives (FN)
Not Spam (Actual)False Positives (FP)True Negatives (TN)
Understanding Precision and Recall

The simplest way to measure the model's performance is to calculate its accuracy, simply the percentage of the correct predictions.

Accuracy measures total correct guesses, but it can’t tell the difference between certain errors. If you're often wrong about specific things, accuracy won't show it, which is a problem for particular tasks. For example, in medical tests, we want to minimize the amount of incorrect negative predictions () so we don't let the disease slip away in the early stages.

Sign up
Join the 1M+ learners on CodeSignal
Be a part of our community of 1M+ users who develop and demonstrate their skills on CodeSignal