Logistic regression is a type of classification algorithm used to predict the probability of a class or event existing. Unlike linear regression, which predicts a continuous number, logistic regression predicts a discrete outcome—a sort of yes or no, true or false, or in data terms, class 0 or class 1.
For example, imagine you have data on whether an email is spam or not. Logistic regression can help you predict whether a new email is spam based on its content.
You'll use logistic regression when you need to classify data into categories. Real-life examples include:
- Predicting if a student will pass or fail
- Determining whether an email is spam
- Diagnosing whether a patient has a certain disease
Logistic regression works by fitting a logistic function (also known as the sigmoid function) to the data. The sigmoid function is defined as:
