Loading...

Introduction to Adjusted Predictions

Welcome to the final lesson of this course on recommendation systems, where we will explore the concept of adjusted weighted averages. Previously, we've used raw ratings to predict user preferences. However, this approach can introduce bias, as it doesn't account for individual users' rating tendencies. In this lesson, you'll learn how switching to using the difference between a rating and a user's average rating can improve prediction accuracy by minimizing these biases.

Recap of Previous Setup

Let's briefly revisit the code setup that we've built upon throughout this course. You should already be familiar with reading a user-item rating matrix from a text file and setting the stage for using this data in predictions. Here's a quick code reminder:

This code reads the user-item matrix from a file, setting up our essential data structure for further manipulations. Remember, understanding this setup is crucial as we now proceed to modify our prediction approach.

Understanding the Switch in Attributes

When we use raw ratings in recommendation systems, we might introduce bias because different users have different rating tendencies. Here's what that means:

Consistently High Raters: Some users might generally give high ratings to most items, regardless of their true preference. For example, a user might rate most movies 4 or 5 stars.
Consistently Low Raters: Conversely, some users might rate items lower on average, even if they like them. They might give most movies 2 or 3 stars.

These tendencies can skew predictions because the system might interpret a high rating as a strong preference, even if it's just the user's habit. To reduce this bias and improve the accuracy of our recommendation system, we adjust the ratings by subtracting the average rating of each user.

By using the rating differences rather than raw averages, we can better identify genuine preferences:

This adjustment ensures that predictions are based more on relative preferences rather than absolute ratings.
It helps to normalize user ratings, making comparisons between users more equitable.

Formula

The formula for predicting a rating using adjusted weighted averages is:

\text{Predicted Rating} = \bar{r}_{u} + \frac{\sum_{v \in U} \text{sim}(u, v) \times (r_{v,i} - \bar{r}_{v})}{\sum_{v \in U} |\text{sim}(u, v)|}

Where:

$\bar{r}_{u}$ is the average rating of the target user $u$ .
$\text{sim}(u, v)$ is the similarity between the target user $u$ and another user $v$ .
$r_{v,i}$ is the rating given by user $v$ to item $i$ .
$\bar{r}_{v}$ is the average rating of user $v$ .
$U$ is the set of all users except the target user $u$ who have rated item $i$ .

This formula illustrates how the prediction accounts for individual user rating tendencies and emphasizes relative preferences, thus minimizing bias. Note that as we calculate not the average rating itself, but the average deviation from the user's average rating, we need to add the target user's average, denoted as $\bar{r}_{u}$ .

Step-by-Step Code Modification: Step 1

Now, let's walk through the specific code modifications needed to implement these changes. The key is to adjust the computation to use the difference between a rating and the user's average rating in our weighted rating prediction function.

Modify the calculation of rating differences by subtracting each user's average rating, like so:

Here, avg_user_rating is calculated as the mean of all ratings given by a user. The rating_diff is the difference between the item rating and this average.

Step-by-Step Code Modification: Step 2

Ensure our similarity calculations take these differences into account:

This modification makes sure the predictions leverage differences rather than raw ratings, aligning with the theoretical benefits discussed earlier.

Implementing Adjusted Ratings in Predictions

With these adjustments, the weighted rating prediction function is revised to incorporate adjusted ratings. Let's consider the entire prediction function:

The predicted rating now effectively balances user biases, leading to recommendations that better reflect each user's true preferences. Note that as we predict a difference between users, average rating and the target item prediction, we add our prediction to avg_target_user_rating in order to get the final rating.

Review, Summary, and Preparation for Practice

Congratulations on reaching the end of this course! Let's summarize what you have covered in this lesson:

You learned about using adjusted weighted averages to improve prediction accuracy by reducing bias in user-item matrices.
You explored specific code modifications designed to use rating differences rather than raw averages, thus enhancing the fairness and equity of similarity-based recommendations.

In the practice exercises that follow, you'll have the chance to apply these concepts hands-on, solidifying your understanding. Thank you for your dedication and hard work throughout this journey. Your newfound expertise in recommendation systems positions you well for further exploration and application in real-world projects. Well done!

Previous Lesson

Join the 1M+ learners on CodeSignal

Be a part of our community of 1M+ users who develop and demonstrate their skills on CodeSignal