Introduction to Adjusted Predictions

Welcome to the final lesson of this course on recommendation systems, where we will explore the concept of adjusted weighted averages. Previously, we've used raw ratings to predict user preferences using weighted averages based on Pearson similarity. However, this approach can introduce bias, as it doesn't account for individual users' rating tendencies. In this lesson, you'll learn how switching to using the difference between a rating and a user's average rating can improve prediction accuracy by minimizing these biases.

Recap of Previous Setup

Let's briefly revisit the code setup that we've built upon throughout this course. You should already be familiar with reading a user-item rating matrix from a text file and setting the stage for using this data in predictions. Here's a quick code reminder:

This code reads the user-item matrix from a file, setting up our essential data structure for further manipulations. Understanding this setup is crucial as we now proceed to modify our prediction approach.

Understanding the Switch in Attributes

When we use raw ratings in recommendation systems, we might introduce bias because different users have different rating tendencies. Here's what that means:

  • Consistently High Raters: Some users might generally give high ratings to most items, regardless of their true preferences. For example, a user might rate most movies 4 or 5 stars.
  • Consistently Low Raters: Conversely, some users might rate items lower on average, even if they like them. They might give most movies 2 or 3 stars.

These tendencies can skew predictions because the system might interpret a high rating as a strong preference, even if it's just the user's habit. To reduce this bias and improve the accuracy of our recommendation system, we adjust the ratings by subtracting the average rating of each user.

By using the rating differences rather than raw averages, we can better identify genuine preferences:

  • This adjustment ensures that predictions are based more on relative preferences rather than absolute ratings.
  • It helps to normalize user ratings, making comparisons between users more equitable.
Formula

The formula for predicting a rating using adjusted weighted averages is:

Predicted Rating=rˉu+vUsim(u,v)×(rv,irˉv)vUsim(u,v)\text{Predicted Rating} = \bar{r}_{u} + \frac{\sum_{v \in U} \text{sim}(u, v) \times (r_{v,i} - \bar{r}_{v})}{\sum_{v \in U} \text{sim}(u, v)}
Step-by-Step Code Modification: Step 1

Now, let's walk through the specific code modifications needed to implement these changes. The key is to adjust the computation to use the difference between a rating and the user's average rating in our weighted rating prediction function.

Modify the calculation of rating differences by subtracting each user's average rating, and ensure you check for the existence of the target item in the ratings map to avoid a runtime panic:

Here, avgUserRating is calculated as the mean of all ratings given by a user. The ratingDiff is the difference between the item rating and this average. The check using the comma-ok idiom (rating, ok := ratings[targetItem]) ensures that we only proceed if the user has actually rated the target item, preventing a possible runtime panic.

Step-by-Step Code Modification: Step 2

Ensure our similarity calculations take these differences into account:

The denominator uses the sum of the signed similarities (sumOfWeights += similarity). This is the standard approach in collaborative filtering, as it ensures that only users with positive similarity (i.e., similar tastes) contribute positively to the prediction, while users with negative similarity (opposite tastes) can reduce or even reverse the effect. If the sum of similarities is close to zero, the prediction will fall back to the target user's average rating.

If you want to further restrict the influence to only positively correlated users, you can add a check to skip users with non-positive similarity.

Implementing Adjusted Ratings in Predictions

With these adjustments, the weighted rating prediction function is revised to incorporate adjusted ratings and use the signed similarity in the denominator. Let's consider the entire prediction function:

Review, Summary, and Preparation for Practice
  • You learned about using adjusted weighted averages to improve prediction accuracy by reducing bias in user-item matrices.
  • You explored specific code modifications designed to use rating differences rather than raw averages, thus enhancing the fairness and equity of similarity-based recommendations.
  • You saw how to use the signed similarity in the denominator, following standard collaborative filtering practice.

In the practice exercises that follow, you'll have the chance to apply these concepts hands-on, solidifying your understanding. Thank you for your dedication and hard work throughout this journey. Your newfound expertise in recommendation systems positions you well for further exploration and application in real-world projects. Well done!

Sign up
Join the 1M+ learners on CodeSignal
Be a part of our community of 1M+ users who develop and demonstrate their skills on CodeSignal