Introduction

Welcome back! You've journeyed through the basics of recommendation systems, starting with baseline predictions and learning about similarity measures like Pearson Correlation. Understanding user similarity is crucial in recommendation systems, enabling more accurate predictions of unknown ratings. In this lesson, we will build upon that knowledge and focus on a practical approach to predicting user ratings using weighted averages combined with Pearson similarity. This technique allows us to make personalized recommendations by accounting for the weighted influence of similar users' ratings. By the end of the lesson, you'll be able to effectively predict a user's rating for an item — a vital skill in crafting sophisticated recommendation systems.

Recap: Using Pearson Similarity

Before diving into this lesson's main topic, let's quickly revisit the Pearson Correlation function we discussed in the previous lesson. This function is key to determining how similar two users are based on their rating patterns.

Here is how you can implement the Pearson Correlation calculation in Go, using slices and manual loops for all calculations:

This function calculates how closely two sets of user ratings align. Higher values indicate greater similarity, which will be important for today's task: predicting ratings based on these similarities.

Reading the User-Item Rating Matrix

To make predictions, we first need to read and interpret our user-item rating data. This data is stored in a file named user_items_matrix.txt. Let's explore how the file is structured and how to load this information.

The file is organized with each line representing a user's rating for a specific item. It has three comma-separated values: User, Item, and Rating. Here's an example:

We can use Go's map data structure to store this information and read the file using bufio.Scanner:

The code reads the file line by line, splits each line into user, item, and rating, and then stores this data in a nested map, . This structure allows for easy retrieval and manipulation of ratings, facilitating our upcoming calculations.

Calculating Non-weighted Average Rating

Before making predictions using weighted averages, it's beneficial to understand non-weighted averages, which are simpler aggregates of ratings for a specific item across all users.

Here's how you can compute this in Go:

This function, calculateNonWeightedAverage, gathers all ratings for a specified item (e.g., "ItemC") from the user-item matrix and calculates the average. It's a straightforward method but does not consider user similarity, unlike the weighted prediction — which we'll explore next.

Formula for Weighted Rating Prediction

Let's consider the formula for predicting a rating using a weighted average based on Pearson similarity:

Predicted Rating=vUsim(u,v)×rv,ivUsim(u,v)\text{Predicted Rating} = \frac{\sum_{v \in U} \text{sim}(u, v) \times r_{v,i}}{\sum_{v \in U} |\text{sim}(u, v)|}
Using Only Common Items for Pearson Correlation

A crucial detail for accurate similarity calculation is to only use ratings for items that both users have rated (excluding the target item being predicted). This ensures the Pearson correlation is meaningful and not distorted by missing data.

Here's how you can extract the common ratings between two users (excluding the target item):

This function finds all items (except the target item) that both users have rated, and returns their ratings as aligned slices.

If your source ratings are stored as int, convert them to float64 before computing means, similarities, and predicted ratings. Raw file values may be whole numbers, but Pearson correlation and weighted predictions naturally use fractional results.

Predicting Rating Using Weighted Average

Now, let's move to the core of this lesson: predicting ratings using a weighted average that's informed by Pearson similarity. This method considers the similarity between users when calculating the predicted rating. Here’s a detailed implementation of this approach, using only common items for Pearson correlation:

This function, weightedRatingPrediction, predicts the rating for a specified user (e.g., "User3") on a target item (e.g., "ItemC") by:

  • Gathering Similarity Scores: Evaluating the closeness between the target user and each other user, expressed as a Pearson correlation score, using only the ratings for items both users have rated.
  • Calculating a Weighted Sum: Using the similarity scores as weights, sum the product of each user's similarity and their rating for the target item.
  • Normalizing by the Sum of Similarity Weights: Divide the weighted sum by the sum of the similarity scores to produce a personalized rating prediction.

This method is more nuanced as it adjusts ratings based on the closeness of users' preferences, thus providing more personalized recommendations.

Example Data and Interpreting Results

To better understand how the weighted rating prediction works, let's look at an example with three users (User1, User2, and User3) and their ratings for various items, including ItemC.

Example Data (as would appear in user_items_matrix.txt):

We want to predict User3's rating for ItemC. Here’s how this process unfolds:

  1. Calculate Similarities:

    • User1's past ratings (e.g., on ItemA and ItemB) are not very similar to User3's ratings, indicating a medium Pearson correlation of about 0.5.
    • User2's ratings are more similar to User3's ratings, resulting in a higher correlation of about 0.86.
  2. Weighted Sum Calculation:

    • Since User1's similarity to is lower, 's rating of 4 for carries less weight: .
Summary and Preparation for Practice

You’ve now learned to predict user ratings using a weighted average approach informed by user similarity. This lesson has enhanced your understanding of collaborative filtering in recommendation systems, allowing you to make predictions that better reflect individual user preferences.

As you move into the practice exercises, take the opportunity to apply these techniques to different datasets and observe how recommendations alter based on user similarity. You are building the foundational knowledge to create effective recommendation systems.

Sign up
Join the 1M+ learners on CodeSignal
Be a part of our community of 1M+ users who develop and demonstrate their skills on CodeSignal