Welcome back! As you continue your journey through the fascinating world of recommendation systems, it's important to understand not just explicit feedback — such as star ratings — but also implicit feedback. Implicit feedback is obtained from user behavior patterns, like watch times or click histories. While it's much easier to gather, it doesn't directly reveal user satisfaction as explicit feedback does.
In the previous unit, you used ALS with explicit ratings and tried to reconstruct missing user-item scores directly. The core matrix-factorization idea still matters here, but the input changes: instead of modeling stars or ratings, we model whether an interaction happened and how strongly we trust that signal. Most classical models utilize either implicit or explicit feedback separately due to the complexities involved in integrating both types into a unified system. In this course, we'll focus on analyzing implicit feedback independently first, and then use those matrices as the input to IALS.
Now, let's delve into the binary matrix of interactions. In the context of implicit feedback, this matrix is a simplified representation showing whether a user interacted with an item or not. Each entry in the matrix is a binary value:
1indicates an interaction (e.g., a user watched an item),0implies no interaction.
For example, let's say User 1 interacted with Items 1, 2, and 4. The binary matrix would look like this:
This matrix is crucial, as it helps algorithms understand which items have been interacted with, providing a baseline for recommending new items to users.
The confidence matrix goes beyond the binary matrix by incorporating the confidence we have in each interaction. This confidence is calculated based on user behaviors such as watchTime. Longer watch times suggest higher interest and, thus, greater confidence in the interaction.
The important distinction is:
- the interaction matrix answers: "did something happen?"
- the confidence matrix answers: "how strongly should the model trust that signal?"
This is why a user who barely sampled an item and a user who watched it for a long time can both have interaction value 1, while still receiving very different confidence values.
You can think of the formula as starting from a baseline and then scaling upward with evidence. The 1 is that baseline: once we have observed an interaction at all, we do not want its confidence to drop to zero. The alpha value controls how quickly confidence grows as watch_time increases. A larger alpha makes the model react more strongly to differences in engagement, while a smaller one keeps confidence values closer together. In this lesson, we use alpha = 40 as a simple teaching default that makes the effect visible in small examples. In practice, it is a tunable hyperparameter rather than a universal constant.
Here's how you might compute a confidence matrix in Go, where watchTime plays a significant role:
The dataset is a JSON file where each entry contains entries for user, item, rating, and watch_time. Each record describes an interaction a user had with an item.
Here's an excerpt explaining how to read the data in Go:
In this block, we read the JSON file and calculate maxUser and maxItem to ascertain the dimensions of our matrices. The JSON still contains a rating field because the raw event log can store both explicit and implicit signals side by side. In this implicit-feedback unit, however, we ignore rating and build our matrices from watch_time.
Following the data read, we initialize the matrices and populate them with interactions and confidence values:
Here, interactionMatrix is filled with 1s indicating a user-item interaction, while confidenceMatrix is filled using the formula:
This formula is taken from the article that we mentioned before. In practice, you can experiment and come up with different approaches to calculate the implicit feedback value. For example, in the same article, the authors offer an alternative formula for confidence that also worked well for them:
Different units in this course will use in slightly different forms depending on the goal:
Here's a short example showing how the resulting matrices might look:
This output reflects interactions and confidence levels across users and items.
You might wonder, why don't we use only the confidence matrix, as it contains all the information? The reason is that splitting user preferences (interactions) and our confidence in their preferences allows us to work with these values distinctly and construct a model that treats them separately. It generally improves the model's performance.
In the next lesson, we will train one example of such a model. But before that, let's wrap it up and have some practice!
In this lesson, you focused on understanding and creating interaction and confidence matrices based on implicit feedback like user watch times. You now have both the theoretical understanding and practical skills to process implicit feedback. This enables you to create a more nuanced and personalized recommendation system.
In the next session, you'll have the opportunity to explore practice exercises that reinforce today's lesson. These exercises will help solidify your understanding and make the transition to advanced models seamless. Keep up the great work as you advance towards mastering recommendation systems!
