In previous lessons, you learned about content-based recommendation systems and how they rely on user and item profiles. We covered how to extract content features such as likes, clicks, and genres, and how to compute similarities using straightforward methods like the dot product. This lesson will build on those foundations to guide you through a more complex example, using more advanced techniques for simulating user preferences and calculating genre similarities.
We'll explore how to simulate user preferences, calculate genre similarities, and score songs based on those similarities, offering you a glimpse into the practical applications of these systems in real-world scenarios, such as music streaming services. We'll also introduce the basics of feature standardization and linear regression—two important concepts that will help you build more sophisticated recommendation models. Let's dive into this sophisticated example step by step.
As a reminder from our previous lessons, let's quickly revisit how to load and merge datasets. In JavaScript, we typically represent data as arrays of objects. Suppose we have two datasets: one for tracks and one for authors. We can merge these datasets by matching a common key, such as author_id.
Here's how you might do this in JavaScript:
By merging the two arrays, we create a unified view of our music tracks, integrating both track details and author information, which will serve as a foundation for our recommendation system.
To offer personalized recommendations, we need to simulate user preferences. In JavaScript, we can represent a user's profile as a plain object, quantifying their genre preferences and listening behavior.
Here, we've created a simple userProfile indicating that our hypothetical user enjoys rock the most, followed by pop, and has a moderate affinity for jazz. This profile will be used to tailor recommendations to their tastes.
Next, let's map music genres into numerical vectors and compute genre similarities. In JavaScript, we can use arrays to represent these vectors. We'll use one-hot encoding, where each genre is represented by an array with a 1 in the position that corresponds to the specific genre and 0s elsewhere.
Here's how we can define this mapping:
Each genre has a distinct and orthogonal representation, which is useful for calculating similarities.
Now, let's build the code to calculate similarities between the user profile and songs. We'll use cosine similarity to measure how closely the user's genre preferences align with each track's genre.
First, let's define a function to calculate cosine similarity between two vectors. To ensure correctness and consistency with the practice section, we'll use a robust implementation:
Next, let's map each track's genre to its vector and create a user genre preference vector:
What cosineSimilarity actually does (worked example)
-
Intuition: Think of each vector as an arrow from the origin. Cosine similarity is the cosine of the angle between these arrows. If they point in the same direction, the cosine is 1 (perfect match). If they are orthogonal (unrelated), the cosine is 0. With non-negative features like ours, values lie between 0 and 1.
-
Step-by-step with our user and one-hot genres:
- User preferences vector: U = [5, 4, 2] (Rock, Pop, Jazz)
- Rock track vector: R = [1, 0, 0]
- Pop track vector: P = [0, 1, 0]
- Jazz track vector: J = [0, 0, 1]
Compute once:
- Dot products with user:
- U·R = 51 + 40 + 2*0 = 5
- U·P = 50 + 41 + 2*0 = 4
- U·J = 50 + 40 + 2*1 = 2
- Magnitudes:
- |U| = sqrt(5^2 + 4^2 + 2^2) = sqrt(45) ≈ 6.708
- |R| = |P| = |J| = 1
-
Scale invariance: The dot product grows with the magnitude of vectors. If a heavy user doubles all their preference numbers (e.g., from [5,4,2] to [10,8,4]), dot products double even though the relative tastes didn’t change. Cosine similarity divides by the magnitudes, so it only captures direction (the proportions across genres), not how large the numbers are.
-
Fair comparison across users: Two users with the same relative preferences but different activity levels should receive the same similarity scores. Cosine achieves that; raw dot products do not.
-
Works with mixed feature scales: In richer item vectors (beyond one-hot), some features can have larger ranges. Cosine dampens the dominance of large-scale features by normalizing magnitudes.
-
In our one-hot setup: The dot product between a user vector and a track’s one-hot genre equals the user’s raw preference for that genre (e.g., Rock → 5). Cosine similarity divides that by the user vector’s length, yielding a comparable score across users. This makes ranking robust when users have very different absolute counts or scales.
-
Angle-based intuition: Cosine measures how aligned two profiles are. Dot product conflates alignment with length (activity or popularity), which can bias recommendations toward users or items with big numbers.
When building more advanced recommendation models, especially those that use multiple numeric features (like likes, clicks, full_listens, author_listeners, and similarity), it's important to standardize these features. Standardization (also called z-score normalization) transforms each feature so that it has a mean of 0 and a standard deviation of 1.
This is important because features with larger numeric ranges can dominate the model, making it harder for the model to learn from features with smaller ranges. Standardization puts all features on the same scale.
The formula for standardization is:
In JavaScript, you can compute the mean and standard deviation for each feature, then apply the formula to each value.
Example:
You would repeat this for each numeric feature before using them in a model.
To make more accurate recommendations, we can use linear regression. Linear regression is a simple machine learning technique that models the relationship between a set of input features (like likes, clicks, full_listens, author_listeners, and similarity) and a target value (such as a user's rating for a song).
The linear regression model predicts the target as a weighted sum of the input features:
Where:
w0is the intercept (bias term)w1, w2, ..., wnare the weights (coefficients) for each feature
The goal is to find the weights that minimize the difference between the predicted and actual ratings.
In practice, we often solve for the weights using matrix operations. When the number of features is small and the data is not too large, we can use the pseudo-inverse of the feature matrix to find the best weights. This is a standard approach in linear algebra and is especially useful when the feature matrix is not square or is not invertible.
The formula is:
Where:
Xis the matrix of standardized features (with a column of 1s for the intercept)yis the vector of target ratingspinv(X)is the pseudo-inverse ofX
In JavaScript, using the ml-matrix library, you can compute the pseudo-inverse directly with Matrix.pseudoInverse(). This method uses SVD (Singular Value Decomposition) under the hood, so you don't need to call SVD directly.
Below is a minimal end-to-end example that:
- Builds a feature matrix from our tracks (including the cosine similarity we computed),
- Standardizes each feature column,
- Solves for weights using the pseudo-inverse,
- Uses the weights to make a prediction.
- X: The design matrix (rows = tracks, columns = standardized features; the first column is all 1s for the intercept).
- y: The target ratings vector (one rating per track).
- w: The learned weights. w[0] is the intercept; the rest weight each standardized feature.
- Standardization: We compute mean and standard deviation on the training columns, and use them for both training rows and future predictions to keep scales consistent.
This approach allows you to combine multiple features (including similarity scores) into a single, data-driven recommendation score.
In this lesson, you've successfully integrated more advanced content-based recommendation concepts, from simulating user preferences to calculating track similarities using cosineSimilarity. You've also learned about feature standardization and the basics of linear regression using the pseudo-inverse, which is a powerful tool for building more accurate and flexible recommendation systems.
As you move on to practice exercises, you'll apply these techniques using real JavaScript libraries and data, building on the theoretical foundation established here. This practical experience will consolidate your understanding and proficiency, enabling you to build sophisticated content-based recommendation systems independently.
