Collaborative filtering

  • Collaborative (Social) filtering
  • Leverage ratings of a user u + other users in the system
  • Key Idea: If two users u and v rated an item similarly, and user v has rated an item, user u's rating will be similar
  • Content of items no longer needed!
    • Content may be a bad indicator depending on the domain/circumstances

Methodology

  • Neighbourhood-based
    • Use stored ratings directly
    • Nearest neighbours
      Model-based
  • Learn a predictive model, model user-item interactions (latent factors)
  • Predict new/incomplete ratings using trained model t

What is being recommended

  • User based
    • Use other users to infer ratings of an item for a user
    • 'Neighbours' - users who have similar ratings
  • Item-based
    • Use ratings of similar items (that user has rated) to predict the rating for a given item

Type of ratings:

  • Explicit Ratings
    • Like/Dislike; ImDB ratings
    • Might not be available
  • Implicit
    • Time spent on web-page; Clicks
    • More available, but intention is unclear (suffers from biases and ambiguity)

Neighbourhood-based recommenders

Explicit ratings, not sparse

$$ \begin{array}{c||c|c|c|c|c} \hline & \begin{array}{c} \text { The } \\ \text { Matrix } \end{array} & \text { Titanic } & \begin{array}{c} \text { Die } \\ \text { Hard } \end{array} & \begin{array}{c} \text { Forrest } \\ \text { Gump } \end{array} & \text { Wall-E } \\ \hline \hline \text { John } & 5 & 1 & & 2 & 2 \\ \text { Lucy } & 1 & 5 & 2 & 5 & 5 \\ \text { Eric } & 2 & ? & 3 & 5 & 4 \\ \text { Diane } & 4 & 3 & 5 & 3 & \\ \hline \end{array} $$
  • Eric and Lucy have similar tastes
  • Eric and Diane have different tastes
  • What should the rating be for Eric for Titanic?

User representation - Rating vector (dimension = item)
Consider k-nearest neighbours of user $u$ who rated item $i: \mathcal{N}_{i}(u)$

$$ r_{u i}=\frac{1}{\left|\mathcal{N}_{i}(u)\right|} \sum_{v \in \mathcal{N}_{i}(u)} r_{v i} $$

  • But Lucy is more similar to Eric than Diane

Consider the similarity

$$ r_{u i}=\frac{\sum_{v \in \mathcal{N}_{i}(u)} w_{u v} r_{v i}}{\sum_{v \in \mathcal{N}_{i}(u)}\left|w_{u v}\right|} $$

Given similarity of < Eric, Lucy >=0.75 and < Eric, Diane >=0.15
$$ r=\frac{0.75 \times 5+0.15 \times 3}{0.75+0.15} \approx 4.67 $$

  • Users may use different rating values -5 ' for John (who always rates $1 / 2 / 5$ ) might not be equal to ' 5 ' from Diane (who rates $3 / 4 / 5)$
  • Solution: Normalise the ratings per user eg. Mean entering or Z-score normalization

Other problems:

  • Limited Coverage - users can be neighbours only if they rated common items
  • Sparsity makes it worse as number of items increase!

Matrix Factorization

  • Attempt to solve sparsity/coverage problems by projecting user/item vectors to a dense latent space by
    • Decompose rating matrix
    • Decompose similarity matrix

Given $R,$ a $|\mathscr{U}| \times|\mathcal{F}|$ matrix of rank $n$

  • Approximate using Singular Value Decomposition (SVD) by $\hat{R}=P Q^{T}$ of rank $k<n$
  • $\mathrm{P}$ is $\mathrm{a}|\mathscr{U}| \times k$ matrix of users
  • $\mathrm{Q}$ is a $|\mathscr{F}| \times k$ matrix of items
    $$ \operatorname{err}(P, Q)=|| R-P Q^{T}||_{F}^{2}=\sum_{u, i}\left(r_{u i}-p_{u} q_{i}^{T}\right)^{2} $$

Advantages and Disadvantages

Pros:

  • No feature engineering required—works directly with user-item interactions
  • Content-independent—doesn't need item descriptions or metadata
  • Discovers unexpected connections—can recommend items that seem unrelated but appeal to similar users
  • Addresses filter bubbles by introducing diversity beyond content similarity

Cons:

  • Sparsity problem—struggles when user-item interaction data is sparse
  • Cold start problem—cannot recommend to new users or new items without interaction history
  • Popularity bias—tends to favor popular items over niche ones
  • Limited help for users with unique preferences—performs poorly for outlier taste profiles