Information Retrieval

Topics

Notes

Linked

Collaborative filtering

Predict user preferences without content features. Leverage patterns in user-item interaction matrices to recommend items.

Learning to Rank

Pointwise scoring doesn't optimize for ranking quality. Learn to directly optimize document ordering for search relevance.

BM25

TF-IDF doesn't account for document length or term frequency saturation. Use probabilistic term weighting with length normalization.

Counterfactual Evaluation and LTR

Can't deploy a new ranking policy just to evaluate it. Use logged interaction data with importance weighting to estimate performance offline.

Online Evaluation and LTR

Offline metrics may not reflect real user satisfaction. Use interleaving or A/B tests to evaluate ranking quality from live user behavior.

LambdaRank

Directly optimizing IR metrics like NDCG is non-differentiable. Define implicit gradients (lambdas) that approximate the desired metric-driven update.

ListNet and ListMLE

Pairwise ranking losses don't consider the full list structure. Define a distribution over permutations and optimize list-level likelihood.

RankNet

Pointwise scoring ignores relative document ordering. Learn pairwise preferences using a cross-entropy loss over document pairs.

Expected Reciprocal Rank

Traditional IR metrics assume independent document relevance. Model user browsing as a cascade where each document's value depends on those above it.