Information Retrieval
Topics
Notes
Linked
Predict user preferences without content features. Leverage patterns in user-item interaction matrices to recommend items.
Pointwise scoring doesn't optimize for ranking quality. Learn to directly optimize document ordering for search relevance.
TF-IDF doesn't account for document length or term frequency saturation. Use probabilistic term weighting with length normalization.
Can't deploy a new ranking policy just to evaluate it. Use logged interaction data with importance weighting to estimate performance offline.
Offline metrics may not reflect real user satisfaction. Use interleaving or A/B tests to evaluate ranking quality from live user behavior.
Directly optimizing IR metrics like NDCG is non-differentiable. Define implicit gradients (lambdas) that approximate the desired metric-driven update.
Pairwise ranking losses don't consider the full list structure. Define a distribution over permutations and optimize list-level likelihood.
Pointwise scoring ignores relative document ordering. Learn pairwise preferences using a cross-entropy loss over document pairs.
Traditional IR metrics assume independent document relevance. Model user browsing as a cascade where each document's value depends on those above it.