Distribution Shift

Created July 11, 2025 · Updated November 25, 2025

Distribution shift is the broader concept that refers to any situation where the training data and test data come from different distributions. It's a fundamental challenge in machine learning because most algorithms assume training and test data are drawn from the same distribution.

Most machine learning models assume training and test data are i.i.d. (independent and identically distributed). When this assumption breaks due to distribution shift, model performance can degrade significantly. Different types of shift require different adaptation strategies.

Types of distribution shifts

Covariate Shift
• P(X) changes, P(Y|X) stays the same
• Input distribution shifts but conditional relationship is stable
• Example: Medical model trained on young patients, tested on elderly patients

Note

Covariate shift is often considered the "easiest" type to handle because the core relationships P(Y|X) remain valid - we just need to account for the different input distribution.

Label Shift (also called Prior Probability Shift)
• P(Y) changes, P(X|Y) stays the same
• The proportion of different classes changes, but what each class "looks like" stays the same
• Example: Fraud detection model trained when fraud rate was 1%, deployed when fraud rate becomes 5%

Concept Drift
• P(Y|X) changes over time
• The fundamental relationship between inputs and outputs shifts
• Example: Stock prediction model where economic relationships change after a market crash

Domain Shift
• More general case where multiple aspects change between domains
• Could involve changes in P(X), P(Y), and P(Y|X)
• Example: Image classifier trained on real photos, tested on abstract paintings