Latent Variable Models

Created December 15, 2020 · Updated March 4, 2026

Latent variables are unobserved targets/values that make it easier to understand the data. Two main reasons why we use latent variable models

Some data might be naturally unobserved, and it helps learning with missing/unaccessible data.
We want to model the inverse process, figure out what these unobserved "latents" are.
More importantly, they enable us to leverage our prior knowledge when defining a model.

Popular latent variable model are Gaussian Mixture Model, and Variational Autoencoders.

Usually learned with approximate learning algorithms such as Expectation Maximization.

Advantages

By making convenient choices for latents, we can model much more complex $$x$$ .
Without latents we can have exploding number of parameters for ex: regular Boltzmann Machines (not RBMs).

Distributed representations

Latent variable models are closely related to the hypothetical notion of distributed representations:

Distribute the 'representation' of our data over multiple neurons
And each neuron models a distribution of concepts (colors, shapes etc.)
The latent layer learns to encode combinations of patterns for efficiency

References

Stanford CS228 notes on latent variable models https://ermongroup.github.io/cs228-notes/learning/latent/
Lecture 9.1, UvA DL course 2020