Latent Variable Models
Latent variables are unobserved targets/values that make it easier to understand the data. Two main reasons why we use latent variable models
- Some data might be naturally unobserved, and it helps learning with missing/unaccessible data.
- We want to model the inverse process, figure out what these unobserved "latents" are.
- More importantly, they enable us to leverage our prior knowledge when defining a model.
Popular latent variable model are Gaussian Mixture Model, and Variational Autoencoders.
Usually learned with approximate learning algorithms such as Expectation Maximization.
Advantages
- By making convenient choices for latents, we can model much more complex $x$.
- Without latents we can have exploding number of parameters for ex: regular Boltzmann Machines (not RBMs).
Distributed representations
Latent variable models are closely related to the hypothetical notion of distributed representations:
- Distribute the 'representation' of our data over multiple neurons
- And each neuron models a distribution of concepts (colors, shapes etc.)
- The latent layer learns to encode combinations of patterns for efficiency
References
- Stanford CS228 notes on latent variable models https://ermongroup.github.io/cs228-notes/learning/latent/
- Lecture 9.1, UvA DL course 2020