Probability Theory
Topics
Notes
Linked
Need a computationally tractable distance measure between empirical distributions without requiring density estimation.
Many probability distributions are consistent with known constraints. Choose the one with maximum entropy as the least biased estimate.
How do you estimate expectations under a target distribution when you only have samples from a different distribution?
Analytical computation of expectations is intractable. Approximate expectations by averaging random samples.
Exact posterior inference is intractable for complex models. Approximate the posterior with a simpler distribution by minimizing KL divergence.
KL divergence goes to infinity when there is no support overlap resulting in no gradient
Proving convexity/concavity properties and establishing bounds like such KL non-negativity and convexity of loss functions.