Lecture 12 - Parameter Learning of partially observed BNs

Parameter Learning of partially observed BNs

Introduction

Topics covered:

Partially-Observed Graphical Models

Partially-Observed Graphical Models

Mixture Models

Unobserved Variables

Challenges in Learning with Latent Variables

Strategy:

  1. Guess value of $Z$.
  2. Apply Maximum Likelihood Estimation (MLE) to estimate best model parameters based on $Z$.
  3. Infer most likely $Z$ based on MLE parameter estimates.
  4. Return to step 2 until $Z$ stops changing.

Expectation-Maximization (EM)

Gaussian Mixture Models (GMMs)

Gaussian Mixture Models

EM for GMMs

Algorithm:

  1. Start:
    • Guess the value of centroids $\mu_k$ and covariances $\Sigma_k$ of each of the $K$ clusters.
    • Loop:
  2. E-Step: Compute the expected value of the sufficient statistics of the hidden variables under current estimates of parameters:

    \[\tau_n^{k(t)} = p\left(z_n^k = 1 \mid x, \mu^{(t)}, \Sigma^{(t)}\right) = \frac{\pi_k^{(t)} N(x_n \mid \mu_k^{(t)}, \Sigma_k^{(t)})}{\sum_i \pi_i^{(t)} N(x_n \mid \mu_i^{(t)}, \Sigma_i^{(t)})}\]
  3. M-Step: Using the current expected value of the hidden variables, compute the parameters that maximize the likelihood:

    \[\mu_k^{(t+1)} = \frac{\sum_n \tau_n^{k(t)} x_n}{\sum_n \tau_n^{k(t)}}\] \[\Sigma_k^{(t+1)} = \frac{\sum_n \tau_n^{k(t)} (x_n - \mu_k^{(t+1)})(x_n - \mu_k^{(t+1)})^T}{\sum_n \tau_n^{k(t)}}\]

K-Means vs. EM

Why Does EM Work?

Foreshadowing Variational Inference

Conclusion

Questions?