Lecture 05 - Undirected GMs (MRFs)

Introduction to Undirected GMs

Logistics Review

No class on 2/11

HW2 deadline pushed to 2/11 11:59pm

Quiz in-class on 2/13

3 HW problems, 2 new problems. All of the questions are multiple choice. Time: Whole lecture.

Undirected Graphic Models

Definition: UGMs provide a visual representation of the structure of joint probability distribution from which conditional independence relationships can be inferred.

Representing Undirected Graphic Model

Clique

Definition: A clique is a fully connected subset of nodes in an undirected graph model. for $G = {V,E}$, a clique is a subgraph such that the nodes in $V’$ are fully connected. figure 3: Clique Figure 3
Maximal Clique is a clique that cannot be extended by adding more nodes without breaking the full connectivity condition.


Interpretation of Clique Potentials figure 4<figcaption> Figure 4

alt text figure 5

Markov Independencies

Global Markov Independencies !figure 6 Figure 6

Local Markov Independencies graph 7 Figure 7

Pairwise Markov Independencies figure 8: pairwise figure 8

I-Maps of UG

Definition of I-Map

An undirected graph ( H ) is an I-Map (Independence Map) for a probability distribution ( P ) if:

\[I(H) \subseteq I(P)\]

where:

- \( I(H) \) is the set of independence relationships implied by the graph \( H \). - \( I(P) \) is the set of independence relationships present in \( P \). ## Gibbs Distribution A probability distribution \( P \) is a **Gibbs Distribution** on a graph \( H \) if: $$ P(X) = \frac{1}{Z} \prod_{c \in C} \psi_c(X_c) $$ - This means that a **Gibbs distribution** can be decomposed into a product of factors over the cliques of the graph. ### **Theorem:** If \( P \) is a Gibbs distribution over the graph \( H \), then \( H \) is an **I-Map** of \( P \). --- ## Perfect Maps
Figure 1: Not all structures have a perfect map.
### Definition of Perfect Map An undirected graph \( H \) is a **Perfect Map** for a probability distribution \( P \) if: $$ \text{sep}_H(X; Z \mid Y) \iff X \perp Z \mid Y $$ ### **Key Point:** - Not all distributions have a **perfect map** as a UG. - **Example: V-Structure \( X \to Y \leftarrow Z \)**: - \( X \perp Z \) - But \( X \) and \( Z \) are **dependent given \( Y \)**: $$ \neg (X \perp Z \mid Y) $$ Since an **undirected graph cannot represent directionality**, it cannot capture this dependency. --- ## Graphical Models and Overlapping Distributions
Figure 2: Overlapping sets of distributions in Directed and Undirected Graphical Models.
- **\( P \)**: The largest set, containing all possible probability distributions. - **\( D \)**: The set of probability distributions that can be represented by **Directed Graphical Models (DGMs)**, such as Bayesian Networks. - **\( U \)**: The set of probability distributions that can be represented by **Undirected Graphical Models (UGMs)**, such as Markov Random Fields. - The **overlap** represents distributions that can be represented by **both** DGMs and UGMs. --- ## Exponential Families in UGMs ### **Energy Representation** Clique potentials \( \psi_c(X_c) \) can be rewritten using an **energy function**: $$ \psi_c(X_c) = \exp(-\phi_c(X_c)) $$ This leads to the **Boltzmann Distribution**: $$ P(X) = \frac{1}{Z} \exp\left(-\sum_{c \in C} \phi_c(X_c)\right) $$ - In **physics**, this is called the **Boltzmann distribution**. - In **statistics**, this is known as a **log-linear model**. --- ## MAP Inference as Free Energy Minimization ### **Boltzmann Distribution** The **Boltzmann distribution** is defined as: $$ P_{\text{Boltzmann}}(X) = \arg\min_H F(P(X; H)) $$ where: - \( P(X;H) \) is the probability distribution over configurations \( X \). - \( F(P) \) is the **free energy**. ### **Free Energy Decomposition** $$ F(P) = E[H(X)] - TS(P(X)) $$ where: - \( E[H(X)] \) is the **expected energy**. - \( TS(P(X)) \) represents the entropy term. ### **MAP Inference and Free Energy** In probabilistic inference, **MAP estimation** minimizes the free energy: $$ Q^*(\theta) = \arg\min_{Q} \left[ E_Q(-\log P(X \mid \theta)) - TS(Q) \right] $$ - The first term represents the **negative log-likelihood**. - The second term **regularizes** the distribution. --- ## **Conclusion** - **I-Maps** and **Gibbs distributions** define the relationship between UGMs and probability distributions. - **Perfect Maps** do not always exist due to **directionality limitations** in UGMs. - **Boltzmann distributions** provide an energy-based formulation of UGMs. --- ## **Next Steps** - Further explore **Variational Inference**. - Applications of UGMs in **machine learning and physics**.