Lecture 05 - Undirected GMs (MRFs)
Introduction to Undirected GMs
Logistics Review
No class on 2/11
HW2 deadline pushed to 2/11 11:59pm
Quiz in-class on 2/13
3 HW problems, 2 new problems. All of the questions are multiple choice. Time: Whole lecture.
Undirected Graphic Models
Definition: UGMs provide a visual representation of the structure of joint probability distribution from which conditional independence relationships can be inferred.
- Unlike a directed graphic mode, UGMs do not imply a causal direction between the variables. UGMs give correlations between variables. (Figure 1)
- key features of UGMs:
- Pairwise relationship
- no explicit way to generate samples
- Contingency constraints on node configurations.
figure 1
- UGM Example: Lattice
- An UGM lattice used for spatial inference, each pixel in the image is represented as a node, each node is connected to its neighboring pixels (Figure 2)
- To infer the yellow pixel, the model classifies the yellow selected pixel based on its surrounding pixels instead of its absolute location.
Since the most of the adjacent pixels belong to “water”, even it’s on the top of right, lattice UGM classifies it as water
figure 2
- To infer the yellow pixel, the model classifies the yellow selected pixel based on its surrounding pixels instead of its absolute location.
Since the most of the adjacent pixels belong to “water”, even it’s on the top of right, lattice UGM classifies it as water
- An UGM lattice used for spatial inference, each pixel in the image is represented as a node, each node is connected to its neighboring pixels (Figure 2)
Representing Undirected Graphic Model
-
An undirected graphical model represents a distribution $P(x)$ defined by an undirected graph $H$ and a set of positive potential functions $\psi$ associated with the cliques of $H$ such that $P(X_1,…,X_n) = \frac{1}{Z} \prod_c \psi_c (X_c)$
- $Z$ is the partition function (normalization): $Z = \sum_X \prod_c \psi_c (X_c) $
- The potential function $\psi_c(X_c)$ can be understood as a “score” of the joint configuration.
- Unlike probability densities, which must integrate (or sum) to 1, the potential function only provides a score of how favorable a particular configuration is compared to others.
- The ${P(X)}$ is a proper probability density because the partition function is finite, which means that ${P(X)}$ is properly normalized so that sum of all probabilities equals 1. Also, all potential functions are positive.
Clique
Definition: A clique is a fully connected subset of nodes in an undirected graph model.
for $G = {V,E}$, a clique is a subgraph such that the nodes in $V’$ are fully connected.
Figure 3
Maximal Clique is a clique that cannot be extended by adding more nodes without breaking the full connectivity condition.
- ${x_1,x_4}$ and ${x_1,x_2,x_3}$ are maximal clique.
- ${x_1,x_2,x_3}$ clique are fully connected since each node has edges to the other two. Adding $x_4$ would break the clique condition because $x_4$ is not connected to both $x_1$ and $x_2$.
- ${x_1,x_2}$, ${x_2,x_3}$, ${x_1,x_3}$, ${x_1}$,${x_2}$, ${x_3}$${x_4}$ are sub-cliques because their nodes are directly connected.
Interpretation of Clique Potentials
<figcaption> Figure 4
-
The figure 4 model implies $ x_1 \perp x_2 \mid x_3$, so joint must factorize accordingly as: $P(X_1,X_2,X_3) = P(X_2) P(X1\mid X2) P(X_3\mid X_2)$
-
In UGMs, we do not use probabilities, we define the distribution in terms of clique potentials $\psi$, which measure\textbf{ “compatibility” }rather than probability. Therefore, we could write the above equation as $P(X_1, X_2, X_3) = P(X_1, X_2) P(X_3 \mid X_2)$ or $P(X_3, X_2) P(X_1 \mid X_2)$
- Cannot have all potential be marginals
- Cannot have all potential be conditions
figure 5
- Maximal Cliques: Based on Figure 5, the probability distribution of the entire system is written as
$P(A.B,C,D) = \frac{1}{Z}\psi_{ABC}(A, B, C) \psi_{BCD}(B, C, D)$
- To make $P(A,B,C,D)$ a valid probability distribution, we normalize it by using the partition function $Z$. Therefore, $Z = \sum_{ABCD}\psi_{ABC}(A, B, C) \psi_{BCD} (B, C, D) $ This ensures the sum of all possible configurations equals 1.
Markov Independencies
Global Markov Independencies
!
Figure 6
- The undirected graph $H$ in Graph 6 demonstrates that the nodes in $B$ separate $A$ and $C$, meaning that all paths from any node in $A$ to any node in $C$ must pass through $B$. This is written as $(sep_H(A;C\mid B))$
- If you know the values of $B$, then $A$ provides no additional information about $C$ because all paths are blacked by $B$.
- Global Markov Property: A probability distribution satisfies it if for any disjoint$\ A,B,C $\ such that $B$ separates $A$ and $C$, $A$ is independent of $C$ given $B$. This is written as $I(H) = {A\perp C \mid B:sep_H(A;C \mid B)}$
Local Markov Independencies
Figure 7
- In the Graph 7, we make blue node be $X_i$. We say that there is a unique Markov blanket of $X_i$, denoted $M B_{x_i}$, which is the set of neighbors of $X_i$ in the graph 6 (red nodes)
- The local Markov independencies in $H$ are: $I_l(H) ={X_i\perp V-{X_i}-M B_{x_i}\mid MB_{X_i}:\forall_i}$
- In other words, $X_i$ is independent of the rest of the nodes in the graph given its immediate neighbors, and this is satisfied for all nodes $i$ in the graph. Once you know the Markov blanket, you do not need any other nodes to determine $X_i$
Pairwise Markov Independencies
figure 8
- The pairwise Markov property associated with H are: $I_p(H) = {X\perp Y \mid V- {X,Y} : {X,Y} \notin E}$
- In figure 8, we can see that nodes $X_1$ and $X_5$ are not directly connected. By the pairwise Markov independence rule, we can say: $X_1\perp X_5 \mid {X_2,X_3,X_4}$
- This means that once we know $X_2$,$X_3$,and $X_4$, knowing $X_1$ tells us nothing more about $X_5$, because the intermediate nodes block all paths between $X_1$ and $X_5$, ensuring their independence.
I-Maps of UG
Definition of I-Map
An undirected graph ( H ) is an I-Map (Independence Map) for a probability distribution ( P ) if:
\[I(H) \subseteq I(P)\]where:

