# Mean Field Approximation

## Overview

In mean field theory, a complex probabilistic model is approximated by a set of individual models defined on each vertex. It overlooks the interaction within cliques and averages the interactive effect among the vertices. Mathematically, it approximates the joint distribution $$p^*(\mathbf{z})$$ by the product of factorized distributions: $$q(\mathbf{z}) \approx \prod_{i=1}^d q_i(z_i)$$ by minimizing the KL divergence. Throughout the writing, we are going to elaborate mean field approximation based on Ising model.

## Formulation

The objective is mean field inference is to minimize the KL divergence:

$\min_{q_1, ..., q_n} KL(q \| p^*) = KL(\prod_{i=1}^d q_i(z_i) \| p^*).$

Steps are taken iteratively by optimizing $$q_k$$ while fixing other factorized distributions until convergence, i.e.,

$\min_{q_k} KL(q \| p^*) = KL(\prod_{i=1}^d q_i(z_i) \| p^*).$ $KL(\prod_{i=1}^d q_i(z_i) \| p^*) = \int \prod_{i=1}^d q_i \log \frac{\prod_{i=1}^d q_i}{p^*} d \mathbf{z} \\ = \sum_{i=1}^d \int \log q_i \prod_{j=1}^d q_j d \mathbf{z} - \int \prod_{j=1}^d q_j \log p^* d \mathbf{z} \\ = \int \log q_k \prod_{j=1}^d q_j d \mathbf{z} + \sum_{i \neq k} \int \log q_i \prod_{j=1}^d q_j d \mathbf{z} - \int \prod_{j=1}^d q_j \log p^* d \mathbf{z} \\$ $\left( \int \log q_k \prod_{j=1}^d q_j d \mathbf{z} = \int q_k \log q_k \int \prod_{j \neq k}^d q_j d z_{\neq k} d z_k \\ = \int q_k \log q_k d z_k \right)$ $\left( \sum_{i \neq k} \int \log q_i \prod_{j=1}^d q_j d \mathbf{z} = const\right)$ $= \int q_k \log q_k d z_k - \int q_k \int \prod_{j\neq k} q_j \log p^* d z_{\neq k} d z_k + const \\ = \int q_k (\log q_k - \int \prod_{j\neq k} q_j \log p^* d z_{\neq k}) d z_k + const$ $\left( h(z_k) = \int \prod_{j\neq k} q_j \log p^* d z_{\neq k} = E_{q_{-k}}(\log p^*) \right)$ $\left(t(z_k) = e^{h(z_k)} = E_{q_{-k}}(p^*) \right)$ $= \int q_k \log \frac{q_k}{t} d z_k + const$

So in each optimization step, we minimize $$KL(q_k \| t)$$.

$\log q_k = \log t(z_k) = h(z_k) = E_{q_{-k}}(\log p^*) + const,$

where the constant is added to ensure a legal distribution.

## Mean Field Inference on Ising Model

Ising model is defined by

$p(y) \propto exp(\frac{1}{2} J \sum_i \sum_{j \in Nbr(i)} y_i y_j + \sum_i b_i y_i ),$

where $$y_i \in \{1, -1 \}$$.

Then we approximate the distribution by mean field $$p(y) \approx q(y) = \prod_i q_i(y_i)$$.

$log(q_k(y_k)) = E_{q_{-k}} \log p(y) + const \\ = E_{q_{-k}} (J \sum_{j \in Nbr(k)} y_k y_j + b_k y_k) + const \\ = J \sum_{j \in Nbr(k)} y_k E_{q_{-k}}(y_j) + b_k y_k + const \\ = J y_k \sum_{j \in Nbr(k)} E(y_j) + b_k y_k + const \\ = y_k (J \sum_{j \in Nbr(k)} E(y_j) + b_k) + const \\ = y_k M + const$ $\left( q_k(y_k=1) + q_k(y_k=-1) = C exp(M) + C exp(-M) = 1 \right)$ $\left( C = \frac{1}{exp(M) + exp(-M)} = \right)$ $q_k(y_k=1) = \frac{exp(M)}{exp(M) + exp(-M)} = \frac{1}{1+exp(-2M)} = sigmoid(2M)$ $q_k(y_k=-1) = \frac{exp(-M)}{exp(M) + exp(-M)} = \frac{1}{1+exp(2M)} = sigmoid(-2M)$ $E(y_k) = q_k(y_k=1) - q_k(y_k=-1) =\frac{exp(M) - exp(-M)}{exp(M) + exp(-M)} = tanh(M)$