Parallel tempering

Explore Monte Carlo methods with hopping frogs

Monte Carlo algorithms are often used to explore probability distributions. Particularly, the Metropolis-Hastings algorithm is an extremely widely used technique for sampling from distributions. In this story, there are a few frog characters to familiarize ourselves with.

First up, explorer frog:

Doesn’t give a flip
Red

Next, we have exploiter frog (also known as greedy frog):

Wants flies, now
Blue

Finally, our most complicated frog will be sampler frog:

Wants flies, most of the time (!)
Green

In the Metropolis-Hastings algorithm, in order to sample from a distribution proportional to \(P(x)\), a proposed move from a “state” \(x\) to some new state \(x'\) will be “accepted” with probability \[P_{\text{accept}}(x \rightarrow x') = \min\left(1,\frac{P(x)}{P(x')}\right).\] (really we should be sure our proposals are symmetric, but we won’t worry about that in this demonstration).

What is going on with our frogs is that there is a

So, generally by using parallel tempering method we can quite generically improve the convergence performance of Monte Carlo methods, and therefore the efficiency of our posterior inference.

A nice side-effect of this, however, is the ability to leverage a technique known as “calorimetry” in the Physics community for computing the free energy of a system, and as “thermodynamic integration” within the statistis community.

Suppose that we have a general model where the

The fundamental identity that will come in handy is that if we define the partition function at an arbitrary \(\beta\) as

\[ Z(\beta) = \int P(A|\vec{\theta})P(\vec{\theta})^\beta d \theta\]

so that notably we have that the Bayesian evidence is obtained at \(\beta = 1\):

\[ Z(1) = \int P(A | \vec{\theta}) P(\vec{\theta}) d \theta = P(A), \qquad Z(0) = \int P(\theta) d\theta = 1\]

Key terms/concepts:

Simulated annealing is generally a useful way to improve convergence.