Lecture II
An Introduction to Markov Chain Monte Carlo
Lecture II
Lecture II
Abstract
Statistical Physics (part 2), the original Metropolis Algorithm,
Simulated Annealing.
Phase Space
It is convenient to visualize a mechanical system as a point in the
6N dimensional space (q,p) of all the positions and momentums
of all the N particles.
(**add picture here**)
Due to the complexity of macroscopic systems (N ~ 10^{24}) it
was necessary to abandom determinism and use statistics to describe
the system. The predictions of statistical physics are expected to hold
only on the average.
Instead of the precise innitial conditions (which are unknown), statistical
physics describes the system by a probability distribution over phase space,
r(q,p) for t = 0. As it will be seen later, Hamilton's equations imply
the conservation, at all time, of this innitial distribution. This is the
famous, Liouville's theorem. The determination of r is then the first
step.
MaxwellBoltzmannGibbs Distribution
Different forms for r are found to be needed depending
on the particular data available about the system. We will be
concerned only with the so called cannonical distribution.
We assume that the system is not isolated but in thermal equilibrium
with a heat bath at constant temperature T. Statistically this
is equivalent to the assumption that the average energy of the molecules
is constant. The novel idea of Boltzmann was to discretize phasespace
to find the most likely distribution for r.
Each particle has a definite position and momentum. Subdivide the positions
and momentums for each particle into m (6 dimensional) cells of equal size.
Assume that these cells are small enough so that the value of the energy
within each cell is approximately constant. Let E_{j} be the energy in the
jth cell. Assume further that the cells, eventhough small, they are still
big enough to accomodate lots of particles inside. These are reasonable
assumptions justified by the smallness of atomic dimensions ( ~ 10^{8}
cm), the size of typical N and the smoothness of energy surfaces. This
discretization of the phasespace for each molecule into m equal size
cells induces a discretization of the phasespace of the system into,
m^{N} equal size cells. With the help of this discretization, the
state s of the system is specified by,
indicating the cell number for each of the N particles. If the
particles are assumed to be identical and indistinguishable, then
permutations of the molecules with a given cell number have no
physical consequences. All it matters is how many molecules end
up in each of the cells and not which ones did. Thus, the actual
set of distinguishable physical states is much smaller than m^{N}
it is,
corresponding to the number of ways of splitting N into the m
cells. There are,
G = 
N! n_{1}! n_{2}!¼n_{m}!



ways of throwing the N molecules into the m cells in such a
way that n_{1} of them are in the first cell, n_{2} in the
second cell, etc. If we assume apriori that the molecules have
equal chance of ending in any of the cells then the number of ways
can be turned into a probability for the state s = (n_{1},¼,n_{m}),
P = 
N! n_{1}! n_{2}!¼n_{m}!

×constant 

Hence, the most likely distribution of balls among the cells is the
one that maximizes this probability subject to whatever is knonw about
the system. When the temperature is all we know we maximize P subject
to the constraint that the average energy is fixed at kT. Where
k is a fenomenological (not fundamental) constant needed to change the
units from ergs (units of energy) to degrees (usual units for
temperature). It is known as the Boltzmann constant and it is about,
k = 1.380 ×10^{14} ergs per degree centigrade 

Using the fact that N and the n_{j} are large we can use
Stirling's approximation,
to get,



NlogN  
å
j

( n_{j}logn_{j}  n_{j}) + constant 
 

N 
å
j

p_{j} logp_{j} + constant 

 

where,
Thus, P is the probability of observing the probability distribution
(p_{1},¼,p_{m}). A probability of a probability... A prior!
Known as an entropic prior, for the quantity in the exponent (sans N) is
the famous expression for the entropy of a probability distribution.
If we treat the p_{j} as if they were continuous variables we can
obtain the most likely apriori distribution by solving,




max
s.t.

 
å
j

p_{j} logp_{j} 
 

 


 

File translated from T_{E}X by T_{T}H, version 2.32.
On 5 Jul 1999, 22:59.