Ehrenfest Urn Problem with Applications
Daniel J. Castellano
December 18, 1997 
1 Introduction
The Ehrenfest urn problem was originally proposed as a model for
dissipation of heat, but has since come to be applied in a wide
variety of fields, thanks in part to generalizations and variations of
the problem, and also, no less importantly, to visualizing the exact
original problem in a different light. To illustrate, we will consider
specific examples and analyze them both algebraically and
geometrically, and also use physical analogies to demonstrate the
relevance of this model to the pure sciences.
First, we shall consider the original problem as formulated by the
Ehrenfests, and derive some of its basic results and
implications. We will then recast the problem as a random walk on a
hypercube, and consider it yet again as a Markov chain. Finally, we
will look at a couple of variations of the problem, checking that the
results of the Ehrenfest model are confirmed, and then expound further
physical implications of these modified urn problems.
2 The original Ehrenfest urn model
Consider two urns A and B. Urn A contains N marbles and Urn B contains
none. The marbles are labelled 1,2,...N. In each step of the
algorithm, a number between 1 and N is chosen randomly, with all
values having equal probability. The marble corresponding to that
value is moved to the opposite urn. Hence the first step of the
algorithm will always involve moving a marble from A to B.
What will the two urns look like after k steps? If k is
sufficiently large, we may expect the urns to have equal populations,
as the probabilities of drawing a marble from A or from B become
increasingly similar. After the first step, for example, the odds of
putting the marble in Urn B back into A is 1/N. Going a step further,
the probability of Urn B having three marbles after three steps is
N−1/N⋅ N−2/N. For N large, this is a very high
probability, but as we multiply repeatedly by N−k/N for
increasingly large k, the probability of moving marbles from A to B
diminishes over time, as we would expect intuitively.
States in which one urn has many more marbles than the other may be
said to be unstable, as there is an overwhelming tendency to move
marbles to the urn that contains fewer. As the populations equalize,
the transition probabilities from A to B and B to A approach each
other, creating a stable, or stationary, state of roughly equal populations
thereafter.
These qualitative results are in perfect accordance with familiar
concepts in thermodynamics. Two bodies of different temperatures
eventually approach the same intermediate temperature when in contact
with each other for an extended period of time. Since thermal motion
is random, we would expect an urn model which gives equal
probabilities to each unit of heat transfer to be the most
appropriate. To proceed to more quantitative results, it is helpful to
recast the problem in a more convenient form.
2.1 Random walk on a hypercube
Each of the N marbles may be in one of 2 states: it is in A or it is
in B. In each step of the algorithm we reverse the state of one of the
marbles, an action which is independent of the states of the other
marbles, hence the states are orthogonal. If we consider each marble
to correspond to a coordinate axis, and states A and B are treated as
values 0 and 1, then the state of the system at any point in time may
be expressed as the vertex of an Ndimensional unit
hypercube. Executing a step in the algorithm is analogous to moving
along an edge to an adjacent vertex. Our condition that all marbles
be chosen with equal probability means that each edge connected to the
vertex has equal chance of being traversed. What we have is a uniform
random walk on the hypercube.
The state of the system may be expressed as a binary Ntuple if we
align the edges of the hypercube with the standard basis in ℜ^{N}.
These states form a group under binary vector addition. We may define
a probability measure P on this space which describes the first
step of the algorithm. This would be P(x)=1/N if x=e_{i} where
e_{i} is the ith basis vector. However, to remove parity problems
arising from an odd number of steps, it is better to use
P(x)=1/N+1 if x=e_{i} or 0. (In both cases, P is
zerovalued elsewhere, of course.)
As in all random walk problems, further steps are expressed by
convoluting the probability function. The second step is
P^{∗ 2}(x)=ΣP(x−y)P(y)
(1)
where P(y) is the probability distribution of the first step. For
the kth step, we have:
P^{∗ k}=P∗ P^{∗ k−1}
(2)
As k approaches infinity, this expression converges to the uniform
distribution. That is, after a sufficiently long random walk, each
vertex on the hypercube has an equal probability of being the walker's
location. Thus the expectation value of the sum of the coordinates of
the corresponding Ntuple is N/2. This reaffirms our thermodynamic
model.
An interesting question is that of how big k must get in order to
have a uniform distribution. The transition from an unstable to a
stable configuration is actually quite dramatic. For k greater than
N/4log N, the distribution rapidly approaches uniformity. To
show that P^{∗ k} tends to uniformity for this value, a lengthy
proof involving Fourier transforms is required.^{1} (Taking the Fourier
transform of a convolution gives a normal product.)
2.1.1 Electric model
Random walks may also be considered treating each edge as a resistor
of unit resistance. A linear random walk, for instance, could never go
off toward infinity because the resistors add in series to give
infinite resistance (a random walk in a plane, however is a different
story, both electrically and mathematically).
Our hypercube may be thought of us as a highly symmetric parallel
circuit between the origin and the opposite vertex. Using this method gives
some interesting results. For example, the expectation value of the
number of steps it takes for an urn lacking one marble to get the last
one is 2^{N}−1!^{2}
2.2 The urn problem as a Markov chain
Discussion to this point has made the problem a bit more complicated
than it needs to be, as far as physical considerations are
concerned. For one thing, we have 2^{N} possible states, but many of
these are degenerate. If we are considering the diffusion of gases,
for example, with each molecule acting as a marble (so N is on the
order of Avogadro's number), we really don't care which molecule goes
where, but only the relative quantities in each region. So our
hypercube collapses considerably, with vertices that have the same
coordinate sum collapse into a single vertex. The number of edges
should remain the same, in order to preserve the correct relative
probabilities, and it is easy to see that geometrically this becomes
something of a mess, even though our new graph has many fewer vertices
than the hypercube.
It may be advisable to abandon our geometric model at this point, and
consider instead a further simplification. The random walk model
expresses each state as being determined by all previous states. This
is a consequence of labelling marbles only in the initial state. If
instead, we relabel the marbles after each step, flaunting our
disconcern for the identity of each marble, the state of the system
after k steps is only determined by the state after k−1
steps. Instead of 2^{N} states, we have only N+1 states
(0, 1,...N marbles in Urn A). This is a twostate Markov chain,
since each step can only increase or decrease the number of marbles in
A by 1. With each state being determined only by the previous state,
we need only concern ourselves with the probability of transition from
one state to the next. These are easily computed:
with zero probability for all other states.
As in most Markov chain problems, this has a variation cutoff; in
other words, a sharp transition to a stationary state. This cutoff is
at N/4 log N, as before. The cutoff phenomenon was made
mathematically precise by Diaconis and Aldous (1987), and led to the
result that a 52card deck needs only 7 shuffles to achieve uniform
mixing. Here we have twoway mixing; we exchange the order of cards in
a single switch rather than moving a card from one pile to another, as
in the Ehrenfest model. Thus the cutoff is only at N/2log N,
which for N=52 gives 102.7. At 13 switches per shuffle (26 pairs,
half of which are switched), we have a maximum of 7.9 shuffles (in
practice, there are slightly more switches per shuffle).
3 Variations of the Ehrenfest model
The treatment of urn problems as Markov chains lends itself readily to
generalizations and modifications of the model. We will consider a
couple of these below.
3.1 The KrafftSchaefer generalization^{4}
Fundamental to the Ehrenfest model is the condition that marbles be
chosen with equal probability. However, the properties of the urns (e.g.,
permeability) may be generalized as follows. If the marble chosen is
in Urn A, then it will be placed in B with probability s; otherwise,
it remains in A. Similarly, a marble in Urn B is moved to A with
conditional probability t. The original model may be retrieved by
simply setting s=t=1. The transition probabilities for the Markov
chain are (zero otherwise):
For the one parameter case s=t, we have:
P(i→ i)=1−s
(10)
In the two parameter case, the matrix of transition probabilities has
N+1 distinct eigenvalues λ_{j}=1−2j/N, where j=0,
1,…, N.
3.2 UppuluriWright variation^{6}
In another twist of the urn problem, we have a single urn with w_{0}
white balls and b_{0} black balls. Once again, balls are chosen at
random with equal probability. If the ball is white, it is replaced
with a black ball with probability α_{1}; otherwise, the system
is unchanged. Similarly, if the ball is black, it becomes white with
probability α_{2}. This model is probabilistically similar to
the KrafftSchaefer model, but it is presented in such a way that
lends itself to explicit analysis. The expectation values of the
number of black and white balls after k steps is
μ_{k}=[I+ 

A]^{k}μ_{0}
(11) 
where A=
and μ_{0}=(w_{0} b_{0})^{t}. The
matrix exponential can be evaluated using Blatz's (1968)
result:^{7}
M^{n}= 
λ_{1}^{n}−λ_{2}^{n} 

λ_{1}−λ_{2} 

M− 
λ_{1}^{n−1}−λ_{2}^{n−2} 

λ_{1}−λ_{2} 

λ_{1}λ_{2}I
(12) 
Thus we have an explicit computation of the probability of the final
state from an arbitrary initial state without anything more involved
than finding the eigenvalues of a two by two matrix and raising them
to the kth power. For the classical Ehrenfest model,
α_{1}=α_{2}=1, and I've computed the eigenvalues of
I+1/NA to be 1, (2/N)1.^{8}
4 Conclusion
It took some doing, but we finally arrived at a completely generalized
formula that solves the Ehrenfest urn problem explicitly. More
importantly, we've covered various geometric and physical
interpretations of the problem which have rendered certain aspects of
it more intelligible and at the same time illustrated the applications
of the model to other fields. This is but one urn model among many,
and one of the simplest ones at that. Nonetheless, it has continued to
provoke significant developments even in recent years, exposing some
highly fertile ground in combinatorics and other applied mathematics.
 1
 Such a proof
is given in Diaconis, P. (1991). Finite Fourier Methods: Access to
Tools, Proceedings of Symposia in Applied
Mathematics, 44, 174175.
 2
 This, and more general results, may be
found in Palacios, J. L. (1994). Another look at the Ehrenfest urn
via electric networks, Advances in Applied Probability,
26, 820824.
 3
 Krafft, O.
and Schaefer, M. (1993). Mean passage times for triangular transition
matrices and a two parameter Ehrenfest urn model, Journal of
Applied Probability, 30, 964970.
 4
 Krafft, O.
and Schaefer, M. (1993). Mean passage times for triangular transition
matrices and a two parameter Ehrenfest urn model, Journal of
Applied Probability, 30, 964970.
 5
 Uppuluri, V. R. R.
and Wright, T. (1981). A note on a further generalization of the
Ehrenfest urn model, Proceedings of the American Statistical
Association, Sampling Survey Section, 564569.
 6
 Uppuluri, V. R. R.
and Wright, T. (1981). A note on a further generalization of the
Ehrenfest urn model, Proceedings of the American Statistical
Association, Sampling Survey Section, 564569.
 7
 Blatz, P. J. (1968). On the arbitrary power of an
arbitrary 2 x 2 matrix, American Mathematical Monthly, 75,
5758.
 8
 Notes
attached.
This document was translated from L^{A}T_{E}X by
H^{E}V^{E}A.
© 1997, 2006 Daniel J. Castellano. All rights reserved. http://www.arcaneknowledge.org