Read an Excerpt
Equilibrium Statistical Mechanics
By E. Atlee Jackson
Dover Publications, Inc.Copyright © 1968 E. Atlee Jackson
All rights reserved.
1. FREQUENCY AND PROBABILITY
It is a common experience that the outcome may not always be the same when an experiment is performed a number of times, even though the conditions of the experiment are kept as similar as possible. The reason is that some of the factors that contribute to the outcome of the experiment are not (or cannot) be completely controlled. Simple examples are the "experiments" of rolling dice, drawing cards, tossing coins, or any of the so-called games of chance. Presumably other experiments are nearer to the hearts of physical scientists, but these examples will suffice for the present. In any case, the typical feature of all experiments is that at the end of the experiment one observes some result of interest. To be concise, we shall call those distinct (or mutually exclusive) results of an experiment that are of interest simple events. Therefore the result of each experiment is always one, and only one, simple event. For simplicity we may label these simple events (or simply "events") with some index i. Thus the two possible events when tossing a coin are heads or tails (i = h, t), whereas there are six possible events when a single die is rolled (i = 1, 2, 3, ..., 6), and so on.
Now if a particular experiment is performed a number of times, say N times, a particular event i may be found to occur ni times. This fact is of considerable interest, because if the experiment is repeated at a later time, we expect the event i to occur with roughly the same frequency. To investigate this idea we consider the ratio
Fi = ni/N (1)
This ratio is the fraction of the N experiments that resulted in the event i and is commonly called the frequency of the event i. Although it is useful to know the value of Fi found in some previous group of N experiments, it is important to realize that, if these N experiments are repeated, one cannot expect that the event i will occur the same number of times (ni). Instead it may occur mi times. This means that Fi will in general be different for different groups of experiments. Thus, for example, if a coin is tossed twenty times (N = 20), the event "heads" may occur eight times (nh = 8), so that Fh = 0.4 for that sequence of tosses. If we tossed the coin again twenty times, we would consider it unlikely that heads would turn up again eight times, so we would expect a different value for Fh. Moreover, if the coin were tossed 100 times, the coin might turn up heads 54 times, in which case Fh = 0.54 for that sequence of tosses. If N = 1,000, we might observe nh = 510, in which case Fh = 0.51. Clearly the frequency of an event depends on the group of experiments being considered.
Since the frequency of an event varies from one group of experiments to another, it is desirable to obtain a quantity that does not depend on any particular group and that at the same time indicates the frequency we can expect in any particular group of experiments. To obtain such a quantity we could, at least in principle, examine the values of the frequency as N becomes extremely large. In the above examples we had
Fh (N = 20) = 0.4, Fh (N = 100) = 0.54, Fh (N = 1,000) = 0.51
As N becomes larger and larger we expect that, if the coin is evenly balanced, the frequency Fh will approach the value 0.50. However, regardless of what the limiting value of the frequency may be when N becomes extremely large, we call this limiting value the probability of a heads (for that coin). Thus, in principle, we have a method for obtaining a quantity Ph (the probability of a heads) that is related to the frequency we can expect to find in future experiments.
We now can give a formal definition for the probabilityPi of an event i, namely,
[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] (2)
By this we mean that Pi equals the limiting value of Fi as N becomes arbitrarily large. This in turn equals the limiting value of ni/N, where ni also depends on the value of N.
There are two ways of interpreting Equation (2), both of which are frequently used in statistical mechanics. We have said that N represents the number of experiments, and ni is the number of these experiments that result in the event i. Now this can be interpreted in two ways. First, we can picture one physical system on which we perform the same experiment over and over again (altogether, N times). The number ni is, in this case, the number of times the event i occurs in this sequence of experiments. From this point of view the experiments are carried out at different times, one after the other. In practice, this is usually what one does. However, there is another interpretation of Equation (2) into which time does not enter. In the second method we envisage N identical systems (for example, N identical coins, N decks of cards, or N bottles of a gas). The systems are identical in the sense that we cannot distinguish between them by any macroscopic method. Such a collection of identical systems is called an ensemble. Now we perform the same experiment on each of the N systems of this ensemble and take ni to be the number of these systems that yield the event i. For example, we might have an ensemble of 1,000 identical coins. All of these coins are now flipped (at the same time, if you like), and it is found that 510 turn up heads. Then nh = 510, so the frequency is Fh = 0.51. From this point of view the time when the experiments are performed is clearly unimportant in determining Fh, whereas in the first interpretation it is not so clear how time might affect the answer. We shall assume from now on that either method will yield the same result. (In statistical mechanics this assumption is known as the ergodic hypothesis.) Thus we can think of frequency (and probability) in terms of either a sequence of experiments on one system or one experiment on each member of an ensemble. Time, therefore, plays no role in determining the probability.
Although definition (2) is fine in principle, in practice one repeats an experiment only a finite number of times. Alternatively, one can only construct an ensemble containing a finite number of systems. Thus, in either case, the limit N -> ∞ in Equation (2) cannot be realized in any physical situation. For this reason one can obtain only an approximate value for the probability of an event. Nonetheless, one is likely to say that the probability of tossing a heads is 1/2, or the probability of picking a particular card from a deck is 1/52. A statement of this sort is based on the assumption that the probabilities of certain events are equal. To see how such numbers (i.e., 1/2, 1/52) result from such assumptions, we must establish two properties of the probability (and the frequency). First, the frequency of any event is clearly a positive number, and consequently, because of definition (2), all Pi are positive numbers. Second, if a coin is tossed N times and the coin turns up heads nh times, it clearly must turn up tails nt = N - nh times. Put another way, we must have nh + nt = N, because in each experiment one of the events, heads or tails, must have occurred. For the same reason we must find that the sum of ni for all possible events i equals the total number of experiments N. This means that the sum of all Fi must equal unity (no matter what the value of N may be), and consequently the sum of all Pi must also equal unity. We therefore have the two simple but very important properties of the probability:
Pi ≥ 0 (for all simple events i)
[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] (summed over all simple events) (3)
Now we shall see how these properties (actually only the second), together with certain assumptions, yield values for the probability of certain events. In the case of a coin we expect that the probability of a heads Ph and that of a tails Pt are equal. If we make this assumption, then we can set Ph and Pt equal to their common probability (call it P0), in which case
Ph + Pi = 2P0 = 1 or P0 = 1/2 = Ph = Pi
This shows how the value of 1/2, quoted above, is arrived at once one has made an assumption about the equality of the probabilities of certain events. It should be emphasized that if we have no information about the coin, the assumption that Ph equals Pt is the most reasonable assumption to make. Although the present example is nearly trivial, it contains all (or nearly all) of the essential features used in more complicated cases.
To illustrate these points with another somewhat more complicated example, consider the experiment of drawing a card from a deck of cards. In this case there are 52 possible events, each with some probability Pi(i = 1, 2, 3, ..., 52). If we have no information about the fact that the cards are marked or that there is some legerdemain being used, the most reasonable assumption to make is that the probability of drawing any particular card is equal to the probability of drawing any other card. Let us call this common probability P0 (again). Then, using (3), we have
[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]
in agreement with our statement (above) that the probability of drawing a particular card from a deck is usually assumed to be 1/52. What is important here is to realize what assumptions are being made and how these assumptions yield a value for the probability.
The procedure outlined above is frequently used. That is, to predict the probability of a certain event, one uses the general rule:
If there is no apparent reason for one event to occur more frequently than another, then their respective probabilities are assumed to be equal.(4)
Probabilities obtained from the reasoning in (4) should be more accurately termed "a priori probabilities" (before, and hence independent of, experience), whereas those obtained by Pi = lim ni (N)/N could be called N -> ∞ "a posteriori probabilities" (or empirical probabilities); however, the terminology is cumbersome and will not be used. Notice, nonetheless, that a physical theory involving probability is usually based on "a priori probability." The final justification of the theory rests on the agreement between the predicted results of the theory and the observed experimental results.
In this section we have discussed how probability is usually related to the observed frequencies in physical experiments. Moreover, we have seen how general arguments [Equation (4)] are frequently used to predict the probability of various events. We shall now consider some of the more formal aspects of probability, which allow us to determine the probability of more complicated events.
2. PROBABILITY OF COMPOUND EVENTS: INDEPENDENT EVENTS
It will be recalled that the result of an experiment is always one, and only one, simple event. The probability of these events was denoted by Pi, and they have the properties
[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] (5)
The assignment of numerical values to the various Pi is, in practice, usually based on an argument such as (4). However, in principle, the values may be assigned for any reason, provided only that they satisfy (5).
Now it is very helpful to think of these events as points in a space, called the "sample space." Such a space is illustrated in Figure 1. It is simply a collection of points, each of which represents a possible result of an experiment. In this space we are not interested in distances, or the arrangement of the points, but only in the points themselves.
By a compound event we shall mean any specified collection of the points in the sample space. The collection of points in the compound event are specified by some feature they have in common (as we shall illustrate shortly). In Figure 1 a compound event A has been indicated. It includes all the points enclosed by the boundary. The probability of this compound event P(A) is defined to be
[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] (6)
where the sum is over all the points in the event (indicated by the symbols "i [susbet] A," which reads "all the i's contained in the event A").
To illustrate these points, consider the experiment of drawing a single card from a deck of cards. The possible simple events are fifty-two in number, and can be labeled i = 1, 2, ..., 52. Our sample space in this case consists of fifty-two points, each representing one of the cards that may be drawn. Now we define three compound events A, B, and C that we might be interested in. These events are specified by certain features that all the points have in common. Thus:
a. : All points that represent hearts
b. : All points that represent a number three card
c. : All points that represent a one-eyed jack
On the basis of (4) we shall assign equal probability to each of the points in the sample space (i.e., to each simple event). Because of (5) we conclude, as we did in the last section, that the probability of each simple event equals 1/52. Now we determine the probability of each of the compound events above, using definition (6). We have
[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]
for there are 13 points representing a heart, each with probability 1/52. Similarly,
[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]
That is, the probability of drawing a heart is 1/4, of drawing a number three card is 1/13, and of drawing a one-eyed jack is 1/28. This sample space and the compound events A, B, and C are illustrated in Figure 2.
Two compound events may or may not have points in common. Thus, in Figure 2, events A and B have a single point in common (namely, the point corresponding to the three of hearts), whereas events B and C have no points in common (there is no number three card that is also a one-eyed jack). If two events do not have any points in common, we say that they are disjoint. Now consider the disjoint events B and C. We may be interested in knowing the probability that the outcome of the experiment will be either event B or event C (i.e., what is the probability that the card drawn is either number three or a one-eyed jack?). This is, of course, just another compound event, which we can represent by B [union] C (read: B union C). The union of B and C is simply the compound event consisting of all points that are either in B or in C (or in both B and C). The probability of this event P (B [union] C) is, according to (6), just the sum of the probability of all points in that event. Since B and C have no points in common (disjoint), this equals the sum of P (B) and P (C), or
P(B [union] C) = P(B) + P(C) (if B and C are disjoint) (7)
More generally, if A and B are two events that are not disjoint, then P (A [union] B) is still equal to the sum of the probabilities of all points in A and B. However, in order to express P (A [union] B) in terms of P (A) and P (B) we cannot use (7). The reason is that the sum of P (A) and P(B) adds the probabilities of the points that A and B have in common twice. To rectify this double counting, we clearly need to subtract something. What we need to subtract is just the probability of all the points that A and B have in common. We therefore define the intersect of A and B as that event which contains all points common to both A and B and write it as AB. The intersect and the union of two events A and B are illustrated in Figure 3.
Excerpted from Equilibrium Statistical Mechanics by E. Atlee Jackson. Copyright © 1968 E. Atlee Jackson. Excerpted by permission of Dover Publications, Inc..
All rights reserved. No part of this excerpt may be reproduced or reprinted without permission in writing from the publisher.
Excerpts are provided by Dial-A-Book Inc. solely for the personal use of visitors to this web site.