Read an Excerpt
By Gary Koop
John Wiley & Sons ISBN: 0-470-84567-8
Chapter One An Overview of Bayesian Econometrics
1.1 BAYESIAN THEORY
Bayesian econometrics is based on a few simple rules of probability. This is one of the chief advantages of the Bayesian approach. All of the things that an econometrician would wish to do, such as estimate the parameters of a model, compare different models or obtain predictions from a model, involve the same rules of probability. Bayesian methods are, thus, universal and can be used any time a researcher is interested in using data to learn about a phenomenon.
To motivate the simplicity of the Bayesian approach, let us consider two random variables, A and B. The rules of probability imply:
p(A, B) = p(A|B)p(B) where p(A, B) is the joint probability of A and B occurring, p(A|B) is the probability of A occurring conditional on B having occurred (i.e. the conditional probability of A given B), and p(B) is the marginal probability of B. Alternatively, we can reverse the roles of A and B and find an expression for the joint probability of A and B:
p(A, B) = p(A|B)p(B)
Equating these two expressions for p(A, B) and rearranging provides us with Bayes'rule, which lies at the heart of Bayesian econometrics:
(1.1) p(B|A) = p(A|B)p(B) / p(A)
Econometrics is concerned with using data to learn about something the researcher is interested in. Just what the 'something' is depends upon the context. However, in economics we typically work with models which depend upon parameters. For the reader with some previous training in econometrics, it might be useful to have in mind the regression model. In this model interest often centers on the coefficients in the regression, and the researcher is interested in estimating these coefficients. In this case, the coefficients are the parameters under study. Let y be a vector or matrix of data and [theta] be a vector or matrix which contains the parameters for a model which seeks to explain y. We are interested in learning about [theta] based on the data, y. Bayesian econometrics uses Bayes' rule to do so. In other words, the Bayesian would replace B by [theta] and A by y in (1.1) to obtain:
(1.2) p([theta]|y) = p([y]|theta) p([theta]) / p(y)
Bayesians treat p([theta]|y) as being of fundamental interest. That is, it directly addresses the question "Given the data, what do we know about [theta]?". The treatment of [theta] as a random variable is controversial among some econometricians. The chief competitor to Bayesian econometrics, often called frequentist econometrics, says that [theta] is not a random variable. However, Bayesian econometrics is based on a subjective view of probability, which argues that our uncertainty about anything unknown can be expressed using the rules of probability. In this book, we will not discuss such methodological issues (see Poirier (1995) for more detail). Rather, we will take it as given that econometrics involves learning about something unknown (e.g. coefficients in a regression) given something known (e.g. data) and the conditional probability of the unknown given the known is the best way of summarizing what we have learned.
Having established that p([theta]|y) is of fundamental interest for the econometrician interested in using data to learn about parameters in a model, let us now return to (1.2). Insofar as we are only interested in learning about [theta], we can ignore the term p(y), since it does not involve [theta]. We can then write:
(1.3) p([theta]|y) [varies] p(y|[theta]) p([theta]) The term p([theta]|y) is referred to as the posterior density, the p.d.f. for the data given the parameters of the model, p(y|[theta]), as the likelihood function and p([theta]) as the prior density. You often hear this relationship referred to as "posterior is proportional to likelihood times prior". At this stage, this may seem a little abstract, and the manner in which priors and likelihoods are developed to allow for the calculation of the posterior may be unclear. Things should become clearer to you in the following chapters, where we will develop likelihood functions and priors in specific contexts. Here we provide only a brief general discussion of what these are.
The prior, p([theta]), does not depend upon the data. Accordingly, it contains any non-data information available about [theta]. In other words, it summarizes what you know about [theta] prior to seeing the data. As an example, suppose [theta] is a parameter which reflects returns to scale in a production process. In many cases, it is reasonable to assume that returns to scale are roughly constant. Thus, before you look at the data, you have prior information about [theta], in that you would expect it to be approximately one. Prior information is a controversial aspect of Bayesian methods. In this book, we will discuss both informative and noninformative priors for various models. In addition, in later chapters, we will discuss empirical Bayes methods. These use data-based information to choose the prior and, hence, violate a basic premise of Bayesian methods. Nevertheless, empirical Bayes methods are becoming increasingly popular for the researcher who is interested in practical, objective, tools that seem to work well in practice.
The likelihood function, p(y|[theta]), is the density of the data conditional on the parameters of the model. It is often referred to as the data generating process. For instance, in the linear regression model (which will be discussed in the next chapter), it is common to assume that the errors have a Normal distribution. This implies that p(y|[theta]) is a Normal density, which depends upon parameters (i.e. the regression coefficients and the error variance).
The posterior, p([theta]|y), is the density which is of fundamental interest. It summarizes all we know about [theta] after (i.e. posterior to) seeing the data. Equation (1.3) can be thought of as an updating rule, where the data allows us to update our prior views about [theta]. The result is the posterior which combines both data and non-data information.
In addition to learning about parameters of a model, an econometrician might be interested in comparing different models. A model is formally defined by a likelihood function and a prior. Suppose we have m different models, [M.sub.i] for i = 1; ..., m, which all seek to explain y. [M.sub.i] depends upon parameters [[theta].sup.i]. In cases where many models are being entertained, it is important to be explicit about which model is under consideration. Hence, the posterior for the parameters calculated using [M.sub.i] is written as
(1.4) p([[theta].sup.i]|y, [M.sub.i]) = p(y|[[theta].sup.i], [M.sub.i]p([[theta].sup.i]|[M.sub.i] / p(y|[M.sub.i]) (1.4)
and the notation makes clear that we now have a posterior, likelihood, and prior for each model.
The logic of Bayesian econometrics suggests that we use Bayes' rule to derive a probability statement about what we do not know (i.e. whether a model is a correct one or not) conditional on what we do know (i.e. the data). This means the posterior model probability can be used to assess the degree of support for [M.sub.i] . Using (1.1) with B = [M.sub.i] and A = y, we obtain
(1.5) p([M.sub.i]|y) = p(y|[M.sub.i]) p([M.sub.i]) / p(y)
Of the terms in (1.5), p(M.sub.i) is referred to as the prior model probability. Since it does not involve the data, it measures how likely we believe [M.sub.i] to be the correct one before seeing the data. p(y| [M.sub.i]) is called the marginal likelihood, and is calculated using (1.4) and a few simple manipulations. In particular, if we integrate both sides of (1.4) with respect to [[theta].sup.i], use the fact that [integral] p([[theta].sup.i]|y, [M.sub.i])d[[theta].sup.i] = 1 (since probability density functions integrate to one), and rearrange, we obtain:
(1.6) p(y|[M.sub.i]) = [integral] p(y|[[theta].sup.i], [M.sup.i])p([[theta].sup.i] |[M.sub.i])d[[theta].sup.i] (1.6)
Note that the marginal likelihood depends only upon the prior and the likelihood. In subsequent chapters, we discuss how (1.6) can be calculated in practice.
Since the denominator in (1.5) is often hard to calculate directly, it is common to compare two models, i and j, using the posterior odds ratio, which is simply the ratio of their posterior model probabilities:
(1.7) P[O.sub.ij] = p([M.sub.i]|y) / p([M.sub.j]|y) = p(y|[M.sub.i] p([M.sub.i] p(y|[M.sub.j] p([M.sub.j]
Note that, since p(y) is common to both models, it cancels out when we take the ratio. As we will discuss in subsequent chapters, there are special techniques in many cases for calculating the posterior odds ratio directly. If we calculate the posterior odds ratio comparing every pair of models, and we assume that our set of models is exhaustive (in that p([M.sub.1]|y) + p([M.sub.2]|y) + ... + p([M.sub.m]|y = 1), then we can use posterior odds ratios to calculate the posterior model probabilities given in (1.5). For instance, if we have m = 2 models then we can use the two equations
p([M.sub.1]|y) + p([M.sub.2]|y = 1
P[O.sub.12] = p([M.sub.1]|y) / p([M.sub.2]|y
to work out
p([M.sub.1]|y) = P]O.sub.12] / 1 + P]O.sub.12]
p([M.sub.2]|y = 1 - p([M.sub.1]|y)
Thus, knowledge of the posterior odds ratio allows us to figure out the posterior model probabilities.
To introduce some more jargon, econometricians may be interested in model comparison when equal prior weight is attached to each model. That is, p([M.sub.i]) = p([M.sub.j]) or, equivalently, the prior odds ratio which is p([M.sub.i]) / p([M.sub.j]) / is set to one. In this case, the posterior odds ratio becomes simply the ratio of marginal likelihoods, and is given a special name, the Bayes Factor, defined as:
(1.8) B[F.sub.ij] = p(y|[M.sub.i]) / p(y|[M.sub.j])
Finally, econometricians are often interested in prediction. That is, given the observed data, y, the econometrician may be interested in predicting some future unobserved data y*. Our Bayesian reasoning says that we should summarize our uncertainty about what we do not know (i.e. y*) through a conditional probability statement. That is, prediction should be based on the predictive density p(y*|y) (or, if we have many models, we would want to make explicit the dependence of a prediction on a particular model, and write p(y*|y, [M.sub.i])). Using a few simple rules of probability, we can write p(y|y*) in a convenient form. In particular, since a marginal density can be obtained from a joint density through integration (see Appendix B), we can write:
p(y*|y) = [integral] p(y*, [theta]|y/d[theta]
However, the term inside the integral can be rewritten using another simple rule of probability:
(1.9) p(y*|y) = [integral] p(y*|y, [theta]) p([theta]|y) d[theta]
As we shall see in future chapters, the form for the predictive in (1.9) is quite convenient, since it involves the posterior.
On one level, this book could end right here. These few pages have outlined all the basic theoretical concepts required for the Bayesian to learn about parameters, compare models and predict. We stress what an enormous advantage this is. Once you accept that unknown things (i.e. [theta], [M.sub.i] and y*) are random variables, the rest of Bayesian approach is non-controversial. It simply uses the rules of probability, which are mathematically true, to carry out statistical inference. A benefit of this is that, if you keep these simple rules in mind, it is hard to lose sight of the big picture. When facing a new model (or reading a new chapter in the book), just remember that Bayesian econometrics requires selection of a prior and a likelihood. These can then be used to form the posterior, (1.3), which forms the basis for all inference about unknown parameters in a model. If you have many models and are interested in comparing them, you can use posterior model probabilities (1.5), posterior odds ratios (1.7), or Bayes Factors (1.8). To obtain any of these, we usually have to calculate the marginal likelihood (1.6).
Excerpted from Bayesian Econometrics by Gary Koop Excerpted by permission.
All rights reserved. No part of this excerpt may be reproduced or reprinted without permission in writing from the publisher.
Excerpts are provided by Dial-A-Book Inc. solely for the personal use of visitors to this web site.