 Shopping Bag ( 0 items )

All (9) from $25.72

New (1) from $218.10

Used (8) from $25.70
More About This Textbook
Overview
Product Details
Related Subjects
Read an Excerpt
PREFACE TO THE SECOND EDITION: VOLUME I
In the twentythree years that have passed since the first edition of our book appeared statistics has changed enormously under the impact of several forces:
The underlying sources of these changes have been the exponential change in computing speed (Moore's "law") and the development of devices (computer controlled) using novel instruments and scientific techniques (e.g., NMR tomography, gene sequencing). These techniques often have a strong intrinsic computational component. Tomographic data are the result of mathematically based processing. Sequencing is done by applying computational algorithms to raw gel electrophoresis data.
As a consequence the emphasis of statistical theory has shifted away from the small sample optimality results that were a major theme of our book in a number of directions:
There have, of course, been other important consequences such as the extensive development of graphical and other exploratory methods for which theoretical development and connection with mathematics have been minimal. These will not be dealt with in our work.
As a consequence our second edition, reflecting what we now teach our graduate students, is much changed from the first. Our one long book has grown to two volumes, each to be only a little shorter than the first edition.
Volume I, which we present in 2000, covers material we now view as important for all beginning graduate students in statistics and science and engineering graduate students whose research will involve statistics intrinsically rather than as an aid in drawing conclusions.
In this edition we pursue our philosophy of describing the basic concepts of mathematical statistics relating theory to practice. However, our focus and order of presentation have changed.
Volume I covers the material of Chapters 16 and Chapter 10 of the first edition with pieces of Chapters 710 and includes Appendix A on basic probability theory. However, Chapter 1 now has become part of a larger Appendix B, which includes more advanced topics from probability theory such as the multivariate Gaussian distribution, weak convergence in Euclidean spaces, and probability inequalities as well as more advanced topics in matrix theory and analysis. The latter include the principal axis and spectral theorems for Euclidean space and the elementary theory of convex functions on R^{d} as well as an elementary introduction to Hilbert space theory. As in the first edition, we do not require measure theory but assume from the start that our models are what we call "regular." That is, we assume either a discrete probability whose support does not depend on the parameter set, or the absolutely continuous case with a density. Hilbert space theory is not needed, but for those who know this topic Appendix B points out interesting connections to prediction and linear regression analysis.
Appendix B is as selfcontained as possible with proofs of most statements, problems, and references to the literature for proofs of the deepest results such as the spectral theorem. The reason for these additions are the changes in subject matter necessitated by the current areas of importance in the field.
Specifically, instead of beginning with parametrized models we include from the start non and semiparametric models, then go to parameters and parametric models stressing the role of identifiability. From the beginning we stress functionvalued parameters, such as the density, and functionvalued statistics, such as the empirical distribution function. We also, from the start, include examples that are important in applications, such as regression experiments. There is more material on Bayesian models and analysis. Save for these changes of emphasis the other major new elements of Chapter 1, which parallels Chapter 2 of the first edition, are an extended discussion of prediction and an expanded introduction to kparameter exponential families. These objects that are the building blocks of most modern models require concepts involving moments of random vectors and convexity that are given in Appendix B.
Chapter 2 of this edition parallels Chapter 3 of the first and deals with estimation. Major differences here are a greatly expanded treatment of maximum likelihood estimates (MLEs), including a complete study of MLEs in canonical kparameter exponential families. Other novel features of this chapter include a detailed analysis including proofs of convergence of a standard but slow algorithm for computing MLEs in multiparameter exponential families and an introduction to the EM algorithm, one of the main ingredients of most modern algorithms for inference. Chapters 3 and 4 parallel the treatment of Chapters 4 and 5 of the first edition on the theory of testing and confidence regions, including some optimality theory for estimation as well and elementary robustness considerations. The main difference in our new treatment is the downplaying of unbiasedness both in estimation and testing and the presentation of the decision theory of Chapter 10 of the first edition at this stage.
Chapter 5 of the new edition is devoted to asymptotic approximations. It includes the initial theory presented in the first edition but goes much further with proofs of consistency and asymptotic normality and optimality of maximum likelihood procedures in inference. Also new is a section relating Bayesian and frequentist inference via the Bernsteinvon Mises theorem.
Finally, Chapter 6 is devoted to inference in multivariate (multiparameter) models. Included are asymptotic normality of maximum likelihood estimates, inference in the general linear model, Wilks theorem on the asymptotic distribution of the likelihood ratio test, the Wald and Rao statistics and associated confidence regions, and some parallels to the optimality theory and comparisons of Bayes and frequentist procedures given in the univariate case in Chapter 5. Generalized linear models are introduced as examples. Robustness from an asymptotic theory point of view appears also. This chapter uses multivariate calculus in an intrinsic way and can be viewed as an essential prerequisite for the more advanced topics of Volume II.
As in the first edition problems play a critical role by elucidating and often substantially expanding the text. Almost all the previous ones have been kept with an approximately equal number of new ones added—to correspond to our new topics and point of view. The conventions established on footnotes and notation in the first edition remain, if somewhat augmented.
Chapters 14 develop the basic principles and examples of statistics. Nevertheless, we star sections that could be omitted by instructors with a classical bent and others that could be omitted by instructors with more computational emphasis. Although we believe the material of Chapters 5 and 6 has now become fundamental, there is clearly much that could be omitted at a first reading that we also star. There are clear dependencies between starred sections that follow: 5.4.2, 5.4.3, 6.2, 6.3, 6.4, 6.5, 6.6
Volume II is expected to be forthcoming in 2003. Topics to be covered include permutation and rank tests and their basis in completeness and equivariance. Examples of application such as the Cox model in survival analysis, other transformation models, and the classical nonparametric k sample and independence problems will be included. Semiparametric estimation and testing will be considered more generally, greatly extending the material in Chapter 8 of the first edition. The topic presently in Chapter 8, density estimation, will be studied in the context of nonparametric function estimation. We also expect to discuss classification and model selection using the elementary theory of empirical processes. The basic asymptotic tools that will be developed or presented, in part in the text and, in part in appendices, are weak convergence for random processes, elementary empirical process theory, and the functional delta method.
A final major topic in Volume 11 will be Monte Carlo methods such as the bootstrap and Markov Chain Monte Carlo.
With the tools and concepts developed in this second volume students will be ready for advanced research in modern statistics.
For the first volume of the second edition we would like to add thanks to new colleagues, particularly Jianging Fan, Michael Jordan, Jianhua Huang, Ying Qing Chen, and Carl Spruill and the many students who were guinea pigs in the basic theory course at Berkeley. We also thank Faye Yeager for typing, Michael Ostland and Simon Cawley for producing the graphs, Yoram Gat for proofreading that found not only typos but serious errors, and Prentice Hall for generous production support.
Last and most important we would like to thank our wives, Nancy Kramer Bickel and Joan H. Fujimura, and our families for support, encouragement, and active participation in an enterprise that at times seemed endless, appeared gratifyingly ended in 1976 but has, with the field, taken on a new life.
Peter J. Bickel
bickel@stat.berkeley.edu
Kjell Doksum
doksum@stat.berkeley.edu
PREFACE TO THE FIRST EDITION
This book presents our view of what an introduction to mathematical statistics for students with a good mathematics background should be. By a good mathematics background we mean linear algebra and matrix theory and advanced calculus (but no measure theory). Because the book is an introduction to statistics, we need probability theory and expect readers to have had a course at the level of, for instance, Hoel, Port, and Stone's Introduction to Probability Theory. Our appendix does give all the probability that is needed. However, the treatment is abridged with few proofs and no examples or problems.
We feel such an introduction should at least do the following:
Although there are several good books available for this purpose, we feel that none has quite the mix of coverage and depth desirable at this level. The work of Rao, Linear Statistical Inference and Its Applications, 2nd ed., covers most of the material we do and much more but at a more abstract level employing measure theory. At the other end of the scale of difficulty for books at this level is the work of Hogg and Craig, Introduction to Mathematical Statistics, 3rd ed. These authors also discuss most of the topics we deal with but in many instances do not include detailed discussion of topics we consider essential such as existence and computation of procedures and large sample behavior.
Our book contains more material than can be covered in two quarters. In the twoquarter courses for graduate students in mathematics, statistics, the physical sciences, and engineering that we have taught we cover the core Chapters 2 to 7, which go from modeling through estimation and testing to linear models. In addition we feel Chapter 10 on decision theory is essential and cover at least the first two sections. Finally, we select topics from Chapter 8 on discrete data and Chapter 9 on nonparametric models.
Chapter 1 covers probability theory rather than statistics. Much of this material unfortunately does not appear in basic probability texts but we need to draw on it for the rest of the book. It may be integrated with the material of Chapters 27 as the course proceeds rather than being given at the start; or it may be included at the end of an introductory probability course that precedes the statistics course.
A special feature of the book is its many problems. They range from trivial numerical exercises and elementary problems intended to familiarize the students with the concepts to material more difficult than that worked out in the text. They are included both as a check on the student's mastery of the material and as pointers to the wealth of ideas and results that for obvious reasons of space could not be put into the body of the text.
Conventions: (i) In order to minimize the number of footnotes we have added a section of comments at the end of each chapter preceding the problem section. These comments are ordered by the section to which they pertain. Within each section of the text the presence of comments at the end of the chapter is signaled by one or more numbers, 1 for the first, 2 for the second, and so on. The comments contain digressions, reservations, and additional references. They need to be read only as the reader's curiosity is piqued.
(i) Various notational conventions and abbreviations are used in the text. A list of the most frequently occurring ones indicating where they are introduced is given at the end of the text.
(iii) Basic notation for probabilistic objects such as random variables and vectors, densities, distribution functions, and moments is established in the appendix.
We would like to acknowledge our indebtedness to colleagues, students, and friends who helped us during the various stages (notes, preliminary edition, final draft) through which this book passed. E. L. Lehmann's wise advice has played a decisive role at many points. R. Pyke's careful reading of a nexttofinal version caught a number of infelicities of style and content. Many careless mistakes and typographical errors in an earlier version were caught by D. Minassian who sent us an exhaustive and helpful listing. W. Carmichael, in proofreading the final version, caught more mistakes than both authors together. A serious error in Problem 2.2.5 was discovered by F. Scholz. Among many others who helped in the same way we would like to mention C. Chen, S. J. Chow G. Drew, C. Gray, U. Gupta, P. X. Quang, and A. Samulon. Without Winston Chow's lovely plots Section 9.6 would probably not have been written and without Julia Rubalcava's impeccable typing and tolerance this text would never have seen the light of day.
We would also like to thank the colleagues and friends who inspired and helped us to enter the field of statistics. The foundation of our statistical knowledge was obtained in the lucid, enthusiastic, and stimulating lectures of Joe Hodges and Chuck Bell, respectively. Later we were both very much influenced by Erich Lehmann whose ideas are strongly reflected in this book.
Peter J. Bickel
Kjell Doksum
Berkeley
1976
Table of Contents
(NOTE: Each chapter concludes with Problems and Complements, Notes, and References.)
1. Statistical Models, Goals, and Performance Criteria.
2. Methods of Estimation.
3. Measures of Performance.
4. Testing and Confidence Regions.
5. Asymptotic Approximations.
6. Inference in the Multiparameter Case.
Appendix A: A Review of Basic Probability Theory.
Appendix B: Additional Topics in Probability and Analysis.
Appendix C: Tables.
Index.
Preface
PREFACE TO THE SECOND EDITION: VOLUME I
In the twentythree years that have passed since the first edition of our book appeared statistics has changed enormously under the impact of several forces:
The underlying sources of these changes have been the exponential change in computing speed (Moore's "law") and the development of devices (computer controlled) using novel instruments and scientific techniques (e.g., NMR tomography, gene sequencing). These techniques often have a strong intrinsic computational component. Tomographic data are the result of mathematically based processing. Sequencing is done by applying computational algorithms to raw gel electrophoresis data.
As a consequence the emphasis of statistical theory has shifted away from the small sample optimality results that were a major theme of our book in a number of directions:
There have, of course, been other important consequences such as the extensive development of graphical and other exploratory methods for which theoretical development and connection with mathematics have been minimal. These will not be dealt with in our work.
As a consequence our second edition, reflecting what we now teach our graduate students, is much changed from the first. Our one long book has grown to two volumes, each to be only a little shorter than the first edition.
Volume I, which we present in 2000, covers material we now view as important for all beginning graduate students in statistics and science and engineering graduate students whose research will involve statistics intrinsically rather than as an aid in drawing conclusions.
In this edition we pursue our philosophy of describing the basic concepts of mathematical statistics relating theory to practice. However, our focus and order of presentation have changed.
Volume I covers the material of Chapters 16 and Chapter 10 of the first edition with pieces of Chapters 710 and includes Appendix A on basic probability theory. However, Chapter 1 now has become part of a larger Appendix B, which includes more advanced topics from probability theory such as the multivariate Gaussian distribution, weak convergence in Euclidean spaces, and probability inequalities as well as more advanced topics in matrix theory and analysis. The latter include the principal axis and spectral theorems for Euclidean space and the elementary theory of convex functions on R^{d} as well as an elementary introduction to Hilbert space theory. As in the first edition, we do not require measure theory but assume from the start that our models are what we call "regular." That is, we assume either a discrete probability whose support does not depend on the parameter set, or the absolutely continuous case with a density. Hilbert space theory is not needed, but for those who know this topic Appendix B points out interesting connections to prediction and linear regression analysis.
Appendix B is as selfcontained as possible with proofs of most statements, problems, and references to the literature for proofs of the deepest results such as the spectral theorem. The reason for these additions are the changes in subject matter necessitated by the current areas of importance in the field.
Specifically, instead of beginning with parametrized models we include from the start non and semiparametric models, then go to parameters and parametric models stressing the role of identifiability. From the beginning we stress functionvalued parameters, such as the density, and functionvalued statistics, such as the empirical distribution function. We also, from the start, include examples that are important in applications, such as regression experiments. There is more material on Bayesian models and analysis. Save for these changes of emphasis the other major new elements of Chapter 1, which parallels Chapter 2 of the first edition, are an extended discussion of prediction and an expanded introduction to kparameter exponential families. These objects that are the building blocks of most modern models require concepts involving moments of random vectors and convexity that are given in Appendix B.
Chapter 2 of this edition parallels Chapter 3 of the first and deals with estimation. Major differences here are a greatly expanded treatment of maximum likelihood estimates (MLEs), including a complete study of MLEs in canonical kparameter exponential families. Other novel features of this chapter include a detailed analysis including proofs of convergence of a standard but slow algorithm for computing MLEs in multiparameter exponential families and an introduction to the EM algorithm, one of the main ingredients of most modern algorithms for inference. Chapters 3 and 4 parallel the treatment of Chapters 4 and 5 of the first edition on the theory of testing and confidence regions, including some optimality theory for estimation as well and elementary robustness considerations. The main difference in our new treatment is the downplaying of unbiasedness both in estimation and testing and the presentation of the decision theory of Chapter 10 of the first edition at this stage.
Chapter 5 of the new edition is devoted to asymptotic approximations. It includes the initial theory presented in the first edition but goes much further with proofs of consistency and asymptotic normality and optimality of maximum likelihood procedures in inference. Also new is a section relating Bayesian and frequentist inference via the Bernsteinvon Mises theorem.
Finally, Chapter 6 is devoted to inference in multivariate (multiparameter) models. Included are asymptotic normality of maximum likelihood estimates, inference in the general linear model, Wilks theorem on the asymptotic distribution of the likelihood ratio test, the Wald and Rao statistics and associated confidence regions, and some parallels to the optimality theory and comparisons of Bayes and frequentist procedures given in the univariate case in Chapter 5. Generalized linear models are introduced as examples. Robustness from an asymptotic theory point of view appears also. This chapter uses multivariate calculus in an intrinsic way and can be viewed as an essential prerequisite for the more advanced topics of Volume II.
As in the first edition problems play a critical role by elucidating and often substantially expanding the text. Almost all the previous ones have been kept with an approximately equal number of new ones added—to correspond to our new topics and point of view. The conventions established on footnotes and notation in the first edition remain, if somewhat augmented.
Chapters 14 develop the basic principles and examples of statistics. Nevertheless, we star sections that could be omitted by instructors with a classical bent and others that could be omitted by instructors with more computational emphasis. Although we believe the material of Chapters 5 and 6 has now become fundamental, there is clearly much that could be omitted at a first reading that we also star. There are clear dependencies between starred sections that follow: 5.4.2, 5.4.3, 6.2, 6.3, 6.4, 6.5, 6.6
Volume II is expected to be forthcoming in 2003. Topics to be covered include permutation and rank tests and their basis in completeness and equivariance. Examples of application such as the Cox model in survival analysis, other transformation models, and the classical nonparametric k sample and independence problems will be included. Semiparametric estimation and testing will be considered more generally, greatly extending the material in Chapter 8 of the first edition. The topic presently in Chapter 8, density estimation, will be studied in the context of nonparametric function estimation. We also expect to discuss classification and model selection using the elementary theory of empirical processes. The basic asymptotic tools that will be developed or presented, in part in the text and, in part in appendices, are weak convergence for random processes, elementary empirical process theory, and the functional delta method.
A final major topic in Volume 11 will be Monte Carlo methods such as the bootstrap and Markov Chain Monte Carlo.
With the tools and concepts developed in this second volume students will be ready for advanced research in modern statistics.
For the first volume of the second edition we would like to add thanks to new colleagues, particularly Jianging Fan, Michael Jordan, Jianhua Huang, Ying Qing Chen, and Carl Spruill and the many students who were guinea pigs in the basic theory course at Berkeley. We also thank Faye Yeager for typing, Michael Ostland and Simon Cawley for producing the graphs, Yoram Gat for proofreading that found not only typos but serious errors, and Prentice Hall for generous production support.
Last and most important we would like to thank our wives, Nancy Kramer Bickel and Joan H. Fujimura, and our families for support, encouragement, and active participation in an enterprise that at times seemed endless, appeared gratifyingly ended in 1976 but has, with the field, taken on a new life.
Peter J. Bickel
bickel@stat.berkeley.edu
Kjell Doksum
doksum@stat.berkeley.edu
PREFACE TO THE FIRST EDITION
This book presents our view of what an introduction to mathematical statistics for students with a good mathematics background should be. By a good mathematics background we mean linear algebra and matrix theory and advanced calculus (but no measure theory). Because the book is an introduction to statistics, we need probability theory and expect readers to have had a course at the level of, for instance, Hoel, Port, and Stone's Introduction to Probability Theory. Our appendix does give all the probability that is needed. However, the treatment is abridged with few proofs and no examples or problems.
We feel such an introduction should at least do the following:
Although there are several good books available for this purpose, we feel that none has quite the mix of coverage and depth desirable at this level. The work of Rao, Linear Statistical Inference and Its Applications, 2nd ed., covers most of the material we do and much more but at a more abstract level employing measure theory. At the other end of the scale of difficulty for books at this level is the work of Hogg and Craig, Introduction to Mathematical Statistics, 3rd ed. These authors also discuss most of the topics we deal with but in many instances do not include detailed discussion of topics we consider essential such as existence and computation of procedures and large sample behavior.
Our book contains more material than can be covered in two quarters. In the twoquarter courses for graduate students in mathematics, statistics, the physical sciences, and engineering that we have taught we cover the core Chapters 2 to 7, which go from modeling through estimation and testing to linear models. In addition we feel Chapter 10 on decision theory is essential and cover at least the first two sections. Finally, we select topics from Chapter 8 on discrete data and Chapter 9 on nonparametric models.
Chapter 1 covers probability theory rather than statistics. Much of this material unfortunately does not appear in basic probability texts but we need to draw on it for the rest of the book. It may be integrated with the material of Chapters 27 as the course proceeds rather than being given at the start; or it may be included at the end of an introductory probability course that precedes the statistics course.
A special feature of the book is its many problems. They range from trivial numerical exercises and elementary problems intended to familiarize the students with the concepts to material more difficult than that worked out in the text. They are included both as a check on the student's mastery of the material and as pointers to the wealth of ideas and results that for obvious reasons of space could not be put into the body of the text.
Conventions: (i) In order to minimize the number of footnotes we have added a section of comments at the end of each chapter preceding the problem section. These comments are ordered by the section to which they pertain. Within each section of the text the presence of comments at the end of the chapter is signaled by one or more numbers, 1 for the first, 2 for the second, and so on. The comments contain digressions, reservations, and additional references. They need to be read only as the reader's curiosity is piqued.
(i) Various notational conventions and abbreviations are used in the text. A list of the most frequently occurring ones indicating where they are introduced is given at the end of the text.
(iii) Basic notation for probabilistic objects such as random variables and vectors, densities, distribution functions, and moments is established in the appendix.
We would like to acknowledge our indebtedness to colleagues, students, and friends who helped us during the various stages (notes, preliminary edition, final draft) through which this book passed. E. L. Lehmann's wise advice has played a decisive role at many points. R. Pyke's careful reading of a nexttofinal version caught a number of infelicities of style and content. Many careless mistakes and typographical errors in an earlier version were caught by D. Minassian who sent us an exhaustive and helpful listing. W. Carmichael, in proofreading the final version, caught more mistakes than both authors together. A serious error in Problem 2.2.5 was discovered by F. Scholz. Among many others who helped in the same way we would like to mention C. Chen, S. J. Chow G. Drew, C. Gray, U. Gupta, P. X. Quang, and A. Samulon. Without Winston Chow's lovely plots Section 9.6 would probably not have been written and without Julia Rubalcava's impeccable typing and tolerance this text would never have seen the light of day.
We would also like to thank the colleagues and friends who inspired and helped us to enter the field of statistics. The foundation of our statistical knowledge was obtained in the lucid, enthusiastic, and stimulating lectures of Joe Hodges and Chuck Bell, respectively. Later we were both very much influenced by Erich Lehmann whose ideas are strongly reflected in this book.
Peter J. Bickel
Kjell Doksum
Berkeley
1976