Smoothing and Regression: Approaches, Computation, and Application bridges the many gaps that exist among competing univariate and multivariate smoothing techniques. It introduces, describes, and in some cases compares a large number of the latest and most advanced techniques for regression modeling. Unlike many other volumes on this topic, which are highly technical and specialized, this book discusses all methods in light of both computational efficiency and their applicability for real data analysis.
Using examples of applications from the biosciences, environmental sciences, engineering, and economics, as well as medical research and marketing, this volume addresses the theory, computation, and application of each approach. A number of the techniques discussed, such as smoothing under shape restrictions or of dependent data, are presented for the first time in book form. Special features of this book include:
* Comprehensive coverage of smoothing and regression with software hints and applications from a wide variety of disciplines
* A unified, easy-to-follow format
* Contributions from more than 25 leading researchers from around the world
* More than 150 illustrations also covering new graphical techniques important for exploratory data analysis and visualization of high-dimensional problems
* Extensive end-of-chapter references
For professionals and aspiring professionals in statistics, applied mathematics, computer science, and econometrics, as well as for researchers in the applied and social sciences, Smoothing and Regression is a unique and important new resource destined to become one the most frequently consulted references in the field.
About the Author
Read an Excerpt
It is an honor to be asked to write a foreword for this volume, and it is mypleasure to congratulate the editor and the authors for a job well done.
This book is unique in that it brings together in one place a variety of points of view regarding nonparametric regression, smoothing and statistical modeling. These methods, as the editor notes, are becoming ubiquitous throughout the scientific, economic, environmental and social enterprises as ever increasing computational power and data collection ability become available.
The historical overviews here are important in tying together the different threads of the theory, which are widely scattered in the literature over time and in various journals. Many of the papers bring the reader up to date on the state of the art in the approaches discussed. Much useful computational and practical information, for example, discussion of useful software, is provided.
As I am sure the editor and authors will agree, this is a fun area to work in. There are many interesting and elegant theoretical questions, many interesting numerical and computational questions, and, finally, many interesting areas of application, where the statistician has the opportunity and satisfaction of working on "real world" scientific problems and contributing to the study of important scientific questions in a variety of areas.
The reader will probably agree with me that there is no one uniquely best method for all situations, but that substantive estimation and modeling problems may require the examination of several different approaches. Nevertheless there is abundant theory to give us guidance under various assumptions, although the last word has not been said on theoretical developments. This is particularly true in the multivariate case where data with very complex structure, as in environmental and demographic medical studies, is collected.
This book provides an important addition to the reading list of a modern nonparametric regression course. The extensive reference lists provide an excellent historical perspective, and a prime starting point for new researchers in the field. It will be a handy reference source for people wishing to obtain an overview of the various techniques, and pointers to practical issues in implementing the various methods. In addition, it is a fine summary of the state of the art as of the creation of the book.
Madison, December 1999
The idea for this volume can be traced back to the COMPSTAT Satellite Meeting on Smoothing, which took place in August 1994 in Semmering, Austria. I had the pleasure of organizing. During this workshop, it became clear that smoothing techniques are still a driving force in the development of nonparametric regression techniques. On the other hand they have become so diversified that it is really hard to keep up with all of the developments in modern regression analysis. In going from univariate to multivariate problems, the difference between competing methods is even more pronounced. During the past few years, important new approaches have emerged, such as the marginal integration method for the estimation of additive models. Markov chain Monte Carlo techniques have taken up a leading role in the Bayesian modeling of complicated data structures (dynamic, temporal, or spatial). Wavelets have become an important issue. Semiparametric regression is attracting more and more interest, and sliced inverse regression is offering new perspectives. Furthermore, all of these fascinating developments have been going on during the planning and writing of this book. We incorporated as much information on them as possible into this volume. References were last updated in late 1999.
What was the motivation to start an enterprise like this? There is certainly no lack of monographs in the field of smoothing techniques and nonparametric regression. Most of them are highly technical, some of them emphasize computational aspects or software products, and others embark on specific applications, but what they all have in common is that they specialize in one approach, say spline smoothing or local polynomial fitting. This volume attempts to bridge the gaps between these contributions in describing and occasionally comparing competing univariate and multivariate smoothing techniques. In addition, some of the subjects discussed in this book have not yet been covered in textbooks or monographs, such as smoothing under shape restrictions or of dependent data, or resampling methods for nonparametric regression, or vector generalized additive models.
What all of these topics have in common is that they are computationally demanding, some even requiring dynamic high-resolution graphics. Hence, in addition to defining and explaining them, it is only natural to discuss their numerical and software-related aspects. However, without the possibility of an appropriate implementation, such theoretical developments appear to be of limited value. Thus, in addition to explanations of their formal statistical background, all methods are discussed in the light of both computational efficiency and applicability for the analysis of real data. Smoothing techniques in regression are increasingly applied in the biosciences, in the environmental sciences, and in the fields of engineering and economics. They are also about to become relevant for many other fields, such as medical research and marketing. Last but not least, great efforts have been made in presenting graphics, which are of the utmost importance for any exploratory data analysis and the visualization of high-dimensional problems. Hence, also color plates were produced for this volume.
As a consequence, each regression approach included in the book is discussed from the following points of view: background theory, problems of computation and implementation, and examples of practical application.
The level of presentation is approximately the same throughout the volume, although some methods differ from others in terms of necessary technicalities, which are kept to a minimum to the greatest possible extent. Hence, the book addresses a wide audience: graduate students of statistics, applied mathematics, engineering, computer science and econometrics; scholars in the aforementioned fields; and applied researchers in these and other areas, such as biosciences, environmental sciences, medicine, psychometrics, and marketing. The wide range of examples in the volume is intended to further diversify the scope of application of nonparametric and semiparametric regression models and related techniques.
The emphasis of the volume is on multivariate regression problems. However, there are also introductory chapters that guide the reader through the most important univariate approaches, especially those of smoothing splines and of kernels. Other aspects of great importance are smoothing parameter choice, band-width selection, variance estimation, smoothing under functional constraints, and smoothing of autocorrelated data. A further topic of current interest is wavelet methods for regression. Some of these topics are essential for the understanding of multivariate regression problems.
The motivation behind the chapters on multivariate regression problems is to provide a state-of-the-art introduction to the large number of different approaches that are already established or are promising for the future of practical data analysis. Among these approaches are smoothing methods for discrete data (e. g., from contingency tables), local polynomial fitting, additive and generalized additive models (including alternating conditional expectations and additivity and variance stabilization), multivariate spline regression (e. g., thin-plate splines and interaction splines), multivariate and semiparametric kernel regression, spatial-process estimates as smoothers, resampling methods for nonparametric regression, multi-dimensional smoothing and visualization, projection pursuit regression, sliced inverse regression, nonparametric Bayesian dynamic and semiparametric models, and, finally, nonparametric Bayesian surface estimation. Competing algorithms are discussed throughout the book and are followed by software hints and examples using simulated as well as real data.
Due to the vast number of different techniques to be covered, the preparation of this volume demanded that experts in these very areas work together. As the book's editor, I designed it that way from the very beginning. To achieve this ambitious goal, I brought together researchers from many different countries and schools of thought, most of who would otherwise not be likely to write a book together. All agreed with me on the point that this project is solely feasible when there is a maximum of exchange among them. So authors commented on each others contributions and the final product is a unique piece of work with the same notation throughout. A crucial requirement was also that all authors obey a certain style of presentation and restrict themselves to a prespecified level of mathematical argumentation, avoiding the popular theoremÐ proof structure of statistics monographs. Proofs are either cited or given in an intuitive manner. The extensive list of references following each chapter should allow readers who desire further details, to trace down all of these technicalities. The specific format and the comprehensiveness of this volume are exactly what makes it special in the field of smoothing and regression.
I am most grateful to the enthusiastic support of all of the people who have helped to shape the book as it now stands. In addition to the authors and Stephen H. Quigley from John Wiley & Sons, Inc., there are many others I have to thank for stimulating discussions and for their most valuable reviews of chapters. In particular, I would like to name Dennis R. Cook, Dennis D. Cox, Naihua Duan, Sylvia Fr¨ uhwirthÐ Schnatter, Peter Hall, Jeffrey D. Hart, M. C. Jones, James S. Marron, Jean D. Opsomer, J¨ urgen Pilz, James O. Ramsay, Burghart Seifert, Neil Shepherd, Stefan Sperlich, Joan G. Staniswalis, Alexander Tsybakov, Matthew P. Wand, and Thomas W. Yee.
Further, I wish to mention two of my former academic teachers, Prof. Leopold Schmetterer at the University of Vienna, Austria, who first introduced me to nonparametric concepts in statistics, and Prof. Bernard W. Silverman at the University of Bath, UK, who confronted me with smoothing techniques and computational statistics. Since my encounters with these individuals, I have been engaged in related problems, with a special emphasis on computing and applications in the biosciences and medicine.
Last but not least, I want to express my sincerest thanks to my mother, Ingrid, and my father, the late Prof. Herbert Toni Schimek, for their continuing support during my academic education. My father, who was a well-known Austrian designer and artist, not only shaped my eye for graphical detail but also stimulated my interest in science and technology. Both aspects turned out to be relevant to my later academic interests, as partly revealed in this book. Another, more direct contribution of my father is the exibris shown on page xvi, printed from a copperplate he engraved for me in 1979.
Michael G. Schimek
Graz, November 1999
Table of ContentsSpline Regression (R. Eubank).
Variance Estimation and Smoothing-Parameter Selection for Spline Regression (A. van der Linde).
Kernel Regression (P. Sarda&P. Vieu).
Variance Estimation and Bandwidth Selection for Kernel Regression (E. Herrmann).
Spline and Kernel Regression under Shape Restrictions (M. Delecroix&C. Thomas-Agnan).
Spline and Kernel Regression for Dependent Data (R. Kohn, et al.).
Wavelets for Regression and Other Statistical Problems (G. Nason&B. Silverman).
Smoothing Methods for Discrete Data (J. Simonoff&G. Tutz).
Local Polynomial Fitting (J. Fan&I. Gijbels).
Additive and Generalized Additive Models (M. Schimek&B. Turlach).
Multivariate Spline Regression (C. Gu).
Multivariate and Semiparametric Kernel Regression (W. Härdle&M. Müller).
Spatial-Process Estimates as Smoothers (D. Nychka).
Resampling Methods for Nonparametric Regression (E. Mammen).
Multidimensional Smoothing and Visualization (D. Scott).
Projection Pursuit Regression (S. Klinke&J. Grassmann).
Sliced Inverse Regression (T. Kötter).
Dynamic and Semiparametric Models (L. Fahrmeir&L. Knorr-Held).
Nonparametric Bayesian Bivariate Surface Estimation (M. Smith, et al.).
What People are Saying About This
From the publisher's description: "...a unique and important new resource destined to become on of the most frequently consulted references in the field." (Mathematical Reviews, 2001 f)
"...provides a comprehensive, concise coverage of statistics for engineers and scientists. I would recommend the use of this book for teaching statistics students..." (Journal of Quality Technology, Vol. 34, No. 1, January 2002)