**Uh-oh, it looks like your Internet Explorer is out of date.**

For a better shopping experience, please upgrade now.

# Introduction to Linear Regression Analysis / Edition 5

Introduction to Linear Regression Analysis / Edition 5 available in Hardcover, NOOK Book

## Overview

**Praise for the Fourth Edition**

"As with previous editions, the authors have produced a leading textbook on regression."

—*Journal of the American Statistical Association*

**A comprehensive and up-to-date introduction to** **the fundamentals of regression analysis**

*Introduction to Linear Regression Analysis, Fifth Edition* continues to present both the conventional and less common uses of linear regression in today’s cutting-edge scientific research. The authors blend both theory and application to equip readers with an understanding of the basic principles needed to apply regression model-building techniques in various fields of study, including engineering, management, and the health sciences.

Following a general introduction to regression modeling, including typical applications, a host of technical tools are outlined such as basic inference procedures, introductory aspects of model adequacy checking, and polynomial regression models and their variations. The book then discusses how transformations and weighted least squares can be used to resolve problems of model inadequacy and also how to deal with influential observations. The *Fifth Edition* features numerous newly added topics, including:

- A chapter on regression analysis of time series data that presents the Durbin-Watson test and other techniques for detecting autocorrelation as well as parameter estimation in time series regression models
- Regression models with random effects in addition to a discussion on subsampling and the importance of the mixed model
- Tests on individual regression coefficients and subsets of coefficients
- Examples of current uses of simple linear regression models and the use of multiple regression models for understanding patient satisfaction data.

In addition to Minitab, SAS, and S-PLUS, the authors have incorporated JMP and the freely available R software to illustrate the discussed techniques and procedures in this new edition. Numerous exercises have been added throughout, allowing readers to test their understanding of the material.

*Introduction to Linear Regression Analysis, Fifth Edition* is an excellent book for statistics and engineering courses on regression at the upper-undergraduate and graduate levels. The book also serves as a valuable, robust resource for professionals in the fields of engineering, life and biological sciences, and the social sciences.

## Product Details

ISBN-13: | 9780470542811 |
---|---|

Publisher: | Wiley |

Publication date: | 05/01/2012 |

Series: | Wiley Series in Probability and Statistics Series , #821 |

Edition description: | New Edition |

Pages: | 672 |

Sales rank: | 613,357 |

Product dimensions: | 6.90(w) x 10.10(h) x 1.50(d) |

## Read an Excerpt

**Praise for the Fourth Edition**

"As with previous editions, the authors have produced a leading textbook on regression."

—*Journal of the American Statistical Association*

**A comprehensive and up-to-date introduction to** **the fundamentals of regression analysis**

*Introduction to Linear Regression Analysis, Fifth Edition* continues to present both the conventional and less common uses of linear regression in today’s cutting-edge scientific research. The authors blend both theory and application to equip readers with an understanding of the basic principles needed to apply regression model-building techniques in various fields of study, including engineering, management, and the health sciences.

Following a general introduction to regression modeling, including typical applications, a host of technical tools are outlined such as basic inference procedures, introductory aspects of model adequacy checking, and polynomial regression models and their variations. The book then discusses how transformations and weighted least squares can be used to resolve problems of model inadequacy and also how to deal with influential observations. The *Fifth Edition* features numerous newly added topics, including:

- A chapter on regression analysis of time series data that presents the Durbin-Watson test and other techniques for detecting autocorrelation as well as parameter estimation in time series regression models
- Regression models with random effects in addition to a discussion on subsampling and the importance of the mixed model
- Tests on individual regression coefficients and subsets of coefficients
- Examples of current uses of simple linear regression models and the use of multiple regression models for understanding patient satisfaction data.

In addition to Minitab, SAS, and S-PLUS, the authors have incorporated JMP and the freely available R software to illustrate the discussed techniques and procedures in this new edition. Numerous exercises have been added throughout, allowing readers to test their understanding of the material.

*Introduction to Linear Regression Analysis, Fifth Edition* is an excellent book for statistics and engineering courses on regression at the upper-undergraduate and graduate levels. The book also serves as a valuable, robust resource for professionals in the fields of engineering, life and biological sciences, and the social sciences.

## First Chapter

**Praise for the Fourth Edition**

"As with previous editions, the authors have produced a leading textbook on regression."

—*Journal of the American Statistical Association*

**A comprehensive and up-to-date introduction to** **the fundamentals of regression analysis**

*Introduction to Linear Regression Analysis, Fifth Edition* continues to present both the conventional and less common uses of linear regression in today’s cutting-edge scientific research. The authors blend both theory and application to equip readers with an understanding of the basic principles needed to apply regression model-building techniques in various fields of study, including engineering, management, and the health sciences.

Following a general introduction to regression modeling, including typical applications, a host of technical tools are outlined such as basic inference procedures, introductory aspects of model adequacy checking, and polynomial regression models and their variations. The book then discusses how transformations and weighted least squares can be used to resolve problems of model inadequacy and also how to deal with influential observations. The *Fifth Edition* features numerous newly added topics, including:

- A chapter on regression analysis of time series data that presents the Durbin-Watson test and other techniques for detecting autocorrelation as well as parameter estimation in time series regression models
- Regression models with random effects in addition to a discussion on subsampling and the importance of the mixed model
- Tests on individual regression coefficients and subsets of coefficients
- Examples of current uses of simple linear regression models and the use of multiple regression models for understanding patient satisfaction data.

In addition to Minitab, SAS, and S-PLUS, the authors have incorporated JMP and the freely available R software to illustrate the discussed techniques and procedures in this new edition. Numerous exercises have been added throughout, allowing readers to test their understanding of the material.

*Introduction to Linear Regression Analysis, Fifth Edition* is an excellent book for statistics and engineering courses on regression at the upper-undergraduate and graduate levels. The book also serves as a valuable, robust resource for professionals in the fields of engineering, life and biological sciences, and the social sciences.

## Table of Contents

PREFACE xiii

**1. INTRODUCTION 1**

1.1 Regression and Model Building 1

1.2 Data Collection 5

1.3 Uses of Regression 9

1.4 Role of the Computer 10

**2. SIMPLE LINEAR REGRESSION 12**

2.1 Simple Linear Regression Model 12

2.2 Least-Squares Estimation of the Parameters 13

2.3 Hypothesis Testing on the Slope and Intercept 22

2.4 Interval Estimation in Simple Linear Regression 29

2.5 Prediction of New Observations 33

2.6 Coeffi cient of Determination 35

2.7 A Service Industry Application of Regression 37

2.8 Using SAS and R for Simple Linear Regression 39

2.9 Some Considerations in the Use of Regression 42

2.10 Regression Through the Origin 45

2.11 Estimation by Maximum Likelihood 51

2.12 Case Where the Regressor x is Random 52

**3. MULTIPLE LINEAR REGRESSION 67**

3.1 Multiple Regression Models 67

3.2 Estimation of the Model Parameters 70

3.3 Hypothesis Testing in Multiple Linear Regression 84

3.4 Confidence Intervals in Multiple Regression 97

3.5 Prediction of New Observations 104

3.6 A Multiple Regression Model for the Patient Satisfaction Data 104

3.7 Using SAS and R for Basic Multiple Linear Regression 106

3.8 Hidden Extrapolation in Multiple Regression 107

3.9 Standardized Regression Coeffi cients 111

3.10 Multicollinearity 117

3.11 Why Do Regression Coeffi cients Have the Wrong Sign? 119

**4. MODEL ADEQUACY CHECKING 129**

4.1 Introduction 129

4.2 Residual Analysis 130

4.3 PRESS Statistic 151

4.4 Detection and Treatment of Outliers 152

4.5 Lack of Fit of the Regression Model 156

**5. TRANSFORMATIONS AND WEIGHTING TO CORRECT MODEL INADEQUACIES 171**

5.1 Introduction 171

5.2 Variance-Stabilizing Transformations 172

5.3 Transformations to Linearize the Model 176

5.4 Analytical Methods for Selecting a Transformation 182

5.5 Generalized and Weighted Least Squares 188

5.6 Regression Models with Random Effect 194

**6. DIAGNOSTICS FOR LEVERAGE AND INFLUENCE 211**

6.1 Importance of Detecting Infl uential Observations 211

6.2 Leverage 212

6.3 Measures of Infl uence: Cook’s D 215

6.4 Measures of Infl uence: DFFITS and DFBETAS 217

6.5 A Measure of Model Performance 219

6.6 Detecting Groups of Infl uential Observations 220

6.7 Treatment of Infl uential Observations 220

**7. POLYNOMIAL REGRESSION MODELS 223**

7.1 Introduction 223

7.2 Polynomial Models in One Variable 223

7.3 Nonparametric Regression 236

7.4 Polynomial Models in Two or More Variables 242

7.5 Orthogonal Polynomials 248

**8. INDICATOR VARIABLES 260**

8.1 General Concept of Indicator Variables 260

8.2 Comments on the Use of Indicator Variables 273

8.3 Regression Approach to Analysis of Variance 275

**9. MULTICOLLINEARITY 285**

9.1 Introduction 285

9.2 Sources of Multicollinearity 286

9.3 Effects of Multicollinearity 288

9.4 Multicollinearity Diagnostics 292

9.5 Methods for Dealing with Multicollinearity 303

9.6 Using SAS to Perform Ridge and Principal-Component Regression 321

**10. VARIABLE SELECTION AND MODEL BUILDING 327**

10.1 Introduction 327

10.2 Computational Techniques for Variable Selection 338

10.3 Strategy for Variable Selection and Model Building 351

10.4 Case Study: Gorman and Toman Asphalt Data Using SAS 354

**11. VALIDATION OF REGRESSION MODELS 372**

11.1 Introduction 372

11.2 Validation Techniques 373

11.3 Data from Planned Experiments 385

**12. INTRODUCTION TO NONLINEAR REGRESSION 389**

12.1 Linear and Nonlinear Regression Models 389

12.2 Origins of Nonlinear Models 391

12.3 Nonlinear Least Squares 395

12.4 Transformation to a Linear Model 397

12.5 Parameter Estimation in a Nonlinear System 400

12.6 Statistical Inference in Nonlinear Regression 409

12.7 Examples of Nonlinear Regression Models 411

12.8 Using SAS and R 412

**13. GENERALIZED LINEAR MODELS 421**

13.1 Introduction 421

13.2 Logistic Regression Models 422

13.3 Poisson Regression 444

13.4 The Generalized Linear Model 450

14. REGRESSION ANALYSIS OF TIME SERIES DATA 474

14.1 Introduction to Regression Models for Time Series Data 474

14.2 Detecting Autocorrelation: The Durbin-Watson Test 475

14.3 Estimating the Parameters in Time Series Regression Models 480

**15. OTHER TOPICS IN THE USE OF REGRESSION ANALYSIS 500**

15.1 Robust Regression 500

15.2 Effect of Measurement Errors in the Regressors 511

15.3 Inverse Estimation—The Calibration Problem 513

15.4 Bootstrapping in Regression 517

15.5 Classifi cation and Regression Trees (CART) 524

15.6 Neural Networks 526

15.7 Designed Experiments for Regression 529

**APPENDIX A. STATISTICAL TABLES 541**

**APPENDIX B. DATA SETS FOR EXERCISES 553**

**APPENDIX C. SUPPLEMENTAL TECHNICAL MATERIAL 574**

C.1 Background on Basic Test Statistics 574

C.2 Background from the Theory of Linear Models 577

C.3 Important Results on SSR and SSRes 581

C.4 Gauss-Markov Theorem, Var(ε) = σ2I 587

C.5 Computational Aspects of Multiple Regression 589

C.6 Result on the Inverse of a Matrix 590

C.7 Development of the PRESS Statistic 591

C.8 Development of S2 (i) 593

C.9 Outlier Test Based on R-Student 594

C.10 Independence of Residuals and Fitted Values 596

C.11 Gauss–Markov Theorem, Var(ε) = V 597

C.12 Bias in MSRes When the Model Is Underspecifi ed 599

C.13 Computation of Infl uence Diagnostics 600

C.14 Generalized Linear Models 601

**APPENDIX D. INTRODUCTION TO SAS 613**

D.1 Basic Data Entry 614

D.2 Creating Permanent SAS Data Sets 618

D.3 Importing Data from an EXCEL File 619

D.4 Output Command 620

D.5 Log File 620

D.6 Adding Variables to an Existing SAS Data Set 622

**APPENDIX E. INTRODUCTION TO R TO PERFORM LINEAR REGRESSION ANALYSIS 623**

E.1 Basic Background on R 623

E.2 Basic Data Entry 624

E.3 Brief Comments on Other Functionality in R 626

E.4 R Commander 627

REFERENCES 628

INDEX 642

## Reading Group Guide

PREFACE xiii

**1. INTRODUCTION 1**

1.1 Regression and Model Building 1

1.2 Data Collection 5

1.3 Uses of Regression 9

1.4 Role of the Computer 10

**2. SIMPLE LINEAR REGRESSION 12**

2.1 Simple Linear Regression Model 12

2.2 Least-Squares Estimation of the Parameters 13

2.3 Hypothesis Testing on the Slope and Intercept 22

2.4 Interval Estimation in Simple Linear Regression 29

2.5 Prediction of New Observations 33

2.6 Coeffi cient of Determination 35

2.7 A Service Industry Application of Regression 37

2.8 Using SAS and R for Simple Linear Regression 39

2.9 Some Considerations in the Use of Regression 42

2.10 Regression Through the Origin 45

2.11 Estimation by Maximum Likelihood 51

2.12 Case Where the Regressor x is Random 52

**3. MULTIPLE LINEAR REGRESSION 67**

3.1 Multiple Regression Models 67

3.2 Estimation of the Model Parameters 70

3.3 Hypothesis Testing in Multiple Linear Regression 84

3.4 Confidence Intervals in Multiple Regression 97

3.5 Prediction of New Observations 104

3.6 A Multiple Regression Model for the Patient Satisfaction Data 104

3.7 Using SAS and R for Basic Multiple Linear Regression 106

3.8 Hidden Extrapolation in Multiple Regression 107

3.9 Standardized Regression Coeffi cients 111

3.10 Multicollinearity 117

3.11 Why Do Regression Coeffi cients Have the Wrong Sign? 119

**4. MODEL ADEQUACY CHECKING 129**

4.1 Introduction 129

4.2 Residual Analysis 130

4.3 PRESS Statistic 151

4.4 Detection and Treatment of Outliers 152

4.5 Lack of Fit of the Regression Model 156

**5. TRANSFORMATIONS AND WEIGHTING TO CORRECT MODEL INADEQUACIES 171**

5.1 Introduction 171

5.2 Variance-Stabilizing Transformations 172

5.3 Transformations to Linearize the Model 176

5.4 Analytical Methods for Selecting a Transformation 182

5.5 Generalized and Weighted Least Squares 188

5.6 Regression Models with Random Effect 194

**6. DIAGNOSTICS FOR LEVERAGE AND INFLUENCE 211**

6.1 Importance of Detecting Infl uential Observations 211

6.2 Leverage 212

6.3 Measures of Infl uence: Cook’s D 215

6.4 Measures of Infl uence: DFFITS and DFBETAS 217

6.5 A Measure of Model Performance 219

6.6 Detecting Groups of Infl uential Observations 220

6.7 Treatment of Infl uential Observations 220

**7. POLYNOMIAL REGRESSION MODELS 223**

7.1 Introduction 223

7.2 Polynomial Models in One Variable 223

7.3 Nonparametric Regression 236

7.4 Polynomial Models in Two or More Variables 242

7.5 Orthogonal Polynomials 248

**8. INDICATOR VARIABLES 260**

8.1 General Concept of Indicator Variables 260

8.2 Comments on the Use of Indicator Variables 273

8.3 Regression Approach to Analysis of Variance 275

**9. MULTICOLLINEARITY 285**

9.1 Introduction 285

9.2 Sources of Multicollinearity 286

9.3 Effects of Multicollinearity 288

9.4 Multicollinearity Diagnostics 292

9.5 Methods for Dealing with Multicollinearity 303

9.6 Using SAS to Perform Ridge and Principal-Component Regression 321

**10. VARIABLE SELECTION AND MODEL BUILDING 327**

10.1 Introduction 327

10.2 Computational Techniques for Variable Selection 338

10.3 Strategy for Variable Selection and Model Building 351

10.4 Case Study: Gorman and Toman Asphalt Data Using SAS 354

**11. VALIDATION OF REGRESSION MODELS 372**

11.1 Introduction 372

11.2 Validation Techniques 373

11.3 Data from Planned Experiments 385

**12. INTRODUCTION TO NONLINEAR REGRESSION 389**

12.1 Linear and Nonlinear Regression Models 389

12.2 Origins of Nonlinear Models 391

12.3 Nonlinear Least Squares 395

12.4 Transformation to a Linear Model 397

12.5 Parameter Estimation in a Nonlinear System 400

12.6 Statistical Inference in Nonlinear Regression 409

12.7 Examples of Nonlinear Regression Models 411

12.8 Using SAS and R 412

**13. GENERALIZED LINEAR MODELS 421**

13.1 Introduction 421

13.2 Logistic Regression Models 422

13.3 Poisson Regression 444

13.4 The Generalized Linear Model 450

14. REGRESSION ANALYSIS OF TIME SERIES DATA 474

14.1 Introduction to Regression Models for Time Series Data 474

14.2 Detecting Autocorrelation: The Durbin-Watson Test 475

14.3 Estimating the Parameters in Time Series Regression Models 480

**15. OTHER TOPICS IN THE USE OF REGRESSION ANALYSIS 500**

15.1 Robust Regression 500

15.2 Effect of Measurement Errors in the Regressors 511

15.3 Inverse Estimation—The Calibration Problem 513

15.4 Bootstrapping in Regression 517

15.5 Classifi cation and Regression Trees (CART) 524

15.6 Neural Networks 526

15.7 Designed Experiments for Regression 529

**APPENDIX A. STATISTICAL TABLES 541**

**APPENDIX B. DATA SETS FOR EXERCISES 553**

**APPENDIX C. SUPPLEMENTAL TECHNICAL MATERIAL 574**

C.1 Background on Basic Test Statistics 574

C.2 Background from the Theory of Linear Models 577

C.3 Important Results on SSR and SSRes 581

C.4 Gauss-Markov Theorem, Var(ε) = σ2I 587

C.5 Computational Aspects of Multiple Regression 589

C.6 Result on the Inverse of a Matrix 590

C.7 Development of the PRESS Statistic 591

C.8 Development of S2 (i) 593

C.9 Outlier Test Based on R-Student 594

C.10 Independence of Residuals and Fitted Values 596

C.11 Gauss–Markov Theorem, Var(ε) = V 597

C.12 Bias in MSRes When the Model Is Underspecifi ed 599

C.13 Computation of Infl uence Diagnostics 600

C.14 Generalized Linear Models 601

**APPENDIX D. INTRODUCTION TO SAS 613**

D.1 Basic Data Entry 614

D.2 Creating Permanent SAS Data Sets 618

D.3 Importing Data from an EXCEL File 619

D.4 Output Command 620

D.5 Log File 620

D.6 Adding Variables to an Existing SAS Data Set 622

**APPENDIX E. INTRODUCTION TO R TO PERFORM LINEAR REGRESSION ANALYSIS 623**

E.1 Basic Background on R 623

E.2 Basic Data Entry 624

E.3 Brief Comments on Other Functionality in R 626

E.4 R Commander 627

REFERENCES 628

INDEX 642

## Interviews

PREFACE xiii

**1. INTRODUCTION 1**

1.1 Regression and Model Building 1

1.2 Data Collection 5

1.3 Uses of Regression 9

1.4 Role of the Computer 10

**2. SIMPLE LINEAR REGRESSION 12**

2.1 Simple Linear Regression Model 12

2.2 Least-Squares Estimation of the Parameters 13

2.3 Hypothesis Testing on the Slope and Intercept 22

2.4 Interval Estimation in Simple Linear Regression 29

2.5 Prediction of New Observations 33

2.6 Coeffi cient of Determination 35

2.7 A Service Industry Application of Regression 37

2.8 Using SAS and R for Simple Linear Regression 39

2.9 Some Considerations in the Use of Regression 42

2.10 Regression Through the Origin 45

2.11 Estimation by Maximum Likelihood 51

2.12 Case Where the Regressor x is Random 52

**3. MULTIPLE LINEAR REGRESSION 67**

3.1 Multiple Regression Models 67

3.2 Estimation of the Model Parameters 70

3.3 Hypothesis Testing in Multiple Linear Regression 84

3.4 Confidence Intervals in Multiple Regression 97

3.5 Prediction of New Observations 104

3.6 A Multiple Regression Model for the Patient Satisfaction Data 104

3.7 Using SAS and R for Basic Multiple Linear Regression 106

3.8 Hidden Extrapolation in Multiple Regression 107

3.9 Standardized Regression Coeffi cients 111

3.10 Multicollinearity 117

3.11 Why Do Regression Coeffi cients Have the Wrong Sign? 119

**4. MODEL ADEQUACY CHECKING 129**

4.1 Introduction 129

4.2 Residual Analysis 130

4.3 PRESS Statistic 151

4.4 Detection and Treatment of Outliers 152

4.5 Lack of Fit of the Regression Model 156

**5. TRANSFORMATIONS AND WEIGHTING TO CORRECT MODEL INADEQUACIES 171**

5.1 Introduction 171

5.2 Variance-Stabilizing Transformations 172

5.3 Transformations to Linearize the Model 176

5.4 Analytical Methods for Selecting a Transformation 182

5.5 Generalized and Weighted Least Squares 188

5.6 Regression Models with Random Effect 194

**6. DIAGNOSTICS FOR LEVERAGE AND INFLUENCE 211**

6.1 Importance of Detecting Infl uential Observations 211

6.2 Leverage 212

6.3 Measures of Infl uence: Cook’s D 215

6.4 Measures of Infl uence: DFFITS and DFBETAS 217

6.5 A Measure of Model Performance 219

6.6 Detecting Groups of Infl uential Observations 220

6.7 Treatment of Infl uential Observations 220

**7. POLYNOMIAL REGRESSION MODELS 223**

7.1 Introduction 223

7.2 Polynomial Models in One Variable 223

7.3 Nonparametric Regression 236

7.4 Polynomial Models in Two or More Variables 242

7.5 Orthogonal Polynomials 248

**8. INDICATOR VARIABLES 260**

8.1 General Concept of Indicator Variables 260

8.2 Comments on the Use of Indicator Variables 273

8.3 Regression Approach to Analysis of Variance 275

**9. MULTICOLLINEARITY 285**

9.1 Introduction 285

9.2 Sources of Multicollinearity 286

9.3 Effects of Multicollinearity 288

9.4 Multicollinearity Diagnostics 292

9.5 Methods for Dealing with Multicollinearity 303

9.6 Using SAS to Perform Ridge and Principal-Component Regression 321

**10. VARIABLE SELECTION AND MODEL BUILDING 327**

10.1 Introduction 327

10.2 Computational Techniques for Variable Selection 338

10.3 Strategy for Variable Selection and Model Building 351

10.4 Case Study: Gorman and Toman Asphalt Data Using SAS 354

**11. VALIDATION OF REGRESSION MODELS 372**

11.1 Introduction 372

11.2 Validation Techniques 373

11.3 Data from Planned Experiments 385

**12. INTRODUCTION TO NONLINEAR REGRESSION 389**

12.1 Linear and Nonlinear Regression Models 389

12.2 Origins of Nonlinear Models 391

12.3 Nonlinear Least Squares 395

12.4 Transformation to a Linear Model 397

12.5 Parameter Estimation in a Nonlinear System 400

12.6 Statistical Inference in Nonlinear Regression 409

12.7 Examples of Nonlinear Regression Models 411

12.8 Using SAS and R 412

**13. GENERALIZED LINEAR MODELS 421**

13.1 Introduction 421

13.2 Logistic Regression Models 422

13.3 Poisson Regression 444

13.4 The Generalized Linear Model 450

14. REGRESSION ANALYSIS OF TIME SERIES DATA 474

14.1 Introduction to Regression Models for Time Series Data 474

14.2 Detecting Autocorrelation: The Durbin-Watson Test 475

14.3 Estimating the Parameters in Time Series Regression Models 480

**15. OTHER TOPICS IN THE USE OF REGRESSION ANALYSIS 500**

15.1 Robust Regression 500

15.2 Effect of Measurement Errors in the Regressors 511

15.3 Inverse Estimation—The Calibration Problem 513

15.4 Bootstrapping in Regression 517

15.5 Classifi cation and Regression Trees (CART) 524

15.6 Neural Networks 526

15.7 Designed Experiments for Regression 529

**APPENDIX A. STATISTICAL TABLES 541**

**APPENDIX B. DATA SETS FOR EXERCISES 553**

**APPENDIX C. SUPPLEMENTAL TECHNICAL MATERIAL 574**

C.1 Background on Basic Test Statistics 574

C.2 Background from the Theory of Linear Models 577

C.3 Important Results on SSR and SSRes 581

C.4 Gauss-Markov Theorem, Var(ε) = σ2I 587

C.5 Computational Aspects of Multiple Regression 589

C.6 Result on the Inverse of a Matrix 590

C.7 Development of the PRESS Statistic 591

C.8 Development of S2 (i) 593

C.9 Outlier Test Based on R-Student 594

C.10 Independence of Residuals and Fitted Values 596

C.11 Gauss–Markov Theorem, Var(ε) = V 597

C.12 Bias in MSRes When the Model Is Underspecifi ed 599

C.13 Computation of Infl uence Diagnostics 600

C.14 Generalized Linear Models 601

**APPENDIX D. INTRODUCTION TO SAS 613**

D.1 Basic Data Entry 614

D.2 Creating Permanent SAS Data Sets 618

D.3 Importing Data from an EXCEL File 619

D.4 Output Command 620

D.5 Log File 620

D.6 Adding Variables to an Existing SAS Data Set 622

**APPENDIX E. INTRODUCTION TO R TO PERFORM LINEAR REGRESSION ANALYSIS 623**

E.1 Basic Background on R 623

E.2 Basic Data Entry 624

E.3 Brief Comments on Other Functionality in R 626

E.4 R Commander 627

REFERENCES 628

INDEX 642

## Recipe

PREFACE xiii

**1. INTRODUCTION 1**

1.1 Regression and Model Building 1

1.2 Data Collection 5

1.3 Uses of Regression 9

1.4 Role of the Computer 10

**2. SIMPLE LINEAR REGRESSION 12**

2.1 Simple Linear Regression Model 12

2.2 Least-Squares Estimation of the Parameters 13

2.3 Hypothesis Testing on the Slope and Intercept 22

2.4 Interval Estimation in Simple Linear Regression 29

2.5 Prediction of New Observations 33

2.6 Coeffi cient of Determination 35

2.7 A Service Industry Application of Regression 37

2.8 Using SAS and R for Simple Linear Regression 39

2.9 Some Considerations in the Use of Regression 42

2.10 Regression Through the Origin 45

2.11 Estimation by Maximum Likelihood 51

2.12 Case Where the Regressor x is Random 52

**3. MULTIPLE LINEAR REGRESSION 67**

3.1 Multiple Regression Models 67

3.2 Estimation of the Model Parameters 70

3.3 Hypothesis Testing in Multiple Linear Regression 84

3.4 Confidence Intervals in Multiple Regression 97

3.5 Prediction of New Observations 104

3.6 A Multiple Regression Model for the Patient Satisfaction Data 104

3.7 Using SAS and R for Basic Multiple Linear Regression 106

3.8 Hidden Extrapolation in Multiple Regression 107

3.9 Standardized Regression Coeffi cients 111

3.10 Multicollinearity 117

3.11 Why Do Regression Coeffi cients Have the Wrong Sign? 119

**4. MODEL ADEQUACY CHECKING 129**

4.1 Introduction 129

4.2 Residual Analysis 130

4.3 PRESS Statistic 151

4.4 Detection and Treatment of Outliers 152

4.5 Lack of Fit of the Regression Model 156

**5. TRANSFORMATIONS AND WEIGHTING TO CORRECT MODEL INADEQUACIES 171**

5.1 Introduction 171

5.2 Variance-Stabilizing Transformations 172

5.3 Transformations to Linearize the Model 176

5.4 Analytical Methods for Selecting a Transformation 182

5.5 Generalized and Weighted Least Squares 188

5.6 Regression Models with Random Effect 194

**6. DIAGNOSTICS FOR LEVERAGE AND INFLUENCE 211**

6.1 Importance of Detecting Infl uential Observations 211

6.2 Leverage 212

6.3 Measures of Infl uence: Cook’s D 215

6.4 Measures of Infl uence: DFFITS and DFBETAS 217

6.5 A Measure of Model Performance 219

6.6 Detecting Groups of Infl uential Observations 220

6.7 Treatment of Infl uential Observations 220

**7. POLYNOMIAL REGRESSION MODELS 223**

7.1 Introduction 223

7.2 Polynomial Models in One Variable 223

7.3 Nonparametric Regression 236

7.4 Polynomial Models in Two or More Variables 242

7.5 Orthogonal Polynomials 248

**8. INDICATOR VARIABLES 260**

8.1 General Concept of Indicator Variables 260

8.2 Comments on the Use of Indicator Variables 273

8.3 Regression Approach to Analysis of Variance 275

**9. MULTICOLLINEARITY 285**

9.1 Introduction 285

9.2 Sources of Multicollinearity 286

9.3 Effects of Multicollinearity 288

9.4 Multicollinearity Diagnostics 292

9.5 Methods for Dealing with Multicollinearity 303

9.6 Using SAS to Perform Ridge and Principal-Component Regression 321

**10. VARIABLE SELECTION AND MODEL BUILDING 327**

10.1 Introduction 327

10.2 Computational Techniques for Variable Selection 338

10.3 Strategy for Variable Selection and Model Building 351

10.4 Case Study: Gorman and Toman Asphalt Data Using SAS 354

**11. VALIDATION OF REGRESSION MODELS 372**

11.1 Introduction 372

11.2 Validation Techniques 373

11.3 Data from Planned Experiments 385

**12. INTRODUCTION TO NONLINEAR REGRESSION 389**

12.1 Linear and Nonlinear Regression Models 389

12.2 Origins of Nonlinear Models 391

12.3 Nonlinear Least Squares 395

12.4 Transformation to a Linear Model 397

12.5 Parameter Estimation in a Nonlinear System 400

12.6 Statistical Inference in Nonlinear Regression 409

12.7 Examples of Nonlinear Regression Models 411

12.8 Using SAS and R 412

**13. GENERALIZED LINEAR MODELS 421**

13.1 Introduction 421

13.2 Logistic Regression Models 422

13.3 Poisson Regression 444

13.4 The Generalized Linear Model 450

14. REGRESSION ANALYSIS OF TIME SERIES DATA 474

14.1 Introduction to Regression Models for Time Series Data 474

14.2 Detecting Autocorrelation: The Durbin-Watson Test 475

14.3 Estimating the Parameters in Time Series Regression Models 480

**15. OTHER TOPICS IN THE USE OF REGRESSION ANALYSIS 500**

15.1 Robust Regression 500

15.2 Effect of Measurement Errors in the Regressors 511

15.3 Inverse Estimation—The Calibration Problem 513

15.4 Bootstrapping in Regression 517

15.5 Classifi cation and Regression Trees (CART) 524

15.6 Neural Networks 526

15.7 Designed Experiments for Regression 529

**APPENDIX A. STATISTICAL TABLES 541**

**APPENDIX B. DATA SETS FOR EXERCISES 553**

**APPENDIX C. SUPPLEMENTAL TECHNICAL MATERIAL 574**

C.1 Background on Basic Test Statistics 574

C.2 Background from the Theory of Linear Models 577

C.3 Important Results on SSR and SSRes 581

C.4 Gauss-Markov Theorem, Var(ε) = σ2I 587

C.5 Computational Aspects of Multiple Regression 589

C.6 Result on the Inverse of a Matrix 590

C.7 Development of the PRESS Statistic 591

C.8 Development of S2 (i) 593

C.9 Outlier Test Based on R-Student 594

C.10 Independence of Residuals and Fitted Values 596

C.11 Gauss–Markov Theorem, Var(ε) = V 597

C.12 Bias in MSRes When the Model Is Underspecifi ed 599

C.13 Computation of Infl uence Diagnostics 600

C.14 Generalized Linear Models 601

**APPENDIX D. INTRODUCTION TO SAS 613**

D.1 Basic Data Entry 614

D.2 Creating Permanent SAS Data Sets 618

D.3 Importing Data from an EXCEL File 619

D.4 Output Command 620

D.5 Log File 620

D.6 Adding Variables to an Existing SAS Data Set 622

**APPENDIX E. INTRODUCTION TO R TO PERFORM LINEAR REGRESSION ANALYSIS 623**

E.1 Basic Background on R 623

E.2 Basic Data Entry 624

E.3 Brief Comments on Other Functionality in R 626

E.4 R Commander 627

REFERENCES 628

INDEX 642