Data Analysis Using Regression and Multilevel/Hierarchical Models

Data Analysis Using Regression and Multilevel/Hierarchical Models

2.5 2
by Andrew Gelman, Jennifer Hill

Data Analysis Using Regression and Multilevel/Hierarchical Models is a comprehensive manual for the applied researcher who wants to perform data analysis using linear and nonlinear regression and multilevel models. The book introduces a wide variety of models, whilst at the same time instructing the reader in how to fit these models using available software packages.… See more details below


Data Analysis Using Regression and Multilevel/Hierarchical Models is a comprehensive manual for the applied researcher who wants to perform data analysis using linear and nonlinear regression and multilevel models. The book introduces a wide variety of models, whilst at the same time instructing the reader in how to fit these models using available software packages. The book illustrates the concepts by working through scores of real data examples that have arisen from the authors' own applied research, with programming codes provided for each one. Topics covered include causal inference, including regression, poststratification, matching, regression discontinuity, and instrumental variables, as well as multilevel logistic regression and missing-data imputation. Practical tips regarding building, fitting, and understanding are provided throughout. Author resource page:

Product Details

Cambridge University Press
Publication date:
Analytical Methods for Social Research Series
Edition description:
First Edition
Sales rank:
Product dimensions:
7.01(w) x 10.00(h) x 1.38(d)

Related Subjects

Table of Contents

List of examples     xvii
Preface     xix
Why?     1
What is multilevel regression modeling?     1
Some examples from our own research     3
Motivations for multilevel modeling     6
Distinctive features of this book     8
Computing     9
Concepts and methods from basic probability and statistics     13
Probability distributions     13
Statistical inference     16
Classical confidence intervals     18
Classical hypothesis testing     20
Problems with statistical significance     22
55,000 residents desperately need your help!     23
Bibliographic note     26
Exercises     26
Single-level regression     29
Linear regression: the basics     31
One predictor     31
Multiple predictors     32
Interactions     34
Statistical inference     37
Graphical displays of data and fitted model     42
Assumptions and diagnostics     45
Prediction and validation     47
Bibliographic note     49
Exercises     49
Linear regression:before and after fitting the model     53
Linear transformations     53
Centering and standardizing, especially for models with interactions     55
Correlation and "regression to the mean"     57
Logarithmic transformations     59
Other transformations     65
Building regression models for prediction     68
Fitting a series of regressions     73
Bibliographic note     74
Exercises     74
Logistic regression     79
Logistic regression with a single predictor     79
Interpreting the logistic regression coefficients     81
Latent-data formulation     85
Building a logistic regression model: wells in Bangladesh     86
Logistic regression with interactions     92
Evaluating, checking, and comparing fitted logistic regressions     97
Average predictive comparisons on the probability scale     101
Identifiability and separation     104
Bibliographic note     105
Exercises     105
Generalized linear models     109
Introduction     109
Poisson regression, exposure, and overdispersion     110
Logistic-binomial model      116
Probit regression: normally distributed latent data     118
Ordered and unordered categorical regression     119
Robust regression using the t model     124
Building more complex generalized linear models     125
Constructive choice models     127
Bibliographic note     131
Exercises     132
Working with regression inferences     135
Simulation of probability models and statistical inferences     137
Simulation of probability models     137
Summarizing linear regressions using simulation: an informal Bayesian approach     140
Simulation for nonlinear predictions: congressional elections     144
Predictive simulation for generalized linear models     148
Bibliographic note     151
Exercises     152
Simulation for checking statistical procedures and model fits     155
Fake-data simulation     155
Example: using fake-data simulation to understand residual plots     157
Simulating from the fitted model and comparing to actual data     158
Using predictive simulation to check the fit of a time-series model     163
Bibliographic note     165
Exercises     165
Causal inference using regression on the treatment variable     167
Causal inference and predictive comparisons     167
The fundamental problem of causal inference     170
Randomized experiments     172
Treatment interactions and poststratification     178
Observational studies     181
Understanding causal inference in observational studies     186
Do not control for post-treatment variables     188
Intermediate outcomes and causal paths     190
Bibliographic note     194
Exercises     194
Causal inference using more advanced models     199
Imbalance and lack of complete overlap     199
Subclassification: effects and estimates for different subpopulations     204
Matching: subsetting the data to get overlapping and balanced treatment and control groups     206
Lack of overlap when the assignment mechanism is known: regression discontinuity     212
Estimating causal effects indirectly using instrumental variables     215
Instrumental variables in a regression framework     220
Identification strategies that make use of variation within or between groups     226
Bibliographic note     229
Exercises     231
Multilevel regression      235
Multilevel structures     237
Varying-intercept and varying-slope models     237
Clustered data: child support enforcement in cities     237
Repeated measurements, time-series cross sections, and other non-nested structures     241
Indicator variables and fixed or random effects     244
Costs and benefits of multilevel modeling     246
Bibliographic note     247
Exercises     248
Multilevel linear models: the basics     251
Notation     251
Partial pooling with no predictors     252
Partial pooling with predictors     254
Quickly fitting multilevel models in R     259
Five ways to write the same model     262
Group-level predictors     265
Model building and statistical significance     270
Predictions for new observations and new groups     272
How many groups and how many observations per group are needed to fit a multilevel model?     275
Bibliographic note     276
Exercises     277
Multilevel linear models: varying slopes, non-nested models, and other complexities     279
Varying intercepts and slopes     279
Varying slopes without varying intercepts      283
Modeling multiple varying coefficients using the scaled inverse-Wishart distribution     284
Understanding correlations between group-level intercepts and slopes     287
Non-nested models     289
Selecting, transforming, and combining regression inputs     293
More complex multilevel models     297
Bibliographic note     297
Exercises     298
Multilevel logistic regression     301
State-level opinions from national polls     301
Red states and blue states: what's the matter with Connecticut?     310
Item-response and ideal-point models     314
Non-nested overdispersed model for death sentence reversals     320
Bibliographic note     321
Exercises     322
Multilevel generalized linear models     325
Overdispersed Poisson regression: police stops and ethnicity     325
Ordered categorical regression: storable votes     331
Non-nested negative-binomial model of structure in social networks     332
Bibliographic note     342
Exercises     342
Fitting multilevel models     343
Multilevel modeling in Bugs and R: the basics     345
Why you should learn Bugs      345
Bayesian inference and prior distributions     345
Fitting and understanding a varying-intercept multilevel model using R and Bugs     348
Step by step through a Bugs model, as called from R     353
Adding individual- and group-level predictors     359
Predictions for new observations and new groups     361
Fake-data simulation     363
The principles of modeling in Bugs     366
Practical issues of implementation     369
Open-ended modeling in Bugs     370
Bibliographic note     373
Exercises     373
Fitting multilevel linear and generalized linear models in Bugs and R     375
Varying-intercept, varying-slope models     375
Varying intercepts and slopes with group-level predictors     379
Non-nested models     380
Multilevel logistic regression     381
Multilevel Poisson regression     382
Multilevel ordered categorical regression     383
Latent-data parameterizations of generalized linear models     384
Bibliographic note     385
Exercises     385
Likelihood and Bayesian inference and computation     387
Least squares and maximum likelihood estimation      387
Uncertainty estimates using the likelihood surface     390
Bayesian inference for classical and multilevel regression     392
Gibbs sampler for multilevel linear models     397
Likelihood inference, Bayesian inference, and the Gibbs sampler: the case of censored data     402
Metropolis algorithm for more general Bayesian computation     408
Specifying a log posterior density, Gibbs sampler, and Metropolis algorithm in R     409
Bibliographic note     413
Exercises     413
Debugging and speeding convergence     415
Debugging and confidence building     415
General methods for reducing computational requirements     418
Simple linear transformations     419
Redundant parameters and intentionally nonidentifiable models     419
Parameter expansion: multiplicative redundant parameters     424
Using redundant parameters to create an informative prior distribution for multilevel variance parameters     427
Bibliographic note     434
Exercises     434
Prom data collection to model understanding to model checking     435
Sample size and power calculations     437
Choices in the design of data collection     437
Classical power calculations: general principles, as illustrated by estimates of proportions     439
Classical power calculations for continuous outcomes     443
Multilevel power calculation for cluster sampling     447
Multilevel power calculation using fake-data simulation     449
Bibliographic note     454
Exercises     454
Understanding and summarizing the fitted models     457
Uncertainty and variability     457
Superpopulation and finite-population variances     459
Contrasts and comparisons of multilevel coefficients     462
Average predictive comparisons     466
R[superscript 2] and explained variance     473
Summarizing the amount of partial pooling     477
Adding a predictor can increase the residual variance!     480
Multiple comparisons and statistical significance     481
Bibliographic note     484
Exercises     485
Analysis of variance     487
Classical analysis of variance     487
ANOVA and multilevel linear and generalized linear models     490
Summarizing multilevel models using ANOVA     492
Doing ANOVA using multilevel models     494
Adding predictors: analysis of covariance and contrast analysis      496
Modeling the variance parameters: a split-plot latin square     498
Bibliographic note     501
Exercises     501
Causal inference using multilevel models     503
Multilevel aspects of data collection     503
Estimating treatment effects in a multilevel observational study     506
Treatments applied at different levels     507
Instrumental variables and multilevel modeling     509
Bibliographic note     512
Exercises     512
Model checking and comparison     513
Principles of predictive checking     513
Example: a behavioral learning experiment     515
Model comparison and deviance     524
Bibliographic note     526
Exercises     527
Missing-data imputation     529
Missing-data mechanisms     530
Missing-data methods that discard data     531
Simple missing-data approaches that retain all the data     532
Random imputation of a single variable     533
Imputation of several missing variables     539
Model-based imputation     540
Combining inferences from multiple imputations     542
Bibliographic note      542
Exercises     543
Appendixes     545
Six quick tips to improve your regression modeling     547
Fit many models     547
Do a little work to make your computations faster and more reliable     547
Graphing the relevant and not the irrelevant     548
Transformations     548
Consider all coefficients as potentially varying     549
Estimate causal inferences in a targeted way, not as a byproduct of a large regression     549
Statistical graphics for research and presentation     551
Reformulating a graph by focusing on comparisons     552
Scatterplots     553
Miscellaneous tips     559
Bibliographic note     562
Exercises     563
Software     565
Getting started with R, Bugs, and a text editor     565
Fitting classical and multilevel regressions in R     565
Fitting models in Bugs and R     567
Fitting multilevel models using R, Stata, SAS, and other software     568
Bibliographic note     573
References     575
Author index     601
Subject index     607

Read More

Customer Reviews

Average Review:

Write a Review

and post it to your social network


Most Helpful Customer Reviews

See all customer reviews >