Complex Surveys: A Guide to Analysis Using R / Edition 1

Paperback (Print)
Buy New
Buy New from BN.com
$70.87
Used and New from Other Sellers
Used and New from Other Sellers
from $65.11
Usually ships in 1-2 business days
(Save 23%)
Other sellers (Paperback)
  • All (6) from $65.11   
  • New (3) from $65.11   
  • Used (3) from $67.32   

Overview

As survey analysis continues to serve as a core component of sociological research, researchers are increasingly relying upon data gathered from complex surveys to carry out traditional analyses. Complex Surveys is a practical guide to the analysis of this kind of data using R, the freely available and downloadable statistical programming language. As creator of the specific survey package for R, the author provides the ultimate presentation of how to successfully use the software for analyzing data from complex surveys while also utilizing the most current data from health and social sciences studies to demonstrate the application of survey research methods in these fields.

The book begins with coverage of basic tools and topics within survey analysis such as simple and stratified sampling, cluster sampling, linear regression, and categorical data regression. Subsequent chapters delve into more technical aspects of complex survey analysis, including post-stratification, two-phase sampling, missing data, and causal inference. Throughout the book, an emphasis is placed on graphics, regression modeling, and two-phase designs. In addition, the author supplies a unique discussion of epidemiological two-phase designs as well as probability-weighting for casual inference. All of the book's examples and figures are generated using R, and a related Web site provides the R code that allows readers to reproduce the presented content. Each chapter concludes with exercises that vary in level of complexity, and detailed appendices outline additional mathematical and computational descriptions to assist readers with comparing results from various software systems.

Complex Surveys is an excellent book for courses on sampling and complex surveys at the upper-undergraduate levels. It is also a practical reference guide for applied statisticians and practitioners in the social and health sciences who use statistics in their everyday work.

Read More Show Less

Product Details

  • ISBN-13: 9780470284308
  • Publisher: Wiley
  • Publication date: 3/1/2010
  • Series: Wiley Series in Survey Methodology Series , #565
  • Edition description: New Edition
  • Edition number: 1
  • Pages: 296
  • Sales rank: 348,481
  • Product dimensions: 6.00 (w) x 9.10 (h) x 0.80 (d)

Meet the Author

THOMAS LUMLEY, PHD, is Associate Professor of Biostatistics at the University of Washington. He has published numerous journal articles in his areas of research interest, which include regression modeling, clinical trials, statistical computing, and survey research. Dr. Lumley created the survey package that currently accompanies the R software package, and he is also coauthor of Biostatistics: A Methodology for the Health Sciences, Second Edition, published by Wiley.

Read More Show Less

Table of Contents

Acknowledgments xi

Preface xiii

Acronyms xv

1 Basic Tools 1

1.1 Goals of inference 1

1.1.1 Population or process? 1

1.1.2 Probability samples 2

1.1.3 Sampling weights 3

1.1.4 Design effects 6

1.2 An introduction to the data 6

1.2.1 Real surveys 7

1.2.2 Populations 8

1.3 Obtaining the software 9

1.3.1 Obtaining R 10

1.3.2 Obtaining the survey package 10

1.4 Using R 10

1.4.1 Reading plain text data 10

1.4.2 Reading data from other packages 12

1.4.3 Simple computations 13

Exercises 14

2 Simple and Stratified sampling 17

2.1 Analyzing simple random samples 17

2.1.1 Confidence intervals 19

2.1.2 Describing the sample to R 20

2.2 Stratified sampling 21

2.3 Replicate weights 23

2.3.1 Specifying replicate weights to R 25

2.3.2 Creating replicate weights in R 25

2.4 Other population summaries 28

2.4.1 Quantiles 28

2.4.2 Contingency tables 30

2.5 Estimates in subpopulations 32

2.6 Design of stratified samples 34

Exercises 36

3 Cluster sampling 39

3.1 Introduction 39

3.1.1 Why clusters: the NHANES II design 39

3.1.2 Single-stage and multistage designs 41

3.2 Describing multistage designs to R 42

3.2.1 Strata with only one PSU 43

3.2.2 How good is the single-stage approximation? 44

3.2.3 Replicate weights for multistage samples 46

3.3 Sampling by size 46

3.3.1 Loss of information from sampling clusters 50

3.4 Repeated measurements 51

Exercises 54

4 Graphics 57

4.1 Why is survey data different? 57

4.2 Plotting a table 58

4.3 One continuous variable 62

4.3.1 Graphs based on the distribution function 62

4.3.2 Graphs based on the density 65

4.4 Two continuous variables 67

4.4.1 Scatterplots 67

4.4.2 Aggregation and smoothing 70

4.4.3 Scatterplot smoothers 71

4.5 Conditioning plots 72

4.6 Maps 73

4.6.1 Design and estimation issues 73

4.6.2 Drawing maps in R 76

Exercises 80

5 Ratios and linear regression 83

5.1 Ratio estimation 84

5.1.1 Estimating ratios 84

5.1.2 Ratios for subpopulation estimates 85

5.1.3 Ratio estimators of totals 85

5.2 Linear regression 90

5.2.1 The least-squares slope as an estimated population summary 90

5.2.2 Regression estimation of population totals 92

5.2.3 Confounding and other criteria for model choice 97

5.2.4 Linear models in the survey package 98

5.3 Is weighting needed in regression models? 104

Exercises 105

6 Categorical data regression 109

6.1 Logistic regression 110

6.1.1 Relative risk regression 116

6.2 Ordinal regression 117

6.2.1 Other cumulative link models 122

6.3 Loglinear models 123

6.3.1 Choosing models 124

6.3.2 Linear-association models 129

Exercises 132

7 Post-stratification, raking and calibration 135

7.1 Introduction 135

7.2 Post-stratification 136

7.3 Raking 139

7.4 Generalized raking, GREG estimation, and calibration 141

7.4.1 Calibration in R 143

7.5 Basu's elephants 149

7.6 Selecting auxiliary variables for non-response 152

7.6.1 Direct standardization 154

7.6.2 Standard error estimation 154

Exercises 154

8 Two-phase sampling 157

8.1 Multistage and multiphase sampling 157

8.2 Sampling for stratification 158

8.3 The case-control design 159

8.3.1 *Simulations: efficiency of the design-based estimator 161

8.3.2 Frequency matching 164

8.4 Sampling from existing cohorts 164

8.4.1 Logistic regression 165

8.4.2 Two-phase case-control designs in R 167

8.4.3 Survival analysis 170

8.4.4 Case-cohort designs in R 171

8.5 Using auxiliary information from phase one 174

8.5 1 Population calibration for regression models 175

8.5.2 Two-phase designs 178

8.5.3 Some history of the two-phase calibration estimator 181

Exercises 182

9 Missing data 185

9.1 Item non-response 185

9.2 Two-phase estimation for missing data 186

9.2.1 Calibration for item non-response 186

9.2.2 Models for response probability 189

9.2.3 Effect on precision 190

9.2.4 *Doubly-robust estimators 192

9.3 Imputation of missing data 193

9.3.1 Describing multiple imputations to R 195

9.3.2 Example: NHANES III imputations 196

Exercises 200

10 *Causal inference 203

10.1 IPTW estimators 204

10.1.1 Randomized trials and calibration 204

10.1.2 Estimated weights for IPTW 207

10.1.3 Double robustness 211

10.2 Marginal Structural Models 211

Appendix A Analytic Details 217

A.l Asymptotics 217

A.1.1 Embedding in an infinite sequence 217

A.1.2 Asymptotic unbiasedness 218

A.1.3 Asymptotic normality and consistency 220

A.2 Variances by linearization 221

A.2.1 Subpopulation inference 221

A.3 Tests in contingency tables 223

A.4 Multiple imputation 224

A.5 Calibration and influence functions 225

A.6 Calibration in randomized trials and ANCOVA 226

Appendix B Basic R 231

B.1 Reading data 231

B.1.1 Plain text data 231

B.2 Data manipulation 232

B.2.1 Merging 232

B.2.2 Factors 233

B.3 Randomness 233

B.4 Methods and objects 234

B.5 *Writing functions 235

B.5.1 Repetition 236

B.5.2 Strings 238

Appendix C Computational details 239

C.l Linearization 239

C.l.l Generalized linear models and expected information 240

C.2 Replicate weights 240

C.2.1 Choice of estimators 240

C.2.2 Hadamard matrices 241

C.3 Scatterplot smoothers 242

C.4 Quantiles 242

C.5 Bug reports and feature requests 244

Appendix D Database-backed design objects 245

D.1 Large data 245

D.2 Setting up database interfaces 247

D.2.1 ODBC 247

D.2.2 DBI 248

Appendix E Extending the package 249

E.1 A case study: negative binomial regression 249

E.2 Using a Poisson model 250

E.3 Replicate weights 251

E.4 Linearization 253

References 257

Author Index 269

Topic Index 271

Read More Show Less

Customer Reviews

Be the first to write a review
( 0 )
Rating Distribution

5 Star

(0)

4 Star

(0)

3 Star

(0)

2 Star

(0)

1 Star

(0)

Your Rating:

Your Name: Create a Pen Name or

Barnes & Noble.com Review Rules

Our reader reviews allow you to share your comments on titles you liked, or didn't, with others. By submitting an online review, you are representing to Barnes & Noble.com that all information contained in your review is original and accurate in all respects, and that the submission of such content by you and the posting of such content by Barnes & Noble.com does not and will not violate the rights of any third party. Please follow the rules below to help ensure that your review can be posted.

Reviews by Our Customers Under the Age of 13

We highly value and respect everyone's opinion concerning the titles we offer. However, we cannot allow persons under the age of 13 to have accounts at BN.com or to post customer reviews. Please see our Terms of Use for more details.

What to exclude from your review:

Please do not write about reviews, commentary, or information posted on the product page. If you see any errors in the information on the product page, please send us an email.

Reviews should not contain any of the following:

  • - HTML tags, profanity, obscenities, vulgarities, or comments that defame anyone
  • - Time-sensitive information such as tour dates, signings, lectures, etc.
  • - Single-word reviews. Other people will read your review to discover why you liked or didn't like the title. Be descriptive.
  • - Comments focusing on the author or that may ruin the ending for others
  • - Phone numbers, addresses, URLs
  • - Pricing and availability information or alternative ordering information
  • - Advertisements or commercial solicitation

Reminder:

  • - By submitting a review, you grant to Barnes & Noble.com and its sublicensees the royalty-free, perpetual, irrevocable right and license to use the review in accordance with the Barnes & Noble.com Terms of Use.
  • - Barnes & Noble.com reserves the right not to post any review -- particularly those that do not follow the terms and conditions of these Rules. Barnes & Noble.com also reserves the right to remove any review at any time without notice.
  • - See Terms of Use for other conditions and disclaimers.
Search for Products You'd Like to Recommend

Recommend other products that relate to your review. Just search for them below and share!

Create a Pen Name

Your Pen Name is your unique identity on BN.com. It will appear on the reviews you write and other website activities. Your Pen Name cannot be edited, changed or deleted once submitted.

 
Your Pen Name can be any combination of alphanumeric characters (plus - and _), and must be at least two characters long.

Continue Anonymously

    If you find inappropriate content, please report it to Barnes & Noble
    Why is this product inappropriate?
    Comments (optional)