A Solution to the Ecological Inference Problem: Reconstructing Individual Behavior from Aggregate Data

A Solution to the Ecological Inference Problem: Reconstructing Individual Behavior from Aggregate Data

Paperback

$69.95
View All Available Formats & Editions
Eligible for FREE SHIPPING
  • Want it by Thursday, October 18?   Order by 12:00 PM Eastern and choose Expedited Shipping at checkout.

Overview

A Solution to the Ecological Inference Problem: Reconstructing Individual Behavior from Aggregate Data by Gary King, Princeton University Press

"This is a significant contribution to political methodology, and to statistical methodology throughout the social sciences. As always with Gary King's work, it is written with great flair and sophistication. This book will generate a good deal of excitement at the methodological frontier, and will also have a bracing impact on substantive research in a variety of fields."—Larry M. Bartels, Princeton University

"In this work, Gary King presents a number of new and important contributions to the field of statistical theory, and the practice of estimating choice probabilities from data aggregated into groups. An impressive statistical contribution."—Melvin J. Hinich, University of Texas-Austin

Product Details

ISBN-13: 9780691012407
Publisher: Princeton University Press
Publication date: 03/17/1997
Pages: 346
Product dimensions: 7.75(w) x 10.00(h) x 0.92(d)

About the Author

Gary King is Professor of Government at Harvard University. He has authored and coauthored numerous journal articles and books in the field of political methodology, including Designing Social Inquiry: Scientific Inference in Qualitative Research (Princeton).

Table of Contents

List of Figures xi

List of Tables xiii

Preface xv

PART I: INTRODUCTION 1

1. Qualitative Overview 3

1.1 The Necessity of Ecological Inferences 7

1.2 The Problem 12

1.3 The Solution 17

1.4 The Evidence 22

1.5 The Method 26

2. Formal Statement of the Problem 28

PART II: CATALOG OF PROBLEMS TO FIX 35

3. Aggregation Problems 37

3.1 Goodman's Regression: A Definition 37

3.2 The Indeterminacy Problem 39

3.3 The Grouping Problem 46

3.4 Equivalence of the Grouping and Indeterminacy Problems 53

3.5 A Concluding Definition 54

4. Non-Aggregation Problems 56

4.1 Goodman Regression Model Problems 56

4.2 Applying Goodman's Regression in 2 x 3 Tables 68

4.3 Double Regression Problems 71

4.4 Concluding Remarks 73

PART III: THE PROPOSED SOLUTION 75

5. The Data: Generalizing the Method of Bounds 77

5.1 Homogeneous Precincts: No Uncertainty 78

5.2 Heterogeneous Precincts: Upper and Lower Bounds 79

5.2.1 Precinct-Level Quantities of Interest 79

5.2.2 District-Level Quantities of Interest 83

5.3 An Easy Visual Method for Computing Bounds 85

6. The Model 91

6.1 The Basic Model 92

6.2 Model Interpretation 94

6.2.1 Observable Implications of Model Parameters 96

6.2.2 Parameterizing the Truncated Bivariate Normal 102

6.2.3 Computing 2p Parameters from Only p Observations 106

6.2.4 Connections to the Statistics of Medical and Seismic Imaging 112

6.2.5 Would a Model of Individual-Level Choices Help? 119

7. Preliminary Estimation 123

7.1 A Visual Introduction 124

7.2 The Likelihood Function 132

7.3 Parameterizations 135

7.4 Optional Priors 138

7.5 Summarizing Information about Estimated Parameters 139

8. Calculating Quantities of Interest 141

8.1 Simulation Is Easier than Analytical Derivation 141

8.1.1 Definitions and Examples 142

8.1.2 Simulation for Ecological Inference 144

8.2 Precinct-Level Quantities 145

8.3 District-Level Quantities 149

8.4 Quantities of Interest from Larger Tables 151

8.4.1 A Multiple Imputation Approach 151

8.4.2 An Approach Related to Double Regression 153

8.5 Other Quantities of Interest 156

9. Model Extensions 158

9.1 What Can Go Wrong? 158

9.1.1 Aggregation Bias 159

9.1.2 Incorrect Distributional Assumptions 161

9.1.3 Spatial Dependence 164

9.2 Avoiding Aggregation Bias 168

9.2.1 Using External Information 169

9.2.2 Unconditional Estimation: Xi as a Covariate 174

9.2.3 Tradeoffs and Priors for the Extended Model 179

9.2.4 Ex Post Diagnostics 183

9.3 Avoiding Distributional Problems 184

9.3.1 Parametric Approaches 185

9.3.2 A Nonparametric Approach 191

PART IV: VERIFICATION 197

10. A Typical Application Described in Detail: Voter Registration by Race 199

10.1 The Data 199

10.2 Likelihood Estimation 200

10.3 Computing Quantities of Interest 207

10.3.1 Aggregate 207

10.3.2 County Level 209

10.3.3 Other Quantities of Interest 215

11. Robustness to Aggregation Bias: Poverty Status by Sex 217

11.1 Data and Notation 217

11.2 Verifying the Existence of Aggregation Bias 218

11.3 Fitting the Data 220

11.4 Empirical Results 222

12. Estimation without Information: Black Registration in Kentucky 226

12.1 The Data 226

12.2 Data Problems 227

12.3 Fitting the Data 228

12.4 Empirical Results 232

13. Classic Ecological Inferences 235

13.1 Voter Transitions 235

13.1.1 Data 235

13.1.2 Estimates 238

13.2 Black Literacy in 1910 241

PART V: GENERALIZATIONS AND CONCLUDING SUGGESTIONS 247

14. Non-Ecological Aggregation Problems 249

14.1 The Geographer's Modifiable Areal Unit Problem 249

14.1.1 The Problem with the Problem 250

14.1.2 Ecological Inference as a Solution to the Modifiable Areal Unit Problem 252

14.2 The Statistical Problem of Combining Survey and Aggregate Data 255

14.3 The Econometric Problem of Aggregating Continuous Variables 258

14.4 Concluding Remarks on Related Aggregation Research 262

15. Ecological Inference in Larger Tables 263

15.1 An Intuitive Approach 264

15.2 Notation for a General Approach 267

15.3 Generalized Bounds 269

15.4 The Statistical Model 271

15.5 Distributional Implications 273

15.6 Calculating the Quantities of Interest 276

15.7 Concluding Suggestions 276

16. A Concluding Checklist 277

PART VI: APPENDICES 293

A. Proof That All Discrepancies Are Equivalent 295

B Parameter Bounds 301

B.1 Homogeneous Precincts 301

B.2 Heterogeneous Precincts 302

B.3 Heterogeneous Precincts 303

C Conditional Posterior Distribution 304

C.1 Using Bayes Theorem 305

C.2 Using Properties of Normal Distributions 306

D The Likelihood Function 307

E The Details of Nonparametric Estimation 309

F Computational Issues 311

Glossary of Symbols 313

References 317

Index 337

What People are Saying About This

Bartels

This is a significant contribution to political methodology, and to statistical methodology throughout the social sciences. As always with Gary King's work, it is written with great flair and sophistication. This book will generate a good deal of excitement at the methodological frontier, and will also have a bracing impact on substantive research in a variety of fields.
Larry M. Bartels, Princeton University

Hinich

In this work, Gary King presents a number of new and important contributions to the field of statistical theory, and the practice of estimating choice probabilities from data aggregated into groups. An impressive statistical contribution.
Melvin J. Hinich, University of Texas-Austin

Customer Reviews

Most Helpful Customer Reviews

See All Customer Reviews