Ecological Models and Data in R

Ecological Models and Data in R

by Benjamin M. Bolker
Ecological Models and Data in R

Ecological Models and Data in R

by Benjamin M. Bolker

eBookCourse Book (Course Book)

$54.49  $72.00 Save 24% Current price is $54.49, Original price is $72. You Save 24%.

Available on Compatible NOOK Devices and the free NOOK Apps.
WANT A NOOK?  Explore Now

Related collections and offers


Overview

Ecological Models and Data in R is the first truly practical introduction to modern statistical methods for ecology. In step-by-step detail, the book teaches ecology graduate students and researchers everything they need to know in order to use maximum likelihood, information-theoretic, and Bayesian techniques to analyze their own data using the programming language R. Drawing on extensive experience teaching these techniques to graduate students in ecology, Benjamin Bolker shows how to choose among and construct statistical models for data, estimate their parameters and confidence limits, and interpret the results. The book also covers statistical frameworks, the philosophy of statistical modeling, and critical mathematical functions and probability distributions. It requires no programming background--only basic calculus and statistics.


  • Practical, beginner-friendly introduction to modern statistical techniques for ecology using the programming language R

  • Step-by-step instructions for fitting models to messy, real-world data

  • Balanced view of different statistical approaches

  • Wide coverage of techniques--from simple (distribution fitting) to complex (state-space modeling)

  • Techniques for data manipulation and graphical display

  • Companion Web site with data and R code for all examples


Product Details

ISBN-13: 9781400840908
Publisher: Princeton University Press
Publication date: 07/01/2008
Sold by: Barnes & Noble
Format: eBook
Pages: 408
File size: 16 MB
Note: This product may take a few minutes to download.

About the Author

Benjamin M. Bolker is a theoretical ecologist in the departments of Mathematics & Statistics and Biology at McMaster University.

Read an Excerpt

Ecological Models and Data in R


By B.M. Bolker Princeton University Press
Copyright © 2008
Princeton University Press
All right reserved.

ISBN: 978-0-691-12522-0


Chapter One Introduction and Background

This chapter gives a broad overview of the philosophy and techniques of ecological modeling. A small data set on seed removal illustrates the three most common frameworks for statistical modeling in ecology: frequentist, likelihood-based, and Bayesian. The chapter also reviews what you should know to get the most out of the book, discusses the R language, and spells out a step-by-step process for building models of ecological systems.

If you're impatient with philosophical discussion, you can read Section 1.4 and the R supplement at the end of the chapter and move on to Chapter 2.

1.1 Introduction

This book is about combining models with data to answer ecological questions. Pursuing this worthwhile goal will lead to topics ranging from basic statistics, to the cutting edge of modern statistics, to the nuts and bolts of computer programming, to the philosophy of science. Remember as we go along not to miss the ecological forest for the statistical trees; all of these complexities are in the service of answering ecological questions, and the most important thing is to keep your common sense about you and your focus on the biological questions you set out to answer. "Does this make sense?" and "What does this answer really mean?" are the two questions youshould ask constantly. If you cannot answer them, back up to the last point you understood.

If you want to combine models with data, you need to use statistical tools. Ecological statistics has gotten much more complicated in the last few decades. Research papers in ecology now routinely refer to likelihood, Markov chain Monte Carlo, and other arcana. This new complexity arises from the explosion of cheap computing power, which allows us to run complicated tests quickly and easily-or at least more easily than before. But there is still a lot to know about how these tests work, which is what this book is about. The good news is that we can now develop statistical methods that directly answer our ecological questions, adapting statistics to the data rather than vice versa. Instead of asking "What is the probability of observing at least this much variability among the arcsine-square-root-transformed counts of seeds in different treatments?" we can ask "Is the number of seeds removed consistent with standard foraging theory, and what are the attack rates and handling times of predators? Do the attack rates or handling times increase with mean seed size? With the time that the seeds have been available? Is there evidence for variability among seeds?" By customizing statistical tests we can squeeze more information, and more relevant information, from expensive data. Building your own statistical tests is not easy, but it is really no harder than using any of the other tools ecologists have picked up in their ongoing effort to extract meaning from the natural world (stable isotope techniques, radiotelemetry, microsatellite population genetics, geographic information systems, otolith analysis, flow cytometry, mist netting ... you can probably identify several more from your own field). Custom statistical techniques are just another set of tools in the modern ecologist's toolbox; the information this book presents should show you how to use them on your own data, to answer your own questions.

For example, Sandin and Pacala (2005b) combined population counts through time with remote underwater video monitoring to analyze how the density of reef fishes in the Caribbean affected their risk of predation. The classic approach to this problem would be to test for a significant correlation between density and mortality rate, or between density and predator activity. A positive correlation between prey population density and the number of observed predator visits or attacks would suggest that prey aggregations attract predators. If predator attacks on the prey population are proportional to population density, then the predation rate per prey individual will be independent of density; predator attacks would need to accelerate with increasing population density in order for predators to regulate the prey population. One could test for positive correlations between prey density and per capita mortality to see whether this is so.

However, correlation analysis assumes the data are bivariate normally distributed, while linear regression assumes a linear relationship between a predictor variable and a normally distributed response variable. Although one can sometimes transform data to satisfy these assumptions, or simply ignore minor violations, Sandin and Pacala took a more powerful approach: they built explicit models to describe how the absolute and per capita predator visits or mortality depended on prey population density. For example, the absolute mortality probability would be [r.sub.0] + [r.sub.1]n and the per capita mortality probability would be ([r.sub.0] + [r.sub.1]n)/n if predator visits are proportional to prey density. They also used realistic binomial and Poisson probability distributions to describe the variation in the data, rather than assuming normality (a particularly awkward assumption when there are lots of zeros in the data). By doing so, they were able to choose among a variety of possible mod els and conclude that predators induce inverse density dependence in this system (i.e., that smaller prey populations experience higher per capita mortality, because predators are present at relatively constant numbers independent of prey density). Because they fitted models rather than running classical statistical tests on transformed data, they were also able to estimate meaningful parameter values, such as the increase in predator visits per hour for every additional prey individual present. These values are more useful than p (significance) values, or than regression slopes from transformed data, because they express statistical information in ecological terms.

1.2 What This Book Is Not About

1.2.1 What You Should Already Know

To get the most out of the material presented here you should already have a good grasp of basic statistics, be comfortable with computers (e.g., have used Microsoft Excel to deal with data), and have some rusty calculus. But attitude and aptitude are more important than previous classroom experience. Getting into this material requires some hard work at the outset, but it will become easier as you brush up on basic concepts.

STATISTICS

I assume that you've had the equivalent of a one-semester undergraduate statistics course. The phrases hypothesis test, analysis of variance, linear regression, normal distribution (maybe even Central Limit Theorem) should be familiar to you, even if you don't remember all of the details. The basics of experimental design-the meaning of and need for randomization, control, independence, and replication in setting up experiments, the idea of statistical power, and the concept of pseudoreplication (Hurlbert, 1984; Hargrove and Pickering, 1992; Heffner et al., 1996; Oksanen, 2001)-are essential tools for any working ecologist, but you can learn them from a good introductory statistics class or textbook such as Gotelli and Ellison (2004) or Quinn and Keough (2002).

Further reading: If you need to review statistics, try Crawley (2002), Dalgaard (2003), or Gotelli and Ellison (2004). Gonick and Smith's 1993 Cartoon Guide to Statistics gives a gentle introduction to some basic concepts, but you will need to go beyond what they cover. Sokal and Rohlf (1995), Zar (1999), and Crawley (2005, 2007) cover a broader range of classical statistics. For experimental design, try Underwood (1996), Scheiner and Gurevitch (2001), or Quinn and Keough (2002) (the latter two discuss statistical analysis as well).

COMPUTERS

This book will teach you how to use computers to understand data. You will be writing a few lines of R code at a time rather than full-blown computer programs, but you will have to go beyond pointing and clicking. You need to be comfortable with computers, and with using spreadsheets like Excel to manipulate data. Familiarity with a mainstream statistics package like SPSS or SAS will be useful, although you should definitely use R to work through this book instead of falling back on a familiar software package. (If you have used R already, you'll have a big head start.) You needn't have done any programming.

MATH

Having "rusty" calculus means knowing what a derivative and an integral are. While it would be handy to remember a few of the formulas for derivatives, a feeling for the meanings of logarithms, exponentials, derivatives, and integrals is more important than the formulas (you'll find the formulas in the appendix). In working through this book you will have to use algebra, as much as calculus, in a routine way to solve equations and answer questions. Most of the people who have taken my classes were very rusty when they started.

Further reading: Adler (2004) gives a very applied review of basic calculus, differential equations, and probability, while Neuhauser (2003) covers calculus in a more rigorous and traditional way, but still with a biological slant.

ECOLOGY

I have assumed you know some basic ecological concepts, since they are the foundation of ecological data analysis. You should be familiar, for example, with exponential and logistic growth from population ecology; functional responses from predator-prey ecology; and competitive exclusion from community ecology.

Further reading: For a short introduction to ecological theory, try Hastings (1997) or Vandermeer and Goldberg (2004) (the latter is more general). Gotelli (2001) is more detailed. Begon et al. (1996) gives an extremely thorough introduction to general ecology, including some basic ecological models. Case (1999) provides an illustrated treatment of theory, while Roughgarden (1997) integrates ecological theory with programming examples in MATLAB. Mangel (2006) and Otto and Day (2007), two new books, both give basic introductions to the "theoretical biologist's toolbox."

1.2.2 Other Kinds of Models

Ecologists sometimes want to "learn how to model" without knowing clearly what questions they hope the models will answer, and without knowing what kind of models might be useful. This is a bit like saying "I want to learn to do experiments" or "I want to learn molecular biology": Do you want to analyze microsatellites? Use RNA inactivation to knock out gene function? Sequence genomes? What people usually mean by "I want to learn how to model" is "I have heard that modeling is a powerful tool and I think it could tell me something about my system, but I'm not really sure what it can do."

Ecological modeling has many facets. This book covers only one: statistical modeling, with a bias toward mechanistic descriptions of ecological patterns. The next section briefly reviews a much broader range of modeling frameworks and gives some starting points in the modeling literature in case you want to learn more about other kinds of ecological models.

1.3 Frameworks for Modeling

This book is primarily about how to combine models with data and how to use them to discover the answers to theoretical or applied questions. To help fit statistical models into the larger picture, Table 1.1 presents a broad range of dichotomies that cover some of the kinds and uses of ecological models. The discussion of these dichotomies starts to draw in some of the statistical, mathematical, and ecological concepts I suggested you should know. However, if a few are unfamiliar, don't worry-the next few chapters will review the most important concepts. Part of the challenge of learning the material in this book is a chicken-and-egg problem: to know why certain technical details are important, you need to know the big picture, but the big picture itself involves knowing some of those technical details. Iterating, or cycling, is the best way to handle this problem. Most of the material introduced in this chapter will be covered in more detail in later chapters. If you don't completely get it this time around, hang on and see if it makes more sense the second time.

1.3.1 Scope and Approach

The first set of dichotomies in the table subdivides models into two categories, one (theoretical/strategic) that aims for general insight into the workings of ecological processes and one (applied/tactical) that aims to describe and predict how a particular system functions, often with the goal of forecasting or managing its behavior. Theoretical models are often mathematically difficult and ecologically oversimplified, which is the price of generality. Paradoxically, although theoretical models are defined in terms of precise numbers of individuals, because of their simplicity they are usually used only for qualitative predictions. Applied models are often mathematically simpler (although they can require complex computer code) but tend to capture more of the ecological complexity and quirkiness needed to make detailed predictions about a particular place and time. Because of this complexity their predictions are often less general.

The dichotomy of mathematical versus statistical modeling says more about the culture of modeling and how different disciplines go about thinking about models than about how we should actually model ecological systems. A mathematician is more likely to produce a deterministic, dynamic process model without thinking very much about noise and uncertainty (e.g., the ordinary differential equations that make up the Lotka-Volterra predator-prey model). A statistician, on the other hand, is more likely to produce a stochastic but static model that treats noise and uncertainty carefully but focuses more on static patterns than on the dynamic processes that produce them (e.g., linear regression).

The important difference between phenomenological (pattern) and mechanistic (process) models will be with us throughout the book. Phenomenological models concentrate on observed patterns in the data, using functions and distributions that are the right shape and/or sufficiently flexible to match them; mechanistic models are more concerned with the underlying processes, using functions and distributions based on theoretical expectations. As usual, shades of gray abound; the same function could be classified as either phenomenological or mechanistic depending on why it was chosen. For example, you could use the function f(x) = ax/(b + x)(a Holling type II functional response) as a mechanistic model in a predator-prey context because you expected predators to attack prey at a constant rate and be constrained by handling time, or as a phenomenological model of population growth simply because you wanted a function that started at zero, was initially linear, and leveled off as it approached an asymptote (see Chapter 3). All other things being equal, mechanistic models are more powerful since they tell you about the underlying pro cesses driving patterns. They are more likely to work correctly when extrapolating beyond the observed conditions. Finally, by making more assumptions, they allow you to extract more information from your data-with the risk of making the wrong assumptions.

Examples of theoretical models include the Lotka-Volterra or Nicholson-Bailey predator-prey equations (Hastings, 1997); classical metapopulation models for single (Hanski, 1999) and multiple (Levins and Culver, 1971; Tilman, 1994) species; simple food web models (May, 1973; Cohen et al. 1990); and theoretical ecosystem models (Agren and Bosatta, 1996). Applied models include forestry and biogeochemical cycling models (Blanco et al. 2005), fisheries stock-recruitment models (Quinn and Deriso, 1999), and population viability analysis (Morris and Doak, 2002; Miller and Lacy, 2005).

Further reading: Books on ecological modeling overlap with those on ecological theory listed on p. 4. Other good sources include Nisbet and Gurney (1982; a well-written but challenging classic), Gurney and Nisbet (1998; a lighter ver sion), Haefner (1996; broader, including physiological and ecosystem perspectives), Renshaw (1991; good coverage of stochastic models), Wilson (2000; simulation modeling in C), and Ellner and Guckenheimer (2006; dynamics of biological systems in general).

(Continues...)



Excerpted from Ecological Models and Data in R by B.M. Bolker
Copyright © 2008 by Princeton University Press . Excerpted by permission.
All rights reserved. No part of this excerpt may be reproduced or reprinted without permission in writing from the publisher.
Excerpts are provided by Dial-A-Book Inc. solely for the personal use of visitors to this web site.

Table of Contents

Acknowledgments ix
Chapter 1: Introduction and Background 1
1.1 Introduction 1
1.2 What This Book Is Not About 3
1.3 Frameworks for Modeling 5
1.4 Frameworks for Statistical Inference 10
1.5 Frameworks for Computing 17
1.6 Outline of the Modeling Process 20
1.7 R Supplement 22
Chapter 2: Exploratory Data Analysis and Graphics 29
2.1 Introduction 29
2.2 Getting Data into R 30
2.3 Data Types 34
2.4 Exploratory Data Analysis and Graphics 40
2.5 Conclusion 59
2.6 R Supplement 59
Chapter 3: Deterministic Functions for Ecological Modeling 72
3.1 Introduction 72
3.2 Finding Out about Functions Numerically 73
3.3 Finding Out about Functions Analytically 76
3.4 Bestiary of Functions 87
3.5 Conclusion 100
3.6 R Supplement 100
Chapter 4: Probability and Stochastic Distributions for Ecological Modeling 103
4.1 Introduction: Why Does Variability Matter? 103
4.2 Basic Probability Theory 104
4.3 Bayes’ Rule 107
4.4 Analyzing Probability Distributions 115
4.5 Bestiary of Distributions 120
4.6 Extending Simple Distributions: Compounding and Generalizing 137
4.7 R Supplement 141
Chapter 5: Stochastic Simulation and Power Analysis 147
5.1 Introduction 147
5.2 Stochastic Simulation 148
5.3 Power Analysis 156
Chapter 6: Likelihood and All That 169
6.1 Introduction 169
6.2 Parameter Estimation: Single Distributions 169
6.3 Estimation for More Complex Functions 182
6.4 Likelihood Surfaces, Profiles, and Confidence Intervals 187
6.5 Confidence Intervals for Complex Models: Quadratic Approximation 196
6.6 Comparing Models 201
6.7 Conclusion 220
Chapter 7: Optimization and All That 222
7.1 Introduction 222
7.2 Fitting Methods 223
7.3 Markov Chain Monte Carlo 233
7.4 Fitting Challenges 241
7.5 Estimating Confidence Limits of Functions of Parameters 250
7.6 R Supplement 258
Chapter 8: Likelihood Examples 263
8.1 Tadpole Predation 263
8.2 Goby Survival 276
8.3 Seed Removal 283
Chapter 9: Standard Statistics Revisited 298
9.1 Introduction 298
9.2 General Linear Models 300
9.3 Nonlinearity: Nonlinear Least Squares 306
9.4 Nonnormal Errors: Generalized Linear Models 308
9.5 R Supplement 312
Chapter 10: Modeling Variance 316
10.1 Introduction 316
10.2 Changing Variance within Blocks 318
10.3 Correlations: Time-Series and Spatial Data 320
10.4 Multilevel Models: Special Cases 324
10.5 General Multilevel Models 327
10.6 Challenges 333
10.7 Conclusion 334
10.8 R Supplement 335
Chapter 11: Dynamic Models 337
11.1 Introduction 337
11.2 Simulating Dynamic Models 338
11.3 Observation and Process Error 342
11.4 Process and Observation Error 344
11.5 SIMEX 346
11.6 State-Space Models 348
11.7 Conclusions 357
11.8 R Supplement 360
Chapter 12: Afterword 362
Appendix Algebra and Calculus Basics 363
A.1 Exponentials and Logarithms 363
A.2 Differential Calculus 364
A.3 Partial Differentiation 364
A.4 Integral Calculus 365
A.5 Factorials and the Gamma Function 365
A.6 Probability 365
A.7 The Delta Method 366
A.8 Linear Algebra Basics 366
Bibliography 369
Index of R Arguments, Functions, and Packages 383
General Index 389

What People are Saying About This

Brian Inouye

Benjamin Bolker is a pioneer in helping ecology students make the leap from a casual understanding of modern statistical methods to a hands-on application of these tools to their own precious data sets. This book shows the lessons learned from teaching this material to several cohorts of graduate students. No other book I've read gives such a good feel for the compromises scientists have to make in searching for good statistical models.
Brian Inouye, Florida State University

Munch

This user-friendly introduction to likelihood and Bayesian statistical methods for ecology students is set apart by its emphasis on implementation in R. This alone will make it more useful than previous books. In contrast to other texts, Bolker's book explains how to fit models to data in enough detail that even students with little programming experience will be able to follow along. I expect this to become an exceedingly popular textbook.
Stephan B. Munch, Stony Brook University

Timothy Essington

I have no doubt that this book will become a fixture on many ecologists' bookshelves (it certainly will be on mine). With a presentation that is gentle and encouraging rather than jargon-filled and intimidating, it empowers ecologists to develop their own statistical procedures. I strongly recommend it.
Timothy Essington, University of Washington

From the Publisher

"This user-friendly introduction to likelihood and Bayesian statistical methods for ecology students is set apart by its emphasis on implementation in R. This alone will make it more useful than previous books. In contrast to other texts, Bolker's book explains how to fit models to data in enough detail that even students with little programming experience will be able to follow along. I expect this to become an exceedingly popular textbook."—Stephan B. Munch, Stony Brook University

"Benjamin Bolker is a pioneer in helping ecology students make the leap from a casual understanding of modern statistical methods to a hands-on application of these tools to their own precious data sets. This book shows the lessons learned from teaching this material to several cohorts of graduate students. No other book I've read gives such a good feel for the compromises scientists have to make in searching for good statistical models."—Brian Inouye, Florida State University

"I have no doubt that this book will become a fixture on many ecologists' bookshelves (it certainly will be on mine). With a presentation that is gentle and encouraging rather than jargon-filled and intimidating, it empowers ecologists to develop their own statistical procedures. I strongly recommend it."—Timothy Essington, University of Washington

From the B&N Reads Blog

Customer Reviews