Table of Contents
About the Authors xv
Preface xix
Acknowledgements xxxi
1 How to Use This Book 1
Setting Up Your Computer 1
Running Code as You Go Along 1
Chapter Structure 2
2 Installing and Running R 3
Downloading and Installing R onto Your Computer 3
Installing Packages 7
3 Very Basic R Syntax 9
4 First Simple Programs and Graphics 13
Basic R Features 13
Commas, Brackets and Concatenation 14
The Colon Character 15
Raise to the Power of Symbol 15
Exiting from R 16
Help Pages 16
Beginning with Simple R Code to Get Used to the Command Line System 16
Playing with Graphics 19
Working with Character Variables 23
Built-in R Datasets 27
The table Function 27
Ragged Data 28
5 The Dataframe Concept 31
Combining Sets of Tables for Data Collected on Different Dates 34
Converting Factors in a Dataframe to Numeric or Character 34
6 Plotting Biological Data in Various Ways 37
Example 1 Bryophytes up a Mountain 37
Troubleshooting 1 41
Adding a Legend to a Plot 43
Troubleshooting 2 - Vector Lengths Differ 45
Troubleshooting 3 - Missing Data and NAs 46
Incorporating More Types of Data on the Same Graph 48
Example 2 Tropical Forests, Rural Population, Logarithmic Axes and Installing Packages 49
Example 3 Creating a Barplot: Bryophytes Side-by-side 52
Example 4 Stacked Bar Chart, with Different Colours, Fills and Legends 53
Example 5 Dietary Differences between Hornbill Species - Entering Data as a Table 57
Example 6 Horizontal Bar Plot of Camera Trap Data and More Troubleshooting 60
Example 7 Adding Error Bars to a Barplot or Plot: Fly Ommatidea 62
Example 8 Creating Pie Charts Using pie and circlize 64
Example 9 Fish Metacercarial Load and Box and Whisker Plots 69
Adding Notches to a Boxplot 73
Tukey's Honest Significant Difference Test 74
7 The Grammar of Graphics Family of Packages 79
8 Sets and Venn Diagrams 85
9 Statistics: Choosing the Right Test 95
Explanatory and Response Variables, Experiments and Surveys 97
Parametric versus Non-parametric Tests 98
Difference between Linear Models and Generalized Linear Models 98
Our Basic Aim Is to Achieve a Near-linear QQ Plot and Even Variance 102
10 Commonly Used Measures and Statistical Tests 103
Normality, Skew and Kurtosis 103
Testing Whether Proportions Agree with Null Expectations 104
The Special Case of Contingency Tables 106
Hardy-Weinberg Equilibrium 107
Alternatives to the Chi-squared Test under Some Circumstances 110
Testing Whether Two Means Are Significantly Different 111
Single-sample t-test 111
Two-sample t-test 112
Paired t-test 113
Testing Whether Three or More Means Differ from One Another 113
Comparing Two Variances 114
Non-normally Distributed Data with Small Sample Sizes - Mann-Whitney U Test 114
Non-parametric Two-sample Tests 116
Binomial Test 117
11 Regression and Correlation Analyses 119
Linear versus Non-linear Regression 120
Log-log Plot Example Correlation of Numbers of Species with Area 121
Linearizing Data with No Known Underlying Model 123
Errant Points and Leverage 125
QQ Model Plot from the car Library 129
Comparing Regression Slopes and Intercepts Using t-test 130
Non-linear Regression 134
Multiple Regression 137
Pairwise Plots of Explanatory Variables to Visually Inspect Interactions 138
Polynomial Regression and Model Simplification 140
Model Simplification 143
12 Count Data as Response Variable 147
Example 1 Fledgling Numbers in Relation to Clutch Initiation Date 148
Example 2 Pollinator Flower Visits in Passiflora in Relation to Flower Size 151
13 Analysis of Variance (ANOVA) 155
Example 1 A One-way ANOVA, the InsectSprays Dataset 155
Example 2 ANOVA with Proportion Data as Response Variable Using Arcsine Transformation 157
Example 3 Analysis with Proportion Data as Response Variable Using Logit Transformation 163
14 Analysis of Covariance (ANCOVA) 166
Example 1 Growth of Tagged Gobies 166
Example 2 Fitting through the Origin and Count Data as Response Variable 168
15 More Generalized Linear Modelling 171
Model Inspection 171
Binary Response Variable with One Continuous Explanatory Variable 172
Example 1 Logistic regression of gall former predation 172
LD50s 176
Example 2 Pollinator counts - showing importance of deviance 177
Example 3 Proportion data with N known 182
16 Monte Carlo Tests and Randomization 187
Random Number Generator Code 187
Example 1 Flower Visits by Thai Honey Bee Species 188
Randomizing Cells in a Matrix 191
17 Principal Components Analysis 194
Example 1 Rock Oyster Allozymes 194
Example 2 The Iris Dataset 197
18 Species Abundance, Accumulation and Diversity Data 200
Species Accumulation Data 200
Species Accumulation Curves and Randomization 202
Species Richness Estimation 208
Species Diversity Indices 208
A Note to Be Cautious about Logarithms in Functions 210
Broken-stick Models 211
A Much Faster Approach Using Vectorization 214
19 Survivorship 218
Example 1 Survival of Killdeer Nests 218
20 Dates and Julian Dates 227
Problem with Two-digit Dates and POSIX: A Date of Burial Example 232
Phenology and the density Function 234
Extracting Day and Month from Julian Days 236
Seasonal Patterns and Other Smoothing Curves 238
21 Mapping and Parsing Text Input for Data 240
Creating Our Own Map from Digitized Coordinates 247
22 More on Manipulating Text 257
Example 1 Standardizing Names in a Phylogenetic Tree Description 257
Method 1 With Wildcards 259
Method 2 Based on Fixed Character String Length 262
Method 3 Using a Vector of Positions 262
Example 2 Substrings of Unknown Length 264
Trimming White Spaces and/or Tabs 268
Using Wildcards to Locate Internal Letter Strings 268
Finding Suffixes, Prefixes and Specifying Letters, Numbers and Punctuation 269
Manipulating Character Case 271
Ignoring Character Case 272
Specifying Particular and Modifiable Character Classes 273
23 Phylogenies and Trees 275
Branch Lengths 279
Random Trees 280
Different Types of Plots in ape 281
24 Working with DNA Sequences and Other Character Data 284
Sequential Runs of Base Types 288
Downloading DNA Sequences from GenBank 290
Translating DNA to Amino Acids 292
Prettifying a Table 293
Easy Ways to Extract Taxon Names from a Phylogenetic Matrix 295
Replacing Specified Ambiguity Codes with a Question Mark 296
25 Spacing in Two Dimensions 297
26 Population Modelling Including Spatially Explicit Models 303
Example 1 Ricker Population Growth Model, Plotting as You Go 303
Example 2 Host-Parasitoid Population Modelling- Discrete Time Version 306
Example 3 Spatial Host-Parasitoid Model 310
Example 4 Genetic Drift, a Program Aimed at Teaching Students about Evolution 318
27 More on apply Family of Functions - Avoid Loops to Get More Speed 322
Using apply 323
Using tapply to Calculate Values Based on Factors 324
28 Food Webs and Simple Graphics 326
A Parasitoid foodweb Example 326
Foodweb and Community Packages 328
29 Adding Photographs 332
30 Standard Distributions in R 335
The Normal Distribution 335
Student's t Distribution 338
Lognormal Distribution 340
Logistic Distribution 341
Poisson Distribution 342
Gamma Distribution 343
The Chi-squared Distribution 344
31 Reading and Writing Data to and from Files 348
Appending Data to an Existing File 349
Using read.delim with Non-tab Separator 350
Choosing a File to Read Interactively 350
Using Excel for Data Entry 351
The readxl Function and Tibbies 352
Reading PDF Files for Data Mining 354
Writing Graphics Directly to Disc 354
Appendix 1 Summary of Graphical Parameters 357
Arguments Passed Directly to par Function 357
Arguments Applied Directly to the plot Function as well as in Some Others 357
Arguments for the lines Function 358
Having Multiple Graphics Windows Open at the Same Time 358
Macintosh-specific Graphics 359
Using the layout Function 359
Using the split.screen Function 359
Appendix 2 General Housekeeping R Functions and Others Not Covered in the Main Text 360
General Housekeeping Functions 360
Setting or Changing the Working Directory 360
Finding What Files Are in a Directory 361
Graphical Functions and Parameters 361
Interaction with User 361
Mathematical Functions 361
Writing Concatenated Data Straight to File (in the Working Directory) Using cat 362
Troubleshooting Package Installation 362
Appendix 3 Some Useful Statistical and Mathematical Equations 364
Logical Mathematical Operators 364
Descriptive Statistics 364
Distributions 365
Correlation Coefficients 365
Statistical Tests 365
Logarithms and Exponents 366
Logistic Functions 366
Weibull and Gompertz Equations 366
Trigonometric Functions 367
Convert Radians and Degrees Functions 367
Bibliography 369
Web Resources 375
Index 377
Online Supplementary Appendices
1 Online Resources: Data Files
2 Online Resources: Complete R Codes Used for Graphs, Analyses and Simulations
3 Online Resource: Suggested Answers to Exercises
These Online Resources can be found at: cabi.org/openresources/45349