Table of Contents
Preface xi
1 R: The language and the program 1
1.1 Aims of this chapter 1
1.2 R 1
1.2.1 What is R? 1
1.2.2 R as a language 3
1.2.3 R as a computer program 3
1.2.3.1 Using R interactively 4
1.2.3.2 Using R in a "batch job" 6
1.2.3.3 Editors and IDEs 7
1.3 Reproducible data analysis 9
1.4 Finding additional information 10
1.4.1 R's built-in help 11
1.4.2 Obtaining help from online forums 12
1.4.2.1 Netiquette 12
1.4.2.2 StackOverflow 13
1.4.2.3 Reporting bugs 13
1.5 What is needed to run the examples in this book? 14
1.6 Further reading 15
2 The R language: "Words" and "sentences" 17
2.1 Aims of this chapter 17
2.2 Natural and computer languages 18
2.3 Numeric values and arithmetic 18
2.4 Logical values and Boolean algebra 29
2.5 Comparison operators and operations 31
2.6 Sets and set operations 36
2.7 Character values 39
2.8 The 'mode' and 'class' of objects 41
2.9 'Type' conversions 42
2.10 Vector manipulation 45
2.11 Matrices and multidimensional arrays 51
2.12 Factors 56
2.13 Lists 62
2.13.1 Member extraction and subsetting 62
2.13.2 Adding and removing list members 63
2.13.3 Nested lists 64
2.14 Data frames 66
2.14.1 Operating within data frames 71
2.14.2 Re-arranging columns and rows 75
2.15 Attributes of R objects 77
2.16 Saving and loading data 78
2.16.1 Data sets in R and packages 78
2.16.2 .rda files 79
2.16.3 .rds files 80
2.17 Looking at data 81
2.18 Plotting 83
2.19 Further reading 86
3 The R language: "Paragraphs" and "essays" 87
3.1 Aims of this chapter 87
3.2 Writing scripts 87
3.2.1 What is a script? 88
3.2.2 How do we use a script? 88
3.2.3 How to write a script 89
3.2.4 The need to be understandable to people 90
3.2.5 Debugging scripts 91
3.3 Control of execution flow 94
3.3.1 Compound statements 94
3.3.2 Conditional execution 94
3.3.2.1 Non-vectorized if, else and switch 95
3.3.2.2 Vectorized if else () 98
3.3.3 Iteration 100
3.3.3.1 For loops 100
3.3.3.2 While loops 102
3.3.3.3 Repeat loops 103
3.3.4 Explicit loops can be slow in R 104
3.3.5 Nesting of loops 105
3.3.5.1 Clean-up 107
3.4 Apply functions 108
3.4.1 Applying functions to vectors and lists 108
3.4.2 Applying functions to matrices and arrays 111
3.5 Object names and character strings 113
3.6 The multiple faces of loops 115
3.6.1 Further reading 117
4 The R language: Statistics 119
4.1 Aims of this chapter 119
4.2 Statistical summaries 119
4.3 Distributions 120
4.3.1 Density from parameters 121
4.3.2 Probabilities from parameters and quantiles 122
4.3.3 Quantiles from parameters and probabilities 122
4.3.4 "Random" draws from a distribution 123
4.4 "Random" sampling 124
4.5 Correlation 125
4.5.1 Pearson's γ 125
4.5.2 Kendall's τ and Spearman's ρ 127
4.6 Fitting linear models 127
4.6.1 Regression 128
4.6.2 Analysis of variance, ANOVA 135
4.6.3 Analysis of covariance, ANCOVA 138
4.7 Generalized linear models 138
4.8 Non-linear regression 140
4.9 Model formulas 143
4.10 Time series 151
4.11 Multivariate statistics 153
4.11.1 Multivariate analysis of variance 153
4.11.2 Principal components analysis 155
4.11.3 Multidimensional scaling 157
4.11.4 Cluster analysis 159
4.12 Further reading 161
5 The R language: Adding new "words" 163
5.1 Aims of this chapter 163
5.2 Packages 163
5.2.1 Sharing of R-language extensions 163
5.2.2 How packages work 164
5.2.3 Download, installation and use 165
5.2.4 Finding suitable packages 165
5.3 Defining functions and operators 166
5.3.1 Ordinary functions 168
5.3.2 Operators 171
5.4 Objects, classes, and methods 172
5.5 Scope of names 176
5.6 Further reading 177
6 New grammars of data 179
6.1 Aims of this chapter 179
6.2 Introduction 179
6.3 Packages used in this chapter 181
6.4 Replacements for data, frame 181
6.4.1 'data.table' 181
6.4.2 'tibble' 182
6.5 Data pipes 187
6.5.1 'magrittr' 188
6.5.2 'wrapr' 189
6.6 Reshaping with 'tidyr' 190
6.7 Data manipulation with 'dplyr' 192
6.7.1 Row-wise manipulations 193
6.7.2 Group-wise manipulations 195
6.7.3 Joins 198
6.8 Further reading 201
7 Grammar of graphics 203
7.1 Aims of this chapter 203
7.2 Packages used in this chapter 203
7.3 Introduction to the grammar of graphics 204
7.3.1 Data 205
7.3.2 Mapping 205
7.3.3 Geometries 205
7.3.4 Statistics 205
7.3.5 Scales 206
7.3.6 Coordinate systems 206
7.3.7 Themes 206
7.3.8 Plot construction 207
7.3.9 Plots as R objects 214
7.3.10 Data and mappings 215
7.4 Geometries 216
7.4.1 Point 217
7.4.2 Rug 221
7.4.3 Line and area 222
7.4.4 Column 225
7.4.5 Tiles 226
7.4.6 Simple features (sf) 228
7.4.7 Text 228
7.4.8 Plot insets 233
7.5 Statistics 238
7.5.1 Functions 238
7.5.2 Summaries 239
7.5.3 Smoothers and models 242
7.5.4 Frequencies and counts 245
7.5.5 Density functions 248
7.5.6 Box and whiskers plots 249
7.5.7 Violin plots 250
7.6 Facets 252
7.7 Scales 255
7.7.1 Axis and key labels 256
7.7.2 Continuous scales 258
7.7.2.1 Limits 258
7.7.2.2 Ticks and their labels 260
7.7.2.3 Transformed scales 261
7.7.2.4 Position of x and y axes 263
7.7.2.5 Secondary axes 263
7.7.3 Time and date scales for x and y 264
7.7.4 Discrete scales for x and y 265
7.7.5 Size 266
7.7.6 Color and fill 266
7.7.6.1 Color definitions in R 267
7.7.7 Continuous color-related scales 268
7.7.8 Discrete color-related scales 268
7.7.9 Identity scales 269
7.8 Adding annotations 269
7.9 Coordinates and circular plots 272
7.9.1 Wind-rose plots 272
7.9.2 Pie charts 274
7.10 Themes 275
7.10.1 Complete themes 275
7.10.2 Incomplete themes 277
7.10.3 Defining a new theme 279
7.11 Composing plots 281
7.12 Using plotmath expressions 282
7.13 Creating complex data displays 287
7.14 Creating sets of plots 288
7.14.1 Saving plot layers and scales in variables 288
7.14.2 Saving plot layers and scales in lists 289
7.14.3 Using functions as building blocks 289
7.15 Generating output files 290
7.16 Further reading 291
8 Data import and export 293
8.1 Aims of this chapter 293
8.2 Introduction 294
8.3 Packages used in this chapter 294
8.4 File names and operations 295
8.5 Opening and closing file connections 298
8.6 Plain-text files 299
8.6.1 Base R and 'utils' 301
8.6.2 Readr 305
8.7 XML and HTML files 310
8.7.1 'xml2' 310
8.8 GPX files 311
8.9 Worksheets 312
8.9.1 CSV files as middlemen 312
8.9.2 'readxl' 312
8.9.3 'xlsx' 314
8.9.4 'readODS' 315
8.10 Statistical software 316
8.10.1 Foreign 316
8.10.2 Haven 317
8.11 NetCDF files 318
8.11.1 ncdf4 319
8.11.2 Tidync 320
8.12 Remotely located data 322
8.13 Data acquisition from physical devices 324
8.13.1 Jsonlite 324
8.14 Databases 325
8.15 Further reading 326
Bibliography 327
General index 331
Index of R names by category 339
Alphabetic index of R names 345