Beyond Spreadsheets with R: A beginner's guide to R and RStudio
352Beyond Spreadsheets with R: A beginner's guide to R and RStudio
352Paperback(1st Edition)
-
PICK UP IN STORECheck Availability at Nearby Stores
Available within 2 business hours
Related collections and offers
Overview
Beyond Spreadsheets with R shows you how to take raw data and transform it for use in computations, tables, graphs, and more. You'll build on simple programming techniques like loops and conditionals to create your own custom functions. You'll come away with a toolkit of strategies for analyzing and visualizing data of all sorts using R and RStudio.
Purchase of the print book includes a free eBook in PDF, Kindle, and ePub formats from Manning Publications.
About the Technology
Spreadsheets are powerful tools for many tasks, but if you need to interpret, interrogate, and present data, they can feel like the wrong tools for the task. That's when R programming is the way to go. The R programming language provides a comfortable environment to properly handle all types of data. And within the open source RStudio development suite, you have at your fingertips easy-to-use ways to simplify complex manipulations and create reproducible processes for analysis and reporting.
About the Book
With Beyond Spreadsheets with R you'll learn how to go from raw data to meaningful insights using R and RStudio. Each carefully crafted chapter covers a unique way to wrangle data, from understanding individual values to interacting with complex collections of data, including data you scrape from the web. You'll build on simple programming techniques like loops and conditionals to create your own custom functions. You'll come away with a toolkit of strategies for analyzing and visualizing data of all sorts.
What's inside
- How to start programming with R and RStudio
- Understanding and implementing important R structures and operators
- Installing and working with R packages
- Tidying, refining, and plotting your data
About the Reader
If you're comfortable writing formulas in Excel, you're ready for this book.
About the Author
Dr Jonathan Carroll is a data science consultant providing R programming services. He holds a PhD in theoretical physics.
Table of Contents
- Introducing data and the R language
- Getting to know R data types
- Making new data values
- Understanding the tools you'll use: Functions
- Combining data values
- Selecting data values
- Doing things with lots of data
- Doing things conditionally: Control structures
- Visualizing data: Plotting
- Doing more with your data with extensions
Product Details
ISBN-13: | 9781617294594 |
---|---|
Publisher: | Manning |
Publication date: | 12/17/2018 |
Edition description: | 1st Edition |
Pages: | 352 |
Product dimensions: | 7.30(w) x 9.10(h) x 0.80(d) |
About the Author
Table of Contents
Preface xiii
Acknowledgments xv
About this book xvii
About the authors xxv
About the cover illustration xxvi
1 Introducing data and the R language 1
1.1 Data: What, where, how? 2
What is data? 2
Seeing the world as data sources 2
Data munging 4
What you can do with well-handled data 4
Data as an asset 7
Reproducible research and version control 9
1.2 Introducing R 11
The origins of R 12
What R is and what it isn't 13
1.3 How R works 14
1.4 Introducing RStudio 17
Working with R within RStudio 17
Built-in packages (data and functions) 22
Built-in documentation 23
Vignettes 24
1.5 Try it yourself 24
2 Getting to know R data types 26
2.1 Types of data 27
Numbers 27
Text (strings) 31
Categories (factors) 32
Dates and times 35
Logicals 36
Missing values 37
2.2 Storing values (assigning) 38
Naming data (variables) 38
Unchanging data 43
The assignment operators (<- vs. =) 44
2.3 Specifying the data type 46
2.4 Telling R to ignore something 50
2.5 Try it yourself 51
3 Making new data values 53
3.1 Basic mathematics 53
3.2 Operator precedence 56
3.3 String concatenation (joining) 57
3.4 Comparisons 59
3.5 Automatic conversion (coercion) 63
3.6 Try it yourself 65
4 Understanding the tools you'll use: Functions 67
4.1 Functions 68
Under the hood 70
Function template 72
Arguments 75
Multiple arguments 78
Default arguments 80
Argument name matching 82
Partial matching 84
Scope 86
4.2 Packages 90
Installing packages 92
How does R (not) know about this function? 95
Namespaces 95
4.3 Messages, warnings, and errors, oh my! 97
Creating messages, warnings, and errors 98
Diagnosing messages, warnings, and errors 100
4.4 Testing 102
4.5 Project: Generalizing a function 103
4.6 Try it yourself 104
5 Combining data values 106
5.1 Simple collections 106
Coercion 108
Missing values 109
Attributes 109
Names 110
5.2 Sequences 112
Vector functions 116
Vector math operations 117
5.3 Matrices 119
Naming dimensions 121
5.4 Lists 122
5.5 data.frames 125
5.6 Classes 129
The tibble class 131
Structures as function arguments 135
5.7 Try it yourself 136
6 Selecting data values 139
6.1 Text processing 140
Text matching 140
Substrings 142
Text substitutions 142
Regular expressions 143
6.2 Selecting components from structures 146
Vectors 146
Lists 149
Matrices 153
6.3 Replacing values 155
6.4 data.frames and dplyr 159
Dplyr verbs 160
Non-standard evaluation 162
Pipes 164
Subsetting data.frame the hard way 167
6.5 Replacing NA 170
6.6 Selecting conditionally 171
6.7 Summarizing values 174
6.8 A worked example: Excel vs. R 177
6.9 Try it yourself 178
Solutions-no peeking 179
7 Doing things with lots of data 182
7.1 Tidy data principles 182
The working directory 184
Stored data formats 186
Reading data into R 187
Scraping data 191
Inspecting data 195
Dealing with odd values in data (sentinel values) 196
Converting to tidy data 199
7.2 Merging data 202
7.3 Writing data from R 208
7.4 Try if yourself 211
8 Doing things conditionally: Control structures 213
8.1 Looping 213
Vectorization 214
Tidy repetition: Looping with purrr 215
For loops 220
8.2 Wider and narrower loop scope 222
While loops 224
8.3 Conditional evaluation 225
If conditions 225
Ifelse conditions 229
8.4 Try it yourself 233
9 Visualizing data: Plotting 235
9.1 Data preparation 235
Tidy data, revisited 236
Importance of data types 236
9.2 ggplot2 237
General construction 237
Adding points 241
Style aesthetics 243
Adding lines 247
Adding bars 251
Other types of plots 258
Scales 260
Facetting 268
Additional options 273
9.3 Plots as objects 276
9.4 Saving plots 278
9.5 Try it yourself 279
10 Doing more with your data with extensions 281
10.1 Writing your own packages 282
Creating a minimal package 282
Documentation 283
10.2 Analyzing your package 287
Unit testing 288
Profiling 290
10.3 What to do nest? 291
Regression 291
Clustering 294
Working with maps 297
Interacting with APIs 300
Sharing your package 302
10.4 More resources 303
Appendix A Installing R 305
Appendix B Installing RStudio 307
Appendix C Graphics in bas R 309
Index 317