Think Stats: Exploratory Data Analysis

If you know how to program, you have the skills to turn data into knowledge. This thoroughly revised edition presents statistical concepts computationally, rather than mathematically, using programs written in Python. Through practical examples and exercises based on real-world datasets, you'll learn the entire process of exploratory data analysis—from wrangling data and generating statistics to identifying patterns and testing hypotheses.

Whether you're a data scientist, software engineer, or data enthusiast, you'll get up to speed on commonly used tools including NumPy, SciPy, and Pandas. You'll explore distributions, relationships between variables, visualization, and many other concepts. And all chapters are available as Jupyter notebooks, so you can read the text, run the code, and work on exercises all in one place.

  • Analyze data distributions and visualize patterns using Python libraries
  • Improve predictions and insights with regression models
  • Dive into specialized topics like time series analysis and survival analysis
  • Integrate statistical techniques and tools for validation, inference, and more
  • Communicate findings with effective data visualization
  • Troubleshoot common data analysis challenges
  • Boost reproducibility and collaboration in data analysis projects with interactive notebooks
1146576158
Think Stats: Exploratory Data Analysis

If you know how to program, you have the skills to turn data into knowledge. This thoroughly revised edition presents statistical concepts computationally, rather than mathematically, using programs written in Python. Through practical examples and exercises based on real-world datasets, you'll learn the entire process of exploratory data analysis—from wrangling data and generating statistics to identifying patterns and testing hypotheses.

Whether you're a data scientist, software engineer, or data enthusiast, you'll get up to speed on commonly used tools including NumPy, SciPy, and Pandas. You'll explore distributions, relationships between variables, visualization, and many other concepts. And all chapters are available as Jupyter notebooks, so you can read the text, run the code, and work on exercises all in one place.

  • Analyze data distributions and visualize patterns using Python libraries
  • Improve predictions and insights with regression models
  • Dive into specialized topics like time series analysis and survival analysis
  • Integrate statistical techniques and tools for validation, inference, and more
  • Communicate findings with effective data visualization
  • Troubleshoot common data analysis challenges
  • Boost reproducibility and collaboration in data analysis projects with interactive notebooks
67.99 In Stock
Think Stats: Exploratory Data Analysis

Think Stats: Exploratory Data Analysis

by Allen B. Downey
Think Stats: Exploratory Data Analysis

Think Stats: Exploratory Data Analysis

by Allen B. Downey

eBook

$67.99 

Available on Compatible NOOK devices, the free NOOK App and in My Digital Library.
WANT A NOOK?  Explore Now

Related collections and offers


Overview

If you know how to program, you have the skills to turn data into knowledge. This thoroughly revised edition presents statistical concepts computationally, rather than mathematically, using programs written in Python. Through practical examples and exercises based on real-world datasets, you'll learn the entire process of exploratory data analysis—from wrangling data and generating statistics to identifying patterns and testing hypotheses.

Whether you're a data scientist, software engineer, or data enthusiast, you'll get up to speed on commonly used tools including NumPy, SciPy, and Pandas. You'll explore distributions, relationships between variables, visualization, and many other concepts. And all chapters are available as Jupyter notebooks, so you can read the text, run the code, and work on exercises all in one place.

  • Analyze data distributions and visualize patterns using Python libraries
  • Improve predictions and insights with regression models
  • Dive into specialized topics like time series analysis and survival analysis
  • Integrate statistical techniques and tools for validation, inference, and more
  • Communicate findings with effective data visualization
  • Troubleshoot common data analysis challenges
  • Boost reproducibility and collaboration in data analysis projects with interactive notebooks

Product Details

ISBN-13: 9781098190224
Publisher: O'Reilly Media, Incorporated
Publication date: 04/04/2025
Sold by: Barnes & Noble
Format: eBook
Pages: 324
File size: 9 MB

About the Author

Allen B. Downey is a Professor of Computer Science at Olin College of Engineering. He has taught computer science at Wellesley College, Colby College and U.C. Berkeley. He has a Ph.D. in Computer Science from U.C. Berkeley and Masterâs and Bachelorâs degrees from MIT. He is the author of Think Python, Think Bayes, Think DSP, and a blog, Probably Overthinking It.

Table of Contents

Preface;
Why I Wrote This Book;
How I Wrote This Book;
Contributor List;
Conventions Used in This Book;
Using Code Examples;
Safari® Books Online;
How to Contact Us;
Chapter 1: Statistical Thinking for Programmers;
1.1 Do First Babies Arrive Late?;
1.2 A Statistical Approach;
1.3 The National Survey of Family Growth;
1.4 Tables and Records;
1.5 Significance;
1.6 Glossary;
Chapter 2: Descriptive Statistics;
2.1 Means and Averages;
2.2 Variance;
2.3 Distributions;
2.4 Representing Histograms;
2.5 Plotting Histograms;
2.6 Representing PMFs;
2.7 Plotting PMFs;
2.8 Outliers;
2.9 Other Visualizations;
2.10 Relative Risk;
2.11 Conditional Probability;
2.12 Reporting Results;
2.13 Glossary;
Chapter 3: Cumulative Distribution Functions;
3.1 The Class Size Paradox;
3.2 The Limits of PMFs;
3.3 Percentiles;
3.4 Cumulative Distribution Functions;
3.5 Representing CDFs;
3.6 Back to the Survey Data;
3.7 Conditional Distributions;
3.8 Random Numbers;
3.9 Summary Statistics Revisited;
3.10 Glossary;
Chapter 4: Continuous Distributions;
4.1 The Exponential Distribution;
4.2 The Pareto Distribution;
4.3 The Normal Distribution;
4.4 Normal Probability Plot;
4.5 The Lognormal Distribution;
4.6 Why Model?;
4.7 Generating Random Numbers;
4.8 Glossary;
Chapter 5: Probability;
5.1 Rules of Probability;
5.2 Monty Hall;
5.3 Poincaré;
5.4 Another Rule of Probability;
5.5 Binomial Distribution;
5.6 Streaks and Hot Spots;
5.7 Bayes’s Theorem;
5.8 Glossary;
Chapter 6: Operations on Distributions;
6.1 Skewness;
6.2 Random Variables;
6.3 PDFs;
6.4 Convolution;
6.5 Why Normal?;
6.6 Central Limit Theorem;
6.7 The Distribution Framework;
6.8 Glossary;
Chapter 7: Hypothesis Testing;
7.1 Testing a Difference in Means;
7.2 Choosing a Threshold;
7.3 Defining the Effect;
7.4 Interpreting the Result;
7.5 Cross-Validation;
7.6 Reporting Bayesian Probabilities;
7.7 Chi-Square Test;
7.8 Efficient Resampling;
7.9 Power;
7.10 Glossary;
Chapter 8: Estimation;
8.1 The Estimation Game;
8.2 Guess the Variance;
8.3 Understanding Errors;
8.4 Exponential Distributions;
8.5 Confidence Intervals;
8.6 Bayesian Estimation;
8.7 Implementing Bayesian Estimation;
8.8 Censored Data;
8.9 The Locomotive Problem;
8.10 Glossary;
Chapter 9: Correlation;
9.1 Standard Scores;
9.2 Covariance;
9.3 Correlation;
9.4 Making Scatterplots in Pyplot;
9.5 Spearman’s Rank Correlation;
9.6 Least Squares Fit;
9.7 Goodness of Fit;
9.8 Correlation and Causation;
9.9 Glossary;
Colophon;

Allen Downey is an Associate Professor of Computer Science at the Olin College of Engineering. He has taught computer science at Wellesley College, Colby College and U.C. Berkeley. He has a Ph.D. in Computer Science from U.C. Berkeley and Master’s and Bachelor’s degrees from MIT.

From the B&N Reads Blog

Customer Reviews