Foundations for Analytics with Python: From Non-Programmer to Hacker
If you’re like many of Excel’s 750 million users, you want to do more with your data—like repeating similar analyses over hundreds of files, or combining data in many files for analysis at one time. This practical guide shows ambitious non-programmers how to automate and scale the processing and analysis of data in different formats—by using Python.

After author Clinton Brownley takes you through Python basics, you’ll be able to write simple scripts for processing data in spreadsheets as well as databases. You’ll also learn how to use several Python modules for parsing files, grouping data, and producing statistics. No programming experience is necessary.

  • Create and run your own Python scripts by learning basic syntax
  • Use Python’s csv module to read and parse CSV files
  • Read multiple Excel worksheets and workbooks with the xlrd module
  • Perform database operations in MySQL or with the mysqlclient module
  • Create Python applications to find specific records, group data, and parse text files
  • Build statistical graphs and plots with matplotlib, pandas, ggplot, and seaborn
  • Produce summary statistics, and estimate regression and classification models
  • Schedule your scripts to run automatically in both Windows and Mac environments
1124400789
Foundations for Analytics with Python: From Non-Programmer to Hacker
If you’re like many of Excel’s 750 million users, you want to do more with your data—like repeating similar analyses over hundreds of files, or combining data in many files for analysis at one time. This practical guide shows ambitious non-programmers how to automate and scale the processing and analysis of data in different formats—by using Python.

After author Clinton Brownley takes you through Python basics, you’ll be able to write simple scripts for processing data in spreadsheets as well as databases. You’ll also learn how to use several Python modules for parsing files, grouping data, and producing statistics. No programming experience is necessary.

  • Create and run your own Python scripts by learning basic syntax
  • Use Python’s csv module to read and parse CSV files
  • Read multiple Excel worksheets and workbooks with the xlrd module
  • Perform database operations in MySQL or with the mysqlclient module
  • Create Python applications to find specific records, group data, and parse text files
  • Build statistical graphs and plots with matplotlib, pandas, ggplot, and seaborn
  • Produce summary statistics, and estimate regression and classification models
  • Schedule your scripts to run automatically in both Windows and Mac environments
44.99 In Stock
Foundations for Analytics with Python: From Non-Programmer to Hacker

Foundations for Analytics with Python: From Non-Programmer to Hacker

by Clinton Brownley
Foundations for Analytics with Python: From Non-Programmer to Hacker

Foundations for Analytics with Python: From Non-Programmer to Hacker

by Clinton Brownley

Paperback

$44.99 
  • SHIP THIS ITEM
    In stock. Ships in 1-2 days.
  • PICK UP IN STORE

    Your local store may have stock of this item.

Related collections and offers


Overview

If you’re like many of Excel’s 750 million users, you want to do more with your data—like repeating similar analyses over hundreds of files, or combining data in many files for analysis at one time. This practical guide shows ambitious non-programmers how to automate and scale the processing and analysis of data in different formats—by using Python.

After author Clinton Brownley takes you through Python basics, you’ll be able to write simple scripts for processing data in spreadsheets as well as databases. You’ll also learn how to use several Python modules for parsing files, grouping data, and producing statistics. No programming experience is necessary.

  • Create and run your own Python scripts by learning basic syntax
  • Use Python’s csv module to read and parse CSV files
  • Read multiple Excel worksheets and workbooks with the xlrd module
  • Perform database operations in MySQL or with the mysqlclient module
  • Create Python applications to find specific records, group data, and parse text files
  • Build statistical graphs and plots with matplotlib, pandas, ggplot, and seaborn
  • Produce summary statistics, and estimate regression and classification models
  • Schedule your scripts to run automatically in both Windows and Mac environments

Product Details

ISBN-13: 9781491922538
Publisher: O'Reilly Media, Incorporated
Publication date: 09/04/2016
Pages: 349
Product dimensions: 7.00(w) x 9.30(h) x 0.80(d)

About the Author

Clinton Brownley, Ph.D., is a data scientist at Facebook, where he is responsible for a wide variety of data pipelining, statistical modeling, and data visualization projects that inform data-driven decisions about large-scale infrastructure. Clinton is a Past-President of the San Francisco Bay Area Chapter of the American Statistical Association and a Council member for the Section on Practice of the Institute for Operations Research and the Management Sciences. Clinton received degrees from Carnegie Mellon Universityand American University.

Table of Contents

Preface ix

1 Python Basics 1

How to Create a Python Script 1

How to Run a Python Script 4

Useful Tips for Interacting with the Command Line 7

Pythons Basic Building Blocks 11

Numbers 12

Strings 14

Regular Expressions and Pattern Matching 19

Dates 22

Lists 25

Tuples 31

Dictionaries 32

Control Flow 37

Reading a Text File 44

Create a Text File 44

Script and Input File in Same Location 47

Modern File-Reading Syntax 47

Reading Multiple Text Files with glob 48

Create Another Text File 49

Writing to a Text File 52

Add Code to first_script.py 53

Writing to a Comma-Separated Values (CSV) File 55

Print Statements 57

Chapter Exercises 58

2 Comma-Separated Values (CSV) Files 59

Base Python Versus pandas 61

Read and Write a CSV File (Part 1) 62

How Basic String Parsing Can Fail 69

Read and Write a CSV File (Part 2) 70

Filter for Specific Rows 72

Value in Row Meets a Condition 73

Value in Row Is in a Set of Interest 75

Value in Row Matches a Pattern/Regular Expression 77

Select Specific Columns 79

Column Index Values 79

Column Headings 81

Select Contiguous Rows 83

Add a Header Row 86

Reading Multiple CSV Files 88

Count Number of Files and Number of Rows and Columns in Each File 90

Concatenate Data from Multiple Files 93

Sum and Average a Set of Values per File 97

Chapter Exercises 100

3 Excel Files 101

Introspecting an Excel Workbook 104

Processing a Single Worksheet 109

Read and Write an Excel File 109

Filter for Specific Rows 113

Select Specific Columns 120

Reading All Worksheets in a Workbook 124

Filter for Specific Rows Across All Worksheets 124

Select Specific Columns Across All Worksheets 127

Reading a Set of Worksheets in an Excel Workbook 129

Filter for Specific Rows Across a Set of Worksheets 129

Processing Multiple Workbooks 132

Count Number of Workbooks and Rows and Columns in Each Workbook 134

Concatenate Data from Multiple Workbooks 136

Sum and Average Values per Workbook and Worksheet 138

Chapter Exercises 142

4 Databases 143

Python's Built-in sqlite3 Module 145

Insert New Records into a Table 151

Update Records in a Table 156

MySQL Database 160

Insert New Records into a Table 165

Query a Table and Write Output to a CSV File 170

Update Records in a Table 172

Chapter Exercises 177

5 Applications 179

Find a Set of Items in a Large Collection of Files 179

Calculate a Statistic for Any Number of Categories from Data in a CSV File 192

Calculate Statistics for Any Number of Categories from Data in a Text File 204

Chapter Exercises 213

6 Figures and Plots 215

Matplotlib 215

Bar Plot 216

Histogram 218

Line Plot 220

Scatter Plot 222

Box Plot 224

Pandas 226

Ggplot 228

Seaborn 231

7 Descriptive Statistics and Modeling 239

Datasets 239

Wine Quality 239

Customer Churn 240

Wine Quality 241

Descriptive Statistics 241

Grouping, Histograms, and t-tests 243

Pairwise Relationships and Correlation 244

Linear Regression with Least-Squares Estimation 247

Interpreting Coefficients 249

Standardizing Independent Variables 249

Making Predictions 251

Customer Churn 252

Logistic Regression 255

Interpreting Coefficients 257

Making Predictions 259

8 Scheduling Scripts to Run Automatically 261

Task Scheduler (Windows) 261

The cron Utility (macOS and Unix) 270

Crontab File: One-Time Set-up 271

Adding Cron Jobs to the Crontab File 273

9 Where to Go from Here 277

Additional Standard Library Modules and Built-in Functions 278

Python Standard Library (PSL): A Few More Standard Modules 278

Built-in Functions 279

Python Package Index (PyPI): Additional Add-in Modules 280

NumPy 280

SciPy 286

Scikit-Learn 290

A Few Additional Add-in Packages 292

Additional Data Structures 293

Stacks 293

Queues 294

Graphs 294

Trees 295

Where to Go from Here 295

A Download Instructions 299

B Answers to Exercises 311

Bibliography 313

Index 315

From the B&N Reads Blog

Customer Reviews