Murach's Python for Data Analysis

There is a newer edition of this book titled “Murach’s Python for Data Science (2

nd Edition)

Data analysts are in demand everywhere today! And now, Murach’s Python for Data Analysis shows you how to do data analysis the way the pros do. You’ll master descriptive analysis, using Pandas to analyze the data and Seaborn to create the visualizations that let you present your findings effectively. You’ll get started with predictive analysis, using Scikit-learn with linear regression models. And you’ll be guided right from the start by 4 real-world case studies in political, environmental, social, and sports analytics…essential for learning and great perspective for applying your new skills in your own field. See for yourself how quickly and easily this book can turn you into the data analyst that employers are looking for.

1139698812
Murach's Python for Data Analysis

There is a newer edition of this book titled “Murach’s Python for Data Science (2

nd Edition)

Data analysts are in demand everywhere today! And now, Murach’s Python for Data Analysis shows you how to do data analysis the way the pros do. You’ll master descriptive analysis, using Pandas to analyze the data and Seaborn to create the visualizations that let you present your findings effectively. You’ll get started with predictive analysis, using Scikit-learn with linear regression models. And you’ll be guided right from the start by 4 real-world case studies in political, environmental, social, and sports analytics…essential for learning and great perspective for applying your new skills in your own field. See for yourself how quickly and easily this book can turn you into the data analyst that employers are looking for.

59.5 In Stock
Murach's Python for Data Analysis

Murach's Python for Data Analysis

by Scott McCoy
Murach's Python for Data Analysis

Murach's Python for Data Analysis

by Scott McCoy

Paperback

$59.50 
  • SHIP THIS ITEM
    In stock. Ships in 1-2 days.
  • PICK UP IN STORE

    Your local store may have stock of this item.

Related collections and offers


Overview

There is a newer edition of this book titled “Murach’s Python for Data Science (2

nd Edition)

Data analysts are in demand everywhere today! And now, Murach’s Python for Data Analysis shows you how to do data analysis the way the pros do. You’ll master descriptive analysis, using Pandas to analyze the data and Seaborn to create the visualizations that let you present your findings effectively. You’ll get started with predictive analysis, using Scikit-learn with linear regression models. And you’ll be guided right from the start by 4 real-world case studies in political, environmental, social, and sports analytics…essential for learning and great perspective for applying your new skills in your own field. See for yourself how quickly and easily this book can turn you into the data analyst that employers are looking for.


Product Details

ISBN-13: 9781943872763
Publisher: Mike Murach and Associates, Inc.
Publication date: 08/30/2021
Pages: 600
Product dimensions: 8.00(w) x 9.90(h) x 1.40(d)

About the Author

Scott McCoy is a professional programmer, and has worked in a variety of domains including bioinformatics, IT automation, security, and data analytics. He holds a B.S. in computer science and a Microsoft certification in database technologies. In his free time, Scott enjoys reading, hiking, and spending time with his family.

Table of Contents

Section 1 Get off to a fast start

Chapter 1 Introduction to Python for data analysis

Introduction to data analysis 4

What data analysis is 4

The five phases of data analysis and visualization 6

The IDEs for Python data analysis 8

The Python skills that you need for data analysis 10

How to install and import the Python modules for data analysis 10

How to call and chain methods 12

The coding basics for Python data analysis 14

How to use JupyterLab as your IDE 16

How to start JupyterLab and work with a Notebook 16

How to edit and run the cells in a Notebook 18

How to use the Tab completion and tooltip features 20

How syntax and runtime errors work 22

How to use Markdown language 24

How to get reference information 26

Two more skills for working with JupyterLab 28

How to split the screen between two Notebooks 28

How to use Magic Commands 30

Introduction to the case studies 32

The Polling case study 32

The Forest Fires case study 34

The Social Survey case study 36

The Sports Analytics case study 38

Chapter 2 The Pandas essentials for data analysis

Introduction to the Pandas DataFrame 46

The DataFrame structure 46

Two ways to get data into a DataFrame 48

How to save and restore a DataFrame 50

How to examine the data 52

How to display the data in a DataFrame 52

How to use the attributes of a DataFrame 54

How to use the info(), nunique(), and describe() methods 56

How to access the columns and rows 58

How to access columns 58

How to access rows 60

How to access a subset of rows and columns 62

Another way to access a subset of rows and columns 64

How to work with the data 66

How to sort the data 66

How to use the statistical methods 68

How to use Python for column arithmetic 70

How to modify the string data in columns 72

How to shape the data 74

How to use indexes 74

How to pivot the data 76

How to melt the data 78

How to analyze the data 80

How to group the data 80

How to aggregate the data 82

How to plot the data 84

Chapter 3 The Pandas essentials for data visualization

Introduction to data visualization 92

The Python libraries for data visualization 92

Long vs. wide data for data visualization 94

How the Pandas plot() method works by default 96

The three basic parameters for the Pandas plot() method 98

How to create 8 types of plots 100

How to create a fine plot or an area plot 100

How to create a scatter plot 102

How to create a bar plot 104

How to create a histogram or a density plot 106

How to create a box plot or a pie plot 108

How to enhance a plot 110

How to improve the appearance of a plot 110

How to work with subplots 112

How to use chaining to get the plots you want 114

Chapter 4 The Seaborn essentials for data visualization

Introduction to Seaborn 120

The Seaborn methods for plotting 120

The general methods vs. the specific methods 122

How to use the basic Seaborn parameters 124

How to use the Seaborn parameters for working with subplots 126

How to enhance and save plots 128

How to set the title, x label, and y label 128

How to set the ticks, x limits, and y limits 130

How to set the background style 132

How to work with subplots 134

How to save a plot 136

How to create relational plots 138

How to create a line plot 138

Haw to create a scatter plot 140

How to create categorical plots 142

How to create a bar plot 142

How to create a box plot 144

How to create distribution plots 146

How to create a histogram 146

How to create a KDE or ECDF plot 148

How to enhance a distribution plot 150

Other techniques for enhancing a plot 152

How to use other Axes methods to enhance a plot 152

How to annotate a plot 154

How to set the color palette 156

How to enhance a plot that has subplots 158

How to customize the titles for subplots 160

How to set the size of a specific plot 162

Section 2 The critical skills for success on the job

Chapter 5 How to get the data

How to find the data that you want to analyze 170

Common data sources 170

How to find and select the data that you want 170

How to import data into a DataFrame 172

How to import data directly into a DataFrame 172

How to download a file to disk before importing it 174

How to work with a zip file on disk 176

How to get database data into a DataFrame 178

How to run queries against a database 178

How to use a SQL query to import data into a DataFrame 180

How to work with a Stata file 182

How to get and explore the metadata of a Stata file 182

How to build DataFrames for the metadata and the data 184

How to work with a JSON file 186

How to download a JSON file to disk 186

How to open a JSON file in JupyterLab 186

How to drill down into the data 188

How to build a DataFrame for the data 190

Chapter 6 How to clean the data

Introduction to data cleaning 198

A general plan for cleaning the data 198

What the info() method can tell you 200

What the unique values can tell you 202

What the value counts can tell you 204

How to simplify the data 206

How to drop rows based on conditions 206

How to drop duplicate rows 206

How to drop columns 208

How to rename columns 210

How to find and fix missing values 212

How to find missing values 212

How to drop rows with missing values 214

How to fill missing values 216

How to fix data type problems 218

How to find dates and numbers that are imported as objects 218

How to convert date and time strings to the datctime data type 220

How to convert object columns to numeric data types 222

How to work with the category data type 224

How to replace invalid values and convert a column's data type 226

How to fix data problems when you import the data 228

How find and fix outliers 230

How to find outliers 230

How to fix outliers 232

Chapter 7 How to prepare the data

How to add and modify columns 240

How to work with datetime columns 240

How to work with string columns 242

How to work with numeric columns 242

How to add a summary column to a DataFrame 244

How to apply functions and lambda expressions 246

How to apply functions to rows or columns 246

How to apply user-defined functions 248

How lambda expressions work with DataFrames 250

How to apply lambda expressions 252

How to work with indexes 254

How to set and remove an index 254

How to unstack indexed data 256

How to combine DataFrames 258

How to join DataFrames with an inner join 258

How to join DataFrames with a left or outer join 260

How to merge DataFrames 262

How to concatenate DataFrames 264

How to handle the SettingWithCopyWarning 266

What the warning is telling you 266

What to do when the warning is displayed 268

What to watch for when the warning isn't displayed 268

Chapter 8 How to analyze the data

How to create and plot long data 274

How to melt columns to create long data 274

How to plot melted columns 276

How to group and aggregate the data 278

How to group and apply a single aggregate method 278

How to work with a DataFrameGroupBy object 280

How to apply multiple aggregate methods 282

How to create and use pivot tables 284

How to use the pivor() method 284

How to use the pivot_table() method 286

How to work with bins 288

How to create bins of equal size 288

How to create bins with equal numbers of values 290

How to plot binned data 292

More skills for data analysis 294

How to select the rows with the largest values 294

How to calculate the percent change 296

How to rank rows 298

How to find other methods for analysis 300

Chapter 9 How to analyze time-series data

How to reindex time-series data 306

How to generate time periods 306

How to reindex with datetime indexes 308

How to reindex with a semi-month index 310

How a user-defined function can improve a datetime index 312

How reindexing with an improved index can improve plots 314

How to resample time-series data 316

How to use the resample() method 316

How to use the label and closed parameters when you downsample 318

How downsampling can improve plots 320

How to work with rolling windows 322

The concept of rolling windows 322

How to create rolling windows 324

How to plot rolling window data 326

How to work with running totals 328

How to create running totals 328

How to plot running totals 330

Section 3 An introduction to predictive analysis

Chapter 10 How to make predictions with a linear regression model

Introduction to predictive analysis 338

Types of predictive models 338

Introduction to regression analysis 338

How to find correlations between variables 340

The Housing dataset 340

How to identify correlations with a scatter plot 342

How to identify correlations with a grid of scatter plots 344

How to identify correlations with r-values 346

How to identify correlations with a heatmap 348

How to use Scikit-learn to work with a linear regression 350

A procedure for creating and using a regression model 350

The function and methods for linear regression models 352

How to create, validate, and use a linear regression model 354

How to plot the predicted data 356

How to plot the residuals 358

How to plot regression models with Seaborn 360

The lmplot() method and some of its parameters 360

How to plot a simple linear regression 362

How to plot a logistic regression 362

How to plot a polynomial regression 364

How to plot a lowess regression 364

How to use the residplot() method to plot the residuals 366

Chapter 11 How to make predictions with a multiple regression model

A simple regression model for a Cars dataset 372

The Cars dataset 372

How to create a simple regression model 374

How to plot the residuals of a simple regression 376

How to work with a multiple regression model 378

How to create a multiple regression model 378

How to plot the residuals of a multiple regression 380

How to work with categorical variables 382

How to identify categorical variables 382

How to review categorical variables 384

How to create dummy variables 386

How to restate the data and check the correlations 388

How to create a multiple regression that includes dummy variables 390

How to improve a multiple regression model 392

How to select the independent variables 392

How to test different combinations of variables 394

How to use Scikit-learn to select the variables 396

How to select the right number of variables 398

Section 4 The case studies

Chapter 12 The Polling case study

Get and display the data 406

Import the modules that you will need 406

Get the data 406

Display the data 406

Clean the data 408

Examine the data 408

Drop columns and rows 412

Rename columns 414

Fix object types 414

Fix data 414

Take an early plot with Pandas 414

Save the DataFrame 414

Prepare the data 416

Add columns for grouping and filtering 416

Create a new DataFrame in long form 418

Take an early plot of the long data with Seaborn 418

Add monthly bins to the DataFrame 420

Add an average percent column for each month 420

Save the wide and long DataFrames 420

Analyze the data 422

Plot the national and swing state polls 422

Plot the voter types 424

Plot the last two months of polling 426

Plot the gap changes in selected states 428

More preparation and analysis 430

Prepare the gap data for the last week of polling 430

Plot the gap data for the last week of polling 432

Prepare the weekly gap data for the swing states 434

Plot the weekly gap data for the swing states 436

Chapter 13 The Forest Fires case study

Get the data 442

Download and unzip the SQLite database 442

Connect and query the database 442

Import the data into a DataFrame 442

Clean the data 444

Examine the data 444

Improve the readability of the data 444

Drop unnecessary rows 446

Drop duplicate rows 446

Convert dates to datetime objects 446

Check for missing contain dates 448

Prepare the data 450

Add fire_month and days_burning columns 450

Examine the contain_date and days_burning columns 450

Analyze the data 452

Analyze the data for California 452

Two more plots for California fires 454

Rank the states by total acres burned 456

Prepare a DataFrame for total acres burned by year within state 458

Prepare a DataFrame for the top 4 states 458

Plot the acres burned total by year for the top 4 states 460

Review the 20 largest fires in California 462

Use GeoPandas to plot the fires on a map 464

Use GeoPandas to plot the California map 464

Use GeoPandas or Seaborn to plot the California fires on a map 466

Plot the fires in the continental United States 468

Chapter 14 The Social Survey case study

Introduction to the Social Survey 474

Download and unzip the zip file for the data 474

Build a DataFrame for the metadata 474

The employment data 476

Use the codebook and read the data that you want 476

Prepare the data 478

Plot the data and reduce the number of categories 480

Plot the total counts of the responses 482

Convert the counts to percents and plot them 484

The work-life balance data 486

Search the codebook for small question sets 486

Read and review the work-life data 488

Plot the responses for the first question 490

Plot the responses for the second and thid questions 492

How to expand the scope of the analysis 494

Use the codebook to find related columns 494

Use the codebook to find follow-up questions 496

Select the columns for an expanded DataFrame 498

Bin the data for a column 500

How to use a hypothesis to guide your analysis 502

Develop and test a first hypothesis 502

Develop and test a second hypothesis 504

Develop and test a third hypothesis 506

Chapter 15 The Sports Analytics case study

Get the data and build the DataFrame 512

Get the data 512

Build the DataFrame 512

Clean the data 514

Locate and drop unneeded rows 514

Locate and drop unneeded columns 514

Convert the game_date column to datetime data 514

Prepare the data 516

Add a column for the season 516

Add a column for the shot result 516

Add a column for points made for each shot 518

Add three summary columns 518

Plot the summary data 520

Plot the points per game by season 520

Plot the averages of shots, shots made, and points per game by season 520

Plot the shot locations 522

Plot the shot locations for two games 522

Plot the shot locations for two seasons 524

Plot the shot density for one season 526

Plot the shot density for two seasons 528

Appendix A How to set up Windows for this book

How to install and use Anaconda 532

How to install Anaconda 532

How to use the Anaconda Prompt 534

How to use the Anaconda Navigator 534

How to install and use the files for this book 536

How to install the files for this book 536

How to make sure Anaconda is installed correctly 538

How to download the large data files for this book 538

Appendix B How to set up macOS for this book

How to install and use Anaconda 542

How to install Anaconda 542

How to run conda commands 544

How to use the Anaconda Navigator 544

How to install and use the files for this book 546

How to install the files for this book 546

How to make sure Anaconda is installed correctly 548

How to download the large data files for this book 548

From the B&N Reads Blog

Customer Reviews