Data Visualization with Python and JavaScript: Scrape, Clean, Explore & Transform Your Data / Edition 1

Data Visualization with Python and JavaScript: Scrape, Clean, Explore & Transform Your Data / Edition 1

by Kyran Dale
ISBN-10:
1491920513
ISBN-13:
9781491920510
Pub. Date:
07/28/2016
Publisher:
O'Reilly Media, Incorporated

Paperback

View All Available Formats & Editions
Current price is , Original price is $44.99. You
Select a Purchase Option
  • purchase options
    $36.66 $44.99 Save 19% Current price is $36.66, Original price is $44.99. You Save 19%.
  • purchase options

Overview

Data Visualization with Python and JavaScript: Scrape, Clean, Explore & Transform Your Data / Edition 1

Learn how to turn raw data into rich, interactive web visualizations with the powerful combination of Python and JavaScript. With this hands-on guide, author Kyran Dale teaches you how build a basic dataviz toolchain with best-of-breed Python and JavaScript libraries—including Scrapy, Matplotlib, Pandas, Flask, and D3—for crafting engaging, browser-based visualizations.

As a working example, throughout the book Dale walks you through transforming Wikipedia’s table-based list of Nobel Prize winners into an interactive visualization. You’ll examine steps along the entire toolchain, from scraping, cleaning, exploring, and delivering data to building the visualization with JavaScript’s D3 library. If you’re ready to create your own web-based data visualizations—and know either Python or JavaScript— this is the book for you.

  • Learn how to manipulate data with Python
  • Understand the commonalities between Python and JavaScript
  • Extract information from websites by using Python’s web-scraping tools, BeautifulSoup and Scrapy
  • Clean and explore data with Python’s Pandas, Matplotlib, and Numpy libraries
  • Serve data and create RESTful web APIs with Python’s Flask framework
  • Create engaging, interactive web visualizations with JavaScript’s D3 library

Product Details

ISBN-13: 9781491920510
Publisher: O'Reilly Media, Incorporated
Publication date: 07/28/2016
Pages: 592
Sales rank: 783,199
Product dimensions: 5.90(w) x 8.90(h) x 1.30(d)

About the Author

Kyran Dale is a jobbing programmer, ex research-scientist, recreational hacker, independent researcher, occasional entrepreneur, cross-country runner and improving jazz pianist. During 15 odd years as a research scientist he hacked a lot of code, learned a lot of libraries and settled on some favourite tools. These days he finds Python, Javascript and a little C++ goes a long way to solving most problems out there. He specializes in fast-prototyping and feasibility studies, with an algorithmic bent but is happy to just build cool things.

Table of Contents

Preface ix

Introduction xv

1 Development Setup 1

The Accompanying Code 1

Python 1

JavaScript 5

Databases 6

Integrated Development Environments 8

Summary 8

Part I Basic Toolkit

2 A Language-Learning Bridge Between Python and JavaScript 13

Similarities and Differences 14

Interacting with the Code 15

Basic Bridge Work 18

Differences in Practice 42

A Cheat Sheet 54

Summary 56

3 Reading and Writing Data with Python 59

Easy Does It 59

Passing Data Around 60

Working with System Files 61

CSV, TSV, and Row-Column Data Formats 62

JSON 65

SQL 69

MongoDB 79

Dealing with Dates, Times, and Complex Data 84

Summary 86

4 Webdev 101 87

The Big Picture 87

Single-Page Apps 88

Tooling Up 88

Building a Web Page 93

Chrome's Developer Tools 102

A Basic Page with Placeholders 105

Scalable Vector Graphics 109

Summary 125

Part II Getting Your Data

5 Getting Data off the Web with Python 129

Getting Web Data with the requests Library 129

Getting Data Files with requests 130

Using Python to Consume Data from a Web API 134

Using Libraries to Access Web APIs 140

Scraping Data 146

Getting the Soup 149

Selecting Tags 149

Summary 159

6 Heavyweight Scraping with Scrapy 161

Setting Up Scrapy 163

Establishing the Targets 164

Targeting HTML with Xpaths 165

A First Scrapy Spider 171

Scraping the Individual Biography Pages 177

Chaining Requests and Yielding Data 180

Scrapy Pipelines 185

Scraping Text and Images with a Pipeline 187

Summary 194

Part III Cleaning and Exploring Data with Pandas

7 Introduction to NumPy 197

The NumPy Array 198

Creating Array Functions 204

Summary 206

8 Introduction to Pandas 207

Why Pandas Is Tailor-Made for Dataviz 207

Why Pandas Was Developed 207

Heterogeneous Data and Categorizing Measurements 208

The DataFrame 210

Creating and Saving DataFrames 214

Series into DataFrames 223

Panels 225

Summary 226

9 Cleaning Data with Pandas 229

Coming Clean About Dirty Data 229

Inspecting the Data 231

Indices and Pandas Data Selection 235

Cleaning the Data 239

The Full clean_data Function 256

Saving the Cleaned Dataset 257

Summary 259

10 Visualizing Data with Matplotlib 261

Pyplot and Object-Oriented Matplotlib 261

Starting an Interactive Session 262

Interactive Plotting with Pyplot's Global State 264

Figures and Object-Oriented Matplotlib 269

Plot Types 274

Seaborn 282

Summary 291

11 Exploring Data with Pandas 293

Starting to Explore 294

Plotting with Pandas 296

Gender Disparities 297

National Trends 304

Age and Life Expectancy of Winners 316

The Nobel Diaspora 323

Summary 325

Part IV Delivering the Data

12 Delivering the Data 329

Serving the Data 330

Delivering Static Files 336

Dynamic Data with Flask 340

Using Static or Dynamic Delivery 344

Summary 344

13 RESTful Data with Flask 347

A RESTful, MongoDB API with Eve 348

Delivering Data to the Nobel Prize Visualization 356

RESTful SQL with Flask-Restless 361

Summary 365

Part V Visualizing Your Data with D3

14 Imagining a Nobel Visualization 369

Who Is It For? 369

Choosing Visual Elements 370

Menu Bar 371

Prizes by Year 372

A Map Showing Selected Nobel Countries 373

A Bar Chart Showing Number of Winners by Country 375

A List of the Selected Winners 375

The Complete Visualization 377

Summary 378

15 Building a Visualization 379

Preliminaries 380

The HTML Skeleton 382

CSS Styling 386

The JavaScript Engine 390

Running the Nobel Prize Visualization App 404

Summary 405

16 Introducing D3-The Story of a Bar Chart 407

Framing the Problem 408

Working with Selections 408

Adding DOM Elements 412

Leveraging D3 418

Measuring Up with D3's Scales 418

Unleashing the Power of D3 with Data Binding 423

The enter Method 425

Accessing the Bound Data 429

The Update Pattern 430

Axes and Labels 436

Transitions 442

Summary 447

17 Visualizing Individual Prizes 449

Building the Framework 449

Scales 450

Axes 451

Category Labels 452

Nesting the Data 454

Adding the Winners with a Nested Data-Join 456

A Little Transitional Sparkle 460

Summary 463

18 Mapping with D3 465

Available Maps 466

D3's Mapping Data Formats 467

D3 Geo, Projections, and Paths 471

Putting the Elements Together 477

Updating the Map 481

Adding Value Indicators 484

Our Completed Map 487

Building a Simple Tooltip 488

Summary 491

19 Visualizing Individual Winners 493

Building the List 494

Building the Bio-Box 497

Summary 500

20 The Menu Bar 503

Creating HTML Elements with D3 504

Building the Menu Bar 504

Summary 514

21 Conclusion 515

Recap 515

Future Progress 518

Final Thoughts 521

A Moving from Development to Production 523

Index 545

Customer Reviews

Most Helpful Customer Reviews

See All Customer Reviews