Mining the Social Web: Data Mining Facebook, Twitter, LinkedIn, Google+, GitHub, and More


How can you tap into the wealth of social web data to discover who’s making connections with whom, what they’re talking about, and where they’re located? With this expanded and thoroughly revised edition, you’ll learn how to acquire, analyze, and summarize data from all corners of the social web, including Facebook, Twitter, LinkedIn, Google+, GitHub, email, websites, and blogs.

  • Employ the Natural Language ...
See more details below
Paperback (Second Edition)
$29.93 price
(Save 33%)$44.99 List Price

Pick Up In Store

Reserve and pick up in 60 minutes at your local store

Other sellers (Paperback)
  • All (14) from $25.34   
  • New (13) from $25.34   
  • Used (1) from $29.77   
Mining the Social Web: Data Mining Facebook, Twitter, LinkedIn, Google+, GitHub, and More

Available on NOOK devices and apps  
  • NOOK Devices
  • NOOK HD/HD+ Tablet
  • NOOK
  • NOOK Color
  • NOOK Tablet
  • Tablet/Phone
  • NOOK for Windows 8 Tablet
  • NOOK for iOS
  • NOOK for Android
  • NOOK Kids for iPad
  • PC/Mac
  • NOOK for Windows 8
  • NOOK for PC
  • NOOK for Mac
  • NOOK Study
  • NOOK for Web

Want a NOOK? Explore Now

NOOK Book (eBook)
$19.99 price
(Save 44%)$35.99 List Price


How can you tap into the wealth of social web data to discover who’s making connections with whom, what they’re talking about, and where they’re located? With this expanded and thoroughly revised edition, you’ll learn how to acquire, analyze, and summarize data from all corners of the social web, including Facebook, Twitter, LinkedIn, Google+, GitHub, email, websites, and blogs.

  • Employ the Natural Language Toolkit, NetworkX, and other scientific computing tools to mine popular social web sites
  • Apply advanced text-mining techniques, such as clustering and TF-IDF, to extract meaning from human language data
  • Bootstrap interest graphs from GitHub by discovering affinities among people, programming languages, and coding projects
  • Build interactive visualizations with D3.js, an extraordinarily flexible HTML5 and JavaScript toolkit
  • Take advantage of more than two-dozen Twitter recipes, presented in O’Reilly’s popular "problem/solution/discussion" cookbook format

The example code for this unique data science book is maintained in a public GitHub repository. It’s designed to be easily accessible through a turnkey virtual machine that facilitates interactive learning with an easy-to-use collection of IPython Notebooks.

Read More Show Less

Editorial Reviews

From the Publisher
Mining the social web, again

When we first published Mining the Social Web, I thought it was one of the most important books I worked on that year. Now that we’re publishing a second edition (which I didn’t work on), I find that I agree with myself. With this new edition, Mining the Social Web is more important than ever.
While we’re seeing more and more cynicism about the value of data, and particularly “big data,” that cynicism isn’t shared by most people who actually work with data. Data has undoubtedly been overhyped and oversold, but the best way to arm yourself against the hype machine is to start working with data yourself, to find out what you can and can’t learn. And there’s no shortage of data around. Everything we do leaves a cloud of data behind it: Twitter, Facebook, Google+ — to say nothing of the thousands of other social sites out there, such as Pinterest, Yelp, Foursquare, you name it. Google is doing a great job of mining your data for value. Why shouldn’t you?

There are few better ways to learn about mining social data than by starting with Twitter; Twitter is really a ready-made laboratory for the new data scientist. And this book is without a doubt the best and most thorough approach to mining Twitter data out there. But that’s only a starting point. We hear a lot in the press about sentiment analysis and mining unstructured text data; this book shows you how to do it. If you need to mine the data in web pages or email archives, this book shows you how. And if you want to understand how to people collaborate on projects, Mining the Social Web is the only place I’ve seen that analyzes GitHub data.

All of the examples in the book are available on Github. In addition to the example code, which is bundled into IPython notebooks, Matthew has provided a VirtualBox VM that installs Python, all the libraries you need to run the examples, the examples themselves, and an IPython server. Checking out the examples is as simple as installing Virtual Box, installing Vagrant, cloning the 2nd edition’s Github archive, and typing “vagrant up.”  You can execute the examples for yourself in the virtual machine; modify them; and use the virtual machine for your own projects, since it’s a fully functional Linux system with Python, Java, MongoDB, and other necessities pre-installed. You can view this as a book with accompanying examples in a particularly nice package, or you can view the book as “premium support” for an open source project that consists of the examples and the VM.
If you want to engage with the data that’s surrounding you, Mining the Social Web is the best place to start. Use it to learn, to experiment, and to build your own data projects.
— Mike Loukides
Vice President of Content Strategy for O'Reilly Media, Inc.

Read More Show Less

Product Details

  • ISBN-13: 9781449367619
  • Publisher: O'Reilly Media, Incorporated
  • Publication date: 10/17/2013
  • Edition description: Second Edition
  • Edition number: 2
  • Pages: 448
  • Sales rank: 195,347
  • Product dimensions: 7.00 (w) x 9.20 (h) x 1.20 (d)

Meet the Author

Matthew Russell, Chief Technology Officer at Digital Reasoning, Principal at Zaffra, and author of several books on technology including Mining the Social Web (O'Reilly, 2013), now in its second edition. He is passionate about open source software development, data mining, and creating technology to amplify human intelligence. Matthew studied computer science and jumped out of airplanes at the United States Air Force Academy. When not solving hard problems, he enjoys practicing Bikram Hot Yoga, CrossFitting and participating in triathlons.

Read More Show Less

Table of Contents

Managing Your Expectations;
Python-Centric Technology;
Improvements Specific to the Second Edition;
Conventions Used in This Book;
Using Code Examples;
Safari® Books Online;
How to Contact Us;
Acknowledgments for the Second Edition;
Acknowledgments from the First Edition;
A Guided Tour of the Social Web;
Chapter 1: Mining Twitter: Exploring Trending Topics, Discovering What People Are Talking About, and More;
1.1 Overview;
1.2 Why Is Twitter All the Rage?;
1.3 Exploring Twitter's API;
1.4 Analyzing the 140 Characters;
1.5 Closing Remarks;
1.6 Recommended Exercises;
1.7 Online Resources;
Chapter 2: Mining Facebook: Analyzing Fan Pages, Examining Friendships, and More;
2.1 Overview;
2.2 Exploring Facebook's Social Graph API;
2.3 Analyzing Social Graph Connections;
2.4 Closing Remarks;
2.5 Recommended Exercises;
2.6 Online Resources;
Chapter 3: Mining LinkedIn: Faceting Job Titles, Clustering Colleagues, and More;
3.1 Overview;
3.2 Exploring the LinkedIn API;
3.3 Crash Course on Clustering Data;
3.4 Closing Remarks;
3.5 Recommended Exercises;
3.6 Online Resources;
Chapter 4: Mining Google+: Computing Document Similarity, Extracting Collocations, and More;
4.1 Overview;
4.2 Exploring the Google+ API;
4.3 A Whiz-Bang Introduction to TF-IDF;
4.4 Querying Human Language Data with TF-IDF;
4.5 Closing Remarks;
4.6 Recommended Exercises;
4.7 Online Resources;
Chapter 5: Mining Web Pages: Using Natural Language Processing to Understand Human Language, Summarize Blog Posts, and More;
5.1 Overview;
5.2 Scraping, Parsing, and Crawling the Web;
5.3 Discovering Semantics by Decoding Syntax;
5.4 Entity-Centric Analysis: A Paradigm Shift;
5.5 Quality of Analytics for Processing Human Language Data;
5.6 Closing Remarks;
5.7 Recommended Exercises;
5.8 Online Resources;
Chapter 6: Mining Mailboxes: Analyzing Who's Talking to Whom About What, How Often, and More;
6.1 Overview;
6.2 Obtaining and Processing a Mail Corpus;
6.3 Analyzing the Enron Corpus;
6.4 Discovering and Visualizing Time-Series Trends;
6.5 Analyzing Your Own Mail Data;
6.6 Closing Remarks;
6.7 Recommended Exercises;
6.8 Online Resources;
Chapter 7: Mining GitHub: Inspecting Software Collaboration Habits, Building Interest Graphs, and More;
7.1 Overview;
7.2 Exploring GitHub's API;
7.3 Modeling Data with Property Graphs;
7.4 Analyzing GitHub Interest Graphs;
7.5 Closing Remarks;
7.6 Recommended Exercises;
7.7 Online Resources;
Chapter 8: Mining the Semantically Marked-Up Web: Extracting Microformats, Inferencing over RDF, and More;
8.1 Overview;
8.2 Microformats: Easy-to-Implement Metadata;
8.3 From Semantic Markup to Semantic Web: A Brief Interlude;
8.4 The Semantic Web: An Evolutionary Revolution;
8.5 Closing Remarks;
8.6 Recommended Exercises;
8.7 Online Resources;
Twitter Cookbook;
Chapter 9: Twitter Cookbook;
9.1 Accessing Twitter's API for Development Purposes;
9.2 Doing the OAuth Dance to Access Twitter’s API for Production Purposes;
9.3 Discovering the Trending Topics;
9.4 Searching for Tweets;
9.5 Constructing Convenient Function Calls;
9.6 Saving and Restoring JSON Data with Text Files;
9.7 Saving and Accessing JSON Data with MongoDB;
9.8 Sampling the Twitter Firehose with the Streaming API;
9.9 Collecting Time-Series Data;
9.10 Extracting Tweet Entities;
9.11 Finding the Most Popular Tweets in a Collection of Tweets;
9.12 Finding the Most Popular Tweet Entities in a Collection of Tweets;
9.13 Tabulating Frequency Analysis;
9.14 Finding Users Who Have Retweeted a Status;
9.15 Extracting a Retweet’s Attribution;
9.16 Making Robust Twitter Requests;
9.17 Resolving User Profile Information;
9.18 Extracting Tweet Entities from Arbitrary Text;
9.19 Getting All Friends or Followers for a User;
9.20 Analyzing a User’s Friends and Followers;
9.21 Harvesting a User’s Tweets;
9.22 Crawling a Friendship Graph;
9.23 Analyzing Tweet Content;
9.24 Summarizing Link Targets;
9.25 Analyzing a User’s Favorite Tweets;
9.26 Closing Remarks;
9.27 Recommended Exercises;
9.28 Online Resources;
Information About This Book's Virtual Machine Experience;
OAuth Primer;
Python and IPython Notebook Tips & Tricks;
Read More Show Less

Customer Reviews

Be the first to write a review
( 0 )
Rating Distribution

5 Star


4 Star


3 Star


2 Star


1 Star


Your Rating:

Your Name: Create a Pen Name or

Barnes & Review Rules

Our reader reviews allow you to share your comments on titles you liked, or didn't, with others. By submitting an online review, you are representing to Barnes & that all information contained in your review is original and accurate in all respects, and that the submission of such content by you and the posting of such content by Barnes & does not and will not violate the rights of any third party. Please follow the rules below to help ensure that your review can be posted.

Reviews by Our Customers Under the Age of 13

We highly value and respect everyone's opinion concerning the titles we offer. However, we cannot allow persons under the age of 13 to have accounts at or to post customer reviews. Please see our Terms of Use for more details.

What to exclude from your review:

Please do not write about reviews, commentary, or information posted on the product page. If you see any errors in the information on the product page, please send us an email.

Reviews should not contain any of the following:

  • - HTML tags, profanity, obscenities, vulgarities, or comments that defame anyone
  • - Time-sensitive information such as tour dates, signings, lectures, etc.
  • - Single-word reviews. Other people will read your review to discover why you liked or didn't like the title. Be descriptive.
  • - Comments focusing on the author or that may ruin the ending for others
  • - Phone numbers, addresses, URLs
  • - Pricing and availability information or alternative ordering information
  • - Advertisements or commercial solicitation


  • - By submitting a review, you grant to Barnes & and its sublicensees the royalty-free, perpetual, irrevocable right and license to use the review in accordance with the Barnes & Terms of Use.
  • - Barnes & reserves the right not to post any review -- particularly those that do not follow the terms and conditions of these Rules. Barnes & also reserves the right to remove any review at any time without notice.
  • - See Terms of Use for other conditions and disclaimers.
Search for Products You'd Like to Recommend

Recommend other products that relate to your review. Just search for them below and share!

Create a Pen Name

Your Pen Name is your unique identity on It will appear on the reviews you write and other website activities. Your Pen Name cannot be edited, changed or deleted once submitted.

Your Pen Name can be any combination of alphanumeric characters (plus - and _), and must be at least two characters long.

Continue Anonymously

    If you find inappropriate content, please report it to Barnes & Noble
    Why is this product inappropriate?
    Comments (optional)