Spidering Hacks: 100 Industrial-Strength Tips & Tools

The Internet, with its profusion of information, has made us hungry for ever more, ever better data. Out of necessity, many of us have become pretty adept with search engine queries, but there are times when even the most powerful search engines aren't enough. If you've ever wanted your data in a different form than it's presented, or wanted to collect data from several sites and see it side-by-side without the constraints of a browser, then Spidering Hacks is for you. Spidering Hacks takes you to the next level in Internet data retrieval—beyond search engines—by showing you how to create spiders and bots to retrieve information from your favorite sites and data sources. You'll no longer feel constrained by the way host sites think you want to see their data presented—you'll learn how to scrape and repurpose raw data so you can view in a way that's meaningful to you. Written for developers, researchers, technical assistants, librarians, and power users, Spidering Hacks provides expert tips on spidering and scraping methodologies. You'll begin with a crash course in spidering concepts, tools (Perl, LWP, out-of-the-box utilities), and ethics (how to know when you've gone too far: what's acceptable and unacceptable). Next, you'll collect media files and data from databases. Then you'll learn how to interpret and understand the data, repurpose it for use in other applications, and even build authorized interfaces to integrate the data into your own content. By the time you finish Spidering Hacks, you'll be able to:

Aggregate and associate data from disparate locations, then store and manipulate the data as you like
Gain a competitive edge in business by knowing when competitors' products are on sale, and comparing sales ranks and product placement on e-commerce sites
Integrate third-party data into your own applications or web sites
Make your own site easier to scrape and more usable to others
Keep up-to-date with your favorite comics strips, news stories, stock tips, and more without visiting the site every day

Like the other books in O'Reilly's popular Hacks series, Spidering Hacks brings you 100 industrial-strength tips and tools from the experts to help you master this technology. If you're interested in data retrieval of any type, this book provides a wealth of data for finding a wealth of data.

1140203416

Spidering Hacks: 100 Industrial-Strength Tips & Tools

Aggregate and associate data from disparate locations, then store and manipulate the data as you like
Gain a competitive edge in business by knowing when competitors' products are on sale, and comparing sales ranks and product placement on e-commerce sites
Integrate third-party data into your own applications or web sites
Make your own site easier to scrape and more usable to others
Keep up-to-date with your favorite comics strips, news stories, stock tips, and more without visiting the site every day

29.99 In Stock

Spidering Hacks: 100 Industrial-Strength Tips & Tools

Add to Wishlist

Spidering Hacks: 100 Industrial-Strength Tips & Tools

Paperback

$29.99

Paperback
$29.99

SHIP THIS ITEM

In stock. Ships in 1-2 days.
PICK UP IN STORE

Your local store may have stock of this item.

Available within 2 business hours

Want it Today?
Check Store Availability

Related collections and offers

Overview

Aggregate and associate data from disparate locations, then store and manipulate the data as you like
Gain a competitive edge in business by knowing when competitors' products are on sale, and comparing sales ranks and product placement on e-commerce sites
Integrate third-party data into your own applications or web sites
Make your own site easier to scrape and more usable to others
Keep up-to-date with your favorite comics strips, news stories, stock tips, and more without visiting the site every day

Product Details

ISBN-13:	9780596005771
Publisher:	O'Reilly Media, Incorporated
Publication date:	10/28/2003
Pages:	424
Product dimensions:	6.00(w) x 9.00(h) x 0.97(d)

About the Author

Kevin Hemenway, coauthor of Mac OS X Hacks, is better known as Morbus Iff, the creator of disobey.com, which bills itself as "content for the discontented." Publisher and developer of more home cooking than you could ever imagine, he'd love to give you a Fry Pan of Intellect upside the head. Politely, of course. And with love.

Tara Calishain is the creator of the site, ResearchBuzz. She is an expert on Internet search engines and how they can be used effectively in business situations.

Credits
Preface
Chapter 1: Walking Softly
Chapter 2: Assembling a Toolbox
Chapter 3: Collecting Media Files
Chapter 4: Gleaning Data from Databases
Chapter 5: Maintaining Your Collections
Chapter 6: Giving Back to the World
Colophon

From the B&N Reads Blog

Page 1 of

Spidering Hacks: 100 Industrial-Strength Tips & Tools

Spidering Hacks: 100 Industrial-Strength Tips & Tools

Paperback

Paperback

Related collections and offers

Overview

Product Details

About the Author

Table of Contents

Customer Reviews

Related collections and offers

Overview

Product Details

About the Author

Table of Contents

Related Subjects

Customer Reviews