Dataproc Cookbook: Running Spark and Hadoop Workloads in Google Cloud

Want to build big data solutions in Google Cloud? Dataproc Cookbook is your hands-on guide to mastering Dataproc and the essential GCP fundamentals—like networking, security, monitoring, and cost optimization—that apply across Google Cloud services. Learn practical skills that not only fast-track your Dataproc expertise, but also help you succeed with a wide range of GCP technologies.

Written by data experts Narasimha Sadineni and Anu Venkataraman, this cookbook tackles real-world use cases like serverless Spark jobs, Kubernetes-native deployments, and cost-optimized data lake workflows. You'll learn how to create ephemeral and persistent Dataproc clusters, run secure data science workloads, implement monitoring solutions, and plan effective migration and optimization strategies.

Create Dataproc clusters on Compute Engine and Kubernetes Engine
Run data science workloads on Dataproc
Execute Spark jobs on Dataproc Serverless
Optimize Dataproc clusters to be cost effective and performant
Monitor Spark jobs in various ways
Orchestrate various workloads and activities
Use different methods for migrating data and workloads from existing Hadoop clusters to Dataproc

1146226034

Dataproc Cookbook: Running Spark and Hadoop Workloads in Google Cloud

Create Dataproc clusters on Compute Engine and Kubernetes Engine
Run data science workloads on Dataproc
Execute Spark jobs on Dataproc Serverless
Optimize Dataproc clusters to be cost effective and performant
Monitor Spark jobs in various ways
Orchestrate various workloads and activities
Use different methods for migrating data and workloads from existing Hadoop clusters to Dataproc

79.99 In Stock

Dataproc Cookbook: Running Spark and Hadoop Workloads in Google Cloud

Add to Wishlist

Dataproc Cookbook: Running Spark and Hadoop Workloads in Google Cloud

Paperback

$79.99

View All Available Formats & Editions

Paperback
$79.99

View All Available Formats & Editions

SHIP THIS ITEM

In stock. Ships in 1-2 days.
PICK UP IN STORE

Your local store may have stock of this item.

Available within 2 business hours

Want it Today?
Check Store Availability

Related collections and offers

Overview

Create Dataproc clusters on Compute Engine and Kubernetes Engine
Run data science workloads on Dataproc
Execute Spark jobs on Dataproc Serverless
Optimize Dataproc clusters to be cost effective and performant
Monitor Spark jobs in various ways
Orchestrate various workloads and activities
Use different methods for migrating data and workloads from existing Hadoop clusters to Dataproc

Product Details
About the Author

Product Details

ISBN-13:	9781098157708
Publisher:	O'Reilly Media, Incorporated
Publication date:	07/29/2025
Pages:	300
Product dimensions:	7.00(w) x 9.19(h) x 0.00(d)

About the Author

Narasimha Sadineni is a data engineer at Google who has 12 years of experience in Data & Analytics. While working as a professional services team member at Google and Cloudera, he helped 50+ organizations in solving BigData problems using tools like Hadoop and Google Cloud technologies. He has several years of teaching experience in Hadoop.

Anu Venkataraman is a Senior Program Manager. She previously served as a Data Lake Engineer at Google, accumulating extensive experience in data technologies. Anu assists customers in migrating large-scale distributed systems to the cloud. She finds joy in speaking at universities and contributing technical blogs and videos to the Data community, aiming to expedite customers' journeys to the cloud. Anu played a key role as one of the leads for the Professional Services Tech Talk playlist on the Google Cloud Tech YouTube channel. She holds a Master's degree in Electrical and Computer Engineering from Ryerson University, specializing in Medical Image Processing and Machine Learning.

From the B&N Reads Blog

Page 1 of

Related collections and offers

Overview

Product Details

About the Author

Related Subjects

Customer Reviews