Distributed Data Systems with Azure Databricks: Create, deploy, and manage enterprise data pipelines

Microsoft Azure Databricks helps you to harness the power of distributed computing and apply it to create robust data pipelines, along with training and deploying machine learning and deep learning models. Databricks' advanced features enable developers to process, transform, and explore data. Distributed Data Systems with Azure Databricks will help you to put your knowledge of Databricks to work to create big data pipelines.
The book provides a hands-on approach to implementing Azure Databricks and its associated methodologies that will make you productive in no time. Complete with detailed explanations of essential concepts, practical examples, and self-assessment questions, you’ll begin with a quick introduction to Databricks core functionalities, before performing distributed model training and inference using TensorFlow and Spark MLlib. As you advance, you’ll explore MLflow Model Serving on Azure Databricks and implement distributed training pipelines using HorovodRunner in Databricks.
Finally, you’ll discover how to transform, use, and obtain insights from massive amounts of data to train predictive models and create entire fully working data pipelines. By the end of this MS Azure book, you’ll have gained a solid understanding of how to work with Databricks to create and manage an entire big data pipeline.

1139574307
Distributed Data Systems with Azure Databricks: Create, deploy, and manage enterprise data pipelines

Microsoft Azure Databricks helps you to harness the power of distributed computing and apply it to create robust data pipelines, along with training and deploying machine learning and deep learning models. Databricks' advanced features enable developers to process, transform, and explore data. Distributed Data Systems with Azure Databricks will help you to put your knowledge of Databricks to work to create big data pipelines.
The book provides a hands-on approach to implementing Azure Databricks and its associated methodologies that will make you productive in no time. Complete with detailed explanations of essential concepts, practical examples, and self-assessment questions, you’ll begin with a quick introduction to Databricks core functionalities, before performing distributed model training and inference using TensorFlow and Spark MLlib. As you advance, you’ll explore MLflow Model Serving on Azure Databricks and implement distributed training pipelines using HorovodRunner in Databricks.
Finally, you’ll discover how to transform, use, and obtain insights from massive amounts of data to train predictive models and create entire fully working data pipelines. By the end of this MS Azure book, you’ll have gained a solid understanding of how to work with Databricks to create and manage an entire big data pipeline.

35.99 In Stock
Distributed Data Systems with Azure Databricks: Create, deploy, and manage enterprise data pipelines

Distributed Data Systems with Azure Databricks: Create, deploy, and manage enterprise data pipelines

by Alan Bernardo Palacio
Distributed Data Systems with Azure Databricks: Create, deploy, and manage enterprise data pipelines

Distributed Data Systems with Azure Databricks: Create, deploy, and manage enterprise data pipelines

by Alan Bernardo Palacio

eBook

$35.99 

Available on Compatible NOOK devices, the free NOOK App and in My Digital Library.
WANT A NOOK?  Explore Now

Related collections and offers


Overview

Microsoft Azure Databricks helps you to harness the power of distributed computing and apply it to create robust data pipelines, along with training and deploying machine learning and deep learning models. Databricks' advanced features enable developers to process, transform, and explore data. Distributed Data Systems with Azure Databricks will help you to put your knowledge of Databricks to work to create big data pipelines.
The book provides a hands-on approach to implementing Azure Databricks and its associated methodologies that will make you productive in no time. Complete with detailed explanations of essential concepts, practical examples, and self-assessment questions, you’ll begin with a quick introduction to Databricks core functionalities, before performing distributed model training and inference using TensorFlow and Spark MLlib. As you advance, you’ll explore MLflow Model Serving on Azure Databricks and implement distributed training pipelines using HorovodRunner in Databricks.
Finally, you’ll discover how to transform, use, and obtain insights from massive amounts of data to train predictive models and create entire fully working data pipelines. By the end of this MS Azure book, you’ll have gained a solid understanding of how to work with Databricks to create and manage an entire big data pipeline.


Product Details

ISBN-13: 9781838642693
Publisher: Packt Publishing
Publication date: 05/25/2021
Sold by: Barnes & Noble
Format: eBook
Pages: 414
File size: 17 MB
Note: This product may take a few minutes to download.

About the Author

Alan Bernardo Palacio is a Data Scientist and Engineer with vast experience in different engineering fields. His focus has been the development and application of state-of-the-art data products and algorithms in several industries. He has worked for companies such as Ernst and Young, Globant, and now holds a Data Engineer position at Ebiquity Media helping the company to create a scalable data pipeline. Alan graduated with a Mechanical Engineering degree from the National University of Tucuman in 2015, participated as the founder in startups, and later on earned a Master's degree from the faculty of Mathematics in the Autonomous University of Barcelona in 2017. Originally from Argentina, he now works and resides in the Netherlands.

Table of Contents

Table of Contents
  1. Introduction to Azure Databricks core concepts
  2. Creating an Azure Databricks workspace
  3. Creating an ETL with Databricks
  4. Delta Lake with Databricks
  5. Introducing Delta Engine
  6. Structured Streaming
  7. Azure Databricks integration with Popular Python Libraries
  8. Databricks Runtime for Machine Learning
  9. Databricks Runtime for Deep Learning
  10. Model tuning, deployment and control Using DataBricks AutoML
  11. MLFlow on Azure Databricks
  12. Distributed Deep Learning with Horovod
From the B&N Reads Blog

Customer Reviews