Authors Shiyan Xu, Prashant Wason, Sudha Saktheeswaran, and Rebecca Bilbro provide practical examples and insights to help you unlock the full potential of data lakehouses for different levels of analytics, from batch to interactive to streaming. You'll also learn how to evaluate storage choices and leverage built-in automated table optimizations to build, maintain, and operate production data applications.
This book helps you:
- Understand the need for transactional data lakehouses and the challenges associated with building them
- Get up to speed with Apache Hudi and learn how it makes building data lakehouses easy
- Explore data ecosystem support provided by Apache Hudi for popular data sources and query engines
- Perform different write and read operations on Apache Hudi tables and effectively use them for various use cases, including batch and stream applications
- Implement data engineering techniques to operate and manage Apache Hudi tables
- Apply different storage techniques and considerations, such as indexing and clustering to maximize your lakehouse performance
- Build end-to-end incremental data pipelines using Apache Hudi for faster ingestion and fresher analytics
Authors Shiyan Xu, Prashant Wason, Sudha Saktheeswaran, and Rebecca Bilbro provide practical examples and insights to help you unlock the full potential of data lakehouses for different levels of analytics, from batch to interactive to streaming. You'll also learn how to evaluate storage choices and leverage built-in automated table optimizations to build, maintain, and operate production data applications.
This book helps you:
- Understand the need for transactional data lakehouses and the challenges associated with building them
- Get up to speed with Apache Hudi and learn how it makes building data lakehouses easy
- Explore data ecosystem support provided by Apache Hudi for popular data sources and query engines
- Perform different write and read operations on Apache Hudi tables and effectively use them for various use cases, including batch and stream applications
- Implement data engineering techniques to operate and manage Apache Hudi tables
- Apply different storage techniques and considerations, such as indexing and clustering to maximize your lakehouse performance
- Build end-to-end incremental data pipelines using Apache Hudi for faster ingestion and fresher analytics

Apache Hudi: The Definitive Guide: Building Robust, Open, and High-Performing Data Lakehouses
350
Apache Hudi: The Definitive Guide: Building Robust, Open, and High-Performing Data Lakehouses
350Paperback
Product Details
ISBN-13: | 9781098173838 |
---|---|
Publisher: | O'Reilly Media, Incorporated |
Publication date: | 12/30/2025 |
Pages: | 350 |
Product dimensions: | 7.00(w) x 9.19(h) x 0.00(d) |