Architecting an Apache Iceberg Lakehouse: A scalable, open-source data platform
Free PDF and epub formats plus online reader with AI assistant.
The “lakehouse” data architecture is a powerful way to combine the flexibility of data lakes with the management features of data warehouses. The open source Apache Iceberg framework delivers the scalability, reliability, and performance you want from a lakehouse without the expense and vendor lock-in of platforms like Snowflake, BigQuery, and Redshift.
Apache Iceberg is an open source table format perfect for massive analytic datasets. Iceberg enables ACID transactions, schema evolution, and high-performance queries on data lakes using multiple compute engines like Spark, Trino, Flink, Presto, and Hive. An Iceberg data lakehouse enables fast, reliable analytics at scale while retaining the observability you need for compliance audits, governance, and provable data security.
In this book, data guru Alex Merced shows you:
• How to create a modular, scalable Iceberg lakehouse architecture
• Where Spark, Flink, Dremio, Polaris fit into your design
• Reliable batch and streaming ingestion pipelines
• Strategies for governance, security, and performance at scale
Apache Iceberg is an open source table format perfect for massive analytic datasets. Iceberg enables ACID transactions, schema evolution, and high-performance queries on data lakes using multiple compute engines like Spark, Trino, Flink, Presto, and Hive. An Iceberg data lakehouse enables fast, reliable analytics at scale while retaining the observability you need for compliance audits, governance, and provable data security.
About the book
Architecting an Apache Iceberg Data Lakehouse teaches you to design a complete data platform with Iceberg. The book carefully guides you through the architecture of your platform—from storage to governance. Each layer is fully illustrated and includes hands-on examples that connect theory with practical implementation. You’ll ingest sales and marketing data from PostgreSQL into Iceberg tables using Apache Spark, build interactive dashboards in Apache Superset, design and compare ingestion pipelines, and much more. Author Alex Merced’s experienced guidance helps you understand the important tradeoff decisions you’ll need to make in real-world implementations. You’ll soon have a scalable and maintainable data platform that can handle petabytes of data!
About the reader
For data architects familiar with the basics of data lakehouses, and preparing for a migration.
About the author
Alex Merced is Head of Developer Relations at Dremio, where he helps developers navigate modern data architectures. He shares his expertise through videos, podcasts, and articles, and leads the DataLakehouseHub.com community. He is the co-author of Apache Iceberg: The Definitive Guide.
1148786514
The “lakehouse” data architecture is a powerful way to combine the flexibility of data lakes with the management features of data warehouses. The open source Apache Iceberg framework delivers the scalability, reliability, and performance you want from a lakehouse without the expense and vendor lock-in of platforms like Snowflake, BigQuery, and Redshift.
Apache Iceberg is an open source table format perfect for massive analytic datasets. Iceberg enables ACID transactions, schema evolution, and high-performance queries on data lakes using multiple compute engines like Spark, Trino, Flink, Presto, and Hive. An Iceberg data lakehouse enables fast, reliable analytics at scale while retaining the observability you need for compliance audits, governance, and provable data security.
In this book, data guru Alex Merced shows you:
• How to create a modular, scalable Iceberg lakehouse architecture
• Where Spark, Flink, Dremio, Polaris fit into your design
• Reliable batch and streaming ingestion pipelines
• Strategies for governance, security, and performance at scale
Apache Iceberg is an open source table format perfect for massive analytic datasets. Iceberg enables ACID transactions, schema evolution, and high-performance queries on data lakes using multiple compute engines like Spark, Trino, Flink, Presto, and Hive. An Iceberg data lakehouse enables fast, reliable analytics at scale while retaining the observability you need for compliance audits, governance, and provable data security.
About the book
Architecting an Apache Iceberg Data Lakehouse teaches you to design a complete data platform with Iceberg. The book carefully guides you through the architecture of your platform—from storage to governance. Each layer is fully illustrated and includes hands-on examples that connect theory with practical implementation. You’ll ingest sales and marketing data from PostgreSQL into Iceberg tables using Apache Spark, build interactive dashboards in Apache Superset, design and compare ingestion pipelines, and much more. Author Alex Merced’s experienced guidance helps you understand the important tradeoff decisions you’ll need to make in real-world implementations. You’ll soon have a scalable and maintainable data platform that can handle petabytes of data!
About the reader
For data architects familiar with the basics of data lakehouses, and preparing for a migration.
About the author
Alex Merced is Head of Developer Relations at Dremio, where he helps developers navigate modern data architectures. He shares his expertise through videos, podcasts, and articles, and leads the DataLakehouseHub.com community. He is the co-author of Apache Iceberg: The Definitive Guide.
Architecting an Apache Iceberg Lakehouse: A scalable, open-source data platform
Free PDF and epub formats plus online reader with AI assistant.
The “lakehouse” data architecture is a powerful way to combine the flexibility of data lakes with the management features of data warehouses. The open source Apache Iceberg framework delivers the scalability, reliability, and performance you want from a lakehouse without the expense and vendor lock-in of platforms like Snowflake, BigQuery, and Redshift.
Apache Iceberg is an open source table format perfect for massive analytic datasets. Iceberg enables ACID transactions, schema evolution, and high-performance queries on data lakes using multiple compute engines like Spark, Trino, Flink, Presto, and Hive. An Iceberg data lakehouse enables fast, reliable analytics at scale while retaining the observability you need for compliance audits, governance, and provable data security.
In this book, data guru Alex Merced shows you:
• How to create a modular, scalable Iceberg lakehouse architecture
• Where Spark, Flink, Dremio, Polaris fit into your design
• Reliable batch and streaming ingestion pipelines
• Strategies for governance, security, and performance at scale
Apache Iceberg is an open source table format perfect for massive analytic datasets. Iceberg enables ACID transactions, schema evolution, and high-performance queries on data lakes using multiple compute engines like Spark, Trino, Flink, Presto, and Hive. An Iceberg data lakehouse enables fast, reliable analytics at scale while retaining the observability you need for compliance audits, governance, and provable data security.
About the book
Architecting an Apache Iceberg Data Lakehouse teaches you to design a complete data platform with Iceberg. The book carefully guides you through the architecture of your platform—from storage to governance. Each layer is fully illustrated and includes hands-on examples that connect theory with practical implementation. You’ll ingest sales and marketing data from PostgreSQL into Iceberg tables using Apache Spark, build interactive dashboards in Apache Superset, design and compare ingestion pipelines, and much more. Author Alex Merced’s experienced guidance helps you understand the important tradeoff decisions you’ll need to make in real-world implementations. You’ll soon have a scalable and maintainable data platform that can handle petabytes of data!
About the reader
For data architects familiar with the basics of data lakehouses, and preparing for a migration.
About the author
Alex Merced is Head of Developer Relations at Dremio, where he helps developers navigate modern data architectures. He shares his expertise through videos, podcasts, and articles, and leads the DataLakehouseHub.com community. He is the co-author of Apache Iceberg: The Definitive Guide.
The “lakehouse” data architecture is a powerful way to combine the flexibility of data lakes with the management features of data warehouses. The open source Apache Iceberg framework delivers the scalability, reliability, and performance you want from a lakehouse without the expense and vendor lock-in of platforms like Snowflake, BigQuery, and Redshift.
Apache Iceberg is an open source table format perfect for massive analytic datasets. Iceberg enables ACID transactions, schema evolution, and high-performance queries on data lakes using multiple compute engines like Spark, Trino, Flink, Presto, and Hive. An Iceberg data lakehouse enables fast, reliable analytics at scale while retaining the observability you need for compliance audits, governance, and provable data security.
In this book, data guru Alex Merced shows you:
• How to create a modular, scalable Iceberg lakehouse architecture
• Where Spark, Flink, Dremio, Polaris fit into your design
• Reliable batch and streaming ingestion pipelines
• Strategies for governance, security, and performance at scale
Apache Iceberg is an open source table format perfect for massive analytic datasets. Iceberg enables ACID transactions, schema evolution, and high-performance queries on data lakes using multiple compute engines like Spark, Trino, Flink, Presto, and Hive. An Iceberg data lakehouse enables fast, reliable analytics at scale while retaining the observability you need for compliance audits, governance, and provable data security.
About the book
Architecting an Apache Iceberg Data Lakehouse teaches you to design a complete data platform with Iceberg. The book carefully guides you through the architecture of your platform—from storage to governance. Each layer is fully illustrated and includes hands-on examples that connect theory with practical implementation. You’ll ingest sales and marketing data from PostgreSQL into Iceberg tables using Apache Spark, build interactive dashboards in Apache Superset, design and compare ingestion pipelines, and much more. Author Alex Merced’s experienced guidance helps you understand the important tradeoff decisions you’ll need to make in real-world implementations. You’ll soon have a scalable and maintainable data platform that can handle petabytes of data!
About the reader
For data architects familiar with the basics of data lakehouses, and preparing for a migration.
About the author
Alex Merced is Head of Developer Relations at Dremio, where he helps developers navigate modern data architectures. He shares his expertise through videos, podcasts, and articles, and leads the DataLakehouseHub.com community. He is the co-author of Apache Iceberg: The Definitive Guide.
59.99
Pre Order
5
1
Architecting an Apache Iceberg Lakehouse: A scalable, open-source data platform
250
Architecting an Apache Iceberg Lakehouse: A scalable, open-source data platform
250Paperback
$59.99
59.99
Pre Order
Product Details
| ISBN-13: | 9781633435100 |
|---|---|
| Publisher: | Manning |
| Publication date: | 04/28/2026 |
| Pages: | 250 |
| Product dimensions: | 7.38(w) x 9.25(h) x (d) |
About the Author
From the B&N Reads Blog