Table of Contents
Foreword xv
Preface xvii
Prologue: Imagine Data Mesh xxv
Part I What Is Data Mesh?
1 Data Mesh in a Nutshell 3
The Outcomes 4
The Shifts 4
The Principles 6
Principle of Domain Ownership 6
Principle of Data as a Product 7
Principle of the Self-Serve Data Platform 8
Principle of Federated Computational Governance 8
Interplay of the Principles 9
Data Mesh Model at a Glance 10
The Data 11
Operational Data 11
Analytical Data 12
The Origin 13
2 Principle of Domain Ownership 15
A Brief Background on Domain-Driven Design 17
Applying DDDs Strategic Design to Data 18
Domain Data Archetypes 20
Source-Aligned Domain Data 21
Aggregate Domain Data 23
Consumer-Aligned Domain Data 24
Transition to Domain Ownership 24
Push Data Ownership Upstream 24
Define Multiple Connected Models 25
Embrace the Most Relevant Domain Data: Don't Expect a Single Source of Truth 26
Hide the Data Pipelines as Domains' Internal Implementation 26
Recap 27
3 Principle of Data as a Product 29
Applying Product Thinking to Data 31
Baseline Usability Attributes of a Data Product 33
Transition to Data as a Product 41
Include Data Product Ownership in Domains 42
Reframe the Nomenclature to Create Change 42
Think of Data as a Product, Not a Mere Asset 43
Establish a Trust-But-Verify Data Culture 43
Join Data and Compute as One Logical Unit 44
Recap 45
4 Principle of the Self-Serve Data Platform 47
Data Mesh Platform: Compare and Contrast 49
Serving Autonomous Domain-Oriented Teams 51
Managing Autonomous and Interoperable Data Products 51
A Continuous Platform of Operational and Analytical Capabilities 52
Designed for a Generalist Majority 52
Favoring Decentralized Technologies 53
Domain Agnostic 54
Data Mesh Platform Thinking 54
Enable Autonomous Teams to Get Value from Data 57
Exchange Value with Autonomous and Interoperable Data Products 58
Accelerate Exchange of Value by Lowering the Cognitive Load 59
Scale Out Data Sharing 60
Support a Culture of Embedded Innovation 62
Transition to a Self-Serve Data Mesh Platform 62
Design the APIs and Protocols First 62
Prepare for Generalist Adoption 63
Do an Inventory and Simplify 63
Create Higher-Level APIs to Manage Data Products 64
Build Experiences, Not Mechanisms 64
Begin with the Simplest Foundation, Then Harvest to Evolve 65
Recap 65
5 Principle of Federated Computational Governance 67
Apply Systems Thinking to Data Mesh Governance 69
Maintain Dynamic Equilibrium Between Domain Autonomy and Global Interoperability 71
Embrace Dynamic Topology as a Default State 74
Utilize Automation and the Distributed Architecture 75
Apply Federation to the Governance Model 75
Federated Team 77
Guiding Values 78
Policies 81
Incentives 82
Apply Computation to the Governance Model 83
Standards as Code 84
Policies as Code 85
Automated Tests 86
Automated Monitoring 86
Transition to Federated Computational Governance 86
Delegate Accountability to Domains 86
Embed Policy Execution in Each Data Product 87
Automate Enablement and Monitoring over Interventions 87
Model the Gaps 88
Measure the Network Effect 88
Embrace Change over Constancy 88
Recap 89
Part II Why Data Mesh?
6 The Inflection Point 95
Great Expectations of Data 96
The Great Divide of Data 98
Scale: Encounter of a New Kind 100
Beyond Order 101
Approaching the Plateau of Return 102
Recap 102
7 After the Inflection Point 105
Respond Gracefully to Change in a Complex Business 106
Align Business, Tech, and Now Analytical Data 107
Close the Gap Between Analytical and Operational Data 108
Localize Data Changes to Business Domains 110
Reduce Accidental Complexity of Pipelines and Copying Data 111
Sustain Agility in the Face of Growth 111
Remove Centralized and Monolithic Bottlenecks 112
Reduce Coordination of Data Pipelines 112
Reduce Coordination of Data Governance 113
Enable Autonomy 115
Increase the Ratio of Value from Data to Investment 115
Abstract Technical Complexity with a Data Platform 116
Embed Product Thinking Everywhere 116
Go Beyond the Boundaries 116
Recap 117
8 Before the Inflection Point 121
Evolution of Analytical Data Architectures 121
First Generation: Data Warehouse Architecture 122
Second Generation: Data Lake Architecture 123
Third Generation: Multimodal Cloud Architecture 126
Characteristics of Analytical Data Architecture 126
Monolithic 128
Centralized Data Ownership 132
Technology Oriented 133
Recap 137
Part III How to Design the Data Mesh Architecture
9 The Logical Architecture 143
Domain-Oriented Analytical Data Sharing Interfaces 147
Operational Interface Design 148
Analytical Data Interface Design 149
Interdomain Analytical Data Dependencies 149
Data Product as an Architecture Quantum 151
A Data Product's Structural Components 152
Data Product Data Sharing Interactions 158
Data Discovery and Observability APIs 159
The Multiplane Data Platform 160
A Platform Plane 161
Data Infrastructure (Utility) Plane 162
Data Product Experience Plane 162
Mesh Experience Plane 162
Example 163
Embedded Computational Policies 164
Data Product Sidecar 165
Data Product Computational Container 166
Control Port 167
Recap 168
10 The Multiplane Data Platform Architecture 171
Design a Platform Driven by User Journeys 174
Data Product Developer Journey 175
Incept, Explore, Bootstrap, and Source 177
Build, Test, Deploy and Run 180
Maintain, Evolve, and Retire 183
Data Product Consumer Journey 185
Incept, Explore, Bootstrap, Source 188
Build, Test, Deploy, Run 188
Maintain, Evolve, and Retire 189
Recap 189
Part IV How to Design the Data Product Architecture
11 Design a Data Product by Affordances 193
Data Product Affordances 194
Data Product Architecture Characteristics 197
Design Influenced by the Simplicity of Complex Adaptive Systems 198
Emergent Behavior from Simple Local Rules 198
No Central Orchestrator 199
Recap 200
12 Design Consuming, Transforming, and Serving Data 201
Serve Data 201
The Needs of Data Users 201
Serve Data Design Properties 204
Serve Data Design 216
Consume Data 217
Archetypes of Data Sources 219
Locality of Data Consumption 223
Data Consumption Design 224
Transform Data 226
Programmatic Versus Nonprogrammatic Transformation 226
Dataflow-Based Transformation 228
ML as Transformation 229
Time-Variant Transformation 229
Transformation Design 230
Recap 231
13 Design Discovering, Understanding, and Composing Data 233
Discover, Understand, Trust, and Explore 233
Begin Discovery with Self-Registration 236
Discover the Global URI 236
Understand Semantic and Syntax Models 237
Establish Trust with Data Guarantees 238
Explore the Shape of Data 241
Learn with Documentation 242
Discover, Explore, and Understand Design 242
Compose Data 244
Consume Data Design Properties 245
Traditional Approaches to Data Composability 246
Compose Data Design 250
Recap 252
14 Design Managing, Governing, and Observing Data 255
Manage the Life Cycle 255
Manage Life-Cycle Design 256
Data Product Manifest Components 257
Govern Data 258
Govern Data Design 259
Standardize Policies 260
Data and Policy Integration 262
Linking Policies 262
Observe, Debug, and Audit 262
Observability Design 264
Recap 267
Part V How to Get Started
15 Strategy and Execution 271
Should You Adopt Data Mesh Today? 271
Data Mesh as an Element of Data Strategy 275
Data Mesh Execution Framework 279
Business-Driven Execution 280
End-to-End and Iterative Execution 285
Evolutionary Execution 286
Recap 302
16 Organization and Culture 303
Change 305
Culture 307
Values 308
Reward 310
Intrinsic Motivations 311
Extrinsic Motivations 311
Structure 312
Organization Structure Assumptions 313
Discover Data Product Boundaries 321
People 324
Roles 324
Skillset Development 327
Process 329
Key Process Changes 330
Recap 331
Index 333