Web Caching and Replication / Edition 1

Web Caching and Replication / Edition 1

by Michael Rabinovich, Oliver Spatscheck, Oliver Spatschak

ISBN-10: 0201615703

ISBN-13: 9780201615708

Pub. Date: 07/28/2002

Publisher: Pearson Education

"Rabinovich and Spatscheck report a wealth of detailed information about how to implement Web caching and replication mechanisms, but more importantly, they teach me how to think about the general problem of content distribution. I'm pleased that there is finally a comprehensive book on this important subject."
—Larry Peterson,


"Rabinovich and Spatscheck report a wealth of detailed information about how to implement Web caching and replication mechanisms, but more importantly, they teach me how to think about the general problem of content distribution. I'm pleased that there is finally a comprehensive book on this important subject."
—Larry Peterson, Professor of Computer Science, Princeton University

"This book is a remarkable piece of work, well-organized and clearly articulated. The authors have masterfully presented advanced topics in Internet Web infrastructure and content delivery networks in a way that is suitable for both novices and experts."
—Steve McCanne, Chief Technology Officer, Inktomi

As the Internet grows, evolving from a research tool into a staple of daily life, it is essential that the Web's scalability and performance keep up with increased demand and expectations. Every day, more and more users turn to the Internet to use resource-hungry applications like video and audio on-demand and distributed games. At the same time, more and more computer applications are built to rely on the Web, but with much higher sensitivity to delays of even a few milliseconds. The key to satisfying these growing demands and expectations lies in the practices of caching and replication and in the increased scalability solutions they represent.

Web Caching and Replication provides essential material based on the extensive real-world experience of two experts from AT&T Labs. This comprehensive examination of caching, replication, and load-balancing practices for the Web brings togetherinformation from and for the commercial world, including real-life products; technical standards communities, such as IETF and W3C; and academic research.

By focusing on the underlying, fundamental ideas that are behind the varied technologies currently used in caching and replication, this book will remain a relevant, much-needed resource as the multi-billion dollar industries that rely on the Web continue to grow and evolve.

The book approaches its two central topics in two distinct parts. The part on caching includes coverage of:

  • Proxy caching, including latency reduction and TCP connection caching
  • Transparent and nontransparent proxy deployment
  • Cooperative caching
  • Cache consistency
  • Replacement policies
  • Prefetching
  • "Caching the uncacheable"

The part on replication includes coverage of:

  • Basic mechanisms for request distribution, including content-blind and content-aware request distribution
  • CDNs, including DNS request distribution, streaming content delivery, and secure content access
  • Server selection

Examples and illustrations are included throughout the book. Extensive cross-referencing also enables readers to identify the corresponding parts of each section. Web Caching and Replication concludes with a thorough look into the future. It not only considers how new services can be implemented on caching and replication platforms, but also outlines emerging technologies that will allow for cooperation between different caching and replication enterprises in order to improve the overall performance of the Web.

Product Details

Pearson Education
Publication date:
Product dimensions:
7.43(w) x 9.23(h) x 0.99(d)

Table of Contents

Intended Audiencexvi
Organization of the Bookxvi
I.1The Basics of Web Cachingxxiii
I.1The Basics of Web Replicationxxv
I.1Beyond Performancexxx
Part IBackground1
1Network Layers and Protocols3
1.1The ISO/OSI Reference Model3
1.2Network Components at Different Layers5
1.3Overview of Internet Protocols6
2The Internet Protocol and Routing9
2.2IP Datagram Header11
2.3.1Routing within ASs14
2.3.2Routing between ASs15
3Transmission control protocol21
3.1Segment Header22
3.2Opening a Connection24
3.3Closing a Connection26
3.4Flow Control27
3.5Congestion Control28
4Application Protocols for the Web33
4.1Uniform Resource Locators33
4.2The Domain Name System35
4.2.1Name Hierarchy35
4.2.2The DNS Protocol36
4.3The HyperText Transfer Protocol38
4.3.1The HTTP Request39
4.3.2The HTTP Response40
4.4The HTTP Message Exchange41
4.5Hyperlinks and Embedded Objects43
5HTTP Support for Caching and Replication47
5.1Conditional Requests47
5.1.1Conditional Headers Used for Caching48
5.1.2Conditional Headers Used for Replication48
5.2Age and Expiration of Cached Objects49
5.3Request Redirection50
5.4Range Requests51
5.5The cache-control Header52
5.5.1cache-control Header Directives in Requests52
5.5.2cache-control Header Directives in Responses53
5.5.3Example of the cache-control Header54
5.6Storing State for a Stateless Server: Cookies56
5.7Support for Server Sharing58
5.8Expanded Object Identifiers58
5.9Learning the Proxy Chain58
5.10Cacheability of Web Content60
6Web Behavior Rules of Thumb63
6.1Evaluation Methods64
6.1.1Live Measurements64
6.1.2Trace-Based Methods64
6.2Object Size66
6.3Object Types and Cacheability68
6.4Object Popularity69
6.5Locality of Reference70
6.5.1Temporal Locality70
6.5.2Spatial Locality72
6.6Rate of Object Modifications73
6.7Other Observations73
6.8Summary: Rules of Thumb for the Web75
Part IIWeb Caching77
7Proxy Caching: Realistic Expectations79
7.1Do Proxy Caches Deserve a Hearing?80
7.2Latency Reduction81
7.2.1An Optimistic Bound on Latency Reduction82
7.2.2A Pessimistic View of Latency Reduction83
7.2.3TCP Connection Caching84
7.2.4Connection Caching versus Data Caching86
7.2.5TCP Connection Splitting87
7.2.6Environment-Specific TCP Optimizations89
7.3Bandwidth Savings90
7.4Proxies and Streaming Media92
8Proxy Deployment95
8.1Overview of Internet Connectivity Architectures95
8.2Nontransparent Proxy Deployment98
8.2.1Explicit Client Configuration98
8.2.2Browser Autoconfiguration98
8.2.3Proxy Auto-Discovery99
8.3Transparent Proxy Deployment99
8.3.1Multipath Problem102
8.3.2Interception Mechanisms104
8.3.3Layer 4 Switch as an Intercepter107
8.3.4Router as an Intercepter109
8.3.5Layer 7 Switch as an Intercepter111
8.3.6Intercepting Link113
8.3.7Performance Pitfalls115
8.4Security and Access Control Issues117
8.4.1Proxies and Web Server Access Control117
8.4.2Proxies and Security118
9Cooperative Proxy Caching121
9.1Shared Cache: How Big Is Big Enough?122
9.2Issues in Cooperative Proxy Caching124
9.3Location Management125
9.3.1Broadcast Queries125
9.3.2Hierarchical Caching128
9.3.3URL Hashing129
9.3.4Directory-Based Cooperation133
9.3.5Directory Structures135
9.4Caching on a Global Scale: Proxy Pruning138
9.4.1System Model139
9.4.2Cache Routing141
9.4.3Vicinity Caching143
9.5An Overview of Existing Platforms145
9.5.1Cache Hierarchies145
9.5.2Caching as a Service of a Network Access Point148
9.5.3Satellite Broadcast Cache Service149
10Cache Consistency153
10.1Cache Validation154
10.1.1The Basic Validation Scenario155
10.1.2Implicit Time to Live156
10.1.3Fine-Tuning Validation157
10.1.4Asynchronous and Piggyback Cache Validation158
10.2Cache Invalidation159
10.2.3Delayed versus Immediate Updates163
10.2.5Volume Lease Protocols166
10.2.6Piggyback and Delayed Invalidation168
10.2.7Invalidation in Cache Routing169
10.3Issues in Cooperative Cache Consistency170
10.3.1Validation with Cooperative Proxies170
10.3.2Non-Monotonic Delivery Problem172
11Replacement Policy177
11.1Replacement Policy Metrics177
11.2Replacement Policy Algorithms178
11.3The Value of Replacement Policy180
12.1Performance Metrics183
12.2Performance Bounds of Prefetching185
12.4Nondata Prefetching186
12.5Nontransparent Prefetching188
12.5.1User Nontransparency188
12.5.2Server Nontransparency189
12.6Server Push versus Client Pull190
12.7Information Used in Prefetching Algorithms191
12.7.1User-Specific Information191
12.7.2Group Information192
12.7.3Multiuser Information193
12.8Prediction Algorithms194
12.8.1Popularity-Based Predictions194
12.8.2Markov Modeling195
12.8.3Examples of Algorithms Using First-Order Markov Modeling197
12.8.4Exploiting Longer Request Sequences199
12.8.5Structure Algorithms204
13Caching the Uncacheable207
13.1A Note on Implementation207
13.2Modified Content and Stale Delivery Avoidance209
13.2.1Cache-Friendly Approaches to Stale Delivery Avoidance210
13.2.2Utilizing Cached Stale Content210
13.3Cookied Content213
13.3.1Cache-Friendly Usage of Cookies213
13.3.2Caching Cookied Content214
13.3.3The Semantic Transparency Issue215
13.4Expressly Uncacheable Content and Hit Metering216
13.4.1Cache-Friendly Approaches to Hit Metering216
13.4.2Caching Expressly Uncacheable Content217
13.5Dynamic Content217
13.5.1Cache-Friendly Design of Dynamic Content218
13.5.2Base-Instance Caching219
13.5.3Template Caching221
13.5.4Base-Instance Caching versus Template Caching223
13.6Active Proxies225
Part IIIWeb Replication229
14Basic Mechanisms for Request Distribution231
14.1Content-Blind Request Distribution with Full Replication231
14.1.1Client Redirection232
14.1.2Redirection by a Balancing Switch233
14.1.3Redirection by a Web Site's DNS235
14.2Content-Blind Request Distribution with Partial Replication238
14.2.1Using Surrogates as Server Replicas239
14.2.2Back-End Distributed File Systems240
14.3Content-Aware Request Distribution241
14.3.1Client Redirection by a Java Applet241
14.3.2HTTP Redirection242
14.3.3Redirection by an L7 Switch243
14.3.4Fine-Granularity Domain Names244
15Content Delivery Networks247
15.1Types of CDNs249
15.2Delivering Requests to a CDN252
15.3Finding Origin Servers254
15.4Request Distribution in CDNs255
15.4.1DNS/Balancing Switch Redirection255
15.4.2Two-Level DNS Redirection256
15.4.3Anycast/DNS Redirection257
15.5Pitfalls of DNS-Based Request Distribution258
15.6Fine-Tuning DNS Request Distribution259
15.6.1Post-DNS Request Distribution by Triangular Communication260
15.6.2Post-DNS Request Distribution with HTTP Redirection and URL Rewriting261
15.7Data Consistency in CDNs262
15.8Streaming Content Delivery264
15.8.1Using Multicast for Streaming Content Delivery265
15.8.2Using Application-Level Multicast for Streaming Content Delivery266
15.8.3Constructing a Distribution Tree268
15.9Supporting Secure Content Access270
15.9.1SSL Overview270
15.9.2Performance Impact of Supporting SSL in a CDN272
15.9.3Key Management273
15.9.4Content Retrieval from the Origin Server274
16Server Selection277
16.1.1Proximity Metrics278
16.1.2Server Load Metrics280
16.1.3Aggregate Metrics281
16.1.4Internet Mapping Services284
16.1.5Aging of Metrics284
16.2.1Obtaining Passive Measurements287
16.2.2Avoiding Oscillations288
16.2.3Supporting Client Stickiness291
16.2.4Respecting the Affinity of Server Caches292
16.3Server Selection with Multiple Metrics293
16.4DNS-Based Server Selection294
16.4.1A Typical DNS Server-Selection Scheme294
16.4.2Estimating Hidden Load Factors295
16.5Why Choose a Server When You Can Have Them All?297
Part IVFurther Directions301
17Adding Value at the Edge303
17.1Content Filtering303
17.2Content Transcoding304
17.4Custom Usage Reporting306
17.5Implementing New Services with an Edge Server API307
17.6The ICAP Protocol308
17.7Distributing Web Applications310
17.7.1How to Replicate Applications310
17.7.2Where to Replicate Applications311
18Content Distribution Internetworking315
18.1Pros and Cons of CDI315
18.2Request Distribution316
18.3Content Distribution318

Customer Reviews

Average Review:

Write a Review

and post it to your social network


Most Helpful Customer Reviews

See all customer reviews >