Gift Guide

The Art of Multiprocessor Programming, Revised Reprint

Paperback (Print)
Rent from
(Save 76%)
Est. Return Date: 02/15/2015
Buy Used
Buy Used from
Used and New from Other Sellers
Used and New from Other Sellers
from $56.04
Usually ships in 1-2 business days
(Save 25%)
Other sellers (Paperback)
  • All (5) from $56.04   
  • New (2) from $56.04   
  • Used (3) from $60.54   


Revised and updated with improvements conceived in parallel programming courses, The Art of Multiprocessor Programming is an authoritative guide to multicore programming. It introduces a higher level set of software development skills than that needed for efficient single-core programming. This book provides comprehensive coverage of the new principles, algorithms, and tools necessary for effective multiprocessor programming. Students and professionals alike will benefit from thorough coverage of key multiprocessor programming issues.

  • This revised edition incorporates much-demanded updates throughout the book, based on feedback and corrections reported from classrooms since 2008
  • Learn the fundamentals of programming multiple threads accessing shared memory
  • Explore mainstream concurrent data structures and the key elements of their design, as well as synchronization techniques from simple locks to transactional memory systems
  • Visit the companion site and download source code, example Java programs, and materials to support and enhance the learning experience
Read More Show Less

Editorial Reviews

From the Publisher
"The book could be used for a short course for practitioners looking for solutions to particular problems, a medium course for non-computer science major who would use multiprocessor programming in their own field, or a semester-long course for computer science majors."—Reference and Research Book News, February 2013
Read More Show Less

Product Details

  • ISBN-13: 9780123973375
  • Publisher: Elsevier Science
  • Publication date: 6/5/2012
  • Edition description: Revised
  • Pages: 536
  • Sales rank: 674,917
  • Product dimensions: 7.50 (w) x 9.20 (h) x 1.40 (d)

Meet the Author

Maurice Herlihy received an A.B. in Mathematics from Harvard University, and a Ph.D. in Computer Science from M.I.T. He has served on the faculty of Carnegie Mellon University, on the staff of DEC Cambridge Research Lab, and is currently a Professor in the Computer Science Department at Brown University. Maurice Herlihy is an ACM Fellow, and is the recipient of the 2003 Dijkstra Prize in Distributed Computing. He shared the 2004 Gödel Prize with Nir Shavit, the highest award in theoretical computer science. In 2012 he shared the Edsger W. Dijkstra Prize In Distributed Computing with Nir Shavit.

Nir Shavit received a B.A. and M.Sc. from the Technion and a Ph.D. from the Hebrew University, all in Computer Science. From 1999 to 2011 he served as a member of technical staff at Sun Labs and Oracle Labs. He shared the 2004 Gödel Prize with Maurice Herlihy, the highest award in theoretical computer science. He is a Professor in the Electrical Engineering and Computer Science Department at M.I.T. and the Computer Science Department at Tel-Aviv University. In 2012 he shared the Edsger W. Dijkstra Prize In Distributed Computing with Maurice Herlihy.

Read More Show Less

Read an Excerpt

The Art of Multiprocessor Programming

By Maurice Herlihy Nir Shavit


Copyright © 2012 Elsevier, Inc.
All right reserved.

ISBN: 978-0-12-397795-3

Chapter One


The computer industry is undergoing, if not another revolution, certainly a vigorous shaking-up. The major chip manufacturers have, for the time being at least, given up trying to make processors run faster. Moore's Law has not been repealed: each year, more and more transistors fit into the same space, but their clock speed cannot be increased without overheating. Instead, manufacturers are turning to "multicore" architectures, in which multiple processors (cores) communicate directly through shared hardware caches. Multiprocessor chips make computing more effective by exploiting parallelism: harnessing multiple processors to work on a single task.

The spread of multiprocessor architectures will have a pervasive effect on how we develop software. Until recently, advances in technology meant advances in clock speed, so software would effectively "speed up" by itself over time. Now, however, this free ride is over. Advances in technology will mean increased parallelism and not increased clock speed, and exploiting such parallelism is one of the outstanding challenges of modern Computer Science.

This book focuses on how to program multiprocessors that communicate via a shared memory. Such systems are often called shared-memory multiprocessors or, more recently, multicores. Programming challenges arise at all scales of multiprocessor systems—at a very small scale, processors within a single chip need to coordinate access to a shared memory location, and on a large scale, processors in a supercomputer need to coordinate the routing of data. Multiprocessor programming is challenging because modern computer systems are inherently asynchronous: activities can be halted or delayed without warning by interrupts, preemption, cache misses, failures, and other events. These delays are inherently unpredictable, and can vary enormously in scale: a cache miss might delay a processor for fewer than ten instructions, a page fault for a few million instructions, and operating system preemption for hundreds of millions of instructions.

We approach multiprocessor programming from two complementary directions: principles and practice. In the principles part of this book, we focus on computability: figuring out what can be computed in an asynchronous concurrent environment. We use an idealized model of computation in which multiple concurrent threads manipulate a set of shared objects. The sequence of the thread operations on the objects is called the concurrent program or concurrent algorithm. This model is essentially the model presented by the Java™, C#, or C++ thread packages.

Surprisingly, there are easy-to-specify shared objects that cannot be implemented by any concurrent algorithm. It is therefore important to understand what not to try, before proceeding to write multiprocessor programs. Many of the issues that will land multiprocessor programmers in trouble are consequences of fundamental limitations of the computational model, so we view the acquisition of a basic understanding of concurrent shared-memory computability as a necessary step. The chapters dealing with principles take the reader through a quick tour of asynchronous computability, attempting to expose various computability issues, and how they are addressed through the use of hardware and software mechanisms.

An important step in the understanding of computability is the specification and verification of what a given program actually does. This is perhaps best described as program correctness. The correctness of multiprocessor programs, by their very nature, is more complex than that of their sequential counterparts, and requires a different set of tools, even for the purpose of "informal reasoning" (which, of course, is what most programmers actually do). Sequential correctness is mostly concerned with safety properties. A safety property states that some "bad thing" never happens. For example, a traffic light never displays green in all directions, even if the power fails. Naturally, concurrent correctness is also concerned with safety, but the problem is much, much harder, because safety must be ensured despite the vast number of ways that the steps of concurrent threads can be interleaved. Equally important, concurrent correctness encompasses a variety of liveness properties that have no counterparts in the sequential world. A liveness property states that a particular good thing will happen. For example, a red traffic light will eventually turn green. A final goal of the part of the book dealing with principles is to introduce a variety of metrologies and approaches for reasoning about concurrent programs, which will later serve us when discussing the correctness of real-world objects and programs.

The second part of the book deals with the practice of multiprocessor programming, and focuses on performance. Analyzing the performance of multiprocessor algorithms is also different in flavor from analyzing the performance of sequential programs. Sequential programming is based on a collection of well-established and well-understood abstractions. When we write a sequential program, we usually do not need to be aware that underneath it all, pages are being swapped from disk to memory, and smaller units of memory are being moved in and out of a hierarchy of processor caches. This complex memory hierarchy is essentially invisible, hiding behind a simple programming abstraction.

In the multiprocessor context, this abstraction breaks down, at least from a performance perspective. To achieve adequate performance, the programmer must sometimes "outwit" the underlying memory system, writing programs that would seem bizarre to someone unfamiliar with multiprocessor architectures. Someday perhaps, concurrent architectures will provide the same degree of efficient abstraction now provided by sequential architectures, but in the meantime, programmers should beware.

The principles part of the book presents a progressive collection of shared objects and programming tools. Every object and tool is interesting in its own right, and we use each one to expose the reader to higher-level issues: spin-locks illustrate contention, linked lists illustrate the role of locking in data structure design, and so on. Each of these issues has important consequences for program performance. The hope is that the reader will understand the issue in a way that will later allow him or her to apply the lessons learned to specific multiprocessor systems. We culminate with a discussion of state-of-the-art technologies such as transactional memory.

We would like to include a few words about style. The book uses the Java programming language. There are, of course, other suitable languages which readers would have found equally appealing. We have a long list of reasons for our specific choice, but perhaps it is more suitable to discuss them over a cup of coffee! In the appendix we explain how the concepts expressed in Java are expressed in other popular languages or libraries. We also provide a primer on multiprocessor hardware. Throughout the book, we avoid presenting specific performance numbers for programs and algorithms, and stick to general trends. There is a good reason for this: multiprocessors vary greatly, and unfortunate though it may be, at this point in time, what works well on one machine may be significantly less impressive on another. Sticking to general trends is our way of guaranteeing that the validity of our assertions will be sustained over time.

We provide references at the end of each chapter. The reader will find a bibliographical survey of the material covered, with suggestions for further reading. Each chapter also includes a collection of exercises which readers can use to gauge their comprehension or entertain themselves on Sunday mornings.

1.1 Shared Objects and Synchronization

On the first day of your new job, your boss asks you to find all primes between 1 and 1010 (never mind why), using a parallel machine that supports ten concurrent threads. This machine is rented by the minute, so the longer your program takes, the more it costs. You want to make a good impression. What do you do?

As a first attempt, you might consider giving each thread an equal share of the input domain. Each thread might check 109 numbers, as shown in Fig. 1.1. This approach fails, for an elementary, but important reason. Equal ranges of inputs do not necessarily produce equal amounts of work. Primes do not occur uniformly: there are more primes between 1 and 109 than between 9 · 109 and 1010. To make matters worse, the computation time per prime is not the same in all ranges: it usually takes longer to test whether a large number is prime than a small number. In short, there is no reason to believe that the work will be divided equally among the threads, and it is not clear even which threads will have the most work.

A more promising way to split the work among the threads is to assign each thread one integer at a time (Fig. 1.2). When a thread is finished with testing an integer, it asks for another. To this end, we introduce a shared counter, an object that encapsulates an integer value, and that provides a getAndIncrement() method that increments its value, and returns the counter's prior value to the caller.

Fig. 1.3 shows a naïve implementation of Counter in Java. This counter implementation works well when used by a single thread, but it fails when shared by multiple threads. The problem is that the expression

return value++;

is actually an abbreviation of the following, more complex code:

long temp = value; value = temp + 1; return temp;

In this code fragment, value is a field of the Counter object, and is shared among all the threads. Each thread, however, has its own local copy of temp, which is a local variable to each thread.

Now imagine that threads A and B both call the counter's getAndIncrement() method at about the same time. They might simultaneously read 1 from value, set their local temp variables to 1, value to 2, and both return 1. This behavior is not what we intended: concurrent calls to the counter's getAndIncrement() return the same value, but we expect them to return distinct values. In fact, it could get even worse. One thread might read 1 from value, but before it sets value to 2, another thread would go through the increment loop several times, reading 1 and setting to 2, reading 2 and setting to 3. When the first thread finally completes its operation and sets value to 2, it will actually be setting the counter back from 3 to 2.

The heart of the problem is that incrementing the counter's value requires two distinct operations on the shared variable: reading the value field into a temporary variable and writing it back to the Counter object.

Something similar happens when you try to pass someone approaching you head-on in a corridor. You may find yourself veering right, then left several times to avoid the other person doing exactly the same thing. Sometimes you manage to avoid bumping into them and sometimes you do not, and in fact, as we see in the later chapters, such collisions are provably unavoidable. On an intuitive level, what is going on is that each of you is performing two distinct steps: looking at ("reading") the other's current position, and moving ("writing") to one side or the other. The problem is, when you read the other's position, you have no way of knowing whether they have decided to stay or move. In the same way that you and the annoying stranger must decide who passes on the left and who on the right, threads accessing a shared Counter must decide who goes first and who goes second.

As we will see in Chapter 5, modern multiprocessor hardware provides special read-modify-write instructions that allow threads to read, modify, and write a value to memory in one atomic (i.e., indivisible) hardware step. For the Counter object, we can use such hardware to increment the counter atomically.


Excerpted from The Art of Multiprocessor Programming by Maurice Herlihy Nir Shavit Copyright © 2012 by Elsevier, Inc.. Excerpted by permission of MORGAN KAUFMANN. All rights reserved. No part of this excerpt may be reproduced or reprinted without permission in writing from the publisher.
Excerpts are provided by Dial-A-Book Inc. solely for the personal use of visitors to this web site.

Read More Show Less

Table of Contents

1. Introduction

2. Mutual Exclusion

3. Concurrent Objects and Linearization

4. Foundations of Shared Memory

5. The Relative Power of Synchronization Methods

6. The Universality of Consensus

7. Spin Locks and Contention

8. Monitors and Blocking Synchronization

9. Linked Lists: the Role of Locking

10. Concurrent Queues and the ABA Problem

11. Concurrent Stacks and Elimination

12. Counting, Sorting and Distributed Coordination

13. Concurrent Hashing and Natural Parallelism

14. Skiplists and Balanced Search

15. Priority Queues

16. Futures, Scheduling and Work Distribution

17. Barriers

18. Transactional Memory


Read More Show Less

Customer Reviews

Average Rating 5
( 1 )
Rating Distribution

5 Star


4 Star


3 Star


2 Star


1 Star


Your Rating:

Your Name: Create a Pen Name or

Barnes & Review Rules

Our reader reviews allow you to share your comments on titles you liked, or didn't, with others. By submitting an online review, you are representing to Barnes & that all information contained in your review is original and accurate in all respects, and that the submission of such content by you and the posting of such content by Barnes & does not and will not violate the rights of any third party. Please follow the rules below to help ensure that your review can be posted.

Reviews by Our Customers Under the Age of 13

We highly value and respect everyone's opinion concerning the titles we offer. However, we cannot allow persons under the age of 13 to have accounts at or to post customer reviews. Please see our Terms of Use for more details.

What to exclude from your review:

Please do not write about reviews, commentary, or information posted on the product page. If you see any errors in the information on the product page, please send us an email.

Reviews should not contain any of the following:

  • - HTML tags, profanity, obscenities, vulgarities, or comments that defame anyone
  • - Time-sensitive information such as tour dates, signings, lectures, etc.
  • - Single-word reviews. Other people will read your review to discover why you liked or didn't like the title. Be descriptive.
  • - Comments focusing on the author or that may ruin the ending for others
  • - Phone numbers, addresses, URLs
  • - Pricing and availability information or alternative ordering information
  • - Advertisements or commercial solicitation


  • - By submitting a review, you grant to Barnes & and its sublicensees the royalty-free, perpetual, irrevocable right and license to use the review in accordance with the Barnes & Terms of Use.
  • - Barnes & reserves the right not to post any review -- particularly those that do not follow the terms and conditions of these Rules. Barnes & also reserves the right to remove any review at any time without notice.
  • - See Terms of Use for other conditions and disclaimers.
Search for Products You'd Like to Recommend

Recommend other products that relate to your review. Just search for them below and share!

Create a Pen Name

Your Pen Name is your unique identity on It will appear on the reviews you write and other website activities. Your Pen Name cannot be edited, changed or deleted once submitted.

Your Pen Name can be any combination of alphanumeric characters (plus - and _), and must be at least two characters long.

Continue Anonymously
Sort by: Showing 1 Customer Reviews
  • Posted October 21, 2012


    Are you a senior-level undergraduate and/or practitioner who programs multiprocessors? If you are, then this book is for you! Authors Maurice Herlihy and Nir Shavit, have done an outstanding job of writing a book that focuses on how to program multiprocessors that communicate via a shared memory. Herlihy and Shavit, begin by covering classical mutual exclusion algorithms that work by reading and writing shared memory. In addition, the authors examine various ways of specifying correctness and progress. They then describe the foundations of concurrent shared memory computing. Next, the authors show you a simple technique for proving statements of the form, where there is no wait-free implementation of X by Y. The authors continue by describing how to use consensus objects to build a universal construction that implements any concurrent object. In addition, they then try to make you understand how architecture affects performance, and how to exploit this knowledge to write efficient concurrent programs. The authors then show you how monitors are a structured way of combining synchronization and data. Next, the authors introduce several useful techniques that go beyond course-grained locking to allow multiple threads to access a single object at the same time. They continue by considering a kind of pool that provides first-in-first-out fairness. In addition, the authors show you how to implement concurrent stacks. They then show you how some important problems that seem inherently sequential can be made highly parallel by spreading out coordination tasks among multiple parties. Next, the authors look at concurrent hashing – a problem that seems to be naturally parallelizable or, using a more technical term: disjoint-access-parallel; meaning that concurrent method calls are likely to access disjoint locations, implying that there is little need for synchronization. They continue by looking at concurrent search structures with logarithmic depth. In addition, the authors describe how a priority queue typically provides an add() method to add an item to the set, and a removeMin() method to remove and return the item of minimal score. The authors then show you how to decompose certain kinds of problems into components that can be executed in parallel. Next, they discuss how a barrier is a way of forcing asynchronous threads to act almost as if they were synchronous. Finally, the authors review and analyze the strengths and weaknesses of the standard synchronization primitives, and describe some emerging alternatives that are likely to extend, and perhaps even to displace many of today’s standard primitives. This most excellent book focuses on computability: figuring out what can be computed in an asynchronous concurrent environment. Perhaps more importantly, this book deals with the practice of multiprocessor programming, and focuses on performance.

    Was this review helpful? Yes  No   Report this review
Sort by: Showing 1 Customer Reviews

If you find inappropriate content, please report it to Barnes & Noble
Why is this product inappropriate?
Comments (optional)