Read an Excerpt
The graph abstraction is a powerful problem-solving tool used to describe relationships between discrete objects. Many practical problems can be modeled in their essential form by graphs. Such problems appear in many domains: Internet packet routing, telephone network design, software build systems, Web search engines, molecular biology, automated road-trip planning, scientific computing, and so on. The power of the graph abstraction arises from the fact that the solution to a graph-theoretic problem can be used to solve problems in a wide variety of domains. For example, the problem of solving a maze and the problem of finding groups of Web pages that are mutually reachable can both be solved using depth-first search, an important concept from graph theory. By concentrating on the essence of these problems—the graph model describing discrete objects and the relationships between them—graph theoreticians have created solutions to not just a handful of particular problems, but to entire families of problems.
Now a question arises. If graph theory is generally and broadly applicable to arbitrary problem domains, should not the software that implements graph algorithms be just as broadly applicable? Graph theory would seem to be an ideal area for software reuse. However, up until now the potential for reuse has been far from realized. Graph problems do not typically occur in a pure graph-theoretic form, but rather are embedded in larger domain-specific problems. As a result, the data to be modeled as a graph are often not explicitly represented as a graph but are instead encoded in some application-specific data structure. Even in the case where the application dataare explicitly represented as a graph, the particular graph representation chosen by the programmer might not match the representation expected by a library that the programmer wants to use. Moreover, different applications may place different time and space requirements on the graph data structure.
This implies a serious problem for the graph library writer who wants to provide reusable software, for it is impossible to anticipate every possible data structure that might be needed and to write a different version of the graph algorithm specifically for each one. The current state of affairs is that graph algorithms are written in terms of whatever data structure is most convenient for the algorithm and users must convert their data structures to that format in order to use the algorithm. This is an inefficient undertaking, consuming programmer time and computational resources. Often, the cost is perceived not to be worthwhile, and the programmer instead chooses to rewrite the algorithm in terms of his or her own data structure. This approach is also time consuming and error prone, and will tend to lead to sub-optimal solutions since the application programmer may not be a graph algorithms expert.Generic Programming The Standard Template Library (STL) was introduced in 1994 and was adopted shortly thereafter into the C++ Standard. The STL was a library of interchangeable components for solving many fundamental problems on sequences of elements. What set the STL apart from libraries that came before it was that each STL algorithm could work with a wide variety of sequential datastructures: linked-lists, arrays, sets, and so on. The iterator abstraction provided an interface between containers and algorithms and the C++ template mechanism provided the needed flexibility to allow implementation without loss of efficiency. Each algorithm in the STL is a function template parameterized by the types of iterators upon which it operates. Any iterator that satisfies a minimal set of requirements can be used regardless of the data structure traversed by the iterator. The systematic approach used in the STL to construct abstractions and interchangeable components is called generic programming.
Generic programming lends itself well to solving the reusability problem for graph libraries. With generic programming, graph algorithms can be made much more flexible, allowing them to be easily used in a wide variety applications. Each graph algorithm is written not in terms of a specific data structure, but instead to a graph abstraction that can be easily implemented by many different data structures. Writing generic graph algorithms has the additional advantage of being more natural; the abstraction inherent in the pseudo-code description of an algorithm is retained in the generic function. The Boost Graph Library (BGL) is the first C++ graph library to apply the notions of generic programming to the construction of graph algorithms.
Some BGL History The Boost Graph Library began its life as the Generic Graph Component Library (GGCL), a software project at the Lab for Scientific Computing (LSC). The LSC, under the direction of Professor Andrew Lumsdaine, was an interdisciplinary laboratory dedicated to research in algorithms, software, tools, and run-time systems for high-performance computational science and engineering.2Special emphasis was put on developing industrial-strength, high performance software using modern programming languages and techniques—most notably, generic programming.
Soon after the Standard Template Library was released, work began at the LSC to apply generic programming to scientific computing. The Matrix Template Library (MTL) was one of the first projects. Many of the lessons learned during construction of the MTL were applied to the design and implementation of the GGCL. The LSC has since evolved into the Open Systems Laboratory (OSL) http://www.osl.iu.edu.Although the name and location have changed, the research agenda remains the same.
An important class of linear algebra computations in scientific computing is that of sparse matrix computations, an area where graph algorithms play an important role. As the LSC was developing the sparse matrix capabilities of the MTL, the need for high-performance reusable (and generic) graph algorithms became apparent. However, none of the graph libraries available at the time (LEDA, GTL, Stanford GraphBase) were written using the generic programming style of the MTL and the STL, and hence did not fulfill the flexibility and high-performance requirements of the LSC. Other researchers were also expressing interest in a generic C++ graph library. During a meeting with Bjarne Stroustrup, we were introduced to several individuals at AT&T who needed such a library. Other early work in the area of generic graph algorithms included some codes written by Alexander Stepanov, as well as Dietmar Kühl’s master’s thesis.
With this in mind, and motivated by homework assignments in his algorithms class ,Jeremy Siek began prototyping an interface and some graph classes in the spring of1998. Lie-Quan Lee then developed the first version of the GGCL, which became his master’s thesis project. During the following year, the authors began collaborating with Alexander Stepanov and Matthew Austern. During this time, Stepanov’s disjoint-sets-based connected components implementation was added to the GGCL, and work began on providing concept documentation for the GGCL, similar to Austern’s STL documentation.
During this year the authors also became aware of Boost and were excited to find an organization interested in creating high-quality, open source C++ libraries. Boost included several people interested in generic graph algorithms, most notably Dietmar Kühl. Some discussions about generic interfaces for graph structures resulted in a revision of the GGCL that closely resembles the current Boost Graph Library interface. On September 4, 2000, the GGCL passed the Boost formal review (managed by David Abrahams) and became the Boost Graph Library. The first release of the BGL was September 27, 2000. The BGL is not a “frozen” library. It continues to grow as new algorithms are contributed, and it continues to evolve to meet users’ needs. We encourage readers to participate in the Boost group and help with extensions to the BGL.What Is Boost? Boost is an online community that encourages development and peer-review of free C++ libraries. The emphasis is on portable and high-quality libraries that work well with (and are in the same spirit as) the C++ Standard Library. Members of the community submit proposals (library designs and implementations) for review. The Boost community (led by a review manager) then reviews the library, provides feedback to the contributors, and finally renders a decision as to whether the library should be included in the Boost library collection. The libraries are available at the Boost Web site http://www.boost.org. In addition, the Boost mailing list provides an important forum for discussing library plans and for organizing collaboration.Obtaining and Installing the BGL Software The Boost Graph Library is available as part of the Boost library collection, which can be obtained in several different ways. The CD accompanying this book contains version 1.25.1 of the Boost library collection. In addition, releases of the Boost library collection can be obtained with your Web browser at http://www.boost.org/boost all.zip for the Windows zip archive of the latest release and http://www.boost.org/boostall.tar.gz for the UNIX archive of the latest release. The Boost libraries can also be downloaded via FTP at ftp://boost.sourceforge.net/pub-/boost/release/.
The zip archive of the Boost library collection can be unzipped by using WinZip or other similar tools. The UNIX “tar ball” can be expanded using the following command:
gunzip _cd boost all.tar.gz j tar xvf _
Extracting the archive creates a directory whose name consists of the word boost and a version number. For example, extracting the Boost release 1.25.1 creates a directory boost 1 25 1. Under this top directory, are two principal subdirectories: boost and libs. The subdirectory boost contains the header files for all the libraries in the collection. The subdirectory libs contains a separate subdirectory for each library in the collection. These subdirectories contain library-specific source and documentation files. You can point your Web browser to boost 1 251/index.htm and navigate the whole Boost library collection.
All of the BGL header files are in the directory boost/graph/. However, other Boost header files are needed since BGL uses other Boost components. The HTML documentation is in libs/graph/doc/ and the source code for the examples is inlibs/graph/example/. Regression tests for BGL are in libs/graph/test/. The source files in libs/graph/src/ implement the Graphviz file parsers and printers.
Except as described next, there are no compilation and build steps necessary to use BGL. All that is required is that the Boost header file directory be added to your compiler’s include path. For example, using Windows 2000, if you have unzipped release 1.25.1 from boost all.zip into the top level directory of your C drive, for Borland, GCC, and Metrowerks compilers add -Ic:/boost 1 25 1 to the compiler command line, and for the Microsoft Visual C++ compiler add /I "c:/boost 1 25 1". For IDEs, add c:/boost 1 25 1 (or whatever you have renamed it to) to the include search paths using the appropriate dialog. Before using the BGL interface to LEDA or Stanford GraphBase, LEDA or GraphBase must be installed according to their installation instructions. To use the read graphviz() functions (for reading AT&T Graphviz files), you must build and link to an additional library under boost 1 251/libs/graph/src.
The Boost Graph Library is written in ISO/IEC Standard C++ and compiles with most C++ compilers. For an up-to-date summary of the compatibility with a particular compiler, see the “Compiler Status” page at the Boost Web site http://www.boost.org/status/- compiler status.html.How to Use This Book This book is both a user guide and reference manual for the BGL. It is intended to allow the reader to begin using the BGL for real-life graph problems. This book should also be interesting for programmers who wish to learn more about generic programming. Although there are many books about how to use generic libraries (which in almost all cases means how to use the STL or Standard Library), there is very little available about how actually to build generic software. Yet generic programming is a vitally important new paradigm for software development. We hope that, by way of example, this book will show the reader how to do (and not simply use) generic programming and to apply and extend the generic programming paradigm beyond the basic container types and algorithms of the STL.
The third partner to the user guide and reference manual is the BGL code itself. The BGL code is not simply academic and instructional. It is intended to be used.
For students learning about graph algorithms and data structures, BGL provides a comprehensive graph algorithm framework. The student can concentrate on learning the important theory behind graph algorithms without becoming bogged down and distracted in too many implementation details. For practicing programmers, BGL provides high-quality implementations of graph data structures and algorithms. Programmers will realize significant time saving from this reliability. Time that would have otherwise been spent developing (and debugging) complicated graph data structures and algorithms can now be spent in more productive pursuits. Moreover, the flexible interface to the BGL will allow programmers to apply graph algorithms in settings where a graph may only exist implicitly.
For the graph theoretician, this book makes a persuasive case for the use of generic programming for implementing graph-theoretic algorithms. Algorithms written using the BGL interface will have broad applicability and will be able to be reused innumerous settings.
We assume that the reader has a good grasp of C++. Since there are many sources where the reader can learn about C++, we do not try to teach it here (see the references at the end of the book—The C++ Programming Language, Special ed., by Stroustrup and C++ Primer, 3rd ed., by Josee Lajoie and Stanley B. Lippman are our recommendations). We also assume some familiarity with the STL (see STL Tutorial and Reference Guide by David R. Musser, Gillmer J. Derge, and Atul Sainiand Generic Programming and the STL by Matthew Austern ). We do, however, present some of the more advanced C++ features used to implement generic libraries in general and the BGL in particular. Some necessary graph theory concepts are introduced here, but not in great detail. For a detailed discussion of elementary graph theory see Introduction to Algorithms by T. H. Cormen, C. E. Leiserson, and R.L. Rivest . The Electronic ReferenceAn electronic version of the book is included on the accompanying CD, in the file bgl-book.pdf. The electronic version is searchable and is fully hyperlinked, making it a useful companion for the printed version. The hyperlinks include all internal references such as the literate programming “part” references as well as links to external Web pages.AcknowledgmentsWe owe many debts of thanks to a number of individuals who both inspired and encouraged us in developing the BGL and in writing this book. A most profound thanks goes to Alexander Stepanov and David Musser for their pioneering work in generic programming, for their continued encouragement of our work, and for contributions to the BGL. We especially thank David Musser for his careful proofreading of this book. Matthew Austern’s work on documenting the concepts of the STL provided a foundation for creating the concepts in the BGL. We thank Dietmar Kühl for his work on generic graph algorithms and design patterns; especially for the property map abstraction. This work would not have been possible without the expressive power of Bjarne Stroustrup’s C++ language.
Dave Abrahams, Jens Maurer, Dietmar Kühl, Beman Dawes, Gary Powell, Greg Colvin and the rest of the group at Boost provided valuable input to the BGL interface, numerous suggestions for improvement, and proofreads of this book. We also thank the following BGL users whose questions helped to motivate and improve BGL (as well as this book): Gordon Woodhull, Dave Longhorn, Joel Phillips, Edward Luke, and Stephen North.
Thanks to a number of individuals who reviewed the book during its development: Jan Christiaan van Winkel, David Musser, Beman Dawes, and Jeffrey Squyres. A great thanks to our editor Deborah Lafferty; Kim Arney Mulcahy, Cheryl Ferguson, and Marcy Barnes, the production coordinator; and the rest of the team at Addison-Wesley. It was a pleasure to work with them.
Our original work on the BGL was supported in part by NSF grant ACI-9982205. Parts of the BGL were completed while the third author was on sabbatical at Lawrence Berkeley National Laboratory (where the first two authors were occasional guests).All of the graph drawings in this book were produced using the dot program from the Graphviz package.License The BGL software is released under an open source “artistic” license. A copy of the BGL license is included with the source code in the LICENSE file.
The BGL may be used freely for both commercial and noncommercial use. The main restriction on BGL is that modified source code can only be redistributed if it is clearly marked as a nonstandard version of BGL. The preferred method for the distribution of BGL, and for submitting changes, is through the Boost Web site.