Data Structures and Analysis in C

Data Structures and Analysis in C

by Mark A. Weiss

Hardcover(Older Edition)

$66.75

Product Details

ISBN-13: 9780805354409
Publisher: Benjamin-Cummings Publishing Company
Publication date: 12/28/1992
Edition description: Older Edition
Pages: 600
Product dimensions: 7.68(w) x 9.58(h) x 0.92(d)

Read an Excerpt

PREFACE:

Purpose/Goals

This book describes data structures, methods of organizing large amounts of data, and algorithm analysis, the estimation of the running time of algorithms. As computers become faster and faster, the need for programs that can handle large amounts of input becomes more acute. Paradoxically, this requires more careful attention to efficiency, since inefficiencies in programs become most obvious when input sizes are large. By analyzing an algorithm before it is actually coded, students can decide if a particular solution will be feasible. For example, in this text students look at specific problems and see how careful implementations can reduce the time constraint for large amounts of data from 16 years to less than a second. Therefore, no algorithm or data structure is presented without an explanation of its running time. In some cases, minute details that affect the running time of the implementation are explored.

Once a solution method is determined, a program must still be written. As computers have become more powerful, the problems they must solve have become larger and more complex, requiring development of more intricate programs. The goal of this text is to teach students good programming and algorithm analysis skills simultaneously so that they can develop such programs with the maximum amount of efficiency.

This book is suitable for either an advanced data structures (CS7) course or a first-year graduate course in algorithm analysis. Students should have some knowledge of intermediate programming, including such topics as pointers and recursion, and some background in discrete math.

Approach

I believe it isimportant for students to learn how to program for themselves, not how to copy programs from a book. On the other hand, it is virtually impossible to discuss realistic programming issues without including sample code. For this reason, the book usually provides about one-half to three-quarters of an implementation, and the student is encouraged to supply the rest. Chapter 12, which is new to this edition, discusses additional data structures with an emphasis on implementation details.

The algorithms in this book are presented in ANSI C, which, despite some flaws, is arguably the most popular systems programming language. The use of C instead of Pascal allows the use of dynamically allocated arrays (see, for instance, rehashing in Chapter 5). It also produces simplified code in several places, usually because the and (&&) operations is short-circuited.

Most criticisms of C center on the fact that it is easy to write code that is barely readable. Some of the more standard tricks, such as the simultaneous assignment and testing against 0 via

if (x=y)

are generally not used in the text, since the loss of clarity is compensated by only a few keystrokes and no increased speed. I believe that this books demonstrates that unreadable code can be avoided by exercising reasonable care.

Overview

Chapter 1 contains review material on discrete math and recursion. I believe the only way to be comfortable with recursion is to see good uses over and over. Therefore, recursion is prevalent in this text, with examples in every chapter except Chapter 5.

Chapter 2 deals with algorithm analysis. This chapter explains asymptotic analysis and its major weaknesses. Many examples are provided, including an in-depth explanation of logarithms running time. Simple recursive programs are analyzed by intuitively converting them into iterative programs. More complicated divide-and-conquer programs are introduced, but some of the analysis (solving recurrence relations) is implicitly delayed until Chapter 7, where it is performed in detail.

Chapter 3 covers lists, stacks, and queues. The emphasis here is on coding these data structures using ADTs, fast implementation of these data structures, and an exposition of some of their uses. There are almost no programs (just routines), but the exercises contain plenty of ideas for programming assignments.

Chapter 4 covers trees, with an emphasis on search trees, including external search trees (B-trees). The UNIX file system and expression trees are used as examples. AVL trees and splay trees are introduced but not analyzed. Seventy-five percent of the code is written, leaving similar cases to be completed by the student. More careful treatment of search tree implementation details is found in Chapter 12. Additional coverage of trees, such as file compression and game trees, is deferred until Chapter 10. Data structures for an external medium are considered as the final topic in several chapters.

Chapter 5 is relatively short chapter concerning hash tables. Some analysis is performed, and extendible hashing is covered at the end of the chapter.

Chapter 6 is about priority queues. Binary heaps are covered, and there is additional material on some of the theoretically interesting implementations of priority queues. The Fibonacci heap is discussed in Chapter 11, and the pairing heap is discussed in Chapter 12.

Chapter 7 covers sorting. It is very specific with respect to coding details and analysis. All the important general-purpose sorting algorithms are covered and compared. Four algorithms are analyzed in detail: insertion sort, Shellsort, heapsort, and quicksort. The analysis of the average-case running time of heapsort is new to this edition. External sorting is covered at the end of the chapter.

Chapter 8 discusses the disjoint set algorithm with proof of the running time. This is a short and specific chapter that can be skipped if Kruskal's algorithm is not discussed.

Chapter 9 covers graph algorithms. Algorithms on graphs are interesting, not only because they frequently occur in practice but also because their running time is so heavily dependent on the proper use of data structures. Virtually all of the standard algorithms are presented along with appropriate data structures, pseudocode, and analysis of running time. To place these problems in a proper context, a short discussion on complexity theory (including NP-completeness and undecidability) is provided.

Chapter 10 covers algorithm design by examining common problem-solving techniques. This chapter is heavily fortified with examples. Pseudocode is used in these later chapters so that the student's appreciation of an example algorithm is not obscured by implementation details.

Chapter 11 deals with amortized analysis. Three data structures from Chapters 4 and 6 and the Fibonacci heap, introduced in this chapter, are analyzed.

Chapter 12 is new to this edition. It covers search tree algorithms, the k-d tree, and the pairing heap. This chapter departs from the rest of the text by providing complete and careful implementations for the search trees and pairing heap. The material is structured so that the instructor can integrate sections into discussions from other chapters. For example, the top-down red black tree in Chapter 12 can be discussed under AVL trees (in Chapter 4).

Chapters 1-9 provide enough material for most one-semester data structures courses. If time permits, then Chapter 10 can be covered. A graduate course on algorithm analysis could cover Chapters 7-11. The advanced data structures analyzed in Chapter 11 can easily be referred to in the earlier chapters. The discussion of NP-completeness in Chapter 9 is far too brief to be used in such a course. Garey and Johnson's book on NP-completeness can be used to augment this text.

Exercises

Exercises, provided at the end of each chapter, match the order in which material is presented. The last exercises may address the chapter as a whole rather than a specific section. Difficult exercises are marked with an asterisk, and more challenging exercises have two asterisks.

A solutions manual containing solutions to almost all the exercises is available to instructors from the Addison-Wesley Publishing Company.

References

References are placed at the end of each chapter. Generally the references either are historical, representing the original source of the material, or they represent extensions and improvements to the results given in the text. Some references represent solutions to exercises.

Code Availability

The example program code in this book is available via anonymous ftp at aw.com. It is also accessible through the World Wide Web; the URL is ...

Table of Contents

(All chapters, except Chapter 3, conclude with a Summary, Exercises and References.)
1. Introduction.

What's the Book About?
Mathematics Review.
Exponents.
Logarithms.
Series.
Modular Arithmetic.
The P word.
A Brief Introduction to Recursion.

2. Algorithm Analysis.
Mathematical Background.
Model.
What to Analyze.
Running Time Calculations.
A Simple Example.
General Rules.
Solutions for the Maximum Subsequence Sum Problem.
Logarithms in the Running Time.
Checking Your Analysis.
A Grain of Salt.

3. Lists, Stacks, and Queues.
Abstract Data Types (ADTs).
The List ADT.
Simple Array Implementation of Lists.
Linked Lists.
Programming Details.
Common Errors.
Doubly Linked Lists.
Circularly Linked Lists.
Examples.
Cursor Implementation of Linked Lists.
The Stack ADT.
Stack Model.
Implementation of Stacks.
Applications.
The Queue ADT.
Queue Model.
Array Implementation of Queues.
Applications of Queues.

4. Trees.
Preliminaries.
Implementation of Trees.
Tree Traversals with an Application.
Binary Trees.
Implementation.
Expression Trees.
The Search Tree ADT—Binary Search Trees.
MakeEmpty.
Find.
FindMin and FindMax.
Insert.
Delete.
Average-Case Analysis.
AVL Trees.
SingleRotation.
Double Rotation.
Splay Trees.
A Simple Idea (That Does Not Work).
Splaying.
Tree Traversals (Revisited).
B-Trees.

5. Hashing.
General Idea.
Hash Function.
Separate Chaining.
Open Addressing.
Linear Probing.
Quadratic Probing.
Double Hashing.
Rehashing.
Extendible Hashing.

6. Priority Queues (Heaps).
Model.
Simple Implementations.
Binary Heaps.
Structure Property.
Heap Order Property.
Basic Heap Operations.
Other Heap Operations.
Applications of Priority Queues.
The Selection Problem.
Event Simulation.
d-Heaps.
Leftist Heaps.
Leftist Heap Property.
Leftist Heap Operations.
Skew Heaps.
Binomial Queues.
Binomial Queue Structure.
Binomial Queue Operations.
Implementations of Binomial Queues.

7. Sorting.
Preliminaries.
Insertion Sort.
The Algorithm.
Analysis of Insertion Sort.
A Lower Bound for Simple Sorting Algorithms.
Shellsort.
Analysis of Insertion Sort.
Heapsort.
Analysis of Heapsort.
Mergesort.
Analysis of Mergesort.
Quicksort.
Picking the Pivot.
Partitioning Strategy.
Small Arrays.
Actual Quicksort Routines.
Analysis of Quicksort.
A Linear-Expected-Time Algorithm for Selection.
Sorting Large Structures.
A General Lower Bound for Sorting.
Decision Trees.
Bucket Sort.
External Sorting.
Why We Need New Algorithms.
Model for External Sorting.
The Simple Algorithm.
Multiway Merge.
Polyphase Merge.
Replacement Selection.

8. The Disjoint Set ADT.
Equivalence Relations.
The Dynamic Equivalence Problem.
Basic Data Structure.
Smart Union Algorithms.
Path Compression.
Worst Case for Union-by-Rank and Path Compression.
Analysis of the Union/Find Algorithm.
An Application.

9. Graph Algorithms.
Definitions.
Representation of Graphs.
Topological Sort.
Shortest-Path Algorithms.
Unweighted Shortest Paths.
Dijkstra's Algorithm.
Graphs with Negative Edge Costs.
Acyclic Graphs.
All-Pairs Shortest Path.
Network Flow Problems.
A Simple Maximum-Flow Algorithm.
Minimum Spanning Tree.
Prim's Algorithm.
Kruskal's Algorithm.
Applications of Depth-First Search.
Undirected Graphs.
Biconnectivity.
Euler Circuits.
Directed Graphs.
Finding Strong Components.
Introduction to the NP-Completeness.
Easy vs. Hard.
The Class NP.
NP-Complete Problems.

10. Algorithm Design Techniques.
Greedy Algorithms.
A Simple Scheduling Problem.
Huffman Codes.
Approximate Bin Packing.
Divide and Conquer.
Running Time of Divide and Conquer Algorithms.
Closest-Points Problem.
The Selection Problem.
Theoretical Improvements for Arithmetic Problems.
Dynamic Programming.
Using a Table Instead of Recursion.
Ordering Matrix Multiplications.
Optimal Binary Search Tree.
All-Pairs Shortest Path.
Randomized Algorithms.
Random Number Generators.
Skip Lists.
Primality Testing.
Backtracking Algorithms.
The Turnpike Reconstruction Problem.
Games.

11. Amortized Analysis.
An Unrelated Puzzle.
Binomial Queues.
Skew Heaps.
Fibonacci Heaps.
Cutting Nodes in Leftist Heaps.
Lazy Merging for Binomial Queues.
The Fibonacci Heap Operations.
Proof of the Time Bound.
Splay Trees.

12. Advanced Data Structures and Implementation.
Top-Down Splay Trees.
Red Black Trees.
Bottom-Up Insertion.
Top-Down Red Black Trees.
Top-Down Deletion.
Deterministic Skip Lists.
AA-Trees.
Treaps.
k-d Trees.
Pairing Heaps.

Index.

Preface

Purpose/Goals

This book describes data structures, methods of organizing large amounts of data, and algorithm analysis, the estimation of the running time of algorithms. As computers become faster and faster, the need for programs that can handle large amounts of input becomes more acute. Paradoxically, this requires more careful attention to efficiency, since inefficiencies in programs become most obvious when input sizes are large. By analyzing an algorithm before it is actually coded, students can decide if a particular solution will be feasible. For example, in this text students look at specific problems and see how careful implementations can reduce the time constraint for large amounts of data from 16 years to less than a second. Therefore, no algorithm or data structure is presented without an explanation of its running time. In some cases, minute details that affect the running time of the implementation are explored.

Once a solution method is determined, a program must still be written. As computers have become more powerful, the problems they must solve have become larger and more complex, requiring development of more intricate programs. The goal of this text is to teach students good programming and algorithm analysis skills simultaneously so that they can develop such programs with the maximum amount of efficiency.

This book is suitable for either an advanced data structures (CS7) course or a first-year graduate course in algorithm analysis. Students should have some knowledge of intermediate programming, including such topics as pointers and recursion, and some background in discrete math.

Approach

I believe it isimportant for students to learn how to program for themselves, not how to copy programs from a book. On the other hand, it is virtually impossible to discuss realistic programming issues without including sample code. For this reason, the book usually provides about one-half to three-quarters of an implementation, and the student is encouraged to supply the rest. Chapter 12, which is new to this edition, discusses additional data structures with an emphasis on implementation details.

The algorithms in this book are presented in ANSI C, which, despite some flaws, is arguably the most popular systems programming language. The use of C instead of Pascal allows the use of dynamically allocated arrays (see, for instance, rehashing in Chapter 5). It also produces simplified code in several places, usually because the and (&&) operations is short-circuited.

Most criticisms of C center on the fact that it is easy to write code that is barely readable. Some of the more standard tricks, such as the simultaneous assignment and testing against 0 via

if (x=y)

are generally not used in the text, since the loss of clarity is compensated by only a few keystrokes and no increased speed. I believe that this books demonstrates that unreadable code can be avoided by exercising reasonable care.

Overview

Chapter 1 contains review material on discrete math and recursion. I believe the only way to be comfortable with recursion is to see good uses over and over. Therefore, recursion is prevalent in this text, with examples in every chapter except Chapter 5.

Chapter 2 deals with algorithm analysis. This chapter explains asymptotic analysis and its major weaknesses. Many examples are provided, including an in-depth explanation of logarithms running time. Simple recursive programs are analyzed by intuitively converting them into iterative programs. More complicated divide-and-conquer programs are introduced, but some of the analysis (solving recurrence relations) is implicitly delayed until Chapter 7, where it is performed in detail.

Chapter 3 covers lists, stacks, and queues. The emphasis here is on coding these data structures using ADTs, fast implementation of these data structures, and an exposition of some of their uses. There are almost no programs (just routines), but the exercises contain plenty of ideas for programming assignments.

Chapter 4 covers trees, with an emphasis on search trees, including external search trees (B-trees). The UNIX file system and expression trees are used as examples. AVL trees and splay trees are introduced but not analyzed. Seventy-five percent of the code is written, leaving similar cases to be completed by the student. More careful treatment of search tree implementation details is found in Chapter 12. Additional coverage of trees, such as file compression and game trees, is deferred until Chapter 10. Data structures for an external medium are considered as the final topic in several chapters.

Chapter 5 is relatively short chapter concerning hash tables. Some analysis is performed, and extendible hashing is covered at the end of the chapter.

Chapter 6 is about priority queues. Binary heaps are covered, and there is additional material on some of the theoretically interesting implementations of priority queues. The Fibonacci heap is discussed in Chapter 11, and the pairing heap is discussed in Chapter 12.

Chapter 7 covers sorting. It is very specific with respect to coding details and analysis. All the important general-purpose sorting algorithms are covered and compared. Four algorithms are analyzed in detail: insertion sort, Shellsort, heapsort, and quicksort. The analysis of the average-case running time of heapsort is new to this edition. External sorting is covered at the end of the chapter.

Chapter 8 discusses the disjoint set algorithm with proof of the running time. This is a short and specific chapter that can be skipped if Kruskal's algorithm is not discussed.

Chapter 9 covers graph algorithms. Algorithms on graphs are interesting, not only because they frequently occur in practice but also because their running time is so heavily dependent on the proper use of data structures. Virtually all of the standard algorithms are presented along with appropriate data structures, pseudocode, and analysis of running time. To place these problems in a proper context, a short discussion on complexity theory (including NP-completeness and undecidability) is provided.

Chapter 10 covers algorithm design by examining common problem-solving techniques. This chapter is heavily fortified with examples. Pseudocode is used in these later chapters so that the student's appreciation of an example algorithm is not obscured by implementation details.

Chapter 11 deals with amortized analysis. Three data structures from Chapters 4 and 6 and the Fibonacci heap, introduced in this chapter, are analyzed.

Chapter 12 is new to this edition. It covers search tree algorithms, the k-d tree, and the pairing heap. This chapter departs from the rest of the text by providing complete and careful implementations for the search trees and pairing heap. The material is structured so that the instructor can integrate sections into discussions from other chapters. For example, the top-down red black tree in Chapter 12 can be discussed under AVL trees (in Chapter 4).

Chapters 1-9 provide enough material for most one-semester data structures courses. If time permits, then Chapter 10 can be covered. A graduate course on algorithm analysis could cover Chapters 7-11. The advanced data structures analyzed in Chapter 11 can easily be referred to in the earlier chapters. The discussion of NP-completeness in Chapter 9 is far too brief to be used in such a course. Garey and Johnson's book on NP-completeness can be used to augment this text.

Exercises

Exercises, provided at the end of each chapter, match the order in which material is presented. The last exercises may address the chapter as a whole rather than a specific section. Difficult exercises are marked with an asterisk, and more challenging exercises have two asterisks.

A solutions manual containing solutions to almost all the exercises is available to instructors from the Addison-Wesley Publishing Company.

References

References are placed at the end of each chapter. Generally the references either are historical, representing the original source of the material, or they represent extensions and improvements to the results given in the text. Some references represent solutions to exercises.

Code Availability

The example program code in this book is available via anonymous ftp at aw.com. It is also accessible through the World Wide Web; the URL is http://www.aw.com/cseng/authors/weiss/dsaac2/dsaac2e.sup.html (follow the links from there). The exact location of this material may change.

Acknowledgments

Many, many people have helped me in the preparation of books in this series. Some are listed in other versions of the book; thanks to all.

For this edition, I would like to thank my editors at Addison-Wesley, Carter Shanklin and Susan Hartman. Teri Hyde did another wonderful job with the production, and Matthew Harris and his staff at Publication Services did their usual fine work putting the final pieces together.

M.A.W.
Miami, Florida
July, 1996


Customer Reviews

Most Helpful Customer Reviews

See All Customer Reviews