How do you distinguish a cat from a dog by their DNA? Did Shakespeare really write all of his plays? Pattern matching techniques can offer answers to these questions and to many others, from molecular biology, to telecommunications, to classifying Twitter content. This book for researchers and graduate students demonstrates the probabilistic approach to pattern matching, which predicts the performance of pattern matching algorithms with very high precision using analytic combinatorics and analytic information theory. Part I compiles known results of pattern matching problems via analytic methods. Part II focuses on applications to various data structures on words, such as digital trees, suffix trees, string complexity and string-based data compression. The authors use results and techniques from Part I and also introduce new methodology such as the Mellin transform and analytic depoissonization. More than 100 end-of-chapter problems help the reader to make the link between theory and practice.
|Publisher:||Cambridge University Press|
|Product dimensions:||6.85(w) x 9.72(h) x 1.02(d)|
About the Author
Wojciech Szpankowski is Saul Rosen Professor of Computer Science and (by courtesy) Electrical and Computer Engineering at Purdue University, where he teaches and conducts research in analysis of algorithms, information theory, bioinformatics, analytic combinatorics, random structures, and stability problems of distributed systems. In 2008 he launched the interdisciplinary Institute for Science of Information, and in 2010 he became the Director of the newly established NSF Science and Technology Center for Science of Information. Szpankowski is a Fellow of IEEE and an Erskine Fellow. He received the Humboldt Research Award in 2010.
Table of ContentsPreface; Acknowledgements; Part I. Analysis: 1. Probabilistic models; 2. Exact string matching; 3. Constrained exact string matching; 4. Generalized string matching; 5. Subsequence pattern matching; Part II. Applications: 6. Algorithms and data structures; 7. Digital trees; 8. Suffix trees and Lempel-Ziv'77; 9. Lempel-Ziv'78 compression algorithm; 10. String complexity; Bibliography; Index.