Christopher M. Frenz is a bioinformaticist at New York Medical College and is the author of Visual Basic and Visual Basic.NET for Scientists and Engineers. Frenz is an expert in Perl and scientific programming, in addition to the .NET platform.
Pro Perl Parsingby Christopher M. Frenz
Perl, one of the world’s most diffuse programming languages, was born out of the need to resolve the creator's dissatisfaction with what were at the time standard data-parsing solutions. Indeed, since the 1.0 release in 1987, Perl has been heralded for its powerful parsing capabilities features that are further enhanced through the thousands of Perl
Perl, one of the world’s most diffuse programming languages, was born out of the need to resolve the creator's dissatisfaction with what were at the time standard data-parsing solutions. Indeed, since the 1.0 release in 1987, Perl has been heralded for its powerful parsing capabilities features that are further enhanced through the thousands of Perl extensions made available through CPAN (the Comprehensive Perl Archive Network).
Pro Perl Parsing begins with several chapters devoted to key parsing principles, discussing topics pertinent to regular expressions, parsing grammars, and parsing techniques. This material sets the stage for later chapters, which introduce numerous and powerful CPAN parsing modules, and provide an ample supply of example applications.
Table of Contents
- Parsing and Regular Expression Basics
- Parsing Basics
- Using Parse::Yapp
- Performing Recursive-Descent Parsing with Parse::RecDescent
- Accessing Web Data with HTML::TreeBuilder
- Parsing XML Documents with XML::LibXML and XML::SAX
- Introducing Miscellaneous Parsing Modules
- Finding Solutions to Miscellaneous Parsing Problems
- Performing Text and Data Mining
Meet the Author
Most Helpful Customer Reviews
See all customer reviews
Christopher M. Frenz has put together a real how to manual for those who use Perl for parsing. Grabbing the data you want from a file can be tricky but Frenz has taken parsing from the top shelf and placed it where any Perl programmer can use it. The opening chapter is great for anyone who has had trouble understanding how to use the regular expressions as built into Perl. He explains Pattern Matching, Quantifiers, and how not to be Greedy with your pattern matching. However, the book goes far beyond the basics of regular expressions in Perl to various libraries which can be used for parsing HTML, XML, RSS, and any text based file. Chapter 2 of the book seems very heady as he discusses the use of Generative Grammars which is foundational for anyone wanting to truly understand parsing. From Chomsky's grammar to Type 1, 2, and 3 grammars, he details these structures and how to use them. Perl modules GraphViz::Regex, Regexp::Common, Parse::Yapp, Parse::RecDescent, HTML::TreeBuilder, XML::LibXML, XML::SAX, and XML::RSS are all discussed in this book and clear examples are given on how you can use them to parse files to get the data you want. In the end of the book is a section on Data Mining well worth the read dealing with Descriptive Modeling and Predictive Modeling. For anyone doing data mining work from Web based data or from Relational Databases this section can be very helpful.
Pro Perl Parsing is a major advance to the field of programming Perl in general, and medical-text processing in particular (among many others). Pro Perl Parsing presents a unique conceptual framework for application of regexps, and goes beyond hinting on the importance of parsing for generating medical lexicons from massive sources of patient-specific encounter data. Pro Perl Parsing sets the stage (structure) for application of programming intelligence in clinical medicine using Perl. It is an outstanding piece of scholarship! Oscar A. Linares, MD, PhD (Applied Mathematics)