The Barnes & Noble Review
Maybe you know your way around Perl, at least a little...but today you need a specific technique, or example, or solution. Where should you look? For years, Perl programmers have turned to O’Reilly’s Perl Cookbook. Now they have a better option: Perl Cookbook, Second Edition.
What’s new here? Loads. Just look at the spine: This edition’s about 300 pages fatter. Leading Perl experts Tom Christiansen and Nathan Torkington have created more than 80 new recipes. They’ve substantially updated another 100. They’ve added an entirely new chapter on mod_perl, Apache’s embedded Perl interpreter, covering everything from authentication and logging to advanced templating with Mason and the Template Toolkit. Their new chapter on XML covers everything from parsing and validation to transformation.
The whole book’s been updated for Perl 5.8.1, the robust, stable version that’s become the default standard while folks held their breath for Perl 6. And most of the code's been tested under BSD, Linux, and Solaris. (Except for the system programming examples, most of these recipes ought to work wherever Perl runs, including Windows and Mac OS X.)
While Perl Cookbook, Second Edition isn’t as a Perl tutorial, it’s organized so you can gradually deepen and solidify the skills you already have, even if they’re rudimentary. For example, the authors start with “recipes” for using Perl’s simplest data types and operators -- basic stuff, but invaluable to relative novices. There’s a full chapter on basics such as accessing substrings, parsing comma-separated data, and using Unicode strings. There are examples of representing floating point data, generating pseudo-random numbers, converting between numeric and string date formats, manipulating lists and arrays, and more. There’s also a detailed, start-to-finish demonstration of working with associative arrays, arguably Perl’s most useful data type.
Perl has always been an extraordinarily powerful pattern-matching tool. The authors offer nearly two dozen pattern matching recipes: for matching letters, words, multiple lines, nested and recursive patterns, and strings. You’ll find solutions for manipulating files, followed by four chapters on enhancing program flexibility and power -- including coverage of creating your own user-defined types.
If you can do it with Perl, chances are this book can help you do it better. There’s a full chapter on manipulating DBM files and using SQL and the DBI module to query and update external databases. There’s extensive coverage of process management and communication, and a full chapter on Perl sockets programming. In addition to the aforementioned mod_perl coverage, there are more than 50 recipes for building Internet applications and services: DNS, FTP, mail, LDAP, CGI, automated forms, cookies, HTTP, robots, and more.
In each chapter, Christiansen and Torkington start simple and move toward more complex solutions. Often, they present several approaches to solving the same or similar problems, outlining the trade-offs. As Perlfolk say, “There’s more than one way to do it.” But it’s amazing how often the best way is in here. Bill Camarda
Bill Camarda is a consultant, writer, and web/multimedia content developer. His 15 books include Special Edition Using Word 2000 and Upgrading & Fixing Networks for Dummies, Second Edition.
Read an Excerpt
From Chapter 12: Packages, Libraries, and Modules
Package is a compile-time declaration that sets the default package prefix for unqualified global identifiers, just as chdir sets the default directory prefix for relative pathnames. This effect lasts until the end of the current scope (a brace-enclosed block, file, or eval). The effect is also terminated by any subsequent package statement in the same scope. (See the following code.) All programs are in package main until they use a package statement to change this...
Unlike user-defined identifiers, built-in variables with punctuation names (like $_ and $.) and the identifiers STDIN, STDOUT, STDERR, ARGV, ARGVOUT, ENV, INC, and SIG are all forced to be in package main when unqualified. That way things like STDIN, @ARGV, %ENV, and $_ are always the same no matter what package you're in; for example, @ARGV always means @main:: ARGV, even if you've used package to change the default package. A fully qualified @ElseWhere::ARGV would not (and carries no special built-in meaning). Make sure to localize $_ if you use it in your module.
The unit of software reuse in Perl is the module, a file that has a collection of related functions designed to be used by other programs and library modules. Every module has a public interface, a set of variables and functions that outsiders are encouraged to use. From inside the module, the interface is defined by initializing certain package variables that the standard Exporter module looks at. From outside the module, the interface is accessed by importing symbols as a side effect of the use statement. The public interface of a Perlmodule is whatever is documented to be public. in the case of undocumented interfaces, it's whatever is vaguely intended to be public. When we talk about modules in this chapter, and traditional modules in general, we mean those that use the Exporter.
The require or use statements both pull a module into your program, although their semantics are slightly different. require loads modules at runtime, with a check to avoid the redundant loading of a given module. use is like require, with two added properties: compile-time loading and automatic importing.
Modules included with use are processed at compile time, but require processing happens at run time. This is important because if a module that a program needs is missing, the program won't even start because the use fails during compilation of your script. Another advantage of compile-time use over run-time require is that function prototypes in the module's subroutines become visible to the compiler. This matters because only the compiler cares about prototypes, not the interpreter. (Then again, we don't usually recommend prototypes except for replacing built-in commands, which do have them.)
Use is suitable for giving hints to the compiler because of its compile-time behavior. A pragma is a special module that acts as directive to the compiler to alter how Pert compiles your code. A pragma's name is always all lowercase, so when writing a regular module instead of a program, choose a name that starts with a capital letter. Pragmas supported by Pert 5.004 include autouse, constant, diagnostics, integer, lib, locale, overload, sigtrap, strict, subs, and vars. Each has its own manpage.
The other difference between require and use is that use performs an implicit import on the included module's package. Importing a function or variable from one package to another is a form of aliasing; that is, it makes two different names for the same underlying thing. It's like linking in files from another directory to your current one by the command In In/somedir/somefile. Once it's linked in, you no longer have to use the full pathname to access the file. Likewise, an imported symbol no longer needs to be fully qualified by package name (or predeclared with use vars or use subs). You can use imported variables as though they were part of your package. If you imported $English: :OUTPUT_AUTOFLUSH in the current package, you could refer to it as $OUTPUT_AUTOFLUSH.
The required file extension for a Pert module is ".pm". The module named FileHandle would be stored in the file FiIeHandle.pm. The full path to the file depends on your include path, which is stored in the global @INC variable. Recipe 12.7 shows how to manipulate this array to your own purposes.
If the module name itself contains one or more double colons, these are translated into your system's directory separator. That means that the File::Find module resides in the file File/Find.pm under most filesystems. For example...
The following is a typical setup for a hypothetical module named Cards::Poker that demonstrates how to manage its exports. The code goes in the file named Poker.pm within the directory Cards: that is, Cards/Pokerpm. (See Recipe 12.7 for where the Cards directory should reside.) Here's that file, with line numbers included for reference...
Line 1 declares the package that the module will put its global variables and functions in. Typically, a module first switches to a particular package so that it has its own place for global variables and functions, one that won't conflict with that of another program. This must be written exactly as the corresponding use statement will be written when the module is loaded.
Don't say package Poker just because the basename of your file is Poker.pm. Rather, say package Cards::Poker because your users will say use Cards: :Poker. This common problem is hard to debug. If you don't make the package and use statements exactly the same, you won't see a problem until you try to call imported functions or access imported variables, which will be mysteriously missing.
Line 2 loads in the Exporter module, which manages your module's public interface as described below. Line 3 initializes the special, per-package array @ISA to contain the word 11 Exporter". When a user says use Cards:: Poker, Pert implicitly calls a special method, Cards::Poker->import(). You don't have an import method in your package, but that's OK, because the Exporter package does, and you're inheriting from it because of the assignment to @ISA (is a). Pert looks at the package's @ISA for resolution of undefined methods. inheritance is a topic of Chapter 13, Classes, Objects, and Ties. You may ignore it for now-so long as you put code as shown in lines 2 and 3 into each module you write.
Line 4 assigns the list ('&shuffle', ' @card_deck') to the special, per-package array @EXPORT. When someone imports this module, variables and functions listed in that array are aliased into the caller's own package. That way they don't have to call the function Poker::Deck::shuffle (23) after the import. They can just write shuffle (23) instead. This won't happen if they load Cards::Poker with require Cards::Poker; only a use imports.
Lines 5 and 6 set up the package global variables and functions to be exported. (We presume you'll actually flesh out their initializations and definitions more than in these examples.) You're free to add other variables and functions to your module as well, including ones you don't put in the public interface via @EXPORT. See Recipe 12.1 for more about using the Exporter.
Finally, line 7 is a simple 1, indicating the overall return value of the module. If the last evaluated expression in the module doesn't produce a true value, an exception will be raised. Trapping this is the topic of Recipe 12.2. Any old true value will do, like 6.02e23 or "Because tchrist and gnat told us to put this here"; however, 1 is the canonical true value used by almost every module.
Packages group and organize global identifiers. They have nothing to do with privacy. Code compiled in package Church can freely examine and alter variables in package State. Package variables are always global and are used for sharing. But that's okay, because a module is more than just a package; it's also a file, and files count as their own scope. So if you want privacy, use lexical variables instead of globals. This is the topic of Recipe 12.4...