Writing Apache Modules with PERL and C

Overview

Apache is the most popular web server on the Internet because it is free, reliable, and extensible. The availability of the source code and the modular design of Apache makes it possible to extend web server functionality through the Apache API.For the most part, however, the Apache API has only been available to C programmers, and requires rebuilding the Apache server from source. mod_perl, the popular Apache module used primarily for enhanced CGI performance, changed all that by making the Apache API available ...

See more details below
Other sellers (Paperback)
  • All (42) from $1.99   
  • New (9) from $8.79   
  • Used (33) from $1.99   
Sending request ...

Overview

Apache is the most popular web server on the Internet because it is free, reliable, and extensible. The availability of the source code and the modular design of Apache makes it possible to extend web server functionality through the Apache API.For the most part, however, the Apache API has only been available to C programmers, and requires rebuilding the Apache server from source. mod_perl, the popular Apache module used primarily for enhanced CGI performance, changed all that by making the Apache API available to Perl programmers. With mod_perl, it becomes simple to develop Apache modules with Perl and install them without having to rebuild the web server.Writing Apache Modules with Perl and C shows how to extend web server capabilities regardless of whether the programming language is Perl or C. The book explains the design of Apache, mod_perl, and the Apache API. It then demonstrates how to use them to perform for tasks like the following:

  • Rewriting CGI scripts as Apache modules to vastly improve performance
  • Server-side filtering of HTML documents, to embed special markup or code much like SSI
  • Enhancing server log functionality
  • Converting file formats on the fly
  • Implementing dynamic navigation bars
  • Incorporating database access into CGI scripts
  • Customizing access control and authorization to block robots or to use an external database for passwords
The authors are Lincoln Stein and Doug MacEachern. Lincoln is the successful author of How to Set Up and Maintain a World Wide web Site and the developer of the widely used Perl CGI.pm module. Doug is a consultant and the creator of the innovative mod_perl Apache module.


The Apache server is riding a popularity wave due in no small part to its rock solid reliability and deep roots in the Open Source movement. Intended for intermediate to advanced Web application developers in NT and UNIX environments, this publication examines the Apache API and mod_perl, the fully functional Perl interpreter embedded in Apache. In short, the authors show you a number of ways to avoid the infamous performance bottleneck caused by the CGI protocol itself. You should be familiar with Web scripts, DHTML, CGI and Perl scripts to derive the maximum benefit from this publication. Knowing the basics of Web server administration will help too, since this is essentially an advanced Perl publication. Don't worry too much about C, you only need it to wring max performance from a very few modules, and the C API is explained quite well.

Read More Show Less

Product Details

  • ISBN-13: 9781565925670
  • Publisher: O'Reilly Media, Incorporated
  • Publication date: 4/28/1999
  • Edition number: 1
  • Pages: 750
  • Sales rank: 1,198,413
  • Product dimensions: 7.00 (w) x 9.22 (h) x 1.22 (d)

Meet the Author

Doug MacEachern has been addicted to Perl and web servers since early 1994 when he was introduced to Plexus as a student employee at the University of Arizona. Soon after returning to his home town of Boston, Massachusetts, and entering the "real world," he discovered the Apache web server, and since early 1996, he has been gluing Perl into all its nooks and crannies. His day job has consisted of integrating various other technologies with the Web, including DCE, Kerberos, and GSSAPI, but Perl has been the only one he cannot let go of. Doug has continued as a developer disguised as a consultant since the start of 1998, spending most of his time between Auckland, New Zealand, and San Francisco, California, with time at home in Boston during the warmer months. Doug likes to spend his time away from software—far, far away, sailing on the ocean, diving below it, or simply looking at it from a warm, sandy beach where technology doesn't go much beyond thatched huts and blenders.

Lincoln Stein is an assistant investigator at Cold Spring Harbor Laboratory, where he develops databases and user interfaces for the Human Genome Project using the Apache server and its module API. He is the author of several books about programming for the Web, including The Official Guide to CGI.pm, How to Set Up and Maintain a Web Site, and Web Security: A Step-by-Step Reference Guide.

Read More Show Less

Read an Excerpt


Chapter 7: Other Request Phases

The previous chapters have taken you on a wide-ranging tour of the most popular and useful areas of the Apache API. But we're not done yet! The Apache API allows you to customize URI translation, logging, the handling of proxy transactions, and the manner in which HTTP headers are parsed. There's even a way to incorporate snippets of Perl code directly into HTML pages that use server-side includes.

We've already shown you how to customize the response, authentication, authorization, and access control phases of the Apache request cycle. Now we'll fill in the cracks. At the end of the chapter, we show you the Perl server-side include system, and demonstrate a technique for extending the Apache Perl API by subclassing the Apache request object itself.

The Child Initialization and Exit Phases

Apache provides hooks into the child process initialization and exit handling. The child process initialization handler, installed with PerlChildInitHandler is called just after the main server forks off a child but before the child has processed any incoming requests. The child exit handler, installed with PerlChildExitHandler, is called just before the child process is destroyed.

You might need to install handlers for these phases in order to perform some sort of module initialization that won't survive a fork. For example, the Apache::DBI module has a child init handler that initializes a cache of per-child database connections, and the Apache::Resource module steps in during this phase to set up resource limits on the child processes. The latter is configured in this way:

PerlChildInitHandler Apache::Resource

Like other handlers, you can install a child init handler programatically using Apache::push_handlers(). However, because the child init phase comes so early, the only practical place to do this is from within the parent process, in a Perl startup file configured with a PerlModule or PerlRequire directive. For example, here's how to install an anonymous subroutine that will execute during child initialization to choose a truly random seed value for Perl's random number generator:

use Math::TrulyRandom (); Apache->push_handlers(PerlChildInitHandler => sub { });

Install this piece of code in the Perl startup file. By changing the value of the random number seed on a per-child basis, it ensures that each child process produces a different sequence of random numbers when the built in rand() function is called.

The child exit phase complements the child intialization phase. Child processes may exit for various reasons: the MaxRequestsPerChild limit may have been reached, the parent server was shutdown, or a fatal error occurred. This phase gives modules a chance to tidy up after themselves before the process exits.

The most straightforward way to install a child exit handler is with the explicit PerlChildExitHandler directive, as in:

PerlChildExitHandler Apache::Guillotine

During the child exit phase, mod_perl invokes the Perl API function, perl_destruct()* to run the contents of END blocks and to invoke the DESTROY method for any global objects that have not gone out of scope already. Refer to the Chapter 9 section Special Global Variables, Subroutines and Literals for details.

Note: neither child initialization nor exit hooks are available on Win32 platforms for the reason that the Win32 port of Apache uses a single process.

The Post Read Request Phase

When a listening server receives an incoming request, it reads the HTTP request line and parses any HTTP headers sent along with it. Provided that what's been read is valid HTTP, Apache gives modules an early chance to step in during the post_read_request phase, known to the Perl API world as the rlPostReadRequestHandler. This is the very first callback that Apache makes when serving an HTTP request, and it happens even before URI translation turns the requested URI into a physical pathname.

The post_read_request phase is a handy place to initialize per-request data that will be available to other handlers in the request chain. Because of its usefulness as an initialize routine, mod_perl provides the directive PerlInitHandler as a more readable alias to PerlPostReadRequestHandler.

Since the post_read_request phase happens before URI translation, PerlPostReadRequestHandler cannot appear in , or sections. However the PerlInitHandler directive is actually a bit special. When it appears outside a directory section, it acts as an alias for PerlPostReadRequestHandler as just described. However, when it appears within a directory section, it acts as an alias for PerlHeaderParserHandler (discussed later in this chapter), allowing for per-directory initialization. In other words, wherever you put PerlInitHandler, it will act the way you expect.

Several optional Apache modules install handlers for the post_read_request phase. For example, the mod_unique_id module steps in here to create the UNIQUE_ID environment variable. When the module is activated, this variable is unique to each request over an extended period of time, and so is useful for logging and the generation of session IDs (see Chapter 5). Perl scripts can get at the value of this variable by reading $ENV{UNIQUE_ID}, or by calling $r->subprocess_env('UNIQUE_ID').

mod_setenvif also steps in during this phase to allow you to set enviroment variables based on the incoming client headers. For example, this directive will set the environment variable LOCAL_REFERRAL to true if the Referer header matches a certain regular expression:

SetEnvIf Referer \.acme\.com LOCAL_REFERRAL

mod_perl itself uses the post_read_request phase to process the PerlPassEnv and PerlSetEnv directives, allowing environment variables to be passed to modules that execute early in the request cycle. The built-in Apache equivalents, PassEnv and SetEnv don't get processed until the fixup phase, which may be too late. The Apache::StatINC module, which watches .pm files for changes and reloads them if necessary, is also usually installed into this phase:

PerlPostReadRequestHandler Apache::StatINC PerlInitHandler Apache::StatINC # same thing, but easier to type

The URI Translation Phase

One of the Web's virtues is its Uniform Resource Identifier (URI) and Uniform Resource Locator (URL) standards.* End users never know for sure what is sitting behind a URI. It could be a static file, a dynamic script, a proxied request, or something even more esoteric. The file or program behind a URI may change over time, but this too is transparent to the end user.

Much of Apache's power and flexibility comes from its highly configurable URI translation phase, which comes relatively early in the request cycle, after the post_read_request and before the header_parser phases. During this phase, the URI requested by the remote browser is translated into a physical filename, which may in turn be returned directly to the browser as a static document, or passed on to a CGI script or Apache API module for processing. During URI translation, each module that has declared its interest in handling this phase is given a chance to modify the URI. The first module to handle the phase (i.e. return something other than a status of DECLINED) terminates the phase. This prevents several URI translators from interfering with one another by trying to map the same URI onto several different file paths.

By default, two URI translation handlers are installed in stock Apache distributions. The mod_alias module looks for the existence of several directives that may apply to the current URI. These include Alias, ScriptAlias, Redirect, AliasMatch, and other directives. If it finds one, it uses the directive's value to map the URI to a file or directory somewhere on the server's physical file system. Otherwise, the request falls through to the http_core module (where the default response handler is also found). http_core simply appends the URI to the value of the DocumentRoot configuration directive, forming a file path relative to the document root.

The optional mod_rewrite module implements a much more comprehensive URI translator that allows you to slice and dice URIs in various interesting ways. It is extremely powerful, but uses a series of pattern matching conditions and substitution rules that can be difficult to get right.

Once a translation handler has done its work, Apache walks along the returned filename path in the manner described in Chapter 4, finding where the path part of the URI ends and the additional path information begins. This phase of processing is performed internally and cannot be modified by the module API.

In addition to their intended role in transforming URIs, translation handlers are sometimes used to associate certain types of URIs with specific upstream handlers. We'll see examples of this later in this chapter when we discuss creating custom proxy services.

A Very Simple Translation Handler

Let's look at an example. Many of the documents browsed on a web site are files that are located under the configured DocumentRoot. That is, the requested URI is a filename relative to a directory on the hard disk. Just so you can see how simple a translation handler's job can be, we present a Perl version of Apache's default translation handler found in the http_core module....

Read More Show Less

Table of Contents

Preface;
What You Need to Know to Get the Most out of This Book;
How This Book Is Organized;
Conventions;
The Companion Web Site to This Book;
Using FTP and CPAN;
Comments and Questions;
Acknowledgments;
Chapter 1: Server-Side Programming with Apache;
1.1 Web Programming Then and Now;
1.2 The Apache Project;
1.3 The Apache C and Perl APIs;
1.4 Ideas and Success Stories;
Chapter 2: A First Module;
2.1 Preliminaries;
2.2 Directory Layout Structure;
2.3 Installing mod_perl;
2.4 “Hello World” with the Perl API;
2.5 “Hello World” with the C API;
2.6 Instant Modules with Apache::Registry;
2.7 Troubleshooting Modules;
Chapter 3: The Apache Module Architecture and API;
3.1 How Apache Works;
3.2 The Apache Life Cycle;
3.3 The Handler API;
3.4 Perl API Classes and Data Structures;
Chapter 4: Content Handlers;
4.1 Content Handlers as File Processors;
4.2 Virtual Documents;
4.3 Redirection;
4.4 Processing Input;
4.5 Apache::Registry;
4.6 Handling Errors;
4.7 Chaining Content Handlers;
4.8 Method Handlers;
Chapter 5: Maintaining State;
5.1 Choosing the Right Technique;
5.2 Maintaining State in Hidden Fields;
5.3 Maintaining State with Cookies;
5.4 Protecting Client-Side Information;
5.5 Storing State at the Server Side;
5.6 Storing State Information in SQL Databases;
5.7 Other Server-Side Techniques;
Chapter 6: Authentication and Authorization;
6.1 Access Control, Authentication, and Authorization;
6.2 Access Control with mod_perl;
6.3 Authentication Handlers;
6.4 Authorization Handlers;
6.5 Cookie-Based Access Control;
6.6 Authentication with the Secure Sockets Layer;
Chapter 7: Other Request Phases;
7.1 The Child Initialization and Exit Phases;
7.2 The Post Read Request Phase;
7.3 The URI Translation Phase;
7.4 The Header Parser Phase;
7.5 Customizing the Type Checking Phase;
7.6 Customizing the Fixup Phase;
7.7 The Logging Phase;
7.8 Registered Cleanups;
7.9 Handling Proxy Requests;
7.10 Perl Server-Side Includes;
7.11 Subclassing the Apache Class;
Chapter 8: Customizing the Apache Configuration Process;
8.1 Simple Configuration with the PerlSetVar Directive;
8.2 The Apache Configuration Directive API;
8.3 Configuring Apache with Perl;
8.4 Documenting Configuration Files;
Chapter 9: Perl API Reference Guide;
9.1 The Apache Request Object;
9.2 Other Core Perl API Classes;
9.3 Configuration Classes;
9.4 The Apache::File Class;
9.5 Special Global Variables, Subroutines, and Literals;
Chapter 10: C API Reference Guide, Part I;
10.1 Which Header Files to Use?;
10.2 Major Data Structures;
10.3 Memory Management and Resource Pools;
10.4 The Array API;
10.5 The Table API;
10.6 Processing Requests;
10.7 Server Core Routines;
Chapter 11: C API Reference Guide, Part II;
11.1 Implementing Configuration Directives in C;
11.2 Customizing the Configuration Process;
11.3 String and URI Manipulation;
11.4 File and Directory Management;
11.5 Time and Date Functions;
11.6 Message Digest Algorithm Functions;
11.7 User and Group ID Information Routines;
11.8 Data Mutex Locking;
11.9 Launching Subprocesses;
Standard Noncore Modules;
The Apache::Registry Class;
The Apache::PerlRun Class;
The Apache::RegistryLoader Class;
The Apache::Resource Class;
The Apache::PerlSections Class;
The Apache::ReadConfig Class;
The Apache::StatINC Class;
The Apache::Include Class;
The Apache::Status Class;
Building and Installing mod_perl;
Standard Installation;
Other Configuration Methods;
Building Multifule C API Modules;
Statistically Linked Modules That Need External Libraries;
Dynamically Linked Modules That Need External Libraries;
Building Modules from Several Source Files;
Apache:: Modules Available on CPAN;
Content Handling;
URI Translation;
Perl and HTML Mixing;
Authentication and Authorization;
Fixup;
Logging;
Profiling;
Persistent Database Connections;
Miscellaneous;
Third-Party C Modules;
Content Handling;
International Language;
Security;
Access Control;
Authentication and Authorization;
Logging;
Distributed Authoring;
Miscellaneous;
HTML::Embperl—Embedding Perl Code in HTML;
Dynamic Tables;
Handling Forms;
Storing Persistent Data;
Modularization of Embperl Pages;
Debugging;
Querying a Database;
Security;
An Extended Example;
Colophon;

Read More Show Less

Customer Reviews

Be the first to write a review
( 0 )
Rating Distribution

5 Star

(0)

4 Star

(0)

3 Star

(0)

2 Star

(0)

1 Star

(0)

Your Rating:

Your Name: Create a Pen Name or

Barnes & Noble.com Review Rules

Our reader reviews allow you to share your comments on titles you liked, or didn't, with others. By submitting an online review, you are representing to Barnes & Noble.com that all information contained in your review is original and accurate in all respects, and that the submission of such content by you and the posting of such content by Barnes & Noble.com does not and will not violate the rights of any third party. Please follow the rules below to help ensure that your review can be posted.

Reviews by Our Customers Under the Age of 13

We highly value and respect everyone's opinion concerning the titles we offer. However, we cannot allow persons under the age of 13 to have accounts at BN.com or to post customer reviews. Please see our Terms of Use for more details.

What to exclude from your review:

Please do not write about reviews, commentary, or information posted on the product page. If you see any errors in the information on the product page, please send us an email.

Reviews should not contain any of the following:

  • - HTML tags, profanity, obscenities, vulgarities, or comments that defame anyone
  • - Time-sensitive information such as tour dates, signings, lectures, etc.
  • - Single-word reviews. Other people will read your review to discover why you liked or didn't like the title. Be descriptive.
  • - Comments focusing on the author or that may ruin the ending for others
  • - Phone numbers, addresses, URLs
  • - Pricing and availability information or alternative ordering information
  • - Advertisements or commercial solicitation

Reminder:

  • - By submitting a review, you grant to Barnes & Noble.com and its sublicensees the royalty-free, perpetual, irrevocable right and license to use the review in accordance with the Barnes & Noble.com Terms of Use.
  • - Barnes & Noble.com reserves the right not to post any review -- particularly those that do not follow the terms and conditions of these Rules. Barnes & Noble.com also reserves the right to remove any review at any time without notice.
  • - See Terms of Use for other conditions and disclaimers.
Search for Products You'd Like to Recommend

Recommend other products that relate to your review. Just search for them below and share!

Create a Pen Name

Your Pen Name is your unique identity on BN.com. It will appear on the reviews you write and other website activities. Your Pen Name cannot be edited, changed or deleted once submitted.

 
Your Pen Name can be any combination of alphanumeric characters (plus - and _), and must be at least two characters long.

Continue Anonymously
Sort by: Showing 1 Customer Reviews
  • Anonymous

    Posted July 21, 2000

    Excellent presentation

    Good book indeed for freshers...

    Was this review helpful? Yes  No   Report this review
Sort by: Showing 1 Customer Reviews

If you find inappropriate content, please report it to Barnes & Noble
Why is this product inappropriate?
Comments (optional)