Read an Excerpt
Excerpt from Chapter 1:
History and Background of ApacheDespite its dominance and importance today, the World Wide Web (WWW) is a relative newcomer to networked computing, having been developed only in the middle 1990s. Despite its late start, the Web has become the service synonymous with "Internet" to millions of users worldwide. Whether you've been around the Internet since the early days (and remember Gopher and other pre-Web services) or you arrived on the scene after the Web had become the most popular service for Internet users-running neck and neck with electronic mail-you know people want fast and reliable access to the millions of Web pages out there.
While you can't guarantee reliable service on the user's end, you can make sure your own pages are served rapidly and your Web presence is stable, whether you're running a small Web server out of your dining room or you're part of an administrative team operating a server that offers thousands of pages for millions of daily hits. The secret to a stable Web presence is choosing the right Web server for your site: the Apache Server. Over 60 percent of sites on the Web use Apache or one of its derivatives to power their pages. In this chapter, you learn why Web administrators choose Apache, as well as what makes it so powerful and unique.
WHAT IS APACHE?At its most basic, the Apache Server is a standards-compliant Web server. This means the Apache Server supports the requirements of the HTTP 1.1 standard, a document that defines the method by which files encoded in Hypertext Markup Language (HTML) are moved across computer networks. TIP: HTTP is an acronym for Hyper Text Transfer Protocol.
The term server means Apache responds to requests from other programs, but doesn't provide documents of its own volition. That is, when you open a Web browser-such as Netscape-and type http://www.apache.org into the text box and then press ENTER, your browser contacts the server at apache.org and requests the default page for that site. The server responds to the request with the file you want to see, which the browser then formats and displays. Figure 2-1 shows the basic process.
NOTE: These standards are maintained by the World Wide Web Consortium (W3C), a nonprofit group that works to develop standards for both HTTP and HTML. In Chapter 13, "Serving Compliant HTML," you learn more about working with standards and why they're critical to administrators and their sites.
Apache is more than a simple Web server, though. The true power behind the Apache Server lies in its modularity. The core of the server is actually quite small, serving as the central component of the program, but not providing a lot of extra functions. Those functions are added as modules, individual pieces of code that permit the server to handle a particular type of request or file in the appropriate way. Chapter 5, "Apache Modules," covers the range of available modules, while Chapter 8, "Dealing with Innovation: mod_perl, A Case Study," explains one popular module in great detail. If you plan to run Apache in any serious way, you'll find its modularity means you only need to install the functions you plan to use-without wasting machine cycles on functions you don't need.
DEVELOPMENT AND HISTORY OF THE APACHE PROJECTThe Apache Server is the creation of a large group of programmers and developers who work together to build and strengthen Apache and its modules, as well as to incorporate new technologies into the server. The Apache project started in 1995 as an attempt to upgrade the original HTTP daemon (httpd) developed at the National Center for Supercomputing Applications by Rob McCool. Because McCool had taken a new job in 1994, nobody at NCSA had taken over the project, so httpd was languishing at a time when Web programming was starting to take off.
Web administrators were working on httpd on their own, and they began to share their patches and hacks with each other in an attempt to strengthen httpd without McCool's input. Soon, eight programmers announced the formation of the Apache Group, which would serve as a central node for httpd development. They took all the patches they could find and incorporated them into httpd code, releasing the first Apache server distribution in April 1995 as version 0.6.2. Testing and writing new code occupied the Apache group (including NCSA programmers) for the remainder of 1995, and after two more beta releases, Apache 1.0 was released in December 1995. Within a year, Apache was the most popular server being used on the Web. This popularity hasn't slowed, with Apache itself now serving 60 percent of Web sites and its derivatives adding another 3 or 4 percent to that total. Apache is currently in beta for version 2.0, with the most recent stable release being 1.3.
NOTE: This book is written using both Apache 2.0 and Apache 1.3. Since the 2.0 release is still under construction and is released only as beta software, those running Web sites that require reliability may need to stay with the current stable release (Apache 1.3) until the 2.0 version is released as stable. Significant differences from 1.3 are noted in this book, but some processes given here for 2.0 may not work on 1.3 installations.
At the end of 1999, the Apache developers took a somewhat unusual step. The server had become so popular, a more bureaucratic structure was needed to manage the project and its work. So, the Apache Software Foundation was established under United States law as a fully nonprofit organization. The foundation can receive donations, distribute funds to developers or other recipients, and manage the growth of Apache in an organized manner. Perhaps even more important, the foundation is considered a separate legal entity, apart from any people involved in the project. The foundation can enter into contracts, participate in legal action, and even sue or be sued, though one hopes that will never be necessary!
OPEN SOURCE SOFTWAREWorking with Apache without learning something about the Open Source or Free Software community is nearly impossible. Apache is often touted as one of the biggest successes to come out of this community, and the project has stayed faithful to its roots as the server has become more widely used. But what J's Software, and why is it important?
At their most general, the terms Free Softwak and Open Source refer to software developed by volunteers and distributed with a license that's simultaneously restrictive and open. Free Software licenses usually require the user to contribute any changes made to the program back to the development community. They also require the full code base be distributed openly, holding nothing back as a "trade secret." Many programs released under such licenses, like Apache, are also distributed free of charge.
NOTE: Free Software doesn't always mean "no cost" software. The "free" refers to the way in which the code base, and improvements to the code base, must circulate among users and developers. People in the community use the phrase "free speech, not free beer" to indicate a difference exists between sharing without restraint and sharing without payment.
The Free Software movement is the brainchild of Richard Stallman, an MIT computer scientist who spent much of the 1970s decrying the rise of commercial software that hid its code from users and administrators. Without access to the code, Stallman knew administrators would have to rely on the software companies to fix bugs and produce upgrades. These upgrades would be generic and not always useful for a particular administrator's needs. So, Stallman began working on projects that would be released freely to the computing community and has continued to do so for the last quarter-century. He also created a foundation, called the Free Software Foundation, which helps people write Free Software and get it distributed.
Many of Stallman's programs are now considered integral parts of a Unix system, which is ironic because his project name, GNU, stands for Gnu's Not Unix. Stallman wasn't the only person working on such programs, though. A robust international community of programmers, hackers, and students was building an amazing array of programs. The rise of the Internet and its growing availability to people outside the military and academic networks helped with this explosion of code. However, the catalyst for truly amazing growth came when a Finnish college student, Linus Torvalds, released the first version of a new operating system called Linux.
NOTE: You'll see Unix spelled both with the capital U and in all capital letters, as in UNIX. The latter is a registered trademark, while the former has become the general way to describe UNIX-based operating systems, which may or may not contain part of the code in the AT&T copyrighted UNIX. In this book, the Unix spelling is used.
Linux was a version of an older Unix-based operating system called Minix, but it was developed and released under a GNU-derived license. One major innovation was that Linux could run on a variety of hardware, a far cry from the days when individual computers arrived with their own unique operating systems. The wide distribution of Linux meant a large user base was available to work with new programs and to generate data that would work as independent of the hardware platform as possible. With a Free and flexible operating system now available, the community exploded . . . and business began to take note.
Unfortunately-or fortunately, depending on the side you take-Stallman's insistence on the term "Free Software" wasn't the best marking tool. Businesses weren't comfortable with the concept of "free," thinking free code might be worth exactly what was paid for it. The programs were good and competitive, but the perception was a problem. Enter Eric Raymond, a programmer active in the Free Software community who identified this problem. In his landmark essay "The Cathedral and the Bazaar," Raymond suggested the term "open source" as a replacement. Open Source would carry the same connotations of open development and the distribution of source code, but would remove any financial or moral implications from the software's description. What term you use is up to you, but you should be aware of the shadings behind each description.
NOTE: If you're interested in learning more about this community, you can find out a lot by searching the Web and by reading the writings of both Stallman and Raymond. Raymond's book, The Cathedral and the Bazaar (O'Reilly & Associates, 2000), is a collection of his most important essays, which are also available on his Web site: http://www.tuxedo.org/-esr/writings/. You can learn more about Stallman's views by reading through the GNU site at hftp://www.gnu.org....