The Internet's "killer app" is not the World Wide Web or Push technologies: it is humble electronic mail. More people use email than any other Internet application. As the number of email users swells, and as email takes on an ever greater role in personal and business communication, Internet mail protocols have become not just an enabling technology for messaging, but a programming interface on top of which core applications are built. Programming Internet Email unmasks the Internet Mail System and shows how a loose federation of connected networks have combined to form the world's largest and most heavily trafficked message system. Programming Internet Email tames the Internet's most popular messaging service. For programmers building applications on top of email capabilities, and power users trying to get under the hood of their own email systems, Programming Internet Email stands out as an essential guide and reference book. In typical O'Reilly fashion, Programming Internet Email covers the topic with nineteen tightly written chapters and five useful appendixes.Following a thorough introduction to the Internet Mail System, the book is divided into five parts:
- Part I covers email formats, from basic text messages to the guts of MIME. Secure email message formats (OpenPGP and S/MIME), mailbox formats and other commonly used formats are detailed in this reference section.
- Part II describes Internet email protocols: SMTP and ESMTP, POP3 and IMAP4. Each protocol is covered in detail to expose the Internet Mail System's inner workings.
- Part III provides a solid API reference for programmers working in Perl and Java. Class references are given for commonly used Perl modules that relate to email and the Java Mail API.
- Part IV provides clear and concise examples of how to incorporate email capabilities into your applications. Examples are given in both Perl and Java.
- Part V covers the future of email on the Internet. Means and methods for controlling spam email and newly proposed Internet mail protocols are discussed.
- Appendixes to Programming Internet Email provide a host of explanatory information and useful references for the programmer and avid user alike, including a comprehensive list of Internet RFCs relating to email, MIME types and a list of email related URLs.
|Publisher:||O'Reilly Media, Incorporated|
|Product dimensions:||7.00(w) x 9.19(h) x 0.90(d)|
About the Author
David Wood architected the first large-scale RDF database, re-architected the Persistent URL service to support Linked Data, and co-founded the Callimachus Project. He is also the co-chair of the World Wide Web Consortium's RDF Working Group.
Read an Excerpt
Chapter 11: The Post Office Protocol (Version 3)In this chapter
- Using POP
- POP Commands
- POP Sessions
The Post Office Protocol (POP) is the older and simpler of the Internet's two mechanisms for retrieving electronic mail from a remote server. The other, the Internet Message Access Protocol (IMAP) is discussed in Chapter 12, The Internet Message Access Protocol (Version 4 Revision 1). POP and [MAP are both client/server systems that implement the concept of Mail Retrieval Agents, as described in Chapter 2, Electronic Mail on the Internet.
POP is the most mature of these two mechanisms. The current version of POP is version 3 (POP3), which is an Internet standard and has been assigned a designation of STD 53. IMAP, by comparison, is still a proposed standard.
This chapter describes POP3 in detail and provides information necessary to implement either POP servers or clients. There have been several enhancements proposed to extend POP's security options and these are also discussed.
Using POPPOP3 provides a standard mechanism for retrieving electronic mail from a remote server. The messages stored on the server may be deleted (as is usually the case), or retained. This allows many millions of home Internet users to have electronic mail sent to a mail server at their Internet service provider, then retrieve it when they are online. Small offices can use POP in the same manner to pull mail for many users from a mail server at a home office or an Internet service provider. These remote mail servers are known as maildrops. In short, any user that wishes to receive e-mail to their own machine or LAN but does not have a permanent Internet connection may use POP.
A maildrop host must provide several services. It must include an SMTP MTA, providing mail services for all local users (if any) as well as for users for which it acts as a maildrop. The MTA must have access to at least one MDA for local message delivery. Chapter 2 described the relationship of MTAs and MDAs. In order for this machine to provide services as a POP maildrop, it must also include an MRA for POP services. This MRA would consist of a POP server capable of answering POP requests from authorized remote clients.
A POP server is a TCP application daemon that listens for POP requests on TCP port 110. This is called a 'well-known port'. Any POP client wishing to request service from a known host would connect to this port. The server would respond by spawning a new POP server instance on a high port number, communicating that instance port number to the client and return to listening on the well known port to handle other requests. The client would then connect to the instance port and begin the POP3 session communication. The instance server would be destroyed when the session is complete. In this way, one server daemon can handle many nearly-simultaneous requests from multiple clients. The mundane task of port switching is handled automatically in most languages when a socket is opened.
Networks services under Unix can either be run standalone, or arbitrated by the inet daemon. If a program is run continuously and listens for any incoming connections itself, it is a standalone service. The act of listening for incoming connections may be left to to the inet 'super-daemon', which will either start the necessary service or pass the request to a running service, thereby optimizing the use server resources.
Like other network services, POP3 daemons may be run either standalone or via the inet service.
MUAs that provide support for the POP3 protocol implement a POP3 client. A POP3 client may retrieve mail by initializing a connection to a POP server on the maildrop host. Figure 11-1 shows this behavior.
Figure 11-1. POP Usage for a Dial-in Host
The situation for a LAN that is intermittently connected to the Internet is slightly more complex. Instead of servicing a single user and thereby a single MUA, the machine that retrieves remote mail must serve many users on many machines.
An internal mailhost can use the POP3 protocol to retrieve mail from a maildrop on the Internet, as shown in Figure 11 -2. The Unix utility fetchmail is often used for this purpose, although there are other solutions. Fetchmail acts as a POP client, then remails each message to a local MTA, which delivers it to each user's mailbox. Of course, it is equally possible to write a utility that acts as a POP client and directly appends each message to each user's local mailbox.
Internal LAN users would then retrieve their mail in one of three ways; via an MUA on the local mailhost, via IMAP from an MUA on their own machine or again via POP from an MUA on their own machine.
Figure 11-2. POP Usage for a Dial-in LAN
Comparing POP and IMAPThe major difference between IMAP and POP is where mail is ultimately stored (and hence, how it is accessed). With POP, mail is transferred from a maildrop server to a local machine. IMAP, on the other hand, stores all messages on the maildrop. An IMAP-capable MUA passes messages back and forth to an IMAP maildrop to manipulate and view messages, but they are stored on the maildrop.
POP is therefore much more efficient in terms of network bandwidth. Since messages are transferred from a maildrop, there will seldom be a requirement to duplicate the message transfer. If a user wishes to see a stored message again, they may simply retrieve it from local storage. An IMAP user would be forced to retrieve the message over the network, at least once per session. Also for this reason, POP MUAs can be substantially faster than their IMAP counterparts.
Security and privacy aspects are often important when dealing with e-mail messages. Many people are not comfortable with folders of private messages being stored on an Internet-connected server not under their direct control. After all, what you keep may reveal more about you than what you receive.
IMAP has its advantages, too. It was developed to support roaming users who connect from many different client machines. Storing mail on the server means that, once a user is properly authenticated, that user can get full access to all of their mail, both new and saved, regardless of the client that they happen to be using. It is a panacea for those that travel frequently and wish to maintain mail access without having to synchronize mailboxes later.
POP CommandsPOP clients initiate every POP session. POP Servers just wait for clients to connect. POP clients issue all of the POP commands and servers respond to those commands.
POP commands are case insensitive, although server responses are not.
When a client connects to a server, the server responds with a banner greeting. It looks like this:
Client: (Initiates socket connection) Server: +OK POP server ready
The first thing that a client must do upon connecting is to provide authentication for a particular user. That way, the server knows that it is safe to allow access to a given mailbox. This is known as the Authorization State. Commands that deal with user authentication are only valid in this state. The client can't do anything in this state except log in or quit.
User authentication commands and examples are shown in "The Authorization State.
Once a user is authenticated, the mailbox may be manipulated. This is known as the Transaction State. Commands that deal with a mailbox are only valid in this state, such as reading messages or moving them to other mailboxes. These commands and examples are shown in 'The Transaction State.
Once the client has issued a QUIT command (which tells the server that the client is finished with the session), a server enters the very brief Update State if any messages have been marked for deletion. At this time, any messages that were marked for deletion in the Transaction State are actually deleted. The server does not enter this state if no messages were marked for deletion.
Every command issued by a client yields a server response. Responses may be single line or multiline, but only as appropriate for the command given. Single line responses always consist of a status indicator ('+OK' or '-ERR') denoting a positive or negative response, an optional comment and a CRLF sequence. The status indicators must be given in upper case and the total line must not exceed 512 characters including the terminating CRLF.
The comments in the single line response are completely optional! Each POP server implementation may put in its own comments, so they cannot be relied upon when parsing server responses. The literal space that separates the status indicator from the comments is part of the comments, so it too may not be present. Some POP client implementations have relied on the space or specific comments, which is not portable.
Multiline responses are only appropriate if the command given demands a multiline response. Multiline responses consist of a single line response (status indicator, optional comment and CRLF), followed by the appropriate data in lines of no more than 512 total characters and ending with a '.' on a line by itself with a CRLF. That is, a sequence of CRLF.CRLF ends any multiline response. . . .
Table of Contents
How This Book Is Organized;
Conventions Used in This Book;
We'd Like to Hear From You;
Chapter 1: Electronic Mail on the Internet;
1.1 Email Systems;
1.2 Internet Email Standards;
1.3 Tools of the Trade;
1.4 The Basic Internet Email System;
Chapter 2: Simple Text Messages;
2.1 Internet Text Messages;
2.2 Think Globally, Act Locally;
2.4 Mandatory Headers;
2.5 User-Defined Headers;
2.6 Address Formats;
Chapter 3: Multipurpose Internet Mail Extensions;
3.1 Mail with Attitude;
3.2 MIME Header Fields;
3.3 MIME Encoding;
3.4 MIME Boundaries;
3.5 MIME Summary;
Chapter 4: Creating MIME-Compliant Messages;
4.1 The Minimal MIME Message;
4.2 Multipart Messages;
4.3 Nested Body Parts;
4.4 A Few Interesting MIME Types;
4.5 MIME Message Creation Gotchas;
Chapter 5: OpenPGP and S/MIME;
5.1 An Extremely Brief Introduction to Security Concepts;
5.2 An Overview of OpenPGP and S/MIME;
5.3 Combining Security and MIME;
5.4 The OpenPGP Format;
5.5 The S/MIME Format;
Chapter 6: vCard;
6.1 Personal Data Interchange with vCard;
6.2 The vCard Version 3.0 Profile;
6.3 Version 3.0 Housekeeping Types;
6.4 Version 3.0 Identification Types;
6.5 The vCard Version 2.1 Profile;
6.6 Attaching vCards to Email Messages;
Chapter 7: Mailbox Formats;
7.2 Common mbox Variations;
7.3 Variation for lMAP Mailboxes;
Chapter 8: Mailcap Files;
8.1 Mailcap File Format;
8.2 Implementation Under Unix Operating Systems;
8.3 Implementation Under Other Operating Systems;
Chapter 9: The Extended Simple Mail Transfer Protocol;
9.1 Using ESMTP;
9.2 ESMTP Commands;
9.3 ESMTP Sessions;
Chapter 10: The Post Office Protocol;
10.1 Using POP;
10.2 POP Commands;
10.3 POP Sessions;
Chapter 11: The Internet Message Access Protocol;
11.1 Using IMAP;
11.2 IMAP Commands;
11.3 The Nonauthenticated State;
11.4 The Authenticated State;
11.5 The Selected State;
11.6 IMAP Sessions;
Chapter 12: The Application Configuration Access Protocol;
12.1 Using ACAP;
12.2 ACAP Datasets;
12.3 Access Control;
12.4 Example Dataset;
12.5 ACAP Commands;
12.6 The Nonauthenticated State;
12.7 The Authenticated State;
12.8 ACAP Sessions;
Chapter 13: Email-Related Perl Modules;
13.1 Finding and Installing Perl Modules;
13.2 Maturity of the Mail-Related Modules;
13.3 Email-Related Modules Quick Reference;
Chapter 14: The Java Mail API;
14.1 An Overview of the Java Mail API;
14.2 Java Mail API Reference;
14.3 The javax.mail.internet Package;
14.4 The javax.mail.search Package;
14.5 The javax.mail.event Package;
Chapter 15: Creating and Sending a Multipart Mail Message;
15.1 Designing a MIME-Capable Replacement for /bin/mail;
15.2 Creating mail.pl;
15.3 Extending and Enhancing mail.pl;
15.4 Sending MIME Email via Java;
Chapter 16: Archiving and Cleaning a Mailbox;
16.1 Scrubbing Unwanted MIME Attachments;
16.2 Creating mboxscrub.pl;
16.3 Extending and Enhancing mboxscrub.pl;
Chapter 17: Watching an IMAP Mailbox;
17.1 Designing JBiff;
17.2 Creating JBiff;
17.3 Extending JBiff;
Chapter 18: Anti-Spamming Techniques;
18.1 The UCE Problem;
18.2 Recipient Approaches;
18.3 Service Provider Approaches;
18.4 Legislative Approaches;
Chapter 19: The Future of Email;
19.1 Trends in MUAs;
19.2 Trends with Web-based Mail;
19.3 Trends Inside Firewalls;
Internet RFCs Relating to Email;
MIME Media Types;
This book is a desktop quick reference for programmers working with Internet electronic mail. It also serves as a tutorial for those interested in learning more about the internal workings of the Internet mail system. It was written primarily because I needed it. I could not find adequate information elsewhere on Internet e-mail is a reasonably compact form.
Until this publication, one had to read rather academic-sounding Internet Requests for Comments in order to learn about e-mail on the 'Net. Hopefully, it will allow people to learn about the Internet mail system without having to learn the Augmented Backus-Naur Form meta-language common in the standards. We've also included some pretty pictures ;A)
Within these pages are e-mail formats, protocols, APIs and examples. While not exactly ships and shoes and sealing wax, they should be enough to teach novices and provide a handy reference to daily coders. The formats show how e-mail messages and mailboxes are formatted in text. The protocols illustrate how e-mail servicescommunicate to pass messages and the APIs provide code libraries in Perl and Java so that you won't have to reinvent the wheel. The examples attempt to show how to work with e-mail to enhance your daily life or expand the capabilities of a program.