The Session Initiation Protocol (SIP) is a new signaling protocol developed to set up, modify, and tear down multimedia sessions over the Internet [1]. This chapter covers some background for the understanding of the protocol. SIP was developed by the Internet Engineering Task Force (IETF) as part of the Internet Multimedia Conferencing Architecture, and was designed to dovetail with other Internet protocols such as TCP, UDP, IP, DNS, and others. This organization and these related protocols will be briefly introduced. Related background topics such as Internet URLs, IP multicast routing, and ABNF representations of protocol messages will also be covered.
This book is about the Session Initiation Protocol, which is a signaling protocol. As the name implies, the protocol allows two end-points to establish media sessions with each other. The main functions of signaling protocols are as follows:
Tear-down of existing media sessions. The treatment of SIP in this book will be from a telephony perspective. This is likely to be one of the first applications of SIP, but not the only one. SIP will likely be used to establish a whole set of session types that bear almost no resemblance to a telephone call. The basic protocol operation, however, will be the same. As a result, this book will use familiar telephone examples to illustrate concepts.
1.2 The Internet Engineering Task Force
SIP was developed by the Internet Engineering Task Force (IETF). To quote The Tao of the IETF [2]: "The Internet Engineering Task Force is a loosely self-organized group of people who make technical and other contributions to the engineering and evolution of the Internet and its technologies." The two document types used within the IETF are Internet-Drafts (I-Ds) and Request for Comments (RFCs). I-Ds are the working documents of the group; anyone can author one on any topic and submit it to the IETF. There is no formal membership in the IETF; anyone can participate. Every I-D contains the following paragraph on the first page: "Internet-Drafts are documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as work in progress."
Internet standards are archived by the IETF as the Request for Comments, or RFC, series of numbered documents. As changes are made in a protocol, or new versions come out, a new RFC document with a new number is issued, which "obsoletes" the old RFC. Some I-Ds are cited in this book; I have tried, however, to restrict this to mature documents that are likely to become RFCs by the time this book is published. A standard begins life as an I-D, then progresses to an RFC once there is consensus and there are working implementations of the protocol. Anyone with Internet access can download any I-D or RFC at no charge using the World Wide Web, ftp, or e-mail. Information on how to do so is on the IETF web site: http://www.ietf.org.
The IETF is organized into working groups, which are chartered to work in a particular area and develop a protocol to solve that particular area. Each working group has its own archive and mailing list, which is where most of the work gets done. The IETF also meets three times per year.
1.3 A Brief History of SIP
SIP was originally developed by the IETF Multi-Party Multimedia Session Control Working Group, known as MMUSIC. Version 1.0 was submitted as an Internet-Draft in 1997. Significant changes were made to the protocol and resulted in a second version, version 2.0, which was submitted as an Internet-Draft in 1998. The protocol achieved Proposed Standard status in March 1999 and was published as RFC 2543 [3] in April 1999. In September 1999, the SIP working group was established by the IETF to meet the growing interest in the protocol. An Internet-Draft containing bug fixes and clarifications to SIP was submitted in July 2000, referred to as RFC 2543 "bis". This document will be first published as an Internet-Draft then as an RFC with a new RFC number, which will obsolete RFC 2543. To advance from Proposed Standard to Draft Standard, a protocol must have multiple independent interworking implementations and limited operational experience. To this end, forums of interoperability tests, called "bakeoffs," have been organized by the SIP working group. Three interoperabiliry "bakeoffs" took place for SIP in 1999, with more planned for 2000. The final level, Standard, is achieved after operational success has been demonstrated [4]. With the documented interoperabiliry of the bakeoffs, SIP should move to Draft Standard status sometime in early 2001.
SIP incorporates elements of two widely used Internet protocols: HTTP (Hyper Text Transport Protocol) used for web browsing and SMTP (Simple Mail Transport Protocol) used for e-mail. From HTTP, SIP borrowed a client-server design and the use of uniform resource locators (URLs). From SMTP, SIP borrowed a text-encoding scheme and header style. For example, SIP reuses SMTP headers such as T o, F r o m, u a t e, and s u b j e c t . In keeping with its philosophy of "one problem, one protocol", the IETF designed SIP to be a pure signaling protocol. SIP uses other IETF protocols for transport, media transport, and media description. The interaction of SIP with other Internet protocols such as IP, TCP, UDP, and DNS will be described in the next section.
1.4 Internet Multimedia Protocol Stack
Figure 1.1 shows the four-layer Internet Multimedia Protocol stack. The layers shown and protocols identified will be discussed.
1.4.1 Physical Layer
The lowest layer is the physical and link layer, which could be an Ethernet local area network (LAN), a telephone line (V.90 or 56k modem) running Point-to-Point Protocol (PPP), or a digital subscriber line (DSL) running asynchronous transport mode (ATM), or even a multi-protocol label switching (MPLS) network. This layer performs such functions as symbol exchange, frame synchronization, and physical interface specification.
1.4.2 Internet Layer
The next layer in Figure 1.1 is the Internet layer. Internet Protocol (IP) [5] is used at this layer to route a packet across the network using the destination IP address. IP is a connectionless, best-effort packet delivery protocol. IP packets can be lost, delayed, or received out of sequence. Each packet is routed on its own, using the IP header appended to the physical packet. IP address examples in this book use the current version of IP, Version 4. IPv4 addresses are four octets long, usually written in so-called "dotted decimal" notation (for example, 207.134.3.5). Between each of the dots is a decimal number between 0 and 255. At the IP layer, packets are not acknowledged. A checksum is calculated to detect corruption in the IP header, which could cause a packet to become misrouted. Corruption or errors in the IP payload, however, are not detected; a higher layer must perform this function if necessary. IP uses a single octet protocol number in the packet header to identify the transport layer protocol that should receive the packet. IP addresses used over the public Internet are assigned in blocks by the Internet Assigned
Number Association (IANA). As a result of this centralized assignment, IP addresses are globally unique. This enables a packet to be routed across the public Internet using only the destination IP address. Various protocols are used to route packets over an IP network, but they are outside of the scope of this book. Subnetting and other aspects of the structure of IP addresses are also not covered here. There are other excellent sources [6] that cover the entire suite of TCP/IP protocols in more detail...