12 The IETF Internet Telephony Architecture and Protocols Henning Schulzrinne, Columbia University Jonathan Rosenberg, Bell Laboratories-Lucent Technologies Abstract Internet telephony was first used as a simple way to provide point-to-point voice transport between two IP hosts. However, the growing interest in providing integrated voice, data, and video services has caused its scope to be expanded. Internet telephony now encompasses a range of services, including not only traditional conferencing, call control, multimedia, and mobility services, but also new ones that integrate Web, e-mail, presence, and instant messaging applications with telephony. Internet telephony and traditional circuit-switched telephony will coexist for quite some time, requiring interworking between the two. In this article we present a suite of protocols, developed in the IETF, which provide a partial solution to this complex problem. I nternet telephony was first used as a simple way to provide point-to-point voice transport between two IP hosts, primarily to replace expensive international phone calls. However, the growing interest in providing integrated voice, data, and video services has caused its scope to be expanded. Internet telephony now encompasses a range of services. These services include not only traditional conferencing, call control supplementary services, multimedia transport, and mobility, but also new services that integrate Web, e-mail, presence, and instant messaging applications with telephony. Furthermore, it is generally accepted that Internet telephony and traditional circuit-switched telephony will coexist for quite some time, requiring gateways between the two worlds. Consider the following example of what an integrated IP telephony call might be like. John is sitting at his computer, and all of a sudden his machine sounds a “boing,” followed by speaking: “Audio and video call from Joe.” He accepts the call, and talks for a while. He then decides that Alice needs to be in on the call, too. He says “Add Alice,” and the speech recognition software on his PC interprets the command. His client application consults a local Internet “white pages” directory and adds “[email protected]” to the call. The call setup request reaches Alice’s personal agent software. The agent has been instructed to “ring” her cell phone, home PC, and work PC, all in parallel. To complete the call to the cell phone, Alice has instructed her agent to use the cheapest gateways that support credit card billing. The agent finds an appropriate gateway, and rings Alice at her cell phone, home PC, and work PC, all at once. Alice picks up in her car, joining the call voice-only. During the conversation, Joe remembers that a video segment from a recent IEEE tutorial 18 0890-8044/99/$10.00 © 1999 IEEE presentation would be helpful. He finds the media server with the content, and plays the media stream into the conversation. Later, John decides to leave the call. He transfers Joe and Alice to a Web page containing the additional information on the IEEE tutorial. Joe’s Web browser jumps to the page, and Alice’s phone displays a text-only version, which they continue to discuss. Joe then decides to add Bob to the call. Bob is not available, but his agent returns a Web page containing his appointments, along with a hyperlink to a voicemail service. The services contained in the call scenario above require many protocol components in order to work. In this article we examine the various protocols and discuss how they fit into the broader picture. First, a signaling protocol is needed to allow Joe to call John, and to establish a multimedia session so that they can exchange audio and video. We discuss signaling protocols in the next section. The actual audio and video are exchanged between session participants using a transport protocol called RTP, which we discuss after that. Directory access protocols, used to access white pages services, for example, are also important, but we do not discuss them further here, since they are the same as for e-mail service, for example [1]. During the call, Joe brings in a media server and instructs it to play a video segment. This is accomplished using a streaming media control protocol, called the Real Time Streaming Protocol (RTSP), which we describe. The intelligent agent concept described above interacts with the signaling protocol to provide advanced services. To realize these agents, we briefly describe a call processing language. We then present the Gateway Location Protocol (GLP), which helps the agent in selecting a gateway for terminating the call from the Internet on the telephone network. IEEE Network • May/June 1999 This tutorial does not cover the protocols that allow controllers and signaling gateways — gateways connecting to the public switched telephone network (PSTN) at the signaling layer — to communicate. Proposals for such protocols are being discussed within the Internet Engineering Task Force (IETF) at this time, including Media Gateway Control Protocol (MGCP) [2] and Media Description Control Protocol (MDCP) [3]. Other protocols, such as those for billing and authentication, are also beyond the scope of this survey. Signaling Protocols Signaling protocols are at the heart of Internet telephony and distinguish it from other services. They play several roles [4], discussed below. User Location — If user A wishes to communicate with user B, A first needs to find out where B is currently located on the network, so that the session establishment request (below) can reach him. This function is known as user location. Users can be in different places at different times, and even reachable by several means at the same time (by work PC or traditional phone). This function is particularly important for users whose PCs do not have a permanent IP address. (Almost all modem connections, including asynchronous digital subscriber line — ADSL — and cable modems, assign addresses to PCs dynamically using the Dynamic Host Configuration Protocol, DHCP [5].) Session Establishment — The signaling protocol allows the called party to accept the call, reject it, or redirect it to another person, voicemail, or a Web page. (Generally, the terms call and session are used interchangeably in this article, although session has a somewhat wider meaning, including, for example, a group of hosts listening to an “Internet radio” multicast.) Session Negotiation — The multimedia session being set up can comprise different media streams, including audio, video, and shared applications. Each of these media streams may use a variety of different speech and video compression algorithms, and take place on different multicast or unicast addresses and ports. The process of session negotiation allows the parties involved to settle on a set of session parameters. This process is also sometimes known as capabilities exchange. Call Participant Management — New members can be added to a session, and existing members can leave a session. Feature Invocation — Call features, such as hold, transfer, and mute, require communication between parties. Several protocols exist to fill this need. One is International Telecommunications Union (ITU) Recommendation H.323 [6], which describes a set of protocols. The IETF has defined two protocols to perform many of the above tasks: the Session Initiation Protocol (SIP) [7] and Session Description Protocol (SDP) [8]. A detailed comparison of SIP/SDP and H.323 can be found in [9]. In this article we focus on the IETF protocols. Excellent articles on H.323 can be found in [10]. Session Initiation Protocol As its name implies, SIP is used to initiate a session between users. It provides for user location services (this is its greatest strength, in fact), call establishment, call participant management (using a SIP extension [11]), and limited feature invoca- IEEE Network • May/June 1999 tion. Interestingly, SIP does not define the type of session that is established. SIP can just as easily establish an interactive gaming session as an audio/video conference. Each SIP request consists of a set of header fields that describe the call as a whole followed by a message body which describes the individual media sessions that make up the call. Currently, SDP (below) is used, but consenting parties may agree on another capability exchange protocol. SIP is a client-server protocol, similar in both syntax and semantics to Hypertext Transfer Protocol (HTTP) [12]. Requests are generated by one entity (the client) and sent to a receiving entity (the server). The server processes the requests, and then sends a response to the client. A request and the responses which follow it are called a transaction. The software on an end system that interacts with a human user is known as a user agent. A user agent contains two components, a user agent client (UAC) and a user agent server (UAS). The UAC is responsible for initiating calls (sending requests), and the UAS for answering calls (sending responses). A typical Internet telephony appliance or application contains both a UAS and a UAC. (Note that this differs from a Web browser, which acts only as a client.) Within the network, there are three types of servers. A registration server receives updates on the current locations of users. A proxy server receives requests and forwards them to another server (called a next-hop server), which has more precise location information about the callee. The next-hop server might be another proxy server, a UAS, or a redirect server. A redirect server also receives requests, and determines a next-hop server. However, instead of forwarding the request there, it returns the address of the next-hop server to the client. The primary function of proxy and redirect servers is call routing — the determination of the set of servers to traverse in order to complete the call. A proxy or redirect server can use any means at its disposal to determine the next-hop server, including executing programs and consulting databases. A SIP proxy server can also fork a request, sending copies to multiple next-hop servers at once. This allows a call setup request to try many different locations at once. The first location to answer is connected with the calling party. As in HTTP, the client requests invoke methods (commands) on the server. SIP defines several methods. INVITE invites a user to a call. BYE terminates a connection between two users in a call. OPTIONS solicits information about capabilities, but does not set up a call. ACK is used for reliable message exchanges for invitations. CANCEL terminates a search for a user. Finally, REGISTER conveys information about a user’s location to a SIP registration server. A client sets up a call by issuing an INVITE request. This request contains header fields used to convey information about the call. The most important header fields are To and From, which contain the callee’s and caller’s address, respectively. The Subject header field identifies the subject of the call. The Call-ID header field contains a unique call identifier, and the CSeq header field contains a sequence number. The Contact header field lists addresses where a user can be contacted. It is placed in responses from a redirect server, for example. The Require header field is used for negotiation of protocol features, providing extensibility. The ContentLength and Content-Type header fields are used to convey information about the body of the message. The body contains a description of the session which is to be established. Extensions can be defined with new header fields. One such extension, used for call control [10], defines several new headers used for feature invocation (such as call transfer) and multiparty conferencing [13, 14]. 19 •Start and stop times — For broadcast-style sessions like a television program, the start, stop, SIP redirect and repeat times of the session are conveyed. Location service server Thus, one can announce or invite others to a Request Response weekly TV show or a Tuesday/Thursday lec2 ture.1 3 •Originator — For broadcast-style sessions, the 5 session description names the originator of the session and how that person can be contacted 4 6 (e.g., in case of technical difficulties). 11 1 7 SDP conveys this information in a simple tex10 12 tual format. In fact, the acronym for SDP is a SIP proxy SIP proxy misnomer, since SDP is more of a description 8 format. When a call is set up using SIP, the INVITE message contains an SDP body describ9 ing the session parameters acceptable to the SIP client caller. The response from the called party conSIP client (UAS) tains a modified version of this description, incorporating the capabilities of the called party. Figure 2 shows an example. ■ Figure 1. SIP operation. The v line is a version identifier for the session. The o line is a set of values which form a unique identifier for the session. The u and e lines given the A typical SIP transaction is depicted in Fig. 1. A SIP user URL and email addresses for further information about the agent creates an INVITE request for sip:[email protected] session. The c line indicates the address for the session, the b This request is forwarded to a local proxy (1). This proxy line indicates the bandwidth (64 kb/s in this case), and the t looks up company.com in the Domain Name Service (DNS), line the start and stop times (where 0 means that the session and obtains the IP address of a server handling SIP requests continues indefinitely). The k line conveys the encryption key for this domain. It then proxies the request to this server (2). for the session. There are three m lines, each of which identiThe server for company.com knows about user joe, but this fies a media stream type (audio, video, and whiteboard appliuser is currently logged in as [email protected] cation), the port number for that stream, the protocol, and a Therefore, the server redirects the proxy (3) to try this address. list of payload types. The a line specifies an attribute. For The local proxy looks up university.edu in DNS, and example, the line below the audio stream definition defines obtains the IP address of its SIP server. The request is then the codec parameters for RTP payload type 96. proxied there (4). The university server consults a local database (5), which indicates (6) that [email protected] is known locally as [email protected] Thus, the Real Time Transport Protocol main university server proxies the request to the computer science server (7). This server knows the IP address where the The Real Time Transport Protocol (RTP) [15], as its name user is currently logged in, so it proxies the request there (8). implies, supports the transport of real-time media (e.g., audio The user accepts the call, and the response is returned and video) over packet networks. (It is also used by H.323.) through the proxy chain (9), (10), (11), (12) to the caller. The process of transport involves taking the bitstream generated by the media encoder, breaking it into packets, sending Session Description Protocol the packets across the network, and then recovering the bitstream at the receiver. The process is complex because packSDP is used to describe multimedia sessions, for both telephoets can be lost, delayed by variable amounts, and reordered in ny and distribution applications like Internet radio. The prothe network. The transport protocol must allow the receiver tocol includes information about: to detect these losses. It must also convey timing information • Media streams — A multimedia session can contain many so that the receiver can correctly compensate for jitter (varimedia streams; for example, two audio streams, a video ability in delay). To assist in this function, RTP defines the stream, and a whiteboard session. SDP conveys the number formatting of the packets sent across the network. The packand type of each media stream. It currently defines audio, ets contain the media information (called the RTP payload), video, data, control, and application as stream types, similar along with an RTP header. This header provides information to MIME types used for Internet mail. to the receiver that allows it to reconstruct the media. RTP • Addresses — For each stream, the destination address (unialso specifies how the codec bitstreams are broken up into cast or multicast) is indicated. Note that the addresses for packets [16]. RTP was also “engineered” for multicast, which different media streams may differ, so a user may, for means that it works in conferencing applications and broadexample, receive audio on a low-delay Internet telephone cast environments where multicast is used to distribute the appliance and video on a workstation. media. It is important to note that RTP does not try to • Ports — For each stream, the UDP port numbers for sendreserve resources in the network to avoid packet loss and jiting and/or receiving are indicated. ter; rather, it allows the receiver to recover in the presence of • Payload types — The media formats which can be used durloss and jitter. ing the session are also conveyed. For unicast sessions (“traRTP plays a key component in any Internet telephony sysditional” IP telephony), this list is called a capability set. tem. It is effectively at the heart of the application, moving the actual voice among participants. The relationship between 1 SIP can be used not just to make a traditional phone call, but also to the signaling protocol and RTP is that signaling protocols are used to establish the parameters for RTP transport. invite others to, say, a TV program, without the caller and callee talking to RTP provides a number of specific functions: each other. 20 IEEE Network • May/June 1999 • Sequencing — Each RTP packet contains a sequence number. This can be used for loss detection and compensation for reordering. • Intramedia synchronization — Packets within the same stream may suffer different delays (jitter). Applications use playout buffers [17–19] to compensate for delay jitter. They need the timestamps provided by RTP to measure it. • Payload identification — In the Internet, network conditions such as packet loss and delay vary, even during the duration of a single call. Speech and video codecs differ in their ability to work properly under various loss conditions. Therefore, it is desirable to be able to change the encoding for the media (the “payload” of RTP) dynamically as network conditions vary. To support this, RTP contains a payload type identifier in each packet to describe the encoding of the media. • Frame indication — Video and audio are sent in logical units called frames. It is necessary to indicate to a receiver where the beginning or end of a frame is, in order to aid in synchronized delivery to higher layers. A frame marker bit is provided for this purpose. • Source identification — In a multicast session, many users are participating. There must be a way for a packet to contain an indicator of which participant sent it. An identifier called the synchronization source (SSRC) is provided for this purpose. RTP also has a companion control protocol, called Real Time Control Protocol (RTCP). RTCP provides additional information to session participants. In particular, it provides: • QoS feedback — Receivers in a session use RTCP to report back the quality of their reception from each sender. This information includes the number of lost packets, jitter, and round-trip delays. This information can be used by senders for adaptive applications [20–22] which adjust encoding rates and other parameters based on feedback. • Intermedia synchronization — For flexibility, audio and video are often carried in separate packet streams, but they need to be synchronized at the receiver to provide “lip sync.” The necessary information for the synchronization of sources, even if originating from different servers, is provided by RTCP. • Identification — RTCP packets contain information such as the e-mail address, phone number, and full name of the participant. This allows session participants to learn the identities of the other participants in the session. • Session Control — RTCP allows participants to indicate that they are leaving a session (through a BYE RTCP packet). Participants can also send small notes to each other, such as “stepping out of the office.” RTCP mandates that all session participants (including those who send media and those who just listen) send a packet periodically which contains the information described above. These packets are sent to the same address (multicast or unicast) as the RTP media, but on a different port. The information is sent periodically since it contains time-sensitive information, such as reception quality, which becomes stale after some amount of time. However, a participant cannot just send an RTCP packet with a fixed period. Since RTP is used in multicast groups, there could be sessions (like a large lecture) with hundreds or thousands of participants. If each one were to send a packet with a fixed period, the network would become swamped with RTCP packets. To fix this, RTCP specifies an algorithm that allows the period to increase in larger groups [23]. Real Time Streaming Protocol The Real Time Streaming Protocol (RTSP) [24] is used to control a stored media server. A stored media server is a device capable of playing prerecorded media from disk to the network, and recording multimedia content to disk. RTSP IEEE Network • May/June 1999 v o s u e c b t k m a m m a = = = = = = = = = = = = = = 0 g.bell 87728 8772 IN IP4 132.151.1.19 Come here, Watson! http://www.ietf.org [email protected] IN IP4 132.151.1.19 CT:64 3086272736 0 clear:manhole cover audio 3456 RTP/AVP 96 rtpmap:96 VDVI/8000/1 video 3458 RTP/AVP 31 application 32416 udp wb orient:portrait ■ Figure 2. Example of an SDP description. offers controls similar to those in a VCR remote control. A client can instruct the server to play, record, fast forward, rewind, and pause. It can also configure the server with the IP addresses, UDP ports, and speech codecs to use to deliver (or record) the media. Typically, media is sent from the media server using RTP. Stored media has a number of applications in Internet telephony: • A media server can record the content of a conference for future reference. • Media servers can play media into an existing conference. For example, if participants in a multiparty conference are discussing a movie, it would be useful to be able to bring a media server into the conference, and have it play portions of the movie into the conference. (This is done in the introductory example, where Joe has the media server play the IEEE tutorial into the conference.) • Media servers can record voicemail. RTSP clients can use the protocol to control playback of the message. This would allow users to listen to their voicemail, and rewind to a critical part (e.g., a phone number in the message). RTSP can also be used to record the incoming or outgoing voice mail message. A client executes the following steps to cause a media server to play back content. Obtain the Presentation Description — The first step is to obtain the presentation description. This description enumerates the various components of the session. As an example, a classroom presentation might contain three components: a document camera, a video view of the professor, and the audio. For each component, the description defines the media parameters needed to decode the component, such as the codec type and frame rate. The presentation description can be obtained in several ways. The client can issue a DESCRIBE request to the server, which causes the server to return a description. It is also possible to obtain a description through other means, such as a Web page. Set Up the Server — Once it has obtained the description, the client can issue a SETUP request to the server. This request establishes the destination to which the media should be delivered. The destination includes the IP address (unicast or multicast), port numbers, protocols (usually, but not necessarily, RTP over UDP), TTL, and number of multicast layers. In response to the SETUP message, the server returns a session id. This id is used in further requests. Issue Media Requests — After the stream has been set up, the client can issue media requests to the server. These include operations such as PLAY, RECORD, and PAUSE. The PLAY method encompasses seek, fast forward, and reverse, in addition to straight play. This is accomplished by includ- 21 ing a Scale header which indicates the time speedup (or speeddown) at which the media should be played. The Range header specifies from where in the stream playback should start. Teardown — Once interaction with the server is complete, the client issues a TEARDOWN request. This request destroys any state associated with the session. Further requests for the given session id will no longer be valid. Call Processing Language The Call Processing Language (CPL) [25, 26] is a scripting language that allows end users to specify the behavior of call agents which execute on their behalf. The call agents are invoked when a call arrives at a SIP server. These agents execute the instructions contained in the CPL. This allows an end user to specify their own call services. For example, a user can instruct the agent to connect a call by trying a cell phone, home PC, and work PC all at once. Gateway Location Protocol SIP allows a user on the Internet to call another user, also on the Internet. What if an Internet user wishes to call someone not on the Internet, but rather, on the telephone network? In this case, an Internet telephony gateway is needed. This device is capable of converting signaling and media between a packet network and the telephone network (PSTN). We anticipate that many gateways will be deployed in the future by Internet service providers (ISPs) and telephone companies. To complete a call to a PSTN endpoint, an IP host must send a SIP invitation to the gateway. However, given a phone number to call, how does the caller find and select one of the many gateways to complete the call? In theory, each gateway GW LS UA GW Intradomain protocol ■ Figure 3. GLP operation. 22 GLP Front-end could dial (almost) any phone number, but the caller may want to minimize the distance of the PSTN leg of the call, for example. This function is supported by the Gateway Location Protocol (GLP). An overall architecture for GLP is shown in Fig. 3. In that architecture there are a number of Internet telephony domains in the Internet, each of which is under the control of a single authority. Each domain has some number of IP telephony gateways that provide connectivity between the Internet and the PSTN. Each domain also has some number of IP users and some number of location servers (LS). The LSs in a domain know about the gateways in their own domains, by means of an intradomain protocol, such as the Service Location Protocol (SLP) [27]. The intradomain protocol propagates information from the gateways to the location servers within a domain. Unfortunately, it is unlikely that a single administrative domain will have access to enough gateways to complete calls to all possible telephone numbers. As a result, users in one administrative domain can make use of gateways in another administrative domain. This usually requires preestablished business relationships between domains. Once the agreements are in place, it is necessary for the LS in one domain to exchange information about its gateways with the LS in another domain. An LS can then take this information and exchange it with other LSs with which it has an established relationship. The protocol used for these exchanges is GLP [28]. GLP allows an LS to build up a database of gateways in other domains. The database contains entries with the IP address of the gateway, a range of numbers the gateway is willing to terminate, and attributes that describe the gateway. These attributes include signaling protocols, cost information, and provider identifiers, among others. An LS can use the attributes to decide which gateways to use to terminate a call to a particular number. The information can also be used to determine which gateways to further advertise to other LSs. Both of these decisions are embodied in the policy that directs the behavior of the LS. When a client within the domain wishes to make a call to a number in the phone network, it can proceed in several ways: • Lightweight Directory Access Protocol (LDAP) — The database of the LS is made available through LDAP [1]. The calling client queries this database with the destination phone number, and the LS returns the IP address of the gateway. The client can then send a SIP invitation to that address. • SIP — Rather then sending a SIP invitation directly to the gateway, the caller sends it to the LS. The invitation contains the desired destination phone number. The LS consults its database, finds the right gateway, and proxies the call to it. In this case, the LS also acts as a SIP proxy. By acting as a proxy, the LS hides the gateway selection process from the caller. The caller application does not need to know whether the address being called is a phone number or a SIP universal resource locator (URL). In either case, the invitation is sent to the local proxy. • Web pages — The LS can make its database available through Web pages. A user that wishes to make a call browses the Web page and finds the gateway it likes (perhaps this can be done through a Web form), and the LS returns the address of the gateway on the Web page. The user copies this address to their SIP software, and completes a call to the gateway. GLP is just beginning the process of specification. Because it is similar to existing interdomain IP routing protocols, such as BGP4 [29], it is likely to borrow heavily from them [30, 31]. IEEE Network • May/June 1999 Conclusion Providing an integrated Internet telephony service is no small task. It requires signaling protocols, transport protocols, directory protocols, service specification languages, gateway discovery protocols, and a host of other mechanisms. In this article, we have provided an overview of some of the protocols being developed to solve these problems, and have demonstrated how they work together as building blocks to provide the infrastructure for telephone services in the Internet. It should be noted that almost all of these protocols have uses outside Internet telephony. For example, SDP and RTP are used for broadcasting streaming media, RTSP for controlling media-ondemand servers and GLP may turn out to be useful as a widearea service location protocol for locating media translators or web server replicas. Due to space constraints, we have not covered protocols and architectures that assure “telephone quality,” such as RSVP [32, 33] or differentiated services [34], even though Internet telephony may well be a primary “customer” of these resource reservation mechanisms. References [1] W. Yeong, T. Howes, and S. Kille, “Lightweight directory access protocol,” IETF RFC 1777, Mar. 1995. [2] C. Huitema et al., “An architecture for internet telephony service for residential customers,” IEEE Network, this issue. [3] P. Sijben et al., “Toward the PSTN/Internet inter-networking MEDIA DEVICE CONTROL PROTOCOL,” Internet draft, IETF, Feb. 1999; work in progress. [4] H. Schulzrinne and J. Rosenberg, “Internet telephony: Architecture and protocols — An IETF perspective,” Comp. Networks and ISDN Sys., vol. 31, Feb. 1999, pp. 237–55. [5] R. Droms, “Dynamic host configuration protocol,” IETF RFC 1541, Oct. 1993. [6] ITU-T Rec. H.323, “Visual telephone systems and equipment for local area networks which provide a non-guaranteed quality of service,” Geneva, Switzerland, May 1996. [7] M. Handley et al., “SIP: session initiation protocol,” IETF 2543, Mar. 1999. [8] M. Handley and V. Jacobson, “SDP: session description protocol,” IETF RFC 2327, Apr. 1998. [9] H. Schulzrinne and J. Rosenberg, “A comparison of SIP and H.323 for internet telephony,” Proc. NOSSDAV, Cambridge,U.K., July 1998. [10] J. Toga and J. Ott, “ITU-T standardization activities for interactive multimedia communications on packet networks: H.323 and related recommendations,” Comp. Networks, vol. 31, no. 3, 1999, pp. 205–23. [11] H. Schulzrinne and J. Rosenberg, “SIP call control services,” Internet draft, IETF, Feb. 1998; work in progress. [12] R. Fielding et al., “Hypertext transfer protocol -HTTP/1.1,” IETF RFC 2068, Jan. 1997. [13] H. Schulzrinne and J. Rosenberg, “Signaling for internet telephony,” Tech. rep. CUCS-005-98, Columbia Univ., New York, NY, Feb. 1998. [14] H. Schulzrinne and J. Rosenberg, “Signaling for internet telephony,” Int’l. Conf. Network Protocols, Austin, TX, Oct. 1998. [15] H. Schulzrinne et al., “RTP: a transport protocol for real-time applications,” IETF RFC 1889, Jan. 1996. [16] H. Schulzrinne, “RTP profile for audio and video conferences with minimal control,” IETF RFC 1890, Jan. 1996. [17] R. Ramjee et al., “Adaptive playout mechanisms for packetized audio applications in wide-area networks,” Proc. IEEE INFOCOM, Toronto, Canada, Los Alamitos, CA: IEEE Computer Society Press, June 1994, pp. 680–88. [18] W. A. Montgomery, “Techniques for packet voice synchronization,” IEEE JSAC, vol. SAC-1, Dec. 1983, pp. 1022–28. IEEE Network • May/June 1999 [19] S. B. Moon, J. Kurose, and D. Towsley, “Packet audio playout delay adjustment: performance bounds and algorithms,” ACM/Springer Multimedia Sys., vol. 5, Jan. 1998, pp. 17–28. [20] J.-C. Bolot and A. V. Garcia, “Control mechanisms for packet audio in the internet,” Proc. IEEE INFOCOM, San Francisco, CA, Mar. 1996. [21] I. Busse, B. Deffner, and H. Schulzrinne, “Dynamic QoS control of multimedia applications based on RTP,” Comp. Commun., vol. 19, Jan. 1996, pp. 49–58. [221] C. Perkins and O. Hodson, “Options for repair of streaming media,” IETF RFC 2354, June 1998. [23] J. Rosenberg and H. Schulzrinne, “Timer reconsideration for enhanced RTP scalability,” Proc. IEEE INFOCOM, San Francisco, CA, Mar./Apr. 1998. [24] H. Schulzrinne, A. Rao, and R. Lanphier, “Real time streaming protocol (RTSP),” IETF RFC 2326, Apr. 1998. [25] J. Lennox and H. Schulzrinne, “CPL: a language for user control of internet telephony services,” Internet draft, IETF, Mar. 1999; work in progress. [26] J. Rosenberg, J. Lennox, and H. Schulzrinne, “Programming Internet Telephony Services,” IEEE Network, this issue. [27] J. Veizades et al., “Service location protocol,” IETF RFC 2165, June 1997. [28] J. Rosenberg and H. Schulzrinne, “A framework for a gateway location protocol,” Internet draft, IETF, Feb. 1999; work in progress. [29] Y. Rekhter and T. Li, “A border gateway protocol 4 (BGP-4),” IETF RFC 1771, Mar. 1995. [30] M. Squire, “A gateway location protocol,” Internet draft, IETF, Feb. 1999; work in progress. [31] D. Hampton et al., “The IP telephony border gateway protocol architecture,” Internet Draft, Internet Engineering Task Force, Feb. 1999, Work in progress. [32] L. Zhang et al., “RSVP: A New Resource Reservation Protocol,” IEEE Network, vol. 7, Sept. 1993, pp. 8–18. [33] R. Braden et al., “Resource ReSerVation protocol (RSVP) — version 1 functional specification,” IETF RFC 2205, Sept. 1997. [34] S. Blake et al., “An architecture for differentiated service,” IETF RFC 2475, Dec. 1998. Biographies HENNING SCHULZRINNE ([email protected]) received his undergraduate degree in economics and electrical engineering from the Technische Hochschule, Darmstadt, Germany, in 1984, his M.S.E.E. degree as a Fulbright scholar from the University of Cincinnati, Ohio, and his Ph.D. degree from the University of Massachusetts, Amherst, in 1987 and 1992, respectively. From 1992 to 1994 he was a member of technical staff at AT&T Bell Laboratories, Murray Hill, New Jersey. From 1994-1996, he was associate department head at GMD-Fokus (Berlin), before joining the Computer Science and Electrical Engineering departments at Columbia University, New York. His research interests encompass realtime multimedia network services in the Internet and modeling and performance evaluation. He is an editor of the Journal of Communications and Networks and IEEE Communications Society editor of IEEE Internet Computing. He co-chairs the IEEE Communications Society Internet Technical Committee and is vice chair of the IEEE Communications Society Technical Committee on Computer Communications. He has been vice general chair of IEEE INFOCOM and will be co-technical chair of that conference in 2000. Protocols codeveloped by him are now Internet standards, used by almost all Internet telephony and multimedia applications. He is co-author of the Real-Time Protocol (RTP) for real-time Internet services, the signaling protocol for Internet multimedia conferences and telephony (SIP), and the stream control protocol for Internet media-on-demand (RTSP). JONATHAN ROSENBERG ([email protected]) is currently a member of technical staff in the High Speed Networks Research Department, Bell Laboratories, Lucent Technologies, Holmdel, New Jersey. He conducts research on technologies related to multimedia communications on the Internet, including transport and error recovery, signaling, architectures, protocols, and service creation. He received B.S. and M.S. degrees in electrical engineering from the Massachusetts Institute of Technology in Cambridge, and is continuing his studies in the same field as a Ph.D candidate at Columbia University in New York City. He is active in the Internet Engineering Task Force (IETF), where he chairs the IP Telephony working group. 23
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
advertisement