The IETF Internet Telephony Architecture and Protocols

The IETF Internet Telephony Architecture and Protocols
12
The IETF Internet Telephony
Architecture and Protocols
Henning Schulzrinne, Columbia University
Jonathan Rosenberg, Bell Laboratories-Lucent Technologies
Abstract
Internet telephony was first used as a simple way to provide point-to-point voice
transport between two IP hosts. However, the growing interest in providing integrated voice, data, and video services has caused its scope to be expanded. Internet
telephony now encompasses a range of services, including not only traditional conferencing, call control, multimedia, and mobility services, but also new ones that
integrate Web, e-mail, presence, and instant messaging applications with telephony. Internet telephony and traditional circuit-switched telephony will coexist for
quite some time, requiring interworking between the two. In this article we present
a suite of protocols, developed in the IETF, which provide a partial solution to this
complex problem.
I
nternet telephony was first used as a simple way to provide point-to-point voice transport between two IP hosts,
primarily to replace expensive international phone calls.
However, the growing interest in providing integrated
voice, data, and video services has caused its scope to be
expanded. Internet telephony now encompasses a range of
services. These services include not only traditional conferencing, call control supplementary services, multimedia
transport, and mobility, but also new services that integrate
Web, e-mail, presence, and instant messaging applications
with telephony. Furthermore, it is generally accepted that
Internet telephony and traditional circuit-switched telephony
will coexist for quite some time, requiring gateways between
the two worlds.
Consider the following example of what an integrated IP
telephony call might be like. John is sitting at his computer,
and all of a sudden his machine sounds a “boing,” followed
by speaking: “Audio and video call from Joe.” He accepts the
call, and talks for a while. He then decides that Alice needs
to be in on the call, too. He says “Add Alice,” and the speech
recognition software on his PC interprets the command. His
client application consults a local Internet “white pages”
directory and adds “[email protected]” to the call. The call
setup request reaches Alice’s personal agent software. The
agent has been instructed to “ring” her cell phone, home PC,
and work PC, all in parallel. To complete the call to the cell
phone, Alice has instructed her agent to use the cheapest
gateways that support credit card billing. The agent finds an
appropriate gateway, and rings Alice at her cell phone, home
PC, and work PC, all at once. Alice picks up in her car, joining the call voice-only. During the conversation, Joe remembers that a video segment from a recent IEEE tutorial
18
0890-8044/99/$10.00 © 1999 IEEE
presentation would be helpful. He finds the media server
with the content, and plays the media stream into the conversation. Later, John decides to leave the call. He transfers Joe
and Alice to a Web page containing the additional information on the IEEE tutorial. Joe’s Web browser jumps to the
page, and Alice’s phone displays a text-only version, which
they continue to discuss. Joe then decides to add Bob to the
call. Bob is not available, but his agent returns a Web page
containing his appointments, along with a hyperlink to a
voicemail service.
The services contained in the call scenario above require
many protocol components in order to work. In this article we
examine the various protocols and discuss how they fit into
the broader picture. First, a signaling protocol is needed to
allow Joe to call John, and to establish a multimedia session
so that they can exchange audio and video. We discuss signaling protocols in the next section. The actual audio and video
are exchanged between session participants using a transport
protocol called RTP, which we discuss after that. Directory
access protocols, used to access white pages services, for
example, are also important, but we do not discuss them further here, since they are the same as for e-mail service, for
example [1]. During the call, Joe brings in a media server and
instructs it to play a video segment. This is accomplished
using a streaming media control protocol, called the Real
Time Streaming Protocol (RTSP), which we describe. The
intelligent agent concept described above interacts with the
signaling protocol to provide advanced services. To realize
these agents, we briefly describe a call processing language.
We then present the Gateway Location Protocol (GLP),
which helps the agent in selecting a gateway for terminating
the call from the Internet on the telephone network.
IEEE Network • May/June 1999
This tutorial does not cover the protocols that allow controllers and signaling gateways — gateways connecting to
the public switched telephone network (PSTN) at the signaling layer — to communicate. Proposals for such protocols are being discussed within the Internet Engineering
Task Force (IETF) at this time, including Media Gateway
Control Protocol (MGCP) [2] and Media Description Control Protocol (MDCP) [3]. Other protocols, such as those
for billing and authentication, are also beyond the scope of
this survey.
Signaling Protocols
Signaling protocols are at the heart of Internet telephony and
distinguish it from other services. They play several roles [4],
discussed below.
User Location — If user A wishes to communicate with user B,
A first needs to find out where B is currently located on the
network, so that the session establishment request (below) can
reach him. This function is known as user location. Users can
be in different places at different times, and even reachable by
several means at the same time (by work PC or traditional
phone). This function is particularly important for users
whose PCs do not have a permanent IP address. (Almost all
modem connections, including asynchronous digital subscriber
line — ADSL — and cable modems, assign addresses to PCs
dynamically using the Dynamic Host Configuration Protocol,
DHCP [5].)
Session Establishment — The signaling protocol allows the
called party to accept the call, reject it, or redirect it to another person, voicemail, or a Web page. (Generally, the terms
call and session are used interchangeably in this article,
although session has a somewhat wider meaning, including,
for example, a group of hosts listening to an “Internet radio”
multicast.)
Session Negotiation — The multimedia session being set up
can comprise different media streams, including audio, video,
and shared applications. Each of these media streams may
use a variety of different speech and video compression algorithms, and take place on different multicast or unicast
addresses and ports. The process of session negotiation
allows the parties involved to settle on a set of session
parameters. This process is also sometimes known as capabilities exchange.
Call Participant Management — New members can be added
to a session, and existing members can leave a session.
Feature Invocation — Call features, such as hold, transfer, and
mute, require communication between parties.
Several protocols exist to fill this need. One is International
Telecommunications Union (ITU) Recommendation H.323
[6], which describes a set of protocols. The IETF has defined
two protocols to perform many of the above tasks: the Session
Initiation Protocol (SIP) [7] and Session Description Protocol
(SDP) [8]. A detailed comparison of SIP/SDP and H.323 can
be found in [9]. In this article we focus on the IETF protocols.
Excellent articles on H.323 can be found in [10].
Session Initiation Protocol
As its name implies, SIP is used to initiate a session between
users. It provides for user location services (this is its greatest
strength, in fact), call establishment, call participant management (using a SIP extension [11]), and limited feature invoca-
IEEE Network • May/June 1999
tion. Interestingly, SIP does not define the type of session that
is established. SIP can just as easily establish an interactive
gaming session as an audio/video conference.
Each SIP request consists of a set of header fields that
describe the call as a whole followed by a message body which
describes the individual media sessions that make up the call.
Currently, SDP (below) is used, but consenting parties may
agree on another capability exchange protocol.
SIP is a client-server protocol, similar in both syntax and
semantics to Hypertext Transfer Protocol (HTTP) [12].
Requests are generated by one entity (the client) and sent to
a receiving entity (the server). The server processes the
requests, and then sends a response to the client. A request
and the responses which follow it are called a transaction. The
software on an end system that interacts with a human user is
known as a user agent. A user agent contains two components,
a user agent client (UAC) and a user agent server (UAS).
The UAC is responsible for initiating calls (sending requests),
and the UAS for answering calls (sending responses). A typical Internet telephony appliance or application contains both
a UAS and a UAC. (Note that this differs from a Web browser, which acts only as a client.)
Within the network, there are three types of servers. A
registration server receives updates on the current locations of
users. A proxy server receives requests and forwards them to
another server (called a next-hop server), which has more
precise location information about the callee. The next-hop
server might be another proxy server, a UAS, or a redirect
server. A redirect server also receives requests, and determines a next-hop server. However, instead of forwarding the
request there, it returns the address of the next-hop server
to the client. The primary function of proxy and redirect
servers is call routing — the determination of the set of
servers to traverse in order to complete the call. A proxy or
redirect server can use any means at its disposal to determine the next-hop server, including executing programs and
consulting databases. A SIP proxy server can also fork a
request, sending copies to multiple next-hop servers at once.
This allows a call setup request to try many different locations at once. The first location to answer is connected with
the calling party.
As in HTTP, the client requests invoke methods (commands) on the server. SIP defines several methods. INVITE
invites a user to a call. BYE terminates a connection between
two users in a call. OPTIONS solicits information about capabilities, but does not set up a call. ACK is used for reliable
message exchanges for invitations. CANCEL terminates a
search for a user. Finally, REGISTER conveys information
about a user’s location to a SIP registration server.
A client sets up a call by issuing an INVITE request. This
request contains header fields used to convey information
about the call. The most important header fields are To and
From, which contain the callee’s and caller’s address, respectively. The Subject header field identifies the subject of the
call. The Call-ID header field contains a unique call identifier, and the CSeq header field contains a sequence number.
The Contact header field lists addresses where a user can be
contacted. It is placed in responses from a redirect server, for
example. The Require header field is used for negotiation of
protocol features, providing extensibility. The ContentLength and Content-Type header fields are used to convey
information about the body of the message. The body contains a description of the session which is to be established.
Extensions can be defined with new header fields. One
such extension, used for call control [10], defines several new
headers used for feature invocation (such as call transfer) and
multiparty conferencing [13, 14].
19
•Start and stop times — For broadcast-style sessions like a television program, the start, stop,
SIP redirect
and repeat times of the session are conveyed.
Location service
server
Thus, one can announce or invite others to a
Request
Response
weekly TV show or a Tuesday/Thursday lec2
ture.1
3
•Originator — For broadcast-style sessions, the
5
session description names the originator of the
session and how that person can be contacted
4
6
(e.g., in case of technical difficulties).
11
1
7
SDP conveys this information in a simple tex10
12
tual format. In fact, the acronym for SDP is a
SIP proxy
SIP proxy
misnomer, since SDP is more of a description
8
format. When a call is set up using SIP, the
INVITE message contains an SDP body describ9
ing the session parameters acceptable to the
SIP client
caller. The response from the called party conSIP client
(UAS)
tains a modified version of this description,
incorporating the capabilities of the called party.
Figure 2 shows an example.
■ Figure 1. SIP operation.
The v line is a version identifier for the session. The o line is a set of values which form a
unique identifier for the session. The u and e lines given the
A typical SIP transaction is depicted in Fig. 1. A SIP user
URL and email addresses for further information about the
agent creates an INVITE request for sip:[email protected]
session. The c line indicates the address for the session, the b
This request is forwarded to a local proxy (1). This proxy
line indicates the bandwidth (64 kb/s in this case), and the t
looks up company.com in the Domain Name Service (DNS),
line the start and stop times (where 0 means that the session
and obtains the IP address of a server handling SIP requests
continues indefinitely). The k line conveys the encryption key
for this domain. It then proxies the request to this server (2).
for the session. There are three m lines, each of which identiThe server for company.com knows about user joe, but this
fies a media stream type (audio, video, and whiteboard appliuser is currently logged in as [email protected]
cation), the port number for that stream, the protocol, and a
Therefore, the server redirects the proxy (3) to try this address.
list of payload types. The a line specifies an attribute. For
The local proxy looks up university.edu in DNS, and
example, the line below the audio stream definition defines
obtains the IP address of its SIP server. The request is then
the codec parameters for RTP payload type 96.
proxied there (4). The university server consults a local database
(5), which indicates (6) that [email protected] is
known locally as [email protected] Thus, the
Real Time Transport Protocol
main university server proxies the request to the computer science server (7). This server knows the IP address where the
The Real Time Transport Protocol (RTP) [15], as its name
user is currently logged in, so it proxies the request there (8).
implies, supports the transport of real-time media (e.g., audio
The user accepts the call, and the response is returned
and video) over packet networks. (It is also used by H.323.)
through the proxy chain (9), (10), (11), (12) to the caller.
The process of transport involves taking the bitstream generated by the media encoder, breaking it into packets, sending
Session Description Protocol
the packets across the network, and then recovering the bitstream at the receiver. The process is complex because packSDP is used to describe multimedia sessions, for both telephoets can be lost, delayed by variable amounts, and reordered in
ny and distribution applications like Internet radio. The prothe network. The transport protocol must allow the receiver
tocol includes information about:
to detect these losses. It must also convey timing information
• Media streams — A multimedia session can contain many
so that the receiver can correctly compensate for jitter (varimedia streams; for example, two audio streams, a video
ability in delay). To assist in this function, RTP defines the
stream, and a whiteboard session. SDP conveys the number
formatting of the packets sent across the network. The packand type of each media stream. It currently defines audio,
ets contain the media information (called the RTP payload),
video, data, control, and application as stream types, similar
along with an RTP header. This header provides information
to MIME types used for Internet mail.
to the receiver that allows it to reconstruct the media. RTP
• Addresses — For each stream, the destination address (unialso specifies how the codec bitstreams are broken up into
cast or multicast) is indicated. Note that the addresses for
packets [16]. RTP was also “engineered” for multicast, which
different media streams may differ, so a user may, for
means that it works in conferencing applications and broadexample, receive audio on a low-delay Internet telephone
cast environments where multicast is used to distribute the
appliance and video on a workstation.
media. It is important to note that RTP does not try to
• Ports — For each stream, the UDP port numbers for sendreserve resources in the network to avoid packet loss and jiting and/or receiving are indicated.
ter; rather, it allows the receiver to recover in the presence of
• Payload types — The media formats which can be used durloss and jitter.
ing the session are also conveyed. For unicast sessions (“traRTP plays a key component in any Internet telephony sysditional” IP telephony), this list is called a capability set.
tem. It is effectively at the heart of the application, moving
the actual voice among participants. The relationship between
1 SIP can be used not just to make a traditional phone call, but also to
the signaling protocol and RTP is that signaling protocols are
used to establish the parameters for RTP transport.
invite others to, say, a TV program, without the caller and callee talking to
RTP provides a number of specific functions:
each other.
20
IEEE Network • May/June 1999
• Sequencing — Each RTP packet contains a sequence number. This can be used for loss detection and compensation
for reordering.
• Intramedia synchronization — Packets within the same
stream may suffer different delays (jitter). Applications use
playout buffers [17–19] to compensate for delay jitter. They
need the timestamps provided by RTP to measure it.
• Payload identification — In the Internet, network conditions
such as packet loss and delay vary, even during the duration
of a single call. Speech and video codecs differ in their ability
to work properly under various loss conditions. Therefore, it
is desirable to be able to change the encoding for the media
(the “payload” of RTP) dynamically as network conditions
vary. To support this, RTP contains a payload type identifier
in each packet to describe the encoding of the media.
• Frame indication — Video and audio are sent in logical
units called frames. It is necessary to indicate to a receiver
where the beginning or end of a frame is, in order to aid in
synchronized delivery to higher layers. A frame marker bit
is provided for this purpose.
• Source identification — In a multicast session, many users are
participating. There must be a way for a packet to contain an
indicator of which participant sent it. An identifier called the
synchronization source (SSRC) is provided for this purpose.
RTP also has a companion control protocol, called Real
Time Control Protocol (RTCP). RTCP provides additional
information to session participants. In particular, it provides:
• QoS feedback — Receivers in a session use RTCP to report
back the quality of their reception from each sender. This
information includes the number of lost packets, jitter, and
round-trip delays. This information can be used by senders
for adaptive applications [20–22] which adjust encoding
rates and other parameters based on feedback.
• Intermedia synchronization — For flexibility, audio and video
are often carried in separate packet streams, but they need to
be synchronized at the receiver to provide “lip sync.” The necessary information for the synchronization of sources, even if
originating from different servers, is provided by RTCP.
• Identification — RTCP packets contain information such as
the e-mail address, phone number, and full name of the
participant. This allows session participants to learn the
identities of the other participants in the session.
• Session Control — RTCP allows participants to indicate that
they are leaving a session (through a BYE RTCP packet).
Participants can also send small notes to each other, such
as “stepping out of the office.”
RTCP mandates that all session participants (including those
who send media and those who just listen) send a packet periodically which contains the information described above. These
packets are sent to the same address (multicast or unicast) as
the RTP media, but on a different port. The information is sent
periodically since it contains time-sensitive information, such as
reception quality, which becomes stale after some amount of
time. However, a participant cannot just send an RTCP packet with a fixed period. Since RTP is used in multicast groups,
there could be sessions (like a large lecture) with hundreds or
thousands of participants. If each one were to send a packet
with a fixed period, the network would become swamped with
RTCP packets. To fix this, RTCP specifies an algorithm that
allows the period to increase in larger groups [23].
Real Time Streaming Protocol
The Real Time Streaming Protocol (RTSP) [24] is used to
control a stored media server. A stored media server is a
device capable of playing prerecorded media from disk to the
network, and recording multimedia content to disk. RTSP
IEEE Network • May/June 1999
v
o
s
u
e
c
b
t
k
m
a
m
m
a
=
=
=
=
=
=
=
=
=
=
=
=
=
=
0
g.bell 87728 8772 IN IP4 132.151.1.19
Come here, Watson!
http://www.ietf.org
[email protected]
IN IP4 132.151.1.19
CT:64
3086272736 0
clear:manhole cover
audio 3456 RTP/AVP 96
rtpmap:96 VDVI/8000/1
video 3458 RTP/AVP 31
application 32416 udp wb
orient:portrait
■ Figure 2. Example of an SDP description.
offers controls similar to those in a VCR remote control. A
client can instruct the server to play, record, fast forward,
rewind, and pause. It can also configure the server with the IP
addresses, UDP ports, and speech codecs to use to deliver (or
record) the media. Typically, media is sent from the media
server using RTP.
Stored media has a number of applications in Internet telephony:
• A media server can record the content of a conference for
future reference.
• Media servers can play media into an existing conference.
For example, if participants in a multiparty conference are
discussing a movie, it would be useful to be able to bring a
media server into the conference, and have it play portions
of the movie into the conference. (This is done in the introductory example, where Joe has the media server play the
IEEE tutorial into the conference.)
• Media servers can record voicemail. RTSP clients can use the
protocol to control playback of the message. This would allow
users to listen to their voicemail, and rewind to a critical part
(e.g., a phone number in the message). RTSP can also be used
to record the incoming or outgoing voice mail message.
A client executes the following steps to cause a media server to play back content.
Obtain the Presentation Description — The first step is to
obtain the presentation description. This description enumerates the various components of the session. As an example, a
classroom presentation might contain three components: a
document camera, a video view of the professor, and the
audio. For each component, the description defines the media
parameters needed to decode the component, such as the
codec type and frame rate. The presentation description can
be obtained in several ways. The client can issue a DESCRIBE
request to the server, which causes the server to return a
description. It is also possible to obtain a description through
other means, such as a Web page.
Set Up the Server — Once it has obtained the description, the
client can issue a SETUP request to the server. This request
establishes the destination to which the media should be delivered. The destination includes the IP address (unicast or multicast), port numbers, protocols (usually, but not necessarily,
RTP over UDP), TTL, and number of multicast layers. In
response to the SETUP message, the server returns a session
id. This id is used in further requests.
Issue Media Requests — After the stream has been set up,
the client can issue media requests to the server. These
include operations such as PLAY, RECORD, and PAUSE. The
PLAY method encompasses seek, fast forward, and reverse,
in addition to straight play. This is accomplished by includ-
21
ing a Scale header which indicates the time speedup (or
speeddown) at which the media should be played. The
Range header specifies from where in the stream playback
should start.
Teardown — Once interaction with the server is complete, the
client issues a TEARDOWN request. This request destroys
any state associated with the session. Further requests for the
given session id will no longer be valid.
Call Processing Language
The Call Processing Language (CPL) [25, 26] is a scripting
language that allows end users to specify the behavior of call
agents which execute on their behalf. The call agents are
invoked when a call arrives at a SIP server. These agents execute the instructions contained in the CPL. This allows an end
user to specify their own call services. For example, a user can
instruct the agent to connect a call by trying a cell phone,
home PC, and work PC all at once.
Gateway Location Protocol
SIP allows a user on the Internet to call another user, also
on the Internet. What if an Internet user wishes to call
someone not on the Internet, but rather, on the telephone
network? In this case, an Internet telephony gateway is
needed. This device is capable of converting signaling and
media between a packet network and the telephone network
(PSTN). We anticipate that many gateways will be deployed
in the future by Internet service providers (ISPs) and telephone companies.
To complete a call to a PSTN endpoint, an IP host must
send a SIP invitation to the gateway. However, given a phone
number to call, how does the caller find and select one of the
many gateways to complete the call? In theory, each gateway
GW
LS
UA
GW
Intradomain
protocol
■ Figure 3. GLP operation.
22
GLP
Front-end
could dial (almost) any phone number, but the caller may
want to minimize the distance of the PSTN leg of the call, for
example. This function is supported by the Gateway Location
Protocol (GLP).
An overall architecture for GLP is shown in Fig. 3. In that
architecture there are a number of Internet telephony domains
in the Internet, each of which is under the control of a single
authority. Each domain has some number of IP telephony
gateways that provide connectivity between the Internet and
the PSTN. Each domain also has some number of IP users
and some number of location servers (LS). The LSs in a
domain know about the gateways in their own domains, by
means of an intradomain protocol, such as the Service Location Protocol (SLP) [27]. The intradomain protocol propagates information from the gateways to the location servers
within a domain.
Unfortunately, it is unlikely that a single administrative
domain will have access to enough gateways to complete
calls to all possible telephone numbers. As a result, users in
one administrative domain can make use of gateways in
another administrative domain. This usually requires preestablished business relationships between domains. Once
the agreements are in place, it is necessary for the LS in
one domain to exchange information about its gateways
with the LS in another domain. An LS can then take this
information and exchange it with other LSs with which it
has an established relationship. The protocol used for these
exchanges is GLP [28]. GLP allows an LS to build up a
database of gateways in other domains. The database contains entries with the IP address of the gateway, a range of
numbers the gateway is willing to terminate, and attributes
that describe the gateway. These attributes include signaling protocols, cost information, and provider identifiers,
among others. An LS can use the attributes to decide which
gateways to use to terminate a call to a particular number.
The information can also be used to determine which gateways to further advertise to other LSs. Both of these decisions are embodied in the policy that directs the behavior
of the LS.
When a client within the domain wishes to make a call to a
number in the phone network, it can proceed in several ways:
• Lightweight Directory Access Protocol (LDAP) — The database
of the LS is made available through LDAP [1]. The calling
client queries this database with the destination phone number, and the LS returns the IP address of the gateway. The
client can then send a SIP invitation to that address.
• SIP — Rather then sending a SIP invitation directly to
the gateway, the caller sends it to the LS. The invitation
contains the desired destination phone number. The LS
consults its database, finds the right gateway, and proxies the call to it. In this case, the LS also acts as a SIP
proxy. By acting as a proxy, the LS hides the gateway
selection process from the caller. The caller application
does not need to know whether the address being called
is a phone number or a SIP universal resource locator
(URL). In either case, the invitation is sent to the local
proxy.
• Web pages — The LS can make its database available
through Web pages. A user that wishes to make a call
browses the Web page and finds the gateway it likes (perhaps this can be done through a Web form), and the LS
returns the address of the gateway on the Web page. The
user copies this address to their SIP software, and completes a call to the gateway.
GLP is just beginning the process of specification. Because
it is similar to existing interdomain IP routing protocols, such
as BGP4 [29], it is likely to borrow heavily from them [30, 31].
IEEE Network • May/June 1999
Conclusion
Providing an integrated Internet telephony service is no small
task. It requires signaling protocols, transport protocols, directory protocols, service specification languages, gateway discovery protocols, and a host of other mechanisms. In this article,
we have provided an overview of some of the protocols being
developed to solve these problems, and have demonstrated how
they work together as building blocks to provide the infrastructure for telephone services in the Internet. It should be
noted that almost all of these protocols have uses outside
Internet telephony. For example, SDP and RTP are used for
broadcasting streaming media, RTSP for controlling media-ondemand servers and GLP may turn out to be useful as a widearea service location protocol for locating media translators or
web server replicas. Due to space constraints, we have not covered protocols and architectures that assure “telephone quality,” such as RSVP [32, 33] or differentiated services [34], even
though Internet telephony may well be a primary “customer”
of these resource reservation mechanisms.
References
[1] W. Yeong, T. Howes, and S. Kille, “Lightweight directory access protocol,”
IETF RFC 1777, Mar. 1995.
[2] C. Huitema et al., “An architecture for internet telephony service for residential customers,” IEEE Network, this issue.
[3] P. Sijben et al., “Toward the PSTN/Internet inter-networking MEDIA DEVICE
CONTROL PROTOCOL,” Internet draft, IETF, Feb. 1999; work in progress.
[4] H. Schulzrinne and J. Rosenberg, “Internet telephony: Architecture and protocols — An IETF perspective,” Comp. Networks and ISDN Sys., vol. 31, Feb.
1999, pp. 237–55.
[5] R. Droms, “Dynamic host configuration protocol,” IETF RFC 1541, Oct. 1993.
[6] ITU-T Rec. H.323, “Visual telephone systems and equipment for local area
networks which provide a non-guaranteed quality of service,” Geneva,
Switzerland, May 1996.
[7] M. Handley et al., “SIP: session initiation protocol,” IETF 2543, Mar. 1999.
[8] M. Handley and V. Jacobson, “SDP: session description protocol,” IETF RFC
2327, Apr. 1998.
[9] H. Schulzrinne and J. Rosenberg, “A comparison of SIP and H.323 for internet telephony,” Proc. NOSSDAV, Cambridge,U.K., July 1998.
[10] J. Toga and J. Ott, “ITU-T standardization activities for interactive multimedia communications on packet networks: H.323 and related recommendations,” Comp. Networks, vol. 31, no. 3, 1999, pp. 205–23.
[11] H. Schulzrinne and J. Rosenberg, “SIP call control services,” Internet draft,
IETF, Feb. 1998; work in progress.
[12] R. Fielding et al., “Hypertext transfer protocol -HTTP/1.1,” IETF RFC 2068,
Jan. 1997.
[13] H. Schulzrinne and J. Rosenberg, “Signaling for internet telephony,” Tech.
rep. CUCS-005-98, Columbia Univ., New York, NY, Feb. 1998.
[14] H. Schulzrinne and J. Rosenberg, “Signaling for internet telephony,” Int’l.
Conf. Network Protocols, Austin, TX, Oct. 1998.
[15] H. Schulzrinne et al., “RTP: a transport protocol for real-time applications,”
IETF RFC 1889, Jan. 1996.
[16] H. Schulzrinne, “RTP profile for audio and video conferences with minimal
control,” IETF RFC 1890, Jan. 1996.
[17] R. Ramjee et al., “Adaptive playout mechanisms for packetized audio applications in wide-area networks,” Proc. IEEE INFOCOM, Toronto, Canada, Los
Alamitos, CA: IEEE Computer Society Press, June 1994, pp. 680–88.
[18] W. A. Montgomery, “Techniques for packet voice synchronization,” IEEE
JSAC, vol. SAC-1, Dec. 1983, pp. 1022–28.
IEEE Network • May/June 1999
[19] S. B. Moon, J. Kurose, and D. Towsley, “Packet audio playout delay adjustment: performance bounds and algorithms,” ACM/Springer Multimedia Sys.,
vol. 5, Jan. 1998, pp. 17–28.
[20] J.-C. Bolot and A. V. Garcia, “Control mechanisms for packet audio in the
internet,” Proc. IEEE INFOCOM, San Francisco, CA, Mar. 1996.
[21] I. Busse, B. Deffner, and H. Schulzrinne, “Dynamic QoS control of multimedia applications based on RTP,” Comp. Commun., vol. 19, Jan. 1996, pp.
49–58.
[221] C. Perkins and O. Hodson, “Options for repair of streaming media,” IETF
RFC 2354, June 1998.
[23] J. Rosenberg and H. Schulzrinne, “Timer reconsideration for enhanced RTP
scalability,” Proc. IEEE INFOCOM, San Francisco, CA, Mar./Apr. 1998.
[24] H. Schulzrinne, A. Rao, and R. Lanphier, “Real time streaming protocol
(RTSP),” IETF RFC 2326, Apr. 1998.
[25] J. Lennox and H. Schulzrinne, “CPL: a language for user control of internet
telephony services,” Internet draft, IETF, Mar. 1999; work in progress.
[26] J. Rosenberg, J. Lennox, and H. Schulzrinne, “Programming Internet Telephony Services,” IEEE Network, this issue.
[27] J. Veizades et al., “Service location protocol,” IETF RFC 2165, June 1997.
[28] J. Rosenberg and H. Schulzrinne, “A framework for a gateway location
protocol,” Internet draft, IETF, Feb. 1999; work in progress.
[29] Y. Rekhter and T. Li, “A border gateway protocol 4 (BGP-4),” IETF RFC
1771, Mar. 1995.
[30] M. Squire, “A gateway location protocol,” Internet draft, IETF, Feb. 1999;
work in progress.
[31] D. Hampton et al., “The IP telephony border gateway protocol architecture,”
Internet Draft, Internet Engineering Task Force, Feb. 1999, Work in progress.
[32] L. Zhang et al., “RSVP: A New Resource Reservation Protocol,” IEEE Network, vol. 7, Sept. 1993, pp. 8–18.
[33] R. Braden et al., “Resource ReSerVation protocol (RSVP) — version 1 functional specification,” IETF RFC 2205, Sept. 1997.
[34] S. Blake et al., “An architecture for differentiated service,” IETF RFC 2475,
Dec. 1998.
Biographies
HENNING SCHULZRINNE ([email protected]) received his undergraduate degree
in economics and electrical engineering from the Technische Hochschule, Darmstadt, Germany, in 1984, his M.S.E.E. degree as a Fulbright scholar from the
University of Cincinnati, Ohio, and his Ph.D. degree from the University of Massachusetts, Amherst, in 1987 and 1992, respectively. From 1992 to 1994 he
was a member of technical staff at AT&T Bell Laboratories, Murray Hill, New Jersey. From 1994-1996, he was associate department head at GMD-Fokus
(Berlin), before joining the Computer Science and Electrical Engineering departments at Columbia University, New York. His research interests encompass realtime multimedia network services in the Internet and modeling and performance
evaluation. He is an editor of the Journal of Communications and Networks and
IEEE Communications Society editor of IEEE Internet Computing. He co-chairs the
IEEE Communications Society Internet Technical Committee and is vice chair of
the IEEE Communications Society Technical Committee on Computer Communications. He has been vice general chair of IEEE INFOCOM and will be co-technical
chair of that conference in 2000. Protocols codeveloped by him are now Internet
standards, used by almost all Internet telephony and multimedia applications. He
is co-author of the Real-Time Protocol (RTP) for real-time Internet services, the signaling protocol for Internet multimedia conferences and telephony (SIP), and the
stream control protocol for Internet media-on-demand (RTSP).
JONATHAN ROSENBERG ([email protected]) is currently a member of technical
staff in the High Speed Networks Research Department, Bell Laboratories, Lucent
Technologies, Holmdel, New Jersey. He conducts research on technologies related to
multimedia communications on the Internet, including transport and error recovery,
signaling, architectures, protocols, and service creation. He received B.S. and M.S.
degrees in electrical engineering from the Massachusetts Institute of Technology in
Cambridge, and is continuing his studies in the same field as a Ph.D candidate at
Columbia University in New York City. He is active in the Internet Engineering Task
Force (IETF), where he chairs the IP Telephony working group.
23
Was this manual useful for you? yes no
Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Download PDF

advertisement