Data and Computer Communications (Eighth Edition)

Data and Computer Communications (Eighth Edition)
Table 14.3 IS-95 Forward Link Channel Parameters
Data rate (bps)
Code repetition
Modulation symbol
rate (sps)
Traffic Rate Set 1
Traffic Rate Set 2
19,200 19,200 19,200 19,200 19,200 19,200 19,200 19,200 19,200 19,200
PN chips/modulation
PN chips/bit
682.67 341.33 170.67
The most widely used second-generation CDMA scheme is IS-95, which is primarily
deployed in North America. The transmission structures on the forward and reverse
links differ and are described separately.
IS-95 Forward Link
Table 14.3 lists forward link channel parameters. The forward link consists of up
to 64 logical CDMA channels each occupying the same 1228-kHz bandwidth
(Figure 14.10a). The forward link supports four types of channels:
Pilot channel
Paging channel
Paging channel
Traffic channel 1
Traffic channel 2
Traffic channel 24
Synchronization channel
Traffic channel 25
Traffic channel 55
(a) Forward channels
User-specific long code
Walsh code
1.228 MHz
Distinct long code
1.228 MHz
Access channel
Access channel
Access channel
Traffic channel
Traffic channel
Traffic channel
Traffic channel
(b) Reverse channels
Figure 14.10
IS-95 Channel Structure
• Pilot (channel 0): A continuous signal on a single channel. This channel
allows the mobile unit to acquire timing information, provides phase reference for the demodulation process, and provides a means for signal strength
comparison for the purpose of handoff determination. The pilot channel
consists of all zeros.
• Synchronization (channel 32): A 1200-bps channel used by the mobile station
to obtain identification information about the cellular system (system time,
long code state, protocol revision, etc.).
• Paging (channels 1 to 7): Contain messages for one or more mobile stations.
• Traffic (channels 8 to 31 and 33 to 63): The forward channel supports
55 traffic channels. The original specification supported data rates of up
to 9600 bps. A subsequent revision added a second set of rates up to
14,400 bps.
Note that all of these channels use the same bandwidth. The chipping code is
used to distinguish among the different channels. For the forward channel, the chipping codes are the 64 orthogonal 64-bit codes derived from a 64 * 64 matrix known
as the Walsh matrix (discussed in [STAL05]).
Figure 14.11 shows the processing steps for transmission on a forward traffic
channel using rate set 1. For voice traffic, the speech is encoded at a data rate of 8550
bps. After additional bits are added for error detection, the rate is 9600 bps. The full
channel capacity is not used when the user is not speaking. During quiet periods the
data rate is lowered to as low as 1200 bps. The 2400-bps rate is used to transmit transients in the background noise, and the 4800-bps rate is used to mix digitized speech
and signaling data.
The data or digitized speech is transmitted in 20-ms blocks with forward error
correction provided by a convolutional encoder with rate 1/2, thus doubling the
effective data rate to a maximum of 19.2 kbps. For lower data rates, the encoder
output bits (called code symbols) are replicated to yield the 19.2-kbps rate. The
data are then interleaved in blocks to reduce the effects of errors by spreading
them out.
Following the interleaver, the data bits are scrambled. The purpose of this is to
serve as a privacy mask and to prevent the sending of repetitive patterns, which in
turn reduces the probability of users sending at peak power at the same time. The
scrambling is accomplished by means of a long code that is generated as a pseudorandom number from a 42-bit-long shift register. The shift register is initialized with
the user’s electronic serial number. The output of the long code generator is at a rate
of 1.2288 Mbps, which is 64 times the rate of 19.2 kbps, so only one bit in 64 is
selected (by the decimator function). The resulting stream is XORed with the output of the block interleaver.
The next step in the processing inserts power control information in the traffic
channel, to control the power output of the antenna. The power control function of
the base station robs the traffic channel of bits at a rate of 800 bps. These are
inserted by stealing code bits. The 800-bps channel carries information directing the
mobile unit to increment, decrement, or keep stable its current output level. This
power control stream is multiplexed into the 19.2 kbps by replacing some of the
code bits, using the long code generator to encode the bits.
Forward traffic channel
information bits
(172, 80, 40, or 16 b/frame)
8.6 kbps
4.0 kbps
2.0 kbps
0.8 kbps
19.2 ksps
Add frame
quality indicators for 9600 &
4800 bps rates
9.2 kbps
4.4 kbps
2.0 kbps
0.8 kbps
Long code mask
for user m
Long code
1.2288 Mbps
Add 8-bit
encoder tail
9.6 kbps
4.8 kbps
2.4 kbps
1.2 kbps
19.2 kbps
19.2 kbps 19.2 kbps
Power control
bits 800 bps
800 Hz
19.2 kbps
code n PN chip 1.2288 Mbps
19.2 ksps
(n, k, K) (2, 1, 9)
19.2 kbps
9.6 kbps
4.8 kbps
2.4 kbps
Figure 14.11 IS-95 Forward Link Transmission
The next step in the process is the DS-SS function, which spreads the
19.2 kbps to a rate of 1.2288 Mbps using one row of the 64 * 64 Walsh matrix.
One row of the matrix is assigned to a mobile station during call setup. If a 0 bit
is presented to the XOR function, then the 64 bits of the assigned row are sent.
If a 1 is presented, then the bitwise XOR of the row is sent. Thus, the final bit
rate is 1.2288 Mbps. This digital bit stream is then modulated onto the carrier
using a QPSK modulation scheme. Recall from Chapter 5 that QPSK involves
creating two bit streams that are separately modulated (see Figure 5.11).
In the IS-95 scheme, the data are split into I and Q (in-phase and quadrature)
channels and the data in each channel are XORed with a unique short code. The
short codes are generated as pseudorandom numbers from a 15-bit-long shift
Table 14.4 IS-95 Reverse Link Channel Parameters
Data rate (bps)
Code rate
Symbol rate before
repetition (sps)
Traffic-Rate Set 1
Traffic-Rate Set 2
Symbol repetition
Symbol rate after
repetition (sps)
Transmit duty cycle
Code symbols/
modulation symbol
PN chips/
modulation symbol
PN chips/bit
IS-95 Reverse Link
Table 14.4 lists reverse link channel parameters. The reverse link consists of up to
94 logical CDMA channels each occupying the same 1228-kHz bandwidth (Figure
14.10b). The reverse link supports up to 32 access channels and up to 62 traffic
The traffic channels in the reverse link are mobile unique. Each station has a
unique long code mask based on its electronic serial number. The long code mask is
a 42-bit number, so there are 2 42 - 1 different masks. The access channel is used by
a mobile to initiate a call, to respond to a paging channel message from the base station, and for a location update.
Figure 14.12 shows the processing steps for transmission on a reverse traffic
channel using rate set 1. The first few steps are the same as for the forward channel. For the reverse channel, the convolutional encoder has a rate of 1/3, thus
tripling the effective data rate to a maximum of 28.8 kbps. The data are then block
The next step is a spreading of the data using the Walsh matrix. The way in
which the matrix is used, and its purpose, are different from that of the forward
channel. In the reverse channel, the data coming out of the block interleaver are
grouped in units of 6 bits. Each 6-bit unit serves as an index to select a row of the
64 * 64 Walsh matrix 12 6 = 642, and that row is substituted for the input. Thus the
data rate is expanded by a factor of 64/6 to 307.2 kbps. The purpose of this encoding
is to improve reception at the base station. Because the 64 possible codings are
orthogonal, the block coding enhances the decision-making algorithm at the
receiver and is also computationally efficient (see [PETE95] for details). We can
view this Walsh modulation as a form of block error-correcting code with
1n, k2 = 164, 62 and dmin = 32. In fact, all distances are 32.
The data burst randomizer is implemented to help reduce interference
from other mobile stations (see [BLAC99b] for a discussion). The operation
Reverse traffic channel
information bits
(172, 80, 40, or 16 b/frame)
8.6 kbps
4.0 kbps
2.0 kbps
0.8 kbps
Add frame
quality indicators for 9600 &
4800 bps rates
9.2 kbps
4.4 kbps
2.0 kbps
0.8 kbps
4.8 ksps
(307.2 kcps)
Add 8-bit
encoder tail
(n, k, K) (3, 1, 9)
Long code
PN chip
1.2288 Mbps
Long code
28.8 ksps
Figure 14.12
(Walsh chip)
Data burst
9.6 kbps
4.8 kbps
2.4 kbps
1.2 kbps
28.8 kbps
14.4 kbps
7.2 kbps
3.6 kbps
28.8 ksps
IS-95 Reverse Link Transmission
involves using the long code mask to smooth the data out over each 20-ms
The next step in the process is the DS-SS function. In the case of the reverse
channel, the long code unique to the mobile is XORed with the output of the randomizer to produce the 1.2288-Mbps final data stream. This digital bit stream is then
modulated onto the carrier using an offset QPSK modulation scheme. This differs
from the forward channel in the use of a delay element in the modulator (Figure
5.11) to produce orthogonality. The reason the modulators are different is that in the
forward channel, the spreading codes are orthogonal, all coming from the Walsh
matrix, whereas in the reverse channel, orthogonality of the spreading codes is not
The objective of the third generation (3G) of wireless communication is to provide
fairly high-speed wireless communications to support multimedia, data, and video in
addition to voice. The ITU’s International Mobile Telecommunications for the year
2000 (IMT-2000) initiative has defined the ITU’s view of third-generation capabilities as
• Voice quality comparable to the public switched telephone network
• 144-kbps data rate available to users in high-speed motor vehicles over large
• 384 kbps available to pedestrians standing or moving slowly over small areas
• Support (to be phased in) for 2.048 Mbps for office use
• Symmetrical and asymmetrical data transmission rates
• Support for both packet-switched and circuit-switched data services
• An adaptive interface to the Internet to reflect efficiently the common asymmetry between inbound and outbound traffic
• More efficient use of the available spectrum in general
• Support for a wide variety of mobile equipment
• Flexibility to allow the introduction of new services and technologies
More generally, one of the driving forces of modern communication technology is
the trend toward universal personal telecommunications and universal communications access. The first concept refers to the ability of a person to identify
himself or herself easily and use conveniently any communication system in an
entire country, over a continent, or even globally, in terms of a single account. The
second refers to the capability of using one’s terminal in a wide variety of environments to connect to information services (e.g., to have a portable terminal
that will work in the office, on the street, and on airplanes equally well). This
revolution in personal computing will obviously involve wireless communication
in a fundamental way.
Personal communications services (PCSs) and personal communication networks (PCNs) are names attached to these concepts of global wireless communications, and they also form objectives for third-generation wireless.
Generally, the technology planned is digital using time division multiple access
or code division multiple access to provide efficient use of the spectrum and high
PCS handsets are designed to be low power and relatively small and light.
Efforts are being made internationally to allow the same terminals to be used
Alternative Interfaces
Figure 14.13 shows the alternative schemes that have been adopted as part of
IMT-2000. The specification covers a set of radio interfaces for optimized performance
Radio interface
Direct spread
Time code
Figure 14.13
Single carrier
IMT-2000 Terrestrial Radio Interfaces
in different radio environments. A major reason for the inclusion of five alternatives was to enable a smooth evolution from existing first- and second-generation
The five alternatives reflect the evolution from the second generation. Two of
the specifications grow out of the work at the European Telecommunications Standards Institute (ETSI) to develop a UMTS (universal mobile telecommunications
system) as Europe’s 3G wireless standard. UMTS includes two standards. One of
these is known as wideband CDMA, or W-CDMA. This scheme fully exploits
CDMA technology to provide high data rates with efficient use of bandwidth.
Table 14.5 shows some of the key parameters of W-CDMA. The other European
effort under UMTS is known as IMT-TC, or TD-CDMA. This approach is a combination of W-CDMA and TDMA technology. IMT-TC is intended to provide an
upgrade path for the TDMA-based GSM systems.
Another CDMA-based system, known as cdma2000, has a North American
origin. This scheme is similar to, but incompatible with, W-CDMA, in part because
the standards use different chip rates. Also, cdma2000 uses a technique known as
multicarrier, not used with W-CDMA.
Two other interface specifications are shown in Figure 14.13. IMT-SC is primarily designed for TDMA-only networks. IMT-FT can be used by both TDMA and
FDMA carriers to provide some 3G services; it is an outgrowth of the Digital European Cordless Telecommunications (DECT) standard.
CDMA Design Considerations
The dominant technology for 3G systems is CDMA. Although three different
CDMA schemes have been adopted, they share some common design issues.
[OJAN98] lists the following:
Table 14.5 W-CDMA Parameters
Channel bandwidth
5 MHz
Forward RF channel structure
Direct spread
Chip rate
3.84 Mcps
Frame length
10 ms
Number of slots/frame
Spreading modulation
Balanced QPSK (forward)
Dual channel QPSK (reverse)
Complex spreading circuit
Data modulation
QPSK (forward)
BPSK (reverse)
Coherent detection
Pilot symbols
Reverse channel multiplexing
Control and pilot channel time multiplexed. I and Q multiplexing
for data and control channels
Various spreading and multicode
Spreading factors
4 to 256
Power control
Open and fast closed loop (1.6 kHz)
Spreading (forward)
Variable length orthogonal sequences for channel separation.
Gold sequences 2 18 for cell and user separation.
Spreading (reverse)
Same as forward, different time shifts in I and Q channels.
• Bandwidth: An important design goal for all 3G systems is to limit channel
usage to 5 MHz. There are several reasons for this goal. On the one hand, a
bandwidth of 5 MHz or more improves the receiver’s ability to resolve
multipath when compared to narrower bandwidths. On the other hand,
available spectrum is limited by competing needs, and 5 MHz is a reasonable upper limit on what can be allocated for 3G. Finally, 5 MHz is
adequate for supporting data rates of 144 and 384 kHz, the main targets for
3G services.
• Chip rate: Given the bandwidth, the chip rate depends on desired data rate,
the need for error control, and bandwidth limitations. A chip rate of 3 Mcps or
more is reasonable given these design parameters.
• Multirate: The term multirate refers to the provision of multiple fixed-datarate logical channels to a given user, in which different data rates are provided on different logical channels. Further, the traffic on each logical
channel can be switched independently through the wireless and fixed networks to different destinations. The advantage of multirate is that the system can flexibly support multiple simultaneous applications from a given
user and can efficiently use available capacity by only providing the capacity required for each service. Multirate can be achieved with a TDMA
scheme within a single CDMA channel, in which a different number of slots
per frame are assigned to achieve different data rates. All the subchannels
(a) Time multiplexing
(b) Code multiplexing
Figure 14.14 Time and Code Multiplexing Principles [OJAN98]
at a given data rate would be protected by error correction and interleaving
techniques (Figure 14.14a). An alternative is to use multiple CDMA codes,
with separate coding and interleaving, and map them to separate CDMA
channels (Figure 14.14b).
[BERT94] and [ANDE95] are instructive surveys of cellular wireless propagation
effects. [BLAC99b] is one of the best technical treatments of second-generation cellular
[TANT98] contains reprints of numerous important papers dealing with CDMA in cellular networks. [DINA98] provides an overview of both PN and orthogonal spreading codes
for cellular CDMA networks.
[OJAN98] provides an overview of key technical design considerations for 3G
systems. Another useful survey is [ZENG00]. [PRAS00] is a much more detailed
ANDE95 Anderson, J.; Rappaport, T.; and Yoshida, S. “Propagation Measurements and
Models for Wireless Communications Channels.” IEEE Communications Magazine, January 1995.
BERT94 Bertoni, H.; Honcharenko, W.; Maciel, L.; and Xia, H. “UHF Propagation Prediction for Wireless Personal Communications.” Proceedings of the IEEE, September 1994.
BLAC99b Black, U. Second-Generation Mobile and Wireless Networks. Upper Saddle
River, NJ: Prentice Hall, 1999.
DINA98 Dinan, E., and Jabbari, B. “Spreading Codes for Direct Sequence CDMA and
Wideband CDMA Cellular Networks.” IEEE Communications Magazine, September 1998.
OJAN98 Ojanpera, T., and Prasad, G. “An Overview of Air Interface Multiple Access for
IMT-2000/UMTS.” IEEE Communications Magazine, September 1998.
PRAS00 Prasad, R.; Mohr, W.; and Konhauser, W., eds. Third-Generation Mobile Communication Systems. Boston: Artech House, 2000.
TANT98 Tantaratana, S, and Ahmed, K., eds. Wireless Applications of Spread Spectrum
Systems: Selected Readings. Piscataway, NJ: IEEE Press, 1998.
ZENG00 Zeng, M.; Annamalai, A.; and Bhargava, V. “Harmonization of Global Thirdgeneration Mobile Systems. IEEE Communications Magazine, December 2000.
Recommended Web sites:
• Cellular Telecommunications and Internet Association: An industry consortium
that provides information on successful applications of wireless technology.
• CDMA Development Group: Information and links for IS-95 and CDMA generally.
• 3G Americas: A trade group of Western Hemisphere companies supporting a variety
of second- and third-generation schemes. Includes industry news, white papers, and
other technical information.
Key Terms
adaptive equalization
Advanced Mobile Phone
Service (AMPS)
base station
cellular network
code division multiple access
fast fading
flat fading
first-generation (1G) network
forward channel
frequency diversity
frequency reuse
mobile radio
power control
reuse factor
reverse channel
second-generation (2G)
selective fading
slow fading
space diversity
third-generation (3G)
Review Questions
What geometric shape is used in cellular system design?
What is the principle of frequency reuse in the context of a cellular network?
List five ways of increasing the capacity of a cellular system.
Explain the paging function of a cellular system.
What is fading?
What is the difference between diffraction and scattering?
What is the difference between fast and slow fading?
What is the difference between flat and selective fading?
What are the key differences between first- and second-generation cellular systems?
What are the advantages of using CDMA for a cellular network?
What are the disadvantages of using CDMA for a cellular network?
What are some key characteristics that distinguish third-generation cellular systems
from second-generation cellular systems?
Consider four different cellular systems that share the following characteristics. The
frequency bands are 825 to 845 MHz for mobile unit transmission and 870 to 890
MHz for base station transmission. A duplex circuit consists of one 30-kHz channel in
each direction. The systems are distinguished by the reuse factor, which is 4, 7, 12, and
19, respectively.
a. Suppose that in each of the systems, the cluster of cells (4, 7, 12, 19) is duplicated
16 times. Find the number of simultaneous communications that can be supported
by each system.
b. Find the number of simultaneous communications that can be supported by a single cell in each system.
c. What is the area covered, in cells, by each system?
d. Suppose the cell size is the same in all four systems and a fixed area of 100 cells is
covered by each system. Find the number of simultaneous communications that
can be supported by each system.
Describe a sequence of events similar to that of Figure 14.6 for
a. a call from a mobile unit to a fixed subscriber
b. a call from a fixed subscriber to a mobile unit
An analog cellular system has a total of 33 MHz of bandwidth and uses two 25-kHz
simplex (one-way) channels to provide full duplex voice and control channels.
a. What is the number of channels available per cell for a frequency reuse factor of
(1) 4 cells, (2) 7 cells, and (3) 12 cells?
b. Assume that 1 MHz is dedicated to control channels but that only one control
channel is needed per cell. Determine a reasonable distribution of control channels
and voice channels in each cell for the three frequency reuse factors of part (a).
A cellular system uses FDMA with a spectrum allocation of 12.5 MHz in each direction, a guard band at the edge of the allocated spectrum of 10 kHz, and a channel
bandwidth of 30 kHz. What is the number of available channels?
For a cellular system, FDMA spectral efficiency is defined as ha =
, where
Bc = channel bandwidth
Bw = total bandwidth in one direction
NT = total number of voice channels in the covered area
What is an upper bound on ha?
Walsh codes are the most common orthogonal codes used in CDMA applications. A
set of Walsh codes of length n consists of the n rows of an n * n Walsh matrix. That is,
there are n codes, each of length n. The matrix is defined recursively as follows:
W1 = 102
W2n = a
where n is the dimension of the matrix and the overscore denotes the logical NOT of
the bits in the matrix. The Walsh matrix has the property that every row is orthogonal
to every other row and to the logical NOT of every other row. Show the Walsh matrices of dimensions 2, 4, and 8.
Demonstrate that the codes in an 8 * 8 Walsh matrix are orthogonal to each other by
showing that multiplying any code by any other code produces a result of zero.
Consider a CDMA system in which users A and B have the Walsh codes
1- 1 1 - 1 1 - 1 1 -1 12 and 1- 1 - 1 1 1 -1 -1 1 12, respectively.
a. Show the output at the receiver if A transmits a data bit 1 and B does not transmit.
b. Show the output at the receiver if A transmits a data bit 0 and B does not transmit.
c. Show the output at the receiver if A transmits a data bit 1 and B transmits a data
bit 1. Assume the received power from both A and B is the same.
d. Show the output at the receiver if A transmits a data bit 0 and B transmits a data
bit 1. Assume the received power from both A and B is the same.
e. Show the output at the receiver if A transmits a data bit 1 and B transmits a data
bit 0. Assume the received power from both A and B is the same.
f. Show the output at the receiver if A transmits a data bit 0 and B transmits a data
bit 0. Assume the received power from both A and B is the same.
g. Show the output at the receiver if A transmits a data bit 1 and B transmits a data
bit 1. Assume the received power from B is twice the received power from A. This
can be represented by showing the received signal component from A as consisting of elements of magnitude 11 + 1, -12 and the received signal component from
B as consisting of elements of magnitude 21+2, -22.
h. Show the output at the receiver if A transmits a data bit 0 and B transmits a data
bit 1. Assume the received power from B is twice the received power from A.
Local Area Networks
he trend in local area networks (LANs) involves the use of shared transmission media or shared switching capacity to achieve high data rates
over relatively short distances. Several key issues present themselves.
One is the choice of transmission medium. Whereas coaxial cable was commonly used in traditional LANs, contemporary LAN installations emphasize
the use of twisted pair or optical fiber. In the case of twisted pair, efficient
encoding schemes are needed to enable high data rates over the medium. Wireless LANs have also assumed increased importance. Another design issue is
that of access control.
Chapter 15 Local Area Network Overview
The essential technology underlying all forms of LANs comprises
topology, transmission medium, and medium access control technique.
Chapter 15 examines the first two of these elements. Four topologies
are in common use: bus, tree, ring, and star. The most common transmission media for local networking are twisted pair (unshielded and
shielded), coaxial cable (baseband and broadband), optical fiber, and
wireless (microwave and infrared). These topologies and transmission
media are discussed, with the exception of wireless, which is covered in
Chapter 17.
The increasing deployment of LANs has led to an increased need
to interconnect LANs with each other and with WANs. Chapter 15 also
discusses a key device used in interconnecting LANs: the bridge.
Chapter 16 High-Speed LANs
Chapter 16 looks in detail at the topologies, transmission media, and MAC
protocols of the most important LAN systems in current use; all of these
have been defined in standards documents. The most important of these is
Ethernet, which has been deployed in versions at 10 Mbps, 100 Mbps,
1 Gbps, and 10 Gbps. Then the chapter looks at Fibre Channel.
Chapter 17 Wireless LANs
Wireless LANs use one of three transmission techniques: spread spectrum, narrowband microwave, and infrared. Chapter 17 provides an
overview wireless LAN technology and applications. The most significant
set of standards defining wireless LANs are those defined by the IEEE
802.11 committee. Chapter 17 examines this set of standards in depth.
15.1 Background
15.2 Topologies and Transmission Media
15.3 LAN Protocol Architecture
15.4 Bridges
15.5 Layer 2 and Layer 3 Switches
15.6 Recommended Reading and Web Site
15.7 Key Terms, Review Questions, and Problems
The whole of this operation is described in minute detail in the official British Naval
History, and should be studied with its excellent charts by those who are interested in
its technical aspect. So complicated is the full story that the lay reader cannot see the
wood for the trees. I have endeavored to render intelligible the broad effects.
—The World Crisis, Winston Churchill
A LAN consists of a shared transmission medium and a set of hardware and software for interfacing devices to the medium and regulating the orderly access to the medium.
The topologies that have been used for LANs are ring, bus, tree, and
star. A ring LAN consists of a closed loop of repeaters that allow data
to circulate around the ring. A repeater may also function as a device
attachment point. Transmission is generally in the form of frames. The
bus and tree topologies are passive sections of cable to which stations
are attached. A transmission of a frame by any one station can be
heard by any other station. A star LAN includes a central node to
which stations are attached.
A set of standards has been defined for LANs that specifies a range of
data rates and encompasses a variety of topologies and transmission
In most cases, an organization will have multiple LANs that need to be
interconnected. The simplest approach to meeting this requirement is
the bridge.
Hubs and switches form the basic building blocks of most LANs.
We turn now to a discussion of local area networks (LANs). Whereas wide
area networks may be public or private, LANs usually are owned by the organization that is using the network to interconnect equipment. LANs have
much greater capacity than wide area networks, to carry what is generally a
greater internal communications load.
In this chapter we look at the underlying technology and protocol architecture of LANs. Chapters 16 and 17 are devoted to a discussion of specific
LAN systems.
The variety of applications for LANs is wide. To provide some insight into the types
of requirements that LANs are intended to meet, this section provides a brief discussion of some of the most important general application areas for these networks.
Personal Computer LANs
A common LAN configuration is one that supports personal computers. With the
relatively low cost of such systems, individual managers within organizations often
independently procure personal computers for departmental applications, such as
spreadsheet and project management tools, and Internet access.
But a collection of department-level processors will not meet all of an organization’s needs; central processing facilities are still required. Some programs,
such as econometric forecasting models, are too big to run on a small computer.
Corporate-wide data files, such as accounting and payroll, require a centralized
facility but should be accessible to a number of users. In addition, there are other
kinds of files that, although specialized, must be shared by a number of users. Further, there are sound reasons for connecting individual intelligent workstations
not only to a central facility but to each other as well. Members of a project or
organization team need to share work and information. By far the most efficient
way to do so is digitally.
Certain expensive resources, such as a disk or a laser printer, can be shared by
all users of the departmental LAN. In addition, the network can tie into larger corporate network facilities. For example, the corporation may have a building-wide
LAN and a wide area private network. A communications server can provide controlled access to these resources.
LANs for the support of personal computers and workstations have become
nearly universal in organizations of all sizes. Even those sites that still depend heavily on the mainframe have transferred much of the processing load to networks of
personal computers. Perhaps the prime example of the way in which personal computers are being used is to implement client/server applications.
For personal computer networks, a key requirement is low cost. In particular,
the cost of attachment to the network must be significantly less than the cost of the
attached device. Thus, for the ordinary personal computer, an attachment cost in the
hundreds of dollars is desirable. For more expensive, high-performance workstations, higher attachment costs can be tolerated.
Backend Networks and Storage Area Networks
Backend networks are used to interconnect large systems such as mainframes,
supercomputers, and mass storage devices. The key requirement here is for bulk
data transfer among a limited number of devices in a small area. High reliability is
generally also a requirement. Typical characteristics include the following:
• High data rate: To satisfy the high-volume demand, data rates of 100 Mbps or
more are required.
• High-speed interface: Data transfer operations between a large host system
and a mass storage device are typically performed through high-speed parallel
I/O interfaces, rather than slower communications interfaces. Thus, the physical link between station and network must be high speed.
• Distributed access: Some sort of distributed medium access control (MAC)
technique is needed to enable a number of devices to share the transmission
medium with efficient and reliable access.
• Limited distance: Typically, a backend network will be employed in a computer room or a small number of contiguous rooms.
• Limited number of devices: The number of expensive mainframes and mass
storage devices found in the computer room generally numbers in the tens of
Typically, backend networks are found at sites of large companies or research
installations with large data processing budgets. Because of the scale involved, a
small difference in productivity can translate into a sizable difference in cost.
Consider a site that uses a dedicated mainframe computer. This implies a fairly
large application or set of applications. As the load at the site grows, the existing
mainframe may be replaced by a more powerful one, perhaps a multiprocessor system. At some sites, a single-system replacement will not be able to keep up; equipment performance growth rates will be exceeded by demand growth rates. The
facility will eventually require multiple independent computers. Again, there are
compelling reasons for interconnecting these systems. The cost of system interrupt is
high, so it should be possible, easily and quickly, to shift applications to backup systems. It must be possible to test new procedures and applications without degrading
the production system. Large bulk storage files must be accessible from more than
one computer. Load leveling should be possible to maximize utilization and performance.
It can be seen that some key requirements for backend networks differ from
those for personal computer LANs. High data rates are required to keep up with the
work, which typically involves the transfer of large blocks of data. The equipment
for achieving high speeds is expensive. Fortunately, given the much higher cost of
the attached devices, such costs are reasonable.
A concept related to that of the backend network is the storage area network
(SAN). A SAN is a separate network to handle storage needs. The SAN detaches
storage tasks from specific servers and creates a shared storage facility across a
high-speed network. The collection of networked storage devices can include hard
disks, tape libraries, and CD arrays. Most SANs use Fibre Channel, which is
described in Chapter 16. Figure 15.1 contrasts the SAN with the traditional serverbased means of supporting shared storage. In a typical large LAN installation, a
number of servers and perhaps mainframes each has its own dedicated storage
devices. If a client needs access to a particular storage device, it must go through
the server that controls that device. In a SAN, no server sits between the storage
devices and the network; instead, the storage devices and servers are linked
directly to the network. The SAN arrangement improves client-to-storage access
efficiency, as well as direct storage-to-storage communications for backup and
replication functions.
Storage devices
(a) Server-based storage
Figure 15.1
Storage devices
(b) Storage area network
The Use of Storage Area Networks [HURW98]
High-Speed Office Networks
Traditionally, the office environment has included a variety of devices with low- to
medium-speed data transfer requirements. However, applications in today’s office
environment would overwhelm the limited speeds (up to 10 Mbps) of traditional
LAN. Desktop image processors have increased network data flow by an unprecedented amount. Examples of these applications include fax machines, document
image processors, and graphics programs on personal computers and workstations.
Consider that a typical page with 200 picture elements, or pels1 (black or white
points), per inch resolution (which is adequate but not high resolution) generates
3,740,000 bits 18.5 inches * 11 inches * 40,000 pels per square inch2. Even with
compression techniques, this will generate a tremendous load. In addition, disk technology and price/performance have evolved so that desktop storage capacities of
multiple gigabytes are common. These new demands require LANs with high speed
that can support the larger numbers and greater geographic extent of office systems
as compared to backend systems.
Backbone LANs
The increasing use of distributed processing applications and personal computers has
led to a need for a flexible strategy for local networking. Support of premises-wide
data communications requires a networking service that is capable of spanning the distances involved and that interconnects equipment in a single (perhaps large) building
A picture element, or pel, is the smallest discrete scanning-line sample of a facsimile system, which contains only black-white information (no gray scales). A pixel is a picture element that contains gray-scale
or a cluster of buildings. Although it is possible to develop a single LAN to interconnect all the data processing equipment of a premises, this is probably not a practical
alternative in most cases. There are several drawbacks to a single-LAN strategy:
• Reliability: With a single LAN, a service interruption, even of short duration,
could result in a major disruption for users.
• Capacity: A single LAN could be saturated as the number of devices attached
to the network grows over time.
• Cost: A single LAN technology is not optimized for the diverse requirements
of interconnection and communication. The presence of large numbers of lowcost microcomputers dictates that network support for these devices be provided at low cost. LANs that support very-low-cost attachment will not be
suitable for meeting the overall requirement.
A more attractive alternative is to employ lower-cost, lower-capacity LANs within
buildings or departments and to interconnect these networks with a higher-capacity
LAN.This latter network is referred to as a backbone LAN. If confined to a single building or cluster of buildings, a high-capacity LAN can perform the backbone function.
The key elements of a LAN are
Transmission medium
Wiring layout
Medium access control
Together, these elements determine not only the cost and capacity of the LAN, but
also the type of data that may be transmitted, the speed and efficiency of communications, and even the kinds of applications that can be supported.
This section provides a survey of the major technologies in the first two of
these categories. It will be seen that there is an interdependence among the choices
in different categories. Accordingly, a discussion of pros and cons relative to specific
applications is best done by looking at preferred combinations. This, in turn, is best
done in the context of standards, which is a subject of a later section.
In the context of a communication network, the term topology refers to the way in
which the end points, or stations, attached to the network are interconnected. The
common topologies for LANs are bus, tree, ring, and star (Figure 15.2). The bus is a
special case of the tree, with only one trunk and no branches.
Bus and Tree Topologies Both bus and tree topologies are characterized by the
use of a multipoint medium. For the bus, all stations attach, through appropriate hardware interfacing known as a tap, directly to a linear transmission medium, or bus. Fullduplex operation between the station and the tap allows data to be transmitted onto
Flow of data
(a) Bus
(c) Ring
Central hub, switch,
or repeater
(b) Tree
(d) Star
Figure 15.2
LAN Topologies
the bus and received from the bus. A transmission from any station propagates the
length of the medium in both directions and can be received by all other stations. At
each end of the bus is a terminator, which absorbs any signal, removing it from the bus.
The tree topology is a generalization of the bus topology. The transmission
medium is a branching cable with no closed loops. The tree layout begins at a point
known as the headend. One or more cables start at the headend, and each of these
may have branches. The branches in turn may have additional branches to allow
quite complex layouts. Again, a transmission from any station propagates throughout the medium and can be received by all other stations.
Two problems present themselves in this arrangement. First, because a transmission from any one station can be received by all other stations, there needs to be
some way of indicating for whom the transmission is intended. Second, a mechanism
is needed to regulate transmission. To see the reason for this, consider that if two stations on the bus attempt to transmit at the same time, their signals will overlap and
become garbled. Or consider that one station decides to transmit continuously for a
long period of time.
To solve these problems, stations transmit data in small blocks, known as
frames. Each frame consists of a portion of the data that a station wishes to transmit,
plus a frame header that contains control information. Each station on the bus is
assigned a unique address, or identifier, and the destination address for a frame is
included in its header.
Figure 15.3 illustrates the scheme. In this example, station C wishes to transmit
a frame of data to A. The frame header includes A’s address. As the frame propagates along the bus, it passes B. B observes the address and ignores the frame. A, on
the other hand, sees that the frame is addressed to itself and therefore copies the
data from the frame as it goes by.
So the frame structure solves the first problem mentioned previously: It provides a mechanism for indicating the intended recipient of data. It also provides the
basic tool for solving the second problem, the regulation of access. In particular, the
stations take turns sending frames in some cooperative fashion. This involves
putting additional control information into the frame header, as discussed later.
With the bus or tree, no special action needs to be taken to remove frames
from the medium. When a signal reaches the end of the medium, it is absorbed by
the terminator.
Ring Topology In the ring topology, the network consists of a set of repeaters
joined by point-to-point links in a closed loop. The repeater is a comparatively simple device, capable of receiving data on one link and transmitting them, bit by bit, on
the other link as fast as they are received. The links are unidirectional; that is, data
are transmitted in one direction only, so that data circulate around the ring in one
direction (clockwise or counterclockwise).
Each station attaches to the network at a repeater and can transmit data onto
the network through the repeater. As with the bus and tree, data are transmitted in
frames. As a frame circulates past all the other stations, the destination station recognizes its address and copies the frame into a local buffer as it goes by. The frame
continues to circulate until it returns to the source station, where it is removed
(Figure 15.4). Because multiple stations share the ring, medium access control is
needed to determine at what time each station may insert frames.
C transmits frame addressed to A
Frame is not addressed to B; B ignores it
A copies frame as it goes by
Figure 15.3 Frame Transmission on a Bus LAN
Star Topology In the star LAN topology, each station is directly connected to a
common central node. Typically, each station attaches to a central node via two
point-to-point links, one for transmission and one for reception.
In general, there are two alternatives for the operation of the central node.
One approach is for the central node to operate in a broadcast fashion. A transmission of a frame from one station to the node is retransmitted on all of the outgoing links. In this case, although the arrangement is physically a star, it is logically
a bus: A transmission from any station is received by all other stations, and only
one station at a time may successfully transmit. In this case, the central element is
referred to as a hub. Another approach is for the central node to act as a frameswitching device. An incoming frame is buffered in the node and then retransmitted on an outgoing link to the destination station. These approaches are explored
in Section 15.5.
(a) C transmits frame
addressed to A
(b) Frame is not addressed
to B; B ignores it
(c) A copies frame
as it goes by
(d) C absorbs
returning frame
Figure 15.4 Frame Transmission on a Ring LAN
Choice of Topology The choice of topology depends on a variety of factors,
including reliability, expandability, and performance. This choice is part of the overall task of designing a LAN and thus cannot be made in isolation, independent of
the choice of transmission medium, wiring layout, and access control technique. A
few general remarks can be made at this point. There are four alternative media that
can be used for a bus LAN:
• Twisted pair: In the early days of LAN development, voice-grade twisted pair
was used to provide an inexpensive, easily installed bus LAN. A number of
systems operating at 1 Mbps were implemented. Scaling twisted pair up to
higher data rates in a shared-medium bus configuration is not practical, so this
approach was dropped long ago.
• Baseband coaxial cable: A baseband coaxial cable is one that makes use of digital signaling.The original Ethernet scheme makes use of baseband coaxial cable.
• Broadband coaxial cable: Broadband coaxial cable is the type of cable used in
cable television systems. Analog signaling is used at radio and television frequencies. This type of system is more expensive and more difficult to install
and maintain than baseband coaxial cable. This approach never achieved popularity and such LANs are no longer made.
• Optical fiber: There has been considerable research relating to this alternative
over the years, but the expense of the optical fiber taps and the availability of
better alternatives have resulted in the demise of this option as well.
Thus, for a bus topology, only baseband coaxial cable has achieved widespread
use, primarily for Ethernet systems. Compared to a star-topology twisted pair or
optical fiber installation, the bus topology using baseband coaxial cable is difficult to
work with. Even simple changes may require access to the coaxial cable, movement
of taps, and rerouting of cable segments. Accordingly, few if any new installations
are being attempted. Despite its limitations, there is a considerable installed base of
baseband coaxial cable bus LANs.
Very-high-speed links over considerable distances can be used for the ring
topology. Hence, the ring has the potential of providing the best throughput of any
topology. One disadvantage of the ring is that a single link or repeater failure could
disable the entire network.
The star topology takes advantage of the natural layout of wiring in a building.
It is generally best for short distances and can support a small number of devices at
high data rates.
Choice of Transmission Medium The choice of transmission medium is
determined by a number of factors. It is, we shall see, constrained by the topology of
the LAN. Other factors come into play, including
Capacity: to support the expected network traffic
Reliability: to meet requirements for availability
Types of data supported: tailored to the application
Environmental scope: to provide service over the range of environments
The choice is part of the overall task of designing a local network, which is
addressed in Chapter 16. Here we can make a few general observations.
Voice-grade unshielded twisted pair (UTP) is an inexpensive, well-understood
medium; this is the Category 3 UTP referred to in Chapter 4. Typically, office buildings are wired to meet the anticipated telephone system demand plus a healthy margin; thus, there are no cable installation costs in the use of Category 3 UTP.
However, the data rate that can be supported is generally quite limited, with the
exception of very small LAN. Category 3 UTP is likely to be the most cost-effective
for a single-building, low-traffic LAN installation.
Shielded twisted pair and baseband coaxial cable are more expensive than
Category 3 UTP but provide greater capacity. Broadband cable is even more expensive but provides even greater capacity. However, in recent years, the trend has been
toward the use of high-performance UTP, especially Category 5 UTP. Category 5
UTP supports high data rates for a small number of devices, but larger installations
can be supported by the use of the star topology and the interconnection of the
switching elements in multiple star-topology configurations. We discuss this point in
Chapter 16.
Optical fiber has a number of attractive features, such as electromagnetic isolation, high capacity, and small size, which have attracted a great deal of interest. As
yet the market penetration of fiber LANs is low; this is primarily due to the high
cost of fiber components and the lack of skilled personnel to install and maintain
fiber systems. This situation is beginning to change rapidly as more products using
fiber are introduced.
The architecture of a LAN is best described in terms of a layering of protocols that
organize the basic functions of a LAN. This section opens with a description of the
standardized protocol architecture for LANs, which encompasses physical, medium
access control (MAC), and logical link control (LLC) layers. The physical layer
encompasses topology and transmission medium, and is covered in Section 15.2.
This section provides an overview of the MAC and LLC layers.
IEEE 802 Reference Model
Protocols defined specifically for LAN and MAN transmission address issues relating to the transmission of blocks of data over the network. In OSI terms, higher
layer protocols (layer 3 or 4 and above) are independent of network architecture
and are applicable to LANs, MANs, and WANs. Thus, a discussion of LAN protocols
is concerned principally with lower layers of the OSI model.
Figure 15.5 relates the LAN protocols to the OSI architecture (Figure 2.11).
This architecture was developed by the IEEE 802 LAN standards committee2 and
has been adopted by all organizations working on the specification of LAN standards. It is generally referred to as the IEEE 802 reference model.
Working from the bottom up, the lowest layer of the IEEE 802 reference model
corresponds to the physical layer of the OSI model and includes such functions as
• Encoding/decoding of signals
• Preamble generation/removal (for synchronization)
• Bit transmission/reception
In addition, the physical layer of the 802 model includes a specification of the transmission medium and the topology. Generally, this is considered “below” the lowest
layer of the OSI model. However, the choice of transmission medium and topology
is critical in LAN design, and so a specification of the medium is included.
Above the physical layer are the functions associated with providing service to
LAN users. These include
This committee has developed standards for a wide range of LANs. See Appendix D for details.
OSI reference
IEEE 802
LLC service
access point
( )
( ) ( )
Logical link control
Data link
Figure 15.5
Medium access
IEEE 802
IEEE 802 Protocol Layers Compared to OSI Model
• On transmission, assemble data into a frame with address and error detection
• On reception, disassemble frame, and perform address recognition and error
• Govern access to the LAN transmission medium.
• Provide an interface to higher layers and perform flow and error control.
These are functions typically associated with OSI layer 2. The set of functions
in the last bullet item are grouped into a logical link control (LLC) layer. The functions in the first three bullet items are treated as a separate layer, called medium
access control (MAC). The separation is done for the following reasons:
• The logic required to manage access to a shared-access medium is not found in
traditional layer 2 data link control.
• For the same LLC, several MAC options may be provided.
Figure 15.6 illustrates the relationship between the levels of the architecture
(compare Figure 2.9). Higher-level data are passed down to LLC, which appends
Application data
Application layer
TCP layer
IP layer
LLC layer
TCP segment
IP datagram
LLC protocol data unit
MAC frame
Figure 15.6 LAN Protocols in Context
MAC layer
control information as a header, creating an LLC protocol data unit (PDU). This control information is used in the operation of the LLC protocol. The entire LLC PDU
is then passed down to the MAC layer, which appends control information at the
front and back of the packet, forming a MAC frame. Again, the control information
in the frame is needed for the operation of the MAC protocol. For context, the figure
also shows the use of TCP/IP and an application layer above the LAN protocols.
Logical Link Control
The LLC layer for LANs is similar in many respects to other link layers in common
use. Like all link layers, LLC is concerned with the transmission of a link-level PDU
between two stations, without the necessity of an intermediate switching node. LLC
has two characteristics not shared by most other link control protocols:
1. It must support the multiaccess, shared-medium nature of the link (this differs
from a multidrop line in that there is no primary node).
2. It is relieved of some details of link access by the MAC layer.
Addressing in LLC involves specifying the source and destination LLC users.
Typically, a user is a higher-layer protocol or a network management function in the
station. These LLC user addresses are referred to as service access points (SAPs), in
keeping with OSI terminology for the user of a protocol layer.
We look first at the services that LLC provides to a higher-level user, and then
at the LLC protocol.
LLC Services LLC specifies the mechanisms for addressing stations across the
medium and for controlling the exchange of data between two users. The operation
and format of this standard is based on HDLC. Three services are provided as alternatives for attached devices using LLC:
• Unacknowledged connectionless service: This service is a datagram-style service. It is a very simple service that does not involve any of the flow- and errorcontrol mechanisms. Thus, the delivery of data is not guaranteed. However, in
most devices, there will be some higher layer of software that deals with reliability issues.
• Connection-mode service: This service is similar to that offered by HDLC. A
logical connection is set up between two users exchanging data, and flow control and error control are provided.
• Acknowledged connectionless service: This is a cross between the previous
two services. It provides that datagrams are to be acknowledged, but no prior
logical connection is set up.
Typically, a vendor will provide these services as options that the customer can
select when purchasing the equipment. Alternatively, the customer can purchase
equipment that provides two or all three services and select a specific service based
on application.
The unacknowledged connectionless service requires minimum logic and is
useful in two contexts. First, it will often be the case that higher layers of software
will provide the necessary reliability and flow-control mechanism, and it is efficient
to avoid duplicating them. For example, TCP could provide the mechanisms
needed to ensure that data is delivered reliably. Second, there are instances in
which the overhead of connection establishment and maintenance is unjustified or
even counterproductive (for example, data collection activities that involve the
periodic sampling of data sources, such as sensors and automatic self-test reports
from security equipment or network components). In a monitoring application, the
loss of an occasional data unit would not cause distress, as the next report should
arrive shortly. Thus, in most cases, the unacknowledged connectionless service is
the preferred option.
The connection-mode service could be used in very simple devices, such as terminal controllers, that have little software operating above this level. In these cases,
it would provide the flow control and reliability mechanisms normally implemented
at higher layers of the communications software.
The acknowledged connectionless service is useful in several contexts. With the
connection-mode service, the logical link control software must maintain some sort
of table for each active connection, to keep track of the status of that connection. If
the user needs guaranteed delivery but there are a large number of destinations for
data, then the connection-mode service may be impractical because of the large number of tables required. An example is a process control or automated factory environment where a central site may need to communicate with a large number of
processors and programmable controllers. Another use of this is the handling of
important and time-critical alarm or emergency control signals in a factory. Because
of their importance, an acknowledgment is needed so that the sender can be assured
that the signal got through. Because of the urgency of the signal, the user might not
want to take the time first to establish a logical connection and then send the data.
LLC Protocol The basic LLC protocol is modeled after HDLC and has similar
functions and formats. The differences between the two protocols can be summarized as follows:
• LLC makes use of the asynchronous balanced mode of operation of HDLC, to
support connection-mode LLC service; this is referred to as type 2 operation.
The other HDLC modes are not employed.
• LLC supports an unacknowledged connectionless service using the unnumbered information PDU; this is known as type 1 operation.
• LLC supports an acknowledged connectionless service by using two new
unnumbered PDUs; this is known as type 3 operation.
• LLC permits multiplexing by the use of LLC service access points (LSAPs).
All three LLC protocols employ the same PDU format (Figure 15.7), which
consists of four fields. The DSAP (Destination Service Access Point) and SSAP
(Source Service Access Point) fields each contain a 7-bit address, which specify
the destination and source users of LLC. One bit of the DSAP indicates whether the
DSAP is an individual or group address. One bit of the SSAP indicates whether the
PDU is a command or response PDU. The format of the LLC control field is identical to that of HDLC (Figure 7.7), using extended (7-bit) sequence numbers.
For type 1 operation, which supports the unacknowledged connectionless service, the unnumbered information (UI) PDU is used to transfer user data. There is
MAC address
MAC address
1 octet
1 or 2
LLC control
DSAP value
SSAP value
address fields
I/G Individual/Group
C/R Command/Response
Figure 15.7 LLC PDU in a Generic MAC Frame Format
no acknowledgment, flow control, or error control. However, there is error detection and discard at the MAC level.
Two other PDUs are used to support management functions associated with
all three types of operation. Both PDUs are used in the following fashion. An LLC
entity may issue a command 1C/R bit = 02 XID or TEST. The receiving LLC entity
issues a corresponding XID or TEST in response. The XID PDU is used to
exchange two types of information: types of operation supported and window size.
The TEST PDU is used to conduct a loopback test of the transmission path between
two LLC entities. Upon receipt of a TEST command PDU, the addressed LLC
entity issues a TEST response PDU as soon as possible.
With type 2 operation, a data link connection is established between two LLC
SAPs prior to data exchange. Connection establishment is attempted by the type 2
protocol in response to a request from a user. The LLC entity issues a SABME
PDU3 to request a logical connection with the other LLC entity. If the connection
is accepted by the LLC user designated by the DSAP, then the destination LLC
entity returns an unnumbered acknowledgment (UA) PDU. The connection is
henceforth uniquely identified by the pair of user SAPs. If the destination LLC
user rejects the connection request, its LLC entity returns a disconnected mode
Once the connection is established, data are exchanged using information
PDUs, as in HDLC. The information PDUs include send and receive sequence numbers, for sequencing and flow control. The supervisory PDUs are used, as in HDLC,
This stands for Set Asynchronous Balanced Mode Extended. It is used in HDLC to choose ABM and to
select extended sequence numbers of seven bits. Both ABM and 7-bit sequence numbers are mandatory
in type 2 operation.
for flow control and error control. Either LLC entity can terminate a logical LLC
connection by issuing a disconnect (DISC) PDU.
With type 3 operation, each transmitted PDU is acknowledged. A new (not
found in HDLC) unnumbered PDU, the Acknowledged Connectionless (AC)
Information PDU, is defined. User data are sent in AC command PDUs and must be
acknowledged using an AC response PDU. To guard against lost PDUs, a 1-bit
sequence number is used. The sender alternates the use of 0 and 1 in its AC command PDU, and the receiver responds with an AC PDU with the opposite number
of the corresponding command. Only one PDU in each direction may be outstanding at any time.
Medium Access Control
All LANs and MANs consist of collections of devices that must share the network’s
transmission capacity. Some means of controlling access to the transmission
medium is needed to provide for an orderly and efficient use of that capacity. This is
the function of a medium access control (MAC) protocol.
The key parameters in any medium access control technique are where and
how. Where refers to whether control is exercised in a centralized or distributed
fashion. In a centralized scheme, a controller is designated that has the authority to
grant access to the network. A station wishing to transmit must wait until it receives
permission from the controller. In a decentralized network, the stations collectively
perform a medium access control function to determine dynamically the order in
which stations transmit. A centralized scheme has certain advantages, including
• It may afford greater control over access for providing such things as priorities, overrides, and guaranteed capacity.
• It enables the use of relatively simple access logic at each station.
• It avoids problems of distributed coordination among peer entities.
The principal disadvantages of centralized schemes are
• It creates a single point of failure; that is, there is a point in the network that, if
it fails, causes the entire network to fail.
• It may act as a bottleneck, reducing performance.
The pros and cons of distributed schemes are mirror images of the points just
The second parameter, how, is constrained by the topology and is a tradeoff
among competing factors, including cost, performance, and complexity. In general,
we can categorize access control techniques as being either synchronous or asynchronous. With synchronous techniques, a specific capacity is dedicated to a connection. This is the same approach used in circuit switching, frequency division
multiplexing (FDM), and synchronous time division multiplexing (TDM). Such
techniques are generally not optimal in LANs and MANs because the needs of the
stations are unpredictable. It is preferable to be able to allocate capacity in an asynchronous (dynamic) fashion, more or less in response to immediate demand. The
asynchronous approach can be further subdivided into three categories: round
robin, reservation, and contention.
Round Robin With round robin, each station in turn is given the opportunity to
transmit. During that opportunity, the station may decline to transmit or may transmit
subject to a specified upper bound, usually expressed as a maximum amount of data
transmitted or time for this opportunity. In any case, the station, when it is finished,
relinquishes its turn, and the right to transmit passes to the next station in logical
sequence. Control of sequence may be centralized or distributed. Polling is an example of a centralized technique.
When many stations have data to transmit over an extended period of time,
round-robin techniques can be very efficient. If only a few stations have data to
transmit over an extended period of time, then there is a considerable overhead in
passing the turn from station to station, because most of the stations will not transmit but simply pass their turns. Under such circumstances other techniques may be
preferable, largely depending on whether the data traffic has a stream or bursty
characteristic. Stream traffic is characterized by lengthy and fairly continuous transmissions; examples are voice communication, telemetry, and bulk file transfer.
Bursty traffic is characterized by short, sporadic transmissions; interactive terminalhost traffic fits this description.
Reservation For stream traffic, reservation techniques are well suited. In general,
for these techniques, time on the medium is divided into slots, much as with synchronous TDM. A station wishing to transmit reserves future slots for an extended
or even an indefinite period. Again, reservations may be made in a centralized or
distributed fashion.
Contention For bursty traffic, contention techniques are usually appropriate. With
these techniques, no control is exercised to determine whose turn it is; all stations
contend for time in a way that can be, as we shall see, rather rough and tumble. These
techniques are of necessity distributed in nature. Their principal advantage is that
they are simple to implement and, under light to moderate load, efficient. For some
of these techniques, however, performance tends to collapse under heavy load.
Although both centralized and distributed reservation techniques have been
implemented in some LAN products, round-robin and contention techniques are
the most common.
MAC Frame Format The MAC layer receives a block of data from the LLC
layer and is responsible for performing functions related to medium access and for
transmitting the data. As with other protocol layers, MAC implements these functions making use of a protocol data unit at its layer. In this case, the PDU is referred
to as a MAC frame.
The exact format of the MAC frame differs somewhat for the various MAC
protocols in use. In general, all of the MAC frames have a format similar to that of
Figure 15.7. The fields of this frame are
• MAC Control: This field contains any protocol control information needed for
the functioning of the MAC protocol. For example, a priority level could be
indicated here.
• Destination MAC Address: The destination physical attachment point on the
LAN for this frame.
15.4 / BRIDGES
• Source MAC Address: The source physical attachment point on the LAN for
this frame.
• LLC: The LLC data from the next higher layer.
• CRC: The Cyclic Redundancy Check field (also known as the frame check
sequence, FCS, field). This is an error-detecting code, as we have seen in
HDLC and other data link control protocols (Chapter 7).
In most data link control protocols, the data link protocol entity is responsible
not only for detecting errors using the CRC, but for recovering from those errors by
retransmitting damaged frames. In the LAN protocol architecture, these two functions are split between the MAC and LLC layers. The MAC layer is responsible for
detecting errors and discarding any frames that are in error. The LLC layer optionally keeps track of which frames have been successfully received and retransmits
unsuccessful frames.
In virtually all cases, there is a need to expand beyond the confines of a single LAN,
to provide interconnection to other LANs and to wide area networks. Two general
approaches are used for this purpose: bridges and routers. The bridge is the simpler
of the two devices and provides a means of interconnecting similar LANs. The
router is a more general-purpose device, capable of interconnecting a variety of
LANs and WANs. We explore bridges in this section and look at routers in Part
The bridge is designed for use between local area networks (LANs) that use
identical protocols for the physical and link layers (e.g., all conforming to IEEE
802.3). Because the devices all use the same protocols, the amount of processing
required at the bridge is minimal. More sophisticated bridges are capable of mapping from one MAC format to another (e.g., to interconnect an Ethernet and a
token ring LAN).
Because the bridge is used in a situation in which all the LANs have the same
characteristics, the reader may ask, why not simply have one large LAN? Depending on circumstance, there are several reasons for the use of multiple LANs connected by bridges:
• Reliability: The danger in connecting all data processing devices in an organization to one network is that a fault on the network may disable communication for all devices. By using bridges, the network can be partitioned into
self-contained units.
• Performance: In general, performance on a LAN declines with an increase in
the number of devices or the length of the wire. A number of smaller LANs
will often give improved performance if devices can be clustered so that
intranetwork traffic significantly exceeds internetwork traffic.
• Security: The establishment of multiple LANs may improve security of communications. It is desirable to keep different types of traffic (e.g., accounting,
personnel, strategic planning) that have different security needs on physically
separate media. At the same time, the different types of users with different
levels of security need to communicate through controlled and monitored
• Geography: Clearly, two separate LANs are needed to support devices clustered in two geographically distant locations. Even in the case of two buildings
separated by a highway, it may be far easier to use a microwave bridge link
than to attempt to string coaxial cable between the two buildings.
Functions of a Bridge
Figure 15.8 illustrates the action of a bridge connecting two LANs, A and B, using
the same MAC protocol. In this example, a single bridge attaches to both LANs; frequently, the bridge function is performed by two “half-bridges,” one on each LAN.
The functions of the bridge are few and simple:
• Read all frames transmitted on A and accept those addressed to any station on B.
• Using the medium access control protocol for B, retransmit each frame on B.
• Do the same for B-to-A traffic.
Several design aspects of a bridge are worth highlighting:
• The bridge makes no modification to the content or format of the frames it
receives, nor does it encapsulate them with an additional header. Each frame
to be transferred is simply copied from one LAN and repeated with exactly
the same bit pattern on the other LAN. Because the two LANs use the same
LAN protocols, it is permissible to do this.
• The bridge should contain enough buffer space to meet peak demands. Over a
short period of time, frames may arrive faster than they can be retransmitted.
• The bridge must contain addressing and routing intelligence. At a minimum,
the bridge must know which addresses are on each network to know which
frames to pass. Further, there may be more than two LANs interconnected by
a number of bridges. In that case, a frame may have to be routed through several bridges in its journey from source to destination.
• A bridge may connect more than two LANs.
In summary, the bridge provides an extension to the LAN that requires no
modification to the communications software in the stations attached to the LANs.
It appears to all stations on the two (or more) LANs that there is a single LAN on
which each station has a unique address. The station uses that unique address and
need not explicitly discriminate between stations on the same LAN and stations on
other LANs; the bridge takes care of that.
Bridge Protocol Architecture
The IEEE 802.1D specification defines the protocol architecture for MAC bridges.
Within the 802 architecture, the endpoint or station address is designated at the
Frames with
addresses 11 through
20 are accepted and
repeated on LAN B
Station 1
Station 2
Station 10
Frames with
addresses 1 through
10 are accepted and
repeated on LAN A
Figure 15.8
Station 12
Station 20
15.4 / BRIDGES
Station 11
Bridge Operation
(a) Architecture
t1, t8
t2, t7
User data
User data
t3, t4, t5, t6 MAC-H LLC-H
User data
(b) Operation
Figure 15.9 Connection of Two LANs by a Bridge
MAC level. Thus, it is at the MAC level that a bridge can function. Figure 15.9 shows
the simplest case, which consists of two LANs connected by a single bridge. The
LANs employ the same MAC and LLC protocols. The bridge operates as previously
described. A MAC frame whose destination is not on the immediate LAN is captured by the bridge, buffered briefly, and then transmitted on the other LAN. As far
as the LLC layer is concerned, there is a dialogue between peer LLC entities in the
two endpoint stations. The bridge need not contain an LLC layer because it is
merely serving to relay the MAC frames.
Figure 15.9b indicates the way in which data are encapsulated using a bridge.
Data are provided by some user to LLC. The LLC entity appends a header and
passes the resulting data unit to the MAC entity, which appends a header and a
trailer to form a MAC frame. On the basis of the destination MAC address in the
frame, it is captured by the bridge. The bridge does not strip off the MAC fields; its
function is to relay the MAC frame intact to the destination LAN. Thus, the frame
is deposited on the destination LAN and captured by the destination station.
The concept of a MAC relay bridge is not limited to the use of a single bridge
to connect two nearby LANs. If the LANs are some distance apart, then they can be
connected by two bridges that are in turn connected by a communications facility.
The intervening communications facility can be a network, such as a wide area
packet-switching network, or a point-to-point link. In such cases, when a bridge captures a MAC frame, it must encapsulate the frame in the appropriate packaging and
transmit it over the communications facility to a target bridge. The target bridge
strips off these extra fields and transmits the original, unmodified MAC frame to the
destination station.
Fixed Routing
There is a trend within many organizations to an increasing number of LANs interconnected by bridges. As the number of LANs grows, it becomes important to
15.4 / BRIDGES
provide alternate paths between LANs via bridges for load balancing and reconfiguration in response to failure. Thus, many organizations will find that static, preconfigured routing tables are inadequate and that some sort of dynamic routing is
Consider the configuration of Figure 15.10. Suppose that station 1 transmits a
frame on LAN A intended for station 6. The frame will be read by bridges 101, 102,
and 107. For each bridge, the addressed station is not on a LAN to which the bridge
is attached. Therefore, each bridge must make a decision whether or not to retransmit the frame on its other LAN, in order to move it closer to its intended destination. In this case, bridge 102 should repeat the frame on LAN C, whereas bridges
101 and 107 should refrain from retransmitting the frame. Once the frame has been
transmitted on LAN C, it will be picked up by both bridges 105 and 106. Again,
each must decide whether or not to forward the frame. In this case, bridge 105
should retransmit the frame on LAN F, where it will be received by the destination,
station 6.
Thus we see that, in the general case, the bridge must be equipped with a
routing capability. When a bridge receives a frame, it must decide whether or not to
Station 1
Station 2
Station 3
Station 4
Station 5
Station 6
Station 7
Figure 15.10 Configuration of Bridges and LANs, with Alternate Routes
forward it. If the bridge is attached to two or more networks, then it must decide
whether or not to forward the frame and, if so, on which LAN the frame should be
The routing decision may not always be a simple one. Figure 15.10 also shows
that there are two routes between LAN A and LAN E. Such redundancy provides
for higher overall Internet availability and creates the possibility for load balancing. In this case, if station 1 transmits a frame on LAN A intended for station 5 on
LAN E, then either bridge 101 or bridge 107 could forward the frame. It would
appear preferable for bridge 107 to forward the frame, since it will involve only one
hop, whereas if the frame travels through bridge 101, it must suffer two hops.
Another consideration is that there may be changes in the configuration. For example, bridge 107 may fail, in which case subsequent frames from station 1 to station 5
should go through bridge 101. So we can say that the routing capability must take
into account the topology of the internet configuration and may need to be dynamically altered.
A variety of routing strategies have been proposed and implemented in recent
years. The simplest and most common strategy is fixed routing. This strategy is suitable for small internets and for internets that are relatively stable. In addition, two
groups within the IEEE 802 committee have developed specifications for routing
strategies. The IEEE 802.1 group has issued a standard for routing based on the use
of a spanning tree algorithm. The token ring committee, IEEE 802.5, has issued its
own specification, referred to as source routing. In the remainder of this section, we
look at fixed routing and the spanning tree algorithm, which is the most commonly
used bridge routing algorithm.
For fixed routing, a route is selected for each source-destination pair of LANs
in the configuration. If alternate routes are available between two LANs, then typically the route with the least number of hops is selected. The routes are fixed, or at
least only change when there is a change in the topology of the internet.
The strategy for developing a fixed routing configuration for bridges is similar
to that employed in a packet-switching network (Figure 12.2). A central routing
matrix is created, to be stored perhaps at a network control center. The matrix
shows, for each source-destination pair of LANs, the identity of the first bridge on
the route. So, for example, the route from LAN E to LAN F begins by going through
bridge 107 to LAN A. Again consulting the matrix, the route from LAN A to LAN
F goes through bridge 102 to LAN C. Finally, the route from LAN C to LAN F is
directly through bridge 105. Thus the complete route from LAN E to LAN F is
bridge 107, LAN A, bridge 102, LAN C, bridge 105.
From this overall matrix, routing tables can be developed and stored at each
bridge. Each bridge needs one table for each LAN to which it attaches. The information for each table is derived from a single row of the matrix. For example, bridge
105 has two tables, one for frames arriving from LAN C and one for frames arriving
from LAN F. The table shows, for each possible destination MAC address, the identity of the LAN to which the bridge should forward the frame.
Once the directories have been established, routing is a simple matter. A
bridge copies each incoming frame on each of its LANs. If the destination MAC
address corresponds to an entry in its routing table, the frame is retransmitted on
the appropriate LAN.
15.4 / BRIDGES
The fixed routing strategy is widely used in commercially available products. It
requires that a network manager manually load the data into the routing tables. It
has the advantage of simplicity and minimal processing requirements. However, in a
complex internet, in which bridges may be dynamically added and in which failures
must be allowed for, this strategy is too limited.
The Spanning Tree Approach
The spanning tree approach is a mechanism in which bridges automatically develop
a routing table and update that table in response to changing topology. The algorithm consists of three mechanisms: frame forwarding, address learning, and loop
Frame Forwarding In this scheme, a bridge maintains a forwarding database for
each port attached to a LAN. The database indicates the station addresses for which
frames should be forwarded through that port. We can interpret this in the following
fashion. For each port, a list of stations is maintained. A station is on the list if it is on
the “same side” of the bridge as the port. For example, for bridge 102 of Figure
15.10, stations on LANs C, F, and G are on the same side of the bridge as the LAN
C port, and stations on LANs A, B, D, and E are on the same side of the bridge as
the LAN A port. When a frame is received on any port, the bridge must decide
whether that frame is to be forwarded through the bridge and out through one of
the bridge’s other ports. Suppose that a bridge receives a MAC frame on port x. The
following rules are applied:
1. Search the forwarding database to determine if the MAC address is listed for
any port except port x.
2. If the destination MAC address is not found, forward frame out all ports except
the one from which is was received. This is part of the learning process described
3. If the destination address is in the forwarding database for some port y, then
determine whether port y is in a blocking or forwarding state. For reasons
explained later, a port may sometimes be blocked, which prevents it from receiving or transmitting frames.
4. If port y is not blocked, transmit the frame through port y onto the LAN to
which that port attaches.
Address Learning The preceding scheme assumes that the bridge is already
equipped with a forwarding database that indicates the direction, from the bridge, of
each destination station. This information can be preloaded into the bridge, as in
fixed routing. However, an effective automatic mechanism for learning the direction
of each station is desirable. A simple scheme for acquiring this information is based
on the use of the source address field in each MAC frame.
The strategy is this. When a frame arrives on a particular port, it clearly has
come from the direction of the incoming LAN. The source address field of the
frame indicates the source station. Thus, a bridge can update its forwarding database for that port on the basis of the source address field of each incoming frame.
To allow for changes in topology, each element in the database is equipped with a
timer. When a new element is added to the database, its timer is set. If the timer
expires, then the element is eliminated from the database, since the corresponding direction information may no longer be valid. Each time a frame is received,
its source address is checked against the database. If the element is already in the
database, the entry is updated (the direction may have changed) and the timer is
reset. If the element is not in the database, a new entry is created, with its own
Spanning Tree Algorithm The address learning mechanism described previously is effective if the topology of the internet is a tree; that is, if there are no alternate routes in the network. The existence of alternate routes means that there is a
closed loop. For example in Figure 15.10, the following is a closed loop: LAN A,
bridge 101, LAN B, bridge 104, LAN E, bridge 107, LAN A.
To see the problem created by a closed loop, consider Figure 15.11. At time t0 ,
station A transmits a frame addressed to station B. The frame is captured by both
bridges. Each bridge updates its database to indicate that station A is in the direction of LAN X, and retransmits the frame on LAN Y. Say that bridge a retransmits
at time t1 and bridge b a short time later t2 . Thus B will receive two copies of the
frame. Furthermore, each bridge will receive the other’s transmission on LAN Y.
Note that each transmission is a frame with a source address of A and a destination
address of B. Thus each bridge will update its database to indicate that station A is in
Station B
Station A
Figure 15.11
Loop of Bridges
the direction of LAN Y. Neither bridge is now capable of forwarding a frame
addressed to station A.
To overcome this problem, a simple result from graph theory is used: For any
connected graph, consisting of nodes and edges connecting pairs of nodes, there is a
spanning tree of edges that maintains the connectivity of the graph but contains no
closed loops. In terms of internets, each LAN corresponds to a graph node, and each
bridge corresponds to a graph edge. Thus, in Figure 15.10, the removal of one (and
only one) of bridges 107, 101, and 104, results in a spanning tree. What is desired is to
develop a simple algorithm by which the bridges of the internet can exchange sufficient information to automatically (without user intervention) derive a spanning
tree. The algorithm must be dynamic. That is, when a topology change occurs, the
bridges must be able to discover this fact and automatically derive a new spanning
The spanning tree algorithm developed by IEEE 802.1, as the name suggests, is able to develop such a spanning tree. All that is required is that each
bridge be assigned a unique identifier and that costs be assigned to each bridge
port. In the absence of any special considerations, all costs could be set equal; this
produces a minimum-hop tree. The algorithm involves a brief exchange of messages among all of the bridges to discover the minimum-cost spanning tree.
Whenever there is a change in topology, the bridges automatically recalculate the
spanning tree.
In recent years, there has been a proliferation of types of devices for interconnecting
LANs that goes beyond the bridges discussed in Section 15.4 and the routers discussed in Part Five. These devices can conveniently be grouped into the categories
of layer 2 switches and layer 3 switches. We begin with a discussion of hubs and then
explore these two concepts.
Earlier, we used the term hub in reference to a star-topology LAN. The hub is the
active central element of the star layout. Each station is connected to the hub by two
lines (transmit and receive). The hub acts as a repeater: When a single station transmits, the hub repeats the signal on the outgoing line to each station. Ordinarily, the
line consists of two unshielded twisted pairs. Because of the high data rate and the
poor transmission qualities of unshielded twisted pair, the length of a line is limited
to about 100 m. As an alternative, an optical fiber link may be used. In this case, the
maximum length is about 500 m.
Note that although this scheme is physically a star, it is logically a bus: A transmission from any one station is received by all other stations, and if two stations
transmit at the same time there will be a collision.
Multiple levels of hubs can be cascaded in a hierarchical configuration.
Figure 15.12 illustrates a two-level configuration. There is one header hub
(HHUB) and one or more intermediate hubs (IHUB). Each hub may have a
Two cables
(twisted pair or
optical fiber)
Figure 15.12
Two-Level Star Topology
mixture of stations and other hubs attached to it from below. This layout fits well
with building wiring practices. Typically, there is a wiring closet on each floor of an
office building, and a hub can be placed in each one. Each hub could service the
stations on its floor.
Layer 2 Switches
In recent years, a new device, the layer 2 switch, has replaced the hub in popularity,
particularly for high-speed LANs. The layer 2 switch is also sometimes referred to as
a switching hub.
To clarify the distinction between hubs and switches, Figure 15.13a shows a
typical bus layout of a traditional 10-Mbps LAN. A bus is installed that is laid out so
that all the devices to be attached are in reasonable proximity to a point on the bus.
In the figure, station B is transmitting. This transmission goes from B, across the lead
from B to the bus, along the bus in both directions, and along the access lines of each
of the other attached stations. In this configuration, all the stations must share the
total capacity of the bus, which is 10 Mbps.
A hub, often in a building wiring closet, uses a star wiring arrangement to
attach stations to the hub. In this arrangement, a transmission from any one station
is received by the hub and retransmitted on all of the outgoing lines. Therefore, to
avoid collision, only one station can transmit at a time. Again, the total capacity of
the LAN is 10 Mbps. The hub has several advantages over the simple bus arrangement. It exploits standard building wiring practices in the layout of cable. In addition, the hub can be configured to recognize a malfunctioning station that is
10 Mbps
10 Mbps
10 Mbps
10 Mbps
Shared bus — 10 Mbps
(a) Shared medium bus
Total capacity
up to 10 Mbps
10 Mbps
10 Mbps
10 Mbps
10 Mbps
(b) Shared medium hub
Total capacity
N 10 Mbps
10 Mbps
10 Mbps
10 Mbps
10 Mbps
(c) Layer 2 switch
Figure 15.13
Lan Hubs and Switches
jamming the network and to cut that station out of the network. Figure 15.13b illustrates the operation of a hub. Here again, station B is transmitting. This transmission
goes from B, across the transmit line from B to the hub, and from the hub along the
receive lines of each of the other attached stations.
We can achieve greater performance with a layer 2 switch. In this case, the central hub acts as a switch, much as a packet switch or circuit switch. With a layer 2
switch, an incoming frame from a particular station is switched to the appropriate
output line to be delivered to the intended destination. At the same time, other
unused lines can be used for switching other traffic. Figure 15.13c shows an example
in which B is transmitting a frame to A and at the same time C is transmitting a
frame to D. So, in this example, the current throughput on the LAN is 20 Mbps,
although each individual device is limited to 10 Mbps. The layer 2 switch has several
attractive features:
1. No change is required to the software or hardware of the attached devices to
convert a bus LAN or a hub LAN to a switched LAN. In the case of an Ethernet LAN, each attached device continues to use the Ethernet medium access
control protocol to access the LAN. From the point of view of the attached
devices, nothing has changed in the access logic.
2. Each attached device has a dedicated capacity equal to that of the entire original
LAN, assuming that the layer 2 switch has sufficient capacity to keep up with all
attached devices. For example, in Figure 15.13c, if the layer 2 switch can sustain a
throughput of 20 Mbps, each attached device appears to have a dedicated capacity for either input or output of 10 Mbps.
3. The layer 2 switch scales easily. Additional devices can be attached to the layer
2 switch by increasing the capacity of the layer 2 switch correspondingly.
Two types of layer 2 switches are available as commercial products:
• Store-and-forward switch: The layer 2 switch accepts a frame on an input
line, buffers it briefly, and then routes it to the appropriate output line.
• Cut-through switch: The layer 2 switch takes advantage of the fact that the
destination address appears at the beginning of the MAC (medium access
control) frame. The layer 2 switch begins repeating the incoming frame onto
the appropriate output line as soon as the layer 2 switch recognizes the destination address.
The cut-through switch yields the highest possible throughput but at some risk
of propagating bad frames, because the switch is not able to check the CRC prior to
retransmission. The store-and-forward switch involves a delay between sender and
receiver but boosts the overall integrity of the network.
A layer 2 switch can be viewed as a full-duplex version of the hub. It can also
incorporate logic that allows it to function as a multiport bridge. [BREY99] lists the
following differences between layer 2 switches and bridges:
• Bridge frame handling is done in software. A layer 2 switch performs the
address recognition and frame forwarding functions in hardware.
• A bridge can typically only analyze and forward one frame at a time, whereas
a layer 2 switch has multiple parallel data paths and can handle multiple
frames at a time.
• A bridge uses store-and-forward operation. With a layer 2 switch, it is possible
to have cut-through instead of store-and-forward operation.
Because a layer 2 switch has higher performance and can incorporate the
functions of a bridge, the bridge has suffered commercially. New installations typically include layer 2 switches with bridge functionality rather than bridges.
Layer 3 Switches
Layer 2 switches provide increased performance to meet the needs of high-volume
traffic generated by personal computers, workstations, and servers. However, as the
number of devices in a building or complex of buildings grows, layer 2 switches
reveal some inadequacies. Two problems in particular present themselves: broadcast
overload and the lack of multiple links.
A set of devices and LANs connected by layer 2 switches is considered to have
a flat address space. The term flat means that all users share a common MAC broadcast address. Thus, if any device issues a MAC frame with a broadcast address, that
frame is to be delivered to all devices attached to the overall network connected by
layer 2 switches and/or bridges. In a large network, frequent transmission of broadcast frames can create tremendous overhead. Worse, a malfunctioning device can
create a broadcast storm, in which numerous broadcast frames clog the network and
crowd out legitimate traffic.
A second performance-related problem with the use of bridges and/or layer
2 switches is that the current standards for bridge protocols dictate that there be no
closed loops in the network. That is, there can only be one path between any two
devices. Thus, it is impossible, in a standards-based implementation, to provide multiple paths through multiple switches between devices. This restriction limits both
performance and reliability.
To overcome these problems, it seems logical to break up a large local network
into a number of subnetworks connected by routers. A MAC broadcast frame is
then limited to only the devices and switches contained in a single subnetwork. Furthermore, IP-based routers employ sophisticated routing algorithms that allow the
use of multiple paths between subnetworks going through different routers.
However, the problem with using routers to overcome some of the inadequacies of bridges and layer 2 switches is that routers typically do all of the IP-level
processing involved in the forwarding of IP traffic in software. High-speed LANs
and high-performance layer 2 switches may pump millions of packets per second,
whereas a software-based router may only be able to handle well under a million
packets per second. To accommodate such a load, a number of vendors have developed layer 3 switches, which implement the packet-forwarding logic of the router
in hardware.
There are a number of different layer 3 schemes on the market, but fundamentally they fall into two categories: packet by packet and flow based. The packetby-packet switch operates in the identical fashion as a traditional router. Because
the forwarding logic is in hardware, the packet-by-packet switch can achieve an
order of magnitude increase in performance compared to the software-based
router. A flow-based switch tries to enhance performance by identifying flows of IP
packets that have the same source and destination. This can be done by observing
ongoing traffic or by using a special flow label in the packet header (allowed in IPv6
but not IPv4). Once a flow is identified, a predefined route can be established
through the network to speed up the forwarding process. Again, huge performance
increases over a pure software-based router are achieved.
Figure 15.14 is a typical example of the approach taken to local networking in
an organization with a large number of PCs and workstations (thousands to tens of
thousands). Desktop systems have links of 10 Mbps to 100 Mbps into a LAN controlled by a layer 2 switch. Wireless LAN connectivity is also likely to be available
for mobile users. Layer 3 switches are at the local network’s core, forming a local
backbone. Typically, these switches are interconnected at 1 Gbps and connect to
layer 2 switches at from 100 Mbps to 1 Gbps. Servers connect directly to layer 2 or
Layer 3
Layer 2
Layer 3
Layer 2
Layer 2
Laptop with
wireless connection
Figure 15.14
Typical Premises Network Configuration
layer 3 switches at 1 Gbps or possible 100 Mbps. A lower-cost software-based router
provides WAN connection. The circles in the figure identify separate LAN subnetworks; a MAC broadcast frame is limited to its own subnetwork.
The material in this chapter is covered in much more depth in [STAL00]. [REGA04] and
[FORO02] also provides extensive coverage. [METZ99] is an excellent treatment of layer 2
and layer 3 switches, with a detailed discussion of products and case studies. Another comprehensive account is [SEIF00].
FORO02 Forouzan, B., and Chung, S. Local Area Networks. New York: McGraw-Hill, 2002.
METZ99 Metzler, J., and DeNoia, L. Layer 2 Switching. Upper Saddle River, NJ: Prentice Hall, 1999.
REGA04 Regan, P. Local Area Networks. Upper Saddle River, NJ: Prentice Hall, 2004.
SEIF00 Seifert, R. The Switch Book. New York: Wiley, 2000.
STAL00 Stallings, W. Local and Metropolitan Area Networks, Sixth Edition. Upper
Saddle River, NJ: Prentice Hall, 2000.
Recommended Web site:
• IEEE 802 LAN/MAN Standards Committee: Status and documents for all of the
working groups
Key Terms
local area network (LAN)
logical link control
medium access control (MAC)
ring topology
spanning tree
bus topology
layer 2 switch
layer 3 switch
star topology
tree topology
storage area networks (SAN)
Review Questions
How do the key requirements for computer room networks differ from those for personal computer local networks?
What are the differences among backend LANs, SANs, and backbone LANs?
What is network topology?
List four common LAN topologies and briefly describe their methods of operation.
What is the purpose of the IEEE 802 committee?
Why are there multiple LAN standards?
List and briefly define the services provided by LLC.
List and briefly define the types of operation provided by the LLC protocol.
List some basic functions performed at the MAC layer.
What functions are performed by a bridge?
What is a spanning tree?
What is the difference between a hub and a layer 2 switch?
What is the difference between a store-and forward switch and a cut-through switch?
Instead of LLC, could HDLC be used as a data link control protocol for a LAN? If
not, what is lacking?
An asynchronous device, such as a teletype, transmits characters one at a time with
unpredictable delays between characters. What problems, if any, do you foresee if
such a device is connected to a LAN and allowed to transmit at will (subject to gaining access to the medium)? How might such problems be resolved?
Consider the transfer of a file containing one million 8-bit characters from one station
to another. What is the total elapsed time and effective throughput for the following
a. A circuit-switched, star-topology local network. Call setup time is negligible and
the data rate on the medium is 64 kbps.
b. A bus topology local network with two stations a distance D apart, a data rate of
B bps, and a frame size of P with 80 bits of overhead per frame. Each frame is
acknowledged with an 88-bit frame before the next is sent. The propagation speed
on the bus is 200 m/ms. Solve for:
1. D = 1 km, B = 1 Mbps, P = 256 bits
2. D = 1 km, B = 10 Mbps, P = 256 bits
3. D = 10 km, B = 1 Mbps, P = 256 bits
4. D = 1 km, B = 50 Mbps, P = 10,000 bits
c. A ring topology local network with a total circular length of 2D, with the two stations a distance D apart. Acknowledgment is achieved by allowing a frame to circulate past the destination station, back to the source station, with an
acknowledgment bit set by the destination. There are N repeaters on the ring,
each of which introduces a delay of one bit time. Repeat the calculation for each
of b1 through b4 for N = 10; 100; 1000.
Consider a baseband bus with a number of equally spaced stations with a data rate of
10 Mbps and a bus length of 1 km.
a. What is the mean time to send a frame of 1000 bits to another station, measured
from the beginning of transmission to the end of reception? Assume a propagation speed of 200 m/ms.
b. If two stations begin to transmit at exactly the same time, their packets will interfere with each other. If each transmitting station monitors the bus during transmission, how long before it notices an interference, in seconds? In bit times?
Repeat Problem 15.4 for a data rate of 100 Mbps.
At a propagation speed of 200 m/ms, what is the effective length added to a ring by a
bit delay at each repeater?
a. At 1 Mbps
b. At 40 Mbps
A tree topology is to be provided that spans two buildings. If permission can be
obtained to string cable between the two buildings, one continuous tree layout will be
used. Otherwise, each building will have an independent tree topology network and a
point-to-point link will connect a special communications station on one network
with a communications station on the other network. What functions must the communications stations perform? Repeat for ring and star.
System A consists of a single ring with 300 stations, one per repeater. System B consists of three 100-station rings linked by a bridge. If the probability of a link failure is
P1 , a repeater failure is Pr , and a bridge failure is Pb , derive an expression for parts (a)
through (d):
a. Probability of failure of system A
b. Probability of complete failure of system B
c. Probability that a particular station will find the network unavailable, for systems
A and B
d. Probability that any two stations, selected at random, will be unable to communicate, for systems A and B
e. Compute values for parts (a) through (d) for P1 = Pb = Pr = 10-2.
Draw figures similar to Figure 15.9 for a configuration in which
a. Two LANs are connected via two bridges that are connected by a point-to-point link.
b. Two LANs are connected via two bridges that are connected by an X.25 packetswitching network.
For the configuration of Figure 15.10, show the central routing matrix and the routing
tables at each bridge.
16.1 The Emergence of High-Speed LANS
16.2 Ethernet
16.3 Fibre Channel
16.4 Recommended Reading and Web Sites
16.5 Key Terms, Review Questions, and Problems
Appendix 16A Digital Signal Encoding for LANS
Appendix 16B Performance Issues
Appendix 16C Scrambling
Congratulations. I knew the record would stand until it was broken.
Yogi Berra
The IEEE 802.3 standard, known as Ethernet, now encompasses
data rates of 10 Mbps, 100 Mbps, 1 Gbps, and 10 Gbps. For the lower
data rates, the CSMA/CD MAC protocol is used. For the 1-Gbps
and 10-Gbps options, a switched technique is used.
Fibre Channel is a switched network of nodes designed to provide
high-speed linkages for such applications as storage area networks.
A variety of signal encoding techniques are used in the various LAN
standards to achieve efficiency and to make the high data rates practical.
Recent years have seen rapid changes in the technology, design, and commercial
applications for local area networks (LANs). A major feature of this evolution is
the introduction of a variety of new schemes for high-speed local networking. To
keep pace with the changing local networking needs of business, a number of
approaches to high speed LAN design have become commercial products. The
most important of these are
• Fast Ethernet and Gigabit Ethernet: The extension of 10-Mbps CSMA/CD
(carrier sense multiple access with collision detection) to higher speeds is a
logical strategy because it tends to preserve the investment in existing systems.
• Fibre Channel: This standard provides a low-cost, easily scalable approach to
achieving very high data rates in local areas.
• High-speed wireless LANs: Wireless LAN technology and standards have at
last come of age, and high-speed standards and products are being introduced.
Table 16.1 lists some of the characteristics of these approaches.The remainder of this chapter fills in some of the details on Ethernet and Fibre Channel.
Chapter 17 covers wireless LANs.
Personal computers and microcomputer workstations began to achieve widespread
acceptance in business computing in the early 1980s and have now achieved the status of the telephone: an essential tool for office workers. Until relatively recently,
office LANs provided basic connectivity services—connecting personal computers
Table 16.1 Characteristics of Some High-Speed LANs
Fast Ethernet
Gigabit Ethernet
Fibre Channel
Wireless LAN
Data Rate
100 Mbps
1 Gbps, 10 Gbps
100 Mbps–3.2 Gbps
1 Mbps–54 Mbps
Transmission Media
optical Fiber
UTP, shielded cable,
optical fiber
Optical fiber, coaxial
cable, STP
2.4-GHz, 5-GHz
Access Method
Supporting Standard
IEEE 802.3
IEEE 802.3
Fibre Channel
IEEE 802.11
and terminals to mainframes and midrange systems that ran corporate applications,
and providing workgroup connectivity at the departmental or divisional level. In
both cases, traffic patterns were relatively light, with an emphasis on file transfer
and electronic mail. The LANs that were available for this type of workload, primarily Ethernet and token ring, are well suited to this environment.
In recent years, two significant trends have altered the role of the personal
computer and therefore the requirements on the LAN:
• The speed and computing power of personal computers has continued to
enjoy explosive growth. Today’s more powerful platforms support graphicsintensive applications and ever more elaborate graphical user interfaces to the
operating system.
• MIS organizations have recognized the LAN as a viable and indeed essential
computing platform, resulting in the focus on network computing. This trend
began with client/server computing, which has become a dominant architecture in the business environment and the more recent intranetwork trend.
Both of these approaches involve the frequent transfer of potentially large
volumes of data in a transaction-oriented environment.
The effect of these trends has been to increase the volume of data to be handled over LANs and, because applications are more interactive, to reduce the acceptable delay on data transfers. The earlier generation of 10-Mbps Ethernets and
16-Mbps token rings are simply not up to the job of supporting these requirements.
The following are examples of requirements that call for higher-speed LANs:
• Centralized server farms: In many applications, there is a need for user, or
client, systems to be able to draw huge amounts of data from multiple centralized servers, called server farms. An example is a color publishing operation, in
which servers typically contain hundreds of gigabytes of image data that must
be downloaded to imaging workstations. As the performance of the servers
themselves has increased, the bottleneck has shifted to the network.
• Power workgroups: These groups typically consist of a small number of cooperating users who need to draw massive data files across the network. Examples
are a software development group that runs tests on a new software version, or
a computer-aided design (CAD) company that regularly runs simulations of
new designs. In such cases, large amounts of data are distributed to several
workstations, processed, and updated at very high speed for multiple iterations.
• High-speed local backbone: As processing demand grows, LANs proliferate at
a site, and high-speed interconnection is necessary.
The most widely used high-speed LANs today are based on Ethernet and were
developed by the IEEE 802.3 standards committee. As with other LAN standards,
there is both a medium access control layer and a physical layer, which are considered in turn in what follows.
IEEE 802.3 Medium Access Control
It is easier to understand the operation of CSMA/CD if we look first at some earlier
schemes from which CSMA/CD evolved.
Precursors CSMA/CD and its precursors can be termed random access, or contention, techniques. They are random access in the sense that there is no predictable
or scheduled time for any station to transmit; station transmissions are ordered randomly. They exhibit contention in the sense that stations contend for time on the
shared medium.
The earliest of these techniques, known as ALOHA, was developed for
packet radio networks. However, it is applicable to any shared transmission
medium. ALOHA, or pure ALOHA as it is sometimes called, specifies that a station may transmit a frame at any time. The station then listens for an amount of
time equal to the maximum possible round-trip propagation delay on the network
(twice the time it takes to send a frame between the two most widely separated stations) plus a small fixed time increment. If the station hears an acknowledgment
during that time, fine; otherwise, it resends the frame. If the station fails to receive
an acknowledgment after repeated transmissions, it gives up. A receiving station
determines the correctness of an incoming frame by examining a frame check
sequence field, as in HDLC. If the frame is valid and if the destination address in
the frame header matches the receiver’s address, the station immediately sends an
acknowledgment. The frame may be invalid due to noise on the channel or because
another station transmitted a frame at about the same time. In the latter case, the
two frames may interfere with each other at the receiver so that neither gets
through; this is known as a collision. If a received frame is determined to be invalid,
the receiving station simply ignores the frame.
ALOHA is as simple as can be, and pays a penalty for it. Because the number
of collisions rises rapidly with increased load, the maximum utilization of the channel is only about 18%.
To improve efficiency, a modification of ALOHA, known as slotted ALOHA,
was developed. In this scheme, time on the channel is organized into uniform slots
whose size equals the frame transmission time. Some central clock or other technique is needed to synchronize all stations. Transmission is permitted to begin only
at a slot boundary. Thus, frames that do overlap will do so totally. This increases the
maximum utilization of the system to about 37%.
Both ALOHA and slotted ALOHA exhibit poor utilization. Both fail to take
advantage of one of the key properties of both packet radio networks and LANs,
which is that propagation delay between stations may be very small compared to
frame transmission time. Consider the following observations. If the station-to-station
propagation time is large compared to the frame transmission time, then, after a station launches a frame, it will be a long time before other stations know about it. During that time, one of the other stations may transmit a frame; the two frames may
interfere with each other and neither gets through. Indeed, if the distances are great
enough, many stations may begin transmitting, one after the other, and none of their
frames get through unscathed. Suppose, however, that the propagation time is small
compared to frame transmission time. In that case, when a station launches a frame, all
the other stations know it almost immediately. So, if they had any sense, they would
not try transmitting until the first station was done. Collisions would be rare because
they would occur only when two stations began to transmit almost simultaneously.
Another way to look at it is that a short propagation delay provides the stations with
better feedback about the state of the network; this information can be used to
improve efficiency.
The foregoing observations led to the development of carrier sense multiple
access (CSMA). With CSMA, a station wishing to transmit first listens to the
medium to determine if another transmission is in progress (carrier sense). If the
medium is in use, the station must wait. If the medium is idle, the station may transmit. It may happen that two or more stations attempt to transmit at about the same
time. If this happens, there will be a collision; the data from both transmissions will
be garbled and not received successfully. To account for this, a station waits a reasonable amount of time after transmitting for an acknowledgment, taking into
account the maximum round-trip propagation delay and the fact that the acknowledging station must also contend for the channel to respond. If there is no acknowledgment, the station assumes that a collision has occurred and retransmits.
One can see how this strategy would be effective for networks in which the
average frame transmission time is much longer than the propagation time. Collisions can occur only when more than one user begins transmitting within a short
time interval (the period of the propagation delay). If a station begins to transmit a
frame, and there are no collisions during the time it takes for the leading edge of the
packet to propagate to the farthest station, then there will be no collision for this
frame because all other stations are now aware of the transmission.
The maximum utilization achievable using CSMA can far exceed that of
ALOHA or slotted ALOHA. The maximum utilization depends on the length of
the frame and on the propagation time; the longer the frames or the shorter the
propagation time, the higher the utilization.
With CSMA, an algorithm is needed to specify what a station should do if the
medium is found busy. Three approaches are depicted in Figure 16.1. One algorithm
is nonpersistent CSMA. A station wishing to transmit listens to the medium and
obeys the following rules:
1. If the medium is idle, transmit; otherwise, go to step 2.
2. If the medium is busy, wait an amount of time drawn from a probability distribution (the retransmission delay) and repeat step 1.
Constant or variable delay
•Transmit if idle
•If busy, wait random time
and repeat process
•If collision, back off
Channel busy
•Transmit as soon as
channel goes idle
•If collision, back off
Figure 16.1
•Transmit as soon as channel
goes idle with probability P
•Otherwise, delay one time slot
and repeat process
•If collision, back off
CSMA Persistence and Backoff
The use of random delays reduces the probability of collisions. To see this, consider that two stations become ready to transmit at about the same time while
another transmission is in progress; if both stations delay the same amount of time
before trying again, they will both attempt to transmit at about the same time. A
problem with nonpersistent CSMA is that capacity is wasted because the medium
will generally remain idle following the end of a transmission even if there are one
or more stations waiting to transmit.
To avoid idle channel time, the 1-persistent protocol can be used. A station
wishing to transmit listens to the medium and obeys the following rules:
1. If the medium is idle, transmit; otherwise, go to step 2.
2. If the medium is busy, continue to listen until the channel is sensed idle; then
transmit immediately.
Whereas nonpersistent stations are deferential, 1-persistent stations are selfish. If two or more stations are waiting to transmit, a collision is guaranteed. Things
get sorted out only after the collision.
A compromise that attempts to reduce collisions, like nonpersistent, and
reduce idle time, like 1-persistent, is p-persistent. The rules are as follows:
1. If the medium is idle, transmit with probability p, and delay one time unit with
probability 11 - p2. The time unit is typically equal to the maximum propagation delay.
2. If the medium is busy, continue to listen until the channel is idle and repeat
step 1.
3. If transmission is delayed one time unit, repeat step 1.
The question arises as to what is an effective value of p. The main problem
to avoid is one of instability under heavy load. Consider the case in which n stations have frames to send while a transmission is taking place. At the end of the
transmission, the expected number of stations that will attempt to transmit is
equal to the number of stations ready to transmit times the probability of transmitting, or np. If np is greater than 1, on average multiple stations will attempt to
transmit and there will be a collision. What is more, as soon as all these stations
realize that their transmission suffered a collision, they will be back again,
almost guaranteeing more collisions. Worse yet, these retries will compete with
new transmissions from other stations, further increasing the probability of collision. Eventually, all stations will be trying to send, causing continuous collisions, with throughput dropping to zero. To avoid this catastrophe, np must be
less than one for the expected peaks of n; therefore, if a heavy load is expected
to occur with some regularity, p must be small. However, as p is made smaller,
stations must wait longer to attempt transmission. At low loads, this can result in
very long delays. For example, if only a single station desires to transmit, the
expected number of iterations of step 1 is 1/p (see Problem 16.2). Thus, if
p = 0.1, at low load, a station will wait an average of 9 time units before transmitting on an idle line.
Description of CSMA/CD CSMA, although more efficient than ALOHA or
slotted ALOHA, still has one glaring inefficiency. When two frames collide, the
medium remains unusable for the duration of transmission of both damaged frames.
For long frames, compared to propagation time, the amount of wasted capacity can
be considerable. This waste can be reduced if a station continues to listen to the
medium while transmitting. This leads to the following rules for CSMA/CD:
1. If the medium is idle, transmit; otherwise, go to step 2.
2. If the medium is busy, continue to listen until the channel is idle, then transmit
3. If a collision is detected during transmission, transmit a brief jamming signal to
assure that all stations know that there has been a collision and then cease transmission.
4. After transmitting the jamming signal, wait a random amount of time, referred
to as the backoff, then attempt to transmit again (repeat from step 1).
Figure 16.2 illustrates the technique for a baseband bus. The upper part of
the figure shows a bus LAN layout. At time t0 , station A begins transmitting
a packet addressed to D. At t1 , both B and C are ready to transmit. B senses a
transmission and so defers. C, however, is still unaware of A’s transmission
(because the leading edge of A’s transmission has not yet arrived at C)
and begins its own transmission. When A’s transmission reaches C, at t2 , C
detects the collision and ceases transmission. The effect of the collision propagates back to A, where it is detected some time later, t3 , at which time A ceases
With CSMA/CD, the amount of wasted capacity is reduced to the time it takes
to detect a collision. Question: How long does that take? Let us consider the case of a
baseband bus and consider two stations as far apart as possible. For example, in Figure
16.2, suppose that station A begins a transmission and that just before that transmission reaches D, D is ready to transmit. Because D is not yet aware of A’s transmission,
A's transmission
C's transmission
Signal on bus
A's transmission
C's transmission
Signal on bus
A's transmission
C's transmission
Signal on bus
A's transmission
C's transmission
Signal on bus
Figure 16.2
CSMA/CD Operation
it begins to transmit. A collision occurs almost immediately and is recognized by D.
However, the collision must propagate all the way back to A before A is aware of the
collision. By this line of reasoning, we conclude that the amount of time that it takes
to detect a collision is no greater than twice the end-to-end propagation delay.
An important rule followed in most CSMA/CD systems, including the IEEE
standard, is that frames should be long enough to allow collision detection prior to the
end of transmission. If shorter frames are used, then collision detection does not occur,
and CSMA/CD exhibits the same performance as the less efficient CSMA protocol.
For a CSMA/CD LAN, the question arises as to which persistence algorithm to
use. You may be surprised to learn that the algorithm used in the IEEE 802.3 standard is 1-persistent. Recall that both nonpersistent and p-persistent have performance problems. In the nonpersistent case, capacity is wasted because the medium
will generally remain idle following the end of a transmission even if there are stations waiting to send. In the p-persistent case, p must be set low enough to avoid
instability, with the result of sometimes atrocious delays under light load. The 1-persistent algorithm, which means, after all, that p = 1, would seem to be even more
unstable than p-persistent due to the greed of the stations. What saves the day is that
the wasted time due to collisions is mercifully short (if the frames are long relative to
propagation delay), and with random backoff, the two stations involved in a collision
are unlikely to collide on their next tries. To ensure that backoff maintains stability,
IEEE 802.3 and Ethernet use a technique known as binary exponential backoff. A
station will attempt to transmit repeatedly in the face of repeated collisions. For the
first 10 retransmission attempts, the mean value of the random delay is doubled. This
mean value then remains the same for 6 additional attempts. After 16 unsuccessful
attempts, the station gives up and reports an error. Thus, as congestion increases, stations back off by larger and larger amounts to reduce the probability of collision.
The beauty of the 1-persistent algorithm with binary exponential backoff is
that it is efficient over a wide range of loads. At low loads, 1-persistence guarantees
that a station can seize the channel as soon as it goes idle, in contrast to the non- and
p-persistent schemes. At high loads, it is at least as stable as the other techniques.
However, one unfortunate effect of the backoff algorithm is that it has a last-in firstout effect; stations with no or few collisions will have a chance to transmit before
stations that have waited longer.
For baseband bus, a collision should produce substantially higher voltage
swings than those produced by a single transmitter. Accordingly, the IEEE standard
dictates that the transmitter will detect a collision if the signal on the cable at the
transmitter tap point exceeds the maximum that could be produced by the transmitter alone. Because a transmitted signal attenuates as it propagates, there is a potential problem: If two stations far apart are transmitting, each station will receive a
greatly attenuated signal from the other. The signal strength could be so small that
when it is added to the transmitted signal at the transmitter tap point, the combined
signal does not exceed the CD threshold. For this reason, among others, the IEEE
standard restricts the maximum length of coaxial cable to 500 m for 10BASE5 and
200 m for 10BASE2.
A much simpler collision detection scheme is possible with the twisted-pair
star-topology approach (Figure 15.12). In this case, collision detection is based on
logic rather than sensing voltage magnitudes. For any hub, if there is activity (signal)
on more than one input, a collision is assumed. A special signal called the collision
presence signal is generated. This signal is generated and sent out as long as activity
is sensed on any of the input lines. This signal is interpreted by every node as an
occurrence of a collision.
MAC Frame Figure 16.3 depicts the frame format for the 802.3 protocol. It consists of the following fields:
• Preamble: A 7-octet pattern of alternating 0s and 1s used by the receiver to
establish bit synchronization.
• Start Frame Delimiter (SFD): The sequence 10101011, which indicates the
actual start of the frame and enables the receiver to locate the first bit of the
rest of the frame.
• Destination Address (DA): Specifies the station(s) for which the frame is
intended. It may be a unique physical address, a group address, or a global address.
46 to 1500 octets
7 octets
LLC data
= Start of frame delimiter
= Destination address
= Source address
= Frame check sequence
Figure 16.3 IEEE 802.3 Frame Format
• Source Address (SA): Specifies the station that sent the frame.
• Length/Type: Length of LLC data field in octets, or Ethernet Type field,
depending on whether the frame conforms to the IEEE 802.3 standard or the
earlier Ethernet specification. In either case, the maximum frame size, excluding the Preamble and SFD, is 1518 octets.
• LLC Data: Data unit supplied by LLC.
• Pad: Octets added to ensure that the frame is long enough for proper CD
• Frame Check Sequence (FCS): A 32-bit cyclic redundancy check, based on all
fields except preamble, SFD, and FCS.
IEEE 802.3 10-Mbps Specifications (Ethernet)
The IEEE 802.3 committee has defined a number of alternative physical configurations. This is both good and bad. On the good side, the standard has been responsive
to evolving technology. On the bad side, the customer, not to mention the potential
vendor, is faced with a bewildering array of options. However, the committee has
been at pains to ensure that the various options can be easily integrated into a configuration that satisfies a variety of needs. Thus, the user that has a complex set of
requirements may find the flexibility and variety of the 802.3 standard to be an asset.
To distinguish the various implementations that are available, the committee
has developed a concise notation:
6data rate in Mbps6signaling method7maximum segment length in
hundreds of meters7
The defined alternatives for 10-Mbps are as follows:1
• 10BASE5: Specifies the use of 50-ohm coaxial cable and Manchester digital signaling.2 The maximum length of a cable segment is set at 500 meters. The length
of the network can be extended by the use of repeaters. A repeater is transparent to the MAC level; as it does no buffering, it does not isolate one segment
from another. So, for example, if two stations on different segments attempt to
transmit at the same time, their transmissions will collide. To avoid looping, only
one path of segments and repeaters is allowed between any two stations. The
standard allows a maximum of four repeaters in the path between any two
stations, extending the effective length of the medium to 2.5 kilometers.
• 10BASE2: Similar to 10BASE5 but uses a thinner cable, which supports fewer
taps over a shorter distance than the 10BASE5 cable. This is a lower-cost alternative to 10BASE5.
• 10BASE-T: Uses unshielded twisted pair in a star-shaped topology. Because of
the high data rate and the poor transmission qualities of unshielded twistedpair, the length of a link is limited to 100 meters. As an alternative, an optical
fiber link may be used. In this case, the maximum length is 500 m.
There is also a 10BROAD36 option, specifying a 10-Mbps broadband bus; this option is rarely used.
See Section 5.1.
Table 16.2 IEEE 802.3 10-Mbps Physical Layer Medium Alternatives
Transmission medium
Coaxial cable
(50 ohm)
Coaxial cable
(50 ohm)
twisted pair
850-nm optical
fiber pair
Signaling technique
Maximum segment length (m)
Nodes per segment
Cable diameter (mm)
0.4 to 0.6
62.5/125 mm
• 10BASE-F: Contains three specifications: a passive-star topology for interconnecting stations and repeaters with up to 1 km per segment; a point-to-point
link that can be used to connect stations or repeaters at up to 2 km; a point-topoint link that can be used to connect repeaters at up to 2 km.
Note that 10BASE-T and 10-BASE-F do not quite follow the notation: “T”
stands for twisted pair and “F” stands for optical fiber. Table 16.2 summarizes the
remaining options. All of the alternatives listed in the table specify a data rate of
10 Mbps.
IEEE 802.3 100-Mbps Specifications (Fast Ethernet)
Fast Ethernet refers to a set of specifications developed by the IEEE 802.3
committee to provide a low-cost, Ethernet-compatible LAN operating at
100 Mbps. The blanket designation for these standards is 100BASE-T. The committee defined a number of alternatives to be used with different transmission
Table 16.3 summarizes key characteristics of the 100BASE-T options. All of
the 100BASE-T options use the IEEE 802.3 MAC protocol and frame format.
100BASE-X refers to a set of options that use two physical links between nodes; one
for transmission and one for reception. 100BASE-TX makes use of shielded twisted
pair (STP) or high-quality (Category 5) unshielded twisted pair (UTP). 100BASEFX uses optical fiber.
In many buildings, any of the 100BASE-X options requires the installation
of new cable. For such cases, 100BASE-T4 defines a lower-cost alternative that can
use Category 3, voice-grade UTP in addition to the higher-quality Category 5
UTP.3 To achieve the 100-Mbps data rate over lower-quality cable, 100BASE-T4 dictates the use of four twisted-pair lines between nodes, with the data transmission making use of three pairs in one direction at a time.
For all of the 100BASE-T options, the topology is similar to that of 10BASE-T,
namely a star-wire topology.
See Chapter 4 for a discussion of Category 3 and Category 5 cable.
Table 16.3 IEEE 802.3 100BASE-T Physical Layer Medium Alternatives
Transmission medium
2 pair, STP
2 pair, Category
2 optical fibers
4 pair, Category
3, 4, or 5 UTP
Signaling technique
Data rate
100 Mbps
100 Mbps
100 Mbps
100 Mbps
Maximum segment length
100 m
100 m
100 m
100 m
Network span
200 m
200 m
400 m
200 m
100BASE-X For all of the transmission media specified under 100BASE-X, a unidirectional data rate of 100 Mbps is achieved transmitting over a single link (single
twisted pair, single optical fiber). For all of these media, an efficient and effective signal encoding scheme is required. The one chosen is referred to as 4B/5B-NRZI. This
scheme is further modified for each option. See Appendix 16A for a description.
The 100BASE-X designation includes two physical medium specifications,
one for twisted pair, known as 100BASE-TX, and one for optical fiber, known as
100BASE-TX makes use of two pairs of twisted-pair cable, one pair used for
transmission and one for reception. Both STP and Category 5 UTP are allowed. The
MTL-3 signaling scheme is used (described in Appendix 16A).
100BASE-FX makes use of two optical fiber cables, one for transmission and
one for reception. With 100BASE-FX, a means is needed to convert the 4B/5B-NRZI
code group stream into optical signals. The technique used is known as intensity modulation. A binary 1 is represented by a burst or pulse of light; a binary 0 is represented
by either the absence of a light pulse or a light pulse at very low intensity.
100BASE-T4 100BASE-T4 is designed to produce a 100-Mbps data rate over
lower-quality Category 3 cable, thus taking advantage of the large installed base of
Category 3 cable in office buildings. The specification also indicates that the use of
Category 5 cable is optional. 100BASE-T4 does not transmit a continuous signal
between packets, which makes it useful in battery-powered applications.
For 100BASE-T4 using voice-grade Category 3 cable, it is not reasonable to
expect to achieve 100 Mbps on a single twisted pair. Instead, 100BASE-T4 specifies
that the data stream to be transmitted is split up into three separate data streams,
each with an effective data rate of 3313Mbps. Four twisted pairs are used. Data are
transmitted using three pairs and received using three pairs. Thus, two of the pairs
must be configured for bidirectional transmission.
As with 100BASE-X, a simple NRZ encoding scheme is not used for
100BASE-T4. This would require a signaling rate of 33 Mbps on each twisted pair
and does not provide synchronization. Instead, a ternary signaling scheme known as
8B6T is used (described in Appendix 16A).
Full-Duplex Operation A traditional Ethernet is half duplex: a station can
either transmit or receive a frame, but it cannot do both simultaneously. With
full-duplex operation, a station can transmit and receive simultaneously. If a
100-Mbps Ethernet ran in full-duplex mode, the theoretical transfer rate
becomes 200 Mbps.
Several changes are needed to operate in full-duplex mode. The attached stations must have full-duplex rather than half-duplex adapter cards. The central point
in the star wire cannot be a simple multiport repeater but rather must be a switching
hub. In this case each station constitutes a separate collision domain. In fact, there
are no collisions and the CSMA/CD algorithm is no longer needed. However, the
same 802.3 MAC frame format is used and the attached stations can continue to
execute the CSMA/CD algorithm, even though no collisions can ever be detected.
Mixed Configuration One of the strengths of the Fast Ethernet approach is
that it readily supports a mixture of existing 10-Mbps LANs and newer 100-Mbps
LANs. For example, the 100-Mbps technology can be used as a backbone LAN to
support a number of 10-Mbps hubs. Many of the stations attach to 10-Mbps hubs
using the 10BASE-T standard. These hubs are in turn connected to switching
hubs that conform to 100BASE-T and that can support both 10-Mbps and 100Mbps links. Additional high-capacity workstations and servers attach directly to
these 10/100 switches. These mixed-capacity switches are in turn connected to
100-Mbps hubs using 100-Mbps links. The 100-Mbps hubs provide a building
backbone and are also connected to a router that provides connection to an outside WAN.
Gigabit Ethernet
In late 1995, the IEEE 802.3 committee formed a High-Speed Study Group to
investigate means for conveying packets in Ethernet format at speeds in the gigabits per second range. The strategy for Gigabit Ethernet is the same as that for
Fast Ethernet. While defining a new medium and transmission specification, Gigabit Ethernet retains the CSMA/CD protocol and Ethernet format of its 10-Mbps
and 100-Mbps predecessors. It is compatible with 100BASE-T and 10BASE-T,
preserving a smooth migration path. As more organizations move to 100BASE-T,
putting huge traffic loads on backbone networks, demand for Gigabit Ethernet
has intensified.
Figure 16.4 shows a typical application of Gigabit Ethernet. A 1-Gbps switching
hub provides backbone connectivity for central servers and high-speed workgroup
hubs. Each workgroup LAN switch supports both 1-Gbps links, to connect to the backbone LAN switch and to support high-performance workgroup servers, and 100-Mbps
links, to support high-performance workstations, servers, and 100-Mbps LAN switches.
Media Access Layer The 1000-Mbps specification calls for the same CSMA/CD
frame format and MAC protocol as used in the 10-Mbps and 100-Mbps version of
IEEE 802.3. For shared-medium hub operation (Figure 15.13b), there are two
enhancements to the basic CSMA/CD scheme:
• Carrier extension: Carrier extension appends a set of special symbols to the
end of short MAC frames so that the resulting block is at least 4096 bit-times in
duration, up from the minimum 512 bit-times imposed at 10 and 100 Mbps. This
is so that the frame length of a transmission is longer than the propagation time
at 1 Gbps.
1 Gbps
100 Mbps link
1 Gbps link
100/1000-Mbps Hubs
Figure 16.4 Example Gigabit Ethernet Configuration
• Frame bursting: This feature allows for multiple short frames to be transmitted consecutively, up to a limit, without relinquishing control for CSMA/CD
between frames. Frame bursting avoids the overhead of carrier extension
when a single station has a number of small frames ready to send.
With a switching hub (Figure 15.13c), which provides dedicated access to the
medium, the carrier extension and frame bursting features are not needed. This is
because data transmission and reception at a station can occur simultaneously without interference and with no contention for a shared medium.
Physical Layer The current 1-Gbps specification for IEEE 802.3 includes the following physical layer alternatives (Figure 16.5):
• 1000BASE-SX: This short-wavelength option supports duplex links of up to
275 m using 62.5-mm multimode or up to 550 m using 50-mm multimode fiber.
Wavelengths are in the range of 770 to 860 nm.
• 1000BASE-LX: This long-wavelength option supports duplex links of up to
550 m of 62.5-mm or 50-mm multimode fiber or 5 km of 10-mm single-mode
fiber. Wavelengths are in the range of 1270 to 1355 nm.
10-m single-mode fiber
50-m multimode fiber
62.5-m multimode fiber
50-m multimode fiber
62.5-m multimode fiber
Category5 UTP
Shielded cable
25 m
50 m
250 m 500 m
Maximum distance
2500 m 5000 m
Figure 16.5 Gigabit Ethernet Medium Options (log scale)
• 1000BASE-CX: This option supports 1-Gbps links among devices
located within a single room or equipment rack, using copper jumpers (specialized shielded twisted-pair cable that spans no more than 25 m). Each
link is composed of a separate shielded twisted pair running in each
• 1000BASE-T: This option makes use of four pairs of Category 5 unshielded
twisted pair to support devices over a range of up to 100 m.
The signal encoding scheme used for the first three Gigabit Ethernet options
just listed is 8B/10B, which is described in Appendix 16A. The signal-encoding
scheme used for 1000BASE-T is 4D-PAM5, a complex scheme whose description is
beyond our scope.
10-Gbps Ethernet
With gigabit products still fairly new, attention has turned in the past several years
to a 10-Gbps Ethernet capability. The principle driving requirement for 10 Gigabit
Ethernet is the increase in Internet and intranet traffic. A number of factors contribute to the explosive growth in both Internet and intranet traffic:
• An increase in the number of network connections
• An increase in the connection speed of each end-station (e.g., 10 Mbps users
moving to 100 Mbps, analog 56-kbps users moving to DSL and cable
• An increase in the deployment of bandwidth-intensive applications such as
high-quality video
• An increase in Web hosting and application hosting traffic
Initially network managers will use 10-Gbps Ethernet to provide high-speed,
local backbone interconnection between large-capacity switches. As the demand for
bandwidth increases, 10-Gbps Ethernet will be deployed throughout the entire network and will include server farm, backbone, and campuswide connectivity. This
technology enables Internet service providers (ISPs) and network service providers
(NSPs) to create very high-speed links at a low cost, between co-located, carrierclass switches and routers.
The technology also allows the construction of metropolitan area networks
(MANs) and WANs that connect geographically dispersed LANs between campuses or points of presence (PoPs). Thus, Ethernet begins to compete with ATM
and other wide area transmission and networking technologies. In most cases
where the customer requirement is data and TCP/IP transport, 10-Gbps Ethernet
provides substantial value over ATM transport for both network end users and
service providers:
• No expensive, bandwidth-consuming conversion between Ethernet packets
and ATM cells is required; the network is Ethernet, end to end.
• The combination of IP and Ethernet offers quality of service and traffic
policing capabilities that approach those provided by ATM, so that
advanced traffic engineering technologies are available to users and
• A wide variety of standard optical interfaces (wavelengths and link distances)
have been specified for 10-Gbps Ethernet, optimizing its operation and cost
for LAN, MAN, or WAN applications.
Figure 16.6 illustrates potential uses of 10-Gbps Ethernet. Higher-capacity
backbone pipes will help relieve congestion for workgroup switches, where Gigabit
Ethernet uplinks can easily become overloaded, and for server farms, where 1-Gbps
network interface cards are already in widespread use.
The goal for maximum link distances cover a range of applications: from 300 m
to 40 km. The links operate in full-duplex mode only, using a variety of optical fiber
physical media.
Four physical layer options are defined for 10-Gbps Ethernet (Figure 16.7).
The first three of these have two suboptions: an “R” suboption and a “W” suboption. The R designation refers to a family of physical layer implementations that use
a signal encoding technique known as 64B/66B. The R implementations are
designed for use over dark fiber, meaning a fiber optic cable that is not in use and
that is not connected to any other equipment. The W designation refers to a family
of physical layer implementations that also use 64B/66B signaling but that are then
encapsulated to connect to SONET equipment.
The four physical layer options are
• 10GBASE-S (short): Designed for 850-nm transmission on multimode fiber.
This medium can achieve distances up to 300 m. There are 10GBASE-SR and
10GBASE-SW versions.
• 10GBASE-L (long): Designed for 1310-nm transmission on single-mode fiber.
This medium can achieve distances up to 10 km. There are 10GBASE-LR and
10GBASE-LW versions.
Server farm
10/100 Mbps
1 Gbps
10 Gbps
10 Gbps
Example 10 Gigabit Ethernet Configuration
Figure 16.6
(850 nm)
50-m multimode fiber
(1310 nm)
Single-mode fiber
(1550 nm)
Single-mode fiber
(1310 nm)
62.5-m multimode fiber
Single-mode fiber
50-m multimode fiber
62.5-m multimode fiber
10 m
100 m
1 km
300 m
Maximum distance
10 km
40 km 100 km
Figure 16.7 10-Gbps Ethernet Distance Options (log scale)
• 10GBASE-E (extended): Designed for 1550-nm transmission on single-mode
fiber. This medium can achieve distances up to 40 km. There are 10GBASEER and 10GBASE-EW versions.
• 10GBASE-LX4: Designed for 1310-nm transmission on single-mode or multimode fiber. This medium can achieve distances up to 10 km. This medium uses
wavelength-division multiplexing (WDM) to multiplex the bit stream across
four light waves.
The success of Fast Ethernet, Gigabit Ethernet, and 10-Gbps Ethernet highlights
the importance of network management concerns in choosing a network technology.
Both ATM and Fiber Channel, explored later, may be technically superior choices for a
high-speed backbone, because of their flexibility and scalability. However, the Ethernet
alternatives offer compatibility with existing installed LANs, network management
software, and applications. This compatibility has accounted for the survival of a nearly
30-year-old technology (CSMA/CD) in today’s fast-evolving network environment.
As the speed and memory capacity of personal computers, workstations, and servers
have grown, and as applications have become ever more complex with greater
reliance on graphics and video, the requirement for greater speed in delivering data
to the processor has grown. This requirement affects two methods of data communications with the processor: I/O channel and network communications.
An I/O channel is a direct point-to-point or multipoint communications link,
predominantly hardware based and designed for high speed over very short distances. The I/O channel transfers data between a buffer at the source device and a
buffer at the destination device, moving only the user contents from one device to
another, without regard to the format or meaning of the data. The logic associated
with the channel typically provides the minimum control necessary to manage the
transfer plus hardware error detection. I/O channels typically manage transfers
between processors and peripheral devices, such as disks, graphics equipment, CDROMs, and video I/O devices.
A network is a collection of interconnected access points with a software protocol structure that enables communication. The network typically allows many different types of data transfer, using software to implement the networking protocols
and to provide flow control, error detection, and error recovery. As we have discussed in this book, networks typically manage transfers between end systems over
local, metropolitan, or wide area distances.
Fibre Channel is designed to combine the best features of both technologies—
the simplicity and speed of channel communications with the flexibility and interconnectivity that characterize protocol-based network communications. This fusion
of approaches allows system designers to combine traditional peripheral connection, host-to-host internetworking, loosely coupled processor clustering, and multimedia applications in a single multiprotocol interface. The types of channel-oriented
facilities incorporated into the Fibre Channel protocol architecture include
• Data-type qualifiers for routing frame payload into particular interface buffers
• Link-level constructs associated with individual I/O operations
• Protocol interface specifications to allow support of existing I/O channel
architectures, such as the Small Computer System Interface (SCSI)
The types of network-oriented facilities incorporated into the Fibre Channel
protocol architecture include
• Full multiplexing of traffic between multiple destinations
• Peer-to-peer connectivity between any pair of ports on a Fibre Channel network
• Capabilities for internetworking to other connection technologies
Depending on the needs of the application, either channel or networking
approaches can be used for any data transfer. The Fibre Channel Industry Association, which is the industry consortium promoting Fibre Channel, lists the following
ambitious requirements that Fibre Channel is intended to satisfy [FCIA01]:
• Full-duplex links with two fibers per link
• Performance from 100 Mbps to 800 Mbps on a single line (full-duplex 200
Mbps to 1600 Mbps per link)
• Support for distances up to 10 km
• Small connectors
• High-capacity utilization with distance insensitivity
• Greater connectivity than existing multidrop channels
• Broad availability (i.e., standard components)
• Support for multiple cost/performance levels, from small systems to supercomputers
• Ability to carry multiple existing interface command sets for existing channel
and network protocols
The solution was to develop a simple generic transport mechanism based on
point-to-point links and a switching network. This underlying infrastructure supports a simple encoding and framing scheme that in turn supports a variety of channel and network protocols.
Fibre Channel Elements
The key elements of a Fibre Channel network are the end systems, called nodes, and
the network itself, which consists of one or more switching elements. The collection
of switching elements is referred to as a fabric. These elements are interconnected
by point-to-point links between ports on the individual nodes and switches. Communication consists of the transmission of frames across the point-to-point links.
Each node includes one or more ports, called N_ports, for interconnection.
Similarly, each fabric-switching element includes multiple ports, called F_ports.
Interconnection is by means of bidirectional links between ports. Any node can
communicate with any other node connected to the same fabric using the services of
the fabric. All routing of frames between N_ports is done by the fabric. Frames may
be buffered within the fabric, making it possible for different nodes to connect to
the fabric at different data rates.
A fabric can be implemented as a single fabric element with attached nodes (a
simple star arrangement) or as a more general network of fabric elements, as shown
in Figure 16.8. In either case, the fabric is responsible for buffering and for routing
frames between source and destination nodes.
The Fibre Channel network is quite different from the IEEE 802 LANs. Fibre
Channel is more like a traditional circuit-switching or packet-switching network, in
contrast to the typical shared-medium LAN. Thus, Fibre Channel need not be concerned with medium access control issues. Because it is based on a switching network,
the Fibre Channel scales easily in terms of N_ports, data rate, and distance covered.
Fibre Channel
switching fabric
Figure 16.8
Fibre Channel Network
This approach provides great flexibility. Fibre Channel can readily accommodate new
transmission media and data rates by adding new switches and F_ports to an existing
fabric. Thus, an existing investment is not lost with an upgrade to new technologies
and equipment. Further, the layered protocol architecture accommodates existing
I/O interface and networking protocols, preserving the preexisting investment.
Fibre Channel Protocol Architecture
The Fibre Channel standard is organized into five levels. Each level defines a function or set of related functions. The standard does not dictate a correspondence
between levels and actual implementations, with a specific interface between adjacent levels. Rather, the standard refers to the level as a “document artifice” used to
group related functions. The layers are as follows:
• FC-0 Physical Media: Includes optical fiber for long-distance applications,
coaxial cable for high speeds over short distances, and shielded twisted pair for
lower speeds over short distances
• FC-1 Transmission Protocol: Defines the signal encoding scheme
• FC-2 Framing Protocol: Deals with defining topologies, frame format, flow
and error control, and grouping of frames into logical entities called sequences
and exchanges
• FC-3 Common Services: Includes multicasting
• FC-4 Mapping: Defines the mapping of various channel and network protocols to Fibre Channel, including IEEE 802, ATM, IP, and the Small Computer
System Interface (SCSI)
Fibre Channel Physical Media and Topologies
One of the major strengths of the Fibre Channel standard is that it provides a range
of options for the physical medium, the data rate on that medium, and the topology
of the network (Table 16.4).
Transmission Media The transmission media options that are available under
Fibre Channel include shielded twisted pair, video coaxial cable, and optical fiber.
Standardized data rates range from 100 Mbps to 3.2 Gbps. Point-to-point link distances range from 33 m to 10 km.
Table 16.4 Maximum Distance for Fibre Channel Media Types
800 Mbps
400 Mbps
200 Mbps
100 Mbps
Single mode fiber
10 km
10 km
10 km
50-Mm multimode fiber
0.5 km
1 km
2 km
62.5-Mm multimode fiber
175 m
1 km
1 km
Video coaxial cable
50 m
71 m
100 m
100 m
Miniature coaxial cable
14 m
19 m
28 m
42 m
Shielded twisted pair
28 m
46 m
57 m
80 m
Topologies The most general topology supported by Fibre Channel is referred to
as a fabric or switched topology. This is an arbitrary topology that includes at least
one switch to interconnect a number of end systems. The fabric topology may also
consist of a number of switches forming a switched network, with some or all of
these switches also supporting end nodes.
Routing in the fabric topology is transparent to the nodes. Each port in the configuration has a unique address. When data from a node are transmitted into the fabric, the edge switch to which the node is attached uses the destination port address in
the incoming data frame to determine the destination port location. The switch then
either delivers the frame to another node attached to the same switch or transfers the
frame to an adjacent switch to begin routing the frame to a remote destination.
The fabric topology provides scalability of capacity: As additional ports are
added, the aggregate capacity of the network increases, thus minimizing congestion
and contention and increasing throughput. The fabric is protocol independent and
largely distance insensitive. The technology of the switch itself and of the transmission links connecting the switch to nodes may be changed without affecting the
overall configuration. Another advantage of the fabric topology is that the burden
on nodes is minimized. An individual Fibre Channel node (end system) is only
responsible for managing a simple point-to-point connection between itself and the
fabric; the fabric is responsible for routing between ports and error detection.
In addition to the fabric topology, the Fibre Channel standard defines two
other topologies. With the point-to-point topology there are only two ports, and
these are directly connected, with no intervening fabric switches. In this case there is
no routing. The arbitrated loop topology is a simple, low-cost topology for connecting up to 126 nodes in a loop. The arbitrated loop operates in a manner roughly
equivalent to the token ring protocols that we have seen.
Topologies, transmission media, and data rates may be combined to provide an
optimized configuration for a given site. Figure 16.9 is an example that illustrates the
principal applications of Fiber Channel.
Prospects for Fibre Channel
Fibre Channel is backed by an industry interest group known as the Fibre Channel Association and a variety of interface cards for different applications are
available. Fibre Channel has been most widely accepted as an improved peripheral device interconnect, providing services that can eventually replace such
schemes as SCSI. It is a technically attractive solution to general high-speed LAN
requirements but must compete with Ethernet and ATM LANs. Cost and performance issues should dominate the manager’s consideration of these competing
[STAL00] covers in greater detail the LAN systems discussed in this chapter.
[SPUR00] provides a concise but thorough overview of all of the 10-Mbps through 1-Gbps
802.3 systems, including configuration guidelines for a single segment of each media type, as well
1 Linking highperformance
workstation clusters
5 Linking LANs and
WANs to the backbone
Fibre Channel
switching fabric
2 Connecting mainframes
to each other
4 Clustering disk farms
3 Giving server farms high-speed pipes
Figure 16.9
Five Applications of Fibre Channel
as guidelines for building multisegment Ethernets using a variety of media types. Two excellent
treatments of both 100-Mbps and Gigabit Ethernet are [SEIF98] and [KADA98].A good survey
article on Gigabit Ethernet is [FRAZ99].
[SACH96] is a good survey of Fibre Channel. A short but worthwhile treatment is
FCIA01 Fibre Channel Industry Association. Fibre Channel Storage Area Networks. San
Francisco: Fibre Channel Industry Association, 2001.
FRAZ99 Frazier, H., and Johnson, H. “Gigabit Ethernet: From 100 to 1,000 Mbps.”
IEEE Internet Computing, January/February 1999.
KADA98 Kadambi, J.; Crayford, I.; and Kalkunte, M. Gigabit Ethernet. Upper Saddle
River, NJ: Prentice Hall, 1998.
SACH96 Sachs. M., and Varma, A. “Fibre Channel and Related Standards.” IEEE Communications Magazine, August 1996.
SEIF98 Seifert, R. Gigabit Ethernet. Reading, MA: Addison-Wesley, 1998.
SPUR00 Spurgeon, C. Ethernet: The Definitive Guide. Cambridge, MA: O’Reilly and
Associates, 2000.
STAL00 Stallings, W. Local and Metropolitan Area Networks, Sixth Edition. Upper Saddle River, NJ: Prentice Hall, 2000.
Recommended Web sites:
• Interoperability Lab: University of New Hampshire site for equipment testing for
high-speed LANs
• Charles Spurgeon’s Ethernet Web Site: Provides extensive information about Ethernet, including links and documents
• IEEE 802.3 10-Gbps Ethernet Task Force: Latest documents
• Fibre Channel Industry Association: Includes tutorials, white papers, links to vendors, and descriptions of Fibre Channel applications
• CERN Fibre Channel Site: Includes tutorials, white papers, links to vendors, and
descriptions of Fibre Channel applications
• Storage Network Industry Association: An industry forum of developers, integrators, and IT professionals who evolve and promote storage networking technology and
Key Terms
1-persistent CSMA
binary exponential backoff
carrier sense multiple access
carrier sense multiple access
with collision detection
Fibre Channel
full-duplex operation
nonpersistent CSMA
p-persistent CSMA
slotted ALOHA
Review Questions
What is a server farm?
Explain the three persistence protocols that can be used with CSMA.
What is CSMA/CD?
Explain binary exponential backoff.
What are the transmission medium options for Fast Ethernet?
How does Fast Ethernet differ from 10BASE-T, other than the data rate?
In the context of Ethernet, what is full-duplex operation?
List the levels of Fibre Channel and the functions of each level.
What are the topology options for Fibre Channel?
A disadvantage of the contention approach for LANs, such as CSMA/CD, is the
capacity wasted due to multiple stations attempting to access the channel at the
same time. Suppose that time is divided into discrete slots, with each of N stations
attempting to transmit with probability p during each slot. What fraction of slots are
wasted due to multiple simultaneous transmission attempts?
For p-persistent CSMA, consider the following situation. A station is ready to transmit and is listening to the current transmission. No other station is ready to transmit,
and there will be no other transmission for an indefinite period. If the time unit used
in the protocol is T, show that the average number of iterations of step 1 of the protocol is 1/p and that therefore the expected time that the station will have to wait after
the current transmission is
T a - 1 b . Hint: Use the equality a iXi - 1 =
The binary exponential backoff algorithm is defined by IEEE 802 as follows:
The delay is an integral multiple of slot time. The number of slot times to delay
before the nth retransmission attempt is chosen as a uniformly distributed random integer r in the range 0 … r 6 2 K, where K = min1n, 102.
Slot time is, roughly, twice the round-trip propagation delay. Assume that two stations
always have a frame to send. After a collision, what is the mean number of retransmission attempts before one station successfully retransmits? What is the answer if
three stations always have frames to send?
Describe the signal pattern produced on the medium by the Manchester-encoded
preamble of the IEEE 802.3 MAC frame.
Analyze the advantages of having the FCS field of IEEE 802.3 frames in the trailer of
the frame rather than in the header of the frame.
The most widely used MAC approach for a ring topology is token ring, defined in
IEEE 802.5. The token ring technique is based on the use of a small frame, called a
token, that circulates when all stations are idle. A station wishing to transmit must
wait until it detects a token passing by. It then seizes the token by changing one bit in
the token, which transforms it from a token to a start-of-frame sequence for a data
frame. The station then appends and transmits the remainder of the fields needed to
construct a data frame. When a station seizes a token and begins to transmit a data
frame, there is no token on the ring, so other stations wishing to transmit must wait.
The frame on the ring will make a round trip and be absorbed by the transmitting station. The transmitting station will insert a new token on the ring when both of the following conditions have been met: (1) The station has completed transmission of its
frame. (2) The leading edge of the transmitted frame has returned (after a complete
circulation of the ring) to the station.
a. An option in IEEE 802.5, known as early token release, eliminates the second
condition just listed. Under what conditions will early token release result in
improved utilization?
b. Are there any potential disadvantages to early token release? Explain.
For a token ring LAN, suppose that the destination station removes the data
frame and immediately sends a short acknowledgment frame to the sender rather
than letting the original frame return to sender. How will this affect performance?
Another medium access control technique for rings is the slotted ring. A number of
fixed-length slots circulate continuously on the ring. Each slot contains a leading bit
to designate the slot as empty or full. A station wishing to transmit waits until an
empty slot arrives, marks the slot full, and inserts a frame of data as the slot goes by.
The full slot makes a complete round trip, to be marked empty again by the station
that marked it full. In what sense are the slotted ring and token ring protocols the
complement (dual) of each other?
Consider a slotted ring of length 10 km with a data rate of 10 Mbps and 500 repeaters,
each of which introduces a 1-bit delay. Each slot contains room for one source address
byte, one destination address byte, two data bytes, and five control bits for a total
length of 37 bits. How many slots are on the ring?
With 8B6T coding, the effective data rate on a single channel is 33 Mbps with a signaling rate of 25 Mbaud. If a pure ternary scheme were used, what is the effective
data rate for a signaling rate of 25 Mbaud?
With 8B6T coding, the DC algorithm sometimes negates all of the ternary symbols in
a code group. How does the receiver recognize this condition? How does the receiver
discriminate between a negated code group and one that has not been negated? For
example, the code group for data byte 00 is + - 0 0 + - and the code group for data
byte 38 is the negation of that, namely, - + 0 0 - + .
Draw the MLT decoder state diagram that corresponds to the encoder state diagram
of Figure 16.10.
For the bit stream 0101110, sketch the waveforms for NRZ-L, NRZI, Manchester, and
Differential Manchester, and MLT-3.
Consider a token ring system with N stations in which a station that has just transmitted a frame releases a new token only after the station has completed transmission of
its frame and the leading edge of the transmitted frame has returned (after a complete circulation of the ring) to the station.
a. Show that utilization can be approximated by 1/11 + a/N2 for a 6 1 and by
1/1a + a/N2 for a 7 1,
b. What is the asymptotic value of utilization as N increases?
a. Verify that the division illustrated in Figure 16.18a corresponds to the implementation of Figure 16.17a by calculating the result step by step using Equation (16.7).
b. Verify that the multiplication illustrated in Figure 16.18b corresponds to the
implementation of Figure 16.17b by calculating the result step by step using
Equation (16.8).
Draw a figure similar to Figure 16.17 for the MLT-3 scrambler and descrambler.
In Chapter 5, we looked at some of the common techniques for encoding digital data for transmission, including Manchester and differential Manchester, which are used in some of the
LAN standards. In this appendix, we examine some additional encoding schemes referred to in
this chapter.
This scheme, which is actually a combination of two encoding algorithms, is used for
100BASE-X. To understand the significance of this choice, first consider the simple alternative of a NRZ (nonreturn to zero) coding scheme. With NRZ, one signal state represents
binary one and one signal state represents binary zero. The disadvantage of this approach is
its lack of synchronization. Because transitions on the medium are unpredictable, there is no
way for the receiver to synchronize its clock to the transmitter. A solution to this problem is
to encode the binary data to guarantee the presence of transitions. For example, the data
could first be encoded using Manchester encoding. The disadvantage of this approach is that
the efficiency is only 50%. That is, because there can be as many as two transitions per bit
time, a signaling rate of 200 million signal elements per second (200 Mbaud) is needed to
achieve a data rate of 100 Mbps. This represents an unnecessary cost and technical burden.
Greater efficiency can be achieved using the 4B/5B code. In this scheme, encoding is
done 4 bits at a time; each 4 bits of data are encoded into a symbol with five code bits, such
that each code bit contains a single signal element; the block of five code bits is called a code
group. In effect, each set of 4 bits is encoded as 5 bits. The efficiency is thus raised to 80%:
100 Mbps is achieved with 125 Mbaud.
To ensure synchronization, there is a second stage of encoding: Each code bit of the
4B/5B stream is treated as a binary value and encoded using nonreturn to zero inverted
(NRZI) (see Figure 5.2). In this code, a binary 1 is represented with a transition at the
beginning of the bit interval and a binary 0 is represented with no transition at the beginning
of the bit interval; there are no other transitions. The advantage of NRZI is that it employs differential encoding. Recall from Chapter 5 that in differential encoding, the signal is decoded
by comparing the polarity of adjacent signal elements rather than the absolute value of a signal element. A benefit of this scheme is that it is generally more reliable to detect a transition
in the presence of noise and distortion than to compare a value to a threshold.
Now we are in a position to describe the 4B/5B code and to understand the selections
that were made. Table 16.5 shows the symbol encoding. Each 5-bit code group pattern is shown,
together with its NRZI realization. Because we are encoding 4 bits with a 5-bit pattern, only
16 of the 32 possible patterns are needed for data encoding. The codes selected to represent
Table 16.5 4B/5B Code Groups (page 1 of 2)
Data Input
(4 bits)
Code Group
(5 bits)
NRZI pattern
Data 0
Data 1
Data 2
Data 3
Data 4
Data 5
Data 6
Data 7
Data 8
Data 9
Data A
Data B
Data C
Data D
Data E
Data F
Start of stream
delimiter, part 1
Start of stream
delimiter, part 2
End of stream
delimiter, part 1
End of stream
delimiter, part 2
Transmit error
Invalid codes
the 16 4-bit data blocks are such that a transition is present at least twice for each 5-code group
code. No more than three zeros in a row are allowed across one or more code groups
The encoding scheme can be summarized as follows:
1. A simple NRZ encoding is rejected because it does not provide synchronization; a
string of 1s or 0s will have no transitions.
2. The data to be transmitted must first be encoded to assure transitions. The 4B/5B code is
chosen over Manchester because it is more efficient.
3. The 4B/5B code is further encoded using NRZI so that the resulting differential signal will
improve reception reliability.
4. The specific 5-bit patterns for the encoding of the 16 4-bit data patterns are chosen to
guarantee no more than three zeros in a row to provide for adequate synchronization.
Those code groups not used to represent data are either declared invalid or assigned
special meaning as control symbols. These assignments are listed in Table 16.5. The nondata
symbols fall into the following categories:
• Idle: The idle code group is transmitted between data transmission sequences. It
consists of a constant flow of binary ones, which in NRZI comes out as a continuous
alternation between the two signal levels. This continuous fill pattern establishes
and maintains synchronization and is used in the CSMA/CD protocol to indicate
that the shared medium is idle.
• Start of stream delimiter: Used to delineate the starting boundary of a data transmission sequence; consists of two different code groups.
• End of stream delimiter: Used to terminate normal data transmission sequences; consists of two different code groups.
• Transmit error: This code group is interpreted as a signaling error. The normal use of
this indicator is for repeaters to propagate received errors.
Although 4B/5B-NRZI is effective over optical fiber, it is not suitable as is for use over
twisted pair. The reason is that the signal energy is concentrated in such a way as to produce
undesirable radiated emissions from the wire. MLT-3, which is used on 100BASE-TX, is
designed to overcome this problem.
The following steps are involved:
1. NRZI to NRZ conversion. The 4B/5B NRZI signal of the basic 100BASE-X is converted back to NRZ.
2. Scrambling. The bit stream is scrambled to produce a more uniform spectrum distribution
for the next stage.
3. Encoder. The scrambled bit stream is encoded using a scheme known as MLT-3.
4. Driver. The resulting encoding is transmitted.
The effect of the MLT-3 scheme is to concentrate most of the energy in the transmitted
signal below 30 MHz, which reduces radiated emissions. This in turn reduces problems due to
The MLT-3 encoding produces an output that has a transition for every binary one and
that uses three levels: a positive voltage 1+ V2, a negative voltage 1- V2, and no voltage (0).
The encoding rules are best explained with reference to the encoder state diagram shown in
Figure 16.10:
Input 0
from V
Input 1
Input = 1
Input 0
Input 1
Input 0
Input 1
from V
Input 0
Figure 16.10
MLT-3 Encoder State Diagram
1. If the next input bit is zero, then the next output value is the same as the preceding value.
2. If the next input bit is one, then the next output value involves a transition:
(a) If the preceding output value was either + V or -V, then the next output value is 0.
(b) If the preceding output value was 0, then the next output value is nonzero, and
that output is of the opposite sign to the last nonzero output.
Figure 16.11 provides an example. Every time there is an input of 1, there is a transition.
The occurrences of + V and - V alternate.
The 8B6T encoding algorithm uses ternary signaling. With ternary signaling, each signal element can take on one of three values (positive voltage, negative voltage, zero voltage). A pure
ternary code is one in which the full information-carrying capacity of the ternary signal is
exploited. However, pure ternary is not attractive for the same reasons that a pure binary
(NRZ) code is rejected: the lack of synchronization. However, there are schemes referred to
as block-coding methods that approach the efficiency of ternary and overcome this disadvantage. A new block-coding scheme known as 8B6T is used for 100BASE-T4.
With 8B6T the data to be transmitted are handled in 8-bit blocks. Each block of 8 bits
is mapped into a code group of 6 ternary symbols. The stream of code groups is then transmitted in round-robin fashion across the three output channels (Figure 16.12). Thus the
ternary transmission rate on each output channel is
Figure 16.11 Example of MLT-3 Encoding
6T (25 Mbaud)
Stream of 8-bit
8B (100 Mbps)
6T (25 MBaud)
6T (25 MBaud)
Figure 16.12 8B6T Transmission Scheme
* 33 = 25 Mbaud
Table 16.6 shows a portion of the 8B6T code table; the full table maps all possible 8-bit
patterns into a unique code group of 6 ternary symbols. The mapping was chosen with two
requirements in mind: synchronization and DC balance. For synchronization, the codes were
chosen so to maximize the average number of transitions per code group. The second requirement is to maintain DC balance, so that the average voltage on the line is zero. For this purpose all of the selected code groups either have an equal number of positive and negative
symbols or an excess of one positive symbol. To maintain balance, a DC balancing algorithm
is used. In essence, this algorithm monitors the cumulative weight of the of all code groups
transmitted on a single pair. Each code group has a weight of 0 or 1. To maintain balance, the
Table 16.6 Portion of 8B6T Code Table
6T Code
6T Code
6T Code
6T Code
+ - 00 + -
+0+ - -0
00 - + + -
+ - 00 - +
0+ - + -0
+ +0-0-
- - + 00 +
0+ - - +0
+ -0+ -0
+0+ -0-
+ + -0+ -
+ -0- +0
-0+ + -0
0+ + -0-
+ + -0- +
-0+ - +0
-0+0+ -
0+ + - -0
00 + 0 - +
-0+0- +
0+ - -0+
+ + 00 - -
00 + 0 + -
0+ - +0-
+ -0-0+
+0+0- -
00 - 00 +
+ -0+0-
-0+ -0+
0+ +0- -
- - + + + -
-0+ +0-
- + 00 + -
0+ -0+ -
-0- + +0
- + 00 - +
0- + - +0
0- + + -0
0+ -0- +
- -0+0+
- +0+ -0
0+ - + + -
-0- +0+
- +0- +0
+0- + -0
0 + - 00 +
0- - +0+
+0- - +0
+0-0+ -
0 - + 00 +
0- - + +0
+0-0- +
0- + -0+
0- + + + -
- - 00 + +
0- + +0-
- +0-0+
0- +0- +
-0-0+ +
- +0+0-
+0- -0+
0- +0+ -
0- -0+ +
+0- +0-
algorithm may negate a transmitted code group (change all + symbols to - symbols and all symbols to + symbols), so that the cumulative weight at the conclusion of each code group is
always either 0 or 1.
The encoding scheme used for Fibre Channel and Gigabit Ethernet is 8B/10B, in which each
8 bits of data is converted into 10 bits for transmission. This scheme has a similar philosophy to
the 4B/5B scheme discussed earlier. The 8B/10B scheme, developed and patented by IBM for
use in its 200-megabaud ESCON interconnect system [WIDM83], is more powerful than
4B/5B in terms of transmission characteristics and error detection capability.
The developers of this code list the following advantages:
• It can be implemented with relatively simple and reliable transceivers at low cost.
• It is well balanced, with minimal deviation from the occurrence of an equal number of
1 and 0 bits across any sequence.
• It provides good transition density for easier clock recovery.
• It provides useful error detection capability.
The 8B/10B code is an example of the more general mBnB code, in which m binary source
bits are mapped into n binary bits for transmission. Redundancy is built into the code to provide the desired transmission features by making n 7 m.
The 8B/10B code actually combines two other codes, a 5B/6B code and a 3B/4B code.
The use of these two codes is simply an artifact that simplifies the definition of the mapping
and the implementation; the mapping could have been defined directly as an 8B/10B code. In
any case, a mapping is defined that maps each of the possible 8-bit source blocks into a 10-bit
code block. There is also a function called disparity control. In essence, this function keeps
track of the excess of zeros over ones or ones over zeros. An excess in either direction is
referred to as a disparity. If there is a disparity, and if the current code block would add to that
disparity, then the disparity control block complements the 10-bit code block. This has the
effect of either eliminating the disparity or at least moving it in the opposite direction of the
current disparity.
The 8B/10B code results in an overhead of 25%. To achieve greater efficiency at a higher data
rate, the 64B/66B code maps a block of 64 bits into an output block of 66 bits, for an overhead
of just 3%.This code is used in 10-Gbps Ethernet. Figure 16.13 illustrates the process.The entire
64-bit data field (scrambled)
(a) Data octets only
8-bit type
Combined 56-bit data /control field (scrambled)
(b) Mixed data/control block
Figure 16.13
Encoding Using 64B/66B
Ethernet frame, including control fields, is considered “data” for this pzrocess. In addition, there
are nondata symbols, called “control,” and which include those defined for the 4B/5B code discussed previously plus a few other symbols. For a 64-bit block consisting only of data octets, the
entire block is scrambled. Two synchronization bits, with values 01, are prepended to the scrambled block. For a block consisting a mixture of control and data octets, a 56-bit block is used,
which is scrambled; a 66-bit block is formed by prepending two synchronization bits, with values
10, and an 8-bit control type field, which defines the control functions included with this block.
In both cases, scrambling is performed using the polynomial 1 + X39 + X58. See Appendix
16C for a discussion of scrambling. The two-bit synchronization field provides block alignment
and a means of synchronizing when long streams of bits are sent.
Note that in this case, no specific coding technique is used to achieve the desired synchronization and frequency of transitions. Rather the scrambling algorithm provides the
required characteristics.
The choice of a LAN or MAN architecture is based on many factors, but one of the most
important is performance. Of particular concern is the behavior (throughput, response time)
of the network under heavy load. In this appendix, we provide an introduction to this topic. A
more detailed discussion can be found in [STAL00].
The Effect of Propagation Delay and Transmission Rate
In Chapter 7, we introduced the parameter a, defined as
a =
Propagation time
Transmission time
In that context, we were concerned with a point-to-point link, with a given propagation time
between the two endpoints and a transmission time for either a fixed or average frame size. It
was shown that a could be expressed as
a =
Length of data link in bits
Length of frame in bits
This parameter is also important in the context of LANs and MANs, and in fact determines an upper bound on utilization. Consider a perfectly efficient access mechanism that
allows only one transmission at a time. As soon as one transmission is over, another station
begins transmitting. Furthermore, the transmission is pure data; no overhead bits. What is the
maximum possible utilization of the network? It can be expressed as the ratio of total
throughput of the network to its data rate:
U =
Data rate
Now define, as in Chapter 7:
R = data rate of the channel
d = maximum distance between any two stations
V = velocity of signal propagation
L = average or fixed frame length
The throughput is just the number of bits transmitted per unit time. A frame contains L bits,
and the amount of time devoted to that frame is the actual transmission time (L/R) plus the
propagation delay (d/V). Thus
Throughput =
d>V + L>R
But by our preceding definition of a,
a =
Substituting (16.2) and (16.3) into (16.1),
U =
1 + a
Note that this differs from Equation (7.4) in Appendix 7A. This is because the latter assumed
a half-duplex protocol (no piggybacked acknowledgments).
So utilization varies with a. This can be grasped intuitively by studying Figure 16.14,
which shows a baseband bus with two stations as far apart as possible (worst case) that take
turns sending frames. If we normalize time such that frame transmission time = 1, then the
propagation time = a. For a 6 1, the sequence of events is as follows:
1. A station begins transmitting at t0 .
2. Reception begins at t0 + a.
3. Transmission is completed at t0 + 1.
4. Reception ends at t0 + 1 + a.
5. The other station begins transmitting.
Start of transmission
Start of transmission
t0 1
t0 a
End of transmission
Start of reception
t0 a
t0 1
Start of reception
End of transmission
t0 1 a
t0 1 a
End of reception
(a) Transmission time 1; propagation time a 1
End of reception
(b) Transmission time 1; propagation time a 1
Figure 16.14 The Effect of a on Utilization for Baseband Bus
t0 1
t0 a
t0 a
t0 1
t0 1 a
t0 1 a
Figure 16.15
The Effect of a on Utilization for Ring
For a 7 1, events 2 and 3 are interchanged. In both cases, the total time for one “turn”
is 1 + a, but the transmission time is only 1, for a utilization of 1/11 + a2.
The same effect can be seen to apply to a ring network in Figure 16.15. Here we assume
that one station transmits and then waits to receive its own transmission before any other station transmits. The identical sequence of events just outlined applies.
Typical values of a range from about 0.01 to 0.1 for LANs and 0.1 to well over 1.0 for
MANs. Table 16.7 gives some representative values for a bus topology. As can be seen, for
larger and/or higher-speed networks, utilization suffers. For this reason, the restriction of only
one frame at a time is lifted for high-speed LANs.
Table 16.7 Representative Values of a
Data Rate (Mbps)
Frame Size (bits)
Network Length (km)
1/11 a 2
Finally, the preceding analysis assumes a “perfect” protocol, for which a new frame can
be transmitted as soon as an old frame is received. In practice, the MAC protocol adds overhead that reduces utilization. This is demonstrated in the next subsection.
Simple Performance Model of CSMA/CD
The purpose of this section is to give the reader some insight into the performance of
CSMA/CD by developing a simple performance models. It is hoped that this exercise will aid
in understanding the results of more rigorous analyses.
For these models we assume a local network with N active stations and a maximum
normalized propagation delay of a. To simplify the analysis, we assume that each station is
always prepared to transmit a frame. This allows us to develop an expression for maximum
achievable utilization (U). Although this should not be construed to be the sole figure of
merit for a local network, it is the single most analyzed figure of merit and does permit useful
performance comparisons.
Consider time on a bus medium to be organized into slots whose length is twice the
end-to-end propagation delay. This is a convenient way to view the activity on the medium;
the slot time is the maximum time, from the start of transmission, required to detect a collision. Assume that there are N active stations. Clearly, if each station always has a frame to
transmit and does so, there will be nothing but collisions on the line. So we assume that each
station restrains itself to transmitting during an available slot with probability P.
Time on the medium consists of two types of intervals. First is a transmission interval,
which lasts 1/(2a) slots. Second is a contention interval, which is a sequence of slots with
either a collision or no transmission in each slot. The throughput, normalized to system capacity, is the proportion of time spent in transmission intervals.
To determine the average length of a contention interval, we begin by computing A, the
probability that exactly one station attempts a transmission in a slot and therefore acquires
the medium. This is the binomial probability that any one station attempts to transmit and the
others do not:
A = a
N 1
b P 11 - P2N - 1
= NP11 - P2N - 1
This function takes on a maximum over P when P = 1/N:
A = 11 - 1/N2N - 1
We are interested in the maximum because we want to calculate the maximum throughput of
the medium. It should be clear that the maximum throughput will be achieved if we maximize
the probability of successful seizure of the medium. Therefore, the following rule should be
enforced: During periods of heavy usage, a station should restrain its offered load to 1/N.
(This assumes that each station knows the value of N. To derive an expression for maximum
possible throughput, we live with this assumption.) On the other hand, during periods of light
usage, maximum utilization cannot be achieved because the load is too low; this region is not
of interest here.
Now we can estimate the mean length of a contention interval, w, in slots:
i slots in a row with a collision or no
E[w] = a i * Pr £ transmission followed by a slot with one ≥
= a i11 - A2iA
The summation converges to
E[w] =
1 - A
We can now determine the maximum utilization, which is the length of a transmission interval as a proportion of a cycle consisting of a transmission and a contention interval:
U =
1>2a + 11 - A2>A
1 + 2a11 - A2>A
Figure 16.16 shows normalized throughput as a function of a for two values of N.
Throughput declines as a increases. This is to be expected. Figure 16.16 also shows throughput
as a function of N. The performance of CSMA/CD decreases because of the increased likelihood of collision or no transmission.
It is interesting to note the asymptotic value of U as N increases. We need to know that
1 N-1
= . Then we have
lim a 1 N: q
lim U =
1 + 3.44a
For some digital data encoding techniques, a long string of binary zeros or ones in a transmission can degrade system performance. Also, other transmission properties, such as spectral properties, are enhanced if the data are more nearly of a random nature rather than
constant or repetitive. A technique commonly used to improve signal quality is scrambling
and descrambling. The scrambling process tends to make the data appear more random.
a 0.1
a 1.0
N 10
4 5 6 7 89
4 5 6 7 89
Figure 16.16 CSMA/CD Throughput as a Function of a and N
The scrambling process consists of a feedback shift register, and the matching descrambler consists of a feedforward shift register. An example is shown in Figure 16.17. In this
example, the scrambled data sequence may be expressed as follows:
Bm = A m Bm - 3 Bm - 5
where indicates the exclusive-or operation. The shift register is initialized to contain all
zeros. The descrambled sequence is
(a) Scrambler
Figure 16.17
Scrambler and Descrambler
(b) Descrambler
1 0 0 1 0 1
1 0 1
1 0 0
0 1 1 1 0 0 0 1 1 0 1 0 0 1
1 0 0 0 0 0 1 1 1 1
0 0 1
1 0 1
0 0 1
0 0 0
1 0 1
1 0 1 0 0 0
(a) Scrambling
1 0 1 1 1 0 0 0 1 1
1 0 1 1 1 0 0 0 1 1
1 0 1 1 1 0 0 0 1 1 0 1
1 0 1 1 1 0 0 0 1 1 0 1 0 0 1
1 0 1 0 1 0 1 0 0 0 0 0 1 1 1
(b) Descrambling
Figure 16.18 Example of Scrambling with P1X2 1 X 3 X 5
Cm = Bm Bm - 3 Bm - 5
= 1A m Bm - 3 Bm - 52 Bm - 3 Bm - 5
= A m1 Bm - 3 Bm - 32 1Bm - 5 Bm - 52
= Am
As can be seen, the descrambled output is the original sequence.
We can represent this process with the use of polynomials. Thus, for this example, the
polynomial is P1X2 = 1 + X3 + X5. The input is divided by this polynomial to produce the
scrambled sequence. At the receiver the received scrambled signal is multiplied by the same
polynomial to reproduce the original input. Figure 16.18 is an example using the polynomial
P(X) and an input of 101010100000111.4 The scrambled transmission, produced by dividing
by P(X) (100101), is 101110001101001. When this number is multiplied by P(X), we get the
We use the convention that the leftmost bit is the first bit presented to the scrambler; thus the bits can be
labeled A 0A 1A 2 Á . Similarly, the polynomial is converted to a bit string from left to right.The polynomial
B0 + B1X + B2X2 + . . . is represented as B0B1B2 Á .
original input. Note that the input sequence contains the periodic sequence 10101010 as well
as a long string of zeros. The scrambler effectively removes both patterns.
For the MLT-3 scheme, which is used for 100BASE-TX, the scrambling equation is:
Bm = A m X9 X11
In this case the shift register consists of nine elements, used in the same manner as the
5-element register in Figure 16.17. However, in the case of MLT-3, the shift register is not fed
by the output Bm . Instead, after each bit transmission, the register is shifted one unit up, and
the result of the previous XOR is fed into the first unit. This can be expressed as:
2 … i … 9
Xi1t2 = Xi - 11t - 12;
X11t2 = X91t - 12 X111t - 12
If the shift register contains all zeros, no scrambling occurs (we just have Bm = A m) the
above equations produce no change in the shift register. Accordingly, the standard calls for
initializing the shift register with all ones and re-initializing the register to all ones when it
takes on a value of all zeros.
For the 4D-PAM5 scheme, two scrambling equations are used, one in each direction:
Bm = A m Bm - 13 Bm - 33
Bm = A m Bm - 20 Bm - 33
17.1 Overview
17.2 Wireless LAN Technology
17.3 IEEE 802.11 Architecture and Services
17.4 IEEE 802.11 Medium Access Control
17.5 IEEE 802.11 Physical Layer
17.6 IEEE 802.11 Security Considerations
17.7 Recommended Reading and Web Sites
17.8 Key Terms, Review Questions, and Problems
Investigators have published numerous reports of birds taking turns vocalizing; the
bird spoken to gave its full attention to the speaker and never vocalized at the same
time, as if the two were holding a conversation. Researchers and scholars who have
studied the data on avian communication carefully write the (a) the communication code of birds such has crows has not been broken by any means; (b) probably
all birds have wider vocabularies than anyone realizes; and (c) greater complexity
and depth are recognized in avian communication as research progresses.
—The Human Nature of Birds, Theodore Barber
The principal technologies used for wireless LANs are infrared,
spread spectrum, and narrowband microwave.
The IEEE 802.11 standard defines a set of services and physical
layer options for wireless LANs.
The IEEE 802.11 services include managing associations, delivering
data, and security.
The IEEE 802.11 physical layer includes infrared and spread
spectrum and covers a range of data rates.
In just the past few years, wireless LANs have come to occupy a significant niche
in the local area network market. Increasingly, organizations are finding that
wireless LANs are an indispensable adjunct to traditional wired LANs, to satisfy
requirements for mobility, relocation, ad hoc networking, and coverage of locations difficult to wire.
This chapter provides a survey of wireless LANs. We begin with an
overview that looks at the motivations for using wireless LANs and summarize
the various approaches in current use. The next section examines the three principal types of wireless LANs, classified according to transmission technology:
infrared, spread spectrum, and narrowband microwave.
The most prominent specification for wireless LANs was developed by
the IEEE 802.11 working group. The remainder of the chapter focuses on this
As the name suggests, a wireless LAN is one that makes use of a wireless transmission medium. Until relatively recently, wireless LANs were little used. The reasons
for this included high prices, low data rates, occupational safety concerns, and licensing requirements. As these problems have been addressed, the popularity of wireless LANs has grown rapidly.
In this section, we survey the key wireless LAN application areas and then
look at the requirements for and advantages of wireless LANs.
Wireless LAN Applications
[PAHL95] lists four application areas for wireless LANs: LAN extension, crossbuilding interconnect, nomadic access, and ad hoc networks. Let us consider each of
these in turn.
LAN Extension Early wireless LAN products, introduced in the late 1980s, were
marketed as substitutes for traditional wired LANs. A wireless LAN saves the cost
of the installation of LAN cabling and eases the task of relocation and other modifications to network structure. However, this motivation for wireless LANs was
overtaken by events. First, as awareness of the need for LANs became greater,
architects designed new buildings to include extensive prewiring for data applications. Second, with advances in data transmission technology, there is an increasing
reliance on twisted pair cabling for LANs and, in particular, Category 3 and Category 5 unshielded twisted pair. Most older buildings are already wired with an abundance of Category 3 cable, and many newer buildings are prewired with Category 5.
Thus, the use of a wireless LAN to replace wired LANs has not happened to any
great extent.
However, in a number of environments, there is a role for the wireless LAN as
an alternative to a wired LAN. Examples include buildings with large open areas,
such as manufacturing plants, stock exchange trading floors, and warehouses; historical buildings with insufficient twisted pair and where drilling holes for new wiring is
prohibited; and small offices where installation and maintenance of wired LANs is
not economical. In all of these cases, a wireless LAN provides an effective and more
attractive alternative. In most of these cases, an organization will also have a wired
LAN to support servers and some stationary workstations. For example, a manufacturing facility typically has an office area that is separate from the factory floor but
that must be linked to it for networking purposes. Therefore, typically, a wireless
LAN will be linked into a wired LAN on the same premises. Thus, this application
area is referred to as LAN extension.
Figure 17.1 indicates a simple wireless LAN configuration that is typical of
many environments. There is a backbone wired LAN, such as Ethernet, that supports servers, workstations, and one or more bridges or routers to link with other
networks. In addition, there is a control module (CM) that acts as an interface to a
wireless LAN. The control module includes either bridge or router functionality to
link the wireless LAN to the backbone. It includes some sort of access control logic,
such as a polling or token-passing scheme, to regulate the access from the end systems. Note that some of the end systems are standalone devices, such as a workstation or a server. Hubs or other user modules (UMs) that control a number of
stations off a wired LAN may also be part of the wireless LAN configuration.
The configuration of Figure 17.1 can be referred to as a single-cell wireless
LAN; all of the wireless end systems are within range of a single control module.
Ethernet switch
UM user module
CM control module
Ethernet switch
Bridge or router
Figure 17.1 Example Single-Cell Wireless LAN Configuration
Another common configuration, suggested by Figure 17.2, is a multiple-cell wireless
LAN. In this case, there are multiple control modules interconnected by a wired
LAN. Each control module supports a number of wireless end systems within its
transmission range. For example, with an infrared LAN, transmission is limited to a
single room; therefore, one cell is needed for each room in an office building that
requires wireless support.
Cross-Building Interconnect Another use of wireless LAN technology is to
connect LANs in nearby buildings, be they wired or wireless LANs. In this case, a
point-to-point wireless link is used between two buildings. The devices so connected
are typically bridges or routers. This single point-to-point link is not a LAN per se,
but it is usual to include this application under the heading of wireless LAN.
Nomadic Access Nomadic access provides a wireless link between a LAN hub
and a mobile data terminal equipped with an antenna, such as a laptop computer or
notepad computer. One example of the utility of such a connection is to enable an
employee returning from a trip to transfer data from a personal portable computer
to a server in the office. Nomadic access is also useful in an extended environment
such as a campus or a business operating out of a cluster of buildings. In both of
Frequency 2
Frequency 1
Frequency 3
Ethernet switch
Bridge or router
Figure 17.2
Example Multiple-Cell Wireless LAN Configuration
these cases, users may move around with their portable computers and may wish
access to the servers on a wired LAN from various locations.
Ad Hoc Networking An ad hoc network is a peer-to-peer network (no centralized server) set up temporarily to meet some immediate need. For example, a group
of employees, each with a laptop or palmtop computer, may convene in a conference room for a business or classroom meeting. The employees link their computers
in a temporary network just for the duration of the meeting.
Figure 17.3 suggests the differences between a wireless LAN that supports LAN
extension and nomadic access requirements and an ad hoc wireless LAN. In the former
case, the wireless LAN forms a stationary infrastructure consisting of one or more cells
with a control module for each cell. Within a cell, there may be a number of stationary
end systems. Nomadic stations can move from one cell to another. In contrast, there is
no infrastructure for an ad hoc network. Rather, a peer collection of stations within
range of each other may dynamically configure themselves into a temporary network.
Wireless LAN Requirements
A wireless LAN must meet the same sort of requirements typical of any LAN,
including high capacity, ability to cover short distances, full connectivity among
attached stations, and broadcast capability. In addition, there are a number of
High-speed backbone wired LAN
(a) Infrastructure wireless LAN
(b) Ad hoc LAN
Figure 17.3
Wireless LAN Configurations
requirements specific to the wireless LAN environment. The following are among
the most important requirements for wireless LANs:
• Throughput: The medium access control protocol should make as efficient use
as possible of the wireless medium to maximize capacity.
• Number of nodes: Wireless LANs may need to support hundreds of nodes
across multiple cells.
• Connection to backbone LAN: In most cases, interconnection with stations on
a wired backbone LAN is required. For infrastructure wireless LANs, this is
easily accomplished through the use of control modules that connect to both
types of LANs. There may also need to be accommodation for mobile users
and ad hoc wireless networks.
Service area: A typical coverage area for a wireless LAN has a diameter of
100 to 300 m.
Battery power consumption: Mobile workers use battery-powered workstations that need to have a long battery life when used with wireless adapters.
This suggests that a MAC protocol that requires mobile nodes to monitor
access points constantly or engage in frequent handshakes with a base station
is inappropriate. Typical wireless LAN implementations have features to
reduce power consumption while not using the network, such as a sleep mode.
Transmission robustness and security: Unless properly designed, a wireless
LAN may be especially vulnerable to interference and eavesdropping. The
design of a wireless LAN must permit reliable transmission even in a noisy
environment and should provide some level of security from eavesdropping.
Collocated network operation: As wireless LANs become more popular, it is
quite likely for two or more wireless LANs to operate in the same area or in
some area where interference between the LANs is possible. Such interference may thwart the normal operation of a MAC algorithm and may allow
unauthorized access to a particular LAN.
License-free operation: Users would prefer to buy and operate wireless LAN
products without having to secure a license for the frequency band used by the
Handoff/roaming: The MAC protocol used in the wireless LAN should enable
mobile stations to move from one cell to another.
Dynamic configuration: The MAC addressing and network management
aspects of the LAN should permit dynamic and automated addition, deletion,
and relocation of end systems without disruption to other users.
Wireless LANs are generally categorized according to the transmission technique that
is used. All current wireless LAN products fall into one of the following categories:
• Infrared (IR) LANs: An individual cell of an IR LAN is limited to a single
room, because infrared light does not penetrate opaque walls.
• Spread spectrum LANs: This type of LAN makes use of spread spectrum transmission technology. In most cases, these LANs operate in the ISM (industrial,
scientific, and medical) microwave bands so that no Federal Communications
Commission (FCC) licensing is required for their use in the United States.
Infrared LANs
Optical wireless communication in the infrared portion of the spectrum is commonplace in most homes, where it is used for a variety of remote control devices.
More recently, attention has turned to the use of infrared technology to construct
wireless LANs. In this section, we begin with a comparison of the characteristics of
infrared LANs with those of radio LANs and then look at some of the details of
infrared LANs.
Strengths and Weaknesses Infrared offers a number of significant advantages
over microwave approaches. The spectrum for infrared is virtually unlimited, which
presents the possibility of achieving extremely high data rates. The infrared spectrum is unregulated worldwide, which is not true of some portions of the microwave
In addition, infrared shares some properties of visible light that make it attractive for certain types of LAN configurations. Infrared light is diffusely reflected by
light-colored objects; thus it is possible to use ceiling reflection to achieve coverage
of an entire room. Infrared light does not penetrate walls or other opaque objects.
This has two advantages: First, infrared communications can be more easily secured
against eavesdropping than microwave; and second, a separate infrared installation
can be operated in every room in a building without interference, enabling the construction of very large infrared LANs.
Another strength of infrared is that the equipment is relatively inexpensive
and simple. Infrared data transmission typically uses intensity modulation, so that
IR receivers need to detect only the amplitude of optical signals, whereas most
microwave receivers must detect frequency or phase.
The infrared medium also exhibits some drawbacks. Many indoor environments experience rather intense infrared background radiation, from sunlight and
indoor lighting. This ambient radiation appears as noise in an infrared receiver,
requiring the use of transmitters of higher power than would otherwise be required
and also limiting the range. However, increases in transmitter power are limited by
concerns of eye safety and excessive power consumption.
Transmission Techniques Three alternative transmission techniques are in
common use for IR data transmission: the transmitted signal can be focused and
aimed (as in a remote TV control); it can be radiated omnidirectionally; or it can be
reflected from a light-colored ceiling.
Directed-beam IR can be used to create point-to-point links. In this mode, the
range depends on the emitted power and on the degree of focusing. A focused IR data
link can have a range of kilometers. Such ranges are not needed for constructing indoor
wireless LANs. However, an IR link can be used for cross-building interconnect
between bridges or routers located in buildings within a line of sight of each other.
One indoor use of point-to-point IR links is to set up a ring LAN. A set of IR
transceivers can be positioned so that data circulate around them in a ring configuration. Each transceiver supports a workstation or a hub of stations, with the hub
providing a bridging function.
An omnidirectional configuration involves a single base station that is within
line of sight of all other stations on the LAN. Typically, this station is mounted on the
ceiling. The base station acts as a multiport repeater. The ceiling transmitter broadcasts an omnidirectional signal that can be received by all of the other IR transceivers in the area. These other transceivers transmit a directional beam aimed at
the ceiling base unit.
In a diffused configuration, all of the IR transmitters are focused and aimed at
a point on a diffusely reflecting ceiling. IR radiation striking the ceiling is reradiated
omnidirectionally and picked up by all of the receivers in the area.
Spread Spectrum LANs
Currently, the most popular type of wireless LAN uses spread spectrum techniques.
Configuration Except for quite small offices, a spread spectrum wireless LAN
makes use of a multiple-cell arrangement, as was illustrated in Figure 17.2. Adjacent
cells make use of different center frequencies within the same band to avoid interference.
Within a given cell, the topology can be either hub or peer to peer. The hub
topology is indicated in Figure 17.2. In a hub topology, the hub is typically mounted
on the ceiling and connected to a backbone wired LAN to provide connectivity to
stations attached to the wired LAN and to stations that are part of wireless LANs in
other cells. The hub may also control access, as in the IEEE 802.11 point coordination function, described subsequently. The hub may also control access by acting as
a multiport repeater with similar functionality to Ethernet multiport repeaters. In
this case, all stations in the cell transmit only to the hub and receive only from the
hub. Alternatively, and regardless of access control mechanism, each station may
broadcast using an omnidirectional antenna so that all other stations in the cell may
receive; this corresponds to a logical bus configuration.
One other potential function of a hub is automatic handoff of mobile stations.
At any time, a number of stations are dynamically assigned to a given hub based on
proximity. When the hub senses a weakening signal, it can automatically hand off to
the nearest adjacent hub.
A peer-to-peer topology is one in which there is no hub.A MAC algorithm such
as CSMA is used to control access. This topology is appropriate for ad hoc LANs.
Transmission Issues A desirable, though not necessary, characteristic of a wireless
LAN is that it be usable without having to go through a licensing procedure. The
licensing regulations differ from one country to another, which complicates this objective. Within the United States, the FCC has authorized two unlicensed applications
within the ISM band: spread spectrum systems, which can operate at up to 1 watt, and
very low power systems, which can operate at up to 0.5 watts. Since the FCC opened
up this band, its use for spread spectrum wireless LANs has become popular.
In the United States, three microwave bands have been set aside for unlicensed
spread spectrum use: 902–928 MHz (915-MHz band), 2.4–2.4835 GHz (2.4-GHz band),
and 5.725–5.825 GHz (5.8-GHz band). Of these, the 2.4 GHz is also used in this manner
in Europe and Japan. The higher the frequency, the higher the potential bandwidth, so
the three bands are of increasing order of attractiveness from a capacity point of view.
In addition, the potential for interference must be considered. There are a number of
devices that operate at around 900 MHz, including cordless telephones, wireless microphones, and amateur radio. There are fewer devices operating at 2.4 GHz; one notable
example is the microwave oven, which tends to have greater leakage of radiation with
increasing age.At present there is little competition at the 5.8-GHz-band; however, the
higher the frequency band, in general the more expensive the equipment.
In 1990, the IEEE 802 Committee formed a new working group, IEEE 802.11, specifically devoted to wireless LANs, with a charter to develop a MAC protocol and physical
medium specification. The initial interest was in developing a wireless LAN operating
in the ISM (industrial, scientific, and medical) band. Since that time, the demand for
WLANs, at different frequencies and data rates, has exploded. Keeping pace with this
demand, the IEEE 802.11 working group has issued an ever-expanding list of standards
(Table 17.1). Table 17.2 briefly defines key terms used in the IEEE 802.11 standard.
Table 17.1 IEEE 802.11 Standards
Medium access control (MAC): One common MAC for WLAN applications
IEEE 802.11
Physical layer: Infrared at 1 and 2 Mbps
Physical layer: 2.4-GHz FHSS at 1 and 2 Mbps
Physical layer: 2.4-GHz DSSS at 1 and 2 Mbps
IEEE 802.11a
Physical layer: 5-GHz OFDM at rates from 6 to 54 Mbps
IEEE 802.11b
Physical layer: 2.4-GHz DSSS at 5.5 and 11 Mbps
IEEE 802.11c
Bridge operation at 802.11 MAC layer
IEEE 802.11d
Physical layer: Extend operation of 802.11 WLANs to new regulatory domains
IEEE 802.11e
MAC: Enhance to improve quality of service and enhance security mechanisms
IEEE 802.11f
Recommended practices for multivendor access point interoperability
IEEE 802.11g
Physical layer: Extend 802.11b to data rates 720 Mbps
IEEE 802.11h
Physical/MAC: Enhance IEEE 802.11a to add indoor and outdoor channel selection and
to improve spectrum and transmit power management
IEEE 802.11i
MAC: Enhance security and authentication mechanisms
IEEE 802.11j
Physical: Enhance IEEE 802.11a to conform to Japanese requirements
IEEE 802.11k
Radio resource measurement enhancements to provide interface to higher layers for
radio and network measurements
IEEE 802.11m
Maintenance of IEEE 802.11-1999 standard with technical and editorial corrections
IEEE 802.11n
Physical/MAC: Enhancements to enable higher throughput
IEEE 802.11p
Physical/MAC: Wireless access in vehicular environments
IEEE 802.11r
Physical/MAC: Fast roaming (fast BSS transition)
IEEE 802.11s
Physical/MAC: ESS mesh networking
IEEE 802.11,2
Recommended practice for the evaluation of 802.11 wireless performance
IEEE 802.11u
Physical/MAC: Interworking with external networks
Table 17.2 IEEE 802.11 Terminology
Access point (AP)
Any entity that has station functionality and provides access to the distribution system
via the wireless medium for associated stations
Basic service set
A set of stations controlled by a single coordination function
The logical function that determines when a station operating within a BSS is permitted
to transmit and may be able to receive PDUs
Distribution system
A system used to interconnect a set of BSSs and integrated LANs to create an (ESS)
Extended service
set (ESS)
A set of one or more interconnected BSSs and integrated LANs that appear as a
single BSS to the LLC layer at any station associated with one of these BSSs
MAC protocol data
unit (MPDU)
The unit of data exchanged between two peer MAC entites using the
services of the physical layer
MAC service data
unit (MSDU)
Information that is delivered as a unit between MAC users
Any device that contains an IEEE 802.11 conformant MAC and physical layer
The Wi-Fi Alliance
The first 802.11 standard to gain broad industry acceptance was 802.11b. Although
802.11b products are all based on the same standard, there is always a concern
whether products from different vendors will successfully interoperate. To meet this
concern, the Wireless Ethernet Compatibility Alliance (WECA), an industry consortium, was formed in 1999. This organization, subsequently renamed the Wi-Fi
(Wireless Fidelity) Alliance, created a test suite to certify interoperability for
802.11b products. The term used for certified 802.11b products is Wi-Fi. Wi-Fi certification has been extended to 802.11g products,. The Wi-Fi Alliance has also developed a certification process for 802.11a products, called Wi-Fi5. The Wi-Fi Alliance
is concerned with a range of market areas for WLANs, including enterprise, home,
and hot spots.
IEEE 802.11 Architecture
Figure 17.4 illustrates the model developed by the 802.11 working group. The
smallest building block of a wireless LAN is a basic service set (BSS), which consists of some number of stations executing the same MAC protocol and competing
for access to the same shared wireless medium. A BSS may be isolated or it may
connect to a backbone distribution system (DS) through an access point (AP).
The AP functions as a bridge and a relay point. In a BSS, client stations do not
communicate directly with one another. Rather, if one station in the BSS wants to
communicate with another station in the same BSS, the MAC frame is first sent
from the originating station to the AP, and then from the AP to the destination
station. Similarly, a MAC frame from a station in the BSS to a remote station is
sent from the local station to the AP and then relayed by the AP over the DS on
its way to the destination station. The BSS generally corresponds to what is
referred to as a cell in the literature. The DS can be a switch, a wired network, or a
wireless network.
IEEE 802.x LAN
service set
Distribution system
service set
service set
STA station
AP access point
Figure 17.4
IEEE 802.11 Architecture
When all the stations in the BSS are mobile stations, with no connection to
other BSSs, the BSS is called an independent BSS (IBSS). An IBSS is typically an ad
hoc network. In an IBSS, the stations all communicate directly, and no AP is
A simple configuration is shown in Figure 17.4, in which each station belongs
to a single BSS; that is, each station is within wireless range only of other stations
within the same BSS. It is also possible for two BSSs to overlap geographically, so
that a single station could participate in more than one BSS. Further, the association
between a station and a BSS is dynamic. Stations may turn off, come within range,
and go out of range.
An extended service set (ESS) consists of two or more basic service sets interconnected by a distribution system. Typically, the distribution system is a wired
backbone LAN but can be any communications network. The extended service set
appears as a single logical LAN to the logical link control (LLC) level.
Figure 17.4 indicates that an access point (AP) is implemented as part of a station; the AP is the logic within a station that provides access to the DS by providing
DS services in addition to acting as a station. To integrate the IEEE 802.11 architecture with a traditional wired LAN, a portal is used. The portal logic is implemented
in a device, such as a bridge or router, that is part of the wired LAN and that is
attached to the DS.
IEEE 802.11 Services
IEEE 802.11 defines nine services that need to be provided by the wireless LAN to
provide functionality equivalent to that which is inherent to wired LANs. Table 17.3
lists the services and indicates two ways of categorizing them.
Table 17.3 IEEE 802.11 Services
Used to Support
Distribution system
MSDU delivery
LAN access and security
LAN access and security
Distribution system
MSDU delivery
Distribution system
MSDU delivery
Distribution system
MSDU delivery
MSDU delivery
MSDU delivery
LAN access and security
Distribution system
MSDU delivery
1. The service provider can be either the station or the DS. Station services are
implemented in every 802.11 station, including AP stations. Distribution services are provided between BSSs; these services may be implemented in an AP
or in another special-purpose device attached to the distribution system.
2. Three of the services are used to control IEEE 802.11 LAN access and confidentiality. Six of the services are used to support delivery of MAC service data
units (MSDUs) between stations. The MSDU is a block of data passed down
from the MAC user to the MAC layer; typically this is a LLC PDU. If the
MSDU is too large to be transmitted in a single MAC frame, it may be fragmented and transmitted in a series of MAC frames. Fragmentation is discussed
in Section 17.4.
Following the IEEE 802.11 document, we next discuss the services in an order
designed to clarify the operation of an IEEE 802.11 ESS network. MSDU delivery,
which is the basic service, has already been mentioned. Services related to security
are discussed in Section17.6.
Distribution of Messages within a DS The two services involved with the distribution of messages within a DS are distribution and integration. Distribution is the
primary service used by stations to exchange MAC frames when the frame must traverse the DS to get from a station in one BSS to a station in another BSS. For example,
suppose a frame is to be sent from station 2 (STA 2) to STA 7 in Figure 17.4.The frame
is sent from STA 2 to STA 1, which is the AP for this BSS. The AP gives the frame to
the DS, which has the job of directing the frame to the AP associated with STA 5 in the
target BSS. STA 5 receives the frame and forwards it to STA 7. How the message is
transported through the DS is beyond the scope of the IEEE 802.11 standard.
If the two stations that are communicating are within the same BSS, then the
distribution service logically goes through the single AP of that BSS.
The integration service enables transfer of data between a station on an IEEE
802.11 LAN and a station on an integrated IEEE 802.x LAN. The term integrated
refers to a wired LAN that is physically connected to the DS and whose stations
may be logically connected to an IEEE 802.11 LAN via the integration service. The
integration service takes care of any address translation and media conversion logic
required for the exchange of data.
Association-Related Services The primary purpose of the MAC layer is to
transfer MSDUs between MAC entities; this purpose is fulfilled by the distribution
service. For that service to function, it requires information about stations within the
ESS that is provided by the association-related services. Before the distribution service can deliver data to or accept data from a station, that station must be associated.
Before looking at the concept of association, we need to describe the concept of
mobility. The standard defines three transition types, based on mobility:
• No transition: A station of this type is either stationary or moves only within
the direct communication range of the communicating stations of a single BSS.
• BSS transition: This is defined as a station movement from one BSS to another
BSS within the same ESS. In this case, delivery of data to the station requires
that the addressing capability be able to recognize the new location of the station.
• ESS transition: This is defined as a station movement from a BSS in one ESS
to a BSS within another ESS. This case is supported only in the sense that the
station can move. Maintenance of upper-layer connections supported by
802.11 cannot be guaranteed. In fact, disruption of service is likely to occur.
To deliver a message within a DS, the distribution service needs to know
where the destination station is located. Specifically, the DS needs to know the identity of the AP to which the message should be delivered in order for that message to
reach the destination station. To meet this requirement, a station must maintain an
association with the AP within its current BSS. Three services relate to this requirement:
• Association: Establishes an initial association between a station and an AP. Before
a station can transmit or receive frames on a wireless LAN, its identity and address
must be known. For this purpose, a station must establish an association with an
AP within a particular BSS. The AP can then communicate this information to
other APs within the ESS to facilitate routing and delivery of addressed frames.
• Reassociation: Enables an established association to be transferred from one
AP to another, allowing a mobile station to move from one BSS to another.
• Disassociation: A notification from either a station or an AP that an existing
association is terminated. A station should give this notification before leaving
an ESS or shutting down. However, the MAC management facility protects
itself against stations that disappear without notification.
The IEEE 802.11 MAC layer covers three functional areas: reliable data delivery,
access control, and security. This section covers the first two topics.
Reliable Data Delivery
As with any wireless network, a wireless LAN using the IEEE 802.11 physical and
MAC layers is subject to considerable unreliability. Noise, interference, and other
propagation effects result in the loss of a significant number of frames. Even with
error correction codes, a number of MAC frames may not successfully be received.
This situation can be dealt with by reliability mechanisms at a higher layer, such as
TCP. However, timers used for retransmission at higher layers are typically on the
order of seconds. It is therefore more efficient to deal with errors at the MAC level.
For this purpose, IEEE 802.11 includes a frame exchange protocol. When a station
receives a data frame from another station, it returns an acknowledgment (ACK)
frame to the source station. This exchange is treated as an atomic unit, not to be
interrupted by a transmission from any other station. If the source does not receive
an ACK within a short period of time, either because its data frame was damaged or
because the returning ACK was damaged, the source retransmits the frame.
Thus, the basic data transfer mechanism in IEEE 802.11 involves an exchange
of two frames. To further enhance reliability, a four-frame exchange may be used. In
this scheme, a source first issues a Request to Send (RTS) frame to the destination.
The destination then responds with a Clear to Send (CTS). After receiving the CTS,
the source transmits the data frame, and the destination responds with an ACK. The
RTS alerts all stations that are within reception range of the source that an
exchange is under way; these stations refrain from transmission in order to avoid a
collision between two frames transmitted at the same time. Similarly, the CTS alerts
all stations that are within reception range of the destination that an exchange is
under way. The RTS/CTS portion of the exchange is a required function of the MAC
but may be disabled.
Medium Access Control
The 802.11 working group considered two types of proposals for a MAC algorithm:
distributed access protocols, which, like Ethernet, distribute the decision to transmit
over all the nodes using a carrier sense mechanism; and centralized access protocols,
which involve regulation of transmission by a centralized decision maker. A distributed access protocol makes sense for an ad hoc network of peer workstations (typically an IBSS) and may also be attractive in other wireless LAN configurations that
consist primarily of bursty traffic. A centralized access protocol is natural for configurations in which a number of wireless stations are interconnected with each other
and some sort of base station that attaches to a backbone wired LAN; it is especially
useful if some of the data is time sensitive or high priority.
The end result for 802.11 is a MAC algorithm called DFWMAC (distributed
foundation wireless MAC) that provides a distributed access control mechanism
with an optional centralized control built on top of that. Figure 17.5 illustrates the
architecture. The lower sublayer of the MAC layer is the distributed coordination
function (DCF). DCF uses a contention algorithm to provide access to all traffic.
Ordinary asynchronous traffic directly uses DCF. The point coordination function
(PCF) is a centralized MAC algorithm used to provide contention-free service. PCF
is built on top of DCF and exploits features of DCF to assure access for its users. Let
us consider these two sublayers in turn.
Logical link control
function (PCF)
Distributed coordination function
1 Mbps
2 Mbps
1 Mbps
2 Mbps
1 Mbps
2 Mbps
IEEE 802.11
Figure 17.5
6, 9, 12,
18, 24, 36,
48, 54 Mbps
5.5 Mbps
11 Mbps
6, 9, 12,
18, 24, 36,
48, 54 Mbps
IEEE 802.11a
IEEE 802.11b
IEEE 802.11g
IEEE 802.11 Protocol Architecture
Distributed Coordination Function The DCF sublayer makes use of a simple CSMA (carrier sense multiple access) algorithm. If a station has a MAC frame
to transmit, it listens to the medium. If the medium is idle, the station may transmit;
otherwise the station must wait until the current transmission is complete before
transmitting. The DCF does not include a collision detection function (i.e.,
CSMA/CD) because collision detection is not practical on a wireless network. The
dynamic range of the signals on the medium is very large, so that a transmitting station cannot effectively distinguish incoming weak signals from noise and the effects
of its own transmission.
To ensure the smooth and fair functioning of this algorithm, DCF includes a
set of delays that amounts to a priority scheme. Let us start by considering a single
delay known as an interframe space (IFS). In fact, there are three different IFS values, but the algorithm is best explained by initially ignoring this detail. Using an IFS,
the rules for CSMA access are as follows (Figure 17.6):
1. A station with a frame to transmit senses the medium. If the medium is idle, it
waits to see if the medium remains idle for a time equal to IFS. If so, the station
may transmit immediately.
Wait for frame
to transmit
Wait IFS
Wait until current
transmission ends
Transmit frame
Wait IFS
Exponential backoff
while medium idle
Transmit frame
Figure 17.6
IEEE 802.11 Medium Access Control Logic
2. If the medium is busy (either because the station initially finds the medium busy
or because the medium becomes busy during the IFS idle time), the station
defers transmission and continues to monitor the medium until the current transmission is over.
3. Once the current transmission is over, the station delays another IFS. If the
medium remains idle for this period, then the station backs off a random amount
of time and again senses the medium. If the medium is still idle, the station may
transmit. During the backoff time, if the medium becomes busy, the backoff timer
is halted and resumes when the medium becomes idle.
4. If the transmission is unsuccessful, which is determined by the absence of an
acknowledgement, then it is assumed that a collision has occurred.
To ensure that backoff maintains stability, binary exponential backoff,
described in Chapter 16, is used. Binary exponential backoff provides a means of
handling a heavy load. Repeated failed attempts to transmit result in longer and
longer backoff times, which helps to smooth out the load. Without such a backoff,
the following situation could occur: Two or more stations attempt to transmit at the
same time, causing a collision. These stations then immediately attempt to retransmit, causing a new collision.
The preceding scheme is refined for DCF to provide priority-based access by
the simple expedient of using three values for IFS:
• SIFS (short IFS): The shortest IFS, used for all immediate response actions, as
explained in the following discussion
• PIFS (point coordination function IFS): A midlength IFS, used by the centralized controller in the PCF scheme when issuing polls
• DIFS (distributed coordination function IFS): The longest IFS, used as a minimum delay for asynchronous frames contending for access
Figure 17.7a illustrates the use of these time values. Consider first the SIFS.
Any station using SIFS to determine transmission opportunity has, in effect, the
highest priority, because it will always gain access in preference to a station waiting
an amount of time equal to PIFS or DIFS. The SIFS is used in the following circumstances:
Immediate access
when medium is free
longer than DIFS
Contention window
Busy Medium
Backoff window
Next frame
Slot time
Defer access
Select slot using binary exponential backoff
(a) Basic access method
Superframe (fixed nominal length)
Superframe (fixed nominal length)
PCF (optional)
Foreshortened actual
superframe period
Contention period
Variable length
(per superframe)
Busy medium
PCF (optional)
traffic defers
(b) PCF superframe construction
Figure 17.7 IEEE 802.11 MAC Timing
• Acknowledgment (ACK): When a station receives a frame addressed only to
itself (not multicast or broadcast), it responds with an ACK frame after waiting only for an SIFS gap. This has two desirable effects. First, because collision
detection is not used, the likelihood of collisions is greater than with
CSMA/CD, and the MAC-level ACK provides for efficient collision recovery.
Second, the SIFS can be used to provide efficient delivery of an LLC protocol
data unit (PDU) that requires multiple MAC frames. In this case, the following
scenario occurs. A station with a multiframe LLC PDU to transmit sends out
the MAC frames one at a time. Each frame is acknowledged by the recipient
after SIFS. When the source receives an ACK, it immediately (after SIFS)
sends the next frame in the sequence. The result is that once a station has contended for the channel, it will maintain control of the channel until it has sent
all of the fragments of an LLC PDU.
• Clear to Send (CTS): A station can ensure that its data frame will get through
by first issuing a small Request to Send (RTS) frame. The station to which this
frame is addressed should immediately respond with a CTS frame if it is ready
to receive. All other stations receive the RTS and defer using the medium.
• Poll response: This is explained in the following discussion of PCF.
The next longest IFS interval is the PIFS. This is used by the centralized controller in issuing polls and takes precedence over normal contention traffic. However, those frames transmitted using SIFS have precedence over a PCF poll.
Finally, the DIFS interval is used for all ordinary asynchronous traffic.
Point Coordination Function PCF is an alternative access method implemented on top of the DCF.The operation consists of polling by the centralized polling
master (point coordinator). The point coordinator makes use of PIFS when issuing
polls. Because PIFS is smaller than DIFS, the point coordinator can seize the medium
and lock out all asynchronous traffic while it issues polls and receives responses.
As an extreme, consider the following possible scenario. A wireless network is
configured so that a number of stations with time-sensitive traffic are controlled by
the point coordinator while remaining traffic contends for access using CSMA. The
point coordinator could issue polls in a round-robin fashion to all stations configured
for polling. When a poll is issued, the polled station may respond using SIFS. If the
point coordinator receives a response, it issues another poll using PIFS. If no response
is received during the expected turnaround time, the coordinator issues a poll.
If the discipline of the preceding paragraph were implemented, the point coordinator would lock out all asynchronous traffic by repeatedly issuing polls. To prevent this, an interval known as the superframe is defined. During the first part of this
interval, the point coordinator issues polls in a round-robin fashion to all stations
configured for polling. The point coordinator then idles for the remainder of the
superframe, allowing a contention period for asynchronous access.
Figure 17.7b illustrates the use of the superframe. At the beginning of a superframe, the point coordinator may optionally seize control and issue polls for a given
period of time. This interval varies because of the variable frame size issued by
responding stations. The remainder of the superframe is available for contentionbased access. At the end of the superframe interval, the point coordinator contends
Octets 2
0 to 2312
Frame body
FC = Frame control
D/I = Duration/connection ID
SC = Sequence control
Figure 17.8 IEEE 802.11 MAC Frame Format
for access to the medium using PIFS. If the medium is idle, the point coordinator
gains immediate access and a full superframe period follows. However, the medium
may be busy at the end of a superframe. In this case, the point coordinator must wait
until the medium is idle to gain access; this results in a foreshortened superframe
period for the next cycle.
MAC Frame
Figure 17.8 shows the 802.11 frame format. This general format is used for all data
and control frames, but not all fields are used in all contexts. The fields are as follows:
• Frame Control: Indicates the type of frame (control, management, or data)
and provides control information. Control information includes whether the
frame is to or from a DS, fragmentation information, and privacy information.
• Duration/Connection ID: If used as a duration field, indicates the time (in
microseconds) the channel will be allocated for successful transmission of a
MAC frame. In some control frames, this field contains an association, or connection, identifier.
• Addresses: The number and meaning of the 48-bit address fields depend on
context. The transmitter address and receiver address are the MAC addresses
of stations joined to the BSS that are transmitting and receiving frames over
the wireless LAN. The service set ID (SSID) identifies the wireless LAN over
which a frame is transmitted. For an IBSS, the SSID is a random number generated at the time the network is formed. For a wireless LAN that is part of a
larger configuration the SSID identifies the BSS over which the frame is transmitted; specifically, the SSID is the MAC-level address of the AP for this BSS
(Figure 17.4). Finally the source address and destination address are the MAC
addresses of stations, wireless or otherwise, that are the ultimate source and
destination of this frame. The source address may be identical to the transmitter address and the destination address may be identical to the receiver
• Sequence Control: Contains a 4-bit fragment number subfield, used for fragmentation and reassembly, and a 12-bit sequence number used to number
frames sent between a given transmitter and receiver.
• Frame Body: Contains an MSDU or a fragment of an MSDU. The MSDU is a
LLC protocol data unit or MAC control information.
• Frame Check Sequence: A 32-bit cyclic redundancy check.
We now look at the three MAC frame types.
Control Frames Control frames assist in the reliable delivery of data frames.
There are six control frame subtypes:
• Power Save-Poll (PS-Poll): This frame is sent by any station to the station that
includes the AP (access point). Its purpose is to request that the AP transmit a
frame that has been buffered for this station while the station was in powersaving mode.
• Request to Send (RTS): This is the first frame in the four-way frame exchange
discussed under the subsection on reliable data delivery at the beginning of
Section 17.3. The station sending this message is alerting a potential destination, and all other stations within reception range, that it intends to send a data
frame to that destination.
• Clear to Send (CTS): This is the second frame in the four-way exchange. It is
sent by the destination station to the source station to grant permission to send
a data frame.
• Acknowledgment: Provides an acknowledgment from the destination to the
source that the immediately preceding data, management, or PS-Poll frame
was received correctly.
• Contention-Free (CF)-end: Announces the end of a contention-free period
that is part of the point coordination function.
• CF-End + CF-Ack: Acknowledges the CF-end. This frame ends the contention-free period and releases stations from the restrictions associated with
that period.
Data Frames There are eight data frame subtypes, organized into two groups.
The first four subtypes define frames that carry upper-level data from the source
station to the destination station. The four data-carrying frames are as follows:
• Data: This is the simplest data frame. It may be used in both a contention
period and a contention-free period.
• Data + CF-Ack: May only be sent during a contention-free period. In addition to carrying data, this frame acknowledges previously received data.
• Data + CF-Poll: Used by a point coordinator to deliver data to a mobile station and also to request that the mobile station send a data frame that it may
have buffered.
• Data + CF-Ack + CF-Poll: Combines the functions of the Data + CF-Ack
and Data + CF-Poll into a single frame.
The remaining four subtypes of data frames do not in fact carry any user
data. The Null Function data frame carries no data, polls, or acknowledgments. It
is used only to carry the power management bit in the frame control field to the
AP, to indicate that the station is changing to a low-power operating state. The
remaining three frames (CF-Ack, CF-Poll, CF-Ack + CF-Poll) have the same
functionality as the corresponding data frame subtypes in the preceding list
1Data + CF-Ack, Data + CF-Poll, Data + CF-Ack + CF-Poll2 but without the
Management Frames Management frames are used to manage communications between stations and APs. Functions covered include management of associations (request, response, reassociation, dissociation, and authentication).
The physical layer for IEEE 802.11 has been issued in four stages. The first part, simply called IEEE 802.11, includes the MAC layer and three physical layer specifications, two in the 2.4-GHz band (ISM) and one in the infrared, all operating at 1 and
2 Mbps. IEEE 802.11a operates in the 5-GHz band at data rates up to 54 Mbps.
IEEE 802.11b operates in the 2.4-GHz band at 5.5 and 11 Mbps. IEEE 802.11g also
operates in the 2.4-GHz band, at data rates up to 54 Mbps. Table 17.4 provides some
details. We look at each of these in turn.
Original IEEE 802.11 Physical Layer
Three physical media are defined in the original 802.11 standard:
• Direct sequence spread spectrum (DSSS) operating in the 2.4-GHz ISM band,
at data rates of 1 Mbps and 2 Mbps. In the United States, the FCC (Federal
Communications Commission) requires no licensing for the use of this band.
The number of channels available depends on the bandwidth allocated by the
various national regulatory agencies. This ranges from 13 in most European
countries to just one available channel in Japan.
• Frequency-hopping spread spectrum (FHSS) operating in the 2.4-GHz ISM
band, at data rates of 1 Mbps and 2 Mbps. The number of channels available
ranges from 23 in Japan to 70 in the United States.
Table 17.4 IEEE 802.11 Physical Layer Standards
Available bandwidth
83.5 MHz
300 MHz
83.5 MHz
83.5 MHz
Unlicensed frequency
of operation
2.4–2.4835 GHz
5.15–5.35 GHz
5.725–5.825 GHz
2.4–2.4835 GHz
2.4–2.4835 GHz
Number of nonoverlapping channels
3 (indoor/outdoor)
4 indoor
4 (indoor/outdoor)
4 outdoor
3 (indoor/outdoor)
3 (indoor/outdoor)
Data rate per channel
1, 2 Mbps
6, 9, 12, 18, 24,
36, 48, 54 Mbps
1, 2, 5.5, 11 Mbps
1, 2, 5.5, 6, 9, 11,
12, 18, 24, 36, 48,
54 Mbps
Wi-Fi at 11 Mbps
and below
• Infrared at 1 Mbps and 2 Mbps operating at a wavelength between 850 and
950 nm
Direct Sequence Spread Spectrum Up to three nonoverlapping channels,
each with a data rate of 1 Mbps or 2 Mbps, can be used in the DSSS scheme. Each
channel has a bandwidth of 5 MHz. The encoding scheme that is used is DBPSK
(differential binary phase shift keying) for the 1-Mbps rate and DQPSK for the
2-Mbps rate.
Recall from Chapter 9 that a DSSS system makes use of a chipping code, or
pseudonoise sequence, to spread the data rate and hence the bandwidth of the signal. For IEEE 802.11, a Barker sequence is used.
A Barker sequence is a binary 5-1, +16 sequence 5s1t26 of length n with the
property that its autocorrelation values R1t2 satisfy ƒ R1t2 ƒ … 1 for all ƒ t ƒ … 1n - 12.
1 N
BkBk-t, where the
Autocorrelation is defined by the following formula: R(t) =
N ka
Bi are the bits of the sequence.
Further, the Barker property is preserved under the following transformations:
s1t2 : 1 -12ts1t2 and
s1t2 : - s1t2
s1t2 : - s1n - 1 - t2
as well as under compositions of these transformations. Only the following Barker
sequences are known:
+ +
+ + + + + + + + - +
+ + + - - + n = 11 + - + + - + + + - - -
n = 13
+ + + + + - - + + - + - +
IEEE 802.11 DSSS uses the 11-chip Barker sequence. Each data binary 1 is
mapped into the sequence 5+ - + + - + + + - - -6, and each binary 0 is
mapped into the sequence 5- + - - + - - - + + +6.
Important characteristics of Barker sequences are their robustness against
interference and their insensitivity to multipath propagation.
Frequency-Hopping Spread Spectrum Recall from Chapter 9 that a FHSS
system makes use of a multiple channels, with the signal hopping from one channel
to another based on a pseudonoise sequence. In the case of the IEEE 802.11
scheme, 1-MHz channels are used.
The details of the hopping scheme are adjustable. For example, the
minimum hop rate for the United States is 2.5 hops per second. The minimum
See Appendix J for a discussion of correlation and orthogonality.
hop distance in frequency is 6 MHz in North America and most of Europe and
5 MHz in Japan.
For modulation, the FHSS scheme uses two-level Gaussian FSK for the 1-Mbps
system. The bits zero and one are encoded as deviations from the current carrier
frequency. For 2 Mbps, a four-level GFSK scheme is used, in which four different
deviations from the center frequency define the four 2-bit combinations.
Infrared The IEEE 802.11 infrared scheme is omnidirectional rather than point to
point. A range of up to 20 m is possible. The modulation scheme for the 1-Mbps data
rate is known as 16-PPM (pulse position modulation). In pulse position modulation
(PPM), the input value determines the position of a narrow pulse relative to the
clocking time. The advantage of PPM is that it reduces the output power required of
the infrared source. For 16-PPM, each group of 4 data bits is mapped into one of the
16-PPM symbols; each symbol is a string of 16 pulse positions. Each 16-pulse string
consists of fifteen 0s and one binary 1. For the 2-Mbps data rate, each group of 2
data bits is mapped into one of four 4-pulse-position sequences. Each sequence consists of three 0s and one binary 1. The actual transmission uses an intensity modulation scheme, in which the presence of a signal corresponds to a binary 1 and the
absence of a signal corresponds to binary 0.
IEEE 802.11a
Channel Structure IEEE 802.11a makes use of the frequency band called the
Universal Networking Information Infrastructure (UNNI), which is divided into
three parts. The UNNI-1 band (5.15 to 5.25 GHz) is intended for indoor use; the
UNNI-2 band (5.25 to 5.35 GHz) can be used either indoor or outdoor, and the
UNNI-3 band (5.725 to 5.825 GHz) is for outdoor use.
IEEE 80211.a has several advantages over IEEE 802.11b/g:
• IEEE 802.11a utilizes more available bandwidth than 802.11b/g. Each UNNI
band provides four nonoverlapping channels for a total of 12 across the allocated spectrum.
• IEEE 802.11a provides much higher data rates than 802.11b and the same
maximum data rate as 802.11g.
• IEEE 802.11a uses a different, relatively uncluttered frequency spectrum
(5 GHz).
Coding and Modulation Unlike the 2.4-GHz specifications, IEEE 802.11 does
not use a spread spectrum scheme but rather uses orthogonal frequency division
multiplexing (OFDM). Recall from Section 11.2 that OFDM, also called multicarrier modulation, uses multiple carrier signals at different frequencies, sending some
of the bits on each channel. This is similar to FDM. However, in the case of OFDM,
all of the subchannels are dedicated to a single data source.
To complement OFDM, the specification supports the use of a variety of
modulation and coding alternatives. The system uses up to 48 subcarriers that are
modulated using BPSK, QPSK, 16-QAM, or 64-QAM. Subcarrier frequency
12 OFDM symbols
Variable number of OFDM symbols
PLCP preamble
(BPSK, r = 1/2)
(rate is indicated in signal)
Rate r
4 bits 1
12 bits
1 6 bits
16 bits
6 bits
(a) IEEE 802.11a physical PDU
72 bits at
1 Mbps DBPSK
48 bits at
2 Mbps DQPSK
Variable number bits at
2 Mbps DQPSK; 5.5 Mbps DBPSK; 11 Mbps DQPSK
PLCP Preamble
PLCP Header
Signal Service
56 bits
16 bits
8 bits 8 bits
16 bits
16 bits
(b) IEEE 802.11b physical PDU
Figure 17.9
IEEE 802 Physical-Level Protocol Data Units
spacing is 0.3125 MHz., and each subcarrier transmits at a rate of 250 kbaud. A
convolutional code at a rate of 1/2, 2/3, or 3/4 provides forward error correction.
The combination of modulation technique and coding rate determines the data
Physical-Layer Frame Structure The primary purpose of the physical layer is
to transmit medium access control (MAC) protocol data units (MPDUs) as directed
by the 802.11 MAC layer. The PLCP sublayer provides the framing and signaling
bits needed for the OFDM transmission and the PDM sublayer performs the actual
encoding and transmission operation.
Figure 17.9a illustrates the physical layer frame format. The PLCP Preamble
field enables the receiver to acquire an incoming OFDM signal and synchronize the
demodulator. Next is the Signal field, which consists of 24 bits encoded as a single
OFDM symbol. The Preamble and Signal fields are transmitted at 6 Mbps using
BPSK. The signal field consists of the following subfields:
• Rate: Specifies the data rate at which the data field portion of the frame is
• r: reserved for future use
• Length: Number of octets in the MAC PDU
• P: An even parity bit for the 17 bits in the Rate, r, and Length subfields
• Tail: Consists of 6 zero bits appended to the symbol to bring the convolutional
encoder to zero state
The Data field consists of a variable number of OFDM symbols transmitted at
the data rate specified in the Rate subfield. Prior to transmission, all of the bits of
the Data field are scrambled (see Appendix 16C for a discussion of scrambling). The
Data field consists of four subfields:
• Service: Consists of 16 bits, with the first 7 bits set to zeros to synchronize the
descrambler in the receiver, and the remaining 9 bits (all zeros) reserved for
future use.
• MAC PDU: Handed down from the MAC layer. The format is shown in Figure
• Tail: Produced by replacing the six scrambled bits following the MPDU end
with 6 bits of all zeros; used to reinitialize the convolutional encoder.
• Pad: The number of bits required to make the Data field a multiple of the
number of bits in an OFDM symbol (48, 96, 192, or 288).
IEEE 802.11b
IEEE 802.11b is an extension of the IEEE 802.11 DSSS scheme, providing data
rates of 5.5 and 11 Mbps in the ISM band. The chipping rate is 11 MHz, which is
the same as the original DSSS scheme, thus providing the same occupied bandwidth. To achieve a higher data rate in the same bandwidth at the same chipping
rate, a modulation scheme known as complementary code keying (CCK) is
The CCK modulation scheme is quite complex and is not examined in
detail here. Figure 17.10 provides an overview of the scheme for the 11-Mbps
rate. Input data are treated in blocks of 8 bits at a rate of 1.375 MHz
18 bits/symbol * 1.375 MHz = 11 Mbps2. Six of these bits are mapped into one
of 64 codes sequences derived from a 64 * 64 matrix known as the Walsh matrix
1.375 MHz
Figure 17.10
11 MHz
11-Mbps CCK Modulation Scheme
Pick one of
64 complex
I out
Q out
(discussed in [STAL05]). The output of the mapping, plus the two additional
bits, forms the input to a QPSK modulator.
An optional alternative to CCK is known as packet binary convolutional coding (PBCC). PBCC provides for potentially more efficient transmission at the cost
of increased computation at the receiver. PBCC was incorporated into 802.11b in
anticipation of its need for higher data rates for future enhancements to the standard.
Physical-Layer Frame Structure IEEE 802.11b defines two physical-layer
frame formats, which differ only in the length of the preamble. The long preamble of
144 bits is the same as used in the original 802.11 DSSS scheme and allows interoperability with other legacy systems. The short preamble of 72 bits provides improved
throughput efficiency. Figure 17.9b illustrates the physical layer frame format with
the short preamble. The PLCP Preamble field enables the receiver to acquire an
incoming signal and synchronize the demodulator. It consists of two subfields: a
56-bit Sync field for synchronization, and a 16-bit start-of-frame delimiter (SFD).
The preamble is transmitted at 1 Mbps using differential BPSK and Barker code
Following the preamble is the PLCP Header, which is transmitted at 2 Mbps
using DQPSK. It consists of the following subfields:
• Signal: Specifies the data rate at which the MPDU portion of the frame is
• Service: Only 3 bits of this 8-bit field are used in 802.11b. One bit indicates
whether the transmit frequency and symbol clocks use the same local oscillator. Another bit indicates whether CCK or PBCC encoding is used. A third bit
acts as an extension to the Length subfield.
• Length: Indicates the length of the MPDU field by specifying the number of
microseconds necessary to transmit the MPDU. Given the data rate, the length
of the MPDU in octets can be calculated. For any data rate over 8 Mbps, the
length extension bit from the Service field is needed to resolve a rounding
• CRC: A 16-bit error detection code used to protect the Signal, Service, and
Length fields.
The MPDU field consists of a variable number of bits transmitted at the
data rate specified in the Signal subfield. Prior to transmission, all of the bits of
the physical layer PDU are scrambled (see Appendix 16C for a discussion of
IEEE 802.11g
IEEE 802.11g extends 802.11b to data rates above 20 Mbps, up to 54 Mbps. Like
802.11b, 802.11g operates in the 2.4-GHz range and thus the two are compatible.
The standard is designed so that 802.11b devices will work when connected to an
802.11g AP, and 802.11g devices will work when connected to an 802.11b AP, in both
cases using the lower 802.11b data rate.
Table 17.5 Estimated Distance (m) Versus Data Rate
Data Rate (Mbps)
60 +
IEEE 802.11g offers a wider array of data rate and modulation scheme
options. IEEE 802.11g provides compatibility with 802.11 and 802.11b by specifying
the same modulation and framing schemes as these standards for 1, 2, 5.5, and
11 Mbps. At data rates of 6, 9, 12, 18, 24, 36, 48, and 54 Mbps, 802.11g adopts the
802.11a OFDM scheme, adapted for the 2.4 GHz rate; this is referred to as ERPOFDM, with ERP standing for extended rate physical layer. In addition, and ERPPBCC scheme is used to provide data rates of 22 and 33 Mbps.
The IEEE 802.11 standards do not include a specification of speed versus distance objectives. Different vendors will give different values, depending on environment. Table 17.5, based on [LAYL04] gives estimated values for a typical office
There are two characteristics of a wired LAN that are not inherent in a wireless
1. In order to transmit over a wired LAN, a station must be physically connected to the LAN. On the other hand, with a wireless LAN, any station
within radio range of the other devices on the LAN can transmit. In a sense,
there is a form of authentication with a wired LAN, in that it requires some
positive and presumably observable action to connect a station to a wired
2. Similarly, in order to receive a transmission from a station that is part of a
wired LAN, the receiving station must also be attached to the wired LAN. On
the other hand, with a wireless LAN, any station within radio range can
receive. Thus, a wired LAN provides a degree of privacy, limiting reception of
data to stations connected to the LAN.
Access and Privacy Services
IEEE 802.11 defines three services that provide a wireless LAN with these two features:
• Authentication: Used to establish the identity of stations to each other. In a
wired LAN, it is generally assumed that access to a physical connection conveys authority to connect to the LAN. This is not a valid assumption for a wireless LAN, in which connectivity is achieved simply by having an attached
antenna that is properly tuned. The authentication service is used by stations
to establish their identity with stations they wish to communicate with. IEEE
802.11 supports several authentication schemes and allows for expansion of
the functionality of these schemes. The standard does not mandate any particular authentication scheme, which could range from relatively unsecure handshaking to public-key encryption schemes. However, IEEE 802.11 requires
mutually acceptable, successful authentication before a station can establish
an association with an AP.
• Deauthentication: This service is invoked whenever an existing authentication
is to be terminated.
• Privacy: Used to prevent the contents of messages from being read by other
than the intended recipient. The standard provides for the optional use of
encryption to assure privacy.
Wireless LAN Security Standards
The original 802.11 specification included a set of security features for privacy and
authentication that, unfortunately, were quite weak. For privacy, 802.11 defined the
Wired Equivalent Privacy (WEP) algorithm. The privacy portion of the 802.11 standard contained major weaknesses. Subsequent to the development of WEP, the
802.11i task group has developed a set of capabilities to address the WLAN security
issues. In order to accelerate the introduction of strong security into WLANs, the
Wi-Fi Alliance promulgated Wi-Fi Protected Access (WPA) as a Wi-Fi standard.
WPA is a set of security mechanisms that eliminates most 802.11 security issues and
was based on the current state of the 802.11i standard. As 802.11i evolves, WPA will
evolve to maintain compatibility.
WPA is examined in Chapter 21.
[PAHL95] and [BANT94] are detailed survey articles on wireless LANs. [KAHN97] provides
good coverage of infrared LANs.
[ROSH04] provides a good up-to-date technical treatment of IEEE 802.11. Another
useful book is [BING02]. [OHAR99] is an excellent technical treatment of IEEE 802.11.
Another good treatment is [LARO02]. [CROW97] is a good survey article on the 802.11
standards but does not cover IEEE 802.11a and IEEE 802.11b. A brief but useful survey of
802.11 is [MCFA03]. [GEIE01] has a good discussion of IEEE 802.11a. [PETR00] summarizes IEEE 802.11b. [SHOE02] provides an overview of IEEE 802.11g. [XIAO04] discusses
BANT94 Bantz, D., and Bauchot, F. “Wireless LAN Design Alternatives.” IEEE Network,
March/April 1994.
BING02 Bing, B. Wireless Local Area Networks. New York: Wiley, 2002.
CROW97 Crow, B., et al. “IEEE 802.11 Wireless Local Area Networks.” IEEE
Communications Magazine, September 1997.
GEIE01 Geier, J. “Enabling Fast Wireless Networks with OFDM.” Communications
System Design, February 2001. (
KAHN97 Kahn, J., and Barry, J. “Wireless Infrared Communications.” Proceedings of the
IEEE, February 1997.
LARO02 LaRocca, J., and LaRocca, R. 802.11 Demystified. New York: McGraw-Hill, 2002.
MCFA03 McFarland, B., and Wong, M. ’The Family Dynamics of 802.11” ACM Queue,
May 2003.
OHAR99 Ohara, B., and Petrick, A. IEEE 802.11 Handbook: A Designer’s Companion.
New York: IEEE Press, 1999.
PAHL95 Pahlavan, K.; Probert, T.; and Chase, M. “Trends in Local Wireless Networks.”
IEEE Communications Magazine, March 1995.
PETR00 Petrick, A. “IEEE 802.11b—Wireless Ethernet.” Communications System
Design, June 2000.
ROSH04 Roshan, P., and Leary, J. 802.11 Wireless LAN Fundamentals. Indianapolis:
Cisco Press, 2004.
SHOE02 Shoemake, M. “IEEE 802.11g Jells as Applications Mount.” Communications
System Design, April 2002.
XIAO04 Xiao, Y. “IEEE 802.11e: QoS Provisioning at the MAC Layer.” IEEE
Communications Magazine, June 2004.
Recommended Web sites:
• Wireless LAN Association: Gives an introduction to the technology, including a discussion of implementation considerations and case studies from users. Links to related sites.
• The IEEE 802.11 Wireless LAN Working Group: Contains working group documents plus discussion archives.
• Wi-Fi Alliance: An industry group promoting the interoperability of 802.11 products
with each other and with Ethernet.
Key Terms
access point (AP)
ad hoc networking
Barker sequence
basic service set (BSS)
complementary code keying
coordination function
distributed coordination
function (DCF)
distribution system (DS)
extended service set (ESS)
infrared LAN
LAN extension
narrowband microwave LAN
nomadic access
point coordination function
spread spectrum LAN
wireless LAN
Review Questions
List and briefly define four application areas for wireless LANs.
List and briefly define key requirements for wireless LANs.
What is the difference between a single-cell and a multiple-cell wireless LAN?
What are some key advantages of infrared LANs?
What are some key disadvantages of infrared LANs?
List and briefly define three transmission techniques for infrared LANs.
What is the difference between an access point and a portal?
Is a distribution system a wireless network?
List and briefly define IEEE 802.11 services.
How is the concept of an association related to that of mobility?
Consider the sequence of actions within a BSS depicted in Figure 17.11. Draw a timeline, beginning with a period during which the medium is busy and ending with a
1. B
2. D
3. D
8. C
Subscriber A
1. Beacon
4. Data CF-Ack CF-Poll
5. Data CF-ACK
8. CF-End
Subscriber B
1. B
7. D
8. C
6. C
Subscriber C
Figure 17.11
Configuration for Problem 17.1
period in which the CF-End is broadcast from the AP. Show the transmission periods
and the gaps.
Find the autocorrelation for the 11-bit Barker sequence as a function of t.
a. For the 16-PPM scheme used for the 1-Mbps IEEE 802.11 infrared standard,
a1. What is the period of transmission (time between bits)?
For the corresponding infrared pulse transmission,
a2. What is the average time between pulses (1 values) and the corresponding
average rate of pulse transmission?
a3. What is the minimum time between adjacent pulses?
a4. What is the maximum time between pulses?
b. Repeat (a) for the 4-PPM scheme used for the 2-Mbps infrared standard.
For IEEE 802.11a, show how the modulation technique and coding rate determine
the data rate.
The 802.11a and 802.11b physical layers make use of data scrambling (see Appendix
16C). For 802.11, the scrambling equation is
P1X2 = 1 + X4 + X7
In this case the shift register consists of seven elements, used in the same manner as
the five-element register in Figure 16.17. For the 802.11 scrambler and descrambler,
a. Show the expression with exclusive-or operators that corresponds to the polynomial definition.
b. Draw a figure similar to Figure 16.17.
Internet and Transport
e have dealt, so far, with the technologies and techniques used to
exchange data between two devices. Part Two dealt with the case in
which the two devices share a transmission link. Parts Three and Four
were concerned with the case in which a communication network provides a
shared transmission capacity for multiple attached end systems.
In a distributed data processing system, much more is needed. The data
processing systems (workstations, PCs, servers, mainframes) must implement
a set of functions that will allow them to perform some task cooperatively.
This set of functions is organized into a communications architecture and
involves a layered set of protocols, including internetwork, transport, and
application-layer protocols. In Part Five, we examine the internetwork and
transport protocols.
Before proceeding with Part Five, the reader is advised to revisit Chapter 2,
which introduces the concept of a protocol architecture and discusses the key
elements of a protocol.
Chapter 18 Internet Protocols
With the proliferation of networks, internetworking facilities have become
essential components of network design. Chapter 18 begins with an examination of the requirements for an internetworking facility and the various
design approaches that can be taken to satisfy those requirements. The
remainder of the chapter deals with the use of routers for internetworking.
The Internet Protocol (IP) and the new IPv6 are examined.
Chapter 19 Internetwork Operation
Chapter 19 begins with a discussion of multicasting across an internet.
Then issues of routing and quality of service are explored.
The traffic that the Internet and these private internetworks must
carry continues to grow and change. The demand generated by traditional
data-based applications, such as electronic mail, Usenet news, file transfer, and remote logon, is sufficient to challenge these systems. But the driving factors are the heavy use of the World Wide Web, which demands
real-time response, and the increasing use of voice, image, and even video
over internetwork architectures.
These internetwork schemes are essentially datagram packetswitching technology with routers functioning as the switches. This technology was not designed to handle voice and video and is straining to
meet the demands placed on it. While some foresee the replacement of
this conglomeration of Ethernet-based LANs, packet-based WANs, and
IP-datagram-based routers with a seamless ATM transport service from
desktop to backbone, that day is far off. Meanwhile, the internetworking
and routing functions of these networks must be engineered to meet the
Chapter 19 looks at some of the tools and techniques designed to
meet the new demand, beginning with a discussion of routing schemes,
which can help smooth out load surges. The remainder of the chapter
looks at recent efforts to provide a given level of quality of service (QoS)
to various applications. The most important elements of this new
approach are integrated services and differentiated services.
Chapter 20 Transport Protocols
The transport protocol is the keystone of the whole concept of a computer communications architecture. It can also be one of the most complex of protocols. Chapter 20 examines in detail transport protocol
mechanisms and then discusses two important examples, TCP and UDP.
The bulk of the chapter is devoted to an analysis of the complex set of
TCP mechanisms and of TCP congestion control schemes.
18.1 Basic Protocol Functions
18.2 Principles of Internetworking
18.3 Internet Protocol Operation
18.4 Internet Protocol
18.5 IPv6
18.6 Virtual Private Networks and IP Security
18.7 Recommended Reading and Web Sites
18.8 Key Terms, Review Questions, and Problems
The map of the London Underground, which can be seen inside every train, has
been called a model of its kind, a work of art. It presents the underground network
as a geometric grid. The tube lines do not, of course, lie at right angles to one
another like the streets of Manhattan. Nor do they branch off at acute angles or
form perfect oblongs.
—King Solomon’s Carpet. Barbara Vine (Ruth Rendell)
Key functions typically performed by a protocol include encapsulation, fragmentation and reassembly, connection control, ordered
delivery, flow control, error control, addressing, and multiplexing.
An internet consists of multiple separate networks that are interconnected by routers. Data are transmitted in packets from a source system to a destination across a path involving multiple networks and
routers. Typically, a connectionless or datagram operation is used. A
router accepts datagrams and relays them on toward their destination
and is responsible for determining the route, much the same way as
packet-switching nodes operate.
The most widely used protocol for internetworking is the Internet
Protocol (IP). IP attaches a header to upper-layer (e.g., TCP) data to
form an IP datagram. The header includes source and destination
addresses, information used for fragmentation and reassembly, a timeto-live field, a type-of-service field, and a checksum.
A next-generation IP, known as IPv6, has been defined. IPv6 provides
longer address fields and more functionality than the current IP.
The purpose of this chapter is to examine the Internet Protocol, which is
the foundation on which all of the internet-based protocols and on which
internetworking is based. First, it will be useful to review the basic functions
of networking protocols. This review serves to summarize some of the
material introduced previously and to set the stage for the study of internet-based protocols in Parts Five and Six. We then move to a discussion of
internetworking. Next, the chapter focuses on the two standard internet
protocols: IPv4 and IPv6. Finally, the topic of IP security is introduced.
Refer to Figure 2.5 to see the position within the TCP/IP suite of the
protocols discussed in this chapter.
Before turning to a discussion of internet protocols, let us consider a rather small set
of functions that form the basis of all protocols. Not all protocols have all functions;
this would involve a significant duplication of effort. There are, nevertheless, many
instances of the same type of function being present in protocols at different levels.
We can group protocol functions into the following categories:
Fragmentation and reassembly
Connection control
Ordered delivery
Flow control
Error control
Transmission services
For virtually all protocols, data are transferred in blocks, called protocol data units
(PDUs). Each PDU contains not only data but also control information. Indeed,
some PDUs consist solely of control information and no data. The control information falls into three general categories:
• Address: The address of the sender and/or receiver may be indicated.
• Error-detecting code: Some sort of frame check sequence is often included for
error detection.
• Protocol control: Additional information is included to implement the protocol functions listed in the remainder of this section.
The addition of control information to data is referred to as encapsulation.
Data are accepted or generated by an entity and encapsulated into a PDU containing that data plus control information. Typically, the control information is contained in a PDU header; some data link layer PDUs include a trailer as well.
Numerous examples of PDUs appear in the preceding chapters [e.g., TFTP (Figure
2.13), HDLC (Figure 7.7), frame relay (Figure 10.16), ATM (Figure 11.4), LLC (Figure 15.7), IEEE 802.3 (Figure 16.3), IEEE 802.11 (Figure 17.8)].
Fragmentation and Reassembly1
A protocol is concerned with exchanging data between two entities. Usually, the
transfer can be characterized as consisting of a sequence of PDUs of some bounded
size. Whether the application entity sends data in messages or in a continuous
The term segmentation is used in OSI-related documents, but in protocol specifications related to the
TCP/IP protocol suite, the term fragmentation is used. The meaning is the same.
stream, lower-level protocols typically organize the data into blocks. Further, a
protocol may need to divide a block received from a higher layer into multiple
blocks of some smaller bounded size. This process is called fragmentation.
There are a number of motivations for fragmentation, depending on the
context. Among the typical reasons for fragmentation are the following:
• The communications network may only accept blocks of data up to a certain
size. For example, an ATM network is limited to blocks of 53 octets; Ethernet
imposes a maximum size of 1526 octets.
• Error control may be more efficient with a smaller PDU size. With smaller
PDUs, fewer bits need to be retransmitted when a PDU suffers an error.
• More equitable access to shared transmission facilities, with shorter delay, can
be provided. For example, without a maximum block size, one station could
monopolize a multipoint medium.
• A smaller PDU size may mean that receiving entities can allocate smaller
• An entity may require that data transfer comes to some sort of “closure” from
time to time, for checkpoint and restart/recovery operations.
There are several disadvantages to fragmentation that argue for making PDUs
as large as possible:
• Because each PDU contains a certain amount of control information, smaller
blocks have a greater percentage of overhead.
• PDU arrival may generate an interrupt that must be serviced. Smaller blocks
result in more interrupts.
• More time is spent processing smaller, more numerous PDUs.
All of these factors must be taken into account by the protocol designer in
determining minimum and maximum PDU size.
The counterpart of fragmentation is reassembly. Eventually, the segmented
data must be reassembled into messages appropriate to the application level. If
PDUs arrive out of order, the task is complicated.
Connection Control
An entity may transmit data to another entity in such a way that each PDU is
treated independently of all prior PDUs. This is known as connectionless data transfer; an example is the use of the datagram, described in Chapter 10. While this mode
is useful, an equally important technique is connection-oriented data transfer, of
which the virtual circuit, also described in Chapter 10, is an example.
Connection-oriented data transfer is preferred (even required) if stations
anticipate a lengthy exchange of data and/or certain details of their protocol must
be worked out dynamically. A logical association, or connection, is established
between the entities. Three phases occur (Figure 18.1):
• Connection establishment
• Data transfer
• Connection termination
Connection re
Connection ac
nnection requ
nnection accept
Figure 18.1 The Phases of a Connection-Oriented Data Transfer
With more sophisticated protocols, there may also be connection interrupt and
recovery phases to cope with errors and other sorts of interruptions.
During the connection establishment phase, two entities agree to exchange
data. Typically, one station will issue a connection request (in connectionless fashion) to the other. A central authority may or may not be involved. In simpler protocols, the receiving entity either accepts or rejects the request and, in the former case,
the connection is considered to be established. In more complex proposals, this
phase includes a negotiation concerning the syntax, semantics, and timing of the
protocol. Both entities must, of course, be using the same protocol. But the protocol
may allow certain optional features and these must be agreed upon by means of
negotiation. For example, the protocol may specify a PDU size of up to 8000 octets;
one station may wish to restrict this to 1000 octets.
Following connection establishment, the data transfer phase is entered. During this phase both data and control information (e.g., flow control, error control)
are exchanged. Figure 18.1 shows a situation in which all of the data flow in one
direction, with acknowledgments returned in the other direction. More typically,
data and acknowledgments flow in both directions. Finally, one side or the other
wishes to terminate the connection and does so by sending a termination request.
Alternatively, a central authority might forcibly terminate a connection.
A key characteristic of many connection-oriented data transfer protocols is that
sequencing is used (e.g., HDLC, IEEE 802.11). Each side sequentially numbers the
PDUs that it sends to the other side. Because each side remembers that it is engaged
in a logical connection, it can keep track of both outgoing numbers, which it generates,
and incoming numbers, which are generated by the other side. Indeed, one can essentially define a connection-oriented data transfer as a transfer in which both sides
number PDUs and keep track of both incoming and outgoing numbers. Sequencing
supports three main functions: ordered deliver, flow control, and error control.
Sequencing is not found in all connection-oriented protocols. Examples
include frame relay and ATM. However, all connection-oriented protocols
include in the PDU format some way of identifying the connection, which may be a
unique connection identifier or a combination of source and destination addresses.
Ordered Delivery
If two communicating entities are in different hosts2 connected by a network, there
is a risk that PDUs will not arrive in the order in which they were sent, because
they may traverse different paths through the network. In connection-oriented
protocols, it is generally required that PDU order be maintained. For example, if a
file is transferred between two systems, we would like to be assured that the
records of the received file are in the same order as those of the transmitted file,
and not shuffled. If each PDU is given a unique number, and numbers are assigned
sequentially, then it is a logically simple task for the receiving entity to reorder
received PDUs on the basis of sequence number. A problem with this scheme is
that, with a finite sequence number field, sequence numbers repeat (modulo some
maximum number). Evidently, the maximum sequence number must be greater
than the maximum number of PDUs that could be outstanding at any time. In fact,
the maximum number may need to be twice the maximum number of PDUs that
could be outstanding (e.g., selective-repeat ARQ; see Chapter 7).
Flow Control
Flow control is a function performed by a receiving entity to limit the amount or
rate of data that is sent by a transmitting entity.
The simplest form of flow control is a stop-and-wait procedure, in which each
PDU must be acknowledged before the next can be sent. More efficient protocols
involve some form of credit provided to the transmitter, which is the amount of data
that can be sent without an acknowledgment. The HDLC sliding-window technique
is an example of this mechanism (Chapter 7).
Flow control is a good example of a function that must be implemented in several protocols. Consider Figure 18.2, which repeats Figure 2.1. The network will need
to exercise flow control over host A via the network access protocol, to enforce network traffic control. At the same time, B’s network access module has finite buffer
space and needs to exercise flow control over A’s transmission; it can do this via the
transport protocol. Finally, even though B’s network access module can control its
data flow, B’s application may be vulnerable to overflow. For example, the application could be hung up waiting for disk access. Thus, flow control is also needed over
the application-oriented protocol.
Error Control
Error control techniques are needed to guard against loss or damage of data and
control information. Typically, error control is implemented as two separate
The term host refers to any end system attached to a network, such as a PC, workstation, or server.
Host A
Host B
App X
(service access point)
App Y
Logical connection
(TCP connection)
App Y
App X
Network access
protocol #1
Global internet
Network access
protocol #2
Subnetwork attachment
point address
Logical connection
(e.g., virtual circuit)
Network 1
Figure 18.2
Physical Physical
Network 2
TCP/IP Concepts
functions: error detection and retransmission. To achieve error detection, the sender
inserts an error-detecting code in the transmitted PDU, which is a function of the
other bits in the PDU. The receiver checks the value of the code on the incoming
PDU. If an error is detected, the receiver discards the PDU. Upon failing to receive
an acknowledgment to the PDU in a reasonable time, the sender retransmits the
PDU. Some protocols also employ an error correction code, which enables the
receiver not only to detect errors but, in some cases, to correct them.
As with flow control, error control is a function that must be performed at various layers of protocol. Consider again Figure 18.2. The network access protocol
should include error control to assure that data are successfully exchanged between
station and network. However, a packet of data may be lost inside the network, and
the transport protocol should be able to recover from this loss.
The concept of addressing in a communications architecture is a complex one and
covers a number of issues, including
• Addressing level
• Addressing scope
• Connection identifiers
• Addressing mode
During this discussion, we illustrate the concepts using Figure 18.2, which
shows a configuration using the TCP/IP architecture. The concepts are essentially
the same for the OSI architecture or any other communications architecture.
Addressing level refers to the level in the communications architecture at
which an entity is named. Typically, a unique address is associated with each end
system (e.g., workstation or server) and each intermediate system (e.g., router) in
a configuration. Such an address is, in general, a network-level address. In the case
of the TCP/IP architecture, this is referred to as an IP address, or simply an internet address. In the case of the OSI architecture, this is referred to as a network service access point (NSAP). The network-level address is used to route a PDU
through a network or networks to a system indicated by a network-level address
in the PDU.
Once data arrive at a destination system, they must be routed to some process
or application in the system. Typically, a system will support multiple applications
and an application may support multiple users. Each application and, perhaps, each
concurrent user of an application is assigned a unique identifier, referred to as a port
in the TCP/IP architecture and as a service access point (SAP) in the OSI architecture. For example, a host system might support both an electronic mail application
and a file transfer application. At minimum each application would have a port
number or SAP that is unique within that system. Further, the file transfer application might support multiple simultaneous transfers, in which case, each transfer is
dynamically assigned a unique port number or SAP.
Figure 18.2 illustrates two levels of addressing within a system. This is typically
the case for the TCP/IP architecture. However, there can be addressing at each level
of an architecture. For example, a unique SAP can be assigned to each level of the
OSI architecture.
Another issue that relates to the address of an end system or intermediate
system is addressing scope. The internet address or NSAP address referred to previously is a global address. The key characteristics of a global address are as follows:
• Global nonambiguity: A global address identifies a unique system. Synonyms
are permitted. That is, a system may have more than one global address.
• Global applicability: It is possible at any global address to identify any other
global address, in any system, by means of the global address of the other system.
Because a global address is unique and globally applicable, it enables an internet to route data from any system attached to any network to any other system
attached to any other network.
Figure 18.2 illustrates that another level of addressing may be required. Each
network must maintain a unique address for each device interface on the network.
Examples are a MAC address on an IEEE 802 network and an ATM host address.
This address enables the network to route data units (e.g., MAC frames, ATM cells)
through the network and deliver them to the intended attached system. We can refer
to such an address as a network attachment point address.
The issue of addressing scope is generally only relevant for network-level
addresses. A port or SAP above the network level is unique within a given system
but need not be globally unique. For example, in Figure 18.2, there can be a port 1 in
system A and a port 1 in system B. The full designation of these two ports could be
expressed as A.1 and B.1, which are unique designations.
The concept of connection identifiers comes into play when we consider connection-oriented data transfer (e.g., virtual circuit) rather than connectionless data
transfer (e.g., datagram). For connectionless data transfer, a global identifier is used
with each data transmission. For connection-oriented transfer, it is sometimes desirable to use only a connection identifier during the data transfer phase. The scenario
is this: Entity 1 on system A requests a connection to entity 2 on system B, perhaps
using the global address B.2. When B.2 accepts the connection, a connection
identifier (usually a number) is provided and is used by both entities for future
transmissions. The use of a connection identifier has several advantages:
• Reduced overhead: Connection identifiers are generally shorter than global
identifiers. For example, in the frame relay protocol (discussed in Chapter 10),
connection request packets contain both source and destination address fields.
After a logical connection, called a data link connection, is established, data
frames contain a data link connection identifier (DLCI) of 10, 16, or 23 bits.
• Routing: In setting up a connection, a fixed route may be defined. The connection identifier serves to identify the route to intermediate systems, such as
packet-switching nodes, for handling future PDUs.
• Multiplexing: We address this function in more general terms later. Here we
note that an entity may wish to enjoy more than one connection simultaneously. Thus, incoming PDUs must be identified by connection identifier.
• Use of state information: Once a connection is established, the end systems
can maintain state information relating to the connection. This enables such
functions as flow control and error control using sequence numbers. We see
examples of this with HDLC (Chapter 7) and IEEE 802.11 (Chapter 17).
Figure 18.2 shows several examples of connections. The logical connection
between router J and host B is at the network level. For example, if network 2 is a
frame relay network, then this logical connection would be a data link connection.
At a higher level, many transport-level protocols, such as TCP, support logical connections between users of the transport service. Thus, TCP can maintain a connection between two ports on different systems.
Another addressing concept is that of addressing mode. Most commonly, an
address refers to a single system or port; in this case it is referred to as an individual or
unicast address. It is also possible for an address to refer to more than one entity or
port. Such an address identifies multiple simultaneous recipients for data. For example,
a user might wish to send a memo to a number of individuals.The network control center may wish to notify all users that the network is going down.An address for multiple
recipients may be broadcast, intended for all entities within a domain, or multicast,
intended for a specific subset of entities. Table 18.1 illustrates the possibilities.
Related to the concept of addressing is that of multiplexing. One form of multiplexing is supported by means of multiple connections into a single system. For
Table 18.1 Addressing Modes
Network Address
System Address
Port/SAP Address
example, with frame relay, there can be multiple data link connections terminating
in a single end system; we can say that these data link connections are multiplexed
over the single physical interface between the end system and the network. Multiplexing can also be accomplished via port names, which also permit multiple simultaneous connections. For example, there can be multiple TCP connections
terminating in a given system, each connection supporting a different pair of ports.
Multiplexing is used in another context as well, namely the mapping of connections from one level to another. Consider again Figure 18.2. Network 1 might provide
a connection-oriented service. For each process-to-process connection established at
the next higher level, a data link connection could be created at the network access
level.This is a one-to-one relationship, but it need not be so. Multiplexing can be used
in one of two directions. Upward multiplexing, or inward multiplexing, occurs when
multiple higher-level connections are multiplexed on, or share, a single lower-level
connection. This may be needed to make more efficient use of the lower-level service
or to provide several higher-level connections in an environment where only a single
lower-level connection exists. Downward multiplexing, or splitting, means that a single higher-level connection is built on top of multiple lower-level connections, and
the traffic on the higher connection is divided among the various lower connections.
This technique may be used to provide reliability, performance, or efficiency.
Transmission Services
A protocol may provide a variety of additional services to the entities that use it. We
mention here three common examples:
• Priority: Certain messages, such as control messages, may need to get through
to the destination entity with minimum delay. An example would be a terminate-connection request. Thus, priority could be assigned on a message basis.
Additionally, priority could be assigned on a connection basis.
• Quality of service: Certain classes of data may require a minimum throughput
or a maximum delay threshold.
• Security: Security mechanisms, restricting access, may be invoked.
All of these services depend on the underlying transmission system and any
intervening lower-level entities. If it is possible for these services to be provided
from below, the protocol can be used by the two entities to exercise those services.
Packet-switching and packet-broadcasting networks grew out of a need to allow the
computer user to have access to resources beyond that available in a single system.
In a similar fashion, the resources of a single network are often inadequate to meet
users’ needs. Because the networks that might be of interest exhibit so many differences, it is impractical to consider merging them into a single network. Rather, what
is needed is the ability to interconnect various networks so that any two stations on
any of the constituent networks can communicate.
Table 18.2 lists some commonly used terms relating to the interconnection of
networks, or internetworking. An interconnected set of networks, from a user’s
point of view, may appear simply as a larger network. However, if each of the constituent networks retains its identity and special mechanisms are needed for communicating across multiple networks, then the entire configuration is often referred
to as an internet.
Each constituent network in an internet supports communication among the
devices attached to that network; these devices are referred to as end systems (ESs).
In addition, networks are connected by devices referred to in the ISO documents as
intermediate systems (ISs). Intermediate systems provide a communications path
Table 18.2 Internetworking Terms
Communication Network
A facility that provides a data transfer service among devices attached to the network.
A collection of communication networks interconnected by bridges and/or routers.
An internet used by a single organization that provides the key Internet applications, especially the World
Wide Web. An intranet operates within the organization for internal purposes and can exist as an isolated,
self-contained internet, or may have links to the Internet.
Refers to a constituent network of an internet. This avoids ambiguity because the entire internet, from a user’s
point of view, is a single network.
End System (ES)
A device attached to one of the networks of an internet that is used to support end-user applications or
Intermediate System (IS)
A device used to connect two networks and permit communication between end systems attached to different
An IS used to connect two LANs that use similar LAN protocols. The bridge acts as an address filter, picking
up packets from one LAN that are intended for a destination on another LAN and passing those packets on.
The bridge does not modify the contents of the packets and does not add anything to the packet. The bridge
operates at layer 2 of the OSI model.
An IS used to connect two networks that may or may not be similar. The router employs an internet
protocol present in each router and each end system of the network. The router operates at layer 3 of the
OSI model.
and perform the necessary relaying and routing functions so that data can be
exchanged between devices attached to different networks in the internet.
Two types of ISs of particular interest are bridges and routers. The differences between them have to do with the types of protocols used for the internetworking logic. In essence, a bridge operates at layer 2 of the open systems
interconnection (OSI) seven-layer architecture and acts as a relay of frames
between similar networks; bridges are discussed in Chapter 15. A router operates
at layer 3 of the OSI architecture and routes packets between potentially different
networks. Both the bridge and the router assume that the same upper-layer protocols are in use.
We begin our examination of internetworking with a discussion of the basic
principles of internetworking. We then examine the most important architectural
approach to internetworking: the connectionless router.
The overall requirements for an internetworking facility are as follows (we refer to
Figure 18.2 as an example throughout):
1. Provide a link between networks. At minimum, a physical and link control
connection is needed. (Router J has physical links to N1 and N2, and on each
link there is a data link protocol.)
2. Provide for the routing and delivery of data between processes on different networks. (Application X on host A exchanges data with application X on host B.)
3. Provide an accounting service that keeps track of the use of the various networks
and routers and maintains status information.
4. Provide the services just listed in such a way as not to require modifications to the
networking architecture of any of the constituent networks. This means that the
internetworking facility must accommodate a number of differences among networks. These include
• Different addressing schemes: The networks may use different endpoint
names and addresses and directory maintenance schemes. Some form of
global network addressing must be provided, as well as a directory service.
(Hosts A and B and router J have globally unique IP addresses.)
• Different maximum packet size: Packets from one network may have to be
broken up into smaller pieces for another. This process is referred to as
fragmentation. (N1 and N2 may set different upper limits on packet sizes.)
• Different network access mechanisms: The network access mechanism
between station and network may be different for stations on different networks. (For example, N1 may be a frame relay network and N2 an Ethernet
• Different timeouts: Typically, a connection-oriented transport service will
await an acknowledgment until a timeout expires, at which time it will
retransmit its block of data. In general, longer times are required for successful delivery across multiple networks. Internetwork timing procedures
must allow successful transmission that avoids unnecessary retransmissions.
• Error recovery: Network procedures may provide anything from no error
recovery up to reliable end-to-end (within the network) service. The internetwork service should not depend on nor be interfered with by the nature
of the individual network’s error recovery capability.
• Status reporting: Different networks report status and performance differently. Yet it must be possible for the internetworking facility to provide such
information on internetworking activity to interested and authorized
• Routing techniques: Intranetwork routing may depend on fault detection
and congestion control techniques peculiar to each network. The internetworking facility must be able to coordinate these to route data adaptively
between stations on different networks.
• User access control: Each network will have its own user access control
technique (authorization for use of the network). These must be invoked by
the internetwork facility as needed. Further, a separate internetwork access
control technique may be required.
• Connection, connectionless: Individual networks may provide connectionoriented (e.g., virtual circuit) or connectionless (datagram) service. It may
be desirable for the internetwork service not to depend on the nature of the
connection service of the individual networks.
The Internet Protocol (IP) meets some of these requirements. Others require
additional control and application software, as we shall see in this chapter and
the next.
Connectionless Operation
In virtually all implementation, internetworking involves connectionless operation
at the level of the Internet Protocol. Whereas connection-oriented operation corresponds to the virtual circuit mechanism of a packet-switching network (Figure
10.10), connectionless-mode operation corresponds to the datagram mechanism of
a packet-switching network (Figure 10.9). Each network protocol data unit is
treated independently and routed from source ES to destination ES through a series
of routers and networks. For each data unit transmitted by A, A makes a decision as
to which router should receive the data unit. The data unit hops across the internet
from one router to the next until it reaches the destination network. At each router,
a routing decision is made (independently for each data unit) concerning the next
hop. Thus, different data units may travel different routes between source and destination ES.
All ESs and all routers share a common network-layer protocol known
generically as the Internet Protocol. An Internet Protocol (IP) was initially developed for the DARPA internet project and published as RFC 791 and has become
an Internet Standard. Below this Internet Protocol, a protocol is needed to access
a particular network. Thus, there are typically two protocols operating in each ES
and router at the network layer: an upper sublayer that provides the internetworking function, and a lower sublayer that provides network access. Figure 18.3
shows an example.
Frame relay
End system
End system
Physical Physical
Physical Physical
t3, t4 MAC1-H LLC1-H
t8, t9
t13, t14 MAC2-H LLC2-H
t12, t15
TCP header
IP header
LLC header
MAC header
t2, t5
t1, t6, t7, t10, t11, t16
Figure 18.3
MACi-T MAC trailer
Frame relay header
Frame relay trailer
Example of Internet Protocol Operation
In this section, we examine the essential functions of an internetwork protocol.
For convenience, we refer specifically to the Internet Standard IPv4, but the
narrative in this section applies to any connectionless Internet Protocol, such
as IPv6.
Operation of a Connectionless Internetworking Scheme
IP provides a connectionless, or datagram, service between end systems. There are a
number of advantages to this approach:
• A connectionless internet facility is flexible. It can deal with a variety of networks, some of which are themselves connectionless. In essence, IP requires
very little from the constituent networks.
• A connectionless internet service can be made highly robust. This is basically
the same argument made for a datagram network service versus a virtual circuit service. For a further discussion, see Section 10.5.
• A connectionless internet service is best for connectionless transport protocols, because it does not impose unnecessary overhead.
Figure 18.3 depicts a typical example using IP, in which two LANs are interconnected by a frame relay WAN.The figure depicts the operation of the Internet Protocol
for data exchange between host A on one LAN (network 1) and host B on another
LAN (network 2) through the WAN. The figure shows the protocol architecture and
format of the data unit at each stage.The end systems and routers must all share a common Internet Protocol. In addition, the end systems must share the same protocols
above IP. The intermediate routers need only implement up through IP.
The IP at A receives blocks of data to be sent to B from a higher layers of software in A (e.g., TCP or UDP). IP attaches a header (at time t1) specifying, among
other things, the global internet address of B. That address is logically in two parts:
network identifier and end system identifier. The combination of IP header and
upper-level data is called an Internet Protocol data unit (PDU), or simply a datagram. The datagram is then encapsulated with the LAN protocol (LLC header at t2 ;
MAC header and trailer at t3) and sent to the router, which strips off the LAN fields
to read the IP header (t6). The router then encapsulates the datagram with the
frame relay protocol fields (t8) and transmits it across the WAN to another router.
This router strips off the frame relay fields and recovers the datagram, which it then
wraps in LAN fields appropriate to LAN 2 and sends it to B.
Let us now look at this example in more detail. End system A has a datagram to
transmit to end system B; the datagram includes the internet address of B.The IP module in A recognizes that the destination (B) is on another network. So the first step is to
send the data to a router, in this case router X. To do this, IP passes the datagram down
to the next lower layer (in this case LLC) with instructions to send it to router X. LLC
in turn passes this information down to the MAC layer, which inserts the MAC-level
address of router X into the MAC header. Thus, the block of data transmitted onto
LAN 1 includes data from a layer or layers above TCP, plus a TCP header, an IP
header, an LLC header, and a MAC header and trailer (time t3 in Figure 18.3).
Next, the packet travels through network 1 to router X. The router removes
MAC and LLC fields and analyzes the IP header to determine the ultimate destination of the data, in this case B. The router must now make a routing decision. There
are three possibilities:
1. The destination station B is connected directly to one of the networks to which the
router is attached. If so, the router sends the datagram directly to the destination.
2. To reach the destination, one or more additional routers must be traversed.
If so, a routing decision must be made: To which router should the datagram
be sent? In both cases 1 and 2, the IP module in the router sends the datagram
down to the next lower layer with the destination network address. Please
note that we are speaking here of a lower-layer address that refers to this
3. The router does not know the destination address. In this case, the router
returns an error message to the source of the datagram.
In this example, the data must pass through router Y before reaching the destination. So router X constructs a new frame by appending a frame relay (LAPF)
header and trailer to the IP datagram. The frame relay header indicates a logical
connection to router Y. When this frame arrives at router Y, the frame header and
trailer are stripped off. The router determines that this IP data unit is destined for B,
which is connected directly to a network to which this router is attached. The router
therefore creates a frame with a layer-2 destination address of B and sends it out
onto LAN 2. The data finally arrive at B, where the LAN and IP headers can be
stripped off.
At each router, before the data can be forwarded, the router may need to fragment the datagram to accommodate a smaller maximum packet size limitation on
the outgoing network. If so, the data unit is split into two or more fragments, each of
which becomes an independent IP datagram. Each new data unit is wrapped in a
lower-layer packet and queued for transmission. The router may also limit the
length of its queue for each network to which it attaches so as to avoid having a slow
network penalize a faster one. Once the queue limit is reached, additional data units
are simply dropped.
The process just described continues through as many routers as it takes for
the data unit to reach its destination. As with a router, the destination end system
recovers the IP datagram from its network wrapping. If fragmentation has occurred,
the IP module in the destination end system buffers the incoming data until the
entire original data field can be reassembled. This block of data is then passed to a
higher layer in the end system.3
This service offered by IP is an unreliable one. That is, IP does not guarantee that all data will be delivered or that the data that are delivered will arrive in
the proper order. It is the responsibility of the next higher layer (e.g., TCP) to
recover from any errors that occur. This approach provides for a great deal of
With the Internet Protocol approach, each unit of data is passed from router to
router in an attempt to get from source to destination. Because delivery is not
guaranteed, there is no particular reliability requirement on any of the networks.
Thus, the protocol will work with any combination of network types. Because the
sequence of delivery is not guaranteed, successive data units can follow different
paths through the internet. This allows the protocol to react to both congestion and
failure in the internet by changing routes.
Appendix L provides a more detailed example, showing the involvement of all protocol layers.
Design Issues
With that brief sketch of the operation of an IP-controlled internet, we now examine some design issues in greater detail:
Datagram lifetime
Fragmentation and reassembly
Error control
Flow control
As we proceed with this discussion, note the many similarities with design
issues and techniques relevant to packet-switching networks. To see the reason for
this, consider Figure 18.4, which compares an internet architecture with a packetswitching network architecture. The routers (R1, R2, R3) in the internet correspond
(a) Packet-switching network architecture
(b) Internetwork architecture
Figure 18.4
The Internet as a Network (based on [HIND83])
to the packet-switching nodes (P1, P2, P3) in the network, and the networks (N1, N2,
N3) in the internet correspond to the transmission links (T1, T2, T3) in the networks.
The routers perform essentially the same functions as packet-switching nodes and
use the intervening networks in a manner analogous to transmission links.
Routing For the purpose of routing, each end system and router maintains a routing table that lists, for each possible destination network, the next router to which
the internet datagram should be sent.
The routing table may be static or dynamic. A static table, however, could contain alternate routes if a particular router is unavailable. A dynamic table is more
flexible in responding to both error and congestion conditions. In the Internet, for
example, when a router goes down, all of its neighbors will send out a status report,
allowing other routers and stations to update their routing tables. A similar scheme
can be used to control congestion. Congestion control is particularly important
because of the mismatch in capacity between local and wide area networks. Chapter
19 discusses routing protocols.
Routing tables may also be used to support other internetworking services,
such as security and priority. For example, individual networks might be classified to
handle data up to a given security classification. The routing mechanism must assure
that data of a given security level are not allowed to pass through networks not
cleared to handle such data.
Another routing technique is source routing. The source station specifies the
route by including a sequential list of routers in the datagram. This, again, could be
useful for security or priority requirements.
Finally, we mention a service related to routing: route recording. To record a
route, each router appends its internet address to a list of addresses in the datagram.
This feature is useful for testing and debugging purposes.
Datagram Lifetime If dynamic or alternate routing is used, the potential exists
for a datagram to loop indefinitely through the internet. This is undesirable for two
reasons. First, an endlessly circulating datagram consumes resources. Second, we will
see in Chapter 20 that a transport protocol may depend on the existence of an upper
bound on datagram lifetime. To avoid these problems, each datagram can be marked
with a lifetime. Once the lifetime expires, the datagram is discarded.
A simple way to implement lifetime is to use a hop count. Each time that a
datagram passes through a router, the count is decremented. Alternatively, the lifetime could be a true measure of time. This requires that the routers must somehow
know how long it has been since the datagram or fragment last crossed a router, to
know by how much to decrement the lifetime field. This would seem to require
some global clocking mechanism. The advantage of using a true time measure is that
it can be used in the reassembly algorithm, described next.
Fragmentation and Reassembly Individual networks within an internet may
specify different maximum packet sizes. It would be inefficient and unwieldy to try
to dictate uniform packet size across networks. Thus, routers may need to fragment
incoming datagrams into smaller pieces, called segments or fragments, before transmitting on to the next network.
If datagrams can be fragmented (perhaps more than once) in the course of
their travels, the question arises as to where they should be reassembled. The easiest
solution is to have reassembly performed at the destination only. The principal disadvantage of this approach is that fragments can only get smaller as data move
through the internet. This may impair the efficiency of some networks. However, if
intermediate router reassembly is allowed, the following disadvantages result:
1. Large buffers are required at routers, and there is the risk that all of the buffer
space will be used up storing partial datagrams.
2. All fragments of a datagram must pass through the same router. This inhibits
the use of dynamic routing.
In IP, datagram fragments are reassembled at the destination end system. The
IP fragmentation technique uses the following information in the IP header:
Data Unit Identifier (ID)
Data Length4
More Flag
The ID is a means of uniquely identifying an end-system-originated datagram.
In IP, it consists of the source and destination addresses, a number that corresponds
to the protocol layer that generated the data (e.g., TCP), and an identification supplied by that protocol layer. The Data Length is the length of the user data field in
octets, and the Offset is the position of a fragment of user data in the data field of the
original datagram, in multiples of 64 bits.
The source end system creates a datagram with a Data Length equal to the entire
length of the data field, with Offset 0, and a More Flag set to 0 (false). To fragment a
long datagram into two pieces, an IP module in a router performs the following tasks:
1. Create two new datagrams and copy the header fields of the incoming datagram into both.
2. Divide the incoming user data field into two portions along a 64-bit boundary
(counting from the beginning), placing one portion in each new datagram. The
first portion must be a multiple of 64 bits (8 octets).
3. Set the Data Length of the first new datagram to the length of the inserted data,
and set More Flag to 1 (true). The Offset field is unchanged.
4. Set the Data Length of the second new datagram to the length of the inserted
data, and add the length of the first data portion divided by 8 to the Offset
field. The More Flag remains the same.
Figure 18.5 gives an example in which two fragments are created from an
original IP datagram. The procedure is easily generalized to an n-way split. In this
example, the payload of the original IP datagram is a TCP segment, consisting of a
In the IPv6 header, there is a Payload Length field that corresponds to Data Length in this discussion. In
the IPv4 header, there is Total Length field whose value is the length of the header plus data; the data
length must be calculated by subtracting the header length.
Original IP datagram
Data length 404 octets
Segment offset 0; More 0
header header
octets) octets)
header header
octets) octets)
TCP payload (384 octets)
Partial TCP payload (188 octets)
First fragment
Data length 208 octets
Segment offset 0; More 1
Partial TCP payload (196 octets)
Second fragment
Data length 196 octets
Segment offset 26 64-bit units
(208 octets); More 0
Figure 18.5 Fragmentation Example
TCP header and application data. The IP header from the original datagram is used
in both fragments, with the appropriate changes to the fragmentation-related fields.
Note that the first fragment contains the TCP header; this header is not replicated in
the second fragment, because all of the IP payload, including the TCP header is
transparent to IP. That is, IP is not concerned with the contents of the payload of the
To reassemble a datagram, there must be sufficient buffer space at the
reassembly point. As fragments with the same ID arrive, their data fields are
inserted in the proper position in the buffer until the entire data field is reassembled, which is achieved when a contiguous set of data exists starting with an Offset of
zero and ending with data from a fragment with a false More Flag.
One eventuality that must be dealt with is that one or more of the fragments
may not get through: The IP service does not guarantee delivery. Some method is
needed to decide when to abandon a reassembly effort to free up buffer space. Two
approaches are commonly used. First, assign a reassembly lifetime to the first fragment to arrive. This is a local, real-time clock assigned by the reassembly function
and decremented while the fragments of the original datagram are being buffered.
If the time expires prior to complete reassembly, the received fragments are discarded. A second approach is to make use of the datagram lifetime, which is part of
the header of each incoming fragment. The lifetime field continues to be decremented by the reassembly function; as with the first approach, if the lifetime expires
prior to complete reassembly, the received fragments are discarded.
Error Control The internetwork facility does not guarantee successful delivery
of every datagram. When a datagram is discarded by a router, the router should
attempt to return some information to the source, if possible. The source Internet
Protocol entity may use this information to modify its transmission strategy and
may notify higher layers. To report that a specific datagram has been discarded,
some means of datagram identification is needed. Such identification is discussed in
the next section.
Datagrams may be discarded for a number of reasons, including lifetime expiration, congestion, and FCS error. In the latter case, notification is not possible
because the source address field may have been damaged.
Flow Control Internet flow control allows routers and/or receiving stations to
limit the rate at which they receive data. For the connectionless type of service we
are describing, flow control mechanisms are limited. The best approach would seem
to be to send flow control packets, requesting reduced data flow, to other routers
and source stations. We will see one example of this with Internet Control Message
Protocol (ICMP), discussed in the next section.
In this section, we look at version 4 of IP, officially defined in RFC 791. Although it
is intended that IPv4 will ultimately be replaced by IPv6, it is currently the standard
IP used in TCP/IP networks.
The Internet Protocol (IP) is part of the TCP/IP suite and is the most widely
used internetworking protocol. As with any protocol standard, IP is specified in two
parts (see Figure 2.9):
• The interface with a higher layer (e.g., TCP), specifying the services that IP
• The actual protocol format and mechanisms
In this section, we examine first IP services and then the protocol. This is followed by a discussion of IP address formats. Finally, the Internet Control Message
Protocol (ICMP), which is an integral part of IP, is described.
IP Services
The services to be provided across adjacent protocol layers (e.g., between IP and
TCP) are expressed in terms of primitives and parameters. A primitive specifies
the function to be performed, and the parameters are used to pass data and control information. The actual form of a primitive is implementation dependent. An
example is a procedure call.
IP provides two service primitives at the interface to the next higher layer. The
Send primitive is used to request transmission of a data unit. The Deliver primitive
is used by IP to notify a user of the arrival of a data unit. The parameters associated
with the two primitives are as follows:
Source address: Internetwork address of sending IP entity.
Destination address: Internetwork address of destination IP entity.
Protocol: Recipient protocol entity (an IP user, such as TCP).
Type-of-service indicators: Used to specify the treatment of the data unit in its
transmission through component networks.
Identification: Used in combination with the source and destination addresses
and user protocol to identify the data unit uniquely. This parameter is needed
for reassembly and error reporting.
Don’t fragment identifier: Indicates whether IP can fragment data to accomplish delivery.
Time to live: Measured in seconds.
Data length: Length of data being transmitted.
Option data: Options requested by the IP user.
Data: User data to be transmitted.
The identification, don’t fragment identifier, and time to live parameters
are present in the Send primitive but not in the Deliver primitive. These three
parameters provide instructions to IP that are not of concern to the recipient IP
The options parameter allows for future extensibility and for inclusion of
parameters that are usually not invoked. The currently defined options are as
• Security: Allows a security label to be attached to a datagram.
• Source routing: A sequenced list of router addresses that specifies the route to
be followed. Routing may be strict (only identified routers may be visited) or
loose (other intermediate routers may be visited).
• Route recording: A field is allocated to record the sequence of routers visited
by the datagram.
• Stream identification: Names reserved resources used for stream service. This
service provides special handling for volatile periodic traffic (e.g., voice).
• Timestamping: The source IP entity and some or all intermediate routers add
a timestamp (precision to milliseconds) to the data unit as it goes by.
Internet Protocol
The protocol between IP entities is best described with reference to the IP datagram
format, shown in Figure 18.6. The fields are as follows:
• Version (4 bits): Indicates version number, to allow evolution of the protocol;
the value is 4.
Time to Live
20 octets
Total Length
Fragment Offset
Header Checksum
Source Address
Destination Address
Options Padding
Figure 18.6 IPv4 Header
• Internet Header Length (IHL) (4 bits): Length of header in 32-bit words. The
minimum value is five, for a minimum header length of 20 octets.
• DS/ECN (8 bits): Prior to the introduction of differentiated services, this
field was referred to as the Type of Service field and specified reliability,
precedence, delay, and throughput parameters. This interpretation has now
been superseded. The first six bits of this field are now referred to as the DS
(Differentiated Services) field, discussed in Chapter 19. The remaining 2 bits
are reserved for an ECN (Explicit Congestion Notification) field, currently
in the process of standardization. The ECN field provides for explicit signaling of congestion in a manner similar to that discussed for frame relay
(Section 13.5).
• Total Length (16 bits): Total datagram length, including header plus data, in octets.
• Identification (16 bits): A sequence number that, together with the source
address, destination address, and user protocol, is intended to identify a datagram uniquely. Thus, this number should be unique for the datagram’s source
address, destination address, and user protocol for the time during which the
datagram will remain in the internet.
• Flags (3 bits): Only two of the bits are currently defined. The More bit is used
for fragmentation and reassembly, as previously explained. The Don’t Fragment bit prohibits fragmentation when set. This bit may be useful if it is known
that the destination does not have the capability to reassemble fragments.
However, if this bit is set, the datagram will be discarded if it exceeds the
maximum size of an en route network. Therefore, if the bit is set, it may be
advisable to use source routing to avoid networks with small maximum
packet size.
• Fragment Offset (13 bits): Indicates where in the original datagram this
fragment belongs, measured in 64-bit units. This implies that fragments other
than the last fragment must contain a data field that is a multiple of 64 bits
in length.
Time to Live (8 bits): Specifies how long, in seconds, a datagram is allowed to
remain in the internet. Every router that processes a datagram must decrease
the TTL by at least one, so the TTL is similar to a hop count.
Protocol (8 bits): Indicates the next higher level protocol that is to receive the
data field at the destination; thus, this field identifies the type of the next header
in the packet after the IP header. Example values are TCP = 6; UDP =
17; ICMP = 1. A complete list is maintained at
Header Checksum (16 bits): An error-detecting code applied to the header
only. Because some header fields may change during transit (e.g., Time to Live,
fragmentation-related fields), this is reverified and recomputed at each router.
The checksum is formed by taking the ones complement of the 16-bit ones
complement addition of all 16-bit words in the header. For purposes of computation, the checksum field is itself initialized to a value of zero.5
Source Address (32 bits): Coded to allow a variable allocation of bits to
specify the network and the end system attached to the specified network, as
discussed subsequently.
Destination Address (32 bits): Same characteristics as source address.
Options (variable): Encodes the options requested by the sending user.
Padding (variable): Used to ensure that the datagram header is a multiple of
32 bits in length.
Data (variable): The data field must be an integer multiple of 8 bits in length.
The maximum length of the datagram (data field plus header) is 65,535 octets.
It should be clear how the IP services specified in the Send and Deliver primitives map into the fields of the IP datagram.
IP Addresses
The source and destination address fields in the IP header each contain a 32-bit global
internet address, generally consisting of a network identifier and a host identifier.
Network Classes The address is coded to allow a variable allocation of bits to
specify network and host, as depicted in Figure 18.7. This encoding provides
flexibility in assigning addresses to hosts and allows a mix of network sizes on an
internet. The three principal network classes are best suited to the following
• Class A: Few networks, each with many hosts
• Class B: Medium number of networks, each with a medium number of hosts
• Class C: Many networks, each with a few hosts
A discussion of this checksum is contained in Appendix K.
Network (7 bits)
1 0
1 1 0
Host (24 bits)
Class A
Host (16 bits)
Network (14 bits)
Host (8 bits)
Network (21 bits)
Class B
Class C
1 1 1 0
Class D
1 1 1 1 0
Future use
Class E
Figure 18.7
IPv4 Address Formats
In a particular environment, it may be best to use addresses all from one
class. For example, a corporate internetwork that consist of a large number of
departmental local area networks may need to use Class C addresses exclusively.
However, the format of the addresses is such that it is possible to mix all three
classes of addresses on the same internetwork; this is what is done in the case of
the Internet itself. A mixture of classes is appropriate for an internetwork consisting of a few large networks, many small networks, plus some medium-sized networks.
IP addresses are usually written in dotted decimal notation, with a decimal
number representing each of the octets of the 32-bit address. For example, the IP
address 11000000 11100100 00010001 00111001 is written as
Note that all Class A network addresses begin with a binary 0. Network
addresses with a first octet of 0 (binary 00000000) and 127 (binary 01111111) are
reserved, so there are 126 potential Class A network numbers, which have a first
dotted decimal number in the range 1 to 126. Class B network addresses begin with
a binary 10, so that the range of the first decimal number in a Class B address is 128
to 191(binary 10000000 to 10111111). The second octet is also part of the Class B
address, so that there are 214 = 16,384 Class B addresses. For Class C addresses, the
first decimal number ranges from 192 to 223 (11000000 to 11011111). The total number of Class C addresses is 221 = 2,097,152.
Subnets and Subnet Masks The concept of subnet was introduced to address
the following requirement. Consider an internet that includes one or more WANs
and a number of sites, each of which has a number of LANs. We would like to allow
arbitrary complexity of interconnected LAN structures within an organization
while insulating the overall internet against explosive growth in network numbers
and routing complexity. One approach to this problem is to assign a single network
number to all of the LANs at a site. From the point of view of the rest of the internet, there is a single network at that site, which simplifies addressing and routing. To
allow the routers within the site to function properly, each LAN is assigned a subnet
number. The host portion of the internet address is partitioned into a subnet number
and a host number to accommodate this new level of addressing.
Within the subnetted network, the local routers must route on the basis of an
extended network number consisting of the network portion of the IP address and the
subnet number. The bit positions containing this extended network number are indicated by the address mask. The use of the address mask allows the host to determine
whether an outgoing datagram is destined for a host on the same LAN (send directly)
or another LAN (send datagram to router). It is assumed that some other means (e.g.,
manual configuration) are used to create address masks and make them known to the
local routers.
Table 18.3a shows the calculations involved in the use of a subnet mask. Note
that the effect of the subnet mask is to erase the portion of the host field that refers
to an actual host on a subnet. What remains is the network number and the subnet
number. Figure 18.8 shows an example of the use of subnetting. The figure shows a
local complex consisting of three LANs and two routers. To the rest of the internet,
this complex is a single network with a Class C address of the form 192.228.17.x,
where the leftmost three octets are the network number and the rightmost octet
contains a host number x. Both routers R1 and R2 are configured with a subnet
Table 18.3 IP Addresses and Subnet Masks [STEI95]
(a) Dotted decimal and binary representations of IP address and subnet masks
Binary Representation
Dotted Decimal
IP address
Subnet mask
Bitwise AND of address and mask
(resultant network/subnet number)
Subnet number
Host number
(b) Default subnet masks
Binary Representation
Dotted Decimal
Class A default mask
Example Class A mask
Class B default mask
Example Class B mask
Class C default mask
255. 255. 255.0
Example Class C mask
255. 255. 255.252
Net ID/Subnet ID:
Subnet number: 1
Rest of
IP Address:
Host number: 1
IP Address:
Host number: 25
Net ID/Subnet ID:
Subnet number: 2
IP Address:
Host number: 1
Net ID/Subnet ID:
Subnet number: 3
IP Address:
Host number: 1
Figure 18.8
Example of Subnetworking
mask with the value (see Table 18.3a). For example, if a datagram
with the destination address arrives at R1 from either the rest of the
internet or from LAN Y, R1 applies the subnet mask to determine that this address
refers to subnet 1, which is LAN X, and so forwards the datagram to LAN X. Similarly, if a datagram with that destination address arrives at R2 from LAN Z, R2
applies the mask and then determines from its forwarding database that datagrams
destined for subnet 1 should be forwarded to R1. Hosts must also employ a subnet
mask to make routing decisions.
The default subnet mask for a given class of addresses is a null mask (Table
18.3b), which yields the same network and host number as the non-subnetted address.
Internet Control Message Protocol (ICMP)
The IP standard specifies that a compliant implementation must also implement
ICMP (RFC 792). ICMP provides a means for transferring messages from
routers and other hosts to a host. In essence, ICMP provides feedback about
problems in the communication environment. Examples of its use are when a
datagram cannot reach its destination, when the router does not have the buffering capacity to forward a datagram, and when the router can direct the station to
send traffic on a shorter route. In most cases, an ICMP message is sent in
response to a datagram, either by a router along the datagram’s path or by the
intended destination host.
IPHeader 64 bits of original datagram
(a) Destination unreachable; time exceeded; source quench
IP Header 64 bits of original datagram
(b) Parameter problem
Sequence number
Originate timestamp
(e) Timestamp
Sequence number
Originate timestamp
Receive timestamp
Transmit timestamp
(f) Timestamp reply
Gateway Internet address
IP Header 64 bits of original datagram
Sequence number
(g) Address mask request
(c) Redirect
Sequence number
Optional data
(d) Echo, echo reply
Figure 18.9
Sequence number
Address mask
(h) Address mask reply
ICMP Message Formats
Although ICMP is, in effect, at the same level as IP in the TCP/IP architecture,
it is a user of IP. An ICMP message is constructed and then passed down to IP, which
encapsulates the message with an IP header and then transmits the resulting datagram in the usual fashion. Because ICMP messages are transmitted in IP datagrams,
their delivery is not guaranteed and their use cannot be considered reliable.
Figure 18.9 shows the format of the various ICMP message types. An ICMP
message starts with a 64-bit header consisting of the following:
• Type (8 bits): Specifies the type of ICMP message.
• Code (8 bits): Used to specify parameters of the message that can be encoded
in one or a few bits.
• Checksum (16 bits): Checksum of the entire ICMP message. This is the same
checksum algorithm used for IP.
• Parameters (32 bits): Used to specify more lengthy parameters.
These fields are generally followed by additional information fields that further specify the content of the message.
In those cases in which the ICMP message refers to a prior datagram, the
information field includes the entire IP header plus the first 64 bits of the data field
of the original datagram. This enables the source host to match the incoming ICMP
message with the prior datagram. The reason for including the first 64 bits of the
data field is that this will enable the IP module in the host to determine which
upper-level protocol or protocols were involved. In particular, the first 64 bits would
include a portion of the TCP header or other transport-level header.
The destination unreachable message covers a number of contingencies. A
router may return this message if it does not know how to reach the destination network. In some networks, an attached router may be able to determine if a particular
host is unreachable and returns the message. The destination host itself may return
this message if the user protocol or some higher-level service access point is
unreachable. This could happen if the corresponding field in the IP header was set
incorrectly. If the datagram specifies a source route that is unusable, a message is
returned. Finally, if a router must fragment a datagram but the Don’t Fragment flag
is set, the datagram is discarded and a message is returned.
A router will return a time exceeded message if the lifetime of the datagram
expires. A host will send this message if it cannot complete reassembly within a time
A syntactic or semantic error in an IP header will cause a parameter problem
message to be returned by a router or host. For example, an incorrect argument may
be provided with an option. The Parameter field contains a pointer to the octet in
the original header where the error was detected.
The source quench message provides a rudimentary form of flow control.
Either a router or a destination host may send this message to a source host,
requesting that it reduce the rate at which it is sending traffic to the internet destination. On receipt of a source quench message, the source host should cut back
the rate at which it is sending traffic to the specified destination until it no longer
receives source quench messages. The source quench message can be used by a
router or host that must discard datagrams because of a full buffer. In that case,
the router or host will issue a source quench message for every datagram that it
discards. In addition, a system may anticipate congestion and issue source quench
messages when its buffers approach capacity. In that case, the datagram referred
to in the source quench message may well be delivered. Thus, receipt of a source
quench message does not imply delivery or nondelivery of the corresponding
A router sends a redirect message to a host on a directly connected router to
advise the host of a better route to a particular destination. The following is an
example, using Figure 18.8. Router R1 receives a datagram from host C on network
Y, to which R1 is attached. R1 checks its routing table and obtains the address for
the next router, R2, on the route to the datagram’s internet destination network, Z.
Because R2 and the host identified by the internet source address of the datagram
are on the same network, R1 sends a redirect message to C. The redirect message
advises the host to send its traffic for network Z directly to router R2, because this is
a shorter path to the destination. The router forwards the original datagram to its
internet destination (via R2). The address of R2 is contained in the parameter field
of the redirect message.
The echo and echo reply messages provide a mechanism for testing that
communication is possible between entities. The recipient of an echo message is
obligated to return the message in an echo reply message. An identifier and
sequence number are associated with the echo message to be matched in the
echo reply message. The identifier might be used like a service access point to
identify a particular session, and the sequence number might be incremented on
each echo request sent.
The timestamp and timestamp reply messages provide a mechanism for
sampling the delay characteristics of the internet. The sender of a timestamp message may include an identifier and sequence number in the parameters field and
include the time that the message is sent (originate timestamp). The receiver
records the time it received the message and the time that it transmits the reply
message in the timestamp reply message. If the timestamp message is sent using
strict source routing, then the delay characteristics of a particular route can be
The address mask request and address mask reply messages are useful in
an environment that includes subnets. The address mask request and reply
messages allow a host to learn the address mask for the LAN to which it connects. The host broadcasts an address mask request message on the LAN. The
router on the LAN responds with an address mask reply message that contains
the address mask.
Address Resolution Protocol (ARP)
Earlier in this chapter, we referred to the concepts of a global address (IP
address) and an address that conforms to the addressing scheme of the network
to which a host is attached (subnetwork address). For a local area network, the
latter address is a MAC address, which provides a physical address for a host port
attached to the LAN. Clearly, to deliver an IP datagram to a destination host, a
mapping must be made from the IP address to the subnetwork address for that
last hop. If a datagram traverses one or more routers between source and destination hosts, then the mapping must be done in the final router, which is attached
to the same subnetwork as the destination host. If a datagram is sent from one
host to another on the same subnetwork, then the source host must do the mapping. In the following discussion, we use the term system to refer to the entity that
does the mapping.
For mapping from an IP address to a subnetwork address, a number of
approaches are possible, including
• Each system can maintain a local table of IP addresses and matching subnetwork addresses for possible correspondents. This approach does not accommodate easy and automatic additions of new hosts to the subnetwork.
• The subnetwork address can be a subset of the network portion of the IP
address. However, the entire internet address is 32 bits long and for most subnetwork types (e.g., Ethernet) the Host Address field is longer than 32 bits.
• A centralized directory can be maintained on each subnetwork that contains the
IP-subnet address mappings. This is a reasonable solution for many networks.
• An address resolution protocol can be used. This is a simpler approach than
the use of a centralized directory and is well suited to LANs.
RFC 826 defines an Address Resolution Protocol (ARP), which allows
dynamic distribution of the information needed to build tables to translate an IP
address A into a 48-bit Ethernet address; the protocol can be used for any broadcast
network. ARP exploits the broadcast property of a LAN; namely, that a transmission from any device on the network is received by all other devices on the network.
ARP works as follows:
1. Each system on the LAN maintains a table of known IP-subnetwork address
2. When a subnetwork address is needed for an IP address, and the mapping is not
found in the system’s table, the system uses ARP directly on top of the LAN protocol (e.g., IEEE 802) to broadcast a request.The broadcast message contains the
IP address for which a subnetwork address is needed.
3. Other hosts on the subnetwork listen for ARP messages and reply when a
match occurs. The reply includes both the IP and subnetwork addresses of the
replying host.
4. The original request includes the requesting host’s IP address and subnetwork
address. Any interested host can copy this information into its local table, avoiding the need for later ARP messages.
5. The ARP message can also be used simply to broadcast a host’s IP address and
subnetwork address, for the benefit of others on the subnetwork.
18.5 IPv6
The Internet Protocol (IP) has been the foundation of the Internet and virtually all
multivendor private internetworks. This protocol is reaching the end of its useful life
and a new protocol, known as IPv6 (IP version 6), has been defined to ultimately
replace IP.6
We first look at the motivation for developing a new version of IP and then
examine some of its details.
IP Next Generation
The driving motivation for the adoption of a new version of IP was the limitation
imposed by the 32-bit address field in IPv4. With a 32-bit address field, it is possible in principle to assign 232 different addresses, which is over 4 billion possible
addresses. One might think that this number of addresses was more than adequate to meet addressing needs on the Internet. However, in the late 1980s it was
perceived that there would be a problem, and this problem began to manifest
itself in the early 1990s. Reasons for the inadequacy of 32-bit addresses include
the following:
• The two-level structure of the IP address (network number, host number) is
convenient but wasteful of the address space. Once a network number is
assigned to a network, all of the host-number addresses for that network
number are assigned to that network. The address space for that network may
The currently deployed version of IP is IP version 4; previous versions of IP (1 through 3) were successively defined and replaced to reach IPv4. Version 5 is the number assigned to the Stream Protocol, a
connection-oriented internet-layer protocol; hence the use of the label version 6.
18.5 / IPv6
be sparsely used, but as far as the effective IP address space is concerned, if a
network number is used, then all addresses within the network are used.
The IP addressing model generally requires that a unique network number be
assigned to each IP network whether or not it is actually connected to the
Networks are proliferating rapidly. Most organizations boast multiple LANs,
not just a single LAN system. Wireless networks have rapidly assumed a major
role. The Internet itself has grown explosively for years.
Growth of TCP/IP usage into new areas will result in a rapid growth in the
demand for unique IP addresses. Examples include using TCP/IP to interconnect electronic point-of-sale terminals and for cable television receivers.
Typically, a single IP address is assigned to each host. A more flexible arrangement is to allow multiple IP addresses per host. This, of course, increases the
demand for IP addresses.
So the need for an increased address space dictated that a new version of IP
was needed. In addition, IP is a very old protocol, and new requirements in the
areas of address configuration, routing flexibility, and traffic support had been
In response to these needs, the Internet Engineering Task Force (IETF) issued
a call for proposals for a next generation IP (IPng) in July of 1992. A number of
proposals were received, and by 1994 the final design for IPng emerged. A major
milestone was reached with the publication of RFC 1752, “The Recommendation
for the IP Next Generation Protocol,” issued in January 1995. RFC 1752 outlines the
requirements for IPng, specifies the PDU formats, and highlights the IPng approach
in the areas of addressing, routing, and security. A number of other Internet documents defined details of the protocol, now officially called IPv6; these include an
overall specification of IPv6 (RFC 2460), an RFC dealing with addressing structure
of IPv6 (RFC 2373), and numerous others.
IPv6 includes the following enhancements over IPv4:
• Expanded address space: IPv6 uses 128-bit addresses instead of the 32-bit
addresses of IPv4. This is an increase of address space by a factor of 296. It
has been pointed out [HIND95] that this allows on the order of 6 * 1023
unique addresses per square meter of the surface of the earth. Even if
addresses are very inefficiently allocated, this address space seems inexhaustible.
• Improved option mechanism: IPv6 options are placed in separate optional
headers that are located between the IPv6 header and the transport-layer
header. Most of these optional headers are not examined or processed by any
router on the packet’s path. This simplifies and speeds up router processing of
IPv6 packets compared to IPv4 datagrams.7 It also makes it easier to add
additional options.
The protocol data unit for IPv6 is referred to as a packet rather than a datagram, which is the term used
for IPv4 PDUs.
• Address autoconfiguration: This capability provides for dynamic assignment
of IPv6 addresses.
• Increased addressing flexibility: IPv6 includes the concept of an anycast
address, for which a packet is delivered to just one of a set of nodes. The scalability of multicast routing is improved by adding a scope field to multicast
• Support for resource allocation: IPv6 enables the labeling of packets
belonging to a particular traffic flow for which the sender requests special
handling. This aids in the support of specialized traffic such as real-time
All of these features are explored in the remainder of this section.
IPv6 Structure
An IPv6 protocol data unit (known as a packet) has the following general form:
;!40 octets !: ;!!!!!!!!!!!
IPv6 header
Extension header
0 or
. . .
Extension header
Transport-level PDU
The only header that is required is referred to simply as the IPv6 header. This is
of fixed size with a length of 40 octets, compared to 20 octets for the mandatory portion of the IPv4 header (Figure 18.6). The following extension headers have been
• Hop-by-Hop Options header: Defines special options that require hop-by-hop
• Routing header: Provides extended routing, similar to IPv4 source routing
• Fragment header: Contains fragmentation and reassembly information
• Authentication header: Provides packet integrity and authentication
• Encapsulating Security Payload header: Provides privacy
• Destination Options header: Contains optional information to be examined
by the destination node
The IPv6 standard recommends that, when multiple extension headers are
used, the IPv6 headers appear in the following order:
1. IPv6 header: Mandatory, must always appear first
2. Hop-by-Hop Options header
3. Destination Options header: For options to be processed by the first destination
that appears in the IPv6 Destination Address field plus subsequent destinations
listed in the Routing header
4. Routing header
5. Fragment header
6. Authentication header
7. Encapsulating Security Payload header
18.5 / IPv6
8. Destination Options header: For options to be processed only by the final destination of the packet
Figure 18.10 shows an example of an IPv6 packet that includes an instance
of each header, except those related to security. Note that the IPv6 header and
each extension header include a Next Header field. This field identifies the type
of the immediately following header. If the next header is an extension header,
then this field contains the type identifier of that header. Otherwise, this field
contains the protocol identifier of the upper-layer protocol using IPv6 (typically
a transport-level protocol), using the same values as the IPv4 Protocol field. In
Figure 18.10, the upper-layer protocol is TCP; thus, the upper-layer data carried
by the IPv6 packet consist of a TCP header followed by a block of application
We first look at the main IPv6 header and then examine each of the extensions
in turn.
IPv6 header
IPv6 header
options header
Routing header
Fragment header
Destination options
20 (optional variable part)
TCP header
Application data
Next Header field
Figure 18.10
IPv6 Packet with Extension Headers (containing a TCP
Payload Length
10 32 bits 40 octets
Flow Label
Next Header
Hop Limit
Source Address
Destination Address
Figure 18.11 IPv6 Header
IPv6 Header
The IPv6 header has a fixed length of 40 octets, consisting of the following fields
(Figure 18.11):
• Version (4 bits): Internet protocol version number; the value is 6.
• DS/ECN (8 bits): Available for use by originating nodes and/or forwarding
routers for differentiated services and congestion functions, as described for
the IPv4 DS/ECN field.
• Flow Label (20 bits): May be used by a host to label those packets for which it is
requesting special handling by routers within a network; discussed subsequently.
• Payload Length (16 bits): Length of the remainder of the IPv6 packet following the header, in octets. In other words, this is the total length of all of the
extension headers plus the transport-level PDU.
• Next Header (8 bits): Identifies the type of header immediately following the
IPv6 header; this will either be an IPv6 extension header or a higher-layer
header, such as TCP or UDP.
• Hop Limit (8 bits): The remaining number of allowable hops for this packet. The
hop limit is set to some desired maximum value by the source and decremented
by 1 by each node that forwards the packet. The packet is discarded if Hop Limit
is decremented to zero. This is a simplification over the processing required for
the Time to Live field of IPv4.The consensus was that the extra effort in accounting for time intervals in IPv4 added no significant value to the protocol. In fact,
IPv4 routers, as a general rule, treat the Time to Live field as a hop limit field.
• Source Address (128 bits): The address of the originator of the packet.
18.5 / IPv6
• Destination Address (128 bits): The address of the intended recipient of the
packet. This may not in fact be the intended ultimate destination if a Routing
header is present, as explained subsequently.
Although the IPv6 header is longer than the mandatory portion of the IPv4
header (40 octets versus 20 octets), it contains fewer fields (8 versus 12). Thus,
routers have less processing to do per header, which should speed up routing.
Flow Label RFC 3967 defines a flow as a sequence of packets sent from a particular source to a particular (unicast, anycast, or multicast) destination for which the
source desires special handling by the intervening routers. A flow is uniquely identified by the combination of a source address, destination address, and a nonzero
20-bit flow label. Thus, all packets that are to be part of the same flow are assigned
the same flow label by the source.
From the source’s point of view, a flow typically will be a sequence of packets
that are generated from a single application instance at the source and that have the
same transfer service requirements. A flow may comprise a single TCP connection
or even multiple TCP connections; an example of the latter is a file transfer application, which could have one control connection and multiple data connections. A single application may generate a single flow or multiple flows. An example of the
latter is multimedia conferencing, which might have one flow for audio and one for
graphic windows, each with different transfer requirements in terms of data rate,
delay, and delay variation.
From the router’s point of view, a flow is a sequence of packets that share
attributes that affect how these packets are handled by the router. These include
path, resource allocation, discard requirements, accounting, and security attributes.
The router may treat packets from different flows differently in a number of ways,
including allocating different buffer sizes, giving different precedence in terms of
forwarding, and requesting different quality of service from networks.
There is no special significance to any particular flow label. Instead the special
handling to be provided for a packet flow must be declared in some other way. For
example, a source might negotiate or request special handling ahead of time from
routers by means of a control protocol, or at transmission time by information in
one of the extension headers in the packet, such as the Hop-by-Hop Options
header. Examples of special handling that might be requested include some sort of
nondefault quality of service and some form of real-time service.
In principle, all of a user’s requirements for a particular flow could be defined
in an extension header and included with each packet. If we wish to leave the concept of flow open to include a wide variety of requirements, this design approach
could result in very large packet headers. The alternative, adopted for IPv6, is the
flow label, in which the flow requirements are defined prior to flow commencement
and a unique flow label is assigned to the flow. In this case, the router must save flow
requirement information about each flow.
The following rules apply to the flow label:
1. Hosts or routers that do not support the Flow Label field must set the field to
zero when originating a packet, pass the field unchanged when forwarding a
packet, and ignore the field when receiving a packet.
2. All packets originating from a given source with the same nonzero Flow Label
must have the same Destination Address, Source Address, Hop-by-Hop
Options header contents (if this header is present), and Routing header contents (if this header is present). The intent is that a router can decide how to
route and process the packet by simply looking up the flow label in a table and
without examining the rest of the header.
3. The source assigns a flow label to a flow. New flow labels must be chosen
(pseudo-) randomly and uniformly in the range 1 to 220 - 1, subject to the
restriction that a source must not reuse a flow label for a new flow within the
lifetime of the existing flow. The zero flow label is reserved to indicate that no
flow label is being used.
This last point requires some elaboration. The router must maintain information about the characteristics of each active flow that may pass through it, presumably in some sort of table. To forward packets efficiently and rapidly, table lookup
must be efficient. One alternative is to have a table with 220 (about 1 million) entries,
one for each possible flow label; this imposes an unnecessary memory burden on the
router. Another alternative is to have one entry in the table per active flow, include
the flow label with each entry, and require the router to search the entire table each
time a packet is encountered. This imposes an unnecessary processing burden on the
router. Instead, most router designs are likely to use some sort of hash table
approach. With this approach a moderate-sized table is used, and each flow entry is
mapped into the table using a hashing function on the flow label. The hashing function might simply be the low-order few bits (say 8 or 10) of the flow label or some
simple calculation on the 20 bits of the flow label. In any case, the efficiency of the
hash approach typically depends on the flow labels being uniformly distributed over
their possible range. Hence requirement number 3 in the preceding list.
IPv6 Addresses
IPv6 addresses are 128 bits in length. Addresses are assigned to individual interfaces
on nodes, not to the nodes themselves.8 A single interface may have multiple unique
unicast addresses. Any of the unicast addresses associated with a node’s interface
may be used to uniquely identify that node.
The combination of long addresses and multiple addresses per interface
enables improved routing efficiency over IPv4. In IPv4, addresses generally do not
have a structure that assists routing, and therefore a router may need to maintain
huge table of routing paths. Longer internet addresses allow for aggregating
addresses by hierarchies of network, access provider, geography, corporation, and so
on. Such aggregation should make for smaller routing tables and faster table
lookups. The allowance for multiple addresses per interface would allow a subscriber that uses multiple access providers across the same interface to have separate addresses aggregated under each provider’s address space.
IPv6 allows three types of addresses:
• Unicast: An identifier for a single interface. A packet sent to a unicast address
is delivered to the interface identified by that address.
In IPv6, a node is any device that implements IPv6; this includes hosts and routers.
18.5 / IPv6
• Anycast: An identifier for a set of interfaces (typically belonging to different
nodes). A packet sent to an anycast address is delivered to one of the interfaces identified by that address (the “nearest” one, according to the routing
protocols’ measure of distance).
• Multicast: An identifier for a set of interfaces (typically belonging to different
nodes). A packet sent to a multicast address is delivered to all interfaces identified by that address.
Hop-by-Hop Options Header
The Hop-by-Hop Options header carries optional information that, if present, must
be examined by every router along the path. This header consists of (Figure 18.12a):
• Next Header (8 bits): Identifies the type of header immediately following this
• Header Extension Length (8 bits): Length of this header in 64-bit units, not
including the first 64 bits.
• Options: A variable-length field consisting of one or more option definitions.
Each definition is in the form of three subfields: Option Type (8 bits), which
identifies the option; Length (8 bits), which specifies the length of the Option
Data field in octets; and Option Data, which is a variable-length specification
of the option.
Next header Hdr ext len
Next header Hdr ext len
One or more options
(a) Hop-by-Hop options header;
destination options header
Next header Reserved
Fragment offset
29 31
Res M
(b) Fragment header
Next header Hdr ext len Routing type Segments left
Type-specific data
(d) Type 0 routing header
(c) Generic routing header
Figure 18.12
IPv6 Extension Headers
Segments left
It is actually the lowest-order five bits of the Option Type field that are used to
specify a particular option. The high-order two bits indicate that action to be taken
by a node that does not recognize this option type, as follows:
• 00—Skip over this option and continue processing the header.
• 01—Discard the packet.
• 10—Discard the packet and send an ICMP Parameter Problem message to the
packet’s Source Address, pointing to the unrecognized Option Type.
• 11—Discard the packet and, only if the packet’s Destination Address is not a
multicast address, send an ICMP Parameter Problem message to the packet’s
Source Address, pointing to the unrecognized Option Type.
The third highest-order bit specifies whether the Option Data field does not
change (0) or may change (1) en route from source to destination. Data that may
change must be excluded from authentication calculations, as discussed in Chapter 21.
These conventions for the Option Type field also apply to the Destination
Options header.
Four hop-by-hop options have been specified so far:
• Pad1: Used to insert one byte of padding into the Options area of the header.
• PadN: Used to insert N bytes (N Ú 2) of padding into the Options area of the
header. The two padding options ensure that the header is a multiple of 8 bytes
in length.
• Jumbo payload: Used to send IPv6 packets with payloads longer than 65,535
octets. The Option Data field of this option is 32 bits long and gives the length of
the packet in octets, excluding the IPv6 header. For such packets, the Payload
Length field in the IPv6 header must be set to zero, and there must be no
Fragment header. With this option, IPv6 supports packet sizes up to more than 4
billion octets. This facilitates the transmission of large video packets and enables
IPv6 to make the best use of available capacity over any transmission medium.
• Router alert: Informs the router that the contents of this packet is of interest to
the router and to handle any control data accordingly.The absence of this option
in an IPv6 datagram informs the router that the packet does not contain information needed by the router and hence can be safely routed without further
packet parsing. Hosts originating IPv6 packets are required to include this
option in certain circumstances.The purpose of this option is to provide efficient
support for protocols such as RSVP (Chapter 19) that generate packets that
need to be examined by intermediate routers for purposes of traffic control.
Rather than requiring the intermediate routers to look in detail at the extension
headers of a packet, this option alerts the router when such attention is required.
Fragment Header
In IPv6, fragmentation may only be performed by source nodes, not by routers
along a packet’s delivery path. To take full advantage of the internetworking environment, a node must perform a path discovery algorithm that enables it to learn
the smallest maximum transmission unit (MTU) supported by any network on the
path. With this knowledge, the source node will fragment, as required, for each given
18.5 / IPv6
destination address. Otherwise the source must limit all packets to 1280 octets,
which is the minimum MTU that must be supported by each network.
The fragment header consists of the following (Figure 18.12b):
• Next Header (8 bits): Identifies the type of header immediately following this
• Reserved (8 bits): For future use.
• Fragment Offset (13 bits): Indicates where in the original packet the payload of
this fragment belongs, measured in 64-bit units.This implies that fragments (other
than the last fragment) must contain a data field that is a multiple of 64 bits long.
• Res (2 bits): Reserved for future use.
• M Flag (1 bit): 1 = more fragments; 0 = last fragment.
• Identification (32 bits): Intended to uniquely identify the original packet. The
identifier must be unique for the packet’s source address and destination
address for the time during which the packet will remain in the internet. All
fragments with the same identifier, source address, and destination address are
reassembled to form the original packet.
The fragmentation algorithm is the same as that described in Section 18.3.
Routing Header
The Routing header contains a list of one or more intermediate nodes to be visited
on the way to a packet’s destination. All routing headers start with a 32-bit block
consisting of four 8-bit fields, followed by routing data specific to a given routing
type (Figure 18.12c). The four 8-bit fields are as follows:
• Next Header: Identifies the type of header immediately following this header.
• Header Extension Length: Length of this header in 64-bit units, not including
the first 64 bits.
• Routing Type: Identifies a particular Routing header variant. If a router does
not recognize the Routing Type value, it must discard the packet.
• Segments Left: Number of route segments remaining; that is, the number of explicitly listed intermediate nodes still to be visited before reaching the final destination.
The only specific routing header format defined in RFC 2460 is the Type 0 Routing header (Figure 18.12d). When using the Type 0 Routing header, the source node
does not place the ultimate destination address in the IPv6 header. Instead, that address
is the last address listed in the Routing header (Address[n] in Figure 18.12d), and the
IPv6 header contains the destination address of the first desired router on the path.The
Routing header will not be examined until the packet reaches the node identified in the
IPv6 header. At that point, the IPv6 and Routing header contents are updated and the
packet is forwarded. The update consists of placing the next address to be visited in the
IPv6 header and decrementing the Segments Left field in the Routing header.
Destination Options Header
The Destination Options header carries optional information that, if present, is
examined only by the packet’s destination node. The format of this header is the
same as that of the Hop-by-Hop Options header (Figure 18.12a).
In today’s distributed computing environment, the virtual private network (VPN)
offers an attractive solution to network managers. In essence, a VPN consists of a set
of computers that interconnect by means of a relatively unsecure network and that
make use of encryption and special protocols to provide security. At each corporate
site, workstations, servers, and databases are linked by one or more local area networks (LANs). The LANs are under the control of the network manager and can be
configured and tuned for cost-effective performance. The Internet or some other public network can be used to interconnect sites, providing a cost savings over the use of a
private network and offloading the wide area network management task to the public
network provider. That same public network provides an access path for telecommuters and other mobile employees to log on to corporate systems from remote sites.
But the manager faces a fundamental requirement: security. Use of a public
network exposes corporate traffic to eavesdropping and provides an entry point
for unauthorized users. To counter this problem, the manager may choose from a
variety of encryption and authentication packages and products. Proprietary solutions raise a number of problems. First, how secure is the solution? If proprietary
encryption or authentication schemes are used, there may be little reassurance in
the technical literature as to the level of security provided. Second is the question
of compatibility. No manager wants to be limited in the choice of workstations,
servers, routers, firewalls, and so on by a need for compatibility with the security
facility. This is the motivation for the IP Security (IPSec) set of Internet standards.
In 1994, the Internet Architecture Board (IAB) issued a report titled “Security in
the Internet Architecture” (RFC 1636). The report stated the general consensus that
the Internet needs more and better security and identified key areas for security
mechanisms. Among these were the need to secure the network infrastructure from
unauthorized monitoring and control of network traffic and the need to secure enduser-to-end-user traffic using authentication and encryption mechanisms.
To provide security, the IAB included authentication and encryption as necessary security features in the next-generation IP, which has been issued as IPv6. Fortunately, these security capabilities were designed to be usable both with the current
IPv4 and the future IPv6. This means that vendors can begin offering these features
now, and many vendors do now have some IPSec capability in their products. The
IPSec specification now exists as a set of Internet standards.
Applications of IPSec
IPSec provides the capability to secure communications across a LAN, across private
and public WANs, and across the Internet. Examples of its use include the following:
• Secure branch office connectivity over the Internet: A company can build a
secure virtual private network over the Internet or over a public WAN. This
enables a business to rely heavily on the Internet and reduce its need for
private networks, saving costs and network management overhead.
• Secure remote access over the Internet: An end user whose system is equipped
with IP security protocols can make a local call to an Internet service provider
(ISP) and gain secure access to a company network. This reduces the cost of
toll charges for traveling employees and telecommuters.
• Establishing extranet and intranet connectivity with partners: IPSec can be
used to secure communication with other organizations, ensuring authentication and confidentiality and providing a key exchange mechanism.
• Enhancing electronic commerce security: Even though some Web and electronic commerce applications have built-in security protocols, the use of IPSec
enhances that security. IPSec guarantees that all traffic designated by the network administrator is both encrypted and authenticated, adding an additional
layer of security to whatever is provided at the application layer.
The principal feature of IPSec that enables it to support these varied applications is that it can encrypt and/or authenticate all traffic at the IP level. Thus, all
distributed applications, including remote logon, client/server, e-mail, file transfer,
Web access, and so on, can be secured.
Figure 18.13 is a typical scenario of IPSec usage. An organization maintains
LANs at dispersed locations. Nonsecure IP traffic is conducted on each LAN. For traffic offsite, through some sort of private or public WAN, IPSec protocols are used.These
protocols operate in networking devices, such as a router or firewall, that connect each
LAN to the outside world. The IPSec networking device will typically encrypt and
compress all traffic going into the WAN, and decrypt and decompress traffic coming
from the WAN; these operations are transparent to workstations and servers on the
LAN. Secure transmission is also possible with individual users who dial into the WAN.
Such user workstations must implement the IPSec protocols to provide security.
Benefits of IPSec
Some of the benefits of IPSec are as follows:
• When IPSec is implemented in a firewall or router, it provides strong security
that can be applied to all traffic crossing the perimeter. Traffic within a company or workgroup does not incur the overhead of security-related processing.
• IPSec in a firewall is resistant to bypass if all traffic from the outside must use
IP and the firewall is the only means of entrance from the Internet into the
• IPSec is below the transport layer (TCP, UDP) and so is transparent to applications. There is no need to change software on a user or server system when
IPSec is implemented in the firewall or router. Even if IPSec is implemented
in end systems, upper-layer software, including applications, is not affected.
• IPSec can be transparent to end users. There is no need to train users on security mechanisms, issue keying material on a per-user basis, or revoke keying
material when users leave the organization.
• IPSec can provide security for individual users if needed. This is useful for offsite workers and for setting up a secure virtual subnetwork within an organization for sensitive applications.
User system
with IPSec
Secure IP
Public (Internet)
or private
IP er
he IP
cu ad
Se aylo
he Se
ad c
IP ader
pa cure
yl I
oa P
header header
Networking device
with IPSec
Figure 18.13 An IP Security Scenario
Networking device
with IPSec
IPSec Functions
IPSec provides three main facilities: an authentication-only function referred to as
Authentication Header (AH), a combined authentication/encryption function
called Encapsulating Security Payload (ESP), and a key exchange function. For
VPNs, both authentication and encryption are generally desired, because it is
important both to (1) assure that unauthorized users do not penetrate the virtual
private network and (2) assure that eavesdroppers on the Internet cannot read messages sent over the virtual private network. Because both features are generally
desirable, most implementations are likely to use ESP rather than AH. The key
exchange function allows for manual exchange of keys as well as an automated
IPSec is explored in Chapter 21.
[RODR02] provides clear coverage of all of the topics in this chapter. Good coverage of
internetworking and IPv4 can be found in [COME06] and [STEV94]. [SHAN02] and
[KENT87] provide useful discussions of fragmentation. [LEE05] is a thorough technical
description IPv6. [KESH98] provides an instructive look at present and future router
functionality. [METZ02] and [DOI94] describe the IPv6 anycast feature. For the reader
interested in a more in-depth discussion of IP addressing, [SPOR03] offers a wealth of
COME06 Comer, D. Internetworking with TCP/IP, Volume I: Principles, Protocols, and
Architecture. Upper Saddle River, NJ: Prentice Hall, 2006.
DOI04 Doi, S., et al. “IPv6 Anycast for Simple and Effective Communications.” IEEE
Communications Magazine, May 2004.
HUIT98 Huitema, C. IPv6: The New Internet Protocol. Upper Saddle River, NJ: Prentice
Hall, 1998.
KENT87 Kent, C., and Mogul, J. “Fragmentation Considered Harmful.” ACM Computer
Communication Review, October 1987.
KESH98 Keshav, S., and Sharma, R. “Issues and Trends in Router Design.” IEEE Communications Magazine, May 1998.
LEE05 Lee, H. Understanding IPv6. New York: Springer-Verlag, 2005.
METZ02 Metz C. “IP Anycast.” IEEE Internet Computing, March 2002.
RODR02 Rodriguez, A., et al. TCP/IP Tutorial and Technical Overview. Upper Saddle
River: NJ: Prentice Hall, 2002.
SHAN02 Shannon, C.; Moore, D.; and Claffy, K. “Beyond Folklore: Observations on
Fragmented Traffic.” IEEE/ACM Transactions on Networking, December 2002.
SPOR03 Sportack, M. IP Addressing Fundamentals. Indianapolis, IN: Cisco Press, 2003.
STEV94 Stevens, W. TCP/IP Illustrated, Volume 1: The Protocols. Reading, MA:
Addison-Wesley, 1994.
Recommended Web sites:
• IPv6: Information about IPv6 and related topics.
• IPv6 Working Group: Chartered by IETF to develop standards related to IPv6. The
Web site includes all relevant RFCs and Internet drafts.
• IPv6 Forum: An industry consortium that promotes IPv6-related products. Includes a
number of white papers and articles.
Key Terms
datagram lifetime
end system
intermediate system
Internet Control Message
Protocol (ICMP)
Internet Protocol (IP)
subnet mask
traffic class
Review Questions
Give some reasons for using fragmentation and reassembly.
List the requirements for an internetworking facility.
What are the pros and cons of limiting reassembly to the endpoint as compared to
allowing en route reassembly?
Explain the function of the three flags in the IPv4 header.
How is the IPv4 header checksum calculated?
What is the difference between the traffic class and flow label fields in the IPv6 header?
Briefly explain the three types of IPv6 addresses.
What is the purpose of each of the IPv6 header types?
Although not explicitly stated, the Internet Protocol (IP) specification, RFC 791,
defines the minimum packet size a network technology must support to allow IP to
run over it.
a. Read Section 3.2 of RFC 791 to find out that value. What is it?
b. Discuss the reasons for adopting that specific value.
In the discussion of IP, it was mentioned that the identifier, don’t fragment identifier,
and time-to-live parameters are present in the Send primitive but not in the Deliver
primitive because they are only of concern to IP. For each of these parameters, indicate whether it is of concern to the IP entity in the source, the IP entities in any intermediate routers, and the IP entity in the destination end systems. Justify your answer.
What is the header overhead in the IP protocol?
Describe some circumstances where it might be desirable to use source routing rather
than let the routers make the routing decision.
Because of fragmentation, an IP datagram can arrive in several pieces, not necessarily
in the correct order. The IP entity at the receiving end system must accumulate these
fragments until the original datagram is reconstituted.
a. Consider that the IP entity creates a buffer for assembling the data field in the original datagram.As assembly proceeds, the buffer will contain blocks of data and “holes”
between the data blocks. Describe an algorithm for reassembly based on this concept.
b. For the algorithm in part (a), it is necessary to keep track of the holes. Describe a
simple mechanism for doing this.
A 4480-octet datagram is to be transmitted and needs to be fragmented because it
will pass through an Ethernet with a maximum payload of 1500 octets. Show the
Total Length, More Flag, and Fragment Offset values in each of the resulting fragments.
Consider a header that consists of 10 octets, with the checksum in the last two octets
(this does not correspond to any actual header format) with the following content (in
hexadecimal): 01 00 F6 F7 F4 F5 F2 03 00 00
a. Calculate the checksum. Show your calculation.
b. Show the resulting packet.
c. Verify the checksum.
The IP checksum needs to be recalculated at routers because of changes to the IP
header, such as the lifetime field. It is possible to recalculate the checksum from
scratch. Suggest a procedure that involves less calculation. Hint: Suppose that the
value in octet k is changed by Z = new_value - old_value; consider the effect of this
change on the checksum.
An IP datagram is to be fragmented. Which options in the option field need to be
copied into the header of each fragment, and which need only be retained in the first
fragment? Justify the handling of each option.
A transport-layer message consisting of 1500 bits of data and 160 bits of header is sent
to an internet layer, which appends another 160 bits of header. This is then transmitted through two networks, each of which uses a 24-bit packet header. The destination
network has a maximum packet size of 800 bits. How many bits, including headers, are
delivered to the network-layer protocol at the destination?
The architecture suggested by Figure 18.2 is to be used. What functions could be
added to the routers to alleviate some of the problems caused by the mismatched
local and long-haul networks?
Should internetworking be concerned with a network’s internal routing? Why or why not?
Provide the following parameter values for each of the network classes A, B, and C.
Be sure to consider any special or reserved addresses in your calculations.
a. Number of bits in network portion of address
b. Number of bits in host portion of address
c. Number of distinct networks allowed
d. Number of distinct hosts per network allowed
e. Integer range of first octet
What percentage of the total IP address space does each of the network classes represent?
What is the difference between the subnet mask for a Class A address with 16 bits for
the subnet ID and a class B address with 8 bits for the subnet ID?
Is the subnet mask valid for a Class A address?
Given a network address of and a subnet mask of,
a. How many subnets are created?
b. How many hosts are there per subnet?
Given a company with six individual departments and each department having ten
computers or networked devices, what mask could be applied to the company network to provide the subnetting necessary to divide up the network equally?
In contemporary routing and addressing, the notation commonly used is called classless interdomain routing or CIDR. With CIDR, the number of bits in the mask is indicated in the following fashion: This corresponds to a mask of If this example would provide for 256 host addresses on the network,
how many addresses are provided with the following?
Find out about your network. Using the command “ipconfig”, “ifconfig”, or “winipcfg”,
we can learn not only our IP address but other network parameters as well. Can you
determine your mask, gateway, and the number of addresses available on your network?
Using your IP address and your mask, what is your network address? This is determined by converting the IP address and the mask to binary and then proceeding with
a bitwise logical AND operation. For example, given the address and the
mask, we would discover that the network address would be
Compare the individual fields of the IPv4 header with the IPv6 header. Account for
the functionality provided by each IPv4 field by showing how the same functionality
is provided in IPv6.
Justify the recommended order in which IPv6 extension headers appear (i.e., why is
the Hop-by-Hop Options header first, why is the Routing header before the Fragment header, and so on).
The IPv6 standard states that if a packet with a nonzero flow label arrives at a router
and the router has no information for that flow label, the router should ignore the
flow label and forward the packet.
a. What are the disadvantages of treating this event as an error, discarding the
packet, and sending an ICMP message?
b. Are there situations in which routing the packet as if its flow label were zero will
cause the wrong result? Explain.
The IPv6 flow mechanism assumes that the state associated with a given flow label is
stored in routers, so they know how to handle packets that carry that flow label. A
design requirement is to flush flow labels that are no longer being used (stale flow
label) from routers.
a. Assume that a source always send a control message to all affected routers deleting a flow label when the source finishes with that flow. In that case, how could a
stale flow label persist?
b. Suggest router and source mechanisms to overcome the problem of stale flow labels.
The question arises as to which packets generated by a source should carry nonzero
IPv6 flow labels. For some applications, the answer is obvious. Small exchanges of
data should have a zero flow label because it is not worth creating a flow for a few
packets. Real-time flows should have a flow label; such flows are a primary reason
flow labels were created. A more difficult issue is what to do with peers sending large
amounts of best-effort traffic (e.g., TCP connections). Make a case for assigning a
unique flow label to each long-term TCP connection. Make a case for not doing this.
The original IPv6 specifications combined the Traffic Class and Flow Label fields into
a single 28-bit Flow Label field. This allowed flows to redefine the interpretation of
different values of priority. Suggest reasons why the final specification includes the
Priority field as a distinct field.
For Type 0 IPv6 routing, specify the algorithm for updating the IPv6 and Routing
headers by intermediate nodes.
19.1 Multicasting
19.2 Routing Protocols
19.3 Integrated Services Architecture
19.4 Differentiated Services
19.5 Service Level Agreements
19.6 IP Performance Metrics
19.7 Recommended Reading and Web Sites
19.8 Key Terms, Review Questions, and Problems
She occupied herself with studying a map on the opposite wall because she knew
she would have to change trains at some point. Tottenham Court Road must be that
point, an interchange from the black line to the red. This train would take her there,
was bearing her there rapidly now, and at the station she would follow the signs, for
signs there must be, to the Central Line going westward.
—King Solomon’s Carpet, Barbara Vine (Ruth Rendell)
The act of sending a packet from a source to multiple destinations is
referred to as multicasting. Multicasting raises design issues in the
areas of addressing and routing.
Routing protocols in an internet function in a similar fashion to those
used in packet-switching networks. An internet routing protocol is
used to exchange information about reachability and traffic delays,
allowing each router to construct a next-hop routing table for paths
through the internet. Typically, relatively simple routing protocols are
used between autonomous systems within a larger internet and more
complex routing protocols are used within each autonomous system.
The integrated services architecture is a response to the growing variety and volume of traffic experienced in the Internet and intranets. It
provides a framework for the development of protocols such as RSVP
to handle multimedia/multicast traffic and provides guidance to
router vendors on the development of efficient techniques for handling a varied load.
The differentiated services architecture is designed to provide a simple, easy-to-implement, low-overhead tool to support a range of network services that are differentiated on the basis of performance.
Differentiated services are provided on the basis of a 6-bit label in the
IP header, which classifies traffic in terms of the type of service to be
given by routers for that traffic.
As the Internet and private internets grow in scale, a host of new demands march
steadily into view. Low-volume TELNET conversations are leapfrogged by highvolume client/server applications. To this has been added more recently the tremendous volume of Web traffic, which is increasingly graphics intensive. Now real-time
voice and video applications add to the burden.
To cope with these demands, it is not enough to increase internet capacity. Sensible and effective methods for managing the traffic and controlling congestion are
needed. Historically, IP-based internets have been able to provide a simple best-effort
delivery service to all applications using an internet. But the needs of users have
changed.A company may have spent millions of dollars installing an IP-based internet
designed to transport data among LANs but now finds that new real-time, multimedia, and multicasting applications are not well supported by such a configuration. The
only networking scheme designed from day one to support both traditional TCP and
UDP traffic and real-time traffic is ATM. However, reliance on ATM means either
constructing a second networking infrastructure for real-time traffic or replacing the
existing IP-based configuration with ATM, both of which are costly alternatives.
Thus, there is a strong need to be able to support a variety of traffic with a variety of quality-of-service (QoS) requirements, within the TCP/IP architecture. This
chapter looks at the internetwork functions and services designed to meet this need.
We begin this chapter with a discussion of multicasting. Next we explore the
issue of internetwork routing algorithms. Next, we look at the Integrated Services
Architecture (ISA), which provides a framework for current and future internet services. Then we examine differentiated services. Finally, we introduce the topics of
service level agreements and IP performance metrics.
Refer to Figure 2.5 to see the position within the TCP/IP suite of the protocols
discussed in this chapter.
Typically, an IP address refers to an individual host on a particular network. IP also
accommodates addresses that refer to a group of hosts on one or more networks.
Such addresses are referred to as multicast addresses, and the act of sending a packet
from a source to the members of a multicast group is referred to as multicasting.
Multicasting has a number of practical applications. For example,
• Multimedia: A number of users “tune in” to a video or audio transmission
from a multimedia source station.
• Teleconferencing: A group of workstations form a multicast group such that a
transmission from any member is received by all other group members.
• Database: All copies of a replicated file or database are updated at the same time.
• Distributed computation: Intermediate results are sent to all participants.
• Real-time workgroup: Files, graphics, and messages are exchanged among
active group members in real time.
Multicasting done within the scope of a single LAN segment is straightforward. IEEE 802 and other LAN protocols include provision for MAC-level multicast addresses. A packet with a multicast address is transmitted on a LAN segment.
Those stations that are members of the corresponding multicast group recognize the
multicast address and accept the packet. In this case, only a single copy of the packet
is ever transmitted. This technique works because of the broadcast nature of a LAN:
A transmission from any one station is received by all other stations on the LAN.
In an internet environment, multicasting is a far more difficult undertaking. To
see this, consider the configuration of Figure 19.1; a number of LANs are interconnected by routers. Routers connect to each other either over high-speed links or across
a wide area network (network N4). A cost is associated with each link or network in
Router A
Group member
Multicast server
Group member
Figure 19.1
Group member
Example Configuration
each direction, indicated by the value shown leaving the router for that link or network.
Suppose that the multicast server on network N1 is transmitting packets to a multicast
address that represents the workstations indicated on networks N3, N5, N6. Suppose
that the server does not know the location of the members of the multicast group.Then
one way to assure that the packet is received by all members of the group is to
broadcast a copy of each packet to each network in the configuration, over the leastcost route for each network. For example, one packet would be addressed to N3 and
would traverse N1, link L3, and N3. Router B is responsible for translating the IP-level
multicast address to a MAC-level multicast address before transmitting the MAC
frame onto N3. Table 19.1 summarizes the number of packets generated on the various
links and networks in order to transmit one packet to a multicast group by this method.
In this table, the source is the multicast server on network N1 in Figure 19.1; the multicast address includes the group members on N3, N5, and N6. Each column in the table
refers to the path taken from the source host to a destination router attached to a particular destination network. Each row of the table refers to a network or link in the
configuration of Figure 19.1. Each entry in the table gives the number of packets that
Table 19.1 Traffic Generated by Various Multicasting Strategies
(a) Broadcast
(b) Multiple Unicast
S : N2 S : N3 S : N5 S : N6 Total
S : N3 S : N5 S : N6 Total
(c) Multicast
traverse a given network or link for a given path. A total of 13 copies of the packet are
required for the broadcast technique.
Now suppose the source system knows the location of each member of the
multicast group. That is, the source has a table that maps a multicast address into a
list of networks that contain members of that multicast group. In that case, the
source need only send packets to those networks that contain members of the
group. We could refer to this as the multiple unicast strategy. Table 19.1 shows that
in this case, 11 packets are required.
Both the broadcast and multiple unicast strategies are inefficient because they
generate unnecessary copies of the source packet. In a true multicast strategy, the
following method is used:
1. The least-cost path from the source to each network that includes members of
the multicast group is determined. This results in a spanning tree1 of the configuration. Note that this is not a full spanning tree of the configuration. Rather, it
is a spanning tree that includes only those networks containing group members.
2. The source transmits a single packet along the spanning tree.
3. The packet is replicated by routers only at branch points of the spanning tree.
Figure 19.2a shows the spanning tree for transmissions from the source to the
multicast group, and Figure 19.2b shows this method in action. The source transmits a
single packet over N1 to router D. D makes two copies of the packet, to transmit over
The concept of spanning tree was introduced in our discussion of bridges in Chapter 15. A spanning tree
of a graph consists of all the nodes of the graph plus a subset of the links (edges) of the graph that provides connectivity (a path exists between any two nodes) with no closed loops (there is only one path
between any two nodes).
(a) Spanning tree from source to multicast group
Figure 19.2
(b) Packets generated for multicast transmission
Multicast Transmission Example
links L3 and L4. B receives the packet from L3 and transmits it on N3, where it is read
by members of the multicast group on the network. Meanwhile, C receives the packet
sent on L4. It must now deliver that packet to both E and F. If network N4 were a
broadcast network (e.g., an IEEE 802 LAN), then C would only need to transmit one
instance of the packet for both routers to read. If N4 is a packet-switching WAN, then
C must make two copies of the packet and address one to E and one to F. Each of
these routers, in turn, retransmits the received packet on N5 and N6, respectively. As
Table 19.1 shows, the multicast technique requires only eight copies of the packet.
Requirements for Multicasting
In ordinary unicast transmission over an internet, in which each datagram has a unique
destination network, the task of each router is to forward the datagram along the shortest path from that router to the destination network. With multicast transmission, the
router may be required to forward two or more copies of an incoming datagram. In our
example, routers D and C both must forward two copies of a single incoming datagram.
Thus, we might expect that the overall functionality of multicast routing is
more complex than unicast routing. The following is a list of required functions:
1. A convention is needed for identifying a multicast address. In IPv4, Class D
addresses are reserved for this purpose. These are 32-bit addresses with 1110 as
their high-order 4 bits, followed by a 28-bit group identifier. In IPv6, a 128-bit
multicast address consists of an 8-bit prefix of all ones, a 4-bit flags field, a 4-bit
scope field, and a 112-bit group identifier. The flags field, currently, only indicates
whether this address is permanently assigned or not.The scope field indicates the
scope of applicability of the address, ranging from a single network to global.
2. Each node (router or source node participating in the routing algorithm) must
translate between an IP multicast address and a list of networks that contain
members of this group. This information allows the node to construct a shortest-path spanning tree to all of the networks containing group members.
3. A router must translate between an IP multicast address and a network multicast address in order to deliver a multicast IP datagram on the destination network. For example, in IEEE 802 networks, a MAC-level address is 48 bits long;
if the highest-order bit is 1, then it is a multicast address. Thus, for multicast
delivery, a router attached to an IEEE 802 network must translate a 32-bit
IPv4 or a 128-bit IPv6 multicast address into a 48-bit IEEE 802 MAC-level
multicast address.
4. Although some multicast addresses may be assigned permanently, the more
usual case is that multicast addresses are generated dynamically and that individual hosts may join and leave multicast groups dynamically. Thus, a mechanism is needed by which an individual host informs routers attached to the
same network as itself of its inclusion in and exclusion from a multicast group.
IGMP, described subsequently, provides this mechanism.
5. Routers must exchange two sorts of information. First, routers need to know
which networks include members of a given multicast group. Second, routers
need sufficient information to calculate the shortest path to each network containing group members. These requirements imply the need for a multicast
routing protocol. A discussion of such protocols is beyond the scope of this
6. A routing algorithm is needed to calculate shortest paths to all group members.
7. Each router must determine multicast routing paths on the basis of both
source and destination addresses.
The last point is a subtle consequence of the use of multicast addresses. To
illustrate the point, consider again Figure 19.1. If the multicast server transmits a
unicast packet addressed to a host on network N5, the packet is forwarded by router
D to C, which then forwards the packet to E. Similarly, a packet addressed to a host
on network N3 is forwarded by D to B. But now suppose that the server transmits a
packet with a multicast address that includes hosts on N3, N5, and N6. As we have
discussed, D makes two copies of the packet and sends one to B and one to C. What
will C do when it receives a packet with such a multicast address? C knows that this
packet is intended for networks N3, N5, and N6. A simple-minded approach would
be for C to calculate the shortest path to each of these three networks. This produces
the shortest-path spanning tree shown in Figure 19.3. As a result, C sends two copies
of the packet out over N4, one intended for N5 and one intended for N6. But it also
sends a copy of the packet to B for delivery on N3. Thus B will receive two copies of
the packet, one from D and one from C. This is clearly not what was intended by the
host on N1 when it launched the packet.
To avoid unnecessary duplication of packets, each router must route packets
on the basis of both source and multicast destination. When C receives a packet
intended for the multicast group from a source on N1, it must calculate the spanning
Figure 19.3
Spanning Tree from Router C to Multicast Group
tree with N1 as the root (shown in Figure 19.2a) and route on the basis of that
spanning tree.
Internet Group Management Protocol (IGMP)
IGMP, defined in RFC 3376, is used by hosts and routers to exchange multicast
group membership information over a LAN. IGMP takes advantage of the broadcast nature of a LAN to provide an efficient technique for the exchange of information among multiple hosts and routers. In general, IGMP supports two principal
1. Hosts send messages to routers to subscribe to and unsubscribe from a multicast group defined by a given multicast address.
2. Routers periodically check which multicast groups are of interest to which hosts.
IGMP is currently at version 3. In IGMPv1, hosts could join a multicast group
and routers used a timer to unsubscribe group members. IGMPv2 enabled a host to
request to be unsubscribed from a group. The first two versions used essentially the
following operational model:
• Receivers have to subscribe to multicast groups.
• Sources do not have to subscribe to multicast groups.
• Any host can send traffic to any multicast group.
This paradigm is very general, but it also has some weaknesses:
1. Spamming of multicast groups is easy. Even if there are application level filters
to drop unwanted packets, still these packets consume valuable resources in
the network and in the receiver that has to process them.
2. Establishment of the multicast distribution trees is problematic. This is mainly
because the location of sources is not known.
3. Finding globally unique multicast addresses is difficult. It is always possible
that another multicast group uses the same multicast address.
IGMPv3 addresses these weaknesses by
1. Allowing hosts to specify the list of hosts from which they want to receive traffic. Traffic from other hosts is blocked at routers.
2. Allowing hosts to block packets that come from sources that send unwanted
The remainder of this section discusses IGMPv3.
IGMP Message Format All IGMP messages are transmitted in IP datagrams.
The current version defines two message types: Membership Query and Membership Report.
A Membership Query message is sent by a multicast router. There are three
subtypes: a general query, used to learn which groups have members on an attached
network; a group-specific query, used to learn if a particular group has any members
on an attached network; and a group-and-source-specific query, used to learn if any
attached device desires reception of packets sent to a specified multicast address,
from any of a specified list of sources. Figure 19.4a shows the message format, which
consists of the following fields:
• Type: Defines this message type.
• Max Response Code: Indicates the maximum allowed time before sending a
responding report in units of 1/10 second.
• Checksum: An error-detecting code, calculated as the 16-bit ones complement
addition of all the 16-bit words in the message. For purposes of computation,
the Checksum field is itself initialized to a value of zero. This is the same
checksum algorithm used in IPv4.
• Group Address: Zero for a general query message; a valid IP multicast group
address when sending a group-specific query or group-and-source-specific
• S Flag: When set to one, indicates to any receiving multicast routers that they
are to suppress the normal timer updates they perform upon hearing a query.
• QRV (querier’s robustness variable): If nonzero, the QRV field contains the
RV value used by the querier (i.e., the sender of the query). Routers adopt the
RV value from the most recently received query as their own RV value, unless
that most recently received RV was zero, in which case the receivers use the
default value or a statically configured value. The RV dictates how many times
a host will retransmit a report to assure that it is not missed by any attached
multicast routers.
• QQIC (querier’s querier interval code): Specifies the QI value used by the
querier, which is a timer for sending multiple queries. Multicast routers that
are not the current querier adopt the QI value from the most recently received
query as their own QI value, unless that most recently received QI was zero, in
which case the receiving routers use the default QI value.
Type 17
Max resp code
Group address (class D IPv4 address)
Resv S QRV
Number of sources (N)
Source address [1]
Source address [2]
Source address [N]
(a) Membership query message
Type 34
Number of group records (M)
Group record [1]
Group record [2]
Group record [M]
(b) Membership report message
Record type
Aux data len
Number of sources (N)
Multicast address
Source address [1]
Source address [2]
Source address [N]
Auxiliary data
(c) Group record
Figure 19.4
IGMPv3 Message Formats
• Number of Sources: Specifies how many source addresses are present in this
query. This value is nonzero only for a group-and-source-specific query.
• Source Addresses: If the number of sources is N, then there are N 32-bit unicast addresses appended to the message.
A Membership Report message consists of the following fields:
• Type: Defines this message type.
• Checksum: An error-detecting code, calculated as the 16-bit ones complement
addition of all the 16-bit words in the message.
• Number of Group Records: Specifies how many group records are present in
this report.
• Group Records: If the number of group records is M, then there are M 32-bit
unicast group records appended to the message.
A group record includes the following fields:
• Record Type: Defines this record type, as described subsequently.
• Aux Data Length: Length of the auxiliary data field, in 32-bit words.
• Number of Sources: Specifies how many source addresses are present in this
• Multicast Address: The IP multicast address to which this record pertains.
• Source Addresses: If the number of sources is N, then there are N 32-bit unicast addresses appended to the message.
• Auxiliary Data: Additional information pertaining to this record. Currently,
no auxiliary data values are defined.
IGMP Operation The objective of each host in using IGMP is to make itself
known as a member of a group with a given multicast address to other hosts on the
LAN and to all routers on the LAN. IGMPv3 introduces the ability for hosts to signal group membership with filtering capabilities with respect to sources. A host can
either signal that it wants to receive traffic from all sources sending to a group
except for some specific sources (called EXCLUDE mode) or that it wants to
receive traffic only from some specific sources sending to the group (called
INCLUDE mode). To join a group, a host sends an IGMP membership report message, in which the group address field is the multicast address of the group. This message is sent in an IP datagram with the same multicast destination address. In other
words, the Group Address field of the IGMP message and the Destination Address
field of the encapsulating IP header are the same. All hosts that are currently members of this multicast group will receive the message and learn of the new group
member. Each router attached to the LAN must listen to all IP multicast addresses
in order to hear all reports.
To maintain a valid current list of active group addresses, a multicast router
periodically issues an IGMP general query message, sent in an IP datagram with an
all-hosts multicast address. Each host that still wishes to remain a member of one or
more multicast groups must read datagrams with the all-hosts address. When such a
host receives the query, it must respond with a report message for each group to
which it claims membership.
Note that the multicast router does not need to know the identity of every host in
a group. Rather, it needs to know that there is at least one group member still active.
Therefore, each host in a group that receives a query sets a timer with a random delay.
Any host that hears another host claim membership in the group will cancel its own
report. If no other report is heard and the timer expires, a host sends a report. With this
scheme, only one member of each group should provide a report to the multicast router.
When a host leaves a group, it sends a leave group message to the all-routers
static multicast address. This is accomplished by sending a membership report message with the INCLUDE option and a null list of source addresses; that is, no
sources are to be included, effectively leaving the group. When a router receives
such a message for a group that has group members on the reception interface, it
needs to determine if there are any remaining group members. For this purpose, the
router uses the group-specific query message.
Group Membership with IPv6 IGMP was defined for operation with IPv4
and makes use of 32-bit addresses. IPv6 internets need this same functionality.
Rather than to define a separate version of IGMP for IPv6, its functions have
been incorporated into the new version of the Internet Control Message Protocol
(ICMPv6). ICMPv6 includes all of the functionality of ICMPv4 and IGMP. For
multicast support, ICMPv6 includes both a group-membership query and a groupmembership report message, which are used in the same fashion as in IGMP.
The routers in an internet are responsible for receiving and forwarding packets
through the interconnected set of networks. Each router makes routing decision
based on knowledge of the topology and traffic/delay conditions of the internet. In a
simple internet, a fixed routing scheme is possible. In more complex internets, a
degree of dynamic cooperation is needed among the routers. In particular, the router
must avoid portions of the network that have failed and should avoid portions of the
network that are congested. To make such dynamic routing decisions, routers
exchange routing information using a special routing protocol for that purpose.
Information is needed about the status of the internet, in terms of which networks
can be reached by which routes, and the delay characteristics of various routes.
In considering the routing function, it is important to distinguish two concepts:
• Routing information: Information about the topology and delays of the internet
• Routing algorithm: The algorithm used to make a routing decision for a particular datagram, based on current routing information
Autonomous Systems
To proceed with our discussion of routing protocols, we need to introduce the concept of an autonomous system. An autonomous system (AS) exhibits the following
1. An AS is a set of routers and networks managed by a single organization.
2. An AS consists of a group of routers exchanging information via a common routing protocol.
3. Except in times of failure, an AS is connected (in a graph-theoretic sense); that
is, there is a path between any pair of nodes.
A shared routing protocol, which we shall refer to as an interior router protocol (IRP), passes routing information between routers within an AS. The protocol used within the AS does not need to be implemented outside of the system.
This flexibility allows IRPs to be custom tailored to specific applications and
It may happen, however, that an internet will be constructed of more than one
AS. For example, all of the LANs at a site, such as an office complex or campus,
could be linked by routers to form an AS. This system might be linked through a
wide area network to other ASs. The situation is illustrated in Figure 19.5. In this
case, the routing algorithms and information in routing tables used by routers in different ASs may differ. Nevertheless, the routers in one AS need at least a minimal
level of information concerning networks outside the system that can be reached.
Autonomous system 1
Autonomous system 2
Interior router protocol
Exterior router protocol
Figure 19.5
Application of Exterior and Interior Routing Protocols
We refer to the protocol used to pass routing information between routers in different ASs as an exterior router protocol (ERP).2
We can expect that an ERP will need to pass less information than an IRP, for
the following reason. If a datagram is to be transferred from a host in one AS to a
host in another AS, a router in the first system need only determine the target AS
and devise a route to get into that target system. Once the datagram enters the target AS, the routers within that system can cooperate to deliver the datagram; the
ERP is not concerned with, and does not know about, the details of the route followed within the target AS.
In the remainder of this section, we look at what are perhaps the most important examples of these two types of routing protocols: BGP and OSPF. But first, it is
useful to look at a different way of characterizing routing protocols.
Approaches to Routing
Internet routing protocols employ one of three approaches to gathering and using
routing information: distance-vector routing, link-state routing, and path-vector
Distance-vector routing requires that each node (router or host that implements the routing protocol) exchange information with its neighboring nodes. Two
nodes are said to be neighbors if they are both directly connected to the same network. This approach is that used in the first generation routing algorithm for
ARPANET, as described in Section 12.2. For this purpose, each node maintains a
vector of link costs for each directly attached network and distance and next-hop
vectors for each destination. The relatively simple Routing Information Protocol
(RIP) uses this approach.
Distance-vector routing requires the transmission of a considerable amount of
information by each router. Each router must send a distance vector to all of its
neighbors, and that vector contains the estimated path cost to all networks in the
configuration. Furthermore, when there is a significant change in a link cost or when
a link is unavailable, it may take a considerable amount of time for this information
to propagate through the internet.
Link-state routing is designed to overcome the drawbacks of distance-vector
routing. When a router is initialized, it determines the link cost on each of its network interfaces. The router then advertises this set of link costs to all other routers
in the internet topology, not just neighboring routers. From then on, the router
monitors its link costs. Whenever there is a significant change (a link cost increases
or decreases substantially, a new link is created, an existing link becomes unavailable), the router again advertises its set of link costs to all other routers in the configuration.
Because each router receives the link costs of all routers in the configuration,
each router can construct the topology of the entire configuration and then calculate the shortest path to each destination network. Having done this, the router can
construct its routing table, listing the first hop to each destination. Because the
In the literature, the terms interior gateway protocol (IGP) and exterior gateway protocol (EGP) are
often used for what are referred to here as IRP and ERP. However, because the terms IGP and EGP also
refer to specific protocols, we avoid their use to define the general concepts.
router has a representation of the entire network, it does not use a distributed version of a routing algorithm, as is done in distance-vector routing. Rather, the router
can use any routing algorithm to determine the shortest paths. In practice, Dijkstra’s
algorithm is used. The Open Shortest Path First (OSPF) protocol is an example of a
routing protocol that uses link-state routing. The second-generation routing algorithm for ARPANET also uses this approach.
Both link-state and distance-vector approaches have been used for interior
router protocols. Neither approach is effective for an exterior router protocol.
In a distance-vector routing protocol, each router advertises to its neighbors a
vector listing each network it can reach, together with a distance metric associated
with the path to that network. Each router builds up a routing database on the basis
of these neighbor updates but does not know the identity of intermediate routers
and networks on any particular path. There are two problems with this approach for
an exterior router protocol:
1. This distance-vector protocol assumes that all routers share a common distance metric with which to judge router preferences. This may not be the case
among different ASs. If different routers attach different meanings to a given
metric, it may not be possible to create stable, loop-free routes.
2. A given AS may have different priorities from other ASs and may have
restrictions that prohibit the use of certain other AS. A distance-vector algorithm gives no information about the ASs that will be visited along a route.
In a link-state routing protocol, each router advertises its link metrics to all
other routers. Each router builds up a picture of the complete topology of the configuration and then performs a routing calculation. This approach also has problems
if used in an exterior router protocol:
1. Different ASs may use different metrics and have different restrictions.
Although the link-state protocol does allow a router to build up a picture of
the entire topology, the metrics used may vary from one AS to another, making it impossible to perform a consistent routing algorithm.
2. The flooding of link state information to all routers implementing an exterior
router protocol across multiple ASs may be unmanageable.
An alternative, known as path-vector routing, is to dispense with routing metrics and simply provide information about which networks can be reached by a given
router and the ASs that must be crossed to get there. The approach differs from a
distance-vector algorithm in two respects: First, the path-vector approach does not
include a distance or cost estimate. Second, each block of routing information lists all
of the ASs visited in order to reach the destination network by this route.
Because a path vector lists the ASs that a datagram must traverse if it follows
this route, the path information enables a router to perform policy routing. That is, a
router may decide to avoid a particular path in order to avoid transiting a particular
AS. For example, information that is confidential may be limited to certain kinds of
ASs. Or a router may have information about the performance or quality of the portion of the internet that is included in an AS that leads the router to avoid that AS.
Examples of performance or quality metrics include link speed, capacity, tendency
Table 19.2 BGP-4 Messages
Used to open a neighbor relationship with another router.
Used to (1) transmit information about a single route and/or (2) list
multiple routes to be withdrawn.
Used to (1) acknowledge an Open message and (2) periodically confirm
the neighbor relationship.
Send when an error condition is detected.
to become congested, and overall quality of operation. Another criterion that could
be used is minimizing the number of transit ASs.
Border Gateway Protocol
The Border Gateway Protocol (BGP) was developed for use in conjunction with
internets that employ the TCP/IP suite, although the concepts are applicable to any
internet. BGP has become the preferred exterior router protocol for the Internet.
Functions BGP was designed to allow routers, called gateways in the standard, in
different autonomous systems (ASs) to cooperate in the exchange of routing information. The protocol operates in terms of messages, which are sent over TCP connections. The repertoire of messages is summarized in Table 19.2. The current
version of BGP is known as BGP-4 (RFC 1771).
Three functional procedures are involved in BGP:
• Neighbor acquisition
• Neighbor reachability
• Network reachability
Two routers are considered to be neighbors if they are attached to the same
network. If the two routers are in different autonomous systems, they may wish to
exchange routing information. For this purpose, it is necessary first to perform
neighbor acquisition. In essence, neighbor acquisition occurs when two neighboring
routers in different autonomous systems agree to exchange routing information regularly. A formal acquisition procedure is needed because one of the routers may not
wish to participate. For example, the router may be overburdened and does not want
to be responsible for traffic coming in from outside the system. In the neighbor acquisition process, one router sends a request message to the other, which may either
accept or refuse the offer. The protocol does not address the issue of how one router
knows the address or even the existence of another router, nor how it decides that it
needs to exchange routing information with that particular router. These issues must
be dealt with at configuration time or by active intervention of a network manager.
To perform neighbor acquisition, two routers send Open messages to each
other after a TCP connection is established. If each router accepts the request, it
returns a Keepalive message in response.
Once a neighbor relationship is established, the neighbor reachability procedure is used to maintain the relationship. Each partner needs to be assured that the
other partner still exists and is still engaged in the neighbor relationship. For this
purpose, the two routers periodically issue Keepalive messages to each other.
The final procedure specified by BGP is network reachability. Each router
maintains a database of the networks that it can reach and the preferred route for
reaching each network. When a change is made to this database, the router issues an
Update message that is broadcast to all other routers implementing BGP. Because
the Update message is broadcast, all BGP routers can build up and maintain their
routing information.
BGP Messages Figure 19.6 illustrates the formats of all of the BGP messages.
Each message begins with a 19-octet header containing three fields, as indicated by
the shaded portion of each message in the figure:
Unfeasible routes
Withdrawn routes
My autonomous
Hold time
BGP identifier
Opt parameter length
Optional parameters
Total path
attributes length
Path attributes
Network layer
(a) Open message
(b) Update message
Error code
Error subcode
(c) Keepalive message
(d) Notification message
Figure 19.6
BGP Message Formats
• Marker: Reserved for authentication. The sender may insert a value in this
field that would be used as part of an authentication mechanism to enable the
recipient to verify the identity of the sender.
• Length: Length of message in octets.
• Type: Type of message: Open, Update, Notification, Keepalive.
To acquire a neighbor, a router first opens a TCP connection to the neighbor
router of interest. It then sends an Open message. This message identifies the AS to
which the sender belongs and provides the IP address of the router. It also includes
a Hold Time parameter, which indicates the number of seconds that the sender proposes for the value of the Hold Timer. If the recipient is prepared to open a neighbor relationship, it calculates a value of Hold Timer that is the minimum of its Hold
Time and the Hold Time in the Open message. This calculated value is the maximum
number of seconds that may elapse between the receipt of successive Keepalive
and/or Update messages by the sender.
The Keepalive message consists simply of the header. Each router issues these
messages to each of its peers often enough to prevent the Hold Timer from expiring.
The Update message communicates two types of information:
• Information about a single route through the internet. This information is
available to be added to the database of any recipient router.
• A list of routes previously advertised by this router that are being withdrawn.
An Update message may contain one or both types of information. Information about a single route through the network involves three fields: the Network
Layer Reachability Information (NLRI) field, the Total Path Attributes Length
field, and the Path Attributes field. The NLRI field consists of a list of identifiers of
networks that can be reached by this route. Each network is identified by its IP
address, which is actually a portion of a full IP address. Recall that an IP address is a
32-bit quantity of the form 5network, host6. The left-hand or prefix portion of this
quantity identifies a particular network.
The Path Attributes field contains a list of attributes that apply to this particular route. The following are the defined attributes:
• Origin: Indicates whether this information was generated by an interior router
protocol (e.g., OSPF) or an exterior router protocol (in particular, BGP).
• AS_Path: A list of the ASs that are traversed for this route.
• Next_Hop: The IP address of the border router that should be used as the next
hop to the destinations listed in the NLRI field.
• Multi_Exit_Disc: Used to communicate some information about routes internal to an AS. This is described later in this section.
• Local_Pref: Used by a router to inform other routers within the same AS of its
degree of preference for a particular route. It has no significance to routers in
other ASs.
• Atomic_Aggregate, Aggregator: These two fields implement the concept of
route aggregation. In essence, an internet and its corresponding address space
can be organized hierarchically (i.e., as a tree). In this case, network addresses
are structured in two or more parts. All of the networks of a given subtree
share a common partial internet address. Using this common partial address,
the amount of information that must be communicated in NLRI can be significantly reduced.
The AS_Path attribute actually serves two purposes. Because it lists the ASs that
a datagram must traverse if it follows this route, the AS_Path information enables a
router to implement routing policies. That is, a router may decide to avoid a particular
path to avoid transiting a particular AS. For example, information that is confidential
may be limited to certain kinds of ASs. Or a router may have information about the
performance or quality of the portion of the internet that is included in an AS that
leads the router to avoid that AS. Examples of performance or quality metrics include
link speed, capacity, tendency to become congested, and overall quality of operation.
Another criterion that could be used is minimizing the number of transit ASs.
The reader may wonder about the purpose of the Next_Hop attribute. The
requesting router will necessarily want to know which networks are reachable via
the responding router, but why provide information about other routers? This is
best explained with reference to Figure 19.5. In this example, router R1 in
autonomous system 1 and router R5 in autonomous system 2 implement BGP and
acquire a neighbor relationship. R1 issues Update messages to R5, indicating which
networks it can reach and the distances (network hops) involved. R1 also provides
the same information on behalf of R2. That is, R1 tells R5 what networks are reachable via R2. In this example, R2 does not implement BGP. Typically, most of the
routers in an autonomous system will not implement BGP. Only a few routers will
be assigned responsibility for communicating with routers in other autonomous systems. A final point: R1 is in possession of the necessary information about R2,
because R1 and R2 share an interior router protocol (IRP).
The second type of update information is the withdrawal of one or more routes.
In this case, the route is identified by the IP address of the destination network.
Finally, the Notification Message is sent when an error condition is detected.
The following errors may be reported:
• Message header error: Includes authentication and syntax errors.
• Open message error: Includes syntax errors and options not recognized in an
Open message. This message can also be used to indicate that a proposed Hold
Time in an Open message is unacceptable.
• Update message error: Includes syntax and validity errors in an Update message.
• Hold timer expired: If the sending router has not received successive
Keepalive and/or Update and/or Notification messages within the Hold Time
period, then this error is communicated and the connection is closed.
• Finite state machine error: Includes any procedural error.
• Cease: Used by a router to close a connection with another router in the
absence of any other error.
BGP Routing Information Exchange The essence of BGP is the exchange
of routing information among participating routers in multiple ASs. This process can
be quite complex. In what follows, we provide a simplified overview.
Let us consider router R1 in autonomous system 1 (AS1), in Figure 19.5. To
begin, a router that implements BGP will also implement an internal routing protocol such as OSPF. Using OSPF, R1 can exchange routing information with other
routers within AS1 and build up a picture of the topology of the networks and
routers in AS1 and construct a routing table. Next, R1 can issue an Update message
to R5 in AS2. The Update message could include the following:
• AS_Path: The identity of AS1
• Next_Hop: The IP address of R1
• NLRI: A list of all of the networks in AS1
This message informs R5 that all of the networks listed in NLRI are reachable
via R1 and that the only autonomous system traversed is AS1.
Suppose now that R5 also has a neighbor relationship with another router
in another autonomous system, say R9 in AS3. R5 will forward the information
just received from R1 to R9 in a new Update message. This message includes the
• AS_Path: The list of identifiers 5AS2, AS16
• Next_Hop: The IP address of R5
• NLRI: A list of all of the networks in AS1
This message informs R9 that all of the networks listed in NLRI are reachable
via R5 and that the autonomous systems traversed are AS2 and AS1. R9 must now
decide if this is its preferred route to the networks listed. It may have knowledge of
an alternate route to some or all of these networks that it prefers for reasons of performance or some other policy metric. If R9 decides that the route provided in R5’s
update message is preferable, then R9 incorporates that routing information into its
routing database and forwards this new routing information to other neighbors. This
new message will include an AS_Path field of 5AS3, AS2, AS16.
In this fashion, routing update information is propagated through the larger
internet, consisting of a number of interconnected autonomous systems. The
AS_Path field is used to assure that such messages do not circulate indefinitely: If
an Update message is received by a router in an AS that is included in the
AS_Path field, that router will not forward the update information to other
Routers within the same AS, called internal neighbors, may exchange BGP
information. In this case, the sending router does not add the identifier of the common AS to the AS_Path field. When a router has selected a preferred route to an
external destination, it transmits this route to all of its internal neighbors. Each of
these routers then decides if the new route is preferred, in which case the new route
is added to its database and a new Update message goes out.
When there are multiple entry points into an AS that are available to a border router in another AS, the Multi_Exit_Disc attribute may be used to choose
among them. This attribute contains a number that reflects some internal metric
for reaching destinations within an AS. For example, suppose in Figure 19.5 that
both R1 and R2 implement BGP and both have a neighbor relationship with R5.
Each provides an Update message to R5 for network 1.3 that includes a routing
metric used internal to AS1, such as a routing metric associated with the OSPF
internal router protocol. R5 could then use these two metrics as the basis for choosing between the two routes.
Open Shortest Path First (OSPF) Protocol
The OSPF protocol (RFC 2328) is now widely used as the interior router protocol in
TCP/IP networks. OSPF computes a route through the internet that incurs the least
cost based on a user-configurable metric of cost. The user can configure the cost to
express a function of delay, data rate, dollar cost, or other factors. OSPF is able to
equalize loads over multiple equal-cost paths.
Each router maintains a database that reflects the known topology of the
autonomous system of which it is a part. The topology is expressed as a directed
graph. The graph consists of the following:
• Vertices, or nodes, of two types:
1. router
2. network, which is in turn of two types
a. transit, if it can carry data that neither originate nor terminate on an end
system attached to this network
b. stub, if it is not a transit network
• Edges of two types:
1. graph edges that connect two router vertices when the corresponding
routers are connected to each other by a direct point-to-point link
2. graph edges that connect a router vertex to a network vertex when the
router is directly connected to the network
Figure 19.7, based on one in RFC 2328, shows an example of an autonomous
system, and Figure 19.8 is the resulting directed graph. The mapping is straightforward:
• Two routers joined by a point-to-point link are represented in the graph as being
directly connected by a pair of edges, one in each direction (e.g., routers 6 and 10).
• When multiple routers are attached to a network (such as a LAN or packetswitching network), the directed graph shows all routers bidirectionally connected to the network vertex (e.g., routers 1, 2, 3, and 4 all connect to
network 3).
• If a single router is attached to a network, the network will appear in the graph
as a stub connection (e.g., network 7).
• An end system, called a host, can be directly connected to a router, in which
case it is depicted in the corresponding graph (e.g., host 1).
• If a router is connected to other autonomous systems, then the path cost to
each network in the other system must be obtained by some exterior router
protocol (ERP). Each such network is represented on the graph by a stub
and an edge to the router with the known path cost (e.g., networks 12
through 15).
Figure 19.7 A Sample Autonomous System
A cost is associated with the output side of each router interface. This cost is
configurable by the system administrator. Arcs on the graph are labeled with the
cost of the corresponding router output interface. Arcs having no labeled cost have
a cost of 0. Note that arcs leading from networks to routers always have a cost of 0.
A database corresponding to the directed graph is maintained by each router.
It is pieced together from link state messages from other routers in the internet.
Using Dijkstra’s algorithm (see Section 12.3), a router calculates the least-cost path
to all destination networks. The result for router 6 of Figure 19.7 is shown as a tree in
Figure 19.9, with R6 as the root of the tree. The tree gives the entire route to any destination network or host. However, only the next hop to the destination is used in the
forwarding process. The resulting routing table for router 6 is shown in Table 19.3.
The table includes entries for routers advertising external routes (routers 5 and 7).
For external networks whose identity is known, entries are also provided.
N13 N14
Figure 19.8
Directed Graph of Autonomous System of Figure 19.7
To meet the requirement for QoS-based service, the IETF is developing a suite of
standards under the general umbrella of the Integrated Services Architecture
(ISA). ISA, intended to provide QoS transport over IP-based internets, is defined in
overall terms in RFC 1633, while a number of other documents are being developed
to fill in the details. Already, a number of vendors have implemented portions of the
ISA in routers and end-system software.
This section provides an overview of ISA.
Internet Traffic
Traffic on a network or internet can be divided into two broad categories: elastic
and inelastic. A consideration of their differing requirements clarifies the need for
an enhanced internet architecture.
Elastic Traffic Elastic traffic is that which can adjust, over wide ranges, to changes
in delay and throughput across an internet and still meet the needs of its applications. This is the traditional type of traffic supported on TCP/IP-based internets and
N12 N13 N14
Figure 19.9
The SPF Tree for Router R6
is the type of traffic for which internets were designed. Applications that generate
such traffic typically use TCP or UDP as a transport protocol. In the case of UDP,
the application will use as much capacity as is available up to the rate that the application generates data. In the case of TCP, the application will use as much capacity
as is available up to the maximum rate that the end-to-end receiver can accept data.
Also with TCP, traffic on individual connections adjusts to congestion by reducing
the rate at which data are presented to the network; this is described in Chapter 20.
Applications that can be classified as elastic include the common applications
that operate over TCP or UDP, including file transfer (FTP), electronic mail
(SMTP), remote login (TELNET), network management (SNMP), and Web access
(HTTP). However, there are differences among the requirements of these applications. For example,
• E-mail is generally insensitive to changes in delay.
• When file transfer is done interactively, as it frequently is, the user expects the
delay to be proportional to the file size and so is sensitive to changes in
• With network management, delay is generally not a serious concern. However, if failures in an internet are the cause of congestion, then the need for
Table 19.3 Routing Table for R6
Next Hop
SNMP messages to get through with minimum delay increases with increased
• Interactive applications, such as remote logon and Web access, are sensitive to
It is important to realize that it is not per-packet delay that is the quantity of
interest. As noted in [CLAR95], observation of real delays across the Internet suggest that wide variations in delay do not occur. Because of the congestion control
mechanisms in TCP, when congestion develops, delays only increase modestly before
the arrival rate from the various TCP connections slow down. Instead, the QoS perceived by the user relates to the total elapsed time to transfer an element of the current application. For an interactive TELNET-based application, the element may be
a single keystroke or single line. For a Web access, the element is a Web page, which
could be as little as a few kilobytes or could be substantially larger for an image-rich
page. For a scientific application, the element could be many megabytes of data.
For very small elements, the total elapsed time is dominated by the delay time
across the internet. However, for larger elements, the total elapsed time is dictated by
the sliding-window performance of TCP and is therefore dominated by the throughput
achieved over the TCP connection.Thus, for large transfers, the transfer time is proportional to the size of the file and the degree to which the source slows due to congestion.
It should be clear that even if we confine our attention to elastic traffic, a QoSbased internet service could be of benefit. Without such a service, routers are dealing
evenhandedly with arriving IP packets, with no concern for the type of application
and whether a particular packet is part of a large transfer element or a small one.
Under such circumstances, and if congestion develops, it is unlikely that resources
will be allocated in such a way as to meet the needs of all applications fairly. When
inelastic traffic is added to the mix, the results are even more unsatisfactory.
Inelastic Traffic Inelastic traffic does not easily adapt, if at all, to changes in delay
and throughput across an internet. The prime example is real-time traffic. The
requirements for inelastic traffic may include the following:
• Throughput: A minimum throughput value may be required. Unlike most elastic traffic, which can continue to deliver data with perhaps degraded service,
many inelastic applications absolutely require a given minimum throughput.
• Delay: An example of a delay-sensitive application is stock trading; someone
who consistently receives later service will consistently act later, and with
greater disadvantage.
• Jitter: The magnitude of delay variation, called jitter, is a critical factor in realtime applications. Because of the variable delay imposed by the Internet, the
interarrival times between packets are not maintained at a fixed interval at the
destination. To compensate for this, the incoming packets are buffered, delayed
sufficiently to compensate for the jitter, and then released at a constant rate to
the software that is expecting a steady real-time stream.The larger the allowable
delay variation, the longer the real delay in delivering the data and the greater
the size of the delay buffer required at receivers. Real-time interactive applications, such as teleconferencing, may require a reasonable upper bound on jitter.
• Packet loss: Real-time applications vary in the amount of packet loss, if any,
that they can sustain.
These requirements are difficult to meet in an environment with variable queuing delays and congestion losses. Accordingly, inelastic traffic introduces two new
requirements into the internet architecture. First, some means is needed to give preferential treatment to applications with more demanding requirements. Applications
need to be able to state their requirements, either ahead of time in some sort of service
request function, or on the fly, by means of fields in the IP packet header. The former
approach provides more flexibility in stating requirements, and it enables the network
to anticipate demands and deny new requests if the required resources are unavailable. This approach implies the use of some sort of resource reservation protocol.
A second requirement in supporting inelastic traffic in an internet architecture
is that elastic traffic must still be supported. Inelastic applications typically do not
back off and reduce demand in the face of congestion, in contrast to TCP-based
applications. Therefore, in times of congestion, inelastic traffic will continue to supply a high load, and elastic traffic will be crowded off the internet. A reservation
protocol can help control this situation by denying service requests that would leave
too few resources available to handle current elastic traffic.
ISA Approach
The purpose of ISA is to enable the provision of QoS support over IP-based internets. The central design issue for ISA is how to share the available capacity in times
of congestion.
For an IP-based internet that provides only a best-effort service, the tools for
controlling congestion and providing service are limited. In essence, routers have
two mechanisms to work with:
• Routing algorithm: Most routing protocols in use in internets allow routes to
be selected to minimize delay. Routers exchange information to get a picture
of the delays throughout the internet. Minimum-delay routing helps to balance loads, thus decreasing local congestion, and helps to reduce delays seen
by individual TCP connections.
• Packet discard: When a router’s buffer overflows, it discards packets. Typically,
the most recent packet is discarded. The effect of lost packets on a TCP connection is that the sending TCP entity backs off and reduces its load, thus helping to alleviate internet congestion.
These tools have worked reasonably well. However, as the discussion in the
preceding subsection shows, such techniques are inadequate for the variety of traffic
now coming to internets.
ISA is an overall architecture within which a number of enhancements to the traditional best-effort mechanisms are being developed. In ISA, each IP packet can be
associated with a flow. RFC 1633 defines a flow as a distinguishable stream of related IP
packets that results from a single user activity and requires the same QoS. For example,
a flow might consist of one transport connection or one video stream distinguishable by
the ISA.A flow differs from a TCP connection in two respects:A flow is unidirectional,
and there can be more than one recipient of a flow (multicast).Typically, an IP packet is
identified as a member of a flow on the basis of source and destination IP addresses and
port numbers, and protocol type. The flow identifier in the IPv6 header is not necessarily equivalent to an ISA flow, but in future the IPv6 flow identifier could be used in ISA.
ISA makes use of the following functions to manage congestion and provide
QoS transport:
• Admission control: For QoS transport (other than default best-effort transport),
ISA requires that a reservation be made for a new flow. If the routers collectively
determine that there are insufficient resources to guarantee the requested QoS,
then the flow is not admitted. The protocol RSVP is used to make reservations.
• Routing algorithm: The routing decision may be based on a variety of QoS
parameters, not just minimum delay. For example, the routing protocol OSPF,
discussed in Section 19.2, can select routes based on QoS.
• Queuing discipline: A vital element of the ISA is an effective queuing policy
that takes into account the differing requirements of different flows.
• Discard policy: A discard policy determines which packets to drop when a
buffer is full and new packets arrive. A discard policy can be an important element in managing congestion and meeting QoS guarantees.
ISA Components
Figure 19.10 is a general depiction of the implementation architecture for ISA
within a router. Below the thick horizontal line are the forwarding functions of the
router; these are executed for each packet and therefore must be highly optimized.
Classifier &
Figure 19.10
QoS queuing
Best-effort queuing
Integrated Services Architecture Implemented in Router
The remaining functions, above the line, are background functions that create data
structures used by the forwarding functions.
The principal background functions are as follows:
• Reservation protocol: This protocol is to reserve resources for a new flow at a
given level of QoS. It is used among routers and between routers and end systems. The reservation protocol is responsible for maintaining flow-specific
state information at the end systems and at the routers along the path of the
flow. RSVP is used for this purpose. The reservation protocol updates the traffic control database used by the packet scheduler to determine the service
provided for packets of each flow.
• Admission control: When a new flow is requested, the reservation protocol
invokes the admission control function. This function determines if sufficient
resources are available for this flow at the requested QoS. This determination
is based on the current level of commitment to other reservations and/or on
the current load on the network.
• Management agent: A network management agent is able to modify the traffic control database and to direct the admission control module in order to set
admission control policies.
• Routing protocol: The routing protocol is responsible for maintaining a routing database that gives the next hop to be taken for each destination address
and each flow.
These background functions support the main task of the router, which is the
forwarding of packets. The two principal functional areas that accomplish forwarding are the following:
• Classifier and route selection: For the purposes of forwarding and traffic control, incoming packets must be mapped into classes. A class may correspond to
a single flow or to a set of flows with the same QoS requirements. For example,
the packets of all video flows or the packets of all flows attributable to a particular organization may be treated identically for purposes of resource allocation and queuing discipline. The selection of class is based on fields in the IP
header. Based on the packet’s class and its destination IP address, this function
determines the next-hop address for this packet.
• Packet scheduler: This function manages one or more queues for each output
port. It determines the order in which queued packets are transmitted and the
selection of packets for discard, if necessary. Decisions are made based on a
packet’s class, the contents of the traffic control database, and current and past
activity on this outgoing port. Part of the packet scheduler’s task is that of
policing, which is the function of determining whether the packet traffic in a
given flow exceeds the requested capacity and, if so, deciding how to treat the
excess packets.
ISA Services
ISA service for a flow of packets is defined on two levels. First, a number of general
categories of service are provided, each of which provides a certain general type of
service guarantees. Second, within each category, the service for a particular flow is
specified by the values of certain parameters; together, these values are referred to
as a traffic specification (TSpec). Currently, three categories of service are defined:
• Guaranteed
• Controlled load
• Best effort
An application can request a reservation for a flow for a guaranteed or controlled load QoS, with a TSpec that defines the exact amount of service required. If
the reservation is accepted, then the TSpec is part of the contract between the data
flow and the service. The service agrees to provide the requested QoS as long as the
flow’s data traffic continues to be described accurately by the TSpec. Packets that
are not part of a reserved flow are by default given a best-effort delivery service.
Before looking at the ISA service categories, one general concept should be
defined: the token bucket traffic specification. This is a way of characterizing traffic
that has three advantages in the context of ISA:
1. Many traffic sources can be defined easily and accurately by a token bucket
2. The token bucket scheme provides a concise description of the load to be imposed
by a flow, enabling the service to determine easily the resource requirement.
3. The token bucket scheme provides the input parameters to a policing function.
A token bucket traffic specification consists of two parameters: a token
replenishment rate R and a bucket size B. The token rate R specifies the continually
sustainable data rate; that is, over a relatively long period of time, the average data
rate to be supported for this flow is R. The bucket size B specifies the amount by
which the data rate can exceed R for short periods of time. The exact condition is as
follows: During any time period T, the amount of data sent cannot exceed RT + B.
Token replenishment
rate R bps
1. Router puts tokens
into bucket at
predetermined rate.
2. Tokens can
accumulate up
to bucket size;
excess tokens
size B bits
4. Router's queue regulator
requests tokens equal to
to size of the next packet.
3. Traffic seeks
to network.
5. If tokens are available,
packet is queued for
6. If tokens are not available,
packet is either queued for
transmission but marked as
excess, buffered for later transmission,
or discarded.
Figure 19.11
Token Bucket Scheme
Figure 19.11 illustrates this scheme and explains the use of the term bucket.
The bucket represents a counter that indicates the allowable number of octets of IP
data that can be sent at any time. The bucket fills with octet tokens at the rate of R
(i.e., the counter is incremented R times per second), up to the bucket capacity (up
to the maximum counter value). IP packets arrive and are queued for processing.
An IP packet may be processed if there are sufficient octet tokens to match the IP
data size. If so, the packet is processed and the bucket is drained of the corresponding number of tokens. If a packet arrives and there are insufficient tokens available,
then the packet exceeds the TSpec for this flow. The treatment for such packets is
not specified in the ISA documents; common actions are relegating the packet to
best-effort service, discarding the packet, or marking the packet in such a way that it
may be discarded in future.
Over the long run, the rate of IP data allowed by the token bucket is R. However, if there is an idle or relatively slow period, the bucket capacity builds up, so
that at most an additional B octets above the stated rate can be accepted. Thus, B is
a measure of the degree of burstiness of the data flow that is allowed.
Guaranteed Service The key elements of the guaranteed service are as follows:
• The service provides assured capacity, or data rate.
• There is a specified upper bound on the queuing delay through the network.
This must be added to the propagation delay, or latency, to arrive at the bound
on total delay through the network.
• There are no queuing losses. That is, no packets are lost due to buffer overflow;
packets may be lost due to failures in the network or changes in routing paths.
With this service, an application provides a characterization of its expected
traffic profile, and the service determines the end-to-end delay that it can guarantee.
One category of applications for this service is those that need an upper
bound on delay so that a delay buffer can be used for real-time playback of
incoming data, and that do not tolerate packet losses because of the degradation
in the quality of the output. Another example is applications with hard real-time
The guaranteed service is the most demanding service provided by ISA.
Because the delay bound is firm, the delay has to be set at a large value to cover rare
cases of long queuing delays.
Controlled Load The key elements of the controlled load service are as follows:
• The service tightly approximates the behavior visible to applications receiving
best-effort service under unloaded conditions.
• There is no specified upper bound on the queuing delay through the network.
However, the service ensures that a very high percentage of the packets do not
experience delays that greatly exceed the minimum transit delay (i.e., the
delay due to propagation time plus router processing time with no queuing
• A very high percentage of transmitted packets will be successfully delivered
(i.e., almost no queuing loss).
As was mentioned, the risk in an internet that provides QoS for real-time
applications is that best-effort traffic is crowded out. This is because best-effort
types of applications employ TCP, which will back off in the face of congestion and
delays. The controlled load service guarantees that the network will set aside sufficient resources so that an application that receives this service will see a network
that responds as if these real-time applications were not present and competing for
The controlled service is useful for applications that have been referred to as
adaptive real-time applications [CLAR92]. Such applications do not require an a
priori upper bound on the delay through the network. Rather, the receiver measures
the jitter experienced by incoming packets and sets the playback point to the minimum delay that still produces a sufficiently low loss rate (e.g., video can be adaptive
by dropping a frame or delaying the output stream slightly; voice can be adaptive by
adjusting silent periods).
Queuing Discipline
An important component of an ISA implementation is the queuing discipline used
at the routers. Routers traditionally have used a first-in-first-out (FIFO) queuing
discipline at each output port. A single queue is maintained at each output port.
When a new packet arrives and is routed to an output port, it is placed at the end of
the queue. As long as the queue is not empty, the router transmits packets from the
queue, taking the oldest remaining packet next.
There are several drawbacks to the FIFO queuing discipline:
• No special treatment is given to packets from flows that are of higher priority
or are more delay sensitive. If a number of packets from different flows are
ready to be forwarded, they are handled strictly in FIFO order.
• If a number of smaller packets are queued behind a long packet, then FIFO
queuing results in a larger average delay per packet than if the shorter packets
were transmitted before the longer packet. In general, flows of larger packets
get better service.
• A greedy TCP connection can crowd out more altruistic connections. If congestion occurs and one TCP connection fails to back off, other connections
along the same path segment must back off more than they would otherwise
have to do.
To overcome the drawbacks of FIFO queuing, some sort of fair queuing scheme is
used, in which a router maintains multiple queues at each output port (Figure
19.12). With simple fair queuing, each incoming packet is placed in the queue for its
flow. The queues are serviced in round-robin fashion, taking one packet from each
nonempty queue in turn. Empty queues are skipped over. This scheme is fair in that
each busy flow gets to send exactly one packet per cycle. Further, this is a form of
load balancing among the various flows. There is no advantage in being greedy. A
greedy flow finds that its queues become long, increasing its delays, whereas other
flows are unaffected by this behavior.
A number of vendors have implemented a refinement of fair queuing known
as weighted fair queuing (WFQ). In essence, WFQ takes into account the amount of
traffic through each queue and gives busier queues more capacity without completely shutting out less busy queues. In addition, WFQ can take into account the
amount of service requested by each traffic flow and adjust the queuing discipline
Flow 1
Flow 1
Flow 2
Flow 2
Flow N
Flow N
(a) FIFO queuing
Figure 19.12 FIFO and Fair Queuing
(b) Fair queuing
Resource ReSerVation Protocol (RSVP)
RFC 2205 defines RSVP, which provides supporting functionality for ISA. This subsection provides an overview.
A key task, perhaps the key task, of an internetwork is to deliver data from a
source to one or more destinations with the desired quality of service (QoS), such as
throughput, delay, delay variance, and so on. This task becomes increasingly difficult
on any internetwork with increasing number of users, data rate of applications, and
use of multicasting. To meet these needs, it is not enough for an internet to react to
congestion. Instead a tool is needed to prevent congestion by allowing applications
to reserve network resources at a given QoS.
Preventive measures can be useful in both unicast and multicast transmission.
For unicast, two applications agree on a specific quality of service for a session and
expect the internetwork to support that quality of service. If the internetwork is
heavily loaded, it may not provide the desired QoS and instead deliver packets at a
reduced QoS. In that case, the applications may have preferred to wait before initiating the session or at least to have been alerted to the potential for reduced QoS. A
way of dealing with this situation is to have the unicast applications reserve resources
in order to meet a given quality of service. Routers along an intended path could then
preallocate resources (queue space, outgoing capacity) to assure the desired QoS. If a
router could not meet the resource reservation because of prior outstanding reservations, then the applications could be informed. The applications may then decide to
try again at a reduced QoS reservation or may decide to try later.
Multicast transmission presents a much more compelling case for implementing resource reservation. A multicast transmission can generate a tremendous
amount of internetwork traffic if either the application is high-volume (e.g., video)
or the group of multicast destinations is large and scattered, or both. What makes
the case for multicast resource reservation is that much of the potential load generated by a multicast source may easily be prevented. This is so for two reasons:
1. Some members of an existing multicast group may not require delivery from a
particular source over some given period of time. For example, there may be
two “channels” (two multicast sources) broadcasting to a particular multicast
group at the same time. A multicast destination may wish to “tune in” to only
one channel at a time.
2. Some members of a group may only be able to handle a portion of the source
transmission. For example, a video source may transmit a video stream that
consists of two components: a basic component that provides a reduced picture quality, and an enhanced component. Some receivers may not have the
processing power to handle the enhanced component or may be connected to
the internetwork through a subnetwork or link that does not have the capacity
for the full signal.
Thus, the use of resource reservation can enable routers to decide ahead of
time if they can meet the requirement to deliver a multicast transmission to all designated multicast receivers and to reserve the appropriate resources if possible.
Internet resource reservation differs from the type of resource reservation that
may be implemented in a connection-oriented network, such as ATM or frame relay.
An internet resource reservation scheme must interact with a dynamic routing strategy that allows the route followed by packets of a given transmission to change.
When the route changes, the resource reservations must be changed.To deal with this
dynamic situation, the concept of soft state is used. A soft state is simply a set of state
information at a router that expires unless regularly refreshed from the entity that
requested the state. If a route for a given transmission changes, then some soft states
will expire and new resource reservations will invoke the appropriate soft states on
the new routers along the route. Thus, the end systems requesting resources must
periodically renew their requests during the course of an application transmission.
Based on these considerations, the specification lists the following characteristics of RSVP:
• Unicast and multicast: RSVP makes reservations for both unicast and multicast transmissions, adapting dynamically to changing group membership as
well as to changing routes, and reserving resources based on the individual
requirements of multicast members.
• Simplex: RSVP makes reservations for unidirectional data flow. Data exchanges
between two end systems require separate reservations in the two directions.
• Receiver-initiated reservation: The receiver of a data flow initiates and maintains the resource reservation for that flow.
• Maintaining soft state in the internet: RSVP maintains a soft state at intermediate routers and leaves the responsibility for maintaining these reservation
states to end users.
• Providing different reservation styles: These allow RSVP users to specify how
reservations for the same multicast group should be aggregated at the intermediate switches. This feature enables a more efficient use of internet resources.
• Transparent operation through non-RSVP routers: Because reservations and
RSVP are independent of routing protocol, there is no fundamental conflict in
a mixed environment in which some routers do not employ RSVP. These
routers will simply use a best-effort delivery technique.
• Support for IPv4 and IPv6: RSVP can exploit the Type-of-Service field in the
IPv4 header and the Flow Label field in the IPv6 header.
The Integrated Services Architecture (ISA) and RSVP are intended to support QoS
capability in the Internet and in private internets. Although ISA in general and
RSVP in particular are useful tools in this regard, these features are relatively complex to deploy. Further, they may not scale well to handle large volumes of traffic
because of the amount of control signaling required to coordinate integrated QoS
offerings and because of the maintenance of state information required at routers.
As the burden on the Internet grows, and as the variety of applications grow,
there is an immediate need to provide differing levels of QoS to different traffic
flows. The differentiated services (DS) architecture (RFC 2475) is designed to
provide a simple, easy-to-implement, low-overhead tool to support a range of network services that are differentiated on the basis of performance.
Several key characteristics of DS contribute to its efficiency and ease of
• IP packets are labeled for differing QoS treatment using the existing IPv4
(Figure 18.6) or IPv6 (Figure 18.11) DS field. Thus, no change is required to IP.
• A service level agreement (SLA) is established between the service provider
(internet domain) and the customer prior to the use of DS. This avoids the
need to incorporate DS mechanisms in applications. Thus, existing applications need not be modified to use DS.
• DS provides a built-in aggregation mechanism. All traffic with the same DS
octet is treated the same by the network service. For example, multiple voice
connections are not handled individually but in the aggregate. This provides
for good scaling to larger networks and traffic loads.
• DS is implemented in individual routers by queuing and forwarding packets
based on the DS octet. Routers deal with each packet individually and do not
have to save state information on packet flows.
Today, DS is the most widely accepted QoS mechanism in enterprise networks.
Although DS is intended to provide a simple service based on relatively simple mechanisms, the set of RFCs related to DS is relatively complex. Table 19.4 summarizes some of the key terms from these specifications.
The DS type of service is provided within a DS domain, which is defined as a contiguous portion of the Internet over which a consistent set of DS policies are administered. Typically, a DS domain would be under the control of one administrative
entity. The services provided across a DS domain are defined in an SLA, which is a
service contract between a customer and the service provider that specifies the forwarding service that the customer should receive for various classes of packets. A
customer may be a user organization or another DS domain. Once the SLA is established, the customer submits packets with the DS octet marked to indicate the packet
class. The service provider must assure that the customer gets at least the agreed QoS
for each packet class. To provide that QoS, the service provider must configure the
appropriate forwarding policies at each router (based on DS octet value) and must
measure the performance being provided for each class on an ongoing basis.
If a customer submits packets intended for destinations within the DS domain,
then the DS domain is expected to provide the agreed service. If the destination is
beyond the customer’s DS domain, then the DS domain will attempt to forward the
packets through other domains, requesting the most appropriate service to match
the requested service.
A draft DS framework document lists the following detailed performance
parameters that might be included in an SLA:
• Detailed service performance parameters such as expected throughput, drop
probability, latency
Table 19.4 Terminology for Differentiated Services
Behavior Aggregate
A set of packets with the same DS codepoint crossing a link in a particular
Selects packets based on the DS field (BA classifier) or on multiple fields within
the packet header (MF classifier).
DS Boundary Node
A DS node that connects one DS domain to a node in another domain.
DS Codepoint
A specified value of the 6-bit DSCP portion of the 8-bit DS field in the IP header.
DS Domain
A contiguous (connected) set of nodes, capable of implementing differentiated services, that operate with a common set of service provisioning policies and per-hop
behavior definitions.
DS Interior Node
A DS node that is not a DS boundary node.
DS Node
A node that supports differentiated services. Typically, a DS node is a router. A host
system that provides differentiated services for applications in the host is also a DS
The process of discarding packets based on specified rules; also called policing.
The process of setting the DS codepoint in a packet. Packets may be marked on
initiation and may be re-marked by an en route DS node.
The process of measuring the temporal properties (e.g., rate) of a packet stream
selected by a classifier. The instantaneous state of that process may affect marking,
shaping, and dropping functions.
Per-Hop Behavior
The externally observable forwarding behavior applied at a node to a behavior
Service Level
Agreement (SLA)
A service contract between a customer and a service provider that specifies the
forwarding service a customer should receive.
The process of delaying packets within a packet stream to cause it to conform to
some defined traffic profile.
Traffic Conditioning
Control functions performed to enforce rules specified in a TCA, including
metering, marking, shaping, and dropping.
Traffic Conditioning
Agreement (TCA)
An agreement specifying classifying rules and traffic conditioning rules that are to
apply to packets selected by the classifier.
• Constraints on the ingress and egress points at which the service is provided,
indicating the scope of the service
• Traffic profiles that must be adhered to for the requested service to be provided,
such as token bucket parameters
• Disposition of traffic submitted in excess of the specified profile
The framework document also gives some examples of services that might be
1. Traffic offered at service level A will be delivered with low latency.
2. Traffic offered at service level B will be delivered with low loss.
3. Ninety percent of in-profile traffic delivered at service level C will experience no
more than 50 ms latency.
4. Ninety-five percent of in-profile traffic delivered at service level D will be delivered.
5. Traffic offered at service level E will be allotted twice the bandwidth of traffic
delivered at service level F.
Differentiated services codepoint
Class selector
Drop precedence
DS codepoint
Default behavior
Class selector
Class 4—best service
Class 3
Class 2
Class 1
Drop precedence
Low—most important
High—least important
101110 Expedited forwarding (EF) behavior
(a) DS Field
Figure 19.13
(b) Codepoints for assured forwarding PHB
DS Field
6. Traffic with drop precedence X has a higher probability of delivery than traffic
with drop precedence Y.
The first two examples are qualitative and are valid only in comparison to
other traffic, such as default traffic that gets a best-effort service. The next two
examples are quantitative and provide a specific guarantee that can be verified by
measurement on the actual service without comparison to any other services
offered at the same time. The final two examples are a mixture of quantitative and
DS Field
Packets are labeled for service handling by means of the 6-bit DS field in the IPv4
header or the IPv6 header. The value of the DS field, referred to as the DS codepoint, is the label used to classify packets for differentiated services. Figure 19.13a
shows the DS field.
With a 6-bit codepoint, there are in principle 64 different classes of traffic that
could be defined. These 64 codepoints are allocated across three pools of codepoints, as follows:
• Codepoints of the form xxxxx0, where x is either 0 or 1, are reserved for
assignment as standards.
• Codepoints of the form xxxx11 are reserved for experimental or local use.
• Codepoints of the form xxxx01 are also reserved for experimental or local use
but may be allocated for future standards action as needed.
Within the first pool, several assignments are made in RFC 2474. The codepoint 000000 is the default packet class. The default class is the best-effort forwarding behavior in existing routers. Such packets are forwarded in the order that they
are received as soon as link capacity becomes available. If other higher-priority
packets in other DS classes are available for transmission, these are given preference over best-effort default packets.
Codepoints of the form xxx000 are reserved to provide backward compatibility with the IPv4 precedence service. To explain this requirement, we need to digress
to an explanation of the IPv4 precedence service. The IPv4 type of service (TOS)
field includes two subfields: a 3-bit precedence subfield and a 4-bit TOS subfield.
These subfields serve complementary functions. The TOS subfield provides guidance to the IP entity (in the source or router) on selecting the next hop for this datagram, and the precedence subfield provides guidance about the relative allocation
of router resources for this datagram.
The precedence field is set to indicate the degree of urgency or priority to be
associated with a datagram. If a router supports the precedence subfield, there are
three approaches to responding:
• Route selection: A particular route may be selected if the router has a smaller
queue for that route or if the next hop on that route supports network precedence or priority (e.g., a token ring network supports priority).
• Network service: If the network on the next hop supports precedence, then
that service is invoked.
• Queuing discipline: A router may use precedence to affect how queues are
handled. For example, a router may give preferential treatment in queues to
datagrams with higher precedence.
RFC 1812, Requirements for IP Version 4 Routers, provides recommendations for queuing discipline that fall into two categories:
• Queue service
(a) Routers SHOULD implement precedence-ordered queue service. Prece-
dence-ordered queue service means that when a packet is selected for output on a (logical) link, the packet of highest precedence that has been
queued for that link is sent.
(b) Any router MAY implement other policy-based throughput management
procedures that result in other than strict precedence ordering, but it
MUST be configurable to suppress them (i.e., use strict ordering).
• Congestion control. When a router receives a packet beyond its storage capacity, it must discard it or some other packet or packets.
(a) A router MAY discard the packet it has just received; this is the simplest
but not the best policy.
(b) Ideally, the router should select a packet from one of the sessions most
heavily abusing the link, given that the applicable QoS policy permits this.
A recommended policy in datagram environments using FIFO queues is
to discard a packet randomly selected from the queue. An equivalent algorithm in routers using fair queues is to discard from the longest queue. A
router MAY use these algorithms to determine which packet to discard.
(c) If precedence-ordered queue service is implemented and enabled, the
router MUST NOT discard a packet whose IP precedence is higher than
that of a packet that is not discarded.
(d) A router MAY protect packets whose IP headers request the maximize reli-
ability TOS, except where doing so would be in violation of the previous rule.
(e) A router MAY protect fragmented IP packets, on the theory that dropping
a fragment of a datagram may increase congestion by causing all fragments of the datagram to be retransmitted by the source.
(f) To help prevent routing perturbations or disruption of management functions, the router MAY protect packets used for routing control, link control, or network management from being discarded. Dedicated routers
(i.e., routers that are not also general purpose hosts, terminal servers, etc.)
can achieve an approximation of this rule by protecting packets whose
source or destination is the router itself.
The DS codepoints of the form xxx000 should provide a service that at minimum is equivalent to that of the IPv4 precedence functionality.
DS Configuration and Operation
Figure 19.14 illustrates the type of configuration envisioned in the DS documents. A
DS domain consists of a set of contiguous routers; that is, it is possible to get from
any router in the domain to any other router in the domain by a path that does not
include routers outside the domain. Within a domain, the interpretation of DS codepoints is uniform, so that a uniform, consistent service is provided.
Routers in a DS domain are either boundary nodes or interior nodes. Typically, the interior nodes implement simple mechanisms for handling packets based
Shaper/dropper Queue management
DS domain
DS domain
Border component
Interior component
Figure 19.14
DS Domains
on their DS codepoint values. This includes queuing discipline to give preferential
treatment depending on codepoint value, and packet-dropping rules to dictate
which packets should be dropped first in the event of buffer saturation. The DS
specifications refer to the forwarding treatment provided at a router as per-hop
behavior (PHB). This PHB must be available at all routers, and typically PHB is the
only part of DS implemented in interior routers.
The boundary nodes include PHB mechanisms but more sophisticated traffic
conditioning mechanisms are also required to provide the desired service. Thus,
interior routers have minimal functionality and minimal overhead in providing the
DS service, while most of the complexity is in the boundary nodes. The boundary
node function can also be provided by a host system attached to the domain, on
behalf of the applications at that host system.
The traffic conditioning function consists of five elements:
• Classifier: Separates submitted packets into different classes. This is the foundation of providing differentiated services. A classifier may separate traffic
only on the basis of the DS codepoint (behavior aggregate classifier) or based
on multiple fields within the packet header or even the packet payload (multifield classifier).
• Meter: Measures submitted traffic for conformance to a profile. The meter
determines whether a given packet stream class is within or exceeds the service level guaranteed for that class.
• Marker: Re-marks packets with a different codepoint as needed. This may be
done for packets that exceed the profile; for example, if a given throughput is
guaranteed for a particular service class, any packets in that class that exceed
the throughput in some defined time interval may be re-marked for best effort
handling. Also, re-marking may be required at the boundary between two DS
domains. For example, if a given traffic class is to receive the highest supported
priority, and this is a value of 3 in one domain and 7 in the next domain, then
packets with a priority 3 value traversing the first domain are remarked as priority 7 when entering the second domain.
• Shaper: Delays packets as necessary so that the packet stream in a given class
does not exceed the traffic rate specified in the profile for that class.
• Dropper: Drops packets when the rate of packets of a given class exceeds that
specified in the profile for that class.
Figure 19.15 illustrates the relationship between the elements of traffic conditioning. After a flow is classified, its resource consumption must be measured. The
metering function measures the volume of packets over a particular time interval to
determine a flow’s compliance with the traffic agreement. If the host is bursty, a simple data rate or packet rate may not be sufficient to capture the desired traffic characteristics. A token bucket scheme, such as that illustrated in Figure 19.11, is an
example of a way to define a traffic profile to take into account both packet rate and
If a traffic flow exceeds some profile, several approaches can be taken. Individual packets in excess of the profile may be re-marked for lower-quality handling
and allowed to pass into the DS domain. A traffic shaper may absorb a burst of
Figure 19.15
DS Traffic Conditioner
packets in a buffer and pace the packets over a longer period of time. A dropper
may drop packets if the buffer used for pacing becomes saturated.
Per-Hop Behavior
As part of the DS standardization effort, specific types of PHB need to be defined,
which can be associated with specific differentiated services. Currently, two standards-track PHBs have been issued: expedited forwarding PHB (RFCs 3246 and
3247) and assured forwarding PHB (RFC 2597).
Expedited Forwarding PHB RFC 3246 defines the expedited forwarding
(EF) PHB as a building block for low-loss, low-delay, and low-jitter end-to-end services through DS domains. In essence, such a service should appear to the endpoints
as providing close to the performance of a point-to-point connection or leased line.
In an internet or packet-switching network, a low-loss, low-delay, and low-jitter
service is difficult to achieve. By its nature, an internet involves queues at each node, or
router, where packets are buffered waiting to use a shared output link. It is the queuing
behavior at each node that results in loss, delays, and jitter. Thus, unless the internet is
grossly oversized to eliminate all queuing effects, care must be taken in handling traffic
for EF PHB to assure that queuing effects do not result in loss, delay, or jitter above a
given threshold. RFC 3246 declares that the intent of the EF PHB is to provide a PHB
in which suitably marked packets usually encounter short or empty queues. The relative absence of queues minimizes delay and jitter. Furthermore, if queues remain short
relative to the buffer space available, packet loss is also kept to a minimum.
The EF PHB is designed to configuring nodes so that the traffic aggregate3 has
a well-defined minimum departure rate. (Well-defined means “independent of the
dynamic state of the node.” In particular, independent of the intensity of other traffic at the node.) The general concept outlined in RFC 3246 is this: The border nodes
control the traffic aggregate to limit its characteristics (rate, burstiness) to some predefined level. Interior nodes must treat the incoming traffic in such a way that queuing effects do not appear. In general terms, the requirement on interior nodes is that
the aggregate’s maximum arrival rate must be less than the aggregate’s minimum
departure rate.
The term traffic aggregate refers to the flow of packets associated with a particular service for a particular user.
RFC 3246 does not mandate a specific queuing policy at the interior nodes to
achieve the EF PHB. The RFC notes that a simple priority scheme could achieve the
desired effect, with the EF traffic given absolute priority over other traffic. So long
as the EF traffic itself did not overwhelm an interior node, this scheme would result
in acceptable queuing delays for the EF PHB. However, the risk of a simple priority
scheme is that packet flows for other PHB traffic would be disrupted. Thus, some
more sophisticated queuing policy might be warranted.
Assured Forwarding PHB The assured forwarding (AF) PHB is designed to
provide a service superior to best-effort but one that does not require the reservation of resources within an internet and does not require the use of detailed discrimination among flows from different users. The concept behind the AF PHB was
first introduced in [CLAR98] and is referred to as explicit allocation. The AF PHB
is more complex than explicit allocation, but it is useful to first highlight the key elements of the explicit allocation scheme:
1. Users are offered the choice of a number of classes of service for their traffic.
Each class describes a different traffic profile in terms of an aggregate data
rate and burstiness.
2. Traffic from a user within a given class is monitored at a boundary node. Each
packet in a traffic flow is marked in or out based on whether it does or does not
exceed the traffic profile.
3. Inside the network, there is no separation of traffic from different users or even
traffic from different classes. Instead, all traffic is treated as a single pool of packets, with the only distinction being whether each packet has been marked in or out.
4. When congestion occurs, the interior nodes implement a dropping scheme in
which out packets are dropped before in packets.
5. Different users will see different levels of service because they will have different quantities of in packets in the service queues.
The advantage of this approach is its simplicity. Very little work is required by
the internal nodes. Marking of the traffic at the boundary nodes based on traffic
profiles provides different levels of service to different classes.
The AF PHB defined in RFC 2597 expands on the preceding approach in the
following ways:
1. Four AF classes are defined, allowing the definition of four distinct traffic profiles. A user may select one or more of these classes to satisfy requirements.
2. Within each class, packets are marked by the customer or by the service
provider with one of three drop precedence values. In case of congestion, the
drop precedence of a packet determines the relative importance of the packet
within the AF class. A congested DS node tries to protect packets with a lower
drop precedence value from being lost by preferably discarding packets with a
higher drop precedence value.
This approach is still simpler to implement than any sort of resource reservation scheme but provides considerable flexibility. Within an interior DS node, traffic
from the four classes can be treated separately, with different amounts of resources
(buffer space, data rate) assigned to the four classes. Within each class, packets are
handled based on drop precedence. Thus, as RFC 2597 points out, the level of forwarding assurance of an IP packet depends on
• How much forwarding resources has been allocated to the AF class to which
the packet belongs
• The current load of the AF class, and, in case of congestion within the class
• The drop precedence of the packet
RFC 2597 does not mandate any mechanisms at the interior nodes to manage
the AF traffic. It does reference the RED algorithm as a possible way of managing
Figure 19.13b shows the recommended codepoints for AF PHB in the DS field.
A service level agreement (SLA) is a contract between a network provider and a
customer that defines specific aspects of the service that is to be provided. The definition is formal and typically defines quantitative thresholds that must be met. An
SLA typically includes the following information:
• A description of the nature of service to be provided: A basic service would be
IP-based network connectivity of enterprise locations plus access to the Internet. The service may include additional functions such as Web hosting, maintenance of domain name servers, and operation and maintenance tasks.
• The expected performance level of the service: The SLA defines a number of
metrics, such as delay, reliability, and availability, with numerical thresholds.
• The process for monitoring and reporting the service level: This describes how
performance levels are measured and reported.
The types of service parameters included in an SLA for an IP network are similar to those provided for frame relay and ATM networks. A key difference is that,
because of the unreliable datagram nature of an IP network, it is more difficult to
realize tightly defined constraints on performance, compared to the connection-oriented frame relay and ATM networks.
Figure 19.16 shows a typical configuration that lends itself to an SLA. In this case,
a network service provider maintains an IP-based network. A customer has a number
of private networks (e.g., LANs) at various sites. Customer networks are connected to
the provider via access routers at the access points. The SLA dictates service and performance levels for traffic between access routers across the provider network. In
addition, the provider network links to the Internet and thus provides Internet access
for the enterprise. For example, for the Internet Dedicated Service provided by MCI
(, the SLA includes the following items:
• Availability: 100% availability.
• Latency (delay): Average round-trip transmissions of …45 ms between
access routers in the contiguous United States. Average round-trip transmissions of …90 ms between an access router in the New York metropolitan
Customer networks
Access routers
ISP network
Figure 19.16
Typical Framework for Service Level Agreement
area and an access router in the London metropolitan area. Latency is calculated by averaging sample measurements taken during a calendar month
between routers.
• Network packet delivery (reliability): Successful packet delivery rate of Ú 99.5%.
• Denial of service (DoS): Responds to DoS attacks reported by customer
within 15 minutes of customer opening a complete trouble ticket. MCI defines
a DoS attack as more than 95% bandwidth utilization.
• Network jitter: Jitter is defined as the variation or difference in the end-to-end
delay between received packets of an IP or packet stream. Jitter performance
will not exceed 1 ms between access routers.
An SLA can be defined for the overall network service. In addition, SLAs can
be defined for specific end-to-end services available across the carrier’s network,
such as a virtual private network, or differentiated services.
The IPPM Performance Metrics Working Group (IPPM) is chartered by IETF to
develop standard metrics that relate to the quality, performance, and reliability of
Internet data delivery. Two trends dictate the need for such a standardized measurement scheme:
1. The Internet has grown and continues to grow at a dramatic rate. Its topology
is increasingly complex. As its capacity has grown, the load on the Internet has
grown at an even faster rate. Similarly, private internets, such as corporate
intranets and extranets, have exhibited similar growth in complexity, capacity,
and load. The sheer scale of these networks makes it difficult to determine
quality, performance, and reliability characteristics.
2. The Internet serves a large and growing number of commercial and personal
users across an expanding spectrum of applications. Similarly, private networks are growing in terms of user base and range of applications. Some of
these applications are sensitive to particular QoS parameters, leading users to
require accurate and understandable performance metrics.
A standardized and effective set of metrics enables users and service providers
to have an accurate common understanding of the performance of the Internet and
private internets. Measurement data is useful for a variety of purposes, including
• Supporting capacity planning and troubleshooting of large complex internets
• Encouraging competition by providing uniform comparison metrics across
service providers
• Supporting Internet research in such areas as protocol design, congestion control, and quality of service
• Verification of service level agreements
Table 19.5 lists the metrics that have been defined in RFCs at the time of this
writing. Table 19.5a lists those metrics which result in a value estimated based on a
sampling technique. The metrics are defined in three stages:
• Singleton metric: The most elementary, or atomic, quantity that can be measured for a given performance metric. For example, for a delay metric, a singleton metric is the delay experienced by a single packet.
• Sample metric: A collection of singleton measurements taken during a given
time period. For example, for a delay metric, a sample metric is the set of delay
values for all of the measurements taken during a one-hour period.
• Statistical metric: A value derived from a given sample metric by computing
some statistic of the values defined by the singleton metric on the sample. For
example, the mean of all the one-way delay values on a sample might be
defined as a statistical metric.
The measurement technique can be either active or passive. Active techniques
require injecting packets into the network for the sole purpose of measurement.
There are several drawbacks to this approach. The load on the network is increased.
This, in turn, can affect the desired result. For example, on a heavily loaded network,
the injection of measurement packets can increase network delay, so that the measured delay is greater than it would be without the measurement traffic. In addition,
an active measurement policy can be abused for denial-of-service attacks disguised
as legitimate measurement activity. Passive techniques observe and extract metrics
from existing traffic. This approach can expose the contents of Internet traffic to
unintended recipients, creating security and privacy concerns. So far, the metrics
defined by the IPPM working group are all active.
Table 19.5 IP Performance Metrics
(a) Sampled metrics
Metric Name
Singleton Definition
Statistical Definitions
One-Way Delay
Delay = dT, where Src transmits first bit of
packet at T and Dst received last bit of
packet at T + dT
Percentile, median,
minimum, inverse percentile
Round-Trip Delay
Delay = dT, where Src transmits first bit of
packet at T and Src received last bit of
packet immediately returned by Dst at
T + dT
Percentile, median,
minimum, inverse percentile
One-Way Loss
Packet loss = 0 (signifying successful
transmission and reception of packet); = 1
(signifying packet loss)
One-Way Loss
Loss distance: Pattern showing the
distance between successive packet losses
in terms of the sequence of packets
Number or rate of loss distances below a defined
threshold, number of loss
periods, pattern of period
lengths, pattern of interloss
period lengths.
Loss period: Pattern showing the number
of bursty losses (losses involving
consecutive packets)
Packet Delay
Packet delay variation (pdv) for a pair of
packets with a stream of packets difference between the one-way-delay of
the selected packets
Percentile, inverse
percentile, jitter, peak-topeak pdv
Src = IP address of a host
Dst = IP address of a host
(b) Other metrics
Metric Name
General Definition
Ability to deliver a packet
over a transport connection.
One-way instantaneous connectivity, two-way
instantaneous connectivity, one-way interval
connectivity, two-way interval connectivity,
two-way temporal connectivity
Bulk Transfer
Long-term average data
rate (bps) over a single
congestion-aware transport
BTC = 1data sent2/1elapsed time2
For the sample metrics, the simplest technique is to take measurements at
fixed time intervals, known as periodic sampling. There are several problems with
this approach. First, if the traffic on the network exhibits periodic behavior, with a
period that is an integer multiple of the sampling period (or vice versa), correlation
effects may result in inaccurate values. Also, the act of measurement can perturb
what is being measured (for example, injecting measurement traffic into a network
alters the congestion level of the network), and repeated periodic perturbations can
drive a network into a state of synchronization (e.g., [FLOY94]), greatly magnifying
what might individually be minor effects. Accordingly, RFC 2330 (Framework for IP
I1, I2 times that mark that beginning and ending of the interval
in which the packet stream from which the singleton
measurement is taken occurs
MP1, MP2 source and destination measurement points
P(i) ith measured packet in a stream of packets
dTi one-way delay for P(i)
Figure 19.17
Model for Defining Packet Delay Variation
Performance Metrics) recommends Poisson sampling. This method uses a Poisson
distribution to generate random time intervals with the desired mean value.
Most of the statistical metrics listed in Table 19.5a are self-explanatory. The
percentile metric is defined as follows: The xth percentile is a value y such that x%
of measurements Ú y. The inverse percentile of x for a set of measurements is the
percentage of all values …x.
Figure 19.17 illustrates the packet delay variation metric. This metric is used to
measure jitter, or variability, in the delay of packets traversing the network. The singleton metric is defined by selecting two packet measurements and measuring the
difference in the two delays. The statistical measures make use of the absolute values of the delays.
Table 19.5b lists two metrics that are not defined statistically. Connectivity
deals with the issue of whether a transport-level connection is maintained by the
network. The current specification (RFC 2678) does not detail specific sample and
statistical metrics but provides a framework within which such metrics could be
defined. Connectivity is determined by the ability to deliver a packet across a connection within a specified time limit. The other metric, bulk transfer capacity, is similarly specified (RFC 3148) without sample and statistical metrics but begins to
address the issue of measuring the transfer capacity of a network service with the
implementation of various congestion control mechanisms.
A number of worthwhile books provide detailed coverage of various routing algorithms:
[HUIT00], [BLAC00], and [PERL00]. [MOY98] provides a thorough treatment of OSPF.
Perhaps the clearest and most comprehensive book-length treatment of Internet QoS is
[ARMI00]. [XIAO99] provides an overview and overall framework for Internet QoS as well
as integrated and differentiated services. [CLAR92] and [CLAR95] provide valuable surveys
of the issues involved in internet service allocation for real-time and elastic applications,
respectively. [SHEN95] is a masterful analysis of the rationale for a QoS-based internet architecture. [ZHAN95] is a broad survey of queuing disciplines that can be used in an ISA, including an analysis of FQ and WFQ.
[ZHAN93] is a good overview of the philosophy and functionality of RSVP, written by
its developers. [WHIT97] is a broad survey of both ISA and RSVP.
[CARP02] and [WEIS98] are instructive surveys of differentiated services, while
[KUMA98] looks at differentiated services and supporting router mechanisms that go
beyond the current RFCs. For a thorough treatment of DS, see [KILK99].
Two papers that compare IS and DS in terms of services and performance are
[BERN00] and [HARJ00].
[VERM04] is an excellent surveys of service level agreements for IP networks.
[BOUI02] covers the more general case of data networks. [MART02] examines limitations of
IP network SLAs compared to data networks such as frame relay.
[CHEN02] is a useful survey of Internet performance measurement issues. [PAXS96]
provides an overview of the framework of the IPPM effort.
ARMI00 Armitage, G. Quality of Service in IP Networks. Indianapolis, IN: Macmillan
Technical Publishing, 2000.
BERN00 Bernet, Y. “The Complementary Roles of RSVP and Differentiated Services in
the Full-Service QoS Network.” IEEE Communications Magazine, February 2000.
BLAC00 Black, U. IP Routing Protocols: RIP, OSPF, BGP, PNNI & Cisco Routing
Protocols. Upper Saddle River, NJ: Prentice Hall, 2000.
BOUI02 Bouillet, E.; Mitra, D.; and Ramakrishnan, K. “The Structure and Management
of Service Level Agreements in Networks.” IEEE Journal on Selected Areas in
Communications, May 2002.
CARP02 Carpenter, B., and Nichols, K. “Differentiated Services in the Internet.”
Proceedings of the IEEE, September 2002.
CHEN02 Chen, T. “Internet Performance Monitoring.” Proceedings of the IEEE,
September 2002.
CLAR92 Clark, D.; Shenker, S.; and Zhang, L. “Supporting Real-Time Applications in
an Integrated Services Packet Network: Architecture and Mechanism” Proceedings,
SIGCOMM ’92, August 1992.
CLAR95 Clark, D. Adding Service Discrimination to the Internet. MIT Laboratory for
Computer Science Technical Report, September 1995. Available at http://
HARJ00 Harju, J., and Kivimaki, P. “Cooperation and Comparison of DiffServ and
IntServ: Performance Measurements.” Proceedings, 23rd Annual IEEE Conference
on Local Computer Networks, November 2000.
HUIT00 Huitema, C. Routing in the Internet. Upper Saddle River, NJ: Prentice Hall, 2000.
KILK99 Kilkki, K. Differentiated Services for the Internet. Indianapolis, IN: Macmillan
Technical Publishing, 1999.
KUMA98 Kumar, V.; Lakshman, T.; and Stiliadis, D. “Beyond Best Effort: Router Architectures for the Differentiated Services of Tomorrow’s Internet.” IEEE Communications Magazine, May 1998.
MART02 Martin, J., and Nilsson, A. “On Service Level Agreements for IP Networks.”
Proceeding IEEE INFOCOMM ’02, 2002.
MOY98 Moy, J. OSPF: Anatomy of an Internet Routing Protocol. Reading, MA:
Addison-Wesley, 1998.
PAXS96 Paxson, V. “Toward a Framework for Defining Internet Performance Metrics.”
Proceedings, INET ’96, 1996.
PERL00 Perlman, R. Interconnections: Bridges, Routers, Switches, and Internetworking
Protocols. Reading, MA: Addison-Wesley, 2000.
SHEN95 Shenker, S. “Fundamental Design Issues for the Future Internet.” IEEE Journal on Selected Areas in Communications, September 1995.
VERM04 Verma, D. “Service Level Agreements on IP Networks.” Proceedings of the
IEEE, September 2004.
WEIS98 Weiss, W. “QoS with Differentiated Services.” Bell Labs Technical Journal,
October–December 1998.
WHIT97 White, P., and Crowcroft, J. “The Integrated Services in the Internet: State of
the Art.” Proceedings of the IEEE, December 1997.
XIAO99 Xiao, X., and Ni, L. “Internet QoS: A Big Picture.” IEEE Network, March/April
ZHAN93 Zhang, L.; Deering, S.; Estrin, D.; Shenker, S.; and Zappala, D. “RSVP: A New
Resource ReSerVation Protocol.” IEEE Network, September 1993.
ZHAN95 Zhang, H. “Service Disciplines for Guaranteed Performance Service in
Packet-Switching Networks.” Proceedings of the IEEE, October 1995.
Recommended Web sites:
• Inter-Domain Routing working group: Chartered by IETF to revise BGP and
related standards. The Web site includes all relevant RFCs and Internet drafts.
• OSPF working group: Chartered by IETF to develop OSPF and related standards.
The Web site includes all relevant RFCs and Internet drafts.
• RSVP Project: Home page for RSVP development.
• IP Performance Metrics working group: Chartered by IETF to develop a set of
standard metrics that can be applied to the quality, performance, and reliability of Internet data delivery services. The Web site includes all relevant RFCs and Internet drafts.
Key Terms
autonomous system (AS)
Border Gateway Protocol
broadcast address
Differentiated Services (DS)
distance-vector routing
elastic traffic
exterior router protocol
inelastic traffic
Integrated Services Architecture (ISA)
interior router protocol
Internet Group Management
link-state routing
multicast address
neighbor acquisition
neighbor reachability
network reachability
Open Shortest Path First
path-vector routing
per-hop behavior
quality of service (QoS)
queuing discipline
Resource ReSerVation Protocol (RSVP)
unicast address
Review Questions
List some practical applications of multicasting.
Summarize the differences among unicast, multicast, and broadcast addresses.
List and briefly explain the functions that are required for multicasting.
What operations are performed by IGMP?
What is an autonomous system?
What is the difference between an interior router protocol and an exterior router protocol?
Compare the three main approaches to routing.
List and briefly explain the three main functions of BGP.
What is the Integrated Services Architecture?
What is the difference between elastic and inelastic traffic?
What are the major functions that are part of an ISA?
List and briefly describe the three categories of service offered by ISA.
What is the difference between FIFO queuing and WFQ queuing?
What is the purpose of a DS codepoint?
List and briefly explain the five main functions of DS traffic conditioning.
What is meant by per-hop behavior?
Most operating systems include a tool named “traceroute” (or “tracert”) that can be
used to determine the path packets follow to reach a specified host from the system
the tool is being run on. A number of sites provide Web access to the “traceroute”
tool, for example,
Use the “traceroute” tool to determine the path packets follow to reach the host
A connected graph may have more than one spanning tree. Find all spanning trees of
this graph:
In the discussion of Figure 19.1, three alternatives for transmitting a packet to a multicast address were discussed: broadcast, multiple unicast, and true multicast. Yet
another alternative is flooding. The source transmits one packet to each neighboring
router. Each router, when it receives a packet, retransmits the packet on all outgoing
interfaces except the one on which the packet is received. Each packet is labeled with
a unique identifier so that a router does not flood the same packet more than once.
Fill out a matrix similar to those of Table 19.1 and comment on the results.
In a manner similar to Figure 19.3, show the spanning tree from router B to the multicast group.
IGMP specifies that query messages are sent in IP datagrams that have the Time to
Live field set to 1. Why?
In IGMPv1 and IGMPv2, a host will cancel sending a pending membership report if it
hears another host claiming membership in that group, in order to control the generation of IGMP traffic. However, IGMPv3 removes this suppression of host membership reports. Analyze the reasons behind this design decision.
IGMP Membership Queries include a “Max Resp Code” field that specifies the maximum time allowed before sending a responding report. The actual time allowed,
called the Max Resp Time, is represented in units of 1/10 second and is derived from
the Max Resp Code as follows:
If MaxRespCode 6 128, MaxRespTime = Max Resp Code
If MaxRespCode Ú 128, MaxRespTime is a floating-point value as follows:
MaxRespTime = 1mant ƒ 0x102 V 1exp + 32 in C notation
MaxRespTime = 1mant + 162 * 2 1exp + 32
Explain the motivation for the smaller values and the larger values.
Multicast applications call an API function on their sockets in order to ask the IP
layer to enable or disable reception of packets sent from some specific IP address(es)
to a specific multicast address.
For each of these sockets, the system records the desired multicast reception state.
In addition to these per-socket multicast reception states, the system must maintain a
multicast reception state for each of its interfaces, which is derived from the persocket reception states.
Suppose four multicast applications run on the same host, and participate in the
same multicast group, M1. The first application uses an EXCLUDE5A1, A2, A36 filter. The second one uses an EXCLUDE5A1, A3, A46 filter. The third one uses an
INCLUDE5A3, A46 filter. And the fourth one uses an INCLUDE5A36 filter.
What’s the resulting multicast state (multicast address, filter mode, source list) for the
network interface?
Multicast applications commonly use UDP or RTP (Real-Time Transport Protocol;
discussed in Chapter 24) as their transport protocol. Multicast application do not use
TCP as its transport protocol. What’s the problem with TCP?
With multicasting, packets are delivered to multiple destinations. Thus, in case of
errors (such as routing failures), one IP packet might trigger multiple ICMP error
packets, leading to a packet storm. How is this potential problem avoided? Hint: Consult RFC 1122.
BGP’s AS_PATH attribute identifies the autonomous systems through which routing
information has passed. How can the AS_PATH attribute be used to detect routing
information loops?
BGP provides a list of autonomous systems on the path to the destination. However,
this information cannot be considered a distance metric. Why?
RFC 2330 (Framework for IP Performance Metrics) defines percentile in the following way. Given a collection of measurements, define the function F(x), which for any x
gives the percentage of the total measurements that were … x. If x is less than the minimum value observed, then F1x2 = 0%. If it is greater or equal to the maximum value
observed, then F1x2 = 100%. The yth percentile refer to the smallest value of x for
which F1x2 Ú y. Consider that we have the following measurements: - 2, 7, 7, 4, 18,
-5. Determine the following percentiles: 0, 25, 50, 100.
For the one-way and two-way delay metrics, if a packet fails to arrive within a reasonable period of time, the delay is taken to be undefined (informally, infinite). The
threshold of reasonable is a parameter of the methodology. Suppose we take a sample
of one-way delays and get the following results: 100 ms, 110 ms, undefined, 90 ms,
500 ms. What is the 50th percentile?
RFC 2330 defines the median of a set of measurements to be equal to the 50th percentile if the number of measurements is odd. For an even number of measurements,
sort the measurements in ascending order; the median is then the mean of the two
central values. What is the median value for the measurements in the preceding two
RFC 2679 defines the inverse percentile of x for a set of measurements to be the percentage of all values …x. What is the inverse percentile of 103 ms for the measurements in Problem 19.14?
When multiple equal-cost routes to a destination exist, OSPF may distribute traffic
equally among the routes. This is called load balancing. What effect does such load
balancing have on a transport layer protocol, such as TCP?
It is clear that if a router gives preferential treatment to one flow or one class of flows,
then that flow or class of flows will receive improved service. It is not as clear that the
overall service provided by the internet is improved. This question is intended to illustrate an overall improvement. Consider a network with a single link modeled by an
exponential server of rate Ts = 1, and consider two classes of flows with Poisson
arrival rates of l1 = l2 = 0.25 and that have utility functions U1 = 4 - 2Tq1 and
U2 = 4 - Tq2 , where Tqi represents the average queuing delay to class i. Thus, class 1
traffic is more sensitive to delay than class 2. Define the total utility of the network as
V = U1 + U2 .
a. Assume that the two classes are treated alike and that FIFO queuing is used.
What is V?
b. Now assume a strict priority service so that packets from class 1 are always transmitted before packets in class 2. What is V? Comment.
Provide three examples (each) of elastic and inelastic Internet traffic. Justify each
example’s inclusion in their respective category.
Why does a Differentiated Services (DS) domain consist of a set of contiguous
routers? How are the boundary node routers different from the interior node routers
in a DS domain?
The token bucket scheme places a limit on the length of time at which traffic can
depart at the maximum data rate. Let the token bucket be defined by a bucket size B
octets and a token arrival rate of R octets/second, and let the maximum output data
rate be M octets/s.
a. Derive a formula for S, which is the length of the maximum-rate burst. That is, for
how long can a flow transmit at the maximum output rate when governed by a
token bucket?
b. What is the value of S for B = 250 KB, R = 2 MB/s, and M = 25 MB/s?
Hint: The formula for S is not so simple as it might appear, because more tokens
arrive while the burst is being output.
In RSVP, because the UDP/TCP port numbers are used for packet classification, each
router must be able to examine these fields. This requirement raises problems in the
following areas:
a. IPv6 header processing
b. IP-level security
Indicate the nature of the problem in each area, and suggest a solution.
20.1 Connection-Oriented Transport Protocol Mechanisms
20.2 TCP
20.3 TCP Congestion Control
20.4 UDP
20.5 Recommended Reading and Web Sites
20.6 Key Terms, Review Questions, and Problems
The foregoing observations should make us reconsider the widely held view that birds
live only in the present. In fact, birds are aware of more than immediately present
stimuli; they remember the past and anticipate the future.
—The Minds of Birds, Alexander Skutch
The transport protocol provides an end-to-end data transfer service
that shields upper-layer protocols from the details of the intervening
network or networks. A transport protocol can be either connection
oriented, such as TCP, or connectionless, such as UDP.
If the underlying network or internetwork service is unreliable, such
as with the use of IP, then a reliable connection-oriented transport
protocol becomes quite complex. The basic cause of this complexity is
the need to deal with the relatively large and variable delays experienced between end systems. These large, variable delays complicate
the flow control and error control techniques.
TCP uses a credit-based flow control technique that is somewhat different from the sliding-window flow control found in X.25 and
HDLC. In essence, TCP separates acknowledgments from the management of the size of the sliding window.
Although the TCP credit-based mechanism was designed for end-toend flow control, it is also used to assist in internetwork congestion
control. When a TCP entity detects the presence of congestion in the
Internet, it reduces the flow of data onto the Internet until it detects
an easing in congestion.
In a protocol architecture, the transport protocol sits above a network or
internetwork layer, which provides network-related services, and just below application and other upper-layer protocols. The transport protocol provides services to
transport service (TS) users, such as FTP, SMTP, and TELNET. The local transport
entity communicates with some remote transport entity, using the services of some
lower layer, such as the Internet Protocol. The general service provided by a transport protocol is the end-to-end transport of data in a way that shields the TS user
from the details of the underlying communications systems.
We begin this chapter by examining the protocol mechanisms required to provide these services. We find that most of the complexity relates to reliable connection-oriented services. As might be expected, the less the network service provides,
the more the transport protocol must do. The remainder of the chapter looks at two
widely used transport protocols: Transmission Control Protocol (TCP) and User
Datagram Protocol (UDP).
Refer to Figure 2.5 to see the position within the TCP/IP suite of the protocols
discussed in this chapter.
Was this manual useful for you? yes no
Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Download PDF