BITTORRENT TRAFFIC MEASUREMENTS BITT ORRENT ABSTRACT

BITTORRENT  TRAFFIC MEASUREMENTS BITT ORRENT ABSTRACT
The Internet has experienced two major revolutions. The first was the emergence of the World
Wide Web, which catapulted the Internet from
being a scientific and academic network to becoming part of the societal infrastructure.The second
revolution was the appearance of the Peer-to-Peer
(P2P) applications, spear-headed by Napster.
The popularity of P2P networking has lead to a
dramatic increase of the volume and complexity of
the traffic generated by P2P applications. P2P traffic has recently been shown to amount to almost
80% of the total traffic in a high speed IP backbone
link. One of the major contributors to this massive
volume of traffic is BitTorrent, a P2P replication
system. Studies have shown that BitTorrent traffic more than doubled during the first quarter of
2004, and still amounts to 60% of all P2P traffic in
2005.
This thesis reports on measurement, modelling
and analysis of BitTorrent traffic collected at Blekinge Institute of Technology (BIT) as well as at
a local ISP. An application layer measurement infrastructure for P2P measurements developed at
BIT is presented. Furthermore, a dedicated fitness
assessment method to avoid issues with large
sample spaces is described. New results regarding
BitTorrent session and message characteristics are
reported and models for several important characteristics are provided. Results show that several BitTorrent metrics such as session durations
and sizes exhibit heavy-tail behaviour. Additionally,
previously reported results on peer reactivity to
new content are corroborated.
BITTORRENT TRAFFIC MEASUREMENTS AND MODELS
ABSTRACT
David Erman
ISSN 1650-2140
ISBN 91-7295-071-4
2005:13
2005:13
BITTORRENT TRAFFIC MEASUREMENTS
AND MODELS
David Erman
Blekinge Institute of Technology
Licentiate Dissertation Series No. 2005:13
School of Engineering
BitTorrent Traffic
Measurements and Models
David Erman
October 2005
Department of Telecommunication Systems,
School of Engineering,
Blekinge Institute of Technology
c October 2005, David Erman. All rights reserved.
Copyright Blekinge Institute of Technology
Licentiate Dissertation Series No. 2005:13
ISSN 1650-2140
ISBN 91-7295-071-4
Published 2005
Printed by Kaserntryckeriet AB
Karlskrona 2005
Sweden
This publication was typeset using LATEX.
For my Family
past, present and future
Abstract
The Internet has experienced two major revolutions. The first was the emergence of the World Wide Web, which catapulted the Internet from being a
scientific and academic network to becoming part of the societal infrastructure.
The second revolution was the appearance of the Peer-to-Peer (P2P) applications, spear-headed by Napster.
The popularity of P2P networking has lead to a dramatic increase of the
volume and complexity of the traffic generated by P2P applications. P2P traffic
has recently been shown to amount to almost 80 % of the total traffic in a high
speed IP backbone link. One of the major contributors to this massive volume
of traffic is BitTorrent, a P2P replication system. Studies have shown that
BitTorrent traffic more than doubled during the first quarter of 2004, and still
amounts to 60 % of all P2P traffic in 2005.
This thesis reports on measurement, modelling and analysis of BitTorrent
traffic collected at Blekinge Institute of Technology (BIT) as well as at a local
ISP. An application layer measurement infrastructure for P2P measurements
developed at BIT is presented. Furthermore, a dedicated fitness assessment
method to avoid issues with large sample spaces is described. New results regarding BitTorrent session and message characteristics are reported and models
for several important characteristics are provided. Results show that several
BitTorrent metrics such as session durations and sizes exhibit heavy-tail behaviour. Additionally, previously reported results on peer reactivity to new
content are corroborated.
iii
iv
Acknowledgements
Several people have contributed in various ways, directly or indirectly, to the
work culminating in this thesis. I extend my gratitude to them all. However, I
would like to thank a few people in particular.
• My advisor, Docent Adrian Popescu for his attention to detail and correctness, motivation and encouragement.
• My fellow graduate students at BIT. In particular Dragos Ilie and Doru
Constantinescu for valuable criticism and encouragement.
• Dr. Markus Fiedler. His enthusiasm and tenacity is an inspiration to any
PhD student.
• Prof. Arne Nilsson for accepting me as a PhD student.
• My parents, for performing above and beyond the call of duty and teaching
me to question the unquestionable.
• My immediate family, Maria, for putting up with me during the writing
of this thesis.
David Erman
Karlskrona, October 2005
v
vi
Contents
Page
1 Introduction
1
1.1
Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2
1.2
Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . .
3
1.3
Main Contributions
. . . . . . . . . . . . . . . . . . . . . . .
4
1.4
Thesis Outline . . . . . . . . . . . . . . . . . . . . . . . . . . .
5
2 Peer-to-peer Protocols
7
2.1
Evolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7
2.2
Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
9
2.3
P2P and File Sharing . . . . . . . . . . . . . . . . . . . . . .
12
2.4
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
14
3 The BitTorrent Protocol
15
3.1
BitTorrent Encoding . . . . . . . . . . . . . . . . . . . . . .
16
3.2
Resource Meta-data . . . . . . . . . . . . . . . . . . . . . . .
17
3.3
Network Entities and Protocols . . . . . . . . . . . . . . .
18
3.4
Peer States . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
22
3.5
Sharing Fairness and Bootstrapping . . . . . . . . . . . . .
23
3.6
Data Transfer . . . . . . . . . . . . . . . . . . . . . . . . . . .
24
vii
3.7
BitTorrent Performance Issues . . . . . . . . . . . . . . .
25
3.8
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
27
4 Traffic Measurements
29
4.1
Measurement Approaches . . . . . . . . . . . . . . . . . . . .
30
4.2
Application Level Traffic Analysis . . . . . . . . . . . . .
34
4.3
Measurement Infrastructure . . . . . . . . . . . . . . . . .
36
4.4
Measurement Software . . . . . . . . . . . . . . . . . . . . .
36
4.5
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
42
5 Traffic Modelling
43
5.1
Heavy-tailed Traffic Models . . . . . . . . . . . . . . . . .
44
5.2
Hypothesising Distributions . . . . . . . . . . . . . . . . . .
51
5.3
Mixture Distributions . . . . . . . . . . . . . . . . . . . . . .
54
5.4
Parameter Estimation . . . . . . . . . . . . . . . . . . . . . .
58
5.5
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
63
6 Fitness Assessment
65
6.1
Graphical Methods . . . . . . . . . . . . . . . . . . . . . . . .
66
6.2
Hypothesis Testing . . . . . . . . . . . . . . . . . . . . . . . .
66
6.3
The Case of Large Sample Spaces . . . . . . . . . . . . . .
71
6.4
Relative and Absolute Fitness . . . . . . . . . . . . . . . .
71
6.5
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
72
7 Modelling Methodology
73
7.1
Distribution Selection
. . . . . . . . . . . . . . . . . . . . .
73
7.2
Parameter Estimation . . . . . . . . . . . . . . . . . . . . . .
74
7.3
Fitness Assessment . . . . . . . . . . . . . . . . . . . . . . . .
74
7.4
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
77
viii
8 BitTorrent Measurements
79
8.1
Traffic Metrics . . . . . . . . . . . . . . . . . . . . . . . . . .
80
8.2
Traffic Measurements . . . . . . . . . . . . . . . . . . . . . .
82
8.3
Summary Statistics . . . . . . . . . . . . . . . . . . . . . . . .
84
8.4
Swarm Size . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
90
8.5
Session Sizes . . . . . . . . . . . . . . . . . . . . . . . . . . . .
92
8.6
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
93
9 BitTorrent Models
95
9.1
Session Characteristics . . . . . . . . . . . . . . . . . . . . .
9.2
Message Characteristics . . . . . . . . . . . . . . . . . . . . 105
9.3
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
10 Conclusions and Future Work
95
119
10.1 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
A BitTorrent Protocol Details
123
A.1 Bencoding Types . . . . . . . . . . . . . . . . . . . . . . . . . 123
A.2 Peer Wire Protocol Messages . . . . . . . . . . . . . . . . 124
A.3 Tracker Request Parameters . . . . . . . . . . . . . . . . . 125
A.4 Scrape Response Keys . . . . . . . . . . . . . . . . . . . . . . 128
B BitTorrent XML Log File
129
B.1 BitTorrent Application Log DTD . . . . . . . . . . . . . . . . . . 134
Bibliography
137
ix
x
List of Figures
Figure
Page
3.1
BitTorrent handshake procedure . . . . . . . . . . . . . . . . . .
19
3.2
Example tracker announce GET request . . . . . . . . . . . . . .
20
3.3
Compact tracker response . . . . . . . . . . . . . . . . . . . . . .
21
3.4
Example tracker scrape GET request . . . . . . . . . . . . . . . .
22
3.5
BitTorrent protocol exchange . . . . . . . . . . . . . . . . . . . .
25
4.1
BIT measurement setup . . . . . . . . . . . . . . . . . . . . . . .
36
4.2
Measurement procedures . . . . . . . . . . . . . . . . . . . . . . .
37
4.3
Sample BitTorrent log file . . . . . . . . . . . . . . . . . . . . . .
42
5.1
Pareto, Weibull, Log-normal and Exponential Hill plots . . . . .
46
5.2
Pareto, Weibull, Log-normal and Exponential α-estimator plots .
47
5.3
Pareto, Weibull, Log-normal and Exponential CCDF . . . . . . .
49
5.4
Skewness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
52
5.5
A finite mixture distribution . . . . . . . . . . . . . . . . . . . . .
54
6.1
AD weighting function for a uniform distribution . . . . . . . . .
69
8.1
Temporal structure of measurements 1–12 . . . . . . . . . . . . .
83
8.2
Connected peers during seed phase for measurements 4 and 6 . .
90
xi
8.3
Connected peers during leech phase for measurements 4 and 6 . .
91
8.4
Swarm reaction to new content . . . . . . . . . . . . . . . . . . .
91
9.1
Fitness assessment plots . . . . . . . . . . . . . . . . . . . . . . .
97
9.2
Session size-duration scatter plot . . . . . . . . . . . . . . . . . . 100
9.3
α-estimates and CCDF for measurement 3 . . . . . . . . . . . . . 102
9.4
Upstream request rate during leech phase . . . . . . . . . . . . . 106
9.5
Modelling results for request rate during leech phase . . . . . . . 107
9.6
Modelling results for request inter-departure times during leech
phase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
9.7
Modelling results for downstream piece rate during leech phase . 110
9.8
Modelling results for downstream piece inter-arrival times during
leech phase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
9.9
Dual Weibull modelling results for downstream request rate during seed phase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
9.10 Modelling results for request inter-arrival times during seed phase 115
9.11 Dual Weibull modelling results for upstream piece rates during
seed phase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
9.12 Modelling results for piece inter-departure times during seed phase117
B.1 Extract from BitTorrent XML log file . . . . . . . . . . . . . . . 130
xii
List of Tables
Table
Page
2.1
P2P and CS content models . . . . . . . . . . . . . . . . . . . . .
11
6.1
EDF statistic percentage points . . . . . . . . . . . . . . . . . . .
71
7.1
Fitness quality boundaries . . . . . . . . . . . . . . . . . . . . . .
76
8.1
Measurement summary
. . . . . . . . . . . . . . . . . . . . . . .
83
8.2
Content summary
. . . . . . . . . . . . . . . . . . . . . . . . . .
84
8.3
Download time and average download rate summary . . . . . . .
85
8.4
Session and peer summary . . . . . . . . . . . . . . . . . . . . . .
86
8.5
Downstream protocol message summary . . . . . . . . . . . . . .
88
8.6
Upstream protocol message summary . . . . . . . . . . . . . . . .
89
8.7
Share ratio during leech phase . . . . . . . . . . . . . . . . . . . .
92
8.8
Correlation coefficients for session sizes . . . . . . . . . . . . . . .
92
9.1
Fitted hyper-exponential parameters . . . . . . . . . . . . . . . .
98
9.2
Correlation coefficients for session duration and sizes . . . . . . .
99
9.3
Percentages of session sizes exceeding 0 bytes and 1 piece size . . 100
9.4
Session α-estimates . . . . . . . . . . . . . . . . . . . . . . . . . . 101
xiii
9.5
Log-normal parameter estimates and errors for upstream session
sizes during seed phase . . . . . . . . . . . . . . . . . . . . . . . . 103
9.6
Log-normal parameter estimates and errors for upstream session
durations during seed phase . . . . . . . . . . . . . . . . . . . . . 104
9.7
Gaussian parameter estimates and errors for upstream request
rate during leech phase . . . . . . . . . . . . . . . . . . . . . . . . 107
9.8
Exponential parameter estimates and errors for request inter-departure times during leech phase . . . . . . . . . . . . . . . . . . 107
9.9
Exponential and Uniform parameter estimates and errors using
alternative model for request inter-departure times during leech
phase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
9.10 Weibull parameter estimates and errors for downstream piece rate
during leech phase . . . . . . . . . . . . . . . . . . . . . . . . . . 109
9.11 Exponential parameter estimates and errors for piece inter-arrival
times during leech phase . . . . . . . . . . . . . . . . . . . . . . . 110
9.12 Weibull parameter estimates and errors for downstream request
rate during seed phase . . . . . . . . . . . . . . . . . . . . . . . . 112
9.13 Dual Weibull parameter estimates and errors for downstream request rate during seed phase . . . . . . . . . . . . . . . . . . . . . 112
9.14 Exponential parameter estimates and errors for request interarrival times during seed phase . . . . . . . . . . . . . . . . . . . 114
9.15 Weibull parameter estimates and errors for upstream piece rate
during seed phase . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
9.16 Dual Weibull parameter estimates and errors for upstream piece
rate during seed phase . . . . . . . . . . . . . . . . . . . . . . . . 116
9.17 Exponential parameter estimates and errors for piece inter-departure times during seed phase . . . . . . . . . . . . . . . . . . . . 117
xiv
Acronyms
Anderson-Darling
Blekinge Institute of Technology
CCDF Complementary
Cumulative
Distribution Function
CDN Content Delivery Network
CS
Client-Server
CVM Cramér-von Mises
DHT Distributed Hash Table
DNS
Domain Name System
DTD Document Type Definition
DOM Document Object Model
EDF
Empirical Distribution Function
EPDF Experimental Probability Density Function
IID
Independent and Identically
Distributed
ISP
Internet Service Provider
KS
Kolmogorov-Smirnov
LRD
Long-Range Dependence
MLE
Maximum Likelihood Estimation
ML
Maximum-Likelihood
MPAA Motion Picture Association of
AD
BIT
America
NAT
Network Address Translation
NFS
Network Filesystem
NNTP Network News Transfer Protocol
P2P
Peer-to-Peer
PDF
Probability Density Function
PIT
Probability Integral Transform
PSTN Public Switched Telephone Network
QQ
Quantile-Quantile
QoS
Quality of Service
RIAA Recording Industry Association
of America
RMON Remote Monitoring
SAX
Simple API for XML
SHA-1 Secure Hash Algorithm One
SMTP Simple Mail Transfer Protocol
SNMP Simple Network Management
Protocol
SRD
Short-Range Dependence
URI
Uniform Resource Indicator
UUCP Unix to Unix Copy Protocol
VoIP Voice over IP
xv
xvi
Chapter 1
Introduction
The prisoner falls in love with his chains.
– Edsger W. Dijkstra
The global Internet has emerged to become an integral part of everyday life.
It is now as fundamental a part of the infrastructure as the telephone system
or the road network. The initial driving factor pushing the acceptance and
widespread usage of the Internet was the introduction of the World Wide Web
(WWW) by Tim Berners-Lee in 1989. The WWW provided a way of accessing
information in a novel and intuitive way, and quickly became the Internet “killer
application” [6].
In May 1999, ten years after the advent of the WWW, Shawn Fanning
introduced Napster, arguably the first modern Peer-to-Peer (P2P) application
[53]. The Napster application and protocols were the first to allow users to share
files among each other without the need of a central storage server. Very quickly,
Napster became immensely popular, and the P2P revolution had begun.
Since the advent of Napster, P2P systems have become wide-spread with
the emergence of file-sharing applications such as Gnutella [46], KaZaA [71] and
eDonkey [27]. These systems generated headlines across the globe when the
1
CHAPTER 1. INTRODUCTION
Recording Industry Association of America (RIAA) [56] and Motion Picture
Association of America (MPAA) [55] started filing law suites against file-sharing
users suspected of copyright infringement. The law suites are partly responsible
for the embrace of the term P2P as a euphemism for illegal file-sharing. Fortunately, the concept of P2P networking is broader than that and P2P systems
have many useful legitimate applications.
The P2P paradigm is the logical and functional antithesis of the ClientServer (CS) paradigm that has been the predominant paradigm for IP-based
networks since their inception. This is however only true to a certain degree,
as the idea of sharing among equals has been part of the Internet since the
early days of the network. Two examples are the e-mail system employed in the
Internet and the Domain Name System (DNS). Both protocols are so tightly
connected to the inner workings of the Internet, that it is impossible to imagine
the degree of usage that the Internet sees today without them. Once an e-mail
has left the user’s mail software, it is routed among mail transfer agents (MTAs),
all acting as equally valued message forwarders. The DNS is the first distributed
information database, and implements a hierarchical mapping scheme, which is
comparable to the multi-layered P2P systems.
The fundamental difference between “legacy” P2P systems such as DNS
and e-mail and the new Peer-to-Peer (P2P) systems such as Gnutella, Napster
and eDonkey is that the older systems work as part of the network core, while
the new applications are typically application-layer protocols run by edge-node
applications. The shift of the edge nodes from acting as purely service users to
additionally taking the role as service providers has significantly changed the
characteristics of the network traffic.
1.1
Motivation
Measurement studies and analysis of P2P traffic have been rather limited so far.
This is because of the complexity of this task, which involves answering hard
questions related to data retrieval and content location, storage, data analysis
and modelling of traffic and topological characteristics as well as privacy and
2
1.2. RELATED WORK
copyright issues.
There are two major points in motivating the work performed for this thesis:
• BitTorrent has become extremely popular over the last years. According
to Cachelogic, the BitTorrent traffic volume has increased from 26 % to
52 % of the total P2P traffic volume during the first half of 2004 [9].
The increase of the amount of BitTorrent traffic indicates that understanding the characteristics of BitTorrent would also help in understanding the
overall Internet behaviour.
• There are few measurement studies performed on BitTorrent [24, 38, 39].
This is because the protocol is quite new, only a few years old, but also
because of the general complexity of the task. In the few studies that do
exist, traffic has been collected from ”trackers” as well as with the help
of modified clients. However, there have been no dedicated measurement
studies on a message-level so far.
The main goals of this thesis are to understand the characteristics of the BitTorrent system and, based on that, to develop models suitable for a P2P simulation
environment. To that end, a dedicated measurement system for P2P system
traffic measurements [36] has been designed and implemented.
1.2
Related Work
In general, measurement studies of P2P systems are limited in number. Saroiu
et al. performed a measurement study of Napster and Gnutella in 2002 [69].
Active and passive measurements were performed on both systems. Their results
show the non-cooperativity of peers involved in the systems and several other
characteristics such as estimated peer bandwidths, number of shared files and
resilience.
The present work is one of a few works investigating the properties of the
BitTorrent system. For instance, in [38], the authors use tracker and client
3
CHAPTER 1. INTRODUCTION
logs to evaluate performance on both global and session scales. They note the
efficiency of the tit-for-tat-policy employed in BitTorrent, and the flexibility and
scalability of the protocol.
Qiu and Srikant present a fluid flow model for BitTorrent-like file-sharing
P2P networks. They assume a Poisson peer arrival process and exponentially
distributed download times. Additionally, the authors assume that seeds remain
in the network according to an exponentially distributed time and identical
download rates for all peers. Their results state that the number of seeds and
leechers are Gaussian random variables [24] when in steady state.
Nicoll et al. have analysed tracker log files with regards to session sizes
(denoted by file sizes in the paper), peer bandwidth and share ratios. They
note that up to 20 % of peers do not download any data at all, and 20-25 % of
peers connect but do not upload any data. Additionally, the authors point to
that 80 % of peers have a share ratio less than 1, i.e., they download more than
they upload.
The measurement infrastructure developed at Blekinge Institute of Technology
(BIT) is capable of detecting and measuring application layer messages with link
layer accuracy. One drawback with the infrastructure is that it cannot yet do so
in real-time. In [44], Karagiannis et al. present a novel method for identifying
P2P traffic without resorting to application payload decoding. Their method is
based on observing connection patterns of source and destination IP addresses.
To verify the method, the authors also present a payload identification method
using protocol-specific bit strings.
1.3
Main Contributions
The main contributions of this thesis are related to providing accurate models
of several BitTorrent key characteristics. To the best of our knowledge, this is
the first study of this kind.
The reported models include session duration, size and inter-arrival times
as well as rates and inter-arrival times for the two most relevant BitTorrent
4
1.4. THESIS OUTLINE
application messages.
From a traffic engineering and control viewpoint, the session models reported in Section 9.1 provide incentive for controlling the amount of concurrent
BitTorrent flows. The message characteristics reported in Section 9.2 indicate
that some form of per-message control may also be beneficial to decrease the
burstiness of the network traffic.
Other contributions include the development of a modular P2P measurement
infrastructure. This infrastructure is currently used to measure Gnutella and
BitTorrent traffic with high accuracy on the link layer.
Additionally, the method for assessing model fitness in the case of large sample spaces may prove useful for other modelling scenarios as well (Section 7.3).
It has performed well during the current work, as well as in other published
work [21].
Parts of the the work presented in this thesis has been previously been
published in [29–31, 36, 37].
1.4
Thesis Outline
This thesis contains nine chapters and two appendices. The current chapter has
presented the motivation for and main contributions of the thesis, along with a
brief presentation of the state of the art in P2P research.
Chapter two contains a short history and description of P2P systems, with
special focus on the most popular application, i.e., file sharing. This is followed
by a detailed description of the BitTorrent system and the associated protocols
in chapter three. Chapter four gives an introduction to traffic modelling and a
brief description of the measurement infrastructure used for this work. Chapter
five discusses traffic modelling in general, and heavy-tailed modelling in particular. Also, tools for describing empirical distributions are presented. Chapter
six summarises some of the most common methods of determining the fitness
of specific distributions. Chapter seven builds on the two preceding chapters to
5
CHAPTER 1. INTRODUCTION
present the modelling methodology used for the work performed for this thesis.
In Chapter eight, the actual measurements performed are presented, together
with some of the more salient results of these measurements. Chapter nine reports on the models for BitTorrent session and message characteristics. Chapter ten concludes the thesis, with conclusions and implications of the presented
work. Potential future work is also presented.
The appendix contains implementation details for the BitTorrent protocols,
and a description of the XML log format used for the application measurements
presented in Chapter eight.
6
Chapter 2
Peer-to-peer Protocols
Tvertimot!
– Henrik Ibsen
The concept of P2P protocols, systems and applications is quite broad. The
term P2P commonly refers to applications and systems that share resources
in a distributed and decentralised manner. Participants in these systems are
viewed as logical and functional equals. This is in contrast to pure ClientServer (CS) protocols, where participants either serve resources or are being
served resources. A more formal definition of these is provided in Section 2.2.
2.1
Evolution
The earliest recorded use of the term “peer-to-peer” was in 1984. It was related
to the IBM Advanced Peer to Peer Networking (APPN) Architecture [28], which
was the result of multiple enhancements to the Systems Network Architecture
(SNA).
Although early networking protocols such as the Unix to Unix Copy Protocol
(UUCP) [35], Network News Transfer Protocol (NNTP) [43] and Simple Mail
7
CHAPTER 2. PEER-TO-PEER PROTOCOLS
Transfer Protocol (SMTP) [45] were working in a P2P fashion – indeed, the
original ARPANET was designed as a P2P system – the term P2P did not
become mainstream until the appearance of Napster in the fall of 1999.
Napster was the first popular file-sharing P2P service. The main goal of the
service was to provide users with easy means of finding music files encoded in
the MP3 format. The architecture of Napster was built around a central server
that was used to index music files shared by client nodes. This approach is
called a centralised directory. The centralised directory allowed Napster to give
a very rapid reply to which hosts stored a particular file. The actual file transfer
occurred directly between the node looking for the file and the node storing the
file.
The success of Napster quickly became a source of serious concern for major record companies who rapidly filed a lawsuit against Napster on grounds
of copyright infringement. The lawsuit made Napster immensely popular, attracting additional millions of users to the system. However, Napster could
not withstand the pressure of the lawsuit and in July 2001 they were forced to
shut down the central server. Without the central server the client nodes could
no longer search for files. Thus, the fragility of a centralised directory system
became apparent.
Napster is one of the first generation P2P applications as defined by [28]. Following the advent of Napster, several other P2P applications appeared. These
applications were similar in appearance, but altogether different beasts in detail. Gnutella [18], which was released by Justin Frankel of Winamp fame in
early 2000, opted to implement a fully distributed system with no central authority. The same year saw the emergence of the Freenet system, which was
the brainchild of Ian Clarke. Clarke wrote his Master’s thesis on a distributed,
anonymous and decentralised information storage and retrieval system. This
system later became Freenet [17, 65]. Freenet’s major difference to previous
P2P systems was the complete anonymity it offered to users.
The fully distributed architecture was resilient to node failures and was also
immune to service disruptions of the type experienced by Napster. However,
experience with Gnutella has shown that fully distributed P2P systems may
8
2.2. DEFINITIONS
lead to scalability problems due to the massive amounts of signalling traffic
they generate [68].
By late 2000 and early 2001, the P2P boom had started and applications such
as KaZaA [71], DirectConnect [54], SoulSeek [73] and eDonkey [27] appeared.
These new systems usually provided some form of community-like features such
as chat rooms and forums, in addition to the file-sharing services provided by
previous systems.
KaZaA, which uses the FastTrack protocol, introduced the concept of supernodes in order to solve scalability problems similar to those experienced by
Gnutella. Each supernode manages a number of regular nodes and exchanges
information about them with other supernodes. Regular nodes upload file lists
and search requests to their supernode. The search requests are processed solely
among the supernodes. Regular peers establish direct HTTP connections to
download files.
Gnutella resolved the scalability problem in a similar way. In Gnutella,
supernodes are called ultrapeers.
During the last few years, the old P2P systems have evolved to better utilise
network resources. New systems have emerged with the specific focus of efficient
bandwidth utilisation. The most significant example of this development is the
BitTorrent system. Furthermore, new systems tend to focus on using Distributed
Hash Tables (DHTs). DHTs force network topology and data storage to follow
specific mathematical structures in order to optimise various parameters (e.g.,
minimise delay or number of hops). They are considered as being a promising
alternative to the flooding algorithms required by routing in unstructured P2P
networks.
2.2
Definitions
There is no clear consensus regarding an exact definition of a P2P system.
Schollmeier makes an attempt to define a P2P network in [70]. In general, the
notion of a P2P network seems to be leaning towards some form of utilisation
9
CHAPTER 2. PEER-TO-PEER PROTOCOLS
of edge node resources by other edge node resources. The resource in question
is commonly accepted to be files, and much research is focusing on the efficient
localisation and placement of files. There also seems to be some consensus
regarding the idea of pure and hybrid systems.
A P2P network is defined in [70] as a network in which the service provided by
the system is provided by the participating nodes. The participating nodes share
part of their local resource pool, such as disk space, files, CPU processing time
to the common resource pool. A pure P2P network is one in which any given
participant may be removed without the system experiencing loss of service.
Examples of this type of network are Gnutella, FastTrack and Freenet. A hybrid
P2P network is one in which a central authority of some sort is necessary for the
system to function. Note that, in contrast to the CS model, the central authority
in a hybrid network rarely shares resources – it is still the participating peers
that share resources. The central authority is commonly an indexing server for
files or provides a peer localisation service. Examples of this type of network
are Napster, eDonkey and DirectConnect.
It is also possible to take a resource view of the two types of P2P networks
described above. Consider the three functions of content insertion, distribution
and control and how they are performed in P2P and CS networks (Table 2.1).
Insertion
Insertion is the function of adding content to the resource
pool of a network. Insertion is here referred to in the sense
of providing the content, so that in both pure and hybrid
content is inserted by the participating peers. This is analogous with the peers sharing content. In a CS system however, content is always provided by the server, and thus also
“shared” by the server.
Distribution This is the function of retrieving content from a network resource pool. Again, P2P systems lack central content localisation, thus content is disseminated in a decentralised fashion. This does not necessarily mean that parts of the same
content is retrieved from different sources, i.e., swarming,
10
2.2. DEFINITIONS
but rather that the parts (e.g., files) of the total resource
pool are retrieved from different sources.
Hybrid CS systems refer to redundant server systems and
content delivery networks (CDNs), such as Akamai [1]. Redundant server systems are systems in which several servers
provide the same content, but are accessed by the requesting client from a single Uniform Resource Indicator (URI).
This is a common model for WWW servers in the Internet
today.
Control
Control is the function of managing the resource pool of
a network, such as admission control and resource localisation. This is the function that separates the two types
of P2P networks. The peers participating in fully decentralised networks are required to assist in the control mechanisms in the network, while hybrid systems may rely on
a central authority for this. Of course, the clients in CS
systems have no responsibility towards the network control
functionality.
Table 2.1: P2P and CS content models. C denotes centralised and D denotes decentralised.
Pure P2P
Hybrid P2P
Hybrid CS
Pure CS
Insertion
D
D
C
C
Distribution
D
D
C/D
C
Control
D
C
C
C
In addition to the definitions provided above, P2P systems can also be classified according to their “generation” [28]. In this classification scheme, hybrid
systems such as Napster are considered to be first generation systems, while
fully decentralised systems such as FastTrack and Gnutella are second generation systems. A third generation is discussed as being the improvement upon the
11
CHAPTER 2. PEER-TO-PEER PROTOCOLS
two first with respect to features such as redundancy, reliability or anonymity.
2.3
P2P and File Sharing
File sharing is almost as old as operating systems themselves. Early methods for
sharing files include protocols such as the UNIX remote copy (rcp) command
and the File Transfer Protocol (FTP) [64]. They were quickly followed by fullfledged network file systems such as NFS [52,72] and CIFS [2]. A common trait
of these protocols (with the exception of rcp) is that they were designed around
the CS paradigm, with the servers being the entity storing and serving files. A
client that wants to share files must upload them to the server to make them
available to other clients.
Instant messaging systems such as ICQ [3], Yahoo! Messenger [7] and MSN
Messenger [4] attempted to provide file sharing service by implementing a mechanism similar to rcp. Users could thus share file with each other without having
to store them on a central server. In fact, this was the first form of P2P file
sharing. Napster further extended this idea by implementing efficient file search
facilities.
In the public eye, P2P is synonymous with file sharing. However, other
applications that may be termed P2P have become fairly popular as well, such
as the [email protected] project [75], distributed.net [26] and ZetaGrid [84]. These
applications have been fairly successful in attracting a user-base, but none of
them come close to the number of users that the file sharing services have.
These services are examples of altruistic systems. The participating peers
provide CPU processing power and time to a common resource pool without
deriving personal benefit from this. The pooled CPU resources are then used to
perform various complex calculations such as calculating fast Fourier transforms
of galactic radio data, code-breaking or finding roots of the Riemann Zetafunction.
A possible reason for the difference in number of users could be that the
incentive to altruistically share resources without gaining anything other than
12
2.3. P2P AND FILE SHARING
some virtual fame or feel-good points of having contributed to the greater good
of humanity seems to be low. Most file sharing P2P systems employ some form of
admission scheme in which peers are not allowed to join the system or download
from it unless they are sharing an adequate amount of files. This provides a
dual incentive: first, a peer wanting to join the network must1 provide sort of
an entry token in the form of shared files, and second, peers joining the system
know that there is a certain amount of content provided to them once they join.
The BitTorrent P2P system is one of the most prominent networks in enforcing
incentive.
As not all files are equally desirable in every system, files not belonging
to the general category of files handled in a specific P2P network should not
be allowed in. For instance, users of a network such as Napster, which only
manages digital music files, might not be interested in peers sharing text files.
For systems that require a large amount of file data to be shared as an admission
scheme, this becomes a problem. Peers may share “junk files” just to gain access
to the network. Junk files are files that are not really requested or desired in
the network. These practices are usually scorned upon, but are hard to get to
grips with. Some systems, such as eDonkey, have implemented a rating system,
in which peers are punished for sharing junk files.
Similar to junk files, there are also “fakes” or “decoys”. Fakes are files inserted in the network that masquerade under a filename that does not represent
the actual content, or files that contain modified versions of the same content.
By adding fakes into the network, the real content is made more difficult to
find. This problem is alleviated by using various hashing techniques for the files
instead of only relying on the filenames to identify the content. An example of
this is the insertion of a faked Madonna single, in which the artist had overlaid
the phrase “What the hell do you think you’re doing?” on top of her newly
released single. Often, fakes are not as immediately apparent as this, and some
form of user feedback is useful. For instance, the eDonkey system implements
a reputation system for files. Decoys are often automatically generated from
incoming queries to pollute the P2P networks. While decoys do not pollute the
actual resource pool of the network, they can have the effect of valid queries
1 Not
in all systems, but in most hybrid systems.
13
CHAPTER 2. PEER-TO-PEER PROTOCOLS
being ignored or de-emphasised.
While file sharing in and of itself is not an illegal technology and has several
non-copyright infringing uses, the ease with which peers may share copyrighted
material has drawn the attention of the MPAA and RIAA. These organisations
consider the sharing of material under the copyrights of their members as seriously harming their revenue streams, by decreasing sales. In 2004, the MPAA
and RIAA began suing individuals for sharing copyrighted material. However,
not all copyright holders and artists agree on this course of action, nor do they
agree on the detrimental effect file sharing has on sales or artistic expression.
Several smaller record labels have embraced the distribution of samples of their
artists’ music online, and artists have formed coalitions against what they feel
is the oppressive behaviour of the larger record labels.
More recently, P2P systems have been used by corporations to distribute
large files such as Linux distributions, game demos and patches. Many companies make use of the BitTorrent system for this, as it provides for substantial
savings in bandwidth costs.
2.4
Summary
This chapter has discussed the history and evolution of P2P systems. The first
P2P systems were classic Internet services such as the DNS or e-mail systems.
More modern systems include Gnutella and eDonkey. Currently, P2P systems
are usually categorised as either pure or hybrid systems.
Additionally, the most popular P2P service, file-sharing, has been discussed.
File-sharing is the major application for P2P protocols, and is used in both
commercial and personal applications.
14
Chapter 3
The BitTorrent Protocol
Anyone who considers protocol unimportant
has never dealt with a cat.
– Robert A. Heinlein
BitTorrent is a P2P protocol for content distribution and replication designed to
quickly, efficiently and fairly replicate data [15,19]. The BitTorrent system may
be viewed as being comprised of two protocols and a set of resource meta-data.
The two protocols are for communication among peers and for the communication with a central network entity called the tracker. The meta-data provides
all information needed for a peer to join a BitTorrent distribution swarm and
to verify correct reception of the resource.
The following terminology is used in this thesis: a BitTorrent swarm refers
to all network entities partaking in a distribution of a specific resource. When
referring to the the peer–peer protocol, the BitTorrent protocol or protocol in
singular is used, while explicitly referring to the tracker protocol for the peer–
tracker communication. The collection of protocols (peer, tracker and metadata) is referred to as the BitTorrent protocol suite or protocol suite.
In contrast to many other P2P protocols such as eDonkey, DirectConnect,
KaZaA, the BitTorrent protocol suite provides neither resource query or lookup
15
CHAPTER 3. THE BITTORRENT PROTOCOL
functionality, nor chat, messaging or topology formation facilities. The protocols
rather focus on fair and effective distribution of data. The signalling is geared
towards an efficient dissemination of data only.
Fairness in the BitTorrent system is implemented by enforcing tit-for-tat
exchange of content between peers. Non-uploading peers are only allowed to
download very small amounts of data, making the download of a complete resource very time consuming if a peer does not share downloaded parts of the
resource.
With one exception (Section 3.3.2), the protocols operate over TCP and use
swarming, i.e., peers simultaneously downloading parts of the content, called
pieces, from several peers. The rationale for this is that it is more efficient in
terms of network load, as the load is shared across links between peers. This
results in a more evenly distributed network utilisation than in the case of
conventional CS distribution systems.
The size of the pieces is fixed on a per-resource basis and may not be changed
without generating a new meta-data file. The default piece size is 218 bytes, i.e.,
256 kB. The selection of an appropriate piece size is a fairly important issue. If
the piece size is small, re-downloading a failed piece is fast, while the amount
of extra data needed to describe all the data grows. On the other hand, larger
piece sizes means less meta-data, but longer re-download times.
3.1
BitTorrent Encoding
BitTorrent uses a simple encoding scheme for most protocol messages and associated data. This encoding scheme is known as bencoding. The scheme allows
for data structuring and type definition, and currently supports four data types:
strings, integers, lists and dictionaries. These are detailed in Section A.1 in the
Appendix.
16
3.2. RESOURCE META-DATA
3.2
Resource Meta-data
A peer interested in downloading some content by using BitTorrent must first
obtain a set of meta-data, the so-called torrent file, to be able to join a set of
peers engaging in the distribution of the specific content. The meta-data needed
to join a BitTorrent swarm consists of the network address information (in
BitTorrent terminology called the announce URL) of the tracker and resource
information such as file and piece size. The torrent file itself is a bencoded
version of the associated meta information.
An important part of the resource information is a set of Secure Hash Algorithm One (SHA-1) [8, 57] hash values1 , each value corresponding to a specific
piece of the resource. The hash values are used to verify the correct reception of
a piece. When rejoining a swarm, the client must recalculate the hash for each
downloaded piece. This is a very intensive operation with regards to both CPU
usage and disk I/O, which has resulted in certain alternative BitTorrent clients
storing information regarding which pieces have been successfully downloaded
within a specific field in the torrent file.
A separate SHA-1 hash value, the info field, is also included in the metadata. This value is used as an identification of the current swarm, and the hash
value appears in both the tracker and peer protocols. The value is obtained
by hashing the entire meta-data (except the info-field itself). Of course, if
a third-party client has added extra fields to the torrent file that may change
intermittently (such as the resume data or cached peer addresses), these should
not be taken into account when calculating the info-field hash value.
The meta-data as defined by the original BitTorrent design does not contain
any information regarding the peers participating in a swarm, though this information is added by some alternative clients to lessen strain on trackers when
rejoining a swarm. This feature allows the peer to continue the download in
case of tracker failure.
1 These
are also known as message digests.
17
CHAPTER 3. THE BITTORRENT PROTOCOL
3.3
Network Entities and Protocols
A BitTorrent swarm is composed of peers and at least one tracker. The peers
are responsible for content distribution among each other. Peers locate other
peers by communicating with the tracker, which keeps peer lists for each swarm.
A swarm may continue to function even after the loss of the tracker, but no new
peers are able to join.
To be functional, the swarm initially needs at least one connected peer to
have the entire content. These peers are denominated as seeds, while peers
that do not have the entire content, i.e., downloading peers, are denominated
as leechers.
The BitTorrent protocols (except the meta-data distribution protocol) are
the tracker protocol and the peer protocol. The tracker protocol is either a
HTTP-based protocol or a UDP-based compact protocol, while the peer protocol is a BitTorrent-specific binary protocol. Peer-to-tracker communication
usually takes place using HTTP, with peers issuing HTTP GET requests and
the tracker returning the results of the query in the returning HTTP response.
The purpose of the peer request to the tracker is to locate other peers in the
distribution swarm and to allow the tracker to record simple statistics of the
swarm. The peer sends a request containing information about itself and some
basic statistics to the tracker, which responds with a randomly selected subset
of all peers engaged in the swarm.
3.3.1
The Peer Protocol
The peer protocol, also known as the peer wire protocol, operates over TCP,
and uses in-band signalling. Signalling and data transfer occur in the form of
a continuous bi-directional stream of length-prefixed protocol messages over a
common TCP byte stream.
A BitTorrent session is equivalent with a TCP session, and there are no
protocol entities for tearing down a BitTorrent session beyond the TCP tear18
3.3. NETWORK ENTITIES AND PROTOCOLS
down itself. Connections between peers are single TCP sessions, carrying both
data and signalling traffic.
Once a TCP connection between two peers is established, the initiating peer
(Peer A in Figure 3.1) sends a handshake message containing the peer id and
info field hash (Figure 3.1). If the receiving peer (Peer B) replies with the
corresponding information, the BitTorrent session is considered to be opened
and the peers start exchanging messages across the TCP streams. Otherwise,
the TCP connection is closed. Immediately following the handshake procedure,
each peer sends information about the pieces of the resource it possesses. This
is done only once, and only by using the first message after the handshake. The
information is sent in a bitfield message, consisting of a stream of bits, with
each bit index corresponding to a piece index.
Peer A
Peer B
info
info,peer_id B
peer_id A
bitfield exchange
message exchange
Figure 3.1: BitTorrent handshake procedure
The BitTorrent peer protocol messages are described in Section A.2 of the
Appendix.
3.3.2
The Tracker Protocol
The tracker is accessed by HTTP or HTTPS GET requests. The default listening port is 6969. The tracker address, port and top-level directory are specified
in the announce url field in the torrent file for a specific swarm.
19
CHAPTER 3. THE BITTORRENT PROTOCOL
Tracker Queries
Tracker queries are encoded as part of the GET URL, in which binary data such
as the info_hash and peer_id fields are escaped as described in RFC1738 [14].
The query is added to the base URL by appending a question-mark, ?, as
described in RFC2396 [13].
The query itself is a sequence of parameter=value pairs, separated by ampersands, &, and possibly escaped. An example of a tracker request is given in
Figure 3.2. The \-characters indicate that the line continues on the following
line.
GET /announce?info_hash=n%05hV%A9%BA%20%FC%29%12%1Ap%D4%12%5D%E6U%0A%85%E1&\
peer_id=M3-4-2--d0241ecc3a07&port=6881&key=0fcca260&uploaded=0&downloaded=0&\
left=663459840&compact=1&event=started HTTP/1.0
Figure 3.2: Example tracker announce GET request
A complete list of parameters is given in Section A.3 of the Appendix.
Tracker Replies
The tracker HTTP response is, unless the compact parameter is 1, a bencoded
dictionary. The contents of the reply are listed in Section A.3.3 of the Appendix.
If the compact parameter is set to 1, then the reply is a binary list of peer
addresses and ports. This list is encoded as a six-byte datum for each peer,
in which the first four bytes are the IP address of the peer, and the last two
bytes are the peer’s listening port (Figure 3.3). This saves bandwidth, but is
only usable in an IPv4 environment. There is no equivalent compact format for
IPv6.
If the request fails for some reason, the dictionary contains only a single key:
failure reason, indicating the reason for the failed request.
20
3.3. NETWORK ENTITIES AND PROTOCOLS
0
32
Peer1 IP address
Peer1 port
Peer2 IP address
..
.
Peer2 port
Peern IP address
Peern port
48
Figure 3.3: Compact tracker response
Tracker UDP Protocol Extension
To lower the bandwidth usage for heavily loaded trackers, a UDP-based tracker
protocol has been proposed [77].
The UDP tracker protocol is not part of the official BitTorrent specification,
but has been implemented in some of the third-party clients and trackers.
Compared to the standard HTTP-based protocol, the UDP protocol uses
about 50 % less bandwidth. It also has the advantage of being stateless, as
opposed to the stateful TCP connections required by the HTTP scheme. This
means that a tracker is less likely to run out of resources due to for instance
half-open TCP-connections.
The Scrape Convention
BitTorrent trackers commonly include simple HTTP servers to provide information on the swarms they track. Web scraping denotes the procedure of parsing
a Web page to extract information from it. It is a fall-back method of obtaining
information when other methods fail or are not available. The BitTorrent variant is a bit different, as it is a way for peers to gain information on a specific
swarm without actually joining the swarm.
Trackers can implement functionality to allow peers to request information
regarding a specific swarm without resorting to error-prone Web-scraping techniques.
If the last name in the announce URL, i.e., the name after the last /-character
21
CHAPTER 3. THE BITTORRENT PROTOCOL
is announce, then the tracker supports scraping by using the announce URL
with the name announce replaced by scrape.
The scrape request may contain an info_hash parameter, as shown in Figure 3.4, or be completely without parameters.
GET /scrape?info_hash=n%05hV%A9%BA%20%FC)%12%1Ap%D4%12%5D%E6U%0A%85%E1 HTTP/1.0
Figure 3.4: Example tracker scrape GET request
The tracker responds with a bencoded dictionary containing information
about all the swarms that the tracker is currently tracking. The dictionary has
a single key, named files. This key contains another dictionary whose keys
are the 20-bit binary info_hash values of the torrents on the specific tracker.
Each value of these keys contains another dictionary with information about
the specific swarm. The contents of this dictionary is given in Section A.4 of
the Appendix.
3.4
Peer States
A peer maintains two states for each peer relationship. These states are known
as the interested and choked states. The interested state is imposed by the
requesting peer on the serving peer, while for the case of the choked state the
opposite is true. If a peer is being choked, then it will not be sent any data by
the serving peer until unchoking occurs. Thus, unchoking is usually equivalent
with uploading.
The interested state indicates whether other peers have parts of the sought
content. Interest should be expressed explicitly, as should lack of interest. That
means that a peer wishing to download notifies the sending peer (where the
sought data is) by sending an interested message, and as soon as the peer no
longer needs any other data, a not interested message is issued. Similarly,
for a peer to be allowed to download, it must have received an unchoke message
from the sending peer. Once a peer receives a choke message, it will no longer
be allowed to download. This allows the sending peer to keep track of the
22
3.5. SHARING FAIRNESS AND BOOTSTRAPPING
peers that are likely to immediately start downloading when unchoked. A new
connection starts out choked and not interested, and a peer with all data, i.e.,
a seed, is never interested.
In addition to the two states described above, some clients add a third state
– the snubbed state. A peer relationship enters this state when a peer purports
that it is going to send a specific sub-piece, but fails to do so before a timeout
occurs (typically 60 seconds). The local peer then considers itself snubbed by
the non-cooperating peer, and will not consider sub-pieces requested from this
peer to be requested at all. The snubbed state is reconsidered from time to
time.
3.5
Sharing Fairness and Bootstrapping
The choke/unchoke and interested/not interested mechanism provides fairness
in the BitTorrent protocol. Since it is the transmitting peer that decides whether
to allow a download or not, peers not sharing content tend to be reciprocated
in the same manner.
To allow peers that have no content to join the swarm and start sharing,
a mechanism called optimistic unchoking is employed. Optimistic unchoking
means that from time to time, a peer with content will allow even a non-sharing
peer to download. This will allow the peer to share the small portions of data
received so far and thus enter into a data exchange with other peers.
This means that while sharing resources is not strictly enforced it is strongly
encouraged. It also means that peers that have not been able to configure
their firewalls and/or Network Address Translation (NAT) routers properly will
only be able to download the pieces altruistically shared by peers through the
optimistic unchoking scheme.
23
CHAPTER 3. THE BITTORRENT PROTOCOL
3.6
Data Transfer
Data transfer is performed in parts of a piece (called subpiece, block or chunk)
at a time, by issuing a request message. Subpiece sizes are typically of size
16384 or 32768 bytes. The subpiece size is not part of the protocol, and may be
chosen at the discretion of the requesting peer.
To allow TCP to increase throughput, several requests are usually sent backto-back. Each request should result in the corresponding subpiece to be transmitted. If the subpiece is not received within a certain time (typically one
minute), the non-transmitting peer is snubbed, i.e., is punished by not being
allowed to download, even if unchoked. Data transfer is performed by sending
a piece message, which contains the requested subpiece (Figure 3.5). Once the
entire piece, i.e., all subpieces, has been received, and the SHA-1 hash of the
piece has been verified to the corresponding hash value in the meta-data, a have
message is sent to all connected peers.
The have message allows other peers in the swarm to update their internal
information on which pieces are shared by specific peers in the swarm.
3.6.1
End-game Mode
When a peer is approaching completion of the download, it sends out requests
for the remaining data to all currently connected peers to quickly finish the
download. This is known as the end-game mode. Once a requested subpiece is
received, the peer sends out cancel-messages to all peers that have not yet sent
the requested data.
Without the end-game mode, there is a tendency for peers to download the
final pieces from the same peer, which may be on a slow link [20].
24
3.7. BITTORRENT PERFORMANCE ISSUES
Peer A
Peer B
interested
Peer C
interested
request(piece,subpiece)
request(piece,subpiece)
request(piece,subpiece)
request(piece,subpiece)
unchoke
unchoke
piece(subpiece)
piece(subpiece)
piece(subpiece)
piece(subpiece)
have
have
Figure 3.5: BitTorrent protocol exchange
3.7
BitTorrent Performance Issues
Even though BitTorrent has become very popular among home users, and widely
deployed in corporate environments, there are still some issues currently being
addressed for the next version of BitTorrent.
The most pressing issue regards the load on the central tracker authority.
There are two main problems related to the tracker: peak load and redundancy.
Many trackers also handle more than a single swarm. The most popular trackers
handle several hundred swarms simultaneously. It is not uncommon for popular
swarms to contain hundreds or even thousands of peers. Each of these peers
connect to the tracker every 30 minutes by default to request new peers and
provide transfer statistics. An initial peer request to the tracker results in
about 2-3 kilobyte of response data. If these requests are evenly spread out
temporally, the tracker can usually handle the load. However, if a particularly
desired resource is made available, this may severely strain the tracker, as it will
be subject to a mass accumulation of connections akin to a distributed denial
25
CHAPTER 3. THE BITTORRENT PROTOCOL
of service attack by requesting peers. This is also known as the flash-crowd
effect [39].
It is imperative for a swarm to have a functioning tracker if the swarm
is to gain new peers as, without the tracker, new peers have no location to
receive new peer addresses. Tracker redundancy is currently being explored
and two alternatives are studied: backup trackers and distributing the tracking
functionality in the swarm itself. An extension exists to the current protocol
that adds a field, announce-list to the meta-data, which contains URLs to
alternate trackers. No good way of distributing the tracking in the swarm
has yet been found, but a network of distributed trackers has been proposed.
Proposals of peers sending their currently connected peers to each others have
also cropped up, but again, no consensus has been agreed on. Additionally,
Distributed Hash Table (DHT) functionality has been implemented in third
party clients to address this problem [11]. A beta version of the reference client
also has support for DHT functionality.
Another important problem is the initial sharing delay problem. If a torrent
has large piece sizes, e.g., larger than 2 MB, the time before a peer has downloaded an entire piece and can start sharing the piece might be quite substantial.
It would be preferable to have the ability to have varying verification granularities for the data in the swarm, so that a downloading peer does not have to wait
for an entire piece to begin calculating the hashes of the data. One way to do
this would be to use a mechanism known as Merkle trees [16], which allow for
varying granularity. By using this mechanism, a peer may start sharing after
having downloaded only a small amount of the data (on about the same order
as the subpiece sizes).
3.7.1
Super Seeding
When a swarm is fairly new, i.e., there are few seeds in the swarm and peers have
little of the shared resource, it makes sense to try to evenly distribute the pieces
of the content to the downloading peers. This will speed up the dissemination
of the entire content in the swarm. A normal seed would announce itself as
26
3.8. SUMMARY
having all pieces during the initial handshaking procedure, thus leaving the
piece selection up to the downloading peer. Seeds have usually been in the
swarm longer. This means that they are likely to have a better view on which
pieces are the most rare in the swarm, and thus most suitable to be first inserted.
As soon as peers start receiving the rare pieces, other peers can download them
from other peers instead of seeds. This further balances the load in and increases
the performance of the swarm.
A seed that employs super seeding does not advertise having any pieces at
all during handshake. As peers connect to the in effect hidden seed, it instead
sends have-messages on a per-peer basis to entice specific peers to download a
particular piece.
This mechanism is most effective in new swarms, or when there is a high
peer-to-seed ratio and the peers have little data. It is not recommended for
everyday use.
As certain peers might have heuristics governing which swarms to be part
of, a swarm containing only super seeders might be discarded. This is because
peers cannot detect the super seeder as a seeder, thus assuming that the swarm
is unseeded. This decreases the overall performance of the swarm.
3.8
Summary
This chapter has discussed the BitTorrent system in detail. It is a swarming
content replication and distribution system. The BitTorrent system consists of
peers and trackers. Peers are either leechers, i.e., downloading peers, or seeds,
i.e., uploading peers. Data content is described by a torrent file, using a specific binary encoding known as bencoding. Data transfer occurs among peers
in a swarming fashion, i.e., peers share parts of the content among themselves.
Each content part has an associated SHA-1 hash to enable verification of downloaded data. Peers that do not share data are punished by not being allowed to
download.
Furthermore, protocol performance issues and current developments have
27
CHAPTER 3. THE BITTORRENT PROTOCOL
been discussed. Primarily, the load on the tracker has been identified as a
potential bottleneck.
28
Chapter 4
Traffic Measurements
Science is the observation of things possible, whether present or past; prescience is
the knowledge of things which may come to
pass, though but slowly.
– Lenoardo da Vinci
Depending upon the domain, traffic measurements may serve different purposes.
For example, an Internet Service Provider (ISP) may benefit from measuring
the amount of outgoing traffic to estimate pricing and the services provided. A
company providing Voice over IP (VoIP) services may want to measure latencies
with high accuracy to ensure a certain degree of Quality of Service (QoS), while
a Web hosting provider may be more interested in metrics such as the number of
requests per time unit. On the other hand, manufacturers of network hardware
such as routers and switches use real-world measurements to test the behaviour
of the hardware under realistic conditions without deploying them.
There are four main reasons for the usefulness of network traffic measurements: network troubleshooting, protocol debugging, workload characterisation
and performance evaluation [82]. For the present work, only the last two are
considered, with a strong emphasis on workload characterisation.
29
CHAPTER 4. TRAFFIC MEASUREMENTS
4.1
Measurement Approaches
There are two main approaches to traffic measurements: active and passive measurements. Active measurements entails actively probing a network with either
artificially generated traffic or having a node join in the network as an active
participant. Probing with artificial traffic is somewhat analogous to system
identification using impulses in for instance vibration experiments or acoustical environments. A passive measurement is one where the network is silently
monitored without any intrusion.
4.1.1
Passive Measurements
Passive measurements are commonly used when data on “real” networks is
desired, for instance for use in trace-driven simulations, model validation or
bottleneck identification. Essentially, this technique is used to observe a live
network without interacting with it.
Depending on the level of accuracy desired, different measurement options
are available. For coarse-grained measurements, on a time-scale on the order of seconds, there is the possibility of using Simple Network Management
Protocol (SNMP) and Remote Monitoring (RMON) to gather information from
networking hardware. This provides only very rough information granularity
and no packet inspection capabilities. It is usually used as part of normal
network operations, and is not very useful for protocol evaluations and per-flow
performance evaluation. Per-flow information is available in for instance Cisco’s
NetFlow, but still without packet inspection.
Application Logging
Application logging is commonly used in server software to enable traceability
of errors and client requests. In certain server applications, such as critical
business systems and other high-security systems, server logs are very important
for detecting intrusion attempts and for estimating severity of security breaches.
30
4.1. MEASUREMENT APPROACHES
In other applications, logs are a useful tool for performance analysis.
However, client applications do not usually provide for much in terms of logging. If logging is made available, it usually provides rather coarsely grained information, such as application start and other very high-level application events.
It is unusual that an application provides the amount of log detail needed to
analyse the network performance of the application.
To provide adequate detail in application logs, it is necessary to modify the
application in such a way that the application both provides the detailed event
information needed and a way to store this information in a log file or database.
In applications that are based on an event-loop with a central managing
component, obtaining the relevant information is a fairly easy task, as the events
being handled contain all information relevant to the specific event. By adding
a timestamp, these may then be ejected to a log file or database. On the
other hand, in a threaded and less centralised application, this becomes a more
difficult task, as events may not be handled through a single component.
An additional issue with client-side logging is deployment of the modified
clients. It is important to have a large enough number of users to provide
representative data. Also, not all users may agree to running a modified client.
One of the most difficult problems relates to the non-availability of clientsource code. For example, most proprietary software does not provide the source
code for the application, making modification impossible without substantial
reverse engineering.
Log storage may become an issue if, for instance, the application is running
on an embedded system where there is no storage available except for internal
memory. Also, if measurements are performed over a long period of time and/or
there is a large number of events, the application logs may grow prohibitively
large.
31
CHAPTER 4. TRAFFIC MEASUREMENTS
Packet Capture
Application logs rarely provide network specific information, such as IP packet
arrival times. Dedicated packet capture or packet monitoring units have the
possibility of capturing every packet on a physical link. The packets may then
be associated with application level events and messages, resulting in higher
granularity of the captured messages.
Packet capture may be performed using only software or by employing specialised hardware. Software configurations provide measurement accuracies on
the order of tens of microseconds, while dedicated measurement hardware provides nanosecond accuracy.
The arguably most commonly used passive software measurement tool for
capturing live network traffic is tcpdump [40]. tcpdump is based on the pcaplibrary which includes the Berkeley Packet Filter (BPF) [51]. This allows tools
using the library to set up complex filter rules on which packets to capture. This
library is the basis for many other measurement and general network tools.
Two important issues regarding passive measurements are related to storage
and computing power requirements [44]. In the case of complete flow reconstruction, the entire data payload portion of the captured packets must be retained.
For each packet captured, there is an additional capture header containing a
timestamp and other meta-data. To capture large flows such as P2P downloads, the storage requirements are correspondingly large. In the case of traffic
measurements on backbone networks, the amount of off-line storage needed is
prohibitively large. For instance, large BitTorrent trackers measure the amount
of downloaded data among their peers in peta-bytes and the number of messages
to a single peer often count in the millions. The complexity of computing statistical measures with this amount of data makes it a challenging, if not daunting,
task. It is often necessary to create specialised software to calculate statistics
of interest.
Other important issues regarding capturing live traffic relate to privacy, deployment and cost. Since a capturing unit has the potential of capturing all
traffic, including user data, security and privacy implications must be consid32
4.1. MEASUREMENT APPROACHES
ered. This may create problems when choosing suitable measurement locations,
as the network owner may not allow full packet captures to be collected. Even if
the network owner allows the recording of full packets, data that allows for user
identification, such as IP addresses, must be scrambled if the packet traces are
to be made public. Finding a suitable location to place the measurement unit
may also prove to be a challenging process. Ideally, the measurements should
be made at a location that provides data representative for the metrics to be
studied. For certain types of measurements, it is not possible to perform measurements using only software running on off-the-shelf hardware. For example,
high-accuracy measurements on high-speed links such as optical links require
both special hardware to split the physical line, known as wire taps, and special hardware to capture the actual packets. This hardware is expensive, and
large-scale deployment of such units may not be economically feasible.
4.1.2
Active Measurements
Active traffic measurements are used for assessing performance metrics and network parameters that are not readily available by using passive measurements,
such as topological information and end-to-end latency. This type of measurements are also useful in determining system response to varying workloads.
The workloads can be either synthetic, i.e., traffic or packets generated from
some analytic model, or trace-driven. A trace-driven workload uses previously
captured traffic to inject into the network. This has the benefit of using real,
non-synthetic traffic, thus avoiding any model discrepancies. It is however more
time-consuming to perform and less flexible. Certain protocols, such as TCP,
are more difficult to properly replay.
Synthetic traffic generators are a more flexible way of placing load on a
network than using trace-driven traffic insertion. For instance, if it is desired to
inject similar, but not identical, traffic from several hosts on the network under
study, a slight change in model parameters would suffice. A trace-driven load
would require both moving around large packet traces and changing these in
real-time on every traffic generating node. A synthetic generator also has the
33
CHAPTER 4. TRAFFIC MEASUREMENTS
advantage of being portable between different systems [41].
As actively probing the network entails injecting traffic or packets, there is
a risk of disturbing the network being measured. Thus, the amount of injected
traffic must be carefully chosen so as not to change the behaviour of the network
to such a degree that the measurements no longer are relevant for the network
under study.
The most commonly known active measurement tools are ping and traceroute. The ping-command measures the latency between two hosts by sending an ICMP echo request and measuring the time until the response arrives.
traceroute uses the TTL-field in the IP header to estimate the IP network path
to a given host. Novel techniques have been developed to circumvent problems
associated with ICMP probes in NAT environments [33].
Depending on the metrics needed, other programs may be used as well. For
example, simple Web clients may be used to measure server response times.
For P2P networks, programs known as crawlers are occasionally used to create
a topological snapshot of the P2P network. However, crawlers generate large
amounts of signalling traffic, and are usually viewed as disruptive to the network.
4.2
Application Level Traffic Analysis
Conventional measurement methodologies such as passive network monitors using either software or specialised hardware are adequate for diagnosing most
directly network-related problems. There are however a few issues that are
difficult, if not impossible, to solve without resorting to application layer information.
For instance, if the network traffic is observed at the network or link layer,
the information gained regarding the higher-layer protocols is filtered through
the mechanisms of the IP stack. Only second-hand information about the application protocols is obtained in this case, with the consequence of more difficulty
in debugging protocols.
34
4.2. APPLICATION LEVEL TRAFFIC ANALYSIS
Lower layer captures also lack the possibility of directly knowing about application state. The internal application states have to be inferred by the available
protocol messages sent. Implicit states such as the BitTorrent snubbed state have
to be heuristically inferred, as there is no explicit message denoting the state.
This adds additional complexity and processing time to the parsing software.
Additionally, one may choose only specific messages and states of interest.
Packet capture necessitates full payload capture and decoding before discarding
messages is possible.
On the other hand, there are drawbacks with analysing traffic at the application layer as well. As mentioned above, the incoming protocol messages are
filtered through the IP stack before reaching the application. This means that
timestamps on the application level are affected by buffering and thus queueing
in the kernel. Timestamps are also affected by the current system load, primarily I/O load. Thus, inaccuracies may occur if the system is heavily loaded with
network traffic and logging to disk.
4.2.1
TCP/IP Stack Performance
By combining the methods of application layer logging and application stream
reassembly on the link and/or IP layers, there is the possibility of evaluating the
performance of the IP stack on which the logging is taking place1 . This opens
for the possibility to assess the amount of buffering taking place in the stack as
well as performance issues associated with specific network loads.
For instance, P2P applications often use a large amount of TCP connections,
which might not only compete for the available network bandwidth, but may
also result in contention in the host computer’s IP stack. Problems in the stack
might then be inappropriately diagnosed as network or protocol issues when the
fact of the matter may be altogether different.
1 Given
that both logging and tracing is performed on the same host.
35
CHAPTER 4. TRAFFIC MEASUREMENTS
4.3
Measurement Infrastructure
A dedicated P2P measurement infrastructure has been developed at BIT to perform measurements on P2P networks (Figure 4.1) [30]. It is composed of standard personal computers participating in various P2P networks (currently BitTorrent and Gnutella) running both instrumented and non-instrumented clients.
Traffic collection and decoding is based on tcpdump [40] for packet capture and
tcptrace [58] for protocol decoding.
The measurement nodes run the Gentoo Linux 1.4 operating system, with
kernel version 2.6.5. Each node is equipped with an Intel Celeron 2.4 GHz
processor, 1 GB RAM, 120 GB hard drive, and 10/100 FastEthernet network
interface. As shown in Figure 4.1, the network interface is connected to a
100 Mbit switch in the lab at the Telecommunications department, which is
further connected through a router to the GigaSUNET backbone.
Internet
BIT router
Switch 10/100 Mbit
BitTorrent
node
Gnutella
node
Figure 4.1: BIT measurement setup
4.4
Measurement Software
The software of the measurement infrastructure is comprised of two major
components [30]. The first component is a generic TCP reassembly and post36
4.4. MEASUREMENT SOFTWARE
processing framework which is used to parse and analyse link layer traces captured by tcpdump. The second component is a common logging format and set
of analysis programs written in a variety of languages, such as perl, python,
awk, R and Octave. Additionally, application logging is performed, with logs adhering to the same common logging format as parsed packet traces. A high-level
abstraction of the measurement process is presented in Figure 4.2.
Log parsing
Log data
reduction
Data collection
with tcpdump
TCP Reassembly
Postprocessing
and analysis
Application msg
flow reassembly
Figure 4.2: Measurement procedures
4.4.1
Generic TCP Reassembly Framework
The TCP reassembly framework is essentially a three-stage capture and reassembly engine. The first stage is the packet capture stage, and the two following
stages are TCP and application flow reassembly. The last two stages are working
in parallel, but are logically separate.
Packet capture traces are collected using tcpdump, version 3.8.3. Tcpdump
is started before the P2P application is measured and filters are applied to the
capture process to avoid capturing packets belonging to applications listening
on well-known ports such as HTTP, FTP, SSH and SMTP.
During capture, the packet traces are saved to files of size 600 MB to facilitate backups to recordable CDs. This also avoids problems related to file
size limitations of the file systems on the measurement nodes, since a typical
Windows or Linux file system cannot handle files larger than 4 GB.
37
CHAPTER 4. TRAFFIC MEASUREMENTS
TCP Reassembly
The TCP reassembly module is based on the TCP engine from tcptrace, and
works in a fashion similar to the one used in the BSD TCP/IP stack as described
in [66].
The engine is highly modular and extensions are provided to facilitate reassembly of any type of data stream, not only TCP. Capabilities include detection and handling of out-of-order segments as well as forward and backward
segment overlapping.
Application Data Flow Reassembly
As soon as a new TCP segment has been read and inserted into the segment list
by the TCP reassembly engine, an application-specific hook function is called.
This hook is used to notify an application data reassembly module. The application decoder is responsible for parsing and decoding the data stream provided
by the TCP reassembly engine. Once a message (which may span several TCP
segments) is fully decoded, a log entry is ejected to the log file.
Decoders have been implemented for Gnutella and BitTorrent.
4.4.2
BitTorrent Application Logging
The reference BitTorrent client version2 is written in the python programming
language [78]. Python is an interpreted and interactive language with object
oriented features that combines syntactical clarity with powerful components
and system-level functionality. This makes the process of extending software
written in the language less complicated than in a compiled and syntactically
more demanding language such as C or Java.
2 Version
38
3.4.1, released on March 11, 2004.
4.4. MEASUREMENT SOFTWARE
Software Modifications
The client is written as an event-based program, reacting on incoming protocol
messages and internal timers. The internal timers activate the sending of messages such as tracker requests, unchoking peers and network timeouts. For the
purpose of the present work, the incoming network message handling routines
are the important part. These are mainly located in a single software component, which handles all incoming events. This component consists of a function
containing the main loop (that receives the network messages), and several message specific functions to handle the incoming messages that are invoked from
the main loop. While it is possible to intercept the messages in the main loop,
it is much easier to do so in the specific message handling routines. There are
two major reasons for this:
• The message type is already implicitly given by the call of the function.
• Message-specific information is provided automatically, without the need
to write extra parsing code. For instance, in the case of a piece request
message being intercepted in the main loop, it would have been necessary
to parse the incoming message to find information such as piece number
and subpiece index.
Before saving the ejected log messages to disk, they are compressed by the
zlib library [32]. This is beneficial both with regards to disk storage and with
regards to the amount of disk I/O performed. The degradation in CPU performance of the compression is practically negligible on the measurement computers.
Finally, extra parameters have been added to the application to allow changing the filename of the log-file, and code to automatically generate a date and
timestamped filename if none was given.
39
CHAPTER 4. TRAFFIC MEASUREMENTS
4.4.3
Log Formats
Selection of a log format that provides a suitable amount of information is a
tricky issue. It is important to capture enough information to make relevant
statistical analysis possible, while at the same time keep the sizes of the log files
to a manageable level.
This problem is most noticeable when designing a log format for application
logs, as it is not possible to re-run a specific measurement a second time if one
has chosen too small a subset of metrics to log. Packet captures are less affected
by this, but are not impervious to similar effects in the case of, for instance,
too small capture size for the recorded packets, thus losing parts of the payload
data. In both cases, information is irretrievably lost.
Complete packet captures that contain all data transmitted on a link may
be used to re-generate log files as needed. This is however often a very timeconsuming process, and it is preferable to avoid it whenever possible.
BitTorrent XML Log Format
The eXtensible Markup Language (XML) [80] has a number of attractive features that makes it a good choice as a log format. XML is by concept and design
made to be easily parsable by a computer, while at the same time be at least
semi-readable by humans.
Some of the salient advantages of using XML as a log format are :
Parsability
There are several XML parsing libraries available for a
plethora of languages, including, but not limited to, perl,
C, C++, python and Matlab. This makes the writing of
log parsers much easier, since it is not necessary to write
an application specific parser for the log format.
Extensibility It is easy to add new log fields, and new log fields do not
necessitate changing the parser. This is very useful when
deciding what information goes into the log, as fields may
40
4.4. MEASUREMENT SOFTWARE
be added and removed easily.
Validation
The number and types of fields are easily verifiable, and is
usually performed as part of the XML validation process
provided by the parsing library.
Two drawbacks with using XML as a log format are that the parsing is slightly
slower than an application specific parser, and that memory requirements are
substantially higher when using specific parsers. In particular, it is rarely possible to use the Document Object Model (DOM) parsers to parse the log files.
These parsers maintain a representation of the entire XML file in memory and,
with log files in the gigabyte range, the amount of memory required is substantial. Simpler parsers, such as Simple API for XML (SAX) parsers, are therefore
used. These parse the document on an element by element basis, removing
the need for keeping the entire document in memory. This solution unfortunately also means that the transformation capabilities provided by eXtensible
Stylesheet Language Transformations (XSLT) cannot be used, and specific software making use of provided SAX parsers must be created.
XML documents are text documents comprised of elements and attributes.
Attributes are contained within the elements, and usually carry element-specific
information and modifiers. A detailed description of the BitTorrent XML log
format is provided in Section B of the Appendix.
Common Log Formats
To facilitate the re-use of parsing software, it was decided as part of the measurement infrastructure, that parsed traces should be written to log files that
adhere to a common format. This format is defined as follows:
• Fields are separated by spaces (ASCII code 20)
• Field definitions are:
1. The first field is the UNIX timestamp for the event.
41
CHAPTER 4. TRAFFIC MEASUREMENTS
2. The second field contains the message type, if any.
3. Any following field may contain arbitrary information.
This is a simple and flexible logging scheme that allows the use of standard
UNIX tools such as sed, awk and perl to parse the files without resorting to
writing specialised parsers for each log file. In fact, the parsing software only
assumes that the first field is a UNIX timestamp, and the rest of the fields are
arbitrary. It is however recommended that the second field is a message type.
Figure 4.3 shows an example of a log file generated from a BitTorrent application log.
1088079499.265901
1088079499.267193
1088079499.269075
1088079499.282697
piece
request 1311 196608
have 593
have 6690
Figure 4.3: Sample BitTorrent log file
4.5
Summary
This chapter has discussed methods and reasons for traffic measurements. Traffic measurements can be either passive or active. Passive measurements implies
observing a network without interfering with the traffic, while active measurements involves actively probing the network.
Additionally, the P2P measurement infrastructure used for the measurements in this thesis has been presented. It is based on passive measurements
using application logs and packet capture. Specific parsing and analysis software for BitTorrent and Gnutella has been written as part of the infrastructure.
The measurements performed using this infrastructure have the advantage of
being able to capture application layer messages with link-layer accuracy.
42
Chapter 5
Traffic Modelling
So long, and thanks for all the fish.
– Douglas Adams
Traffic modelling is a form of workload characterisation, with the aim of providing tractable and parsimonious models for the traffic loads placed on a network
by various applications. Ideally, these models should be invariant, i.e., should
hold true irrespective of operating conditions.
Historically, teletraffic modelling concerned characterising the relationship
between traffic load and performance in the Public Switched Telephone Network
(PSTN). Researchers such as Conny Palm created models that worked very
well for modelling telephone call arrivals, number of blocked calls and busy
hour load [59]. The major tool for these models was the Poisson process. The
fundamental idea behind the Poisson process is that times between events (e.g.,
incoming telephone calls) are drawn from an exponential distribution, and that
these times are independent. Furthermore, as the number of sources increase,
the aggregate traffic becomes less bursty. This makes treatment of the models
analytically simple and attractive.
For the remainder of the thesis, the following notations are used. A theoreti-
43
CHAPTER 5. TRAFFIC MODELLING
cal distribution function with the associated estimated parameters is denoted by
F̂ . F̂ is occasionally mentioned as the estimated distribution in the thesis. The
Empirical Distribution Function (EDF) of a given data set is denoted by Fn .
The associated theoretical Probability Density Function (PDF) and histogram
are denoted by fˆ and fn respectively. Also, x1 , x2 , . . . , xn denote observations
from the random variable X and the total number of observations are denoted
by n.
5.1
Heavy-tailed Traffic Models
With the appearance of packet-switched networks in the late 60’s and early 70’s,
the assumption of exponentiality and independence remained; as much for the
analytic tractability as no apparent reason to suspect otherwise. However, this
all changed when Willinger et al. showed that Ethernet traffic does not display
the expected decrease in burstiness. Rather, the results showed that Ethernet
traffic has self-similar qualities and traffic “spikes” ride on longer term “ripples”
that in turn ride on still longer term “swells” [49]. This seminal paper instigated
a flurry of work on self-similarity in network traffic, and several sources of selfsimilar and long-range dependent traffic have been reported. Examples of these
are Web and FTP file sizes [23,62]. The self-similar traffic characteristics appear
over several time scales and aggregation levels, making the self-similarity a traffic
invariant [60]. Additionally, the control and recovery mechanisms of TCP have
been shown to be both a possible source of Long-Range Dependence (LRD) and
that LRD tendencies are propagated by them [34, 79].
A complete discussion regarding self-similarity and LRD is not within the
scope of this thesis. The current section will therefore give only a very concise
summary of the fundamental concepts.
5.1.1
Self-similarity and Long-range Dependence
Self-similarity refers to the insensitivity to scale. In the case of stochastic processes, this means that process statistics such as variance and mean do not
44
5.1. HEAVY-TAILED TRAFFIC MODELS
change with a change in time scale. That is, the process “looks” the same
whether it is observed on a scale of 1 s or a scale of 6 s. In particular, the
process exhibits the same amount of burstiness on all scales.
The term statistical self-similarity is usually expressed as [74]
A process x(t) is statistically self-similar with parameter H ( 12 ≤
H ≤ 1) if for any real a, the process a−H x(t) have the same statistical properties as x(t).
The parameter H is called the Hurst parameter, and denotes the amount of
LRD of the process. For H = 12 , the process has independent increments, and
thus has no dependence on previous values. However, for H ≥ 12 , the process
exhibits persistence, i.e., previous trends implies that future trends will be the
same. The closer H is to 1, the higher the amount of LRD.
The Brownian motion (BM) process is an example of a stochastic process
that exhibits self-similarity, but not LRD, i.e., has a Hurst parameter H = 21 .
The BM process can be generalised to a fractional BM (FBM) process, which
may exhibit LRD.
LRD and Short-Range Dependence (SRD) are often defined in terms of the
process auto-covariance. An SRD process has a summable auto-covariance, i.e.,
the sum is non-divergent. For an LRD process, the auto-covariance is divergent
and thus non-summable. More formally, the auto-covariance of a short-range
dependent process decays at least exponentially, while the long-range equivalent
decays hyperbolically.
It is of interest to be able to estimate the amount of self-similarity in a
particular data set. This may be done either by using graphical methods or
statistical methods. Two examples of graphical methods are discussed, the Hill
and α scaling estimator plots.
45
CHAPTER 5. TRAFFIC MODELLING
Hill Plots
The Hill plot uses the order statistics of X to form an estimate for the tail index
α (Section 5.1.2). It is defined as
Hk,n =
k−1
1X
(log xn−1 − log xn−k )
k i=0
(5.1)
10
where k is the number of order statistics used in estimating the α parameter.
Hk,n is plotted versus k and a horizontal line indicates the estimate of α. Figure 5.1 shows an example of a Hill plot for the Exponential, Pareto, Weibull
and Log-normal distributions. The parameters have been chosen so as to make
the distributions have quite long tails.
Pareto
8
Log−normal
4
6
Exponential
0
2
Hill alpha−estimate
Weibull
0
2000
4000
6000
8000
10000
Number of order statistics
Figure 5.1: Pareto, Weibull, Log-normal and Exponential Hill plots
Often, empirical distributions show power-law behaviour only in the upper
tail, while the body of the distribution appears to be non-heavy-tailed. The
Hill estimator suffers from the problem of choosing the correct number of order
statistics to give an acceptable estimate for α, i.e., locating the proper cutoff
point where power-law behaviour begins.
Furthermore, the Hill estimator only performs well for distributions close to
a Pareto [67]. It is for instance difficult to assess the heavy-tail index for the
Log-normal and Weibull distributions presented in Figure 5.1.
46
5.1. HEAVY-TAILED TRAFFIC MODELS
The α Scaling Estimator
The α or scaling estimator uses a property of random variables with infinite
variance that is analogous to the central limit theorem for finite variance random
variables [22]. This property is the scaling property of sums of infinite variance
random variables. The estimate is formed by calculating aggregate versions of
the original data at increasing levels of aggregation and comparing the log-log
CCDFs of these aggregates.
File: x.w.asc No. points: 1000000 Alpha Estimate: 1.427
0
-1
-1
-2
-2
Log10(P[X > x])
Log10(P[X > x])
File: x.p.asc No. points: 1000000 Alpha Estimate: 0.986
0
-3
Raw Data
2-Aggregated
4-Aggregated
8-Aggregated
16-Aggregated
32-Aggregated
64-Aggregated
128-Aggregated
256-Aggregated
512-Aggregated
"x.p.asc.pts"
-4
-5
-6
-6
-3
Raw Data
2-Aggregated
4-Aggregated
8-Aggregated
16-Aggregated
32-Aggregated
64-Aggregated
128-Aggregated
256-Aggregated
512-Aggregated
"x.w.asc.pts"
-4
-5
-6
-4
-2
0
2
4
6
-4
-3
-2
-1
0
Log10(size - 15.078)
(a) Pareto
-1
-1
-2
-2
Log10(P[X > x])
Log10(P[X > x])
0
-3
Raw Data
2-Aggregated
4-Aggregated
8-Aggregated
16-Aggregated
32-Aggregated
64-Aggregated
128-Aggregated
256-Aggregated
512-Aggregated
"x.ln.asc.pts"
-6
-6
-5
-4
-6
-1
0
Log10(size - 9.430)
(c) Log-normal
5
Raw Data
2-Aggregated
4-Aggregated
8-Aggregated
16-Aggregated
32-Aggregated
64-Aggregated
128-Aggregated
256-Aggregated
512-Aggregated
"x.e.asc.pts"
-5
-2
4
-3
-4
-3
3
File: x.e.asc No. points: 1000000 Alpha Estimate: 2.264
0
-5
2
(b) Weibull
File: x.ln.asc No. points: 1000000 Alpha Estimate: 1.483
-4
1
Log10(size - 48.344)
1
2
3
4
-6
-5
-4
-3
-2
-1
0
1
2
Log10(size - 1.000)
(d) Exponential
Figure 5.2: Pareto, Weibull, Log-normal and Exponential α-estimator
plots
Figure 5.2 shows four examples of outputs from the scaling estimator. The
47
CHAPTER 5. TRAFFIC MODELLING
points in these plots are the points used for comparing the aggregated CCDFs,
i.e., the points for which the scaling property holds.
The most salient advantage of the scaling method is that it takes into account
both power-law shape (i.e., a straight line in the CCDF) and the regions for which
the distribution is scale invariant. As Figure 5.2 shows, the scaling estimator
performs well in detecting the scaling behaviour of the Log-normal and Weibull
distributions as well as for the Pareto.
5.1.2
Heavy-tailed Distributions
Formally, a heavy-tailed distribution is a distribution whose Complementary
Cumulative Distribution Function (CCDF) decays as power-law, i.e.,
P [X ≥ x] ∼ cx−α
as x → ∞, 0 < α < 2
(5.2)
where c is a positive constant, α is the tail index, and ∼1 indicates that
lim
x→∞
P [X ≥ x]
=1
cx−α
Heavy-tailed distributions have infinite variance, and for α ≤ 1 also exhibit
infinite means.
The α parameter is related to the Hurst parameter by H = 3−α
2 . In network
modelling, the range 1 < α < 2, corresponding to 21 < H < 1 is of primary
interest [60]. That is, processes with infinite variances, but bounded means.
By plotting the CCDF on a log-log scale, a heavy-tailed distribution appears
as a straight line with slope −α. This is also known as exhibiting power-law
behaviour. The typical example of such a distribution is the Pareto distribution
(Eq. (5.3)). A Pareto distribution with slope α = −1 is shown in Figure 5.3.
A larger class of distributions are the long-tailed distributions, also known
as sub-exponential distributions. A long-tailed distribution is one that decays
slower than an exponential distribution. Examples of this class are the Weibull
1 To
48
be read as “is distributed as”.
5.1. HEAVY-TAILED TRAFFIC MODELS
−2
Pareto
Weibull
Log−normal
Exponential
−3
log P[X ≥ x]
−1
0
(Eq. (5.5)) and the Log-normal (Eq. (5.7)) distributions as shown in Figure 5.3.
The parameters for the distributions are the same as in Figure 5.1. These
distributions do not exhibit infinite means. For simplicity, the rest of the thesis
denotes both long- and heavy-tailed distributions as heavy-tailed.
0
1
2
3
log x
Figure 5.3: Pareto, Weibull, Log-normal and Exponential CCDF
The Pareto Distribution
The density and distribution functions for the Pareto distribution are
and

α

 α k
k x
f (x) =

 0

α

 1− k
x
F (x) =

 0
x < k; k, α > 0
(5.3)
otherwise
x > k; k, α > 0
(5.4)
otherwise
where k is the smallest value the distribution may have.
49
CHAPTER 5. TRAFFIC MODELLING
The Weibull Distribution
The density and distribution functions for the Weibull distribution are

 αβ −α xα−1 e−(x/β)α
x > 0, α > 0, β > 0
f (x) =
 0
otherwise
and

 1 − e−(x/β)α
F (x) =
 0
x > k; α > 0
.
(5.5)
(5.6)
otherwise
For α = 1, the Weibull distribution is identical with the exponential distribution.
The Log-normal Distribution
The Log-normal distribution, as the Gaussian distribution, lacks a closed-form
expression for the distribution. The density function is given by

2
2

 √1
e−(ln x−µ) /2σ
x > 0, σ > 0
2
x
2πσ
f (x) =
.
(5.7)

 0
otherwise
5.1.3
Implications of Heavy-tail Behaviour
An important implication of heavy-tail distributions is that the probability of
“very” large values is nonneglible. This affects many fundamental tools and
results in profound ways. For instance, heavy-tailed flow durations tend to
persist if they have already been active for a period of time [60]. That is, the
longer the flow has existed, the longer it is likely to persist.
Another issue regarding the high variability of the heavy-tailed distributions
is that the sample mean converges slowly to the population mean. It can be
shown that the convergence error follows [60]
|X(n) − µ| ∼ n1/α − 1.
50
(5.8)
5.2. HYPOTHESISING DISTRIBUTIONS
Clearly, the closer α is to 1, the slower the convergence becomes. The consequence of this is that the number of samples needed to be able to generate
heavy-tailed random variates is extremely high. For example, 1012 samples are
needed to achieve a two-digit accuracy for α = 1.2 [60].
The implication of this is that classical traffic models based on assumptions
of exponentiality and independence underestimate performance measures and
requirements such as buffer occupancy and loss rates [62]. The high variability
of self-similar processes is poorly handled by these models, and the impact on
networks and systems is significant.
Furthermore, Web servers and FTP servers may face issues regarding depletion of resources such as open sockets and files for heavy-tailed flow durations.
The common way to handle these situations is to over-dimension networks and
server hardware.
5.2
Hypothesising Distributions
The initial process of distribution selection is usually a combination of using
visual inspection and summary statistics. Regardless of the method used, the
process is of an exploratory nature and both qualitative and quantitative measures are frequently used.
5.2.1
Summary Statistics
Summary statistics are primarily useful as general indicators for the general
shape of the distribution of X. Besides the sample mean
n
X(n) =
and variance
1X
xi
n i=1
n
X
2
1
S (n) =
xi − X(n)
n(n − 1) i=1
2
51
CHAPTER 5. TRAFFIC MODELLING
a few other statistic estimates can prove useful2 . The coefficient of variation,
√
S2
cv =
X
is especially useful for indicating exponential behaviour, since cv = 1 for the
exponential distribution.
To measure symmetry, the skewness
v=
1
3/2
n(S 2 )
n
X
i=1
xi − X
3
can be used. v = 0 indicates a symmetric distribution, v < 0 indicates a distribution skewed to the left (i.e., has a heavier left tail), while v > 0 indicates
a distribution skewed to the right (i.e., has a heavier right tail). Examples of
distributions with different values for v are shown in Figure 5.4. The solid line
is the normal density with mean 2 and variance 0.52 (v = 0.0), the dashed
and dotted lines are Weibull densities with parameters (α = 1.5, β = 1) and
(α = 7, β = 3.5) respectively (v = 1.1 and − 0.5). The dash-dotted line is an
exponential density with rate 0.75 (v = 2). Typically, many observed distributions are skewed to the right [48].
v=0.0
0.6
v=1.1
v=−0.5
0.4
0.0
0.2
f(x)
v=2.0
0
1
2
3
4
x
Figure 5.4: Skewness
2 For
brevity, the variable n is omitted for the rest of the statistics in this section, so X(n)
becomes X etc.
52
5.2. HYPOTHESISING DISTRIBUTIONS
An additional statistic, the kurtosis
κ=
1
n(S 2 )
4/2
n
X
4
xi − X
i=1
can be used as a measure of peakedness. A distribution with large kurtosis
tends to have a peak near the mean and exhibit heavy tail behaviour, while
distributions with low kurtosis have flat peaks (e.g., the uniform distribution)
with rapidly decreasing tails.
5.2.2
Graphical Methods
Visual inspection makes use of various plots such as histogram (or Experimental
P
Probability Density Function (EPDF) if the xi s are normalised to i xi ), EDF,
CCDF, Hill and α-estimation plots [22]. Experienced modellers are often able
to identify distributions or mixtures of distributions by inspection alone.
The lower quantiles of the data are useful to observe using the PDF. The
CCDF serves the same purpose for the upper tail. The CCDF is useful for
discerning potentially heavy tail behaviour in the distribution such as for file
sizes and session durations [47]. The histogram is more suitable for observing
metrics in situations where higher frequency behaviour is to be modelled, such
as for inter-arrival times. Hill plots give an indication of the amount of heavy
tail behaviour, and also potential cutoff points in the censored mixture model
case. The α-estimation provides indications of the degree of self-similarity in
the data.
The visual inspection helps in eliminating many candidate distributions,
and indicates whether a single distribution will suffice or if a mixture model is
required. In certain cases, one may also identify rough estimates of distribution
parameters (e.g., means and variances). For this thesis, single distributions
and mixtures of two distributions are primarily considered, as the number of
measurements makes the heuristics involved in calculating more cutoff points
prohibitively complex.
53
CHAPTER 5. TRAFFIC MODELLING
5.3
Mixture Distributions
A finite mixture distribution is a distribution composed of a weighted sum of
distributions [76, 83]. The PDF of such a distribution is given by
p(x) =
n
X
πi fi (x)
(5.9)
i=1
f(x)
0.00
0.05
0.10
0.15
0.20
where the πi s are known as the mixing weights or mixing probabilities and the
fi s as the component densities with their associated parameters. For p(x) to
Pn
form a proper PDF, all fi s must be proper PDFs, and i=1 πi = 1. Figure 5.5
shows an example of a three-component mixture distribution in which all the
components are normal distributions. The dashed lines depict the component
densities, while the solid line is the resulting mixture distribution.
−2
0
2
4
6
8
10
x
Figure 5.5: The mixture distribution f (x) = 0.2φ(2.5, 0.5) + 0.4φ(6, 1) +
0.4φ(1, 2)
A classic example of a mixture distribution is the hyper-exponential distribution
r
X
πi λi e−λi x
(5.10)
Hr (x) =
i=1
which, for the case of r = 2, is an example of a binary mixture distribution
(Eq. (9.1)). This important special case is of the form
p(x) = πf1 (x) + (1 − π)f2 (x).
54
(5.11)
5.3. MIXTURE DISTRIBUTIONS
For the purposes of this thesis, a binary mixture in which f1 (x) and f2 (x) are
of the same distributional family is denoted as a dual distribution, e.g., dual
Gaussian for g(x) = πφ(x) + (1 − π)φ(x).
Probabilistically, a mixture model can be interpreted as modelling an r-stage
parallel system in which a customer enters state i with probability πi . The
customer remains in this state for a time distributed according to distribution
fi .
The theory for parameter estimation and fitness testing of mixture distributions is quite similar to that of the single distribution case, which makes mixture
distributions an attractive tool for model construction. For instance, in certain
cases, the body and tail of a given empirical distribution may not be adequately
matched in a single distribution. Using a mixture of distributions would provide for more flexibility in the distribution fitting process, such as adding a
high-variance Gaussian component to approximate heavy tail behaviour [76].
However, this flexibility may have as a consequence that the construction of
analytic estimators is difficult, and it is often necessary to resort to numerical
methods. If the number of component densities is large, it may be difficult to
properly estimate parameters even by using numerical methods. Additionally,
the data may have to be censored at specific cut-off points to remove biases in
the estimation procedure. Locating these cut-off points is often a trial-and-error
process and application-specific heuristics may have to be employed.
5.3.1
Censored Mixture Distributions
Though finite mixture models are a useful and intuitive tool for model construction, the method does not work well with all processes. For example, the
hypothesised distribution may lack analytic parameter estimators and numeric
methods fail, or provide very poor estimates. In this case, a censored mixture
distribution may yield better results. By a censored mixture distribution we
refer to a set of distributions describing various ranges of the distribution of the
observations. Formally, there is a set of distributions F = F1 , . . . , Fk describing the ranges R = r1 , . . . , rk . These ranges are delimited by the cutoff points
55
CHAPTER 5. TRAFFIC MODELLING
C = c1 , . . . , ck+1 such that R = {r1 = c1 ≤ X < c2 , . . . , rk = ck ≤ X < ck+1 }.
Usually, k = 2, c1 = min X and c2 = max X + ǫ, where ǫ is a small value to
ensure that max X is included in the range c2 .
However, the added flexibility comes at the price of additional problems that
need to be addressed:
• The first problem involves the localisation of the cutoff points between
distributions. It is rarely apparent where the cutoff points are located,
and heuristic methods are often useful in locating them.
• Secondly, the parameter estimation methods described in Section 5.4 assume that the observations make out the complete set of observations from
the process. By splitting the data into ranges, this assumption no longer
holds, and the estimation methods need modification.
• A third problem regards the fitness assessment. The common fitness assessment methods described in Chapter 6 assume that the observation is
a complete sample from the process. These methods therefore need to be
modified to work as intended.
Furthermore, in reality more than one or two distributions is not of much
practical use other than for purely descriptive reasons. For example, generating
random variate using this type of censored mixture distributions for use in, e.g.,
simulations is cumbersome. Fortunately, for many problems, it is sufficient to
locate a single cutoff point between the distribution body and tail.
Cutoff Localisation
The general idea for locating the cutoff points between distributions is that the
best set of cutoff points is the one which gives the smallest total estimation
error.
This may be achieved by either employing successive censoring as described
in [42], or by expressing the problem as a minimisation problem. An error
56
5.3. MIXTURE DISTRIBUTIONS
function that takes into account the cutoff points can be described as follows:
ε(C, Θ, x) =
k
X
i=1
[δ(x − ci ) − δ(x − ci+1 )] ξ(F̂i (x), Fni (x))
(5.12)
where δ(x) is the Heaviside step function and ξ(X, Y) is a vector valued function
to calculate a distance between X and Y. For example ξ(X, Y) = |X − Y| or
ξ(X, Y) = (X − Y)2 . This function “picks off” each portion of the errors
between the estimated CDF and the EDF. Here, the variable x does not refer
to the observations but rather to the x-axis for the distribution combinations.
It then becomes a matter of minimising the sum of these errors, i.e., minΘ,C ε.
However, if k is large, the optimisation can be difficult due to the number of
parameters involved. If the parameters are known, the problem becomes more
tractable, as the only variables left are the locations of the cutoff points.
Censoring
The “picking off” in the previous section is formally denoted as censoring. Censoring may be performed in the time domain, i.e., only observations up to a
specific time or number of observations are used. This is known as time censoring or type 1 censoring. Type 2 or failure censoring involves using only certain
quantiles of the data. In the case of type 2 censoring with a single cutoff point,
removing data starting from the lower quantiles (starting at 0) is denoted as
left censoring while removing data from the upper quantiles (starting at 1) is
denoted as right censoring.
Censoring removes part of the observations, and many fitness assessment
methods (e.g., the Anderson-Darling (AD) test, Section 6.2.2) assume that all
observations are available. This means that the censored observations need to
be modified. Fortunately, since the most powerful fitness assessment statistics
make use of the Probability Integral Transform (PIT) method, this reduces the
problem of adapting an arbitrary censored distribution to adapting a censored
uniform distribution according to the following transformation [25].
Assume that the values Us . . . Ur , s < r are a set of ordered observations
57
CHAPTER 5. TRAFFIC MODELLING
from U (s, r). After the transformation
Vi =
Us+i − Us
Ur − Us
(5.13)
the variable Vi is an ordered sample from U (0, 1). Vi can then be used to
compute various test statistics or be tested for uniformity.
The Probability Integral Transformation (PIT) Theorem
In short, the PIT transformation theorem states [25]:
“If X is a real-valued random variable with CDF F (X), then U = F (X) is
a uniformly distributed random variable on the interval (0, 1).”
The PIT transform works similarly for both continuous and discrete random
variables and is commonly used for generating random variates from a U (0, 1)
random number [48]. In this case, the relation Y = F −1 (U ) is used. If U is a
U (0, 1) random variable, then Y is a random variable distributed according to
F.
By using this method, the testing for a specific distribution F is converted
to testing for uniformity over the range (0, 1).
Figure 9.1(c) shows an example of BitTorrent session inter-arrival times
transformed using the PIT.
5.4
Parameter Estimation
Parameter estimation, also known as point estimation, is in essence a minimisation problem. The principal difference among various estimation methods is
related to the function to minimise. The theory of point estimation is extensive and a cursory description of some of the most popular methods is provided
here. These are the method of moments, minimum χ2 , minimum distance and
Maximum-Likelihood (ML) estimation methods.
58
5.4. PARAMETER ESTIMATION
Though much of the discussion is about estimation of a single parameter θ,
the methodology and theory is very similar in the case of multiple parameters.
The parameter could thus very well be a set of parameters Θ = θ1 , . . . , θn . The
set of parameters is denoted with a capital Θ, and a single parameter with
lower-case θ.
An estimator, ǫ, of the parameter θ is a real-valued function of X, so that
θX = ǫ(X). The estimation error is then θX − θ.
5.4.1
Method of Moments
The idea behind the method of moments is to construct estimators by equating
the sample moments with the distribution moments. This results in a system
of equations as follows:
µ1
=
m1
µ2
=
..
.
m2
µN
= mN
(5.14)
Pn
where µn = E[X n ] and mn = 1/n i=1 xni . The solution to this set of equations
results in estimators for the first N moments. Often, only the first two moments
are used, yielding the sample mean and sample variance. This method is rarely
used, other than as a starting point for other estimators.
5.4.2
Minimum χ2
The minimum χ2 estimation method involves aggregation of the input data into
discrete classes or bins. It utilises the χ2 statistic (also known as the Pearson
statistic), which is a formal way of comparing a histogram of measured data
and the PDF of a hypothesised distribution.
If the range of the measured data is partitioned into k bins, the χ2 statistic
59
CHAPTER 5. TRAFFIC MODELLING
is given by
χ2 =
k
2
X
[ni − npi (θ)]
i=1
npi (θ)
(5.15)
where ni is the number of samples from the data that falls into bin i and n is
the total number of samples in the data. The term npi (θ) denotes the number
of samples that are expected to fall in bin i, if the samples were drawn from
d 2
χ (θ) = 0
distribution p with parameter θ. Minimising (5.15) over θ, i.e., dθ
2
yields the minimum-χ estimate.
Note that the minimum χ2 method does not require equally sized, i.e.,
equiprobable, bins. However, for this thesis the bin sizes have always been chosen
to be of the same size. This means that specifying the number of bins implicitly
specifies the width of each bin. A critical matter when using the χ2 method is
choosing the proper number of bins. If the number of bins is too small, information is lost due to excessive aggregation. On the other hand, if the number
of bins is too large, the histogram becomes jagged and erratic.
There is no specific “correct” method for choosing the “proper” number of
bins, but a rather large number of rules of thumb. Common rules for choosing the number of bins are “enough bins so that each bin contains at least 5
samples” [48] or “make sure that each bin has at least one sample”. A more
formal rule is Sturge’s rule, which gives the number of bins, k, as k = 1 + log2 n.
To choose the width of the bins, Scott’s rule (5.16) or the Freedman-Draconis
rule (5.17) is often used.
3.5s
(5.16)
w= √
3
n
2IQR
(5.17)
w= √
3
n
In (5.16) and (5.17), s is the sample standard deviation, n is the number of
samples, and IQR is the sample interquartile range.
In [81], the author presents a method based on Scott’s rule for calculating bin
widths that is asymptotically L2 -optimal. This method has proven to provide
good bin widths for most of the data in this thesis, and has been the method of
choice for most histograms presented herein.
60
5.4. PARAMETER ESTIMATION
A modification of the χ2 statistic is the λ2 statistic [61]. The λ2 takes into
account the discrepancy in the χ2 -estimate for each bin. This makes it possible
to compare the λ2 statistic among estimates using varying number of bins, which
is not possible using the χ2 statistic.
5.4.3
Minimum Distance
The minimum distance estimation method uses a measure of difference or distance between Fn and F̂ 3 . Compared to the χ2 method, it has the advantage
of having no need for data aggregation beyond forming the EDF. The EDF is
defined as Fn (x) = k/n with k as the number of observations less than or equal
to some value x, and n the total number of observations. The aggregation used
in this method is thus data-dependent, and not arbitrary as in the χ2 case.
Common distance measures are the supremum distance
D(θ) = sup |Fn (x) − F̂ (x)|
x
and the l2 -norm
D(θ) =
s
Xh
x
i2
Fn (x) − F̂ (x) .
The distance measure function is minimised to give the minimum-distance
estimate of the parameter θ, thus
Θ̂ = min D(Θ).
Θ
5.4.4
(5.18)
Maximum Likelihood
The central idea of Maximum Likelihood Estimation (MLE) is to answer the
question “For what set of parameters Θ of the distribution F is it most likely
that we would end up with the data x1 , . . . , xn ?” The answer to this is obtained
3 Recall
that F (x; Θ̂) is denoted F̂ (x).
61
CHAPTER 5. TRAFFIC MODELLING
by forming the likelihood function, L(Θ),
L(Θ) =
n
Y
f (xi ; Θ).
(5.19)
i=1
If the xi s are independent, L(Θ) is the probability that the xi s would be
obtained if Θ is the parameter set for the distribution f . To avoid terminological
confusion, this probability is called the likelihood.
The likelihood function is maximised to obtain the ML parameter estimate
Θ̂ = max L(Θ).
(5.20)
Θ
Alternatively, the logarithm of the likelihood can be used. The value of
Θ that maximises l(Θ) = ln L(Θ) will also maximise L(Θ). l(Θ) is known
as the log-likelihood function. For example, the likelihood for the exponential
distribution is
Pn
1
(5.21)
L(µ) = µ−n e− µ i=1 xi
while the log-likelihood function is
n
l(µ) = −n ln µ −
which is much easier to manage.
5.4.5
1X
xi
µ i=1
(5.22)
Notes on Parameter Estimation
The theory of parameter estimation is often based on creating analytic expressions for a particular estimator. In certain cases, such as for moment estimators,
which do not take the specific distribution into account, this is not a problem.
However, in the case of mixture distributions with several components, obtaining closed-form expressions for the estimators may prove to be very difficult, if
not impossible. It is often necessary to resort to numerical methods.
The parameter estimates presented in this work have been obtained by optimisation methods available in the software package R [5]. For the singledistribution cases, the closed form ML estimators have been used as initial
62
5.5. SUMMARY
estimate where possible. The parameter estimates are further improved by minimising the error percentage as described in Section 7.3. For the case of mixture
distributions and single distributions without closed form estimators, numeric
ML estimates have been used as starting points for further error percentage
minimisation.
5.5
Summary
This chapter has discussed two basic tools of traffic modelling: model selection
and parameter estimation.
Special focus has been placed on heavy-tailed models. These models challenge fundamental assumptions of network traffic, and accurate modelling of
traffic displaying heavy-tail behaviour is important. Several methods for selecting model distributions and parameter estimation are presented.
63
CHAPTER 5. TRAFFIC MODELLING
64
Chapter 6
Fitness Assessment
No matter how many instances of white
swans we may have observed, this does not
justify the conclusion that all swans are
white.
– Karl Popper
Performance modelling tends to be fairly subjective, at least in the sense that
the idea of whether a specific model is “good enough” or not, is application
specific. There are several ways of approaching a specific modelling problem as
well as there are several methods for determining whether the selected models
are fit or not for the specific modelling activity. In this chapter, some classic
methods for determining goodness-of-fit as well as an alternative method for
assessing model suitability are discussed.
The same notation for Fn , F̂ et al. as in the previous chapter is used.
65
CHAPTER 6. FITNESS ASSESSMENT
6.1
Graphical Methods
For the purpose of performance modelling, graphical methods are intuitive and
appealing ways of assessing the general fitness of a parametrised distribution.
Visual procedures such as histogram and CCDF overplots, Quantile-Quantile
(QQ) plots and difference plots all provide fitness quality assessment information. For instance, overplots give insight into the fitness of the lower and upper
tails respectively of a single distribution. The QQ plot is a useful visual aid
to assess the representativeness of the chosen model to several measurements
simultaneously. Figure 9.1 shows examples of the graphical assessment tools.
A humorously worded but still useful augmentation of both the visual tools
and quantitative tools is the inter-ocular trauma test [10], which basically states
that if the data looks significant, the data is significant. The point is that
goodness-of-fit should not be completely dependent on statistical measures without carefully examining the data.
6.2
Hypothesis Testing
Classical statistical theory provides formal methods for testing the goodness
of fit [25]. These tests are known as hypothesis tests and are formal ways of
testing whether a given set of observations are Independent and Identically
Distributed (IID) samples from a distribution F̂ or not.
The tests are performed by forming a statement regarding the nature of the
observation and a related distribution F . This statement is known as the null
hypothesis and is often denoted by H0 , e.g.,
H0 : The observations x1 , x2 , . . . , xn are drawn from the distribution F
with parameters Θ.
The hypothesis testing procedure entails the calculation of a specific test
statistic and comparing this to tables of critical values. These tables contain
values for the test statistic at specific significance levels. If the test statistic
66
6.2. HYPOTHESIS TESTING
exceeds the value given at a certain level α, the null hypothesis is said to be
rejected at significance level α. In this thesis, the term pass at significance level
α will occasionally be used when referring to not rejecting the null hypothesis.
The most fundamental form of most test statistics assume that the parameters of the distribution are not estimated from the data. This is known as
the all parameters known-case or Case 0. The hypothesis is then called a simple hypothesis. If the distribution is known, but the parameters unknown, it is
called a composite hypothesis. If the parameters are estimated from the data in
any way, both the test statistic and the critical values need modification for the
specific distribution (Section 6.3). However, if the parameter estimates Θ̂ are
“good enough”, the all parameters known case can be used to calculate the test
statistics [25].
There are two major types of hypothesis tests: the χ2 and EDF tests. EDF
tests are more powerful than χ2 tests [25].
6.2.1
The χ2 Test
The χ2 test is based on the fact that, in the case of the null hypothesis being
true, the distribution of the χ2 -statistic (Eq. (5.15)) can be shown to converge
to a χ2 distribution of degree k − 1 as n → ∞.
Thus, the obtained value for χ2 is compared to χ2k−1,1−α , and if χ2 >
the null hypothesis is rejected at the approximate level α.
χ2k−1,1−α
The χ2 test suffers from the same problems as the χ2 estimators, e.g., appropriate selection of bin sizes and the associated risk of misrepresentation. Despite
these problems, the χ2 test is still used, since it is possible to test any distribution by using it. Other tests, such as the EDF tests, may require modification
depending on the specific distribution being tested [48].
67
CHAPTER 6. FITNESS ASSESSMENT
6.2.2
EDF Tests
EDF tests are a class of tests based on the differences between the EDF Fn (x)
and the estimated distribution function F̂ (x).
There are two major types of EDF tests: the supremum tests and the quadrature tests.
The Supremum Tests
The most well-known supremum test is based on the Kolmogorov-Smirnov (KS)
statistic. The KS test statistic measures the largest vertical distance between
Fn and F̂ [25]. It is defined as
o
n
Dn = sup Fn (x) − F̂ (x)
(6.1)
x
or, alternatively
Dn = sup D+ , D−
(6.2)
x
where D+ and D− are the largest positive and negative vertical differences
respectively, i.e., D+ = supx {Fn (x) − F̂ (x)} and D− = supx {F̂ (x) − Fn (x)}. A
related statistic is the Kuiper statistic, which is defined as V = D+ + D− .
The Quadrature Tests
This class of tests, also known as the Cramér-von Mises (CVM) family of tests,
are based on the sum of squared differences between Fn and F̂ [25]. The general
form for the test statistics in this class is
Z ∞
2
{Fn (x) − F (x)} Ψ(x)dF (x)
(6.3)
Q=n
−∞
where Ψ(x) is an error weighting function and n is the number of samples. This
class of tests is more powerful, as it takes into account all discrepancies between
Fn and F̂ , not only the largest.
68
6.2. HYPOTHESIS TESTING
The most common test statistics in the CVM family are the Cramér-von
Mises statistic (denoted by W 2 ) and the Anderson-Darling (AD) statistic (denoted by A2 ). For W 2 , the weighting function is Ψ(x) = 1, which basically
amounts to W 2 being the mean square error of the estimated distribution to
the actual data. The CVM statistic weights all errors equally, something that
is not always desirable as distributions often differ mainly in the tails [25].
The AD statistic is more powerful when detecting deviations in the tails of
a distribution. The associated weighting function in this case is
Ψ(x) =
1
(6.4)
F̂ (x) 1 − F̂ (x)
200
20 50
5
Weights
1000
and a visual example of Ψ(x) for a uniform distribution over (0,1) is depicted
in Figure 6.1.
0.0
0.2
0.4
0.6
0.8
1.0
x
Figure 6.1: AD weighting function for a uniform distribution
Notes on EDF Tests
The EDF test statistics are usually calculated by using the PIT method. This
method transforms any distribution F into an approximately uniform(0,1) distribution (denoted by U (0, 1)). It can be shown that the vertical differences
between a true U (0, 1) and the estimate are the same as those between F and
F̂ [25].
69
CHAPTER 6. FITNESS ASSESSMENT
The statistics are then calculated from the transformed values. For example,
the D and A2 statistics are calculated with (6.5) and (6.6)
i−1
i
− Zi , max Zi −
Dn = max max
i
i
i
n
n
(6.5)
n
A2 = −n −
1X
(2i − 1) (ln Zi + ln [1 − Zn−i+1 ])
n i=1
(6.6)
where Zi are the order statistics of F̂ (X; Θ).
Even if the all parameters known-case is assumed, the distribution of most
test statistics differs from the true distribution if parameters are estimated from
the data. With the exception of the AD statistic, all EDF test statistics previously discussed need modification to maintain significance [25]. For example, in
the case of the D statistic, the modification is
Dmod = Dn
√
0.11
n + 0.12 + √
n
.
(6.7)
In the case of the Dmod statistic, the value of Dmod clearly grows with the square
root of Dn . Thus, as n grows very large, so does Dmod . The modifications for
√
other supremum statistics are similarly dependent on n.
For the quadrature statistics, the modifications are similar to Eq 6.8 except
for the AD statistic, which needs no modification if n ≥ 5.
2
Wmod
= W 2 − 0.1n−1 + 0.6n−2
1 + n−1 .
(6.8)
The modified test statistics are then compared to the values in tables such
as shown in Table 6.1.
For the composite case, i.e., no parameters known, other modifiers and tables
must be employed.
70
6.3. THE CASE OF LARGE SAMPLE SPACES
Table 6.1: EDF statistic percentage points
Significance level α
Test statistic
0.15
0.10
0.05
0.25
0.01
0.005
0.001
Dmod
1.138
1.224
1.358
1.480
1.628
1.731
1.950
W2
0.284
0.347
0.461
0.581
0.743
0.869
1.167
2
1.610
1.933
2.492
3.070
3.880
4.500
6.000
A
6.3
The Case of Large Sample Spaces
The largest problem with hypothesis tests is that, for a very large number of
observations, they tend to reject the null hypothesis [12, 25, 48].
As noted in the previous section, the modified supremum statistics grow as
√
n. Unless Dn is very small, H0 will be rejected for even fairly small n. There
is no clear connection with the quadrature modifiers, though it is possible to
intuit the same effect as shown in Figure 6.1.
A possible explanation for the rejection of the null hypothesis in the case of
large number of observations is that it is impossible to create a perfect model.
This means that for each additional observation, an additional error term will
be added to the integrated statistic. Even if each error term is very small, the
total error will increase with n.
However, Beran has shown that the rejection of the null hypothesis may also
be caused by LRD in the data [12]. The effect of LRD is especially pronounced
in the case of the simple hypothesis.
6.4
Relative and Absolute Fitness
While the test statistics discussed above may not work as expected for large
sample spaces, the statistics are still useful as goodness-of-fit measures. For
71
CHAPTER 6. FITNESS ASSESSMENT
instance, assume that the hypothesised distributions F̂1 and F̂2 yield values
for A2 of 23.2 and 43.5 respectively. In this case, the choice of F1 as a more
representative model would be warranted. This type of relative measure can
be useful in comparing a large number of hypothesised distributions. However,
as different statistics emphasise different parts of the distribution, it would be
beneficial to compare using several statistics.
If a maximum value for the test statistic exists, it is possible to give an
absolute measure of fitness. For instance, the largest possible value of Dn is 1.
An error percentage for Dn could then be denoted as D% = 100Dn . Absolute
fitness measures have a more intuitive appeal than statistical measures. However, it is not always an easy task to determine a maximum error value. For
example, a maximum error for the AD statistic
Z ∞
Fn (x) − F (x)
dF (x)
(6.9)
A2 = n
−∞ F (x) [1 − F (x)]
is substantially more difficult to calculate. Results on evaluating the distribution of the AD statistic exist, which may provide some assistance [50]. This is
however beyond the scope of this thesis.
6.5
Summary
A number of fitness assessment methods have been presented. These include
the χ2 and EDF hypothesis tests. Furthermore, the problems with large sample
spaces have been highlighted. Classical hypothesis tests tend to reject the null
hypothesis test for large sample spaces. Additionally, the benefits of using an
absolute test statistic have been put forth.
72
Chapter 7
Modelling Methodology
I KEEP six honest servingmen;
(They taught me all I knew)
Their names are What and
Where and When
And How and Where and Who.
– Rudyard Kipling
Previous chapters have discussed several methods for model selection, parameter
estimation and fitness assessment. This chapter presents the general methodology used in obtaining the models.
7.1
Distribution Selection
For each parameter to be modelled, distribution selection is an exploratory process of observing EPDF and CCDF plots. For inter-arrival and inter-departure
times, the EPDF is primarily used, while for message rates the CCDF is the
preferred method. The CCDF is better suited for detecting potential long-range
dependent behaviour. This is important to detect in the case of rates, sizes and
73
CHAPTER 7. MODELLING METHODOLOGY
durations, as underestimating these characteristics may have an adverse effect
on the network.
7.2
Parameter Estimation
Based on the candidate distributions selected for modelling, MLE is used to
obtain parameter estimates. With the number of observations available for the
measurements, the obtained parameter estimates are assumed to be accurate
enough to consider the associated distribution fully specified, given that the
confidence intervals for the estimated parameters are within acceptable boundaries.
In the case of single and mixture distributions, parameter estimation is a
straightforward procedure, and estimates are obtained from the complete set of
data. In the censored mixture model case, successive right censoring as employed
in [42] together with an error percentage assessment (described in the following
section) is used to find out the cutoff points for the mixture model. The censored
models are however deprecated in favour of more tractable mixture models, as
censored models are less convenient to use in a simulation environment.
Once ML estimates are available, the error percentage presented in the following section is further used to optimise the parameters. It has been found
that this often provides better results than accepting the ML estimates.
7.3
Fitness Assessment
To determine whether a distribution is representative of the observed data,
visual procedures, formal hypothesis tests to a certain extent, and an error
percentage assessment are used.
To assess the quality of the estimated distributions in a more quantitative
manner, a method similar to the EDF test that does not suffer as much with
increasing number of observations is employed.
74
7.3. FITNESS ASSESSMENT
The method is based on the EDF test for a fully specified distribution, as
described in [25]:
1. Obtain the order statistics X1 < X2 < · · · < Xn from the measured data.
2. Transform the original data by using the PIT method and using the selected distribution and estimated parameters. If the samples X1 · · · Xn
are IID samples from some distribution F , then Ûi = F (Xi ; Θ̂),where
i = 1, 2 . . . n, are uniformly IID on [0, 1].
3. Obtain the error percentage by using the following expression:
E% =
n
100 X
|Ui − Ûi |
nEmax i=1
(7.1)
where Emax is defined as
Z1
0
1
sup {U (x), 1 − U (x)} dx =
Z2
0
1 − U (x) dx +
Z1
U (x) dx =
3
4
(7.2)
1
2
or, in plain terms, the maximum discrepancy from a true U [0, 1] distribution that may occur.
4. Accept or discard the estimated distribution as “good enough” according
to some predefined criteria. For the purposes of this thesis, E% ≈ 5 is
chosen as an upper limit for not discarding the estimated distribution. It
is important to mention that this is not a statistical significance level, but
rather an acceptable margin of error.
Additionally, fuzzy classification or rough set theory may be employed
in quantifying the goodness-of-fit in a more formal way. The informal
degrees of fitness quality presented in Table 7.1 are used. More formally
defined measures, e.g., proper membership functions, are subject of future
research.
75
CHAPTER 7. MODELLING METHODOLOGY
Table 7.1: Fitness quality boundaries
E% ≈
Degree
7.3.1
0
1
2
3
4
excellent
very good
good
fair
poor
Notes on the Error Percentage
The error percentage presented above suffers from the same problem as do
the CVM and KS statistics, i.e., it weights all errors equally. To address this
problem, a weighting similar to the AD weight function may be used. For
using the error percentage in optimising parameter estimates for heavy-tailed
distributions, a weighting function for the term |Ui − Ûi | as well as an adaptation
of Emax is necessary. Using a weight function of Ri = (1 + Ui )k provides
increasing weight to the upper tail with increasing k. A general modification of
Emax for any strictly increasing weight function is
Z 1
Z Rx
R(x)dx =
Rmax − R(x)dx +
Emax (k) =
Rx
0
1
Rx
= Rx Rmax − R̃(x) + R̃(x)
0
Rx
= Rx Rmax − R̃(Rx ) + R̃(0) − R̃(Rx ) + R̃(1)
= Rx Rmax − 2R̃(Rx ) + R̃(0) + R̃(1)
(7.3)
where in the case for Ri = (1 + Ui )k
R(x) = (1 + x)k
Rmax = R(1) = 2k
Z
(1 + x)k+1
R̃(x) = R(x) =
k+1
√
−1
k
R (x) = x − 1
√
k−1
1
k
−1
Rmax = 2k−1 − 1 = 2 k − 1.
Rx = R
2
(7.4)
A suitable value for k depends on the shape of the distribution, and a certain
76
7.4. SUMMARY
amount of experimentation is needed for each fitting problem.
Adaptation of the error percentage to other weights such as those similar to
the AD weighting function may be investigated as part of future work.
7.4
Summary
This chapter has presented the modelling methodology used in the thesis. Distribution selection, parameter estimation and fitness assessment have been discussed. Furthermore, a dedicated fitness assessment method have been presented, and a parametrisable weighting function for increased tail accuracy has
been suggested.
77
CHAPTER 7. MODELLING METHODOLOGY
78
Chapter 8
BitTorrent Measurements
I like to think that the moon is there even if
I am not looking at it.
– Albert Einstein
The measurements reported in this chapter were performed by having instances
of the BitTorrent client software join several distribution swarms. An instrumented version of the reference BitTorrent client has been used to avoid potentially injecting non-standard protocol messages in the swarm. The client was
instrumented to log all incoming and outgoing protocol messages together with
a UNIX timestamp. The BitTorrent client is implemented in python, an interpreted programming language. The drawback with this is that the accuracy of
the timestamps is reduced compared to the actual arrival times of the carrying
IP datagrams. By comparing the actual timestamps of back-to-back messages
at the application level with the corresponding TCP segments, the accuracy is
estimated to approximately 10 ms.
Most of the traffic reported here has been collected over a three week time
period at two measurement points in Blekinge, Sweden. The first measurement
point was the networking lab at BIT, Karlskrona, which is connected to the
Internet through a 100 Mbps Ethernet network. The second measurement point
79
CHAPTER 8. BITTORRENT MEASUREMENTS
was placed at a local ISP with 5 Mbpslink. Both measurement points were
running the Gentoo Linux operating system, on standard PC hardware.
For the initial set of measurements, a number of twelve measurements have
been performed, each of them with a duration of two to seven days (Table 8.1).
This first set of measurements were purely performed with the instrumented
client. An additional measurement set with both application logging active and
packet capturing running simultaneously has also been performed at BIT, for a
total of thirteen measurements.
For the first measurement point, no significant amount of other software was
running simultaneously with the BitTorrent client. At the second measurement
point, the BitTorrent client was running as a normal application, together with
other software such as Web browsers and mail software. The first measurement point can be viewed as a dedicated BitTorrent client, while the second
corresponds to normal desktop PC usage patterns.
8.1
Traffic Metrics
The BitTorrent client application logs are in essence timestamped protocol
events. This means that metrics like inter-arrival and inter-departure times
are readily available by simple calculations. The possibility does exist to compute detailed statistics on several levels of aggregation as well. Most notably,
this offers the possibility to look into potential burstiness on timescales that are
decided by the timestamp accuracy.
Specific software has been written to extract several important statistics
and metrics, to characterise the peer behaviour only, and not the entire swarm
[30]. To measure the true size of the swarm, active probing of the tracker is
necessary. This is, however, subject for future work. The goal is to use accurate
characterisation and modelling of the behaviour of a peer in modelling entire
swarms.
A number of metrics have been used for the characterisation of the BitTorrent signalling traffic [30]. The most important ones are as follows:
80
8.1. TRAFFIC METRICS
Download time
This is the time it takes for the modified client to do a complete download.
This metric also provides information about the peer changes from being
both a downloading and uploading peer to being a seed, thus offering the
possibility to collect statistics about the seed and leecher states.
Session duration and size
A BitTorrent session is equivalent to a TCP session, given that the BitTorrent handshake is completed. As BitTorrent protocol messages are fixedlength messages, there is a one-to-one mapping between the messages sent
and received during a session and the session size. A BitTorrent session
duration is the same as the TCP session duration, whereas the session size
is the amount of data transmitted during the TCP session.
Number and type of messages
The number of messages of each type in both upstream and downstream
directions are counted. Together with the session duration and size, this
gives us valuable insights into the behaviour of a peer.
Host persistence
The number of unique host IP addresses and peer client IDs are also
counted. If a given host IP address has a one-to-one mapping to a peer ID
and a long session time, the peer is considered to be persistent. Persistent
peers indicate a healthy swarm in the sense that new peers are more likely
to find a larger number of seeds in a swarm with many persistent peers
than in one with less persistent peers.
Peer swarm size
The peer swarm size refers to the number of peers observed by the measuring client at any given time. This is not the size of the entire swarm, i.e.,
the total number of collaborating peers, but the number of peers to which
the measuring peer is connected. Information about the total swarm size
is only available at the tracker, and therefore it is not considered in the
reported measurements.
81
CHAPTER 8. BITTORRENT MEASUREMENTS
Piece response times
The piece response time is defined to be the time elapsed between the
moment of the initial request for any subpiece belonging to a given piece
to the moment of the transmission of the associated have message. This
parameter gives us the possibility to estimate the downstream bandwidth
usage.
Piece popularity
The popularity of a piece is given by the number of requests for any
subpiece of a given piece. This gives an indication of the effectiveness of
the piece selection algorithms of the requesting peers.
8.2
Traffic Measurements
Measurements 1 through 3 (Table 8.1) were performed with a single instance of
the instrumented BitTorrent client running. As TCP is known to be very aggressive in using the network, this was to minimise the effects of several clients
competing for the available bandwidth and to establish a point of reference for
the rest of client sessions. Measurements 4 through 8 were started simultaneously, as were measurements 11 and 12. The remaining measurements were
performed with some temporal overlap, as shown in Figure 8.1.
An important issue regarding traffic measurements in P2P networks is the
copyright issue. The most popular content in these networks is usually copyrighted material. To circumvent this problem, BitTorrent swarms distributing
several popular Linux operating system distributions were joined. Notably, both
the RedHat Fedora Core 2 (FC2) test and release version swarms were joined.
The FC2 ’Tettnang’ version was released on May 18th 2004, while the rest of the
content was available at the start of the measurements. This provided a unique
opportunity to study the dynamic nature of the FC2 swarms. The contents of
the measured swarms are reported in Table 8.2. Two of the swarms have been
measured from both measurement points to allow for comparisons, one with
temporal overlap, and another without overlap.
82
8.2. TRAFFIC MEASUREMENTS
Table 8.1: Measurement summary
Number
Records
Start
Duration
Location
1
10770695
2004-05-03
2 days, 20 hours
BIT
2
10653466
2004-05-06
3 days, 19 hours
BIT
3
10990569
2004-05-12
4 days, 4 hours
BIT
4
12567283
2004-05-17
7 days
BIT
5
13691459
2004-05-17
7 days
BIT
6
11754838
2004-05-17
7 days
BIT
7
1943636
2004-05-17
7 days
BIT
8
7321166
2004-05-17
7 days
BIT
9
687046
2004-05-13
3 days, 7 hours
ISP
10
2881803
2004-05-18
5 days, 23 hours
ISP
11
9252170
2004-05-22
7 days
ISP
12
5599997
2004-05-22
7 days
ISP
13
14803678
2004-06-26
7 days
BIT
a
a Unfortunately, the original data for this measurement was lost due to hardware failure.
Thus, most analysis is not performed on this data, and only summary statistics are provided.
May 1
May 8
1
2
May 15
3
May 22
May 29
4–8
9
10
11–12
Figure 8.1: Temporal structure of measurements 1–12
83
CHAPTER 8. BITTORRENT MEASUREMENTS
Table 8.2: Content summary
8.3
Content
Pieces
Size
Measurement
RedHat FC 2 test3 CD Images
8465
2.2 GB
1–3
RedHat FC 2 test3 DVD Image
16708
4.3 GB
6, 10
Slackware Linux Install Disk 1
2501
650 MB
4
Slackware Linux Install Disk 2
2627
670 MB
5
Dynebolic Linux 1.3
2522
650 MB
7, 9
Knoppix Linux 3.4
2753
700 MB
8
RedHat FC 2 ‘Tettnang‘ CD Images
8719
2.2 GB
12,13
RedHat FC 2 ‘Tettnang‘ DVD Image
16673
4.3 GB
11
Summary Statistics
In this section some of the more salient results obtained from the measurements
are reported. Download times and rates are summarised in Table 8.3. It is
observed that the time before the measurement peer enters seeding mode varies
from roughly 20 minutes up to 6.5 hours. As the content sizes vary with each
measurement, also provided are the average download rates for the entire content, i.e., the size/time ratio. The download rates also show large disparity,
with rates ranging from just over 120 kB to over 1.3 MB, with the three first
measurements clearly being the most demanding in terms of bandwidth.
A summary of session sizes and durations is reported in Table 8.4. Also
included are the number of sessions and unique peer IPs and peer client IDs.
Measurement 6 clearly stands out here, both with regards to mean session
size and session length. Also, the maximum session size for this measurement is
more than twice that of any of the other measurements. The mean session size
is also about twice that of the corresponding measurement of the same content
(measurement 10). As measurements 6 and 10 have the top two session sizes, it
is probable that the session size is related to the total content size (4.3 GB).
84
8.3. SUMMARY STATISTICS
Table 8.3: Download time and average download rate summary
Measurement
Time (s)
Rate (bytes/s)
1
1930
1149520
2
1932
1147908
3
1681
1319445
4
2607
251424
5
3397
202644
6
23000
190416
7
1237
534282
8
6005
120153
9
2723
242776
10
23475
186570
11
19431
224927
12
9106
250989
13
2951
774420
The minimum session lengths are all set to 0, indicating that all of them
are shorter than the accuracy provided for by the application logs. These very
short sessions are also indicated in the minimum session sizes, and correspond
to a session containing only a handshake or an interrupted handshake.
Another pertinent feature is the ratio of number of unique IPs to number
of unique peers for measurement 8. The IP-to-ID ratio for this measurement is
slightly above 0.25, while none of the other measurements are below 0.5. This
might indicate either users stopping and restarting their clients several times,
or users sharing IPs, such as peers subject to NAT.
Table 8.5 summarises the number of messages received on a per-message
basis. In addition, column 5 shows the number of incoming connection requests
collected.
85
CHAPTER 8. BITTORRENT MEASUREMENTS
Table 8.4: Session and peer summary
8
7
6
5
4
3
2
1
3043
17287
4444
10685
12354
13493
28687
46022
29712
652
294
231
218
1207
910
750
465
233
343
Mean
141509
267497
29163
87026
46478
223235
180298
143707
171074
117605
98991
Max
0
0
0
0
0
0
0
0
0
0
0
0
Min
4036
2580
3791
5907
1719
1972
1642
7016
4504
3942
3614
2316
2741
Std
32.2
8.31
17.22
37.78
21.62
33.11
49.96
74.25
57.08
49.88
28.54
27.15
27.49
Mean
1652.83
987.89
475.86
1499.85
408.05
695.94
431.13
3117.79
668.53
671.99
539.20
646.03
647.26
Max
73
73
73
73
78
73
78
73
73
73
73
73
73
Minb
99.4
30.63
52.73
109.08
42.27
109.31
76.48
247.74
116.10
100.65
61.70
64.05
70.65
Std
3930
2177
1841
444
193
1656
279
1033
1747
1813
1913
1876
2024
ID
2440
1152
1067
305
166
406
184
619
962
1143
1319
1394
1314
IP
Peersa
9
9701
448
292241
0
Session size (MB)
10
43939
197
483996
Session length (s)
11
68288
465
Sessions
12
52833
#
13
b This
peer client IDs and IP addresses.
column is measured in bytes.
a Unique
86
8.3. SUMMARY STATISTICS
The request and have messages clearly dominate in terms of number of
messages sent, while the interested and not interested messages are the
least common. This is valid for all measurements, except for measurement
2, which has almost 5 times more incoming interested messages than the
measurement with the second highest number of interested messages.
The high number of request and have messages found in the measurements
is expected, as the peer is acting as a seed for most of the time spent in the
swarm. When seeding, a peer never receives piece messages, and the downloading peer must request data with the request message, thus explaining their high
number. The have messages are accounted for by the fact that every completed
piece download results in such a message being transmitted.
The summary of the outgoing messages in Table 8.6 again shows the very
low number of interested and not interested messages. The major bulk of
the outgoing messages is however accounted for by the piece messages. This
is again an expected result, as request messages generate a piece message
in response. The absence of transmitted choke messages for measurement 7
indicate that there has been a continuous exchange of data between peers. As
for the request and have messages, these are tightly coupled to the number of
pieces present in the content. The higher number of request messages is due
to these messages corresponding to only a single subpiece.
87
CHAPTER 8. BITTORRENT MEASUREMENTS
12
11
10
9
8
7
6
5
4
3
2
1
#
8113100
1118110
1835910
838379
217336
3347256
810019
4501907
6163605
5596270
3276644
3044768
3316470
request
711
348
470
79
37
766
52
191
401
406
493
489
504
not int.
139702
139943
268575
268429
40426
44328
40371
277261
42176
40167
135682
135797
135615
piece
52865
68297
43957
9703
3045
17292
4445
10688
12364
13502
28714
46047
29746
new conn.
50304
67373
42848
9181
2996
16623
4370
9659
11827
12935
27092
45054
28024
bitfield
60524
37925
54090
13015
1114
9270
290
24239
32325
29628
40705
19117
27120
unchoke
6293438
2619333
4713440
570367
139472
404038
198885
2090892
1197813
1206000
3941658
3984881
3651835
have
9477
3242
2573
692
259
2012
230
2147
2059
2041
2430
14602
2905
int.
58925
36872
52458
11936
956
8579
122
23639
31452
28640
39955
18061
26314
choke
24632
25047
17313
9085
3061
18999
1255
6244
11508
14643
7628
9059
6500
cancel
Table 8.5: Downstream protocol message summary
13
88
request
137007
137271
136738
42709
44862
291200
40497
47413
40906
285650
281921
145517
141316
#
1
2
3
4
5
6
7
8
9
10
11
12
13
7940342
960802
1660868
753074
213693
3296616
808844
4394389
6032599
5468908
3189175
2964836
3251948
piece
80
76
67
71
16
100
18
91
146
76
62
63
63
not int.
27332
49093
35698
21304
3192
19380
4445
23166
25759
25476
16545
17471
11792
unchoke
52830
68271
43927
9673
3042
17281
4444
10661
12353
13489
28682
46020
29714
bitfield
97
125
157
214
19
136
18
197
157
86
64
70
68
int.
8719
8719
16673
16708
2522
2753
2522
16708
2627
2501
8465
8465
8465
have
Table 8.6: Upstream protocol message summary
23527
34570
31279
15222
193
8672
0
18943
23749
22740
14085
13301
9553
choke
807
701
812
611
220
423
140
555
725
855
1011
894
970
cancel
8.3. SUMMARY STATISTICS
89
CHAPTER 8. BITTORRENT MEASUREMENTS
8.4
Swarm Size
The number of locally connected peers at any given time is an indicator of
the popularity of the data content of the swarm. The solid line in Figure 8.2
(Measurement 6) shows the typical evolution of the number of connected peers
in a popular swarm. The measurement peer rapidly connects to – and stays
connected to – the preconfigured maximum number of 55 peers, which is the
default. Measurements 1–3, 6, 11–13 show similar behaviour.
The dashed line (Measurement 4) shows a swarm that is less popular, at
least in the sense that it takes longer to find enough peers to download from.
The maximum number of peers is not reached until the end of the leeching
phase. The amount of data is substantially smaller and the average download
rate higher (Table 8.2), which means that the leech phase is ended fairly quickly.
Figure 8.2: Connected peers during seed phase for measurements 4 and 6
This is further reinforced by the accompanying seed phase graphs. During
the seeding phase, none of the measurement peers have the maximum number
of peers continuously connected, though the number of connected peers is close
to the maximum.
Measurements 4, 5, 7 and 9 show much less activity during leech and seed
phases. This behaviour can be explained by the fact that the content of these
swarms (the Slackware and Dynebolic Linux distributions) are less well-known
and/or used than the RedHat distribution.
In [39], the authors show the influx of new users in a BitTorrent swarm when
popular new content arrives. They denote this sudden increase in swarm size
90
8.4. SWARM SIZE
Number of connections
60
50
40
30
20
Measurement 6
Measurement 4
10
09:00
10:00
11:00
12:00
13:00
14:00
15:00
16:00
Figure 8.3: Connected peers during leech phase for measurements 4 and 6
as the flash-crowd effect.
With this in mind, it is interesting to compare the seed phases of measurements 6, 10 and 11. The data in the first two measurements was the test release
of the RedHat Fedora Core 2 Linux distribution, while the data in the last swarm
was the final release of the same version. The final version was released on May
18, 2004. This event can be clearly observed at around 12:00 in Figure 8.4, at
which time peers start disconnecting. It is likely that this is due to the release
of the new version of the distribution, and that BitTorrent users react quickly
to the availability of the new content.
Figure 8.4: Swarm reaction to new content
It is also interesting to note the similarity of measurements 6 and 10 between
May 19 and 24 as shown in Figure 8.4. The number of connected peers are quite
similar, except for a slight shift in time. Since the measurements were made at
different locations, this shift can be explained by a slight difference in system
clocks.
91
CHAPTER 8. BITTORRENT MEASUREMENTS
8.5
Session Sizes
One of the innovations with the BitTorrent protocol is the tit-for-tat notion of
enforced reciprocation. One way to assess how well this is enforced in practice is
to observe the amounts of data sent and received in each session. Of particular
interest are the session sizes during the leech phase.
The average share ratios for the measurement peer during the leech phase
are presented Table 8.7 . The share ratio is the upstream session size divided
by the downstream session size. Preferably, a peer should have at least a share
ratio of 1, i.e., should upload at least as much data as it downloads.
Table 8.7: Share ratio during leech phase
Measurement
Share ratio
1
2
3
4
5
6
7
8
10
11
12
13
0.01 0.19 0.08 3.45 3.63 0.53 0.60 5.55 0.73 0.05 0.09 0.07
The most apparent result is the very low share ratio for measurements 1–3
and 11–13. This is more likely due to the peer connecting to a high number
of seeds rather than acting unfairly to leech peers. If the latter were the case,
the download times would likely be substantially larger than the other measurements, since the peer would be punished for not reciprocating.
Table 8.8: Correlation coefficients for session sizes
Measurement
Leech phase
Seed phase
1
2
3
4
5
6
7
8
10
11
12
13
−0.10 −0.06 −0.09 0.14 0.11 0.14 −0.14 0.48 0.01 −0.08 0.90 −0.10
0.98
0.96
0.97
1
1
1
1
1
0.98
0.91
0.92
0.99
Table 8.8 shows the correlation coefficients for upstream and downstream
session sizes during leech and seed phases. The high correlation of session upstream and downstream sizes during the seed phase is not surprising. It may
by explained by the fact that during this phase, the measurement peer only
responds to activity from other peers. The peer does not initiate any new connections and does not request any data. The slight deviations from a correlation
92
8.6. SUMMARY
of 1 are probably due to peers requesting data but disconnecting due to being
snubbed or to network problems.
Further investigation of the share ratios and session size correlations are
planned as future work.
8.6
Summary
This chapter has given a detailed account of the measurements used for modelling in this thesis. Relevant traffic metrics have been discussed and presented.
Furthermore, summary statistics have been put forth, showing high variability
in bandwidth utilisation and number of active peers. Additionally, peers are
observed to react quickly to new content, which corroborates earlier results.
93
CHAPTER 8. BITTORRENT MEASUREMENTS
94
Chapter 9
BitTorrent Models
Mh mou toudej kÎklouj t‚ratte!
– Archimedes
Traffic modelling is an important activity in the context of predicting future
network behaviour. Accurate models for various networked applications are a
decisive step towards QoS in the Internet. With the rise in popularity of P2P
protocols, the importance of tractable and practically usable models is further
increased.
This chapter presents models for BitTorrent session and message characteristics. The primary goal of the models is to be useful in a simulation environment.
9.1
Session Characteristics
In this section the modelling results for the distributions of session inter-arrival
times, upstream session sizes and durations are reported.
95
CHAPTER 9. BITTORRENT MODELS
9.1.1
Session Inter-arrival Times
The distributions reported in this section refer to inter-arrival times for remotely
initiated sessions during the seeding phase of the measurement peer. The leech
phase is not considered, partly because it is short compared to the seed phase
and the number of non-locally initiated sessions is fairly low, partly because
the peer is more active during this phase than during the seed phase. The
combination of active peer status and low number of samples (e.g., only 10–20
sessions) that is present during the leech phase makes the analysis more difficult.
Session inter-arrival times have been observed to be well modelled by using
a two-stage hyper-exponential distribution, denoted by H2 . The associated
probability density function is
H2 (x) = pλ1 e−λ1 x + (1 − p)λ2 e−λ2 x
(9.1)
where λ1 and λ2 are the arrival rates for the two exponential terms, and p
is the mixing weight. Figure 9.1 shows examples of visual assessment tools.
Figures 9.1(a) and 9.1(b) show PDF and CCDF overlay plots for measurement
3. Both indicate a very good match for up to 99% probability mass, with
most of the errors in the tail of the distribution. Figure 9.1(c) shows a QQ
plot with all measurements. Parameter estimates for all measurements have
been obtained using MLE. Table 9.1 reports the parameter estimates and the
associated standard deviations obtained in the fitting procedure. Also presented
is the E% value and the resulting fitness decision and degree.
Summarising the results for session inter-arrival times during the seeding
phase it is observed that all measurements pass according to the selected error
criteria. Furthermore, it is observed that measurements 2 and 3 have low E%
values, and that they pass at significance levels of ≈ 0.005 when using the
Anderson-Darling test. This indicates that the selected H2 distribution is a
good candidate for the underlying true distributions.
96
9.1. SESSION CHARACTERISTICS
0.12
0
log10 P[X ≤ x]
0.08
0.04
Density
−1
−2
−3
50.0%
80.0%
90.0%
95.0%
99.0%
0.00
−4
0
10
20
30
40
−3
−2
−1
Interarrival time
0
1
2
3
log10 x
(a) Empirical PDF for measurement 3 with (b) CCDF for measurement 3 with estimate
estimate overlaid
overlaid
(c) QQ-plot of all measurements subject to
H2 (λ̂1 , λ̂2 , p̂)
Figure 9.1: Fitness assessment plots
97
CHAPTER 9. BITTORRENT MODELS
12
11
10
8
7
6
5
4
3
2
1
number
0.0563 ± 0.0004
0.0935 ± 0.0004
0.0140 ± 0.0009
0.5581 ± 0.0205
0.5142 ± 0.0113
0.4188 ± 0.0143
0.4798 ± 0.0174
0.5538 ± 0.0212
0.5372 ± 0.0178
0.0566 ± 0.0006
0.1158 ± 0.0009
0.0593 ± 0.0046
λ̂1 ± σ̂λ1
0.4175 ± 0.0065
5.8224 ± 0.1380
0.0802 ± 0.0005
0.0128 ± 0.0002
0.0168 ± 0.0002
0.0052 ± 0.0001
0.0127 ± 0.0002
0.0162 ± 0.0002
0.0168 ± 0.0002
0.3653 ± 0.0099
0.7556 ± 0.0279
0.1696 ± 0.0085
λ̂2 ± σ̂λ2
0.5897 ± 0.0048
0.8252 ± 0.0021
0.0219 ± 0.0024
0.3276 ± 0.0064
0.4252 ± 0.0050
0.3014 ± 0.0076
0.2879 ± 0.0060
0.2156 ± 0.0052
0.2533 ± 0.0052
0.6575 ± 0.0077
0.7936 ± 0.0066
0.2215 ± 0.0467
p̂ ± σ̂p
1.87389
3.84606
2.20763
3.76412
2.79291
2.05430
3.93588
2.79722
2.79455
0.49009
0.41535
2.07367
E%
Pass, good
Pass, poor
Pass, good
Pass, poor
Pass, fair
Pass, good
Pass, poor
Pass, fair
Pass, fair
Pass, excellent
Pass, excellent
Pass, fair
Comment
Measurement
13
Table 9.1: Fitted hyper-exponential parameters
98
9.1. SESSION CHARACTERISTICS
9.1.2
Session Duration and Size
In this section the modelling results for the size and duration of remotely initiated peer sessions are reported. It is observed that they are highly related,
and also show fairly high correlation, as shown in Table 9.2. This is an expected result, though the correlations were expected to be a bit higher. The
reported correlations indicate that there are long sessions that request little or
no data, alternatively that short sessions transmit large amounts of data (the
mice and elephants effect [63]). Table 9.3 indicates that it is the former reason
that primarily affects the size-duration correlations.
Table 9.2: Correlation coefficients for session duration and sizes
Measurement
1
2
3
4
5
6
7
8
10
11
12
13
ρxy 0.32 0.36 0.29 0.30 0.30 0.34 0.47 0.40 0.67 0.43 0.38 0.25
Figure 9.2 further indicates the mice and elephant effect. The dotted lines
show the average session duration and size respectively. The circles represent
the sessions initiated during the leech phase, and the dashed line marks the
length of the leech phase. It is interesting to note that there are two clearly
discernable clusters for the sessions during the seed phase. These clusters are
present for all measurements to some extent and represent the elephants (the
upper cluster), and mice (lower cluster plus linear shape at bottom). Note that
the axis are log scaled.
For reasons similar to those for session inter-arrival times, the following are
considered for modelling:
• Measurements with more than 20000 sessions
• Sessions initiated after the start of the seeding phase
• Sessions that actually request and receive at least one piece.
The reason for this is threefold:
99
1e+09
1e+02
1e+03
1e+04
1e+07
Upstream session size
1e+01
1e+05
5e+07
5e+06
5e+05
5e+04
Upstream session size
5e+08
CHAPTER 9. BITTORRENT MODELS
1e+05
1e+01
1e+02
Session duration
1e+03
1e+04
1e+05
Session duration
(a) Measurement 3
(b) Measurement 13
Figure 9.2: Session size-duration scatter plot
1. As observed in Table 9.3, most sessions do not transfer any data after the
initial TCP handshake, with the consequence of a fairly low number of
samples (3–6 % of the total number of sessions) left for parameter estimation. By including the measurements with fewer sessions, the remaining
number of sessions would be inadequate for proper parameter estimations.
Table 9.3: Percentages of session sizes exceeding 0 bytes and 1 piece size
> 0 bytes
≥ 1 pieces
Measurement
1
2
3
11
12
13
Sessions
1558
1619
1795
3092
3793
3438
% of sessions
5
4
6
7
6
7
Sessions
1392
1356
1564
1769
2612
3017
% of sessions
5
3
5
4
4
6
2. The α-estimations for measurements (Table 9.4) indicate that there could
be some heavy tail behaviour present in the distributions, as observed
100
9.1. SESSION CHARACTERISTICS
in the CCDF plots. The shape in Figure 9.3(a) is representative of the
CCDFs of session duration for all measurements.
To consider modelling this behaviour, enough samples in the tail are
needed. Considering point 1 above together with the need for a sufficient number of samples in the tail (e.g., the upper 5 %-quantile), a large
number of observations in the original data is necessary.
Table 9.4: Session α-estimates
Measurement
duration
size
1
2
3
11
12
13
α̂
1.335
1.264
1.523
1.379
1.272
1.435
2
σ̂α
0.149
0.163
0.116
0.134
0.060
0.176
α̂
1.176
1.147
1.233
0.961
0.902
1.289
2
σ̂α
0.353
0.339
0.320
0.222
0.147
0.207
3. Both session sizes and durations appear to be drawn from a single, similar
distribution when inspecting only sessions that have transmitted at least
one piece (Figure 9.3(b)). Having a single distribution makes the model
more tractable than using a mixture model. This is especially true in
the case of models that cannot be expressed as a mixture distribution
(Section 5.3), i.e., a linear combination of distributions. In this case, it
is necessary to locate cutoff points between distributions, and it is more
cumbersome to use the results in for instance a simulation environment.
The models for session sizes and durations are reported in Tables 9.5 and 9.6,
respectively. Only the sessions that actually receive data have been modelled.
Log-normal distributions with parameters µ and σ have been used for modelling.
The second and third columns show the estimated parameters, together with
the associated estimated standard deviations, for which the best value of E%
was obtained. The value of E% is given in column 6. The fourth column
indicates the tail probability mass for which the estimated distribution passed
the 5 % fitness limit of E% , while the fifth column shows the tail probability
101
log P[X ≥ x]
−3
−2
50.0%
80.0%
90.0%
95.0%
99.0%
−3
−4
50.0%
80.0%
90.0%
95.0%
99.0%
−2
−2
−3
log P[X ≥ x]
−1
−1
0
0
CHAPTER 9. BITTORRENT MODELS
−1
0
1
2
3
4
5
1
2
3
log x
(a) Duration CCDF for all sessions
File: meas_2_reference_duration.asc No. points: 28687 Alpha Estimate: 1.523
-0.5
-1
-1.5
Log10(P[X > x])
5
(b) Duration CCDF for sessions with ≥ 1
piece
0
-2
-2.5
Raw Data
2-Aggregated
4-Aggregated
8-Aggregated
16-Aggregated
32-Aggregated
64-Aggregated
128-Aggregated
256-Aggregated
512-Aggregated
"meas_2_reference_duration.asc.pts"
-3
-3.5
-4
-4.5
-1
0
1
2
3
Log10(size - 465.197)
4
5
6
(c) α-estimate plot for session duration
Figure 9.3: α-estimates and CCDF for measurement 3
102
4
log x
9.1. SESSION CHARACTERISTICS
mass for which the best value of E% was obtained. It should also be noted
that the single Log-normal distribution fitted to durations and sizes tends to
overestimate them.
Since the number of samples is substantially smaller than for the hyperexponential models shown in Section 9.1.1, the AD statistic is calculated for the
estimated distribution. Column 7 shows the significance levels obtained in the
AD test, under the assumption that the parameter estimates are good enough to
assume a fully specified distribution. The last column shows the fitness decision,
together with the result of the AD test passing at the critical level.
Table 9.5: Log-normal parameter estimates and errors for upstream session sizes during seed phase
Measurement
number
µ̂ ± σ̂
1
18.7 ± 0.04
σ̂LN ± σ̂
Tail
Pass mass E% AD sign. Comment
0.62 ± 0.02
0.45
0.21
2.1 > 0.25
Pass, good
AD: Pass
2
17.8 ± 0.04
0.99 ± 0.03
1
0.4
2.9 > 0.025
Pass, fair
AD: Fail
3
18.4 ± 0.04
0.60 ± 0.02
1
0.24
3.3 > 0.05
Pass, fair
AD: Pass
11
14.1 ± 0.06
2.44 ± 0.04
1
0.99
2.4 ≈ 0.001
Pass, good
AD: Fail
12
13.6 ± 0.05
2.36 ± 0.04
0.86
0.74
3.4 < 0.001
Pass, fair
AD: Fail
13
19.0 ± 0.03
0.69 ± 0.02
1
0.17
3.0 > 0.025
Pass, fair
AD: Fail
Though expected that a true heavy-tail model such as the Pareto distribution
or a mixture of the Pareto and Log-normal distributions would provide a better
103
CHAPTER 9. BITTORRENT MODELS
Table 9.6: Log-normal parameter estimates and errors for upstream session durations during seed phase
Measurement
number
µ̂ ± σ̂
1
8.55 ± 0.03
σ̂LN ± σ̂
1.08 ± 0.02
Tail
Pass mass E% AD sign. Comment
1
0.74
2.2 ≈ 0.01
Pass, good
AD: Fail
2
8.16 ± 0.04
1.33 ± 0.03
1
0.99
1.5 > 0.15
Pass, good
AD: Pass
3
8.17 ± 0.04
1.38 ± 0.02
1
0.98
1.6 > 0.05
Pass, good
AD: Pass
11
8.09 ± 0.04
1.56 ± 0.03
1
1
2.4 > 0.001
Pass, good
AD: Fail
12
7.2 ± 0.03
1.57 ± 0.02
1
1
3.9 ≪ 0.001
Pass, poor
AD: Fail
13
7.94 ± 0.03
1.52 ± 0.02
1
1
2.3 < 0.001
Pass, good
AD: Fail
fitting model, it was found that this was not the case. This is most likely
due to the limitation in the amount of data available in a BitTorrent swarm,
which places an upper bound on the amount of data that a peer is interested in
downloading. There is no point in a peer downloading more data once the entire
content is obtained. It may be conjectured that if a swarming BitTorrent-like
model is applied to streaming data such as VoIP or video, the distribution would
tend more toward a Paretian tail.
104
9.2. MESSAGE CHARACTERISTICS
9.2
Message Characteristics
While the models presented in the previous section refer to traffic collected from
application layer logs from several measurements, the following models refer to
traffic collected from the link layer trace of measurement 13. The link layer
packet captures were processed using the modified tcptrace described in [30].
The BitTorrent-specific parser module created the time-stamped message logs
used for the models.
The resolution used for modelling the message rates is one second. This
resolution has been chosen partly to reduce the amount of data in larger sample
sets and partly due to the difficulty of properly calculating instantaneous rates
on short timescales. Also, the number of back-to-back messages has not been
modelled, i.e., inter-arrival and inter-departure times of 0 s are excluded from
the models.
9.2.1
Upstream request–messages During Leech Phase
The request-messages and their responses are the major bandwidth contributors in a BitTorrent session. Modelling the request behaviour of the measurement peer provides valuable information to describe the overall behaviour of the
entire swarm.
The leech phase of a BitTorrent peer may be partitioned into three different sub-phases. During the first sub-phase, the peer is trying to connect to a
predefined maximum number of other peers. This means that the number of
connected peers will increase during this phase, thus also increasing the number
of outgoing piece requests. On entering the second phase, the peer has connected to enough peers. The number of connected peers and outgoing messages
fluctuate around some average value during this phase. The final phase is the
end-game mode (Section 3.6.1), during which the peer sends a large amount of
requests.
Figure 9.4 shows the upstream request rate measured in requests per second
during the leech phase of the measurement peer. The sub-phases are delimited
105
80
60
40
0
20
Number of requests
100
120
CHAPTER 9. BITTORRENT MODELS
0
500
1000
1500
2000
2500
Interval number
Figure 9.4: Upstream request rate during leech phase
by dashed lines. The behaviour during the sub-phases is clearly observable in
this figure. A single distributional model for the entire duration of the leech
phase would certainly not be able to capture this behaviour. Therefore, the
models in this section only relate to the longer of the sub-phases.
Models are provided for the instantaneous upstream request rate and upstream inter-departure times for the messages.
Request Rates
Upstream request rates have been observed to be accurately modelled by a
Gaussian distribution. The results of the modelling are given in Table 9.7 and
the associated EPDF and CCDF overlay plots in Figure 9.5.
It is conjectured that the appearance of a Gaussian distribution is due to the
fact that requests to each connected peer can be viewed as being drawn from
a separate distribution. The sum of these distributions would then approach a
Gaussian distribution according to the central limit theorem.
106
9.2. MESSAGE CHARACTERISTICS
Table 9.7: Gaussian parameter estimates and errors for upstream request
rate during leech phase
µ̂ ± σ̂
6.88 ± 0.04
E%
AD sign.
Comment
0.52
≈ 0.05
Pass, very good; AD: Pass
0.04
Density
0.03
0.01
0.02
−2
log P[X ≥ x]
−1
0.05
0.06
0
39.4 ± 0.04
σ̂N ± σ̂
0.00
−3
50.0%
80.0%
90.0%
95.0%
99.0%
1
10
20
30
log x
40
50
60
70
Messages/s
(a) Request rate CCDF
(b) Request rate EPDF
Figure 9.5: Modelling results for request rate during leech phase
Request Inter-departure Times
The exponential distribution has been used to model upstream request interdeparture times. The results of this modelling are given in Table 9.8 and the
associated EPDF and CCDF overlay plots in Figure 9.6.
Table 9.8: Exponential parameter estimates and errors for request interdeparture times during leech phase
λ̂ ± σ̂
E%
Comment
32.8 ± 0.8
1.98
Pass, good
It is observed that the majority of error is from inter-departure times shorter
107
CHAPTER 9. BITTORRENT MODELS
than 0.001 s. Fitting an exponential distribution to times longer than this results
in about 0.4 % error.
This warrants the proposition of an additional, alternative model in which
inter-departure times shorter than 0.001 s are modelled by a uniform distribution. As before, the longer times are modelled by an exponential.
The censored mixture model gives a better result with regards to the error
percentage. The cutoff point between the uniform and exponential distributions
is approximately 0.142, which corresponds to 0.001 s.
Table 9.9: Exponential and Uniform parameter estimates and errors using alternative model for request inter-departure times during
leech phase
Uni. range Uni. E%
[0, ≈ 10
−3
]
3.52
Exp. E% Total E% Comment
λ̂ ± σ̂
30.8 ± 0.4
0.39
0.55
Exp.: Pass, excellent
Unif.: Pass, fair
Density
20
−2
50.0%
80.0%
90.0%
95.0%
99.0%
−5
0
−5
−4
10
−3
log P[X ≥ x]
30
−1
40
0
Total: Pass, very good
−4
−3
−2
−1
log x
(a) Request inter-departure time CCDF
0
0.0
0.2
0.4
0.6
Interdeparture time
(b) Request inter-departure time EPDF
Figure 9.6: Modelling results for request inter-departure times during
leech phase
108
9.2. MESSAGE CHARACTERISTICS
It is clear that the uniform part of the model does not provide as good a fit
as the exponential part. However, the improvement in the exponential part still
makes the model a viable alternative to the pure exponential model.
Also, while the linear portion in the upper 1 % quantile can be fitted to a
Pareto distribution, the exponential provides a good enough fit for the purpose
of simulation.
9.2.2
Downstream piece-messages During Leech Phase
By modelling the behaviour of downstream piece-messages, it is possible to
understand the response characteristics of uploading peers.
Piece Rates
Contrary to what might be expected, the downstream piece rates do not conform
to a Gaussian distribution, but rather to a Weibull distribution. The results
presented in Table 9.10 and Figure 9.7 clearly show the applicability of the
model. In particular, the AD statistic (A2 ) has been calculated to be 0.325,
approximately equivalent to α = 0.9 (Table 6.1). This gives a quantitative
confirmation of the qualitative fitness clearly visible in Figure 9.7.
Table 9.10: Weibull parameter estimates and errors for downstream piece
rate during leech phase
α̂ ± σ̂
β̂ ± σ̂
E%
AD sign.
Comment
6.83 ± 0.06
53.6 ± 0.05
0.52
≈ 0.9
Pass, very good
AD: Pass
It is interesting to note the skewness of the EPDF in Figure 9.7. The tendency is to a lighter upper tail and a heavier lower tail with a higher mean than
the corresponding request-messages. This is probably due to back-to-back requests being sent. Since a sub-piece is larger (16384 bytes) than an Ethernet
109
0.03
Density
0.02
0.01
−2
50.0%
80.0%
90.0%
95.0%
99.0%
0.00
−3
log P[X ≥ x]
−1
0.04
0
0.05
CHAPTER 9. BITTORRENT MODELS
20
30
40
log x
50
60
70
Messages/s
(a) Piece rate CCDF
(b) Piece rate EPDF
Figure 9.7: Modelling results for downstream piece rate during leech phase
frame, it cannot fit in a single frame and would thus be taken into account in
the rate calculations.
Piece Inter-arrival Times
The inter-arrival times show similar improvements in fitting quality as downstream rates. The same model as for upstream request-messages is used, and
the results are presented in Table 9.11.
Table 9.11: Exponential parameter estimates and errors for piece interarrival times during leech phase
λ̂ ± σ̂
E%
Comment
48.4 ± 0.5
0.53
Pass, very good
The absence of the discrepancies in the lower tail that was evident in the
upstream request-messages, lead us to suspect that there is some form of local
interference. This can be kernel or user-space queueing on the measurement
computer. The fact that the exponential distribution is still valid is however an
110
9.2. MESSAGE CHARACTERISTICS
Density
10
20
−2
−3
50.0%
80.0%
90.0%
95.0%
99.0%
−6
0
−5
−4
log P[X ≥ x]
30
−1
40
0
indicator of the validity of the first model, despite the discrepancies.
−5
−4
−3
−2
log x
(a) Inter-arrival time CCDF
−1
0.00
0.05
0.10
0.15
0.20
0.25
0.30
Interarrival time
(b) Inter-arrival time EPDF
Figure 9.8: Modelling results for downstream piece inter-arrival times
during leech phase
Figure 9.8 shows the EPDF and CCDF for piece-message inter-arrival
times during the leech phase. The improvement in tail fitness compared to
the request inter-departure times is clearly observable in Figure 9.8(a).
9.2.3
Downstream request–messages During Seed Phase
In this section, models for the aggregate, i.e., rate and inter-arrival times of
all downstream request-messages are presented. While these models do not
provide any direct information on the behaviour of any particular peer, it does
provide information on the expected load placed on a participating peer. The
models also give an indication of the downstream network load placed by a
seeding peer.
Request Rates
As with the upstream request-message inter-departure times, two models are
presented for the downstream messages. However, as opposed to the upstream
111
CHAPTER 9. BITTORRENT MODELS
request-rates, the downstream equivalent is not Gaussian but displays a heavier
tail. The models used thus need at least a long-tailed distribution. A single
Weibull and a dual Weibull mixture have been fitted with good results. The
results of the single Weibull model is presented in Table 9.12, while Table 9.13
shows the results for the dual model.
Table 9.12: Weibull parameter estimates and errors for downstream request rate during seed phase
α̂ ± σ̂
β̂ ± σ̂
E%
Comment
1.99 ± 0.14
12.8 ± 0.29
0.99
Pass, very good
Though the single Weibull provides good results, adding a second Weibull
component reduces the error percentage by 75 %. The decrease occurs primarily
in the lower tail, but the upper tail also yields significant improvement. The
mixture model gives an excellent match up to at least the 99 % quantile, as
shown in Figure 9.9.
Table 9.13: Dual Weibull parameter estimates and errors for downstream
request rate during seed phase
α̂1
β̂1
α̂2
β̂2
p̂
E%
Comment
2.24
10.8
1.86
15.8
0.52
0.21
Pass, excellent
Again, as in the case of upstream request inter-departure times, a linearity
can be observed in the CCDF (Figure 9.9) for quantiles > 99 %, which can be
matched to a true heavy-tail distribution. The blue line shows a Pareto fit to the
upper 1 % quantile. While underestimation of a rate would also underestimate
the network load, the tail discrepancy for the dual Weibull model is considered
acceptable. This is partly because request-messages are small, and partly because the tail does not seem to propagate to the corresponding piece-messages
(see Figure 9.12).
Since in Section 9.2.1 a Gaussian distribution was fitted to upstream request
112
0.05
0.04
Density
0.01
0.02
0.03
−3
−4
log P[X ≥ x]
−2
0.06
−1
0.07
0
9.2. MESSAGE CHARACTERISTICS
0.00
−5
50.0%
80.0%
90.0%
95.0%
99.0%
0
−1
0
1
20
40
60
80
2
Messages/s
log x
(a) Request rate CCDF
(b) Request rate EPDF
Figure 9.9: Dual Weibull modelling results for downstream request rate
during seed phase
rates, it might be expected that this distribution would also appear in the
downstream rates. There are a few reasons as to why this is not the case.
First, recall that the Gaussian was only used for modelling the non-startup
and end-game mode phase during the leech phase. This is not the case for the
downstream messages. In particular the end-game mode adds a certain amount
to the tail of the distribution. This is not as pronounced as for the upstream
messages, since the measurement peer only receives one request per remaining
sub-piece from the requesting peer.
Second, heavy tail behaviour may be induced by TCP under certain circumstances [34]. Also, the ON-OFF behaviour induced by losses can be further
increased by the way the tit-for-tat mechanism works. Assuming a per-flow
view, a requesting peer may be forced to wait for an unchoke-message before
being allowed to download. This state would then be equivalent with an OFF
state, with the peer entering the ON state after receiving the unchoke-message.
For frequently choked peers with high download rates, heavy-tail behaviour may
occur.
Third, TCP has a tendency to propagate LRD behaviour [79]. This means
113
CHAPTER 9. BITTORRENT MODELS
that if a TCP connection is sharing a link with LRD traffic, the connection will
“catch” the LRD behaviour.
Considering all these factors, the appearance of Weibull distributions is not
surprising, but may rather be expected.
Request Inter-arrival Times
Compared to the upstream inter-departure times, the equivalent downstream
inter-arrival times are more well-behaved. An exponential distribution provides
a good model for the entire range of data. The results are presented in Table 9.14. It is interesting to note that the inter-arrival rate is about 1/3 of the
equivalent inter-departure rate. This is not unexpected behaviour, even if every
requesting peer has a total request rate on the order of ≈ 30 requests per second.
The requests are then spread out across the swarm, decreasing the total load
on each uploading peer.
Table 9.14: Exponential parameter estimates and errors for request interarrival times during seed phase
λ̂ ± σ̂
E%
Comment
9.03 ± 0.18
0.54
Pass, very good
The tail of the CCDF in Figure 9.10 shows the same tendency of heavy-tail
behaviour. This corroborates the work reported in [79] to a certain extent.
9.2.4
Upstream piece–messages During Seed Phase
While the previous section presented models for the incoming requests to a
seeding peer, this section provides models for the corresponding piece-messages.
Though these are expected to be highly related (see Table 9.2), it is still of
interest to observe how and to what degree the models differ.
The models used to model the piece-messages are the same as the corre114
6
2
4
Density
−3
−4
−5
log P[X ≥ x]
−2
−1
8
0
9.2. MESSAGE CHARACTERISTICS
−6
−5
0
50.0%
80.0%
90.0%
95.0%
99.0%
−4
−3
−2
−1
0
1
0.0
0.5
log x
1.0
1.5
2.0
x
(a) Request inter-arrival time CCDF
(b) Request inter-arrival time EPDF
Figure 9.10: Modelling results for request inter-arrival times during seed
phase
sponding request-messages, i.e., the single and dual Weibull distributions.
Piece Rates
For the single Weibull model, the parameters for upstream piece-messages are
very similar to those in the request model. The β-parameter is slightly higher,
but for simulation purposes the models can be viewed as equivalent. It should
also be noted that the fitting is better in the case of upstream piece-rates.
Table 9.15: Weibull parameter estimates and errors for upstream piece
rate during seed phase
α̂ ± σ̂
β̂ ± σ̂
2.05 ± 0.1
12.5 ± 0.23
E%
Comment
0.76 Pass, very good
While the single Weibull fitting of upstream piece-messages outperforms the
single Weibull request-message fitting, for the dual Weibull case, the opposite
is true. However, the results shown in Table 9.16 are still very good, and there
is no reason for exploring other models. Additionally, the tail matching appears
115
CHAPTER 9. BITTORRENT MODELS
to be better in the dual piece-case than for the corresponding request-case,
as can be observed in the CCDF plot in Figure 9.11
Table 9.16: Dual Weibull parameter estimates and errors for upstream
piece rate during seed phase
β̂1 ± σ̂
α̂2 ± σ̂
β̂2 ± σ̂
2.33 ± 0.24
11.2 ± 0.76
1.78 ± 0.38
15.1 ± 0.92
E%
Comment
0.58 0.31 Pass, excellent
0.06
0
Density
0.02
0.03
0.04
0.05
−1
−2
−3
log P[X ≥ x]
−4
−1
0.01
50.0%
80.0%
90.0%
95.0%
99.0%
0.00
−5
p̂
0.07
α̂1 ± σ̂
0
1
log x
(a) Request rate CCDF
0
10
20
30
40
50
60
Messages/s
(b) Request rate EPDF
Figure 9.11: Dual Weibull modelling results for upstream piece rates during seed phase
Piece Inter-departure Times
The exponential distribution has been selected for modelling inter-departure
times. The results reported in Table 9.17 indicate the validity of the model.
The linear tail is still apparent in the CCDF in Figure 9.12. However, the
fit is not quite as good as the alternative model in Section 9.2.1. Since the
present model does not require special handling of short inter-departure times,
it is deemed to be accurate enough for simulation purposes.
116
9.3. SUMMARY
Table 9.17: Exponential parameter estimates and errors for piece interdeparture times during seed phase
E%
Comment
11.7 ± 0.47
0.62
Pass, very good
10
8
Density
4
6
−3
−4
−5
log P[X ≥ x]
−2
12
−1
14
0
λ̂ ± σ̂
−5
0
2
50.0%
80.0%
90.0%
95.0%
99.0%
−4
−3
−2
−1
0
1
log x
(a) Piece inter-departure time CCDF
0.0
0.5
1.0
1.5
2.0
x
(b) Piece inter-departure time EPDF
Figure 9.12: Modelling results for piece inter-departure times during seed
phase
9.3
Summary
In this chapter, accurate models for BitTorrent session and message characteristics have been reported. Rates have been shown to have long tails, while interarrival and inter-departure times are Exponentially distributed. Furthermore,
upstream and downstream rates and times are observed to be distributionally
similar.
117
CHAPTER 9. BITTORRENT MODELS
118
Chapter 10
Conclusions and Future
Work
It only takes 20 years for a liberal to become a conservative without changing a single idea.
– Robert Anton Wilson
The main goal of this thesis was to develop tractable models for key BitTorrent
characteristics that are suitable for simulation environments. The work leading
up to and culminating in this thesis started by designing and implementing a
dedicated P2P measurement infrastructure. This infrastructure provides the
possibility of measuring P2P application traffic with high accuracy. A large
number of measurements have been performed to provide the experimental data
used for modelling. The measurements have been reported, and salient characteristics regarding the BitTorrent system have been identified.
The reported models are divided into two specific categories: session models
and message models. Sessions have been shown to arrive with hyper-exponentially distributed intervals and their duration and size exhibits long-tail behaviour
if the session involves data transfer. It is important to note that there is no
true heavy-tail behaviour present, which indicates that the BitTorrent system is
119
CHAPTER 10.
CONCLUSIONS AND FUTURE WORK
fairly well-behaved with respect to session sizes and durations. Also, the longtail models presented tend to slightly overestimate the durations and sizes, thus
making the model suitable for prediction purposes.
Furthermore, accurate models have been obtained for the most bandwidthconsuming BitTorrent messages. The inter-arrival and inter-departure times
of these messages have been shown to be well modelled as exponentially distributed. Upstream data request rates have been shown to be Gaussian under
certain circumstances. The corresponding downstream rates are however longtailed, due to the end-game mode of the BitTorrent protocol suite.
In addition to models for the BitTorrent protocol, a fitness assessment method has been presented. The method circumvents problems that classical fitness
tests exhibit with large number of observations.
10.1
Future Work
A slight drawback with the message models reported in Section 9.2 is that they
primarily model aggregate characteristics. A natural continuation of the work
is to extend the aggregate models to per-flow models. Extending the message
models to other BitTorrent messages is also an interesting prospect. This would
allow for very detailed simulation models to be built. It would also make it
possible to evaluate the behaviour of specific BitTorrent client types, to see
whether there are any inherent protocol invariants.
Tracker information is still missing from the measurements. Ideally, a tracker
and a collaborating peer should be used for measurements. This would make it
possible to assess, e.g., to what degree the flash-crowd effect influences a specific
peer (in both seed and leech phases).
The measurements presented herein were performed during mid 2004. The
Internet has not remained static since then, and more measurements to verify
the results presented would be interesting. For example, some of the latest
developments (summer of 2005) in the BitTorrent community is working towards
removing the dependency on the tracker. This would add signalling load on the
120
10.1. FUTURE WORK
network, something that the BitTorrent networks have been more or less devoid
of previously.
The fitness assessment method reported in Section 7.3 suffers from insensitivity to tail discrepancies. Further work on suitable weighting and normalisation
for the fitness measure is needed to increase the applicability of the method.
It has been successfully used for parameter optimisation in other work, but
modification for use as a general error percentage is still needed.
121
CHAPTER 10.
122
CONCLUSIONS AND FUTURE WORK
Appendix A
BitTorrent Protocol
Details
This chapter provides a fairly complete description of the BitTorrent Protocol
as defined in [20]. Where applicable, notes have been added to expound on the
specification.
A.1
strings
Bencoding Types
Strings are encoded length-prefixed. The length is given in
base ten, in ASCII character encoding. The length should
be followed by a colon, immediately followed by the specified number of characters as string data.
Note that the string encoding does not necessarily mean
that the string data are humanly readable, i.e., in the printable ASCII range. Strings carry any valid 8-bit value, and
are commonly used to carry binary data.
Example: 3:BIT encodes the string “BIT”.
integers
Integers are encoded by enclosing a base ten ASCII coded
numerical string by i and e. Negative numbers are ac123
APPENDIX A. BITTORRENT PROTOCOL DETAILS
cepted, but not leading zeroes, except in the case for the
value 0 itself.
Example: i23e encodes the integer 23.
lists
Lists are encoded by enclosing any valid bencoding type,
including other lists, by l and e. More than type is allowed.
Example: l3:agei30ee encodes the string “age” and the
integer 30.
dictionaries Dictionaries are encoded by enclosing (key, value) pairs by
d and e. The keys must be bencoding strings and the values
may be any valid bencoding type, including other dictionaries.
Example:
d3:agei30e4:name5:james5likesl4:food5:drinkee
encodes the structure:
age: 30
name: james
likes: {food, drink}
A.2
Peer Wire Protocol Messages
piece
The only payload-related protocol message. The message contains one subpiece.
request
The request-message is the method a peer wishing to
download uses to notify the sending peer which subpiece
is desired.
cancel
If a peer has previously sent a request message, this
message may be used to withdraw the request before it
has been serviced. Mostly used during end-game mode
(Section 3.6.1).
124
A.3. TRACKER REQUEST PARAMETERS
interested
This message is sent by a peer to another peer to notify
it that the first peer intends to download some data. See
Section 3.4 for description of this and the following three
messages.
not interested This is the negation of the previous message. It is sent
when a peer no longer wants to download.
choke
This message is sent by a data transmitting peer to notify
the receiving peer that it will no longer be allowed to
download.
unchoke
The negation of the previous message. Sent by a transmitting peer to a peer that has previously sent it an
interested message.
have
After a completed download, the peer sends this message
to all its connected peers to notify them of which parts
of the data are available from the peer.
bitfield
Only sent during the initial BitTorrent handshake, and is
then exchanged between the connecting peers. Contains
a bitfield indicating which pieces the peer has.
keepalive
Empty message, to keep a connection alive.
A.3
Tracker Request Parameters
A.3.1
Mandatory
Each announce request must include the following parameters:
info hash
The SHA1 hash of the value contained in the info field in
the torrent file.
125
APPENDIX A. BITTORRENT PROTOCOL DETAILS
peer id
A 20 byte string to uniquely identify the requesting peer.
There is no consensus regarding the generation of this value,
but several distinct types of ID-generation have appeared
that may be used to identify which client a peer is running.
There is some disagreement between the official protocol
description [20] and the Wiki [15]. The original specification states that this field most likely will have to be URL
escaped, while the Wiki claims that it must not be escaped.
port
The listening port of the client. The default port range for
the reference client is 6881–6889. Each active swarm needs
a separate port in the default client, but third party clients
have implemented single-port functionality.
uploaded
The total number of bytes uploaded to all peers in the
swarm, encoded in base ten ASCII. The specification does
not state whether this takes into account re-transmits or
not.
downloaded The total number of bytes downloaded from all peers in the
swarm, encoded in base ten ASCII. The specification does
not state whether this takes into account re-transmits or
not.
left
A.3.2
The total number of bytes left to download, also encoded in
base ten ASCII.
Optional Parameters
The following parameters may optionally be included:
compact
126
If set to 1, the tracker response will not be a proper bencoded
datum as described below, but rather a binary list of peer
addresses and ports.
A.3. TRACKER REQUEST PARAMETERS
numwant Specifies the number of peers that the requesting peer is requesting from the tracker.
event
May be one of:
started
The first request to the tracker, must include this parameter–value pair.
stopped
If shutting down, this should be specified to indicate
graceful shutdown.
completed
Included to notify the tracker once a download is complete, and should not be included when joining a swarm
with the full content.
key
A.3.3
Used as session identifier.
Tracker Replies
interval
Indicates the number of seconds between subsequent requests to the tracker.
complete
Number of seeds in the swarm.
incomplete Number of leechers in the swarm.
peers
Contains a list of dictionaries. Each dictionary in this list
has the following keys:
peer id The peer id parameter that the peer has
reported to the tracker.
ip
IP address or DNS name of the peer.
port
Listening port of the peer.
127
APPENDIX A. BITTORRENT PROTOCOL DETAILS
A.4
Scrape Response Keys
complete
Number of seeds for the specific swarm.
downloaded Number of registered complete-events for the specific
swarm.
incomplete Number of leechers for the specific swarm.
name
128
This optional field contains the name of the file as defined
in the name-field in the torrent file.
Appendix B
BitTorrent XML Log File
The XML document type used for the BitTorrent log files is comprised of only
two elements: EVENTLIST and EVENT. The EVENTLIST element carries information regarding the torrent-file used for the measurement and the settings that
were used for the BitTorrent client during the measurement session. Figure
B.1 shows two excerpts from such an XML document. Section B.1 gives the
Document Type Definition (DTD) for the BitTorrent XML log.
Every EVENT element contains the attributes type and timestamp. The
timestamp attribute signifies the time at which this event was ejected to the log
file, expressed as a UNIX timestamp, i.e., the number of seconds elapsed since
00:00:00 UTC, January 1, 1970. The type field denotes the event type. The
various values for the type-attribute are:
announce The only tracker-related event type available. It is ejected
into the log file when the peer communicates with the tracker
to request more peers. This element carries the following
attributes:
uploaded
Denotes the number of subpiece bytes
this peer has sent to other peers since it
was launched.
downloaded Denotes the number of subpiece bytes
129
Figure B.1: Extract from BitTorrent XML log file
130
<EVENT
<EVENT
<EVENT
<EVENT
<EVENT
<EVENT
<EVENT
<EVENT
<EVENT
<EVENT
<EVENT
<EVENT
<EVENT
<EVENT
<EVENT
<EVENT
<EVENT
<EVENT
<EVENT
<EVENT
<EVENT
<EVENT
<EVENT
<EVENT
<EVENT
<EVENT
<EVENT
<EVENT
<EVENT
<EVENT
<EVENT
<EVENT
<EVENT
<EVENT
<EVENT
<EVENT
<EVENT
<EVENT
.
.
.
<EVENT
<EVENT
<EVENT
type="cancel" timestamp="1084430940.120284" dst="200.185.78.6" dstid="-AZ2084-a3QFRyfkLC8Q" piece="1762" nconns="15"/>
type="have" direction="out" timestamp="1084430940.120433" piece="1762" nconns="15" down="661110784"/>
type="piece" timestamp="1084430940.129599" src="212.100.224.105" srcid="\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\xda\xe3.\xdb\x8b\xcfb\xa8"
piece="1176" begin="49152" length="16384"/>
type="announce" timestamp="1084430940.133816" uploaded="22118400" downloaded="661127168" left="0" last="None" trackerid="None" event="completed" numwant="50"/>
type="done" timestamp="1084430940.134094" src="212.100.224.105" srcid="\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\xda\xe3.\xdb\x8b\xcfb\xa8"
piece="1176" rxtime="0.645374" rxstart="1084430939.488720" nconns="15"/>
type="cancel" timestamp="1084430940.134308" dst="220.233.6.19" dstid="-AZ2084-v8j2jYQi0GOq" piece="1176" nconns="15"/>
type="not interested" timestamp="1084430940.134540" dst="220.233.6.19" dstid="-AZ2084-v8j2jYQi0GOq" nconns="15"/>
type="not interested" timestamp="1084430940.134640" dst="67.70.42.140" dstid="-AZ2084-1lG41aBmxdcr" nconns="15"/>
type="announce" timestamp="1084428217.041880" uploaded="0" downloaded="0" left="661127168" last="None" trackerid="None" event="started" numwant="50"/>
type="start_dl" timestamp="1084428218.717552" src="220.233.6.19" port="6881" nconns="1"/>
type="unchoke" timestamp="1084428218.717747" dst="220.233.6.19" dstid="-AZ2084-v8j2jYQi0GOq" nconns="1"/>
type="connect" timestamp="1084428218.717867" src="220.233.6.19" srcid="-AZ2084-v8j2jYQi0GOq" nconns="1"/>
type="bitfield" timestamp="1084428219.388488" src="220.233.6.19" srcid="-AZ2084-v8j2jYQi0GOq" nconns="1"/>
type="interested" timestamp="1084428219.429775" dst="220.233.6.19" dstid="-AZ2084-v8j2jYQi0GOq" nconns="1"/>
type="start_dl" timestamp="1084428225.820689" src="217.226.127.188" port="6881" nconns="2"/>
type="unchoke" timestamp="1084428225.820863" dst="217.226.127.188" dstid="-AZ2084-8wtClbR51oMR" nconns="2"/>
type="connect" timestamp="1084428225.820989" src="217.226.127.188" srcid="-AZ2084-8wtClbR51oMR" nconns="2"/>
type="unchoke" timestamp="1084428229.588582" src="220.233.6.19" srcid="-AZ2084-v8j2jYQi0GOq" nconns="2"/>
type="request" timestamp="1084428229.588833" dst="220.233.6.19" dstid="-AZ2084-v8j2jYQi0GOq" piece="688" begin="0" nconns="2"/>
type="request" timestamp="1084428229.589048" dst="220.233.6.19" dstid="-AZ2084-v8j2jYQi0GOq" piece="688" begin="16384" nconns="2"/>
type="request" timestamp="1084428229.589190" dst="220.233.6.19" dstid="-AZ2084-v8j2jYQi0GOq" piece="688" begin="32768" nconns="2"/>
type="request" timestamp="1084428229.589342" dst="220.233.6.19" dstid="-AZ2084-v8j2jYQi0GOq" piece="688" begin="49152" nconns="2"/>
type="request" timestamp="1084428229.589477" dst="220.233.6.19" dstid="-AZ2084-v8j2jYQi0GOq" piece="688" begin="65536" nconns="2"/>
type="bitfield" timestamp="1084428230.003638" src="217.226.127.188" srcid="-AZ2084-8wtClbR51oMR" nconns="2"/>
type="interested" timestamp="1084428230.052099" dst="217.226.127.188" dstid="-AZ2084-8wtClbR51oMR" nconns="2"/>
type="start_dl" timestamp="1084428230.842135" src="67.70.42.140" port="6881" nconns="3"/>
type="unchoke" timestamp="1084428230.842327" dst="67.70.42.140" dstid="-AZ2084-1lG41aBmxdcr" nconns="3"/>
type="connect" timestamp="1084428230.842438" src="67.70.42.140" srcid="-AZ2084-1lG41aBmxdcr" nconns="3"/>
type="piece" timestamp="1084428235.373197" src="220.233.6.19" srcid="-AZ2084-v8j2jYQi0GOq" piece="688" begin="0" length="16384"/>
type="request" timestamp="1084428235.375262" dst="220.233.6.19" dstid="-AZ2084-v8j2jYQi0GOq" piece="688" begin="81920" nconns="3"/>
type="bitfield" timestamp="1084428238.242941" src="67.70.42.140" srcid="-AZ2084-1lG41aBmxdcr" nconns="3"/>
type="interested" timestamp="1084428238.279325" dst="67.70.42.140" dstid="-AZ2084-1lG41aBmxdcr" nconns="3"/>
type="piece" timestamp="1084428238.983001" src="220.233.6.19" srcid="-AZ2084-v8j2jYQi0GOq" piece="688" begin="16384" length="16384"/>
type="request" timestamp="1084428238.983420" dst="220.233.6.19" dstid="-AZ2084-v8j2jYQi0GOq" piece="688" begin="98304" nconns="3"/>
type="piece" timestamp="1084428243.061652" src="220.233.6.19" srcid="-AZ2084-v8j2jYQi0GOq" piece="688" begin="32768" length="16384"/>
type="request" timestamp="1084428243.062012" dst="220.233.6.19" dstid="-AZ2084-v8j2jYQi0GOq" piece="688" begin="114688" nconns="3"/>
type="have" timestamp="1084428243.807769" src="67.70.42.140" srcid="-AZ2084-1lG41aBmxdcr" piece="2325" nconns="3"/>
type="piece" timestamp="1084428246.762416" src="220.233.6.19" srcid="-AZ2084-v8j2jYQi0GOq" piece="688" begin="49152" length="16384"/>
type="request" timestamp="1084428246.762847" dst="220.233.6.19" dstid="-AZ2084-v8j2jYQi0GOq" piece="688" begin="131072" nconns="3"/>
type="have" timestamp="1084428250.036548" src="67.70.42.140" srcid="-AZ2084-1lG41aBmxdcr" piece="2376" nconns="3"/>
type="have" timestamp="1084428250.036794" src="67.70.42.140" srcid="-AZ2084-1lG41aBmxdcr" piece="2181" nconns="3"/>
APPENDIX B. BITTORRENT XML LOG FILE
this peer has received from other peers
since the client was launched.
left
Denotes the number of bytes of the resource that remains to download.
last
This parameter is undocumented in both
the official protocol specification and the
Wiki.
trackerid
Used by the tracker for maintaining
state.
event
Is one of started, none or completed.
The value started should be used when
sending the initial tracker announce message, and only then. The None value
is used when transmitting the periodic
updates to the tracker, while the value
completed is sent exactly once to the
tracker when the download is complete.
numwant
Denotes the number of new peer addresses the peer is requesting from the tracker.
start dl
This element is ejected for every newly initiated TCP connection to a peer. Note that it does not necessarily imply that
the BitTorrent handshake will be completed.
connect
This element is ejected after every completed BitTorrent
handshake.
unchoke,choke,interested,not interested,request,piece,have,cancel
These element types are ejected for each sent or received corresponding BitTorrent protocol message.
send
The send-element is the equivalent of the piece-message, but
for the subpieces the local peer transmits.
131
APPENDIX B. BITTORRENT XML LOG FILE
done
This element is ejected once a download completes fully, and
should only appear once per log file.
The various peer-related event types carry event specific information in additional attributes. These additional attributes are:
src, dst
These attributes indicate the source and destination IP address of the sending or receiving peer respectively.
Valid for all event types.
srcid, dstid These attributes indicate the peer ID of the sending or receiving peer respectively.
The content of these attributes are encoded using the python
functions repr and xml.saxutils.escape. The function
repr returns a unique string representation of the input parameter, and the escape function returns an XML-escaped
version of its input. Recall that the peer ID is a binary
20-byte value. The peer ID is first processed by the repr
function to convert any non-printable to its python hexadecimal representation, i.e., the characters \x followed by the
hexadecimal value. This string is then made into a valid
XML attribute by the xml.saxutil.escape function, i.e.,
converting XML special characters such as & to &amp;, with
the exception of the quotation character, ", which is encoded using the python hexadecimal encoding (\x22). For
a complete list of XML entity encodings see [80].
Valid for all event types except start_dl.
piece
Denotes which piece a specific message refers to.
Valid for types piece, cancel, send, have and request.
begin
Starting byte of a subpiece reference. Used together with
the length parameter to denote a specific subpiece.
Valid for types piece, cancel, send and request.
132
length
Number of content data bytes received or sent in a single
piece message.
Valid for types piece, send and cancel.
down
Denotes the number of downloaded and SHA1-verified
bytes.
Only valid for type have.
nconns
This attribute denotes the number of currently connected
peers at the time of event ejection. This includes both locally and remotely initiated connections.
Valid for all event types.
port
Indicates the TCP port of the remote peer.
Valid for type start_dl only.
direction
Only valid for types have and bitfield. Used for differentiating between sent and received messages of these types.
If the attribute is present and contains the value out, the
message was sent by the measurement peer, otherwise it was
received.
txtime
The difference in time between the sending of the first subpiece of a piece and the reception of the last subpiece of the
piece.
rxtime
The difference in time between the first request of a piece
and the reception of the last subpiece of the piece.
133
APPENDIX B. BITTORRENT XML LOG FILE
B.1
<!ELEMENT
<!ATTLIST
<!ATTLIST
<!ATTLIST
<!ATTLIST
<!ATTLIST
<!ATTLIST
<!ATTLIST
<!ATTLIST
<!ATTLIST
<!ATTLIST
<!ATTLIST
<!ATTLIST
<!ATTLIST
<!ATTLIST
<!ATTLIST
<!ATTLIST
<!ATTLIST
<!ATTLIST
<!ATTLIST
<!ATTLIST
<!ATTLIST
<!ATTLIST
<!ATTLIST
<!ATTLIST
<!ATTLIST
<!ATTLIST
<!ATTLIST
<!ATTLIST
<!ATTLIST
<!ATTLIST
<!ATTLIST
<!ATTLIST
<!ATTLIST
<!ATTLIST
<!ATTLIST
<!ATTLIST
<!ATTLIST
<!ATTLIST
<!ATTLIST
<!ATTLIST
<!ATTLIST
<!ATTLIST
<!ATTLIST
<!ELEMENT
<!ATTLIST
<!ATTLIST
<!ATTLIST
<!ATTLIST
<!ATTLIST
<!ATTLIST
<!ATTLIST
<!ATTLIST
<!ATTLIST
<!ATTLIST
<!ATTLIST
<!ATTLIST
<!ATTLIST
<!ATTLIST
<!ATTLIST
<!ATTLIST
<!ATTLIST
<!ATTLIST
134
BitTorrent Application Log DTD
EVENTLIST (#PCDATA | EVENT)*>
EVENTLIST start_timestamp CDATA #IMPLIED>
EVENTLIST peertype CDATA #IMPLIED>
EVENTLIST version CDATA #IMPLIED>
EVENTLIST bound_ip CDATA #IMPLIED>
EVENTLIST bound_port CDATA #IMPLIED>
EVENTLIST tracker_ip CDATA #IMPLIED>
EVENTLIST tracker_port CDATA #IMPLIED>
EVENTLIST peer_id CDATA #IMPLIED>
EVENTLIST pieces CDATA #IMPLIED>
EVENTLIST piecesize CDATA #IMPLIED>
EVENTLIST nfiles CDATA #IMPLIED>
EVENTLIST totlen CDATA #IMPLIED>
EVENTLIST max_slice_length CDATA #IMPLIED>
EVENTLIST rarest_first_cutoff CDATA #IMPLIED>
EVENTLIST ip CDATA #IMPLIED>
EVENTLIST download_slice_size CDATA #IMPLIED>
EVENTLIST snub_time CDATA #IMPLIED>
EVENTLIST rerequest_interval CDATA #IMPLIED>
EVENTLIST max_uploads CDATA #IMPLIED>
EVENTLIST saveas CDATA #IMPLIED>
EVENTLIST min_uploads CDATA #IMPLIED>
EVENTLIST spew CDATA #IMPLIED>
EVENTLIST max_upload_rate CDATA #IMPLIED>
EVENTLIST minport CDATA #IMPLIED>
EVENTLIST http_timeout CDATA #IMPLIED>
EVENTLIST timeout_check_interval CDATA #IMPLIED>
EVENTLIST display_interval CDATA #IMPLIED>
EVENTLIST max_initiate CDATA #IMPLIED>
EVENTLIST max_message_length CDATA #IMPLIED>
EVENTLIST upload_rate_fudge CDATA #IMPLIED>
EVENTLIST check_hashes CDATA #IMPLIED>
EVENTLIST min_peers CDATA #IMPLIED>
EVENTLIST keepalive_interval CDATA #IMPLIED>
EVENTLIST maxport CDATA #IMPLIED>
EVENTLIST request_backlog CDATA #IMPLIED>
EVENTLIST bind CDATA #IMPLIED>
EVENTLIST max_rate_period CDATA #IMPLIED>
EVENTLIST url CDATA #IMPLIED>
EVENTLIST statfile CDATA #IMPLIED>
EVENTLIST report_hash_failures CDATA #IMPLIED>
EVENTLIST timeout CDATA #IMPLIED>
EVENTLIST responsefile CDATA #IMPLIED>
EVENTLIST max_allow_in CDATA #IMPLIED>
EVENT (#PCDATA)>
EVENT uploaded CDATA #IMPLIED>
EVENT downloaded CDATA #IMPLIED>
EVENT left CDATA #IMPLIED>
EVENT last CDATA #IMPLIED>
EVENT trackerid CDATA #IMPLIED>
EVENT event CDATA #IMPLIED>
EVENT numwant CDATA #IMPLIED>
EVENT port CDATA #IMPLIED>
EVENT txtime CDATA #IMPLIED>
EVENT rxtime CDATA #IMPLIED>
EVENT rxstart CDATA #IMPLIED>
EVENT direction CDATA #IMPLIED>
EVENT down CDATA #IMPLIED>
EVENT dst CDATA #IMPLIED>
EVENT dstid CDATA #IMPLIED>
EVENT nconns CDATA #IMPLIED>
EVENT type CDATA #IMPLIED>
EVENT timestamp CDATA #IMPLIED>
B.1. BITTORRENT APPLICATION LOG DTD
<!ATTLIST
<!ATTLIST
<!ATTLIST
<!ATTLIST
<!ATTLIST
EVENT
EVENT
EVENT
EVENT
EVENT
src CDATA #IMPLIED>
srcid CDATA #IMPLIED>
piece CDATA #IMPLIED>
begin CDATA #IMPLIED>
length CDATA #IMPLIED>
135
APPENDIX B. BITTORRENT XML LOG FILE
136
Bibliography
[1] Akamai.
http://www.akamai.com, August 2005.
[2] CIFS: A common internet file system.
http://www.microsoft.com/mind/1196/cifs.asp, August 2005.
[3] ICQ.
http://www.icq.com, August 2005.
[4] Msn messenger.
http://messenger.msn.com, August 2005.
[5] The R project.
http://www.r-project.org, August 2005.
[6] The world wide web consortium.
http://www.w3.org/, September 2005.
[7] Yahoo! messenger.
http://messenger.yahoo.com, August 2005.
[8] D. Eastlake 3rd and P. Jones.
September 2001. RFC 3174.
US Secure Hash Algorithm 1 (SHA1),
[9] Cachelogic A. Parker. The true picture of peer-to-peer file sharing.
http://www.cachelogic.com/research/slide9.php, May 2005.
137
BIBLIOGRAPHY
[10] Charles Annis. InterOcular trauma test.
http://www.statisticalengineering.com/interocular.htm, August 2005.
[11] Azureus.
http://azureus.sourceforge.net/, August 2005.
[12] Jan Beran. Statistics for Long-Memory Processes. Chapman & Hall, 1994.
[13] T. Berners-Lee, R. Fielding, and L. Masinter. Uniform Resource Identifiers
(URI): Generic Syntax, August 1998. RFC 2396.
[14] T. Berners-Lee, L. Masinter, and M. McCahill. Uniform Resource Locators
(URL), December 1994. RFC 1738.
[15] BitTorrent specification.
http://wiki.theory.org/BitTorrentSpecification, February 2005.
[16] J. Chapweske. Tree hash exchange format.
http://open-content.net/specs/draft-jchapweske-thex-02.html,
2005.
February
[17] Ian Clarke, Oskar Sandberg, Brandon Wiley, and Theodore W. Hong.
Freenet: A distributed anonymous information storage and retreival system. White paper.
[18] Clip2.
The Annotated Gnutella Protocol Specification v0.4.
Gnutella Developer Forum (GDF), 1.8th edition, July
http://groups.yahoo.com/group/the gdf/files/Development/.
The
2003.
[19] Bram Cohen. Bittorrent.
http://bitconjurer.org/BitTorrent/, August 2005.
[20] Bram Cohen. BitTorrent protocol specification.
http://www.bitconjurer.org/BitTorrent/protocol.html, February 2005.
[21] Doru Constantinescu. Measurements and models of one-way transit time
in IP routers, 2005. Licenciate thesis, Blekinge Institute of Technology.
138
BIBLIOGRAPHY
[22] Mark E. Crovella and Murad S. Taqqu. Estimating the heavy tail index
from scaling properties. Methodology and Computing in Applied Probability,
Vol 1(No. 1), 1999.
[23] Mark E. Crovella, Murad S. Taqqu, and Azer Bestavros. A Practical
Guide to Heavy Tails, chapter Heavy-Tailed Probability Distributions in
the World Wide Web, pages 3–27. Birkhäuser, 1998. ISBN 0-8176-3951-9.
[24] Qiu D. and Srikant R.J. Modeling and performance analysis of bittorrentlike peer-to-peer networks. Technical report, University of Illinois at
Urbana-Champaign, USA, 2004.
[25] Ralph B. D’Agostino and Michael A. Stephens, editors. Goodness-of-fit
Techniques. Dekker, 1986.
[26] distributed.net. distributed.net.
http://distributed.net, February 2005.
[27] eDonkey.
http://www.edonkey.com, February 2005.
[28] Wikipedia Encyclopedia. Peer-to-peer.
http://en.wikipedia.org/wiki/P2p, August 2005.
[29] David Erman, Dragos Ilie, and Adrian Popescu. BitTorrent session characteristics and models. In Demetres Kouvatsos, editor, Technical Proceedings.
HET-NETs ’05 - 3rd International Working Conference on Performance
Modelling and Evaluation of Heterogeneous Networks, 2005.
[30] David Erman, Dragos Ilie, and Adrian Popescu. Peer-to-peer traffic measurements. Technical report, Blekinge Institute of Technology, Karlskrona,
Sweden, 2005.
[31] David Erman, Dragos Ilie, Adrian Popescu, and Arne A. Nilsson. Measurement and analysis of BitTorrent traffic. In NTS 17, August 2004.
[32] Jean-loup Gailly and Mark Adler. zlib.
http://www.gzip.org/zlib, August 2005.
139
BIBLIOGRAPHY
[33] Krishna P. Gummadi, Stefan Saroiu, and Steven D. Gribble. King: estimating latency between arbitrary internet end hosts. In Proceedings of
the second ACM SIGCOMM Workshop on Internet measurment workshop,
pages 5–18. ACM Press, 2002.
[34] Liang Guo, Mark Crovella, and Ibrahim Matta. TCP congestion control
and heavy tails. Technical Report 2000-017, 3 2000.
[35] M.R. Horton. UUCP mail interchange format standard, February 1986.
RFC 976.
[36] Dragos Ilie, David Erman, and Adrian Popescu. Traffic measurements
of P2P systems. Swedish National on Computer Networking Workshop
(SNCNW04), November 2004.
[37] Dragos Ilie, David Erman, Adrian Popescu, and Arne A. Nilsson. Measurement and analysis of Gnutella signaling traffic. In IPSI 2004, September
2004.
[38] M. Izal, G. Urvoy-Keller, E.W. Biersack, P.A. Felber, A. Al Hamra, and
L. Garcés-Erice. Dissecting BitTorrent: Five months in a torrent’s lifetime.
In PAM2004, 2004.
[39] Pouwelse J.A., Garbacki P., Epema D.H.J., and Sips H.J. The BitTorrent
P2P file-sharing system: Measurements and analysis. 4th International
Workshop on Peer-to-Peer Systems (IPTPS’05), February 2005.
[40] Van Jacobsen, Leres C., and McCanne S. Tcpdump.
http://www.tcpdump.org, August 2005.
[41] Raj Jain. The Art of Computer Systems Performance Analysis. John Wiley
& Sons, 1991. ISBN 0-471-50336-3.
[42] Ajit K. Jena, Adrian Popescu, and Arne A. Nilsson. Modeling and evaluation of internet applications. In International Teletraffic Conference,
Berlin, Germany, August 2003. ITC18.
[43] B. Kantor and P. Lapsley. Network News Transfer Protocol, February 1986.
RFC 977.
140
BIBLIOGRAPHY
[44] Thomas Karagiannis, Andre Broido, Michalis Faloustos, and Claffy Kc.
Transport layer identification of P2P traffic. IMC’04, 2004.
[45] J. Klensin and Ed. Simple Mail Transfer Protocol, April 2001. RFC 2821.
[46] Tor Klingberg and Raphael Manfredi.
Gnutella 0.6.
The
Gnutella Developer Forum (GDF), 200206-draft edition, June 2002.
http://groups.yahoo.com/group/the gdf/files/Development/.
[47] Balachander Krishnamurty and Jennifer Rexford. Web Protocols and Practice. Addison Wesley, 2001. ISBN 0-201-71088-9.
[48] Averill M. Law and W. David Kelton. Simulation Modeling and Analysis.
McGraw-Hill, 2000. ISBN 0-07-059292-6.
[49] Will E. Leland, Murad S. Taqq, Walter Willinger, and Daniel V. Wilson.
On the self-similar nature of Ethernet traffic. In Deepinder P. Sidhu, editor,
ACM SIGCOMM, pages 183–193, San Francisco, California, 1993.
[50] George Marsaglia and John Marsaglia. Evaluating the Anderson-Darling
distribution. Journal of Statistical Software, 9(2):1–5, February 2004.
[51] Steven McCanne and Van Jacobson. The BSD packet filter: A new architecture for user-level packet capture. In USENIX Winter, pages 259–270,
1993.
[52] Sun Microsystems. NFS: Network File System Protocol specification, March
1989. RFC 1094.
[53] Napster. Napster.
http://www.napster.com, August 2005.
[54] NeoModus. DirectConnect.
http://www.neo-modus.com, February 2005.
[55] Motion Picture Association of America, August 2005.
http://www.mpaa.com.
[56] Recording Industry Association of America, August 2005.
http://www.riaa.com.
141
BIBLIOGRAPHY
[57] National Institute of Standards and Technology. Specifications for secure
hash standard.
http://www.itl.nist.gov/fipspubs/fip180-1.htm, April 1995. FIPS PUB 1801.
[58] Shawn Ostermann. Tcptrace.
http://www.tcptrace.org, August 2005.
[59] Conny Palm. Intensitätsschwankungen im Fernsprechverkehr. PhD thesis,
Royal Institute of Technology, 1943.
[60] Kihong Park and Walter Willinger, editors. Self-Similar Network Traffic
and Performance Evaluation. Wiley Interscience, 2000. ISBN 0-471-319740.
[61] Vern Paxson. Empirically derived analytic models of wide-area tcp connections. IEEE Transactions on Networking, 1994.
[62] Vern Paxson and Sally Floyd. Wide area traffic: the failure of Poisson
modeling. IEEE/ACM Transactions on Networking, 3(3):226–244, 1995.
[63] Vern Paxson and Sally Floyd. Why we don’t know how to simulate the
internet. In Winter Simulation Conference, pages 1037–1044, 1997.
[64] J. Postel and J.K. Reynolds. File Transfer Protocol, October 1985. RFC
959.
[65] The Free Network Project. The free network project.
http://freenet.sourceforge.net.
[66] Gary R. Wright and W. Richard Stevens. TCP/IP Illustrated: The Implementation, volume 2. Addison-Wesley, 1995. ISBN: 0-201-63354-X.
[67] S. Resnick. Heavy tail modeling and teletraffic data.
[68] Jordan Ritter. Why Gnutella can’t scale. No, really., February 2001.
http://www.darkridge.com-/˜jpr5-/doc-/gnutella.html.
142
BIBLIOGRAPHY
[69] Stefan Saroiu, P. Krishna Gummadi, and Steven D. Gribble. A measurement study of peer-to-peer file sharing systems. In Proceedings of the Multimedia Computing and Networking (MMCN), January 2002.
[70] Rüdiger Schollmeier. A definition of peer-to-peer networking for the classification of peer-to-peer architectures and applications. In Proceedings of the
First International Conference on Peer-to-Peer Computing. IEEE, 2001.
[71] Sharman Networks. KaZaA.
http://www.kazaa.com, February 2005.
[72] S. Shepler, B. Callaghan, D. Robinson, R. Thurlow, C. Beame, M. Eisler,
and D. Noveck. Network File System (NFS) version 4 Protocol, April 2003.
RFC 3530.
[73] Soulseek.
http://www.slsknet.org/, August 2005.
[74] William Stallings. High-Speed Networks and Internets. Prentice-Hall, Inc.,
second edition edition, 2002. ISBN: 0-13-032221-0.
[75] The [email protected] Project. [email protected] – the search for extraterrestial
intelligence.
http://setiathome.ssl.berkeley.edu/, February 2005.
[76] D.M. Titterington, A.F.M. Smith, and U.E. Makov. Statistical Analysis of
Finite Mixture Distributions. John Wiley & Sons, 1985. ISBN 0 471 90763
4.
[77] Olaf van der Spek. BitTorrent udp-tracker protocol extension.
http://libtorrent.sourceforge.net/udp tracker protocol.html,
2005.
February
[78] Guido van Rossum et al. Python. Online at http://www.python.org, August 2005.
[79] A. Veres, Z. Kenesi, S. Molnar, and G. Vattay. On the propagation of
long-range dependence in the internet. In Proceedings of ACM IGCOMM
2000, Stockholm, Sweden, Aug.-Sep. 2000., 2000.
143
BIBLIOGRAPHY
[80] W3C. Extensible Markup Language (XML) 1.0, 2004.
[81] M.P. Wand. Data-based choice of histogram bin width.
[82] Carey Williamson. Internet traffic measurement. 2001.
[83] Stephen Yantis, David E. Meyer, and J.E. Keith Smith. Analyses of multinomial mixture distributions: New tests for stochastic models of cognition
and action. Psychological Bulletin, Volume 110(No. 2):350–374, 1991.
[84] ZetaGrid. ZetaGrid.
http://www.zetagrid.net/, February 2005.
144
The Internet has experienced two major revolutions. The first was the emergence of the World
Wide Web, which catapulted the Internet from
being a scientific and academic network to becoming part of the societal infrastructure.The second
revolution was the appearance of the Peer-to-Peer
(P2P) applications, spear-headed by Napster.
The popularity of P2P networking has lead to a
dramatic increase of the volume and complexity of
the traffic generated by P2P applications. P2P traffic has recently been shown to amount to almost
80% of the total traffic in a high speed IP backbone
link. One of the major contributors to this massive
volume of traffic is BitTorrent, a P2P replication
system. Studies have shown that BitTorrent traffic more than doubled during the first quarter of
2004, and still amounts to 60% of all P2P traffic in
2005.
This thesis reports on measurement, modelling
and analysis of BitTorrent traffic collected at Blekinge Institute of Technology (BIT) as well as at
a local ISP. An application layer measurement infrastructure for P2P measurements developed at
BIT is presented. Furthermore, a dedicated fitness
assessment method to avoid issues with large
sample spaces is described. New results regarding
BitTorrent session and message characteristics are
reported and models for several important characteristics are provided. Results show that several BitTorrent metrics such as session durations
and sizes exhibit heavy-tail behaviour. Additionally,
previously reported results on peer reactivity to
new content are corroborated.
BITTORRENT TRAFFIC MEASUREMENTS AND MODELS
ABSTRACT
David Erman
ISSN 1650-2140
ISBN 91-7295-071-4
2005:13
2005:13
BITTORRENT TRAFFIC MEASUREMENTS
AND MODELS
David Erman
Blekinge Institute of Technology
Licentiate Dissertation Series No. 2005:13
School of Engineering
Was this manual useful for you? yes no
Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Download PDF

advertisement