A SIMULATION FRAMEWORK FOR EFFICIENT SEARCH IN P2P NETWORKS WITH 8-POINT HYPERCIRCLES

A SIMULATION FRAMEWORK FOR EFFICIENT SEARCH IN P2P NETWORKS WITH 8-POINT HYPERCIRCLES
A SIMULATION FRAMEWORK FOR
EFFICIENT SEARCH IN P2P NETWORKS
WITH 8-POINT HYPERCIRCLES
Christopher Henricsson
Syed Muhammad Abbas
MASTER THESIS 2008
COMPUTER ENGINEERING
ETT SIMULATIONSRAMVERK FOR
EFFEKTIV SÖKNING I P2P-NÄTVERK
MED 8-PUNKTERS HYPERCIRKLAR
A SIMULATION FRAMEWORK FOR EFFICIENT
SEARCH IN P2P NETWORKS WITH 8-POINT
HYPERCIRCLES
Christopher Henricsson
Syed Muhammad Abbas
Detta examensarbete är utfört vid Tekniska Högskolan i Jönköping inom
ämnesområdet datateknik. Arbetet är ett led i teknologie magisterutbildningen
med inriktning informationsteknik. Författarna svarar själva för framförda
åsikter, slutsatser och resultat.
Handledare: Feiyu Lin
Examinator: Vladimir Tarasov
Omfattning: 20 poäng (D-nivå)
Datum:
Arkiveringsnummer:
Postadress:
Box 1026
551 11 Jönköping
Besöksadress:
Gjuterigatan 5
Telefon:
036-10 10 00 (vx)
Abstract
Abstract
This report concerns the implementation of a simulation framework to evaluate an
emerging peer-to-peer network topology scheme using 8-point hypercircles,
entitled HyperCircle. This topology was proposed in order to alleviate some of the
drawbacks of current P2P systems evolving in an uncontrolled manner, such as
scalability issues, network overload and long search times. The framework is
supposed to be used to evaluate the advantages of this new topology. The
framework has been built on top of an existing simulator software solution, the
selection of which was an important part of the development. Weighing different
variables such as scalability and API usability, the selection fell on OverSim, an
open-source discreet-event simulator based on OMNET++.
After formalizing the protocol for easier implementation, as well as extending it for
better performance, implementation followed using C++ with OverSim’s API and
simulation library. Implemented as a module (alongside other stock modules
providing their own protocols such as Chord and Kademlia), it can be used in
OverSim to simulate a user-defined network using one of the simulation routine
applications provided (or using a custom application written by the user). For the
purposes of this thesis, the standard application KBRTestApp was used; an
application sending test messages between randomly selected nodes, while adding
and removing nodes at specific time intervals. The adding and removing of nodes
can be configured with probability parameters.
Tentative testing shows that this implementation of the HyperCircle protocol has
a certain performance gain over the OverSim implementations of the Chord and
Kademlia protocols, measurable in the time it takes a message to get from sender
to recipient. Further testing is outside the scope of this thesis.
i
Sammanfattning
Sammanfattning
Denna rapport beskriver utvecklingen av ett simulationsramverk för att utvärdera
en ny peer-to-peer-nätverkstopologi som använder sig av 8-punkters hypercirklar,
kallad HyperCircle. Denna topologi framfördes som en lösning på några av de
problem som dagens P2P-system utvecklar när de tillåts växa på ett okontrollerat
sätt, så som skalbarhetsproblem, överbelastning och långa söktider. Ramverket är
tänkt att byggas ovanpå en existerande simulationslösning, valet av vilken kommer
att vara en viktigt del i utvecklingen. Baserat på variabler som skalbarhet och
programmeringsstöd föll valet på OverSim, en simulator med öppen källkod
baserad på OMNET++.
Efter formalisering av protokollet, tillsammans med visa utökningar för bättre
prestanda, följde implementeringen i C++ med hjälp av OverSims API och
simulationsbibliotek. Implementerad som en modul (jämte medföljande moduler
som tillhandahåller protokoll som Chord och Kademlia) kan den användas för att
simulera ett användar-definierat nätverk med en av simulationsapplikationerna
som följer med OverSim (eller med en skräddarsydd applikation skriven av
användaren). För rapportens syften användes standardapplikationen KBRTestApp.
Denna skickar testmeddelanden mellan slumpmässigt valda noder, medan den
lägger till och tar bort noder mellan specifika tidsintervall. Tillägg och avlägsnande
av noder kan konfigureras med hjälp av sannolikhetsparametrar.
Preliminära tester visar på en viss prestandaökning jämför med OverSims
implementationer av Chord- och Kademliaprotokollen. Vidare tester ligger
utanför ramen för denna rapport.
ii
Acknowledgements
Acknowledgements
Thanks to Feiyu Lin for supervision and guidance.
iii
Key words
Key words
Peer-to-Peer
Network topology
Network Overlay Protocol
Broadcast algorithm
iv
Contents
Contents
1 Introduction................................................................................. 1 1.1 1.2 1.3 1.4 2 BACKGROUND ................................................................................................................................ 1 PURPOSE/OBJECTIVES ................................................................................................................... 1 LIMITATIONS .................................................................................................................................. 2 THESIS OUTLINE ............................................................................................................................. 2 Theoretical Background .............................................................. 3 2.1 PEER-TO-PEER NETWORKS............................................................................................................ 3 2.2 STRUCTURED AND UNSTRUCTURED P2P NETWORKS .................................................................... 3 2.2.1 Unstructured P2P Networks ...................................................................................................... 3 2.2.2 Centralized Server Networks ...................................................................................................... 4 2.2.3 Super-Peer Networks ................................................................................................................ 5 2.3 STRUCTURED P2P NETWORKS (DETERMINISTIC TOPOLOGIES) .................................................... 6 3 The HyperCircle P2P Topology ................................................... 7 3.1 NETWORK MODEL, AIMS AND REQUIREMENTS............................................................................. 7 3.2 ORGANIZING PEERS INTO A HYPERCIRCLE GRAPH ....................................................................... 7 3.3 SEARCH AND BROADCAST ALGORITHM ......................................................................................... 8 3.4 CONSTRUCTING THE HYPERCIRCLE TOPOLOGY ........................................................................... 9 3.5 TOPOLOGY MAINTENANCE ALGORITHM .................................................................................... 13 3.5.1 Integration Dimension Selection ................................................................................................. 13 3.5.2 Integration Champion Node Appointment .................................................................................. 14 3.5.3 Node Integration..................................................................................................................... 14 3.5.4 Node Departure ..................................................................................................................... 14 3.5.5 Broadcast and Search in an Incomplete Hypercircle ....................................................................... 14 4 Implementation ......................................................................... 15 4.1 SELECTION OF SIMULATION SOFTWARE ...................................................................................... 15 4.1.1 P2PSim ................................................................................................................................ 15 4.1.2 Overlay Weaver ...................................................................................................................... 16 4.1.3 OverSim................................................................................................................................ 16 4.2 THE OVERSIM SIMULATOR........................................................................................................... 16 4.2.1 Flexibility ............................................................................................................................. 17 4.2.2 Scalability ............................................................................................................................. 17 4.2.3 Interchangeable Underlying Network Models ............................................................................... 18 4.2.4 Interactive GUI ...................................................................................................................... 18 4.2.5 Base Overlay Class ................................................................................................................. 18 4.2.6 Reuse of Simulation Code ......................................................................................................... 18 4.2.7 Statistics................................................................................................................................ 18 4.3 LAYER STRUCTURE OF THE HYPERCIRCLE IMPLEMENTATION .................................................... 18 4.4 THE HYPERCIRCLE CLASS DIAGRAM ........................................................................................... 20 4.4.1 Extensions to the HyperCircle Algorithm ................................................................................... 22 4.4.2 Code examples ....................................................................................................................... 22 4.5 SIMULATION ................................................................................................................................. 23 4.5.1 Simulation Parameters ............................................................................................................. 24 4.5.2 Statistics Gathering ................................................................................................................. 24 5 Results ....................................................................................... 25 5.1 LIMITATIONS OF THE PROTOCOL IMPLEMENTATION .................................................................. 25 5.2 SIMULATION RUN SETUP .............................................................................................................. 25 5.3 SIMULATION RESULTS .................................................................................................................. 26 v
Contents
6 Conclusion and Future Work ..................................................... 32 7 References ................................................................................. 33 vi
List of Figures
List of Figures
FIGURE 1: PEER-TO-PEER NETWORK ..................................................................... 4 FIGURE 2: CENTRALIZED NETWORK .................................................................... 5 FIGURE 3: SUPER-PEER NETWORK ......................................................................... 5 FIGURE 4: DHT-BASED P2P OVERLAY SYSTEM [7] .......................................... 6 FIGURE 5: THE HYPERCIRCLE TOPOLOGY [1] ................................................... 7 FIGURE 6: 8-POINT BROADCAST [1] ......................................................................... 8 FIGURE 7: TOPOLOGY CONSTRUCTION (2 NODES) [1] .................................... 9 FIGURE 8: TOPOLOGY CONSTRUCTION (4 NODES, ONE VIRTUAL) [1] . 10 FIGURE 9: TOPOLOGY CONSTRUCTION (4 NODES) [1] .................................. 10 FIGURE 10: TOPOLOGY CONSTRUCTION (6 NODES, ONE VIRTUAL) [1] 11 FIGURE 11: TOPOLOGY CONSTRUCTION (6 NODES) [1] ................................ 11 FIGURE 12: TOPOLOGY CONSTRUCTION (8 NODES, ONE VIRTUAL) [1] 12 FIGURE 13: TOPOLOGY CONSTRUCTION (8 NODES) [1] ................................ 12 FIGURE 14: 64-POINT HYPERCIRCLE [1] .............................................................. 13 FIGURE 15: MODULAR ARCHITECTURE OF OVERSIM [9]........................... 17 FIGURE 16: LAYER STRUCTURE OF THE HYPERCIRCLE
IMPLEMENTATION ............................................................................................. 19 vii
List of Figures
FIGURE 17: THE HYPERCIRCLE CLASS DIAGRAM ........................................ 20 FIGURE 18: HYPERCIRCLE DELIVERY RATIO (256 NODES) ....................... 27 FIGURE 19: KADEMLIA DELIVERY RATIO (256 NODES) ............................. 27 FIGURE 20: CHORD DELIVERY RATIO (256 NODES) ...................................... 28 FIGURE 21: HYPERCIRCLE HOP COUNT (256 NODES) ................................... 28 FIGURE 22: KADEMLIA HOP COUNT (256 NODES) .......................................... 29 FIGURE 23: CHORD HOP COUNT (256 NODES) .................................................. 29 FIGURE 24: HYPERCIRCLE GLOBAL DELAY TIME (256 NODES) .............. 30 FIGURE 25: KADEMLIA GLOBAL DELAY TIME (256 NODES) .................... 30 FIGURE 26: CHORD GLOBAL DELAY TIME (256 NODES) ............................. 31 viii
List of Abbreviations
List of Abbreviations
API
GUI
KBR
P2P
RPC
UDP
Application Programming Interface
Graphical user interface
Key-based routing
Peer-to-peer
Remote Procedure Call
User Datagram Protocol
ix
Introduction
1 Introduction
P2P networks are very popular today. Peer-to-Peer (P2P) networks have developed
from a niche technology used experimentally in research networks into an established
paradigm for implementing distributed applications on the Internet, moving far
beyond their current applications as for file sharing and exchange. Pure P2P networks,
which couple peers in a random way based on a transport network, in which there is
no clients and servers, no central router or switch, were found to have serious
drawbacks in efficiency for large numbers of nodes, when searching information by
broadcasting queries over the whole network.
The problem can be addressed by imposing a deterministic topology on P2P networks.
The Hypercircle topology is such a topology. The deterministic topology has a limited
view on the network consisting of a set of neighbors, but at the same time knowing
overall topology. This can be used to reach locally optimized decisions when
broadcasting or routing messages and to route the data to all the nodes in the network
with a minimum number of messages needed. An efficient topology construction and
maintenance algorithm will be provided which is crucial to symmetric peer-to-peer
networks, does neither require a central server nor super-nodes in the network. Nodes
can join and leave the self-organizing network at any time, and the network is resilient
against failure.
1.1 Background
Many different styles of P2P networks have been introduced, including centralized
P2P networks (Napster [10]), decentralized P2P networks (Kazaa [11]), unstructured
P2P networks (Gnutella [12]) and hybrid P2P network (JXTA [13]), but in all these
the fundamental concept of scalability is lacking. All P2P networks use a flooding
algorithm, which is based on an inefficient broadcasting mechanism. To address the
problem, a deterministic topology called Hypercircle has been proposed in [1]. The
Hypercircle topology broadcasts data in the network with a minimum number of
messages and does not require any central or super-peer nodes in the network. This
master thesis involves the construction of a simulation framework to evaluate the
Hypercircle topology.
1.2 Purpose/Objectives
The purpose of this master thesis is to construct a simulation framework to test the
efficiency in a network consisting of 8-point hypercircles. The basic idea of the
topology is to accommodate nodes in an n-dimensional space consisting of 8-point
circles where each point can in itself consist of an 8-point circle. The topology is
based on the following rules:
• Every circle has at maximum eight nodes.
• Every node has at maximum three relationships (denoted neighbor-0,
neighbor-1 and neighbor-2) with the other nodes in the same circle.
• Every node has a neighbor-0. The neighbor-0 is the 180 degree neighbor, i.e.
the opposite side of the circle connected to the node via the circle point. This
relationship will not change unless the neighbor-0 or the node leaves the
topology.
1
Introduction
1.3 Limitations
The purpose of thesis is to construct a simulation framework to aid further evaluation
of the HyperCircle protocol, along with some basic testing to demonstrate its
functionality. In-depth testing and evaluation of the protocol is outside the scope.
1.4 Thesis outline
In the introduction, an overview of the thesis work is described. The introduction
section also describes purpose/objectives and limitations. The theoretical background
explains the basic idea of P2P networks, P2P network types, structured and
unstructured, and gives an overview of a deterministic topology. The third chapter
explains the Hypercircle topology algorithm, how peers are organized into a
hypercircle graph, how search and broadcast is performed in the hypercircle graph,
the topology maintenance algorithm, and how nodes join and leave the topology.
Finally the forth section explains the application framework, which is selected for
implementation of algorithm/simulation and describes the development process in
detail.
2
Theoretical Background
2 Theoretical Background
This chapter will provide an overview of the theory behind our simulation
framework, as well as describe the HyperCircle topology.
2.1 Peer-to-Peer Networks
A Peer-to-Peer (P2P) network is a network in which peers have equal responsibility
and capability, unlike in a conventional centralized system or a client/server system
where a single server indexes data in a large scale system. P2P is an equalizing and
decentralizing concept where all peers function as equal and there is no client and
server distinction. By recognizing computers as peers in network, P2P enables direct
exchange of resources and services without any server, as contents are dispersed
among various peers in the network. If a particular data item is searched for, no single
point is asked in the network. Instead the query is broadcast to all the peers in the
network, and peers capable of answering the query respond [2, 4].
The P2P approach has a number of advantages over centralized storage system, some
of which are described below:
Diversity and Equality: Peers have equal access to the network and are able to share
any type of content in the network. Content in a P2P network is searched dynamically
by asking as many peers possible in the network. Participation of peers in the network
is very dynamic, because the peers change status rapidly [5].
Dynamics: In P2P networks, information is searched and downloaded fresh from the
source where the information exist as compared to a centralized system which
requires updating when the cached information is no longer valid [5].
Redundancy: Data in P2P networks are often redundant. Contents are spread at
different peers in the network, with popular content existing at several peers at once.
Peers automatically download and store contents of other peers, and there is no single
potential point of failure in the network. When a node fails, other peers take charge to
balance the load on the network. In contrast, if peers are organized in a centralized
manner, taking down the central server disables entire network. In addition this,
centralized systems are also hampered by drawbacks related to bandwidth bottlenecks
[5].
2.2 Structured and Unstructured P2P Networks
P2P networks consist of peers as network nodes. Links exist between nodes in the
network; if a participating peer knows the location of another peer, then there exists a
directed edge between the two peers that know each other. Based on how nodes in the
network are linked to each other, we classify the P2P network as structured or
unstructured [6].
2.2.1
Unstructured P2P Networks
In unstructured P2P networks (see Figure 1) the links between nodes are formed
arbitrarily. Peers in such networks can join at any time, may contact any node for
integration, and copy existing links of other nodes and then form their own over time.
3
Theoretical Background
If a peer wants to find some piece of data in an unstructured p2p network, it uses a
flooding mechanism in which the query message has to be broadcasted through the
entire network to find as many peers as possible that share the data. Messages reach
individual peers several times since more than one path exist to each peer. The peers
in the network do not know where specific content might be located. Popular content
is available at several peers and any peer searching for it will get the same results.
Popular unstructured P2P networks include Gnutella[12] and FastTrack[11] [5].
Figure 1: Peer-to-peer network
Unstructured P2P networks do however suffer from some serious drawbacks,
described below:
Scalability: As there is no scheme imposed on the way peers join and leave the
network, any peer can join and leave the network at any time, joining peers
connecting to any peer already in the network. This makes the network grow in a nonoptimal way, and searches cannot be performed efficiently. Information is searched by
broadcasting a query over the network. Broadcasting a query also produces overhead
traffic, since the query reaches the same peers many times and also reaches peers not
capable of providing an answer [1, 5].
Lack of Search Guarantees: Searches for data merely reach a number of random
peers, which does not guarantee an accurate result. This is especially true when peers
are searching for rare data stored by only a few peers. There is then a greater chance
that the search will be unsuccessful, since there is no guarantee that the peer with
desired data will be found. As flooding broadcasts queries to all the nodes in the
network, it causes a large amount of traffic in the network, which impairs search
efficiency [1, 5].
2.2.2
Centralized Server Networks
In a centralized server network (see Figure 2), a peer searching for information
contacts a centralized server, which provides links to peers providing the information.
Despite the improvements in search performance, this network style is hampered by
the vulnerability of a single point failure, bandwidth bottlenecks and the overhead of
keeping the directory information up-to-date [1].
4
Theoretical Background
Figure 2: Centralized Network
2.2.3
Super-Peer Networks
Super-peer networks offer a middle ground between unstructured P2P networks and
centralized server networks by introducing hierarchy into the network in the form of
super-peers (see Figure 3). Super-peers provide services to the leaf peers and they
index contents of leaf peers assigned to them. Queries are broadcasted to super peers
who forward them to the leaf peers if relevant. The search performance of super-peer
networks is significantly better than P2P networks, and they also reduce the
disadvantage of single point failure inherent in centralized server networks. However,
super-peer networks put additional work load on super-peers and must be carefully
constructed to work well. Peers in the network can become super-peers and take on
more responsibilities than others. Still, there are no guarantees when it comes to the
search process. The topology could also result in an inefficient network due to
uncontrolled evolution [1, 5].
Figure 3: Super-Peer Network
5
Theoretical Background
2.3 Structured P2P Networks (Deterministic
Topologies)
Structured P2P networks give every node a global knowledge of the network, so that
any node can route a search to a peer which has desired file, even if the file is
extremely rare. Nodes in a deterministic topology have a limited view of the network
consisting of a set of neighbors but at the same time know the overall topology. This
can be used to reach locally optimized decisions when broadcasting and routing the
query message. The most common type of structured P2P network is the distributed
hash table (DHT), in which consistent hashing is used to assign ownership of a file to
a particular peer (see Figure 4). Well known DHTs include Chord [14], Pastry [15]
and CAN [16]. HyperCup [3] is also a structured P2P protocol [5, 7].
Figure 4: DHT-based P2P overlay system [7]
6
Theoretical Background
3 The HyperCircle P2P Topology
In P2P networks, nodes are connected to each in order to share information. In the
HyperCircle topology, we state organization of such networks deterministically [1].
3.1 Network Model, Aims and Requirements
The HyperCircle topology aims to be symmetric. Every node in the network should
have identical power and tasks. There is no central server, which precludes the
prominence of some nodes over others. Peers that can send messages directly to each
other are called neighbors. A minimum number of messages are to be broadcasted in
order to reach all the peers in the network. Every node in the network should be able
to be the root of the spanning tree. For load balancing, network traffic should be
distributed equally among the peers. The topology should be redundant, with node
failure not hampering the search and broadcast processes.
3.2 Organizing peers into a HyperCircle graph
Figure 5 depicts an 8-point HyperCircle graph. A complete HyperCircle graph
consists of N = 8k nodes, which means that each point can in itself consist of an 8point hypercircle. The network diameter is Δ = 2 * log8 8k, which gives the shortest
path length between the nodes furthest away from each other. As can be inferred from
this, the structure is symmetric with no nodes taking a more prominent position than
others. This is crucial for load balancing. Any node can be the source of a broadcast,
the root of a spanning tree, distributing the load equally.
Figure 5: The HyperCircle Topology [1]
Edges in the graph are labeled as follows: Node X is neighbor-i of node Z or (X =
i(Z)). In Figure 5, node 6 is neighbor-0 of node 7 and vice versa. Edges in the graph
are undirected; i.e. node 7 is also the neighbor-0 of node 6. Nodes in the network also
have extended neighbors X = N(Z) = {z1, z2, z3 ….}, where N is the neighbor link
set, which consists of a sequence of i-neighbors that X have to follow in order to reach
node Z (and vice versa). In the Figure 5, the neighbor link set {0, 1, 2} leads from 0
Theoretical Background
to 3 and back from 3 to 0 using the same link {0, 1, 2}. Edge labels start at i = 0 and
maximum number of neighbors is 3. Every peer maintains a small routing table,
which consists of the neighboring peers’ Node IDs and IP addresses. A node is
recognized by its ID and is reached by its address [2].
3.3 Search and Broadcast Algorithm
The following broadcast algorithm is proposed in [1]:
The node invoking the broadcast sends a message to all its neighbors, marking it with
the edge label on which the message was sent. The receiving node will forward the
message to a) neighbors-(0, 1) if it receives the message from its neighbor-2 or b)
neighbors-(0, 2) if it receives it from its neighbor-1. Nodes receiving the message
from their neighbor-0 will not forward. After this second forward, if the circle consists
of less than 5 nodes, no forwarding will stop. If the circle contains 5 nodes or more,
forwarding will stop after the next step.
To further remove redundancy, an additional rule exists for when a circle contains 5
or 6 nodes. Forwarding will then only be done to the neighbor-0:s in the second step.
As an example, in Figure 6 node 2 initiates a broadcast, sending to its neighboring
nodes 3, 4 and 6. Node 4 receives the message from its neighbor-2, forwarding to
nodes 5 and 2 (neighbors-(0, 1)). Node 6 receives the message from its neighbor-1, so
it forwards to nodes 7 and 1 (neighbors-(0, 1)). Node 3 will not forward since it
receives the message from its neighbor-0.
Figure 6: 8-point broadcast [1]
In the spanning tree in Figure 6, all nodes receive the message exactly once. N - 1
messages are needed to reach all the nodes, requiring 2 * log8 8k steps to spread the
message to every node [1, 2].
A search in the HyperCircle protocol is a broadcast with a time-to-live, i.e. a
broadcast with a limited scope.
8
Theoretical Background
3.4 Constructing the HyperCircle Topology
The main idea of the HyperCircle topology is to manage nodes in n-dimensional space
consisting of 8-point circles where each point can in itself consist of an 8-point circle.
The topology is based on the following rules:
• Every circle has a maximum eight nodes.
• Every node has a maximum of three relationship described as neighbor-0,
neighbor-1, and neighbor-2 with the other nodes in the same circle.
• Every node has neighbor-0. The neighbor-0 is the 180-degree neighbor,
connected to the opposite side of the circle through the circle point. This
relationship does not change unless the node or its neighbor-0 leaves the
topology.
To achieve symmetry in the topology, any node in the topology can accept and
integrate new nodes. When a node leaves the topology, a simulative node jump to
cover the position of the departed node, prepared to give the position to a real node
when new nodes join. The neighbor-0 of the departed node will take of the departed
node’s network responsibilities until a new node takes its place [1].
The following steps are taken when a new circle is created:
Start: Peer 0 is alone in a newly opened circle.
Step a (Figure 7): Peer 1 wants to join the network, contacting peer 0. Peer 0
integrates the new peer as its neighbor-0, its first vacant position. The neighbor-0
vacancy is always filled first.
Figure 7: Topology Construction (2 nodes) [1]
Step b (Figure 8): Peer 2 wants to join. It can contact either of the two peers. If it
contacts peer 0, peer 0 will open up a new dimension for peer 2 on the hypercircle, as
depicted. As one more peer is needed to balance the circle, a virtual peer 2, called 2’,
is created as the neighbor-0 of peer 2. Peer 1 becomes peer 2:s neighbor-1 and peer
2’:s neighbor-2. Peer 2 becomes peer 0:s neighbor-1 and peer 1:s neighbor-2. Peer 0
in this case is the integration control node and is responsible for integrating peer 2 into
the topology. Now every node has a neighbor set {0,1,2} with other peers. Every peer
is also aware that there is a vacant point in the circle.
9
Theoretical Background
If instead peer 1 is contacted, peer 1 becomes neighbor-1 of peer 2 and the neighbor-2
of peer 2’.
Figure 8: Topology Construction (4 nodes, one virtual) [1]
Step c (Figure 9): Peer 3 wants to join. It can contact any of the nodes, but the result
will be the same: since every node knows of the existence of a virtual node, the new
peer will be instructed to take the place of the virtual node, inheriting its neighbors in
the process.
Figure 9: Topology Construction (4 nodes) [1]
Step d (Figure 10): Peer 4 wants to join and contacts, for example, peer 0. Since peer
0:s neighbor slots are already occupied, it will open a new dimension for the joining
peer and a simulative peer 4’ is added as the neighbor-0 of peer 4. The balance of the
circle is destroyed. Peer 0 will rearrange its neighbor-1 to peer 4 and set peer 2 as the
neighbor-2 of peer 4. Peer 1 will also rearrange itself by setting its neighbor-2 to the
simulative peer 4’, while peer 3 becomes the neighbor-1 to peer 4’. Every peer knows
that there is vacant point in the circle and that peer 4 is responsible for the virtual peer
4’.
10
Theoretical Background
Figure 10: Topology Construction (6 nodes, one virtual) [1]
Step e (Figure 11): Peer 5 arrives and replaces peer 4’ as in step c.
Figure 11: Topology Construction (6 nodes) [1]
Step f (Figure 12): Peer 6 arrives and contacts, for example, peer 2. Peer 2 will open a
new dimension for peer 6. A virtual peer 6’ is added as the neighbor-0 of peer 6. The
circle becomes imbalanced again. Peer 2 rearranges its neighbors: it becomes the
neighbor-1 of peer 6. Peer 1 becomes the neighbor-2 of peer 6 and neighbor-1 of peer
5. Peer 5 gets peer 3 as its neighbor-2. Peer 0 is now the neighbor-2 of peer 6’ and
peer 3 is the neighbor-1 of peer 6’. All the nodes are notified about the vacant point in
the circle.
11
Theoretical Background
Figure 12: Topology Construction (8 nodes, one virtual) [1]
Step g (Figure 13): Peer 7 joins, replacing peer 6’. Every node is notified that the
circle is full.
Figure 13: Topology Construction (8 nodes) [1]
If more peers want to join, the following will happen:
Peer 8 contacts peer 7. Peer 7 knows that the circle is full. It will create a new circle,
ordered as circle 2, and mark its own circle as circle 1. All the nodes in the circle are
notified that a new circle 2 is created, being the neighbor-0 of circle 1, and with peer 8
representing it. If circle 2 becomes full, a new circle called circle 3 will be created. In
the end, a 64-point circle will have been constructed, as shown in Figure 14 [1, 2].
12
Theoretical Background
Figure 14: 64-point hypercircle [1]
If a peer leaves the network, its removal should be carried out in this way:
• If a virtual node does not exist in the circle, a new virtual node is created to
take the place of the leaving node.
• If a virtual node exists, the neighbor-0 of that virtual node will take the place
occupied by the leaving node. The virtual node will cease to exist, and the
neighbors will be reassigned accordingly.
3.5 Topology Maintenance Algorithm
A major challenge in designing P2P networks is that the network should be
symmetric. The HyperCircle topology is based on the idea that emerging nodes take
over responsibility of more than one position in the topology, if needed. Upon arrival
of nodes, the HyperCircle topology unfolds with virtual nodes as place-fillers where
necessary. Upon removal of nodes from the topology, virtual nodes jump to cover the
position, prepared to yield it to arriving peers or peers rearranged following another
node leaving. Since the complete hypercircle topology is implicitly preserved, nodes
joining or leaving does not affect the search and broadcast algorithm. Nodes joining
the network are allowed to ask any node in the topology for integration. The following
steps are then carried out:
3.5.1
Integration Dimension Selection
The node that is integrating the new peer in the topology selects a dimension for the
joining peer. If there are empty points on the hypercircle, these empty points should
be filled. For example: a node arrives and contacts a peer for joining into the network.
The integration node searches for the empty points in its immediate neighborhood, i.e.
at a one-hop distance. If it has an empty point in its immediate neighborhood, it will
integrate the new node there, otherwise passing on the integration control to another
node [2, 5].
13
Theoretical Background
3.5.2
Integration Champion Node Appointment
If the node that is contacted for integration does not have an empty point in its
immediate neighborhood, it begins looking in its non-immediate neighborhood. If a
node there has an empty point, the integration control is passed on to that node to
carry out integration. In this case, the first node to forward the control to its nonimmediate neighborhood is called the integration champion node [5].
3.5.3
Node Integration
The node is integrated into the network. The node is assigned one or more positions
on the hypercircle (i.e. a primary position and, if need be, a virtual position) and
connected to the new neighbors [5].
3.5.4
Node Departure
When a node leaves the topology, it must follow a departure protocol to keep the
topology in balance. Node departure should not affect search and broadcast algorithm.
As a basic rule, if a node leaves the network, it will be replaced by a virtual node,
which is administered by a proper node. In the HyperCircle topology, it is
administered by the neighbor-0 of the leaving node [1, 5].
3.5.5
Broadcast and Search in an Incomplete Hypercircle
The algorithm described in section 3.3 is used for broadcast and search in the
HyperCircle topology. Nodes in the network may cover two positions (i.e. proper
node and virtual node), and will carry out broadcast and search responsibilities for
both positions. If a node that covers more than one position receives a broadcast
message, it will forward the message on behalf of all of its positions, always applying
the basic idea of broadcast algorithm. Since the broadcast message is received exactly
once by all the peers, even if it covers more than one position, the source of the
broadcast is never hit again [5].
14
Theoretical Background
4 Implementation
The framework is to consist of a simulator able to simulate a network adhering to the
HyperCircle protocol (and, if possible, other P2P protocols as well). The framework is
intended to be used to evaluate the performance of the HyperCircle protocol, and
possibly aid its further development, including producing statistics comparable with
other protocols. To be useful, the simulation must have a degree of flexibility,
allowing the user to define certain parameters such as the size of the network to be
simulated. The simulator must also support nodes joining and leaving the topology
during the simulation, in order to test the maintenance algorithm fully.
To test not only the construction of the topology but also the broadcast algorithm, the
simulation needs to be message-based, i.e. simulation events being related to message
delivery between the nodes. The ability to simulate characteristic network events such
as connection timeouts and line failures would also be helpful in determining the
protocols resilience against such events.
With the actual simulation being a very complicated task, it is best left to an existing
software solution, on top of which (or into which) the HyperCircle protocol can be
implemented. If the software has existing protocols built-in, this will both ease the
implementation of the new protocol, as well as give reference data that statistics from
the HyperCircle implementation could be measured against.
4.1 Selection of Simulation Software
Many features have to be considered when selecting simulation software. Some of the
features are as follows [8]:
a. Do not consider a single issue, such as ease of use. Consider the needs and
applicability of the software in accordance with the needs, accuracy and ease of
learning.
b. Take into account the execution speed, since execution speed affects development
time. Neglect experimental runs that take more time.
c. Beware of the advertisement claims, advertisements only highlight the positive
aspects of the software and hide the negative ones.
d. Beware of the packages offer; check whether the package is open source, free for
non-profit use, or requires a runtime license. Runtime licenses vary both in price and
features.
e. Check whether the simulation software can be linked to code or routines written in
external languages such as C, C++ or Java. Simulation software that comes with
existing external routines that are suitable for the project has a big advantage.
The following candidates were considered:
4.1.1
P2PSim
Written in C++, P2PSim [20] comes stock with seven overlay protocols
implemented, but its API is largely undocumented, making it difficult to extend
with new protocols [7]. This is the reason it was not chosen for this project.
15
Theoretical Background
4.1.2
Overlay Weaver
OverlayWeaver [21] is a tool intended to facilitate the construction of overlay
protocols. It has simulation capabilities, but these are a secondary function, which
does not extend to the underlay protocol. Simulations have to be run in real time,
without resulting in any statistical data. As such, it is unsuitable for a project
whose main aim is not protocol development but simulation [7].
4.1.3
OverSim
OverSim [22] is an open-source simulator for the Linux/UNIX environment. As
described in the section below, it satisfies all our requirements for this particular
project. Thus, this is the simulator we chose to work with.
4.2 The OverSim Simulator
OverSim [9] is a flexible overlay network simulation framework based on OMNet++,
which is an open-source simulation environment which is free for academic and nonprofit use. OMNet++ consists of a set of hierarchically nested modules. Module
structure is defined in the OMNet++ NED language. Modules are often referred to as
networks. There are of two types of modules, compound modules and simple
modules. Modules containing other modules are called compound modules. Simple
modules are at the lowest level of hierarchy and are implemented directly in C++
using the OMNet++ simulation library. Modules communicate through exchanging
messages via gates and connections. Messages represent packets or frames in the
network. Gates are the input and output interface of modules, messages are received
via input gates and sent through output gates.
OverSim is based on a discrete event simulation system for communication and
processing of messages. A discrete event simulation system is a system in which the
state of the system changes at discrete points in time.
The OverSim framework (as seen in Figure 15) includes implementations of many
structured and unstructured P2P overlay protocols, such as Chord [14], Kademlia
[17], Koorde [18] and Gia [19]. To facilitate the implementation of new protocols,
OverSim includes several functions that are common to many overlay protocol
implementations. These functions include:
•
•
•
•
An overlay message handler using Remote Procedure Calls (RPC)
Lookup functions
Visualization support
Bootstrapping support
The overlay message handler provides an RPC interface which deals with packet
retransmission and packet timeouts. The overlay message handler also collects
statistics related to the messages sent, received, forwarded and dropped.
The lookup mechanism provides both iterative and recursive lookup support, dealing
with the methods to query the routing table, return the closet node during query
broadcasting and including support for malicious node behavior.
OverSim also provides visualization support by having a strong graphical user
interface. The topology structure and the passing of messages between the nodes can
16
Theoretical Background
be explicitly displayed in the GUI, provided that code for visualization has been
provided in the desired protocol implementation.
Bootstrapping support is in the form of a generic module called the Bootstrap Oracle.
The Bootstrap Oracle gives the addresses of random nodes already in the topology to
the nodes wanting to join.
OverSim includes several underlay network models which simulate complex underlay
networks as well as simplified networks for large scale simulations. OverSim can
simulate network of up to 100,000 nodes. A good introduction about OverSim can be
found in [9].
Figure 15: Modular Architecture of OverSim [9]
OverSim’s main features are described in the following sections:
4.2.1
Flexibility
A simulator should allow the simulation of both structured and unstructured overlay
networks. Because of the modular design and common API, OverSim can easily
facilitate the implementation of new features and protocols. The user can specify the
simulation parameters in a human-readable configuration file [9].
4.2.2
Scalability
OverSim is designed according to current network requirements, keeping performance
as major requirement. OverSim can simulate networks with up to 100,000 nodes [9].
17
Theoretical Background
4.2.3
Interchangeable Underlying Network Models
OverSim provides different underlying network models. On one hand the framework
provide a fully configurable IPv4 network topology with realistic bandwidths, packet
delays and packet losses (INET), and on other hand provides a fast and simple
alternative model for high performance (Simple Underlay) [9].
4.2.4
Interactive GUI
OverSim provides GUI support for validating and debugging of new and existing
overlay protocols. OverSim can visualize network topology structure, node states and
message communication between the nodes [9].
4.2.5
Base Overlay Class
OverSim implements a base overlay class. The base overlay class facilitates the
implementation of structured P2P protocols by providing an RPC interface, a generic
lookup mechanism and a common API for key-based routing (KBR) [9].
4.2.6
Reuse of Simulation Code
OverSim provides implementations of different overlay protocols. These protocols are
reusable for real networks application. OverSim provides ways to compare simulation
results with real network test results, since OverSim is able to exchange messages
with other implementations of the same overlay protocol [9].
4.2.7
Statistics
OverSim collects data such as messages sent, received and forwarded, successful or
unsuccessful packet delivery and packet hop count. External programs can display
this output in an easy readable format [9].
4.3 Layer Structure of the HyperCircle
Implementation
Figure 16 shows the layered structure of OverSim and our HyperCircle
implementation’s place in it. The OverSim Underlay implements an underlying
network model, several of which are available (Simple Network, INET, etc.). These
are all completely transparent to the overlay layers using a consistent UDP interface,
and can be exchanged freely.
The OverSim BaseOverlay layer provides basic functionality common for the overlay
protocols, such as bootstrapping support and message handling. In our
implementation we did have to override a few methods from this layer, mostly having
to do with the generic lookup function not being suitable for our broadcast algorithm.
Connected by the OverSim API is our HyperCircle overlay protocol, here divided into
topology and messaging implementations. The topology implementation handles all
arrivals and departures of nodes in a way that is consistent with the HyperCircle
18
Theoretical Background
Figure 16: Layer structure of the HyperCircle implementation
structure. To accommodate the neighbors and other special properties of a
HyperCircle node, we specialized OverSim’s basic NodeHandle class into our own
HyperCircleNodeHandle class, as well as implemented classes to hold the logical
hypercircles, called HyperCircleNodeBucket (Node bucket is a term carried over from
OverSim’s Kademlia implementation), HyperCircleBaseDimension and
HyperCircleDimension. These container classes are designed to hold nodes, buckets
and dimensions, respectively. For the messaging implementation, we specialized the
BaseRouteMessage class into our own HyperCircleRouteMessage class, and we
overrode the BaseOverlay implementation of sendToKey() to be able to send
messages to more than one node at a time.
19
Theoretical Background
Using a key-based routing interface, applications written for OverSim can use our
overlay protocol. For testing purposes, we used the KBR TestApp that comes with
OverSim. Other applications can be written with further test purposes in mind.
4.4 The HyperCircle Class Diagram
OverSim NodeHandle
vacantPoint
0..1
neighbor-0
0..n
0..1
HyperCircleNodeHandle
0..1
neighbor-1
0..1
0..1
neighbor-2
0..1
2..4
0..8
OverSim BaseRouteMessage
0..1
0..n
HyperCircleRouteMessage
vacantCircle
circleVector
0..1
0..n
0..n
0..1
1
0..1
HyperCircleNodeBucket
neighbor-1
0..1
0..1
0..1 0..8
0..1
neighbor-2
neighbor-0
dimensionVector
vacantDimension
upOneLevel
0..1
0..n
neighbor-2
1
0..1
0..1
HyperCircleDimension
neighbor-0
0..n
0..8
0..8
neighbor-1
HyperCircleBaseDimension
Figure 17: The HyperCircle class diagram
20
Theoretical Background
The class diagram (Figure 17) shows our implementation of the HyperCircle protocol.
The HyperCircleNodeHandle class, specialized from the basic OverSim NodeHandle
class, stores all properties of the nodes, while the HyperCircleBucket class stores the
properties of the first-level hypercircles. The HyperCircleBaseDimension class stores
the second-level circles, and the HyperCircleDimension (superclass of
HyperCircleBaseDimension) stores circles of every greater levels. The most important
properties of all these classes are the neighbors, which are illustrated here as recursive
relationships. All three neighbor relationships will almost always all exist, the
exception being when there are less than three nodes (or circles) in the encompassing
circle.
The HyperCircleNodeBucket class is implemented as a vector storing up to 8 nodes,
and also having its own neighbor relationships. The buckets are always contained
within a dimension; in the case of a topology of no more than 64 nodes this would be
the universe dimension that is always at the top level. A further vector class called
circleVector is used to store the addresses of the buckets that are not full (instantiated
with the name nonFullCircle). In this way, if a joining peer contacts a peer in a full
circle, he can be forwarded to a non-full circle. The same is also true for the class
dimensionVector and its instance nonFullDimension, albeit dealing with dimension.
The HyperCircleDimension class, along with its specialized class
HyperCircleBaseDimension, represent circles of level two and upwards (i.e. circles
containing circles, as opposed to circles containing nodes).
HyperCircleBaseDimension is a vector containing HyperCircleNodeBuckets, while
HyperCircleDimension is a vector containing HyperCircleBaseDimensions or
HyperCircleDimensions. As with the HyperCircleNodeBucket class, these classes
also have three neighbor relationships, and its own class for containing non-full
dimensions (dimensionVector).
This brings us to the subject of vacantPoint, vacantCircle and vacantDimension, three
vector instances which hold the addresses of virtual nodes, buckets and dimensions.
(There can be at most one virtual node in every bucket, one virtual bucket in every
BaseDimension and one virtual dimension (base or otherwise) in every dimension in
the network at any time). The virtual objects take priority over the non-full objects
and get filled when a new proper object appears (for example, a new peer, or a new
circle as a consequence of a new peer joining).
The HyperCircleRouteMessage, specialized from OverSim’s basic RouteMessage
class, holds the messages sent to and from the nodes. Depending on the
circumstances, a message can have a relationship with 2, 3 or 4 nodes. Every message
has a source and destination nodes. If it is on its way to its destination via other nodes
network, it also has a relationship to the last node it passed on its way, and if it passes
a node that is forwarding messages on behalf of a virtual node, it also has a
relationship to the virtual node. In the same way, it also has relationships with the last
bucket and the last dimension it passed, as well as virtual buckets and dimensions. In
order to terminate the message’s further travel when it has passed the appropriate
number of nodes it has a counter for nodes traversed, and a separate counter for
dimensions traversed.
21
Theoretical Background
4.4.1
Extensions to the HyperCircle Algorithm
To make the topological HyperCircle structure work in a satisfactory way, a few
extensions had to be made to the algorithm described in [1]. Most of them have to do
with conserving the balance of the network when nodes join and leave. For this
purpose, there can only ever be one virtual node in a circle at any time (that goes for
both buckets and dimensions). Therefore, all nodes need to be informed of the vacant
point. In real life, this would be implemented with broadcast messages to all nodes,
but in our simulation the vacant nodes, buckets and dimensions are stored in their own
global vectors (a map in the case of dimensions, to be able to map the vacant point to
a specific level). The same is true for our nonFullCircles and nonFullDimensions
vectors, which store all non-full circles and dimensions so that new nodes can be
directed to them instead of opening new circles and/or dimensions of their own, which
would seriously derail the balance of network.
4.4.2
Code examples
To illustrate the implantation, we here present two code examples representative of
the code as a whole:
Topology construction:
This code is run when a node is to be added to a level-1 circle, i.e. a bucket. Since the
algorithm is largely dependent on the number of nodes present in the circle at any
time, the code therefore largely consists of if statements like this:
HyperCircleNodeHandle* oldneighbor = contactNode->neighbor1;
contactNode->neighbor1 = n;
n->neighbor1 = contactNode;
n->neighbor2 = oldneighbor;
HyperCircleNodeHandle* virt = new HyperCircleNodeHandle(true);
virt->circle = bucketno;
vacantPoint.push_back(virt);
n->neighbor0 = virt;
virt->neighbor0 = n;
if ((*BaseRoutingTable)[bucketno]->size() == 2)
{
contactNode->neighbor0->neighbor1 = virt;
contactNode->neighbor0->neighbor2 = n;
n->neighbor2 = contactNode->neighbor0;
virt->neighbor1 = contactNode->neighbor0;
virt->neighbor2 = contactNode;
contactNode->neighbor2 = virt;
(*BaseRoutingTable)[bucketno]->push_back(virt);
}
else if ((*BaseRoutingTable)[bucketno]->size() == 4)
{
virt->neighbor1 = contactNode->neighbor2;
virt->neighbor2 = contactNode->neighbor0;
contactNode->neighbor2->neighbor1 = virt;
contactNode->neighbor2->neighbor0->neighbor2 = n;
contactNode->neighbor2->neighbor0->neighbor1 = contactNode->neighbor0;
virt->neighbor2->neighbor1 = n->neighbor2;
virt->neighbor2->neighbor2 = virt;
(*BaseRoutingTable)[bucketno]->push_back(virt);
}
else if ((*BaseRoutingTable)[bucketno]->size() == 6)
{
virt->neighbor1 = contactNode->neighbor0;
virt->neighbor2 = contactNode->neighbor2->neighbor1;
contactNode->neighbor2->neighbor1->neighbor2 = virt;
oldneighbor = contactNode->neighbor0->neighbor1;
22
Theoretical Background
contactNode->neighbor0->neighbor1 = virt;
contactNode->neighbor0->neighbor2 = oldneighbor;
oldneighbor = contactNode->neighbor0->neighbor2->neighbor2->neighbor2;
contactNode->neighbor0->neighbor2->neighbor2->neighbor1 = oldneighbor;
contactNode->neighbor0->neighbor2->neighbor2->neighbor2 = n;
oldneighbor = contactNode->neighbor0->neighbor2->neighbor1;
contactNode->neighbor0->neighbor2->neighbor1 = contactNode->neighbor0
->neighbor2->neighbor2;
contactNode->neighbor0->neighbor2->neighbor2 = oldneighbor;
(*BaseRoutingTable)[bucketno]->push_back(virt);
}
In this code, a node is added along with a virtual node, as in steps b, d and f in the
topology construction (see chapter 3.4). The neighbors are rearranged accordingly,
hence the many assignment operations.
Message handling:
This code, and variations of it, is used to send messages from the current (cur) node to
its neighbors (cur->neighbor0, etc.), while first checking whether the recipient node
is real or virtual. If it is virtual, the message will instead be sent to the neighbor-0 of
the recipient, first setting the parameter VirtualNode to let the recipient know that it
should act as a proxy to the virtual node. Parameters are also set describing the
message’s last node traversal (the current node) and iterating the step counter (the
number of nodes traversed as of this node).
if (cur->neighbor0 != NULL && !cur->neighbor0->isUnspecified() && !cur->neighbor0
->virt)
{
HyperCircleRouteMessage* routeMsg0 = new HyperCircleRouteMessage(*routeMsg);
routeMsg0->setStep(routeMsg->getStep() + 1);
routeMsg0->setLastNode((unsigned int) cur);
routeMsg0->setLastCircle(cur->circle);
sendRouteMessage(*cur->neighbor0, routeMsg0, useNextHopRpc);
}
else if (cur->neighbor0 != NULL && cur->neighbor0->isUnspecified() && cur->neighbor0
->virt)
{
HyperCircleRouteMessage* routeMsg0 = new HyperCircleRouteMessage(*routeMsg);
routeMsg0->setVirtualNode((unsigned int) cur->neighbor0);
routeMsg0->setStep(routeMsg->getStep() + 1);
routeMsg0->setLastNode((unsigned int) cur);
routeMsg0->setLastCircle(cur->circle);
sendRouteMessage(*cur->neighbor0->neighbor0, routeMsg0, useNextHopRpc);
}
4.5 Simulation
The simulation (which is implemented as several separate applications in OverSim,
for our purposes we used the KBR TestApp) works as follows: during the simulation
run, nodes will join and leave the network randomly. The nodes follow our
HyperCircle algorithm to construct a HyperCircle topology as described. Test
messages are sent from random nodes to be received by the destination node. During
the simulation run, statistics are gathered, notably delivery ratio (the percentage of
sent messages that are received at their destination), hop count (the average number of
nodes the messages traverse until they are received, equal to the number of steps in
the specification) and time delay (time between message sent and message received).
23
Theoretical Background
4.5.1
Simulation Parameters
In addition to the application chosen, the simulation is also controlled by a set of
configuration options, some application-dependant, others global. These are set in the
omnetpp.ini file, where a set of simulation runs are defined. For each simulation run,
the following parameters, among others, can be set:
•
•
•
•
•
•
•
4.5.2
network,
specifying which underlay network model to use, usually SimpleNetwork or
IPv4
overlayType,
specifying the overlay protocol to use, either our own HyperCircle protocol, or
one of the stock protocols like Chord [14], Kademlia [17], etc.
tier1Type,
specifying the application modules to use for simulation, usually
KBRTestModules for the standard KBRTestApp, or a custom module set
developed by the user
targetOverlayTerminalNum,
specifying the number of nodes that should be added to the network at the start
of the simulation
creationProbability,
specifying the probability that a new node should be created (this is evaluated
at regular intervals, a higher number resulting in a larger number of nodes
added to the network over time)
removalProbability,
specifying the probability that a node will be removed (again evaluated at
regular intervals, a higher number resulting in more nodes being removed over
time)
gracefulLeaveProbability,
the probability that a node will perform a graceful leave on exit, as opposed to
just timing out (our implementation does not distinguish between these
possibilities)
Statistics Gathering
The statistics gathered during simulation can be plotted to a graph using the Linux
tool plove. We used this to see if our results were credible and in accordance with the
theoretical results in [1]. We also used an exhaustive test method, drawing the whole
network topology and the tracing the messages passing through it by hand according
to the debug output from our implementation. We did this for a handful of messages
in various network sizes, and found (after a few code adjustments) that each message
we traced reached its destination, as well as every other node in the network without
any node receiving it twice.
24
Theoretical Background
5 Results
5.1 Limitations of the Protocol Implementation
Because of a certain lack of time at the end stages of our project, we did not
implement a balancing function for circles of level 2 and above, i.e. circles containing
circles. Thus, in the rare case that a level-2 circle becomes empty, it will not be
removed, and the containing circle will not be balanced. Since this requires that all 64
nodes inside the circle decide to leave and no new nodes arrive during this time, this
will probably only happen at the end of simulation runs when all nodes are called
upon to leave the network. At this time it will have very little significance.
Another limitation is that of system resources. Because of the recursive nature of
some of our code, a lot of memory will be allocated for messages which will not
immediately be freed. Our test computer had relatively little memory and did not
respond well to simulations of 1500 nodes and above, but we imagine that even more
well-equipped computers will at some node count stop behaving in a satisfactory way.
5.2 Simulation Run Setup
The following simulation runs were set up in the omnetpp.ini file for our
simulation purposes (chapter 4.5.1 describes some of the parameters below, the
rest are unimportant for the purposes of this thesis):
[Run 34]
description = "HyperCircle"
network = SimpleNetwork
**.overlayType = "HyperCircleModules"
**.tier1Type = "KBRTestAppModules"
**.overlay.iterativeLookup=false
**.useCommonAPIforward = false
**.targetOverlayTerminalNum=256
**.creationProbability=0.5
**.migrationProbability=0.0
**.removalProbability=0.8
**.gracefulLeaveProbability=0.3
[Run 36]
description = "Kademlia"
network = SimpleNetwork
**.overlayType = "KademliaModules"
**.tier1Type = "KBRTestAppModules"
**.overlay.iterativeLookup=true
**.targetOverlayTerminalNum=256
**.creationProbability=0.5
**.migrationProbability=0.0
**.removalProbability=0.8
**.gracefulLeaveProbability=0.3
**.tier*.kbrTestApp.lookupNodeIds=true
**.overlay.lookupRedundantNodes = 8
**.overlay.lookupMerge = true
[Run 37]
description = "Chord"
network = SimpleNetwork
**.overlayType = "ChordModules"
**.tier1Type = "KBRTestAppModules"
**.targetOverlayTerminalNum=256
**.creationProbability=0.5
25
Theoretical Background
**.migrationProbability=0.0
**.removalProbability=0.8
**.gracefulLeaveProbability=0.3
Each simulation run has, as can be seen, the same probabilities for node arrival and
departure. We chose to a network of 256 nodes, as this was something our test
computer could do without freezing. We let this simulation run for 15 minutes
(simulation time, not actual time), as, again, we didn’t want our test computer to start
misbehaving.
5.3 Simulation Results
This section lists delivery ratio, hop count and delay time graphs for nodes that join
and leave the network using the hypercircle algorithm, compared with the same
statistics for the OverSim implementations of Chord [14] and Kademlia [17]. Since
the purpose of this report is to provide a framework for future study of the Hypercircle
algorithm, only one example simulation is presented here.
Although this Kademlia implementation does not use broadcasting in the exact sense,
since it tries to determine the best path to the receiving node, the Hypercircle
algorithm also results in a shortest path (along with several others). Since it is this
path that will determine the end-to-end delay time (the additional broadcasting will
only take up bandwidth), the results are somewhat comparable. The hop count results,
on the other hand, are not (more on this below).
The diagrams that follow were generated by plove from the vector file outputted by
OverSim. The horizontal axis denotes the simulation time from simulation start to
simulation finish. The vertical axis denotes either delivery ratio (the ratio of messages
successfully delivered to messages sent), hop count (the number of nodes traversed
between the sending and receiving nodes) or global time delay (the time between
message sent and message received at the end-point).
Figures 18, 19 and 20 show the delivery ratios of the three simulation runs. The
delivery ratio is defined as the percentage of messages correctly delivered to their
destination. If a protocol is well-designed and well-implemented, the delivery ratio
will only be dependent on simulation parameters relating to network instability, and in
certain cases on nodes that are leaving the network, therefore not being able to
forward the message.
The results shown above are roughly equal, the HyperCircle having a slight dip at
times when a node is removed (our implementation does not resend messages in these
cases). The Chord implementation seems to be sensitive to this as well.
Figures 21, 22 and 23 show the hop count (the number of nodes that each message
passes on its way to its destination). These results are not comparable, since Kademlia
uses a lookup function optimized for finding the shortest path to a node, as opposed to
finding the optimal spanning tree for broadcasting. The results are interesting though,
when seen in context with the time delay results in Figures 24 to 26.
26
Theoretical Background
Figure 18: HyperCircle delivery ratio (256 nodes)
Figure 19: Kademlia delivery ratio (256 nodes)
27
Theoretical Background
Figure 20: Chord delivery ratio (256 nodes)
Figure 21: HyperCircle hop count (256 nodes)
28
Theoretical Background
Figure 22: Kademlia hop count (256 nodes)
Figure 23: Chord hop count (256 nodes)
29
Theoretical Background
Figure 24: HyperCircle global delay time (256 nodes)
Figure 25: Kademlia global delay time (256 nodes)
30
Theoretical Background
Figure 26: Chord global delay time (256 nodes)
Figures 24, 25 and 26 show the end-to-end delay time, i.e. the time it takes for a
message to travel from the sending node to the receiving node. Here we can see that
the HyperCircle implementation has a lower average delay time than Kademlia,
despite having a higher hop count as seen above. Chord has both a higher hop count
and a higher delay time.
Apart from these, statistics also exist for the delay time and hop count of each single
node in the network. Custom statistics can also be gathered by modifying the source
code of the test application and/or the protocol.
31
Theoretical Background
6 Conclusion and Future Work
The HyperCircle protocol was proposed to alleviate certain drawbacks in large-scale
P2P systems by enforcing a balanced network structure. To assist evaluation of the
merits of this protocol, we developed a simulation framework to test the HyperCircle
topology in a virtual network.
Based on OverSim, a discreet-event simulator specifically designed to simulate large
networks, we developed our own OverSim module implementing the HyperCircle
protocol alongside the stock implementations of Chord, Kademlia and others.
The resulting framework is flexible enough to allow extensive experimentation, both
with parameter values and actual source code. Since the simulation is controlled by
OverSim applications, new applications can be constructed to perform customized
simulation for whatever purpose the user intends.
Subsequent tests and simulation runs showed that our implementation of the
HyperCircle protocol is functioning correctly and that the results can act as a
measurement of the protocol’s performance. Comparing the results with the OverSim
implementations of Chord and Kademlia showed that HyperCircle has a lower global
time delay than both, even though its messages traverse a larger number of nodes in
the case of Kademlia. This means that this implementation of HyperCircle is faster at
sending messages than both Chord and Kademlia.
There are certain limitations to the framework. With large networks, our test computer
shows signs of performance degradation, resulting in slow execution and
unresponsiveness. In addition to this being a hardware issue, it could also have to do
with the recursive nature of our implementation, allocating large chunks of memory
before giving it back.
Since our aim was only to construct a framework for future experiments, we cannot
ourselves explore simulation results in any in-depth way, but leave it up to others to
investigate further. Suffice to say, our tentative simulation runs show that our
implementation performs better than the stock protocols we tested it against.
Following our results, future work could possibly include writing a specialized
OverSim test application for the HyperCircle protocol (introducing the ability to use
pure broadcasting), pitting HyperCircle against yet other protocols implemented for
OverSim, or even implementing new protocols themselves for comparison against
HyperCircle.
32
Error! Reference source not found.
7 References
[1]
Feiyu Lin, Kurt Sandkuhl; (2006) Towards efficient search in P2P networks
with 8-point hypercircles.
IADIS International Conference WWW/Internet 2006, Murcia, Spain.
[2]
Mario Schlosser, Michael Sintek, Stefan Decker, Wolfgang Nejdl; (2002) A
Scalable and Ontology-Based P2P Infrastructure for Semantic Web Services.
Stanford University
[3]
Boyko Syarov; (2007) HyperCup.
Institute of Computer Science, Alber; Ludwig University Freiburg, Germany
[4]
Miller and Michael; (2001) Discovering P2P.
Sybex, ISBN 9780782140187
[5]
M. Schlosser; (2002) Semantic Web Services.
Diplomarbeit, Hannover University.
[6]
http://en.wikipedia.org/wiki/Peer-to-peer (Acc. 10/10/2008)
[7]
Eng Keong Lua, Jon Crowcroft, Marcelo Pias, Ravi Sharma, Steven Lim;
(2004) A Survey and Comparison of Peer-to-Peer Overlay Network Schemes.
IEEE Communications Survey and Tutorial, March 2004
[8]
J. Banks, S.C John, B.L Nelson, D.M. Nicol; Discrete-Event System
Simulation, 4th edition.
Prentice Hall, ISBN 978-0131446793
[9]
I. Baumgart, B. Heep, S. Krause; (2007) OverSim: A flexible overlay network
simulation framework.
Institute of Telematics, Universität Karlsruhe (TH)
[10]
Napster website: http://www.napster.com
[11]
KaZaA website: http://www.kazaaa.com
[12]
Gnutella website: http://www.gnutella.com
[13]
JXTA website: http://www.jxta.org/
[14]
The Chord Project website: http://pdos.csail.mit.edu/chord/
[15]
Pastry website: http://freepastry.org/
[16]
Sylvia Ratnasamy, Paul Francis, Mark Handley, Richard Karp, Scott Shenker;
(2001) A Scalable Content-Addressable Network.
Proceedings of ACM SIGCOMM 2001
33
Error! Reference source not found.
[17]
Petar Maymounkov, David Mazières; Kademlia: A peer-to-peer Information
System Based on the XOR Metric.
New York University
[18]
M. Frans Kaashoek David R. Karger; Koorde: A simple degree-optimal
distributed hash table.
MIT Laboratory for Computer Science
[19]
Yatin Chawathe, Sylvia Ratnasamy, Lee Breslau, Nick Lanham, Scott
Shenker; (2003) GIA: Making Gnutella-like P2P Systems Scalable.
SIGCOMM 2003
[20]
P2PSim website: http://pdos.csail.mit.edu/p2psim/
[21]
Overlay Weaver website: http://overlayweaver.sourceforge.net/
[22]
OverSim website: http://www.oversim.org
34
Was this manual useful for you? yes no
Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Related manuals

Download PDF

advertisement