Distributed Load-Balancing of Network Flows using

Distributed Load-Balancing of Network Flows using
Informatica — Universiteit van Amsterdam
Bachelor Informatica
Distributed Load-Balancing of
Network Flows using Multi-Path
Routing
Kevin Ouwehand
September 20, 2015
Supervisor(s): Stavros Konstantaros, Benno Overeinder (NLnetLabs)
Signed:
2
Abstract
With the growth of the internet, networks become bigger and data flows are increasing. To keep
up with this increase, new technologies like SDN (which is explained in chapter 2) are invented
to manage these networks and to load-balance them to maintain network performance. This
managing and load-balancing has thus far been done with a central OpenFlow controller. This
presents us with a network bottleneck and a single point of failure. Therefore, we present a
unified approach that uses multiple controllers to combine the advantages of OpenFlow with
a distributed approach. Each controller will make routing decisions based on an intelligent
algorithm that uses the estimated bandwidth usage of flows and the estimated amount of free
bandwidth a path has. Various parameters for proper bandwidth estimation and its effect on
the performance of the system will be tested, as well as the stability of the system with regards
to oscillation between multiple paths for a flow. It is observed that with good parameters, an
accurate estimation of the bandwidth usage of flows leads to good network performance and
optimal results. However, the detection of congestion at links with flows that use relatively lowbandwidth compared to the link capacity, is not sufficient, and will have to be improved. Finally,
the system is stable and shows no signs of oscillation of paths.
2
Contents
1 Introduction
1.1 Research Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.3 Thesis Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5
6
6
6
2 Background
2.1 Current load-balancing techniques . . . . . . .
2.1.1 TRILL . . . . . . . . . . . . . . . . . . .
2.1.2 SPB and ECMP . . . . . . . . . . . . .
2.2 Software Defined Networks (SDN) . . . . . . .
2.2.1 Flow-based forwarding using OpenFlow
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
7
7
7
7
8
8
3 Approach
3.1 Strategy description . . . . . . . . . . . . . . . .
3.2 Intelligent algorithm . . . . . . . . . . . . . . . .
3.3 Retrieving information . . . . . . . . . . . . . . .
3.3.1 Network topology . . . . . . . . . . . . .
3.3.2 Bandwidth estimation for flows and paths
3.4 Other controller functionality . . . . . . . . . . .
3.4.1 Packet-ins for new flows . . . . . . . . . .
3.4.2 Algorithm and problematic flows . . . . .
3.5 Configurable parameters . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
13
13
14
15
16
16
17
17
17
17
4 Implementation
4.1 Controller to controller communication
4.2 Topology handling . . . . . . . . . . .
4.3 Implementation constraints . . . . . .
4.3.1 Manageability . . . . . . . . . .
4.3.2 Flow issues . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
19
19
20
21
21
22
5 Experimental results
5.1 Test-bed configuration . . .
5.1.1 First scenario . . . .
5.1.2 Second scenario . . .
5.2 Performance measurements
5.2.1 Functionality Test .
5.2.2 Stability Test . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
25
25
25
26
26
27
30
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
6 Discussion
33
7 Conclusion
35
8 Future work
37
3
9 References
39
4
CHAPTER 1
Introduction
Since its creation, the internet has kept growing both in size and usage. Networks become
bigger, more datacenters are being built and expanding in physical size as well as capacity [1].
This growth is needed to keep up with more data flows that also become bigger and bigger.
According to a recent article published by Cisco [2], the annual global IP traffic is now five times
larger than in 2010. The amount of yearly IP traffic will reach the zettabyte barrier in 2016,
and will be 2 zettabytes by 2019. Also, by 2019, Content Delivery Networks (CDNs, who are
responsible for delivering content like video-streams to end-users) will carry more than half of
the internet’s traffic. With the increasing popularity of video-streaming services like Netflix,
this should be no surprise, especially with new technologies like 4K that offer video-streams at a
higher resolution, and will thus require even more bandwidth.
All this traffic is delivered via high-speed networks to end-users. With this increase of traffic,
proper load-balancing of the networks becomes important to maintain high speeds. Using the
classical Ethernet switches to build these networks presents a few problems. Although they
are easy to setup and require no or only little maintenance, it is not possible to do any form
of load-balancing by dynamically using multiple paths. In large networks, multiple paths are
often available, but these are not always used, which leads to network congestion and inefficient
network usage. There are other technologies, like TRILL and ECMP, which are discussed in the
next chapter. TRILL provides some form of load-balancing, but in this method, a switch keeps
state about all other switches to make a routing decision [3]. With the networks ever-increasing,
this method does not scale well. ECMP is a method used to make routing decisions, but it only
considers paths of equal cost and does not load-balance along all possible paths [4]. With the
networks ever-increasing, these methods could create serious scalability problems.
Another technology that enables load-balancing is Software Defined Networking (SDN). A
classical switch handles the forwarding of packets (data plane) and the routing (control plane)
on its own. With SDN, these two planes are separated, and the switch is left only with the
data plane [5]. The complex control plane is moved to a different system, which is usually an
average commodity server. This system can be programmed to make the routing decisions for
the switch. A great example of SDN is OpenFlow, which will be explained in more detail in the
next chapter.
While this method allows for great manageability and thus load-balancing of the network via
that central controller, it also presents a few problems. The most notable one being that, with
the aforementioned growth of networks and data-flows, the central approach does not scale and
thus presents a bottleneck. Furthermore, it has a single point of failure. If this controller stops
functioning for any reason, and the switches lose their connection with the controller, they will
delete all the installed rules [6]. This means that the network will be completely unusable until
the controller functions properly again. To mitigate this problem, OpenFlow allows network
engineers to configure a backup controller. Also, emergency rules can be installed by a controller
that will be activated upon losing connection with the controller. However, these rules will be
static, and provide no proper (dynamic) load-balancing of the network.
5
1.1 Research Questions
Our approach combines SDN with a distributed approach, to get the best of both worlds. Multiple, independent controllers will be used to control the switches, and multiple paths will be
used to do dynamic load-balancing. Controllers can communicate with each other to exchange
information to enhance their routing decisions. In our research, we will look at the performance
of this approach. Also, when using multiple paths, oscillation of paths can take place, so the
stability of the solution is tested as well. Finally, the controllers communicate with each other
to exchange free bandwidth information of their own links, since there is no central controller
anymore. Interesting question is whether local information is enough to achieve high throughput,
or how the usage of non-local information changes the results. Our research questions are thus:
• What is the performance of the system?
• How stable is the system?
• How does communication between controllers influence the network throughput?
1.2 Related Work
Various other studies also analyzed the effect on performance when load-balancing traffic flows
over various paths, but so far they all depended on a central entity to handle the load-splitting.
However, all studies showed an increase in response time and/or throughput, depending on their
optimization goals.
A thorough research by Konstantaras et al.[7] found that for larger file-transfers, splitting
traffic flows (instead of using the same path) yielded a maximum of 45 percent increase in
throughput. Also, Sridharan et al.[8] showed and tested a near-optimal solution for trafficengineering while preserving the existing infrastructure, by using traffic knowledge to reroute
traffic using multiple paths.
Finally, Jarschel et al.[9] researched several methods on how to load-balance traffic. They
tested Round-Robin, Bandwith-based, DPI (Deep Packet Inspection), and Application-Aware
path selection. DPI outperformed all other path selection methods, with the exception of the
Application-Aware method, at the cost of a much less efficient use of network resources. The
Application-Aware method yielded results equal to DPI, but with a more efficient use of network
resources comparable with the non-DPI methods.
1.3 Thesis Outline
In the next chapter, relevant backgrounds are explained, these include the concepts of switching
in general, its limitations, and a few other techniques with regards to load-balancing network
loads such as the aforementioned Thrill and ECMP. Also, the general concepts of SDN will be
presented, most notably OpenFlow will be explained in more depth, as the presented solution
in this paper uses OpenFlow. In chapter 3, the presented solution will be explained, covering
the design and the algorithm. In chapter 4, the implementation and the overall constraints of
that design will be explained. Then, the results of the experiments will be shown, covering
configuration of the test-bed, tested scenarios, and the results. In chapter 6, the results will
be discussed and in chapter 7 a conclusion will be drawn. Finally, future work is discussed in
chapter 8, after which the list of references is presented.
6
CHAPTER 2
Background
There are various methods with regards to forwarding packets through a network. The most
widely used forwarding switches are the classical Ethernet switches, which are not without limitations. For instance, dynamic load-balancing of an Ethernet network is not possible. Using
multiple paths for packet flows is also not possible, and flooding can be problematic as well. The
last issue can be attributed to the fact that Ethernet does not have a Time-To-Live (TTL) value
as IP does. If a network contains loops (which is often the case for networks that have multiple
paths between destinations), flooding packets using Ethernet will be flooded forever. However,
protocols like STP (Spanning Tree Protocol) counter this by disabling flooding on certain ports.
In IP, the TTL value is decremented at each hop to prevent packets from being flooded forever,
as packets with a TTL value of zero are discarded.
2.1 Current load-balancing techniques
Although some of these limitations have been overcome by adding to the Ethernet standard
(e.g., STP), not all issues are accounted for: dynamic load-balancing of the network is still not
possible. A few methods that do address this problem will be discussed.
2.1.1 TRILL
Transparent Interconnection of Lots of Links (TRILL) is an IETF standard that combines bridging and routing by using special TRILL switches (also called RBridges) within an existing Ethernet network [3]. These RBridges broadcast their connectivity to all other RBridges, so that each
one knows about all the others and their connectivity (a so-called link state protocol). A TTL
value in the form of a hop count is used to address the issue of flooding in networks with loops.
Using the connectivity information, each TRILL switch calculates all the shortest paths from
itself to every other TRILL switch. TRILL then selects, based on a certain selection algorithm
(e.g. layer 3 techniques), which path(s) are used for what frames. The best part of TRILL is that
it can co-exist within an existing Ethernet network and add layer 3 techniques to improve the
performance of the network. The downside is that every RBridge keeps connectivity information
and paths to all the other RBridges, which does not scale well.
2.1.2 SPB and ECMP
While using a spanning tree prevented the use of some links that could yield a loop of packets
when flooding, Shortest Path Bridging (SPB) enables all paths that have the same (least) cost.
This method is also known as Equal-Cost Multi-Path (ECMP) routing, and can be combined
with TRILL as the selection algorithm. ECMP is a per-hop strategy where a router can select
which next hop a packet takes, if multiple equal-cost paths are available. If multiple paths are
available, but there are no paths with the same cost as the least cost path, then only the least
cost path is used. The advantage that ECMP has is the fact that load-balancing can be done
7
for paths with equal costs, making sure that higher cost (and potentially much slower or worse)
paths are not used. The downside is the other side of that coin: paths with a slightly higher cost
are not used altogether, which could lead to congestion at the selected least cost path(s), while
leaving the slightly higher cost paths idle. [10]
2.2 Software Defined Networks (SDN)
In a normal router or switch, the data path (forwarding of packets) and the control path (routing)
are handled by the same device. The SDN architecture separates these two, leaving the simple
data path at the switch, and moving the complex control path to another system. This concept
aims to be simple, dynamic, and manageable by making the network control programmable [5].
By doing this, the network control can be programmed to be very flexible and do all sorts of
things, like dynamic load-balancing, using multiple paths, handle flooding, etc.
2.2.1 Flow-based forwarding using OpenFlow
The OpenFlow protocol is one way of implementing SDN, by providing communication between
the data plane and the control plane. With OpenFlow, the OpenFlow-enabled switches have a
flow table with rules on what packets to output to which ports. A central OpenFlow controller
manages all these rules for all the switches, which are communicated via the OpenFlow protocol
[6]. An overview of flow-based forwarding using the OpenFlow standard version 1.0 will be
given in this section, using information from the specification [6]. For an even more in-depth
explanation, the OpenFlow white paper and specification can be consulted.
Figure 2.1: Structure of a flow entry, showing the 3 parts that make up a flow entry.
At the heart of the OpenFlow approach are flow entries, which are used by OpenFlow switches
to handle the data plane. An OpenFlow switch has 1 or more flow tables, and each table has
0 or more flow entries. A flow entry consist of header fields, counters, and a list of actions (see
figure 2.1). The header fields are used for matching incoming packets to this flow entry by using,
for example, the link-layer source and destination address. Wild-carding can be used to match
all flows using a certain TCP source port, for instance. Also, if the switch supports it, subnet
masks can provide more fine-grained wild-carding. Table 2.1 shows an example of the header
fields part of a flow entry.
in port
1
eth src
*
eth dst
aa:bb:cc:dd:ee:ff
ip src
*
ip dst
10.0.0.2
ip proto
6 (TCP)
tp src
1234
tp dst
5678
Table 2.1: Example of a flow entry showing some of the more important header fields that are
used in this research. Wild-carded options are denoted with a *. It should be noted that in the
description below, Ethernet addresses are also known as MAC addresses.
8
Field
in port
eth src
eth dst
ip src
ip dst
ip proto
tp src
tp dst
Description
This is the port from which the packet entered the switch, also known as the
ingress port.
The Ethernet source address of the packet.
The Ethernet destination address of the packet.
The IP source address of the packet.
The IP destination address of the packet.
The protocol of the transport layer of the packet. A value of 6 indicates TCP,
whereas 17 indicates UDP. Other values are possible and are mentioned in the
specification. There is also a header field with ’nw proto’, indicating the
protocol of the network layer (e.g., IP).
The transport layer source port, e.g. incoming TCP port 1234.
The transport layer destination port, e.g. outgoing TCP port 5678.
The counters of a flow entry are used for keeping track of per-flow statistics. These statistics
include, among others, the number of packets that have been matched with this flow entry, the
amount of bytes that has been sent using this flow entry, and more. An example of a flow entry
showing some of the more important counters is in table 2.2.
duration
12.345s
packet count
42
byte count
65432
idle timeout
30s
idle age
2s
hard timeout
600s
Table 2.2: A flow entry showing some of the more important counters, of which a few were used
in this research. They are explained in detail in the table below.
Counter
duration
packet count
byte count
idle timeout
idle age
hard timeout
Description
The duration shows how long the flow entry is in the flow tables of the switch.
This shows how many packets have matched this flow entry.
This tells how many bytes of packets have been forwarded using this entry.
The idle timeout value tells the switch how long to wait before removing the
flow entry, if no packets matched that entry in the specified amount of time.
This shows how long no packets have matched this flow entry.
The hard timeout value tells the switch how long to wait before removing the
flow entry, regardless of whether it is active or not. The switch will remove a
flow entry if either the idle age reaches idle timeout seconds, or if duration
reaches hard timeout, whichever comes first.
Finally, the actions list is used to handle packets that matched this flow entry. Various (and
even multiple) actions can be set, e.g., send all matched packets to the controller, forward all
matched packets out of a certain port, etc. The next table (table 2.3) shows some of the possible
actions of a flow entry. There are more actions defined, but these are the most important ones,
as they are used in this research.
9
Action
modify
drop
forward
enqueue (optional)
Description
This action allows the switch to modify various headers of the actual
packets that matched this flow entry. For example, the Ethernet
destination address can be modified, VLAN IDs can be changed, or
the transport-layer destination port can be changed.
This action simply drops the packet.
This action tells the switch to forward the packet over one or more of
its physical ports. Virtual ports can be used as well, and two examples
of virtual ports are the CONTROLLER, IN PORT and ALL ports.
Forwarding can then be used to send the packets to the controller
(CONTROLLER), to flood the packets (ALL), to send it back from
the port it arrived from (IN PORT), or to send it to its destination
using any of the normal, physical ports.
This action enqueues a packet to a pre-defined queue attached
to a port, and is used to provide basic Quality-of-Service (QoS) as
configured by the queue. For example, a simple configuration can be
used to rate-limit the bandwidth usage of flows. This action is optional
because the OpenFlow specification does not require the switch to
implement this action. All the other actions and items mentioned are
required to be implemented by an OpenFlow switch.
Table 2.3: A table showing some of the possible actions that can be used for flow entries.
OpenFlow protocol
The OpenFlow protocol is an application-layer protocol that runs on top of TCP, and handles
communication between OpenFlow switches (data plane) and OpenFlow controllers (control
plane). It defines various communication messages that are used by the controller and switch to
manage the flow tables. These messages can be used to send and receive network packets from
the switch, install flow-table entries to forward flows over a certain path, query the switch for
statistics, and much more. The switch and controller can initiate and send messages to each
other asynchronously, resulting in communication in both directions (switch-to-controller and
controller-to-switch communication). A list of some of these messages and their descriptions is
provided in table 2.4.
Statistics
Various types of statistics can be gathered using OpenFlow. There are counters per table,
showing the number of active flow entries, the number of packets looked up in this table, and
the number of matched packets for this table. It also includes how many entries can be stored in
that table. Second, there are statistics per individual flow entry, which is the counters field in a
flow entry as explained and shown above in the previous section on flow entries (table 2.2). And
finally, there are statistics per port, showing comparable statistics to flow entries. It includes,
among others, the number of packets and bytes sent and received via this port. A short list of
these port statistics is presented below in table 2.5.
10
Message
packet-in
send-packet
flow-mod
setup/configuration
error
read-state
Description
The switch sends a packet-in message to the controller if the switch
received a packet that did not match any flow entry in its tables.
This message is also sent if the packet matches an existing entry
with the action specifying that it should be sent to the controller.
With this message, the switch sends the first, often 128 bytes of the
packet as well as the ingress port to the controller. This way, it
enables the controller to determine the proper header fields for a
potential new flow entry. The controller can then create a new
flow entry, or update an existing one, to accommodate this new
packet and communicate that to the switch via a flow-mod message.
This message is sent by the controller to send a packet (included in
the message) out of a specified port on the switch.
The controller sends a flow-mod message if a new flow entry needs to
be added to the tables of the switch. This message is also used to
update an existing flow entry, which can be seen as removing and
re-adding that entry (effectively resetting any counters for that entry).
This message can be sent by the controller as a response to a
packet-in event, or it can be a standalone message. The controller
includes in this message the header fields, and a list of actions.
The setup/configuration messages are used to setup the initial
OpenFlow connection, and to exchange information about
configuration details. These details include, among others, the
features supported by the switch, and the version of OpenFlow that
will be used by both the switch and the controller.
The error messages are used by the switch to indicate errors to the
controller, such as a failure to modify a flow entry.
This message is sent by the controller to request various statistics
about the flow tables, flow entries, and ports. The switch must
respond with the requested information.
Table 2.4: A table showing some of the OpenFlow messages that are defined by the OpenFlow
protocol.
Statistic
rx packets
tx packets
rx bytes
tx bytes
drops, and errors
Description
This shows how many incoming packets have been received on this port.
This shows how many outgoing packets have been sent using this port.
This shows how many bytes have been received on this port.
This shows how many bytes have been sent using this port.
This shows how many received and transmitted packets have been
dropped, as well as how many receive and transmit errors have occurred.
If any packets are dropped or errors occur, these statistics will show that.
The switch can also send an error message to the controller with more
details.
Table 2.5: A table showing some of the OpenFlow statistics that are reported for OpenFlow
ports.
11
12
CHAPTER 3
Approach
There are currently various ways of load-balancing network flows using multiple paths, as explained in the previous chapter. The idea is to use multiple paths to reduce congestion, increase
the throughput of the network, and to optimally use the network topology. As shown in chapter
2, there are already a few methods to dynamically use multiple paths to route traffic over the
network. There are multiple ways to select a path out of multiple paths. For instance, the shortest path can be chosen, or the path with the least congestion can be chosen. Since the network
traffic keeps increasing, requiring more and more bandwidth, our approach selects the path with
the most free bandwidth. This allows flows to use as much bandwidth as possible and thereby
increase the network throughput. However, using the path with the most free bandwidth may
not always be the shortest path, so it is quite possible that the latency increases. On the other
hand, if the shortest path is congested, then the longer path could very well be faster.
3.1 Strategy description
Currently, this load-balancing using OpenFlow is all done using a centralized approach, and
now we present our distributed approach. Our approach uses multiple OpenFlow controllers,
each controlling an OpenFlow switch. These controllers can communicate with each other to
exchange information about the network status. The approach is distributed in the sense that
each controller in the network makes a decision about which packet is forwarded over which
link (’next-hop forwarding’). A controller makes this decision based on local information (its
own statistics), and may use additional non-local information to improve its decision (statistics
communicated by other controllers). In order to do dynamic load-balancing, multiple paths will
be used for routing. The routing decisions that will be made by the controllers will be made
based on the bandwidth usage of network flows, as well as how much free bandwidth a path has.
This will be explained in paragraph 3.3.2.
When multiple independent controllers are used, no single point of failure is present, as each
controller manages his own switch (or a small group of switches), leading to independent devices
(or independent sub-networks respectively). If one or more controllers were to stop functioning,
their corresponding switches would stop working. However, this time, other controllers could
detect this and use a different path to work around the problem. The network will still be
usable, and will thus also be more stable than with the central approach. Finally, the biggest
advantage of this approach is scalability: with the growth of networks and more switches being
added, more controllers can be used to handle these new switches.
An overview of the entire architecture of the solution can be seen in figure 3.1. The POX
controller [11] provides the basis of our OpenFlow controller, handling the OpenFlow connection
with the OpenFlow switch. We extended the POX controller with various components. Most
notably, the intelligent algorithm assigns paths to flows, so that each controller makes his own
routing decisions. To make good routing decisions, some information is needed, which is retrieved
from the other components. They provide network topology information, as well as information
13
about the network status. The network status is regarded as the amount of bandwidth flows
use, and also how much free bandwidth a path has. As each controller controls only his switch,
each controller can only gather statistics from his own switch. Therefore, another component
was added to communicate with other controllers, so that local information can be shared among
controllers to improve their routing decisions. All these components will be explained in more
detail in the following sections.
Figure 3.1: Architecture of the presented solution, showing the various components of the system
and how they interact with each other. The POX Controller [11] and Open vSwitch [12] provided
the basis of our architecture, to which we added various components for our method to function.
3.2 Intelligent algorithm
At the heart of the system is the algorithm that makes the routing decisions, and thus decides which flows take which paths. This algorithm is run by every controller, using local information and optionally non-local information. As explained, since the optimization goal is
network throughput, decisions need to be made based on bandwidth usage. The following algorithm presents the proposed solution to the problem statement in pseudocode in algorithm 1. A
flowchart of the algorithm is presented in figure 3.2.
First, for each f low in F , all the paths between the current switch and the destination of
that flow are determined. The paths are determined via the use of the Topology manager in
figure 3.1. The current switch represents the switch for which the controller runs the algorithm.
Optionally, the paths can be sorted to, for instance, prefer shorter paths. All the paths that
cannot support the required bandwidth of this flow are filtered out. This way, it is made sure
that flows who use a lot of bandwidth, do not switch to slower paths that cannot support them.
Also, paths that lead back to the source of this flow are removed, to prevent loops. Now that all
the paths for all the flows are known, the flows are grouped based on the number of paths they
have. Then, for each group in Fgrouped , all the flows are sorted by priority, so that flows with
a higher priority are processed first, making sure that they get the paths with more bandwidth
first.
14
After the flows have been grouped and sorted, the algorithm processes each group. Per flow
in this group, the list of paths is consulted. It selects a new path if the gain in bandwidth by
switching to that path is more than a given percentage of the current bandwidth usage of this
flow. This is to prevent constant switching of paths for flows. If no such path can be found (or
no paths can support this flow), this flow does not change paths. The algorithm processes the
groups in ascending order of the number of paths that their flows have. This way, flows with
only 1 path are handled first, then flows with 2 paths, etc. The idea is that flows with multiple
paths can use multiple paths and adjust to flows with less paths.
Furthermore, by determining the amount of free bandwidth of a path, local information is
always used. Optionally, the amount of free bandwidth is requested from controllers along the
path, using at most n edges. If n is 0, then only local information is used. If n is 1 or more, then
other controllers are queried for their free bandwidth information along the path. The retrieval
of how much free bandwidth a path has is done via the Path bandwidth calculator component
in figure 3.1. Finally, after all the flows have been processed, new flow entries are installed for
the flows that changed paths.
Data: Graph G = (V, E), current node v, list of flows F, n (hops away)
Result: A list of flows with their assigned paths
Grouped flows Fgrouped = N one;
for f low in F do
f low.paths ← f indAllShortestP aths(v, f low.destination);
Remove paths that cannot support this flows bandwidth requirement;
Remove paths that lead back to the source of this flow (to prevent loops);
Sort paths based on number of edges, so shorter paths are processed first;
Add flow based on |f low.paths| to Fgrouped ;
end
for group in Fgrouped do
Sort flows in group based on priority;
for f low in group do
for path in f low.paths do
Get the amount of free bandwidth b along at most n edges of this path;
bandwidthGain ← b − f low.bandwidthRequirement;
if bandwidthGain ≥ f low.bandwidthRequirement ∗ pathGainT hresopehold
then
Assign this flow to path path and update the amount of free bandwidth
along this path;
break;
end
end
end
end
Algorithm 1: Algorithm in pseudocode using flows as objects for simplicity.
Optionally, if not all flows fit (there is a shortage on available bandwidth), then Weighted
Fair Queuing (WFQ) could be used to allocate bandwidth to flows based on priority.
3.3 Retrieving information
In order for our approach to function, various information is needed. Network topology needs
to be known, so we know all the paths and packets can be forwarded correctly. Furthermore,
the controllers can communicate with each other to exchange information. The information that
they exchange is a simple estimate of how much free bandwidth a certain link has (see paragraph
3.3.2). This non-local information is used in the decision-making process of the algorithm to
make better decisions with regards to the network status.
15
Figure 3.2: Algorithm Flowchart
3.3.1 Network topology
First, information about the network topology needs to be known to all controllers, so they
all know how to route packets over the network. Specifically, we need to know which host is
connected to which switch behind which port, as well as all the links (which switch is connected to
which other switch behind which port). Furthermore, we need to know the bandwidth capacities
of all the links, which is used to calculate how much free bandwidth a link has. Since our
approach uses multiple OpenFlow controllers, each controller must also know which controller
controls which switch. This information could be learned and changed dynamically, as is common
in the centralized approach by using link discovery methods. Since handling topology issues is
not part of our research, in our implementation, all of this information is simply read from a file,
and handled by the Topology manager (see figure 3.1).
3.3.2 Bandwidth estimation for flows and paths
The piece of information that is used in the decision making process, is the amount of bandwidth
a flow uses. Each controller uses a statistics and bandwidth estimation handler, as seen in figure
3.1. The statistics handler queries the connected switch for the byte count values of all the flow
entries (flows), and the bandwidth estimation handler keeps track of these measurement values.
The statistics handler uses a certain query interval, to measure the byte count values at certain
times and thereby get the bandwidth estimation handler a good idea of the bandwidth usage of a
flow. The controller also keeps track of the time when the switch responds with this information.
For each flow, all byte count measurements, and their respective measurement times, are used
to calculate the intermediate bandwidth usages between measurements. The final estimated
bandwidth is then simply the average bandwidth of these intermediate values. By using this
method, the controller can estimate how much bandwidth a flow uses, and use it to make a
decision.
Another critical part of the decision making process, is the amount of free bandwidth a path
(a list of links) has. This information can be communicated to other controllers, who can use
it in their decision making process. The amount of free bandwidth that a link has, is estimated
using almost the exact same process as the one for flows. However, instead of querying the switch
for the statistics of all the flow entries, the switch is queried for the statistics of all the ports
instead. The ’bandwidth usage of a port/link’ is then estimated by using the same method as
the one for flows. Since the capacity of the link is known (see paragraph 3.3.1), we now have an
estimate of how much free bandwidth a link has.
By changing the value of the query interval, bandwidth estimates can be done over a longer
period of time, thereby making the estimations more smooth. This also means that the implemented solution will be less sensitive to changes in bandwidth for flows than if a lower value was
used. Of course, more complex bandwidth estimation methods can be used, but that is not the
goal of this research project.
16
3.4 Other controller functionality
Besides the retrieval of information necessary for the algorithm, other functionality is necessary
as well for the controller to function. First, communication between controllers is handled using
JSON messages, communicating the amount of free bandwidth of a link. This is done via a
simple request/response model using the controller-to-controller handler of each controller. A
controller requests from another controller how much free bandwidth it has at a certain port
(representing a link). The other controller then replies the request with this information by
consulting its own statistics. Other functionality is the handling of packet-ins and problematic
flows, which are explained next.
3.4.1 Packet-ins for new flows
Incoming packets that do not match an existing flow entry yield a packet-in event, and are sent
to the controller of the switch that received the packet. The controller then determines all the
paths to the destination of this packet. It selects the path with the most free bandwidth, using
at most n hops-away information, by querying other controllers. After the path has been chosen,
a new flow entry is installed at the switch for this new flow. Now that the flow is known to the
controller, the algorithm is scheduled to run again after a configurable amount of seconds. It is
currently scheduled to run after three times the measurement/query interval. This is done to
allow the controller to measure the bandwidth of this new flow and get an accurate estimate, so
that it can make a good routing decision.
Finally, the idle timeout and hard timeout values, which are set to 30 and 600 seconds
respectively, control how long a flow entry should be kept in the flow tables. The flow entry gets
removed if no packets matched this flow entry for 30 seconds, or if 600 seconds have passed since
this flow entry got installed/modified. When updating flow entries (assigning different paths
to flows), the timers for the idle and hard timeout deadlines are reset. This is equivalent to
removing and re-adding the flow entry. Other values are of course possible, but this falls outside
the scope of the research project.
3.4.2 Algorithm and problematic flows
The algorithm is also scheduled to run every 30 seconds (another configurable interval). Furthermore, it is run again if the controller detects a problematic flow, which is signaled by the
bandwidth estimation handler (see figure 3.1). A flow is deemed ’problematic’ if the new bandwidth estimation drops below a percentage of the previous bandwidth estimation. In order to
detect this, a certain number of byte count samples and intermediate bandwidths are stored.
The algorithm then runs for all flows who have a bandwidth estimation, and ignores those flows
who do not have an estimation yet. The flows that do not have an estimation yet will thus not
change paths. Also, if desired, flows with an estimated bandwidth below a certain value can be
filtered out from being able to trigger the algorithm. This is done so that low-bandwidth flows
can be more or less ignored, as optimizing for those flows is not very interesting compared to
optimizing for big flows.
3.5 Configurable parameters
Table 3.1 summarizes and describes all the parameters introduced in this chapter. Each parameter adjusts the workings of each component in figure 3.1 and as explained in this chapter.
Experiments will be done to test the performance of our method with regards to the components
and their parameters (see chapter 5).
17
Symbol
q
Name
query interval /
measurement interval
a
algorithm interval
t
new flow timeout
n
look ahead
m
num samples
g
bw gain
p
path gain threshold
f
filter
s
bw sensitivity
i
idle timeout
h
hard timeout
Description
This parameter controls how often the controller
queries the switch for statistics on flow entries
and ports (see figure 3.1).
This parameter can be adjusted to specify how often
the controller should run the intelligent algorithm
(see figure 3.1).
This value controls how long the controller should wait
before running the algorithm after installing a flow
entry for a new flow.
This parameter determines how much non-local
information is used, if any. Its value corresponds
directly to how many hops away along a path another
controller is queried for statistics.
The number of byte count samples to store is specified
via this parameter. A minimum value of 2 is required.
The threshold to switch to another path is indicated
via this parameter. It is the bandwidthGain variable
in algorithm 1.
The path gain threshold determines the needed gain
in bandwidth for a flow to switch to another path.
Only if the bandwidth gain of switching to that path
exceeds this value (percentage-wise), then that flow
is assigned to the new path.
This value can be set to filter out flows with an
estimated bandwidth below this value from triggering
the algorithm as a problematic flow (see figure 3.1).
This parameter controls how sensitive the bandwidth
estimation handler is with regards to flows with a lower
bandwidth estimation than the previous estimation.
If the drop in bandwidth usage is more than this value,
percentage-wise, then the algorithm can be signaled to
run again (see figure 3.1), provided both estimations are
not below the value of the filter parameter.
The idle timeout value determines how long an idle
flow entry should be kept in the flow tables of the
switch (see table 2.2).
The hard timeout value determines how long a flow
entry should be kept in the flow tables of the switch
(see table 2.2).
Table 3.1: This table shows all the parameters used in this research, along with a description of
each parameter.
18
CHAPTER 4
Implementation
Now that the approach has been explained in the previous chapter, some of the implementation
details will be discussed in this chapter. As explained, each OpenFlow switch in the network is
controlled by its own OpenFlow controller. This controller is the open source POX controller
[11]. An extension module for POX was written in Python, implementing all the components
mentioned in the previous chapter. Although POX offers support to use PyPy (which offers
significant performance improvements over the standard Python interpreter [13]), this was not
used in our implementation, since it was not compatible with various Python modules that were
used. Since the implementation of the algorithm and the bandwidth estimation handlers are
fairly straightforward (as they are discussed in detail in chapter 3), more important components
are discussed here: the communication between controllers, and the handling of topology.
4.1 Controller to controller communication
As mentioned in the previous chapter, a controller can request statistics from another controller
to enhance its routing decision. More precisely, it requests how much free bandwidth there is
available at a certain port, representing a link. This communication between controllers is done
via simple JSON messages, using a request/response model. The controller-to-controller handler
in figure 3.1 handles this communication between controllers. An example and the format of
these communication messages is shown in listing 1.
{
"type":
"port":
"origin":
"request",
1,
"c0"
"type":
"bandwidth":
"origin":
"response",
123,
"c1"
}
{
}
Listing 1: JSON example messages showing the request and response format of the communication between two controllers c0 and c1. Here, c0 requests from c1 how much free bandwidth it
has available at port 1. Controller c1 then responds with the requested information, telling c0
that it has 123 Mb/s available at that port.
19
4.2 Topology handling
In order to simulate the network, Mininet 2.2.1 [14] was used to create topologies with hosts,
switches, and controllers. Links were added between hosts and switches, with a certain latency
(currently 2 milliseconds) and various bandwidth capacities. Currently, no (random) loss takes
place, unless the buffers of a switch are full. For simulating the switches, Open vSwitch 2.0.2 [12]
was used, along with their provided kernel module to enhance performance. When creating these
topologies, a JSON file was written containing the details of the network topology required by
the controller (as discussed in section 3.3.1), and is handled by the topology manager (see figure
3.1). An example scenario (figure 4.1) and the translation to this JSON config-file is shown in
listing 2 and continues in listing 3.
When the controllers are initializing, each one reads this config-file and uses it to create
a MultiDiGraph: a directed graph where nodes can have multiple edges between them. This
MultiDiGraph is consulted to find all the paths between two nodes, and to determine various
other details, like the bandwidth capacity between nodes. In our case, a node is either a switch
or a host. This graph functionality is provided by the NetworkX package.
Figure 4.1: Example of a scenario that can be created with Mininet. The translation to a JSON
topology file is shown in listings 2 and 3.
20
{
"cmap": {
"s0": "c0",
"s1": "c1"
},
"controllers": {
"c0": {
"info_port": 6634,
"ip":
},
"c1": {
"info_port": 6636,
"ip":
}
},
"hosts": {
"h0": {
"ip": "10.0.0.1",
"mac":
},
"h1": {
"ip": "10.0.0.2",
"mac":
}
},
"switches": {
"s0": {
"mac": "00:00:00:00:00:11"
},
"s1": {
"mac": "00:00:00:00:00:12"
}
},
"127.0.0.1",
"of_port": 6633
"127.0.0.1",
"of_port": 6635
"aa:bb:cc:dd:ee:ff"
"ff:ee:dd:cc:bb:aa"
Listing 2: JSON example file showing all the details of the network topology (part 1). The ’cmap’
key holds a mapping between switches and controllers, so each controller knows which controller
handles which switch. The ’controllers’ key is a list of all controllers with their details. The
’info port’ is the port each controller uses to request statistics from other controllers, whereas
the ’of port’ is used for the OpenFlow connection to a switch. The ’hosts’ and ’switches’ keys
provide information about hosts and switches respectively, showing their MAC addresses (and
IP addresses for hosts).
4.3 Implementation constraints
Now that the approach and the important parts of the implementation have been explained,
it can be observed that there are various (implementation) constraints. Currently, there is
no support for flooding packets. Also, dynamic topologies, where hosts can freely move and
connect via different switches, are not handled. The detection of broken links between switches
is not implemented as well, and the dynamic detection of hosts, switches and controllers is not
implemented either. However, these issues can be grouped under topology issues. It is possible
to implement them, but this was not done as it is not part of the research. There are however,
a few more important constraints, explained in the next paragraphs.
4.3.1 Manageability
In our research, there is currently 1 controller for each switch. For very large networks, managing
this may not be very practical. It is quite possible to have 1 controller manage multiple switches,
21
"topology": {
"hosts": {
"h0": {
"bandwidth": 1000.0,
"switch": "s0",
},
"h1": {
"bandwidth": 1000.0,
"switch": "s1",
}
},
"switches": {
"s0": {
"switches":
[ "s1",
"s1"
]
"ports":
[ 2,
3
],
"bandwidths":
[ 900.0,
100.0
],
},
"s1": {
"switches":
[ "s0",
"s0"
]
"ports":
[ 4,
5
],
"bandwidths":
[ 900.0,
100.0
],
}
}
}
"switch_port": 1
"switch_port": 1
}
Listing 3: JSON example file showing all the details of the network topology (part 2). The
’topology’ key is used to describe the network topology, by using a list of hosts and switches.
Each host in that list (’hosts’ key) shows to which switch it is connected, what its (currently
symmetric) bandwidth capacity to that switch is, and behind which port on the switch it is
connected. For the ’switches’ key, a slightly different format is used. Each switch has a list of
switches it is connected to, as well as behind which port and at which capacity. The three lists
for each switch should be read column-wise. For instance, for switch s0, it is connected to switch
s1 via port 2, with a bandwidth capacity of 900 Mb/s.
without losing the entire distributed approach. One controller could control a group of switches,
and make decisions for each switch based on all the information that it knows. Another option
is to follow the distributed approach more closely, and use only the information for each switch
that it would know as if the controller would control only that switch.
4.3.2 Flow issues
More important are the issues related to flows. With the centralized approach, if the central
controller receives a packet-in from a switch for a new flow, it chooses a new path for this
flow. It then installs flow entries for this new flow along the chosen path. However, using the
presented distributed approach, if a controller receives a packetin from his switch, it queries
controllers along all the paths for their free bandwidth information. After the best path with
the most free bandwidth has been chosen, the controller installs a flow entry for this new flow
on his switch. The new flow then gets forwarded from this switch to the next switch along the
chosen path. Now, this next switch sends a packetin to his controller, and the process repeats
itself. For flows that need to be routed over a long path (because for instance, no shorter path is
available), this introduces a lot of latency. This latency issue becomes bigger if every controller
queries a lot of other controllers along all the paths. However, by using only local information
when choosing a new path for a new flow, this issue can be reduced to the absolute minimum.
The tradeoff is then that the newly chosen path may not be a very good path.
22
Another thing that can be improved is the fact that right now, each flow has its own flow
entry. Once again, for very large networks, this does not scale. These networks can (potentially)
have more network flows flowing through them than can be stored in the flow tables of their
switches. Therefore, flow entries need to be aggregated into flow entries that match a group
of entries, to reduce the number of entries. The level of aggregation will then determine how
fine-grained control of the network is required. In the current implementation, these aggregate
flow entries are currently not used, but its possible to implement this. Flow entries that match
multiple flows will then be treated as 1 flow, and follow the usual steps.
23
24
CHAPTER 5
Experimental results
The results will be presented in this section. First, the setup that was used for achieving these
results will be explained, followed by the actual results. The first test is a functionality test to
see if the solution is working correctly, and to test various parameter values to see the effects on
the bandwidth estimation accuracy and load-balancing. The second test is a stability test to see
if the solution is stable. Both tests are done for two scenarios.
5.1 Test-bed configuration
To measure the throughput between two hosts, Iperf 2.0.5 was used to generate and measure
both TCP and UDP traffic between two hosts. One host runs an Iperf client, and the other
runs an Iperf server. The throughput measured by the Iperf server is also compared to the
throughputs estimated by the controllers. This way, the direct impact of the solution on the
throughput between the hosts can be measured.
Also, some parameters are set to a fixed value, and are not experimented with. The path gain
threshold is currently set to 10%. The tests that were done, are not suited for experimenting
with this parameter, because the tests and scenarios are fairly simple in nature, and this is a
parameter suited for more complex scenarios and tests. Also, the filtering of flows is currently
set to flows that have an estimated throughput of 10 Mb/s or less. These flows are unable to
trigger the algorithm, unless they increase in throughput.
5.1.1 First scenario
The first scenario that was tested is a very simple scenario. It consists of two hosts h0 and
h1, two switches s0 and s1, and two controllers c0 and c1, as can be seen below. There are 2
paths between the hosts, a ’slow’ path, and a ’fast’ path (compared to each other). This simple
scenario will be used to see if the presented solution is working correctly, as well as to see the
effect of the parameter values on the performance of the system. See figure 5.1.
25
Figure 5.1: Scenario 1: A simple scenario connecting 2 hosts via 2 switches, in order to validate
the correct workings of the system and the influence of the various parameters.
5.1.2 Second scenario
The second scenario is a bit more advanced, with 3 switches (and 3 controllers) in a triangle
form, and 3 hosts, each connected to a different switch. Once again, there are two paths between
each pair of distinctive hosts. There is one ’slow’ path, and one ’fast’ path. This scenario will
be used for a slightly more complex test, namely to show how stable the solution is, and to see
if, and how much, the oscillation of paths takes place. See figure 5.2.
Figure 5.2: Scenario 2: A scenario connecting 3 hosts via 3 switches, in order to test the workings
and stability of the system, as well as the influence of the various parameters.
5.2 Performance measurements
Now that the test scenarios are known, performance measurements can be done. The following
tests are explained, after which the results for each scenario are shown. For each test, a number
of different parameter values were tested to test the performance of the implemented solution
and the influence of these parameters. The bandwidth, as reported by Iperf, will be compared
to the estimated bandwidth by a controller. In our tests, UDP flows will be used as background
traffic (to simulate a network that is being used), and therefore only the results for the TCP flows
will be presented. The results for the UDP flows are also omitted because they rarely change
in bandwidth significantly. The biggest change in bandwidth for these UDP flows was found to
be at most 1 Mb/s more or less. Overall, these UDP flows are very stable, as TCP adapts its
transmission rate, but UDP does not.
26
5.2.1 Functionality Test
The first test consists of installing static flow entries on each switch (so that packets can be
routed between host h0 and host h1 without the controller and algorithm doing it). These static
entries use the slow path of scenario 1 and 2, with a capacity of 100 Mb/s. After these entries
are installed, a new UDP flow is started at t = 0 from h0 to h1, with a fixed bit-rate of 80 Mb/s,
thereby almost satisfying the slow link. Five seconds later (at t = 5), a TCP flow is started at
h0 directed to h1. The UDP flow runs for 90 seconds and then stops, the TCP flow runs for 120
seconds and then stops. In figure 5.3, a time-line of this test can be seen.
Figure 5.3: A time-line representing the functionality test with the start and end of the UDP
and TCP flow.
If the implemented solution is working correctly (and good parameters have been chosen),
we hope to observe that the controller detects the sub-optimal solution, runs the algorithm, and
installs new flow entries on the switches. Ideally, it installs new flow entries such that the optimal
solution would be reached. In this case, the optimal solution would be to route the TCP flow
over the fast path (900 Mb/s), and the UDP flow over the slow path (100 Mb/s). We will now
show the results of this test for both scenarios with various parameters. For this test, only local
information was used to make decisions, as both scenarios do not benefit from requesting free
bandwidth information from another controller for this test.
First scenario
The first parameter that is tested is the query interval parameter. As can be seen in figure 5.4,
the performance of the algorithm is quite depending on the proper estimation of the bandwidth
usage of flows. A query interval of 0.5 seconds yields a very spiky estimation graph, and leads the
controller to believe that the flow has sufficient bandwidth. Using an interval of 1 second yields
much better results. An interval of 2 seconds yields an even smoother estimation graph, while
still showing comparable results. It was expected that the sub-optimal situation with regards
to the TCP flow would be detected. However, this does not seem to be the case. The situation
does get corrected after the algorithm is run again (algorithm interval), which is after about 30
seconds.
27
q = 0.5 seconds
q = 1.0 seconds
q = 2.0 seconds
Figure 5.4: Results for a query interval of 0.5, 1.0, and 2.0 seconds respectively. For each result,
2 byte count samples were stored, and a bandwidth sensitivity of 10% was used.
Using a query interval of 2.0 seconds consistently leads to better results than the results
where a query interval of 0.5 was used, since the estimation is more accurate and shows the best
performance. The following results are achieved by using 2, 5 and 10 measurement samples,
while using a query interval of 2.0 seconds and a bandwidth sensitivity of 10%.
m=2
m=5
m = 10
Figure 5.5: Results for 2, 5, and 10 byte count samples used respectively. For each result, a
query interval of 2.0 seconds was used, as well as a bandwidth sensitivity of 10%.
From figure 5.5, it can be seen that either 2 or 5 samples lead to good results. While the
results with 10 stored samples are very smooth, the estimated bandwidth falls behind the real
bandwidth too much (being too slow to update). Therefore, in the next test, 2 byte count
samples are stored, and the influence of the bandwidth sensitivity values will be shown. A value
of 10%, 1% and 0.1% will be tested.
s = 10%
s = 1%
s = 0.1%
Figure 5.6: Results of the bandwidth sensitivity parameter with values of 10%, 1%, and 0.1%
respectively. For each result, a query interval of 2.0 seconds was used, and 2 byte count samples
were stored.
As can be seen in figure 5.6, even a bandwidth sensitivity value of 0.1% is not sufficient to
detect the sub-optimal solution in the first 30 seconds. Apparently, some other form of detection
28
needs to take place to solve this issue. Overall, for good parameters, the optimal solution is
reached after 40-60 seconds. Basically, it can be seen that the system of network flows converges
to that optimum as soon as the algorithm is run, or else soon afterwards (within 10 seconds).
Second scenario
Following the results for the first scenario, are now the results from the same test for the second
scenario. As can be seen in figure 5.7, a very similar result with regards to bandwidth estimation
and TCP throughput is achieved when using the same parameters. Once again, a query interval
of 2.0 seconds leads to the smoothest estimation, while still showing comparable results to that
where an interval of 1.0 second was used. The results with a query interval of 0.5 seconds is too
spiky and therefore leads to very bad results.
q = 0.5 s
q = 1.0 s
q = 2.0 s
Figure 5.7: Results for a query interval of 0.5, 1.0, and 2.0 seconds respectively. For each result,
2 byte count samples were stored, and a bandwidth sensitivity of 10% was used.
Using a measurement interval of 2.0 seconds consistently leads to better results than the
results where a query interval of 0.5 was used. The following results are achieved by using 2,
5 and 10 measurement samples, while using a query interval of 2.0 seconds and a bandwidth
sensitivity of 10%.
m=2
m=5
m = 10
Figure 5.8: Results for 2, 5, and 10 byte count samples respectively. For each result, a query
interval of 2.0 seconds was used, as well as a bandwidth sensitivity of 10%.
From figure 5.8, it can be seen that either 2 or 5 samples lead to good results. While the
results with 10 stored samples are very smooth, the estimated bandwidth falls behind the real
bandwidth too much (being too slow to update). Therefore, in the next test, 2 byte count
samples are stored, and the influence of the bandwidth sensitivity values will be shown. A value
of 10%, 1% and 0.1% will be tested.
29
s = 10%
s = 1%
s = 0.1%
Figure 5.9: Results of the bandwidth sensitivity parameter with values of 10%, 1%, and 0.1%
respectively. For each result, a query interval of 2.0 seconds was used, and 2 byte count samples
were stored.
As can be seen in figure 5.9, like with the first scenario, even a bandwidth sensitivity value of
0.1% is not sufficient to detect the sub-optimal solution in the first 30 seconds. Apparently, some
other form of detection needs to take place to solve this issue. Overall, for good parameters,
the optimal solution is reached after 40-60 seconds. Basically, it can be seen that the system of
network flows converges to that optimum as soon as the algorithm is run, or else soon afterwards
(within 10 seconds).
5.2.2 Stability Test
The second test is a stability test, meant to test if flows oscillate between paths. This time, no
static entries are pre-installed, and new flows should be assigned to the path with the most free
bandwidth. A TCP flow is started at h0 with destination h1, that runs for 10 minutes. After
5 seconds, a new UDP flow is started with a fixed bit-rate of 80 Mb/s again, which runs for 30
seconds, and then stops. After another 30 seconds have passed, the cycle of 30 seconds of UDP
flow and 30 seconds of waiting is repeated for as long as the TCP flow runs. If the implemented
solution works properly, then the TCP flow should not be influenced (or only very little) by the
UDP flow. In figure 5.10, a time-line of this test can be seen as well.
Figure 5.10: A time-line representing the stability test with the start and end of the UDP and
TCP flow.
Scenario 1
As can be seen in figure 5.11, in the 5 seconds after the TCP flow is started, it quickly reaches the
max 900 Mb/s throughput, but it also has to share the bandwidth with the UDP flow (hence the
around 800 Mb/s speeds for about 30 seconds). This happens because the moment the controller
c0 receives the new UDP flow, the amount of free bandwidth for the 900 Mb/s path is more than
it is for the slower path, assigning the UDP flow to the fast link as well. After the UDP flow
stops, the TCP flow reaches the max throughput and is further on not influenced by the other
flow, which is exactly as desired (apart from the estimation peaks happening).
30
Figure 5.11: TCP throughput as reported by Iperf and as estimated by controller c0. Two
byte count samples, a measurement interval of 2.0 seconds, and a bandwidth sensitivity of 10%
were used.
Scenario 2
For the second scenario, almost the exact same result can be observed in figure 5.12. The TCP
flow has to share the bandwidth with the UDP flow for a short while, but afterwards, is not
disturbed by the UDP flow anymore. It should be noted that the TCP flow is only shortly
influenced by the UDP flow, as the algorithm runs again and assigns the UDP flow to the slower
path, thereby reaching the optimal solution very quickly.
Figure 5.12: TCP throughput as reported by Iperf and as estimated by controller c0. Two
byte count samples, a measurement interval of 2.0 seconds, and a bandwidth sensitivity of 10%
were used.
9
31
32
CHAPTER 6
Discussion
We started our research with the purpose of bridging OpenFlow with a distributed approach, to
combine the best of both worlds. The notion of one central OpenFlow controller was changed to
accommodate multiple, independent controllers, each controlling its own switch or a small group
of switches. Communication between controllers can be used to exchange information about
the network status to improve routing decisions with regards to load-balancing the network.
Using a simple method of estimating the bandwidth usage of flows, as well as the amount of free
bandwidth of paths, in order to make the routing decisions, proves to be quite promising. The
implemented solution performs quite well (if good parameters have been chosen), achieving high
network throughput while remaining stable, as can be seen by the results of the stability test for
both scenarios.
However, the method of detecting sub-optimal flows is not sufficient and needs to be addressed. This concerns the first 30 seconds of all the results of the functionality test, for both
scenarios. Fortunately, it can be seen that once the network reaches its maximum throughput,
the detection mechanisms are sufficient. They schedule the algorithms quite often, resulting in a
so-called ’monitor mode’. Furthermore, the fairly simple bandwidth estimation method performs
reasonably well when good parameters have been chosen, with the exception of a few seemingly
random peaks in the estimates. A set of good parameters are as follows:
• A query interval of 2.0 seconds.
• Two byte count samples to store and use for bandwidth estimations.
• A bandwidth sensitivity value of 10%.
Also, the system is stable, as can be seen from the stability test. Finally, the performance
of the system with regards to using local and/or non-local information is untested. A special
scenario in the shape of a grid was created specifically for this purpose, and is discussed in more
detail in section 8. It would also be used to test the more complex parameters, such as the path
gain threshold, which is currently untested as well.
33
34
CHAPTER 7
Conclusion
In this thesis, a distributed, and scalable SDN approach was presented and tested. Out of the
three research questions, the first two can be answered, and the third is left for future work. To
summarize, the research questions are mentioned below.
• What is the performance of the system?
• How stable is the system?
• How does communication between controllers influence the network throughput?
From our results, it can be observed that using 2 byte count samples, and querying the
switches every 2 seconds, seem to provide the most accurate and stable bandwidth estimations
of flows, leading to the best performance. As can be seen from the tests, the more accurate the
bandwidth estimation of the flows is done, the better the performance of the system. This can be
seen as a drawback of our approach, as good parameter values are required for good performance.
Unfortunately, it was hoped that the bandwidth sensitivity parameter would be sufficient to
detect the sub-optimal performance situation. This is not the case, and can be seen in the first
30 seconds of all the results of the functionality test, for both scenarios. For those situations, a
different approach must be used. It should be noted that this was an artificially created situation
(by using pre-installed flow entries). It is thus less likely that it will occur (yet possible), as new
flows are always assigned to the path with the most free bandwidth.
Furthermore, the stability test proved the stability of the algorithm quite well. Once the
UDP flow finishes its first run (or the algorithm runs again), it no longer influences the TCP
flow at all. This is because the new UDP flow will be assigned to the path with the most free
bandwidth, which will then be the slow path. This result is as desired, as it prevents the constant
switching of flows between multiple paths via the path gain threshold.
Finally, in all tests, only local information was used, as the two tests and scenarios do not
benefit from the added communication. The communication between controllers works fine, and
a quick test leads to similar results on the second stability test. This communication was to be
tested with another scenario, but will have to be reserved for future work.
35
36
CHAPTER 8
Future work
Our unified approach shows promising results, but real testing with bigger networks and dataflows will have to be done to ensure that it performs properly. However, the detection of the
sub-optimal situation in the first 30 seconds of all the results of the functionality test, is not
working as expected. Thus, a different approach should be used instead. It should be noted that
as soon as the algorithm is run again (triggered by the periodic run every 30 seconds or otherwise),
the problem is dealt with. So the algorithm works, only the detection mechanism needs to be
addressed. Perhaps a detection mechanism could be used that would run the algorithm if a link
reaches its maximum capacity, but only if there are flows using that link that could use other
paths that have enough available bandwidth. However, all this can be work for future research,
to improve the presented solution.
Another area of improvement is the communication between controllers. Currently, only a
simple request/response model is used to communicate the free bandwidth information. If nonlocal information is used, it would be interesting to see the information of a certain amount of
hops away used versus the network performance. Even more interesting would be to see what
the results will be if only non-local information is used, but controllers along a path can signal
back a controller that it should take another path for a certain flow (as the current one may be
congested, etc). This would reduce the total amount of communication between controllers, as
a controller queries a certain amount of controllers for all the paths of a flow upon receiving a
packet-in. It even queries all the paths of all the flows when running the algorithm.
Also, the implementation is currently very experimental. A number of features that other
controllers have are not implemented, and are mentioned below.
• The ability to flood packets over all or a set of ports
• The ability to dynamically detect the topology of the network
• The detection and handling of broken links (and also non-functioning controllers)
These features are currently not implemented as they fall outside the scope of this project.
However, there are a few things implementation-wise that can be improved. When controllers
along a path are queried for their bandwidth information, the response is not saved or cached.
This means that a controller can be queried multiple times for the same information if multiple
paths are considered (for all the flows), leading to unnecessary communication. An interesting
experiment would be to see how long that information can be cached without losing network
performance. It would be interesting to see how the approach presented in this solution can be
improved.
37
38
CHAPTER 9
References
References
[1]
IDC. Growth, Consolidation, and Changing Ownership Patterns in Worldwide Datacenter
Forecast. https://www.idc.com/getdoc.jsp?containerId=prUS25237514. Accessed:
2015-08-08. 2014.
[2]
Cisco. The Zettabyte Era - Trends and Analysis. http : / / www . cisco . com / c / en / us /
solutions / collateral / service - provider / visual - networking - index - vni / VNI _
Hyperconnectivity_WP.pdf. Accessed: 2015-08-09. 2015.
[3] TRILL RFC Specification. http://tools.ietf.org/html/rfc6325. Accessed: 2015-0702.
[4] Equal-Cost Multi-Path routing. https://en.wikipedia.org/wiki/Equal-cost_multipath_routing. Accessed: 2015-09-14.
[5] SDN definition. https://www.opennetworking.org/sdn-resources/sdn-definition.
Accessed: 2015-07-02.
[6] OpenFlow 1.0 Specification. http : / / archive . openflow . org / documents / openflow spec-v1.0.0.pdf. Accessed: 2015-07-02.
[7]
Stavros Konstantaras, Ana Oprescu, and Zhiming Zhao. “PIRE ExoGENI–ENVRI preparation for Big Data science”. In: (2014).
[8]
Ashwin Sridharan, Roch Guérin, and Christophe Diot. “Achieving Near-optimal Traffic
Engineering Solutions for Current OSPF/IS-IS Networks”. In: IEEE/ACM Trans. Netw.
13.2 (Apr. 2005), pp. 234–247. issn: 1063-6692. doi: 10.1109/TNET.2005.845549. url:
http://dx.doi.org/10.1109/TNET.2005.845549.
[9]
M. Jarschel et al. “SDN-Based Application-Aware Networking on the Example of YouTube
Video Streaming”. In: (2013), pp. 87–92. doi: 10.1109/EWSDN.2013.21.
[10] Shortest Path Bridging. https://en.wikipedia.org/wiki/IEEE_802.1aq. Accessed:
2015-07-02.
[11] POX OpenFlow Controller. http://www.noxrepo.org/pox/about-pox. Accessed: 201507-12.
[12] Open vSwitch. http://openvswitch.org/. Accessed: 2015-07-12.
[13] PyPy, a fast JIT compiler for Python. http://pypy.org/. Accessed: 2015-08-23.
[14] Mininet. http://mininet.org/. Accessed: 2015-07-12.
39
Was this manual useful for you? yes no
Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Download PDF

advertisement