Contributing to OpenFlow 1.1 Support in Open vSwitch

Contributing to OpenFlow 1.1 Support in Open vSwitch
Contributing to
OpenFlow 1.1 Support
in Open vSwitch
A report
submitted in partial fulfillment
of the requirements for the degree
of
Bachelor of Computing and Mathematical Sciences
at
The University of Waikato
by
Joe Stringer
c 2012 Joe Stringer
Abstract
Open vSwitch is a production-quality software switch. It provides high-performance
packet-forwarding in virtualised environments. One of the many protocols it
supports is OpenFlow, which allows dynamic configuration of network behaviour. Newer versions of OpenFlow have recently been released, adding
support for important protocols such as MPLS and IPv6. However, the support
for OpenFlow in Open vSwitch is not in sync with the current standards.
At the end of 2011, a call for assistance was posted on the Open vSwitch
mailing-list, requesting that the OpenFlow community contribute towards
support for newer versions of the protocol. This report describes the process
of responding to this call, contributing to an open source project, and working
towards providing network researchers with a more flexible platform for their
experiments.
Acknowledgements
I would like to take the time to thank the following people for their guidance
and support during this project: Richard Nelson, for his supervision of this
project; Josh Bailey and the folks at Google, for introducing me to softwaredefined networking; Ben Pfaff, for his continued patience with my enquiries and
patch submissions; Justin Pettit, Jesse Gross, Isaku Yamahata, and the rest of
the Open vSwitch community, for their help and assistance with reviewing and
discussing implementation details; Brendan Jones and Shane Alcock, for their
input into this report; And all of the great bunch at WAND, for providing
many memorable times.
Contents
1 Introduction
1
1.1
1.2
Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Goals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2
2
1.3
Structure of this Document . . . . . . . . . . . . . . . . . . .
3
2 Background
2.1
Software-Defined Networks . . . . . . . . . . . . . . . . . . .
2.2
OpenFlow . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4
4
5
2.2.1
Prior OpenFlow Implementations . . . . . . . . . . . . . .
2.3
Open vSwitch . . . . . . . . . . . . . . . . . . . . . . . . . .
5
6
2.3.1
Development . . . . . . . . . . . . . . . . . . . . . . . . .
2.4
OpenFlow Testing Framework . . . . . . . . . . . . . . . . .
7
7
3 Development Process
3.1
Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
9
9
3.1.1
Identifying work . . . . . . . . . . . . . . . . . . . . . . .
3.1.2
Learning the Architecture . . . . . . . . . . . . . . . . . .
3.2
Communication . . . . . . . . . . . . . . . . . . . . . . . . .
10
11
12
3.2.1
3.2.2
Distributed development . . . . . . . . . . . . . . . . . . .
Ongoing Development . . . . . . . . . . . . . . . . . . . .
12
13
3.3
Implementation . . . . . . . . . . . . . . . . . . . . . . . . .
3.3.1
Code Quality Tools . . . . . . . . . . . . . . . . . . . . . .
3.4
Submission . . . . . . . . . . . . . . . . . . . . . . . . . . . .
14
14
15
3.4.1
3.4.2
Preparing to submit . . . . . . . . . . . . . . . . . . . . .
Licensing . . . . . . . . . . . . . . . . . . . . . . . . . . .
15
16
3.4.3
Code Review . . . . . . . . . . . . . . . . . . . . . . . . .
17
4 Existing Architecture
4.1
OpenFlow Architecture . . . . . . . . . . . . . . . . . . . . .
19
19
v
Contents
4.1.1
Flow Modification . . . . . . . . . . . . . . . . . . . . . .
19
4.1.2
4.1.3
Instructions . . . . . . . . . . . . . . . . . . . . . . . . . .
Experimenter Extensions . . . . . . . . . . . . . . . . . . .
20
21
4.1.4
Extensible Matches . . . . . . . . . . . . . . . . . . . . . .
4.2
Open vSwitch Architecture . . . . . . . . . . . . . . . . . . .
4.2.1
OpenFlow Provider . . . . . . . . . . . . . . . . . . . . . .
21
22
23
4.2.2
Testing
4.3
Summary
. . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . .
23
25
5 Implementation
5.1
Arbitrary Ethernet Address Masking . . . . . . . . . . . . .
26
26
5.2
Metadata . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.2.1
Matching on Metadata . . . . . . . . . . . . . . . . . . . .
28
29
5.2.2
Writing Metadata . . . . . . . . . . . . . . . . . . . . . . .
5.3
SCTP Support . . . . . . . . . . . . . . . . . . . . . . . . .
5.3.1
End-to-End Testing . . . . . . . . . . . . . . . . . . . . . .
29
31
33
5.4
Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.4.1
Code Review . . . . . . . . . . . . . . . . . . . . . . . . .
35
35
5.4.2
Testing features . . . . . . . . . . . . . . . . . . . . . . . .
36
6 Conclusion
6.1
Impact . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.2
Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . .
38
38
39
References
40
List of Abbreviations
API
Application Programming Interface
CPqD Centro de Pesquisa e Desenvolvimento
FIB
Forwarding Information Base
HBO Host Byte Order
IETF Internet Engineering Task Force
IRC
Internet Relay Chat
KVM Kernel-based Virtual Machine
lksctp Linux Kernel SCTP
LSR
Label-Switched Router
LXC
Linux Containers
NBO Network Byte Order
NXM Nicira Extensible Match
OXM OpenFlow Extensible Match
RFC
Request For Comments
SCTP Stream Control Transmission Protocol
SDN Software-Defined Networks
VM
Virtual Machine
List of Figures
2.1 OFTest: Black-box testing
. . . . . . . . . . . . . . . . . . . . . .
8
4.1 OpenFlow 1.0 Pipeline . . . . . . . . . . . . . . . . . . . . . . . . .
20
4.2 OpenFlow 1.1+ Pipeline . . . . . . . . . . . . . . . . . . . . . . . .
4.3 Open vSwitch Components . . . . . . . . . . . . . . . . . . . . . .
20
22
4.4 OpenFlow Provider . . . . . . . . . . . . . . . . . . . . . . . . . .
23
5.1 Ethernet address masking . . . . . . . . . . . . . . . . . . . . . . .
5.2 SCTP test setup . . . . . . . . . . . . . . . . . . . . . . . . . . . .
27
33
List of Tables
5.1 Development breakdown . . . . . . . . . . . . . . . . . . . . . . . .
35
Chapter 1
Introduction
Computer networks have become critically important to our lives—the internet
pervades the lives of billions worldwide. For network researchers, this provides
a mixed blessing: On one hand, innovation and experimentation on networks
is more relevant than ever; on the other hand, the scale of installed network
hardware sets limitations on the impact that researchers can make in the field.
Traditionally, network hardware has been implemented as monolithic boxes
with limited configurability. The “Clean Slate” group (Stanford University,
2012) have spent the past five years working towards providing a more open,
research-friendly platform for managing network behaviour—OpenFlow (McKeown et al., 2008). This movement has attracted global interest, most notably
with Google announcing its use of OpenFlow to increase network utilisation
and reduce operating costs (Hölzle, 2012).
Open vSwitch (Pfaff et al., 2009) sits at the meeting point of this open networking movement and the recent trend towards increased computer virtualisation.
As a software switch that has been designed from the ground up for performance in data center environments, it too has garnered much interest; during
the course of this project, the company behind Open vSwitch, Nicira, was
acquired by VMware for US$1.26 billion (Herrod, 2012).
The availability of these technologies and the open background for them provides a compelling field for research and development. Applications which have
previously only been possible through the use of proprietary network hardware
are becoming increasingly accessible. One such application is that of a LabelSwitched Router (LSR)—a particular type of network switch that is used in
the core of the internet to simplify traffic management.
Chapter 1 Introduction
2
1.1 Motivation
The motivation for this project arose from the proof-of-concept OpenFlowbased LSR platform developed by Kempf et al. (2011). The use of custom
hardware and unstandardised extensions to OpenFlow in that work presents
a barrier for research. The custom extensions used by Kempf et al. have since
been included in the official OpenFlow 1.1 standard (OpenFlow.org, 2011),
and more recently, Google has presented ‘Project W’—a project to make the
open source LSR work more accessible.
One of the required components for Project W is Open vSwitch. Late in
2011, a call for assistance was posted on the openvswitch-dev mailing-list,
seeking support from the community to write code to support OpenFlow 1.1
and above (Pfaff, 2011). Contributing to OpenFlow 1.1 Support in Open
vSwitch is a response to this call for assistance, as a step towards providing a
more accessible open source LSR. The contributions of this project will also
act to benefit the networking research community by improving the foremost
implementation of OpenFlow.
1.2 Goals
Prior work in this area has required specific hardware and software configurations, which limits the accessibility of the work. To provide lasting value for
the research community, this project should focus on contributing work into
the upstream codebase rather than providing another proof of concept.
Three OpenFlow 1.1 features were selected for this project to focus on:
• Arbitrary ethernet address masking
• Attaching metadata to a flow
• Adding support for Stream Control Transmission Protocol (SCTP)
For each of these features, it should be verified that the implementation is
accurate to the OpenFlow specification. At the minimum, unit tests should be
developed to prove that the resulting switch behaviour is as expected. Any additional testing—for example, using the OpenFlow testing framework (Ericsson
Research, 2011a)—provides additional assurance of the protocol compliance.
Chapter 1 Introduction
3
1.3 Structure of this Document
• Chapter 2 introduces the concept of Software-Defined Networks (SDN),
and provides background on OpenFlow and Open vSwitch.
• Chapter 3 describes the development cycle of committing to an open
source project, with particular reference to Open vSwitch.
• Chapter 4 briefly explains aspects of OpenFlow which are relevant to
this project and describes the architecture of Open vSwitch.
• Chapter 5 discusses the implementation of the OpenFlow 1.1 features
specifically targeted in this project and evaluates the success of this
project.
• Chapter 6 discusses the contributions of this project and possible future
additions.
Chapter 2
Background
This chapter explores the work that this project is based upon; What SDN
is and how OpenFlow is related. Previous implementations of OpenFlow are
explored, which provides some context for the work on Open vSwitch. Finally,
this chapter discusses the plans for developing and testing the functionality
proposed in this report.
2.1 Software-Defined Networks
The core concept of SDN is quite simple: to be able to configure network
behaviour in software. As network hardware stands today, each switch or
router (hereafter forwarding element) holds its own configuration and view of
global network state. Distributed protocols exist to share this state between
forwarding elements, but these operate separately on each forwarding element.
It is common for forwarding elements to hold state which is inconsistent
with that of its neighbours. Under the SDN paradigm, this state can be
held separately from the device. A common case is for a single entity to
hold a consistent view of the network and distribute this state to individual
forwarding elements. SDN makes a clear abstraction between the layer that
makes decisions about how packets should be forwarded (the controller) and
the layer that forwards packets (the datapath).
The controller keeps track of connectivity information. It builds a map of the
network by gathering information about neighbouring devices. Based on this
map, it constructs a set of optimal routes for forwarding traffic. The controller
then assembles flow entries—a tuple of a packet classifier (match) and how
to forward the packet (action). These flow entries are then written to the
Chapter 2 Background
5
Forwarding Information Base (FIB)—a lookup table used by the datapath.
The datapath is the layer that deals with forwarding individual packets. It
holds a lookup table for the flow entries, known as the FIB. Each time a
packet enters the datapath, it classifies the packet and performs a lookup in
the FIB to determine how to forward the packet—for instance, send the packet
out a port. This logic is kept simple to minimise the processing time for each
packet.
2.2 OpenFlow
OpenFlow is a protocol which follows the SDN paradigm. McKeown et al.
describe OpenFlow as an API for the datapath of network hardware, which
provides basic building blocks for implementing more complicated functionality
in the controller. The canonical use of OpenFlow involves using a simple,
commodity switch as the datapath and hosting the controller on a standard
PC.
Due to the abstraction between forwarding and control, the network performance of the forwarding element is not degraded by hosting the controller
remotely (Bianco et al., 2010). Software applications can be created to generate rules to determine network behaviour, while providing little performance
overhead. This suggests that OpenFlow could viably be used in production
environments as a replacement for traditional network management.
2.2.1 Prior OpenFlow Implementations
There have been several OpenFlow software switches developed to date. The
initial version from Stanford University was developed to provide a reference
that other implementations could follow. This switch was updated for each
new version of the specification, to understand the new structures and features; however it did not include implementations of all optional features in
later versions of the protocol. It was the reference for implementations up
to OpenFlow 1.0, but has not been kept in sync with later releases of the
specification.
This implementation has been used as a base for several projects: Indigo,
a hardware-specific implementation (Big Switch Networks, 2011); an OpenFlow 1.1-only switch (Ericsson Research, 2011b); and OpenFlow 1.2 and 1.3-
Chapter 2 Background
6
only versions (CPqD, 2012). Each of these only targets a single version of
the OpenFlow specification, and with the exception of Indigo, have no ability
to hook in with hardware. Indigo itself is only targeted for physical switch
platforms.
The call for assistance from the Open vSwitch developers outlined plans to
support multiple versions of OpenFlow at run-time, unlike previous implementations (Pfaff, 2011). Erlang Solutions have since released an implementation which also does this, however that project focuses on flexibility over
performance. As such, the development of further OpenFlow support in Open
vSwitch will provide value to the research community above the existing implementations.
2.3 Open vSwitch
Open vSwitch is an open source software switch (datapath) with support for
several major protocols including OpenFlow. The codebase is currently over
100,000 lines of C code, contributed by 75 developers. Pfaff et al. (2009)
introduced Open vSwitch as an alternative to Linux bridge, for providing highperformance switching between Virtual Machines.
The architecture of Open vSwitch is designed such that hardware vendors can
write modules that interface with their own physical switches. This allows
Open vSwitch to hardware-accelerate switching functionality, and provide the
device with OpenFlow support without requiring the vendor to implement
OpenFlow. As of this publication, it is known that at least five commercially
available switches already support Open vSwitch in this manner.
Due to the large community backing of Open vSwitch, it can be reasonably
expected that the codebase will continue to be maintained well into the future.
The review process for submitting code is fairly rigorous, which makes it
more difficult to get code accepted upstream; however, this provides a useful
guarantee of code quality. As this project seeks to provide lasting benefit to
the research community, the assurance regarding code quality makes Open
vSwitch a good fit for implementing an open source LSR.
Chapter 2 Background
7
2.3.1 Development
Each version of OpenFlow is incompatible with the others; both the controller
and the datapath must communicate using the same version. Previous implementations of OpenFlow 1.1 in the datapath have not been written to speak
multiple versions of OpenFlow at run-time, which limits researchers to only
using controller software that matches the same version of OpenFlow. While
this reduces the development time for new protocol support, it also creates
division in the base of available OpenFlow software.
The approach outlined for Open vSwitch from the openvswitch-dev mailinglist was to instead support multiple versions of OpenFlow at run-time. The
plan for this is to create generic structures that abstract the protocol differences
from the core code, then implement the specific protocol handlers separately.
For each new version of OpenFlow, a new parser will need to be added and
taught how to convert to and from the internal format. The benefit from this
approach is that Open vSwitch can be used in a wider range of environments
than datapaths that only support a single version.
In addition to this, the plan specifies that newer features should be backported to the older protocol, by adding them as experimenter extensions.
This approach will allow Open vSwitch to interoperate with a wide range
of software and hardware configurations. Section 4.1.3 describes the usage of
this OpenFlow feature in this project.
2.4 OpenFlow Testing Framework
Just as there multiple versions of the reference OpenFlow switch have been
developed, there have also been multiple versions of a matching test framework,
known as OFTest (Big Switch Networks, 2012). This framework is written in
Python, and conducts black-box testing of a switch: It surrounds a datapath,
controlling all of the inputs and outputs. Figure 2.1 shows this relationship
between OFTest and the datapath that it tests. OFTest acts as the controller
for the datapath by sending OpenFlow messages to it, and also connects to
the ports on that datapath to send packets through it. The behaviour of the
switch can then be evaluated against the OpenFlow standard.
Tests are conducted independently by conducting the following:
8
Chapter 2 Background
OFTest Server
OpenFlow
connection
Datapath
under
test
Test traffic
Figure 2.1: OFTest: Black-box testing
• Reset the rules on the test switch
• Install new flow entries into the datapath
• Send packets through the datapath ports
• Observe the behaviour
This provides an external test for protocol compliance—a level of guarantee
about interoperability with other OpenFlow implementations. It is aimed
at providing OpenFlow developers with a simple ‘plug-and-play’ environment
that is easy to set up and use. Section 5.4.2 will discuss the use of OFTest to
evaluate this project, and the issues faced in this process.
Chapter 3
Development Process
Contributing code to an open-source codebase requires additional skills on
top of the basic skillset of a programmer. While the ability to reason about
programs, select appropriate data structures and write quality code are all
very useful skills, these need to be supplemented with a social understanding
of the project’s ecosystem.
This chapter describes the considerations that a potential contributor (hereafter the contributor) needs to make when undertaking work on open source
software. This will include general commentary on each particular aspect of
the development process, followed by examples of how the aspect is managed
in Open vSwitch. This will provide insight into the processes involved with
a large community consisting of dozens of contributors, and should be widely
applicable in any software development community.
The general development process can be broken down into a cycle involving
the following activities:
• Learning about how to contribute
• Communicating plans for contributing
• Implementing the functionality
• Submitting the work upstream
3.1 Learning
There are a few steps that the contributor must take to learn how to contribute
to an open source project. Learning about what modifications are being sought
Chapter 3 Development Process
10
after, and the desired approach for this implementation is one step. Once it
has been established what contributions that the contributor will make, the
architecture must be examined and learnt. This provides the basis for the
work to be carried out.
3.1.1 Identifying work
This step can be split into two parts. The first of these is negotiating with
other developers about what tasks to work on, preventing duplicate efforts;
the second is to understand what is expected of the final implementation
of the feature. In the case of implementing a standardised protocol, this
involves studying the specification to ensure that the work accurately reflects
the intended behaviour.
Deciding which feature to work on is the first step in making a contribution. It is recommended that new contributors select something reasonably
small at first, so that progress can be made without becoming overwhelmed
with the scale of the codebase. With Open vSwitch, the call for assistance
provides a useful basis for feature selection. As one becomes more familiar
with a codebase, it becomes easier to consider and understand the effects of
implementation decisions on other areas of the code.
In any project that has a large number of active developers, there are likely to
be several features being worked on at a given time. To avoid duplicate efforts,
it is important to make contact with the lead developers and determine whether
there is already a developer allocated to the feature. In the case of Open
vSwitch, the developer mailing list is the central point for such communication
(Open vSwitch, 2012b). The contributor expresses interest in developing a
particular feature, and other developers provide feedback about whether the
feature is already under development. This may also include some indication
as to how they would prefer the work to be carried out.
When a decision has been made with regard to the feature to focus on, there is
a process of learning the requisite implementation details of the feature. This
may involve searching for literature on the subject—reading papers, specifications and IETF Request For Comments (RFC)s. In the case of implementing
OpenFlow features, the primary source of information is the specification
distributed by the Open Networking Foundation. Additionally, many of the
features make reference to protocols which are defined elsewhere, so it is
Chapter 3 Development Process
11
important to have an understanding of the operation of these protocols.
3.1.2 Learning the Architecture
Having investigated what modifications should be made to the codebase, the
contributor must learn about how to make these changes. Several factors affect
the significance of this step. For some idea of the scope, one could consider
how many lines of code there are, or the number of files in the codebase. For
a more accurate gauge of the effort required for this task, it is worth looking
at the quality of documentation—to what degree it exists and how accurate it
is.
There is no single accepted format for the documentation of a project. Most
codebases have some form of web presence—particularly open-source ones—
but even so, the documentation may not be found on the website. In Open
vSwitch, the developers have opted to distribute documentation with the
codebase, and make these accessible from the main website (Open vSwitch,
2012a). The descriptions of the abstraction layers are found in the PORTING
file, which is targeted towards developers interested in porting the codebase to
additional platforms. This is augmented by the descriptions of design decisions
made when developing Open vSwitch, which is detailed in the DESIGN file.
Both of these files are found in the root directory of the codebase.
Even when there are design documents distributed with the project, they may
not provide the level of detail required to implement a feature. The PORTING
guide gives a useful overview of the Open vSwitch architecture, but it does not
mention the specifics of how the existing components parse OpenFlow messages
or how these messages are implemented. One example is the meta-flow
structure which assists with parsing OpenFlow matches at run-time. Such
structures can be learnt by searching for components which interact with
them and investigating the interactions, and through communication with the
upstream developers.
Lastly, the nature of working on a fairly new codebase is that the internal
data structures have not been fully stabilised yet. Prior to this project, much
of the internal representation of data was based directly on the OpenFlow 1.0
structures. Over the course of this project, the internal structures were adapted
to allow for differing representations and additional features. Most notably,
the internal representation of OpenFlow actions was introduced as a superset
Chapter 3 Development Process
12
of the OpenFlow 1.1 actions and instructions so that Open vSwitch could
dynamically support multiple versions of OpenFlow at runtime. Such evolution in the codebase will affect the implementation of both new and existing
features. As such, the contributor needs to follow developments throughout
the entire project codebase to determine the effect that they may have on the
proposed code modifications.
3.2 Communication
Learning to communicate appropriately is one of the most important aspects
of contributing to an open source codebase. Healthy communication can lead
to a streamlined development pipeline and less wasted effort. These provide
positive reinforcement and improve the pace of community contributions. This
is particularly important with a large project such as Open vSwitch, which
receives upwards of twenty posts on the developer mailing-list per day. Any
of these could affect the implementation of a feature that the contributor is
working on.
3.2.1 Distributed development
With any existing development community, new contributors need a support
structure to help them to develop effectively on the codebase. This structure
provides information about the architecture, insight into design decisions, and
commentary on developments. When working with a distributed development
team, the ability to use other contributors as a resource is hindered by the
geographic spread of the community.
Typically, open-source codebases with a significant number of contributors
have a large variation in development locations. Bird and Nagappan (2012) investigated the geographic distribution of two major open-source communities,
Firefox and Eclipse. They found that around a third of major contributions
were made by developers on different continents for Firefox versions 1.5 and
2.0. Eclipse was far less distributed, but still involved two continents for a
majority of the work.
In Open vSwitch, the majority of commits are contributed by Nicira in California. As the software continues to be developed, major contributions have begun
coming in from other locations. For the work described later in this document,
Chapter 3 Development Process
13
the majority was developed in NZST (UTC+12), while corresponding with
California in PDT (UTC-7) and Japan JST (UTC+9).
These timezone differences have a considerable effect on how developers can
schedule their workflow. While the direct implementation work can be carried
out at any time, discussion with developers on the implementation can only
occur during set times of the day. The common case for this project involved
using the IRC channel for urgent communication in the morning NZST, or
using the mailing list for more detailed discussion on implementation or code
quality. To set reasonable expectations for what defines a timely response, it
is recommended to be aware of what times of the day are business hours for
other developers. Upstream developers may take a day or more to respond to
messages that are posted to the mailing-list. As such, it is also recommended
for the contributor to have additional tasks to work on while waiting for
responses.
3.2.2 Ongoing Development
Every development community will have a commonly agreed means for communication. Common forms are the mailing list, bug tracking systems and
IRC channels. For Open vSwitch, the primary communication platform is the
mailing lists. There is a list specifically for development discussion, which is
separate from the general user discussion. Bug reports are also posted on a
Debian Linux bug tracker, but there is no public bug tracker specifically set up
for Open vSwitch. To ensure that they reach the attention of the appropriate
developer, bug reports are expected to be reported to either the developer list
or the Debian tracker.
It is important as a contributor to actively watch the development activity to
promote co-operation between developers on related features. Every feature
described in this report was affected by other activity on the list. Examples include new abstractions introduced in the core codebase, commentary
on related features, and general development guidelines. In particular, the
write-metadata feature—described in Section 5.2.2—was based upon another
patchset from the community.
Active participation on the mailing-list also allows for the contributor to provide assistance to other developers. Some of the work submitted from this
project influenced other development. For example, the initial write-metadata
Chapter 3 Development Process
14
patch provided a model for others to implement similar features (Pfaff, 2012a).
The discussions following the post provided additional insight into the preferred approach for implementation. This is useful to the contributor to
improve the understanding of the codebase.
3.3 Implementation
There are several ways to verify the code that the contributor has created is
of acceptable quality to be included upstream. Perhaps the most indicative
is to write a comprehensive set of tests to verify that the behaviour is as
expected. Often, software projects will have additional criteria to keep the code
maintainable and find unexpected bugs. The Open vSwitch codebase includes
a guide for submitting patches which clearly outlines the quality criteria for
code submissions. The minimum expectations are that the code:
• builds correctly,
• does not break any existing tests,
• changes a single feature,
• and updates the user documentation.
This section will cover the expectations for proving the functionality of the
implementation, which will be followed by a discussion of its presentation in
Section 3.4.
3.3.1 Code Quality Tools
For code to be accepted upstream, it must first meet particular quality criteria.
As code submissions can be sent by a programmer of any skill level, it is useful
to have tools to verify the quality of the code. This assists the contributor
in improving code programming practice, and assists upstream developers by
providing independent verification of code correctness. Two tools are used by
Open vSwitch developers for this purpose: Sparse and GNU AutoTest.
Sparse is an open-source static code analyser for C. It is a frontend for popular compilers that provides additional information about the use of memory
address space and other resources. In Open vSwitch, this program is used to
detect some potential bugs such as mixed byte ordering. When dealing with
Chapter 3 Development Process
15
network packets, the endianness of data on the wire is often different from that
of the host—referred to as Network Byte Order (NBO) and Host Byte Order
(HBO). Open vSwitch differentiates these types by defining its own types for
NBO values (ovs_beXX), and using standard UNIX types for HBO (uint_XX);
however, the base data type of each of these is the integer. Typical compilers
will not detect errors where a developer is directly copying values between
NBO and HBO variables without swapping the endianness. Sparse is built to
highlight these difficult-to-detect mistakes.
AutoTest is a framework for generating platform-specific test scripts. Developers create AutoTest scripts which define general information about a test,
how to run the test, and what the expected output is. The Open vSwitch
codebase uses AutoTest as the basis for most of its unit tests. When adding
functionality to the code, developers are expected to create additional tests to
prove the behaviour of the additions is correct. This also allows developers to
verify that there have been no regressions when further changes are made to
the codebase.
When a feature has been written and appropriate test cases have been created,
the contributor must then prepare the work for review. Firstly, it must
be checked that the patch compiles successfully against a fresh copy of the
codebase. This can reveal build errors where the contributor has neglected to
include all of the relevant modifications in the patch. Following this, the code
should be regression-tested. Running make distcheck in Open vSwitch will
perform these two tasks in a single step.
3.4 Submission
There are several considerations for the submission of a patch beyond the code
implementation. Broadly, this section will describe them in terms of modifications required by the contributor, licensing considerations and feedback from
upstream maintainers.
3.4.1 Preparing to submit
The Open vSwitch codebase includes a style guide which defines the format
which is expected for code before it is to be accepted upstream. In practice,
this is unlikely to be an issue if the developer follows the existing style from
Chapter 3 Development Process
16
code surrounding his modifications. Given a sensible patch, the upstream
developers are unlikely to decline to review based on style but it is likely to
draw attention away from the functionality.
When the patch modifies or introduces new user-facing features, it is also
expected that the user documentation is updated. For each of the userspace
utilities provided by Open vSwitch, there are Unix-style manual pages describing the features of the utility and the syntax for using it. A submission that
does not include or update the manual pages will not be accepted upstream.
It is important that the behaviour described by the manual is accurate to the
behaviour of the application.
Patches must also be signed off by the contributor, to indicate that the code
can be submitted to the project without breaching the licensing terms. It is a
public record of the contribution. The usual case is where the contributor has
written the code, although it may include cases where code from another source
has been included under a compatible license—as described in Section 3.4.2.
Lastly, it is expected that any new patches are based on the latest development
code branch. It is good practice to fetch the latest version of the development
codebase before applying the candidate patch and performing regression testing. This is because the latest code may modify the components that the
candidate patch depends on. This is particularly important for projects which
have a high volume of code contributions.
3.4.2 Licensing
Open-source applications are defined as such based on the license under which
they are distributed. On occasion, it may be sensible to include another
project’s code in a patch so as to reduce duplicate efforts. This allows the
contributor to take advantage of a stable existing codebase rather than reimplementing a new—and potentially buggy—solution. In this case however,
it is essential to be aware of the license restrictions on the codebases involved.
In this project, the primary example of needing to include external code is
the CRC32c checksum implementation for userspace support of SCTP. To reimplement a checksum algorithm is a risk-prone undertaking and would require
a large amount of effort to write correctly. As such, it is sensible to look for
existing implementations that could be included in the Open vSwitch source.
Chapter 3 Development Process
17
Several open-source kernels include copies of the CRC32c algorithm for use
by their own SCTP implementations. Each of these has its own license, some
of which are compatible with the Open vSwitch license and some which are
not. Ultimately, the FreeBSD implementation was chosen, due to its licensing
compatibility with the Apache version 2 license that most of Open vSwitch
uses.
3.4.3 Code Review
The review process is the last step in the development cycle.
For Open
vSwitch, this step involves the contributor posting the candidate patches to the
openvswitch-dev mailing-list, and receiving feedback from other developers.
This step provides an additional assurance of code quality for the codebase.
There are some measures of code quality that cannot be measured using
static code analysis or written tests. One such measure is of the effect on
the architecture of the system. The core developers of a project are able to
provide insight into how a particular work interacts with the larger codebase.
As maintainers of the code, it is in their interests to ensure that code follows
the intended design of the system. Conversely, if they allow code that breaks
down the design or behaves unexpectedly, then this will increase the work
required in the future to keep the system bug-free.
For the contributor, the most beneficial aspect of this step is an opportunity
to discuss implementation with direct reference to code. Although questions
can be asked on IRC or on mailing-lists without a patch submission, the lack
of context can introduce ambiguities in communication. When the contributor
submits a patch, this is a chance for the upstream developers to educate the
contributor on how the work should be carried out.
The review process commonly repeats multiple times until the upstream developers are content with the proposed changes. If the contributor is persistent
and willing to respond to feedback, then each round of review will bring the
code closer to acceptance upstream.
Kernel code
Open vSwitch also includes a kernel module for Linux in its distribution, which
has been included in the main distribution since version 3.3 (Calleja, 2012).
The review process for this code is more rigorous—the patch must also satisfy
Chapter 3 Development Process
18
a further 24 patch submission criteria (Riel, 2006), and will be seen by multiple
other developers. As such, features that require work in this area will undergo
additional scrutiny from the Open vSwitch developers, to reduce the effort
required to later get the code included in Linux. The SCTP feature described
in Section 5.3 requires work in this area.
Chapter 4
Existing Architecture
Chapter 2 discussed the concepts underlying this project: SDN, OpenFlow,
and the approach to extending Open vSwitch. The overview of these concepts
is essential for understanding how this project fits into the larger work, but does
not provide enough detail to understand the proposed changes. This chapter
describes some of the implementation details that this project relies upon;
the inner workings of OpenFlow switches in general, and the way that Open
vSwitch reflects these. This provides a basis for the modifications detailed in
Chapter 5.
4.1 OpenFlow Architecture
Prior descriptions of forwarding elements in this document refer to the forwarding of individual packets. However, to place one flow entry into the FIB
for each packet that passes through the datapath would be inefficient. As such,
packets are usually classified into flows based on attributes such as the source,
destination and protocol being used. All packets with the same values in these
limited fields is handled by a single rule.
4.1.1 Flow Modification
In the OpenFlow 1.0 specification, the datapath’s FIB is presented as a single
logical table. The table has a list of flow entries describing the flow to
match on, and a set of actions to perform on packets that match. At the
bare minimum, an OpenFlow switch must support actions to either forward
or drop the packet; there are also provisions for modifying the packet and
queueing. When a packet is received, the headers are parsed and used to find
20
actions
matches
Chapter 4 Existing Architecture
Flow Table
Figure 4.1: OpenFlow 1.0 Pipeline
a matching flow entry. The actions for that flow entry are then immediately
applied in order. The default action if there is no flow entry is to forward
the packet to the controller so that it can create a flow entry to match the
packet. Figure 4.1 contains a simplified representation of the OpenFlow 1.0
pipeline. The arrow represents a packet being received by the datapath. When
the packet is received, fields from the packet are looked up in the table to find
the associated action. The action is then taken—for instance, forwarding out
a particular port.
4.1.2 Instructions
In OpenFlow 1.1, the datapath is expanded to support multiple tables. The
pipeline processing begins at the first table, and may finish processing after
a single table (as with OpenFlow 1.0), or may be configured to continue
Table 0
...
instructions
actions +
metadata
matches
instructions
matches
processing on another table.
Table N
Figure 4.2: OpenFlow 1.1+ Pipeline
Chapter 4 Existing Architecture
21
Instructions in OpenFlow 1.1 allow configuration of this processing behaviour.
They take the place of actions in a flow entry, providing a superset of the
functionality. Instructions may have actions attached, in which case they
specify how those actions are to be performed; for instance, the Apply-Actions
instruction performs the attached actions immediately—in the same manner
as OpenFlow 1.0. Other instructions provide for attaching actions or data to a
packet, and moving the processing immediately to any later table. Figure 4.2
displays the pipeline for OpenFlow 1.1 and above. As with the OpenFlow 1.0
pipeline, the fields from a packet are looked up in a table, starting with the
first. The table will provide instructions for further processing. This may
forward to a later table, or it may immediately apply a set of actions. If
the packet is forwarded to another table, then actions and data can also be
attached to the packet as it goes through the pipeline.
4.1.3 Experimenter Extensions
As a protocol developed in the research community, OpenFlow has always had
an inherent focus towards assisting experimentation. Early development of
the specification was done in the open. By version 0.89, there were provisions specifically for allowing vendor extensions, later known as Experimenter
Extensions.
The experimenter extensions support consists of standardised structures for
adding new matches, actions and instructions (OpenFlow.org, 2011). This allows researchers to develop functionality to be included in subsequent versions
of the specification. Vendors and researchers may use this feature to build upon
the existing feature-set from the standard protocol. Many of the features from
Nicira’s extensions to OpenFlow 1.0 have become standardised in later versions
of the specifications. One such example is the extensible matches, which are
used by all of the features described in Chapter 5.
4.1.4 Extensible Matches
Extensible matches provide researchers with additional flexibility for modifying
flow entries. In OpenFlow 1.0 and 1.1, flow modification messages contain
fields for every type of match that the protocol supports. When constructing
these messages, the match entries for all of these features must be sent, even
if the flow entry only matches on a single field. Extensible matches allow
22
Chapter 4 Existing Architecture
these messages to only include relevant matches. This feature was introduced
in Open vSwitch as Nicira Extensible Match (NXM), and was standardised
in OpenFlow 1.2 as OpenFlow Extensible Match (OXM) (Open Networking
Foundation, 2012). To support new features on all versions of OpenFlow, these
structures will need to be updated.
4.2 Open vSwitch Architecture
The internal structure of Open vSwitch consists of the following components:
• ovs-vswitchd: the daemon process that stores state,
• netdev-provider: provides network device information,
• ofproto: brokers OpenFlow connections,
• ofproto-provider: implements OpenFlow functionality.
Figure 4.3 shows the relationship between these components. ovs-vswitchd is
the main userspace program, which stores and retrieves state from a database.
The netdev-provider handles interaction with network devices, which is
done by interfacing with the running kernel. ofproto communicates with the
OpenFlow controller, and passes messages down to the ofproto-provider.
Finally, ofproto-provider is the component which understands and implements OpenFlow—parsing the messages and installing rules to forward flows.
ovs-vswitchd
ofproto
netdev-provider
ofproto-provider
kernel
Figure 4.3: Open vSwitch Components
23
Chapter 4 Existing Architecture
4.2.1 OpenFlow Provider
This project focuses on implementing features in the OpenFlow provider.
Figure 4.4 shows the primary functions of this component. The parser takes
OpenFlow messages from the higher level, and converts them to the internal
flow entry format. This flow entry is then written to the datapath, which
performs the forwarding of packets.
The OpenFlow provider distributed with Open vSwitch contains two datapaths: one in userspace, and one in the Linux kernel. These datapaths make use
of a shared OpenFlow implementation for features which they do not directly
support. The default behaviour is that Open vSwitch will attempt to use the
kernel DataPlane for forwarding, as this is faster—packets do not need to cross
the userspace/kernelspace divide. In cases where the kernel implementation
does not provide particular OpenFlow functionality, it will revert to using the
userspace library.
4.2.2 Testing
Open vSwitch includes an extensive test-suite for over 1000 individual features.
The testing of OpenFlow features can be broken up into the two main functions
from the OpenFlow provider—parser and DataPlane. The parsing tests just
check that the OpenFlow messages can be understood and converted into the
OpenFlow
1.0
OpenFlow
1.1
Nicira
Extension
parser
Match
Flow
Table
Action
...
Flow
Table
datapath
Figure 4.4: OpenFlow Provider
Chapter 4 Existing Architecture
24
internal format; the DataPlane tests ensure that the functionality of these
messages matches the specification. These tests are written with reference to
the internal workings of Open vSwitch, known as white-box testing.
Testing the Parser
Open vSwitch includes a commandline tool called ovs-ofctl that can parse
OpenFlow messages and print text reflecting the purpose of the message. The
contributor specifies the hex bytes of the message as input, then ovs-ofctl
parses the message as though it had been received from the controller.
For a test case, a test’s expected output will initially be set to expect an error
message for the feature being tested. This shows that Open vSwitch will notify
the controller when a message is sent that uses an unsupported feature. When
support for the feature is written, the contributor adds the new feature to the
print component. Subsequently, the contributor updates the expected output
to the new feature to show that the parser recognises the new feature.
Testing Functionality
The functionality of a feature is tested in a different manner: by setting up a
copy of Open vSwitch and sending packets through it to determine whether the
behaviour is as expected. Unlike the OpenFlow Testing Framework described
in Section 2.4, these tests build upon knowledge of the internal structures to
test that the behaviour is correct. The test script starts Open vSwitch and
carries out the following:
1. Send a flow modification message including the new match and an action
2. Send a packet which satisfies the match criteria
3. Verify that the specified action is carried out
Between the parsing tests and the functionality tests, the Open vSwitch test
suite covers the interpretation of OpenFlow messages and implementation of
their behaviour. The use of this test suite to test the features implemented in
this project is discussed further in Chapter 5.
Chapter 4 Existing Architecture
25
4.3 Summary
This chapter has described the internal structure of how OpenFlow switches—
and in particular, Open vSwitch—provides packet-forwarding functionality.
The next chapter will build upon this knowledge to describe the work carried
out in contribution to OpenFlow 1.1 support in Open vSwitch.
Chapter 5
Implementation
This chapter explores the work carried out for this project: support for arbitrary ethernet address masking, attaching metadata to flows, and implementing support for the Stream Control Transmission Protocol. Each feature
is explained separately in the following manner: the feature is described in
context of network research—what the feature is and how it can be used. The
approach for implementation of the feature is then described with reference to
Open vSwitch architecture. Each section will conclude with a description of
how the feature was evaluated. Finally, this chapter will evaluate the success
of this work based on the original goals.
5.1 Arbitrary Ethernet Address Masking
In IP-based networks, it is common to use an address comparison method
known as address masking. This allows forwarding elements to use part of
an address to direct network traffic. The OpenFlow 1.1 specification adds
this feature for use with ethernet addresses (OpenFlow.org, 2011). When the
controller adds a flow entry to the Forwarding Information Base (FIB), it
specifies an address and a bitmask to match on. Then, when a packet enters
the pipeline, the forwarding element checks that the specified mask bits are
the same between the flow entry address and the packet’s address. Figure 5.1
shows an example of a positive ethernet address match. 0xFF specifies that
the corresponding bits should be identical in the match address and the packet
address. 0x00 specifies a wildcard; the corresponding bits need not match.
Traditionally, this functionality has not been provided by ethernet switches. In
the interests of providing additional flexibility to researchers, the OpenFlow 1.1
27
Chapter 5 Implementation
match address:
with mask:
01:23:45:67:89:AB
packet address:
01:23:45:AB:89:67
FF:FF:FF:00:00:00
}
match entry
Figure 5.1: Ethernet address masking
standard makes these addresses arbitrarily maskable. Particular sections of the
ethernet address have significance, such as identifying the sending device or
determining whether the address is globally unique (IEEE Computer Society,
2002). This allows researchers to investigate the behaviour of particular devices
in their networks.
Prior to this project, the Open vSwitch codebase provided limited functionality
for masking ethernet addresses. As an OpenFlow 1.0 compliant switch, it
allowed exact matching of ethernet source and destinations. Additionally,
Nicira have added extensions to match the ethernet multicast bit—a section
of the address that specifies whether the destination is a group of devices or
just a single device.
The existing support for masking was implemented with a set of statically
defined masks that indicate the following mask types:
• Mask all bits—Match a specific ethernet address
• Mask all bits except multicast—Match an ethernet address, but allow it
to be directed at one or more destinations
• Mask the multicast bit—Match all traffic directed at multiple destinations
• Mask no bits—Do not match the address
The NXM match message for ethernet source and destination addresses had a
field to specify which of these masks should be applied. When the message was
parsed, the enumerated mask value would be placed into the flow structure,
which would be used by the classifier to dynamically apply the appropriate
mask.
Rather than applying pre-determined flow masks, this feature allows for specific masks to be attached to the flow entry. The work carried out in the
userspace datapath was primarily in the classifier and flow components. A
Chapter 5 Implementation
28
new field was added to the flow structure to store the ethernet mask and the
classifier was updated to use these new fields. In the OpenFlow parser, the
NXM message handler was modified to use the new mask field rather than the
previous masks.
Additionally, many of these functions implemented their own bitwise operations to modify and compare multi-byte ethernet values. To reduce duplicate
code and increase the readability of functions that use this code, it was deemed
worthwhile to refactor it into separate functions.
The testing for this feature followed the description in Section 4.2.2. The
ovs-ofctl tool was used to parse a flow modification message containing an
ethernet address and a previously unsupported mask. For it to print out the
address and mask correctly shows that the parser has correctly interpreted the
message and placed it into the internal structures. The classifier tests were
also updated to use the new mask structures, and confirm that rules which
match on arbitrary ethernet masks would take action when the mask matched
a packet being received.
The implementation of this feature was considered as an opportunity to explore
the codebase and become familiar with the processes as described in Chapter 3.
As the first feature written as part of this project, the implementation took
longer than originally scheduled. These delays can be categorised into two
areas: Inexperience with the large codebase, and lack of familiarity with the
code submission process. The knowledge gained in these areas improved the
development speed for the subsequent features.
5.2 Metadata
The metadata feature in OpenFlow 1.1 allows additional information to be
attached to a packet as it passes through the forwarding pipeline. One table
could perform classification of a packet based on particular protocol fields and
pass this information to later tables. This data can then be used to perform
actions later in the pipeline. A similar feature was previously introduced to
BGP (Traina et al., 1996). Donnet and Bonaventure found that on average
this feature was being used by half of the routes held by the sampled set of
routers, and this number was increasing over the period they were monitored.
This gives some indication of the value that a similar feature is providing to
Chapter 5 Implementation
29
network administrators.
5.2.1 Matching on Metadata
The datapath implementation for matching on metadata is similar to that of
ethernet matching—Add a field to the flow structure and update the functions
which pack and unpack this structure. After the lessons learnt from the
ethernet matching work, these areas were well understood. This increased
the speed of implementation initially; however, the work required in the parser
differed significantly. This was because the ethernet mask implementation had
some existing code to extend, whereas the metadata feature did not build upon
an existing feature.
The two standardised representations for metadata are the OpenFlow 1.1
match type and the OXM type used in 1.2 and above. The parser for each
of these representations was updated to handle the new field. In the case
of the extensible match parser, the existing implementation only handled
the NXM format and any structures in the OXM format that mirrored the
NXM counterpart. Prior to this project, the parser would only handle these
messages if there was a representation in both formats. Duplicating the same
structures for new features in both NXM and OXM provides no benefit, and
is only used for older features for backwards compatibility. As such, it was
deemed worthwhile to extend the extensible match support so that new OXM
structures could be used with the Nicira extension format.
To test the new field, some tests were written to send metadata match messages
to the datapath and check that the parser recognises the messages correctly.
This was done for each of the possible representations, including the use of the
new OXM metadata structure within a Nicira match message. Unfortunately,
while the current version of Open vSwitch implements multiple tables internally, it does not support the OpenFlow 1.1 instruction to direct the pipeline
to a later table. As such, the matching functionality of transferring metadata
to another table was not unit-tested.
5.2.2 Writing Metadata
The initial approach to writing metadata was performed using a Nicira action
extension, NXAST_REG_LOAD. This extension allows modification of any field
in the flow structure. Each field from the flow structure is represented by
Chapter 5 Implementation
30
a number, designated by the NXM and OXM implementation. This is used
to identify which flow field that NXAST_REG_LOAD will change. This action
made it possible to independently test metadata matching before developing
the write-metadata instruction. Later, the behaviour of the write-metadata
instruction could be compared against this to determine whether the behaviour
was correct.
During the implementation of this feature, a new abstraction for actions was
introduced internally. The new format abstracts away the difference between
actions and instructions from the core code, leaving it up to the parser to
translate these structures into specific OpenFlow versions. This keeps the
datapath code simple, as it does not need to handle the specifics of which
OpenFlow version is being used. Instead, the conversion between the internal
format and particular OpenFlow versions is handled in the parser, as described
in Section 4.2.
This approach to implementation has curious effects when combined with the
outlined plan to support new features on OpenFlow 1.0 with Nicira extensions.
The OpenFlow 1.1 specification defines the order in which instructions should
be applied; however, actions do not have the same guarantee. This is further
complicated when one considers that the same Nicira extension message could
be attached to an OpenFlow 1.1 message in addition to the official instruction.
After a discussion with the Open vSwitch developers, it was determined that
the best approach is to enforce specific rules regarding the action’s position in
a message (Pfaff, 2012b). OpenFlow 1.1 instructions that are implemented
as Nicira extension actions must occur at the end of a flow modification
message, and must appear in the correct order—as specified for OpenFlow 1.1
instructions in the specification. Furthermore, an OpenFlow 1.1 or higher
message cannot contain the metadata action structure. Any flow modification
message that does not comply with these restrictions should be responded to
with an error message.
Testing Metadata Writing
Testing the write-metadata instruction is performed using the OpenFlow
packet-in message. This message is usually used when a packet enters the
datapath and there is no flow entry that matches it. In this case, the OpenFlow
switch encapsulates the incoming packet and sends it to the controller with
Chapter 5 Implementation
31
any additional contextual information that it can gather. The controller may
then investigate the packet and create a rule to match it. In Nicira extensions
(and later, OpenFlow 1.2), this message also passes back any metadata that is
attached to the packet. A flow entry can also explicitly specify that matching
packets should be sent to the controller. This particular behaviour is used to
test the writing of metadata.
The test procedure is carried out as follows: Firstly, a flow entry is installed
on the datapath to match all traffic and apply the following actions:
• Attach some specific metadata value to the flow
• Send the packet back to the controller
Secondly, a packet that matches the flow entry is sent to the datapath. The
datapath matches the packet, and performs the actions specified—attaching
metadata and sending a packet-in message to the controller. The message can
then be examined to determine whether the metadata was written as expected.
There was a particular counter-intuitive test case with the expected functionality. There are no instructions in OpenFlow 1.0, so the write-metadata instruction is implemented as a vendor extension to the actions. When ovs-ofctl
is used to test the conversion of this message to a OpenFlow 1.0 action structure, the tool outputs a Nicira extension structure. However, when the same is
done for converting to a OpenFlow 1.1 structure, the metadata is not displayed.
This could be mistaken as an example of the OpenFlow 1.1 parser failing to
parse instruction messages correctly. The actual case is that in OpenFlow 1.1,
this message will only be represented as an instruction. This test case only
uses ovs-ofctl to convert into action structures, while the actual behaviour in
practice is that an OpenFlow 1.1 message will be parsed for both instructions
and actions. Hence, the expected output from the test is actually to drop the
action.
5.3 SCTP Support
SCTP is a transport protocol similar to TCP or UDP (Stewart, 2007). SCTP
is used by the telecommunications industry to interconnect various systems in
their networks, in particular newer billing and authorisation systems (Calhoun
et al., 2003). As a relatively recent protocol (first introduced 2000), it is of
Chapter 5 Implementation
32
interest to researchers to experiment with its behaviour and compare it to
traditional transport protocols.
The implementation of this feature was split into four areas: introducing a
CRC32c checksum implementation for use in the userspace datapath, parsing
OpenFlow matches for SCTP, and support for matching SCTP in each of
the userspace and kernelspace datapaths. Section 3.4.2 described the considerations for including a CRC32c implementation, and the previous two
sections outlined the modifications required to support a new OpenFlow match
type. Therefore, this section will focus on the implementation of SCTP in the
datapaths.
The SCTP support in OpenFlow 1.1 refers to the ability to perform matching
and actions using SCTP source and destination port numbers. These port
numbers allow hosts to distinguish between multiple connections with the
same destination. This functionality mirrors that of several other protocols,
including TCP and UDP. Furthermore, the location within a packet for these
port fields is also identical. As such, some of the functionality for this feature
was implemented in the same format as the code for these other protocols.
This was the case for the matching support; the logic consists of looking at
a particular byte offset into a packet to find the port, and using the value to
compare with the relevant flow entry.
The feature support that deviates the most from these other protocols provides
support for the set-field action. This action allows particular fields of a
packet to be altered by the datapath as it passes through the pipeline. When
altering part of a packet in this manner, the checksum for the packet must also
be updated—hence the need for an implementation of CRC32c.
Another consideration was revealed during discussion of this feature with
upstream developers. In typical ethernet networks, when a switch receives
a packet that has an invalid checksum, the expected behaviour is to forward
the packet normally. The end host will recognise that the packet is corrupt
and deal with it accordingly. However, consider the case where a set-field
action is used on a corrupt packet. To behave in the expected manner, this
error in the checksum must be propagated across the field change.
This is performed in the following manner:
1. Take a copy of the initial checksum from the packet
33
Chapter 5 Implementation
2. Calculate the correct checksum for the original packet
3. Modify a field in the packet
4. Calculate the new checksum for the modified packet
5. If the first two are not identical, apply the difference to the new checksum
6. Place the new checksum into the packet
In the case where the original checksum is correct, the resulting checksum will
be correct for the packet. Therefore, the destination host for the packet will
see a valid packet. In the case where the original checksum was incorrect,
the destination host will receive a packet with an invalid checksum. If the
behaviour was not written in this manner, then it would be possible for an
invalid packet to arrive at the destination host with a valid checksum.
5.3.1 End-to-End Testing
A virtualised test environment was constructed to test the implementation
of this feature. Figure 5.2 shows the test setup. The test PC is running
Debian GNU/Linux 6.05, with the modified version of Open vSwitch connected
to two Virtual Machines (VMs). Each of the VMs has Linux Kernel SCTP
(lksctp) (IBM Corporation, 2001) installed for testing. One VM acts as a
server, listening for SCTP traffic while the other is instructed to send SCTP
packets to the other host.
Virtual
Machine
Virtual
Machine
Open vSwitch
Figure 5.2: SCTP test setup
Chapter 5 Implementation
34
Initially, Open vSwitch acts as a standard ethernet switch—learning how to
forward packets between the hosts, and allowing traffic to freely pass between
them. A rule is installed into Open vSwitch to drop all SCTP traffic on the
port being used. Subsequently, the SCTP traffic cannot reach the other host.
Another application is used to test reachability between the two hosts, and
indicates that the hosts can still communicate normally. Finally, the flow
entry is deleted from Open vSwitch and the SCTP connection is observed to
re-establish.
The first attempt at setting up this test environment was made using Linux
Containers (LXC). LXC is a lightweight virtualisation software that provides
isolated Linux systems based on the host operating system (Lezcano, 2008).
This minimises the set up required to run the containers. Additionally, the
virtual environment will have access to the same software that is installed on
the host. LXC provides a kernel module that allows the containers to have
a separate process and network space. Unfortunately, despite the support for
SCTP in the rest of the Linux networking stack, this module does not support
SCTP. After a query regarding this behaviour on the lxc-users mailing-list,
the developers confirmed this lack of support (Lezcano, 2012).
For the second attempt, a more established virtualiser was chosen: QEMU
(Bellard, 2003). Libvirt (Red Hat, Inc, 2005) and Kernel-based Virtual Machine
(KVM) (Red Hat, Inc, 2007) were also used to speed up the installation and
management of the VMs. This process involved installing a copy of Linux into
each VM and installing lksctp. With these VMs set up, the test was carried
out for the userspace datapath and the kernelspace datapath. This testing
provided confirmation that the SCTP port matching was operating correctly.
Before spending more time on implementing and writing automated tests for
the remaining functionality in SCTP support, the patchset for this feature
was submitted to the openvswitch-dev mailing-list. This was planned to give
assurance that the approach was correct, particularly with the implementation
of the kernel datapath code. This review process was delayed due to the
acquisition of Nicira (Herrod, 2012), and was not carried out in time for this
feature to be completed to the expected quality. The remaining time of this
project was spent working on the OpenFlow Testing Framework and writing
this report.
35
Chapter 5 Implementation
5.4 Evaluation
Each prior section in this chapter has described the uses, implementation
and testing of a feature individually. This section describes how well the
implementation lines up with the goals set at the outset of this project.
5.4.1 Code Review
The primary goal for this project was for all implementation to be accepted
upstream. As of publication, eleven patches have been accepted into the
Open vSwitch codebase. Table 5.1 shows a summary of the development
over the course of this project. The development time includes time spent
learning the tools and processes, but does not include any time spent after the
first submission of that patchset. The review time is counted from the first
submission of the patch until it is submitted into the upstream repository.
With the exception of SCTP and misc patches, the implementation phase was
usually carried out after the review process of the previous feature. Time not
accounted for in the table includes time spent on testing using the OpenFlow
Testing Framework.
The notable delay in the review process for write-metadata is due to two
factors: the patch was based against another patchset from the community
which needed to go through the review process; and the acquisition of Nicira
further delayed this process. This delay also affected the work on SCTP
support. At the time of writing, the SCTP support is yet to be accepted
upstream. The patch has received one round of review to date. However, due
to time constraints, this round of feedback could not be integrated into the
implementation.
The number of reviews conducted for the first three features appears to increase
each time. The first round for each usually solicited a discussion on the
Feature
Arbitrary ethernet
Metadata matching
Metadata writing
SCTP
Misc
Patches
Reviews
2
1
2
5
5
2
3
4
1
1ea
Development
7
3
5
9
weeks
weeks
weeks
weeks
-
Table 5.1: Development breakdown
Review Time
2 weeks
1 week
14 weeks
-
Chapter 5 Implementation
36
implementation approach for the feature or an explanation of the architecture
of Open vSwitch. In the case of metadata matching, one of the reviews was
a simple request to remove an unused function which was unintentionally
included in the patch.
Metadata writing had the most reviews to date, in part due to the discussion
on how the feature should be represented, and also due to the decision made
to base the patch on an alternative branch of the codebase. In retrospect,
this decision slowed development and provided additional workload for the
developers involved. The recommendation from this experience is to use
alternative branches of the codebase only in cases where branch is likely to be
integrated into the main development branch before completion of the feature.
5.4.2 Testing features
Each of the features that was accepted upstream included unit-tests with the
code submission. These tests demonstrated the compliance with the OpenFlow
protocol. This is reinforced by the reviews conducted by upstream developers,
most of whom have had some involvement in the creation of the OpenFlow
specifications.
An additional guarantee of code compliance was also investigated—the use of
the OpenFlow Testing Framework. However, several issues impeded the use of
this software. These consist of problems running the framework, and problems
with the representation of OpenFlow messages.
Despite the goal for the testing framework to be easy to set up and use, there
were particular hurdles faced in this area. Two versions of the framework were
used: One from Ericsson Research, and one from CPqD. The latter of these
required an additional library that is not developed primarily for Linux. There
was some work involved in patching this codebase to have it build correctly on
the test PC. Even when the framework was successfully installed, it exhibited
unusual behaviour; the failure of one test would prevent the next test from
running correctly. This behaviour was traced to the way in which it reserved
addresses on the local host for sending OpenFlow messages.
Furthermore, despite the support of various OpenFlow 1.1 features in Open
vSwitch, the current behaviour is to only use these through Nicira extensions.
As of publication, the ofproto component that brokers connections with
Chapter 5 Implementation
37
OpenFlow controllers will not negotiate an OpenFlow 1.1 connection.
A
patchset to add this feature is currently undergoing review on the mailinglist. However, due to time constraints, this patchset could not be used to test
the functionality of this project.
Chapter 6
Conclusion
This report has documented the process of learning about a large open source
project and contributing features to the codebase. This process involved a cycle
of learning about the project and its architecture, discussing how modifications
should be made, implementing those modifications, and the steps involved in
ultimately getting the code into the upstream codebase.
An exploration of the computer networking concepts was presented, to provide
a basis for this project—Software-Defined Networks, OpenFlow, and how Open
vSwitch implements these. This was expanded upon with specific reference to
the OpenFlow specifications and Open vSwitch design documentation. The use
of OpenFlow features such as instructions, experimenter extensions and extensible matches was described and linked to the architecture of Open vSwitch.
These elements—the concepts, process, and existing architecture—provided a
basis for the implementation of arbitrary ethernet masking, metadata support,
and SCTP support. The details of each of these features as explained and
linked to how they were implemented in the Open vSwitch codebase. Finally,
the implementation of each feature was assessed programmatically through
software tests and independently by upstream developers. This resulted in the
acceptance of eleven patches into the mainline codebase.
6.1 Impact
This project resulted in the addition of new OpenFlow features and general
improvements to the Open vSwitch codebase. With the completion of the
review process for SCTP support, this will also involve the inclusion of code
Chapter 6 Conclusion
39
into the Linux kernel. This work provides additional flexibility to researchers
seeking to use a production-quality software switch in their experiments.
In terms of its positioning as part of Project W, this project pushes one of
the LSR components closer to compatibility with OpenFlow 1.1. This version
of the protocol standardised extensions used by Kempf et al. for the original
OpenFlow-based LSR platform. The contributions of this project are step
towards providing a more accessible open source LSR.
6.2 Future Work
Implementing features for OpenFlow 1.1 and above in Open vSwitch is an
active development area. While the majority of work towards OpenFlow 1.1
support in Open vSwitch has been contributed, there are some remaining
features. To provide an interoperable platform with support for a wide range
of features, support for the later 1.2 and 1.3 protocols will also need to be
developed. These developments would provide the research community with a
fully OpenFlow-compliant datapath. Further work will need to be carried out
in the broader community to provide support for these versions of OpenFlow
in controller software.
References
Bellard, F. (2003).
QEMU.
http://www.qemu.org/.
Retrieved 20 October, 2012, from
Bianco, A., Birke, R., Giraudo, L., Palacin, M. (2010). OpenFlow Switching:
Data Plane Performance. In ICC, pp. 1–5. IEEE.
Big Switch Networks (2011).
Indigo - Open Source OpenFlow Switches.
Retrieved 8 October, 2012, from http://www.openflowhub.org/display/Indigo.
Big Switch Networks (2012).
OFTest—Validating OpenFlow Switches.
Retrieved 8 October, 2012, from http://oftest.openflowhub.org/.
Bird, C., Nagappan, N. (2012). Who? Where? What? Examining distributed
development in two large open source projects. In Lanza, M., Penta, M. D.,
Xi, T. (Eds.), MSR, pp. 237–246. IEEE.
Calhoun, P. R., Loughney, J., Arkko, J., Guttman, E., Zorn, G. (2003).
Diameter Base Protocol. RFC 3588.
Calleja, D. (2012).
Linux 3.3.
http://kernelnewbies.org/Linux 3.3.
Retrieved 22 October, 2012, from
CPqD (2012). OpenFlow Software Switch. Retrieved 8 October, 2012, from
http://github.com/CPqD/.
Donnet, B., Bonaventure, O. (2008). On BGP communities. SIGCOMM
Comput. Commun. Rev., 38 (2), 55–59. doi:10.1145/1355734.1355743.
Ericsson Research (2011a). OFTest for OpenFlow 1.1. Retrieved 8 October,
2012, from http://github.com/TrafficLab/oftest11.
Ericsson Research (2011b).
OpenFlow 1.1 Software Switch.
Retrieved 8
October, 2012, from http://github.com/TrafficLab/of11softswitch.
41
References
Erlang Solutions (2012). FlowForwarding/LINC-Switch. Retrieved 8 October,
2012, from http://github.com/FlowForwarding/LINC-Switch.
Herrod,
S. (2012).
VMware and Nicira—Advancing the Software-
Defined Datacenter.
Retrieved 23
http://blogs.vmware.com/console/2012/07/.
September,
2012,
from
Hölzle, U. (2012). OpenFlow @ Google. Open Networking Summit.
IBM Corporation (2001). LKSCTP.
http://lksctp.sourceforge.net/.
Retrieved 20 October, 2012, from
IEEE Computer Society (2002). IEEE Standard for Local and Metropolitan
Area Networks: Overview and Architecture. New York, NY: IEEE.
Kempf, J., Whyte, S., Ellithorpe, J., Kazemian, P., Haitjema, M., Beheshti,
N., Stuart, S., Green, H. (2011). OpenFlow MPLS and the open source
label switched router. In Proceedings of the 23rd International Teletraffic
Congress, ITC ’11, pp. 8–14. ITCP.
Lezcano, D. (2008). lxc Linux Containers. Retrieved 20 October, 2012, from
http://lxc.sourceforge.net/.
Lezcano,
network
D.
(2012).
protocols.
Re:
Retrieved 20
[Lxc-users]
Alternative
October,
2012,
from
http://www.mail-archive.com/[email protected]/msg03826.html.
McKeown, N., Anderson, T., Balakrishnan, H., Parulkar, G., Peterson, L.,
Rexford, J., Shenker, S., Turner, J. (2008). OpenFlow: enabling innovation
in campus networks. SIGCOMM Comput. Commun. Rev., 38 (2), 69–74.
Open
Networking
Foundation
(2012).
The
OpenFlow
Switch
Specification (1.2 and above).
Retrieved 22 March, 2012, from
http://www.opennetworking.org/about/onf-documents.
Open vSwitch (2012a). Open vSwitch Documentation. Retrieved 20 October,
2012, from http://openvswitch.org/support/.
Open vSwitch (2012b). Open vSwitch Mailing Lists. Retrieved 20 October,
2012, from http://openvswitch.org/mlists/.
42
References
OpenFlow.org (2011). The OpenFlow Switch Specification (1.0,1.1). Retrieved
22 October, 2012, from http://www.openflow.org/wp/documents/.
Pfaff,
B. (2011).
Call for assistance:
OpenFlow 1.1 and 1.2
support in Open vSwitch.
Retrieved 21 March, 2012,
http://www.mail-archive.com/[email protected]/msg06532.html.
Pfaff,
B.
(2012a).
[PATCH
v2
00/11]
instruction
actions/goto-table support.
Retrieved 20 October, 2012,
http://www.mail-archive.com/[email protected]/msg11500.html.
from
applyfrom
Pfaff, B. (2012b).
[PATCH v2] ofp-actions:
Implement writing
to metadata field.
Retrieved 20 October,
2012,
from
http://www.mail-archive.com/[email protected]/msg11290.html.
Pfaff, B., Pettit, J., Koponen, T., Amidon, K., Casado, M., Shenker, S. (2009).
Extending networking into the virtualization layer. In HotNets-VIII.
Red Hat, Inc (2005). Libvirt: The virtualization API. Retrieved 20 October,
2012, from http://libvirt.org/.
Red Hat, Inc (2007). Kernel Based Virtual Machine. Retrieved 20 October,
2012, from http://www.linux-kvm.org/.
Riel, R. (2006). UpstreamMerge/SubmitChecklist. Retrieved 22 October,
2012, from http://kernelnewbies.org/UpstreamMerge/SubmitChecklist.
Stanford University (2012). Clean Slate Design for the Internet. Retreived 12
October, 2012, from http://cleanslate.stanford.edu/.
Stewart, R. R. (2007). Stream Control Transmission Protocol. RFC 4960.
Traina, P., Chandrasekeran, R., Li, T. (1996). BGP Communities Attribute.
RFC 1997.
Glossary
action How the forwarding element should handle particular matches .
AutoTest Allows developers to create platform-independent test cases. Part of
GNU Build tools .
BGP Border Gateway Protocol. The de facto standard protocol for sharing
routes with other organisations .
black-box Testing the behaviour of a component based on sending particular
inputs to the component, and monitoring the outputs .
controller The part of a forwarding element which is responsible for route
aggregation and advertisement .
CRC32c The checksum algorithm used in SCTP to detect transmission or
storage errors for the data contained in a packet .
datapath The part of a forwarding element which is responsible for the forwarding of packets. Typically used to refer to a software or hardware
switch .
flow A class of packets, as determined through a set of common attributes
such as the same source or destination address. .
flow entry A combination of a match and action .
forwarding element A router or switch .
Forwarding Information Base A table which stores forwarding entries for fast
packet switching .
Host Byte Order The endianness of bytes when the data is stored in memory.
Architecture dependent .
Glossary
44
Label-Switched Router A type of router that provides label-switching functionality. This functionality is used to simplify traffic management in hightraffic network environments. .
match The classification of a packet based on particular field values .
Network Byte Order The endianness of bytes when the data is written to the
network interface. Typically standardised as big-endian—Most significant bytes first .
Sparse The semantic parser, a compiler frontend and static code analyzer for
ANSI C programs .
white-box Testing the behaviour of a component with reference to the component’s structure, often by using shortcut code that allows direct access
to internal functions .
Was this manual useful for you? yes no
Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Download PDF

advertisement