Central Control Over Distributed Routing (Extended Version)

Central Control Over Distributed Routing (Extended Version)
Central Control Over Distributed Routing
(Extended Version)
http://fibbing.net
Stefano Vissicchio∗, Olivier Tilmans∗, Laurent Vanbever†, Jennifer Rexford‡
∗
∗
Université catholique de Louvain, † ETH Zurich, ‡ Princeton University
[email protected], †[email protected], ‡[email protected]
scrubber
ABSTRACT
Centralizing routing decisions offers tremendous flexibility,
but sacrifices the robustness of distributed protocols. In this
paper, we present Fibbing, an architecture that achieves both
flexibility and robustness through central control over distributed routing. Fibbing introduces fake nodes and links
into an underlying link-state routing protocol, so that routers
compute their own forwarding tables based on the augmented
topology. Fibbing is expressive, and readily supports flexible load balancing, traffic engineering, and backup routes.
Based on high-level forwarding requirements, the Fibbing
controller computes a compact augmented topology and injects the fake components through standard routing-protocol
messages. Fibbing works with any unmodified commercial
routers speaking OSPF. Our experiments also show that it
can scale to large networks with many forwarding requirements, introduces minimal overhead, and quickly reacts to
network and controller failures.
1.
10
S1
B
1
flow
S2
1
D
SA
IGP
weight
C
5
5
A
1
E
S3
5
F
D2
D1
1
1
A
B
10
1
D
1
C
5
2
1
1
E
5
5
F
destination
(a) Initial topology
(b) Augmented topology
Figure 1: Fibbing can steer the initial forwarding paths (see (a)) for D1 through a scrubber by
adding fake nodes and links (see (b)).
to D1 would also reroute flows to D2 since they home
to the same router. Advertising D1 from the middlebox
would attract the right traffic, but would not necessarily alleviate the congestion, because all D1 traffic would
traverse (and congest) path (A, D, E, B), leaving (A, B)
unused. Well-known Traffic-Engineering (TE) protocols
(e.g., MPLS RSVP-TE [1]) could help. Unfortunately,
since D1 traffic enters the network from multiple points,
many tunnels (three, on A, D, and E, in our tiny example) would need to be configured and signaled. This
increases both control-plane and data-plane overhead.
Software Defined Networking (SDN) could easily solve
the problem as it enables centralized and direct control of the forwarding behavior. However, moving away
from distributed routing protocols comes at a cost. Indeed, IGPs like OSPF and IS-IS are scalable (support
networks with hundreds of nodes), robust, and quickly
react to failures. Building a SDN controller with comparable scalability and reliability is challenging. It must
compute and install forwarding rules for all the switches,
and respond quickly to topology changes. Even the simple task of updating the switch rule tables can then become a major bottleneck for a central controller managing hundreds of thousands of rules in hundreds of
switches. In contrast, distributed routing protocols naturally parallelize this work. For reliability and scalability, a SDN controller should also be replicated and
geographically distributed, leading to additional challenges in managing controller state. Finally, the de-
INTRODUCTION
Consider a large IP network with hundreds of devices,
including the components shown in Fig. 1a. A set of
IP addresses (D1 ) see a sudden surge of traffic, from
multiple entry points (A, D, and E), that congests a
part of the network. As a network operator, you suspect
a denial-of-service attack (DoS), but cannot know for
sure without inspecting the traffic as it could also be a
flash crowd. Your goal is therefore to: (i) isolate the
flows destined to these IP addresses, (ii) direct them
to a scrubber connected between B and C, in order to
“clean” them if needed, and (iii) reduce congestion by
load-balancing the traffic on unused links, like (B, E).
Performing this routine task is very difficult in traditional networks. First, since the middlebox and the
destinations are not adjacent to each other, the configuration of multiple devices needs to change. Also,
since intra-domain routing is typically based on shortest path algorithms, modifying the routing configuration is likely to impact many other flows not involved
in the attack. In Fig. 1a, any attempt to reroute flows
∗
fake node
router
source
S. Vissicchio is a postdoctoral researcher of F.R.S.-FNRS.
1
forwarding paths:
- configuration
- manageability
- path installation
robustness:
- network failures
- controller failures
- partitions
routing policies:
centralized/SDN
OpenFlow [2], PCE [3], SR [4]
distributed/traditional
IGP [5, 6], RSVP-TE [1]
hybrid
Fibbing
simple (declarative & global)
high (direct control)
slow (by controller, per-device)
complex (indirect & per-device)
low [7, 8] (need for coordination)
fast (by device, distributed)
simple (declarative & global)
high (direct control)
fast (by device, distributed)
slow (by controller)
hard (ad-hoc synch)
hard (uncontrollable devices)
highest (any path)
fast (local)
native (distributed)
best (distributed)
- low for IGP (shortest paths)
- highest for RSVP (any path)
fast (local)
easy (synch via IGP)
best (fallback on distributed)
high (any non-loopy paths)
Table 1: Fibbing combines the advantages of existing control planes, avoiding the main drawbacks.
ployment of SDN as a whole is a major hurdle as many
networks have a huge installed base of devices, management tools, and human operators that are not familiar
with the technology. As a result, existing SDN deployments are limited in scope, e.g., new deployments of
private backbones [8, 9] and software deployments at
the network edge [10].
This paper introduces Fibbing, a technique that offers
direct control over the routers’ forwarding information
base (FIB) by manipulating the input of a distributed
routing protocol. Fibbing relies on traditional link-state
protocols such as OSPF [5] and IS-IS [6], where routers
compute shortest paths over a synchronized view of the
topology. Fibbing controls routers by carefully lying to
them, removing the need to configure them. It coaxes
the routers into computing the target forwarding entries by presenting them with a carefully constructed
augmented topology that includes fake nodes (providing fake announcements of destination address blocks)
and fake links (with fake weights). In essence, Fibbing
inverts the routing function: given the forwarding entries (i.e., the desired output) and the routing protocol
(i.e., the function), Fibbing automatically computes the
routing messages to send to the routers (i.e., the input).
Fibbing can solve the problem in Fig. 1a adding two
fake nodes (Fig. 1b), connected to A and E with the depicted weights. Both fake nodes advertise that they can
reach D1 directly. Based on the augmented topology, D
starts to use A to reach D1 , as the new cost (3) is lower
than the original one (6). A and E also select different
paths. Since the fake nodes do not really exist, packets
forwarded by A or E actually flow through B. Routers
B and C do not change their forwarding decisions.
Table 1 gives an overview of how Fibbing improves
flexibility and manageability by adopting a SDN-like
approach while keeping the advantages of distributed
protocols (e.g., robustness and fast FIB modifications).
neering, (b) load balancing, (c) fast failover, and (d)
traffic steering through middleboxes. By relying on
destination-based routing protocols, Fibbing does not
support finer-grained routing and forwarding policies
such as matching on port numbers. Though, those policies can easily be supported via middleboxes.
Fibbing scales and is robust to failures. Lying
to routers is powerful but challenging. Indeed, Fibbing must be fast in computing augmented topologies
to avoid loops and blackholes upon network failures.
At the same time, Fibbing must compute small augmented topologies since routers have limited resources.
Finally, Fibbing must be reliable and gracefully handle
controller failures. We address all three challenges.
Fibbing differs from previous approaches that
rely on routing protocols to program routers.
Prior approaches like the Routing Control Platform [11]
rely on BGP as a “poor man’s” SDN protocol to install a forwarding rule for each destination prefix on
each router. In contrast, Fibbing leverages the routingprotocol implementation on the routers. Doing so, Fibbing can adapt the forwarding behavior of many routers
at once, while allowing them to compute forwardingtable entries and converge on their own. That is, while
the controller computes the routing input centrally, the
routing output is still computed in a distributed fashion.
Fibbing works on existing routers. We implemented
a fully-functional Fibbing prototype and used it to program real routers (both Cisco and Juniper). Based on
an augmented topology, these routers can install hundreds of thousands of forwarding entries with an average installation time of less than 1ms per entry. This
offers much greater scale and faster convergence than
is possible with state-of-the-art SDN switches [12, 13],
without requiring the deployment of new equipment and
per-device actions from the controller. This also means
that Fibbing can implement recent SDN proposals, like
Google’s B4 [8] and Microsoft’s SWAN [9]—on top of
existing networks.
Our earlier work showed that Fibbing can enforce
any set of forwarding DAGs [14]. This paper goes fur-
Fibbing is expressive. Fibbing can steer flows along
any set of per-destination loop-free paths. In other
words, it can exert full control at a per-destination granularity. For this reason, Fibbing readily supports advanced forwarding applications such as: (a) traffic engi-
2
Compilation
Augmentation
§2
§3
Optimization
Injection/
Monitoring
pol
s
r
p
n
b
§4
§3
+
path
reqs.
network
topology
per-destination
forwarding DAGs
augmented
topology
reduced topology
running
network
ther by describing the complete design, implementation,
and evaluation of a Fibbing controller managing intradomain routing. Rather than focusing on specific use
cases (like traffic engineering), we describe its support
for different higher-level approaches (e.g., [8, 9]). We
make the following contributions:
Fibbing Policy
Requirement
Path Req.
Path Expr.
Node Expr.
Backup Req.
which specifies how traffic should flow to a given destination, or a backup requirement which specifies how
traffic should flow if the IGP topology changes. Each
path requirement is recursively defined as a composition
of path requirements through logical AND and OR. Operators can express load-balancing requirements using a
conjunction of n requirements. Similarly, they can ensure that traffic to a specific destination will take one
of n paths (e.g., containing a firewall) using disjunction. Path requirements are composed of a sequence
of node requirements. A node requirement is either a
node (router) identifier or the wildcard *, representing
any sequence of nodes. Like path requirements, node
requirements can be combined using logical AND and
OR. Whenever no path requirement is stated, the original IGP paths should be used. This way, operators
do not have to express all the unmodified paths, only
deviations from the IGP shortest paths.
The following example illustrates the main features of
the language. It states that traffic between E and D1
should be load-balanced on two paths, traffic between
A and D2 should cross B or C and that traffic from F
to D3 should be rerouted via G if the link (F, H) fails.
• Abstraction: We show how to express and realize
high-level forwarding requirements by manipulating a distributed link-state routing protocol (§2).
• Algorithms: We propose new, efficient algorithms
to compute compact augmented topologies (§3).
• Implementation: We describe a complete Fibbing implementation which works with unmodified
Cisco and Juniper routers (§4).
• Evaluation: We show that our Fibbing controller
quickly generates small augmented topologies, inducing minimal load on routers (§5).
FLEXIBLE FIBBING
Fibbing workflow proceeds in four consecutive stages
based on two inputs: the desired forwarding graphs (one
Directed Acyclic Graph, or DAG, per destination) and
the IGP topology (Fig. 2). The forwarding DAGs can
either be provided directly or expressed indirectly, at a
high-level, using a simple path-based language. In the
latter case, the Compilation (§2) stage starts by compiling the requirements into concrete forwarding DAGs.
Then, the Topology Augmentation (§3) stage computes an augmented topology that achieves these forwarding DAGs. While computing an augmented topology is fast, the resulting topology can be large. As such,
the job of the next Topology Optimization (§3) stage
is to reduce the augmented topology while preserving
the forwarding paths. Finally, the Injection & Monitoring (§4) stage turns fake information into actual
“lies” that the controller injects into the network.
In this section, we present the high-level language and
compilation process (§2.1), and show that Fibbing can
express a wide range of forwarding behaviors (§2.2).
2.1
(s1 ; . . . ; sn )
p|b
p1 and p2 | p1 or p2 | p
Path(n+ )
id | ∗ | n1 and n2 | n1 or n2
r as backupof ((id1 , id2 )+ )
Figure 3: Syntax of Fibbing high-level language.
Figure 2: The four-staged Fibbing workflow.
2.
::=
::=
::=
::=
::=
::=
( ( E , C , D1 ) and ( E , G , D1 ) ;
( ( A , ∗ , B , ∗ , D2 ) or ( A , ∗ , C , ∗ , D2 ) ) ;
( F , G , ∗ , D3 ) as backupof ( ( F , H ) ) ) ; )
Fibbing policies are compiled into per-destination forwarding DAGs by finding convenient network paths (if
any). Compilation works in two consecutive stages.
First, the compiler expands any requirement with wildcards into paths. This step can be computationally expensive as, in general, a network can have a number of
paths exponential in the number of nodes. While this
is unlikely, especially for networks designed according
to best practices, we bounded the number of paths that
can be expanded out of a single requirement. We only
expand again if no solution is found with the current
set of paths. Once all requirements are expanded, the
compiler groups them by destination and computes the
Disjunctive Normal Form (DNF) of each requirement.
To finally produce a forwarding DAG, the compiler iterates over the disjunction of path requirements and
checks whether the resulting graph is loop-free.
Fibbing high-level language
Fibbing language (Fig. 3) provides operators with a
succinct and easy way to specify their forwarding requirements. A Fibbing policy is a collection of requirements, naturally expressed as regular expressions on
paths. Each requirement is either a path requirement
2.2
3
Fibbing expressiveness
Beyond steering traffic along a given path (§1), we
now show that Fibbing can also: (i) balance load over
multiple paths and; (ii) provision backup paths.
3
0.50 S1
0.50 S2
demand
0.75
0.75 S1
A
3
B
1
1 C
10
0.50 S2
0.75 S3
E
3
link
capacity
3
D1
H
D2
D3
10
F
1
1 G
3
0.75
A
3
B
6
0.50
(a) Initial topology
1
E
3
F
1 1
C
0.25
1
1 G
0.75
0.25
3
1 D
.5
3
0.50
0.50
1
C
E
1
5
G
3
3
H
(a) Initial topology
F
D1
D2
A
1
3
B
.75
1
3
0.50
1
1 D
.5
G
0.50
C
3
E
1
F
5
H
5
(b) Augmented topology
Figure 5: Fibbing can provision backup paths.
Here, possible congestion upon a link failure (see
(a)) is avoided by adding a fake node (see (b)).
as backupof ( ( A , D ) or ( D , E ) or ( E , F ) )
Fig. 5b shows the corresponding augmented topology,
which has a single fake node advertising D2 . The weights
are set to prevent A from using the fake node to reach
D2 unless a failure occurs along the path (A, D, E, F ).
While successful for this example, Fibbing cannot satisfy all possible requirements for backup paths (§3.3).
3.
D
AUGMENTING TOPOLOGY
In this section, we detail the augmentation problem
(§3.1), and we show how the Fibbing controller quickly
computes small augmented topologies from a set of forwarding DAGs. We rely on a divide-and-conquer approach based on three consecutive steps.
10
3
A
B
5
0.75
D
1
5
Fibbing can forward any flow on any set of paths.
Fibbing can load-balance traffic over multiple paths to
maximize throughput, minimize response time, or increase reliability. For example, consider the network in
Fig. 4a where three sources S1 , S2 , and S3 send traffic
to three corresponding destinations. Demands and link
capacities are such that link (F, G) is congested. One
way to alleviate congestion is to split traffic destined to
D2 over the top (via (B, C)) and bottom (via (F, G))
paths. Load-balancing traffic coming from E on multiple paths is possible under conventional link-state routing (e.g., by re-weighting links (F, G) to 15. However,
this would force the traffic from S2 and S3 to spread
over both paths, creating congestion. More generally, it
is impossible to route the traffic destined to D2 and D3
on different links under conventional link-state routing.
.75
H
(b) Augmented topology
Figure 4: Fibbing supports multi-path forwarding. Here, it avoids the initial congestion (see
(a)) by load-balancing traffic for D2 (see (b)).
1) Topology initialization (§3.2): We modify the
initial weights in the link-state protocol (if necessary),
to guarantee that any set of forwarding DAGs can be
enforced by Fibbing. If needed, this operation has to
be done only once, when Fibbing is first deployed.
This simple requirement can easily be expressed as:
( ( S2 , E , B , C , H , D2 ) and ( S2 , E , F , G , H , D2 ) )
2) Per-destination augmentation (§3.3): Starting
from an initialized topology, we compute a suitable augmentation, individually for every destination of an input forwarding DAG. We designed two algorithms for
this step, achieving different trade-offs between computation time and augmentation size. The fastest one,
Simple, can compute augmented topologies within milliseconds, and works by injecting a dedicated fake node
for every router that changes its next-hop. The relatively slower one, Merger, reduces the augmentation
size by re-using the same fake nodes to program multiple routers. Simple and Merger are suited for different
goals. The speed of Simple is useful for quick failure reaction. In contrast, Merger can be run in background to
progressively re-optimize the augmented topology. We
evaluate the trade-offs achieved by each algorithm in §5.
Fig. 4b shows the augmented topology which realizes
this requirement. A fake node announcing D1 (with a
weight of 6) is inserted between E and B. After introducing this node, E has two shortest paths (of cost 7)
to reach D2 and, hence, splits D2 traffic over B and
F . In this example, Fibbing enables maximum network
efficiency as each link is used to its full capacity.
Fibbing can provision backup paths for any flow.
Fibbing can provision backup paths to prevent congestion or increased delays after link and node failures.
As an illustration, consider the network in Figure 5a.
The failure of link (E, F ) leads to congestion since traffic flows for D1 and D2 are both rerouted to the same
path via link (A, B). To prevent congestion, traffic destined to D1 and D2 should be split over the two remaining disjoint paths but only upon failure on the path
(A, D, E, F ). This is another example of a requirement
that is impossible to achieve with link-state routing,
and would require significant control-plane overhead in
MPLS. In contrast, it is easily done with Fibbing.
Backup paths can be specified in our language as:
3) Optimization across destinations (§3.4): We
merge the augmentations obtained in the per-destination
augmentation step to further reduce the number of fake
nodes and edges. Namely, whenever safe, we replace
multiple fake nodes announcing different destinations
with a single fake node which either announces all the
( A , B , ∗ , D1 ) and ( A , G , ∗ , D2 )
4
destinations or creates a new path (a shortcut) between
routers in the augmented topology.
3.1
pliant if for every destination d, the cost of the shortest
path from every router (including the ones announcing
d) to d exceeds 2. In Fibbing compliant topologies, for
any router r and destination d the controller can always compute a fake path P such that (i) P is shorter
than the original shortest path from r to d; and (ii) P
is longer than the original shortest path from any other
router v 6= r to d. As proved by the following theorem,
this implies the ability of Fibbing to forward flows for
the same destination on any set of loop-free paths.
The Topology Augmentation Problem
We start the description of the topology augmentation algorithms by precisely defining the basic concepts
on which they rely and the problem that they solve.
Fake nodes scoping. A Fibbing controller can generate both locally-scoped lies (targeted to a single router)
and globally-scoped lies (targeted to all routers). Locallyscoped lies are useful as they enable local actions on one
router without creating side effects on other routers.
Globally-scoped lies affect the entire network. Hence,
if carefully computed, they can reduce the size of the
augmented topology. All of our previous examples used
globally-scoped lies. We detail how to implement both
kinds of lies in the current OSPF in §4.
Theorem 1. Any set of per-destination forwarding
DAGs can always be enforced by augmenting a Fibbingcompliant topology even only with globally-scoped lies.
Proof. We prove the statement by showing a simple
topology augmentation procedure. Let G be the initial
topology. For every forwarding DAG with destination
d, we add for each node r in the network a fake node fr
announcing d. This generates a new fake path (r, fr , d)
in the augmented topology. We set the total cost of
this newly added fake path to 2. Since G is Fibbing
compliant, then the cost of the shortest path from r to d
in G is greater than 2. Hence, the shortest path of every
node r in the augmented topology will be (r, fr , d). The
forwarding DAG is then implemented by mapping the
fake link on the right physical link.
Fake edges to forwarding next-hop mapping function. Fibbing can modify the routing path computed by
any IGP router r. In particular, it can augment the IGP
topology so that r’s shortest path is no longer the one in
the original topology but includes some fake sub-paths.
Throughout the paper, we assume that a fake edge in
the shortest path from any router r to any destination d
corresponds to the ability to force the next-hop of r for
d to be any of its neighbors. In the example in Fig. 1,
for instance, the fake edge between the A and its adjacent fake node translates into A forwarding traffic to
B. We discuss in §4 how to achieve this ability in the
current OSPF protocol, as well as in future IGPs.
Note that Theorem 1 applies to destinations in the
augmented topology. Those destinations do not need to
match the destination prefixes announced in the original IGP. Hence, Fibbing allows to control flows for deaggregation of the original IGP destinations (up to IP
address granularity) or even non-overlapping prefixes.
Topology augmentation problem. Since we assume
an arbitrary mapping between fake edges and forwarding next-hops, the topology augmentation problem is
defined as follows: Given an initial topology G and a
set of forwarding DAGs, compute an augmented topology G0 ⊃ G such that for each path (u, v, . . . , d) in the
forwarding DAG for d, the next-hop of u in one of its
shortest paths for d in G0 is either v or a fake node.
3.2
Topology initialization is non-intrusive. Since it
is based on the adaptation of a few configurable parameters, our initialization procedure can be applied
to any link-state routing configuration, and preserves
the original forwarding paths. It can be carried out in
a running network, using known lossless reconfiguration techniques [15]. Moreover, it is strictly needed no
more than once in the network lifetime. Indeed, since
Fibbing compliance does not depend on the routing requirements or the presence of specific links, any topology remains Fibbing compliant independently of new
requirements or the failures of nodes or links. Finally,
topologies growing in size can be easily kept Fibbing
compliant by ensuring that the new destinations are announced with high costs and the new links have weights
consistent with pre-existing ones.
Topology Initialization
In the topology initialization, we scale the link weights
of the original IGP topology G to guarantee arbitrary
per-destination control through Fibbing and help reduce the size of topology augmentations. In particular, we proportionally increase link weights (multiplying them by a constant factor) if they are too low in G.
Moreover, we set very high announcement cost for any
destination, at least equal to the length of the longest
path in G times the maximum link weight.
3.3
Topology initialization enables full Fibbing expressivity. Indeed, it makes the IGP topology Fibbing
compliant, which provably avoids cases in which a forwarding DAG cannot be implemented by Fibbing (see
Appendix A). We say that a topology is Fibbing com-
Per-destination augmentation
We now describe Simple and Merger. We use Figure 6
to illustrate the difference of the two algorithms.
3.3.1
5
Simple
initial
25
locallyscoped
node
25
A
5
B
F
C
50
required
(a) Requirements
1
1
f1
1
1
10
B
f2
1
100
C
125
10
f3
1
30
1
10
1
10
D
50
1
f3 1
C
f4
1
X
5
B
F
30
F
15
D
X
15
f0
10
10
5
A
30
10
A
25
100
(b) Simple augmentation
15
globallyscoped
node
10
10
50
X
1
f4
D
100
120
(c) Merger augmentation
Figure 6: Outcome of our per-destination augmentation algorithms. Simple produces larger topologies (with locally-scoped lies) in a very short time, while Merger reduces the size of the augmentation
by relying on globally-scoped lies and a longer computation.
Simple relies solely on locally-scoped lies to avoid having to compute any fake path cost. For every destination d and corresponding forwarding DAG D, the algorithm adds fake nodes to each router whose next-hops
in the original topology differ from those in D. Precisely, for every router r that changes its next-hop for
d, Simple adds a fake node fr,d and a fake link (r, fr,d ).
Node fr,d announces d to r with a locally-scoped lie.
We set the total cost of path (r, fr,d , d) to 2. Since the
topology is Fibbing compliant, r is ensured to change
its shortest path. Also, since the lie is locally-scoped,
other routers are not affected by it.
Figure 6b shows the output of Simple for the example
of Figure 6a. Nodes A, B, C and X are required to
change their respective next-hops. Moreover, A needs
to load-balance on B and C. Thus, Simple creates five
locally-scoped nodes (two connected to A), all providing
fake paths to the destination with a cost of 2.
3.3.2
warding DAGs. However, to enable merging of globallyvisible fake nodes, Merger calculates lower and upper
bounds for every newly-added fake path (r, fr,d , d). Since
the initial positioning of fake nodes is as in Simple, we
illustrate bound computations referring to Figure 6b.
The upper bound ub(fr,d ) represents the maximum
value of the fake path cost that changes r’s shortest
path to (r, fr,d , d). It is easy to compute by statically
considering the non-augmented topology G. Indeed, it
is equal to dist(r, d, G) − 1, where dist(u, v, G) is the
cost of the shortest path from u to v in G.
The lower bound lb(fr,d ) represents the minimal value
of the fake path cost that does not change the shortest
path of any real node different from r. To compute
it, we divide nodes in two sets, depending on whether
the input forwarding DAG prescribes to change their
respective next-hops or not.
For the next-hop preserving nodes whose shortest path
does not traverse r in the input topology, we impose
that their original shortest path is not modified by fr,d .
For example, when computing lb(f0 ) in Figure 6b, we
ensure that f0 does not change the shortest path of F ,
by constraining lb(f0 ) + 25 > dist(F, d, G) = 110, that
is, lb(f0 ) = 86. More generally, for every fake node
fr,d and every next-hop preserving neighbor n of r, we
impose that lb(fr,d ) > dist(n, d, G) − dist(n, r, G).
For next-hop changing nodes (connected to other fake
nodes), the final value of their shortest paths is not
known in advance, but is determined by the augmentation itself. That is, the lower bound of any fake
node generally depends on the lower bound of other fake
nodes. For example, in Figure 6b, f 4 changes the shortest path of X only if lb(f 4) < dist(X, C, G) + lb(f 3).
To avoid that real nodes pass through fake nodes not
directly connected to them, Merger runs a lower bound
propagation procedure. This procedure takes as input
the lower bounds initialized with values from next-hop
preserving nodes. It then fixes one lower bound at the
time, following a specific order and adjusting the others
to be consistent with the fixed one. This order guarantees that each lower bound must be considered only
once. Sometimes, lower bounds cannot be made consis-
Merger
To reduce the number of fake nodes, Merger relies on
globally-scoped lies that can change the forwarding behavior of multiple routers at once. When applied to Figure 6a, Merger creates only two fakes nodes (see Fig. 6c)
to change the next-hops of A, B, C and X, instead of
the five used by Simple. The added fake nodes create
load-balancing on A (cost 136) via B and C, as required.
Merger performs the topology augmentation for any
destination d in two phases. First, it adds an excessive
number of fake nodes, and computes the lower and upper bounds for their respective cost. Second, it merges
fake nodes whenever possible, based on the value of the
computed bounds. We now provide an intuitive description of those two phases. Additional details about
them and Merger correctness proofs are reported in Appendix B.
Step 1. Fake bounds computation. Merger starts
by adding fake nodes to every router r that is required to
change the next-hop (according to the input forwarding
DAG) for d. In this case, one new fake node fr,d is connected to r for every new r’s next-hop in the input for6
tent. Indeed, Theorem 1 does not provide guarantees
if fake nodes are connected only to next-hop changing
nodes. We solve these cases by using locally-visible lie.
B
F
135 1
90
Step 2. Fake nodes merging. In this step, Merger
tries to merge fake nodes together. More precisely, it
iterates over every simple path from a source to d in
the input forwarding DAG. For each of those paths, it
merges pairs of fake nodes whenever safe.
To assess when it is safe to merge a fake nodes f 0
into f 00 , Merger sequentially performs three checks. We
illustrate these checks by considering the required path
(A, B, X, D) and the merge of f1 into f2 in Figure 6b.
First, Merger assesses whether the IGP shortest paths
are compliant with the considered source-sink path in
the DAG. In our example, it verifies that the shortest path from A and B (i.e., (A, B)) is a sub-path of
the required (A, B, X, D). If A had predecessors in the
required path not connected to a fake node, this check
would have been repeated for those predecessors as well.
Second, Merger checks the possibility to use f 00 as
part of the shortest path of the real node connected
to f 0 without changing the current next-hops of any
node. To this end, the algorithm assess the existence of
feasible post-merging bounds for f 00 . More precisely, it
re-computes the modified lower bound of f 00 as the minimum value (if any) that forces the new shortest path
of the real node connected to f 0 and its next-hop preserving predecessors in the required path through f 00 ,
without affecting nodes previously not crossing f 00 . In
our example, the lower bound of f2 is modified to exclude the constraint of not changing A’s next-hop, hence
it is decreased to lb(f2 ) = 81 (it was greater before, for
A’s next-hop to be f1 ). The upper bound of f 00 is also
modified to ensure that the real node connected to f 0
and its next-hop preserving predecessors use the fake
path via f 00 . In our example, ub(f2 ) is modified to 129,
as the cost of the original shortest path from A to the
destination is 135 and those between A and B is 5.
Third, Merger simulates the merge to assess whether
all bounds can be consistently adjusted network-wide,
given that the merging f 0 into f 00 would change the
bounds of f 00 and remove one fake node. To this end, we
re-run the lower bound propagation procedure, devoting
special attention to fake nodes used for load-balancing.
In our example, for instance, we constrain the lower
bound of f0 to be equal to the cost of path (A, B, f2 , d),
meant to be used by A after the merging.
If all the three checks pass, then Merger actually
perform the merge, by removing f 0 and updating the
bounds of all other fake nodes (including f 00 ) according
to the values computed during the last check.
3.3.3
25
25
5
A
100
10
C
15
10
50
X
1
120
D
5
B
135 1
30
10
A
100
90
100
100
71
(a) Fake destination merging.
F
100
30
10
10
C
15
10
X
19
globally
scoped
shortcut
50
1
D
500
100
asymmetric
weight
(b) Fake shortcuts creation.
Figure 7: Cross-destination optimizations.
gorithms. Let G0 be an augmented topology computed
to accommodate primary requirements. To deal with
backup ones, slight variants of Simple and Merger can
be run after the computation of G0 . The main modification of Simple consists in setting the cost of the
fake paths added for backup requirements to 3 instead
of 2. In contrast, backup requirements are supported
in Merger by imposing that lower bounds are always
greater than the cost of the shortest path in G0 .
Contrary to primary requirements (see Theorem 1),
Fibbing may not enforce backup requirements, even in a
Fibbing-compliant topology (see Appendix C). Indeed,
if the cost of the original shortest path of a node r is
equal to a fake one (used for a primary requirement) in
G0 , then backup paths different from the original shortest path cannot be implemented on r. In those cases,
we notify the operator on the impossibility to implement the given backup paths.
3.4
Cross-Destination Optimization
Fake nodes computed on a per-destination basis may
be redundant. We reduce such redundancy in two ways,
namely, by (i) merging fake nodes connected to the same
real node, and (ii) replacing fake destination announcements with fake paths connecting real nodes.
Cross-destination merging. After per-destination
augmentations, two fake nodes f1 and f2 announcing
different destinations d1 and d2 can be connected to the
same node r and used to force traffic to the same real
link (r, n). Those fake nodes can always be merged. Indeed, we can replace f1 and f2 with a new fake node f 0
such that (i) cost(r, f 0 ) = min{cost(r, f1 ), cost(r, f2 )},
and (ii) f 0 announces both d1 and d2 , with cost(f 0 , di ) =
cost(r, fi , di ) − cost(x, f 0 ) for i = 1, 2. For example, assume that in the network in Figure 6 additional destinations are attached to C and F as in Figure 7a. The
result of the cross-destination merging is shown in Figure 7a, where both A and X have a single fake neighbor
announcing multiple destinations (rather than multiple
fake neighbors each announcing a single destination).
This reduces the number of fake nodes from 4 to 2.
Dealing with backup requirements
Creating shortcuts. One of the most appealing features (unmatched by competitor solutions) of Fibbing
is that a single lie can change the paths for multiple des-
Backup requirements can be specified by providing
additional sets of (tagged) forwarding DAGs to our al7
tinations. To this end, we need however to replace fake
destination announcements with fake paths connecting
real nodes together, i.e., fake shortcuts.
Currently, we use fake shortcuts only if a real link
is never traversed in different directions. Consider, for
example, the link between X and D in Figure 7a. It is
traversed from X to D for the destinations attached to
D and F , and never from D to X. In those cases, we
try to transform X’s fake neighbor into a fake shortcut,
as in Figure 7b. Let u and v be the two real nodes
at the endpoints of the shortcut. First, we check if a
shortcut cost c exists such that all the shortest paths1
are kept the same with and without the shortcut. If this
condition is met, given a fake node f used to enforce
the subpath (u, . . . , v) in the input forwarding DAGs,
all fake destinations announced by f are replaced by a
fake shortcut (u, f, v) and the cost of (u, f, v) is set to c.
Figure 7b illustrates that we found such a value c = 20
for X in our example. Also, we use asymmetric weights
to prevent the fake shortcut from being traversed in the
opposite direction. To this end, the cost of path (v, f, u)
is set to a very high value, e.g., by setting a high weight
of the directed link (v, f ). In Figure 7b, we indeed set
the weight of the link from D to the fake node in the
shortcut to 500.
4.
those needed for any single link failure, and stores them
in a deltas database.
Link-state translator interacts with the routers by
establishing routing adjacencies to inject lies and track
topology changes. Thanks to the flooding mechanism
used by link-state protocols, a single adjacency is sufficient to send and receive all routing messages to/from
all routers. Though, maintaining several adjacencies is
useful for reliability. In this case, the translator simply
injects the lies via all adjacencies. While this slightly
increase the flooding load, doing so does not impact the
routers memory as each message has a unique identifier
and routers only maintain one copy per ID in memory.
Event manager maintains an update-to-date view of
the network topology by (i) parsing the routing messages collected by the translator and (ii) constructing
a network graph for the topology generator. The event
manager checks whether each new event (e.g., a node
failure or addition) affects any of the forwarding requirements. If so, it first checks the deltas database for a
pre-computed lie, and otherwise notifies the topology
generator to request a new augmented topology.
4.2
Our Fibbing prototype works with unmodified OSPFspeaking routers (tested on Cisco and Juniper). To creates lies, our prototype leverages the Forwarding Address (FA) [17] field of OSPF messages. Suppose the
controller wants routers to think that destination d is
directly attached to the router with IP address y. Then,
the controller injects the route for d with a forwarding
address of y and the desired cost for the fake edge from
y to d. Router y ignores the message, and all other
routers compute the cost of the route as the sum of
their cost to y plus the cost in the message.
IMPLEMENTATION
We built a complete prototype of Fibbing in Python
(algorithmic part) and C (interaction with OSPF) by
extending Quagga [16]. Fibbing code base spans over
2300 (resp. 400) lines of Python (resp. C) code. It
is available at http://www.fibbing.net. In this section, we present our prototype (§4.1), describe how Fibbing works with current OSPF routers (§4.2), and propose two small modifications to link-state protocols that
would make Fibbing even more efficient (§4.3). Finally,
we describe how to ensure controller reliability (§4.4).
4.1
Fibbing with Unmodified OSPF
Locally-scoped lies in OSPF. To support locallyscoped lies, we reserve a set of IP addresses to be used as
FAs. All those addresses are propagated network-wide
in OSPF. However, every router is configured not to install routes to all those addresses. Consequently, only
the directly-connected router can reach any of those addresses, and accept routes specifying that address as FA.
Both allocation and configuration of FA-associated IP
addresses can be done just once in the network lifetime.
Fibbing Controller
Our prototype consists of three main components:
Fake topology generator applies (i) the compilation
algorithms (§2) to turn forwarding requirements into
forwarding DAGs and (ii) the augmentation algorithms
(§3) to convert the forwarding DAGs into fake nodes
and links. The topology generator uses a JSON interface to register for update events produced by the
event manager. Upon network updates, the generator
automatically recomputes the augmented topology either using Simple to ensure fast convergence, or Merger
to reduce the size of the topology. To ensure fast convergence and a small augmented topology, the generator also pre-computes augmentations with Merger, e.g.,
Globally-scoped lies in OSPF, and limitations.
OSPF readily supports globally-scoped lies by simply
propagating OSPF messages with the FA set to an IP
address announced in OSPF. However, some subtle constraints hold in an unmodified OSPF network due to
how FAs are resolved on the router. Prominently, (i)
OSPF fake nodes are actually fake routes to the specified FA, hence both their positioning and the path to be
used for reaching them are constrained; and (ii) OSPF
routers discard any OSPF message whose FA is one of
1
By definition of the shortest path, we only need to check the paths
from v and from u neighbors to each destination.
8
its own IP address and computes its shortest path according to the topology without the fake node. The
combination of these two constraints limit the power of
globally-scoped lies in OSPF, making them insufficient
to implement all possible forwarding DAG. An example
of those cases is reported in Appendix D.
To limit control-plane overhead, our implementation
relies on a primary-backup architecture with a single
active replica and an inexpensive election process. A
pre-defined range of router IDs is reserved for controller
replicas. The replica with the lowest router ID across
all running ones is the active replica. It is the only
one injecting lies, while the others only compute the
topology augmentation. When the controller is booted
or when an active replica fails, running replicas receive
IGP messages on the current network topology. Based
on those messages, every replica independently infers
the possible presence of other replicas; it also checks
whether it is the new active replica by comparing its
router ID with the one of the other running replicas.
Overcoming OSPF limitations. To support the full
expressiveness of Fibbing, our prototype controller uses
an OSPF-compliant implementation of Merger with a
combination of locally and globally-scoped lies. This
implementation uses globally-scoped lies whenever possible, and falls back to locally-scoped lies for any requirements that cannot be met that way.
4.3
Proposed Protocol Enhancements
5.
Other link-state protocols, like IS-IS [6], do not support forwarding addresses, and even the OSPF implementation of lies has limitations. However, minor protocol extensions can enable more flexible Fibbing in future routers. Fully-fledged Fibbing needs for the routing protocol to support two functions: (i) the creation
of adjacencies (with fake nodes) on the basis of a received message; and (ii) a third-party next-hop mechanism which allows to specify in a route the forwarding
next-hop to be used if that route is selected. Support
for these functions can be added to protocol specifications (without impacting current functionalities), and
can be deployed through router software updates.
Preliminary discussions with router vendors confirm
that these changes are reasonable and could be integrated into current protocol implementations. Moreover, backwards compatibility is easily achieved as legacy
routers would simply ignore any Fibbing-specific protocol features. Our algorithms can be modified to account
for the fact that the shortest path of a legacy router is
never changed by any fake node.
4.4
EVALUATION
We now evaluate Fibbing along three axis. First, we
show that existing routers are perfectly capable of handling the extra load induced by Fibbing (§5.1). We
then demonstrate the efficiency of Fibbing’s augmentation algorithms in terms of speed and size of the topology (§5.2). Observe that Fibbing behaves as a plain
IGP at the network level. Hence, given its negligible
impact on single routers and the efficiency of our controller, current ISP networks can be seen as the best
large-scale evaluation for Fibbing. We therefore complete the evaluation by illustrating how Fibbing can be
used in a realistic case (§5.3).
5.1
Router measurements
By increasing the size of the link-state routing topology, Fibbing could increase the CPU and memory overhead on the routers, or slow down protocol convergence.
Our experiments demonstrate that the impact on load
and convergence time is negligible. All our measurements were performed using OSPF on a recent Cisco
ASR9K running IOS XR v5.2.2 equipped with 12GB of
DRAM assigned to the routing engine, as well as on a (7year-old) Juniper M120 running JunOS v9.2, equipped
with 2GB of DRAM. Both routers are representative of
typical edge devices (i.e., aggregation routers) found in
commercial networks. We draw the same conclusions
on both router platforms, and focus on measurements
collected on the Cisco device in the following.
Controller Replication
As any network component, a Fibbing controller can
fail at any time. Reliability can be ensured by running
multiple copies the Fibbing controller in parallel and
connecting them to different places in the network.
No state needs to be synchronized between the replicas besides the input forwarding requirements (mostly
static anyway). Indeed, Fibbing algorithms are deterministic, hence all replicas will always compute exactly
the same augmented topology. The only dynamic state
maintained by a Fibbing replica is the network graph.
This state, however, is implicitly synchronized through
the shared topology offered by the underlying IGP: the
link-state flooding mechanism keeps the network graph
up to date and eventually consistent across all replicas.
The determinism of our algorithms enables all replicas to inject (the same) lies at the same time. However,
this would increase the amount of flooded information.
Fibbing induces very little CPU and memory
overhead on routers. We first measured the memory increase caused by a growing number of fake nodes
(Table 2). Two processes are impacted by the presence
of fake nodes: (i) the RIB process, which maintains
information about all the routes known to each destination, and (ii) the OSPF process which maintains the
entire OSPF topology. Even with a huge number of
fake nodes (100,000), the total overhead on both processes was only 154MB—a small fraction of the total
memory available. We collected the CPU utilization on
9
20
40
60
80
100
1.0
0.8
0.6
0.0
0
% of nodes changing next−hop
(a) Per-destination time
0.4
fraction of experiments
60
40
0
0
merger (cross opt)
merger
simple
0.2
80
simple
merger (95−th)
merger (median)
merger (5−th)
20
# of fake nodes (% of total nodes)
time (sec)
0.001
0.1
10
simple
merger (95−th)
merger (median)
merger (5−th)
20
40
60
80
0
% of nodes changing next−hop
(b) Per-destination augmentation size
5
10
15
20
25
# of fake nodes (% of total nodes)
(c) Cross-destination reduction
Figure 8: Evaluation of our augmentation algorithms.
the router every minute immediately after we started
injecting fake nodes. The utilization was systematically
low, at most 4%. This is easily explained as Fibbing
relies on OSPF Type-5 LSAs which do not cause the
routers to recompute their shortest paths to each other.
# fake nodes
RIB memory (MB)
OSPF memory (MB)
1,000
5,000
10,000
50,000
100,000
0.09
1.58
3.56
19.67
39.78
0.56
5.19
10.96
56.37
113.17
Fibbing does not have any impact on routingprotocol convergence time. Finally, we compared
the total time for routers to converge with and without
fake nodes. We failed a link and measured the time for
the last FIB entry to be updated considering two cases:
(i) no lie was injected and (ii) one lie per destination
was injected. Similar to the previous experiments, we
repeated the measurements for a growing number of
destinations and lies, between 100 and 100,000. In all
our experiments, the presence of lies did not have any
visible impact. The total convergence times with or
without lies were systematically within 4ms, with the
router being even faster to converge in the presence of
lies in some cases.
Table 2: Routers easily sustain Fibbing-induced
load, even with huge augmented topologies.
Fibbing can quickly program forwarding entries.
In a second experiment, we measured how long it took
for a router to install a growing number of forwarding
entries (Table 3). We injected a growing number of fake
nodes, one per destination, and measured the total installation time, by tracking the time at which the router
updated the last entry in its FIB. The time to process and install one entry was constant (around 900µs),
independent of the number of entries. This result is
several orders of magnitude better than any OpenFlow
switches currently on the market [12, 13]. Since installation of forwarding entries is distributed, routers can
install their entries in parallel, meaning Fibbing can program thousands of network-wide entries within 1 second.
# fake nodes
installation time (s)
avg time/entry (µs)
1,000
5,000
10,000
50,000
100,000
0.89
4.46
8.96
44.74
89.50
886.00
891.40
894.50
894.78
894.98
5.2
Topology Augmentation Evaluation
We now evaluate Simple and Merger (§ 3) according to: (i) the time they take to compute an augmented
topology for a given requirement, and (ii) the size of the
resulting augmented topology. Results are depicted in
Fig. 8. Our evaluation is based on simulation performed
on realistic ISP topologies [18], whose sizes range from
80 nodes to over 300. On these topologies, we generated forwarding requirements by randomly changing
the next-hop of randomly selected nodes. Destinations
of requirement DAGs were also randomly generated.
Fibbing augments network topologies within ms.
Fig. 8a shows the time (on the y-axis) taken by Simple
and Merger for an increasing number of nodes that must
change their next-hop (on the x-axis). The plot refers to
simulations we ran on the biggest Rocketfuel topology
(AS1239). The time taken by Simple to compute the
per-destination augmentation varies in the order of milliseconds, ranging from 0.5 ms to 8 ms. While Merger
took more time (as expected), its performance is still
one order of magnitude lower than the second. For both
Table 3: Programming a forwarding entry in a
router with Fibbing is fast, sub 1ms.
10
algorithms, the computation time does not vary much
with the number of nodes changing their next-hops.
Fig. 9b plots the throughput of each flow. Immediately
after the introduction of the second flow at t = 5s, the
two flows start competing for the available bandwidth.
To improve network efficiency, the Fibbing controller
injects a fake node f1 connected to C and announces
one destination at time t = 16. A few ms after the injection, we see that the throughput of both flows double
as each of them now traverses a different path.
Merger and cross-optimization effectively reduce
the size of the augmented topology. Fig. 8b plots
the fake topology size (on the y-axis) when the number
of nodes that have to change their next-hop increases
(on the x-axis) for a single destination on all topologies. The plot shows that Merger reduces the number
of fake nodes by about 25% in the average case and almost 50% in the best case. Fake topology reduction is
further corroborated by our cross-destination optimization procedures (see §3.4). Fig. 8c shows a cumulative
distribution function (CDF) of the topology augmentation size computed by Simple, Merger, and Merger
with cross-destination optimization. The figure refers
to simulations with a number of destinations varying
between 1 and 100 with 26% of the nodes changing
their nexthop. In more than 90% of our simulations,
cross-destination optimization achieves a reduction of
the augmented topology. Depending on the experiment, such a reduction is up to about 10% with respect
to Merger without cross-destination optimization, and
20% with respect to Simple.
5.3
6.
We now analyze Fibbing’s reaction to different kinds
of failures. We distinguish between network (affecting real router or router-to-router links) and controller
(shutting down replicas or replica-to-router links) failures. Also, we separately deal with failures inducing
network partitions and non-partitioning ones.
Fibbing quickly reacts to non-partitioning failures. Upon network failures, forwarded flows fall in
one of the following three cases. First, some flows are
not impacted by the failure as their pre-failure forwarding path is not disrupted. Second, flows for which no
input requirements have been specified require only the
IGP to establish a new path, but no action from the Fibbing controller. Reaction to failures is extremely fast in
this case (sub-second even in large networks), thanks to
fast convergence [19] and local fast re-route [20] features
commonly supported by current IGP implementations.
Third, the remaining flows are forwarded on paths modified by the Fibbing controller. They require reaction
from the controller, both to remove possible blackholes
or loops due to previously injected lies [14], and to avoid
requirement violations due to new IGP paths. Theoretically, the total failure reaction time is equal to the
sum of the notification time (for the controller to be
notified of the failure), the processing time (for the controller to compute the new topology augmentation) and
the IGP convergence time (for all routers to install the
new lies). Our evaluation (§5) shows that the processing time is negligible, especially for the Simple algorithm. Moreover, the notification time is bounded by
the IGP convergence time, as flooding is faster than
re-convergence. Thus, in the worst case, the total reaction time is twice the IGP convergence time, that is,
still below 2 seconds [19]. Also, in the average case,
the notification time is smaller than IGP convergence,
because the controller is notified about the failure before all other routers complete convergence; hence, the
controller injects new lies during the IGP convergence,
and the total failure reaction time is slightly higher than
IGP convergence without Fibbing.
In addition, if one or more controller replicas fail but
others are running, we have no impact on forwarded
flows, unless the failed replica is the active one and some
of its injected lies expire before the new active replica is
elected. Even in the latter case, the new active replica is
Case Study
We now show the practicality of Fibbing by improving the performance of a real network consisting of four
routers (Cisco 3700 running IOS v12.4(3)) connected in
a square with link of 1 Mbps capacity (see Fig. 9a). In
this network, we introduce two sources (bottom left)
that send traffic to two destinations (bottom right) using iperf. The first source is introduced at time t = 0,
the second one at time t = 5. OSPF weights are configured such that all traffic flows along link (C, X).
without Fibbing
1
10
5
flow2
C
X
5
flow1
with Fibbing
A
1
B
1
f1
1.2
B
Throughput (Mbps)
A
1
.8
.6
.4
.2
flow 1
flow 2
0
10
2
C
5
5
0
X
(a) Topology
5
10
16
20
REACTION TO FAILURES
25
Time (s)
(b) Throughput evolution
Figure 9: Case study on how Fibbing can alleviate congestion. At t = 16, a fake node is added to
shift one flow to the upper path in the network,
increasing the total available bandwidth.
Such a network suffers from two inherent inefficiencies: (i) the upper path is never used and (ii) the two
flows systematically traverse the same path, competing
for bandwidth, no matter what the link weights are.
11
replica
fails
Throughput (Mbps)
1.2
(A,B)
fails
(B,X)
fails
(B,X)
up
sively fail (i) the active replica at time t = 5; (ii) link
(A, B) at t = 12; and (iii) link (B, X) at t = 20. Finally, we re-establish both failed links, one at the time
(at t = 36 and t = 48).
The results of this experiment, collected via iperf,
are reported in Figure 10. Concretely, the failure of
the active replica has no impact on the forwarded flows.
Indeed, the initially passive replica (connected to B)
quickly detects the failure of the other replica, and start
refreshing the injected lies by the failed controller. When
(A, B) fails, the active replica needs to remove the fake
node f 1: Since the physical path (C, A, B, X) is not
available anymore, this fake node is creating a loop between C and A for the violet flow. Upon failure detection, the controller then sends the LSA to remove
f 1, re-establishing the connectivity for the disrupted
flow in approximately 1s. Note that this time can be
lowered by relying on fast failure detection mechanisms
(like BFD). When (B, X) also fails, we create a partition that makes it impossible for the running replica to
interact with routers A, C, and X. After about 1s, the
injected lies disappear, because they are not refreshed
anymore by any controller. Consistent with the configured failure semantics, the red flow is blackholed (to
avoid the IGP routing it over policy-violating paths)
while the violet flow keeps using the IGP shortest path.
Finally, re-adding the failed links allows the running
replica to re-take control of the network: it re-builds
a (safe) path for the red flow upon (B, X) restoration,
and re-optimizes the distribution of both flow over the
available paths when (A, B) is restored.
(A,B)
up
1
.8
.6
.4
.2
flow 1
flow 2
0
0
5
10
15
20
25
30
35
40
45
50
55
Time (s)
Figure 10: Case study on how Fibbing reacts
to failures, and can successfully implement failclose (flow 1) and fail-open (flow 2) semantics.
quickly elected, in a time which is approximately equal
to the detection and flooding of the failure event by the
IGP. The short election time makes it unlikely that lies
expire before the new active replica is elected, and limits
the period with possible disruptions.
Fibbing can implement both fail-open and failclose semantics to deal with partitions. Even if
unlikely, catastrophic events like a simultaneous failure
of all the controller replicas or network partitions may
happen. As for any centralized solution, a major risk in
those cases is to leave the network uncontrolled. This
happens, for example, if some routers are not reachable by a controller replica after a network partition.
With respect to pure SDN solutions, Fibbing has the
additional possibility to delegate control to the underlying IGP. This way, Fibbing can implement both the
fail-open or fail-close semantics, on a per-destination
basis. For non-critical (optimization) requirements like
traffic engineering ones, the corresponding destinations
can be injected in the IGP, so that connectivity can
be preserved as long as the partition leaves at least one
source-destination path. For stringent requirements like
security ones (e.g., firewall traversal), Fibbing can implement fail-close semantics by not announcing the corresponding destinations in the IGP. As such, the corresponding flows stop to be forwarded in the absence of
the controller. To quickly reach this configuration, we
can set a low validity time of the injected lies, making
them rapidly expire if not refreshed. This then comes
at the cost of additional control-plane overhead.
7.
FREQUENTLY ASKED QUESTIONS
We now provide answers to high-level concerns often raised against Fibbing. Since empirical analyses are
hardly applicable to those concerns (e.g., debuggability), we describe qualitative considerations.
Is Fibbing a long-term solution? Yes. We believe
Fibbing is here to stay. In the short run, Fibbing offers programmability and is easy to deploy, at very little cost. A network that ultimately needs even greater
flexibility could deploy finer-grained SDN functionality at the edge, and solutions like Fibbing in the core,
as advocated by major industry [21] and academic actors [10, 22]. By combining the best of centralized and
distributed routing, Fibbing fits the needs of the network core (flexibility, robustness, low overhead) better
than current forwarding paradigms.
We confirmed Fibbing resilience. We consider again
the topology in Figure 9a, and we connect two controller
replicas respectively to routers A and B. The active
replica is initially the one connected to A. We assume a
strict policy on the red flow forcing it to cross the link
(C, X). We then configure a fail-close semantics to it,
and a fail-open to the other flow. Starting from a state
in which both replicas and all links are up, we succes-
Does Fibbing make networks harder to debug?
No. Fibbing relies on “tried and true” protocols. This
has several implications. First, Fibbing routing matches
the current mental model of operators, a major advantage with respect to other SDN proposals. Moreover,
Fibbing is compatible with any existing management,
12
monitoring, and debugging tools. Finally, the Fibbing
controller can expose a higher-level interface for debugging, including a mapping between the injected lies and
their usage (matched requirements and how).
once, with little input (e.g., one fake node), and let them
compute their own forwarding entries.
Centralized control over the routing/forwarding
tables: In SDN, a central controller installs packetprocessing rules directly in the switches, possibly reacting to the reception of specific packets. While more
flexible (e.g., enabling stateful control logic) than Fibbing, SDN requires updating the switch-level rules oneby-one, and forgoes the scalability and reliability benefits of distributed routing. Recently, the IETF developed I2RS [37] which offers a new management interface
for centralized updates the routing information bases
(RIBs) in the routers. Still, I2RS must push RIB entries individually to each router.
The Fibbing language for expressing requirements is
similar in spirit to Merlin [38], but the mechanism for
satisfying the requirements (i.e., fake nodes/links) is entirely different. Our main contributions are the Fibbing
techniques and algorithms, not the language.
For the networks which require the extra flexibility
provided by OpenFlow, Fibbing helps during the transition by providing access to the FIBs of legacy routers
to any SDN controller [39]. This contrasts to techniques like Panopticon [40], where programmability is
only available in the SDN-enabled parts of the network.
Does Fibbing sum the complexities of centralized and distributed approaches? No. Fibbing
uses the underlying IGP in a very simple way. The
IGP output is easy to predict and provides the controller with a powerful API to program routers. As a
result, the design of the Fibbing controller is significantly simpler than for existing SDN controllers (e.g.,
[23, 24, 25, 26]) since heavy tasks such as path computation and topology maintenance are offloaded to the
routers. Even basic primitives for controller replication
and replica consistency are mainly delegated to current
distributed routing protocols (see §6).
Does Fibbing impact security? No. The lies introduced by the Fibbing controller can easily be authenticated, e.g., using MD5-based authentication [27, 28].
Since Fibbing can only program loop-free paths,
can it support middleboxes chaining? Partially.
Forwarding loops can be encountered when steering traffic through a chain of middleboxes (e.g., [29] and [30]).
These requirements can be satisfied in Fibbing with local support from routers to break the loops. For instance, a router could match on the input interface in
addition to the destination IP address using policybased routing, a feature widely available on existing
routers [31, 32] and provisioned centrally using BGP
flowspec [33, 34]. Alternatively, middlebox traffic steering could be implemented through SDN functionality at
the network edge, while still using Fibbing in the core.
8.
9.
CONCLUSIONS
The advent of SDN makes it clear that network operators want their networks to be more programmable and
easier to manage centrally. In this paper, we show how
Fibbing can achieve those objectives, by centrally and
automatically controlling forwarding without forgoing
the benefits of distributed routing protocols. Fibbing is
expressive, scalable, and works with existing routers. In
future work, we plan to look at extensions of IGP protocols (e.g., for source-destination routing [41] or network
service header awareness) to enable finer-grained control via Fibbing. Abstractly, Fibbing shows how centralized and distributed approaches can be profitably
combined. We believe that new research can further
explore this direction, for example, investigating an alternative division of tasks between centralized and distributed network components.
RELATED WORK
Fibbing contributes to the larger debate about centralized and distributed control over routing by identifying a new point in the design space.
Centralized configuration of distributed routing
protocols: A centralized management system can perform traffic engineering by optimizing the link weights
in link-state routing protocols [35, 36]. Fibbing is more
general, since it can implement any forwarding paths by
injecting fake nodes and links into the link-state routing
topology. The extra flexibility enables even better load
balancing, as well as a wider range of functionality.
Acknowledgements
We are grateful to SIGCOMM anonymous reviewers
and our shepherd, Teemu Koponen, for insightful comments. We thank Jo Segaert from BELNET and Dave
Ward, Clarence Filsfils and Kris Michielsen from Cisco
Systems for their support in testing Fibbing on real
routers. This work has been partially supported by the
EC Seventh Framework Programme (FP7/2007-2013)
grant no. 317647 (Leone) and by the ARC grant 13/18054 from Communauté française de Belgique.
Centralized control using existing routing protocols as a control channel: RCP [11] is a logicallycentralized platform that uses BGP to install forwarding entries into routers. RCP must install forwarding
entries one-by-one, on each device. In contrast, Fibbing
can adapt the forwarding behavior of many routers at
13
10.
REFERENCES
[22] “Time for an SDN Sequel? Scott Shenker Preaches
SDN Version 2,” www.sdxcentral.com/articles/
news/scott-shenker-preaches-revised-sdn-sdnv2/
2014/10/.
[23] “ONOS: Open Network Operating System,”
http://onosproject.org/.
[24] T. Koponen et al., “Onix: A distributed control
platform for large-scale production networks,” in
OSDI, 2010.
[25] “Project Floodlight,”
http://www.projectfloodlight.org/floodlight/.
[26] N. Foster et al., “Languages for software-defined
networks,” IEEE Comm. Mag., 2013.
[27] “Cisco OSPF MD5 Authentication,”
http://www.cisco.com/c/en/us/support/docs/ip/
open-shortest-path-first-ospf/13697-25.html.
[28] “Juniper OSPF MD5 Authentication,” http://
www.juniper.net/documentation/en US/junos14.
2/topics/topic-map/ospf-authentication.html.
[29] Z. A. Qazi et al., “Simple-fying middlebox policy
enforcement using sdn,” in SIGCOMM, 2013.
[30] S. K. Fayazbakhsh et al., “Enforcing network-wide
policies in the presence of dynamic middlebox
actions using flowtags,” in NSDI, 2014.
[31] “Cisco. Configuring Policy-Based Routing,”
http://www.cisco.com/c/en/us/td/docs/ios/12
2/qos/configuration/guide/fqos c/qcfpbr.html.
[32] “Juniper. Configuring Filter-Based Forwarding to
a Specific Outgoing Interface or Destination IP
Address,” http://www.juniper.net/techpubs/en
US/junos12.2/topics/topic-map/
filter-based-forwarding-policy-based-routing.html.
[33] “Cisco. Implementing BGP Flowspec,”
http://www.cisco.com/c/en/us/td/docs/routers/
asr9000/software/asr9k r5-2/routing/
configuration/guide/b routing cg52xasr9k/b
routing cg52xasr9k chapter 011.html.
[34] “Juniper. Enabling BGP to Carry
Flow-Specification Routes,”
https://www.juniper.net/documentation/en US/
junos12.3/topics/example/
routing-bgp-flow-specification-routes.html.
[35] B. Fortz and M. Thorup, “Internet traffic
engineering by optimizing OSPF weights,” in
INFOCOM, 2000.
[36] B. Fortz, J. Rexford, and M. Thorup, “Traffic
engineering with traditional IP routing protocols,”
IEEE Comm. Mag., vol. 40, no. 10, pp. 118–124,
2002.
[37] A. Atlas, J. Halpern, S. Hares, and D. Ward, “An
Architecture for the Interface to the Routing
System,” Internet Draft, 2013.
[38] R. Soulé et al., “Merlin: A language for
provisioning network resources,” in CoNEXT,
2014.
[1] D. Awduche et al., “RSVP-TE: Extensions to
RSVP for LSP Tunnels,” RFC 3209, 2001.
[2] N. McKeown et al., “OpenFlow: enabling
innovation in campus networks,” ACM
SIGCOMM CCR, vol. 38, no. 2, pp. 69–74, 2008.
[3] A. Farrel, J.-P. Vasseur, and J. Ash, “A Path
Computation Element (PCE)-Based
Architecture,” RFC 4655, 2006.
[4] C. Filsfils et al., “Segment Routing Architecture,”
Internet Draft, 2014.
[5] B. Clouston and B. Moore, “Definitions of
Managed Objects for HPR using SMIv2,” RFC
2238, 1997.
[6] D. Oran, “OSI IS-IS Intra-domain Routing
Protocol,” RFC 1142, 1990.
[7] A. Pathak, M. Zhang, Y. C. Hu, R. Mahajan, and
D. A. Maltz, “Latency inflation with MPLS-based
traffic engineering,” in IMC, 2011.
[8] S. Jain et al., “B4: Experience with a
Globally-Deployed Software Defined WAN,” in
SIGCOMM, 2013.
[9] C.-Y. Hong et al., “Achieving High Utilization
with Software-Driven WAN,” in SIGCOMM, 2013.
[10] M. Casado et al., “Fabric: A retrospective on
evolving sdn,” in HotSDN, 2012.
[11] M. Caesar et al., “Design and implementation of a
routing control platform,” in NSDI, 2005.
[12] X. Jin et al., “Dynamic scheduling of network
updates,” in SIGCOMM, 2014.
[13] C. Rotsos, N. Sarrar, S. Uhlig, R. Sherwood, and
A. W. Moore, “OFLOPS: An Open Framework
for Openflow Switch Evaluation,” in PAM, 2012.
[14] S. Vissicchio, L. Vanbever, and J. Rexford, “Sweet
little lies: Fake topologies for flexible routing,” in
Hotnets, 2014.
[15] L. Vanbever, S. Vissicchio, C. Pelsser, P. Francois,
and O. Bonaventure, “Seamless Network-Wide
IGP Migrations,” in SIGCOMM, 2011.
[16] “Quagga routing suite,” www.nongnu.org/quagga.
[17] J. Moy, “OSPF Version 2,” RFC 2328, Apr. 1998.
[18] N. Spring, R. Mahajan, and D. Wetherall,
“Measuring ISP topologies with Rocketfuel,” in
SIGCOMM, 2002.
[19] P. Francois, C. Filsfils, J. Evans, and
O. Bonaventure, “Achieving Sub-second IGP
Convergence in Large IP Networks,” ACM
SIGCOMM CCR, vol. 35, no. 3, 2005.
[20] C. Filsfils, P. Francois, M. Shand, B. Decraene,
J. Uttaro, N. Leymann, and M. Horneffer,
“Loop-Free Alternate (LFA) Applicability in
Service Provider (SP) Networks,” RFC 6571, 2012.
[21] T. Koponen et al., “Network Virtualization in
Multi-tenant Datacenters,” in NSDI, 2014.
14
[39] S. Vissicchio, L. Vanbever, and O. Bonaventure,
“Opportunities and research challenges of hybrid
software defined networks,” ACM SIGCOMM
CCR, vol. 44, no. 2, pp. 70–75, 2014.
[40] D. Levin, M. Canini, S. Schmid, F. Schaffert, and
A. Feldmann, “Panopticon: Reaping the Benefits
of Incremental SDN Deployment in Enterprise
Networks,” in USENIX ATC, 2014.
[41] F. Baker, “IPv6 Source/Destination Routing using
OSPFv3,” Internet Draft, 2013.
provided in input to Merger as G, and augmentations
of G iteratively computed by Merger as G0 , G00 , etc.
For any topology T , the shortest paths from s to d in
T and their cost are respectively denoted as sp(s, d, T )
and dist(s, d, T ). Similarly, we denote the cost of a path
P in T as cost(P, T ).
In addition, we always assume that fake path costs
are consistently assigned, i.e., it exists an integer k ≥ 0
such that the cost of any fake path (x, fx , d) created by
a globally-visible lie is equal to lb(fx ) + k. To prove
Merger correctness, we also define guarantees in the
augmented topology (e.g., about paths of traversing or
not traversing a specific fake node) as assertions valid
for any value of k used for consistently-assigned fake
path cost.
APPENDIX
A.
TOPOLOGY INITIALIZATION
Figure 11a shows an example of a non-initialized
topology where no feasible augmentation can implement
a specific forwarding DAG. In the example, it is impossible to add fake nodes or links that would change A’s
shortest path. Any such fake path would, at a minimum, include a fake link to a fake node announcing the
destination attached to B. Typically, routing protocols
require that a positive cost is set on any link or destination announcement. Hence, the minimum cost of any
fake path is 2, making it impossible to divert A away
from its current shortest path (also with a cost of 2).
B.2
The lower bound propagation procedure takes as input the non-augmented topology G, the target forwarding DAG for a destination d and initial fake node lower
bounds (i.e., initialized according to the next-hop preserving neighbors as in §3). From this input, it first formalizes the constraints between lower bounds of nexthop changing neighbors as inequalities to be respected.
Then, it fixes lower bounds satisfying the computed inequalities considering them one by one, in a precise order. In the following, we provide a more detailed description of those two phases.
initial
initial
1
A
2
2
C
B
1
3
5
A
10
X
required
(a) Non compliant
B
45
Formalization of the constraints between lower
bounds. This is a static computation, only based on
the link weights in G. Consider any fake node fr,d
connected to a node r. Let n 6= r be another node
in G, to which Merger has already connected another
fake node fn,d . We then want to impose that the path
(n . . . r fr,d d) is longer than (n fn,d d). Under the assumption of consistently assigned fake path costs, this
property holds if lb(fn,d ) < dist(n, r, G) + lb(fr,d ). We
therefore use this inequality as the formalization of the
dependency between lower bounds lb(fr,d ) and lb(fn,d ).
10
C
15
X
required
(b) Fibbing compliant
Figure 11: The left topology is not Fibbing compliant: A is at a distance of 1 from B preventing
traffic from being attracted. The right topology is Fibbing compliant; traffic can be attracted
from A by injecting a fake node with a cost less
than 5. It is always possible to go from a noncompliant to a compliant topology.
Computation of lower bound values. This phase
is performed iterating over the lower bounds. At each
iteration, we sort all not yet fixed lower bounds and we
fix the value of the first lower bound in such a sorted set.
Fixed values are not modified by successive iterations.
We now detail a generic iteration i. We denote the set
of lower bounds not yet fixed as Ci , and the lower bound
fixed at iteration i as lb(fr,d ).
Figure 11b shows a possible output of our initialization procedure for the topology in Figure 11a. Such
an initialization makes the topology Fibbing compliant,
which is a sufficient condition for implementing any perdestination forwarding DAG with Fibbing.
B.
MERGER ALGORITHM
1. Sorting lower bounds. Lower bounds are sorted
according to the value of a specific function δ. The
value of this function corresponds to the value of
the considered lower bound minus the distance between the corresponding real node and its closest
node connected to any other fake node. That is,
the value of δ at iteration i for a given lower bound
In this section, we provide additional details on the
Merger algorithm, and we prove its correctness.
B.1
Lower bound propagation procedure
General Notation
Throughout this section, we use the following notation. We indicate a generic non-augmented topology
15
lb(fx,d ) is δi (fx,d ) = lbi (fx,d ) − dist(x, y, G) where
lbi (fx,d ) is the value of lb(fx,d ) at i and y is the
closest (from x) real node connected to a fake one.
In the case of lower bounds with the same δ value,
any deterministic tie-breaker can be used to have
a total order among lower bounds in Ci .
be the sequence of fake nodes connected to nodes in
P and used to implement P in the current augmented
topology. Let F be ordered according to the sequence
of routers in P . More formally, for any fi and fj in F ,
i > j if and only if the routers ri and rj to which fi and
fj are respectively connected are such that i > j (that
is, ri is a successor of rj in P ).
Merger tries to merge fake nodes which are consecutive in F . More precisely, it iteratively tries to merge
every fi into fi+1 , with i = 1, . . . , k − 1. It starts from
f1 and tries to merge f1 into f2 according to the subprocedure described in the next paragraph. Irrespectively of the outcome of this procedure, f2 will remain
(possibly, with modified lower bounds if f1 has been
merged into it). Hence, Merger applies a new merging attempt on the pair (f2 , f3 ). The algorithm iterates
merging attempts until fk is reached.
2. Decision of one lower bound. We fix lb(fr,d ) to
its current value, and we remove it from Ci . In the
following, we indicate this value as ¯l.
3. Update of non-fixed lower bounds. We ensure that path (r fr,d d) will be the shortest one
from r to d in the augmented graph if lb(fr,d ) =
¯l. In particular, we impose that all constraint inequalities with lb(fr,d ) on the left side are satisfied. Consider any of those constraint inequalities
lb(fr,d ) < dist(r, j, G) + lb(fj,d ), with lb(fj,d ) being any lower bound in Ci . This would be satisfied
if lb(fj,d ) ends up having a value strictly greater
than ¯l − dist(r, j, G). We then update the value of
lb(fj,d ) to the maximum between its current value
lbi (fj,d ) and ¯l − dist(r, j, G) + 1. Note that this
operation may increase lb(fj,d ) to a value higher
than the corresponding upper bound ub(fj,d ), as in
the example in Figure 12. In such cases, we use a
locally-visible lie to create the fake path (j fj,d d),
assign the minimum possible cost (2) in an IGP
topology to it, and remove fj,d , as for B in Figure 12. Then, we restart the computation of lower
bounds from scratch, without considering j and
lb(fj,d ) anymore.
initial
2
2
A
C
lower
bound
[101,101]
[103,103] B
2
2
2
1. Shortest path compliance. Let all sps(i, j, G)
be the set of shortest paths from i to j in the input
topology G. Merger asserts whether every path in
all sps(i, j, G) is included in the input forwarding
DAG R. A similar check is performed on all the
paths from any node p whose shortest path for d
includes i before the merging. In the following, we
refer to the shortest path from i to j in G which is
a sub-path of P as sp(i, j, G).
upper
bound
required
B
Merging attempt. The core of the merging attempt
is the feasibility check, used by Merger to assess whether
the pair of input candidates (fi , fj ) can be merged and
with which lower bound adjustments. Assume that fi
and fj are two fake nodes announcing destination d to
the connected real nodes i and j in a source-destination
path P in R. We now formally describe what are the
checks performed by Merger in this phase.
2
2
E
2
F
(a) Initial and required
flows.
A
C
2
2. Candidate compatibility. If shortest path are
compliant according to the previous check, the algorithm assesses the existence of a cost of (j, fj , d)
such that (i) the shortest path of i in the postmerging augmented topology G0 becomes the concatenation of sp(i, j, G0 ) and (j, fj , d); and (ii) the
current next-hops of any node do not change. To
this end, we temporarily modify lower and upper
bounds of fj . The modified lower bound ˜lb(fj ) is
the lower bound of fj recomputed with the constraints that (i) dist(i, j, G)+˜lb(fj ) must be strictly
lower than dist(i, d, G) (to let i use fj ) and than
dist(n, d, G) for any pre-merging predecessor n of i
or j (to force pre-merging predecessors of i and j to
use fj ); and (ii) dist(i, j, G)+˜lb(fj ) must be strictly
greater than dist(m, d, G) for any other real node
m (to avoid that another nodes change their shortest paths). The modified upper bound ũb(fj ) is
the minimum between its unmodified upper bound
ub(fj ) and ub(fi ) − dist(i, j, G). Merger asserts if
2
2
E
2
F
(b) Initial fake bounds.
Figure 12: A case in which Merger needs a
locally-visible lie.
Indeed, during the lower
bound propagation procedure, lb(fA ) is incremented to make A use the fake path (A, fA , d)
instead of the fake path via B; this way, it however becomes bigger than ub(A).
B.3
Merging procedure
This procedure iterates over source-destination paths
in the input forwarding DAG R for destination d. For
each source-destination path, it repeats the following
operations.
Selection of candidates. Let P = (r1 , r2 , . . . rm ) be a
source-destination path in R, and let F = (f1 , f2 , . . . fk )
16
˜lb(fj ) exists and ˜lb(fj ) ≤ ũb(fj ). If this is the case,
it also stores ˜lb(fj ) and ũb(fj ), as they will be the
new bounds of fj if the merging happens.
to cases of multiple shortest paths in the original IGP
topology and load-balancing requirements.
We start by proving the correctness of the non-merged
augmented topology. We denote with k an integer such
that k ≥ 0, with G the original topology, and with G0
the topology as computed by Merger after the lower
bound propagation procedure. Since we never change
any link or link weight in G, for any pair of real nodes
x and y in G, dist(x, y, G0 ) = dist(x, y, G).
3. Network-wide feasibility. Since the bounds of
the merged node fj can change and one fake node
disappears in the merge, other pre-merging lower
bound values may become inconsistent. To avoid
inconsistencies, we re-run the lower bound propagation procedure of Merger, ignoring the bounds of
fi and using ˜lb(fj ) and ũb(fj ) as bounds for fj . In
this step, we also apply extra care for real nodes initially connected to multiple fake nodes, i.e., subject
to load-balancing requirements. Consider any of
those real nodes, l, initially connected to fake nodes
fl1 . . . flk , with k ≥ 2. During the lower bound
propagation procedure, we impose additional constraints on l, such that the cost of all the current
shortest paths from l to d (e.g., via fake nodes)
have the same cost. For example, when Merger
merges f 1 into f2 in the topology in Figure 6b, it
needs to maintain the load-balancing requirement
on A. To this end, it forces lb(f 0) to have the same
lower bound value as dist(A, B)+dist(B, d) (where
d is the destination), that is 5 + lb(f 2). Note that
sometimes considering load-balancing requirements
correctly prevents fake node merging. Considering
again Figure 6b, Merger does not merge f 3 into f 4.
Indeed, the lower bound of f 3 has been raised to
enable the merging of f 2 into f 4, hence it is now
not compatible anymore with f 4. This is correct
as using only f 4 forcedly breaks the load-balancing
requirement on A. If the lower bound propagation
succeeds without adding locally-visible lies, we finally store all the adjusted lower bound values, including the relaxed lower and upper bound of fj .
Lemma 1. If lb(x) is fixed at an iteration i of the
lower bound propagation procedure and δi (x) ≥ δi (y) at
that iteration, then δj (x) ≥ δj (y) at any iteration j > i.
Proof. Assume by contradiction that δi (x) ≥ δi (y)
but δj (x) < δj (y) for some iteration j > i and some
lower bound lb(y) 6= lb(x). Let m be the closest iteration to i such that δm (x) < δm (y). Since lb(x) is
fixed at i, it must be δl (x) = δi (x) for any l > i, hence
δi (x) = δm (x). Moreover, let u and v be the real nodes
respectively connected to x and y.
One of the following cases must apply.
• m = i + 1. In this case, the delta function of y
must have been increased during the i-th iteration.
By step 3 in the computation of lower bound values, lbm (y) must be equal to max{lbi (y), lbi (x) −
dist(u, v, G) + 1}. We have two subcases.
If lbm (y) = lbi (y), then δm (y) = δi (y). Since
δi (x) ≥ δi (y) by hypothesis and δi (x) = δm (x),
we must have δm (x) ≥ δm (y).
Otherwise, we have lbm (y) = lbi (x)−dist(u, v, G)+
1, that is, lbm (y) − 1 = lbm (x) − dist(u, v, G). The
value of δm (y) is defined as lbm (y) − cost(P, G),
where P is a given path in G; hence, δm (y) ≤
lbm (y)−1, since the cost of any IGP path is strictly
greater than zero. Moreover, δm (x) = lbm (x) −
dist(u, z, G) with z being the closest node to u also
connected to a fake node; consequently, δm (x) ≥
lbm (x) − dist(x, y, G). Combining those inequalities, we have that δm (y) ≤ lbm (y) − 1 = lbm (x) −
dist(x, y, G) ≤ δm (x), that is, δm (x) ≥ δm (y).
In both subcases, we generate a contradiction, as m
is defined as an iteration in which δm (x) < δm (y).
• m > i + 1. In this case, it must exist an iteration
n, with i > n ≥ m, such that δn−1 (x) ≥ δn−1 (y)
and δn (x) < δn (y). Since δn−1 (x) = δn (x) = δi (x),
lb(y) cannot be fixed at n − 1 and δn (y) > δn−1 (y).
Consider the lower bound lb(z) 6= lb(y) fixed at
n−1. Fixing it at iteration n−1 must cause δn (y) >
δn−1 (y). Since lb(z) is fixed at n − 1, we must have
δn−1 (z) ≥ δn−1 (y). We have two subcases.
If δn−1 (z) ≥ δn−1 (y) and δn (z) < δn (y), we can
consider z instead of x, and iterate our reasoning
by contradiction on z and y. The first case of our
proof applies since n = (n−1)+1, hence we directly
generate a contradiction.
If any of those checks fails, Merger immediately abort
the merging attempt. Otherwise, it merges fi into fj : It
removes fi and sets the lower bounds of all other nodes
to the ones computed during the third check.
B.4
Correctness Proofs
We prove the correctness of Merger in two steps. First,
we show that the algorithm correctly implements any
input forwarding DAG after the lower bound computation procedure (see Theorem 2). Second, we prove
that merges performed during the merging procedure
never change the forwarding paths implemented in the
pre-merging augmented graph (see Theorem 3). The
correctness of Merger then follows by noting that it sequentially runs the lower bound computation and merging procedures.
For the sake of simplicity, our proofs assume no equalcost multipath in either the original topology or the
required paths. Nevertheless, they can be generalized
17
is, δf (y) = cost(Fy , G0 ) − k − dist(ry , z, G0 ) by
definition of consistently-assigned fake path costs.
Since dist(ry , z, G0 ) > 0 and k ≥ 0, we thus have
δf (y) < cost(Fy , G0 ).
Combining the previous inequalities, we then have
cost(Fy , G0 ) > δf (y) ≥ δf (x) ≥ cost(Fx , G0 ) −
dist(rx , ry , G0 ).
Otherwise, if δn−1 (z) ≥ δn−1 (y) and δn (z) ≥ δn (y),
then δn−1 (x) = δn (x) < δn (y) ≤ δn (z) = δn−1 (z).
That is, δn−1 (x) < δn−1 (z). However, since lb(x) if
fixed at i, we must also have δi (x) ≥ δi (z). Hence,
we can iterate our reasoning by contradiction on x
and z. Note that n − 1 < m, which means that
we cannot iterate indefinitely on this sub-case, but
we eventually fall in another case and generate a
contradiction.
In all the subcases, we eventually generate a contradiction, which proves the statement.
In both cases, we have cost(Fy , G0 ) > cost(Fx , G0 ) −
dist(rx , ry , G0 ), that is, cost(Fx , G0 ) < cost(Fy , G0 ) +
dist(rx , ry , G0 ). This means that path Fx is shorter than
reaching ry from rx and then using Fy . By applying the
same argument to any lower bound lb(y), we conclude
that Fx is the shortest path from rx to d in G0 . The
statement then follows by applying the same argument
to all nodes rx in G0 .
Lemma 2. The shortest path from any real node r
connected to a fake node fr,d to d in G0 is (r fr,d d).
Proof. The statement holds if the fake node announces d using a locally-visible lie, because the cost
(2) set for this fake path in the update of non-fixed
lower bounds is guaranteed to be lower than the cost
of any other cost in G0 . We then focus on fake nodes
announcing globally-visible lies.
Consider any pair of lower bounds lb(x) and lb(y).
Let i1 and i2 be the iterations at which lb(x) and lb(y)
are respectively fixed during the lower bound propagation procedure. We denote the value of δ(x) and δ(y)
after the last iteration of the procedure as δf (x) and
δf (y). Moreover, for brevity, we write Fx and Fy instead of (rx x d) and (ry y d), with rx and ry being the
real nodes connected to x and y respectively.
By definition of the procedure, we have two cases.
• lb(x) is fixed before lb(y), that is, i1 < i2. Since
we assume that fake path costs are consistently
assigned, we have that cost(Fy , G0 ) = lbf (y) + k
with k ≥ 0, which implies cost(Fy , G0 ) ≥ lbf (y).
Moreover, as a consequence of step 3 in the computation of lower bound values, lbf (y) > lbf (x) −
dist(rx , ry , G0 ). Since fake path costs are consistently assigned, we can rewrite the right side of the
previous inequality as lbf (x) − dist(rx , ry , G0 ) =
cost(Fx , G0 ) − k − dist(rx , ry , G0 ), hence lbf (x) −
dist(rx , ry , G0 ) ≥ cost(Fx , G0 ) − dist(rx , ry , G0 ) by
definition of k. Concatenating all previous inequalities, we conclude that cost(Fy , G0 ) > cost(Fx , G0 )−
dist(rx , ry , G0 ).
• lb(x) is fixed after lb(y), that is, i1 > i2. It must
then be δi2 (x) ≤ δi2 (y). This implies δf (x) ≤
δf (y), by Lemma 1. We now express δf (x) and
δf (y) with respect to the cost of paths in G0 .
On one hand, by definition of δ, δf (x) = lbf (x) −
dist(rx , n, G) with n being the closest node to rx .
Hence, δf (x) ≥ lbf (x)−dist(rx , ry , G0 ), i.e., δf (x) ≥
cost(Fx , G0 ) − k − dist(rx , ry , G0 ) by definition of
consistently-assigned fake path costs. Since k ≥ 0,
it must be δf (x) ≥ cost(Fx , G0 ) − dist(rx , ry , G0 ).
On the other hand, by definition of δ, δf (y) =
lbf (y) − dist(ry , z, G0 ) for some real node z; that
Lemma 3. For any real node n not directly connected
to any fake node, if sp(n, d, G) is guaranteed not to contain a fake node f before the lower bound propagation
procedure, then sp(n, d, G0 ) is guaranteed not to contain
f after the procedure too.
Proof. By definition of guarantee, dist(n, rf , G) +
lb(f ) > dist(n, d, G), where rf is the real node connected to f . Note that dist(n, rf , G0 ) = dist(n, rf , G)
and dist(n, d, G0 ) = dist(n, d, G) since the original topology is never modified by Merger (only new fake nodes
and links are added to it). The statement then follows
by noting that lb(f ) is never decreased by the lower
bound propagation procedure, by construction (the procedure can only increase the value of some lower bounds
during the update of non-fixed lower bounds).
Theorem 2. If fake path costs are consistently assigned, Merger correctly implements the input forwarding DAG after the lower bound propagation procedure.
Proof. Consider any real node r. We have two cases.
If r is not connected to any fake node, the assignment
of lower bound values before the propagation procedure
ensures that paths including any fake node connected
to real ones not in sp(r, d, G) have a cost strictly greater
than dist(r, d, G). Hence, by Lemma 3, r’s shortest
path sp(r, d, G0 ) in the augmented topology is guaranteed not to traverse any fake node connected to a
real one not in sp(r, d, G). By the property of the
shortest path, sp(r, d, G0 ) can actually be written as the
concatenation of P = sp(r, x, G0 ) and Q = (x, fx , d),
with either (i) Q 6= ∅ and fx connected to a real node
x ∈ sp(r, d, G); or (ii) Q = ∅ and x = d if no node in
sp(r, d, G) is connected to a fake one. In any case, r’s
next-hop is the same as in the original topology. This is
consistent with the input forwarding DAG, given that
Merger initially adds fake nodes to all next-hop changing real ones.
Otherwise, if r is connected to a fake node, then
Lemma 2 holds. Hence, the required next-hop (as in
18
the forwarding DAG) can be imposed via the fake path,
which yields the statement.
definition of the fake bound computation phase in
Merger, we must have dist(r, d, G0 ) < dist(r, z, G0 )+
lb(fl ) on the augmented topology G0 provided as
input to the merging procedure. Lemma 4 then implies that dist(r, d, G00 ) < dist(r, z, G00 )+lb(fl ). By
combining the previous two inequalities, we conclude that sp(r, y, G00 ) + (y, f2 , d) is guaranteed to
be the shortest path of r after the merging. Finally, as a consequence of the shortest path compliance check, it must be sp(r, y, G00 ) = sp(r, x, G00 ) +
sp(x, y, G00 ), which implies that the next-hop of r
is not change by the fake node merging.
We now prove that Merger does not trigger violations
of previously-enforced forwarding DAGs. To this end,
we first prove a property preserved by the merging procedure (invariant).
Lemma 4. For any pair of real nodes (r, z) and any
fake one fl connected to z and announcing a given destination d, if dist(r, d, G0 ) < dist(r, z, G0 ) + lb1 (fl ) in
the topology G0 provided as input to the merging procedure, then dist(r, d, Gk ) < dist(r, z, Gk ) + lbk (fl ) at any
iteration k of the merging procedure.
• r’s shortest path does not include f1 nor f2 before the merging. We have three sub-cases. If
r is next-hop preserving, it is not directly connected to any fake node and dist(r, y, G00 )+˜lb(f2 ) >
dist(r, d, G00 ), by definition of ˜lb(f2 ) in the candidate compatibility step. Otherwise, if r is nexthop changing and connected to a locally-visible fake
node, then it uses the fake path via its connected
fake node independently of the presence of any other
fake node, because of the cost (2) set for this fake
path in the update of non-fixed lower bounds. Finally, if r is next-hop changing and connected to a
globally-visible fake node, the lower bound propagation procedure run in the network-wide feasibility check ensures that r keeps using in G00 the same
shortest path as in G0 (see Lemma 3).
Proof. Consider any iteration k of the merging procedure, such that dist(r, d, Gk−1 ) < dist(r, z, Gk−1 ) +
lbk−1 (fl ). During z, Merger picks two fake nodes f1 and
f2 , and assess the possibility to merge f1 into f2 .
If the merging attempts fails, then all lower bounds
and link weights remain as in the previous iteration.
Hence, dist(r, d, Gk−1 ) < dist(r, z, Gk−1 ) + lbk−1 (fl ) directly implies dist(r, d, Gk ) < dist(r, z, Gk ) + lbk (fl ).
Otherwise, f1 is merged into f2 . By definition of the
candidate compatibility step, the new lower bound of
the merged node is initially set to a value ˜lb(f2 ) such
that dist(r, d, Gk−1 ) < dist(r, z, Gk−1 ) + ˜lb(f2 ). Any
other lower bound also satisfied the corresponding inequality by hypothesis. Moreover, by its definition,
no lower bound is decreased during the lower bound
propagation procedure run in the network-wide feasibility step. Finally, link weights in the original graph
are never modified by the merging procedure, hence
dist(r, d, Gk−1 ) = dist(r, d, Gk ) and dist(r, z, Gk−1 ) =
dist(r, z, Gk ). This implies that for any fake node fl
dist(r, d, Gk ) < dist(r, z, Gk ) + lbk (fl ) at the end of the
iteration k, which yields the statement.
In all the cases, r has the same next-hop in G0 and G00 ,
which proves the statement.
C.
UNSUPPORTED BACKUP
We now show a case of backup requirement not supported by our current graph augmentation algorithms.
This case is represented in Figure 13.
Theorem 3. Successful merging attempts in Merger
do not affect the enforcement of the forwarding DAG
implemented before the merging.
C
C
6
5
A
Proof. Let R be any forwarding DAG for a destination d, and let G0 and G00 be the augmented topologies
before and after merging a fake node f1 into another f2 .
We denote the real nodes connected to f1 and f2 as x
and y respectively.
For any real node r, one of the following cases holds.
5
E
10
5
6
5
F
10
H
(a) Primary requirement.
A
5
E
10
5
F
10
H
(b) Backup requirement.
Figure 13: A backup requirement not supported
by our algorithms.
• r’s shortest path includes f1 before the merge (possibly r = x). By definition of the post-merging
upper bound of f2 in the candidate compatibility
check, dist(r, y, G0 ) + ũb(f2 ) < dist(r, d, G0 ), which
guarantees that sp(r, y, G0 ) + ũb(f2 ) is shorter than
the original shortest path from r to d. Moreover, by
the candidate compatibility and network-wide feasibility steps, dist(r, y, G00 )+˜lb(f2 ) < dist(r, d, G00 ).
In contrast, consider any fake node fl 6= f1 , f2 . By
In this example, the primary requirement is already
fulfilled by the shortest path in the original topology
(see Figure 13a). By construction, our algorithms do
not add any fake node to the network, in order to minimize the control-plane overhead. However, this implies
that the backup requirement depicted in Figure 13b
cannot be enforced. Indeed, for path (A, H, F ) to be
used instead of (A, C, F ) in an augmented topology, we
19
need a fake node fA connected to A such that the path
P from A to the destination via the fA is shorter than
(A, C, F ). This means that the cost of P must be 10 or
less. However, such a cost is less or equal than the cost
of the original shortest path, hence making P be used
by A even in the absence of failures and disrupting the
primary requirement illustrated in Figure 13a.
D.
to the FA. It will then fallback on the directly-connected
route to the FA, that is (A, C), to forward the packet.
OSPF LIMITATIONS
As stated in §4, the current OSPF protocol does not
allow full Fibbing expressiveness with global lies (i.e., it
violates Theorem 1). The reason is that global lies are
implemented in OSPF relying on the forwarding address
(FA) field. Namely, in an OSPF LSA, we can specify an
IP address x as FA. In this case, the router configured
with x on a network interface discards the LSA. All the
others compute the cost of the route included in the
injected LSA as cost to reach x plus the cost specified
in the LSA. Moreover, if this route is selected by a router
u, u will forward the corresponding packets on its nexthop in the OSPF shortest path to x.
This latter observation is at the core of the example
in Figure 14, where globally-visible lies cannot be used
to enforce a simple requirement. In this example, for
a given destination d, A is required to change its nexthop from F to C. Hence, we need to redirect A’s nexthop on the link (A, C). We can then inject a OSPF
globally-visible lie as an LSA specifying a new route
with the closest interface of C as FA and cost m <
90. This would make the cost of this route, that is,
dist(A, C) + m < 20 + 90, strictly lower than the cost
of the original shortest path, i.e, 100. However, even in
this case, A will use the next-hop on the shortest path
to C as next-hop for the router announced in the lie.
This means that A will keep sending traffic F which is
on its shortest path (A, F, C) to C.
C
50
A
C
10
10
(a) Initial flows.
50
F
100
A
10
10
F
100
(b) Required flows.
Figure 14: A requirement impossible to support
with global lies in the current OSPF implementation.
We rely on locally-visible lies to overcome the limitations of globally-visible ones. For example, a single
locally-visible lie can be used to enforce the requirement illustrated problem shown in Figure 14. Indeed,
an OSPF locally-visible lie uses an IP address which is
announced network-wide but not installed in the OSPF
routing table of any router (see §4). Hence, when resolving the FA, A will not find any OSPF shortest path
20
Was this manual useful for you? yes no
Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Download PDF

advertisement