Scaling Network Verification using Symmetry and Surgery.

Scaling Network Verification using Symmetry and Surgery.
Scaling Network Verification using Symmetry and Surgery
Gordon D. Plotkin†
Nikolaj Bjørner‡
Nuno P. Lopes‡
Andrey Rybalchenko‡
George Varghese‡
LFCS, University of Edinburgh†
Microsoft Research‡
[email protected]
{nbjorner,nlopes,rybal,george}@microsoft.com
Abstract
On the surface, large data centers with ∼ 105 stations and nearly
a million routing rules are complex and hard to verify. However,
these networks are highly regular by design; for example they employ fat tree topologies with backup routers interconnected by redundant patterns. To exploit these regularities, we introduce network transformations: given a reachability formula ϕ and a network N , we transform N into (a simpler to verify) network N̄ and
a corresponding transformed formula ϕ such that (for example) ϕ
is valid in N if and only ϕ is valid in N̄ .
Our network transformations exploit network surgery (in which
irrelevant or redundant sets of nodes, headers, ports, or rules
are “sliced” away) and network symmetry (say between backup
routers). The validity of these transformations is established using
a formal theory of networks. In particular, using Van BenthemHennessy-Milner style bisimulation, we show that one can generally associate bisimulations to transformations connecting networks and formulas with their transforms. Our work is a development in an area of current wide interest: applying programming
language techniques (in our case bisimulation and modal logic) to
problems in switching networks.
We provide experimental evidence that our network transformations can speed up the task of verifying the communication between all pairs of Virtual Machines in a large datacenter network
with ∼100,000 VMs by 65×. An all-pair reachability calculation,
which formerly took 5.5 days, can be done in 2 hours, and can be
easily parallelized to complete in minutes.
Categories and Subject Descriptors
ing]: Software/Program Verification
D.2.4 [Software Engineer-
General Terms Verification, Theory
Keywords Network Verfication, Symmetries, Semantics
1.
Introduction
Cloud services such as Dropbox, Google, iCloud, Amazon, and
Azure contain up to a million inexpensive servers connected by a
data center network. A single service request such as a Web search
is split among hundreds of servers that communicate to produce the
response. As our dependency on networks increases, the reliability
of data center networks becomes increasingly critical. However,
surveys show [19, 31] that network outages are quite common.
Thus verification technologies that proactively prevent potential
network disruptions are valuable.
A network, as shown in Figure 1, consists of boxes (routers,
switches, firewalls, henceforth referred to as routers) that forward
packets from input ports to output ports. We abstract the dataplane,
or forwarding component of a router, as a set of rules that map
predicates on packet headers (e.g., 32-bit IP addresses starting with
101) to output ports; some rules called Access Control Lists (ACLs)
use complex predicates (e.g., sources from outside Microsoft that
send SQL packets) on sets of fields to decide which packets must be
dropped. The rules may also prescribe changes in packet headers.
The output ports of each router are physically connected to the
input ports of other routers as specified by a network topology.
The network program, the composition of all the router programs
following the topology, then maps packets from entry points in the
network through the network to exit points.
This abstraction only enables qualitative reasoning, for example
about packet reachability or about looping. It abstracts away quantitative issues, such as performance metrics (e.g., delay) and ignores
the control plane that builds the forwarding rules. The correctness
requirements [17] are also simple: they specify the headers that can
communicate between hosts, together with predicates on the paths
the headers traverse. So wherein, therefore, lies the complexity that
justifies the emerging area of network verification [15–17, 22, 30]?
First, manual rules added by operators interact with the rules
computed by automatic routing protocols. Second, the forwarding
rules typically involve load balancing by which a packet can be sent
by many possible routes to its destination. Third, routers sometimes
rewrite packets. Fourth, while the program structure is simple,
the state space is large. Forwarding rules operate on at least the
IP and TCP headers and other fields such as MPLS [18]. Thus
conservatively, we have a space of 280 headers, millions of rules,
and a large number N of servers (as many as 1 million) that can
communicate. Checking reachability across N 2 pairs of stations for
280 headers to find a network policy violation is a scaling challenge.
Early systems [22] found a single violation using SAT solvers.
Later systems [16, 17] found all violations for medium sized networks using symbolic execution. NetPlumber [15] introduced efficient incremental analysis. Finally, Yang and Lam [30] used predicate abstraction to rewrite router rules in a form that is much more
efficient to verify. However, these improvements do not target the
scaling challenge. For example, it was reported in [10] that the cost
of verifying reachability for all pairs of stations took around 1 day,
even on a small university network. Verifying all pairs in a large
data center in a few minutes (the time scale at which a network can
be reconfigured) seemed out of reach for earlier techniques.
...
...
...
Core
...
Aggregation
Edge
Leaf
Figure 1. Data center network schematic
1.1
Scaling Verification using Network Transformations
Fortunately, most large networks are designed using design patterns
that enforce regularities that we can exploit. For example, data
center networks are often arranged in fat-trees (Figure 1), which
are leveled graphs where routers at a given level are symmetrically
connected to multiple routers at adjacent levels. This is done for
load balancing and resiliency reasons. For example, in Figure 2 R3
and R4 are symmetrically placed, for if they are interchanged and
the ports on R3 , R4 and the corresponding ports of their neighbors
R1 , R2 and R5 are correspondingly interchanged, then the global
topology stays the same.
Thus, by design, one may expect the rule sets of symmetrically
placed routers at a given level (e.g., R3 and R4 ) to be symmetrical.
This suggests that symmetrical routers such as R3 and R4 can be
replaced by a single equivalent (with respect to packet forwarding)
router as in Figure 2 without changing the qualitative properties of
the network. Doing so reduces the rules in the new network, and in
the limit can transform a “fat tree” into a “thin tree”.
Z
Z
R1
R1
R2
...
R2
...
···
R3
p1
...
R4
Transforms to
⇐⇒
R3
p2
R5
X
R5
Y
X
Y
Figure 2. Suppose that interchanging R3 and R4 as well as corresponding marked ports on their neighbours, e.g., p1 and p2 on R5 ,
leaves reachability the same. Then R3 and R4 can be merged into
a single router in the transformed network on the right.
Further, the hierarchical structure implies that communication
within subtrees should stay local. Thus two stations X and Y
attached to the same edge (or ToR, for Top of Rack switch) R5
(see Figure 2) should typically communicate only via R5 . Thus,
for verification of the communication between X and Y, the rest of
the network is irrelevant as long as one can prove that traffic from X
to Y does not “leak” out from the ToR R5 . This suggests a general
notion of slicing away irrelevant rules and portions of the network.
We refer to such slicing of ports and rules as network “surgery”.
One can also slice networks in terms of headers. For example, if
in reality the two backup routers R3 and R4 of Figure 2 are nearly
identical except for two local addresses h1 in R3 and h2 in R4 ,
we might hope to slice the network into two equivalent networks
operating on disjoint sets of headers, one of which has R3 and R4
perfectly symmetric, and one of which has the standard topology
but with a smaller set of rules (say dealing with only h1 and h2 ).
Both symmetry-induced and slicing-based transformations are
network transformations that transform the original network into a
sequence of simpler networks with equivalent forwarding, but with
smaller size in terms of boxes, rules, and links. If the verification
effort scales with size, and the transformations are efficient, the
overall verification time will be faster.
To proceed, we need a model of networks and a way to write
reachability specifications. To this end we use an operational semantics treating each router as a state machine. We then use a standard modal logic defined on the states to describe desired behaviors. The logic permits assertions that certain (anonymous) transi-
tions occur, that a packet header has reached a certain port, or that
a packet header has a certain property.
To connect network transformations with the logic we generally
show that they yield bisimulations (in the sense of Van Benthem
and Park and Milner [28]) between a network N and its transform
N̄ . We then employ the Van Benthem-Hennessy-Milner principle,
as used in modal logic and process calculus [24, 28], to show the
validity of a transformation using bisimulation. In process calculus
this principle states that if there is a bisimulation between two
processes P and P , then P has a given property ϕ, if, and only
if, P does. In our case, as, for example, N̄ may have different ports
from N , it may be necessary to modify the formula ϕ asserted of
N to another, ϕ, asserted of N̄ (the analogous situation in process
calculus would be when comparing processes with different action
alphabets, see, e.g., [4]).
We propose two different ways to employ the Van BenthemHennessy-Milner technique for network verification. One is to reduce the size of N , for example, by slicing away parts of the network irrelevant to the proposition, or else by exploiting symmetry
either to remove essentially duplicate rules, or to merge symmetrically placed routers with symmetrical semantics. The other is to
reduce the number of properties to be verified, by finding symmetries between propositions about symmetrically placed parts of the
network or by identifying equivalent packet headers. In doing so, it
may be useful to pass through a sequence of such transformations,
producing a sequence N1 ∼1 . . . ∼n−1 Nn of bisimilar networks.
Our contributions (with a paper outline) are:
1. A Theory of Network Dataplanes: To formalize network
transformation, we introduce a network model (Sections 2.1, 2.2),
a modal logic (Section 2.3), and a proof technique based on bisimulation to verify networks (Section 3). After describing surgeries,
we describe a compositional way of generating bisimulations (Section 5). A side effect of all this machinery is the ability to formalize
earlier concepts such as slicing [16], and (a generalized version of)
Yang-Lam equivalence [30].
2. A toolbox of network transformations: We provide a toolbox of network transformations (Examples 1 through 4) based
on surgery (Section 4) and header merging and symmetry (Section 6) that can be applied iteratively to simplify the verification
task. There are generally bisimulations corresponding to each. As
with earlier work (e.g., [7]), we exploit symmetry over state spaces.
However, to do this we exploit the structure of the network domain
where the state consists of both headers and locations, and the program is distributed over boxes and rules, enabling symmetries to be
constructed from both topological and semantic components.
2. Scaling comprehensive Network Verification: We show (Section 7) how one might scale comprehensive network evaluation for
large production data centers over all pairs of stations. In particular,
for a Microsoft data center in Singapore with ∼820K rules, we reduce the time to verify reachability between all pairs of VMs from
5.5 days to 2 hours on a single core using only simple transformations, a 65× reduction. Earlier work [15–17, 30] did not report the
running times for reachability between all N 2 pairs.
2.
Networks and their logics
As a foundation for our theory, we provide a formal definition of
switching networks as graphs of interconnected boxes of various
kinds, such as routers, switches, bridges, or firewalls. This is network syntax. Having a syntax available, we then define network
semantics in terms of suitable kinds of transition relations. The semantics, in its turn, supports a suitable modal logic, which can be
used to specify network properties that we wish to verify.
2.1
Network syntax
Networks are constructed using a box signature that consists of a
set of boxes b ∈ Box, with each box having a given finite collection
K ⊆fin N of ports, written b : K; we write Box for the signature.
(We use a set, rather than a sequence, of ports to prepare the ground
for network surgery in which we slice away ports to transform a
network.) Given such a box signature, a network N consists of
• a finite set NodeN of nodes,
• a box assignment function βN : NodeN → BoxN ,
Next, packet exit ports j may be independent of their entry ports
i. Formally, we additionally have for any box b : K in the network:
h @ i1 −→Box, b h0 @ i0 ⇐⇒ h @ i2 −→Box, b h0 @ i0
for all i1 , i2 , i0 ∈ K and h, h0 . This happens when the forwarding
tables are “centrally located” and are used to route all incoming
messages to exit ports, independently of their entry port, and when
ACLs are given per exit port; the Singapore network again provides
an example. In this case the the semantics can be equivalently
described in terms of “rules” of the form h 7→ I where I ⊆ K,
with one such rule for each h, namely the one where
• a symmetric, 1-1, irreflexive, port connection or link relation
I =def {j ∈ K | ∃i ∈ K. h @ i →Box, b h @ j}
γN on the set PortN of network ports where:
PortN =def {(n, i) | n ∈ NodeN , βN (n) : K, and i ∈ K}
The idea behind having both boxes and nodes is that the same
box (e.g., backup router) can be replicated at different points of the
network; one can think of the nodes as being box id’s attached by
βN to their boxes. We use p to range over ports, and write the pairs
(n, i) as n.i. The connection relation γN specifies how ports are
connected to each other and hence encodes the network topology.
It is symmetric as packets can flow in either direction; it is 1-1 to
model physical connection; it is irreflexive to avoid tight loops. Not
all ports need be connected to another port; such unconnected ports
are called external; the others are called internal.
Definitions in this style of various forms of networks made up
out of boxes with ports can be found in other contexts, for example
sharing graphs [12], bigraphs [25], or cyclic networks [13]; other
definitions can be found in the network literature, for example [32].
2.2
Network semantics
For the network semantics, we first assume given a set PacN of
packet headers ranged over by h (below we may just say “packet”);
we call PacN the header space (of N ). Packet headers are typically
bit strings – hence the 280 possibilities alluded to in the introduction
— but one may use any convenient set. For each box b : K we then
have the set of box-located packets, where such a located packet
is a pair, written h @ i, of a packet h and a box location, that is, a
box port i (with i ∈ K). In other words, a box-located packet is a
specific packet located at a specific port in the network.
Next, for each box b ∈ Box, we assume given a transition
relation
h @ i −→Box, b h0 @ i0
between box-located packets. Note that packets may be changed
(rewritten) in a transition as well as their location. When the signature can be understood from the context, we will generally just
write h @ i −→b h0 @ i0 .
These transitions are typically given by the data plane of a
router, via forwarding tables, ACLs (access control lists), and
packet rewriting. We do not specify here what these may be, or how
they induce transition relations (however, apart from Section 7, our
work is independent of such considerations). The box transitions
arise from packets moving from an input port of a router, through
its internal ports, and, possibly changed, on to an output port, following its set of rules.
In many switching networks only packet locations change, as
there are no packet rewriting rules. That is, packet headers are not
changed by any of the boxes in the network. Formally, for any box
b : K in the network we have:
h @ i −→Box, b h0 @ i0 =⇒ h0 = h
0
0
for all i, i ∈ K and h, h . This holds, for example, for the Singapore network considered in Section 7.
With the box transition semantics available, we define the network semantics in terms of two transition relations between states:
an internal one, moving to internal network nodes, and an external
one, moving to external network nodes.
The set, StatesN , of states of the network N consists of the
network-located packets; these are pairs, written h @ p, of a packet
h and a network location, that is, a network port p.
The internal network transition relation:
h @ n.i −→N h0 @ n0 .i0
holds if, and only if, for some j, h @ i −→β(n) h0 @ j and
n.j γ n0 .i0 . In some sense we are doing two transitions in one step
in that a packet first follows a box transition at input port i to output
port i0 in the same box, and then moves to the input port j of the
router to which i is connected (as specified by the port connection relation γ). A similar “short-circuiting” is done in the Header
Space model [16].
The external network transition relation:
h @ n.i N h0 @ n0 .i0
holds iff h @ i −→β(n) h0 @ i0 where n.i0 is an external port. We
write N for the total network transition relation →N ∪ N .
Below, when they can be understood from the context, we generally omit network suffixes from NodeN , βN and so on. However,
network suffixes are essential when we describe transformations
between networks.
The double packet/location nature of states gives the subject a
special interest. Note that our states only keep track of the location
of one packet, not of several as would be natural if we were interested in concurrent aspects of network operation. Our purpose,
as elsewhere in the network verification literature, see, e.g., [33], is
rather to verify properties such as reachability, concerning only one
packet at a time.
A notion of network slice will prove important. In its most
general form, a slice is a subset S of States, the set of states. We
say that a slice is a network invariant if, in addition, we have:
h @ p ∈ S ∧ (h @ p
N
h0 @ p) =⇒ h0 @ p0 ∈ S
A less general, but useful case, already considered in [16], is where
the slice is a product H × P of sets H and P of packets and ports.
2.3
A Network logic
We next spell out a small natural modal logic for such networks
that supports the application of the van Benthem-Hennessy-Milner
principle while being enough to cover typical network requirements
such as reachability relations between hosts.
We first assume given a collection of packet formulas α to
specify properties of packets, together with a satisfaction relation
h |= α
for them, where h ∈ PacN . For example such packet formulas
may consist of wildcard expressions such as 101xxxxx for packets
with (let’s say) eight bit destination IP addresses that start with 101
as in Header Space Analysis [16].
One can then define a largely standard modal logic by the
following grammar of N -formulas:
Original
Complex Core
k0
k0
l
k
ϕ ::= α | @ p | ⊥ | ¬ϕ | ϕ ∧ ϕ | ♦ϕ | Fϕ
The modal worlds are the network states. The formula α expresses
that α holds of the state packet; @ p expresses that the state location is p; ♦ϕ expresses that one can reach a state where ϕ holds
in zero or more internal steps from the current state; and Fϕ expresses that one can reach a state where ϕ holds by an external step
from the current state. Other connectives are definable: the other
propositional ones; the modality, expressing necessarily reaching, having taken zero or more internal steps; ♦T ϕ abbreviating
♦(ϕ ∨ Fϕ) and the corresponding T ϕ; and @ p1 , . . . , pn (for
n > 0) abbreviating @ p1 ∨ . . . ∨ @ pn . More expressive logics
can be obtained, for example, by adding fixed-points µX.ϕ[X], but
the simpler logic seems sufficiently expressive.
Given a network N , and a semantics for it, we obtain a semantics for the logic, following the above ideas. As usual, we define a
satisfaction relation:
h @ p |=N ϕ
– that in network N the packet h satisfies the formula ϕ when
located at port p — by structural induction on formulas. The clauses
for the boolean formulas are as usual; the others are:
h @ p |=N α
⇐⇒ def
h @ p |=N @ p0 ⇐⇒ def
h @ p |=N ♦ϕ ⇐⇒ def
h @ p |=N Fϕ ⇐⇒ def
h |= α
p = p0
∃h0 , p0 . h @ p→N ∗ h0 @ p0
∧ h0 @ p0 |= ϕ
0
0
∃h , p . h @ pN h0 @ p0
∧ h0 @ p0 |= ϕ
There are then two natural validity notions for a given network
N (reflecting the double nature of states): one where a header
validates a formula if the header satisfies the formula when located
at any port; and the other where all headers satisfy the formula
when located at any port:
h |=N ϕ ⇐⇒ ∀p. h @ p |=N ϕ
and
|=N ϕ ⇐⇒ ∀h. h |=N ϕ
(There is also a third validity p |=N ϕ, but we do not know of a use
for it.)
Let us give a few examples to illustrate the expressiveness of the
logic. First,
|=N α ∧ @ p1 ⇒ ♦( @ p2 , p3 ∧ ♦F(α0 ∧ @ p4 ))
holds if, whenever a network packet satisfying α is at port p1 then
it can reach an external port p4 via p2 or p3 , being meanwhile
transformed into a packet satisfying α0 .
In the same vein, suppose a network slice S is defined by a
boolean combination of atomic formulas, ϕ. Then the slice is a
network invariant if, and only if,
|=N ϕ ⇒ T ϕ
holds. Suppose that S 0 is another slice, defined by a boolean combination of atomic formulas, ϕ0 . Then
|=N ϕ ⇒ ¬♦T ϕ0
holds if the second slice cannot be accessed from the first. This
models the standard notion of slicing (a form of network virtualization that requires isolation between slices) considered in [16, 17],
but our formalism can express alternative notions.
l
R1
i
i
0
R2
0
j
j0
V1
Core by hub
surgery
⇐⇒
V2
H
l
l0
k
R1
R2
0
i
i
V1
j
j0
V2
Figure 3. Replacing the core by a hub.
Continuing, suppose that p is an internal port. Then
|=N α ∧ @ p ⇒ @ p
holds if any packet with header satisfying α and reaching p is
dropped (or loops through p forever).
Next,
h |=N @ p ⇒ ♦(¬ @ p ∧ ♦ @ p)
holds if any packet with header h, loops through p via some other
port, returning to p with its header possibly being transformed. This
is a generic loop, in the terminology of [16]. However, if one has
an invariant α available, one can detect some infinite loops:
|=N α ∧ @ p ⇒ ♦(¬ @ p ∧ ♦(α ∧ @ p))
holds if any packet with header satisfying α loops through p,
via some other port, with its header, however transformed, still
satisfying α.
2.4
A first surgery
Our first example of a surgery shows how to replace a core by a
hub, while preserving reachability formulas.
Example 1: Replacing a core by a hub
Suppose our network N consists of a “core” Ncore through which
ToRs (or other boxes) communicate via a single link. Suppose too
that packet headers are not changed by any of the boxes in the
network. Then if the core is obstruction-free, in a suitable sense,
reachability queries between network machines accessing the core
via those boxes hold if, and only if, they hold for a greatly reduced
network in which the core is replaced by a hub: see Figure 3.
As shown there, the core network Ncore has external ports k0
and l (at nodes a and b, say). Fix a “traffic set” T ⊆ PacN of
packet headers; typically this will be the packets addressed to the
VMs handled by R2 . Then we assume the core is obstruction-free
for T between a.k0 and b.l, i.e., that, for all h ∈ T we have:
h @ a.k0 −→∗Ncore h @ b.l
The network N̄ is the same as N except that the core has been
replaced by a hub H with ports k0 and l. Regarding its semantics
we assume only that, for all h:
h @ k0 →H h @ l
In both cases we have boxes (ToR’s, say) R1 and R2 (at nodes
r1 and r2 , say) to which (for example, virtual machines) V1 and V2
are connected, with ports as shown. Regarding logic, both networks
have the same packet formulas, and the evident collections of port
formulas.
Then, for any h and p, we have that the reachability formula
ϕreach , namely:
αT ∧ @ r1 .i0 ⇒ ♦ @ v2 .j0
is satisfied by h @ p in N if, and only if, it is satisfied by h @ p in
N̄ (we are assuming that we have a packet formula αT defining T,
i.e., such that T = {h | h |= αT }). This is easily verified by direct
consideration of the semantics of the formula in each of the two
networks, coupled with the no-obstruction assumption. For either
N or N̄ , h @ p |= ϕreach holds if, and only if, we have
0
h ∈ T ∧ p = r1 .i ∧ h @ p |= ♦ @ v2 .j
0
and we can then calculate, for h ∈ T:
h @ r1 .i0 |=N ♦ @ v2 .j0 ⇐⇒ h @ r1 .i0 →∗N h @ v2 .j0
⇐⇒ h @ i0 →R1 h @ k ∧ h @ a.k0 →∗Ncore h @ b.l ∧
h @ r2 .l0 →R2 h @ v2 .j0
⇐⇒ h @ i0 →R1 h @ k ∧ h @ r2 .l0 →R2 h @ v2 .j
⇐⇒ h @ i0 →R1 h @ k ∧ h @ k0 →H h @ l ∧
h @ r2 .l0 →R2 h @ v2 .j0
⇐⇒ h @ r1 .i0 →∗N̄ h @ v2 .j0
⇐⇒ h @ r1 .i0 |=N̄ ♦ @ v2 .j0
In the calculation we use the fact that the network does not alter
packet headers and, in the third equivalence, the fact that Ncore is
obstruction-free for T between a.k0 and b.l.
3.
ϕ ∼For ϕ
ϕ ∼For ϕ
♦ϕ ∼For ♦ϕ
Fϕ ∼For Fϕ
P ROOF. This is a standard proof. There are five cases:
1. As ⊥ never holds, we have ⊥∼For ⊥.
2. Suppose that ϕ ∼For ϕ and that h @ p ∼ h @ p. Then
h @ p |=N ¬ϕ holds iff h @ p |=N ϕ does not hold iff
h @ p |=N̄ ϕ does not hold (as we have ϕ ∼For ϕ) iff
h @ p |=N̄ ¬ϕ holds.
3. Suppose that ϕ ∼For ϕ and that ψ ∼For ψ and h @ p ∼ h @ p.
Then h @ p |=N ϕ ∧ ψ holds iff h @ p |=N ϕ and h @ p |=N ψ
both hold iff h @ p |=N̄ ϕ and h @ p |=N̄ ψ both hold (as we
have ϕ ∼For ϕ and ψ ∼For ψ) iff h @ p |=N̄ ϕ ∧ ψ holds.
4. Suppose that ϕ ∼For ϕ and that h @ p ∼ h @ p. Assume that
h @ p |=N ♦ϕ holds, in order to show that h @ p |=N̄ ♦ϕ
does. Note that then, for all h0 , p0 such that h @ p →∗N h0 @ p0 ,
h0 @ p0 |=N ϕ holds.
0
Network bisimulations
To show the validity of network transformations, our proof technique is to define bisimulations ∼ between two networks N and N
built using the same box signature, but possibly with different box
semantics, for example between the networks on the left and right
in Figure 2.
We take these to be relations between network-located packets
which are both bisimulations between −→N ∗ and −→N ∗ and
bisimulations between N and N , each in the usual sense (note
∗
). It is
they are then also bisimulations between ∗N and N
natural to consider, in particular, one-step bisimulations between
N and N ; these are bisimulations between both −→N and −→N
and N and N (and so also between N and N ).
One can form relations between states from relations ∼Pa and
∼Po between the respective sets of packets and ports by:
0
Assuming given h , p0 such that h @ p →∗N h @ p0 , we have to
0
show h @ p0 |=N̄ ϕ holds. As ∼ is a bisimulation between N
and N̄ and h @ p ∼ h @ p, there are h0 , p0 such that h @ p →∗N
h0 @ p0 and h0 @ p0 ∼ h @ p0 . By the assumption we then have
0
h0 @ p0 |=N ϕ, and so, as ϕ ∼For ϕ, h @ p0 |=N̄ ϕ, holds as
required.
The proof that h @ p |=N ♦ϕ holds if h @ p |=N̄ ♦ϕ does is
similar.
5. This is similar to the previous case.
In case ∼ is built from relations ∼Pa (on packets) and ∼Po
(on ports), as above, there are natural sufficient conditions for the
correlations to hold between atomic formulas. Recall that atomic
formulas are either of the form α (sets of packets) or @ p1 , . . . , pn
(sets of ports). So, α ∼For α holds if
∀h, h. h ∼Pa h =⇒ (h |= α ⇐⇒ h |= α)
and @ p1 , . . . , pm ∼For @ p1 , . . . , pn holds if
h @ p ∼ h @ p ⇐⇒ def h ∼Pa h ∧ p ∼Po p
∀p, p. p ∼Po p =⇒ (p ∈ {p1 , . . . , pm } ⇐⇒ p ∈ {p1 , . . . , pn })
Then, given a relation ∼Po between ports, there is a largest relation
∼Pa between packets which, so combined with ∼Po , forms a (onestep) network bisimulation between N and N . We remark that,
while there certainly are largest such (one-step) bisimulations, they
do not seem very useful. The bisimulations relative to a fixed port
relation are sensitive to ports visited and so do prove useful; they
are analogous to those considered in [4].
Given a bisimulation ∼ between N and N we wish to transfer
properties between the first network and the second, as is usual
with bisimulations, so that proving a formula in the first network is
equivalent to proving a transformed formula in the second network.
As we have two logics here, speaking of different sets of packets
and ports, we need to correlate formulas. To that end we define a
correlation relation ∼For between N -formulas and N -formulas by
taking ϕ ∼For ϕ to hold if, and only if, for all h, p, h, p we have:
These conditions are also necessary, except in evident trivial cases.
We remark that there is no commitment to the above logic
for purposes of network verification. Proposition 1 demonstrates
a wide range of properties that can be transferred between bisimilar networks. However one is free to express such properties using
whatever logical or automata-theoretic means one desires. For example one can translate our logic into Datalog, so one can use [20].
h @ p ∼ h @ p =⇒ (h @ p |= ϕ ⇐⇒ h @ p |= ϕ)
We then have the expected proposition expressing the Van BenthemHennessy-Milner principle for networks:
P ROPOSITION 1. The formula correlation relation ∼For is closed
under the logical connectives, that is the following implications
hold:
⊥∼For ⊥
ϕ ∼For ϕ
¬ϕ ∼For ¬ϕ
ϕ ∼For ϕ ψ ∼For ψ
ϕ ∧ ψ ∼For ϕ ∧ ψ
4.
Network surgery
We next give three examples of network transformations by various forms of surgery, whether at the levels of graphs, headers, or
rules. The networks are related to their transforms by bisimulations;
these, in their turn, allow application of appropriate versions of the
Van Benthem-Hennesy-Milner principle, justified by Proposition 1.
We also revisit Example 1, and show why the obstruction-freeness
assumption made there does not justify a helpful bisimulation.
Our first example of a network transformation may appear too
trivial to be useful (removing disconnected sets of nodes). However, as Figure 4 shows, it becomes effective after the following
example, a more complex surgery called slicing.
Example 2: Removing disconnected components
When a network N consists of two disconnected subnetworks we
can verify their properties separately. Suppose its nodes split into
Then the slice is a network invariant if, and only if, the following
two conditions hold:
Slice irrelevant
portions
Transforms to
via slicing
(Example 2)
⇐⇒
R1
Nodes 1
X
Nodes 2
Y
R1
Nodes 2
Nodes 1
X
Transforms to
via removal
(Example 1)
⇐⇒
Y
• No leakage For all b : K, i ∈ K, and h ∈ H:
h @ i −→b h0 @ j =⇒ (i, j) is not P -leaky
• Header invariance For all b : K, h ∈ H, and P -connected pairs
Nodes 1
X
(i, j):
Y
Figure 4. When considering the reachability between stations in
the same subtree, the remainder of the network can be sliced away
(Example 3), leaving a disconnected set of Nodes that can be
removed by Example 2.
two disjoint subsets Node1 and Node2 which are disconnected in
that for all ports n1 .i1 and n2 .i2 :
n1 ∈ Node1 ∧ n1 ∈ Node2 =⇒ ¬ n1 .i1 γ n2 .i2
Then N naturally splits into two subnetworks N1 and N2 with
respective node sets Node1 and Node2 and with βNi j and γNj
being the evident restrictions of βN and γN , for j = 1, 2. Note
that a port is internal (external) in an Nj iff it is in N .
We then have that N is bisimilar to each of the Nj . For j = 1, 2,
one can take as (one-step) bisimulation relation
h @ n.i ∼j h @ n.i ⇐⇒ def h @ n.i = h @ n.i ∧ n ∈ Nodej
The logic for Nj has as formulas those of N that only mention
ports with nodes in Nodej . For such formulas, Proposition 1 tells
us that that for all packets h and Nj -ports n.i we have:
h @ n.i |=N ϕ ⇐⇒ h @ n.i |=Nj ϕ
We now introduce the more powerful slicing transformation
illustrated in Figure 4. When verifying the reachability between
certain pairs of stations such as X and Y, under certain conditions
it is possible to slice away irrelevant portions of the network which
we formalize as slicing.
Example 3: Slicing
Suppose we have a slice of the form S = H × P of a network
N , where H is a subset of the set of headers and P is a subset of
the set of ports. We wish to define a corresponding slice network
N |S obtained by cutting off all the ports not in P and restricting
the header space to H. We restrict ourselves to the case where P is
connection invariant, meaning that for all p, p0 ∈ PortN , we have:
pγN p0 =⇒ (p ∈ P ⇐⇒ p0 ∈ P )
Thus we are going to cut off the port at one end of a link if, and
only if, we cut off the port at its other end; in other words, we are
cutting off either external ports or whole links.
For slicing to work, we will need the slice to be a network
invariant, as defined above. In the case of slices of the form H × P
network invariance can be rephrased in terms of natural no-leakage
and header invariance conditions on network boxes. For every box
b : K occurring in the network (i.e., in the range of β) say that a box
port pair (i, j) ∈ K 2 is P -leaky if, and only if, for some p ∈ P
and q ∈
/ P we have:
∃n. β(n) = b ∧ p = n.i ∧ n.j γ q
and say that a box port pair (i, j) ∈ K 2 is P -connected if, and only
if, for some p, q ∈ P we have:
∃n. β(n) = b ∧ p = n.i ∧ (n.j γ q ∨ (q = n.j ∧ q is external))
h @ i −→b h0 @ j =⇒ h0 ∈ H
In the case where the slice is only on headers it evidently suffices to
check invariance of H for all pairs of ports of all boxes occurring
in the network.
To define the slice network N |S, we begin by defining the slice
signature Box|S. For every box b : K and every I ⊆ K, we assume
available a Box|S box b|I : I; intuitively, b|I is obtained from b by
slicing off all of b’s ports that are in K\I. We take the header space
to be H and its semantics is the restriction of that of b to H, that is
that, for all h, h0 ∈ H and i, i0 ∈ I we have:
h @ i →Box|S,b|I h0 @ i0 ⇐⇒ h @ i →Box,b h0 @ i0
The slice network N |S is then obtained as follows: its node set
is that of N ; its box assignment function is given by:
βN |S (n) = βN (n)|{i ∈ K | n.i ∈ P }
where β(n) : K; and its connection relation γN |I is the restriction
of γN to N |I’s ports, i.e., to P . We note that the set of ports of N |S
is P , as anticipated. (For n.i is such a port iff i ∈ K and n.i ∈ P ,
where β(n) : K. But if n.i ∈ P then n.i ∈ PortN and so i ∈ K;
so we see that n.i is indeed such a port iff it is in PortN |S .) Note
too that, as P is connection invariant, a port p ∈ P is an internal
(external) port of N |S if, and only if, it is of N .
In the particular case when we are only slicing on headers, that
is when P is the set of all ports, N |S can be taken to be N , although
one evidently still needs the restricted signature.
We now show that, restricted to S, the transition relations of N
and N |S are the same.
L EMMA 1. 1. The internal (external) transition relation of N |S
is contained in that of N .
2. The restriction of the internal (external) transition relation of
N to the slice S is contained in that of N |S.
P ROOF.
1. Suppose, first, that h @ n.i −→N |S h0 @ n0 .i0 in order to
show that h @ n.i −→N h0 @ n0 .i0 . Then for some j we
have h @ i −→Box|S, β(n) h0 @ j and also n.j γN |S n0 .i0 . We
then have h @ i −→Box, β(n) h0 @ j and n.j γN n0 .i0 and so
h @ n.i −→N h0 @ n0 .i0 .
Next, suppose instead that h @ n.i N |S h0 @ n0 .i0 . Then we
have h @ i −→Box|S, β(n) h0 @ i0 and n.i0 is an external port of
N |S. We then have that h @ i −→Box, β(n) h0 @ i0 and, by the
above remark on external ports of N |S, that n.i0 is an external
port of N , and so h @ n.i N h0 @ n0 .i0 .
2. For the second part consider two states h @ n.i and h0 @ n0 .i0
in S. Then suppose, first, that h @ n.i −→N h0 @ n0 .i0 to show
that h @ n.i −→N |S h0 @ n0 .i0 . Then h @ i −→Box, βN (n)
h0 @ j and n.j γN n0 .i0 , for some j.
So, first, as n.i ∈ P (since h @ n.i is in S), i is in the set I =def
{i ∈ K | n.i ∈ P } where βN (n) : K. Then, as n0 .i0 ∈ P
(since h0 @ n0 .i0 ∈ S) and as P is connection invariant and
n.j γN n0 .i0 , we see that n.j ∈ P and so that j ∈ I also
holds. It follows that h @ i −→Box, βN (n)|I h0 @ j. Then, as
h, h0 ∈ H, we finally have h @ i −→Box|S, βN |S (n) h0 @ j.
Next as n.j γN n0 .i0 and n.j, n0 .i ∈ P we further have
that n.j γN |S n0 .i0 . Combining the last two facts, we have
h @ n.i −→N |S h0 @ n0 .i0 , as required.
Next, suppose instead, that h @ i −→Box, βN (n) h0 @ i0 and
n.i0 is an external port of N . Here, as h @ n.i and h0 @ n0 .i0
are in S, n.i and n0 .i0 are in P ; therefore, arguing as above, we
see that h @ i −→Box|S, βN |S (n) h0 @ i0 holds. But n.i0 is an
external port of N |S as it is an external port of N . So we find
that h @ n.i N |S h0 @ n0 .i0 , concluding the proof.
Turning to logic, for the logic of N |S, we take the N -formulas
that mention only ports of N |S. There is a natural potential bisimulation ∼S between N and N |S, defined by:
h @ p ∼S h @ p ⇐⇒ def h @ p = h @ p ∈ S
T HEOREM 1. The relation ∼S is a one-step bisimulation between
→N and →N |S if, and only if, S is a network invariant of N . In
that case, for any N |S formula and any state h @ p ∈ S we have:
h @ p |=N ϕ ⇐⇒ h @ p |=N |S ϕ
P ROOF. Suppose that S is an invariant. To show that ∼S is a bisimulation between the →N and →N |S , we suppose that h @ p ∼S
h @ p (i.e., that h @ p = h @ p ∈ S) and verify the usual two conditions.
First, if h @ p →N h0 @ p0 then, as S is an invariant, we
have h0 @ p0 ∈ S. So, applying Lemma 1, part 2, we have that
h @ p →N |S h0 @ p0 . As h0 @ p0 ∼S h0 @ p0 (as h0 @ p0 ∈ S) we
have verified the first condition. Second, if h @ p →N |S h0 @ p0
then h0 @ p0 ∈ S and so h @ p →N h0 @ p0 , by Lemma 1, part 1.
The second condition follows. One shows ∼S an external transition
bisimulation similarly.
Conversely, suppose that ∼S is a one-step bisimulation between
N and N |S. To show S invariant, suppose that h @ p ∈ S and
h @ p N h0 @ p0 . Then h @ p ∼S h @ p and so, as ∼S is a
0
0
bisimulation, we have that h @ p N |S h @ p0 for some h @ p0
0
0
with h0 @ p0 ∼S h @ p0 , that is, with h0 @ p0 = h @ p0 ∈ S. Thus
S is indeed an invariant.
The final part follows from the assumption that ∼S is a bisimulation by applying Proposition 1 and the remarks made immediately
thereafter. B. The advantage of this transformation is that it can be applied
repeatedly to gradually drain the rules out of backup routers, to
make a much simpler reduced network.
Example 4: Redirecting traffic
Suppose we have a network N in which traffic T ⊆ PacN flows
from a node a to a node b and then, without change, to a node d, but
there is an alternative flow, in which the same traffic moves from a
to yet another node c from which it also flows without change to d.
Then, as long as the box at d treats the traffic in the same way,
by whichever of the two routes it arrives, we can transform the
network by rerouting the traffic T through b through c instead, and
removing relevant rules from the box at b. This results in another
network N̄ , with less rules.
While this transformation is clearly far from general, it seems
widely applicable to data center switching networks because of
the symmetry present there. In Section 6 below we point out how
one could pick out candidate nodes a, b, c, etc., using symmetry
considerations.
We next present the transformation and its background assumptions precisely. The nodes a, b, c, and d are supposed all distinct.
i0
i
B b
k
k0
a A
j j0
i0
C c
⇐⇒ B b
l
k
l0
d D
i
a A
j j0
C c
k0
l0
d D
l
Figure 6. The formal setup for Example 4
Let the boxes at these nodes be A : KA , B : KB , C : KC , and
D : KD , respectively. Suppose that a and b are connected by ports
a.i and b.i0 ; that b and d are connected by ports b.k and d.k0 ; that
a and c are connected by ports a.j and c.j0 ; and that c and d are
connected by ports c.l and d.l0 . Note that we then also have that i
and j are distinct, as are i0 and k; j0 and l; and k0 and l0 . This scheme
is depicted in Figure 6.
The background assumptions are that for any h ∈ T the following hold:
h @ i0 −→B h0 @ q ⇐⇒ h0 = h ∧ q = k
h h A
h
B
C h
D
Transform:
Remove
h rule in B,
redirect in A
⇐⇒
h h A
B
C h
D
which says that T traffic entering at i0 is forwarded by B unchanged
at k,
h @ j0 −→C h0 @ q ⇐⇒ h0 = h ∧ q = l
which says that T traffic entering at j0 is forwarded by C unchanged
at l, and
h @ k0 −→D h0 @ q ⇐⇒ h @ l0 −→D h0 @ q
Figure 5. When considering forwarding header h, redirecting traffic from A to only use C and dropping the relevant rule in B does
not impact the reachability of h
Our next surgery (redirection), formalized in Example 4, keeps
the topology unchanged but drops rules. This, in turn, involves
redirecting the traffic covered by the dropped rules. First, notice
in Figure 5 that A, in Figure 5, forwards packets to header h to
both B and C. This is done for good reasons in practice such as
load balancing.
However, given that certain conditions hold, to compute the
reachable set of packets addressed to h it suffices to redirect traffic
at A to only go to C, and drop the rules dealing with this traffic from
which says that D treats T traffic at k0 and l0 identically.
The network N̄ is obtained from N by replacing the boxes
A : KA and B : KB by boxes A : KA and B : KB . The transition
rules of A are the same as those of A except for transitions to the
ports i or j, where we put instead:
/T
h @ q →A h0 @ i ⇐⇒ def h @ q →A h0 @ i ∧ h0 ∈
h @ q →A h0 @ j ⇐⇒ def h @ q →A h0 @ j ∨
(h @ q →A h0 @ i ∧ h0 ∈ T)
The transition rules of B are the same as those for B except for
transitions between ports i0 and k where we have:
h @ i0 →B h0 @ k ⇐⇒ def h @ i0 →B h0 @ k ∧ h ∈
/T
Turning to the bisimulation relation ∼, we take h @ p ∼ h @ p
to hold if, and only if, h = h and one of the following two
conditions also hold:
Further, for such a q, d.q will either be an external port or
else be connected to a port other than b.i0 (as that port is
connected to another port). So if h0 @ p0 is reachable by
a → transition from h @ p (equivalently from h @ p) then
we have h0 @ p0 ∼ h0 @ p0 . The bisimulation condition
therefore holds in this case too.
- p = p and, if h ∈ T, then p 6= b.i0
- h ∈ T and one of the following two conditions hold:
p = b.i0
p = d.k0
∧
∧
p = c.j0
p = d.l0
Note that this bisimulation relation is not the composition of
separate relations between packets and ports.
T HEOREM 2. The relation ∼ is a one-step bisimulation between
−→N and −→N̄ .
P ROOF. We first show ∼ a bisimulation of the −→ relation. We
consider all the cases when a relation h @ p ∼ h @ p holds, and
verify the bisimulation condition in each. We always have h = h.
Suppose first that p = p. Then either h ∈
/ T or else p 6= b.i0 .
There are two cases:
In the first case, we assume first that p is not a port of the
node a. In this case, a →-successor state of h @ p of either
network cannot be of the form h0 @ b.i0 and so is ∼-related
to itself.
Further, both networks have the same →-successor states,
as the relevant transitions are the same in both networks.
This is evident in the case of the c and the d nodes where
the corresponding boxes are the same in both networks. In
the case of the node b, the boxes B and B have the same
transitions, except for those from i0 ; however in this case we
then have h ∈
/ T and so B and B have the same transitions
from h @ i0 .
As the →-successor states are the same in both networks
and are ∼-related to themselves, the bisimilarity condition
holds in this case.
In the second case we assume that p has the form a.q,
and consider the possible forms h0 @ q 0 of the A and A
successors of h @ q.
These are the same if h0 ∈
/ T or else q 0 6= i and q 0 6= j. The
corresponding →-successor states in the two networks are
then equal and in the ∼ relation to themselves.
Otherwise, h0 ∈ T and q 0 = i or q 0 = j, and we note:
− Every A successor h0 @ i can be paired off with the A
successor h0 @ j, and h0 @ b.i0 ∼ h0 @ c.j0 .
− Every A successor h0 @ j can be paired off with the
A successor h0 @ j, and h0 @ c.j0 ∼ h0 @ c.j0 ; and this
accounts for all the remaining A successors of the form
h00 @ j with h00 ∈ T.
− There are no A successors of h @ q of the form h0 @ i.
This verifies the bisimilarity condition in this case.
Suppose next that p 6= p. Then h ∈ T, and there are two cases:
0
0
In the first case p = b.i and p = c.j . So, employing
the first two background assumptions, the only possible →successor state of N is h @ d.k0 , and only possible next
such state of N̄ is h @ d.l0 . As h @ d.k0 ∼ h @ d.l0 , the
bisimulation condition holds in this case.
In the second case, p = d.k0 and p = d.l0 . By the third
background assumption the h0 @ q reachable from h @ k0
and those reachable from h0 @ l0 by a D transition are the
same, and so too, therefore are the states of the two networks
reachable from h @ p and h0 @ p by a →-transition.
We next show that ∼ is a bisimulation of the relation. The
two networks have the same relation. Also, h @ p ∼ h @ p
whenever there is a transition from h @ p. The only case where
∼ relates two different states, one of which may make a transition, is h @ d.k0 ∼ h @ d.l0 , with h ∈ T, and in that case one can
make a transition to a state h0 @ p0 iff the other can, as D treats
T traffic at k0 and l0 identically. Putting these together, we see that
∼ is a bisimulation of . Turning to logic we take the logic of N̄ to be that of N ,
and consider formula correlations. For packet header formulas we
evidently always have
α ∼For α
For port formulas we assume available a boolean combination ϕT
of packet header formulas defining T. We then have the following
correlations:
@ p ∼For @ p
if p is not b.i0 , c.j0 , d.k0 , or d.l0 ,
¬ϕT ∧ @ p ∼For ¬ϕT ∧ @ p
if p is c.j0 , d.k0 , or d.l0 ,
ϕT ∧ ( @ b.i0 ∨ @ c.j0 ) ∼For ϕT ∧ @ c.j0
and
ϕT ∧ ( @ d.k0 ∨ @ d.l0 ) ∼For ϕT ∧ ( @ d.k0 ∨ @ d.l0 )
So, in particular, if we are “away from the surgery” the same
assertions hold. More precisely, if p is not b.i0 , c.j0 , d.k0 , or d.l0 ,
and ϕ does not mention any of these ports, then, for any h, we have:
h @ p |=N ϕ ⇐⇒ h @ p |=N̄ ϕ
There are variations on this transformation. Regarding the definition of A, one can instead leave its transitions to i unchanged,
when the proof that ∼ is a one-step bisimulation relation goes
through unchanged. It may also happen that the transitions to j are
unchanged, as the ones being added are already there. This is common in the core where ports such as i and j are treated symmetrically for the usual redundancy and performance reasons.
Next, let us assume, as is common in switching networks, that
packets are forwarded without being transformed. We can then
weaken the condition that D treats T traffic at k and l identically.
Given a subset G of ports (to be “guarded”), we define a forwarding
relation ≡TG on ports. It is the least relation such that p ≡TG q if,
and only if, one of the following hold:
• p = q, or
• p, q ∈
/ G and both of the following hold
∀h ∈ T, p0 . h @ p →N h @ p0 ⇒ ∃q 0 . h @ q →N h @ q 0 ∧
p0 ≡TG q 0
∀h ∈ T, q 0 . h @ q →N h @ q 0 ⇒ ∃p0 . h @ p →N h @ p0 ∧
p0 ≡TG q 0
Then ≡TG is an equivalence relation and if p ∈ G and p ≡TG q
then p = q. The weaker condition on D is then that d.k0 ≡TG d.l0 .
One can define a bisimulation exactly as before, except that one
replaces the second clause in the case where h ∈ T (i.e., the one
that p = d.k0 and p = d.l0 ) by p ≡TG p. As before, the same
assertions hold if we are away from the surgery, i.e., the ports b.i0 ,
c.j0 , or any ports in the ≡TG -relation to a different port, and so,
in particular, any port in G. As before, one may also leave the
transitions to i of A unchanged.
Returning to Example 1, let us see why there need not be
a relevant bisimulation between the two networks, by which we
mean a bisimulation allowing the two atomic formulas @ r1 .i0 and
@ v2 .j0 to be transferred. The problem is that although a packet h ∈
T can go from r1 .i0 to v2 .j0 via the core, it might also enter the core
and then go somewhere else and then be dropped. Now consider
the following reachability formula (generally stronger than ϕreach )
αT ∧ @ r1 .i0 ⇒ ♦ @ v2 .j0
This holds in N̄ , but may fail to hold in N . However, if that is the
case, then, by Proposition 1, there can be no bisimulation between
N and N̄ that allows the two atomic formulas to be transferred.
5.
Composite bisimulations
We now see how composite bisimulations can be obtained. These
use “topological” relations between networks and their signatures
to compose together bisimulation relations between the transition
relations of boxes to form bisimulations between the networks.
This contrasts with the bisimulations in Section 4 which are rather
constructed in an “ad hoc” way, according to the need at hand.
Suppose we have two signatures Box and Box. Then a signature
relation between the two signatures consists of:
• a box relation ∼B between Box and Box, and
• for each so-related pair of boxes b : K, b : K, a box port relation
b,b
∼Po
⊆ K × K between their ports.
Suppose next that we have two networks N and N built over
the two signatures. Then a topological relation between N and
N consists of a signature relation between the two signatures, as
above, together with a node relation ∼N between their sets of nodes
such that:
We then have:
P ROPOSITION 2. If ∼Pa is a signature bisimulation, then ∼Net is
a one-step bisimulation relation between N and N .
P ROOF. Assume ∼Pa is a signature bisimulation. We give the
first halves of the proofs that ∼Net is an internal transition relation
bisimulation and that it is an external one; the second halves are
similar.
Suppose h @ n.i ∼Net h @ n.i. Then h ∼Pa h, n ∼N n, and
β (n),βN̄ (n)
βN (n),βN̄ (n)
i ∼PoN
i, and so h @ i ∼Blp
h @ i.
0
0 0
If, first, h @ n.i →N h @ n .i , then h @ i →βN (n) h0 @ j
and n.j γ n0 .i0 , for some j. As ∼Blp is a bisimulation of →βN (n)
0
0
by →βN̄ (n) , there are h , j such that h @ i →βN̄ (n) h @ j
β
(n),β
(n)
0
β
(n),β
(n)
0
0
N
N̄
and h0 @ j ∼Blp
h @ j (and so h0 ∼Pa h and
β (n),βN (n)
βN (n),βN (n)
j). Next, as n ∼N n and j ∼PoN
j,
j ∼Po
0 0
n.j ∼Po n.j. So as n.j γ n .i and ∼Po is a bisimulation of γ by
0
0
0
γ , there are n0 and j such that n.j γ n0 .j and n0 .i0 ∼Po n0 .j .
0
0 0
Then, as h @ i →βN̄ (n) h @ j and n.j γ n .j , we have that
0
0
0
0
h @ n.i →N h @ n0 .i . Finally, as h0 ∼Pa h and n0 .i0 ∼Po n0 .i
0
0
0 0
0 0
we also have h @ n .i ∼Net h @ n .i , as required.
If, instead, h @ n.i N h0 @ n0 .i0 , then h @ i →βN (n) h0 @ i0
and n.i0 is external. So, as ∼Blp is a bisimulation of →βN (n)
0 0
0
0
by →βN̄ (n) , there are h , i such that h @ i →βN̄ (n) h @ i
0
0
N
N̄
h @ i (and so h0 ∼Pa h and
and h0 @ i0 ∼Blp
βN (n),βN (n) 0
β (n),βN (n) 0
0
i ∼Po
i ). Next, as n ∼N n and i0 ∼PoN
i,
0
0
0
0
n.i ∼Po n.i . But then as n.i is external, we see that n.i is
external too.
0
0
0
0
So, as h @ i →βN̄ (n) h @ i , we have h @ n.i N h @ n.i
0
0
Finally, as h0 ∼Pa h and n0 .i0 ∼Po n0 .i we also have that
0
0
0 0
0 0
h @ n .i ∼Net h @ n .i , as required. • if n ∼N n then βN (n) ∼B βN̄ (n), and
6.
• the network port relation ∼Po between PortN and PortN̄
We use composite symmetries in two ways. The first is to define
various notions of symmetries, particularly the topological ones
that typically hold in data centers, and then show how local symmetries can be found and also how to divide out networks by symmetry groups. The second is to construct equivalencies between
packet headers that allow us to replace assertions about one header
by an assertion about an equivalent one, or to go further and replace
headers by their equivalence classes.
defined by:
n.i ∼Po n.i
⇐⇒ def
β
n ∼N n ∧ i ∼PoN
(n),βN̄ (n)
i
is a bisimulation of γ by γ. (Note that then if n.i ∼Po n.i then
n.i is external iff n.i is.)
Turning to the box transition relations , assume first that we have
two header spaces PacN and PacN . Then a packet header relation
is a relation ∼Pa between the two header spaces, PacN and PacN .
Assume next that we have a semantics for each of the signatures,
that is, we have families of transition relations −→N ,b (b ∈ Box)
and −→N̄ ,b (b ∈ Box) for the two signatures, as considered above.
Then, given a signature relation between the two signatures as
above, a packet header relation ∼Pa is a signature bisimulation
between the two signatures if, and only if the box-located packet
b,b
relation ∼Blp
, defined by:
h @ i ∼Blp h @ i
⇐⇒ def
h ∼Pa h ∧ i ∼b,b
Po i
is a bisimulation between −→b and −→b , whenever b ∼B b.
Given a topological relation (comprising a signature relation
and a node relation) and a packet relation as above, we have a
relation ∼Pa between the two sets of packets and a relation ∼Po
between the two sets of ports. So we can define a relation ∼Net
between the corresponding located packet sets as remarked in Section 3, i.e., by putting:
h @ p ∼Net h @ p
⇐⇒ def
h ∼Pa h ∧ p ∼Po p
Network symmetries and merging headers
Example 5: Network symmetry
At the most general level, a symmetry of a network N is a permutation πN of network states which preserves and reflects both →∗N
and ∗N . That is, for all network states h @ p and h0 @ p0 , we have:
h @ p −→∗N h0 @ p0 ⇐⇒ πN (h @ p) −→∗N πN (h0 @ p0 )
and similarly for ∗N . More strictly, a one step symmetry of N is
required to preserve and reflect →N and and ∗N . One could also
require that the permutation is composed from separate permutations of the headers and the network ports (note that if the header
space is finite, then the reflection condition is redundant).
The network structure provides a good source of exploitable
symmetries, and so we focus on composite symmetries which are
simply those composite bisimulations between a network and itself
where all the relations are bijections. (We note in passing the evident fact that all these various classes of symmetries form groups.)
Let us spell this out in functional terms. Assume given a signature Box. Then a signature symmetry over Box consists of:
• a box permutation πBo : Box ∼
= Box, and
b
• for all boxes b : K, a box port bijection πPo
:K ∼
= K, where
πBo (b) : K,
Next, given a network N over Box, a topological symmetry of
N is a signature symmetry, together with a node permutation
πN o : Node ∼
= Node such that:
• For all n, we have πBo (β(n)) = β(πN o (n)), and
• For all ports n.i and n+ .i+ , we have:
n.i γ n+ .i+ ⇐⇒ πPo (n.i) γ πPo (n+ .i+ )
where the port bijection πPo : Port ∼
= Port is defined by:
β(n)
πPo (n.i) = πN o (n).πPo (i)
(n.i ∈ Port)
We can associate a graph ΓN to every network N . It has nodes
those of the network and has relation RN where:
RN (n, n0 ) ⇐⇒ def ∃n.i, n0 .i0 ∈ Port. n.i γ n0 .i0
Then every topological symmetry πN o is a symmetry of ΓN .
Turning to the semantics, suppose we have a header space
PacN , and families of transition relations −→N ,b (b ∈ Box),
as usual. Then, given a signature symmetry as above, a packet
bijection
πPa : PacN ∼
= PacN
is a symmetry of the signature semantics, if, for all h, h0 , b : K, and
all i, i0 ∈ K, we have:
b
b
h @ i →b h0 @ i0 ⇐⇒ πPa (h) @ πPo
(i) →πBo (b) πPa (h0 ) @ πPo
(i0 )
Figure 2 provides an example of a topological symmetry where
the boxes R3 and R4 , and their nodes would be permuted, but all
others would be fixed, and where the ports are permuted as shown.
If, further, the two boxes contained the same rules (modulo box
port bijections) then the identity packet bijection would provide a
signature semantics symmetry.
Given a topological symmetry πN o (comprising a signature
symmetry and a node permutation) and a packet bijection πPa as
above, we can define a bijection of the located packet set by:
πN et (h @ n.i) =def πPa (h) @ πPo (n.i)
and we have the following corollary of Proposition 2:
So the suggestion is to look for two nodes with the same neighbors, and use the node connections to determine the port correlations between them (two ports are correlated if they link to the same
neighbor). One can then choose pairs of such ports (other than any
ports connecting a and b) to search for candidates for i0 and j0 , or
for k and l, for traffic redirection.
Figure 2 again provides an example. The two required nodes are
those corresponding to the two boxes R3 and R4 .
We next sketch how to quotient a network N by a group of
topological symmetries. In this way we may be able to “slim” fat
trees, as discussed above. First, let us briefly recall the relevant
background material (and see, e.g. [3, §17]). An action of a group
G on a set X is a map · : G × X → X, such that the following two
equations hold:
(g 0 · g) · x = g 0 · (g · x)
The orbit equivalence relation is then defined on X by:
x ∼G y ⇐⇒ def ∃g ∈ G. g · x = y
and we write X/G for the set of equivalence classes [x] of elements
x of X under this equivalence relation.
Under componentwise composition, the topological symmetries
(πBo , πPo , πN o ) form a group TopN , with three actions: on Box;
on BoxPort =def {b.i | b : K, i ∈ K}; and on Node, where:
(πBo , πPo , πN o ) · b
(πBo , πPo , πN o ) · (b.i)
(πBo , πPo πN o ) · n
−1
h |= απ ⇐⇒ πPa
(h) |= α
We then obtain an evident homomorphic definition of ϕπ for N formulas ϕ, where:
(@ p)π = @ πPo (p)
(¬ϕ)π = ¬ϕπ
(ϕ ∧ ψ)π = ϕπ ∧ ψπ
(♦ϕ)π = ♦ϕπ (Fϕ)π = Fϕπ
Proposition 1 then tells us, that, for any N -formula ϕ, we have:
h @ p |= ϕ ⇐⇒ πPa (h) @ πPo (p) |= ϕπ
We can use symmetry considerations to pick out candidates for
traffic-redirection surgery. We would like to find two symmetrically
placed nodes b and c which are candidates for redirecting traffic
(from b to c), so wish to find a box symmetry πBo and an associated topological symmetry πN o which switches a and b. Let us
say that the symmetry is local if πN o leaves all the other nodes invariant. For such a local symmetry a and b will have the same ΓN
neighbors. If there is only one link between any two nodes, as is
common, the signature symmetry is also then determined.
=def
=def
=def
πBo (b)
b
(i)
πBo (b).πPo
πN o (n)
Let G be a subgroup of TopN . Let Box/G, BoxPort/G, and
Node/G, be the collections of orbit equivalence classes for each
of its three actions, inherited from TopN . We define a signature on
Box/G by setting [b] : {[b.i] | i ∈ K}, for each b : K (ignoring that
the [b.i] are not natural numbers). Then we can define a quotient
network N /G over the signature. It has node set Node/G, box
assignment function βN /G , where
βN /G ([n]) =def [βN (n)]
and connection relation γN /G where
m.q γN /G m0 .q 0
⇐⇒ def
C OROLLARY 1. If πPa is a symmetry of the signature semantics,
then πN et is a one-step symmetry of N .
Turning to the logic, assume that for each packet formula α
there is a formula απ such that:
e·x=x
∃ n, n0 , b, b0 , i, i0 . n.i γN n0 .i0 ∧
βN (n) = b ∧ βN (n0 ) = b0 ∧
m = [n] ∧ m0 = [n0 ] ∧
q = [b.i] ∧ q 0 = [b.i0 ]
Assume next that the identity packet permutation is a symmetry
of the signature semantics, given any signature component of any
element of G. Then we can define a transition relation for Box/G,
keeping the packet set the same as that of N , by:
h @ q → c h0 @ q 0
⇐⇒ def
∃b, i, i0 . h @ i →b h0 @ i0 ∧
c = [b] ∧ q = [b.i] ∧ q 0 = [b.i0 ]
So we have defined the syntax and semantics of the quotient
network N /G, as desired.
There is a one-step bisimulation between N and N /G, formed
by combining the identity relation on packets and the relation ∼Po
on ports, where:
n.i ∼Po [n].[n.i] ⇐⇒ def n ∈ [n] ∧ n.i ∈ [n.i]
Finally, assuming the evident logic for N /G, we have:
@ p1 , . . . , pm ∼For [p]
if [p] = {p1 , . . . , pm }.
Figure 2 again provides an example, assuming it depicts a composite symmetry as discussed above. The permutation group G is
that generated by the symmetry, when, for example, the box orbit
equivalence classes would be singletons except for {R3, R4}. The
right-hand-side of the figure depicts the net after division by the
symmetry group, with, for example, R3 depicting {R3, R4}. In
general, one might have a number of such pairs of symmetrically
placed boxes, when G would be the group generated by the various
pair symmetries.
Example 6: Merging headers
Suppose we have a signature Box with semantics given as above.
Then a binary (signature) invariant is a bisimulation ∼Pa between
the signature and itself. Spelling this out, what is required is that
∼Pa is a relation on the set of packet headers such that, for any h,
h, if h ∼Pa h then, for all boxes b : K and i, i0 ∈ K:
0
0
0
(i) h @ i −→b h0 @ i0 =⇒ ∃h . h0 ∼Pa h ∧ h @ i −→b h @ i0
holds for all h0 , as does
0
0
(ii) h @ i −→b h @ i0 =⇒ ∃h0 . h0 ∼Pa h ∧h @ i −→b h0 @ i0
0
for all h . Taking the other relations to be the relevant identity relations, Proposition 2 applies, and one obtains a one-step bisimulation of N by itself.
Proposition 1 then tells us that for any formula ϕ and headers
h ∼PacN h we have:
h |=N ϕ ⇐⇒ h |=N ϕ
provided ϕ is built, using the connectives, from formulas of the
form @ p or formulas that are ∼Pa -invariant, by which is meant
that
∀h, h. h ∼Pa h =⇒ (h |=N ψ ⇐⇒ h |=N ψ)
holds, and which are boolean combinations ψ of basic formulas.
We can go further, and identify equivalent packet headers. First,
we can assume that ∼Pa is a partial equivalence relation (i.e., that
it is transitive and symmetric, but not necessarily reflexive); for if
it is not, one need only consider its transitive symmetric closure.
The cases where reflexivity holds are given by the domain of ∼Pa ,
H =def {h | h ∼Pa h} (and H × Port is easily seen to be a
network invariant). Restricted to its domain, ∼Pa is an equivalence
relation and we can change the set of packet headers to its set of
equivalence classes. That is, we change the header space to:
PacN / ∼Pa =def {[h] | h ∈ H}
Having done, so we can define a new semantics for boxes, where:
[h] @ i −→b [h0 ] @ i0 ⇐⇒ def h @ i −→b h0 @ i0
Then we can define the network N / ∼Pa ; this is the same as N ,
except for the above changes in packet headers and box semantics.
As may be expected, N and N /∼Pa are one-step bisimilar, taking
the packet header bisimulation ∈PacN to be membership:
h ∈PacN [h] ⇐⇒ def h ∈ [h] (⇐⇒ h ∼Pa h)
and the other relations to be the relevant diagonals.
Turning to logic, we take the basic formulas of N / ∼Pa to be
the ∼Pa -invariant boolean combinations ψ of basic formulas and
set:
[h] |= ψ ⇐⇒ h |= ψ
Then, applying Proposition 1, we find that, for any N / ∼Pa formula ϕ, and any h ∈ H:
h |=N ϕ ⇐⇒ [h] |=N /∼Pa ϕ
The maximal signature bisimulation provides a natural choice
of binary invariant; it is automatically an equivalence relation. In
the case where the signature semantics does not change packet
headers, that is where packet headers are forwarded unchanged,
one can give a more explicit form of the maximal signature bisimulation, first (implicitly) considered by Yang and Lam [30] in the
particular case where the output ports of the signature transition
relations do not depend on the input ports.
To see this, first note that in the case where packet headers are
forwarded unchanged, ∼Pa is a signature invariant if, and only if,
whenever h ∼Pa h then, for all boxes b : K and i, i0 ∈ K, we have:
h @ i →b h @ i0 ⇐⇒ h @ i →b h @ i0
Next, identifying predicates on headers with sets of headers, given
a set H of predicates, one can define an equivalence relation on
headers by taking h ∼H h to hold if, and only if, for all H ∈ H
we have:
h ∈ H ⇐⇒ h ∈ H
Then Yang and Lam’s atomic predicates are the equivalence classes
of this relation. Taking H to be the sets {h | h @ i →b h @ i0 },
where b : K and i, i0 ∈ K, one obtains an equivalence relation ≡YL
slightly generalising that of Yang and Lam, which we therefore call
Yang-Lam equivalence.
Yang and Lam demonstrated that impressive efficiencies in verification could be achieved by replacing packet headers by (representations of) their equivalence classes (as discussed above) since
there can be many fewer equivalence classes than headers. in the
networks they considered. As we shall see this also holds for the
Singapore network.
Comparing the above reformulation of signature invariants with
Yang-Lam equivalence we see that ∼Pa is a signature invariant if,
and only if, ∼Pa ⊆ ≡YL . This proves the first part of the following
theorem:
T HEOREM 3. Under the assumption that the signature semantics
does not change packet headers, the following are the same:
• the maximal signature bisimulation
• Yang-Lam equivalence
Further, in case all boxes occur in N (i.e., are in the range of βN ),
then they are also the same as:
• the maximal relation on headers which, when combined with
the identity relation on ports (as in Section 3), forms a bisimulation.
P ROOF.
For the proof of the second part of the theorem, the condition
that a packet header relation ∼Pa forms a bisimulation when combined with the identity relation on ports is clearly equivalent to asking that, for all nodes n, i ∈ K (where β(n) : K), and ports q we
have:
h @ n.i →N h @ q ⇐⇒ h @ n.i →N h @ q
(∗)
Fix a node n, and an i ∈ K (where b : K, setting b =def β(n)),
and consider the possible q for which (*) holds. There are three
cases. In the first q = n.j for some j and q is external. Here,
inspection of the definition of →N tells us that (*) is equivalent
to
h @ i →b h @ j ⇐⇒ h @ i →b h @ j
In the second q = n0 .j 0 and n.jγn0 .j 0 for some port n0 .j 0 and
j ∈ K. Inspection of the definition of →N again tells us that (*) is
equivalent to
h @ i →b h @ j ⇐⇒ h @ i →b h @ j
The third case is when q has neither of these forms, in which case
(*) is trivially true (more precisely, both sides of implication are
false).
As every port at any node is either internal or external, and as
every box is in the range of β we see from the characterisation of
signature bisimulations in the case that headers are left unchanged
in signature transitions that ∼Pa forms a bisimulation when combined with the identity relation on ports iff it is a signature bisimulation. 7.
Experiments
We describe our benchmark network and then show the use of variants of the surgeries of Examples 1 and 4, together with a header
equivalence relation, to obtain large speedups compared to experiments described in earlier work [20]. The first is systematically
applicable to all data center networks with a distinguished core; the
others are applied automatically, in seconds, on large benchmarks;
we also briefly describe how they are implemented.
7.1
Setup
As an initial experimental test of some of the ideas considered in
this paper, we worked on a Microsoft production data center located
in Singapore. This is a fairly large switching network, with 52 core
routers, each with about 800 forwarding rules (but no ACLs), and
with 90 ToRs with about 800 rules and 100 ACLs each. In total,
this network has about 820K forwarding and ACL rules and is a
reasonable example of a complex data center.
We used parsers to automatically extract routing tables from
the Arista and Cisco devices in this network using a “show ip
route” command. This produced a set of routing tables including
the ECMP routing options. Each router is also annotated with a
name from which it is easy to syntactically determine its level in
the topology (e.g., router names starting with HL denote Host Leaf
or Edge routers). This is standard practice; we did not have to add
extra annotations manually. The rules in routers map headers to a
set of next hop IP addresses; using a simple call to the Domain
Name Service we map these next hops to router names to automatically extract the topology of the network.
We use the NoD tools [20] to encode the networks. All results
were obtained by running NoD [20] queries on both the original
network and various transformations of it.
7.2
Speeding up All-Pairs Reachability
Our first experiment computed all-pairs reachability between client
VMs on the Singapore data center; that is, for each pair of VMs we
compute (a representation of) the set of headers of those packets
that can reach one VM to the other. This is ideally what needs
to be done periodically in operating data centers to help catch
configuration errors in routers.
Naively done, this requires a quadratic number of queries over
the number of client VMs (Virtual Machines), and these are usually of the order of 100,000s. For example, without exploiting the
surgeries in this paper, NoD takes 131 hours (∼5.5 days) to prove
that client VMs can reach each other using a single query encoding
all-pairs reachability at once.
We did the following experiment based on our knowledge of
how this network operates. We followed the idea of the simplest
surgery, described in Example 1, automatically rewriting the core
by a hub, much as shown in Figure 3. The hub was, however, more
complicated: it connected all pairs of ToRs, not just one, and it
took account of the fact that ToRs could be connected to the core
by more than one port. We then applied NoD to the transformed
network, running the same all-pairs reachability query. This now
took only 2 hours.
Of course, to be fair one has to count the time to prove that the
core network does indeed behave like the hub i.e., that it does not
filter out packets. The simplest approach is to use NoD itself to
prove (see Figure 3) that for all pairs of edge routers R1 and R2 ,
that all headers h (with source address corresponding to the prefix
corresponding to R1 and with destination address corresponding
to the prefix corresponding to R2 ) when sent from R1 reach R2 .
This “proof” does not even require rewriting the network; it only
requires computing reachability between every pair of edge routers
in the original network.
This “brute-force” verification that the core can be replaced by
a hub still took only 1 hour and 40 minutes to complete. This is
faster than the original 131 hours because all virtual machines such
as V1 connected to R1 in Figure 3 are aggregated by a single prefix.
Thus the original N 2 scaling is reduced to (N/64)2 as edge routers
typically have 64 ports. Below we will give a more sophisticated
verification, which is much faster, taking only seconds, that exploits
local rule surgery, much like Example 4.
However, even using brute-force verification that the core can
be placed by a hub, the overall speedup is from 131 hours to 3.66
hours, i.e., 36×.
7.3
Accelerating other Queries
We next apply the local surgery ideas introduced in Example 4
to speed up four sets of earlier experiments described in [20].
These checked common “beliefs” in the case of the Singapore
data center. The first two beliefs checked were that neither Internet
addresses nor customer VMs can access protected fabric controllers
for security reasons. The third was that all “utility boxes” can
reach all “fabric controllers”, and the fourth was that all “service
boxes” can reach all “fabric controllers”, The definitions of fabric
controllers, utility boxes, and service boxes are unimportant; the
reader should think of these as classes of boxes with different
reachability privileges.
The four queries each took several minutes to complete in the
original network without the use of surgeries. Instead, we implemented a version of the Example 4 rule surgery, that used the fact
that the network forwards packets unchanged and without regard
to entry ports, when we can discuss the network transition relation
using rules of the form h 7→ I, as described in Section 2.2. We then
obtained the results given in Table 1. Note that, in contrast to the
times in minutes on the original network, the same queries on the
transformed network often ran in a couple of seconds after surgery,
owing to the reduction in the number of rules.
The essence of Example 4 is to remove rules with respect to
some set of headers T that do not impact reachability of headers
h in T with respect to some set of ports G. We take T to be a
singleton consisting of a single header h, and G to be the set of ToR
ports connecting to VMs, and write ≡h for ≡TG , the forwarding
equivalence relation defined in Example 4.
rA
h
k
B
rB
h
A
l
rC
h
C
i
rD
h
e
j
D
e ≡h e
⇓
i ≡h j
⇓
k ≡h l
0
rA
h
B
A
rC
h
C
rD
h
D
e
Figure 7. We fix a header h and then inductively establish the
port forwarding equivalence relation ≡h . Some rules can then be
dropped after redirecting traffic, in accordance with the equivalence
relation.
For example, consider Figure 7, where, for illustration, we suppose that k, l, i, and j are not in G; the rule rA sends h out only to
k and l; the rule rB sends it out only to i; the rule rC sends it out
only to j; and the rule rD sends it out only to e. Then ports i and
j at D are forwarding equivalent with respect to header h because
they forward to the same external port e. This inductively implies
that ports k at B and l at C are equivalent because they respectively
forward to equivalent ports i and j.
We can then redirect the h-traffic through A to go only to C
0
via l, changing rA to the rule rA
. Note that the new network has
the same forwarding relation ≡h as before, so we can do similar
redirections for other rules. Now suppose that in Figure 7 the
only way that h-traffic could reach B is via k. Then, after the
redirections, the rule rB will be inaccessible and so can be pruned,
as can any other inaccessible rules.
We implemented this idea with two data structures. First, we
have to deal with the large number of potential headers used in
the forwarding plane. For IPv4, this is up to 232 headers even if
considering only the destination IP for building equivalence classes
(as we do). We mapped the original set of potential headers into a
smaller set of equivalence classes where headers are in the same
equivalence class if there are no two rules that can distinguish
them. For this purpose we used a data-structure called disjoint
decomposed normal form (ddNF) data-structure, described in [5].
The asscoiated equivalence relation is theoretically coarser than
Yang-Lam equivalence, but we observe in [5] that the number of
extra partitions is insignificant.
Despite the fact that the number of rules in our data set was
close to a million, the number of header equivalence classes was
only around 4000, consistent with the results of [30]. It takes under
four seconds to compute these equivalence classes for our network.
Next, we split rules so that each rule operates on a single header
equivalence class. Then, for each such class h we compute the
forwarding equivalence relation ≡h on ports, illustrated above. The
algorithm refines port equivalence relations p ∼h q, represented as
maps from header equivalence classes h to partitions on ports.
Initially, the map maps each such h to the discrete partition.
Then, in the style of congruence closure algorithms, we use unionfind structures to maintain partitions [29]: Until reaching the fixedpoint ≡h , for each class of headers h, we merge partitions containing p and q not in G if {p0 | h @ p −→N h @ p0 } and
{q 0 | h @ q −→N h @ q 0 } are element-wise ∼h -equivalent, as are
{p0 | h @ p N h @ p0 } and {q 0 | h @ q N h @ q 0 }.
The element-equivalence check is fast because we can use the
union-find root to find canonical equivalence classs representatives
and we can maintain the sets as sorted lists. Finally, we rewrite all
rules in all routers using the header and port equivalence relations:
Each rule in every router is rewritten to redirect traffic to the canonical representative of the equivalence relation between ports. Then
rules that are no longer reachable, because all the rules that previously directed headers to it had their ports renamed, are garbage
collected.
After this transformation (which can be thought of as a set of
transformations, one for each header equivalence class), we reran
the original four queries described in the NoD paper [20], obtaining
the results given in Table 1.
In this experiment, we transformed a network with nearly a
million rules to a new network with just over 10K rules. Not shown
is the time to parse text files containing the data center network
(i.e., translate from CISCO format to Datalog) which is about 6
seconds and the time to perform the surgeries which is 4 seconds.
The overall speedups obtained ranged from 15× to 360×.
We reiterate that both the identification and the rewriting involved are completely automatic and very fast. A major obstacle of the classical symmetry reduction program in model checking [6, 8, 14] is that is often computationally hard to even identify the symmetries. We are much faster because we do not aim
to find all symmetries, and we work at the fine structure of rules.
Even naively, finding which pairs of rules are equivalent is only
Experiment
Internet Reaches Protected Fabric
VM Reaches Protected Fabric
Utility boxes can reach
all fabric controllers
Service boxes can reach
all fabric controllers
pre-op
12 min
12 min
4 min
post surgery
2s
48s
1.7s
6 min
1.6s
Table 1. Speedups for belief checking experiments in [20]
quadratic in complexity (per header class). The congruence closure
algorithm described above is even more efficient. It is noteworthy
that this particular surgery does not even require identifying which
routers are backups for each other because we focus on rule and not
box equivalence.
We complete the story by connecting the two experiments. Recall that Experiment 1 used a brute-force verification that the core
can be replaced by a hub that took 1 hour and 40 minutes. We can
use the rule surgeries just described to reduce to reduce even this
time to under a second.
Suppose the Edge Routers of a data-center are tor1 , . . . , torn
and we wish to check that, for example, tori is reachable on the address range that it owns from all the other ToRs on any of the ports
by which they connect to the core (we use the terms ToR and Edge
Router synonymously). Then the typical data-center configuration
ensures that all the ToRs, other than tori , are forwarding equivalent for any packet h in the address range (by which we mean that
the core ports they connect to are all ≡h -equivalent). We checked
that this was the case for the Singapore data-center for all its ToRs,
using the computed forwarding equivalence relations.
We observe that this cuts down a quadratic number of routes to
a linear number of representatives for the pairwise routes, as, for
example, to check reachability of tori from the others, one need
only check reachability from any one of the core ports one of the
others connects to. We checked these reachability queries on the
Singapore data-center using a simple depth-first search algorithm
on the transition graph of the heavily reduced network.
The bottom line is that with more efficient verification, the
overall speedup for all-pairs verification dropped from 131 hours
to 2 hours, a 65× speedup.
8.
Related work
Symmetry reduction: Symmetry reduction has a long history in
the model checking literature where it was used for verifying concurrent systems [6, 8, 14]. Ip and Dill [14] trace the use of symmetry for automatic verification to the 80’s [1, 21]. Symmetry is
formally defined in terms of a certain permutation of participating
processes specified by a group G. After symmetries are identified,
model checking is done on a simpler structure quotiented by G.
The permutations are usually discovered on the state components, i.e., they deal with the data aspects of the program. By contrast, we target particular permutations that arise from the replication of routers for load balancing and redundancy reasons. Perhaps
it is fair to say that our symmetries are driven by the “control-flow
graph” of the network. Further, exploiting symmetries does not require modifying the network verification engine since it is done by
modifying the network itself (and adjusting the property).
Two practical difficulties with the classical symmetry reduction
program are firstly finding and verifying the group G, and secondly
dealing with the fact that real structures do not have perfect symmetries. We proceed differently. Rather than calculating the whole
group, our aim is to find particular symmetries and divide the network by them, rather than its transition relation, intending to verify
the simpler network in place of the original one. It is not hard to find
such symmetries, at least as regards the topology of the network,
especially for fat trees where we need only look at routers on the
same level. However it may be that the topological symmetry does
not preserve the transition relation, as there are a few differences
in the rules of the two routers. So, rather than actually construct
the quotiented network (which would anyway be expensive, even
with perfect symmetry) we use the topological symmetry (possibly
implicitly) to remove redundant rules, as illustrated in Section 4.
Software verification methods for multi-threaded programs usually explore symmetry in the local data maintained by (almost)
identical processes by applying the so-called thread modular proof
rule, where the induction principle is adjusted to reflect the symmetry [9]. Our approach performs a surgery before the verification
engine is run, as opposed to within the verification engine.
Bisimulation: Bisimulation is at the core of NetKAT’s [2] decision procedure for its equational theory [11]. However, its current
implementation does not exploit symmetries. Figure 3 in [11] suggests that on fat tree topologies the running time appears to grow
rapidly with the number of hosts.
Our approach, by contrast, relies on bisimulation in the preprocessing phase, but not the actual verification step. Our notions of
symmetry and surgery might be beneficial for (and easily integrated
into) the NetKAT decision procedure, perhaps as a preprocessing
step. A similar approach could be adopted for other network verification tools that check forwarding properties [17, 30].
Flowlog [27] employs partial evaluation and weakening to simplify the network verification problem. Partial evaluation is not currently considered in our system but could be a valuable addition.
Weakening is more challenging as our current setup is intimately
connected with the notion of bisimulation, but not simulation.
Kuai [23] relies on partial order reduction when checking properties of SDNs. Our traffic redirection can be seen as an instance of
partial order reduction that is applied statically.
Semantics: While NetKAT [2] and related work [26] provide
a semantically-oriented theory of networks, there is currently no
provision for exploiting symmetry and network transformation.
Note also that the research program in NetKAT is primarily topdown: defining a policy language and synthesizing networks that
meet these policies. By contrast, our agenda is bottom-up: we start
with existing networks and analyze their reachability policies.
9.
Conclusion
If network verification is to become an integral part of operational
procedure in large networks, then the time for comprehensive verification (all pairs of stations, all properties) should be less than the
average time of a network reconfiguration. Reconfiguration typically takes hours, but is set to decrease to seconds in the presence
of virtual machine migration [15] to optimize resource usage.
Surveying the initial work on network verification, Zhang, Malik, and McGeer [33] say
. . . initial results are based on modest sized systems.
However, overall, both FSM- and SAT-based approaches
will need to be tested for larger scale systems, e.g. entire data centers or large scale enterprise networks. This
will likely need development of new ideas in their solutions, or at the least adaptation of scaling techniques used in
other domains. For example, large data centers are likely to
have symmetry in their structure. This may enable the use
of parametric model checking techniques [39], or symmetry reduction in model-checking and SAT-based techniques.
Their application will open up new challenges.
Our paper addresses the challenge of verifying large scale networks
by developing a theory of network transformation. Our experimen-
tal results show that the apparent complexity of a well-structured
data center network (ostensibly 232 headers, ∼820K rules) can reduce to ∼4,000 header equivalence classes and ∼10,000 rules after
suitable transformations. In some sense, we are extending the research program of [30] which reduces the number of header equivalence classes but not the number of rules (or ports or routers).
The final simpler underlying structure may not be as surprising as
it seems at first: if it were more complex, it would be beyond the
understanding of the humans who design and operate the network.
The initial experimental evidence in this paper shows that easyto-code versions of simple transformations (Examples 1 and 4) can
provide large speedups, reducing the time for comprehensive verification from days to 2 hours. Since the all-pairs verification task
is easily parallelized, this suggests that comprehensive evaluation
can be done using a 32-core machine in under 4 minutes which
is practical for immediate deployment. Other transformations such
as slicing (see Examples 2 and 3) may well bring this time down
further. The aim would be to achieve verification in the order of
seconds, possibly also using incremental verification techniques as
in [15]. Note that earlier results in [15] and [30] were not for all
pairs, but only for single queries and for much smaller networks.
The specific transformations we found useful for data center
networks in our experiments may carry over to two other important classes of networks: enterprise networks and Internet Service
Provider Networks. While neither uses fat-tree topologies, there are
regularities in these designs that perhaps could be exploited using
the methods of this paper. For example, most enterprise networks
use a core network that interconnects a number of leaf networks,
and Points of Presence (POPs) in ISP networks often use complete mesh topologies. Even if new transformations are needed, the
bisimulation proof techniques may well still be applicable.
While we have only implicitly touched on modularity (when we
replaced the core by a wire in Section 7), we plan to extend the
theory in this paper to allow modular verification; this would correspond to the compositionality properties of bisimulation in the
process calculus, though, as there, composing properties of subsystems will no doubt present challenges. Other avenues for research
include scaling quantitative verification (e.g., bandwidth and delay
and not just reachability) and control plane verification [10] using
network transformations.
Finally, note that our semantics is relational, modeling nondeterminism. In particular partiality is used to model both dropped
packets and infinite loops. One could instead adopt other semantic frameworks to model other aspects of networks, for example to
distinguish packet dropping from infinite loops, or to model multicasting and/or probabilistic choice. We anticipate that one could
then still follow the program of this paper and connect network verification with the then relevant notions of bisimulation.
References
[1] S. Aggarwal, R. Kurshan, and K. Sabnani. A calculus for protocol
specification and validation. Protocol Specification, Testing, and Verification, 3(1), 1983.
[2] C. J. Anderson, N. Foster, A. Guha, J.-B. Jeannin, D. Kozen,
C. Schlesinger, and D. Walker. NetKAT: semantic foundations for
networks. In POPL, 2014.
[3] M. A. Armstrong. Groups and Symmetry. Springer, 1988.
[4] S. Arun-Kumar. On bisimilarities induced by relations on actions. In
SEFM, 2006.
[5] N. Bjørner, G. Juniwal, R. Mahajan, S. A. Seshia, and G. Varghese.
ddnf: An efficient data structure for header spaces. Technical report,
Microsoft Research, November 2015.
[6] E. M. Clarke, T. Filkorn, and S. Jha. Exploiting symmetry in temporal
logic model checking. In CAV, 1993.
[7] E. Emerson and A. Sistla. Symmetry and model checking. Formal
Methods in System Design, 9(1-2):105–131, 1996.
[8] E. A. Emerson and A. P. Sistla. Symmetry and model checking. In
CAV, 1993.
[9] C. Flanagan and S. Qadeer. Thread-modular model checking. In SPIN,
2003.
[10] A. Fogel, S. Fung, L. Pedrosa, M. Walraed-Sullivan, R. Govindan,
R. Mahajan, and T. Millstein. A general approach to network configuration analysis. In NSDI, 2015.
[11] N. Foster, D. Kozen, M. Milano, A. Silva, and L. Thompson. A
coalgebraic decision procedure for NetKAT. In POPL, 2015.
[12] M. Hasegawa. Models of Sharing Graphs: A Categorical Semantics
of let and letrec. PhD thesis, University of Edinburgh, 1997.
[13] M. Hasegawa, M. Hofmann, and G. Plotkin. Finite dimensional
vector spaces are complete for traced symmetric monoidal categories.
In Pillars of Computer Science: Essays Dedicated to Boris (Boaz)
Trakhtenbrot on the Occasion of His 85th Birthday, pages 367–385.
Springer Berlin Heidelberg, 2008.
[14] N. Ip and D. Dill. Better verification through symmetry. Formal
Methods in System Design, 9(1), 1996.
[15] P. Kazemian, M. Chang, H. Zeng, G. Varghese, N. McKeown, and
S. Whyte. Real time network policy checking using header space
analysis. In NSDI, 2013.
[16] P. Kazemian, G. Varghese, and N. McKeown. Header space analysis:
static checking for networks. In NSDI, 2012.
[17] A. Khurshid, X. Zou, W. Zhou, M. Caesar, and P. B. Godfrey. VeriFlow: verifying network-wide invariants in real time. In NSDI, 2013.
[18] J. F. Kurose and K. Ross. Computer Networking: A Top-Down Approach Featuring the Internet. Addison-Wesley Longman Publishing
Co., Inc., Boston, MA, USA, 2nd edition, 2002.
[19] Z. Li, M. Liang, L. O’Brien, and H. Zhang. The cloud’s cloudy
moment: A systematic survey of public cloud service outage. In-
ternational Journal of Cloud Computing and Services Science (IJCLOSER), 2(5):321–331, 2013.
[20] N. P. Lopes, N. Bjørner, P. Godefroid, K. Jayaraman, and G. Varghese.
Checking beliefs in dynamic networks. In NSDI, 2015.
[21] B. Lubachevsky. An approach to automating the veri
cation of compact parallel coordination programs. Acta Informatica,
21(2), 1984.
[22] H. Mai, A. Khurshid, R. Agarwal, M. Caesar, P. B. Godfrey, and S. T.
King. Debugging the data plane with Anteater. In SIGCOMM, 2011.
[23] R. Majumdar, S. D. Tetali, and Z. Wang. Kuai: A model checker for
software-defined networks. In FMCAD, 2014.
[24] R. Milner. Communication and Concurrency. Prentice-Hall, 1989.
[25] R. Milner. The Space and Motion of Communicating Agents. Cambridge University Press, 2009.
[26] C. Monsanto, N. Foster, R. Harrison, and D. Walker. A compiler and
run-time system for network programming languages. In POPL, 2012.
[27] T. Nelson, A. D. Ferguson, M. J. G. Scheer, and S. Krishnamurthi.
Tierless programming and reasoning for software-defined networks.
In NSDI, 2014.
[28] D. Sangiorgi. On the origins of bisimulation and coinduction. ACM
Trans. Program. Lang. Syst., 31(4):15:1–15:41, May 2009.
[29] R. E. Tarjan. Efficiency of a good but not linear set union algorithm.
J. ACM, 22(2):215–225, 1975.
[30] H. Yang and S. Lam. Real-time verification of network properties
using atomic predicates. In ICNP, 2013.
[31] H. Zeng, P. Kazemian, G. Varghese, and N. McKeown. Automatic test
packet generation. In CoNEXT, 2012.
[32] S. Zhang and S. Malik. SAT based verification of network data planes.
In ATVA, 2013.
[33] S. Zhang, S. Malik, and R. McGeer. Verification of computer switching networks: An overview. In ATVA, 2012.
Was this manual useful for you? yes no
Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Download PDF

advertisement