Layer2 and Cumulus Linux Validated Design Guide

Layer2 and Cumulus Linux Validated Design Guide
Data Center Layer 2 High Availability
Validated Design Guide
Deploying a Data Center to Support Layer 2 Services with Network Switches
®
®
Running Cumulus Linux
DATA CENTER LAYER 2 HIGH AVAILAB ILITY: VALIDATED DESIGN GUI DE
Contents
Contents ........................................................................................................................................................................................... 2
Layer 2 Networks with Cumulus Linux ............................................................................................................................................ 4
Objective ....................................................................................................................................................................................... 4
Enabling Choice of Hardware in the Data Center ....................................................................................................................... 4
Driving Towards Operational Efficiencies ................................................................................................................................... 4
Intended Audience for Network Design and Build ..................................................................................................................... 5
Understanding Layer 2 Architecture ............................................................................................................................................... 6
Network Architecture and Design Considerations ..................................................................................................................... 6
Management and Out-of-Band Networking Considerations ...................................................................................................... 8
Scaling-out the Architecture ........................................................................................................................................................ 9
External Connectivity ..................................................................................................................................................................... 10
External Connectivity at Layer 2 ................................................................................................................................................ 10
External Connectivity at Layer 3 ................................................................................................................................................ 11
Implementing a Layer 2 Data Center Network with Cumulus Linux ............................................................................................ 12
Network Architecture for Hierarchical Layer 2 Leaf, Spine, and Core ..................................................................................... 12
Build Steps ................................................................................................................................................................................. 15
1. Set Up Physical Network and Basic Configuration of All Switches .................................................................................. 16
2. Configure Leaf Switches .................................................................................................................................................... 18
3. Configure Spine Switches ................................................................................................................................................. 24
4. Set Up Spine/Leaf Network Fabric ................................................................................................................................... 27
5. Configure VLANs ................................................................................................................................................................ 30
6. Connect Hosts.................................................................................................................................................................... 32
7. Connect Spine Switches to Core ....................................................................................................................................... 34
Automation Considerations ........................................................................................................................................................... 37
Automation and Converged Administration .............................................................................................................................. 37
Automated Switch Provisioning with Zero Touch Provisioning ................................................................................................ 37
Network Configuration Templates Using Mako ........................................................................................................................ 38
Automated Network Configuration Using Ansible .................................................................................................................... 39
Automated Network Configuration Using Puppet..................................................................................................................... 42
Operational and Management Considerations ............................................................................................................................ 46
Authentication and Authorization .............................................................................................................................................. 46
Security Hardening .................................................................................................................................................................... 46
Accounting and Monitoring........................................................................................................................................................ 46
Quality of Service (QoS) Considerations ................................................................................................................................... 47
2
CONTENTS
Link Pause .................................................................................................................................................................................. 48
Conclusion ...................................................................................................................................................................................... 49
Summary .................................................................................................................................................................................... 49
References ................................................................................................................................................................................. 49
Appendix A: Example /etc/network/interfaces Configurations ................................................................................................... 51
leaf01 ......................................................................................................................................................................................... 51
leaf02 ......................................................................................................................................................................................... 53
spine01 ...................................................................................................................................................................................... 55
spine02 ...................................................................................................................................................................................... 57
oob-mgmt ................................................................................................................................................................................... 59
Appendix B: Network Design and Setup Checklist ....................................................................................................................... 60
Version 1.0.0
February 2, 2016
About Cumulus Networks
Unleash the power of Open Networking with Cumulus Networks. Founded by veteran networking engineers from Cisco and
VMware, Cumulus Networks makes the first Linux operating system for networking hardware and fills a critical gap in
realizing the true promise of the software-defined data center. Just as Linux completely transformed the economics and
innovation on the server side of the data center, Cumulus Linux is doing the same for the network. It is radically reducing
the costs and complexities of operating modern data center networks for service providers and businesses of all sizes.
Cumulus Networks has received venture funding from Andreessen Horowitz, Battery Ventures, Sequoia Capital, Peter
Wagner and four of the original VMware founders. For more information visit cumulusnetworks.com or @cumulusnetworks.
©2016 Cumulus Networks. CUMULUS, the Cumulus Logo, CUMULUS NETWORKS, and the Rocket Turtle Logo (the “Marks”) are trademarks and service marks
of Cumulus Networks, Inc. in the U.S. and other countries. You are not permitted to use the Marks without the prior written consent of Cumulus Networks. The
registered trademark Linux® is used pursuant to a sublicense from LMI, the exclusive licensee of Linus Torvalds, owner of the mark on a world-wide basis. All
other marks are used under fair use or license from their respective owners.
www.cumulusnetworks.com
3
DATA CENTER LAYER 2 HIGH AVAILAB ILITY: VALIDATED DESIGN GUI DE
Layer 2 Networks with Cumulus Linux
Objective
This Validated Design Guide presents a design and implementation approach for deploying layer 2 networks on switches
running Cumulus Linux.
Enabling Choice of Hardware in the Data Center
Server virtualization revolutionized the data center by giving IT the freedom and choice of industry-standard server
hardware, thereby lowering CapEx costs, as well as a rich ecosystem of automation tools and services to provision and
manage compute and storage infrastructure to achieve lower OpEx costs.
This same benefit of choice is now available for networking in the data center. With Cumulus Linux, network administrators
have a multi-platform, multi-vendor network OS that provides the freedom of choice with network switch hardware.
Because Cumulus Linux is Linux, data center administrators can tap the already established wealth of Linux knowledge and
vast application ecosystem for administering servers and extend them to network switches for converged administration.
Cumulus Linux can help you achieve the same CapEx and OpEx efficiencies for your networks by enabling an open market
approach for switching platforms, and by offering a radically simple lifecycle management framework built on the industry’s
best open source tools. By using bare metal servers and network switches, you can achieve cost savings that would have
been impossible just a few years ago
Driving Towards Operational Efficiencies
With a disaggregated hardware and software model, Cumulus Networks has created a simple approach for managing
complex layer 2 networks.
Cumulus Linux simplifies network management through hardware agnosticism and using standard Linux automation and
orchestration tools. Cumulus Linux is a full-featured network operating system and presents a standard interface,
regardless of the underlying hardware or features enabled. Automation tools can process templates that are written once
and reused across the entire environment for provisioning and configuration management, substantially decreasing the
number of upper level management systems and operating expenses as well as increasing the speed at which new
networks are deployed. You can change key variables as needed in a template and power on an entirely new network in
minutes without interacting with a CLI. You can leverage open source protocols and tools such as DHCP, Open Network
Install Environment (ONIE), zero touch provisioning, and automation and orchestration tools of DevOps’ choice such as
Ansible, Chef, Puppet, Salt, and CFEngine. These same tools are already used by many organizations to simplify server
deployments, and modifying them to provision entire racks network switches becomes a simple task of converged
administration.
4
LAYER 2 NETWORKS WIT H CUMULUS LINUX
Intended Audience for Network Design and Build
The intended audience for this guide is a data center cloud architect or administrator who is experienced with server
technologies and familiar with layer 2 networking, including interfaces, link aggregation (LAG) or bonds, and VLANs. The
network architecture and build steps provided in this document can be used as a reference for planning and implementing
layer 2 with Cumulus Linux in your environment.
A basic understanding of Linux commands is assumed, such as accessing a Linux installation, navigating the file system,
and editing files. If you need to learn more about Linux, we suggest you review our Introduction to Linux videos at
cumulusnetworks.com/technical-videos/ for a high level overview of Linux.
If you are using this guide to help you set up your Cumulus Linux environment, we assume you have Cumulus Linux
installed and licensed on switches from the Cumulus Linux Hardware Compatibility List (HCL) at cumulusnetworks.com/hcl.
Additional information on Cumulus Linux software, licensing, and supported hardware may be found on
cumulusnetworks.com or by contacting sales@cumulusnetworks.com.
www.cumulusnetworks.com
5
DATA CENTER LAYER 2 HIGH AVAILAB ILITY: VALIDATED DESIGN GUI DE
Understanding Layer 2 Architecture
Network Architecture and Design Considerations
Many applications in the data center require layer 2 adjacency, where the application assumes other components or
services are on the same IP subnet. In this environment it is assumed that a gateway is only needed to route traffic
between domains with different IP subnets. An example would be a virtual environment in which you assume that all of the
VMs can talk to each other without needing a router.
Figure 2 shows the network design of a typical enterprise data center pod running a virtual environment or container
environment. The pod consists of a pair of aggregation/spine switches connected with one or more pairs of access/leaf
switches, which in turn provide dual-homed highly available connectivity to hosts and storage elements.
Figure 2. Traditional Layer 2 Hierarchical Enterprise Data Center Network Pod
This document describes the network architecture for a layer 2 data center that follows the traditional hierarchical
aggregation and access, also called spine and leaf, structure where layer 2 networking is used to connect all elements.
For optimal network performance, hosts are connected via dual 10G links to the access/leaf switch layer, which in turn is
connected via 40G links to the aggregation/spine layer. Representative spine and leaf switches running Cumulus Linux are
show below in Figure 3, although the details of your specific models may vary. Spine switches typically have thirty-two 40G
switch ports, whereas leaf switches have forty-eight 10G access ports and up to six 40G uplinks. In Cumulus Linux, switch
ports are labeled swp1, swp2, swp3, and so forth.
6
UNDERSTANDING LAYER 2 ARCHITECTURE
Figure 3. Switch Port Numbering for Aggregation/Spine and Access/Leaf Switches
The network design employs multi-chassis link aggregation (MLAG) for network path redundancy and link aggregation for
network traffic optimization. The use of MLAG allows for active/active ports across physical switches and avoids the
underutilization of network ports from ports that are deliberately placed in a blocking state to prevent loops by Spanning
Tree Protocol (STP). MLAG is deployed using the following steps:
Select a pair of physical switches, either a pair of leaf switches or a pair of spine switches. There can only be two
switches in a single MLAG "logical switch” configuration, although you can have multiple MLAG pairs in a topology.
Establish a peer link between pair members. It is recommended to set up the peer link as an LACP bond for
increased reliability and bandwidth.
A clagd daemon runs on each switch in the MLAG pair and communicates over a subinterface of the peer link.
Figure 4. Switch-to-Switch MLAG
The peer link between the switches in an MLAG pair should be sized accordingly to ensure there is sufficient bandwidth to
handle additional traffic during uplink failure scenarios. Typical deployments utilize a bond for the peer link to satisfy both
bandwidth and redundancy requirements.
www.cumulusnetworks.com
7
DATA CENTER LAYER 2 HIGH AVAILAB ILITY: VALIDATED DESIGN GUI DE
Figure 5. Host Link Redundancy and Aggregation, LACP
Figure 6. Host Network Segmentation
Figure 6 shows the high availability (HA) component of the proposed layer 2 solution. As individual links are added or
removed from the LACP uplink group, the traffic from applications such as databases or Web servers proceeds to flow
uninterrupted to and from the host. The number of links present in the LACP uplink group directly affects the amount of
bandwidth available to the applications but the loss of total available bandwidth is the only impact of a link failure on a host
with high availability.
Management and Out-of-Band Networking Considerations
An important supplement to the high capacity production data network is the management network used to administer
infrastructure elements, such as network switches, physical servers, and storage systems. The architecture of these
networks varies considerably based on their intended use, the elements themselves, and access isolation requirements.
This solution guide assumes that a single layer 2 domain is used to administer the network switches and storage elements.
These operations include imaging the elements, configuring them, and monitoring the running system. Some installations
will also use this network for IPMI, also known as DRAC or iLO, access to the host. This network is expected to host both
DHCP and HTTP servers, such as isc-dhcp and apache2, as well as provide DNS reverse and forward resolution. In general,
these networks provide some means to connect to the corporate network, typically a connection through a router or jump
host.
Figure 7 below shows the logical and, where possible, the physical connections of each element as well as the services
required to realize this deployment.
8
UNDERSTANDING LAYER 2 ARCHITECTURE
Figure 7. Out-of-Band Management
Scaling-out the Architecture
Scaling out the architecture involves adding more hosts to the access switch pairs, and then adding more access switches
in pairs as needed, as shown in Figure 8.
Figure 8. Adding Additional Switches
Once the limit for the spine switch pair approaches, an additional network pod of spine/leaf switches may be added, as
shown in Figure 9. The only constraint is that in a Layer 2-only environment, additional spine switches should be added in
pairs.
Figure 9. Adding Network Pods/Server Clusters
www.cumulusnetworks.com
9
DATA CENTER LAYER 2 HIGH AVAILAB ILITY: VALIDATED DESIGN GUI DE
External Connectivity
The spine switches can connect to core switches in one of two ways, depending on where they sit relative to the layer 2 and
layer 3 boundary:
External connectivity at layer 2, with routing and gateway services provided outside the cluster
External connectivity at layer 3, with routing services provided by the spine switches
External Connectivity at Layer 2
In this scenario, the spine switches connect to core switches that handle gateway services as shown in Figure 10.
Figure 10. External Connectivity at Layer 2
When connecting to core switches at layer 2, the core switches need to support a vendor-specific form of multi-chassis link
aggregation (such as vPC, MC-LAG, or MLAG) with which the spine switches can pair.
If the core switches are not capable of MLAG, then you may need to consider that with the VLAN-aware bridge introduced in
Cumulus Linux 2.5, a single instance of spanning tree runs on each switch. By default, BPDUs are sent on the native VLAN
1 when spanning tree is enabled on the VLAN-aware bridge. The native VLAN should match on the core switches to ensure
spanning tree compatibility. To adjust the untagged or native VLAN for an interface, refer to the Configure VLANs step in the
Build Steps later in this document. In addition, all VLANs that are to be trunked to the core should be allowed on all the
uplink trunk interfaces to avoid potentially blocked VLANs.
10
EXTERNAL CONNECTIVITY
External Connectivity at Layer 3
In this scenario, the spine switches connect at layer 3 as shown in Figure 11. Alternatively, the spine switches can be dual
connected to each core switch at layer 3 (not shown in Figure 11).
Figure 11. External Connectivity at Layer 3
In this design, the spine switches route traffic. To connect to the core switches, you will need to determine whether the
routing is static or dynamic, and the protocol — OSPF or BGP — if dynamic.
In addition, you will need to provide a gateway and routing between all layer 2 subnets on the spine switches. Additional
network design considerations for this scenario include setting up Virtual Router Redundancy (VRR) between the spine
switch pair to support active/active forwarding. For more information about VRR, read the Cumulus Linux documentation.
www.cumulusnetworks.com
11
DATA CENTER LAYER 2 HIGH AVAILAB ILITY: VALIDATED DESIGN GUI DE
Implementing a Layer 2 Data Center Network
with Cumulus Linux
Network Architecture for Hierarchical Layer 2 Leaf, Spine, and Core
The instructions that follow detail the steps needed for building a representative network for providing layer 2 connectivity
between servers using switches running Cumulus Linux. Actual configurations reference the following network topology.
Figure 12. Network Topology
The details for the switches, hosts, and logical interfaces are presented in the tables on the following pages.
The following steps are assembled sequentially such that each step builds on the previous step and only includes the
portions of the configuration that are relevant for the given step. If at any point it is unclear what configuration should be
present, Appendix A includes the complete configurations for all the leaves and spines; you can copy these configurations
directly to a test environment. The build steps demonstrate why each piece of configuration is necessary.
12
IMPLEMENTING A LAYER 2 DATA CENTER NETWO RK WITH CUMULUS LINU X
leaf01
connected to
leaf02
Logical Interface
peerlink
Description
Physical Interfaces
Peer bond utilized for MLAG traffic
swp47, swp48
(swp45, swp46
reserved for
expansion)
leaf02
peerlink.4094
Subinterface used for clagd
communication
N/A
spine01, spine02
uplink1
For MLAG between spine01 and spine02
swp49, swp50
future use
uplink2
Reserved for additional connections to
spines
swp51, swp52
multiple hosts
access ports
Connect to hosts and storage
swp1 through swp44
host-01
host-01
Bond to host-01 for host-to-switch MLAG
swp1
out-of-band
management
N/A
Out-of-band management interface
eth0
Description
Physical Interfaces
Peer bond utilized for MLAG traffic
swp47, swp48
leaf02
connected to
leaf01
Logical Interface
peerlink
(swp45, swp46
reserved for
expansion)
leaf01
peerlink.4094
Subinterface used for clagd
communication
N/A
spine01, spine02
uplink
For MLAG between spine01 and spine02
swp49, swp50
future use
uplink2
Reserved for additional connections to
spines
swp51, swp52
multiple hosts
access ports
Connect to hosts and storage
swp1 through swp44
host-01
host-01
Bond to host-01 for host-to-switch MLAG
swp1
out-of-band
management
N/A
Out-of-band management interface
eth0
Description
Physical Interfaces
leaf0N
connected to
Logical Interface
www.cumulusnetworks.com
13
DATA CENTER LAYER 2 HIGH AVAILAB ILITY: VALIDATED DESIGN GUI DE
leaf0N
connected to
Logical Interface
Description
Physical Interfaces
Repeat above configurations for each additional pair of leafs.
spine01
connected to
Logical Interface
Description
Physical Interfaces
spine02
peerlink
Peer bond utilized for MLAG traffic
swp25, swp26
spine02
peerlink.4093
Subinterface used for clagd
communication
N/A
leaf01, leaf02
leaf-pair-01
Bond to another peer link group
swp1, swp2
core1, core2
core-uplink
Bond to core switches
swp31, swp32
out-of-band
management
N/A
Out-of-band management interface
eth0
Description
Physical Interfaces
spine02
connected to
14
Logical Interface
spine01
peerlink
Peer bond utilized for MLAG traffic
swp25, swp26
spine01
peerlink.4093
Subinterface used for clagd
communication
N/A
leaf01, leaf02
downlink1
Bond to another peer link group
swp1, swp2
core1, core2
core-uplink
Bond to core switches
swp31, swp32
out-of-band
management
N/A
Out-of-band management interface
eth0
IMPLEMENTING A LAYER 2 DATA CENTER NETWO RK WITH CUMULUS LINU X
Build Steps
The steps for building out a layer 2 network environment with switches running Cumulus Linux are as follows.
Step
Tasks
Physical Network
1.
Set up physical network and basic configuration of
all switches.
Rack and cable all network switches.
Install Cumulus Linux.
Verify connectivity.
Configure out-of-band management.
Set hostname.
Configure DNS.
Configure NTP.
Network Topology
2.
Configure leaf switches.
Configure each switch in pair individually.
Create peer bond between pair.
Enable MLAG peering between leaf switches.
3.
Configure spine switches.
Configure each switch in pair individually.
Create peer bond between pair.
Enable MLAG peering between spine switches.
4.
Set up spine/leaf network fabric.
Create a switch-to-switch “uplink” bond on each leaf
switch to the spine pair and verify MLAG.
Create a switch-to-switch bond on the spine pair to each
leaf switch pair and verify MLAG.
5.
Configure VLANs
Set up VLANs for traffic.
6.
Connect hosts.
Provision hosts OS (if not done already).
Connect hosts to leaf switches.
Configure hosts with LACP bond uplinks.
Provision layer 2 applications.
7.
Connect spine switches to core.
Configure depending on layer 2 or layer 3 connectivity:
For layer 2:
Create an MLAG port channel to the core for each spine
switch.
On each core, create an MLAG or vendor-equivalent
Multi-Chassis Link Aggregation to the logical spine switch
pair.
For layer 3:
Configure layer 3 switch virtual interface (SVI) gateways.
www.cumulusnetworks.com
15
DATA CENTER LAYER 2 HIGH AVAILAB ILITY: VALIDATED DESIGN GUI DE
In a greenfield environment, the order of configuring spine or leaf switches does not matter, so steps 3 and 4 above may be
done in reverse order. In a brownfield environment, start with leaf switches as shown above for minimal network service
disruptions.
The build order is detailed out in the following steps. Reference the Network Design and Setup Checklist in the appendix of
this guide for use in building out your network.
1. Set Up Physical Network and Basic Configuration of All Switches
Rack and cable all network switches
Install Cumulus Linux
Configure out-of-band management
Verify connectivity
Set hostname
Configure DNS
Configure NTP
After racking and cabling all switches, install the Cumulus Linux OS and license on each switch. Refer to the Quick Start
Guide of the Cumulus Linux documentation for more information. Next, configure the out-of-band management as needed.
To verify all cables are connected and functional, check the link state. To check the link state of a switch port, run the ip
link show command, which displays the physical link and administrative state of the interface and basic layer 2
information. To show the layer 3 information as well, such as IPv4 or IPv6 address, use the ip addr show command.
Look for the UP states in the output. This command checks link state but does not detect traffic flow. Refer to the
Configuring and Managing Network Interfaces chapter of the Cumulus Linux documentation for more information.
cumulus@leaf01$ sudo ip link set up swp47
cumulus@leaf01$ ip link show swp47
49: swp47: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master
peerlink state UP mode DEFAULT qlen 500
link/ether 70:72:cf:9d:4e:64 brd ff:ff:ff:ff:ff:ff
To help verify that cables are properly connected according to your network topology diagram, check what neighbors are
observed from each switch port. For example, to see what port is connected to swp47 on leaf01, run the following
command and observe the LLDP neighbor’s output to verify that swp47 on leaf02 is connected.
cumulus@leaf01$ sudo lldpctl swp47
------------------------------------------------------------------------------LLDP neighbors:
------------------------------------------------------------------------------Interface:
swp47, via: LLDP, RID: 7, Time: 14 days, 20:06:51
Chassis:
ChassisID:
mac c4:54:44:bc:ff:f0
SysName:
leaf02
SysDescr:
Cumulus Linux
MgmtIP:
192.168.0.91
Capability:
Bridge, on
Capability:
Router, on
Port:
PortID:
ifname swp47
PortDescr:
swp47
-------------------------------------------------------------------------------
16
IMPLEMENTING A LAYER 2 DATA CENTER NETWO RK WITH CUMULUS LINU X
Note: To configure files in Cumulus Linux, use your choice of text editor. A standard Cumulus Linux installation includes
nano, vi, and zile. Additionally, you can install additional editors such as vim from the Cumulus Linux repository,
repo.cumulusnetworks.com.
The examples in this guide use vi as the text editor. If you are not familiar with using vi, substitute the command with
nano or zile, which may have more familiar user interfaces.
The default configuration for eth0, the management interface, is DHCP. To reconfigure eth0 to use a static IP address, edit
the /etc/network/interfaces file by adding an IP address/mask and an optional gateway. Refer to the Quick Start
Guide, Wired Ethernet Management section of the Cumulus Linux documentation for more information. For example, on
the leaf and spine switches, configure them as follows:
cumulus@leaf01$ sudo vi /etc/network/interfaces
auto eth0
iface eth0
address 192.168.0.90/24
gateway 192.168.0.254
cumulus@leaf02$ sudo vi /etc/network/interfaces
auto eth0
iface eth0
address 192.168.0.91/24
gateway 192.168.0.254
cumulus@spine01$ sudo vi /etc/network/interfaces
auto eth0
iface eth0
address 192.168.0.94/24
gateway 192.168.0.254
cumulus@spine02$ sudo vi /etc/network/interfaces
auto eth0
iface eth0
address 192.168.0.95/24
gateway 192.168.0.254
Setting Hostname
The default for the hostname of a switch running Cumulus Linux is cumulus. Change this to the appropriate name based on
your network architecture diagram, such as leaf01 or spine01, by modifying /etc/hostname and /etc/hosts. Changes
to the /etc/hostname file do not take effect until you reboot that switch. You can find additional information in the Quick
Start Guide, Setting Unique Host Names section of the Cumulus Linux documentation. For example:
cumulus@leaf01$ sudo vi /etc/hostname
leaf01
cumulus@leaf01$ sudo vi /etc/hosts
127.0.0.1 leaf01 localhost
www.cumulusnetworks.com
17
DATA CENTER LAYER 2 HIGH AVAILAB ILITY: VALIDATED DESIGN GUI DE
Configuring DNS
Modify your DNS settings if needed. Add your domain, search domain, and nameserver entries to the /etc/resolv.conf
file. These changes take effect immediately.
cumulus@leaf01$ sudo vi /etc/resolv.conf
domain example.com
search example.com
nameserver x.x.x.x
Configuring NTP (Network Time Protocol)
By default, NTP is installed and enabled on Cumulus Linux. To override the default NTP servers hosted by Cumulus
Networks, edit the /etc/ntp.conf file. Find the 4 servers that are configured by default and modify the lines to reflect
your desired NTP servers.
cumulus@leaf01$ sudo vi /etc/ntp.conf
# pool.ntp.org maps to about 1000 low-stratum NTP servers. Your server will
# pick a different set every time it starts up. Please consider joining the
# pool: <http://www.pool.ntp.org/join.html>
server 0.cumulusnetworks.pool.ntp.org iburst
server 1.cumulusnetworks.pool.ntp.org iburst
server 2.cumulusnetworks.pool.ntp.org iburst
server 3.cumulusnetworks.pool.ntp.org iburst
After modifying the servers for NTP to use, restart the NTP daemon to read in the new changes to the configuration file. This
can be performed with the sudo service ntp restart command.
cumulus@leaf01:~$ sudo service ntp restart
[ ok ] Stopping NTP server: ntpd.
[ ok ] Starting NTP server: ntpd.
cumulus@leaf01:~$
2. Configure Leaf Switches
Configure each switch in the MLAG pair individually
Create peer link bond between pair
Enable MLAG peering between leaf switches
Configure Each Switch
By default, a switch with Cumulus Linux freshly installed has no switch port interfaces defined. Define the basic
characteristics of swp1 through swpN by creating stanza entries for each switch port (swp) in the
/etc/network/interfaces file. Required statements include the following:
auto <switch port name>
iface <switch port name>
An MTU setting of 9216 is recommended to avoid packet fragmentation. For example:
cumulus@leaf01$ sudo vi /etc/network/interfaces
# physical interface configuration
18
IMPLEMENTING A LAYER 2 DATA CENTER NETWORK WITH CUMULUS LINUX
auto swp1
iface swp1
mtu 9216
auto swp2
iface swp2
mtu 9216
auto swp3
iface swp3
mtu 9216
.
.
.
auto swp52
iface swp52
mtu 9216
You can set additional attributes, such as speed and duplex. Refer to the Configuring Switch Port Attributes chapter of the
Cumulus Linux documentation for more information.
Configure all leaf switches identically.
Instead of manually configuring each interface definition, you can programmatically define them by using shorthand syntax
that leverages Python Mako templates. Refer to the Automation Considerations chapter found later in this guide.
Once all configurations have been defined in the /etc/network/interfaces file, run the ifquery command to ensure
that all syntax is proper and the interfaces are created as expected:
cumulus@leaf01$ ifquery -a
auto lo
iface lo inet loopback
auto eth0
iface eth0
address 192.168.0.90/24
gateway 192.168.0.254
auto swp1
iface swp1
mtu 9216
.
.
.
Once all configurations have been defined in the /etc/network/interfaces file, apply the configurations to ensure they
are loaded into the kernel. There are several methods for applying configuration changes depending on when and what
changes you want to apply.
www.cumulusnetworks.com
19
DATA CENTER LAYER 2 HIGH AVAILAB ILITY: VALIDATED DESIGN GUI DE
Command
Action
sudo ifreload -a
Parse interfaces labeled with auto that have been added to or modified in the
configuration file, and apply changes accordingly.
Note: This command is disruptive to traffic only on interfaces that have been
modified.
sudo service networking restart
Restart all interfaces labeled with auto as defined in the configuration file,
regardless of what has or has not been recently modified.
Note: This command is disruptive to all traffic on the switch including the eth0
management network.
sudo ifup <swpX>
Parse an individual interface labeled with auto as defined in the configuration
file and apply changes accordingly.
Note: This command is disruptive to traffic only on interface swpX.
For example, on leaf01:
cumulus@leaf01:~$ sudo ifreload -a
or individually:
cumulus@leaf01:~$ sudo ifup swp1
cumulus@leaf01:~$ sudo ifup swp2
.
.
.
cumulus@leaf01:~$ sudo ifup swp52
Create Peer Link Bond between Switches
Next, create a peer link bond on both switches by editing /etc/network/interfaces and placing the bond configuration
after the swpN interfaces. Configure the peer link bond identically on both switches in the MLAG pair.
For example, add the following peerlink stanza on leaf01 with bond members swp47 and swp48. The configuration will be
identical on leaf02. For more information on bond settings, refer to the MLAG chapter in the Cumulus Linux
documentation.
cumulus@leaf01$ sudo vi /etc/network/interfaces
# peerlink bond for clag
auto peerlink
iface peerlink
bond-slaves swp47 swp48
bond-mode 802.3ad
bond-miimon 100
bond-use-carrier 1
bond-lacp-rate 1
bond-min-links 1
bond-xmit-hash-policy layer3+4
20
IMPLEMENTING A LAYER 2 DATA CENTER NETWO RK WITH CUMULUS LINU X
Enable MLAG Peering between Switches
An instance of the clagd daemon runs on each MLAG switch member to keep track of various networking information,
including MAC addresses, needed to maintain the peer relationship. clagd communicates with its peer on the other switch
across a layer 3 interface between the two switches. This layer 3 network should not be advertised by routing protocols, nor
should the VLAN be trunked anywhere else in the network. This interface is designed to be a keep-alive reachability test
and for synchronizing the switch state across the directly attached peer bond.
Create the VLAN subinterface for clagd communication and assign an IP address for this subinterface. A unique .1q tag is
recommended to avoid mixing data traffic with the clagd control traffic.
To enable MLAG peering between switches, configure clagd on each switch by creating a peerlink subinterface in
/etc/network/interfaces with a unique .1q tag. Set values for the following parameters under the peerlink
subinterface:
•
•
•
•
•
address. The local IP address/netmask of this peer switch.
o Recommended to use a link local address for example 169.254.1.X/30
clagd-enable. Set to yes (default)
clagd-peer-ip. Set to the IP address assigned to the peer interface on the peer switch.
clagd-backup-ip Set to an IP address on the peer switch reachable independently of the peer link.
o For example, the management interface or a routed interface that does not traverse the peer link.
clagd-sys-mac. Set to a unique MAC address you assign to both peer switches.
o Recommended within the Cumulus Networks reserved range of 44:38:39:FF:00:00 through
44:38:39:FF:FF:FF.
For example, configure leaf01 and leaf02 as follows:
cumulus@leaf01$ sudo vi /etc/network/interfaces
# VLAN for clagd communication
auto peerlink.4094
iface peerlink.4094
address 169.254.1.1/30
clagd-enable yes
clagd-peer-ip 169.254.1.2
clagd-backup-ip 192.168.0.91/24
clagd-sys-mac 44:38:39:ff:40:94
cumulus@leaf02$ sudo vi /etc/network/interfaces
# VLAN for clagd communication
auto peerlink.4094
iface peerlink.4094
address 169.254.1.2/30
clagd-enable yes
clagd-peer-ip 169.254.1.1
clagd-backup-ip 192.168.0.90/24
clagd-sys-mac 44:38:39:ff:40:94
Note: MLAG can use any valid IP address pair for communication; however, we suggest using values from the IPv4 link local
range defined by 169.254.0.0/16. These addresses are not exported by routing protocols and, since the peer
communication VLAN is local to this peer link, the same IP address pairs can be used on all MLAG switch pairs.
www.cumulusnetworks.com
21
DATA CENTER LAYER 2 HIGH AVAILAB ILITY: VALIDATED DESIGN GUI DE
Because MTU configurations on a subinterface are inherited from the parent interface and the parent interface (peerlink)
was previously defined with the MTU setting, there is no need to set MTU in the subinterface stanza.
Reload the network configuration for the MLAG changes to be applied and start clagd:
On leaf01:
cumulus@leaf01:~$ sudo ifreload -a
On leaf02:
cumulus@leaf02:~$ sudo ifreload -a
or individually restart just the peerlink and subinterface to apply the MLAG changes and start clagd:
On leaf01:
cumulus@leaf01:~$ sudo ifup peerlink
cumulus@leaf01:~$ sudo ifup peerlink.4094
On leaf02:
cumulus@leaf02:~$ sudo ifup peerlink
cumulus@leaf02:~$ sudo ifup peerlink.4094
Once clagd is configured under the peerlink subinterface, it will automatically start when the system is booted. Once the
interfaces have been started, verify the interfaces are up and have the proper IP addresses assigned.
cumulus@leaf01:~$ ip addr show peerlink.4094
115: peerlink.4094@peerlink: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9216 qdisc noqueue
state UP
link/ether c4:54:44:9f:49:1e brd ff:ff:ff:ff:ff:ff
inet 169.254.1.1/30 scope global peerlink.4094
inet6 fe80::7272:cfff:fe9d:4e64/64 scope link
valid_lft forever preferred_lft forever
cumulus@leaf02$ ip addr show peerlink.4094
105: peerlink.4094@peerlink: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9216 qdisc noqueue
state UP
link/ether c4:54:44:bd:00:20 brd ff:ff:ff:ff:ff:ff
inet 169.254.1.2/30 scope global peerlink.4094
inet6 fe80::c654:44ff:fe9f:491e/64 scope link
valid_lft forever preferred_lft forever
Next, verify connectivity between the switches by issuing a ping to the backup IP address and across the peer bond from
each switch to its peer switch. For example, from leaf01 you should be able to ping the clagd subinterface’s backup IP
address, 192.168.0.91, and IP address, 169.254.1.2, on leaf02.
cumulus@leaf01$ ping 192.168.0.91
PING 192.168.0.91 (192.168.0.91) 56(84) bytes of data.
64 bytes from 192.168.0.91: icmp_req=1 ttl=64 time=0.258 ms
64 bytes from 192.168.0.91: icmp_req=2 ttl=64 time=0.210 ms
--- 192.168.0.91 ping statistics --2 packets transmitted, 2 received, 0% packet loss, time 1000ms
rtt min/avg/max/mdev = 0.210/0.234/0.258/0.024 ms
cumulus@leaf01:~$ ping 169.254.1.2
22
IMPLEMENTING A LAYER 2 DATA CENTER NETWO RK WITH CUMULUS LINU X
PING 169.254.1.2 (169.254.1.2) 56(84) bytes of data.
64 bytes from 169.254.1.2: icmp_req=1 ttl=64 time=0.798 ms
64 bytes from 169.254.1.2: icmp_req=2 ttl=64 time=0.554 ms
--- 169.254.1.2 ping statistics --2 packets transmitted, 2 received, 0% packet loss, time 1000ms
rtt min/avg/max/mdev = 0.554/0.676/0.798/0.122 ms
Likewise, from leaf02 you should be able to ping the backup IP address and the IP address of the clagd subinterface on
leaf01, 192.168.0.90 and 169.254.1.1.
Verify MLAG port channel operation and the peer roles have been established. In the example below, switch leaf01 is
operating as the primary peer switch while leaf02 is the secondary peer switch. By default, priority values for both switches
are equal and set to 32768. The switch with the lower MAC address is given the primary role in the event of a priority tie.
cumulus@leaf01$ clagctl
The peer is alive
Our Priority, ID, and Role:
Peer Priority, ID, and Role:
Peer Interface and IP:
Backup IP:
System MAC:
32768 c4:54:44:9f:49:1e primary
32768 c4:54:44:bd:00:20 secondary
peerlink.4094 169.254.1.2
192.168.0.91 (active)
44:38:39:ff:40:94
When an MLAG-enabled switch is in the secondary role, it does not send BPDUs on dual-connected links; it only sends
BPDUs on single-connected links. Also, in case the peer switch is determined to be not alive, the switch in the secondary
role will roll back the link MAC address to be the bond interface MAC address instead of the clagd-sys-mac. By contrast,
the switch in the primary role always uses the clagd-sys-mac and sends BPDUs on all single- and dual-connected links.
The role of primary vs. secondary peer switch becomes important to consider when restarting switches. If a secondary peer
switch is restarted, the LACP system ID remains the same. However, if a primary peer switch is restarted, the LACP system
ID will change, which can be disruptive.
Changing the priority does not cause a traffic interruption but will take a few seconds to switch over while the switch waits
for the next peer update.
To change the priority for leaf02 so that it becomes the primary and leaf01 becomes secondary, use the clagctl
command with the priority parameter:
cumulus@leaf02$ sudo clagctl priority 4096
cumulus@leaf02$ clagctl
The peer is alive
Our Priority, ID, and Role: 4096 c4:54:44:bd:00:20 primary
Peer Priority, ID, and Role: 32768 c4:54:44:9f:49:1e secondary
Peer Interface and IP: peerlink.4094 169.254.1.1
Backup IP: 192.168.0.90 (active)
System MAC: 44:38:39:ff:40:94
www.cumulusnetworks.com
23
DATA CENTER LAYER 2 HIGH AVAILAB ILITY: VALIDATED DESIGN GUI DE
3. Configure Spine Switches
•
•
•
Configure each switch in pair individually
Create peer link bond between pair
Enable MLAG peering between spine switches
Configure Each Switch
As in the case of the leaf switches, define all the switch ports on your spine switches that will be in use. A typical spine
switch that is fully populated has swp1 through 32 defined. Define a port by creating stanza entries for each switch port
(swp) in the /etc/network/interfaces file. For example:
cumulus@spine01$ sudo vi /etc/network/interfaces
# physical interface configuration
auto swp1
iface swp1
mtu 9216
auto swp2
iface swp2
mtu 9216
auto swp3
iface swp3
mtu 9216
.
.
.
auto swp32
iface swp32
mtu 9216
Configure both spine switches identically.
Instead of manually configuring each interface definition, you can programmatically define them by using shorthand syntax
that leverages Python Mako templates. Refer to the Automation Considerations chapter found later in this guide.
As you did previously with the leaf switches,
•
•
Use sudo ifquery -a to verify all interfaces were properly defined in the configuration file.
Bring up the interfaces using sudo ifreload -a or individually using sudo ifup swpN.
Create Peer Link Bond between Switches
Next, create a peer link bond on both spine switches in the same manner as previously done for the leaf switch pairs. Do
this by editing the /etc/network/interfaces file and placing the bond configuration after the swpN interfaces.
Configure the peer link bond identically on both switches in the MLAG pair. Add the following peerlink stanza on spine01,
with swp31 and swp32 for bond members. The configuration will be identical on spine02.
cumulus@spine01$ sudo vi /etc/network/interfaces
# peerlink bond for clag
24
IMPLEMENTING A LAYER 2 DATA CENTER NETWO RK WITH CUMULUS LINU X
auto peerlink
iface peerlink
bond-slaves swp31 swp32
bond-mode 802.3ad
bond-miimon 100
bond-use-carrier 1
bond-lacp-rate 1
bond-min-links 1
bond-xmit-hash-policy layer3+4
Enable MLAG Peering between Switches
To enable MLAG peering between switches, you configure the clagd daemon by adding the MLAG configuration
parameters under the subinterface similarly as previously configured in the leaf switches.
cumulus@spine01$ sudo vi /etc/network/interfaces
# VLAN for clagd communication
auto peerlink.4093
iface peerlink.4093
address 169.254.1.1/30
clagd-enable yes
clagd-peer-ip 169.254.1.2
clagd-backup-ip 192.168.0.95/24
clagd-sys-mac 44:38:39:ff:40:93
cumulus@spine02$ sudo vi /etc/network/interfaces
# VLAN for clagd communication
auto peerlink.4093
iface peerlink.4093
address 169.254.1.2/30
clagd-enable yes
clagd-peer-ip 169.254.1.1
clagd-backup-ip 192.168.0.94/24
clagd-sys-mac 44:38:39:ff:40:93
We are using .4093 for the peerlink communication between spine01 and spine02 in contrast to .4094 between leaf01
and leaf02. Using different VLAN IDs for different peerlink communication links avoids the potential for creating an
undesired loop.
Next, reload the network configuration for the MLAG changes to be applied and start clagd:
On spine01:
cumulus@spine01:~$ sudo ifreload -a
On spine02:
cumulus@spine02:~$ sudo ifreload -a
www.cumulusnetworks.com
25
DATA CENTER LAYER 2 HIGH AVAILAB ILITY: VALIDATED DESIGN GUI DE
or individually restart just the peerlink and subinterface to apply the MLAG changes and start clagd:
On spine01:
cumulus@spine01:~$ sudo ifup peerlink
cumulus@spine01:~$ sudo ifup peerlink.4093
On spine02:
cumulus@spine02:~$ sudo ifup peerlink
cumulus@spine02:~$ sudo ifup peerlink.4093
Once the interfaces have been started, verify the interfaces are up and have the proper IP addresses assigned. For
example, on spine01:
cumulus@spine01$ ip addr show peerlink.4093
36: peerlink.4093@peerlink: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9216 qdisc noqueue
state UP
link/ether c4:54:44:72:cf:ad brd ff:ff:ff:ff:ff:ff
inet 169.254.1.1/30 scope global peerlink.4093
inet6 fe80::c654:44ff:fe72:cfad/64 scope link
valid_lft forever preferred_lft forever
Next, verify connectivity between the switches by issuing a ping across the peer bond from each switch to its peer switch.
For example, from spine01 you should be able to ping the clagd subinterface’s backup IP address, 192.168.0.95, and IP
address, 169.254.1.2, on spine02.
cumulus@spine01$ ping 192.168.0.95
PING 192.168.0.95 (192.168.0.95) 56(84) bytes of data.
64 bytes from 192.168.0.95: icmp_req=1 ttl=64 time=0.277 ms
64 bytes from 192.168.0.95: icmp_req=2 ttl=64 time=0.300 ms
--- 192.168.0.95 ping statistics --2 packets transmitted, 2 received, 0% packet loss, time 999ms
rtt min/avg/max/mdev = 0.277/0.288/0.300/0.020 ms
cumulus@spine01:~$ ping 169.254.1.2
PING 169.254.1.2 (169.254.1.2) 56(84) bytes of data.
64 bytes from 169.254.1.2: icmp_req=1 ttl=64 time=0.725 ms
64 bytes from 169.254.1.2: icmp_req=2 ttl=64 time=0.916 ms
--- 169.254.1.2 ping statistics --2 packets transmitted, 2 received, 0% packet loss, time 999ms
rtt min/avg/max/mdev = 0.725/0.820/0.916/0.099 ms
Likewise, from spine02 you should be able to ping the clagd subinterface’s backup IP address, 192.168.0.94, and the IP
address, 169.254.1.1, on spine01.
Verify MLAG port channel operation and the peer roles have been established. In the following example, spine01 is
operating as the primary peer switch and spine02 is the secondary peer switch:
cumulus@spine01$ clagctl
The peer is alive
Our Priority, ID, and Role:
Peer Priority, ID, and Role:
Peer Interface and IP:
Backup IP:
System MAC:
26
32768 c4:54:44:72:cf:ad primary
32768 c4:54:44:72:dd:c9 secondary
peerlink.4093 169.254.1.2
192.168.0.95 (active)
44:38:39:ff:40:93
IMPLEMENTING A LAYER 2 DATA CENTER NETWORK WITH CUMULUS LINUX
4. Set Up Spine/Leaf Network Fabric
•
•
Create a switch-to-switch “uplink” bond on each leaf switch to the spine pair and verify MLAG
Create a switch-to-switch bond on the spine pair to each leaf switch pair and verify MLAG
Creating Switch-to-Switch Spine Bond on Leaf Switches
Now that the peer relationship has been established on the leaf and spine switches, create the switch-to-switch bonds on
the leaf switches by editing the /etc/network/interfaces file and placing the bond configuration after the swpN
interfaces.
You must specify a unique clag-id for every dual-connected bond on each peer switch; the value must be between 1 and
65535 and must be the same on both peer switches in order for the bond to be considered dual-connected. For the
uplink1 bond the clag-id is set to 1000 and the host bond clag-ids start at 1.
cumulus@leaf01$ sudo vi /etc/network/interfaces
# uplink bond to spine
auto uplink1
iface uplink1
bond-slaves swp49 swp50
bond-mode 802.3ad
bond-miimon 100
bond-use-carrier 1
bond-lacp-rate 1
bond-min-links 1
bond-xmit-hash-policy layer3+4
clag-id 1000
Configure leaf02 identically as leaf01.
Once these interfaces have been created, apply the configuration by using the ifreload command or individually bringing
up the new interfaces on both switches:
On leaf01:
cumulus@leaf01:~$ sudo ifreload -a
On leaf02:
cumulus@leaf02:~$ sudo ifreload -a
or individually:
On leaf01:
cumulus@leaf01:~$ sudo ifup uplink1
On leaf02:
cumulus@leaf02:~$ sudo ifup uplink1
www.cumulusnetworks.com
27
DATA CENTER LAYER 2 HIGH AVAILAB ILITY: VALIDATED DESIGN GUI DE
Verify switch-to-switch MLAG port channel operation on both leaf switches using the clagctl command. On leaf01:
cumulus@leaf01$ clagctl
The peer is alive
Peer Priority, ID, and Role:
Our Priority, ID, and Role:
Peer Interface and IP:
Backup IP:
System MAC:
4096 c4:54:44:bd:00:20 primary
32768 c4:54:44:9f:49:1e secondary
peerlink.4094 169.254.1.2
192.168.0.91 (active)
44:38:39:ff:40:94
Creating Switch-to-Switch MLAG Bond on Spine Pair to Each Leaf Switch Pair
Create the switch-to-switch MLAG bonds on the spine switches by editing the /etc/network/interfaces file and placing
the bond configuration after the swpN interfaces.
For example, on spine01, create the downlink1 to aggregate traffic between the leaf01 and leaf02 pair and the leaf03 and
leaf04 pair. The clag-id on both switches for the downlink1 bond is set to 1 and for the downlink2 bond set to 2.
cumulus@spine01$ sudo vi /etc/network/interfaces
# leaf01-leaf02 downlink
auto downlink1
iface downlink1
bond-slaves swp1 swp2
bond-mode 802.3ad
bond-miimon 100
bond-use-carrier 1
bond-lacp-rate 1
bond-min-links 1
bond-xmit-hash-policy layer3+4
clag-id 1
# leaf03-leaf04 downlink
auto downlink2
iface downlink2
bond-slaves swp3 swp4
bond-mode 802.3ad
bond-miimon 100
bond-use-carrier 1
bond-lacp-rate 1
bond-min-links 1
bond-xmit-hash-policy layer3+4
clag-id 2
Configure spine02 identically to spine01.
Once these interfaces have been created, apply the configuration by using the ifreload command or individually bringing
up the new interfaces on both switches:
28
IMPLEMENTING A LAYER 2 DATA CENTER NETWO RK WITH CUMULUS LINU X
On spine01:
cumulus@spine01:~$ sudo ifreload –a
On spine02:
cumulus@spine02:~$ sudo ifreload -a
or individually:
On spine01:
cumulus@spine01:~$ sudo ifup downlink1
On spine02:
cumulus@spine02:~$ sudo ifup downlink1
Verify switch-to-switch MLAG port channel operation on both spine switches using the clagctl command. On spine01:
cumulus@spine01$ clagctl
The peer is alive
Our Priority, ID, and Role:
Peer Priority, ID, and Role:
Peer Interface and IP:
Backup IP:
System MAC:
32768 c4:54:44:72:cf:ad primary
32768 c4:54:44:72:dd:c9 secondary
peerlink.4093 169.254.1.2
192.168.0.95 (active)
44:38:39:ff:40:93
Dual Attached Ports
Our Interface
Peer Interface
------------------------------downlink1
downlink1
MLAG Id
------1
Verify the MLAG connection by running clagctl on the leaf switches.
cumulus@leaf01$ clagctl
The peer is alive
Peer Priority, ID, and Role:
Our Priority, ID, and Role:
Peer Interface and IP:
Backup IP:
System MAC:
Our Interface
---------------uplink1
www.cumulusnetworks.com
4096 c4:54:44:bd:00:20 primary
32768 c4:54:44:9f:49:1e secondary
peerlink.4094 169.254.1.2
192.168.0.91 (active)
44:38:39:ff:40:94
Peer Interface
---------------uplink1
MLAG Id
------1000
29
DATA CENTER LAYER 2 HIGH AVAILAB ILITY: VALIDATED DESIGN GUI DE
5. Configure VLANs
Create the VLANs expected for your traffic. For example:
VLAN 10: in-band hypervisor communications
VLAN 15: in-band virtual SAN storage
VLAN 20-23: VM traffic for WWW services
VLAN 30: VM traffic for application 1 data
VLAN 40: VM traffic for application 2 data
To support VLANs in Cumulus Linux 2.5, a single bridge must be created in /etc/network/interfaces. Create the VLANaware bridge on all spine and leaf switches. For example, on leaf01:
cumulus@leaf01$ sudo vi /etc/network/interfaces
auto bridge
iface bridge
bridge-vlan-aware yes
bridge-ports peerlink uplink1
bridge-vids 10,15,20,21,22,23,30,40,1000-2000
bridge-pvid 1
bridge-stp on
This stanza defines a VLAN-aware bridge for high VLAN scale and assigns the infrastructure ports used for layer 2 links to
the bridge, and assign VLANs to the network infrastructure. This list of VLAN IDs is inherited by all layer 2 interfaces in the
bridge, unless different values are specified under an interface. In this configuration, all VLAN IDs are trunked to all layer 2
interfaces and bonds. The untagged or native VLAN for the infrastructure ports is defined by bridge-pvid; if one is not
specified, the default value is VLAN ID 1. Setting the ID to 1 is a best practice for spanning tree switch interoperability.
Finally, the stanza enables spanning tree on the bridge. To verify spanning tree operation on the bridge, use the mstpctl
command. The following example shows the spanning tree port information for the uplink interface
cumulus@leaf01$ mstpctl showportdetail bridge uplink1
bridge:uplink CIST info
enabled
yes
role
port id
8.006
state
external port cost 305
admin external cost
internal port cost 305
admin internal cost
designated root
1.000.44:38:39:FF:00:00 dsgn external cost
dsgn regional root 1.000.44:38:39:FF:77:00 dsgn internal cost
designated bridge 1.000.44:38:39:FF:77:00 designated port
admin edge port
no
auto edge port
oper edge port
no
topology change ack
point-to-point
yes
admin point-to-point
restricted role
no
restricted TCN
port hello time
2
disputed
bpdu guard port
no
bpdu guard error
network port
no
BA inconsistent
Num TX BPDU
31237
Num TX TCN
Num RX BPDU
38123
Num RX TCN
Num Transition FWD 4
Num Transition BLK
bpdufilter port
no
clag ISL
no
clag ISL Oper UP
clag role
primary
clag dual conn mac
clag remote portID F.FFF
clag system mac
30
Root
forwarding
0
0
305
0
8.001
yes
no
auto
no
yes
no
no
11
119
4
no
44:38:39:ff:77:0
44:38:39:ff:40:94
IMPLEMENTING A LAYER 2 DATA CENTER NETWO RK WITH CUMULUS LINU X
Finally, to verify the VLAN assignment, use the bridge vlan show command for example on spine01:
cumulus@spine01$ bridge vlan show
port
vlan ids
peerlink
1 PVID Egress Untagged
10
15
20-23
30
40
1000-2000
downlink1
1 PVID Egress Untagged
10
15
20-23
30
40
1000-2000
downlink2
1 PVID Egress Untagged
10
15
20-23
30
40
1000-2000
To verify the interfaces with which a specific VLAN is associated, for example VLAN 10, use the bridge vlan show vlan
command:
cumulus@spine01$ bridge vlan show vlan 10
VLAN 10:
peerlink downlink1 downlink2
www.cumulusnetworks.com
31
DATA CENTER LAYER 2 HIGH AVAILAB ILITY: VALIDATED DESIGN GUI DE
6. Connect Hosts
•
•
Provision the host OS
Connect hosts to leaf switches using virtual switches
Provision the host operating systems, if you have not done so already. Since the data center networking fabric has already
been set up, it’s time to connect the hosts to the leaf switches. To improve network reliability and optimization, use host
link redundancy and aggregation.
Set up LACP bonds on your hosts. On each leaf switch, create the interface and assign the host-facing ports to that VLAN.
To enable LACP, first create a new host bond interface. This is a single interface in a bond interface configured on both
switches in an MLAG-enabled pair of leaf switches.
For example, create the following bond on leaf01. Make the identical configuration on leaf02.
cumulus@leaf01$ sudo vi /etc/network/interfaces
auto host-01
iface host-01
bond-slaves swp1
bond-mode 802.3ad
bond-miimon 100
bond-use-carrier 1
bond-lacp-rate 1
bond-min-links 1
bond-xmit-hash-policy layer3+4
clag-id 1
Once the host bond is created, add the bond to the VLAN-aware bridge to enable VLAN trunking to the host.
cumulus@leaf01$ sudo vi /etc/network/interfaces
auto bridge
iface bridge
bridge-ports peerlink uplink1 host-01
By default, the list of VLANs is inherited on the host-01 interface. To override this (to prune VLANs 1000-2000 for example),
configure the allowed VLANs on the host-01 interface without the 1000-2000 from the global configuration as follows:
cumulus@leaf01$ sudo vi /etc/network/interfaces
auto host-01
iface host-01
bridge-vids 10,15,20,21,22,23,30,40
# optional
bridge-pvid 1
mstpctl-portadminedge yes
mstpctl-bpduguard yes
Optional: Set the untagged or native VLAN for the host bond if changing from the default VLAN ID of 1 or changing from the
global value set under the VLAN-aware bridge.
Optional: Set the port to admin edge mode immediately instead of waiting for the automatic admin edge detection.
32
IMPLEMENTING A LAYER 2 DATA CENTER NETWO RK WITH CUMULUS LINU X
Optional: It can be a best practice to enable BPDU guard on all the host-facing ports and bonds. Doing so prevents the
accidental connection of a switch into the network on a host port; in the event a BPDU is unintentionally received, BPDU
guard disables that port. To verify that BPDU guard has been enabled on a port, use the mstpctl command and look for
bpdu information:
cumulus@leaf01$ mstpctl showportdetail bridge host-01 | grep bpdu
bpdu guard port
yes
bpdu guard error
no
bpdufilter port
no
To verify the VLAN information is configured correctly on both MLAG peer switches, use the clagctl verifyvlans
command. This command checks that the VLANs are correctly configured on each dual-connected bond. To see the entire
listing of VLANs as well as validate the configuration, add the -v flag:
cumulus@leaf1$ clagctl -v verifyvlans
Our Bond Interface
VlanId
Peer Bond Interface
----------------------------------------host-02
1
host-02
host-02
10
host-02
.
.
host-02
40
host-02
uplink01
1
uplink01
uplink01
10
uplink01
.
.
uplink01
2000
uplink01
host-01
1
host-01
host-01
10
host-01
.
.
host-01
40
host-01
Optional: By default, the list of VLANs is inherited on the host-01 interface. If the host only connects to a single VLAN, for
example VLAN 10, instead of a trunk set the port to an access port as follows:
cumulus@leaf01$ sudo vi /etc/network/interfaces
auto host-01
iface host-01
bridge-access 10
www.cumulusnetworks.com
33
DATA CENTER LAYER 2 HIGH AVAILAB ILITY: VALIDATED DESIGN GUI DE
7. Connect Spine Switches to Core
Method 1. External Connectivity at Layer 2
The following steps assume the spine switches connect to core switches at layer 2.
•
•
•
Create an MLAG port channel to the core for each spine switch
On each core device, create a vendor-equivalent MLAG bond to the spine switch pair
Configure the core
Depending on your core devices’ support for an MLAG-type configuration, you may need to configure two separate bonds if
your core switches do not support a multi-chassis LACP solution.
If your core devices support MLAG or the equivalent, create a single switch-to-switch MLAG bond on the spine switches by
editing the /etc/network/interfaces file and placing the bond configuration after the swpN interfaces.
For example, on spine01, create the core-uplink interface to aggregate traffic between spine01 and spine02 to the core.
The clag-id for the core-uplink bond is set to 1000 on both spine switches.
cumulus@spine01$ sudo vi /etc/network/interfaces
# bond to core
auto core-uplink
iface core-uplink
bond-slaves swp31 swp32
bond-mode 802.3ad
bond-miimon 100
bond-use-carrier 1
bond-lacp-rate 1
bond-min-links 1
bond-xmit-hash-policy layer3+4
clag-id 1000
Configure spine02 identically to spine01 to aggregate traffic between spine01 and spine02 into a single MLAG bond.
Once these interfaces have been created, apply the configuration by using the ifreload command or individually bringing
up the new interfaces on both switches:
On spine01:
cumulus@spine01:~$ sudo ifreload -a
On spine02:
cumulus@spine02:~$ sudo ifreload -a
or individually:
On spine01:
cumulus@spine01:~$ sudo ifup core-uplink
On spine02:
cumulus@spine02:~$ sudo ifup core-uplink
34
IMPLEMENTING A LAYER 2 DATA CENTER NETWO RK WITH CUMULUS LINU X
Verify switch-to-switch MLAG port channel operation on both spine switches using the clagctl command. On spine01:
cumulus@spine01$ clagctl
The peer is alive
Our Priority, ID, and Role:
Peer Priority, ID, and Role:
Peer Interface and IP:
Backup IP:
System MAC:
32768 c4:54:44:72:cf:ad primary
32768 c4:54:44:72:dd:c9 secondary
peerlink.4093 169.254.1.2
192.168.0.95 (active)
44:38:39:ff:40:93
Dual Attached Ports
Our Interface
Peer Interface
------------------------------downlink1
downlink1
downlink2
downlink2
core-uplink
core-uplink
MLAG Id
------1
2
1000
Method 2. External Connectivity at Layer 3
To provide a layer 3 gateway for a VLAN, use the first hop redundancy protocol, Virtual Router Redundancy (VRR), provided
in Cumulus Linux. VRR provides layer 3 redundancy by using the same virtual IP and MAC addresses on each switch,
allowing traffic across an MLAG to be forwarded, regardless of which switch the traffic arrived on. In this configuration the
switch pair work in an active/active capacity. VRR also works in a non-MLAG environment where a host is in an
active/active or active/standby role. For more information, refer to the Virtual Router Redundancy (VRR) chapter in the
Cumulus Linux documentation.
The following steps assume the spine switches connect to core switches at layer 3.
•
•
•
Configure layer 3 switch virtual interface (SVI) gateways and connect the spine switches to core switches.
Configure core-facing interfaces for IP transfer
Configure dynamic routing protocol
To enable the gateway, first create the layer 3 virtual interface for that VLAN on the VLAN-aware bridge.
For example, to configure this on VLAN 10, add the following stanza to the network configuration:
cumulus@spine01$ sudo vi /etc/network/interfaces
auto bridge.10
iface bridge.10
address 10.1.10.2/24
address-virtual 00:00:5E:00:01:01 10.1.10.1/24
cumulus@spine02$ sudo vi /etc/network/interfaces
auto bridge.10
iface bridge.10
address 10.1.10.3/24
address-virtual 00:00:5E:00:01:01 10.1.10.1/24
This stanza defines a routable interface to VLAN 10 and assigns an IP address to the interface. If, for example, your
gateway is the first IP address in the subnet, such as 10.1.10.1, you should assign the actual interface IP address to
10.1.10.2. On spine02, make sure that the base IP address is unique, such as 10.1.10.3.
www.cumulusnetworks.com
35
DATA CENTER LAYER 2 HIGH AVAILAB ILITY: VALIDATED DESIGN GUI DE
In this example configuration, address-virtual creates a virtual interface with the address of 10.1.10.1, assigned to the
bridge VLAN ID 10, with virtual MAC 00:00:5E:00:01:01. These virtual IP and MAC addresses will be used between the pair
of switches for load balancing and failover. The MAC address is in the VRRP MAC address range of 00:00:5E:00:01:XX and
does not overlap with other MAC addresses in the network.
For each desired gateway VLAN, replicate the above configuration, changing the IP addressing to match the subnet and the
bridge.N where N is the VLAN ID.
To verify the virtual address has been created, first check the bridge.10 interface:
cumulus@spine01$ ip addr show bridge.10
53: bridge.10@bridge: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9216 qdisc noqueue state UP
link/ether c4:54:44:72:cf:35 brd ff:ff:ff:ff:ff:ff
inet 10.1.10.2/24 scope global bridge.10
inet6 fe80::c654:44ff:fe72:cf35/64 scope link
valid_lft forever preferred_lft forever
When the address-virtual keyword is put under a layer 3 bridge ID it automatically creates a virtual interface with the
syntax: (bridge name)-(VLAN ID)-v(virtual instance). In the above example, the = bridge name is bridge, the VLAN ID is 10,
and the virtual instance is 0. Thus, the interface bridge-10-v0 has the virtual MAC address and virtual IP address assigned
under the bridge.10 interface. If additional virtual addresses are added to the interface, each will have its own instance.
To see that the virtual interface is operational, use the ip addr show command.
cumulus@spine01$ ip addr show bridge-10-v0
77: bridge-10-v0@bridge.10: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9216 qdisc noqueue
state UNKNOWN
link/ether 00:00:5e:00:01:01 brd ff:ff:ff:ff:ff:ff
inet 10.1.10.1/32 scope global bridge-10-v0
inet6 fe80::200:5eff:fe00:101/64 scope link
valid_lft forever preferred_lft forever
36
AUTOMATION CONSIDERA TIONS
Automation Considerations
Automation and Converged Administration
Because Cumulus Linux is Linux, you can draw from a rich ecosystem of solutions to automate the provisioning and
management of your switches. Automation tools can range from Linux-based scripts to applications that run natively on the
Linux OS.
In order to have distinct organizational roles and allow both server and network teams to focus on their areas of expertise
and responsibility, yet have a ubiquitous form of centralized management, you can leverage tools such as Puppet, Chef,
Ansible, and others that are widely used for compute management.
The following sections provide automation and template examples using zero touch provisioning, Mako, Ansible, and
Puppet. Additional examples are available in the form of demos in the Cumulus Networks Knowledge Base under Demos
and Training at support.cumulusnetworks.com/hc/en-us/sections/200398866; the source code for the demos can be
found in the Cumulus Networks GitHub repository at: github.com/CumulusNetworks/.
Automated Switch Provisioning with Zero Touch Provisioning
By default, a switch with Cumulus Linux freshly installed has the option to look for and invoke an automation script. This
process is called zero touch provisioning and is triggered by the following conditions:
•
•
•
•
Management port (eth0) configured to DHCP
Management port is restarted, or switch is powered on, using the one of the following commands:
o service networking restart
o reboot
o ifdown and ifup the switch port
DHCP server is configured with option cumulus-provision-url code 239
DHCP server is configured with URL of automation script to execute
Alternatively, zero touch provisioning can be run manually by running /usr/lib/cumulus/autoprovision. For example:
cumulus@switch$ sudo /usr/lib/cumulus/autoprovision -u http://10.99.0.1/script.sh
More details on the autoprovision command may be obtained by running the command with the -h option.
Zero touch provisioning will run a script using the specified URL provided by DCHP or manually. Supported languages
include Bash, Ruby, Python, and Perl. As a failsafe mechanism, zero touch provisioning will look for a CumulusLinuxAutoProvision flag in the HTTP header when retrieving the script prior to executing it.
The script can automate many provisioning functions, such as:
Install the Cumulus Linux license
Change the hostname
Run apt-get update
Install automation tools such as Puppet or Chef
Create users or integrate with authentication
Configure sudoers for administrative privileges of users
www.cumulusnetworks.com
37
DATA CENTER LAYER 2 HIGH AVAILAB ILITY: VALIDATED DESIGN GUI DE
For example, a script that installs the Cumulus Linux license and SSH keys from a central server 10.99.0.1 is as follows:
#!/bin/bash
# CUMULUS-AUTOPROVISIONING
#install license from webserver
wget -q -O /root/license.txt http://10.99.0.1/license.txt
/usr/cumulus/bin/cl-license -i /root/license.txt
#install ssh keys from webserver
/usr/bin/wget -O /root/.ssh/authorized_keys http://10.99.0.1/authorized_keys
exit 0
For more information, refer to the Zero Touch Provisioning chapter of the Cumulus Linux documentation.
Network Configuration Templates Using Mako
In the prior section Network Architecture Build Steps, we showed interface configurations that are manually entered in the
/etc/network/interfaces file. Instead of manually configuring each interface definition, you can programmatically
define them by using shorthand syntax that leverages Python Mako templates.
You can use the following Mako template to represent what would take much longer to manually configure. For example,
the following syntax can programmatically define the interface ports for the hosts. This Mako template:
cumulus@leaf01$ sudo vi /etc/network/interfaces
<%
Host_ports = range(1,45)
%>
% for i in: Host_ports
auto swp${i}
iface swp${i}
mtu 9216
% endfor
is equivalent to:
cumulus@leaf01$ sudo vi /etc/network/interfaces
auto swp1
iface swp1
mtu 9216
auto swp2
iface swp2
mtu 9216
auto swp3
iface swp3
mtu 9216
.
.
38
AUTOMATION CONSIDERA TIONS
.
auto swp43
iface swp43
mtu 9216
auto swp44
iface swp44
mtu 9216
For more information and an example, see the knowledge base article, Configuring /etc/network/interfaces with Mako, at
support.cumulusnetworks.com/hc/en-us/articles/202868023.
Automated Network Configuration Using Ansible
Ansible is an open source, lightweight configuration management tool that can be used to automate many configuration
tasks. Ansible does not require an agent be run on a switch; instead, Ansible manages nodes over SSH.
Using Ansible, you can run automation tasks across many end points, whereas you use Mako within the context of a single
switch.
A particular script that runs a variety of tasks is referred to as a playbook in Ansible. The following example changes the
MTU for a group of switch ports using the example previously shown with Mako.
On the controller, run the tree command to show where the playbook and related files reside:
root@ubuntu# tree
.
├── ansible.cfg
├── ansible.hosts
├── roles
│
└── leaf
│
├── tasks
│
│
└── main.yml
│
├── templates
│
│
└── interfaces.j2
│
└── vars
│
└── main.yml
└── leaf.yml
The following files show how the playbook is run:
root@ubuntu# cat ansible.cfg
[defaults]
host_key_checking=False
hostfile = ansible.hosts
The ansible.hosts file specifies switch1 and switch2 as the DNS names of two bare metal switches running Cumulus
Linux:
root@ubuntu# cat ansible.hosts
[switches]
switch1
switch2
www.cumulusnetworks.com
39
DATA CENTER LAYER 2 HIGH AVAILAB ILITY: VALIDATED DESIGN GUI DE
The leaf.yml file is what is processed to run the playbook. It points to the roles that should be run.
root@ubuntu# cat leaf.yml
--- hosts: switches
user: root
roles:
- leaf
The tasks/main.yml includes all the tasks being run; for example, to overwrite the /etc/network/interfaces file with
the template file, roles/leaf/templates/interfaces.j2.
root@ubuntu# cat roles/leaf/tasks/main.yml
- name: configure interfaces
template: src=interfaces.j2 dest=/etc/network/interfaces
root@ubuntu# cat roles/leaf/templates/interfaces.j2
auto lo
iface lo inet loopback
auto eth0
iface eth0 inet dhcp
{% if switches[inventory_hostname].interfaces is defined -%}
{% for item in switches[inventory_hostname].interfaces -%}
auto {{ item }}
iface {{ item }}
mtu 9216
{% endfor -%}
{% endif -%}
{% if switches[inventory_hostname].start_port is defined -%}
{% for item in
range(switches[inventory_hostname].start_port|int(),switches[inventory_hostname].stop_p
ort|int()) -%}
auto swp{{ item }}
iface swp{{ item }}
mtu 9216
{% endfor -%}
{% endif -%}
40
AUTOMATION CONSIDERA TIONS
The vars/main.yml is the template from where values are retrieved:
root@ubuntu# cat roles/leaf/vars/main.yml
switches:
switch1:
start_port: "1"
stop_port: "44"
switch2:
interfaces: ["swp1", "swp2", "swp3", "swp4", "swp17", "swp18", "swp19", "swp20"]
For switch1, a range similar to the Mako example is used, where switch ports 1 through 44 are set.
switch2 has a different configuration where swp1 through swp4 are set, and then swp17 through swp20.
To run the playbook, use the ansible-playbook command. The -k flag allows you to use a plaintext password rather
than SSH keys. For example:
root@ubuntu# ansible-playbook leaf.yml -k
SSH password:
PLAY [switches] ***************************************************************
GATHERING FACTS ***************************************************************
ok: [switch2]
ok: [switch1]
TASK: [leaf | configure interfaces] ****************************************
changed: [switch2]
changed: [switch1]
PLAY RECAP ********************************************************************
switch1
: ok=2
changed=1
unreachable=0
failed=0
switch2
: ok=2
changed=1
unreachable=0
failed=0
www.cumulusnetworks.com
41
DATA CENTER LAYER 2 HIGH AVAILAB ILITY: VALIDATED DESIGN GUI DE
Automated Network Configuration Using Puppet
Puppet is an open source tool that can automate configuration management through the use of a controller that syncs with
agents installed on each end point. While similar in functionality to Ansible, Puppet relies on agents installed on each
switch being managed. Puppet utilizes TCP port 61613 for syncing between agents and a controller.
In Puppet, a particular script that runs a variety of tasks is referred to as a manifest, similar to the idea of an Ansible
playbook. The following example written in Puppet repeats the previous examples shown with Mako and Ansible.
On the controller, run the tree command to show where the manifest and related directories and files reside:
root@ubuntu# tree
.
├── auth.conf
├── autosign.conf
├── fileserver.conf
├── manifests
│
└── site.pp
├── modules
│
├── base
│
│
├── manifests
│
│
│
├── interfaces.pp
│
│
│
├── role
│
│
│
│
└── switch.pp
│
│
└── templates
│
│
├── interfaces.erb
├── puppet.conf
└── templates
The following files show how the manifest is run.
The site.pp file is the main manifest. It contains site-wide and node-specific statements or definitions, which are blocks
of Puppet code that will only be included in a node’s catalog — information that is specific to that node. For example, this is
where you specify which interfaces you want to change MTU to 9216:
root@ubuntu# cat manifests/site.pp
node 'switch2' {
$int_enabled = true
$int_mtu = {
swp1 => {},
swp2 => {},
swp3 => {},
swp4 => {},
}
include base::role::switch
}
42
AUTOMATION CONSIDERA TIONS
In this example, the module is simply called base and contains a single manifest, interfaces.pp, containing the class
base::interfaces. Classes generally configure all the packages, configuration files, and services needed to run an
application.
root@ubuntu# cat modules/base/manifests/interfaces.pp
class base::interfaces {
if $int_enabled == undef {
$int_enabled = false
}
if ($int_enabled == true) {
file { '/etc/network/interfaces':
owner
=> root,
group
=> root,
mode
=> '0644',
content => template('base/interfaces.erb')
}
service { 'networking':
ensure
=> running,
subscribe => File['/etc/network/interfaces'],
hasrestart => true,
restart
=> '/sbin/ifreload -a',
enable
=> true,
hasstatus => false,
}
}
}
The interfaces.erb file is a template that fetches the variables from site.pp. The template keeps eth0 as DHCP,
checks to see if int_mtu is defined in site.pp, then loops through each interface provided and sets MTU to 9216.
root@ubuntu# cat modules/base/templates/interfaces.erb
auto eth0
iface eth0 inet dhcp
<% if @int_mtu %>
# interfaces
<% int_mtu.each_pair do |key, value_hash| %>
auto <%= key %>
iface <%= key %>
mtu 9216
<% end %>
<% else %>
# no interfaces
<% end %>
www.cumulusnetworks.com
43
DATA CENTER LAYER 2 HIGH AVAILAB ILITY: VALIDATED DESIGN GUI DE
The role is included in the main manifest. It ties all the manifests together that are utilized for this particular node.
root@ubuntu# cat modules/base/manifests/role/switch.pp
class base::role::switch {
include base::interfaces
}
The puppet.conf file is the main configuration file on both the Puppet master and the Puppet client. On the Puppet
master side, change the dns_alt_names value from the default, puppet:
root@ubuntu# cat puppet.conf
[main]
logdir=/var/log/puppet
vardir=/var/lib/puppet
ssldir=/var/lib/puppet/ssl
rundir=/var/run/puppet
factpath=$vardir/lib/facter
#modulepath=/etc/puppet/modules
#templatedir=$confdir/templates
dns_alt_names = puppet,cumulus-vm
[master]
# These are needed when the puppetmaster is run by passenger
# and can safely be removed if webrick is used.
ssl_client_header = SSL_CLIENT_S_DN
ssl_client_verify_header = SSL_CLIENT_VERIFY
On the Puppet client side, change the server value from the default, puppet:
cumulus@switch2:~$ cat /etc/puppet/puppet.conf
[main]
logdir=/var/log/puppet
vardir=/var/lib/puppet
ssldir=/var/lib/puppet/ssl
rundir=/var/run/puppet
factpath=$vardir/lib/facter
#templatedir=$confdir/templates
server=cumulus
[master]
# These are needed when the puppetmaster is run by passenger
# and can safely be removed if webrick is used.
ssl_client_header = SSL_CLIENT_S_DN
ssl_client_verify_header = SSL_CLIENT_VERIFY
44
AUTOMATION CONSIDERA TIONS
Puppet agents run every 30 minutes by default. You can manually force a Puppet agent to run using the --test option. For
example:
cumulus@switch2:~# sudo puppet agent --test
Info: Retrieving pluginfacts
Info: Retrieving plugin
Info: Caching catalog for switch2
Info: Applying configuration version '1415654245'
Notice: /Stage[main]/Base::Interfaces/File[/etc/network/interfaces]/content:
--- /etc/network/interfaces 2014-11-10 22:39:55.000000000 +0000
+++ /tmp/puppet-file20141110-3198-b8i1bh
2014-11-10 22:40:20.714158235 +0000
@@ -3,4 +3,27 @@
+# interfaces
+
+auto swp1
+iface swp1
+
mtu 9216
+
+auto swp2
+iface swp2
+
mtu 9216
+
+auto swp3
+iface swp3
+
mtu 9216
+
+auto swp4
+iface swp4
+
mtu 9216
+
+
Info: /Stage[main]/Base::Interfaces/File[/etc/network/interfaces]: Filebucketed
/etc/network/interfaces to puppet with sum 6f8a42d7ebd62f41c19324868384e095
Notice: /Stage[main]/Base::Interfaces/File[/etc/network/interfaces]/content: content
changed '{md5}6f8a42d7ebd62f41c19324868384e095' to
'{md5}ebf607a6ab09b595e81d1ff63e4b1196'
Info: /Stage[main]/Base::Interfaces/File[/etc/network/interfaces]: Scheduling refresh
of Service[networking]
Notice: /Stage[main]/Base::Interfaces/Service[networking]/ensure: ensure changed
'stopped' to 'running'
Info: /Stage[main]/Base::Interfaces/Service[networking]: Unscheduling refresh on
Service[networking]
Notice: Finished catalog run in 6.99 seconds
www.cumulusnetworks.com
45
DATA CENTER LAYER 2 HIGH AVAILAB ILITY: VALIDATED DESIGN GUI DE
Operational and Management Considerations
Authentication and Authorization
Cumulus Linux switches can be configured to use OpenLDAP v2.4 or later.
Roles are used to segment privileges — No Access, Read Only, Administrator, Custom.
In Cumulus Linux, users belonging to sudo group are equivalent to Administrator. Users created that are not part of the
sudo group can be created and used for read-only access.
Refer to the Authentication, Authorization, and Accounting chapter of the Cumulus Linux documentation for more
information.
Security Hardening
From a security hardening perspective, the following ports are used in this scenario by Cumulus Linux:
•
•
TCP 22: SSH – needed for Cumulus Linux switch management
TCP 5342: MLAG (default port) – needed for MLAG communication between Cumulus Linux switches
Refer to the Netfilter - ACL chapter in the Cumulus Linux documentation for further information.
Accounting and Monitoring
Cumulus Linux log files are written to the /var/log directory. Key log files include:
Cumulus Linux Log File
Description
syslog
This log file shows all issues from the kernel, Cumulus Linux HAL process
(switchd), and issues from almost all other application processes, such as
DHCP, smond (look for facility and INFO entries), and so forth.
daemon.log
Details when processes start and stop in the system.
quagga/zebra.log
Details issues from the Quagga zebra daemon.
quagga/{protocol}.log
Details issues from layer 3 routing protocols, like OSPF or BGP.
switchd.log
Logs activity on the switch monitored by the switchd process.
clagd.log
Logs activity of the clagd daemon for MLAG.
46
OPERATIONAL AND MANA GEMENT CONSIDERATION S
Quality of Service (QoS) Considerations
The use of industry-standard switches allows for unprecedented lower hardware costs and the ability to add additional
bandwidth and hot spares. Thus, QoS becomes less important, and you do not have to think about QoS in the traditional
sense. Instead, make sure to provision sufficient bandwidth to minimize oversubscription.
QoS at its core defines which traffic should be dropped and when, since dropping traffic is almost never appropriate
behavior in the modern data center. When applications begin choking due to bandwidth restrictions, it is much more
productive to increase total bandwidth rather than tweak QoS to starve certain traffic. Since Cumulus Linux provides
hardware choice, which has reduced pricing of switches significantly, the problem of expensive bandwidth is no longer a
problem in today’s data center.
Cumulus Networks strongly recommends not changing the default buffer and queue management values, as they are
optimized for typical environments. However, Cumulus Networks realizes that many situations are unique and require
further tuning based on the requirements of a particular network.
To configure buffer and queue Management, Cumulus Linux divides inbound packets into four different priority groups.
Cumulus Linux by default divides traffic into priority groups based on their Class of Service (CoS) values, rather than
Differentiated Services Code Point (DSCP).
Priority Group
Description
Default CoS Values
Control
Highest priority
7
Service
Second highest priority traffic
2
Lossless
Traffic protected by priority flow control
none
Bulk
All remaining traffic
0, 1, 3, 4, 5, 6
To change the default behavior, edit the /etc/cumulus/datapath/traffic.conf file:
cumulus@switch$ sudo vi /etc/cumulus/datapath/traffic.conf
To specify to which queue a CoS value maps, edit the priority_group.<name>.cos_list values. For example:
priority_group.control.cos_list = [1,3]
To change from CoS to DSCP, change the value of traffic.packet_priority_source, keeping in mind DSCP values
must be mapped to each of the priority groups.
traffic.packet_priority_source = dscp
Changes to traffic.conf require you to run service switchd restart before they take effect.
Note: This command is disruptive to all switch port interfaces.
cumulus@switch$ sudo service switchd restart
Refer to the Configuring Buffer and Queue Management chapter of the Cumulus Linux documentation for further
information.
www.cumulusnetworks.com
47
DATA CENTER LAYER 2 HIGH AVAILAB ILITY: VALIDATED DESIGN GUI DE
Link Pause
The pause frame is a flow control mechanism that halts transmission for a specified period of time. For example, a server
or other network node within the data center may be receiving traffic faster than it can handle and thus may benefit from
using the pause frame. In Cumulus Linux, individual ports can be configured to execute link pause by:
•
•
Transmitting pause frames when its ingress buffers become congested (TX pause enable) and/or
Responding to received pause frames (RX pause enable)
You can configure link pause by editing /etc/cumulus/datapath/traffic.conf.
For example, to enable both types of link pause for swp2 and swp3:
# To configure pause on a group of ports:
# uncomment the link pause port group list
# add or replace a port group name to the list
# populate the port set, e.g.
# swp1-swp4,swp8,swp50s0-swp50s3
# enable pause frame transmit and/or pause frame receive
# link pause
link_pause.port_group_list = [port_group_0]
link_pause.port_group_0.port_set = swp2-swp3
link_pause.port_group_0.rx_enable = true
link_pause.port_group_0.tx_enable = true
A port group refers to one or more sequences of contiguous ports. Multiple port groups can be defined by:
•
•
Adding a comma-separated list of port group names to the port_group_list.
Adding the port_set, rx_enable, and tx_enable configuration lines for each port group.
You can specify the set of ports in a port group in comma-separated sequences of contiguous ports. The syntax supports:
•
•
•
A single switch port, like swp1s0 or swp5
A range of regular switch ports, like swp2-swp5
A sequence within a breakout switch port, like swp6s0-swp6s3
It does not accept a sequence that includes multiple split ports; for example, swp6s0-swp7s3 is not supported.
Restart switchd to allow your link pause configuration changes to take into effect.
Note: this command is disruptive to all switch port interfaces.
cumulus@switch$ sudo service switchd restart
48
CONCLUSION
Conclusion
Summary
The fundamental abstraction of hardware from software and providing customers a choice through a hardware agnostic
approach is core to the philosophy of Cumulus Networks and fits very well within the software-defined data center as well
as within the confines of traditional layer 2 networks.
Choice and CapEx savings are only the beginning. Long term OpEx savings come from the agility gained through
automation. Cumulus Linux enables network and data center architects to leverage automated provisioning tools and
templates to define and provision traditional layer 2 networks as if the switches were Linux servers with 50+ NICs. Cumulus
Linux leverages the strength, familiarity, and maturity of Linux to propel networking into the 21st century.
References
Article/Document
URL
Cumulus Linux Documentation
https://docs.cumulusnetworks.com
Quick Start Guide
Understanding Network Interfaces
MLAG
IGMP and MLD Snooping
LACP Bypass
Virtual Router Redundancy (VRR)
Authentication, Authorization, and
Accounting
Configuring Buffer and Queue
Management
Cumulus Linux Knowledge Base Articles
Configuring
/etc/network/interfaces with Mako
Configuring a Management
Namespace
Demos and Training
https://support.cumulusnetworks.com/hc/enus/articles/202868023
https://support.cumulusnetworks.com/hc/enus/articles/202325278
https://support.cumulusnetworks.com/hc/enus/sections/200398866
Linux Training Videos from Cumulus
https://cumulusnetworks.com/technical-videos/
Cumulus Linux Product Information
http://cumulusnetworks.com/product/pricing/
Software Pricing
http://cumulusnetworks.com/hcl/
Hardware Compatibility List (HCL)
Cumulus Linux Downloads
www.cumulusnetworks.com
http://cumulusnetworks.com/downloads/
49
DATA CENTER LAYER 2 HIGH AVAILAB ILITY: VALIDATED DESIGN GUI DE
Article/Document
URL
Cumulus Linux Repository
http://repo.cumulusnetworks.com
Cumulus Networks GitHub Repository
https://github.com/CumulusNetworks/
50
APPENDIX A: EXAMPLE /ETC/NETWORK/INTERFA CES CONFIGURATIONS
Appendix A: Example /etc/network/interfaces
Configurations
leaf01
cumulus@leaf01$ cat /etc/network/interfaces
auto eth0
iface eth0
address 192.168.0.90/24
gateway 192.168.0.254
# physical interface configuration
auto swp1
iface swp1
mtu 9216
auto swp2
iface swp2
mtu 9216
auto swp3
iface swp3
mtu 9216
.
.
.
auto swp52
iface swp52
mtu 9216
# peerlink bond for clag
auto peerlink
iface peerlink
bond-slaves swp47 swp48
bond-mode 802.3ad
bond-miimon 100
bond-use-carrier 1
bond-lacp-rate 1
bond-min-links 1
bond-xmit-hash-policy layer3+4
# VLAN for clagd communication
www.cumulusnetworks.com
51
DATA CENTER LAYER 2 HIGH AVAILAB ILITY: VALIDATED DESIGN GUI DE
auto peerlink.4094
iface peerlink.4094
address 169.254.1.1/30
clagd-enable yes
clagd-peer-ip 169.254.1.2
clagd-backup-ip 192.168.0.91
clagd-sys-mac 44:38:39:ff:40:94
# uplink bond to spine
auto uplink1
iface uplink1
bond-slaves swp49 swp50
bond-mode 802.3ad
bond-lacp-rate 1
bond-min-links 1
bond-miimon 100
bond-use-carrier 1
bond-xmit-hash-policy layer3+4
clag-id 1000
auto host-01
iface host-01
bond-slaves swp1
bond-mode 802.3ad
bond-miimon 100
bond-use-carrier 1
bond-lacp-rate 1
bond-min-links 1
bond-xmit-hash-policy layer3+4
bridge-vids 10,15,20,21,22,23,30,40
mstpctl-portadminedge yes
mstpctl-bpduguard yes
clag-id 1
auto bridge
iface bridge
bridge-vlan-aware yes
bridge-ports peerlink uplink1 host-01
bridge-vids 10,15,20,21,22,23,30,40,1000-2000
bridge-pvid 1
bridge-stp on
52
APPENDIX A: EXAMPLE /ETC/NETWORK/INTERFACES CONFIGURATI ONS
leaf02
cumulus@leaf02$ sudo vi /etc/network/interfaces
auto eth0
iface eth0
address 192.168.0.91/24
gateway 192.168.0.254
# physical interface configuration
auto swp1
iface swp1
mtu 9216
auto swp2
iface swp2
mtu 9216
auto swp3
iface swp3
mtu 9216
.
.
.
auto swp52
iface swp52
mtu 9216
# peerlink bond for clag
auto peerlink
iface peerlink
bond-slaves swp47 swp48
bond-mode 802.3ad
bond-miimon 100
bond-use-carrier 1
bond-lacp-rate 1
bond-min-links 1
bond-xmit-hash-policy layer3+4
# VLAN for clagd communication
auto peerlink.4094
iface peerlink.4094
address 169.254.1.2/30
clagd-enable yes
clagd-peer-ip 169.254.1.1
clagd-backup-ip 192.168.0.90
clagd-sys-mac 44:38:39:ff:40:94
www.cumulusnetworks.com
53
DATA CENTER LAYER 2 HIGH AVAILAB ILITY: VALIDATED DESIGN GUI DE
# uplink bond to spine
auto uplink1
iface uplink1
bond-slaves swp49 swp50
bond-mode 802.3ad
bond-lacp-rate 1
bond-min-links 1
bond-miimon 100
bond-use-carrier 1
bond-xmit-hash-policy layer3+4
clag-id 1000
auto host-01
iface host-01
bond-slaves swp1
bond-mode 802.3ad
bond-miimon 100
bond-use-carrier 1
bond-lacp-rate 1
bond-min-links 1
bond-xmit-hash-policy layer3+4
bridge-vids 10,15,20,21,22,23,30,40
mstpctl-portadminedge yes
mstpctl-bpduguard yes
clag-id 1
auto bridge
iface bridge
bridge-vlan-aware yes
bridge-ports peerlink uplink1 host-01
bridge-vids 10,15,20,21,22,23,30,40,1000-2000
bridge-pvid 1
bridge-stp on
54
APPENDIX A: EXAMPLE /ETC/NETWORK/INTERFA CES CONFIGURATIONS
spine01
cumulus@spine01$ sudo vi /etc/network/interfaces
auto eth0
iface eth0
address 192.168.0.94/24
gateway 192.168.0.254
# physical interface configuration
auto swp1
iface swp1
mtu 9216
auto swp2
iface swp2
mtu 9216
auto swp3
iface swp3
mtu 9216
.
.
.
auto swp32
iface swp32
mtu 9216
# peerlink bond for clag
auto peerlink
iface peerlink
bond-slaves swp31 swp32
bond-mode 802.3ad
bond-miimon 100
bond-use-carrier 1
bond-lacp-rate 1
bond-min-links 1
bond-xmit-hash-policy layer3+4
# VLAN for clagd communication
auto peerlink.4093
iface peerlink.4093
address 169.254.1.1/30
clagd-enable yes
clagd-peer-ip 169.254.1.2
clagd-backup-ip 192.168.0.95
clagd-sys-mac 44:38:39:ff:40:93
www.cumulusnetworks.com
55
DATA CENTER LAYER 2 HIGH AVAILAB ILITY: VALIDATED DESIGN GUI DE
# leaf01-leaf02 downlink
auto downlink1
iface downlink1
bond-slaves swp1 swp2
bond-mode 802.3ad
bond-miimon 100
bond-use-carrier 1
bond-lacp-rate 1
bond-min-links 1
bond-xmit-hash-policy layer3+4
clag-id 1
auto downlink2
iface downlink2
bond-slaves swp3 swp4
bond-mode 802.3ad
bond-miimon 100
bond-use-carrier 1
bond-lacp-rate 1
bond-min-links 1
bond-xmit-hash-policy layer3+4
clag-id 2
# Need connection to core
auto bridge
iface bridge
bridge-vlan-aware yes
bridge-ports peerlink downlink1 downlink2
bridge-vids 10,15,20,21,22,23,30,40,1000-2000
bridge-pvid 1
bridge-stp on
mstpctl-treeprio 4096
56
APPENDIX A: EXAMPLE /ETC/NETWORK/INTERFA CES CONFIGURATIONS
spine02
cumulus@spine02$ sudo vi /etc/network/interfaces
auto eth0
iface eth0
address 192.168.0.95/24
gateway 192.168.0.254
# physical interface configuration
auto swp1
iface swp1
mtu 9216
auto swp2
iface swp2
mtu 9216
auto swp3
iface swp3
mtu 9216
.
.
.
auto swp32
iface swp32
mtu 9216
# peerlink bond for clag
auto peerlink
iface peerlink
bond-slaves swp31 swp32
bond-mode 802.3ad
bond-miimon 100
bond-use-carrier 1
bond-lacp-rate 1
bond-min-links 1
bond-xmit-hash-policy layer3+4
# VLAN for clagd communication
auto peerlink.4093
iface peerlink.4093
address 169.254.1.2/30
clagd-enable yes
clagd-peer-ip 169.254.1.1
clagd-backup-ip 192.168.0.94
clagd-sys-mac 44:38:39:ff:40:93
www.cumulusnetworks.com
57
DATA CENTER LAYER 2 HIGH AVAILAB ILITY: VALIDATED DESIGN GUI DE
# Need connection to core
# leaf01-leaf02 downlink
auto downlink1
iface downlink1
bond-slaves swp1 swp2
bond-mode 802.3ad
bond-miimon 100
bond-use-carrier 1
bond-lacp-rate 1
bond-min-links 1
bond-xmit-hash-policy layer3+4
clag-id 1
auto downlink2
iface downlink2
bond-slaves swp3 swp4
bond-mode 802.3ad
bond-miimon 100
bond-use-carrier 1
bond-lacp-rate 1
bond-min-links 1
bond-xmit-hash-policy layer3+4
clag-id 2
# Need connection to core
auto bridge
iface bridge
bridge-vlan-aware yes
bridge-ports peerlink downlink1 downlink2
bridge-vids 10,15,20,21,22,23,30,40,1000-2000
bridge-pvid 1
bridge-stp on
mstpctl-treeprio 4096
58
APPENDIX A: EXAMPLE /ETC/NETWORK/INTERFA CES CONFIGURATIONS
oob-mgmt
When utilizing a management networking switch running Cumulus Linux, the switch can be configured by editing
/etc/network/interfaces file with the following configuration:
cumulus@oob-mgmt$ sudo vi /etc/network/interfaces
auto br0
iface br0
bridge-ageing 300
bridge-ports regex (swp[0-9]*[s]*[0-9])
bridge-stp on
and reloading all networking:
cumulus@oob-mgmt$ sudo ifreload -a
www.cumulusnetworks.com
59
DATA CENTER LAYER 2 HIGH AVAILAB ILITY: VALIDATED DESIGN GUI DE
Appendix B: Network Design and Setup
Checklist
Tasks
Considerations
1. Set up physical network and basic configuration of all switches.
Select network switches
Refer to the HCL and hardware guides at http://cumulusnetworks.com/support/hcl.
Out-of-band management:
Assume minimal traffic requirements, used for initial image loading and then
management and monitoring, with no day-to-day data traffic.
48 port 1G switch is sufficient.
Leaf switches:
Choose between at least a 48 port 10G switch or 32 port 40G switch with
breakout cables.
Consider price and future proofing.
Breakout cables provide more 10G ports on a 40G switch than a single 10G
switch.
Spine switches:
Choose at least a 10G switch or a 40G switch; 40G for more traffic aggregation.
Consider price and future proofing.
Use identical switches in pairs to facilitate easier management; more for hot spares.
Plan cabling
Refer to knowledge base article, Suggested Transceivers and Cables:
https://support.cumulusnetworks.com/hc/en-us/articles/202983783.
Generally, higher number ports on a switch are reserved for uplink ports, so:
Assign downlinks or host ports to the lower end, like swp1, swp2
Reserve higher number ports for network
Reserve highest ports for MLAG peer links
Connect all console ports. See the Quick Start Guide in the Cumulus Linux documentation.
Install Cumulus Linux
Obtain the latest version of Cumulus Linux.
Obtain license key, which is separate from Cumulus Linux OS distribution.
To minimize variables and aid in troubleshooting, use identical versions across switches —
same version X.Y.Z, packages, and patch levels.
At a minimum, ensure switches in MLAG pairs have identical versions.
See the Quick Start Guide in the Cumulus Linux documentation.
Reserve management
space
Reserve pool of IP addresses.
Define hostnames and DNS.
RFC 1918 should be used where possible.
60
APPENDIX B: NETWORK DESIGN AND SETUP CHE CKLIST
Tasks
Considerations
Determine IP addressing
Use DCHP to avoid manually configuring on each switch: gateway, IP address, DNS
information, hostname information, zero touch provisioning URL, installation URL for ONIE.
Or use static IP addresses for explicit control, and avoiding managing MAC address to IP
address table.
Edit configuration files
Apply standards and conventions to promote similar configurations. For example, place
stanzas in the same order in configuration files across switches and specify the child
interfaces before the parent interfaces (so a bond member appears earlier in the file than
the bond itself, for example). This allows for standardization and easier maintenance and
troubleshooting, and ease of automation and the use of templates.
Consider naming conventions for consistency, readability, and manageability. Doing so
helps facilitate automation. For example, call your leaf switches leaf01 and leaf02 rather
than leaf1 and leaf2.
Use all lowercase for names
Avoid characters that are not DNS-compatible.
Define child interfaces before using them in parent interfaces. For example, create the
member interfaces of a bond before defining the bond interface itself. Refer to the
Configuring and Managing Network Interfaces chapter of the Cumulus Linux documentation
for more information.
2. Configure leaf switches.
Define switch ports (swp)
in /etc/network/interfaces
on a switch
Instantiate swp interfaces for using the ifup and ifdown commands.
Set MTU
By default, MTU is set to 1500. Set to a high value, like 9216, to avoid packet
fragmentation.
Set speed and duplex
These settings are dependent on your network.
Create peer link bond
between pair of switches
Assign IP address for clagd peerlink. Consider using a link local address (RFC 3927,
169.254/16) to avoid advertising, or an RFC 1918 private address.
Use a very high number VLAN if possible to separate the peer communication traffic from
typical VLANs handling data traffic. Valid VLAN tags end at 4096.
Enable MLAG
Set up MLAG in switch pairs. There’s no particular order necessary for connecting pairs.
Assign clagd-sys-mac
Assign a unique clagd-sys-mac value per pair. This value is used for spanning tree
calculation, so assigning unique values will prevent overlapping MAC addresses.
Assign priority
Use the range reserved for Cumulus Networks: 44:38:39:FF:00:00 through
44:38:39:FF:FF:FF.
Define primary and secondary switches in an MLAG configuration, if desired. Otherwise, by
default the switches will elect a primary switch on their own. Set priority if you want to
explicitly control which switches are designated primary switches.
www.cumulusnetworks.com
61
DATA CENTER LAYER 2 HIGH AVAILAB ILITY: VALIDATED DESIGN GUI DE
Tasks
Considerations
Automate setup
Use automation for setting up many switches.
Set up a pair of switches manually and verify all connectivity first before attempting to
programmatically recreate.
Try to configure switches as similarly as possible. The main differences between two
switches would be the loopback address, management IP address, MLAG Peer, and VRR.
3. Configure spine switches.
Repeat steps for
configuring leaf switches
Steps for configuring leaf switches are similar to configuring spine switches.
Consider using a different VLAN number for spine peer bonds than leaf peer bonds for
distinction to avoid accidentally trunking across the same VLAN.
4. Set up spine/leaf network fabric.
Create a switch-to-switch
“uplink” bond on each leaf
switch to the spine pair
and verify MLAG.
Use different clag-ids for uplinks versus host bonds.
Create a switch-to-switch
bond on the spine pair to
each leaf switch pair and
verify MLAG.
5. Configure VLANs.
Define VLANS
Determine list of VLANs needed.
What VLANs on what interfaces.
Prune VLANs where possible.
Set native VLAN for trunk ports.
6. Connect and configure hosts.
Set up high availability on
hosts
62
Set LACP to fast mode (default).
Enable BDPU guard for port-facing hosts to prevent hooking up a switch on the host port.
APPENDIX B: NETWORK DESIGN AND SETUP CHE CKLIST
7. Connect spine switches to core.
Connect to core switch at
layer 2, if applicable
Check MTU setting for the connection to the core. This depends on what the core needs.
Check MLAG-type capability.
Determine what VLANs need to be trunked to the core instead of pruned.
Ensure the native VLAN matches the core native VLAN.
Connect to core switch at
layer 3, if applicable
Check MTU setting for the connection to the core. This depends on what the core needs.
Determine how to handle the default route: originate or learn?
Specify the IP address subnet information for layer 3 VLANs.
Decide what IP address to use for the gateway. Typically this is either .1 or .254.
Assign IP addresses for VRR. Typically, they are adjacent to the gateway, so .2 or .253.
Assign virtual MAC addresses to use for VRR. For better manageability, use the same MAC
address on both peer switches, from the reserved range for VRRP: 00:00:5E:00:01:XX.
Determine if routing will use static or dynamic routing. If dynamic:
•
•
•
Specify router-id and advertised networks.
Consider IPv4 and/or IPv6.
Determine protocol, OSPF or BGP.
OSPF:
•
•
•
•
•
•
Define area.
Verify MTU setting. OSPF in particular can run into problems with improper MTU
settings.
Define reference bandwidth.
Set timers.
Define network type, such as point-to-point.
Choose between OSFP numbered or OSPF unnumbered interfaces.
BGP:
•
•
•
www.cumulusnetworks.com
Define autonomous system number (ASN).
Set timers.
Choose between iBGP or eBGP.
63
Was this manual useful for you? yes no
Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Download PDF

advertising