Cumulus Linux 2.5.2 User Guide

Cumulus Linux 2.5.2 User Guide
Cumulus Linux 2.5.2
User Guide
Table of Contents
Cumulus Linux 2.5.2 User Guide
Table of Contents
Welcome to Cumulus Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
Quick Start Guide . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
What's New in Cumulus Linux 2.5.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Open Source Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Prerequisites . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Hardware Compatibility List . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Installing Cumulus Linux . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Upgrading Cumulus Linux . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Configuring Cumulus Linux . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Configuring 4x10G Port Configuration (Splitter Cables) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Testing Cable Connectivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Configuring Switch Ports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
Configuring a Loopback Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
System Management and Diagnostics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
Authentication, Authorization, and Accounting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
Configuring switchd . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
Monitoring and Troubleshooting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
Installation, Upgrading and Package Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
Netfilter - ACLs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
Configuring and Managing Network Interfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Man Pages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Configuration Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Basic Commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
ifupdown2 Built-in Interfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
ifupdown2 Interface Dependencies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Bringing All auto Interfaces Up or Down . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Configuring IP Addresses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Specifying User Commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Sourcing Interface File Snippets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Using Globs for Port Lists . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Using Templates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Adding Descriptions to Interfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Caveats and Errata . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Useful Links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
126
127
127
127
127
128
128
132
133
134
134
135
135
136
136
137
Layer 2 Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
cumulusnetworks.com
2
Cumulus Linux 2.5.2 User Guide
Layer 2 Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
Link Layer Discovery Protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Prescriptive Topology Manager - PTM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Understanding Network Interfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Bonding - Link Aggregation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Ethernet Bridging - VLANs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Network Virtualization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Multi-Chassis Link Aggregation - CLAG - MLAG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
LACP Bypass . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Spanning Tree and Rapid Spanning Tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Configuring Switch Port Attributes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Configuring Buffer and Queue Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Virtual Router Redundancy - VRR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
IGMP and MLD Snooping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
138
143
154
160
163
195
215
230
234
240
245
250
255
Layer 3 Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261
Routing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Introduction to Routing Protocols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Network Topology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Quagga Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Configuring Quagga . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Open Shortest Path First - OSPF - Protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Open Shortest Path First v3 - OSPFv3 - Protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Configuring Border Gateway Protocol - BGP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Hardware ECMP Hashing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
262
267
269
271
273
285
294
297
313
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 317
Welcome
to Cumulus Networks
cumulusnetworks.com
3
Cumulus Networks
Welcome to Cumulus Networks
We are transforming networking with Cumulus Linux, the industry's first, full-featured Linux operating
system for networking hardware. Cumulus Linux is a complete network operating system, based on
Debian wheezy. Unlike traditional embedded platforms, Cumulus Linux provides a complete
environment pre-installed with scripting languages, server utilities, and monitoring tools. Management
tasks are accomplished via SSH using standard Linux commands or over a serial console connection.
This documentation is current as of June 3, 2015 for version 2.5.2. Please visit the Cumulus Networks
Web site for the most up to date documentation.
Read the release notes for new features and known issues in this release.
Release Notes for Cumulus Linux 2.5.2 (see page 4)
Quick Start Guide (see page 4)
System Management and Diagnostics (see page 15)
Network Troubleshooting (see page 82)
Layer 2 Features (see page 137)
Layer 3 Features (see page 261)
Quick
Start Guide
4
03 June 2015
Cumulus Linux 2.5.2 User Guide
Quick Start Guide
This chapter helps you get up and running with Cumulus Linux quickly and easily.
Contents
(Click to expand)
Contents (see page 5)
What's New in Cumulus Linux 2.5.2 (see page 5)
Open Source Contributions (see page 5)
Prerequisites (see page 6)
Hardware Compatibility List (see page 6)
Installing Cumulus Linux (see page 6)
Upgrading Cumulus Linux (see page 7)
Configuring Cumulus Linux (see page 7)
Login Credentials (see page 7)
Serial Console Management (see page 8)
Wired Ethernet Management (see page 8)
Configuring the Hostname and Time Zone (see page 8)
Installing the License (see page 9)
Configuring 4x10G Port Configuration (Splitter Cables) (see page 11)
Testing Cable Connectivity (see page 11)
Configuring Switch Ports (see page 12)
Layer 2 Port Configuration (see page 12)
Layer 3 Port Configuration (see page 13)
Configuring a Loopback Interface (see page 14)
What's New in Cumulus Linux 2.5.2
Cumulus Linux 2.5.2 supports new hardware platforms. The release notes contain information about
the new features and known issues in this release.
Open Source Contributions
Cumulus Networks has forked various software projects, like CFEngine, Netdev and some Puppet Labs
packages in order to implement various Cumulus Linux features. The forked code resides in the
Cumulus Networks GitHub repository.
Cumulus Networks developed and released as open source some new applications as well.
The list of open source projects is on the open source software page.
cumulusnetworks.com
5
Cumulus Networks
Prerequisites
Prior intermediate Linux knowledge is assumed for this guide. You should be familiar with basic text
editing, Unix file permissions, and process monitoring. A variety of text editors are pre-installed,
including vi and nano.
You must have access to a Linux or UNIX shell. If you are running Windows, you should use a Linux
environment like Cygwin as your command line tool for interacting with Cumulus Linux.
If you're a networking engineer but are unfamiliar with Linux concepts, use this reference
guide to see examples of the Cumulus Linux CLI and configuration options, and their
equivalent Cisco Nexus 3000 NX-OS commands and settings for comparison.
Hardware Compatibility List
You can find the most up to date hardware compatibility list (HCL) here. Use the HCL to confirm that
your switch model is supported by Cumulus Networks. The HCL is updated regularly, listing products by
port configuration, manufacturer, and SKU part number.
Installing Cumulus Linux
This quick start guide walks you through the steps necessary for getting Cumulus Linux up and running
on your switch, which includes:
1. Powering on the switch and entering ONIE, the Open Network Install Environment.
2. Installing Cumulus Linux on the switch via ONIE.
3. Booting into Cumulus Linux and installing the license.
4. Rebooting the switch to activate the switch ports.
5. Configuring switch ports and a loopback interface.
To install Cumulus Linux, you use ONIE ( Open Network Install Environment), an extension to the
traditional U-Boot software that allows for automatic discovery of a network installer image. This
facilitates the ecosystem model of procuring switches, with a user's own choice of operating system
loaded, such as Cumulus Linux.
If Cumulus Linux is already installed on your switch, and you need to upgrade the software
only, you can skip to Upgrading Cumulus Linux (see page 7) below.
The easiest way to install Cumulus Linux with ONIE is via local HTTP discovery:
1. If your host (like a laptop or server) is IPv6-enabled, make sure it is running a Web server.
If the host is IPv4-enabled, make sure it is running DHCP as well as a Web server.
2. Download the Cumulus Linux installation file to the root directory of the Web server. Rename
this file onie-installer.
6
03 June 2015
Cumulus Linux 2.5.2 User Guide
3. Connect your host via Ethernet cable to the management Ethernet port of the switch.
4. Power on the switch. The switch downloads the ONIE image installer and boots it. You can watch
the progress of the install in your terminal. After the installation finishes, the Cumulus Linux
login prompt appears in the terminal window.
These steps describe a flexible unattended installation method. You should not need a
console cable. A fresh install via ONIE using a local Web server should generally complete in
less than 10 minutes.
If you experience issues with ONIE, read this knowledge base article for more ways to install
Cumulus Linux using ONIE.
ONIE supports many other discovery mechanisms using USB (copy the installer to the root of the drive),
DHCPv6 and DHCPv4, and image copy methods including HTTP, FTP, and TFTP. For more information
on these discovery methods, refer to the ONIE documentation.
After installing Cumulus Linux, you are ready to:
Log in to Cumulus Linux on the switch.
Install the Cumulus Linux license.
Configure Cumulus Linux. This quick start guide provides instructions on configuring switch
ports and a loopback interface.
Upgrading Cumulus Linux
If you already have Cumulus Linux installed on your switch and are upgrading to a maintenance release
(X.Y.Z, like 2.5.1) from an earlier release in the same major and minor release family only (like 2.2.1 to
2.2.2, or 2.5.0 to 2.5.1), you can use apt-get to upgrade to the new version instead. See Upgrading
Cumulus Linux to a Maintenance (X.Y.Z) Release (see page 95) for details.
Configuring Cumulus Linux
When bringing up Cumulus Linux for the first time, the management port makes a DHCPv4 request. To
determine the IP address of the switch, you can cross reference the MAC address of the switch with
your DHCP server. The MAC address should be located on the side of the switch or on the box in which
the unit was shipped.
Login Credentials
The default installation includes one system account, root, with full system privileges, and one user
account, cumulus, with sudo privileges. The root account password is set to null by default (which
prohibits login), while the cumulus account is configured with this default password:
CumulusLinux!
In this quick start guide, you will use the cumulus account to configure Cumulus Linux.
cumulusnetworks.com
7
Cumulus Networks
For best security, you should change the default password (using the passwd command)
before you configure Cumulus Linux on the switch.
All accounts except root are permitted remote SSH login; sudo may be used to grant a non-root
account root-level access. Commands which change the system configuration require this elevated
level of access.
For more information about sudo, read Using sudo to Delegate Privileges (see page 18).
Serial Console Management
Users are encouraged to perform management and configuration over the network, either in band or
out of band. Use of the serial console is fully supported; however, many customers prefer the
convenience of network-based management.
Typically, switches will ship from the manufacturer with a mating DB9 serial cable. Switches with ONIE
are always set to a 115200 baud rate.
Wired Ethernet Management
Switches supported in Cumulus Linux always contain at least one dedicated Ethernet management
port, which is named eth0. This interface is geared specifically for out-of-band management use. The
management interface uses DHCPv4 for addressing by default. You can set a static IP address in the /etc
/network/interfaces file:
auto eth0
iface eth0
address 192.0.2.42/24
gateway 192.0.2.1
Configuring the Hostname and Time Zone
To change the hostname, modify the /etc/hostname and /etc/hosts files with the desired
hostname and reboot the switch. First, edit /etc/hostname:
cumulus@switch:~$ sudo vi /etc/hostname
Then replace the 127.0.1.1 IP address in /etc/hosts with the new hostname:
cumulus@switch:~$ sudo vi /etc/hostname
Reboot the switch:
8
03 June 2015
Cumulus Linux 2.5.2 User Guide
cumulus@switch:~$ sudo reboot
To update the time zone, update the /etc/timezone file with the correct timezone, run dpkgreconfigure --frontend noninteractive tzdata, then reboot the switch:
cumulus@switch:~$ sudo vi /etc/timezone
cumulus@switch:~$ sudo dpkg-reconfigure --frontend noninteractive tzdata
cumulus@switch:~$ sudo reboot
It is possible to change the hostname without a reboot via a script available on Cumulus
Networks GitHub site.
Installing the License
Cumulus Linux is licensed on a per-instance basis. Each network system is fully operational, enabling
any capability to be utilized on the switch with the exception of forwarding on switch panel ports. Only
eth0 and console ports are activated on an unlicensed instance of Cumulus Linux. Enabling front panel
ports requires a license.
You should have received a license key from Cumulus Networks or an authorized reseller. Here is a
sample license file:
cumulusnetworks.com
9
Cumulus Networks
-----BEGIN PGP SIGNED MESSAGE----Hash: SHA1
mail=user@company.com
group=
expires=1388908800
model=*
serial=*
enforcement=warn
NFR=1
-----BEGIN PGP SIGNATURE----Version: GnuPG v1.4.10 (GNU/Linux)
iQEcBXEBAgAGBQJRfvkYAAoJEPxJF1FuRcN0cDwIAL6SGMJLNtJIAWCizks/f1OJ
osV0U6JQEZIy+La1Nt/jcus52VKieybfMP0wPe4XGGzfHQX9pHLe/8JBR7wqRSQY
wwAQd/XXXXXFob+z4iNMwMt4o8obnEhAQ1MlNS+idQaYXgXjgAVZp2fciZ++fo4z
Iwz6GeKSXdm3fG64v6gD/yyyyyyyyfzEiWUg/IieopTGqeFOUWiPGnv57yzBvJWC
mzkDYBine1xUcKzuhc4LWsOjsyYjFmHWw0qLRxkgTBW2Ggm3a8Pa4WPWNhOrkzfZ
p/PDiuJ8d+p9wT9t9sMwpXSh68FljCZbiK+0QVDDgybo/eXFTgJuW72aN31yzGg=
=qZuI
-----END PGP SIGNATURE-----
There are three ways to install this file onto the system:
Copy and paste the license key into the cl-license command:
cumulus@switch:~$ sudo cl-license -i
<paste file>
^+d
Copy it from a local server. Create a text file with the license and copy it to a server accessible
from the switch. On the switch, use the following command to transfer the file directly on the
switch, then install the license file:
cumulus@switch:~$ scp user@my_server:/home/user/my_license_file.
txt .
cumulus@switch:~$ sudo cl-license -i my_license_file.txt
10
03 June 2015
Cumulus Linux 2.5.2 User Guide
Copy the file to an HTTP server (not HTTPS), then reference the URL when you run cl-license:
cumulus@switch:~$ sudo cl-license -i <URL>
Once the license is installed successfully, reboot the system:
cumulus@switch:~$ sudo reboot
Once rebooted, all front panel ports will be active. The front panel ports are identified as switch ports,
and show up as swp1, swp2, and so forth.
Configuring 4x10G Port Configuration (Splitter Cables)
If you are using 4x10G DAC or AOC cables, edit the /etc/cumulus/ports.conf to enable support for
these cables then restart the switchd daemon using the sudo service switchd restart
command. For more details, see Configuring Switch Port Attributes (see page 240).
Testing Cable Connectivity
By default, all data plane ports (every Ethernet port except the management interface, eth0) are
disabled.
To test cable connectivity, administratively enable a port using ip link set <interface> up:
cumulus@switch:~$ sudo ip link set swp1 up
Run the following bash script, as root, to administratively enable all physical ports:
cumulus@switch:~$ sudo su cumulus@switch:~$# for i in /sys/class/net/*; do iface=`basename $i`; if [[
$iface == swp* ]]; then ip link set $iface up; fi done
To view link status, use ip link show. The following examples show the output of a port in "admin
down", "down" and "up" mode, respectively:
cumulusnetworks.com
11
Cumulus Networks
# Administratively Down
swp1: <BROADCAST,MULTICAST> mtu 1500 qdisc pfifo_fast state DOWN mode
DEFAULT qlen 1000
# Administratively Up but Layer 2 protocol is Down
swp1: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast state
DOWN mode DEFAULT qlen 500
# Administratively Up, Layer 2 protocol is Up
swp1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP
mode DEFAULT qlen 500
Configuring Switch Ports
Layer 2 Port Configuration
To configure a front panel port or create a bridge, edit the /etc/network/interfaces file. After
saving the file, to activate the change, use the ifup command.
Examples
In the following configuration example, the front panel port swp1 is placed into a bridge called br0:
auto br0
iface br0
bridge-ports swp1
bridge-stp on
To put a range of ports into a bridge, use the glob keyword. For example, add swp1 through swp10,
swp12, and swp14 through swp20 to br0:
auto br0
iface br0
bridge-ports glob swp1-10 swp12 glob swp14-20
bridge-stp on
To activate or apply the configuration to the kernel:
12
03 June 2015
Cumulus Linux 2.5.2 User Guide
# First, check for typos:
cumulus@switch:~$ sudo ifquery -a
# Then activate the change if no errors are found:
cumulus@switch:~$ sudo ifup -a
To view the changes in the kernel, use the brctl command:
cumulus@switch:~$ brctl show
bridge name
bridge id
br0
8000.089e01cedcc2
STP enabled
yes
interfaces
swp1
A script is available to generate a configuration that places all physical ports in a single bridge.
Layer 3 Port Configuration
To configure a front panel port or bridge interface as a Layer 3 port, edit the /etc/network
/interfaces file.
In the following configuration example, the front panel port swp1 is configured a Layer 3 access port:
auto swp1
iface swp1
address 10.1.1.1/30
To add an IP address to a bridge interface, include the address under the iface configuration in /etc
/network/interfaces:
auto br0
iface br0
address 10.2.2.1/24
bridge-ports glob swp1-10 swp12 glob swp14-20
bridge-stp on
To activate or apply the configuration to the kernel:
cumulusnetworks.com
13
Cumulus Networks
# First check for typos:
cumulus@switch:~$ sudo ifquery -a
# Then activate the change if no errors are found:
cumulus@switch:~$ sudo ifup -a
To view the changes in the kernel use the ip addr show command:
br0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP
link/ether 00:02:00:00:00:28 brd ff:ff:ff:ff:ff:ff
inet 10.2.2.1/24 scope global br0
swp1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP
link/ether 44:38:39:00:6e:fe brd ff:ff:ff:ff:ff:ff
inet 10.1.1.1/30 scope global swp1
Configuring a Loopback Interface
Cumulus Linux has a loopback preconfigured in /etc/network/interfaces. When the switch boots
up, it has a loopback interface, called lo, which is up and assigned an IP address of 127.0.0.1.
To see the status of the loopback interface (lo), use the ip addr show lo command:
cumulus@switch:~$ ip addr show lo
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue state UNKNOWN
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
Note that the loopback is up and is assigned an IP address of 127.0.0.1.
To add an IP address to a loopback interface, add it directly under the iface lo inet loopback
definition in /etc/network/interfaces:
auto lo
iface lo inet loopback
address 10.1.1.1
14
03 June 2015
Cumulus Linux 2.5.2 User Guide
If an IP address is configured without a mask, as shown above, the IP address becomes a /32.
So, in the above case, 10.1.1.1 is actually 10.1.1.1/32.
Multiple loopback addresses can be configured by adding additional address lines:
auto lo
iface lo inet loopback
address 10.1.1.1
address 172.16.2.1/24
System
Management and Diagnostics
cumulusnetworks.com
15
Cumulus Networks
System Management and Diagnostics
Authentication, Authorization, and Accounting (see page 16)
SSH for Remote Access (see page 17)
User Accounts (see page 18)
Using sudo to Delegate Privileges (see page 18)
LDAP Authentication and Authorization (see page 25)
Configuring switchd (see page 28)
Monitoring and Troubleshooting (see page 31)
Setting Date and Time (see page 35)
Single User Mode - Boot Recovery (see page 37)
Monitoring Interfaces and Transceivers Using ethtool (see page 39)
Resource Diagnostics Using cl-resource-query (see page 42)
Monitoring System Hardware (see page 44)
Monitoring System Statistics and Network Traffic with sFlow (see page 50)
Monitoring Virtual Device Counters (see page 53)
Understanding and Decoding the cl-support Output File (see page 57)
Troubleshooting Log Files (see page 60)
Troubleshooting the etc Directory (see page 62)
Troubleshooting the support Directory (see page 73)
Managing Application Daemons (see page 74)
Troubleshooting ifupdown2 (see page 77)
Network Troubleshooting (see page 82)
Installation, Upgrading and Package Management (see page 89)
Managing Cumulus Linux Disk Images (see page 89)
Adding and Updating Packages (see page 104)
Zero Touch Provisioning (see page 110)
Netfilter (ACLs) (see page 114)
Authentication, Authorization, and Accounting
SSH for Remote Access (see page 17)
User Accounts (see page 18)
Using sudo to Delegate Privileges (see page 18)
PAM and NSS (see page 25)
16
03 June 2015
Cumulus Linux 2.5.2 User Guide
SSH for Remote Access
You use SSH to securely access a Cumulus Linux switch remotely.
Contents
(Click to expand)
Contents (see page 17)
Access Using Passkey (Basic Setup) (see page 17)
Completely Passwordless System (see page 18)
Useful Links (see page 18)
Access Using Passkey (Basic Setup)
Cumulus Linux uses the openSSH package to provide SSH functionality. The standard mechanisms of
generating passwordless access just applies. The example below has the cumulus user on a machine
called management-station connecting to a switch called cumulus-switch1.
First, on management-station, generate the SSH keys:
cumulus@management-station:~$ ssh-keygen
Generating public/private rsa key pair.
Enter file in which to save the key (/home/cumulus/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /home/cumulus/.ssh/id_rsa.
Your public key has been saved in /home/cumulus/.ssh/id_rsa.pub.
The key fingerprint is:
8c:47:6e:00:fb:13:b5:07:b4:1e:9d:f4:49:0a:77:a9 cumulus@managementstation
The key's randomart image is:
+--[ RSA 2048]----+
|
|
.
.= o o.
|
o . O *..
|
|
. o = =.o
|
|
. O oE
|
|
+ S
|
|
+
|
|
|
|
|
|
|
+-----------------+
cumulusnetworks.com
17
Cumulus Networks
Next, append the public key in ~/.ssh/id_rsa.pub into ~/.ssh/authorized_keys in the target
user’s home directory:
cumulus@management-station:~$ scp .ssh/id_rsa.pub cumulus@cumulus-switch1:.
ssh/authorized_keys
Enter passphrase for key '/home/cumulus/.ssh/id_rsa':
id_rsa.pub
Remember, you cannot use the root account to SSH to a switch in Cumulus Linux.
Completely Passwordless System
When generating the passphrase and its associated keys, as in the first step above, do not enter a
passphrase. Follow all the other instructions.
Useful Links
http://www.debian-administration.org/articles/152
User Accounts
By default, Cumulus Linux has two user accounts: root and cumulus.
The root account has the standard Linux root user access to everything on the switch. The root account
password is set to null by default, which prohibits login to the switch by SSH, telnet, FTP, and so forth.
The cumulus account is a user account with sudo privileges. The cumulus user can log in to the system
via all the usual channels like console and SSH (see page 17). The default password is CumulusLinux!.
The cumulus user is in the group sudo.
You can add more user accounts as needed. Like the cumulus account, these accounts must use sudo
to execute privileged commands (see page 18), so be sure to include them in the sudo group.
A user with sudo privileges can assign a valid password for the root account, and can install an SSH key
for root if needed. In these cases, the root account behaves as expected under Debian.
To access the switch without any password requires booting into a single shell/user mode. Here are the
instructions (see page 37) on how to do this using PowerPC and x86 switches.
Using sudo to Delegate Privileges
By default, Cumulus Linux has two user accounts: root and cumulus. The cumulus account is a normal
user and is in the group sudo.
You can add more user accounts as needed. Like the cumulus account, these accounts must use sudo
to execute privileged commands.
Contents
(Click to expand)
18
03 June 2015
Cumulus Linux 2.5.2 User Guide
Contents (see page 18)
Commands (see page 19)
Using sudo (see page 19)
sudoers Examples (see page 20)
Configuration Files (see page 25)
Useful Links (see page 25)
Commands
sudo
visudo
Using sudo
sudo allows you to execute a command as superuser or another user as specified by the security
policy. See man sudo(8) for details.
The default security policy is sudoers, which is configured using /etc/sudoers. Use /etc/sudoers.d/
to add to the default sudoers policy. See man sudoers(5) for details.
Use visudo only to edit the sudoers file; do not use another editor like vi or emacs. See man
visudo(8) for details.
Errors in the sudoers file can result in losing the ability to elevate privileges to root. You can
fix this issue only by power cycling the switch and booting into single user mode. Before
modifying sudoers, enable the root user by setting a password for the root user.
By default, users in the sudo group can use sudo to execute privileged commands. To add users to the
sudo group, use the useradd(8) or usermod(8) command. To see which users belong to the sudo
group, see /etc/group (man group(5)).
Any command can be run as sudo, including su. A password is required.
The example below shows how to use sudo as a non-privileged user cumulus to bring up an interface:
cumulus@switch:~$ ip link show dev swp1
3: swp1: <BROADCAST,MULTICAST> mtu 1500 qdisc pfifo_fast master br0 state
DOWN mode DEFAULT qlen 500
link/ether 44:38:39:00:27:9f brd ff:ff:ff:ff:ff:ff
cumulus@switch:~$ ip link set dev swp1 up
RTNETLINK answers: Operation not permitted
cumulus@switch:~$ sudo ip link set dev swp1 up
Password:
cumulus@switch:~$ ip link show dev swp1
cumulusnetworks.com
19
Cumulus Networks
3: swp1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master
br0 state UP mode DEFAULT qlen 500
link/ether 44:38:39:00:27:9f brd ff:ff:ff:ff:ff:ff
sudoers Examples
The following examples show how you grant as few privileges as necessary to a user or group of users
to allow them to perform the required task. For each example, the system group noc is used; groups
are prefixed with an %.
When executed by an unprivileged user, the example commands below must be prefixed with sudo.
Category
Privilege
Monitoring
Switch
port info
Example Command
ethtool -m swp1
sudoers Entry
%noc ALL=(ALL) NOPASSWD:
/sbin/ethtool
Monitoring
System
diagnostics
cl-support
%noc ALL=(ALL) NOPASSWD:/usr
/cumulus/bin/cl-support
Monitoring
Routing
diagnostics
cl-resource-
%noc ALL=(ALL) NOPASSWD:/usr
query
/cumulus/bin/cl-resourcequery
Image
management
Install
images
cl-img-install
%noc ALL=(ALL) NOPASSWD:/usr
http://lab
/cumulus/bin/cl-img-install
/install.bin
Image
management
Swapping
slots
cl-img-select 1
%noc ALL=(ALL) NOPASSWD:/usr
/cumulus/bin/cl-img-select
20
03 June 2015
Cumulus Linux 2.5.2 User Guide
Category
Privilege
Image
management
Clearing
an overlay
Example Command
sudoers Entry
cl-img-clear-
%noc ALL=(ALL) NOPASSWD:/usr
overlay 1
/cumulus/bin/cl-img-clearoverlay
Package
management
Any aptget
command
apt-get update
%noc ALL=(ALL) NOPASSWD:/usr
or apt-get
/bin/apt-get
install
Package
management
Just aptget update
apt-get update
%noc ALL=(ALL) NOPASSWD:/usr
/bin/apt-get update
Package
management
Package
management
Install
packages
apt-get install
%noc ALL=(ALL) NOPASSWD:/usr
mtr-tiny
/bin/apt-get install *
apt-get upgrade
%noc ALL=(ALL) NOPASSWD:/usr
Upgrading
/bin/apt-get upgrade
Netfilter
Install ACL
policies
cl-acltool -i
%noc ALL=(ALL) NOPASSWD:/usr
/cumulus/bin/cl-acltool
Netfilter
List
iptables
rules
iptables -L
%noc ALL=(ALL) NOPASSWD:
/sbin/iptables
cumulusnetworks.com
21
Cumulus Networks
Category
Privilege
L1 + 2 features
Any LLDP
command
Example Command
sudoers Entry
lldpcli show
%noc ALL=(ALL) NOPASSWD:/usr
neighbors /
/sbin/lldpcli
configure
L1 + 2 features
Just show
neighbors
lldpcli show
%noc ALL=(ALL) NOPASSWD:/usr
neighbors
/sbin/lldpcli show
neighbours*
Interfaces
Interfaces
Modify any
interface
ip link set dev
%noc ALL=(ALL) NOPASSWD:
swp1 {up|down}
/sbin/ip link set *
ifup swp1
%noc ALL=(ALL) NOPASSWD:
Up any
interface
/sbin/ifup
Interfaces
Down any
interface
ifdown swp1
%noc ALL=(ALL) NOPASSWD:
/sbin/ifdown
Interfaces
Up/down
only swp2
ifup swp2 /
%noc ALL=(ALL) NOPASSWD:
ifdown swp2
/sbin/ifup swp2,/sbin
/ifdown swp2
Interfaces
22
Any IP
address
chg
03 June 2015
Cumulus Linux 2.5.2 User Guide
Category
Privilege
Example Command
sudoers Entry
ip addr
%noc ALL=(ALL) NOPASSWD:
{add|del}
/sbin/ip addr *
192.0.2.1/30
dev swp1
Interfaces
Only set IP
address
ip addr add
%noc ALL=(ALL) NOPASSWD:
192.0.2.1/30
/sbin/ip addr add *
dev swp1
Ethernet
bridging
Any bridge
command
brctl addbr br0
%noc ALL=(ALL) NOPASSWD:
/ brctl delif
/sbin/brctl
br0 swp1
Ethernet
bridging
Spanning tree
Troubleshooting
Add
bridges
and ints
brctl addbr br0
%noc ALL=(ALL) NOPASSWD:
/ brctl addif
/sbin/brctl addbr *,/sbin
br0 swp1
/brctl addif *
mstpctl
%noc ALL=(ALL) NOPASSWD:
setmaxage br2 20
/sbin/mstpctl
service switchd
%noc ALL=(ALL) NOPASSWD:/usr
restart
/sbin/service switchd *
Set STP
properties
Restart
switchd
cumulusnetworks.com
23
Cumulus Networks
Category
Privilege
Troubleshooting
Restart
any service
Troubleshooting
Example Command
sudoers Entry
service switchd
%noc ALL=(ALL) NOPASSWD:/usr
cron
/sbin/service
tcpdump
%noc ALL=(ALL) NOPASSWD:/usr
Packet
capture
/sbin/tcpdump
L3
Add static
routes
ip route add
%noc ALL=(ALL) NOPASSWD:/bin
10.2.0.0/16 via
/ip route add *
10.0.0.1
L3
Delete
static
routes
ip route del
%noc ALL=(ALL) NOPASSWD:/bin
10.2.0.0/16 via
/ip route del *
10.0.0.1
L3
Any static
route chg
ip route *
%noc ALL=(ALL) NOPASSWD:/bin
/ip route *
L3
Any
iproute
command
ip *
%noc ALL=(ALL) NOPASSWD:/bin
/ip
L3
Nonmodal
OSPF
%noc ALL=(ALL) NOPASSWD:/usr
/bin/cl-ospf
24
03 June 2015
Cumulus Linux 2.5.2 User Guide
Category
Privilege
Example Command
sudoers Entry
cl-ospf area
0.0.0.1 range
10.0.0.0/24
Configuration Files
/etc/sudoers - default security policy
/etc/sudoers.d/ - default security policy
Useful Links
sudo
Adding Yourself to sudoers
LDAP Authentication and Authorization
Cumulus Linux uses Pluggable Authentication Modules (PAM) and Name Switch Service (NSS) for user
authentication.
NSS provides the lookup and mapping of users, while PAM provides login handling, authentication and
session setup.
PAMs can be used with protocols like LDAP to provide user authentication for numerous services on a
network.
Contents
(Click to expand)
Contents (see page 25)
Configuring LDAP (see page 26)
Installing libnss-ldapd (see page 26)
Configuring nslcd.conf (see page 26)
Troubleshooting LDAP Authentication (see page 26)
Common Problems (see page 26)
Configuring LDAP Authorization (see page 27)
A Longer Example (see page 27)
References (see page 27)
cumulusnetworks.com
25
Cumulus Networks
Configuring LDAP
There are 3 common ways of configuring LDAP authentication on Linux:
libnss-ldap
libnss-ldapd
libnss-sss
This chapter covers using libnss-ldapd only. From internal testing, this library worked best with
Cumulus Linux and was the easiest to configure, automate and troubleshoot.
Installing libnss-ldapd
To install libnss-ldapd, run:
cumulus@switch:~$ sudo apt-get install libnss-ldapd ldap-utils
This brings up an interactive prompt asking questions about the LDAP URI, base domain name and so
on. To pre-fill these details, run apt-get install debconf-utils and populate debconf-setselections with the appropriate answers. Run debconf-show <pkg> to check the settings.
Here is an example of how to prefill questions using debconf-set-selections.
For nested group support, libnss-ldapd must be version 0.9 or higher. For Cumulus Linux 2.
x, you can get this from the wheezy-backports repo.
Configuring nslcd.conf
/etc/nslcd.conf is the main configuration file that needs to be changed after the package is
installed. The nslcd.conf man page details all the available configuration options.
Here is an example configuration using Cumulus Linux.
Troubleshooting LDAP Authentication
By default, password and group information is cached by the nscd daemon. It is recommended when
setting up LDAP authentication for the first time, to turn off this service using service nscd stop.
Stop the nslcd service and run it in debug mode. Debug mode works whether you are using LDAP over
SSL (port 636) or an unencrypted LDAP connection (port 389).
cumulus@switch:~$ sudo service nslcd stop
cumulus@switch:~$ sudo nslcd -d
Common Problems
26
03 June 2015
Cumulus Linux 2.5.2 User Guide
Common Problems
nslcd cannot read the SSL certificate. nslcd will report a “Permission denied” error in the
debug during server connection negotiation. The sniffer trace output will show only a TCP
handshake and then a TCP FIN from the switch. Check the permission on each directory in the
path of the root SSL certificate. Ensure that is is readable by the nslcd user.
The FQDN on the LDAP URI does not match the SSL FQDN exactly.
The search filter returns wrong results. Check for typos in the search filter. Use ldapsearch to
test your filter. For example:
In $HOME/.ldaprc configure basic ldapsearch parameters
--------------URI: ldaps://myadserver.rtp.example.test
BASE ou=support,dc=rtp,dc=example,dc=test
TLS_CACERT /etc/ssl/certs/rtp-example-ca.crt
----# ldapsearch \
-D 'CN=cumulus admin,CN=Users,DC=rtp,DC=example,DC=test' \
-w '1Q2w3e4r!' \
"(&(ObjectClass=user) \
(memberOf=cn=cumuluslnxadm,ou=groups,ou=support,dc=rtp,
dc=example, dc=test))"
Configuring LDAP Authorization
In the /etc/nslcd.conf file, the "filter" keyword defines an LDAP search filter. Use this search filter to
only show the users and or groups one desires. In the example below, only users in the
cumuluslnxadm group are shown in the passed database:
# This filter says to get all users who are part of the cumuluslnxadm group.
filter passwd (&(Objectclass=user)(!(objectClass=computer))
(memberOf=cn=cumuluslnxadm,ou=groups,ou=support,dc=rtp,dc=example,dc=test))
A Longer Example
A longer, more complete example for configuring LDAP is available on our knowledge base.
References
https://wiki.debian.org/LDAP/PAM
https://raw.githubusercontent.com/arthurdejong/nss-pam-ldapd/master/nslcd.conf
http://backports.debian.org/Instructions/
cumulusnetworks.com
27
Cumulus Networks
http://backports.debian.org/Instructions/
Configuring switchd
switchd is the daemon at the heart of Cumulus Linux. It communicates between the switch and
Cumulus Linux, and all the applications running on Cumulus Linux.
The switchd configuration is stored in /etc/cumulus/switchd.conf.
Versions of Cumulus Linux prior to 2.1 stored the switchd configuration at /etc/default
/switchd.
Contents
(Click to expand)
Contents (see page 28)
The switchd File System (see page 28)
Configuring switchd Parameters (see page 30)
Restarting switchd (see page 30)
Commands (see page 31)
Configuration Files (see page 31)
The switchd File System
switchd also exports a file system, mounted on /cumulus/switchd, that presents all the switchd
configuration options as a series of files arranged in a tree structure. You can see the contents by
parsing the switchd tree; run tree /cumulus/switchd. The output below is for a switch with one
switch port configured:
cumulus@cumulus:~# sudo tree /cumulus/switchd/
/cumulus/switchd/
|-- config
|
|-- acl
|
|
|-- non_atomic_update_mode
|
|
`-- optimize_hw
|
|-- arp
|
|
|
|-- buf_util
|
|
|-- measure_interval
|
|
`-- poll_interval
|
|-- coalesce
|
|
|-- reducer
|
|
`-- timeout
28
`-- next_hops
03 June 2015
Cumulus Linux 2.5.2 User Guide
|
|-- disable_internal_restart
|
|-- ignore_non_swps
|
|-- interface
|
|
|-- swp1
|
|
|
`-- storm_control
|
|
|
|-- broadcast
|
|
|
|-- multicast
|
|
|
`-- unknown_unicast
|
|-- logging
|
|-- route
|
|
|-- host_max_percent
|
|
|-- max_routes
|
|
`-- table
|
`-- stats
|
`-- poll_interval
|-- ctrl
|
|-- acl
|
|-- hal
|
|
|
|-- logger
|
|-- netlink
|
|
|
|-- resync
|
`-- sample
|
`-- resync
`-- resync
`-- ulog_channel
|-- run
|
`-- route_info
|
|-- ecmp_nh
|
|
|-- count
|
|
|-- max
|
|
`-- max_per_route
|
|-- host
|
|
|-- count
|
|
|-- count_v4
|
|
|-- count_v6
|
|
`-- max
|
|-- mac
|
|
|-- count
|
|
`-- max
|
`-- route
|
|-- count_0
|
|-- count_1
|
|-- count_total
|
|-- count_v4
|
|-- count_v6
cumulusnetworks.com
29
Cumulus Networks
|
|-- mask_limit
|
|-- max_0
|
|-- max_1
|
`-- max_total
`-- version
Configuring switchd Parameters
You can use cl-cfg to configure many switchd parameters at runtime (like ACLs, interfaces, and
route table utilization), which minimizes disruption to your running switch. However, some options are
read only and cannot be configured at runtime.
For example, to see data related to routes, run:
cumulus@cumulus:~$ sudo cl-cfg -a switchd | grep route
route.table = 254
route.max_routes = 32768
route.host_max_percent = 50
cumulus@cumulus:~$
To modify the configuration, run cl-cfg -w. For example, to set the buffer utilization measurement
interval to 1 minute, run:
cumulus@cumulus:~$ sudo cl-cfg -w switchd buf_util.measure_interval=1
To verify that the value changed, use grep:
cumulus@cumulus:~# cl-cfg -a switchd | grep buf
buf_util.poll_interval = 0
buf_util.measure_interval = 1
You can get some of this information by running cl-resource-query; though you cannot
update the switchd configuration with it.
Restarting switchd
Whenever you modify your network configuration (typically changing any *.conf file, like /etc
/cumulus/datapath/traffic.conf), you must restart switchd for the changes to take effect:
30
03 June 2015
Cumulus Linux 2.5.2 User Guide
cumulus@switch:~$ sudo service switchd restart
Commands
cl-cfg
Configuration Files
/etc/cumulus/switchd.conf
Monitoring and Troubleshooting
This chapter introduces monitoring and troubleshooting Cumulus Linux.
Contents
(Click to expand)
Contents (see page 31)
Commands (see page 31)
Using the Serial Console (see page 31)
Configuring the Serial Console on PowerPC Switches (see page 31)
Configuring the Serial Console on x86 Switches (see page 32)
Diagnostics Using cl-support (see page 33)
Configuration Files (see page 34)
Next Steps (see page 34)
Commands
cl-support
fw_setenv
Using the Serial Console
The serial console can be a useful tool for debugging issues, especially when you find yourself
rebooting the switch often or if you don’t have a reliable network connection.
The default serial console baud rate is 115200, which is the baud rate ONIE uses.
Configuring the Serial Console on PowerPC Switches
On PowerPC switches, the U-Boot environment variable baudrate identifies the baud rate of the serial
console. To change the baudrate variable, use the fw_setenv command:
cumulusnetworks.com
31
Cumulus Networks
cumulus@switch:~$ sudo fw_setenv baudrate 9600
Updating environment variable: `baudrate'
Proceed with update [N/y]? y
You must reboot the switch for the baudrate change to take effect.
The valid values for baudrate are:
300
600
1200
2400
4800
9600
19200
38400
115200
Configuring the Serial Console on x86 Switches
On x86 switches, you configure serial console baud rate by editing grub. The valid values for the baud
rate are:
300
600
1200
2400
4800
9600
19200
38400
115200
To change the serial console baud rate:
1. Edit /etc/default/grub. The two relevant lines in /etc/default/grub are as follows;
replace the 115200 value with a valid value specified above in the --speed variable in the first
line and in the console variable in the second line:
GRUB_SERIAL_COMMAND="serial --port=0x2f8 --speed=115200 --word=8 -parity=no --stop=1"
GRUB_CMDLINE_LINUX="console=ttyS1,115200n8
cl_platform=accton_as5712_54x"
32
03 June 2015
Cumulus Linux 2.5.2 User Guide
2. After you save your changes to the grub configuration, type the following at the command
prompt:
cumulus@switch:~$ update-grub
3. If you plan on accessing your switch's BIOS over the serial console, you need to update the baud
rate in the switch BIOS. For more information, see this knowledge base article.
4. Reboot the switch.
Diagnostics Using cl-support
You can use cl-support to generate a single export file that contains various details and the
configuration from a switch. This is useful for remote debugging and troubleshooting.
You should run cl-support before you submit a support request to Cumulus Networks as this file
helps in the investigation of issues:
cumulus@switch:~$ sudo cl-support -h
Usage: cl-support [-h] [reason]...
Args:
[reason]: Optional reason to give for invoking cl-support.
Saved into tarball's reason.txt file.
Options:
-h: Print this usage statement
Example output:
cumulus@switch:~$ ls /var/support
cl_support_20130806_032720.tar.xz
The directory structure is compressed using LZMA2 compression and can be extracted using the unxz
command:
cumulus@switch:~$ cd /var/support
cumulus@switch:~$ sudo unxz cl_support_20130729_140040.tar.xz
cumulus@switch:~$ sudo tar xf cl_support_20130729_140040.tar
cumulus@switch:~$ ls -l cl_support_20130729_140040/
-rwxr-xr-x
1 root root 7724 Jul 29 14:00 cl-support
-rw-r--r--
1 root root
drwxr-xr-x
2 root root 4096 Jul 29 14:00 core
cumulusnetworks.com
52 Jul 29 14:00 cmdline.args
33
Cumulus Networks
drwxr-xr-x 64 root root 4096 Jul 29 13:51 etc
drwxr-xr-x
4 root root 4096 Jul 29 14:00 proc
drwxr-xr-x
2 root root 4096 Jul 29 14:01 support
drwxr-xr-x
3 root root 4096 Jul 29 14:00 sys
drwxr-xr-x
3 root root 4096 Aug
8 15:22 var
The directory contains the following elements:
Directory
Description
core
Contains the core files generated from Cumulus Linux HAL process, switchd.
etc
Is a replica of the switch’s /etc directory. /etc contains all the general Linux
configuration files, as well as configurations for the system’s network interfaces, quagga,
jdoo, and other packages.
log
Is a replica of the switch’s /var/log directory. Most Cumulus Linux log files are located
in this directory. Notable log files include switchd.log, daemon.log, quagga log files,
and syslog. For more information, read this knowledge base article.
proc
Is a replica of the switch’s /proc directory. In Linux, /proc contains runtime system
information (like system memory, devices mounted, and hardware configuration). These
files are not actual files but the current state of the system.
support
Is a set of files containing further system information, which is obtained by cl-support
running commands such as ps -aux, netstat -i, and so forth — even the routing
tables.
cl-support, when untarred, contains a reason.txt file. This file indicates what reason triggered it.
When contacting Cumulus Networks technical support, please attach the cl-support file if possible.
For more information about cl-support, please read Understanding and Decoding the cl-support
Output File (see page 57).
Configuration Files
/etc/cumulus/switchd.conf
Next Steps
The following links discuss more specific monitoring topics:
Setting Date and Time (see page 35) Setting Date and Time (see page 35)
Single User Mode - Boot Recovery (see page 37)
Monitoring Interfaces and Transceivers Using ethtool (see page 39)
Resource Diagnostics Using cl-resource-query (see page 42)
Monitoring System Hardware (see page 44)
Monitoring System Statistics and Network Traffic with sFlow (see page 50)
34
03 June 2015
Cumulus Linux 2.5.2 User Guide
Monitoring Virtual Device Counters (see page 53)
Setting Date and Time
Setting the time zone, date and time requires root privileges; use sudo.
Contents
(Click to expand)
Contents (see page 35)
Commands (see page 35)
Setting the Time Zone (see page 35)
Setting the Date and Time (see page 36)
Setting Time Using NTP (see page 36)
Configuration Files (see page 37)
Useful Links (see page 37)
Commands
date
dpkg-reconfigure tzdata
hwclock
ntpd (daemon)
ntpq
Setting the Time Zone
To see the current time zone, list the contents of /etc/timezone:
cumulus@switch:~$ cat /etc/timezone
US/Eastern
To set the time zone, run dpkg-reconfigure tzdata as root:
cumulus@switch:~$ sudo dpkg-reconfigure tzdata
Then navigate the menus to enable the time zone you want. The following example selects the US
/Pacific time zone:
cumulusnetworks.com
35
Cumulus Networks
cumulus@switch:~$ sudo dpkg-reconfigure tzdata
Configuring tzdata
-----------------Please select the geographic area in which you live. Subsequent
configuration
questions will narrow this down by presenting a list of cities, representing
the time zones in which they are located.
1. Africa
4. Australia
7. Atlantic
10. Pacific
2. America
5. Arctic
8. Europe
11. SystemV
3. Antarctica
6. Asia
9. Indian
12. US
13. Etc
Geographic area: 12
Please select the city or region corresponding to your time zone.
1. Alaska
4. Central
7. Indiana-Starke
10. Pacific
2. Aleutian
5. Eastern
8. Michigan
11. Pacific-New
3. Arizona
6. Hawaii
9. Mountain
12. Samoa
Time zone: 10
Current default time zone: 'US/Pacific'
Local time is now:
Mon Jun 17 09:27:45 PDT 2013.
Universal Time is now:
Mon Jun 17 16:27:45 UTC 2013.
For more info see the Debian System Administrator’s Manual – Time.
Setting the Date and Time
The switch contains a battery backed hardware clock that maintains the time while the switch is
powered off and in between reboots. When the switch is running, the Cumulus Linux operating system
maintains its own software clock.
During boot up, the time from the hardware clock is copied into the operating system’s software clock.
The software clock is then used for all timekeeping responsibilities. During system shutdown the
software clock is copied back to the battery backed hardware clock.
You can set the date and time on the software clock using the date command. See man date(1) for
details.
You can set the date and time on the hardware clock using the hwclock command. See man hwclock
(8) for details.
A good overview of the software and hardware clocks can be found in the Debian System Administrator’
s Manual – Time, specifically the section Setting and showing hardware clock.
Setting Time Using NTP
36
03 June 2015
Cumulus Linux 2.5.2 User Guide
Setting Time Using NTP
The ntpd daemon running on the switch implements the NTP protocol. It synchronizes the system time
with time servers listed in /etc/ntp.conf. It is started at boot by default. See man ntpd(8) for ntpd
details.
By default, /etc/ntp.conf contains some default time servers. Edit /etc/ntp.conf to add or update
time server information. See man ntp.conf(5) for details on configuring ntpd using ntp.conf.
To set the initial date and time via NTP before starting the ntpd daemon, use ntpd -q (This is same as
ntpdate, which is to be retired and not available).
ntpd -q can hang if the time servers are not reachable.
To verify that ntpd is running on the system:
cumulus@switch:~$ ps -ef | grep ntp
ntp
4074
1
0 Jun20 ?
00:00:33 /usr/sbin/ntpd -p /var/run
/ntpd.pid -g -u 101:102
Configuration Files
/etc/default/ntp — ntpd init.d configuration variables
/etc/ntp.conf — default NTP configuration file
/etc/init.d/ntp — ntpd init script
Useful Links
Debian System Administrator’s Manual – Time
http://www.ntp.org
http://en.wikipedia.org/wiki/Network_Time_Protocol
http://wiki.debian.org/NTP
Single User Mode - Boot Recovery
Use single user mode to assist in troubleshooting system boot issues or for password recovery.
Entering single user mode is platform-specific, so follow the appropriate steps for your x86 or PowerPC
switch.
Contents
(Click to expand)
Contents (see page 37)
Entering Single User Mode on a PowerPC Switch (see page 38)
Entering Single User Mode on an x86 Switch (see page 38)
cumulusnetworks.com
37
Cumulus Networks
Entering Single User Mode on an x86 Switch (see page 38)
Entering Single User Mode on a PowerPC Switch
1. From the console, boot the switch, interrupting the U-Boot countdown to enter the U-Boot
prompt. Enter the following:
=> setenv lbootargs init=/bin/sh
=> boot
2. After the system boots, the shell command prompt appears. In this mode, you can change the
root password or test a boot service that is hanging the boot process.
3. Reboot the system.
cumulus@switch:~$ sudo reboot -f
Restarting the system.
Entering Single User Mode on an x86 Switch
From the console, boot the switch. At the GRUB menu, select the image slot you wish to boot into with
a password:
GNU GRUB
version 1.99-27+deb7u2
+-------------------------------------------------------------------------+
|Cumulus Linux 2.5.0-be24dc3-201412021541-build - slot 1
|
|Cumulus Linux 2.5.0-be24dc3-201412021541-build - slot 1 (recovery mode)
|
|Cumulus Linux 2.5.0-b1bb3b7-201412090640-build - slot 2
|
|Cumulus Linux 2.5.0-b1bb3b7-201412090640-build - slot 2 (recovery mode)
|
|ONIE
|
+-------------------------------------------------------------------------|
In this example, you are selecting the slot2 image. Under the linux option, add init=/bin/bash:
GNU GRUB
version 1.99-27+deb7u2
+-------------------------------------------------------------------------+
| insmod part_gpt
|^
| insmod ext2
|
| set root='(hd0,gpt3)'
|
| search --no-floppy --fs-uuid --set=root c42be287-5321-4e77-975f-54e237a\|
| d72b0
38
|
03 June 2015
Cumulus Linux 2.5.2 User Guide
| echo 'Loading Linux
...'
|
| linux /cl-vmlinuz-3.2.60-1+deb7u1+cl2.5-slot-2 root=UUID=f01a2d40-d2fe-\|
| 435b-b3d1-7edc1eb0c42f console=ttyS0,115200n8 cl_platform=dell_s6000_s1\|
| 220 quiet active=2 init=/bin/bash
|
| echo 'Loading initial ramdisk ...' A
|
| initrd /cl-initrd.img-3.2.60-1+deb7u1+cl2.5-slot-2
|
|
|
+-------------------------------------------------------------------------+
Type Ctrl+x or F10 to boot with this change.
When you are done making changes as a single user, run reboot -f to boot the switch back to a
normal state:
Begin: Running /scripts/init-bottom ... done.
bash: cannot set terminal process group (-1): Inappropriate ioctl for device
bash: no job control in this shell
cumulus@switch:/# sudo reboot -f
Monitoring Interfaces and Transceivers Using ethtool
The ethtool command enables you to query or control the network driver and hardware settings. It
takes the device name (like swp1) as an argument. When the device name is the only argument to
ethtool, it prints the current settings of the network device. See man ethtool(8) for details. Not all
options are currently supported on switch port interfaces.
Contents
(Click to expand)
Contents (see page 39)
Commands (see page 39)
Monitoring Interfaces Using ethtool (see page 39)
Viewing and Clearing Interface Counters (see page 41)
Monitoring Switch Port SFP/QSFP Using ethtool (see page 42)
Commands
cl-netstat
ethtool
Monitoring Interfaces Using ethtool
To check the status of an interface using ethtool:
cumulusnetworks.com
39
Cumulus Networks
cumulus@switch:~$ ethtool swp1
Settings for swp1:
Supported ports: [ FIBRE ]
Supported link modes:
1000baseT/Full
10000baseT/Full
Supported pause frame use: No
Supports auto-negotiation: No
Advertised link modes:
1000baseT/Full
Advertised pause frame use: No
Advertised auto-negotiation: No
Speed: 10000Mb/s
Duplex: Full
Port: FIBRE
PHYAD: 0
Transceiver: external
Auto-negotiation: off
Current message level: 0x00000000 (0)
Link detected: yes
To query interface statistics:
cumulus@switch:~$ sudo ethtool -S swp1
NIC statistics:
HwIfInOctets: 1435339
HwIfInUcastPkts: 11795
HwIfInBcastPkts: 3
HwIfInMcastPkts: 4578
HwIfOutOctets: 14866246
HwIfOutUcastPkts: 11791
HwIfOutMcastPkts: 136493
HwIfOutBcastPkts: 0
HwIfInDiscards: 0
HwIfInL3Drops: 0
HwIfInBufferDrops: 0
HwIfInAclDrops: 28
HwIfInDot3LengthErrors: 0
HwIfInErrors: 0
SoftInErrors: 0
SoftInDrops: 0
SoftInFrameErrors: 0
HwIfOutDiscards: 0
HwIfOutErrors: 0
40
03 June 2015
Cumulus Linux 2.5.2 User Guide
HwIfOutQDrops: 0
HwIfOutNonQDrops: 0
SoftOutErrors: 0
SoftOutDrops: 0
SoftOutTxFifoFull: 0
HwIfOutQLen: 0
Viewing and Clearing Interface Counters
Interface counters contain information about an interface. You can view this information when you run
cl-netstat, ifconfig, or cat /proc/net/dev. You can also use cl-netstat to save or clear this
information:
cumulus@switch:~# sudo cl-netstat
Kernel Interface table
Iface
MTU Met
TX_DRP TX_OVR
RX_OK RX_ERR RX_DRP RX_OVR
TX_OK TX_ERR
Flg
-------------------------------------------------------------------------------------------eth0
0
lo
1500
0
0
611
0
0
0
487
0
0
0
0
0
0
0
0
0
0
0
0
0
BMRU
16436
0
0
swp1
1500
0
0
0
LRU
0
BMU
cumulus@switch:~# sudo :~# cl-netstat -c
Cleared counters
Option
Description
-c
Copies and clears statistics. It does not clear counters in the kernel or hardware.
-d
Deletes saved statistics, either the uid or the specified tag.
-D
Deletes all saved statistics.
-l
Lists saved tags.
-r
Displays raw statistics (unmodified output of cl-netstat).
-t <tag name>
Saves statistics with <tag name>.
cumulusnetworks.com
41
Cumulus Networks
Option
Description
-v
Prints cl-netstat version and exits.
Monitoring Switch Port SFP/QSFP Using ethtool
The ethtool -m command provides switch port SFP information. It shows connector information,
vendor data, and more:
cumulus@switch:~$ sudo ethtool -m swp1
swp1: SFP detected
Connector : CopperPigtail
EncodingCodes : Unspecified
ExtIdentOfTypeOfTransceiver : GBIC/SFP defined by twowire interface ID
LengthCable(UnitsOfm) : 1
NominalSignallingRate(UnitsOf100Mbd) : 103
RateIdentifier : Unspecified
ReceivedPowerMeasurementType : OMA
TransceiverCodes :
SFP+CableTechnology : Passive Cable
TypeOfTransceiver : SFP or SFP Plus
VendorDataCode(yymmdd) : 110830
VendorName : Amphenol
VendorOUI : Amp
VendorPN : 571540001
VendorRev : M
VendorSN : APF11350017C4V
Resource Diagnostics Using cl-resource-query
You can use cl-resource-query to retrieve information about host entries, MAC entries, L2 and L3
routes, and ECMPs (equal-cost multi-path routes, see Load Balancing (see page 271)) that are in use.
This is especially useful because Cumulus Linux syncs routes between the kernel and the switching
silicon. If the required resource pools in hardware fill up, new kernel routes can cause existing routes
to move from being fully allocated to being partially allocated.
In order to avoid this, routes in the hardware should be monitored and kept below the ASIC limits. For
example, on systems with a Trident II chipset, the limits are as follows:
routes: 8092 <<<< if all routes are IPv6, or 16384 if all routes are IPv4
long mask routes 2048 <<<< these are routes with a mask longer than the
route mask limit
route mask limit 64
42
03 June 2015
Cumulus Linux 2.5.2 User Guide
host_routes: 8192
ecmp_nhs: 16346
ecmp_nhs_per_route: 52
This translates to about 314 routes with ECMP next hops, if every route has the maximum ECMP NHs.
For systems with a Trident+ chipset, the limits are as follows:
routes: 16384 <<<< if all routes are IPv4
long mask routes 256 <<<< these are routes with a mask longer than the
route mask limit
route mask limit 64
host_routes: 8192
ecmp_nhs: 4044
ecmp_nhs_per_route: 52
This translates to about 77 routes with ECMP next hops, if every route has the maximum ECMP NHs.
You can monitor this in Cumulus Linux with the cl-resource-query command. Results vary between
switches running on Trident+ and Trident II chipsets.
cl-resource-query results for a Trident II switch:
cumulus@switch:~$ sudo cl-resource-query
Host entries:
1,
0% of maximum value
8192 <<<< this is
the default software-imposed limit, 50% of the hardware limit
IPv4 neighbors:
1
<<<< these are counts of the number
of valid entries in the table
IPv6 neighbors:
0
IPv4 entries:
13,
0% of maximum value
32668
IPv6 entries:
18,
0% of maximum value
16384
IPv4 Routes:
13
IPv6 Routes:
18
Total Routes:
31,
0% of maximum value
32768
0,
0% of maximum value
16346
12,
0% of maximum value
32768
ECMP nexthops:
MAC entries:
cl-resource-query results for a Trident+ switch:
cumulus@switch:~$ sudo cl-resource-query
Host entries:
6,
0% of maximum value
4096 <<< same as
above
IPv4 neighbors:
cumulusnetworks.com
6
43
Cumulus Networks
IPv6 neighbors:
0
IPv4/IPv6 entries:
33,
0% of maximum value
16284
Long IPv6 entries:
0,
0% of maximum value
256
IPv4 Routes:
29
IPv6 Routes:
2
Total Routes:
31,
0% of maximum value
32768
ECMP nexthops:
0,
0% of maximum value
4041
MAC entries:
0,
0% of maximum value 131072
Monitoring System Hardware
You monitor system hardware in these ways, using:
decode-syseeprom
sensors
smond
Net-SNMP
Contents
(Click to expand)
Contents (see page 44)
Commands (see page 44)
Monitoring Hardware Using decode-syseeprom (see page 45)
Command Options (see page 45)
Related Commands (see page 46)
Monitoring Hardware Using sensors (see page 46)
Command Options (see page 47)
Monitoring Switch Hardware Using SNMP (see page 47)
Public Community Disabled (see page 49)
Monitoring System Units Using smond (see page 49)
Command Options (see page 50)
Configuration Files (see page 50)
Useful Links (see page 50)
Commands
decode-syseeprom
dmidecode
lshw
sensors
smond
44
03 June 2015
Cumulus Linux 2.5.2 User Guide
Monitoring Hardware Using decode-syseeprom
The decode-syseeprom command enables you to retrieve information about the switch's EEPROM. If
the EEPROM is writable, you can set values on the EEPROM.
For example:
cumulus@switch:~# decode-syseeprom
TlvInfo Header:
Id String:
TlvInfo
Version:
1
Total Length: 114
TLV Name
Code Len Value
-------------------- ---- --- ----Product Name
0x21
4 4804
Part Number
0x22
Device Version
0x26
Serial Number
0x23
19 D1012023918PE000012
Manufacture Date
0x25
19 10/09/2013 20:39:02
Base MAC Address
0x24
6 00:E0:EC:25:7B:D0
MAC Addresses
0x2A
2 53
Vendor Name
0x2D
Label Revision
0x27
4 4804
Manufacture Country
0x2C
2 CN
CRC-32
0xFE
4 0x96543BC5
14 R0596-F0009-00
1 2
17 Penguin Computing
(checksum valid)
Command Options
Usage: /usr/cumulus/bin/decode-syseeprom [-a][-r][-s [args]][-t]
Option
Description
-h, –
help
Displays the help message and exits.
-a
Prints the base MAC address for switch interfaces.
-r
Prints the number of MACs allocated for switch interfaces.
-s
Sets the EEPROM content if the EEPROM is writable. args can be supplied in command line
in a comma separated list of the form '<field>=<value>, ...'. ',' and '=' are
illegal characters in field names and values. Fields that are not specified will default to their
current values. If args are supplied in the command line, they will be written without
confirmation. If args is empty, the values will be prompted interactively.
cumulusnetworks.com
45
Cumulus Networks
Option
Description
-t
TARGET
Selects the target EEPROM (board, psu2, psu1) for the read or write operation; default is
board.
-e, –
serial
Prints the device serial number.
Related Commands
You can also use the dmidecode command to retrieve hardware configuration information that’s been
populated in the BIOS.
You can use apt-get to install the lshw program on the switch, which also retrieves hardware
configuration information.
Monitoring Hardware Using sensors
The sensors command provides a method for monitoring the health of your switch hardware, such as
power, temperature and fan speeds. This command executes lm-sensors.
For example:
cumulus@switch:~$ sensors
tmp75-i2c-6-48
Adapter: i2c-1-mux (chan_id 0)
temp1:
+39.0 C
(high = +75.0 C, hyst = +25.0 C)
tmp75-i2c-6-49
Adapter: i2c-1-mux (chan_id 0)
temp1:
+35.5 C
(high = +75.0 C, hyst = +25.0 C)
ltc4215-i2c-7-40
Adapter: i2c-1-mux (chan_id 1)
in1:
+11.87 V
in2:
+11.98 V
power1:
12.98 W
curr1:
+1.09 A
max6651-i2c-8-48
Adapter: i2c-1-mux (chan_id 2)
fan1:
13320 RPM
fan2:
13560 RPM
(div = 1)
Output from the sensors command varies depending upon the switch hardware you use, as
46
03 June 2015
Cumulus Linux 2.5.2 User Guide
Output from the sensors command varies depending upon the switch hardware you use, as
each platform ships with a different type and number of sensors.
Command Options
Usage: sensors [OPTION]... [CHIP]...
Option
Description
-c, –configfile
Specify a config file; use - after -c to read the config file from stdin; by default,
sensors references the configuration file in /etc/sensors.d/.
-s, –set
Executes set statements in the config file (root only); sensors -s is run once at boot
time and applies all the settings to the boot drivers.
-f, –
fahrenheit
Show temperatures in degrees Fahrenheit.
-A, –noadapter
Do not show the adapter for each chip.
–bus-list
Generate bus statements for sensors.conf.
If [CHIP] is not specified in the command, all chip info will be printed. Example chip names include:
lm78-i2c-0-2d *-i2c-0-2d
lm78-i2c-0-* *-i2c-0-*
lm78-i2c-*-2d *-i2c-*-2d
lm78-i2c-*-* *-i2c-*-*
lm78-isa-0290 *-isa-0290
lm78-isa-* *-isa-*
lm78-*
Monitoring Switch Hardware Using SNMP
Cumulus Linux ships with Net-SNMP v5.4.3. However, it is disabled by default in Cumulus Linux 2.0.x
and later. To enable Net-SNMP, use jdoo, which is the fork of monit version 5.2.5.
jdoo and monit are mutually exclusive, so the monit package is not installed on Cumulus
Linux 2.5.2 and later. If you would prefer to use monit, it will uninstall jdoo from Cumulus
Linux. However, Cumulus Networks will not provide support for issues with monit.
1. Edit /etc/default/snmpd and verify that SNMPDRUN=yes.
2.
cumulusnetworks.com
47
Cumulus Networks
2. In order to use jdoo on SNMPD, you need to add a configuration like the following to your /etc
/jdoo/jdoorc file:
check process snmpd with pidfile /var/run/snmpd.pid
every 6 cycles
group networking
start program = "/etc/init.d/snmpd start"
stop program = "/etc/init.d/snmpd stop"
3. Then reload jdoo:
# sudo jdoo reload
4. Start snmp:
# sudo jdoo start snmpd
5. Optionally, if you don't want to monitor SNMPD, you can just start it natively:
# service snmpd start
Once enabled, you can use SNMP to manage various components on the switch. The supported MIBs
include many publicly used MIBs as well as some MIBs developed by Cumulus Networks for Cumulus
Linux:
SNMP-FRAMEWORK
SNMP-MPD
SNMP-USER-BASED-SM
SNMP-VIEW-BASED-ACM
SNMPv2
IP (includes ICMP)
TCP
UDP
UCD-SNMP (For information on exposing CPU and memory information via SNMP, see this
knowledge base article.)
IF-MIB
LLDP
LM-SENSORS MIB
48
03 June 2015
Cumulus Linux 2.5.2 User Guide
NET-SNMP-EXTEND-MIB (See also this knowledge base article on extending NET-SNMP in
Cumulus Linux to include data from power supplies, fans and temperature sensors.)
Resource utilization: Cumulus Linux includes its own resource utilization MIB, which is similar to
using cl-resource-query. It monitors L3 entries by host, route, nexthops, ECMP groups and
L2 MAC/BDPU entries. The MIB is defined in /usr/share/snmp/Cumulus-Resource-QueryMIB.txt.
Discard counters: Cumulus Linux also includes its own counters MIB, defined in /usr/share
/snmp/Cumulus-Counters-MIB.txt.
The overall Cumulus Linux MIB is defined in /usr/share/snmp/Cumulus-Snmp-MIB.txt.
The Quagga and Zebra routes MIB is disabled in Cumulus Linux.
Public Community Disabled
Public community is disabled by default in Cumulus Linux. While it is disabled, /etc/snmp/snmpd.
conf will have its public community entry commented out, like this:
#rocommunity public default -V systemonly
If the comment is removed, an agent can query the switch with this:
rocommunity public default -V systemonly
After you make any change to snmpd.conf, you must restart snmpd using service snmpd restart
for the new configuration to take effect.
To define the desired community configuration, use:
rocommunity <any community> default -V systemonly
Monitoring System Units Using smond
The smond daemon monitors system units like power supply and fan, updates their corresponding
LEDs, and logs the change in the state. Changes in system unit state are detected via the cpld
registers. smond utilizes these registers to read all sources, which impacts the health of the system unit,
determines the unit's health, and updates the system LEDs.
Use smonctl to display sensor information for the various system units:
cumulus@switch:~$ smonctl
Board
cumulusnetworks.com
:
OK
49
Cumulus Networks
Fan
:
OK
PSU1
:
OK
PSU2
:
BAD
Temp1
(Networking ASIC Die Temp Sensor
):
OK
Temp10
(Right side of the board
):
OK
Temp2
(Near the CPU (Right)
):
OK
Temp3
(Top right corner
):
OK
Temp4
(Right side of Networking ASIC
):
OK
Temp5
(Middle of the board
):
OK
Temp6
(P2020 CPU die sensor
):
OK
Temp7
(Left side of the board
):
OK
Temp8
(Left side of the board
):
OK
Temp9
(Right side of the board
):
OK
Command Options
Usage: smonctl [OPTION]... [CHIP]...
Option
Description
-j, --json
Generates JSON output.
-s SENSOR, --sensor SENSOR
Displays data for the specified sensor.
-v, --verbose
Displays detailed hardware sensors data.
For more information, read man smond and man smonctl.
Configuration Files
/etc/cumulus/switchd.conf
/etc/cumulus/sysledcontrol.conf
/etc/sensors.d/<switch>.conf - sensor configuration file (do not edit it!)
Useful Links
http://packages.debian.org/search?keywords=lshw
http://lm-sensors.org
Net-SNMP tutorials
Monitoring System Statistics and Network Traffic with sFlow
sFlow is a monitoring protocol that samples network packets, application operations, and system
counters. sFlow enables you to monitor your network traffic as well as your switch state and
performance metrics. An outside server, known as an sFlow collector, is required to collect and analyze
this data.
hsflowd is the daemon that samples and sends sFlow data to configured collectors. hsflowd is not
50
03 June 2015
Cumulus Linux 2.5.2 User Guide
hsflowd is the daemon that samples and sends sFlow data to configured collectors. hsflowd is not
included in the base Cumulus Linux installation. After installation, hsflowd will automatically start
when the switch boots up.
Contents
(Click to expand)
Contents (see page 51)
Installing hsflowd (see page 51)
Configuring sFlow (see page 51)
Configuring sFlow via DNS-SD (see page 51)
Manually Configuring /etc/hsflowd.conf (see page 52)
Configuring sFlow Visualization Tools (see page 53)
Configuration Files (see page 53)
Useful Links (see page 53)
Installing hsflowd
To download and install the hsflowd package, use apt-get:
cumulus@switch:~$ sudo apt-get update
cumulus@switch:~$ sudo apt-get install -y hsflowd
Configuring sFlow
You can configure hsflowd to send to the designated collectors via two methods:
DNS service discovery (DNS-SD)
Manually configuring /etc/hsflowd.conf
Configuring sFlow via DNS-SD
With this method, you need to configure your DNS zone to advertise the collectors and polling
information to all interested clients. Add the following content to the zone file on your DNS server:
_sflow._udp SRV 0 0 6343 collector1
_sflow._udp SRV 0 0 6344 collector2
_sflow._udp TXT (
"txtvers=1"
"sampling.1G=2048"
"sampling.10G=4096"
"sampling.40G=8192"
"polling=20"
)
cumulusnetworks.com
51
Cumulus Networks
The above snippet instructs hsflowd to send sFlow data to collector1 on port 6343 and to collector2
on port 6344. hsflowd will poll counters every 20 seconds and sample 1 out of every 2048 packets.
After the initial configuration is ready, bring up the sFlow daemon by running:
cumulus@switch:~$ sudo service hsflowd start
No additional configuration is required in /etc/hsflowd.conf.
Manually Configuring /etc/hsflowd.conf
With this method you will set up the collectors and variables on each switch.
Edit /etc/hsflowd.conf and change DNSSD = on to DNSSD = off:
DNSSD = off
Then set up your collectors and sampling rates in /etc/hsflowd.conf:
# Manual Configuration (requires DNSSD=off above)
#################################################
# Typical configuration is to send every 30 seconds
polling = 20
sampling.1G=2048
sampling.10G=4096
sampling.40G=8192
collector {
ip = 192.0.2.100
udpport = 6343
}
collector {
ip = 192.0.2.200
udpport = 6344
}
This configuration polls the counters every 20 seconds, samples 1 of every 2048 packets and sends this
information to a collector at 192.0.2.100 on port 6343 and to another collector at 192.0.2.200 on port
6344.
52
03 June 2015
Cumulus Linux 2.5.2 User Guide
Some collectors require each source to transmit on a different port, others may listen on only
one port. Please refer to the documentation for your collector for more information.
Configuring sFlow Visualization Tools
For information on configuring various sFlow visualization tools, read this Help Center article.
Configuration Files
/etc/hsflowd.conf
Useful Links
sFlow Collectors
sFlow Wikipedia page
Monitoring Virtual Device Counters
Cumulus Linux gathers statistics for VXLANs and VLANs using virtual device counters. These counters
are supported on Trident II-based platforms only; see the Cumulus Networks HCL for a list of
supported Trident II platforms.
You can retrieve the data from these counters using tools like ip -s link show, ifconfig, /proc
/net/dev, or netstat -i.
Contents
(Click to expand)
Contents (see page 53)
Sample VXLAN Statistics (see page 53)
Sample VLAN Statistics (see page 55)
For VLANs Using the non-VLAN-aware Bridge Driver (see page 55)
For VLANs Using the VLAN-aware Bridge Driver (see page 55)
Configuring the Counters in switchd (see page 56)
Configuring the Poll Interval (see page 56)
Configuring Internal VLAN Statistics (see page 56)
Clearing Statistics (see page 57)
Caveats and Errata (see page 57)
Sample VXLAN Statistics
VXLAN statistics are available as follows:
Aggregate statistics are available per VNI; this includes access and network statistics.
cumulusnetworks.com
53
Cumulus Networks
Network statistics are available for each VNI and displayed against the VXLAN device. This is
independent of the VTEP used, so this is a summary of the VNI statistics across all tunnels.
Access statistics are available per VLAN subinterface.
First, get interface information regarding the VXLAN bridge:
root@switch:~# brctl show br-vxln16757104
bridge name
bridge id
STP enabled
interfaces
-vxln16757104
8000.443839006988
no
swp2s0.6
swp2s1.6
swp2s2.6
swp2s3.6
vxln16757104
To get VNI statistics, run:
root@switch:~# ip -s link show br-vxln16757104
62: br-vxln16757104: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc
noqueue state UP mode DEFAULT
link/ether 44:38:39:00:69:88 brd ff:ff:ff:ff:ff:ff
RX: bytes
packets
errors
dropped overrun mcast
10848
158
0
0
TX: bytes
packets
errors
dropped carrier collsns
27816
541
0
0
0
0
0
0
To get access statistics, run:
root@switch:~# ip -s link show swp2s0.6
63: swp2s0.6@swp2s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc
noqueue master br-vxln16757104 state UP mode DEFAULT
link/ether 44:38:39:00:69:88 brd ff:ff:ff:ff:ff:ff
RX: bytes
packets
errors
dropped overrun mcast
2680
39
0
0
TX: bytes
packets
errors
dropped carrier collsns
7558
140
0
0
0
0
0
0
To get network statistics, run:
root@switch:~# ip -s link show vxln16757104
61: vxln16757104: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue
master br-vxln16757104 state UNKNOWN mode DEFAULT
54
03 June 2015
Cumulus Linux 2.5.2 User Guide
link/ether e2:37:47:db:f1:94 brd ff:ff:ff:ff:ff:ff
RX: bytes
packets
errors
dropped overrun mcast
0
0
0
0
TX: bytes
packets
errors
dropped carrier collsns
0
0
0
9
0
0
0
0
Sample VLAN Statistics
For VLANs Using the non-VLAN-aware Bridge Driver
In this case, each bridge is a single L2 broadcast domain and is associated with an internal VLAN. This
internal VLAN's counters are displayed as bridge netdev stats.
root@switch:~# brctl show br0
bridge name
bridge id
STP enabled
interfaces
br0
8000.443839006989
yes
bond0.100
swp2s2.100
root@switch:~# ip -s link show br0
42: br0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP
mode DEFAULT
link/ether 44:38:39:00:69:89 brd ff:ff:ff:ff:ff:ff
RX: bytes
packets
errors
dropped overrun mcast
23201498
227514
0
0
TX: bytes
packets
errors
dropped carrier collsns
18198262
178443
0
0
0
0
0
0
For VLANs Using the VLAN-aware Bridge Driver
For a bridge using the VLAN-aware driver (see page 184), the bridge is a just a container and each VLAN
(VID/PVID) in the bridge is an independent L2 broadcast domain. As there is no netdev available to
display these VLAN statistics, the switchd nodes are used instead:
root@switch:~# ifquery bridge
auto bridge
iface bridge inet static
bridge-vlan-aware yes
bridge-ports swp2s0 swp2s1
bridge-stp on
bridge-vids 2000-2002 4094
root@switch:~# ls /cumulus/switchd/run/stats/vlan/
2
2000
2001
2002
all
root@switch:~# cat /cumulus/switchd/run/stats/vlan/2000/aggregate
Vlan id
cumulusnetworks.com
: 2000
55
Cumulus Networks
L3 Routed In Octets
: -
L3 Routed In Packets
: -
L3 Routed Out Octets
: -
L3 Routed Out Packets
: -
Total In Octets
: 375
Total In Packets
: 3
Total Out Octets
: 387
Total Out Packets
: 3
Configuring the Counters in switchd
These counters are enabled by default. To configure them, use cl-cfg and configure them as you
would any other switchd parameter (see page 28). The switchd parameters are as follows:
stats.vlan.aggregate, which controls the statistics available for each VLAN. Its value
defaults to BRIEF.
stats.vxlan.aggregate, which controls the statistics available for each VNI (access and
network). Its value defaults to DETAIL.
stats.vxlan.member, which controls the statistics available for each local/access port in a
VXLAN bridge. Its value defaults to BRIEF.
The values for each parameter can be one of the following:
NONE: This disables the counter.
BRIEF: This provides tx/rx packet/byte counters for the associated parameter.
DETAIL: This provides additional feature-specific counters. In the case of stats.vxlan.
aggregate, DETAIL provides access vs. network statistics. For the other types, DETAIL has the
same effect as BRIEF.
If you change one of these settings on the fly, the new configuration applies only to those
VNIs or VLANs set up after the configuration changed; previously allocated counters remain
as is.
Configuring the Poll Interval
The virtual device counters are polled periodically. This can be CPU intensive, so the interval is
configurable in switchd, with a default of 2 seconds.
# Virtual devices hw-stat poll interval (in seconds)
#stats.vdev_hw_poll_interval = 2
Configuring Internal VLAN Statistics
For debugging purposes, you may need to access packet statistics associated with internal VLAN IDs.
These statistics are hidden by default, but can be configured in switchd:
56
03 June 2015
Cumulus Linux 2.5.2 User Guide
#stats.vlan.show_internal_vlans = FALSE
Clearing Statistics
Since ethtool is not supported for virtual devices, you cannot clear the statistics cache maintained by
the kernel. You can clear the hardware statistics via switchd:
root@switch:~# echo 1 > /cumulus/switchd/clear/stats/vlan
root@switch:~# echo 1 > /cumulus/switchd/clear/stats/vxlan
root@switch:~#
Caveats and Errata
Currently the CPU port is internally added as a member of all VLANs. Because of this, packets
sent to the CPU are counted against the corresponding VLAN's tx packets/bytes. There is no
workaround.
When checking the virtual counters for the bridge, the TX count is the number of packets
destined to the CPU before any hardware policers take effect. For example, if 500 broadcast
packets are sent into the bridge, the CPU is also sent 500 packets. These 500 packets are policed
by the default ACLs in Cumulus Linux, so the CPU might receive fewer than the 500 packets if
the incoming packet rate is too high. The TX counter for the bridge should be equal to 500*
(number of ports in the bridge - incoming port + CPU port) or just 500 * number of ports in the
bridge.
You cannot use ethtool -S for virtual devices. This is because the counters available via
netdev are sufficient to display the vlan/vxlan counters currently supported in the hardware
(only rx/tx packets/bytes are supported currently).
Understanding and Decoding the cl-support Output File
The cl-support command generates a tar archive of useful information for
troubleshooting that can be auto-generated or manually created. To manually
create it, run the cl-support command. The cl-support file is automatically
generated when:
There is a core file dump of any application (not specific to Cumulus Linux, but something all
Linux distributions support)
Memory usage surpasses 90% of the total system memory (memory usage > 90% for 1 cycle)
The loadavg over 15 minutes has on average greater than 2 (loadavg (15min) > 2)
All of these conditions are triggered by monit, located at /etc/monit/monitrc.
The Cumulus Networks support team may request you submit the output from cl-support to help
with the investigation of issues you might experience with Cumulus Linux.
cumulusnetworks.com
57
Cumulus Networks
cumulus@switch:~$ sudo cl-support -h
Usage: cl-support [-h] [reason]...
Args:
[reason]: Optional reason to give for invoking cl-support.
Saved into tarball's reason.txt file.
Options:
-h: Print this usage statement
Example output:
cumulus@switch:~$ ls /var/support
cl_support__switch_20141204_203833
(Click to expand)
The cl-support command generates a tar archive of useful information for troubleshooting that
can be auto-generated or manually created. To manually create it, run the cl-support command.
The cl-support file is automatically generated when: (see page 57)
Understanding the File Naming Scheme (see page 58)
Decoding the Output (see page 58)
Understanding the File Naming Scheme
The cl-support command generates a file under /var/support with the following naming scheme.
The following example describes the file called cl_support__switch_20141204_203833.tar.xz.
cl_support
switch
20141204
203833
This is always
prepended to
the tar.gz
output.
This is the hostname
of the switch where
cl-support was
executed.
The date in year,
month, day; so
20141204 is
December, 4th,
2014.
The time in hours, minutes,
seconds; so 203833 is 20, 38, 33
(20:38:33) or the equivalent to 8:
38:33 PM.
Decoding the Output
Decoding a cl_support file is a simple process performed using the tar command. The following
example illustrates extracting the cl_support file:
tar -xf cl_support__switch_20141204_203834.tar.xz
The -xf options are defined here:
58
03 June 2015
Cumulus Linux 2.5.2 User Guide
Option
Description
-x
Extracts to disk from the archive.
-f
Reads the archive from the specified file.
cumulus@switch:~$ ls -l cl_support__switch_20141204_203834/
-rwxr-xr-x
1 root root 7724 Jul 29 14:00 cl-support
-rw-r--r--
1 root root
drwxr-xr-x
2 root root 4096 Jul 29 14:00 core
52 Jul 29 14:00 cmdline.args
drwxr-xr-x 64 root root 4096 Jul 29 13:51 etc
drwxr-xr-x
4 root root 4096 Jul 29 14:00 proc
drwxr-xr-x
2 root root 4096 Jul 29 14:01 support
drwxr-xr-x
3 root root 4096 Jul 29 14:00 sys
drwxr-xr-x
3 root root 4096 Aug
8 15:22 var
The cl_support file, when untarred, contains a reason.txt file. This file indicates what reason
triggered the event. When contacting Cumulus Networks technical support, please attach the clsupport file if possible.
The directory contains the following elements:
Directory
Description
clsupport
This is a copy of the cl-support script that generated the cl_support file. It is copied
so Cumulus Networks knows exactly which files were included and which weren't. This
helps to fix future cl-support requests in the future.
core
Contains the core files generated from the Cumulus Linux HAL (hardware abstraction
layer) process, switchd.
etc
etc is the core system configuration directory. cl-support replicates the switch’s /etc
directory. /etc contains all the general Linux configuration files, as well as
configurations for the system’s network interfaces, quagga, monit, and other packages.
var/log
/var is the "variable" subdirectory, where programs record runtime information. System
logging, user tracking, caches and other files that system programs create and monitor
go into /var. cl-support includes only the log subdirectory of the var system-level
directory and replicates the switch’s /var/log directory. Most Cumulus Linux log files
are located in this directory. Notable log files include switchd.log, daemon.log,
quagga log files, and syslog. For more information, read this knowledge base article.
proc
cumulusnetworks.com
59
Cumulus Networks
Directory
Description
proc (short for processes) provides system statistics through a directory-and-file
interface. In Linux, /proc contains runtime system information (like system memory,
devices mounted, and hardware configuration). cl-support simply replicates the switch’
s /proc directory to determine the current state of the system.
support
support is not a replica of the Linux file system like the other folders listed above.
Instead, it is a set of files containing the output of commands from the command line.
Examples include the output of ps -aux , netstat -i , and so forth — even the routing
tables are included.
Here is more information on the file structure:
Troubleshooting the etc Directory (see page 62) — In terms of sheer numbers of files, /etc
contains the largest number of files to send to Cumulus Networks by far. However, log files
could be significantly larger in file size.
Troubleshooting Log Files (see page 60) — This guide highlights the most important log files to
look at. Keep in mind, cl-support includes all of the log files.
Troubleshooting the support Directory (see page 73) — This is an explanation of the support
directory included in the cl-support output.
Troubleshooting Log Files
The only real unique entity for logging on Cumulus Linux compared to any other Linux distribution is
switchd.log, which logs the HAL (hardware abstraction layer) from hardware like the Broadcom ASIC.
This guide on NixCraft is amazing for understanding how /var/log works. The green highlighted rows
below are the most important logs and usually looked at first when debugging.
Log
Description
Why is this
important?
/var/log
Information from the update-alternatives are logged into this log
/alternatives. file.
log
/var/log/apt
Information the apt utility can send logs here; for example, from
apt-get install and apt-get remove.
/var/log
/audit/
Contains log information stored by the Linux audit daemon, auditd
.
/var/log
/auth.log
Authentication logs.
/var/log
/boot.log
Contains information that is logged when the system boots.
60
03 June 2015
Cumulus Linux 2.5.2 User Guide
Log
Description
/var/log
/btmp
This file contains information about failed login attempts. Use the
last command to view the btmp file. For example:
Why is this
important?
last -f /var/log/btmp | more
/var/log
/daemon.
log
Contains information logged by the various background daemons
that run on the system.
/var/log
/dmesg
Contains kernel ring buffer information. When the system boots up,
it prints number of messages on the screen that display information
about the hardware devices that the kernel detects during boot
process. These messages are available in the kernel ring buffer and
whenever a new message arrives, the old message gets overwritten.
You can also view the content of this file using the dmesg command.
/var/log
/dpkg.log
Contains information that is logged when a package is installed or
removed using the dpkg command.
/var/log
/faillog
Contains failed user login attempts. Use the faillog command to
display the contents of this file.
/var/log/fsck
/*
The fsck utility is used to check and optionally repair one or more
Linux filesystems.
/var/log
/mail.log
Mail server logs.
/var/log
/messages
General messages and system related information.
/var/log
/monit.log
monit is a utility for managing and monitoring processes, files,
directories and filesystems on a Unix system.
/var/log
/news/*
The news command keeps you informed of news concerning the
system.
/var/log
/ntpstats
Logs for network configuration protocol.
dmesg is one of
the few places
to determine
hardware
errors.
Kernel logs.
cumulusnetworks.com
61
Cumulus Networks
Log
Description
Why is this
important?
/var/log
/quagga/*
Where Quagga logs to once enabled.
This is how
Cumulus
Networks
troubleshoots
routing. For
example an
md5 or mtu
mismatch with
OSPF.
/var/log
/switchd.
log/
The HAL log for Cumulus Linux.
This is specific
to Cumulus
Linux. Any
switchd
crashes are
logged here.
/var/log
/syslog
The main system log, which logs everything except auth-related
messages.
The primary
log; it's easiest
to grep this file
to see what
occurred
during a
problem.
/var/log
/wtmp
Login records file.
/var/log
/yum.log
apt command log file.
/var/log
/kern.log
Troubleshooting the etc Directory
The cl-support (see page 57) script replicates the /etc directory.
Files that cl-support deliberately excludes are:
File
Description
/etc/nologin
nologin prevents unprivileged users from logging into the system.
/etc
/alternatives
update-alternatives creates, removes, maintains and displays information about
the symbolic links comprising the Debian alternatives system.
This is the alphabetical of the output from running ls -l on the /etc directory structure created by
62
03 June 2015
Cumulus Linux 2.5.2 User Guide
This is the alphabetical of the output from running ls -l on the /etc directory structure created by
cl-support. The green highlighted rows are the ones Cumulus Networks finds most important when
troubleshooting problems.
File
Description
adduser.conf
The file /etc/adduser.conf contains defaults for
the programs adduser, addgroup, deluser, and
delgroup.
adjtime
Corrects the time to synchronize the system clock.
apt
apt (Advanced Package Tool) is the command-line
tool for handling packages. This folder contains all
the configurations.
audisp
The directory that contains audisp-remote.conf,
which is the file that controls the configuration of
the audit remote logging subsystem.
audit
The directory that contains the /etc/audit
/auditd.conf, which contains configuration
information specific to the audit daemon.
bash.bashrc
Bash is an sh-compatible command language
interpreter that executes commands read from
standard input or from a file.
bash_completion
This points to /usr/share/bash-completion
/bash_completion.
Why is this important?
apt interactions or
unsupported apps can
affect machine
performance.
bash_completion. This folder contains app-specific code for Bash
d
completion on Cumulus Linux, such as mstpctl.
bcm.d
Broadcom-specific ASIC file structure (hardware
interaction). If there are questions contact the
Cumulus Networks Support team. This is unique to
Cumulus Linux.
bindresvport.
blacklist
This file contains a list of port numbers between
600 and 1024, which should not be used by
bindresvport.
ca-certificates
The folder for ca-certificates. It is empty by
default on Cumulus Linux; see below for more
information.
cumulusnetworks.com
63
Cumulus Networks
File
Description
ca-certificates.
conf
Each lines list the pathname of activated CA
certificates under /usr/share/ca-certificates
.
calendar
The system-wide default calendar file.
chef
This is an example of something that is not
included by default. In this instance, cl-support
included the chef folder for some reason.
cron.d
cron is a daemon that executes scheduled
commands.
cron.daily
See above.
cron.hourly
See above.
cron.monthly
See above.
cron.weekly
See above.
crontab
See above.
cumulus
This directory contains the following:
ACL information, stored in the acl directory.
switchd configuration file, switchd.conf.
qos, which is under the datapath directory.
The routing protocol process priority, nice.
conf.
The breakout cable configuration, under
ports.conf.
debconf.conf
Debconf is a configuration system for Debian
packages.
debian_version
The complete Debian version string.
Why is this important?
This is not installed by
default, but this tool could
have been installed or
configured incorrectly,
which is why it's included
in the cl-support output.
This folder is specific to
Cumulus Linux and does
not exist on other Linux
platforms. For example,
while you can configure
iptables, to hardware
accelerate rules into the
hardware you need to use
cl-acltool and have the
rules under the /etc
/cumulus/acl/policy.d
/<filename.rules)
debsums-ignore
64
03 June 2015
Cumulus Linux 2.5.2 User Guide
File
Description
Why is this important?
debsums verifies installed package files against
their MD5 checksums. This file identifies the
packages to ignore.
default
This folder contains files with configurable flags for
many different applications (most installed by
default or added manually). For example, /etc
/default/networking has a flag for
EXCLUDE_INTERFACES=, which is set to nothing by
default, but a user could change it to something
like swp3.
deluser.conf
The file /etc/deluser.conf contains defaults for
the programs deluser and delgroup.
dhcp
This directory contains DHCP-specific information.
dpkg
The package manager for Debian.
e2fsck.conf
The configuration file for e2fsck. It controls the
default behavior of e2fsck while it checks ext2,
ext3 or ext4 filesystems.
environment
Utilized by pam_env for setting and unsetting
environment variables.
ethertypes
This file can be used to show readable characters
instead of hexadecimal numbers for the protocols.
For example, 0x0800 will be represented by IPv4.
fstab
Static information about the filesystems.
fstab.d
The directory that can contain additional fstab
information; it is empty by default.
fw_env.config
Configuration file utilized by U-Boot.
gai.conf
Configuration file for sorting the return information
from getaddrinfo.
groff
The directory containing information for groffer,
an application used for displaying Unix man pages.
group
The /etc/group file is a text file that defines the
groups on the system.
cumulusnetworks.com
65
Cumulus Networks
File
Description
group-
Backup for the /etc/group file.
gshadow
/etc/gshadow contains the shadowed information
for group accounts.
gshadow-
Backup for the /etc/gshadow file.
host.conf
Resolver configuration file, which contains options
like multi that determines whether /etc/hosts
will respond with multiple entries for DNS names.
hostname
The system host name, such as leaf1, spine1, sw1.
hosts
The static table lookup for hostnames.
hosts.allow
The part of the host_access program for controlling
a simple access control language. hosts.
allow=Access is granted when a daemon/client
pair matches an entry.
hosts.deny
See hosts.allow above, except that access is denied
when a daemon/client pair matches an entry.
init
Default location of the system job configuration files
.
init.d
In order for a service to start when the switch
boots, you should add the necessary script to the
director here. The differences between init and
init.d are explained well here.
inittab
The format of the inittab file used by the sysvcompatible init process.
inputrc
The initialization file utilized by readline.
insserv
This application enables installed system init scripts
; this directory is empty by default.
insserv.conf
Configuration file for insserv.
insserv.conf.d
Additional directory for insserv configurations.
Why is this important?
iproute2
66
03 June 2015
Cumulus Linux 2.5.2 User Guide
File
Description
Why is this important?
Directory containing values for the Linux command
line tool ip.
issue
/etc/issue is a text file that contains a message
or system identification to be printed before the
login prompt.
issue.net
Identification file for telnet sessions.
ld.so.cache
Contains a compiled list of candidate libraries
previously found in the augmented library path.
ld.so.conf
Used by the ldconfig tool, which configures
dynamic linker run-time bindings.
ld.so.conf.d
The directory that contains additional ld.so.conf
configuration (see above).
ldap
The directory containing the ldap.conf
configuration file used to set the system-wide
default to be applied when running LDAP clients.
libaudit.conf
Configuration file utilized by get_auditfail_action.
libnl-3
Directory for the configuration relating to the libnl
library, which is the core library for implementing
the fundamentals required to use the netlink
protocol such as socket handling, message
construction and parsing, and sending and
receiving of data.
lldpd.d
Directory containing configuration files whose
commands are executed by lldpcli at startup.
localtime
Copy of the original data file for /etc/timezone.
logcheck
Directory containing logcheck.conf and logfiles
utilized by the log check program, which scans
system logs for interesting lines.
login.defs
Shadow password suite configuration.
logrotate.conf
Rotates, compresses and mails system logs.
logrotate.d
cumulusnetworks.com
67
Cumulus Networks
File
Description
Why is this important?
Directory containing additional log rotate
configurations.
lsb-release
Shows the current version of Linux on the system.
Run cat /etc/lsb-release for output.
magic
Used by the file command to determine file type.
magic tests check for files with data in particular
fixed formats.
magic.mime
The magic MIME type causes the file command
to output MIME type strings rather than the more
traditional human readable ones.
mailcap
The mailcap file is read by the metamail program
to determine how to display non-text at the local
site.
mailcap.order
The order of entries in the /etc/mailcap file can
be altered by editing the /etc/mailcap.order
file.
manpath.config
The manpath configuration file is used by the
manual page utilities to assess users’ manpaths at
run time, to indicate which manual page
hierarchies (manpaths) are to be treated as system
hierarchies and to assign them directories to be
used for storing cat files.
mime.types
MIME type description file for cups.
mke2fs.conf
Configuration file for mke2fs, which is a program
that creates an ext, ext3 or ext4 filesystem.
modprobe.d
Configuration directory for modprobe, which is a
utility that can add and remove modules from the
Linux kernel.
modules
The kernel modules to load at boot time.
monit
monit is a utility for monitoring services on a Unix
system; this directory has configuration files
beneath it.
68
This shows you the version
of the operating system
you are running; also
compare this to the output
of cl-img-select.
03 June 2015
Cumulus Linux 2.5.2 User Guide
File
Description
motd
The contents of /etc/motd ("message of the day")
are displayed by pam_motd after a successful login
but just before it executes the login shell.
mtab
The programs mount and umount maintain a list of
currently mounted filesystems in the /etc/mtab
file. If no arguments are given to mount, this list is
printed.
nanorc
The GNU nano rcfile.
network
Contains the network interface configuration for
ifup and ifdown.
networks
Network name information.
nsswitch.conf
System databases and name service switch
configuration file.
ntp.conf
NTP (network time protocol) server configuration
file.
openvswitch
The directory containing the conf.db file, which is
used by ovsdb-server.
openvswitchvtep
Configuration files used for the VTEP daemon and
ovsdb-server.
opt
Host-specific configuration files for add-on
applications installed in /opt.
os-release
Operating system identification.
Why is this important?
The main configuration file
is under /etc/network
/interfaces. This is
where you configure L2
and L3 information for all
of your front panel ports
(swp interfaces). Settings
like MTU, link speed, IP
address information,
VLANs are all done here.
pam.conf
cumulusnetworks.com
69
Cumulus Networks
File
Description
Why is this important?
The PAM (pluggable authentication module)
configuration file. When a PAM-aware privilege
granting application is started, it activates its
attachment to the PAM-API. This activation
performs a number of tasks, the most important
being the reading of the configuration file(s).
pam.d
Alternate directory to configure PAM (see above).
passwd
User account information.
passwd-
Backup file for /etc/passwd.
perl
Perl is an available scripting language. /etc/perl
contains configuration files specific to Perl.
profile
/etc/profile is utilized by sysprofile, a
modular centralized shell configuration.
profile.d
The directory version of the above, which contains
configuration files.
protocols
The protocols definition file, a plain ASCII file that
describes the various DARPAnet protocols that are
available from the TCP/IP subsystem.
ptm.d
The directory containing scripts that are run if PTM
(see page 143) passes or fails.
python
python is an available scripting language.
python2.6
The 2.6 version of python.
python2.7
The 2.7 version of python.
quagga
Contains the configuration files for the Quagga
routing suite (see page 273), the preferred Cumulus
Linux routing engine.
rc.local
The /etc/rc.local script is used by the system
administrator to execute after all the normal
system services are started, at the end of the
process of switching to a multiuser runlevel. You
can use it to start a custom service, for example, a
server that's installed in /usr/local. Most
70
Cumulus Linux-specific
folder for PTM (prescriptive
topology manager).
03 June 2015
Cumulus Linux 2.5.2 User Guide
File
Description
Why is this important?
installations don't need /etc/rc.local; it's
provided for the minority of cases where it's needed
.
rc0.d
Like rc.local, these scripts are booted by default,
but the number of the folder represents the Linux
runlevel. This folder 0 represents runlevel 0 (halt
the system).
rc1.d
This is run level 1, which is single-user/minimal
mode.
rc2.d
Runlevels 2 through 5 are multiuser modes. Debian
systems (such as Cumulus Linux) come with id=2,
which indicates that the default runlevel will be 2
when the multi-user state is entered, and the
scripts in /etc/rc2.d/ will be run.
rc3.d
See above.
rc4.d
See above.
rc5.d
See above.
rc6.d
Runlevel 6 is reboot the system.
rcS.d
S stands for single and is equivalent to rc1.
resolv.conf
Resolver configuration file, which is where DNS is
set (domain, nameserver and search).
rmt
This is not a mistake. The shell script /etc/rmt is
provided for compatibility with other Unix-like
systems, some of which have utilities that expect to
find (and execute) rmt in the /etc directory on
remote systems.
rpc
The rpc file contains human-readable names that
can be used in place of RPC program numbers.
rsyslog.conf
The rsyslog.conf file is the main configuration
file for rsyslogd, which logs system messages on
*nix systems.
rsyslog.d
The directory containing additional configuration
for rsyslog.conf (see above).
cumulusnetworks.com
You need DNS to reach the
Cumulus Linux repository.
71
Cumulus Networks
File
Description
securetty
This file lists terminals into which the root user can
log in.
security
The /etc/security directory contains securityrelated configurations files. Whereas PAM concerns
itself with the methods used to authenticate any
given user, the files under /etc/security are
concerned with just what a user can or cannot do.
For example, the /etc/security/access.conf
file contains a list of which users are allowed to log
in and from what host (for example, using telnet).
The /etc/security/limits.conf file contains
various system limits, such as maximum number of
processes.
selinux
NSA Security-Enhanced Linux.
sensors.d
The directory from which the sensors program
loads its configuration; this is unique for each
hardware platform. See also Monitoring System
Hardware (see page 44).
sensors3.conf
The sensors.conf file describes how libsensors
, and thus all programs using it, should translate
the raw readings from the kernel modules to realworld values.
services
services is a plain ASCII file providing a mapping
between human-readable textual names for
internet services and their underlying assigned port
numbers and protocol types.
shadow
shadow is a file that contains the password
information for the system's accounts and optional
aging information.
shadow-
The backup for the /etc/shadow file.
shells
The pathnames of valid login shells.
skel
The skeleton directory (usually /etc/ skel ) is
used to copy default files and also sets a umask for
the creation used by pam_mkhomedir.
snmp
Interface functions to the SNMP (simple network
management protocol) toolkit.
72
Why is this important?
03 June 2015
Cumulus Linux 2.5.2 User Guide
File
Description
ssh
The ssh configuration.
ssl
The OpenSSL ssl library implements the Secure
Sockets Layer (SSL v2/v3) and Transport Layer
Security (TLS v1) protocols. This directory holds
certificates and configuration.
staff-group-forusr-local
Use cat or more on this file to learn more
information, see http://bugs.debian.org/299007.
sudoers
The sudoers policy plugin determines a user's
sudo privileges.
sudoers.d
The directory file containing additional sudoers
configuration (see above).
sysctl.conf
Configures kernel parameters at boot.
sysctl.d
The directory file containing additional
configuration (see above).
systemd
systemd system and service manager.
terminfo
Terminal capability database.
timezone
If this file exists, it is read and its contents are used
as the time zone name.
ucf.conf
The update configuration file preserves user
changes in configuration files.
udev
Dynamic device management.
ufw
Provides both a command line interface and a
framework for managing a netfilter firewall.
vim
Configuration file for command line tool vim.
wgetrc
Configuration file for command line tool wget.
Why is this important?
Troubleshooting the support Directory
The support directory is unique in the fact that it is not a copy of the switch's filesystem. Actually, it is
the output from various commands. For example:
cumulusnetworks.com
73
Cumulus Networks
File
Equivalent
Command
support
/ip.addr
Description
This shows you all the interfaces (including swp front panel ports), IP
address information, admin state and physical state.
cumulus@sw
$ ip addr
show
Managing Application Daemons
You manage application daemons in Cumulus Linux in the following ways:
Identifying active listener ports
Identifying daemons currently active or stopped
Identifying boot time state of a specific daemon
Disabling or enabling a specific daemon
Contents
(Click to expand)
Contents (see page 74)
Identifying Active Listener Ports for IPv4 and IPv6 (see page 74)
Identifying Daemons Currently Active or Stopped (see page 75)
Identifying Boot Time State of a Specific Daemon (see page 75)
Disabling or Enabling a Specific Daemon (see page 76)
Identifying Active Listener Ports for IPv4 and IPv6
You can identify the active listener ports under both IPv4 and IPv6 using the lsof command:
cumulus@switch:~$ sudo lsof -Pnl +M -i4
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
ntpd 1882 104 16u IPv4 3954 0t0 UDP *:123
ntpd 1882 104 18u IPv4 3963 0t0 UDP 127.0.0.1:123
ntpd 1882 104 19u IPv4 3964 0t0 UDP 192.168.8.37:123
snmpd 1987 105 8u IPv4 5423 0t0 UDP *:161
zebra 1993 103 10u IPv4 5151 0t0 TCP 127.0.0.1:2601 (LISTEN)
sshd 2496 0 3u IPv4 5809 0t0 TCP *:22 (LISTEN)
jdoo 2622 0 6u IPv4 6132 0t0 TCP 127.0.0.1:2812 (LISTEN)
sshd 31700 0 3r IPv4 187630 0t0 TCP 192.168.8.37:22->192.168.8.3:50386
74
03 June 2015
Cumulus Linux 2.5.2 User Guide
(ESTABLISHED)
cumulus@switch:~$ sudo lsof -Pnl +M -i6
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
ntpd 1882 104 17u IPv6 3955 0t0 UDP *:123
ntpd 1882 104 20u IPv6 3965 0t0 UDP [::1]:123
ntpd 1882 104 21u IPv6 3966 0t0 UDP [fe80::7272:cfff:fe96:6639]:123
sshd 2496 0 4u IPv6 5811 0t0 TCP *:22 (LISTEN)
Identifying Daemons Currently Active or Stopped
To determine which daemons are currently active or stopped, use the service --status-all
command, then pipe the results to grep, using the - or + operators:
cumulus@switch:~$ sudo service --status-all | grep +
[ ? ] aclinit
[ + ] arp_refresh
[ + ] auditd
...
cumulus@switch:~$ sudo service --status-all | grep [ - ] isc-dhcp-server
[ - ] openvswitch-vtep
[ - ] ptmd
...
Identifying Boot Time State of a Specific Daemon
The ls command can provide the boot time state of a daemon. A file link with a name starting with S
identifies a boot-time-enabled daemon. A file link with a name starting with K identifies a disabled
daemon.
cumulus@switch:~/etc$ sudo ls -l rc*.d | grep <daemon name>
For example:
cumulus@switch:~/etc$ sudo ls -l rc*.d | grep snmpd
lrwxrwxrwx 1 root root 15 Apr 4 2014 K02snmpd -> ../init.d/snmpd
lrwxrwxrwx 1 root root 15 Apr 4 2014 K02snmpd -> ../init.d/snmpd
lrwxrwxrwx 1 root root 15 Apr 4 2014 S01snmpd -> ../init.d/snmpd
lrwxrwxrwx 1 root root 15 Apr 4 2014 S01snmpd -> ../init.d/snmpd
lrwxrwxrwx 1 root root 15 Apr 4 2014 S01snmpd -> ../init.d/snmpd
cumulusnetworks.com
75
Cumulus Networks
lrwxrwxrwx 1 root root 15 Apr 4 2014 S01snmpd -> ../init.d/snmpd
lrwxrwxrwx 1 root root 15 Apr 4 2014 K02snmpd -> ../init.d/snmpd
Disabling or Enabling a Specific Daemon
To enable or disable a specific daemon, run:
cumulus@switch:~$ update-rc.d <daemon> disable | enable
For example:
cumulus@switch:~/etc$ sudo update-rc.d snmpd disable
update-rc.d: using dependency based boot sequencing
insserv: warning: current start runlevel(s) (empty) of script `snmpd'
overrides LSB defaults (2 3 4 5).
insserv: warning: current stop runlevel(s) (0 1 2 3 4 5 6) of script
`snmpd' overrides LSB defaults (0 1 6).
insserv: warning: current start runlevel(s) (empty) of script `snmpd'
overrides LSB defaults (2 3 4 5).
insserv: warning: current stop runlevel(s) (0 1 2 3 4 5 6) of script
`snmpd' overrides LSB defaults (0 1 6).
cumulus@switch:~/etc$ sudo ls -l rc*.d | grep snmpd
lrwxrwxrwx 1 root root 15 Apr 4 2014 K02snmpd -> ../init.d/snmpd
lrwxrwxrwx 1 root root 15 Apr 4 2014 K02snmpd -> ../init.d/snmpd
lrwxrwxrwx 1 root root 15 Feb 13 17:35 K02snmpd -> ../init.d/snmpd
lrwxrwxrwx 1 root root 15 Feb 13 17:35 K02snmpd -> ../init.d/snmpd
lrwxrwxrwx 1 root root 15 Feb 13 17:35 K02snmpd -> ../init.d/snmpd
lrwxrwxrwx 1 root root 15 Feb 13 17:35 K02snmpd -> ../init.d/snmpd
lrwxrwxrwx 1 root root 15 Apr 4 2014 K02snmpd -> ../init.d/snmpd
cumulus@switch:~/etc$ sudo update-rc.d snmpd enable
update-rc.d: using dependency based boot sequencing
cumulus@switch:~/etc$ sudo ls -l rc*.d | grep snmpd
lrwxrwxrwx 1 root root 15 Apr 4 2014 K02snmpd -> ../init.d/snmpd
lrwxrwxrwx 1 root root 15 Apr 4 2014 K02snmpd -> ../init.d/snmpd
lrwxrwxrwx 1 root root 15 Feb 13 17:35 S01snmpd -> ../init.d/snmpd
76
03 June 2015
Cumulus Linux 2.5.2 User Guide
lrwxrwxrwx 1 root root 15 Feb 13 17:35 S01snmpd -> ../init.d/snmpd
lrwxrwxrwx 1 root root 15 Feb 13 17:35 S01snmpd -> ../init.d/snmpd
lrwxrwxrwx 1 root root 15 Feb 13 17:35 S01snmpd -> ../init.d/snmpd
lrwxrwxrwx 1 root root 15 Apr 4 2014 K02snmpd -> ../init.d/snmpd
Troubleshooting ifupdown2
The following sections describe various ways you can troubleshoot ifupdown2.
Contents
(Click to expand)
Contents (see page 77)
Enabling Logging for Networking (see page 77)
Using ifquery to Validate and Debug Interface Configurations (see page 78)
Debugging Mako Template Errors (see page 80)
ifdown Cannot Find an Interface that Exists (see page 81)
MTU Set on a Logical Interface Fails with Error: "Numerical result out of range" (see page 81)
Interpreting iproute2 batch Command Failures (see page 81)
Understanding the "RTNETLINK answers: Invalid argument" Error when Adding a Port to a Bridge
(see page 82)
Enabling Logging for Networking
The /etc/default/networking file contains two settings for logging:
To get ifupdown2 logs when the switch boots (stored in syslog)
To enable logging when you run service networking [start|stop|reload]
This file also contains an option for excluding interfaces when you boot the switch or run service
networking start|stop|reload. You can exclude any interface specified in /etc/network
/interfaces. These interfaces do not come up when you boot the switch or start/stop/reload the
networking service.
$cat /etc/default/networking
#
#
# Parameters for the /etc/init.d/networking script
#
#
# Change the below to yes if you want verbose logging to be enabled
VERBOSE="no"
# Change the below to yes if you want debug logging to be enabled
cumulusnetworks.com
77
Cumulus Networks
DEBUG="no"
# Change the below to yes if you want logging to go to syslog
SYSLOG="no"
# Exclude interfaces
EXCLUDE_INTERFACES=
Using ifquery to Validate and Debug Interface Configurations
You use ifquery to print parsed interfaces file entries.
To use ifquery to pretty print iface entries from the interfaces file, run:
cumulus@switch:~$ sudo ifquery bond0
auto bond0
iface bond0
address 14.0.0.9/30
address 2001:ded:beef:2::1/64
bond-slaves swp25 swp26
bond-mode 802.3ad
bond-miimon 100
bond-use-carrier 1
bond-lacp-rate 1
bond-min-links 1
bond-xmit-hash-policy layer3+4
Use ifquery --check to check the current running state of an interface within the interfaces file. It
returns exit code 0 or 1 if the configuration does not match:
cumulus@switch:~$ sudo ifquery --check bond0
iface bond0
bond-mode 802.3ad
bond-miimon 100
()
()
bond-use-carrier 1
()
bond-lacp-rate 1
()
bond-min-links 1
()
bond-xmit-hash-policy layer3+4
bond-slaves swp25 swp26
address 14.0.0.9/30
()
address 2001:ded:beef:2::1/64
78
()
()
()
03 June 2015
Cumulus Linux 2.5.2 User Guide
ifquery --check is an experimental feature.
Use ifquery --running to print the running state of interfaces in the interfaces file format:
cumulus@switch:~$ sudo ifquery --running bond0
auto bond0
iface bond0
bond-xmit-hash-policy layer3+4
bond-miimon 100
bond-lacp-rate 1
bond-min-links 1
bond-slaves swp25 swp26
bond-mode 802.3ad
address 14.0.0.9/30
address 2001:ded:beef:2::1/64
ifquery --syntax-help provides help on all possible attributes supported in the interfaces file.
For complete syntax on the interfaces file, see man interfaces and man ifupdown-addonsinterfaces.
ifquery can dump information in JSON format:
cumulus@switch:~$ sudo ifquery --format=json bond0
{
"auto": true,
"config": {
"bond-use-carrier": "1",
"bond-xmit-hash-policy": "layer3+4",
"bond-miimon": "100",
"bond-lacp-rate": "1",
"bond-min-links": "1",
"bond-slaves": "swp25 swp26",
"bond-mode": "802.3ad",
"address": [
"14.0.0.9/30",
"2001:ded:beef:2::1/64"
]
},
"addr_method": null,
"name": "bond0",
"addr_family": null
}
cumulusnetworks.com
79
Cumulus Networks
You can use ifquery --print-savedstate to check the ifupdown2 state database. ifdown works
only on interfaces present in this state database.
cumulus@leaf1$ sudo ifquery --print-savedstate eth0
auto eth0
iface eth0 inet dhcp
Debugging Mako Template Errors
An easy way to debug and get details about template errors is to use the mako-render command on
your interfaces template file or on /etc/network/interfaces itself.
cumulus@switch:~$ sudo mako-render /etc/network/interfaces
# This file describes the network interfaces available on your system
# and how to activate them. For more information, see interfaces(5).
# The loopback network interface
auto lo
iface lo inet loopback
# The primary network interface
auto eth0
iface eth0 inet dhcp
#auto eth1
#iface eth1 inet dhcp
# Include any platform-specific interface configuration
source /etc/network/interfaces.d/*.if
# ssim2 added
auto swp45
iface swp45
auto swp46
iface swp46
cumulus@switch:~$ sudo mako-render /etc/network/interfaces.d
/<interfaces_stub_file>
80
03 June 2015
Cumulus Linux 2.5.2 User Guide
ifdown Cannot Find an Interface that Exists
If you are trying to bring down an interface that you know exists, use ifdown with the --usecurrent-config option to force ifdown to check the current /etc/network/interfaces file to find
the interface. This can solve issues where the ifup command issues for that interface was interrupted
before it updated the state database. For example:
cumulus@switch:~$ sudo ifdown br0
error: cannot find interfaces: br0 (interface was probably never up ?)
cumulus@switch:~$ sudo brctl show
bridge name
bridge id
STP enabled
interfaces
br0
8000.44383900279f
yes
downlink
peerlink
cumulus@switch:~$ sudo ifdown br0 --use-current-config
MTU Set on a Logical Interface Fails with Error: "Numerical result out of range"
This error occurs when the MTU you are trying to set on an interface is higher than the MTU of the
lower interface or dependent interface. Linux expects the upper interface to have an MTU less than or
equal to the MTU on the lower interface.
In the example below, the swp1.100 VLAN interface is an upper interface to physical interface swp1. If
you want to change the MTU to 9000 on the VLAN interface, you must include the new MTU on the
lower interface swp1 as well.
auto swp1.100
iface swp1.100
mtu 9000
auto swp1
iface swp1
mtu 9000
Interpreting iproute2 batch Command Failures
ifupdown2 batches iproute2 commands for performance reasons. A batch command contains ip force -batch - in the error message. The command number that failed is at the end of this line:
Command failed -:1.
Below is a sample error for the command 1: link set dev host2 master bridge. There was an
error adding the bond host2 to the bridge named bridge because host2 did not have a valid address.
cumulusnetworks.com
81
Cumulus Networks
error: failed to execute cmd 'ip -force -batch - [link set dev host2 master
bridge
addr flush dev host2
link set dev host1 master bridge
addr flush dev host1
]'(RTNETLINK answers: Invalid argument
Command failed -:1)
warning: bridge configuration failed (missing ports)
Understanding the "RTNETLINK answers: Invalid argument" Error when Adding
a Port to a Bridge
This error can occur when the bridge port does not have a valid hardware address.
This can typically occur when the interface being added to the bridge is an incomplete bond; a bond
without slaves is incomplete and does not have a valid hardware address.
Network Troubleshooting
Cumulus Linux contains a number of command line and analytical tools to help you troubleshoot
issues with your network.
Contents
(Click to expand)
Contents (see page 82)
Commands (see page 82)
Checking Reachability Using ping (see page 83)
Printing Route Trace Using traceroute (see page 83)
Manipulating the System ARP Cache (see page 84)
Traffic Generation Using mz (see page 84)
Counter ACL (see page 85)
SPAN and ERSPAN (see page 86)
Configuration Files (see page 89)
Useful Links (see page 89)
Caveats and Errata (see page 89)
Commands
arp
cl-acltool
ip
mz
ping
traceroute
82
03 June 2015
Cumulus Linux 2.5.2 User Guide
traceroute
Checking Reachability Using ping
pingis used to check reachability of a host. ping also calculates the time it takes for packets to travel
the round trip. See man ping for details.
To test the connection to an IPv4 host:
cumulus@switch:~$ ping 206.190.36.45
PING 206.190.36.45 (206.190.36.45) 56(84) bytes of data.
64 bytes from 206.190.36.45: icmp_req=1 ttl=53 time=40.4 ms
64 bytes from 206.190.36.45: icmp_req=2 ttl=53 time=39.6 ms
...
To test the connection to an IPv6 host:
cumulus@switch:~$ ping6 -I swp1 fe80::202:ff:fe00:2
PING fe80::202:ff:fe00:2(fe80::202:ff:fe00:2) from fe80::202:ff:fe00:1
swp1: 56 data bytes
64 bytes from fe80::202:ff:fe00:2: icmp_seq=1 ttl=64 time=1.43 ms
64 bytes from fe80::202:ff:fe00:2: icmp_seq=2 ttl=64 time=0.927 ms
Printing Route Trace Using traceroute
trace route tracks the route that packets take from an IP network on their way to a given host. See
man traceroute for details.
To track the route to an IPv4 host:
cumulus@switch:~$ traceroute www.google.com
traceroute to www.google.com (74.125.239.49), 30 hops max, 60 byte packets
1
fw.cumulusnetworks.com (192.168.1.1)
0.614 ms
0.863 ms
2
router.hackerdojo.com (157.22.42.1)
15.459 ms
16.447 ms
3
gw-cpe-hackerdojo.via.net (157.22.10.97)
4
ge-1-5-v223.core1.uspao.via.net (157.22.10.81)
18.470 ms
0.932 ms
16.818 ms
18.473 ms
18.897 ms
20.419 ms
20.422 ms
22.347 ms
22.584 ms
21.026 ms
5
core2-1-1-0.pao.net.google.com (198.32.176.31)
24.328 ms
6
216.239.49.250 (216.239.49.250)
24.371 ms
7
72.14.232.35 (72.14.232.35)
8
nuq04s19-in-f17.1e100.net (74.125.239.49)
27.505 ms
25.757 ms
22.925 ms
25.987 ms
22.323 ms
23.544 ms
21.851 ms
22.604
ms
cumulusnetworks.com
83
Cumulus Networks
Manipulating the System ARP Cache
arpmanipulates or displays the kernel’s IPv4 network neighbor cache. See man arp for details.
To display the ARP cache:
cumulus@switch:~$ arp -a
? (11.0.2.2) at 00:02:00:00:00:10 [ether] on swp3
? (11.0.3.2) at 00:02:00:00:00:01 [ether] on swp4
? (11.0.0.2) at 44:38:39:00:01:c1 [ether] on swp1
To delete an ARP cache entry:
cumulus@switch:~$ arp -d 11.0.2.2
cumulus@switch:~$ arp -a
? (11.0.2.2) at <incomplete> on swp3
? (11.0.3.2) at 00:02:00:00:00:01 [ether] on swp4
? (11.0.0.2) at 44:38:39:00:01:c1 [ether] on swp1
To add a static ARP cache entry:
cumulus@switch:~$ arp -s 11.0.2.2 00:02:00:00:00:10
cumulus@switch:~$ arp -a
? (11.0.2.2) at 00:02:00:00:00:10 [ether] PERM on swp3
? (11.0.3.2) at 00:02:00:00:00:01 [ether] on swp4
? (11.0.0.2) at 44:38:39:00:01:c1 [ether] on swp1
Traffic Generation Using mz
mzis a fast traffic generator. It can generate a large variety of packet types at high speed. See man mz
for details.
For example, to send two sets of packets to TCP port 23 and 24, with source IP 11.0.0.1 and destination
11.0.0.2, do the following:
cumulus@switch:~$ sudo mz swp1 -A 11.0.0.1 -B 11.0.0.2 -c 2 -v -t tcp
"dp=23-24"
Mausezahn 0.40 - (C) 2007-2010 by Herbert Haas - http://www.perihel.at/sec
/mz/
Use at your own risk and responsibility!
-- Verbose mode --
84
03 June 2015
Cumulus Linux 2.5.2 User Guide
This system supports a high resolution clock.
The clock resolution is 4000250 nanoseconds.
Mausezahn will send 4 frames...
IP:
ver=4, len=40, tos=0, id=0, frag=0, ttl=255, proto=6, sum=0, SA=11.
0.0.1, DA=11.0.0.2,
payload=[see next layer]
TCP: sp=0, dp=23, S=42, A=42, flags=0, win=10000, len=20, sum=0,
payload=
IP:
ver=4, len=40, tos=0, id=0, frag=0, ttl=255, proto=6, sum=0, SA=11.
0.0.1, DA=11.0.0.2,
payload=[see next layer]
TCP: sp=0, dp=24, S=42, A=42, flags=0, win=10000, len=20, sum=0,
payload=
IP:
ver=4, len=40, tos=0, id=0, frag=0, ttl=255, proto=6, sum=0, SA=11.
0.0.1, DA=11.0.0.2,
payload=[see next layer]
TCP: sp=0, dp=23, S=42, A=42, flags=0, win=10000, len=20, sum=0,
payload=
IP:
ver=4, len=40, tos=0, id=0, frag=0, ttl=255, proto=6, sum=0, SA=11.
0.0.1, DA=11.0.0.2,
payload=[see next layer]
TCP: sp=0, dp=24, S=42, A=42, flags=0, win=10000, len=20, sum=0,
payload=
Counter ACL
In Linux, all ACL rules are always counted. To create an ACL rule for counting purposes only, set the
rule action to ACCEPT. See the Netfilter (see page 114) chapter for details on how to use cl-acltool
to set up iptables-/ip6tables-/ebtables-based ACLs.
Always place your rules files under /etc/cumulus/acl/policy.d/.
To count all packets going to a Web server:
cumulus@switch$ cat sample_count.rules
[iptables]
-A FORWARD -p tcp --dport 80 -j ACCEPT
cumulusnetworks.com
85
Cumulus Networks
cumulus@switch:$ sudo cl-acltool -i -p sample_count.rules
Using user provided rule file sample_count.rules
Reading rule file sample_count.rules ...
Processing rules in file sample_count.rules ...
Installing acl policy... done.
cumulus@switch$ sudo iptables -L -v
Chain INPUT (policy ACCEPT 16 packets, 2224 bytes)
pkts bytes target
prot opt in
out
source
destination
Chain FORWARD (policy ACCEPT 0 packets, 0 bytes)
pkts bytes target
prot opt in
out
source
tcp
any
anywhere
destination
2
156 ACCEPT
anywhere
--
any
tcp dpt:http
Chain OUTPUT (policy ACCEPT 44 packets, 8624 bytes)
pkts bytes target
prot opt in
out
source
destination
SPAN and ERSPAN
SPAN (Switched Port Analyzer) provides for the mirroring of all packets coming in from or going out of
an interface to a local port for monitoring. This port is referred to as a mirror-to-port (MTP). The
original packet is still switched, while a mirrored copy of the packet is sent to the MTP port.
ERSPAN (Encapsulated Remote SPAN) enables the mirrored packets to be sent to a monitoring node
located anywhere across the routed network. The switch finds the outgoing port of the mirrored
packets by doing a lookup of the destination IP address in its routing table. The original L2 packet is
encapsulated with GRE for IP delivery. The encapsulated packets have the following format:
---------------------------------------------------------| MAC_HEADER | IP_HEADER | GRE_HEADER | L2_Mirrored_Packet |
----------------------------------------------------------
SPAN and ERSPAN are configured via cl-acltool, the same utility for security ACL configuration (see
page 114). The match criteria for SPAN and ERSPAN can only be an interface; more granular match
terms are not supported. The interface can be a port, a subinterface or a bond interface. Both ingress
and egress interfaces can be matched.
Cumulus Linux supports a maximum of 2 SPAN destinations. Multiple rules can point to the same SPAN
destination. The MTP interface can be a physical port, a subinterface, or a bond interface. The SPAN
/ERSPAN action is independent of security ACL actions. If packets match both a security ACL rule and a
SPAN rule, both actions will be carried out.
86
03 June 2015
Cumulus Linux 2.5.2 User Guide
Always place your rules files under /etc/cumulus/acl/policy.d/.
To configure SPAN for all packets coming in from swp1 locally to swp3:
cumulus@switch$ cat span.rules
[iptables]
-A FORWARD --in-interface swp1 -j SPAN --dport swp3
cumulus@switch$ cl-acltool -i -p span.rules
Using user provided rule file span.rules
Reading rule file span.rules ...
Processing rules in file span.rules ...
Installing acl policy... done.
cumulus@switch$ sudo iptables -L -v
Chain INPUT (policy ACCEPT 18 packets, 3034 bytes)
pkts bytes target
prot opt in
out
source
destination
Chain FORWARD (policy ACCEPT 0 packets, 0 bytes)
pkts bytes target
prot opt in
out
source
all
any
anywhere
destination
28
3014 SPAN
anywhere
--
swp1
dport:swp3
Chain OUTPUT (policy ACCEPT 56 packets, 12320 bytes)
pkts bytes target
prot opt in
out
source
destination
To configure SPAN for all packets going out of bond0 locally to bond1:
cumulus@switch$ cat span.rules
[iptables]
-A FORWARD --out-interface bond0 -j SPAN --dport bond1
cumulus@switch$ cl-acltool -i -p span.rules
Using user provided rule file span.rules
Reading rule file span.rules ...
Processing rules in file span.rules ...
cumulusnetworks.com
87
Cumulus Networks
Installing acl policy... done.
cumulus@switch$ sudo iptables -L -v
Chain INPUT (policy ACCEPT 57 packets, 10000 bytes)
pkts bytes target
prot opt in
out
source
destination
Chain FORWARD (policy ACCEPT 0 packets, 0 bytes)
pkts bytes target
prot opt in
out
source
all
bond0
anywhere
destination
19
1938 SPAN
anywhere
--
any
dport:bond1
Chain OUTPUT (policy ACCEPT 686 packets, 119K bytes)
pkts bytes target
prot opt in
out
source
destination
To configure ERSPAN for all packets coming in from swp1 to 12.0.0.2. :
cumulus@switch$ cat erspan.rules
[iptables]
-A FORWARD --in-interface swp1 -j ERSPAN --src-ip 12.0.0.1 --dst-ip
12.0.0.2
--ttl 64
cumulus@switch$ sudo cl-acltool -i -p erspan.rules
Using user provided rule file erspan.rules
Reading rule file erspan.rules ...
Processing rules in file erspan.rules ...
Installing acl policy... done.
cumulus@switch$ sudo iptables -L -v
Chain INPUT (policy ACCEPT 27 packets, 5526 bytes)
pkts bytes target
prot opt in
out
source
destination
Chain FORWARD (policy ACCEPT 0 packets, 0 bytes)
pkts bytes target
prot opt in
out
source
all
any
anywhere
destination
69
6804 ERSPAN
anywhere
--
swp1
ERSPAN src-ip:12.0.0.1 dst-ip:12.0.0.2
Chain OUTPUT (policy ACCEPT 822 packets, 163K bytes)
88
03 June 2015
Cumulus Linux 2.5.2 User Guide
pkts bytes target
prot opt in
out
source
destination
The src-ip option can be any IP address, whether it exists in the routing table or not. The dst-ip
option must be an IP address reachable via the routing table. The destination IP address must be
reachable from a front-panel port, and not the management port. Use ping or ip route get <ip>
to verify that the destination IP address is reachable. Setting the --ttl option is recommended.
When using Wireshark to review the ERSPAN output, Wireshark may report the message
"Unknown version, please report or test to use fake ERSPAN preference", and the trace is
unreadable. To resolve this, go into the General preferences for Wireshark, then go to
Protocols > ERSPAN and check the Force to decode fake ERSPAN frame option.
Configuration Files
/etc/cumulus/acl/policy.conf
Useful Links
http://en.wikipedia.org/wiki/Ping
https://en.wikipedia.org/wiki/Traceroute
http://www.perihel.at/sec/mz/mzguide.html
Caveats and Errata
SPAN rules cannot match outgoing subinterfaces.
ERSPAN rules must include ttl for versions 1.5.1 and earlier.
Installation, Upgrading and Package Management
A Cumulus Linux switch can have up to two images of the operating system installed. This section
discusses installing new and updating existing Cumulus Linux disk images, and configuring those
images with additional applications (via packages) if desired.
Zero touch provisioning is a way to quickly deploy and configure new switches in a large-scale
environment.
Managing Cumulus Linux Disk Images (see page 89)
Adding and Updating Packages (see page 104)
Zero Touch Provisioning (see page 110)
Managing Cumulus Linux Disk Images
The Cumulus Linux operating system resides on a switch as a disk image. Switches running Cumulus
Linux can be configured with multiple disk images. This section discusses how to manage them.
There are some differences between PowerPC and x86 platforms, which are described below.
cumulusnetworks.com
89
Cumulus Networks
Contents
(Click to expand)
Contents (see page 90)
Commands (see page 90)
Understanding Image Slots (see page 90)
PowerPC Image Slots (see page 91)
x86 Image Slots (see page 91)
Installing a New Cumulus Linux Image (see page 93)
Installing the New Image (see page 93)
Upgrading Cumulus Linux (see page 95)
Upgrading Using cl-img-install -u (see page 95)
Upgrading to a Maintenance (X.Y.Z) Release (see page 96)
Accessing the Alternate Image Slot (see page 96)
Selecting the Alternate Image Slot as the New Primary Slot (see page 97)
Making Configurations Persist across Upgrades (see page 97)
Recommended Files to Make Persistent (see page 98)
Switching between Installed Images (see page 98)
Reverting an Image to its Original Configuration (see page 99)
Reprovisioning the System (Restart Installer) (see page 100)
Uninstalling All Images and Removing the Configuration (see page 100)
Booting into Rescue Mode (see page 101)
Inspecting Image File Contents (see page 102)
PowerPC Image Slot Overlay Detailed Information (see page 103)
Useful Links (see page 104)
Commands
cl-img-install
cl-img-select
cl-img-clear-overlay
cl-img-pkg
Understanding Image Slots
Cumulus Linux uses the concept of image slots to manage two separate Cumulus Linux images. The
characteristics of the image slots vary, based on whether your switch is on a PowerPC or x86 platform.
However, some terminology is common to both platforms:
Active image slot: The currently running image slot.
Primary image slot: The image slot that is selected for the next boot. Often this is the same as
the active image slot.
Alternate image slot: The inactive image slot, not selected for the next boot.
90
03 June 2015
Cumulus Linux 2.5.2 User Guide
Alternate image slot: The inactive image slot, not selected for the next boot.
You can easily determine whether the switch is on the PowerPC or x86 platform by using the
uname -m command.
For example, on a PowerPC platform, uname -m outputs ppc:
cumulus@PPCswitch$ uname -m
ppc
While on an x86 platform, uname -m outputs x86_64:
cumulus@leaf1$ uname -m
x86_64
PowerPC Image Slots
On the PowerPC platform, each image slot consists of a read-only Cumulus Linux base image overlaid
with a read-write user area, as shown in the following diagram:
Files you edit and create reside in the read-write user overlay. This also includes any additional
software you install on top of Cumulus Linux. After an install, the user overlay is empty.
x86 Image Slots
Unlike PowerPC-based switches, there is no overlay for an x86-based switch; instead each slot is a
logical volume in the physical partition, which you can manage with LVM.
When you install Cumulus Linux on an x86 switch, the following entities are created on the disk:
A disk partition using an ext4 file system that contains three logical volumes: two logical
volumes named sysroot1 and sysroot2, and the /mnt/persist logical volume. The logical
volumes represent the Cumulus Linux image slots, so sysroot1 is slot 1 and sysroot2 is slot 2.
/mnt/persist is where you store your persistent configuration (see page 97).
A boot partition, shared by the logical volumes. Each volume mounts this partition as /boot.
cumulusnetworks.com
91
Cumulus Networks
Managing Slot Sizes
As space in a slot is used, you may need to increase the size of the root filesystem by increasing the size
of the corresponding logical volume. This section shows you how to check current utilization and
expand the filesystem as needed.
1. Check utilization on the root filesystem with the df command. In the following example,
filesystem utilization is 16%:
cumulus@switch$ df -h /
Filesystem
Size
Used
4.0G
579M
Avail Use% Mounted on
/dev/disk/by-uuid/64650289-cebf-4849-91ae-a34693fce2f1
G
3.2
16% /
2. To increase available space in the root filesystem, first use the vgs command to check the
available space in the volume group. In this example, there is 6.34 Gigabytes of free space
available in the volume group CUMULUS:
cumulus@switch$ sudo vgs
VG
#PV #LV #SN Attr
CUMULUS
1
3
VSize
VFree
0 wz--n- 14.36g 6.34g
3. Once you confirm the available space, determine the number of the currently active slot using
cl-img-select.
cumulus@switch$ sudo cl-img-select | grep active
active => slot 1 (primary): 2.5.0-199c587-201501081931-build
cl-img-select indicates slot number 1 is active.
4. Resize the slot with the lvresize command. The following example increases slot size by 20
percent of total available space. Replace the "#" character in the example with the active slot
number from the last step.
cumulus@switch$ sudo lvresize -l +20%FREE CUMULUS/SYSROOT#
Extending logical volume SYSROOT# to 5.27 GiB
Logical volume SYSROOT# successfully resized
The use of + is very important with the lvresize command. Issuing lvresize without
the + results in the logical volume size being set directly to the specified size, rather
than extended.
92
03 June 2015
Cumulus Linux 2.5.2 User Guide
5. Once the slot has been extended, use the resize2fs command to expand the filesystem to fit
the new space in the slot. Again, replace the "#" character in the example with the active slot
number.
cumulus@switch$ sudo resize2fs /dev/CUMULUS/SYSROOT#
resize2fs 1.42.5 (29-Jul-2012)
Filesystem at /dev/CUMULUS/SYSROOT# is mounted on /; on-line resizing
required
old_desc_blocks = 1, new_desc_blocks = 1
Performing an on-line resize of /dev/CUMULUS/SYSROOT# to 1381376 (4k)
blocks.
The filesystem on /dev/CUMULUS/SYSROOT# is now 1381376 blocks long.
Installing a New Cumulus Linux Image
You install a new Cumulus Linux image when:
You first acquire a Cumulus Linux license, unless you it came pre-installed on your hardware.
You upgrade to an X.0 or X.Y release; for example, when upgrading from 1.5.3 to 2.5 or from
2.1.1 to 2.2.0, and you don't want to use cl-img-install -u.
Installing a new image is a three step process:
1. Installing the new image into the alternate image slot.
2. Selecting the alternate image slot as the new primary slot.
3. Rebooting the switch.
Installing a new image overwrites all files — including configuration files — on the target slot.
Cumulus Networks strongly recommends you create a persistent configuration (see page 97)
to back up your important files like your configurations.
Installing the New Image
Use the cl-img-install command to install a new image into the alternate image slot.
You can only install into the alternate slot, as it is not possible to install into the actively
running slot.
This example assumes the new image is located on an HTTP server with the following URL: http://10.
0.1.246/cumulus-install-amd64.bin:
cumulus@switch:~$ sudo cl-img-install http://10.0.1.246/cumulus-installcumulusnetworks.com
93
Cumulus Networks
amd64.bin
Defaulting to image slot 1 for install.
Success: download complete.
Dumping image info from cumulus-install-amd64.bin ...
Verifying image checksum ... OK.
Preparing image archive ... OK.
Control File Contents
=====================
Description: Cumulus Linux
OS-Release: 2.1.0-0556262-MODIFIED-201406101128-user
Architecture: amd64
Date: Tue, 10 Jun 2014 11:44:28 -0700
Installer-Version: 1.2
Platforms: im_n29xx_t40n mlx_sx1400_i73612 dell_s6000_s1220
Homepage: http://www.cumulusnetworks.com/
Data Archive Contents
=====================
-rw-r--r-- user/Development
128 2014-06-10 18:44:26 file.list
-rw-r--r-- user/Development
44 2014-06-10 18:44:27 file.list.sha1
-rw-r--r-- user/Development 104276331 2014-06-10 18:44:27 sysroot-internal.
tar.gz
-rw-r--r-- user/Development
44 2014-06-10 18:44:27 sysroot-internal.
tar.gz.sha1
-rw-r--r-- user/Development
5391348 2014-06-10 18:44:26 vmlinuz-initrd.
tar.xz
-rw-r--r-- user/Development
44 2014-06-10 18:44:27 vmlinuz-initrd.
tar.xz.sha1
Current image slot setup:
slot 1 (alt
): 2.1.x-f6fe821-MODIFIED-201406072142-user
active => slot 2 (primary): 2.1.x-7802645-201406080637-build
About to update image slot 1 using:
http://10.0.1.246/cumulus-install-amd64.bin
Are you sure (y/N)?
Verifying image checksum ... OK.
Preparing image archive ... OK.
Validating sha1 for uImage-powerpc.itb... done.
Validating sha1 for sysroot.squash.xz... done.
Installing OS-Release 2.1.x-bae0260-20130528-NB into image slot 2 ...
Copying sysroot into /dev/sda8... done.
Verifying sysroot copy... OK.
Copying kernel uImage into /dev/sda7... done.
Success: http://10.0.1.246/incoming/cumulus-install-amd64.bin loaded into
94
03 June 2015
Cumulus Linux 2.5.2 User Guide
image slot 2.
The system automatically determines which slot is the alternate slot (slot 2 in this case).
Upgrading Cumulus Linux
There are two ways you can upgrade Cumulus Linux:
You can upgrade to any new version of Cumulus Linux using cl-img-install -u, which
upgrades the OS, putting it into the alternate image slot. This saves you from having to reinstall
the OS completely and preserves the Cumulus Linux-specific customizations you made to the
existing version.
If you are upgrading to a maintenance release (X.Y.Z, like 2.5.1) from an earlier release in the
same major and minor release family only (like 2.2.1 to 2.2.2, or 2.5.0 to 2.5.1), you can choose
to only upgrade the operating system, using apt-get, instead of performing a full install of the
OS. This upgrades the installation in the active image slot.
Upgrading Using cl-img-install -u
This is a new feature and thus only works when upgrading from Cumulus Linux 2.5.1 or later
to a newer release. You can only use cl-img-install -u if you installed the 2.5.1 binary
image using cl-img-install. See above for doing a full image install (see page 93).
If you need to upgrade to a newer version of Cumulus Linux and don't want to do a complete install
(see page 93) as described above, you can use cl-img-install -u to upgrade to the newest
version. Upgrading Cumulus Linux in this manner installs the updated version of the OS in the alternate
image slot, based on the current configuration in the active slot. So in the likely circumstance that you
modified your configuration in version 2.5.1, for example, the upgrade process uses the customized
configuration as the baseline for the new version.
If you installed any packages from the Cumulus Linux repo into the existing version, the upgrade
installer prompts you to confirm whether you want to automatically upgrade those packages in the
new slot. This helps prevent a misconfigured state of the new system, so Cumulus Networks
recommends you choose to automatically upgrade those packages.
To upgrade Cumulus Linux:
1. Run cl-img-install -u https://path/to/image/CumulusLinux-X.Y.Z.bin .
2. Reboot the switch. Cumulus Linux automatically boots into the alternate image slot with the
updated OS.
Third party packages that are not in the Cumulus Linux repository do not get updated by this
process. This includes packages installed using apt-get (to install from another repository)
and dpkg.
Also, the contents of /etc/fstab are not moved over to the new slot during the upgrade;
you must move this content yourself — see below (see page 96).
cumulusnetworks.com
95
Cumulus Networks
Upgrading to a Maintenance (X.Y.Z) Release
If you already have Cumulus Linux installed on your switch and you are upgrading to a maintenance
release (X.Y.Z, like 2.5.1) from an earlier release in the same major and minor release family only (like
2.2.1 to 2.2.2, or 2.5.0 to 2.5.1), you can use apt-get to upgrade to the new version. (If are upgrading
to a major (X.0) or minor (X.Y) release, you must either use cl-img-install -u, or else do a full
image install, as described in Installing a New Cumulus Linux Image (see page 93)above.)
To upgrade to a maintenance (X.Y.Z) release using apt-get:
1. Run apt-get update.
2. Run apt-get upgrade.
3. Reboot the switch.
While this method doesn't overwrite the target image slot, the disk image does occupy a lot of
disk space used by both Cumulus Linux image slots.
Accessing the Alternate Image Slot
It may be useful to access the content of the alternate slot to retrieve configuration or logs.
cl-img-install fails while the alternate slot is mounted. It is important to unmount the
alternate slot as shown in step 4 below when done.
1. Determine which slot is the alternate with cl-img-select.
cumulus@switch$ sudo cl-img-select |grep alt
slot 2 (alt
): 2.5.0-199c587-201501081931-build
This output indicates slot 2 is the alternate slot.
2. Create a mount point for the alternate slot:
cumulus@switch$ sudo mkdir /mnt/alt
3. Mount the alternate slot to the mount point:
cumulus@switch$ sudo mount /dev/mapper/CUMULUS-SYSROOT# /mnt/alt
Where # is the number of the alternate slot.
96
03 June 2015
Cumulus Linux 2.5.2 User Guide
Where # is the number of the alternate slot.
The alternate slot is now accessible under /mnt/alt.
4. Unmount the mount point /mnt/alt when done.
cumulus@switch$ cd /
cumulus@switch$ sudo umount /mnt/alt/
Selecting the Alternate Image Slot as the New Primary Slot
Use cl-img-select -s to swap the primary and alternate slots:
cumulus@switch:~$ sudo cl-img-select -s
Success: Primary image slot set to 2.
active => slot 1 (alt
): 2.1.x-e71fa52-20130527-NB
slot 2 (primary): 2.1.x-bae0260-20130528-NB
Reboot required to take effect.
Now reboot the system using the reboot command:
cumulus@switch:~$ sudo reboot
Broadcast message from cumulus@switch (ttyS0):
The system is going down for reboot NOW!
...
Making Configurations Persist across Upgrades
Often times you have application-specific configuration files you want to exist for every image slot,
even across image upgrades. The /mnt/persist mechanism provides this functionality.
At boot time, the system copies all files and directories residing under /mnt/persist to the root of the
active image slot, overwriting any existing files there.
For example, let say you want the following two configuration files applied to every image slot:
/etc/hosts
/etc/cron.hourly/backup
To do this first create the needed directories under /mnt/persist:
cumulusnetworks.com
97
Cumulus Networks
cumulus@switch:~$ sudo mkdir -p /mnt/persist/etc
cumulus@switch:~$ sudo mkdir -p /mnt/persist/etc/cron.hourly
Next copy the files into place under /mnt/persist using scp:
cumulus@switch:~$ sudo scp user@my-server:hosts /mnt/persist/etc
cumulus@switch:~$ sudo scp user@my-server:backup /mnt/persist/etc/cron.
hourly
That's it. Now when you reboot into either image slot the system populates /etc/hosts and /etc
/cron.daily/backup with the files form /mnt/persist.
Recommended Files to Make Persistent
Cumulus Networks recommends you consider making the following files and directories part of a
persistent configuration:
/etc/hostname
/etc/network/interfaces
/etc/resolv.conf
/etc/quagga/
/etc/ssh/
If you have a root user, consider including /root.
If you are using VXLANs without a controller (see page 209), see this list of files (see page 214) to
include in a persistent configuration.
If you are using LLDP (see page 138), consider including:
/etc/lldpd.conf
/etc/lldpd.d/
Because of their size, you may consider copying these directories to a remote server:
/etc
/var/support
Switching between Installed Images
You can select which image slot is active by using the cl-img-select command. First, to see the
currently installed images, use cl-img-select with no options:
cumulus@switch:~$ cl-img-select
slot 1 (alt
): 2.1.x-e71fa52-20130527-NB
active => slot 2 (primary): 2.1.x-bae0260-20130528-NB
Notice that slot 2 is the currently running image (it’s the primary image in the active slot). To select slot
98
03 June 2015
Cumulus Linux 2.5.2 User Guide
Notice that slot 2 is the currently running image (it’s the primary image in the active slot). To select slot
1 as the primary image, use cl-img-select, passing 1 as the only argument:
cumulus@switch:~$ sudo cl-img-select 1
Success: Primary image slot set to 1.
slot 1 (primary): 2.1.x-e71fa52-20130527-NB
active => slot 2 (alt
): 2.1.x-bae0260-20130528-NB
Reboot required to take effect.
You must reboot the switch for the new primary image to take effect as the active image.
After the reboot, running cl-img-select gives you:
cumulus@switch:~$ cl-img-select
active => slot 1 (primary): 2.1.x-e71fa52-20130527-NB
slot 2 (alt
): 2.1.x-bae0260-20130528-NB
To swap from one slot to other you can use the -s option. This lets you make the alternate
slot the primary slot without knowing what slot is currently the primary, active one. You still
need to reboot after swapping slots.
Reverting an Image to its Original Configuration
On PowerPC-based systems, you may want to clear out the read-write user overlay area. Perhaps
something was misconfigured, or was deleted by mistake, or some unneeded software was installed.
You can purge the read-write overlay using the cl-img-clear-overlay command, passing the slot
number as an argument. For example, to purge the read-write overlay for image slot 2 do this:
cumulus@switch:~$ sudo cl-img-clear-overlay 2
Success: Overlay configuration 2 will be re-initialized during the next
reboot.
You must reboot the switch to complete the purge.
cumulusnetworks.com
99
Cumulus Networks
Reprovisioning the System (Restart Installer)
You can reprovision the system, wiping out the contents of both image slots and /mnt/persist.
To initiate the provisioning and installation process, use cl-img-select -i:
cumulus@switch:~$ sudo cl-img-select -i
WARNING:
WARNING: Operating System install requested.
WARNING: This will wipe out all system data.
WARNING:
Are you sure (y/N)? y
Enabling install at next reboot...done.
Reboot required to take effect.
A reboot is required for the reinstall to begin.
If you change your mind, you can cancel a pending reinstall operation by using cl-imgselect -c:
cumulus@switch:~$ sudo cl-img-select -c
Cancelling pending install at next reboot...done.
Uninstalling All Images and Removing the Configuration
To remove all installed images and configurations, returning the switch to its factory defaults, use climg-select -u:
cumulus@switch:~$ sudo cl-img-select -u
WARNING:
WARNING: Operating System uninstall requested.
WARNING: This will wipe out all system data.
WARNING:
Are you sure (y/N)? y
Enabling uninstall at next reboot...done.
Reboot required to take effect.
100
03 June 2015
Cumulus Linux 2.5.2 User Guide
A reboot is required for the uninstall to begin.
If you change your mind you can cancel a pending uninstall operation by using cl-imgselect -c:
cumulus@switch:~$ sudo cl-img-select -c
Cancelling pending uninstall at next reboot...done.
Booting into Rescue Mode
If your system becomes broken is some way, you may be able to correct things by booting into ONIE
rescue mode. In rescue mode, the file systems are unmounted and you can use various Cumulus Linux
utilities to try and fix the problem.
To reboot the system into the ONIE rescue mode, use cl-img-select -r:
cumulus@switch:~$ sudo cl-img-select -r
WARNING:
WARNING: Rescue boot requested.
WARNING:
Are you sure (y/N)? y
Enabling rescue at next reboot...done.
Reboot required to take effect.
A reboot is required to boot into rescue mode.
If you change your mind you can cancel a pending rescue boot operation by using cl-imgselect -c:
cumulus@switch:~$ sudo cl-img-select -c
Cancelling pending rescue at next reboot...done.
cumulusnetworks.com
101
Cumulus Networks
Inspecting Image File Contents
From a running system you can display the contents of a Cumulus Linux image file using cl-img-pkg
-d:
cumulus@switch:~$ sudo cl-img-pkg -d /var/lib/cumulus/installer/onieinstaller
Verifying image checksum ... OK.
Preparing image archive ... OK.
Control File Contents
=====================
Description: Cumulus Linux
OS-Release: 2.1.0-0556262-201406101128-NB
Architecture: amd64
Date: Tue, 10 Jun 2014 11:44:28 -0700
Installer-Version: 1.2
Platforms: im_n29xx_t40n mlx_sx1400_i73612 dell_s6000_s1220
Homepage: http://www.cumulusnetworks.com/
Data Archive Contents
=====================
128 2014-06-10 18:44:26 file.list
44 2014-06-10 18:44:27 file.list.sha1
104276331 2014-06-10 18:44:27 sysroot-internal.tar.gz
44 2014-06-10 18:44:27 sysroot-internal.tar.gz.sha1
5391348 2014-06-10 18:44:26 vmlinuz-initrd.tar.xz
44 2014-06-10 18:44:27 vmlinuz-initrd.tar.xz.sha1
cumulus@switch:~$
You can also extract the image files to the current directory with the -e option:
cumulus@switch:~$ sudo cl-img-pkg -e /var/lib/cumulus/installer/onieinstaller
Verifying image checksum ... OK.
Preparing image archive ... OK.
file.list
file.list.sha1
sysroot-internal.tar.gz
sysroot-internal.tar.gz.sha1
vmlinuz-initrd.tar.xz
vmlinuz-initrd.tar.xz.sha1
Success: Image files extracted OK.
102
03 June 2015
Cumulus Linux 2.5.2 User Guide
cumulus@switch:~$ sudo ls -l
total 107120
-rw-r--r-- 1 1063 3000
128 Jun 10 18:44 file.list
-rw-r--r-- 1 1063 3000
44 Jun 10 18:44 file.list.sha1
-rw-r--r-- 1 1063 3000 104276331 Jun 10 18:44 sysroot-internal.tar.gz
-rw-r--r-- 1 1063 3000
-rw-r--r-- 1 1063 3000
44 Jun 10 18:44 sysroot-internal.tar.gz.sha1
5391348 Jun 10 18:44 vmlinuz-initrd.tar.xz
-rw-r--r-- 1 1063 3000
44 Jun 10 18:44 vmlinuz-initrd.tar.xz.sha1
PowerPC Image Slot Overlay Detailed Information
The root directory of an image slot on a PowerPC system is created using an overlayfs file system. The
lower part of the overlay is a read-only squashfs file system containing the base Cumulus Linux image.
The upper part of the overlay is a read-write directory containing all the user modifications.
The following table describes the mount points and directories used to create the overlay for image
slots 1 and 2.
Slot
Number
R/O squashfs
device
R/O mount point
R/W block device
R/W directory
1
/dev/sysroot1
/mnt/root-ro
/dev/overlay_rw
/mnt/root-rw
/config1
2
/dev/sysroot2
/mnt/root-ro
/dev/overlay_rw
/mnt/root-rw
/config2
A single read-write partition provides separate read-write directories for the upper part of the
overlay. The lower part of the overlay is a partition, while the upper part is a directory.
The following table describes all the interesting mount points.
Mount
Point
File
System
Purpose
/mnt/rootro
squashfs
Contains the read-only base Cumulus Linux image.
/mnt/rootrw
ext2
Contains the read-write user directories for the overlay.
/
overlayfs
The union of /mnt/root-ro and /mnt/root-rw/config1 (or config2).
/mnt/persist
ext2
Contains the persistent user configuration applied to each image slot.
cumulusnetworks.com
103
Cumulus Networks
Mount
Point
File
System
Purpose
/mnt
/initramfs
tmpfs
Contains the initramfs used at boot. Needed during shutdown.
Useful Links
Open Network Install Environment (ONIE) Home Page
Adding and Updating Packages
You use the Advanced Packaging Tool (APT) to manage additional applications (in the form of packages)
and to install the latest updates.
Contents
(Click to expand)
Contents (see page 104)
Commands (see page 104)
Updating the Package Cache (see page 104)
Listing Available Packages (see page 105)
Adding a Package (see page 106)
Listing Installed Packages (see page 107)
Upgrading to Newer Versions of Installed Packages (see page 108)
Upgrading a Single Package (see page 108)
Upgrading All Packages (see page 108)
Adding Packages from Another Repository (see page 108)
Configuration Files (see page 109)
Useful Links (see page 110)
Commands
apt-get
apt-cache
dpkg
Updating the Package Cache
To work properly, APT relies on a local cache of the available packages. You must populate the cache
initially, and then periodically update it with apt-get update:
104
03 June 2015
Cumulus Linux 2.5.2 User Guide
cumulus@switch:~$ sudo apt-get update
Ign https://repo.cumulusnetworks.com CumulusLinux-1.5 Release.gpg
Get:1 https://repo.cumulusnetworks.com CumulusLinux-1.5 Release [9027 B]
Get:2 https://repo.cumulusnetworks.com CumulusLinux-1.5/main powerpc
Packages [105 kB]
Get:3 https://repo.cumulusnetworks.com CumulusLinux-1.5/extras powerpc
Packages [20 B]
Get:4 https://repo.cumulusnetworks.com CumulusLinux-1.5/updates powerpc
Packages [20 B]
Get:5 https://repo.cumulusnetworks.com CumulusLinux-1.5/security-updates
powerpc Packages [20 B]
Ign https://repo.cumulusnetworks.com CumulusLinux-1.5/extras Translation-en
Ign https://repo.cumulusnetworks.com CumulusLinux-1.5/main Translation-en
Ign https://repo.cumulusnetworks.com CumulusLinux-1.5/security-updates
Translation-en
Ign https://repo.cumulusnetworks.com CumulusLinux-1.5/updates Translation-en
Fetched 115 kB in 2s (56.3 kB/s)
Reading package lists... Done
Listing Available Packages
Once the cache is populated, use apt-cache to search the cache to find the packages you are
interested in or to get information about an available package. Here are examples of the search and
show sub-commands:
cumulus@switch:~$ apt-cache search tcp
fakeroot - tool for simulating superuser privileges
libwrap0 - Wietse Venema's TCP wrappers library
libwrap0-dev - Wietse Venema's TCP wrappers library, development files
netbase - Basic TCP/IP networking system
nmap - The Network Mapper
openbsd-inetd - OpenBSD Internet Superserver
openssh-client - secure shell (SSH) client, for secure access to remote
machines
openssh-server - secure shell (SSH) server, for secure access from remote
machines
rsyslog - reliable system and kernel logging daemon
socat - multipurpose relay for bidirectional data transfer
tcpd - Wietse Venema's TCP wrapper utilities
tcpdump - command-line network traffic analyzer
tcpreplay - Tool to replay saved tcpdump files at arbitrary speeds
cumulusnetworks.com
105
Cumulus Networks
tcpstat - network interface statistics reporting tool
tcptrace - Tool for analyzing tcpdump output
tcpxtract - extracts files from network traffic based on file signatures
quagga - BGP/OSPF/RIP routing daemon
jdoo - utility for monitoring and managing daemons or similar programs
cumulus@switch:~$ apt-cache show tcpreplay
Package: tcpreplay
Version: 3.4.3-2+wheezy1
Architecture: powerpc
Maintainer: Noël Köthe <noel@debian.org>
Installed-Size: 984
Depends: libc6 (>= 2.7), libpcap0.8 (>= 0.9.8)
Homepage: http://tcpreplay.synfin.net/
Priority: optional
Section: net
Filename: pool/main/t/tcpreplay/tcpreplay_3.4.3-2+wheezy1_powerpc.deb
Size: 435904
SHA256: 03dc29057cb608d2ddf08207aedf18d47988ed6c23db0af69d30746768a639ae
SHA1: 8ee1b9b02dacd0c48a474844f4466eb54c7e1568
MD5sum: cf20bec7282ef77a091e79372a29fe1e
Description: Tool to replay saved tcpdump files at arbitrary speeds
Tcpreplay is aimed at testing the performance of a NIDS by
replaying real background network traffic in which to hide
attacks. Tcpreplay allows you to control the speed at which the
traffic is replayed, and can replay arbitrary tcpdump traces. Unlike
programmatically-generated artificial traffic which doesn't
exercise the application/protocol inspection that a NIDS performs,
and doesn't reproduce the real-world anomalies that appear on
production networks (asymmetric routes, traffic bursts/lulls,
fragmentation, retransmissions, etc.), tcpreplay allows for exact
replication of real traffic seen on real networks.
cumulus@switch:~$
The search commands look for the search terms not only in the package name but in other
parts of the package information. Consequently, it will match on more packages than you
would expect.
Adding a Package
In order to add a new package, first ensure the package is not already installed in the system:
106
03 June 2015
Cumulus Linux 2.5.2 User Guide
cumulus@switch:~$ dpkg -l | grep {name of package}
If the package is installed already, ensure it’s the version you need. If it’s an older version, then update
the package from the Cumulus Linux repository:
cumulus@switch:~$ sudo apt-get update
If the package is not already on the system, add it by running apt-get install. This retrieves the
package from the Cumulus Linux repository and installs it on your system together with any other
packages that this package might depend on.
For example, the following adds the package tcpreplay to the system:
cumulus@switch:~$ sudo apt-get install tcpreplay
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following NEW packages will be installed:
tcpreplay
0 upgraded, 1 newly installed, 0 to remove and 1 not upgraded.
Need to get 436 kB of archives.
After this operation, 1008 kB of additional disk space will be used.
Get:1 https://repo.cumulusnetworks.com/ CumulusLinux-1.5/main tcpreplay
powerpc 3.4.3-2+wheezy1 [436 kB]
Fetched 436 kB in 0s (1501 kB/s)
Selecting previously unselected package tcpreplay.
(Reading database ... 15930 files and directories currently installed.)
Unpacking tcpreplay (from .../tcpreplay_3.4.3-2+wheezy1_powerpc.deb) ...
Processing triggers for man-db ...
Setting up tcpreplay (3.4.3-2+wheezy1) ...
cumulus@switch:~$
Listing Installed Packages
The APT cache contains information about all the packages available on the repository. To see which
packages are actually installed on your system, use dpkg. The following example lists all the packages
on the system that have "tcp" in their package names:
cumulus@switch:~$ dpkg -l \*tcp\*
Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trigpend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
cumulusnetworks.com
107
Cumulus Networks
||/ Name
Version
Architecture Description
+++-==============-============-=============================================
ii
tcpd
7.6.q-24
powerpc
Wietse Venema's TCP wrapper
4.3.0-1
powerpc
command-line network traffic
3.4.3-2+whee powerpc
Tool to replay saved tcpdump
utili
ii
tcpdump
anal
ii
tcpreplay
file
cumulus@switch:~$
Upgrading to Newer Versions of Installed Packages
Upgrading a Single Package
A single package can be upgraded by simply installing that package again with apt-get install. You
should perform an update first so that the APT cache is populated with the latest information about the
packages.
To see if a package needs to be upgraded, use apt-cache show <pkgname> to show the latest
version number of the package. Use dpkg -l <pkgname> to show the version number of the installed
package.
Upgrading All Packages
You can update all packages on the system with apt-get update. This upgrades all installed versions
with their latest versions but will not install any new packages.
Adding Packages from Another Repository
As shipped, Cumulus Linux searches the Cumulus Linux repository for available packages. You can add
additional repositories to search by adding them to the list of sources that apt-get consults. See man
sources.list for more information.
For several packages, Cumulus Networks has added features or made bug fixes and these
packages must not be replaced with versions from other repositories. Cumulus Linux has
been configured to ensure that the packages from the Cumulus Linux repository are always
preferred over packages from other repositories.
If you want to install packages that are not in the Cumulus Linux repository, the procedure is the same
as above with one additional step.
Packages not part of the Cumulus Linux Repository have generally not been tested, and may
not be supported by Cumulus Linux support.
108
03 June 2015
Cumulus Linux 2.5.2 User Guide
Installing packages outside of the Cumulus Linux repository requires the use of apt-get, but,
depending on the package, easy-install and other commands can also be used.
To install a new package, please complete the following steps:
1. First, ensure package is not already installed in the system. Use the dpkg command:
cumulus@switch:~$ dpkg -l | grep {name of package}
2. If the package is installed already, ensure it's the version you need. If it's an older version, then
update the package from the Cumulus Linux repository:
cumulus@switch:~$ sudo apt-get update
cumulus@switch:~$ sudo apt-get install {name of package}
3. If the package is not on the system, then most likely the package source location is also not in
the /etc/apt/sources.list file. If the source for the new package is not in sources.list,
please edit and add the appropriate source to the file. For example, add the following if you
wanted a package from the Debian repository that is not in the Cumulus Linux repository:
deb http://http.us.debian.org/debian wheezy main
deb http://security.debian.org/ wheezy/updates main
Otherwise, the repository may be listed in /etc/apt/sources.list but is commented out, as can be
the case with the testing repository:
#deb http://repo.cumulusnetworks.com CumulusLinux-VERSION
testing
To uncomment the repository, remove the # at the start of the line, then save the file:
deb http://repo.cumulusnetworks.com CumulusLinux-VERSION testing
4. Run apt-get update then install the package:
cumulus@switch:~$ sudo apt-get update
cumulus@switch:~$ sudo apt-get install {name of package}
Configuration Files
cumulusnetworks.com
109
Cumulus Networks
Configuration Files
/etc/apt/apt.conf
/etc/apt/preferences
/etc/apt/sources.list
Useful Links
Debian GNU/Linux FAQ, Ch 8 Package management tools
man pages for apt-get, dpkg, sources.list, apt_preferences
Zero Touch Provisioning
Zero touch provisioning allows devices to be quickly deployed in large-scale environments. Data center
engineers only need to rack and stack the switch, then connect it to the management network. From
here the provisioning process can start automatically and deploy a configuration.
The provisioning framework allows for a one-time, user-provided script to be executed. This script can
be used to add the switch to a configuration management (CM) platform such as puppet, Chef,
CFEngine, or even a custom, home-grown tool.
In addition, you can use the autoprovision command in Cumulus Linux to invoke your provisioning
script.
Provisioning initially takes place over the management network and is initiated via a DHCP hook. A
DHCP option is used to specify a configuration script. This script is then requested from the Web server
and executed locally on the switch.
The standard Cumulus Linux license requires you to page through the license file before
accepting the terms, which can hinder an unattended installation like zero touch provisioning.
To request a license without the EULA, email licensing@cumulusnetworks.com.
Contents
(Click to expand)
Contents (see page 110)
Commands (see page 111)
Zero Touch Provisioning Process (see page 111)
Specifying DHCP Option 239 (see page 111)
HTTP Headers (see page 112)
Script Requirements (see page 112)
Example Scripts (see page 112)
Using the autoprovision Command (see page 113)
Notes (see page 114)
Configuration Files (see page 114)
110
03 June 2015
Cumulus Linux 2.5.2 User Guide
Commands
autoprovision
Zero Touch Provisioning Process
The zero touch provisioning process involves these steps:
1. The first time you boot Cumulus Linux, eth0 is configured for DHCP and makes a DHCP request.
2. The DHCP server offers a lease to the switch.
3. If option 239 is present in the response, the zero touch provisioning process itself will start.
4. The zero touch provisioning process requests the contents of the script from the URL, sending
additional HTTP headers (see page 111) containing details about the switch.
5. The script’s contents are parsed to ensure it contains the CUMULUS-AUTOPROVISIONING flag.
6. If the CUMULUS-AUTOPROVISIONING flag is present, then the script executes locally on the
switch.
7. The return code of the script gets examined. If it is 0, then the provisioning state is marked as
complete.
Specifying DHCP Option 239
During the DHCP process over eth0, Cumulus Linux will request DHCP option 239. This option is used
to specify the custom provisioning script.
For example, the dhcpd.conf file for an ISC DHCP server could look like:
option cumulus-provision-url code 239 = text;
subnet 192.168.0.0 netmask 255.255.255.0 {
range 192.168.0.100 192.168.0.200;
option cumulus-provision-url "http://192.168.0.2/demo.sh";
}
Additionally, the hostname of the switch can be specified via the host-name option:
subnet 192.168.0.0 netmask 255.255.255.0 {
range 192.168.0.100 192.168.0.200;
option cumulus-provision-url "http://192.168.0.2/demo.sh";
host dc1-tor-sw1 { hardware ethernet 44:38:39:00:1a:6b; fixed-address
192.168.0.101; option host-name "dc1-tor-sw1"; }
}
cumulusnetworks.com
111
Cumulus Networks
HTTP Headers
The following HTTP headers are sent in the request to the Web server to retrieve the provisioning
script:
Header
Value
Example
------
-----
-------
User-Agent
CumulusLinux-
AutoProvision/0.4
CUMULUS-ARCH
CPU architecture
CUMULUS-BUILD
powerpc
1.5.1-5c6829a-
201309251712-final
CUMULUS-LICENSE-INSTALLED
Either 0 or 1
1
CUMULUS-MANUFACTURER
dni
CUMULUS-PRODUCTNAME
et-7448bf
CUMULUS-SERIAL
XYZ123004
CUMULUS-VERSION
1.5.1
CUMULUS-PROV-COUNT
0
CUMULUS-PROV-MAX
32
Script Requirements
The script contents must contain the CUMULUS-AUTOPROVISIONING flag. This can be in a comment or
remark and does not needed to be echoed or written to stdout.
The script can be written in any language currently supported by Cumulus Linux, such as:
Perl
Python
Ruby
Shell
The script must return an exit code of 0 upon success, as this triggers the provisioning process to be
marked as complete.
Example Scripts
Here is a simple script to install puppet:
#!/bin/bash
function error() {
echo -e "\e[0;33mERROR: The Zero Touch Provisioning script failed while
running the command $BASH_COMMAND at line $BASH_LINENO.\e[0m" >&2
exit 1
}
112
03 June 2015
Cumulus Linux 2.5.2 User Guide
trap error ERR
apt-get update -y
apt-get upgrade -y
apt-get install puppet -y
sed -i /etc/default/puppet -e 's/START=no/START=yes/'
sed -i /etc/puppet/puppet.conf -e 's/\[main\]/\[main\]\npluginsync=true/'
service puppet restart
# CUMULUS-AUTOPROVISIONING
exit 0
This script illustrates how to specify an internal APT mirror and puppet master:
#!/bin/bash
function error() {
echo -e "\e[0;33mERROR: The Zero Touch Provisioning script failed while
running the command $BASH_COMMAND at line $BASH_LINENO.\e[0m" >&2
exit 1
}
trap error ERR
sed -i /etc/apt/sources.list -e 's/repo.cumulusnetworks.com/labrepo.
mycompany.com/'
apt-get update -y
apt-get upgrade -y
apt-get install puppet -y
sed -i /etc/default/puppet -e 's/START=no/START=yes/'
sed -i /etc/puppet/puppet.conf -e 's/\[main\]/\[main\]\npluginsync=true/'
sed -i /etc/puppet/puppet.conf -e 's/\[main\]/\[main\]\nserver=labpuppet.
mycompany.com/'
service puppet restart
# CUMULUS-AUTOPROVISIONING
exit 0
Now puppet can take over management of the switch, configuration authentication, changing the
default root password, and setting up interfaces and routing protocols.
Using the autoprovision Command
You can directly invoke an your provisioning script by running the autoprovision command. You can
use this command to enable and disable zero touch provisioning on the switch. Be sure to specify the
full path to the command, as in the examples below.
To enable zero touch provisioning, use the -e option:
cumulus@switch:~$ sudo /usr/lib/cumulus/autoprovision -e
cumulusnetworks.com
113
Cumulus Networks
To run the provisioning script, use the -u option and include the URL to the script:
cumulus@switch:~$ sudo /usr/lib/cumulus/autoprovision -u http://192.168.0.1
/ztp.sh
To disable zero touch provisioning, use the -x option:
cumulus@switch:~$ sudo /usr/lib/cumulus/autoprovision -x
To enable startup discovery mode, without relying on DHCP when you boot the switch, use the -s
option:
cumulus@switch:~$ sudo /usr/lib/cumulus/autoprovision -s
Notes
During the development of a provisioning script, the switch may need to be reset.
You can use the Cumulus Linux cl-img-clear-overlay command to revert the image to its
original configuration.
You can use the Cumulus Linux cl-img-select -i command to cause the switch to reprovision
itself and install a network operating system again using ONIE.
You can trigger the zero touch provisioning process when eth0 is set to use DHCP and one of
the following events occur:
Booting the switch
Plugging a cable into or unplugging it from the eth0 port
Disconnecting then reconnecting the switch’s power cord
Configuration Files
/var/lib/cumulus/autoprovision.conf
Netfilter - ACLs
Netfilter is the packet filtering framework in Cumulus Linux, as well as every other Linux distribution.
iptables, ip6tables and ebtables are userspace tools in Linux to administer filtering rules for IPv4
packets, IPv6 packets and Ethernet frames respectively. cl-acltool is the userspace tool to
administer filtering rules on Cumulus Linux, and is the only tool for configuring ACLs in Cumulus Linux.
114
03 June 2015
Cumulus Linux 2.5.2 User Guide
cl-acltool operates on a series of configuration files, and uses iptables, ip6tables and ebtables
to install rules into the kernel. In addition to programming rules in the kernel, cl-acltool programs
rules in hardware for interfaces involving switch port interfaces, which iptables, ip6tables and
ebtables do not do on their own.
Contents
(Click to expand)
Contents (see page 115)
Commands (see page 115)
Files (see page 115)
Netfilter Framework in the Cumulus Linux Kernel (see page 116)
Limitations on Number of Rules (see page 116)
Enabling Nonatomic Updates (see page 117)
ebtables and Memory Spaces (see page 118)
Memory Spaces with Multiple Commands Line Options (see page 118)
Installing Packet Filtering (ACL) Rules using cl-acltool (see page 118)
Specifying which Policy Files to Install (see page 121)
Managing ACL Rules with cl-acltool (see page 121)
Further Examples (see page 122)
cl-acltool and Network Troubleshooting (see page 122)
Policing Control Plane and Data Plane Traffic (see page 122)
Useful Links (see page 124)
Caveats and Errata (see page 124)
Not All Rules Supported (see page 124)
iptables Interactions with cl-acltool (see page 124)
Where to Assign Rules (see page 125)
Generic Error Message Displayed after ACL Rule Installation Failure (see page 125)
Commands
cl-acltool
ebtables
iptables
ip6tables
Files
/etc/cumulus/acl/policy.conf
/etc/cumulus/acl/policy.d/
cumulusnetworks.com
115
Cumulus Networks
Netfilter Framework in the Cumulus Linux Kernel
Netfilter uses a table-based system for packet filtering. Tables are hooks in the kernel for packet
filtering. Each table has a set of default chains, or categories of ACL rules. Each chain contains packet
filter rules.
The default table in Netfilter is the filter table. The three chains in the filter table are:
INPUT chain, for network traffic going to the switch
OUTPUT chain, for traffic emanating from the switch
FORWARD chain, for traffic being forwarded or routed through the switch
Cumulus Linux, like all Linux distributions, divides ACLs into chains. ACLs are handled both in hardware
and software depending on which chain you use.
Data to Filter
iptables Chain
Hardware Accelerated?
Data plane egress
FORWARD (-o)
Yes
Data plane ingress
FORWARD (-i)
Yes
Control plane input
INPUT
Yes
Control plane output
OUTPUT
No
Limitations on Number of Rules
The maximum number of rules that can be handled in hardware is a function of the platform type
(Apollo2, Firebolt2, Triumph, Trident, Trident+ or Trident II) and a mix of IPv4 and/or IPv6. See the HCL
to determine which switches operate on these platforms.
Apollo2 and Triumph2 Limits
Direction
Atomic Mode
IPv4 Rules
Atomic Mode
IPv6 Rules
Nonatomic Mode
IPv4 Rules
Nonatomic Mode
IPv6 Rules
Ingress
2048
1024
4096
2048
Egress
512
256
1024
512
Firebolt2 Limits
Direction
Atomic Mode
IPv4 Rules
Atomic Mode
IPv6 Rules
Nonatomic Mode
IPv4 Rules
Nonatomic Mode
IPv6 Rules
Ingress
1024
512
2048
1024
116
03 June 2015
Cumulus Linux 2.5.2 User Guide
Direction
Atomic Mode
IPv4 Rules
Atomic Mode
IPv6 Rules
Nonatomic Mode
IPv4 Rules
Nonatomic Mode
IPv6 Rules
Egress
512
256
512
256
Trident/Trident+ Limits
Direction
Atomic Mode
IPv4 Rules
Atomic Mode
IPv6 Rules
Nonatomic Mode
IPv4 Rules
Nonatomic Mode
IPv6 Rules
Ingress
384
384
1024
1024
Egress
512
256
1024
512
Trident II Limits
Direction
Atomic Mode
IPv4 Rules
Atomic Mode
IPv6 Rules
Nonatomic Mode
IPv4 Rules
Nonatomic Mode
IPv6 Rules
Ingress
1024
1024
2048
2048
Egress
512
256
1024
512
Enabling Nonatomic Updates
You can enable nonatomic updates for switchd, which offer better scaling because all hardware
resources are used to actively impact traffic. With atomic updates, half of the hardware resources are
on standby and do not actively impact traffic.
To always start switchd with nonatomic updates:
1. Edit /etc/cumulus/switchd.conf.
2. Add the following line to the file:
acl.non_atomic_update_mode = TRUE
3. Restart switchd:
cumulus@switch:~$ sudo service switchd restart
During nonatomic updates, traffic is stopped first, and enabled after the new configuration is
cumulusnetworks.com
117
Cumulus Networks
During nonatomic updates, traffic is stopped first, and enabled after the new configuration is
written into the hardware completely.
ebtables and Memory Spaces
ebtables rules are put into either the IPv4 or IPv6 memory space depending on whether the rule
utilizes IPv4 or IPv6 to make a decision. L2-only rules, which match the MAC address, are put into the
IPv4 memory space.
Memory Spaces with Multiple Commands Line Options
INPUT and ingress (FORWARD -i) rules occupy the same memory space. A rule counts as ingress if the i option is set. If both input and output options (-i and -o) are set, the rule is considered as ingress
and shares that memory space. For example:
-A FORWARD -i swp1 -o swp2 -s 10.0.14.2 -d 10.0.15.8 -p tcp -j ACCEPT
If you set an output flag with the INPUT chain you will get an error. For example, running clacltool -i on the following rule:
-A FORWARD,INPUT -i swp1 -o swp2 -s 10.0.14.2 -d 10.0.15.8 -p tcp -j
ACCEPT
generates the following error:
error: line 2 : output interface specified with INPUT chain
error processing rule '-A FORWARD,INPUT -i swp1 -o swp2 -s 10.0.14.2
-d 10.0.15.8 -p tcp -j ACCEPT'
However, simply removing the -o option and interface would make it a valid rule.
Installing Packet Filtering (ACL) Rules using cl-acltool
cl-acltool takes access control list (ACL) rules input in files. Each ACL policy file contains iptables,
ip6tables and ebtables categories under the tags [iptables], [ip6tables] and [ebtables]
respectively.
Each rule in an ACL policy must be assigned to one of the rule categories above.
See man cl-acltool(5) for ACL rule details. For iptables rule syntax, see man iptables(8). For
ip6tables rule syntax, see man ip6tables(8). For ebtables rule syntax, see man ebtables(8).
See man cl-acltool(5) and man cl-acltool(8) for further details on using cl-acltool;
118
03 June 2015
Cumulus Linux 2.5.2 User Guide
See man cl-acltool(5) and man cl-acltool(8) for further details on using cl-acltool;
however some examples are listed below, and more are listed in the Cumulus Networks Help Center.
The default directory for ACL policy files is /etc/cumulus/acl/policy.d. By default, all *.rules files
in this directory are included in /etc/cumulus/acl/policy.conf. And by default all files included in
this policy.conf file are installed when the switch boots up.
Here is an example ACL policy file:
[iptables]
-A INPUT --in-interface swp1 -p tcp --dport 80 -j ACCEPT
-A FORWARD --in-interface swp1 -p tcp --dport 80 -j ACCEPT
[ip6tables]
-A INPUT --in-interface swp1 -p tcp --dport 80 -j ACCEPT
-A FORWARD --in-interface swp1 -p tcp --dport 80 -j ACCEPT
[ebtables]
-A INPUT -p IPv4 -j ACCEPT
-A FORWARD -p IPv4 -j ACCEPT
Variables can be used to specify chain and interface lists to ease administration of rules:
INGRESS = swp+
INPUT_PORT_CHAIN = INPUT,FORWARD
[iptables]
-A $INPUT_PORT_CHAIN --in-interface $INGRESS -p tcp --dport 80 -j ACCEPT
[ip6tables]
-A $INPUT_PORT_CHAIN --in-interface $INGRESS -p tcp --dport 80 -j ACCEPT
[ebtables]
-A INPUT -p IPv4 -j ACCEPT
ACL rules for the system can be written into multiple files under the default /etc/cumulus/acl
/policy.d/ directory. Ordering of rules during install follow the sorted order of the files based on file
names.
Use multiple files support to stack rules. The example below shows two rules files separating rules for
management and datapath traffic:
cumulus@switch:~$ ls /etc/cumulus/acl/policy.d/
00sample_mgmt.rules
01sample_datapath.rules
cumulusnetworks.com
119
Cumulus Networks
cumulus@switch:~$ cat /etc/cumulus/acl/policy.d/00sample_mgmt.rules
INGRESS_INTF = swp+
INGRESS_CHAIN = INPUT
[iptables]
# protect the switch management
-A $INGRESS_CHAIN --in-interface $INGRESS_INTF -s 10.0.14.2 -d 10.0.15.8 -p
tcp -j ACCEPT
-A $INGRESS_CHAIN --in-interface $INGRESS_INTF -s 10.0.11.2 -d 10.0.12.8 -p
tcp -j ACCEPT
-A $INGRESS_CHAIN --in-interface $INGRESS_INTF -d 10.0.16.8 -p udp -j DROP
cumulus@switch:~$ cat 00sample_datapath.rules
INGRESS_INTF = swp+
INGRESS_CHAIN = INPUT, FORWARD
[iptables]
-A $INGRESS_CHAIN --in-interface $INGRESS_INTF -s 192.0.2.5 -p icmp -j
ACCEPT
-A $INGRESS_CHAIN --in-interface $INGRESS_INTF -s 192.0.2.6 -d 192.0.2.4 -j
DROP
-A $INGRESS_CHAIN --in-interface $INGRESS_INTF -s 192.0.2.2 -d 192.0.2.8 -j
DROP
Install all ACL policies under a directory:
cumulus@switch:~$ sudo cl-acltool -i -P ./rules
Reading files under rules
Reading rule file ./rules/01_http_rules.txt ...
Processing rules in file ./rules/01_http_rules.txt ...
Installing acl policy ...
Done.
Install all rules and policies included in /etc/cumulus/acl/policy.conf:
cumulus@switch:~$ sudo cl-acltool -i
120
03 June 2015
Cumulus Linux 2.5.2 User Guide
Specifying which Policy Files to Install
By default, any .rules file you configure in /etc/cumulus/acl/policy.d/ will be installed by
Cumulus Linux. To add other policy files to an ACL, you need to include them in /etc/cumulus/acl
/policy.conf. For example, in order for Cumulus Linux to install a rule in a policy file called 01_new.
acl, you would add include /etc/cumulus/acl/policy.d/01_new.acl to policy.conf, as in
this example:
cumulus@switch:~$ sudo vi /etc/cumulus/acl/policy.conf
#
# This file is a master file for acl policy file inclusion
#
# Note: This is not a file where you list acl rules.
#
# This file can contain:
# - include lines with acl policy files
#
example:
#
include <filepath>
#
#
see manpage cl-acltool(5) and cl-acltool(8) for how to write policy
files
#
include /etc/cumulus/acl/policy.d/*.rules
include /etc/cumulus/acl/policy.d/01_new.acl
Managing ACL Rules with cl-acltool
You manage Cumulus Linux ACLs with cl-acltool. Rules are first written to the iptables chains, as
described above, and then synced to hardware via switchd.
To examine the current state of chains and list all installed rules, run:
cumulus@switch:~$ sudo cl-acltool -L all
------------------------------- Listing rules of type iptables:
------------------------------TABLE filter :
Chain INPUT (policy ACCEPT 90 packets, 14456 bytes)
pkts bytes target prot opt in out source destination
0 0 DROP all -- swp+ any 240.0.0.0/5 anywhere
0 0 DROP all -- swp+ any loopback/8 anywhere
cumulusnetworks.com
121
Cumulus Networks
0 0 DROP all -- swp+ any base-address.mcast.net/8 anywhere
0 0 DROP all -- swp+ any 255.255.255.255 anywhere
...
To list installed rules using native iptables, ip6tables and ebtables, run these commands:
cumulus@switch:~$ sudo iptables -L
cumulus@switch:~$ sudo ip6tables -L
cumulus@switch:~$ sudo ebtables -L
To flush all installed rules, run:
cumulus@switch:~$ sudo cl-acltool -F all
To flush only the IPv4 iptables rules, run:
cumulus@switch:~$ sudo cl-acltool -F ip
If the install fails, ACL rules in the kernel and hardware are rolled back to previous state. Errors from
programming rules in kernel or BCM hardware are reported appropriately.
Further Examples
More examples demonstrating how to use cl-acltool are available in the Help Center.
cl-acltool and Network Troubleshooting
You use cl-acltool for both system diagnostics and troubleshooting the whole network. See
Network Troubleshooting (see page 82) for information on using ACLs for counting rules (see page 85
) as well as monitoring packets via SPAN and ERSPAN (see page 86).
Policing Control Plane and Data Plane Traffic
You can configure quality of service for traffic on both the control plane and the data plane. By using
QoS policers, you can rate limit traffic so incoming packets get dropped if they exceed specified
thresholds.
Use the POLICE target with iptables. POLICE takes these arguments:
--set-class value: Sets the system internal class of service queue configuration to value.
--set-rate value: Specifies the maximum rate in kilobytes (KB) or packets.
122
03 June 2015
Cumulus Linux 2.5.2 User Guide
--set-burst value: Specifies the number of packets or kilobytes (KB) allowed to arrive
sequentially.
--set-mode string: Sets the mode in KB (kilobytes) or pkt (packets) for rate and burst size.
For example, to rate limit the incoming traffic on swp1 to 400 packets/second with a burst of 100
packets/second and set the class of the queue for the policed traffic as 0, set this rule in your
appropriate .rules file:
-A INPUT --in-interface swp1 -j POLICE --set-mode
pkt
--set-rate
400 --
set-burst 100 --set-class 0
Here is another example of control plane ACL rules to lock down the switch. This is specified in /etc
/cumulus/acl/policy.d/00control_plane.rules:
INGRESS_INTF = swp+
INGRESS_CHAIN = INPUT
INNFWD_CHAIN = INPUT,FORWARD
MARTIAN_SOURCES_4 = "240.0.0.0/5,127.0.0.0/8,224.0.0.0/8,255.255.255.255/32"
MARTIAN_SOURCES_6 = "ff00::/8,::/128,::ffff:0.0.0.0/96,::1/128"
#Custom Policy Section
SSH_SOURCES_4 = "192.168.0.0/24"
NTP_SERVERS_4 = "192.168.0.1/32,192.168.0.4/32"
DNS_SERVERS_4 = "192.168.0.1/32,192.168.0.4/32"
SNMP_SERVERS_4 = "192.168.0.1/32"
[iptables]
-A $INNFWD_CHAIN --in-interface $INGRESS_INTF -s $MARTIAN_SOURCES_4 -j DROP
-A $INGRESS_CHAIN --in-interface $INGRESS_INTF -p ospf -j POLICE --set-mode
pkt --set-rate 2000 --set-burst 2000 --set-class 7
-A $INGRESS_CHAIN --in-interface $INGRESS_INTF -p tcp --dport bgp -j POLICE
--set-mode pkt --set-rate 2000 --set-burst 2000 --set-class 7
-A $INGRESS_CHAIN --in-interface $INGRESS_INTF -p tcp --sport bgp -j POLICE
--set-mode pkt --set-rate 2000 --set-burst 2000 --set-class 7
-A $INGRESS_CHAIN --in-interface $INGRESS_INTF -p icmp -j POLICE --set-mode
pkt --set-rate 100 --set-burst 40 --set-class 2
-A $INGRESS_CHAIN --in-interface $INGRESS_INTF -p udp --dport bootps:bootpc
-j POLICE --set-mode pkt --set-rate 100 --set-burst 100 --set-class 2
-A $INGRESS_CHAIN --in-interface $INGRESS_INTF -p tcp --dport bootps:bootpc
-j POLICE --set-mode pkt --set-rate 100 --set-burst 100 --set-class 2
-A $INGRESS_CHAIN --in-interface $INGRESS_INTF -p igmp -j POLICE --set-mode
pkt --set-rate 300 --set-burst 100 --set-class 6
cumulusnetworks.com
123
Cumulus Networks
# Custom policy
-A $INGRESS_CHAIN --in-interface $INGRESS_INTF -p tcp --dport 22 -s
$SSH_SOURCES_4 -j ACCEPT
-A $INGRESS_CHAIN --in-interface $INGRESS_INTF -p udp --sport 123 -s
$NTP_SERVERS_4 -j ACCEPT
-A $INGRESS_CHAIN --in-interface $INGRESS_INTF -p udp --sport 53 -s
$DNS_SERVERS_4 -j ACCEPT
-A $INGRESS_CHAIN --in-interface $INGRESS_INTF -p udp --dport 161 -s
$SNMP_SERVERS_4 -j ACCEPT
# Allow UDP traceroute when we are the current TTL expired hop
-A $INGRESS_CHAIN --in-interface $INGRESS_INTF -p udp --dport 1024:65535 -m
ttl --ttl-eq 1 -j ACCEPT
-A $INGRESS_CHAIN --in-interface $INGRESS_INTF -j DROP
Useful Links
http://www.netfilter.org/
http://www.netfilter.org/documentation/HOWTO//packet-filtering-HOWTO-6.html
Caveats and Errata
Not All Rules Supported
Please note that not all iptables and ebtables rules are fully supported. See man cl-acltool(5)
for more information.
Further, there is no way to implement or extend transit filtering in software, and there is no way to
hardware accelerate the OUTPUT chain. If the maximum number of rules for a particular table is
exceeded, cl-acltool -i generates the following error:
error: hw sync failed (sync_acl hardware installation failed)
Rolling back ..
failed.
iptables Interactions with cl-acltool
Since Cumulus Linux is a Linux operating system, the iptables commands can be used directly and
will work. However, you should consider using cl-acltool instead because:
Without using cl-acltool, rules are not installed into hardware.
Running cl-acltool -i (the installation command) will reset all rules and delete anything that
is not stored in /etc/cumulus/acl/policy.conf.
For example performing:
124
03 June 2015
Cumulus Linux 2.5.2 User Guide
cumulus@switch:~$ sudo iptables -A INPUT -p icmp --icmp-type
echo-request -j DROP
Does work, and the rules appear when you run cl-acltool -L:
cumulus@switch:~$ sudo cl-acltool -L ip
------------------------------Listing rules of type iptables:
------------------------------TABLE filter :
Chain INPUT (policy ACCEPT 72 packets, 5236 bytes)
pkts bytes target
prot opt in
out
source
destination
0
anywhere
0 DROP
icmp -anywhere
any
any
icmp echo-request
However, running cl-acltool -i or reboot will remove them. To ensure all rules that can be
in hardware are hardware accelerated, place them in /etc/cumulus/acl/policy.conf and
run cl-acltool -i.
Where to Assign Rules
If a switch port is assigned to a bond, any egress rules must be assigned to the bond.
When using the OUTPUT chain, rules must be assigned to the source. For example, if a rule is
assigned to the switch port in the direction of traffic but the source is a bridge (VLAN), the traffic
won’t be affected by the rule and must be applied to the bridge.
If all transit traffic needs to have a rule applied, use the FORWARD chain, not the OUTPUT chain.
Generic Error Message Displayed after ACL Rule Installation Failure
After an ACL rule installation failure, a generic error message like the following is displayed:
cumulus@switch:$ sudo cl-acltool -i -p 00control_plane.rules
Using user provided rule file 00control_plane.rules
Reading rule file 00control_plane.rules ...
Processing rules in file 00control_plane.rules ...
error: hw sync failed (sync_acl hardware installation failed)
Installing acl policy... Rolling back ..
failed.
Configuring
and Managing Network
cumulusnetworks.com
125
Cumulus Networks
Configuring and Managing Network
Interfaces
ifupdown is the network interface manager for Cumulus Linux. Cumulus Linux 2.1 and later uses an
updated version of this tool, ifupdown2.
For more information on network interfaces, see Understanding Network Interfaces (see page 154).
Keep the following points in mind before you start configuring interfaces using ifupdown2:
IPv4 and IPv6 addresses for an interface can be listed in the same iface section. For examples,
see /usr/share/doc/python-ifupdown2/examples/.
Legacy interface aliases provided a way to assign multiple IP addresses to the same interface. If
you assign multiple IP addresses on the same interface in Linux using iproute2 commands, do
not use legacy interface aliases, as they are only supported for backward compatibility with
ifupdown. They do get configured, but ifquery has problems recognizing them.
Do not confuse the term alias as it's used here with the interface description that is
also called "alias".
ifupdown2 only understands interfaces that were configured using ifupdown. Any interfaces
created with a command other than ifupdown (like brctl) must be de-configured in the same
manner.
Use globs for port lists wherever applicable. Regular expressions work as well, however regular
expressions require all matching interfaces to be present in the /etc/network/interfaces
file. And declaring all interfaces in the interfaces file leads to losing all the advantages that
built-in interfaces provide.
Extensions to ifquery help with validation and debugging.
By default, ifupdown is quiet; use the verbose option -v when you want to know what is going
on when bringing an interface down or up.
Contents
(Click to expand)
Contents (see page 126)
Commands (see page 127)
Man Pages (see page 127)
Configuration Files (see page 127)
Basic Commands (see page 127)
ifupdown2 Built-in Interfaces (see page 128)
ifupdown2 Interface Dependencies (see page 128)
ifup Handling of Upper (Parent) Interfaces (see page 131)
Bringing All auto Interfaces Up or Down (see page 132)
Configuring IP Addresses (see page 133)
126
03 June 2015
Cumulus Linux 2.5.2 User Guide
Configuring IP Addresses (see page 133)
Purging Existing IP Addresses on an Interface (see page 133)
Specifying User Commands (see page 134)
Sourcing Interface File Snippets (see page 134)
Using Globs for Port Lists (see page 135)
Using Templates (see page 135)
Adding Descriptions to Interfaces (see page 136)
Caveats and Errata (see page 136)
Useful Links (see page 137)
Commands
ifdown
ifquery
ifreload
ifup
mako-render
Man Pages
The following man pages have been updated for Cumulus Linux 2.1:
man ifdown(8)
man ifquery(8)
man ifreload
man ifup(8)
man ifupdown-addons-interfaces(5)
man interfaces(5)
Configuration Files
/etc/network/interfaces
Basic Commands
To bring up an interface or apply changes to an existing interface, run:
cumulus@switch:~$ sudo ifup <ifname>
To bring down a single interface, run:
cumulusnetworks.com
127
Cumulus Networks
cumulus@switch:~$ sudo ifdown <ifname>
ifdown2 always deletes logical interfaces after bringing them down. Use the --admin-state
option if you only want to administratively bring the interface up or down.
ifupdown2 Built-in Interfaces
By default, ifupdown2 recognizes VLAN interfaces and physical interfaces that may appear as
dependents. There is no need to list them in the interfaces file unless they need a specific
configuration or they need to match a regular expression used in the interfaces file. Use globs to
avoid limitations with regular expressions.
For example, swp1.100 and swp2.100 below do not need an entry in the interfaces file:
auto br-100
iface br-100
address 10.0.12.2/24
address 2001:dad:beef::3/64
bridge-ports swp1.100 swp2.100
bridge-stp on
Once you specify a dependent interface in /etc/network/interfaces, ifupdown2 no
longer treats it as a built-in interface, so you must bring it up and down with ifup and ifdown
as you would any other non-dependent interface.
ifupdown2 Interface Dependencies
ifupdown2 understands interface dependency relationships. When ifup and ifdown are run with all
interfaces, they always run with all interfaces in dependency order. When run with the interface list on
the command line, the default behavior is to not run with dependents. But if there are any built-in
dependents, they will be brought up or down.
To run with dependents when you specify the interface list, use the --with-depends option. --withdepends walks through all dependents in the dependency tree rooted at the interface you specify.
Consider the following example configuration:
auto bond1
iface bond1
128
03 June 2015
Cumulus Linux 2.5.2 User Guide
address 100.0.0.2/16
bond-slaves swp29 swp30
bond-mode 802.3ad
bond-miimon 100
bond-use-carrier 1
bond-lacp-rate 1
bond-min-links 1
bond-xmit-hash-policy layer3+4
auto bond2
iface bond2
address 100.0.0.5/16
bond-slaves swp31 swp32
bond-mode 802.3ad
bond-miimon 100
bond-use-carrier 1
bond-lacp-rate 1
bond-min-links 1
bond-xmit-hash-policy layer3+4
auto br2001
iface br2001
address 12.0.1.3/24
bridge-ports bond1.2001 bond2.2001
bridge-stp on
Specifying ifup --with-depends br2001 brings up all dependents: bond1.2001, bond2.2001,
bond1, bond2, bond1.2001, bond2.2001, swp29, swp30, swp31, swp32.
Similarly, specifying ifdown --with-depends br2001 brings down all dependents: bond1.2001,
bond2.2001, bond1, bond2, bond1.2001, bond2.2001, swp29, swp30, swp31, swp32.
As mentioned earlier, ifdown2 always deletes logical interfaces after bringing them down.
Use the --admin-state option if you only want to administratively bring the interface up or
down. In terms of the above example, ifdown br2001 deletes br2001.
To guide you through which interfaces will be brought down and up, use the --print-dependency
option to get the list of dependents.
Use ifquery --print-dependency=list -a to get the dependency list of all interfaces:
cumulus@switch:~$ sudo ifquery --print-dependency=list -a
lo : None
eth0 : None
bond0 : ['swp25', 'swp26']
cumulusnetworks.com
129
Cumulus Networks
bond1 : ['swp29', 'swp30']
bond2 : ['swp31', 'swp32']
br0 : ['bond1', 'bond2']
bond1.2000 : ['bond1']
bond2.2000 : ['bond2']
br2000 : ['bond1.2000', 'bond2.2000']
bond1.2001 : ['bond1']
bond2.2001 : ['bond2']
br2001 : ['bond1.2001', 'bond2.2001']
swp40 : None
swp25 : None
swp26 : None
swp29 : None
swp30 : None
swp31 : None
swp32 : None
To print the dependency list of a single interface, use:
cumulus@switch:~$ sudo ifquery --print-dependency=list br2001
br2001 : ['bond1.2001', 'bond2.2001']
bond1.2001 : ['bond1']
bond2.2001 : ['bond2']
bond1 : ['swp29', 'swp30']
bond2 : ['swp31', 'swp32']
swp29 : None
swp30 : None
swp31 : None
swp32 : None
To print the dependency information of an interface in dot format:
cumulus@switch:~$ sudo ifquery --print-dependency=dot br2001
/* Generated by GvGen v.0.9 (http://software.inl.fr/trac/wiki/GvGen) */
digraph G {
compound=true;
node1 [label="br2001"];
node2 [label="bond1.2001"];
node3 [label="bond2.2001"];
node4 [label="bond1"];
node5 [label="bond2"];
node6 [label="swp29"];
130
03 June 2015
Cumulus Linux 2.5.2 User Guide
node7 [label="swp30"];
node8 [label="swp31"];
node9 [label="swp32"];
node1->node2;
node1->node3;
node2->node4;
node3->node5;
node4->node6;
node4->node7;
node5->node8;
node5->node9;
}
You can use dot to render the graph on an external system where dot is installed.
To print the dependency information of the entire interfaces file:
cumulus@switch:~$ sudo ifquery --print-dependency=dot -a >interfaces_all.dot
ifup Handling of Upper (Parent) Interfaces
When you run ifup on a logical interface (like a bridge, bond or VLAN interface), if the ifup resulted in
the creation of the logical interface, by default it implicitly tries to execute on the interface's upper (or
parent) interfaces as well. This helps in most cases, especially when a bond is brought down and up, as
in the example below. This section describes the behavior of bringing up the upper interfaces.
Consider this example configuration:
cumulusnetworks.com
131
Cumulus Networks
auto br100
iface br100
bridge-ports bond1.100 bond2.100
auto bond1
iface bond1
bond-slaves swp1 swp2
If you run ifdown bond1, ifdown deletes bond1 and the VLAN interface on bond1 (bond1.100); it also
removes bond1 from the bridge br100. Next, when you run ifup bond1, it creates bond1 and the
VLAN interface on bond1 (bond1.100); it also executes ifup br100 to add the bond VLAN interface
(bond1.100) to the bridge br100.
As you can see above, implicitly bringing up the upper interface helps, but there can be cases where an
upper interface (like br100) is not in the right state, which can result in warnings. The warnings are
mostly harmless.
If you want to disable these warnings, you can disable the implicit upper interface handling by setting
skip_upperifaces=1 in /etc/network/ifupdown2/ifupdown2.conf.
With skip_upperifaces=1, you will have to explicitly execute ifup on the upper interfaces. In this
case, you will have to run ifup br100 after an ifup bond1 to add bond1 back to bridge br100.
Although specifying a subinterface like swp1.100 and then running ifup swp1.100 will also
result in the automatic creation of the swp1 interface in the kernel, Cumulus Networks
recommends you specify the parent interface swp1 as well. A parent interface is one where
any physical layer configuration can reside, such as link-speed 1000 or link-duplex
full.
It's important to note that if you only create swp1.100 and not swp1, then you cannot run
ifup swp1 since you did not specify it.
Bringing All auto Interfaces Up or Down
You can easily bring up or down all interfaces marked auto in /etc/network/interfaces. Use the -a
option. For further details, see individual man pages for ifup(8), ifdown(8), ifreload(8).
To administratively bring up all interfaces marked auto, run:
cumulus@switch:~$ sudo ifup -a --admin-state
To administratively bring down all interfaces marked auto, run:
cumulus@switch:~$ sudo ifdown -a --admin-state
132
03 June 2015
Cumulus Linux 2.5.2 User Guide
To reload all network interfaces marked auto, use the if reload command (which is equivalent to
running ifdown then ifup, but ifdown skips any configurations that didn't change):
cumulus@switch:~$ sudo ifreload -a
Configuring IP Addresses
In /etc/network/interfaces, list all IP addresses as shown below under the iface section (see
man interfaces for more information):
auto swp1
iface swp1
address 12.0.0.1/30
address 12.0.0.2/30
The address method and address family are not mandatory. They default to inet/inet6 and static
by default, but inet/inet6 must be specified if you need to specify dhcp or loopback:
auto lo
iface lo inet loopback
You can specify both IPv4 and IPv6 addresses under the same iface section:
auto swp1
iface swp1
address 12.0.0.1/30
address 12.0.0.2/30
address 2001:dee:eeef:2::1/64
Purging Existing IP Addresses on an Interface
By default, ifupdown2 purges existing IP addresses on an interface. If you have other processes that
manage IP addresses for an interface, you can disable this feature including the address-purge
setting in the interface's configuration. For example, add the following to the interface configuration in
/etc/network/interfaces:
cumulusnetworks.com
133
Cumulus Networks
auto swp2
iface swp2 inet static
address-purge no
Specifying User Commands
You can specify additional user commands in the interfaces file. As shown in the example below, the
interface stanzas in /etc/network/interfaces can have a command that runs at pre-up, up, postup, pre-down, down, and post-down:
auto swp1
iface swp1
address 12.0.0.1/30
up /sbin/foo bar
Any valid command can be hooked in the sequencing of bringing an interface up or down, although
commands should be limited in scope to network-related commands associated with the particular
interface.
For example, it wouldn't make sense to install some Debian package on ifup of swp1, even though
that is technically possible. See man interfaces for more details.
Sourcing Interface File Snippets
Sourcing interface files helps organize and manage the interfaces(5) file. For example:
cumulus@switch:~$ cat /etc/network/interfaces
# The loopback network interface
auto lo
iface lo inet loopback
# The primary network interface
auto eth0
iface eth0 inet dhcp
source /etc/network/interfaces.d/bond0
The contents of the sourced file used above are:
134
03 June 2015
Cumulus Linux 2.5.2 User Guide
cumulus@switch:~$ cat /etc/network/interfaces.d/bond0
auto bond0
iface bond0
address 14.0.0.9/30
address 2001:ded:beef:2::1/64
bond-slaves swp25 swp26
bond-mode 802.3ad
bond-miimon 100
bond-use-carrier 1
bond-lacp-rate 1
bond-min-links 1
bond-xmit-hash-policy layer3+4
Using Globs for Port Lists
Some modules support globs to describe port lists. You can use globs to specify bridge ports and bond
slaves:
auto br0
iface br0
bridge-ports glob swp1-6.100
auto br1
iface br1
bridge-ports glob swp7-9.100
swp11.100 glob swp15-18.100
Using Templates
ifupdown2 supports Mako-style templates. The Mako template engine is run over the interfaces file
before parsing.
Use the template to declare cookie-cutter bridges in the interfaces file:
%for v in [11,12]:
auto vlan${v}
iface vlan${v}
address 10.20.${v}.3/24
bridge-ports glob swp19-20.${v}
bridge-stp on
%endfor
And use it to declare addresses in the interfaces file:
cumulusnetworks.com
135
Cumulus Networks
And use it to declare addresses in the interfaces file:
%for i in [1,12]:
auto swp${i}
iface swp${i}
address 10.20.${i}.3/24
Regarding Mako syntax, use square brackets ([1,12]) to specify a list of individual numbers
(in this case, 1 and 12). Use range(1,12) to specify a range of interfaces.
For more examples of configuring Mako templates, read this knowledge base article.
Adding Descriptions to Interfaces
You can add descriptions to the interfaces configured in /etc/network/interfaces by using the alias
keyword. For example:
auto swp1
iface swp1
alias swp1 hypervisor_port_1
You can query interface descriptions by running ip link show. The alias appears on the alias line:
cumulus@switch$ ip link show swp1
3: swp1: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast
state DOWN mode DEFAULT qlen 500
link/ether aa:aa:aa:aa:aa:bc brd ff:ff:ff:ff:ff:ff
alias hypervisor_port_1
Interface descriptions also appear in the SNMP OID (see page 47) IF-MIB::ifAlias.
Caveats and Errata
While ifupdown2 supports the inclusion of multiple iface stanzas for the same interface, Cumulus
Networks recommends you use a single iface stanza for each interface, if possible.
136
03 June 2015
Cumulus Linux 2.5.2 User Guide
There are cases where you must specify more than one iface stanza for the same interface. For
example, the configuration for a single interface can come from many places, like a template or a
sourced file.
If you do specify multiple iface stanzas for the same interface, make sure the stanzas do not specify
the same interface attributes. Otherwise, unexpected behavior can result.
For example, swp1 is configured in two places:
cumulus@switch:~$ cat /etc/network/interfaces
source /etc/interfaces.d/speed_settings
auto swp1
iface swp1
address 10.0.14.2/24
cumulus@switch:~$ cat /etc/interfaces.d/speed_settings
auto swp1
iface swp1
link-speed 1000
link-duplex full
ifupdown2 correctly parses a configuration like this because the same attributes are not specified in
multiple iface stanzas.
Useful Links
http://wiki.debian.org/NetworkConfiguration
http://www.linuxfoundation.org/collaborate/workgroups/networking/bonding
http://www.linuxfoundation.org/collaborate/workgroups/networking/bridge
http://www.linuxfoundation.org/collaborate/workgroups/networking/vlan
Layer
2 Features
cumulusnetworks.com
137
Cumulus Networks
Layer 2 Features
Link Layer Discovery Protocol (see page 138)
Prescriptive Topology Manager (PTM) (see page 143)
Understanding Network Interfaces (see page 154)
Bonding (Link Aggregation) (see page 160)
Ethernet Bridging (VLANs) (see page 163)
VLAN Tagging (see page 176)
VLAN-aware Bridge Mode for Large-scale Layer 2 Environments (see page 184)
Network Virtualization (see page 195)
Integrating with VMware NSX (see page 196)
Integrating Hardware VTEPs with Midokura MidoNet and OpenStack
Configuring a VXLAN without a Controller (see page 209)
Lightweight Network Virtualization - LNV
Multi-Chassis Link Aggregation - CLAG - MLAG (see page 215)
LACP Bypass (see page 230)
Spanning Tree and Rapid Spanning Tree (see page 234)
Configuring Switch Port Attributes (see page 240)
Configuring Buffer and Queue Management (see page 245)
Virtual Router Redundancy (VRR) (see page 250)
IGMP and MLD Snooping (see page 255)
Link Layer Discovery Protocol
The lldpd daemon implements the IEEE802.1AB (Link Layer Discovery Protocol, or LLDP) standard.
LLDP allows you to know which ports are neighbors of a given port. By default, lldpd runs as a
daemon and is started at system boot. lldpd command line arguments are placed in /etc/default
/lldpd. lldpd configuration options are placed in /etc/lldpd.conf or under /etc/lldpd.d/.
For more details on the command line arguments and config options, please see man lldpd(8).
lldpd supports CDP (Cisco Discovery Protocol, v1 and v2). lldpd logs by default into /var/log
/daemon.log with an lldpd prefix.
lldpcli is the CLI tool to query the lldpd daemon for neighbors, statistics and other running
configuration information. See man lldpcli(8) for details.
Contents
(Click to expand)
Contents (see page 138)
Commands (see page 139)
Man Pages (see page 139)
138
03 June 2015
Cumulus Linux 2.5.2 User Guide
Man Pages (see page 139)
Example lldpcli Commands (see page 139)
Persistent Configuration (see page 143)
Configuration Files (see page 143)
Useful Links (see page 143)
Caveats and Errata (see page 143)
Commands
lldpd (daemon)
lldpcli (interactive CLI)
Man Pages
man lldpd
man lldpcli
Example lldpcli Commands
To see all neighbors on all ports/interfaces:
cumulus@switch:~$ sudo lldpcli show neighbors
--------------------------------------------------------------------LLDP neighbors:
--------------------------------------------------------------------Interface:
eth0, via: CDPv1, RID: 72, Time: 0 day, 00:33:40
Chassis:
ChassisID:
local test-server-1
SysName:
test-server-1
SysDescr:
Linux running on
Linux 3.2.2+ #1 SMP Mon Jun 10 16:21:22 PDT 2013 ppc
MgmtIP:
192.0.2.72
Capability:
Router, on
Port:
PortID:
ifname eth1
--------------------------------------------------------------------Interface:
swp1, via: CDPv1, RID: 87, Time: 0 day, 00:36:27
nChassis:
ChassisID:
local T1
SysName:
T1
SysDescr:
Linux running on
Cumulus Linux
MgmtIP:
cumulusnetworks.com
192.0.2.15
139
Cumulus Networks
Capability:
Router, on
Port:
PortID:
ifname swp1
PortDescr:
swp1
--------------------------------------------------------------------... and more (output truncated to fit this doc)
To see neighbors on specific ports:
cumulus@switch:~$ sudo lldpcli show neighbors ports swp1,swp2
--------------------------------------------------------------------Interface:
swp1, via: CDPv1, RID: 87, Time: 0 day, 00:36:27
Chassis:
ChassisID:
local T1
SysName:
T1
SysDescr:
Linux running on
Cumulus Linux
MgmtIP:
192.0.2.15
Capability:
Router, on
Port:
PortID:
ifname swp1
PortDescr:
swp1
--------------------------------------------------------------------Interface:
swp2, via: CDPv1, RID: 123, Time: 0 day, 00:36:27
Chassis:
ChassisID:
local T2
SysName:
T2
SysDescr:
Linux running on
Cumulus Linux
MgmtIP:
192.0.2.15
Capability:
Router, on
Port:
PortID:
ifname swp1
PortDescr:
swp1
To see lldpd statistics for all ports:
cumulus@switch:~$ sudo lldpcli show statistics
---------------------------------------------------------------------LLDP statistics:
---------------------------------------------------------------------Interface:
140
eth0
03 June 2015
Cumulus Linux 2.5.2 User Guide
Transmitted:
9423
Received:
17634
Discarded:
0
Unrecognized: 0
Ageout:
10
Inserted:
20
Deleted:
10
-------------------------------------------------------------------Interface:
swp1
Transmitted:
9423
Received:
6264
Discarded:
0
Unrecognized: 0
Ageout:
0
Inserted:
2
Deleted:
0
--------------------------------------------------------------------Interface:
swp2
Transmitted:
9423
Received:
6264
Discarded:
0
Unrecognized: 0
Ageout:
0
Inserted:
2
Deleted:
0
--------------------------------------------------------------------Interface:
swp3
Transmitted:
9423
Received:
6265
Discarded:
0
Unrecognized: 0
Ageout:
0
Inserted:
2
Deleted:
0
---------------------------------------------------------------------... and more (output truncated to fit this document)
To see lldpd statistics summary for all ports:
cumulus@switch:~$ sudo lldpcli show statistics
summary
--------------------------------------------------------------------LLDP Global statistics:
---------------------------------------------------------------------
cumulusnetworks.com
141
Cumulus Networks
Summary of stats:
Transmitted:
648186
Received:
437557
Discarded:
0
Unrecognized: 0
Ageout:
10
Inserted:
38
Deleted:
10
To see the lldpd running configuration:
cumulus@switch:~$ sudo lldpcli show running-configuration
-------------------------------------------------------------------Global configuration:
-------------------------------------------------------------------Configuration:
Transmit delay: 1
Transmit hold: 4
Receive mode: no
Pattern for management addresses: (none)
Interface pattern: (none)
Interface pattern for chassis ID: (none)
Override description with: (none)
Override platform with: (none)
Advertise version: yes
Disable LLDP-MED inventory: yes
LLDP-MED fast start mechanism: yes
LLDP-MED fast start interval: 1
--------------------------------------------------------------------
To configure active interfaces:
lldpcli configure system interface pattern "swp*"
To configure inactive interfaces:
lldpcli configure system interface pattern-blacklist "eth0"
The active interface list always overrides the inactive interface list.
142
03 June 2015
Cumulus Linux 2.5.2 User Guide
To reset any interface list to none:
lldpcli configure system interface pattern-blacklist ""
Persistent Configuration
lldpd settings done via CLI can be made persistent by adding them into /etc/lldpd.conf or /etc
/lldpd.d/.
Here is an example persistent configuration:
cumulus@switch:~$ sudo cat /etc/lldpd.conf
configure lldp tx-interval 40
configure lldp tx-hold 3
configure system interface pattern-blacklist "eth0"
lldpd logs to /var/log/daemon.log with the lldpd prefix:
cumulus@switch:~$ sudo tail -f /var/log/daemon.log
| grep lldp
Aug
7 17:26:17 switch lldpd[1712]: unable to get system name
Aug
7 17:26:17 switch lldpd[1712]: unable to get system name
Aug
7 17:26:17 switch lldpcli[1711]: lldpd should resume operations
Aug
7 17:26:32 switch lldpd[1805]: NET-SNMP version 5.4.3 AgentX subagent
connected
Configuration Files
/etc/lldpd.conf
/etc/lldpd.d
/etc/default/lldpd
Useful Links
http://vincentbernat.github.io/lldpd/
http://en.wikipedia.org/wiki/Link_Layer_Discovery_Protocol
Caveats and Errata
Annex E (and hence Annex D) of IEEE802.1AB (lldp) is not supported.
Prescriptive Topology Manager - PTM
cumulusnetworks.com
143
Cumulus Networks
Prescriptive Topology Manager - PTM
In data center topologies, right cabling is a time-consuming endeavor and is error prone. Prescriptive
Topology Manager (PTM) is a dynamic cabling verification tool to help detect and eliminate such errors.
It takes a graphviz-DOT specified network cabling plan (something many operators already generate),
stored in a topology.dot file, and couples it with runtime information derived from LLDP to verify
that the cabling matches the specification. The check is performed on every link transition on each
node in the network. It also detects forwarding path failures using Bidirectional Forwarding Detection (
BFD).
You can customize the topology.dot file to control ptmd at both the global/network level and the
node/port level.
PTM runs as a daemon, named ptmd.
For more information, see man ptmd(8).
Contents
(Click to expand)
Contents (see page 144)
Supported Features (see page 144)
Configuration (see page 145)
Configuration Parameters (see page 145)
Scripts (see page 148)
Quagga Interaction (see page 149)
Enabling BFD in Quagga (see page 149)
ptmd service Commands (see page 150)
ptmctl Commands (see page 150)
ptmctl Examples (see page 150)
ptmctl Error Outputs (see page 153)
Configuration Files (see page 154)
Useful Links (see page 154)
Caveats and Errata (see page 154)
Supported Features
Topology verification using LLDP. ptmd creates a client connection to the LLDP daemon, lldpd,
and retrieves the neighbor relationship between the nodes/ports in the network and compares
them against the prescribed topology specified in the topology.dot file.
Only physical interfaces, like swp1 or eth0, are currently supported. Cumulus Linux does not
support specifying virtual interfaces like bonds or subinterfaces like eth0.200 in the topology
file.
Forwarding path failure detection using Bidirectional Forwarding Detection (BFD); however, the
Echo function, demand mode, and multihop routed paths are not supported. For more
information on how BFD operates in Cumulus Linux, see man ptmd(8).
BFD requires an IP address for any interface for which it is configured.
144
03 June 2015
Cumulus Linux 2.5.2 User Guide
BFD requires an IP address for any interface for which it is configured.
Integration with Quagga (PTM to Quagga notification).
Client management: ptmd creates an abstract named socket /var/run/ptmd.socket on
startup. Other applications can connect to this socket to receive notifications and send
commands.
Event notifications: see Scripts below.
User configuration via a topology.dot file; see Configuration below.
Configuration
ptmd verifies the physical network topology against a DOT-specified network graph file, /etc/ptm.d
/topology.dot. This file must be present or else ptmd will not start. You can specify an alternate file
using the -c option.
At startup, ptmd connects to lldpd, the LLDP daemon, over a Unix socket and retrieves the neighbor
name and port information. It then compares the retrieved port information with the configuration
information that it read from the topology file. If there is a match, then it is a PASS, else it is a FAIL.
PTM performs its LLDP neighbor check using the PortID ifname TLV information. Previously, it
used the PortID port description TLV information.
PTM also supports undirected graphs:
graph G {
node [shape=record];
graph [hostidtype="hostname", version="1:0", date="04/12/2013"];
edge [dir=none, len=1, headport=center, tailport=center];
//R1's connections - R1 is top-tier spine
"R1":"swp1" -- "R3":"swp3";
"R1":"swp2" -- "R4":"swp3";
}
It’s a good idea to always wrap the hostname in double quotes, like “www.example.com”.
Otherwise, ptmd can fail if you specify a fully-qualified domain name as the hostname and do
not wrap it in double quotes.
Configuration Parameters
You can configure ptmd parameters in the topology file. The parameters are classified as host-only,
global, per-port/node and templates.
cumulusnetworks.com
145
Cumulus Networks
Host-only Parameters
Host-only parameters apply to the entire host on which PTM is running. You can include the
hostnametype host-only parameter, which specifies whether PTM should use only the host name (
hostname) or the fully-qualified domain name (fqdn) while looking for the self-node in the graph file.
For example, in the graph file below, PTM will ignore the FQDN and only look for switch04 , since that is
the host name of the switch it's running on:
graph G {
hostnametype="hostname"
BFD="upMinTx=150,requiredMinRx=250"
"cumulus":swp44 -- "switch04.cumulusnetworks.com":swp20
"cumulus":swp46 -- "switch04.cumulusnetworks.com":swp22
}
However, in this next example, PTM will compare using the FQDN and look for switch05.
cumulusnetworks.com, which is the FQDN of the switch it’s running on:
graph G {
hostnametype="fqdn"
"cumulus":swp44 -- "switch05.cumulusnetworks.com":swp20
"cumulus":swp46 -- "switch05.cumulusnetworks.com":swp22
}
Global Parameters
Global parameters apply to every port listed in the topology file. There are two global parameters: LLDP
and BFD. LLDP is enabled by default; if no keyword is present, default values are used for all ports.
However, BFD is disabled if no keyword is present, unless there is a per-port override configured. For
example:
graph G {
LLDP=""
BFD="upMinTx=150,requiredMinRx=250"
"cumulus":swp44 -- "qct-ly2-04":swp20
"cumulus":swp46 -- "qct-ly2-04":swp22
}
Per-port Parameters
Per-port parameters provide finer-grained control at the port level. These parameters override any
global or compiled defaults. For example:
146
03 June 2015
Cumulus Linux 2.5.2 User Guide
graph G {
LLDP=""
BFD="upMinTx=300,requiredMinRx=100"
"cumulus":swp44 -- "qct-ly2-04":swp20 [BFD="upMinTx=150,
requiredMinRx=250"]
"cumulus":swp46 -- "qct-ly2-04":swp22
}
Templates
Templates provide flexibility in choosing different parameter combinations and applying them to a
given port. A template instructs ptmd to reference a named parameter string instead of a default one.
There are two parameter strings ptmd supports:
bfdtmpl, which specifies a custom parameter tuple for BFD.
lldptmpl, which specifies a custom parameter tuple for LLDP.
For example:
graph G {
LLDP=""
BFD="upMinTx=300,requiredMinRx=100"
BFD1="upMinTx=200,requiredMinRx=200"
BFD2="upMinTx=100,requiredMinRx=300"
LLDP1="match_type=ifname"
LLDP2="match_type=portdescr"
"cumulus":swp44 -- "qct-ly2-04":swp20 [BFD="bfdtmpl=BFD1", LLDP="
lldptmpl=LLDP1"]
"cumulus":swp46 -- "qct-ly2-04":swp22 [BFD="bfdtmpl=BFD2", LLDP="
lldptmpl=LLDP2"]
"cumulus":swp46 -- "qct-ly2-04":swp22
}
In this template, LLDP1 and LLDP2 are templates for LLDP parameters while BFD1 and BFD2 are
template for BFD parameters.
Supported BFD and LLDP Parameters
ptmd supports the following BFD parameters:
upMinTx: the minimum transmit interval, which defaults to 300ms, specified in milliseconds.
requiredMinRx: the minimum interval between received BFD packets, which defaults to
300ms, specified in milliseconds.
detectMult: the detect multiplier, which defaults to 3, and can be any non-zero value.
The following is an example of a topology with BFD applied at the port level:
cumulusnetworks.com
147
Cumulus Networks
graph G {
"cumulus-1":swp44 -- "cumulus-2":swp20 [BFD="upMinTx=300,
requiredMinRx=100"]
"cumulus-1":swp46 -- "cumulus-2":swp22 [BFD="detectMult=4"]
}
ptmd supports the following LLDP parameters:
match_type, which defaults to the interface name (ifname), but can accept a port description (
portdescr) instead if you want lldpd to compare the topology against the port description
instead of the interface name. You can set this parameter globally or at the per-port level.
match_hostname, which defaults to the host name (hostname), but enables PTM to match the
topology using the fully-qualified domain name (fqdn) supplied by LLDP.
The following is an example of a topology with LLDP applied at the port level:
graph G {
"cumulus-1":swp44 -- "cumulus-2":swp20 [LLDP="match_hostname=fqdn"]
"cumulus-1":swp46 -- "cumulus-2":swp22 [LLDP="
match_type=portdescr"]
}
When you specify match_hostname=fqdn, ptmd will match the entire FQDN, like cumulus-2.
domain.com in the example below. If you do not specify anything for match_hostname, ptmd
will match based on hostname only, like cumulus-3 below, and ignore the rest of the URL:
graph G {
"cumulus-1":swp44 -- "cumulus-2.domain.com":swp20 [LLDP="
match_hostname=fqdn"]
"cumulus-1":swp46 -- "cumulus-3":swp22 [LLDP="
match_type=portdescr"]
}
Scripts
ptmd executes scripts at /etc/ptm.d/if-topo-pass and /etc/ptm.d/if-topo-failfor each
interface that goes through a change, running if-topo-pass when an LLDP or BFD check passes and
running if-topo-fails when the check fails. The scripts receive an argument string that is the result
of the ptmctl command, described in ptmd Commands below.
You should modify these default scripts as needed.
148
03 June 2015
Cumulus Linux 2.5.2 User Guide
You should modify these default scripts as needed.
Quagga Interaction
The Quagga routing suite enables additional checks to ensure that routing adjacencies are formed only
on links that have connectivity conformant to the specification, as determined by ptmd. To enable the
check:
quagga# conf t
quagga (config)# ptm-enable
quagga (config)#
To disable the checks:
quagga# conf t
quagga (config)# no ptm-enable
quagga (config)#
When the ptm-enable flag is configured by the user, the zebra daemon connects to ptmd over a Unix
socket. Any time there is a change of status for an interface, ptmd sends notifications to zebra. Zebra
maintains a ptm-status flag per interface and evaluates routing adjacency based on this flag. To
check the per-interface ptm-status:
quagga# show interface swp1
Interface swp1 is up, line protocol is up
PTM status: pass
Description: T1
index 3 metric 1 mtu 1500
flags: <UP,BROADCAST,RUNNING,MULTICAST>
HWaddr: 44:38:39:00:27:1d
inet 192.0.2.1/31 broadcast 255.255.255.255
inet6 2001:DB8::271d/64
quagga#
Enabling BFD in Quagga
To enable BFD in Quagga, do the following, depending upon the protocol you are using (BGP or OSPF).
To enable BFD in Quagga when using BGP:
cumulusnetworks.com
149
Cumulus Networks
quagga# router bgp X
quagga# neighbor <neighbor ip> bfd
quagga#
To enable BFD in Quagga when using OSPF:
quagga# interface X
quagga# ip ospf bfd
quagga#
ptmd service Commands
PTM sends client notifications in CSV format.
cumulus@switch:~$ sudo service ptmd start|restart|force-reload: Starts or restarts the
ptmd service. The topology.dot file must be present in order for the service to start.
cumulus@switch:~$ sudo service ptmd reconfig: Instructs ptmd to read the topology.dot
file again without restarting, applying the new configuration to the running state.
cumulus@switch:~$ sudo service ptmd stop: Stops the ptmd service.
cumulus@switch:~$ sudo service ptmd status: Retrieves the current running state of ptmd.
ptmctl Commands
ptmctl is a client of ptmd; it retrieves the daemon’s operational state. It connects to ptmd over a Unix
socket and listens for notifications. ptmctl parses the CSV notifications sent by ptmd.
See man ptmctl for more information.
ptmctl Examples
For basic output, use ptmctl without any options:
cumulus@switch:~$ sudo ptmctl
------------------------------------------------------------------------------port
cbl
BFD
BFD
status status peer
------------------------------------------------------------------------------swp45 pass
pass
5.5.5.5
swp46 fail
N/A
N/A
For more detailed output, use the -d option:
150
03 June 2015
Cumulus Linux 2.5.2 User Guide
For more detailed output, use the -d option:
cumulus@switch:~$ sudo ptmctl -d
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------port
cbl
exp
BFD
BFD
BFD
act
BFD
echo_tx_timeout
state
portID
det_mult
echo_rx_timeout
status nbr
Type
sysname
portDescr
tx_timeout
last
rx_timeout
max_hop_cnt
nbr
peer
match
on
upd
DownDiag
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------swp45 pass
/A
N/A
h1:swp1 h1:swp1
N/A
/A
N/A
N/A
swp1
N/A
IfName 5m: 5s
N/A
N
N
N/A
h2:swp1 h2:swp1
N/A
/A
swp1
N/A
N/A
swp46 fail
/A
h1
N/A
h2
swp1
N/A
N/A
N/A
swp1
IfName 5m: 5s
N/A
N
N
N/A
To return information on active BFD sessions ptmd is tracking, use the -b option:
cumulus@switch:~$ sudo ptmctl -b
------------------------------------------------------------------------------port
peer
state diag det_mult tx_timeout rx_timeout
------------------------------------------------------------------------------swp45 5.5.5.5 Up
N/A
3
300000000
900000000
To return LLDP information, use the -l option. It returns only the active neighbors currently being
tracked by ptmd.
cumulus@switch:~$ sudo ptmctl -l
--------------------------------------------port
sysname
portID
cumulusnetworks.com
port
match
last
151
Cumulus Networks
descr
on
upd
--------------------------------------------swp45 h1
swp1
swp1
IfName 5m:59s
swp46 h2
swp1
swp1
IfName 5m:59s
To output ptmctl data in JSON format, use the -j option:
cumulus@switch:~$ sudo ptmctl -j
{
"swp45": {
"sysname":"h1",
"rx_timeout": "N/A",
"exp_nbr": "h1:swp1",
"echo_tx_timeout": "N/A",
"tx_timeout": "N/A",
"last upd": "5m:54s",
"portDescr": "swp1",
"BFD DownDiag": "N/A",
"BFD state": "N/A",
"BFD Type": "N/A",
"cbl status": "pass",
"BFD peer": "N/A",
"act nbr": "h1:swp1",
"match on": "IfName",
"portID": "swp1",
"max_hop_cnt": "N/A",
"det_mult": "N/A",
"port": "swp45",
"echo_rx_timeout": "N/A"
},
"swp46":
{
"sysname": "h2",
"rx_timeout": "N/A",
"exp nbr": "h2:swp1",
"echo_tx_timeout": "N/A",
"tx_timeout": "N/A",
"last upd": "5m:54s",
"portDescr": "swp1",
"BFD DownDiag": "N/A",
"BFD state": "N/A",
"BFD Type": "N/A",
"cbl status": "pass",
152
03 June 2015
Cumulus Linux 2.5.2 User Guide
"BFD peer": "N/A",
"act nbr": "h2:swp1",
"match on": "IfName",
"portID": "swp1",
"max_hop_cnt": "N/A",
"det_mult": "N/A",
"port": "swp46",
"echo_rx_timeout": "N/A"
}
}
ptmctl Error Outputs
If there are errors in the topology file or there isn’t a session, PTM will return appropriate outputs.
Typical error strings are:
Topology file error [/etc/ptm.d/topology.dot] [cannot find node cumulus] please check /var/log/ptmd.log for more info
Topology file error [/etc/ptm.d/topology.dot] [cannot open file (errno 2)] please check /var/log/ptmd.log for more info
No Hostname/MgmtIP found [Check LLDPD daemon status] please check /var/log/ptmd.log for more info
No BFD sessions . Check connections
No LLDP ports detected. Check connections
Unsupported command
For example:
cumulus@switch:~$ sudo ptmctl
------------------------------------------------------------------------cmd
error
------------------------------------------------------------------------get-status
Topology file error [/etc/ptm.d/topology.dot] [cannot open file
(errno 2)] - please check /var/log/ptmd.log for more info
If you encounter errors with the topology.dot file, you can use dot (included in the Graphviz
package) to validate the syntax of the topology file.
cumulusnetworks.com
153
Cumulus Networks
Configuration Files
/etc/ptm.d/topology.dot
/etc/ptm.d/if-topo-pass
/etc/ptm.d/if-topo-fail
Useful Links
Bidirectional Forwarding Detection (BFD)
Graphviz
LLDP on Wikipedia
PTMd GitHub repo
Caveats and Errata
Prior to version 2.1, Cumulus Linux stored the ptmd configuration files in /etc/cumulus/ptm.d
. When you upgrade to version 2.1 or later, all the existing ptmd files are copied from their
original location to /etc/ptm.d with a dpkg-old extension, except for topology.dot, which
gets copied to /etc/ptm.d.
If you customized the if-topo-pass and if-topo-fail scripts, they are also copied to dpkgold, and you must modify them so they can parse the CSV output correctly.
Sample if-topo-pass and if-topo-fail scripts are available in /etc/ptm.d. A sample
topology.dot file is available in /usr/share/doc/ptm/examples.
Understanding Network Interfaces
This chapter discusses the various network interfaces on a switch running Cumulus Linux.
Contents
(Click to expand)
Contents (see page 154)
Commands (see page 155)
Man Pages (see page 155)
Configuration Files (see page 155)
Interface Types (see page 155)
Link and Administrative State (see page 155)
Interface up/down (see page 156)
Configuring Network Interfaces Using ifupdown2 (see page 156)
Settings (see page 156)
Port Speed and Duplexing (see page 156)
Auto-negotiation (see page 156)
MTU (see page 157)
154
03 June 2015
Cumulus Linux 2.5.2 User Guide
MTU (see page 157)
Persistent Configuration (see page 157)
Addressing (see page 157)
Runtime Configuration (see page 158)
Statistics (see page 158)
Useful Links (see page 159)
Caveats and Errata (see page 159)
Commands
ethtool
ip
Man Pages
man ethtool
man interfaces
man ip
man ip addr
man ip link
Configuration Files
/etc/network/interfaces
Interface Types
Cumulus Linux exposes network interfaces for several types of physical and logical devices:
lo, network loopback device
ethN, switch management port(s), for out of band management only
swpN, switch front panel ports
(optional) brN, bridges (IEEE 802.1Q VLANs)
(optional) bondN, bonds (IEEE 802.3ad link aggregation trunks, or port channels)
Link and Administrative State
To see the current interface state:
cumulus@switch:~$ ip link show dev swp1
3: swp1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state
UP mode DEFAULT qlen 500
link/ether 44:38:39:00:03:c1 brd ff:ff:ff:ff:ff:ff
In this example, swp1 is administratively UP and the physical link is UP (LOWER_UP flag).
cumulusnetworks.com
155
Cumulus Networks
In this example, swp1 is administratively UP and the physical link is UP (LOWER_UP flag).
Interface up/down
To administratively bring an interface up or down, run:
cumulus@switch:~$ sudo ip link set dev swp1 {up|down}
If you specified manual as the address family, you must bring up that interface manually using
ifconfig. For example, if you configured a bridge like this:
auto bridge01
iface bridge01 inet manual
You can only bring it up by running ifconfig bridge01 up.
Configuring Network Interfaces Using ifupdown2
ifupdown is the network interface manager for Cumulus Linux. Cumulus Linux 2.1 and later uses an
updated version of this package, ifupdown2. To configure network interfaces, read Configuring and
Managing Network Interfaces (see page 125).
You should familiarize yourself with ifupdown2 before you begin configuring interfaces, as there are
some notable differences between the two versions. However, ifupdown2 is backward compatible
with ifupdown.
Settings
Port Speed and Duplexing
Cumulus Linux supports both half- and full-duplex configurations. Supported port speeds include 1G,
10G and 40G. Set the speeds in terms of Mb, where the setting for 1G is 1000, 10G is 10000 and 40G is
40000.
If you specify the port speed in /etc/network/interfaces, you must also specify the
duplex mode setting along with it; otherwise, ethtool defaults to half duplex.
You can also configure these settings at run time, using ethtool. See Setting Port Speed, Duplexing, and
Auto-negotiation (see page 242) for more information.
Auto-negotiation
You can enable or disable auto-negotiation (that is, set it on or off) on a switch port.
156
03 June 2015
Cumulus Linux 2.5.2 User Guide
MTU
Interface MTU applies to the management port, front panel port, bridge, VLAN subinterfaces, and
bonds.
Care must be taken to ensure there are no MTU mismatches in the conversation path. MTU
mismatches will result in dropped or truncated packets, degrading or blocking network
performance.
To show MTU, use ip link show:
cumulus@switch:~$ ip link show dev swp1
3: swp1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state
UP mode DEFAULT qlen 500
link/ether 44:38:39:00:03:c1 brd ff:ff:ff:ff:ff:ff
To set swp1 to Jumbo Frame MTU=9000, use ip link set:
cumulus@switch:~$ sudo ip link set dev swp1 mtu 9000
cumulus@switch:~$ ip link show dev swp1
3: swp1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc pfifo_fast state
UP mode DEFAULT qlen 500
link/ether 44:38:39:00:03:c1 brd ff:ff:ff:ff:ff:ff
Persistent Configuration
A persistent configuration in /etc/network/interfaces demonstrating these settings looks like this:
auto swp1
iface swp1
address 10.1.1.1/24
mtu 9000
link-speed 1000
link-duplex full
link-autoneg off
Addressing
To add addresses to an interface, use ip addr add:
cumulusnetworks.com
157
Cumulus Networks
cumulus@switch:~$ sudo ip addr add 192.0.2.1/30 dev swp1
cumulus@switch:~$ sudo ip addr add 2001:DB8::/126 dev swp1
To show the assigned address on an interface, use ip addr show:
cumulus@switch:~$ ip addr show dev swp1
3: swp1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast
state UP qlen 500
link/ether 44:38:39:00:03:c1 brd ff:ff:ff:ff:ff:ff
inet 192.0.2.1/30 scope global swp1
inet6 2001:DB8::1/126 scope global tentative
valid_lft forever preferred_lft forever
To remove an addresses from an interface, use ip addr del:
cumulus@switch:~$ sudo ip addr del 192.0.2.1/30 dev swp1
cumulus@switch:~$ sudo ip addr del 2001:DB8::/126 dev swp1
Runtime Configuration
To make non-persistent changes to interfaces at runtime, use the ip and brctl commands directly.
For example, to assign another IPv4 address to swp1, use:
cumulus@switch:~$ sudo ip addr add 11.0.0.1/30 dev swp1
See man ip for full details on the options available to manage and query interfaces.
Statistics
High-level interface statistics are available with the ip -s link command:
cumulus@switch:~$ ip -s link show dev swp1
3: swp1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state
UP mode DEFAULT qlen 500
link/ether 44:38:39:00:03:c1 brd ff:ff:ff:ff:ff:ff
158
RX: bytes
packets
errors
dropped overrun mcast
21780
242
0
0
TX: bytes
packets
errors
dropped carrier collsns
1145554
11325
0
0
0
0
242
0
03 June 2015
Cumulus Linux 2.5.2 User Guide
Low-level interface statistics are available with ethtool:
cumulus@switch:~$ sudo ethtool -S swp1
NIC statistics:
HwIfInOctets: 21870
HwIfInUcastPkts: 0
HwIfInBcastPkts: 0
HwIfInMcastPkts: 243
HwIfOutOctets: 1148217
HwIfOutUcastPkts: 0
HwIfOutMcastPkts: 11353
HwIfOutBcastPkts: 0
HwIfInDiscards: 0
HwIfInL3Drops: 0
HwIfInBufferDrops: 0
HwIfInAclDrops: 0
HwIfInBlackholeDrops: 0
HwIfInDot3LengthErrors: 0
HwIfInErrors: 0
SoftInErrors: 0
SoftInDrops: 0
SoftInFrameErrors: 0
HwIfOutDiscards: 0
HwIfOutErrors: 0
HwIfOutQDrops: 0
HwIfOutNonQDrops: 0
SoftOutErrors: 0
SoftOutDrops: 0
SoftOutTxFifoFull: 0
HwIfOutQLen: 0
Useful Links
http://wiki.debian.org/NetworkConfiguration
http://www.linuxfoundation.org/collaborate/workgroups/networking/vlan
http://www.linuxfoundation.org/collaborate/workgroups/networking/bridge
http://www.linuxfoundation.org/collaborate/workgroups/networking/bonding
Caveats and Errata
Switch Port Enumeration: Cumulus Linux begins port enumeration for the front panel switch ports to
match the numbers silkscreened on the hardware. In most cases, switches on the Cumulus Linux HCL
start port enumeration at swp1. However, the Dell S4810 starts port enumeration at swp0.
cumulusnetworks.com
159
Cumulus Networks
Bonding - Link Aggregation
Linux bonding provides a method for aggregating multiple network interfaces (the slaves) into a single
logical bonded interface (the bond). Cumulus Linux bonding supports the IEEE 802.3ad link aggregation
mode. Link aggregation allows one or more links to be aggregated together to form a link aggregation
group (LAG), such that a media access control (MAC) client can treat the link aggregation group as if it
were a single link. The benefits of link aggregation are:
Linear scaling of bandwidth as links are added to LAG
Load balancing
Failover protection
Cumulus Linux LAG control protocol is LACP version 1.
Contents
(Click to expand)
Contents (see page 160)
Example: Bonding 4 Slaves (see page 160)
Hash Distribution (see page 162)
Configuration Files (see page 163)
Useful Links (see page 163)
Caveats and Errata (see page 163)
Example: Bonding 4 Slaves
In this example, front panel port interfaces swp1-swp4 are slaves in bond0 (swp5 and swp6 are not
part of bond0). The name of the bond is arbitrary as long as it follows Linux interface naming
guidelines, and is unique within the switch. The only bonding mode supported in Cumulus Linux is
802.3ad. There are several 802.3ad settings that can be applied to each bond:
bond-slave: The list of slaves in bond.
160
03 June 2015
Cumulus Linux 2.5.2 User Guide
bond-slave: The list of slaves in bond.
bond-mode: Must be set to 802.3ad.
bond-miimon: How often the link state of each slave is inspected for link failures. It defaults to 0
, but 100 is the recommended value.
bond-miimon must be defined in /etc/network/interfaces.
bond-use-carrier: How to determine link state.
bond-xmit-hash-policy: Hash method used to select the slave for a given packet; must be
set to layer3+4.
bond-lacp-rate: Rate to ask link partner to transmit LACP control packets.
bond-min-links: Specifies the minimum number of links that must be active before asserting
carrier on the bond. Minimum value is 1, but a value greater than 1 is useful if higher level
services need to ensure a minimum of aggregate bandwidth before putting the bond in service.
See Useful Links below for more details on settings.
To configure the bond, edit /etc/network/interfaces and add a stanza for bond0:
auto bond0
iface bond0
address 10.0.0.1/30
bond-slaves swp1 swp2 swp3 swp4
bond-mode 802.3ad
bond-miimon 100
bond-use-carrier 1
bond-lacp-rate 1
bond-min-links 1
bond-xmit-hash-policy layer3+4
However, if you are intending that the bond become part of a bridge, you don’t need to specify an IP
address. The configuration would look like this:
auto bond0
iface bond0
bond-slaves glob swp1-4
bond-mode 802.3ad
bond-miimon 100
bond-use-carrier 1
bond-lacp-rate 1
bond-min-links 1
bond-xmit-hash-policy layer3+4
cumulusnetworks.com
161
Cumulus Networks
See man interfaces for more information on /etc/network/interfaces.
Here the link state sampling rate is 1/10 sec, and the LACP transmit rate is set to high. min_links is set
to 1 to indicate the bond must have at least one active member for bond to assert carrier. If the
number of active members drops below min_links, the bond will appear to upper-level protocols as
link-down. When the number of active links returns to greater than or equal to min_links, the bond
will become link-up.
When networking is started on switch, bond0 is created as MASTER and interfaces swp1-swp4 come up
in SLAVE mode, as seen in the ip link show command:
3: swp1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast
master bond0 state UP mode DEFAULT qlen 500
link/ether 44:38:39:00:03:c1 brd ff:ff:ff:ff:ff:ff
4: swp2: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast
master bond0 state UP mode DEFAULT qlen 500
link/ether 44:38:39:00:03:c1 brd ff:ff:ff:ff:ff:ff
5: swp3: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast
master bond0 state UP mode DEFAULT qlen 500
link/ether 44:38:39:00:03:c1 brd ff:ff:ff:ff:ff:ff
6: swp4: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast
master bond0 state UP mode DEFAULT qlen 500
link/ether 44:38:39:00:03:c1 brd ff:ff:ff:ff:ff:ff
And
55: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue
state UP mode DEFAULT
link/ether 44:38:39:00:03:c1 brd ff:ff:ff:ff:ff:ff
All slave interfaces within a bond will have the same MAC address as the bond. Typically, the
first slave added to the bond donates its MAC address for the bond. The other slaves’ MAC
addresses are set to the bond MAC address. The bond MAC address is used as source MAC
address for all traffic leaving the bond, and provides a single destination MAC address to
address traffic to the bond.
Hash Distribution
Egress traffic through a bond is distributed to a slave based on a packet hash calculation. This
distribution provides load balancing over the slaves. The hash calculation uses packet header data to
pick which slave to transmit the packet. For IP traffic, IP header source and destination fields are used
in the calculation. For IP + TCP/UDP traffic, source and destination ports are included in the hash
162
03 June 2015
Cumulus Linux 2.5.2 User Guide
in the calculation. For IP + TCP/UDP traffic, source and destination ports are included in the hash
calculation. Traffic for a given conversation flow will always hash to the same slave. Many flows will be
distributed over all the slaves to load balance the total traffic. In a failover event, the hash calculation is
adjusted to steer traffic over available slaves.
Configuration Files
/etc/network/interfaces
Useful Links
http://www.linuxfoundation.org/collaborate/workgroups/networking/bonding
802.3ad (Accessible writeup)
Link aggregation from Wikipedia
Caveats and Errata
An interface cannot belong to multiple bonds.
Slave ports within a bond should all be set to the same speed/duplex, and should match the link
partner’s slave ports.
A bond cannot enslave VLAN subinterfaces. A bond can have subinterfaces, but not the other
way around.
Ethernet Bridging - VLANs
Ethernet bridges provide a means for hosts to communicate at layer 2. Bridge members can be
individual physical interfaces, bonds or logical interfaces that traverse an 802.1Q VLAN trunk.
Cumulus Linux 2.5.0 introduced a new method for configuring bridges that are VLAN-aware (see page 184
). The bridge driver in Cumulus Linux 2.5.x is capable of VLAN filtering, which allows for configurations
that are similar to incumbent network devices. While Cumulus Linux supports Ethernet bridges in
traditional mode Cumulus Networks recommends using VLAN-aware (see page 184) mode unless you
are using VXLANs in your network.
For a comparison of traditional and VLAN-aware modes, read this knowledge base article.
You can configure both VLAN-aware and traditional mode bridges on the same network in
Cumulus Linux; however you should not have more than one VLAN-aware bridge on a given
switch. If you are implementing VXLANs (see page 209), you must use traditional bridge
mode.
Contents
(Click to expand)
Contents (see page 163)
Configuration Files (see page 164)
Commands (see page 164)
cumulusnetworks.com
163
Cumulus Networks
Creating a Bridge between Physical Interfaces (see page 164)
Creating the Bridge and Adding Interfaces (see page 165)
Showing and Verifying the Bridge Configuration (see page 166)
Examining MAC Addresses (see page 167)
Multiple Bridges (see page 168)
Configuring an SVI (Switch VLAN Interface) (see page 171)
Showing and Verifying the Bridge Configuration (see page 172)
Using Trunks in Traditional Bridging Mode (see page 173)
Trunk Example (see page 174)
Showing and Verifying the Trunk (see page 175)
Additional Examples (see page 175)
Configuration Files (see page 175)
Useful Links (see page 176)
Caveats and Errata (see page 176)
Configuration Files
/etc/network/interfaces
Commands
brctl
bridge
ip addr
ip link
Creating a Bridge between Physical Interfaces
The basic use of bridging is to connect all of the physical and logical interfaces in the system into a
single layer 2 domain.
164
03 June 2015
Cumulus Linux 2.5.2 User Guide
Creating the Bridge and Adding Interfaces
You statically manage bridge configurations in /etc/network/interfaces. The following
configuration snippet details an example bridge used throughout this chapter, explicitly enabling
spanning tree (see page 234) and setting the bridge MAC address ageing timer. First, create a bridge
with a descriptive name of 15 characters or fewer. Then add the logical interfaces (bond0) and physical
interfaces (swp5, swp6) to assign to that bridge.
auto my_bridge
iface my_bridge
bridge-ports bond0 swp5 swp6
bridge-ageing 150
bridge-stp on
Keyword
Explanation
bridgeports
List of logical and physical ports belonging to the logical bridge.
bridgeageing
Maximum amount of time before a MAC addresses learned on the bridge expires from
the bridge MAC cache. The default value is 300 seconds.
bridgestp
Enables spanning tree protocol on this bridge. The default spanning tree mode is Per
VLAN Rapid Spanning Tree Protocol (PVRST).
For more information on spanning-tree configurations see the configuration section:
Spanning Tree and Rapid Spanning Tree (see page 234).
To bring up the bridge my_bridge, use the ifreload command:
cumulusnetworks.com
165
Cumulus Networks
To bring up the bridge my_bridge, use the ifreload command:
cumulus@switch:~$ sudo ifreload -a
Runtime Configuration (Advanced)
A runtime configuration is non-persistent, which means the configuration you create here
does not persist after you reboot the switch.
To create the bridge and interfaces on the bridge, run:
cumulus@switch:~$ sudo brctl addbr my_bridge
cumulus@switch:~$ sudo brctl addif my_bridge bond0 swp5 swp6
cumulus@switch:~$ sudo brctl show
bridge name
bridge id
STP enabled
interfaces
my_bridge
8000.44383900129b
yes
bond0
swp5
swp6
cumulus@switch:~$ sudo ip link set up dev my_bridge
cumulus@switch:~$ sudo ip link set up dev bond0
cumulus@switch:~$ sudo for I in {5..6}; do
ip link set up dev swp$I; done
Showing and Verifying the Bridge Configuration
cumulus@switch:~$ ip link show my_bridge
56: my_bridge: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue
state UP mode DEFAULT
link/ether 44:38:39:00:12:9b brd ff:ff:ff:ff:ff:ff
Do not try to bridge the management port, eth0, with any switch ports (like swp0, swp1, and
166
03 June 2015
Cumulus Linux 2.5.2 User Guide
Do not try to bridge the management port, eth0, with any switch ports (like swp0, swp1, and
so forth). For example, if you created a bridge with eth0 and swp1, it will not work.
Using netshow to Display Bridge Information
netshow is an add-on tool that is not installed in Cumulus Linux by default. Refer to this knowledge
base article for steps to install it.
cumulus@switch$ netshow interface bridge
Name
Speed
Mtu
Mode
Summary
--
---------
-------
-----
---------
-----------------------
UP
my_bridge
N/A
1500
Bridge/L2
Untagged: bond0, swp5-6
Root Port: bond0
VlanID: Untagged
Bridge Interface MAC Address and MTU
A bridge is a logical interface with a MAC address and an MTU (maximum transmission unit). The bridge
MTU is the minimum MTU among all its members. The bridge's MAC address is inherited from the first
interface that is added to the bridge as a member. The bridge MAC address remains unchanged until
the member interface is removed from the bridge, at which point the bridge will inherit from the next
member interface, if any. The bridge can also be assigned an IP address, as discussed later in this
section.
Examining MAC Addresses
A bridge forwards frames by looking up the destination MAC address. A bridge learns the source MAC
address of a frame when the frame enters the bridge on an interface. After the MAC address is learned,
the bridge maintains an age for the MAC entry in the bridge table. The age is refreshed when a frame is
seen again with the same source MAC address. When a MAC is not seen for greater than the MAC
ageing time, the MAC address is deleted from the bridge table.
The following shows the MAC address table of the example bridge. Notice that the is local? column
indicates if the MAC address is the interface's own MAC address (is local is yes), or if it is learned on
the interface from a packet's source MAC (where is local is no):
cumulus@switch:~$ sudo brctl showmacs my_bridge
port name mac addr
is local?
swp4
06:90:70:22:a6:2e
no
19.47
swp1
12:12:36:43:6f:9d
no
40.50
bond0
2a:95:22:94:d1:f0
no
1.98
swp1
44:38:39:00:12:9b
yes
0.00
swp2
44:38:39:00:12:9c
yes
0.00
swp3
44:38:39:00:12:9d
yes
0.00
swp4
44:38:39:00:12:9e
yes
0.00
cumulusnetworks.com
ageing timer
167
Cumulus Networks
bond0
44:38:39:00:12:9f
yes
0.00
swp2
90:e2:ba:2c:b1:94
no
12.84
swp2
a2:84:fe:fc:bf:cd
no
9.43
You can use the bridge fdb command to display the MAC address table as well:
cumulus@en-sw2$ bridge fdb show
70:72:cf:9d:4e:36 dev swp2 VLAN 0 master bridge-A permanent
70:72:cf:9d:4e:35 dev swp1 VLAN 0 master bridge-A permanent
70:72:cf:9d:4e:38 dev swp4 VLAN 0 master bridge-B permanent
70:72:cf:9d:4e:37 dev swp3 VLAN 0 master bridge-B permanent
You can clear a MAC address from the table using the bridge fdb command:
cumulus@switch:~$ sudo bridge fdb del 90:e2:ba:2c:b1:94 dev swp2
Multiple Bridges
Sometimes it is useful to logically divide a switch into multiple layer 2 domains, so that hosts in one
domain can communicate with other hosts in the same domain but not in other domains. You can
achieve this by configuring multiple bridges and putting different sets of interfaces in the different
bridges. In the following example, host-1 and host-2 are connected to the same bridge (bridge-A), while
host-3 and host-4 are connected to another bridge (bridge-B). host-1 and host-2 can communicate with
each other, so can host-3 and host-4, but host-1 and host-2 cannot communicate with host-3 and host4.
168
03 June 2015
Cumulus Linux 2.5.2 User Guide
To configure multiple bridges, edit /etc/network/interfaces:
auto bridge-A
iface bridge-A
bridge-ports swp1 swp2
bridge-stp on
auto my_bridge
iface my_bridge
bridge-ports swp3 swp4
bridge-stp on
To bring up the bridges bridge-A and bridge-B, use the ifreload command:
cumulus@switch:~$ sudo ifreload -a
Runtime Configuration (Advanced)
A runtime configuration is non-persistent, which means the configuration you create here
does not persist after you reboot the switch.
cumulusnetworks.com
169
Cumulus Networks
cumulus@switch:~$ sudo brctl addbr bridge-A
cumulus@switch:~$ sudo brctl addif bridge-A swp1 swp2
cumulus@switch:~$ sudo brctl addbr bridge-B
cumulus@switch:~$ sudo brctl addif bridge-B swp3 swp4
cumulus@switch:~$ sudo for I in {1..4}; do
ip link set up dev swp$I; done
cumulus@switch:~$ sudo ip link set up dev bridge-A
cumulus@switch:~$ sudo ip link set up dev bridge-B
cumulus@switch:~$ sudo brctl show
bridge name
bridge id
STP enabled
interfaces
bridge-A
8000.44383900129b
yes
swp1
bridge-B
8000.44383900129d
yes
swp2
swp3
swp4
cumulus@en-sw2$ ip link show bridge-A
97: bridge-A: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue
state UP mode DEFAULT
link/ether 70:72:cf:9d:4e:35 brd ff:ff:ff:ff:ff:ff
cumulus@en-sw2$ ip link show bridge-B
98: bridge-B: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue
state UP mode DEFAULT
link/ether 70:72:cf:9d:4e:37 brd ff:ff:ff:ff:ff:ff
Using netshow to Display the Bridges
netshow is an add-on tool that is not installed in Cumulus Linux by default. Refer to this knowledge
base article for steps to install it.
cumulus@switch$ netshow interface bridge
Name
Speed
Mtu
Mode
Summary
--
--------
-------
-----
---------
----------------
UP
bridge-A
N/A
1500
Bridge/L2
Untagged: swp1-2
Root Port: swp2
VlanID: Untagged
UP
bridge-B
N/A
1500
Bridge/L2
Untagged: swp3-4
Root Port: swp3
VlanID: Untagged
170
03 June 2015
Cumulus Linux 2.5.2 User Guide
Configuring an SVI (Switch VLAN Interface)
A bridge creates a layer 2 forwarding domain for hosts to communicate. A bridge can be assigned an IP
address — typically of the same subnet as the hosts that are members of the bridge — and participate
in routing topologies. This enables hosts within a bridge to communicate with other hosts outside the
bridge through layer 3 routing.
When an interface is added to a bridge, it ceases to function as a router interface, and the IP
address on the interface, if any, becomes reachable.
The configuration for the two bridges example looks like the following:
auto swp5
iface swp5
address 192.168.1.2/24
address 2001:DB8:1::2/64
auto bridge-A
iface bridge-A
address 192.168.2.1/24
address 2001:DB8:2::1/64
bridge-ports swp1 swp2
bridge-stp on
auto bridge-B
iface bridge-B
address 192.168.3.1/24
address 2001:DB8:3::1/64
bridge-ports swp3 swp4
bridge-stp on
cumulusnetworks.com
171
Cumulus Networks
To bring up swp5 and bridges bridge-A and bridge-B, use the ifreload command:
cumulus@switch:~$ sudo ifreload -a
Showing and Verifying the Bridge Configuration
cumulus@switch$ ip addr show bridge-A
106: bridge-A: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue
state UP
link/ether 70:72:cf:9d:4e:35 brd ff:ff:ff:ff:ff:ff
inet 192.168.2.1/24 scope global bridge-A
inet6 2001:db8:2::1/64 scope global
valid_lft forever preferred_lft forever
inet6 fe80::7272:cfff:fe9d:4e35/64 scope link
valid_lft forever preferred_lft forever
cumulus@switch$ ip addr show bridge-B
107: bridge-B: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue
state UP
link/ether 70:72:cf:9d:4e:37 brd ff:ff:ff:ff:ff:ff
inet 192.168.3.1/24 scope global bridge-B
inet6 2001:db8:3::1/64 scope global
valid_lft forever preferred_lft forever
inet6 fe80::7272:cfff:fe9d:4e37/64 scope link
valid_lft forever preferred_lft forever
To see all the routes on the switch use the ip route show command:
cumulus@switch$ ip route show
192.168.1.0/24 dev swp5 proto kernel scope link src 192.168.1.2 dead
192.168.2.0/24 dev bridge-A proto kernel scope link src 192.168.2.1
192.168.3.0/24 dev bridge-B proto kernel scope link src 192.168.3.1
Runtime Configuration (Advanced)
A runtime configuration is non-persistent, which means the configuration you create here
does not persist after you reboot the switch.
172
03 June 2015
Cumulus Linux 2.5.2 User Guide
To add an IP address to a bridge:
cumulus@switch:~$ sudo ip addr add 192.0.2.101/24 dev bridge-A
cumulus@switch:~$ sudo ip addr add 192.0.2.102/24 dev bridge-B
Using netshow to Display the SVI
netshow is an add-on tool that is not installed in Cumulus Linux by default. Refer to this knowledge
base article for steps to install it.
cumulus@switch$ netshow interface bridge
--
Name
Speed
Mtu
Mode
--------
-------
-----
---------
Summary
-----------------------------------UP
bridge-A
N/A
1500
Bridge/L3
IP: 192.168.2.1/24, 2001:db8:2::1
/64
Untagged: swp1-2
Root Port: swp2
VlanID: Untagged
UP
bridge-B
N/A
1500
Bridge/L3
IP: 192.168.3.1/24, 2001:db8:3::1
/64
Untagged: swp3-4
Root Port: swp3
VlanID: Untagged
Using Trunks in Traditional Bridging Mode
The IEEE standard for trunking is 802.1Q. The 802.1Q specification adds a 4 byte header within the
Ethernet frame that identifies the VLAN of which the frame is a member.
802.1Q also identifies an untagged frame as belonging to the native VLAN (most network devices default
their native VLAN to 1). The concept of native, non-native, tagged or untagged has generated confusion
due to mixed terminology and vendor-specific implementations. Some clarification is in order:
A trunk port is a switch port configured to send and receive 802.1Q tagged frames.
A switch sending an untagged (bare Ethernet) frame on a trunk port is sending from the native
VLAN defined on the trunk port.
A switch sending a tagged frame on a trunk port is sending to the VLAN identified by the 802.1Q
tag.
A switch receiving an untagged (bare Ethernet) frame on a trunk port places that frame in the
native VLAN defined on the trunk port.
A switch receiving a tagged frame on a trunk port places that frame in the VLAN identified by the
802.1Q tag.
cumulusnetworks.com
173
Cumulus Networks
A bridge in traditional mode has no concept of trunks, just tagged or untagged frames. With a trunk of
200 VLANs, there would need to be 199 bridges, each containing a tagged physical interface, and one
bridge containing the native untagged VLAN. See the examples below for more information.
The interaction of tagged and un-tagged frames on the same trunk often leads to undesired
and unexpected behavior. A switch that uses VLAN 1 for the native VLAN may send frames to
a switch that uses VLAN 2 for the native VLAN, thus merging those two VLANs and their
spanning tree state.
Trunk Example
Configure the following in /etc/network/interfaces:
auto br-VLAN100
iface br-VLAN100
bridge-ports swp1.100 swp2.100
bridge-stp on
174
03 June 2015
Cumulus Linux 2.5.2 User Guide
auto br-VLAN200
iface br-VLAN200
bridge-ports swp1.200 swp2.200
bridge-stp on
To bring up br-VLAN100 and br-VLAN200, use the ifreload command:
cumulus@switch:~$ sudo ifreload -a
Showing and Verifying the Trunk
cumulus@en-sw2$ brctl show
bridge name bridge id STP enabled interfaces
br-VLAN100 8000.7072cf9d4e35 no swp1.100
swp2.100
br-VLAN200 8000.7072cf9d4e35 no swp1.200
swp2.200
Using netshow to Display the Trunk
netshow is an add-on tool that is not installed in Cumulus Linux by default. Refer to this knowledge
base article for steps to install it.
cumulus@switch$ netshow interface bridge
Name
Speed
Mtu
Mode
Summary
--
----------
-------
-----
---------
----------------------
UP
br-VLAN100
N/A
1500
Bridge/L2
Tagged: swp1-2
STP: rootSwitch(32768)
VlanID: 100
UP
br-VLAN200
N/A
1500
Bridge/L2
Tagged: swp1-2
STP: rootSwitch(32768)
VlanID: 200
Additional Examples
You can find additional examples of VLAN tagging in this chapter (see page 176).
Configuration Files
/etc/network/interfaces
/etc/network/interfaces.d/
/etc/network/if-down.d/
cumulusnetworks.com
175
Cumulus Networks
/etc/network/if-down.d/
/etc/network/if-post-down.d/
/etc/network/if-pre-up.d/
/etc/network/if-up.d/
Useful Links
http://www.linuxfoundation.org/collaborate/workgroups/networking/bridge
http://www.linuxfoundation.org/collaborate/workgroups/networking/vlan
http://www.linuxjournal.com/article/8172
Caveats and Errata
The same bridge cannot contain multiple subinterfaces of the same port as members.
Attempting to apply such a configuration will result in an error.
VLAN Tagging
This article shows two examples of VLAN tagging (see page ), one basic and one more advanced.
They both demonstrate the streamlined interface configuration from ifupdown2. For more
information, see Configuring and Managing Network Interfaces (see page 125).
Contents
(Click to expand)
Contents (see page 176)
VLAN Tagging, a Basic Example (see page 176)
Persistent Configuration (see page 177)
VLAN Tagging, an Advanced Example (see page 177)
Persistent Configuration (see page 178)
VLAN Translation (see page 183)
VLAN Tagging, a Basic Example
A simple configuration demonstrating VLAN tagging involves two hosts connected to a switch.
176
03 June 2015
Cumulus Linux 2.5.2 User Guide
host1 connects to swp1 with both untagged frames and with 802.1Q frames tagged for vlan100.
host2 connects to swp2 with 802.1Q frames tagged for vlan120 and vlan130.
Persistent Configuration
To configure the above example persistently, configure /etc/network/interfaces like this:
# Config for host1
auto swp1
iface swp1
auto swp1.100
iface swp1.100
# Config for host2
# swp2 must exist to create the .1Q subinterfaces, but it is not assigned
an address
auto swp2
iface swp2
auto swp2.120
iface swp2.120
auto swp2.130
iface swp2.130
VLAN Tagging, an Advanced Example
cumulusnetworks.com
177
Cumulus Networks
VLAN Tagging, an Advanced Example
This example of VLAN tagging is more complex, involving three hosts and two switches, with a number
of bridges and a bond connecting them all.
host1 connects to bridge br-untagged with bare Ethernet frames and to bridge br-tag100 with
802.1q frames tagged for vlan100.
host2 connects to bridge br-tag100 with 802.1q frames tagged for vlan100 and to bridge brvlan120 with 802.1q frames tagged for vlan120.
host3 connects to bridge br-vlan120 with 802.1q frames tagged for vlan120 and to bridge v130
with 802.1q frames tagged for vlan130.
bond2 carries tagged and untagged frames in this example.
Although not explicitly designated, the bridge member ports function as 802.1Q access ports and trunk
ports. In the example above, comparing Cumulus Linux with a traditional Cisco device:
swp1 is equivalent to a trunk port with untagged and vlan100.
swp2 is equivalent to a trunk port with vlan100 and vlan120.
swp3 is equivalent to a trunk port with vlan120 and vlan130.
bond2 is equivalent to an EtherChannel in trunk mode with untagged, vlan100, vlan120, and
vlan130.
Bridges br-untagged, br-tag100, br-vlan120, and v130 are equivalent to SVIs (switched virtual
interfaces).
Persistent Configuration
From /etc/network/interfaces :
178
03 June 2015
Cumulus Linux 2.5.2 User Guide
# Config for host1 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # swp1 does not need an iface section unless it has a specific setting,
# it will be picked up as a dependent of swp1.100.
# And swp1 must exist in the system to create the .1q subinterfaces..
# but it is not applied to any bridge..or assigned an address.
auto swp1.100
iface swp1.100
# Config for host2
# swp2 does not need an iface section unless it has a specific setting,
# it will be picked up as a dependent of swp2.100 and swp2.120.
# And swp2 must exist in the system to create the .1q subinterfaces..
# but it is not applied to any bridge..or assigned an address.
auto swp2.100
iface swp2.100
auto swp2.120
iface swp2.120
# Config for host3
# swp3 does not need an iface section unless it has a specific setting,
# it will be picked up as a dependent of swp3.120 and swp3.130.
# And swp3 must exist in the system to create the .1q subinterfaces..
# but it is not applied to any bridge..or assigned an address.
auto swp3.120
iface swp3.120
auto swp3.130
iface swp3.130
# Configure the bond - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - auto bond2
iface bond2
bond-slaves glob swp4-7
bond-mode 802.3ad
bond-miimon 100
bond-use-carrier 1
bond-lacp-rate 1
cumulusnetworks.com
179
Cumulus Networks
bond-min-links 1
bond-xmit-hash-policy layer3+4
# configure the bridges
- - - - - - - - - - - - - - - - - - - - - - - - -
- - - - - - - auto br-untagged
iface br-untagged
address 10.0.0.1/24
bridge-ports swp1 bond2
bridge-stp on
auto br-tag100
iface br-tag100
address 10.0.100.1/24
bridge-ports swp1.100 swp2.100 bond2.100
bridge-stp on
auto br-vlan120
iface br-vlan120
address 10.0.120.1/24
bridge-ports swp2.120 swp3.120 bond2.120
bridge-stp on
auto v130
iface v130
address 10.0.130.1/24
bridge-ports swp2.130 swp3.130 bond2.130
bridge-stp on
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
To verify:
cumulus@switch:~$ sudo mstpctl showbridge br-tag100
br-tag100 CIST info
enabled
yes
bridge id
8.000.44:38:39:00:32:8B
designated root 8.000.44:38:39:00:32:8B
180
regional root
8.000.44:38:39:00:32:8B
root port
none
path cost
0
internal path cost
0
max age
20
bridge max age
20
03 June 2015
Cumulus Linux 2.5.2 User Guide
forward delay 15
bridge forward delay 15
tx hold count 6
max hops
20
hello time
ageing time
300
2
force protocol version
rstp
time since topology change 333040s
topology change count
1
topology change
no
topology change port
swp2.100
last topology change port
None
cumulus@switch:~$ sudo mstpctl showportdetail br-tag100
| grep -B 2 state
br-tag100:bond2.100 CIST info
enabled
yes
role
Designated
port id
8.003
state
forwarding
-br-tag100:swp1.100 CIST info
enabled
yes
role
Designated
port id
8.001
state
forwarding
-br-tag100:swp2.100 CIST info
enabled
yes
role
Designated
port id
8.002
state
forwarding
cumulus@switch:~$ cat /proc/net/vlan/config
VLAN Dev name
| VLAN ID
Name-Type: VLAN_NAME_TYPE_RAW_PLUS_VID_NO_PAD
bond2.100
| 100
| bond2
bond2.120
| 120
| bond2
bond2.130
| 130
| bond2
swp1.100
| 100
| swp1
swp2.100
| 100
| swp2
swp2.120
| 120
| swp2
swp3.120
| 120
| swp3
swp3.130
| 130
| swp3
cumulus@switch:~$ cat /proc/net/bonding/bond2
Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)
Bonding Mode: IEEE 802.3ad Dynamic link aggregation
Transmit Hash Policy: layer3+4 (1)
MII Status: up
MII Polling Interval (ms): 100
Up Delay (ms): 0
Down Delay (ms): 0
cumulusnetworks.com
181
Cumulus Networks
802.3ad info
LACP rate: fast
Min links: 0
Aggregator selection policy (ad_select): stable
Active Aggregator Info:
Aggregator ID: 3
Number of ports: 4
Actor Key: 33
Partner Key: 33
Partner Mac Address: 44:38:39:00:32:cf
Slave Interface: swp4
MII Status: up
Speed: 10000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: 44:38:39:00:32:8e
Aggregator ID: 3
Slave queue ID: 0
Slave Interface: swp5
MII Status: up
Speed: 10000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: 44:38:39:00:32:8f
Aggregator ID: 3
Slave queue ID: 0
Slave Interface: swp6
MII Status: up
Speed: 10000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: 44:38:39:00:32:90
Aggregator ID: 3
Slave queue ID: 0
Slave Interface: swp7
MII Status: up
Speed: 10000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: 44:38:39:00:32:91
182
03 June 2015
Cumulus Linux 2.5.2 User Guide
Aggregator ID: 3
Slave queue ID: 0
A single bridge cannot contain multiple subinterfaces of the same port as members.
Attempting to apply such a configuration will result in an error:
cumulus@switch:~$ sudo
brctl addbr another_bridge
cumulus@switch:~$ sudo
brctl addif another_bridge swp9 swp9.100
bridge cannot contain multiple subinterfaces of the same port: swp9,
swp9.100
VLAN Translation
By default, Cumulus Linux does not allow VLAN subinterfaces associated with different VLAN IDs to be
part of the same bridge. Base interfaces are not explicitly associated with any VLAN IDs and are exempt
from this restriction:
cumulus@switch:~$ sudo brctl addbr br_mix
cumulus@switch:~$ sudo ip link add link swp10 name swp10.100 type vlan id
100
cumulus@switch:~$ sudo ip link add link swp11 name swp11.200 type vlan id
200
cumulus@switch:~$ sudo brctl addif br_mix swp10.100 swp11.200
can't add swp11.200 to bridge br_mix: Invalid argument
In some cases, it may be useful to relax this restriction. For example, two servers may be connected to
the switch using VLAN trunks, but the VLAN numbering provisioned on the two servers are not
consistent. You can choose to just bridge two VLAN subinterfaces of different VLAN IDs from the
servers. You do this by enabling the sysctl net.bridge.bridge-allow-multiple-vlans. Packets
entering a bridge from a member VLAN subinterface will egress another member VLAN subinterface
with the VLAN ID translated.
A bridge in VLAN-aware mode (see page 184) cannot have VLAN translation enabled for it;
only bridges configured in traditional mode can utilize VLAN translation.
The following example enables the VLAN translation sysctl:
cumulusnetworks.com
183
Cumulus Networks
cumulus@switch:~$ echo net.bridge.bridge-allow-multiple-vlans = 1 | sudo
tee /etc/sysctl.d/multiple_vlans.conf
net.bridge.bridge-allow-multiple-vlans = 1
cumulus@switch:~$ sudo sysctl -p /etc/sysctl.d/multiple_vlans.conf
net.bridge.bridge-allow-multiple-vlans = 1
If the sysctl is enabled and you want to disable it, run the above example, setting the sysctl net.
bridge.bridge-allow-multiple-vlans to 0.
Once the sysctl is enabled, ports with different VLAN IDs can be added to the same bridge. In the
following example, packets entering the bridge br-mix from swp10.100 will be bridged to swp11.200
with the VLAN ID translated from 100 to 200:
cumulus@switch:~$ sudo brctl addif br_mix swp10.100 swp11.200
cumulus@switch:~$ sudo brctl show br_mix
bridge name
bridge id
STP enabled
interfaces
br_mix
8000.4438390032bd
yes
swp10.100
swp11.200
VLAN-aware Bridge Mode for Large-scale Layer 2 Environments
Cumulus Linux bridge driver supports two configuration modes, one that is VLAN-aware, and one that
follows a more traditional Linux bridge model.
For traditional mode Linux bridges, the kernel supports VLANs in the form of VLAN subinterfaces.
Enabling bridging on multiple VLANs means configuring a bridge for each VLAN and, for each member
port on a bridge, creating one or more VLAN subinterfaces out of that port. This mode poses scalability
challenges in terms of configuration size as well as boot time and run time state management, when
the number of ports times the number of VLANs becomes large.
The VLAN-aware mode in Cumulus Linux implements a configuration model for large-scale L2
environments, with one single instance of Spanning Tree (see page 234). Each physical bridge
member port is configured with the list of allowed VLANs as well as its port VLAN ID (PVID or native
VLAN — see below). MAC address learning, filtering and forwarding are VLAN-aware. This significantly
184
03 June 2015
Cumulus Linux 2.5.2 User Guide
VLAN — see below). MAC address learning, filtering and forwarding are VLAN-aware. This significantly
reduces the configuration size, and eliminates the large overhead of managing the port/VLAN instances
as subinterfaces, replacing them with lightweight VLAN bitmaps and state updates.
You can configure both VLAN-aware and traditional mode bridges on the same network in
Cumulus Linux; however you should not have more than one VLAN-aware bridge on a given
switch. If you are implementing VXLANs (see page 209), you must use bridges in traditional
mode.
The following sections illustrate how to configure VLAN-aware bridges using iproute2 commands.
Contents
(Click to expand)
Contents (see page 185)
Creating the Bridge (see page 185)
Defining VLAN Memberships (see page 185)
Configuring Router Interfaces (see page 185)
Using the Show Commands (see page 186)
Configuring a VLAN-aware Bridge (see page 187)
Example Basic Configuration (see page 188)
Example Configuration with Access Ports and Pruned VLANs (see page 189)
Example Configuration with Bonds (see page 190)
Example CLAG Configuration (see page 193)
Caveats and Errata (see page 195)
Creating the Bridge
You need to configure only one VLAN-aware bridge, and you need to add only physical ports or bonds
to the bridge. Use ifupdown2 to create the configuration.
Defining VLAN Memberships
With the VLAN-aware bridge mode, VLAN membership is defined for each bridge member interface.
This includes the allowed VLAN list and the PVID of the interface (that is, native or default VLAN). In the
code below, bond0 and bond1 are trunk ports with native VLAN of 10 and allowed VLAN list of 1-1000,
1010-1020. swp5 is an access port with access VLAN of 10.
Configuring Router Interfaces
In case L3 termination of any VLANs is required, you can configure a router interface as a VLAN
subinterface of the bridge device itself.
To continue with the previous example, say VLAN 10 and VLAN 1000 are layer 3 routed. You can create
the router interfaces by running:
cumulusnetworks.com
185
Cumulus Networks
cumulus@switch:~$ sudo ip link add link br name br.10 type vlan id 10
cumulus@switch:~$ sudo ip link add link br name br.1000 type vlan id 1000
Then you use the ip addr add command to assign an IP address to each interface. Note that in order
for the bridge to pass routed traffic on these two VLANs, you need to assign the VLANs in the bridge's
VLAN list. To do this, run:
cumulus@switch:~$ sudo bridge vlan add vid 10 dev br self
cumulus@switch:~$ sudo bridge vlan add vid 1000 dev br self
Using the Show Commands
To show all bridge VLANs:
cumulus@switch:~$ bridge vlan show
port
bond0
vlan ids
10 PVID Egress Untagged
1-9
11-1000
1010-1020
bond1
10 PVID Egress Untagged
1-9
11-1000
1010-1020
swp5
10 PVID Egress Untagged
br
10
1000
To show membership of a particular VLAN:
cumulus@switch:~$ sudo bridge vlan show vlan 10
VLAN 10:
bond0 bond1 swp5 br0
To show MAC addresses, do one of the following:
186
03 June 2015
Cumulus Linux 2.5.2 User Guide
cumulus@switch:~$ sudo brctl showmacs br | grep -v yes
port name mac addr
vlan
is local? ageing timer
bond0
10
no 39.47
00:e0:ec:25:2f:5b
cumulus@switch:~$ sudo bridge fdb show | grep -v perm
00:e0:ec:25:2f:5b dev bond0 vlan 10 port 0
Configuring a VLAN-aware Bridge
To configure a VLAN-aware bridge, include the bridge-vlan-aware attribute, setting it to yes. Name
the bridge bridge to help ensure it is the only VLAN-aware bridge on the switch. The following attributes
are useful for configuring VLAN-aware bridges:
bridge-vlan-aware: set to yes to indicate that the bridge is VLAN-aware.
bridge-access: declares the access port.
bridge-pvid: specifies native VLANs if the ID is other than 1.
bridge-vids: declares the VLANs associated with this bridge.
For a definitive list of bridge attributes, run ifquery --syntax-help and look for the entries under
bridge, bridgevlan and mstpctl.
A basic configuration for a VLAN-aware bridge configured for STP that contains two switch ports looks
like this:
auto bridge
iface bridge
bridge-vlan-aware yes
bridge-ports swp1 swp2
bridge-stp on
By default, the bridge port inherits the bridge VIDs. You can have a port override the bridge VIDs by
specifying port-specific VIDs, using the bridge-ports attribute.
As with traditional bridges, the bridge port membership and bridge attributes remain under bridge
configuration. But bridge port attributes reside under the ports themselves.
When configuring the VLAN attributes for the bridge, put the layer 2 attributes in a separate stanza
using this special VLAN interface: <bridge>.<vlanid/range>. You can specify a range of VLANs as well.
For example:
auto bridge.4094
vlan bridge.4094
address 172.16.101.100
hwaddress 44:38:39:ff:00:00
bridge-igmp-querier-src 172.16.101.1
Or:
cumulusnetworks.com
187
Cumulus Networks
Or:
auto bridge.[4094-4096]
vlan bridge.[4094-4096]
ATTRIBUTE VALUE
For switched virtual interface configurations, specify a regular bridge.vlanid device with the address
attribute:
auto bridge.4094
iface bridge.4094
address <ipaddr>
hwaddress <mac>
VLAN-aware bridges are backwards compatible with traditional bridge configurations.
Example Basic Configuration
The following is a basic example illustrating how to configure a VLAN-aware bridge using ifupdown2
(see page 125). Add this persistent configuration to /etc/network/interfaces.
Note the attributes used in the stanza:
The bridge-vlan-aware is set to yes, indicating the bridge is VLAN-aware.
The glob keyword referenced in the bridge-ports attribute indicates that swp1 through
swp52 are part of the bridge, instead of enumerating them one by one.
STP (see page 234) is enabled on the bridge.
The bridge-vids attribute declares the VLANs associated with the bridge.
#
# vlan-aware bridge simple example
#
# 'bridge' is a vlan aware bridge with all ports (swp1-52).
# native vlan is by default 1
#
# 'bridge-vids' attribute is used to declare vlans.
# 'bridge-pvid' attribute is used to specify native vlans if other than 1
# 'bridge-access' attribute is used to declare access port
#
#
# ports swp1-swp52 are trunk ports which inherit vlans from 'bridge'
# ie vlans 310 700 707 712 850 910
188
03 June 2015
Cumulus Linux 2.5.2 User Guide
#
# the following is a vlan aware bridge with ports swp1-swp52
# It has stp on
#
auto bridge
iface bridge
bridge-vlan-aware yes
bridge-ports glob swp1-52
bridge-stp on
bridge-vids 310 700 707 712 850 910
Example Configuration with Access Ports and Pruned VLANs
The following example contains an access port and a switch port that is pruned; that is, it only sends
and receives traffic tagged to and from a specific set of VLANs declared by the bridge-vids attribute.
It also contains other switch ports that send and receive traffic from all the defined VLANs.
#
# vlan-aware bridge access ports and pruned vlan example
#
# 'bridge' is a vlan aware bridge with all ports (swp1-52).
# native vlan is by default 1
#
# 'bridge-vids' attribute is used to declare vlans.
# 'bridge-pvid' attribute is used to specify native vlans if other than 1
# 'bridge-access' attribute is used to declare access port
#
#
# The following is an access port to vlan 310, no trunking
auto swp1
iface swp1
bridge-access 310
mstpctl-portadminedge yes
mstpctl-bpduguard yes
# The following is a trunk port that is "pruned".
# native vlan is 1, but only .1q tags of 707, 712, 850 are
# sent and received
#
auto swp2
iface swp2
bridge-vids 707 712 850
cumulusnetworks.com
189
Cumulus Networks
mstpctl-portadminedge yes
mstpctl-bpduguard yes
# The following port is the trunk uplink and inherits all vlans
# from 'bridge'; bridge assurance is enabled using 'portnetwork' attribute
auto swp49
iface swp49
mstpctl-portpathcost 10
mstpctl-portnetwork yes
# The following port is the trunk uplink and inherits all vlans
# from 'bridge'; bridge assurance is enabled using 'portnetwork' attribute
auto swp50
iface swp50
mstpctl-portpathcost 0
mstpctl-portnetwork yes
#
# ports swp3-swp48 are trunk ports which inherit vlans from the 'bridge'
# ie vlans 310,700,707,712,850,910
#
# the following is a vlan aware bridge with ports swp1-swp52
# It has stp on
#
auto bridge
iface bridge
bridge-vlan-aware yes
bridge-ports glob swp1-52
bridge-stp on
bridge-vids 310 700 707 712 850 910
Example Configuration with Bonds
This configuration demonstrates a VLAN-aware bridge with a large set of bonds. The bond
configurations are generated from a Mako template.
##
## vlan-aware bridge with bonds example
##
## uplink1, peerlink and downlink are bond interfaces.
## 'bridge' is a vlan aware bridge with ports uplink1, peerlink
## and downlink (swp2-20).
##
190
03 June 2015
Cumulus Linux 2.5.2 User Guide
## native vlan is by default 1
##
## 'bridge-vids' attribute is used to declare vlans.
## 'bridge-pvid' attribute is used to specify native vlans if other than 1
## 'bridge-access' attribute is used to declare access port
##
auto lo
iface lo
auto eth0
iface eth0 inet dhcp
## bond interface
auto uplink1
iface uplink1
bond-slaves swp32
bond-mode 802.3ad
bond-miimon 100
bond-use-carrier 1
bond-lacp-rate 1
bond-min-links 1
bond-xmit-hash-policy layer3+4
bridge-vids 2000-2079
## bond interface
auto peerlink
iface peerlink
bond-slaves swp30 swp31
bond-mode 802.3ad
bond-miimon 100
bond-use-carrier 1
bond-lacp-rate 1
bond-min-links 1
bond-xmit-hash-policy layer3+4
bridge-vids 2000-2079 4094
## bond interface
auto downlink
iface downlink
bond-slaves swp1
bond-mode 802.3ad
bond-miimon 100
bond-use-carrier 1
bond-lacp-rate 1
bond-min-links 1
cumulusnetworks.com
191
Cumulus Networks
bond-xmit-hash-policy layer3+4
bridge-vids 2000-2079
##
## Declare vlans for all swp ports
## swp2-20 get vlans from 2004 to 2022.
## The below uses mako templates to generate iface sections
## with vlans for swp ports
##
%for port, vlanid in zip(range(2, 20), range(2004, 2022)) :
auto swp${port}
iface swp${port}
bridge-vids ${vlanid}
%endfor
## svi vlan 4094
auto bridge.4094
iface bridge.4094
address 11.100.1.252/24
## l2 attributes for vlan 4094
auto bridge.4094
vlan bridge.4094
bridge-igmp-querier-src 172.16.101.1
##
## vlan-aware bridge
##
auto bridge
iface bridge
bridge-vlan-aware yes
bridge-ports uplink1 peerlink downlink glob swp2-20
bridge-stp on
## svi peerlink vlan
auto peerlink.4094
iface peerlink.4094
address 192.168.10.1/30
broadcast 192.168.10.3
192
03 June 2015
Cumulus Linux 2.5.2 User Guide
Example CLAG Configuration
The following configuration shows a VLAN-aware bridge used with multi-Chassis Link Aggregation
(CLAG) (see page 215). Based on internal testing, Cumulus Networks recommends you use 4094 for the
peerlink VLAN (peer-bond.4094 below) if possible.
#
# vlan-aware bridge with clag example
#
#
# 'bridge' is a vlan aware bridge with ports:
# 'peer-bond spine-bond glob host-bond-0[1-2]'
#
# All ports inherit 'vlans 10 20-23' from the 'bridge-vids' attribute
# under the bridge
#
# native vlan is by default 1
#
# 'bridge-vids' attribute is used to declare vlans.
# 'bridge-pvid' attribute is used to specify native vlans if other than 1
# 'bridge-access' attribute is used to declare access port
#
# 'spine-bond host-bond-0[1-2]' are clag bonds and will be considered by
# clagd for dual connection. clag-id has to be a non-zero and has to match
# across the peer switches for the bonds to become dual connected.
# spine bond
#
auto spine-bond
iface spine-bond
bond-slaves glob swp19-22
bond-mode 802.3ad
bond-miimon 100
bond-use-carrier 1
bond-lacp-rate 1
bond-min-links 1
bond-xmit-hash-policy layer3+4
clag-id 100
# mlag bond and peer interface
#
auto peer-bond
iface peer-bond
bond-slaves glob swp23-24
bond-mode 802.3ad
cumulusnetworks.com
193
Cumulus Networks
bond-miimon 100
bond-use-carrier 1
bond-lacp-rate 1
bond-min-links 1
bond-xmit-hash-policy layer3+4
# sub-interface for clagd communication
#
auto peer-bond.4094
iface peer-bond.4094
address 169.254.0.1/30
clagd-peer-ip 169.254.0.2
clagd-sys-mac 44:38:39:ff:00:01
#clagd-priority 4096
# Please see man clagd for more options
# clagd-args --peerTimeout 30
# host ports
#
auto host-bond-01
iface host-bond-01
bond-slaves swp1
bond-mode 802.3ad
bond-miimon 100
bond-use-carrier 1
bond-lacp-rate 1
bond-min-links 1
bond-xmit-hash-policy layer3+4
clag-id 1
auto host-bond-02
iface host-bond-02
bond-slaves swp2
bond-mode 802.3ad
bond-miimon 100
bond-use-carrier 1
bond-lacp-rate 1
bond-min-links 1
bond-xmit-hash-policy layer3+4
clag-id 2
# the bridge
auto bridge
iface bridge
bridge-vlan-aware yes
194
03 June 2015
Cumulus Linux 2.5.2 User Guide
bridge-ports peer-bond spine-bond glob host-bond-0[1-2]
bridge-stp on
Caveats and Errata
STP: Because Spanning Tree and Rapid Spanning Tree (see page 234) (STP) are enabled on a perbridge basis, VLAN-aware mode essentially supports a single instance of STP across all VLANs. A
common practice when using a single STP instance for all VLANs is to define every VLAN on each
switch in the spanning tree instance. mstpd continues to be the user space protocol daemon,
and Cumulus Linux supports RSTP.
IGMP snooping: IGMP snooping and group membership are supported on a per-VLAN basis,
though the IGMP snooping configuration (including enable/disable, mrouter port and so forth)
are defined on a per-bridge port basis.
VXLANs: Use the traditional configuration mode for VXLAN configuration (see page 209).
Reserved VLAN range: For hardware data plane internal operations, the switching silicon
requires VLANs for every physical port, Linux bridge, and layer 3 subinterface. Cumulus Linux
reserves a range of 700 VLANs by default; this range is 3300-3999. In case any of your userdefined VLANs conflict with the default reserved range, you can modify the range, as long as the
new range is a contiguous set of VLANs with IDs anywhere between 2 and 4094, and the
minimum size of the range is 300 VLANs:
1. Edit /etc/cumulus/switchd.conf, uncomment resv_vlan_range and specify the
new range.
2. Restart switchd (sudo service switchd restart) for the new range to take effect.
While restarting switchd, all running ports will flap and forwarding will be
interrupted.
VLAN translation: A bridge in VLAN-aware mode cannot have VLAN translation enabled for it;
only bridges configured in traditional mode (see page 163) can utilize VLAN translation.
Network Virtualization
Cumulus Linux supports these forms of network virtualization:
VXLAN, integrated with VMware NSX (see page 196)
VXLAN, integrated with Midokura MidoNet
VXLANs configured directly in Cumulus Linux (see page 209)
VXLAN (Virtual Extensible LAN), is a standard overlay protocol that abstracts logical virtual networks
from the physical network underneath. You can deploy simple and scalable layer 3 Clos architectures
while extending layer 2 segments over that layer 3 network.
VXLAN uses a VLAN-like encapsulation technique to encapsulate MAC-based layer 2 Ethernet frames
within layer 3 UDP packets. Each virtual network is a VXLAN logical L2 segment. VXLAN scales to 16
million segments – a 24-bit VXLAN network identifier (VNI ID) in the VXLAN header – for multi-tenancy.
cumulusnetworks.com
195
Cumulus Networks
Hosts on a given virtual network are joined together through an overlay protocol that initiates and
terminates tunnels at the edge of the multi-tenant network, typically the hypervisor vSwitch or top of
rack. These edge points are the VXLAN tunnel end points (VTEP).
Cumulus Linux can initiate and terminate VTEPs in hardware and supports wire-rate VXLAN with
Trident II platforms. VXLAN provides an efficient hashing scheme across IP fabric during the
encapsulation process; the source UDP port is unique, with the hash based on L2-L4 information from
the original frame. The UDP destination port is the standard port 4789.
Cumulus Linux includes the native Linux VXLAN kernel support and integrates with controller-based
overlay solutions like VMware NSX and Midokura MidoNet.
VXLAN is supported only on switches in the Cumulus Linux HCL using Trident II chipsets.
Commands
brctl
bridge fdb add|show
ip link add|show
ovs-pki
ovsdb-client
vtep-ctl
Useful Links
VXLAN IETF draft
ovsdb-server
Integrating with VMware NSX
Switches running Cumulus Linux can integrate with VMware NSX to act as VTEP gateways. The VMware
NSX controller provides consistent provisioning across virtual and physical server infrastructures.
196
03 June 2015
Cumulus Linux 2.5.2 User Guide
Contents
(Click to expand)
Contents (see page 197)
Getting Started (see page 197)
Caveats and Errata (see page 197)
Bootstrapping the NSX Integration (see page 198)
Enabling the openvswitch-vtep Package (see page 198)
Using the Bootstrapping Script (see page 198)
Manually Bootstrapping the NSX Integration (see page 199)
Generating the Credentials Certificate (see page 199)
Configuring the Switch as a VTEP Gateway (see page 201)
Configuring the Transport Layer (see page 203)
Configuring the Logical Layer (see page 204)
Defining Logical Switches (see page 204)
Defining Logical Switch Ports (see page 206)
Verifying the VXLAN Configuration (see page 208)
Persistent VXLAN Configuration in NSX (see page 209)
Troubleshooting VXLANs in NSX (see page 209)
Getting Started
Before you integrate VXLANs with NSX, make sure you have the following components:
A switch (L2 gateway) with a Trident II chipset running Cumulus Linux 2.0 or later;
OVSDB server (ovsdb-server), included in Cumulus Linux 2.0 and later
VTEPd (ovs-vtepd), included in Cumulus Linux 2.0 and later
Integrating a VXLAN with NSX involves:
Bootstrapping the NSX Integration
Configuring the Transport Layer
Configuring the Logical Layer
Verifying the VXLAN Configuration
Once you finish the integration, you can make the configuration persistent across upgrades (see
Persistent VXLAN Configuration in NSX (see page 209) below).
Caveats and Errata
The switch with the sourcing VTEP must connect to a router.
There is no support for VXLAN routing in the Trident II chip; use a loopback interface or external
router.
Do not use 0 or 16777215 as the VNI ID, as they are reserved values under Cumulus Linux.
For more information about NSX, see the VMware NSX User Guide, version 4.0.0 or later.
cumulusnetworks.com
197
Cumulus Networks
Bootstrapping the NSX Integration
Before you start configuring the gateway service and logical switches and ports that comprise the
VXLAN, you need to complete some steps to bootstrap the process. You need to do the bootstrapping
just once, before you begin the integration.
Enabling the openvswitch-vtep Package
Before you start bootstrapping the integration, you need to enable the openvswitch-vtep package,
as it is disabled by default in Cumulus Linux.
1. In /etc/default/openvswitch-vtep, change the START option from no to yes:
cumulus@switch$ cat /etc/default/openvswitch-vtep
# This is a POSIX shell fragment
-*- sh -*-
# Start openvswitch at boot ? yes/no
START=yes
# FORCE_COREFILES: If 'yes' then core files will be enabled.
# FORCE_COREFILES=yes
# BRCOMPAT: If 'yes' and the openvswitch-brcompat package is
installed, then
# Linux bridge compatibility will be enabled.
# BRCOMPAT=no
2. Start the daemon:
cumulus@switch$ sudo service openvswitch-vtep start
Make sure to include this file in your persistent configuration (see Persistent VXLAN Configuration in
NSX (see page 209) below) so it’s available after you upgrade Cumulus Linux.
Using the Bootstrapping Script
A script is available so you can do the bootstrapping automatically. For information, read man vtepbootstrap. The output of the script is displayed here:
198
03 June 2015
Cumulus Linux 2.5.2 User Guide
In the above example, the following information was passed to the vtep-bootstrap script:
--credentials-path /var/lib/openvswitch: Is the path to where the certificate and key
pairs for authenticating with the NSX controller are stored.
vtep7: is the ID for the VTEP.
192.168.100.17: is the IP address of the NSX controller.
172.16.20.157: is the datapath IP address of the VTEP.
192.168.100.157: is the IP address of the management interface on the switch.
These IP addresses will be used throughout the rest of the examples below.
Manually Bootstrapping the NSX Integration
If you don’t use the script, then you must:
Initialize the OVS database instance
Generate a certificate and key pair for authentication by NSX
Configure a switch as a VTEP gateway
These steps are described next.
Generating the Credentials Certificate
cumulusnetworks.com
199
Cumulus Networks
Generating the Credentials Certificate
First, in Cumulus Linux, you must generate a certificate that the NSX controller uses for authentication.
1. In a terminal session connected to the switch, run the following commands:
cumulus@switch:~$ sudo ovs-pki init
Creating controllerca...
Creating switchca...
cumulus@switch:~$ sudo ovs-pki req+sign cumulus
cumulus-req.pem Wed Oct 23 05:32:49 UTC 2013
fingerprint b587c9fe36f09fb371750ab50c430485d33a174a
cumulus@switch:~$
cumulus@switch:~$ ls -l
total 12
-rw-r--r-- 1 root root 4028 Oct 23 05:32 cumulus-cert.pem
-rw------- 1 root root 1679 Oct 23 05:32 cumulus-privkey.pem
-rw-r--r-- 1 root root 3585 Oct 23 05:32 cumulus-req.pem
2. In /usr/share/openvswitch/scripts/ovs-ctl-vtep, make sure the lines containing
private-key, certificate and bootstrap-ca-cert point to the correct files; bootstrap-ca-cert is
obtained dynamically the first time the switch talks to the controller:
# Start ovsdb-server.
set ovsdb-server "$DB_FILE"
set "$@" -vANY:CONSOLE:EMER -vANY:SYSLOG:ERR -vANY:FILE:INFO
set "$@" --remote=punix:"$DB_SOCK"
set "$@" --remote=db:Global,managers
set "$@" --remote=ptcp:6633:$LOCALIP
set "$@" --private-key=/root/cumulus-privkey.pem
set "$@" --certificate=/root/cumulus-cert.pem
set "$@" --bootstrap-ca-cert=/root/controller.cacert
If files have been moved or regenerated, restart the OVSDB server and vtepd:
cumulus@switch:~$ sudo service openvswitch-vtep restart
3. Define the NSX controller cluster IP address in OVSDB. This causes the OVSDB server to start
contacting the NSX controller:
cumulus@switch:~$ sudo vtep-ctl set-manager ssl:192.168.100.17:6632
200
03 June 2015
3.
Cumulus Linux 2.5.2 User Guide
4. Define the local IP address on the VTEP for VXLAN tunnel termination. First, find the physical
switch name as recorded in OVSDB:
cumulus@switch:~$ sudo vtep-ctl list-ps
vtep7
Then set the tunnel source IP address of the VTEP. This is the datapath address of the VTEP,
which is typically an address on a loopback interface on the switch that is reachable from the
underlying L3 network:
cumulus@switch:~$ set Physical_Switch vtep7 tunnel_ips=172.16.20.157
Once you finish generating the certificate, keep the terminal session active, as you need to paste the
certificate into NSX Manager when you configure the VTEP gateway.
Configuring the Switch as a VTEP Gateway
After you create a certificate, connect to NSX Manager in a browser to configure a Cumulus Linux
switch as a VTEP gateway. In this example, the IP address of the NSX manager is 192.168.100.12.
1. In NSX Manager, add a new gateway. Click the Network Components tab, then the Transport
Layer category. Under Transport Node, click Add, then select Manually Enter All Fields. The
Create Gateway wizard appears.
2. In the Create Gateway dialog, select Gateway for the Transport Node Type, then click Next.
cumulusnetworks.com
201
Cumulus Networks
2. In the Create Gateway dialog, select Gateway for the Transport Node Type, then click Next.
3. In the Display Name field, give the gateway a name, then click Next.
4. Enable the VTEP service. Select the VTEP Enabled checkbox, then click Next.
5. From the terminal session connected to the switch where you generated the certificate, copy the
certificate and paste it into the Security Certificate text field. Copy only the bottom portion,
including the BEGIN CERTIFICATE and END CERTIFICATE lines. For example, copy all the
highlighted text in the terminal:
And paste it into NSX Manager:
Then click Next.
6. In the Connectors dialog, click Add Connector to add a transport connector. This defines the
tunnel endpoint that terminates the VXLAN tunnel and connects NSX to the physical gateway.
You must choose a tunnel Transport Type of VXLAN. Choose an existing transport zone for the
connector, or click Create to create a new transport zone.
7. Define the connector’s IP address (that is, the underlay IP address on the switch for tunnel
termination).
8. Click OK to save the connector, then click Save to save the gateway.
202
03 June 2015
Cumulus Linux 2.5.2 User Guide
8. Click OK to save the connector, then click Save to save the gateway.
Once communication is established between the switch and the controller, a controller.cacert file
will be downloaded onto the switch.
Verify the controller and switch handshake is successful. In a terminal connected to the switch, run this
command:
cumulus@switch:~$ sudo ovsdb-client dump -f list | grep -A 7 "Manager"
Manager table
_uuid
: 505f32af-9acb-4182-a315-022e405aa479
inactivity_probe
: 30000
is_connected
: true
max_backoff
: []
other_config
: {}
status
: {sec_since_connect="18223", sec_since_disconnect="
18225", state=ACTIVE}
target
: "ssl:192.168.100.17:6632"
Configuring the Transport Layer
After you finish bootstrapping the NSX integration, you need to configure the transport layer. For each
host-facing switch port that is to be associated with a VXLAN instance, define a Gateway Service for
the port.
1. In NSX Manager, add a new gateway service. Click the Network Components tab, then the
Services category. Under Gateway Service, click Add. The Create Gateway Service wizard
appears.
2. In the Create Gateway Service dialog, select VTEP L2 Gateway Service as the Gateway Service Type
.
3. Give the service a Display Name to represent the VTEP in NSX.
4. Click Add Gateway to associate the service with the gateway you created earlier.
5. In the Transport Node field, choose the name of the gateway you created earlier.
6. In the Port ID field, choose the physical port on the gateway (for example, swp10) that will
connect to a logical L2 segment and carry data traffic.
7. Click OK to save this gateway in the service, then click Save to save the gateway service.
cumulusnetworks.com
203
Cumulus Networks
7. Click OK to save this gateway in the service, then click Save to save the gateway service.
The gateway service shows up as type VTEP L2 in NSX.
Next, you will configure the logical layer on NSX.
Configuring the Logical Layer
To complete the integration with NSX, you need to configure the logical layer, which requires defining a
logical switch (the VXLAN instance) and all the logical ports needed.
Defining Logical Switches
To define the logical switch, do the following:
1. In NSX Manager, add a new logical switch. Click the Network Components tab, then the Logical
Layer category. Under Logical Switch, click Add. The Create Logical Switch wizard appears.
2. In the Display Name field, enter a name for the logical switch, then click Next.
204
03 June 2015
Cumulus Linux 2.5.2 User Guide
3. Under Replication Mode, select Service Nodes, then click Next.
4. Specify the transport zone bindings for the logical switch. Click Add Binding. The Create
Transport Zone Binding dialog appears.
5. In the Transport Type list, select VXLAN, then click OK to add the binding to the logical switch.
6. In the VNI field, assign the switch a VNI ID, then click OK.
Do not use 0 or 16777215 as the VNI ID, as they are reserved values under Cumulus
Linux.
7. Click Save to save the logical switch configuration.
cumulusnetworks.com
205
Cumulus Networks
7. Click Save to save the logical switch configuration.
Defining Logical Switch Ports
As the final step, define the logical switch ports. They can be virtual machine VIF interfaces from a
registered OVS, or a VTEP gateway service instance on this switch, as defined above in the Configuring
the Transport Laye. A VLAN binding can be defined for each VTEP gateway service associated with the
particular logical switch.
To define the logical switch ports, do the following:
1.
206
03 June 2015
Cumulus Linux 2.5.2 User Guide
1. In NSX Manager, add a new logical switch port. Click the Network Components tab, then the
Logical Layer category. Under Logical Switch Port, click Add. The Create Logical Switch Port
wizard appears.
2. In the Logical Switch UUID list, select the logical switch you created above, then click Create.
3. In the Display Name field, give the port a name that indicates it is the port that connects the
gateway, then click Next.
4. In the Attachment Type list, select VTEP L2 Gateway.
5. In the VTEP L2 Gateway Service UUID list, choose the name of the gateway service you created
earlier.
6. In the VLAN list, you can optionally choose a VLAN if you wish to connect only traffic on a specific
VLAN of the physical network. Leave it blank to handle all traffic.
7.
cumulusnetworks.com
207
Cumulus Networks
7. Click Save to save the logical switch port. Connectivity is established. Repeat this procedure for
each logical switch port you want to define.
Verifying the VXLAN Configuration
Once configured, you can verify the VXLAN configuration using these Cumulus Linux commands in a
terminal connected to the switch:
cumulus@switch1:~$ sudo ip –d link show vxln100
71: vxln100: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue
master br-vxln100 state UNKNOWN mode DEFAULT
link/ether d2:ca:78:bb:7c:9b brd ff:ff:ff:ff:ff:ff
vxlan id 100 local 172.16.20.157 port 32768 61000 nolearning ageing 300
svcnode 172.16.21.125
or
cumulus@switch1:~$ sudo bridge fdb show
52:54:00:ae:2a:e0 dev vxln100 dst 172.16.21.150 self permanent
d2:ca:78:bb:7c:9b dev vxln100 permanent
90:e2:ba:3f:ce:34 dev swp2s1.100
90:e2:ba:3f:ce:35 dev swp2s0.100
44:38:39:00:48:0e dev swp2s1.100 permanent
44:38:39:00:48:0d dev swp2s0.100 permanent
208
03 June 2015
Cumulus Linux 2.5.2 User Guide
Persistent VXLAN Configuration in NSX
If you want your VXLAN configuration to persist across upgrades of Cumulus Linux, you need to include
the following items in the persistent configuration. Use scp to copy the files to /mnt/persist:
/usr/share/openvswitch/ovs-ctl-vtep
Certificates and key pairs, as above
/etc/default/openvswitch-vtep
The ovsdb database file; the default is /var/lib/openvswitch/conf.db
Copying the ovsdb database file is optional; the persistent database file helps to
speed up convergence on a system upgrade. NSX Manager pushes any configuration
created or changed in NSX Manager when the connection with the VTEP is
reestablished, which overwrites the database file.
Troubleshooting VXLANs in NSX
Use ovsdb-client dump to troubleshoot issues on the switch. It verifies that the controller and switch
handshake is successful. This command works only for VXLANs integrated with NSX:
cumulus@switch:~$ sudo ovsdb-client dump -f list | grep -A 7 "Manager"
Manager table
_uuid
: 505f32af-9acb-4182-a315-022e405aa479
inactivity_probe
: 30000
is_connected
: true
max_backoff
: []
other_config
: {}
status
: {sec_since_connect="18223", sec_since_disconnect="
18225", state=ACTIVE}
target
: "ssl:192.168.100.17:6632"
Configuring a VXLAN without a Controller
Cumulus Linux includes the native Linux VXLAN kernel support, without need for a controller like
VMware NSX or Midokura MidoNet. VXLAN constructs can be leveraged for rapid integration with
existing overlay solutions by simply translating the overlay controller instructions into a standard Linux
kernel VXLAN construct.
Contents
(Click to expand)
Contents (see page 209)
Requirements (see page 210)
cumulusnetworks.com
209
Cumulus Networks
Requirements (see page 210)
Example VXLAN Configuration (see page 211)
Persistent Configuration Using ifupdown2 (see page 211)
Runtime Configuration (see page 212)
Persistent VXLAN Configuration in Cumulus Linux (see page 214)
Troubleshooting VXLANs in Cumulus Linux (see page 214)
Requirements
A VXLAN configuration requires a platform with hardware support for:
Switches with a Trident II chipset running Cumulus Linux 2.0 or later.
A service to carry unknown destination, broadcast and multicast frames. As mentioned in the
VXLAN IETF documents, you can do this through various mechanisms such as a learning-based
control plane (like multicast) or through a central authority (like a service node).
For a basic VXLAN configuration, you should ensure that:
The VXLAN has a network identifier (VNI); do not use 0 or 16777215 as the VNI ID, as they are
reserved values under Cumulus Linux.
The VXLAN instance is modeled as a link (netdev).
The VXLAN link and local interfaces are added to bridge to create the association between port,
VLAN and VXLAN instance.
Each bridge on the switch has only one VXLAN interface. Cumulus Linux does not support more
than one VXLAN link in a bridge; however a switch can have multiple bridges.
When using VXLAN without a controller, you utilize static ARP entires to reach hosts across the tunnel.
To configure a VXLAN in Cumulus Linux without a controller, run the following commands in a terminal
connected to the switch:
1. Create a VXLAN link:
cumulus@switch1:~$ sudo ip link add <name> type vxlan id <vni> local
<ip addr> [group <mcast group address>] [no] nolearning [ttl] [tos]
[dev] [port MIN MAX] [ageing <value>] [svcnode addr]
If you are specifying ageing, you must specify the service node (svcnode) .
2. Add a VXLAN link to a bridge:
cumulus@switch1:~$ sudo brctl addif br-vxlan <name>
3. Install a static MAC binding to a remote tunnel IP:
210
03 June 2015
3.
Cumulus Linux 2.5.2 User Guide
cumulus@switch1:~$ sudo bridge fdb add <mac addr> dev <device> dst <ip
addr> vni <vni> port <port> via <device>
4. Show VXLAN link and FDB:
cumulus@switch1:~$ sudo ip –d link show
cumulus@switch1:~$ sudo bridge fdb show
Example VXLAN Configuration
Consider the following example:
You can recreate this configuration two ways:
By creating a persistent configuration using ifupdown2 (see page 125)
By creating a runtime configuration with ip commands
Pre-configuring remote MAC addresses does not scale. A better solution is to use an
integrated solution such as VMware NSX (see page 196).
Persistent Configuration Using ifupdown2
You can create a persistent configuration for your VXLANs in Cumulus Linux. Use ifupdown2 (see page
125) syntax when you add the following configuration to the /etc/network/interfaces file on
switch1:
cumulusnetworks.com
211
Cumulus Networks
auto vtep1000
iface vtep1000
vxlan-id 1000
vxlan-local-tunnelip 172.10.1.1
auto br-100
iface br-100
bridge-ports swp1.100 swp2.100 vtep1000
post-up bridge fdb add 0:00:10:00:00:0C dev vtep1000 dst 172.20.1.1 vni
1000
Next, add the following configuration to the /etc/network/interfaces file on switch2:
auto vtep1000
iface vtep1000
vxlan-id 1000
vxlan-local-tunnelip 172.20.1.1
auto br-100
iface br-100
bridge-ports swp1.100 swp2.100 vtep1000
post-up bridge fdb add 00:00:10:00:00:0A dev vtep1000 dst 172.10.1.1
vni 1000
post-up bridge fdb add 00:00:10:00:00:0B dev vtep1000 dst 172.10.1.1
vni 1000
Runtime Configuration
1. Configure hosts A and B as part of the same tenant as C (VNI 10) on switch1. Hosts A and B are
part of VLAN 100. To configure the VTEP interface with VNI 10, run the following commands in a
terminal connected to switch1 running Cumulus Linux:
cumulus@switch1:~$ sudo ip link add link swp1 name swp1.100 type vlan
id 100
cumulus@switch1:~$ sudo ip link add link swp2 name swp2.100 type vlan
id 100
cumulus@switch1:~$ sudo ip link add vtep1000 type vxlan id 10 local
172.10.1.1 nolearning
cumulus@switch1:~$ sudo ip link set swp1 up
212
03 June 2015
Cumulus Linux 2.5.2 User Guide
cumulus@switch1:~$ sudo ip link set swp2 up
cumulus@switch1:~$ sudo ip link set vtep1000 up
2. Configure VLAN 100 and VTEP 1000 to be part of the same bridge br-100 on switch1:
cumulus@switch1:~$ sudo brctl addbr br-100
cumulus@switch1:~$ sudo ip link set br-100 up
cumulus@switch1:~$ sudo brctl addif br-100 swp1.100 swp2.100
cumulus@switch1:~$ sudo brctl addif br-100 vtep1000
3. Install a static MAC binding to a remote tunnel IP, assuming the MAC address for host C is 00:00:
10:00:00:0C:
cumulus@switch1:~$ sudo bridge fdb add 00:00:10:00:00:0C dev vtep1000
dst 172.20.1.1
4. Configure host C as part of the same tenant as hosts A and B on switch2:
cumulus@switch2:~$ sudo ip link add link swp1 name swp1.100 type vlan
id 100
cumulus@switch2:~$ sudo ip link add name vtep1000 type vxlan id 10
local 172.20.1.1 nolearning
cumulus@switch2:~$ sudo ip link set swp1 up
cumulus@switch2:~$ sudo ip link set vtep1000 up
5. Configure VLAN 100 and VTEP 1000 to be part of the same bridge br-100 on switch2:
cumulus@switch2:~$ sudo brctl addbr br-100
cumulus@switch2:~$ sudo ip link set br-100 up
cumulus@switch2:~$ sudo brctl addif br-100 swp1.100
cumulus@switch2:~$ sudo brctl addif br-100 vtep1000
6. Install a static MAC binding to a remote tunnel IP on switch2, assuming the MAC address for host
A is 00:00:10:00:00:0A and the MAC address for host B is 00:00:10:00:00:0B:
cumulus@switch2:~$ sudo bridge fdb add 00:00:10:00:00:0A dev vtep1000
dst 172.10.1.1
cumulus@switch2:~$ sudo bridge fdb add 00:00:10:00:00:0B dev vtep1000
dst 172.10.1.1
cumulusnetworks.com
213
Cumulus Networks
7. Verify the configuration on switch1, then on switch2:
cumulus@switch1:~$ sudo ip –d link show
cumulus@switch1:~$ sudo bridge fdb show
cumulus@switch2:~$ sudo ip –d link show
cumulus@switch2:~$ sudo bridge fdb show
8. Set the static arp for hosts B and C on host A:
root@hostA:~# sudo arp –s 10.1.1.3 00:00:10:00:00:0C
9. Set the static arp for hosts A and C on host B:
root@hostB:~# sudo arp –s 10.1.1.3 00:00:10:00:00:0C
10. Set the static arp for hosts A and B on host C:
root@hostC:~# arp –s 10.1.1.1 00:00:10:00:00:0A
root@hostC:~# arp –s 10.1.1.2 00:00:10:00:00:0B
Persistent VXLAN Configuration in Cumulus Linux
In order for your VXLAN configuration to persist across Cumulus Linux upgrades (see Making
Configurations Persist across Upgrades (see page 97)), use scp to copy the following files to /mnt
/persist:
Certificates and keys, including controller.cacert, cumulus-privkey.pem, cumulus-cert.
pem, cumulus-req.pem
/var/lib/openvswitch/conf.db, the OVSDB database
Troubleshooting VXLANs in Cumulus Linux
Use the following commands to troubleshoot issues on the switch:
brctl show: Verifies the VXLAN configuration in a bridge:
214
03 June 2015
Cumulus Linux 2.5.2 User Guide
cumulus@switch:~$ sudo brctl show
bridge name
bridge id
STP enabled
interfaces
br-vxln100
8000.44383900480d
no
swp2s0.100
swp2s1.
100
vxln100
bridge fdb show: Displays the list of MAC addresses in an FDB:
cumulus@switch1:~$ sudo bridge fdb show
52:54:00:ae:2a:e0 dev vxln100 dst 172.16.21.150 self permanent
d2:ca:78:bb:7c:9b dev vxln100 permanent
90:e2:ba:3f:ce:34 dev swp2s1.100
90:e2:ba:3f:ce:35 dev swp2s0.100
44:38:39:00:48:0e dev swp2s1.100 permanent
44:38:39:00:48:0d dev swp2s0.100 permanent
ip -d link show: Displays information about the VXLAN link:
cumulus@switch1:~$ sudo ip –d link show vxln100
71: vxln100: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc
noqueue master br-vxln100 state UNKNOWN mode DEFAULT
link/ether d2:ca:78:bb:7c:9b brd ff:ff:ff:ff:ff:ff
vxlan id 100 local 172.16.20.103 port 32768 61000 nolearning
ageing 300 svcnode 172.16.21.125
Multi-Chassis Link Aggregation - CLAG - MLAG
Host HA is a set of L2 and L3 features supporting high availability for hosts, including multi-Chassis Link
Aggregation (CLAG) for L2 and redistribute neighbor (an experimental L3 feature).
Multi-Chassis Link Aggregation, or CLAG, is the MLAG implementation in Cumulus Linux. CLAG enables
a server or switch with a two-port bond (such as a link aggregation group/LAG, EtherChannel, port
group, or trunk) to connect those ports to different switches and operate as if they are connected to a
single, logical switch. This provides greater redundancy and greater system throughput.
cumulusnetworks.com
215
Cumulus Networks
Dual-connected devices can create LACP bonds that contain links to each physical switch. Thus, activeactive links from the dual-connected devices are supported even though they are connected to two
different physical switches.
A basic setup looks like this:
The two switches, S1 and S2, known as peer switches, cooperate so that they appear as a single device
to host H1's bond. H1 distributes traffic between the two links to S1 and S2 in any manner that you
configure on the host. Similarly, traffic inbound to H1 can traverse S1 or S2 and arrive at H1.
Contents
(Click to expand)
Contents (see page 216)
CLAG Requirements (see page 217)
LACP and Dual-Connectedness (see page 218)
Understanding Switch Roles (see page 218)
Configuring CLAG (see page 219)
Configuring the Host or Switch (see page 219)
Configuring the Interfaces (see page 219)
Example CLAG Configuration (see page 220)
Using the clagd Command Line Interface (see page 224)
Peer Link Interfaces and the PROTO_DOWN State (see page 225)
Specifying a Backup Link (see page 225)
Monitoring Dual-Connected Peers (see page 226)
CLAG Best Practices (see page 227)
IGMP Snooping with CLAG (see page 227)
clagd Daemon Status Monitoring (see page 228)
STP Interoperability with CLAG (see page 229)
Debugging STP with CLAG (see page 229)
Best Practices for STP with CLAG (see page 230)
Troubleshooting CLAG (see page 230)
Caveats and Errata (see page 230)
216
03 June 2015
Cumulus Linux 2.5.2 User Guide
Caveats and Errata (see page 230)
Configuration Files (see page 230)
CLAG Requirements
CLAG has these requirements:
There must be a direct connection between the two peer switches implementing CLAG (S1 and
S2). This is typically a bond for increased reliability and bandwidth.
There must be only two peer switches in one CLAG configuration, but you can have multiple
configurations in a network (see below).
The peer switches implementing CLAG must be running Cumulus Linux version 2.5 or later.
You must specify a unique clag-id for every dual-connected bond on each peer switch; the
value must be between 1 and 65535 and must be the same on both peer switches in order for
the bond to be considered dual-connected.
The dual-connected devices (hosts or switches) must use LACP (IEEE 802.3ad/802.1ax) to form
the bond. The peer switches must also use LACP.
More elaborate configurations are also possible. The number of links between the host and the
switches can be greater than two, and does not have to be symmetrical:
Additionally, since S1 and S2 appear as a single switch to other bonding devices, pairs of CLAG switches
can also be connected to each other:
In this case, L1 and L2 are also CLAG peer switches, and thus present a two-port bond from a single
cumulusnetworks.com
217
Cumulus Networks
In this case, L1 and L2 are also CLAG peer switches, and thus present a two-port bond from a single
logical system to S1 and S2. S1 and S2 do the same as far as L1 and L2 are concerned.
LACP and Dual-Connectedness
In order for CLAG to operate correctly, the peer switches must know which links are dual-connected, or
are connected to the same host or switch. To do this, specify a clag-id for every dual-connected bond
on each peer switch; the clag-id must be the same for the corresponding bonds on both peer
switches. Link Aggregation Control Protocol (LACP), the IEEE standard protocol for managing bonds, is
used for verifying dual-connectedness. LACP runs on the dual-connected device and on each of the
peer switches. On the dual-connected device, the only configuration requirement is to create a bond
that will be managed by LACP.
On each of the peer switches the links connected to the dual-connected host or switch must be placed
in the bond. This is true even if the links are a single port on each peer switch, where each port is
placed into a bond, as shown below:
All of the dual-connected bonds on the peer switches have their system ID set to the CLAG system ID.
Therefore, from the point of view of the hosts, each of the links in its bond is connected to the same
system, and so the host will use both links.
Each peer switch periodically makes a list of the LACP partner MAC addresses of all of their bonds and
sends that list to its peer (using the clagd daemon; see below). The LACP partner MAC address is the
MAC address of the system at the other end of a bond, which in the figure above would be hosts H1
and H2. When a switch receives this list from its peer, it compares the list to the LACP partner MAC
addresses on its switch. If any matches are found and the clag-id for those bonds match, then that
bond is a dual-connected bond. You can also find the LACP partner MAC address in the /sys/class
/net/<bondname>/bonding/ad_partner_mac sysfs file for each bond.
Understanding Switch Roles
Each CLAG-enabled switch in the pair has a role. When the peering relationship is established between
the two switches, one switch will be in primary role, and the other one will be in secondary role.
By default, the role is determined by comparing the MAC addresses of the two sides of the peering link;
the switch with the lower MAC address assumes the primary role. You can override this by setting the
priority configuration, either by specifying the clagd-priority option in /etc/network/interfaces
, or by using clagctl. The switch with the lower priority value is given the primary role; the default
value is 32768, and the range is 0 to 65535. See the clagd(8) and clagctl(8) man pages.
When the clagd service is exited during switch reboot or the service is stopped in the primary switch,
the peer switch that is in the secondary role will become primary. If the primary switch goes down
without stopping the clagd daemon for any reason or the peer link goes down, the secondary switch
will not change its role. In case the peer switch is determined to be not alive, the switch in the
218
03 June 2015
Cumulus Linux 2.5.2 User Guide
will not change its role. In case the peer switch is determined to be not alive, the switch in the
secondary role will roll back the LACP system ID to be the bond interface MAC address instead of the
clagd-sys-mac and the switch in primary role uses the clagd-sys-mac as the LACP system ID on the
bonds.
When a CLAG-enabled switch is in the secondary role, it does not send STP BPDUs on dual-connected
links; it only sends BPDUs on single-connected links. The switch in the primary role sends STP BPDUs
on all single- and dual-connected links.
Configuring CLAG
Configuring CLAG involves:
On the dual-connected devices, create a bond that uses LACP.
On each peer switch, configure the interfaces, including bonds, VLANs, bridges and peer links.
CLAG synchronizes the dynamic state between the two peer switches, but it does not
synchronize the switch configurations. After modifying the configuration of one peer switch,
you must make the same changes to the configuration on the other peer switch. This applies
to all configuration changes, including:
Port configuration: For example, VLAN membership, MTU, and bonding parameters.
Bridge configuration: For example, spanning tree parameters or bridge properties.
Static address entries: For example, static FDB entries and static IGMP entries.
QoS configuration: For example, ACL entries.
You can verify the configuration of VLAN membership using the clagctl -v verifyvlans
command.
Configuring the Host or Switch
On your dual-connected device, create a bond that uses LACP. The method you use varies with the type
of device you are configuring.
Configuring the Interfaces
Every interface that connects to the CLAG pair from a dual-connected device should be placed into a
bond (see page 160), even if the bond contains only a single link on a single physical switch (since the
CLAG pair contains two or more links). Single-attached hosts, also known as orphan ports, can be just a
member of the bridge.
Additionally, the fast mode of LACP should be configured on the bond to allow more timely updates of
the LACP state. These bonds will then be placed in a bridge, which will include the peer link between
the switches.
In order to enable communication between the clagd daemons on the peer switches, you should
choose an unused VLAN and assign an unrouteable link-local address to give the peer switches layer 3
connectivity between each other. To ensure that the VLAN is completely independent of the bridge and
spanning tree forwarding decisions, configure the VLAN as a VLAN subinterface on the peerlink bond
rather than the VLAN-aware bridge. Cumulus Networks recommends you use 4094 for the peerlink
VLAN (peerlink.4094 below) if possible.
cumulusnetworks.com
219
Cumulus Networks
For example, if peerlink is the inter-chassis bond, and VLAN 4094 is the peerlink VLAN, configure
peerlink.4094 using:
auto peerlink.4094
iface peerlink.4094
address 169.254.1.1/30
clagd-enable
clagd-peer-ip 169.254.1.2
clagd-system-mac 44:39:39:FF:40:94
Then run sudo ifreload -a.
There is no need to add VLAN 4094 to the bridge VLAN list, as it is unnecessary here.
Example CLAG Configuration
An example configuration is included below. It configures two bonds for CLAG, each with a single port,
a peer link that is a bond with two member ports, and three VLANs on each port. You store the
configuration in /etc/network/interfaces on each peer switch.
Configuring these interfaces uses syntax from ifupdown2 and the VLAN-aware bridge driver mode (see
page 184). The bridges use these Cumulus Linux-specific keywords:
bridge-vids, which defines the allowed list of tagged 802.1q VLAN IDs for all bridge member
interfaces. You can specify non-contiguous ranges with a space-separated list, like
bridge-vids 100-200 300 400-500.
bridge-pvids, which defines the untagged VLAN ID for each port. This is commonly referred to
as the native VLAN.
The bridge configurations below indicate that each bond carries tagged frames on VLANs 1000 to 3000
but untagged frames on VLAN 1. Also, take note on how you configure the VLAN subinterface used for
clagd communication (peerlink.4094 in the sample configuration below).
220
03 June 2015
Cumulus Linux 2.5.2 User Guide
At minimum, this VLAN subinterface should not be in your Layer 2 domain, and you should
give it a very high VLAN ID (up to 4094). Read more about the range of VLAN IDs you can use
(see page 195).
The configuration for the spines should look like the following (note that the clag-id and clagd-sysmac must be the same for the corresponding bonds on spine1 and spine2):
Spine1
# This file describes the
network interfaces available
on your system
# and how to activate them.
For more information, see
interfaces(5), ifup(8) #
# Please see /usr/share/doc
/python-ifupdown2/examples/
for examples #
# The loopback network
interface auto lo
iface lo
inet loopback
# The primary network
interface
auto eth0
iface eth0
address 10.0.0.1
netmask 255.255.255.0
auto peerlink
iface peerlink
bond-slaves swp31 swp32
bond-mode 802.3ad
bond-miimon 100
bond-use-carrier 1
bond-lacp-rate 1
bond-min-links 1
bond-xmit-hash-policy
layer3+4
auto peerlink.4094
iface peerlink.4094
address 169.254.255.1
netmask 255.255.255.0
clagd-priority 4096
clagd-peer-ip 169.254.255.2
clagd-sys-mac 44:38:39:ff:
00:01
# ToR pair #1
auto downlink1
iface downlink1
cumulusnetworks.com
Spine2
# This file describes the
network interfaces available
on your system
# and how to activate them.
For more information, see
interfaces(5), ifup(8) #
# Please see /usr/share/doc
/python-ifupdown2/examples/
for examples #
# The loopback network
interface auto lo
iface lo
inet loopback
# The primary network
interface
auto eth0
iface eth0
address 10.0.0.2
netmask 255.255.255.0
auto peerlink
iface peerlink
bond-slaves swp31 swp32
bond-mode 802.3ad
bond-miimon 100
bond-use-carrier 1
bond-lacp-rate 1
bond-min-links 1
bond-xmit-hash-policy
layer3+4
auto peerlink.4094
iface peerlink.4094
address 169.254.255.2
netmask 255.255.255.0
clagd-priority 8192
clagd-peer-ip 169.254.255.1
clagd-sys-mac 44:38:39:ff:
00:01
# ToR pair #1
auto downlink1
iface downlink1
221
Cumulus Networks
bond-slaves swp29 swp30
bond-mode 802.3ad
bond-miimon 100
bond-use-carrier 1
bond-lacp-rate 1
bond-min-links 1
bond-xmit-hash-policy
layer3+4
clag-id 1
# ToR pair #2
auto downlink2
iface downlink2
bond-slaves swp27 swp28
bond-mode 802.3ad
bond-miimon 100
bond-use-carrier 1
bond-lacp-rate 1
bond-min-links 1
bond-xmit-hash-policy
layer3+4
clag-id 2
auto br
iface br
bridge-vlan-aware yes
bridge-ports uplinkA
peerlink downlink1 downlink2
bridge-stp on
bridge-vids 1000-3000
bridge-pvid 1
bridge-mcsnoop 1
bond-slaves swp29 swp30
bond-mode 802.3ad
bond-miimon 100
bond-use-carrier 1
bond-lacp-rate 1
bond-min-links 1
bond-xmit-hash-policy
layer3+4
clag-id 1
# ToR pair #2
auto downlink2
iface downlink2
bond-slaves swp27 swp28
bond-mode 802.3ad
bond-miimon 100
bond-use-carrier 1
bond-lacp-rate 1
bond-min-links 1
bond-xmit-hash-policy
layer3+4
clag-id 2
auto br
iface br
bridge-vlan-aware yes
bridge-ports uplinkA
peerlink downlink1 downlink2
bridge-stp on
bridge-vids 1000-3000
bridge-pvid 1
bridge-mcsnoop 1
Here is an example configuration file for the leaf switches leaf1 and leaf2. Note that the clag-id and
clagd-sys-mac must be the same for the corresponding bonds on leaf1 and leaf2.
Leaf1
# This file describes the
network interfaces available
on your system
# and how to activate them.
For more information, see
interfaces(5).
# The loopback network
interface
auto lo
iface lo inet loopback
# The primary network interface
auto eth0
222
Leaf2
# This file describes the
network interfaces available
on your system
# and how to activate them.
For more information, see
interfaces(5).
# The loopback network
interface
auto lo
iface lo inet loopback
# The primary network interface
auto eth0
03 June 2015
Cumulus Linux 2.5.2 User Guide
iface eth0
address 10.0.0.3
netmask 255.255.255.0
auto spine1-2
iface spine1-2
bond-slaves swp49 swp50
bond-mode 802.3ad
bond-miimon 100
bond-use-carrier 1
bond-lacp-rate 1
bond-min-links 1
bond-xmit-hash-policy
layer3+4
clag-id 1
auto peerlink
iface peerlink
bond-slaves swp51 swp52
bond-mode 802.3ad
bond-miimon 100
bond-use-carrier 1
bond-lacp-rate 1
bond-min-links 1
bond-xmit-hash-policy
layer3+4
auto peerlink.4094
iface peerlink.4094
address 169.254.255.3
netmask 255.255.255.0
clagd-priority 4096
clagd-peer-ip 169.254.255.4
clagd-sys-mac 44:38:39:ff:
01:02
auto host1
iface host1
bond-slaves swp1
bond-mode 802.3ad
bond-miimon 100
bond-use-carrier 1
bond-lacp-rate 1
bond-min-links 1
bond-xmit-hash-policy
layer3+4
clag-id 2
mstpctl-portadminedge yes
mstpctl-bpduguard yes
auto host2
iface host2
bond-slaves swp2
bond-mode 802.3ad
bond-miimon 100
bond-use-carrier 1
cumulusnetworks.com
iface eth0
address 10.0.0.4
netmask 255.255.255.0
auto spine1-2
iface spine1-2
bond-slaves swp49 swp50
bond-mode 802.3ad
bond-miimon 100
bond-use-carrier 1
bond-lacp-rate 1
bond-min-links 1
bond-xmit-hash-policy
layer3+4
clag-id 1
auto peerlink
iface peerlink
bond-slaves swp51 swp52
bond-mode 802.3ad
bond-miimon 100
bond-use-carrier 1
bond-lacp-rate 1
bond-min-links 1
bond-xmit-hash-policy
layer3+4
auto peerlink.4094
iface peerlink.4094
address 169.254.255.4
netmask 255.255.255.0
clagd-priority 8192
clagd-peer-ip 169.254.255.3
clagd-sys-mac 44:38:39:ff:
01:02
auto host1
iface host1
bond-slaves swp1
bond-mode 802.3ad
bond-miimon 100
bond-use-carrier 1
bond-lacp-rate 1
bond-min-links 1
bond-xmit-hash-policy
layer3+4
clag-id 2
mstpctl-portadminedge yes
mstpctl-bpduguard yes
auto host2
iface host2
bond-slaves swp2
bond-mode 802.3ad
bond-miimon 100
bond-use-carrier 1
223
Cumulus Networks
bond-lacp-rate 1
bond-min-links 1
bond-xmit-hash-policy
layer3+4
clag-id 3
mstpctl-portadminedge yes
mstpctl-bpduguard yes
auto br0
iface br0
bridge-vlan-aware yes
bridge-ports spine1-2
peerlink host1 host2
bridge-stp on
bridge-vids 1000-3000
bridge-pvid 1
bond-lacp-rate 1
bond-min-links 1
bond-xmit-hash-policy
layer3+4
clag-id 3
mstpctl-portadminedge yes
mstpctl-bpduguard yes
auto br0
iface br0
bridge-vlan-aware yes
bridge-ports spine1-2
peerlink host1 host2
bridge-stp on
bridge-vids 1000-3000
bridge-pvid 1
The configuration is almost identical, except for the IP addresses used for clagd management.
In the configurations above, the clagd-peer-ip and clagd-sys-mac parameters are
mandatory, while the rest are optional. When mandatory clagd commands are present
under a peer link subinterface, by default clagd-enable is treated as yes; to disable clagd
on the subinterface, set clagd-enable to no. Use clagd-priority to set the role of the
CLAG peer switch to primary or secondary. Each peer switch in a CLAG pair must have the
same clagd-sys-mac setting. Each clagd-sys-mac setting should be unique to each CLAG
pair in the network. For more details refer to man clagd.
Using the clagd Command Line Interface
A command line utility called clagctl is available for interacting with a running clagd daemon to get
status or alter operational behavior. For detailed explanation of the utility, please refer to the clagctl
(8)man page. The following is a sample output of the CLAG operational status displayed by the utility:
cumulus@switch$ clagctl
The peer is alive
Our Priority, ID, and Role: 8192 00:e0:ec:26:50:89 primary
Peer Priority, ID, and Role: 8192 00:e0:ec:27:49:f6 secondary
Peer Interface and IP: peerlink.4094 169.254.255.2
System MAC: 44:38:39:ff:00:01
Dual Attached Ports
224
Our Interface
Peer Interface
CLAG Id
----------------
----------------
-------
downlink1
downlink1
1
downlink2
downlink2
2
03 June 2015
Cumulus Linux 2.5.2 User Guide
Peer Link Interfaces and the PROTO_DOWN State
In addition to the standard UP and DOWN states, an interface that is a member of a CLAG bond or one
of its slaves can also be in a PROTO_DOWN state. When CLAG detects a problem that could result in
connectivity issues such as traffic black-holing or a network meltdown if the link carrier was left in an
UP state, it can put that interface into PROTO_DOWN state. Such connectivity issues include:
When the peer link goes down but the peer switch is up (that is, the backup link is active).
When the bond is configured with a CLAG ID, but the clagd service is not running (whether it
was deliberately stopped or simply died).
When a CLAG-enabled node is booted or rebooted, the CLAG bonds are placed in a
PROTO_DOWN state until the node establishes a connection to its peer switch, or five minutes
have elapsed.
Only Cumulus Linux can place an interface in PROTO_DOWN state. You cannot do this with any
administrative commands.
If a virtual link such as a bond or VXLAN goes into a PROTO_DOWN state, it results in a local OPER
DOWN state.
The following ip link show command output shows an interface in PROTO_DOWN state. Notice that
the link carrier is down (NO-CARRIER):
cumulus@switch:~$ sudo ip link show swp1
3: swp1: <NO-CARRIER,BROADCAST,MULTICAST,SLAVE,UP,PROTO_DOWN > mtu 1500
qdisc pfifo_fast master bond20 state DOWN mode DEFAULT qlen 500 protodown
<MLAG>
link/ether 44:38:39:00:69:84 brd ff:ff:ff:ff:ff:ff
The PROTO_DOWN state is an experimental feature. As such, the name and format could
change in a future version of Cumulus Linux.
Specifying a Backup Link
You can specify a backup link for your peer links in the event that the peer link goes down. When this
happens, the clagd service uses the backup link to check the health of the peer switch. To configure
this, edit /etc/network/interfaces and add clag-backup-ip <ADDRESS> to the peer link
configuration. Here’s an example:
auto peerlink.4094
iface peerlink.4094
address 169.254.255.1
netmask 255.255.255.0
clagd-enable yes
clagd-priority 8192
clagd-peer-ip 169.254.255.2
cumulusnetworks.com
225
Cumulus Networks
clagd-backup-ip 10.0.1.50
clagd-sys-mac 44:38:39:ff:00:01
clagd-args ''
The backup IP address must be different than the peer link IP address ( clagd-peer-ip
above). It must be reachable by a route that doesn’t use the peer link and it must be in the
same network namespace as the peer link IP address.
Cumulus Networks recommends you use the switch's management IP address for this
purpose.
You can also specify the backup UDP port. The port defaults to 5342, but you can configure it as an
argument in clagd-args using --backupPort <PORT>.
auto peerlink.4094
iface peerlink.4094
address 169.254.255.1
netmask 255.255.255.0
clagd-enable yes
clagd-priority 8192
clagd-peer-ip 169.254.255.2
clagd-backup-ip 10.0.1.50
clagd-sys-mac 44:38:39:ff:00:01
clagd-args '--backupPort 5400'
You can see the backup IP address if you run clagctl:
cumulus@switch$ clagctl
The peer is alive
Our Priority, ID, and Role: 8192 00:e0:ec:26:50:89 primary
Peer Priority, ID, and Role: 8192 00:e0:ec:27:49:f6 secondary
Peer Interface and IP: peerlink.4094 169.254.255.2
Backup IP: 10.0.1.50
System MAC: 44:38:39:ff:00:01
Dual Attached Ports
Our Interface
Peer Interface
CLAG Id
----------------
----------------
-------
downlink1
downlink1
1
downlink2
downlink2
2
Monitoring Dual-Connected Peers
226
03 June 2015
Cumulus Linux 2.5.2 User Guide
Monitoring Dual-Connected Peers
Upon receipt of a valid message from its peer, the switch knows that clagd is alive and executing on
that peer. This causes clagd to change the system ID of each bond that was assigned a clag-id from
the default value (the MAC address of the bond) to the system ID assigned to both peer switches. This
makes the hosts connected to each switch act as if they are connected to the same system so that they
will use all ports within their bond. Additionally, clagd determines which bonds are dual-connected
and modifies the forwarding and learning behavior to accommodate these dual-connected bonds.
If the peer does not receive any messages for three update intervals, then that peer switch is assumed
to no longer be acting as a CLAG peer. In this case, the switch reverts all configuration changes so that
it operates as a standard non-CLAG switch. This includes removing all statically assigned MAC
addresses, clearing the egress forwarding mask, and allowing addresses to move from any port to the
peer port. Once a message is again received from the peer, CLAG operation starts again as described
earlier. You can configure a custom timeout setting by adding --peerTimeout <VALUE> to clagdargs in /etc/network/interfaces.
Once bonds are identified as dual-connected, clagd sends more information to the peer switch for
those bonds. The MAC addresses (and VLANs) that have been dynamically learned on those ports are
sent along with the LACP partner MAC address for each bond. When a switch receives MAC address
information from its peer, it adds MAC address entries on the corresponding ports. As the switch learns
and ages out MAC addresses, it informs the peer switch of these changes to its MAC address table so
that the peer can keep its table synchronized. Periodically, at 45% of the bridge ageing time, a switch
will send its entire MAC address table to the peer, so that peer switch can verify that its MAC address
table is properly synchronized.
The switch sends an update frequency value in the messages to its peer, which tells clagd how often
the peer will send these messages. You can configure a different frequency by adding --lacpPoll
<SECONDS> to clagd-args in /etc/network/interfaces.
CLAG Best Practices
For CLAG to function properly, the dual-connected hosts' interfaces should be configured identically on
the pair of peering switches. See the note above in the Configuring CLAG (see page 219) section.
IGMP Snooping with CLAG
IGMP snooping processes IGMP reports received on a bridge port in a bridge to identify hosts that are
configured to receive multicast traffic destined to that group. An IGMP query message received on a
port is used to identify the port that is connected to a router and configured to receive multicast traffic.
IGMP snooping is enabled by default on the bridge. IGMP snooping multicast database entries and
router port entries are synced to the peer CLAG switch. If there is no multicast router in the VLAN, the
IGMP querier can be configured on the switch to generate IGMP query messages by adding a
configuration like the following to /etc/network/interfaces:
auto br.100
vlan br.100
#igmp snooping is enabled by default, but is shown here for completeness
bridge-mcsnoop 1
# If you need to specify the querier IP address
bridge-igmp-querier-source 123.1.1.1
To display multicast group and router port information, use the bridge -d mdb show command:
cumulusnetworks.com
227
Cumulus Networks
To display multicast group and router port information, use the bridge -d mdb show command:
cumulus@switch:~# sudo bridge -d mdb show
dev br port bond0 vlan 100 grp 234.1.1.1 temp
router ports on br: bond0
Runtime Configuration (Advanced)
cumulus@switch:~# sudo brctl setmcqv4src br 100 123.1.1.1
cumulus@switch:~# sudo brctl setmcquerier br 1
cumulus@switch:~# sudo brctl showmcqv4src br
vlan
querier address
100
123.1.1.1
clagd Daemon Status Monitoring
Due to the critical nature of the clagd daemon, an external process, called jdoo, continuously
monitors the status of clagd. If the clagd daemon dies or becomes unresponsive for any reason, the
jdoo process will get clagd up and running again. This monitoring is automatically configured and
enabled as long as clagd is enabled (that is, clagd-peer-ip and clagd-sys-mac are configured in
/etc/network/interfaces) and clagd been started. When clagd is explicitly stopped, for example
with the service clagd stop command, monitoring of clagd is also stopped.
The jdoo process checks two things to make sure the clagd daemon is operating properly:
The result of the service clagd status command. If the command returns that clagd is
running, or that clagd is not configured to run, then jdoo does nothing. If service clagd
status returns that clagd is not running but was configured to run, jdoo will start the clagd
daemon. This check is performed every 30 seconds. Due to the way the jdoo process
implements this check, it may start the clagd process twice. This is harmless, since clagd
checks to make sure another instance is not already running when it begins executing. This is
indicated with a message in the clagd log file, /var/log/clagd.log.
The modification time of the /var/run/clagd.alive file. As clagd runs, it periodically
updates the modification time of the /var/run/clagd.alive file (by default, every 4 seconds).
If jdoo notices that this file’s modification time has not been updated within the last 4 minutes,
it will assume clagd is alive, but hung, and will restart clagd. If clagd is not enabled to run,
this check still occurs and jdoo will start clagd. But since clagd is not configured to run,
nothing will happen except that a message is written to the jdoo log file that it tried to start
clagd.
You can check the status of clagd monitoring by using the jdoo summary command:
cumulus@switch:~$ sudo jdoo summary
The jdoo daemon 5.4 uptime: 15m
...
228
03 June 2015
Cumulus Linux 2.5.2 User Guide
Program 'clagd'
Status ok
File 'clagd.alive'
Waiting
...
STP Interoperability with CLAG
Cumulus Networks recommends that you always enable STP in your layer 2 network.
Further, with CLAG, Cumulus Networks recommends you enable BPDU guard on the host-facing bond
interfaces. (For more information about BPDU guard, see BPDU Guard and Bridge Assurance (see page
238).)
Debugging STP with CLAG
/var/log/daemon.log has mstpd logs.
Run mstpctl debuglevel 3 to see CLAG-related logs in /var/log/daemon.log:
root@se3-sp1:~# mstpctl showportdetail br peer-bond
br:peer-bond CIST info
enabled
yes
role
Designated
port id
8.008
state
forwarding
...............
bpdufilter port
no
clag ISL
yes
clag ISL Oper UP
yes
clag role
primary
clag dual conn mac
0:0:0:0:0:
clag system mac
44:38:39:
0
clag remote portID F.FFF
ff:0:1
root@se3-sp1:~#
root@se3-sp1:~# mstpctl showportdetail br downlink-1
br:downlink-1 CIST info
enabled
yes
role
Designated
port id
8.006
state
forwarding
..............
bpdufilter port
no
clag ISL
no
clag ISL Oper UP
no
clag role
primary
clag dual conn mac
0:0:0:3:
clag system mac
44:38:39:
11:1
clag remote portID F.FFF
ff:0:1
root@se3-sp1:~#
cumulusnetworks.com
229
Cumulus Networks
Best Practices for STP with CLAG
The STP global configuration must be the same on both the switches.
The STP configuration for dual-connected ports should be the same on both peer switches.
Use mstpctl commands for all spanning tree configurations, including bridge priority, path cost
and so forth. Do not use brctl commands for spanning tree, except for brctl stp on/off,
as changes are not reflected to mstpd and can create conflicts.
Troubleshooting CLAG
By default, when clagd is running, it logs its status to the /var/log/clagd.log file and syslog.
Example log file output is below:
Jan 14 23:45:10 se3-sp1 clagd[3704]: Beginning execution of clagd version
1.0.0
Jan 14 23:45:10 se3-sp1 clagd[3704]: Invoked with: /usr/sbin/clagd --daemon
169.254.2.2 peer-bond.4000 44:38:39:ff:00:01 --priority 8192
Jan 14 23:45:11 se3-sp1 clagd[3995]: Role is now secondary
Jan 14 23:45:31 se3-sp1 clagd[3995]: Role is now primary
Jan 14 23:45:32 se3-sp1 clagd[3995]: The peer switch is active.
Jan 14 23:45:35 se3-sp1 clagd[3995]: downlink-1 is now dual connected.
Caveats and Errata
If both the backup and peer connectivity are lost within a 30-second window, the switch in the
secondary role misinterprets the event sequence, believing the peer switch is down, so it takes over as
the primary.
Configuration Files
/etc/network/interfaces
LACP Bypass
On Cumulus Linux, LACP Bypass is a feature that allows a bond (see page 160) configured in 802.3ad
mode to become active and forward traffic even when there is no LACP partner. A typical use case for
this feature is to enable a host, without the capability to run LACP, to PXE boot while connected to a
switch on a bond configured in 802.3ad mode. Once the preboot process finishes and the host is
capable of running LACP, the normal 802.3ad link aggregation operation will take over.
As a safeguard, a configurable timeout period can be enforced to limit the duration of the bypass. The
valid range of timeout period is 0 to 100 seconds; the default is 0 seconds, which indicates no timeout.
If no LACP partner is detected before the timeout period expires, the bond will become inactive and
stop forwarding traffic. Bringing down the slave interfaces and bringing them back up will restart the
cycle. At any point in time, receiving LACPDU on any of the slave interfaces will abort the bypass and
LACP protocol negotiation will take over. Enabling or disabling bypass during LACP exchange does not
affect the link aggregation operation.
230
03 June 2015
Cumulus Linux 2.5.2 User Guide
affect the link aggregation operation.
In case a bond has multiple slave interfaces, the user can control which of them should go into bypass
in the following ways:
In a CLAG deployment (see page 215) where a host is dual-connected to two switches, the bond
on the switch with the CLAG primary role has a higher bypass priority than the bond on the
other switch with a secondary role.
On each switch, if a bond has multiple slave interfaces, a bypass priority value (default is 0) can
be configured on the interfaces; the one with higher numerical priority value wins. A string
comparison of the interface names will serve as a tiebreaker in case the priority values are
equal; the string with the lower ASCII values wins. Note that the priority value is significant
within a switch; there is no coordination between two switches in a CLAG peering relationship.
Configuring LACP Bypass
In /etc/network/interfaces:
Enable bypass on the host facing bond by setting bond-lacp-bypass-allow to 1.
(Optional): Configure the timeout by setting bond-lacp-bypass-period to a valid value
different from the default; Cumulus Networks recommends you keep the default of 0.
(Optional): Configure the priority values if desired by setting bond-lacp-bypass-priority
with the values for each slave interface.
Default Configuration
The default configuration below shows LACP bypass as being enabled.
auto bond0
iface bond0
bond-mode 802.3ad
bond-lacp-rate 1
bond-min-links 1
bond-lacp-bypass-allow 1
bond-slaves swp4 swp5
/The following shows that bond0 is in bypass state. It is operationally up
and swp4 is the active interface://
/
cumulus@switch$ ip link show bond0
7: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue
state UP mode DEFAULT
link/ether 00:02:00:00:00:02 brd ff:ff:ff:ff:ff:ff
cumulus@switch$ cat /proc/net/bonding/bond0
Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)
Bonding Mode: IEEE 802.3ad Dynamic link aggregation
Transmit Hash Policy: layer2 (0)
MII Status: up
MII Polling Interval (ms): 0
cumulusnetworks.com
231
Cumulus Networks
Up Delay (ms): 0
Down Delay (ms): 0
802.3ad info
LACP rate: fast
Min links: 1
Aggregator selection policy (ad_select): stable
System Identification: 65535 00:02:00:00:00:02
Active Aggregator Info:
Aggregator ID: 1
Number of ports: 1
Actor Key: 33
Partner Key: 1
Partner Mac Address: 00:00:00:00:00:00
Fall back Info:
Allowed: 1
Timeout: 0
Slave Interface: swp4
MII Status: up
Speed: 10000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: 00:02:00:00:00:02
Aggregator ID: 1
LACP bypass: on
Slave queue ID: 0
Slave Interface: swp5
MII Status: up
Speed: 10000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: 00:02:00:00:00:01
Aggregator ID: 2
Slave queue ID: 0
Configuration with Optional Priority and Period
The following configuration shows LACP bypass enabled, with a period and priorities set.
auto bond0
iface bond0
bond-mode 802.3ad
bond-lacp-rate 1
bond-min-links 1
bond-lacp-bypass-allow 1
232
03 June 2015
Cumulus Linux 2.5.2 User Guide
bond-slaves swp4 swp5
bond-lacp-bypass-period 100
bond-lacp-bypass-priority swp4=2 swp5=1
/The following shows that swp4 bypass has expired and the bond is
operationally down:/
cumulus@switch$ ip link show bond0
7: bond0: <NO-CARRIER,BROADCAST,MULTICAST,MASTER,UP> mtu 1500 qdisc noqueue
state DOWN mode DEFAULT
link/ether 00:02:00:00:00:02 brd ff:ff:ff:ff:ff:ff
cumulus@switch$ cat /proc/net/bonding/bond0
Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)
Bonding Mode: IEEE 802.3ad Dynamic link aggregation
Transmit Hash Policy: layer2 (0)
MII Status: up
MII Polling Interval (ms): 0
Up Delay (ms): 0
Down Delay (ms): 0
802.3ad info
LACP rate: fast
Min links: 1
Aggregator selection policy (ad_select): stable
System Identification: 65535 00:02:00:00:00:02
Active Aggregator Info:
Aggregator ID: 1
Number of ports: 1
Actor Key: 33
Partner Key: 1
Partner Mac Address: 00:00:00:00:00:00
Fall back Info:
Allowed: 1
Timeout: 100
Slave Interface: swp4
MII Status: up
Speed: 10000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: 00:02:00:00:00:02
Aggregator ID: 1
LACP bypass priority: 2
LACP bypass: expired
Slave queue ID: 0
Slave Interface: swp5
MII Status: up
cumulusnetworks.com
233
Cumulus Networks
Speed: 10000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: 00:02:00:00:00:01
Aggregator ID: 2
Bypass priority: 1
Slave queue ID: 0
Spanning Tree and Rapid Spanning Tree
Spanning tree protocol (STP) is always recommended in layer 2 topologies, as it prevents bridge loops
and broadcast radiation on a bridged network.
mstpd is a daemon that implements IEEE802.1D 2004 and IEEE802.1Q 2011. Currently, STP is disabled
by default on the bridge in Cumulus Linux.
To enable STP, configure brctl stp <bridge> on.
The STP modes Cumulus Linux supports vary depending upon which bridge driver mode (see
page 163) is in use. For a bridge configured in traditional mode, STP, RSTP, PVST and PVRST
are supported; with the default set to PVRST. VLAN-aware (see page 184) bridges only operate
in RSTP mode.
If a bridge running RSTP (802.1w) receives a common STP (802.1D) BPDU, it will automatically
fall back to 802.1D operation.
By default, mstpd starts off with Per VLAN Rapid Spanning Tree Protocol (PVRST), but if a peer sends
only STP BPDU, mstpd will fall back to Per VLAN Spanning Tree (PVST). mstpd can also be configured to
be only in STP mode, by setting setforcevers to STP.
Contents
(Click to expand)
Contents (see page 234)
Commands (see page 235)
PVST/PVRST (see page 235)
Creating a Bridge and Configuring STP (see page 235)
Configuring Spanning Tree Parameters (see page 237)
Persistent Configuration (see page 238)
BPDU Guard and Bridge Assurance (see page 238)
Enabling bpdufilter (see page 239)
Configuration Files (see page 239)
Man Pages (see page 240)
Useful Links (see page 240)
Caveats and Errata (see page 240)
234
03 June 2015
Cumulus Linux 2.5.2 User Guide
Caveats and Errata (see page 240)
Commands
brctl
mstpctl
mstpctl is a utility to configure STP. mstpd is started by default on bootup. mstpd logs and errors are
located in /var/log/daemon.log.
PVST/PVRST
Per VLAN Spanning Tree (PVST) creates a spanning tree instance for a bridge. Rapid PVST (PVRST)
supports RSTP enhancements for each spanning tree instance. You must create a bridge corresponding
to the untagged native/access VLAN, and all the physical switch ports must be part of the same VLAN.
When connected to a switch that has a native VLAN configuration, the native VLAN must be configured
to be VLAN 1 only.
Cumulus Linux supports the RSTP/PVRST/PVST modes of STP natively when the bridge is configured in
traditional mode (see page 163).
Creating a Bridge and Configuring STP
brctl is used to create the bridge, add bridge ports in the bridge and configure STP on the bridge.
mstpctl is used only when an admin needs to change the default configuration parameters for STP:
cumulus@switch:~$ sudo brctl addbr br2
cumulus@switch:~$ sudo brctl addif br2 swp1.101 swp4.101 swp5.101
cumulus@switch:~$ sudo brctl stp br2 on
cumulus@switch:~$ sudo ifconfig br2 up
To get the kernel bridge state, use:
cumulus@switch:~$ sudo brctl show
bridge name
bridge id
STP enabled
interfaces
br2
8000.001401010100
yes
swp1.101
swp4.101
swp5.101
To get the mstpd bridge state, use:
cumulusnetworks.com
235
Cumulus Networks
cumulus@switch:~$ sudo mstpctl showbridge br2
br2 CIST info
enabled
yes
bridge id
F.000.00:14:01:01:01:00
designated root F.000.00:14:01:01:01:00
regional root
F.000.00:14:01:01:01:00
root port
none
path cost
0
internal path cost
0
max age
20
bridge max age
20
forward delay 15
bridge forward delay 15
tx hold count 6
max hops
20
hello time
ageing time
200
2
force protocol version
rstp
time since topology change 90843s
topology change count
4
topology change
no
topology change port
swp4.101
last topology change port
swp5.101
To get the mstpd bridge port state, use:
cumulus@switch:~$ sudo mstpctl showport br2
E swp1.101 8.001 forw F.000.00:14:01:01:01:00 F.000.00:14:01:01:01:00
8.001 Desg
swp4.101 8.002 forw F.000.00:14:01:01:01:00 F.000.00:14:01:01:01:00
8.002 Desg
E swp5.101 8.003 forw F.000.00:14:01:01:01:00 F.000.00:14:01:01:01:00
8.003 Desg
cumulus@switch:~$ sudo mstpctl showportdetail br2 swp1.101
br2:swp1.101 CIST info
enabled
yes
role
Designated
port id
8.001
state
forwarding
external port cost 2000
admin external cost
0
internal port cost 2000
admin internal cost
0
designated root
236
F.000.00:14:01:01:01:00 dsgn external cost
0
dsgn regional root F.000.00:14:01:01:01:00 dsgn internal cost
0
designated bridge
F.000.00:14:01:01:01:00 designated port
8.001
admin edge port
no
auto edge port
yes
oper edge port
yes
topology change ack
no
point-to-point
yes
admin point-to-point auto
restricted role
no
restricted TCN
no
03 June 2015
Cumulus Linux 2.5.2 User Guide
port hello time
2
disputed
no
bpdu guard port
no
bpdu guard error
no
network port
no
BA inconsistent
no
Num TX BPDU
45772
Num TX TCN
4
Num RX BPDU
0
Num RX TCN
0
Num Transition BLK
2
Num Transition FWD 2
Configuring Spanning Tree Parameters
For an explanation of these parameters, see the IEEE 802.1D, 802.1Q specifications and man mstpctl.
The set bridge forward delay/Max Age must meet the condition 2 * (Bridge Foward Delay - 1
second) >= Bridge Max Age.
mstpd supports only long mode; that is, 32 bits for the Path cost:
cumulus@switch:~$ sudo mstpctl setmaxage br2 20
cumulus@switch:~$ sudo mstpctl setfdelay br2 15
cumulus@switch:~$ sudo mstpctl setmaxhops br2 20
cumulus@switch:~$ sudo mstpctl settxholdcount br2 6
cumulus@switch:~$ sudo mstpctl setforcevers br2 rstp
cumulus@switch:~$ sudo mstpctl settreeprio br2 0 32768
cumulus@switch:~$ sudo mstpctl sethello br2 20
cumulus@switch:~$ sudo mstpctl setportpathcost br2 swp1.101 0
cumulus@switch:~$ sudo mstpctl setportadminedge br2 swp1.101 no
cumulus@switch:~$ sudo mstpctl setportautoedge br2 swp1.101 yes
cumulus@switch:~$ sudo mstpctl setportp2p br2 swp1.101 no
cumulus@switch:~$ sudo mstpctl setportrestrrole br2 swp1.101 no
cumulus@switch:~$ sudo mstpctl setbpduguard br2 swp1.101 no
cumulus@switch:~$ sudo mstpctl setportrestrtcn br2 swp1.101 no
cumulusnetworks.com
237
Cumulus Networks
cumulus@switch:~$ sudo mstpctl setportnetwork br2 swp4.101 no
cumulus@switch:~$ sudo mstpctl settreeportprio br2 swp4.101 0 128
Persistent Configuration
The persistent configuration for a bridge is set in /etc/network/interfaces. The configuration
below is for the example bridge above:
auto br0
iface br0 inet static
bridge-ports swp1 swp2 swp3
bridge-stp on
mstpctl-maxage 20
mstpctl-fdelay 15
mstpctl-maxhops 20
mstpctl-txholdcount 6
mstpctl-forcevers rstp
mstpctl-treeprio 32768
mstpctl-hello 2
mstpctl-portpathcost swp1=0 swp2=0
mstpctl-portadminedge swp1=no swp2=no
mstpctl-portautoedge swp1=yes swp2=yes
mstpctl-portp2p swp1=no swp2=no
mstpctl-portrestrrole swp1=no swp2=no
mstpctl-bpduguard swp1=no swp2=no
mstpctl-portrestrtcn swp1=no swp2=no
mstpctl-portnetwork swp1=no
mstpctl-treeportprio swp3=128
BPDU Guard and Bridge Assurance
If an end station is not required to influence the STP topology, a BPDU guard (Bridge Protocol Data
Unit) can be configured on the port. If a BPDU is received on the port, STP will bring down the port and
log an error in /var/log/daemon.log. The admin would need to rectify the configuration on the end
station and manually bring up the port that was down:
cumulus@switch:~$ sudo mstpctl setbpduguard br1007 swp8.1007 yes
cumulus@switch:~$ sudo mstpctl showportdetail br1007 swp8.1007 | grep guard
bpdu guard port
238
yes
bpdu guard error
yes
03 June 2015
Cumulus Linux 2.5.2 User Guide
cumulus@switch:~$ sudo grep -in error /var/log/daemon.log | grep mstp
mstpd: error, br1007:swp8.1007 Recvd BPDU on BPDU Guard Port - Port Down
cumulus@switch:~$ ifconfig swp8.1007
swp8.1007 Link encap:Ethernet
BROADCAST MULTICAST
HWaddr 00:02:00:00:00:22
MTU:1500
Metric:1
On a point-to-point link where RSTP is running, if you want to detect unidirectional links and put the
port in discarding state (in error), you can enable bridge assurance on the port by enabling port type
network. The port would be in a bridge assurance inconsistent state until a BPDU is received from the
peer. Port type network needs to be configured on both the ends of the link:
cumulus@switch:~$ sudo mstpctl setportnetwork br1007 swp1.1007 yes
cumulus@switch:~$ sudo mstpctl showportdetail br1007 swp1.1007 | grep
network
network port
yes
BA inconsistent
yes
cumulus@switch:~$ sudo grep -in assurance /var/log/daemon.log | grep mstp
1365:Jun 25 18:03:17 mstpd: br1007:swp1.1007 Bridge assurance inconsistent
Enabling bpdufilter
You can enable bpdufilter on a switch port, which filters BPDUs in both directions. This effectively
disables STP on the port.
To enable it, run the mstpctl command:
cumulus@switch:~$ sudo mstpctl setportbpdufilter br100 swp1.100=yes swp2.
100=yes
To make the configuration persistent, add the following to /etc/network/interfaces under the
bridge port iface section example:
auto br100
iface br100
bridge-ports swp1.100 swp2.100
mstpctl-portbpdufilter swp1=yes swp2=yes
For more information, see man(5) ifupdown-addons-interfaces.
Configuration Files
cumulusnetworks.com
239
Cumulus Networks
Configuration Files
/etc/network/interfaces
Man Pages
brctl(8)
bridge-utils-interfaces(5)
ifupdown-addons-interfaces(5)
mstpctl(8)
mstpctl-utils-interfaces(5)
Useful Links
The source code for mstpd/mstpctl was written by Vitalii Demianets and is hosted at the sourceforge
URL below.
https://sourceforge.net/projects/mstpd/
http://en.wikipedia.org/wiki/Spanning_Tree_Protocol
Caveats and Errata
MSTP is not supported currently. However, interoperability with MSTP networks can be
accomplished using PVRSTP or PVSTP.
Configuring Switch Port Attributes
You configure attributes for the switch ports in the /etc/cumulus/ports.conf file. This file serves
four essential purposes:
1. A simplified mapping of each switch port number (swpN) to its speed setting.
2. The ability to combine (or gang) 4 contiguous 10G ports into a single 40G port.
3. The ability to split (or ungang) a 40G port into 4 logical 10G ports using a breakout cable. Note
that 40G switches with Trident II chipsets (see the HCL) have limitations on the number of logical
ports they can support. See Logical Switch Port Limitations (see page 244) below.
4. For platforms that have both SFP and RJ45 connectors on a port, it allows for selecting the active
path.
After you modify the configuration, restart switchd to push the new configuration (run sudo
service switchd restart; this interrupts network services). This processes that configuration file
and does the following:
1. Creates all the swpN devices that map to the switch ports xe0..xeN and assigns the appropriate
MAC addresses to them.
2. Generates /var/lib/cumulus/porttab, which details the mapping of the xe0..xeN switch
ports to the swpN device nodes.
3. Generates /var/lib/cumulus/phytab, which details the physical address mappings for each
xeN switch port and swpN device node.
240
03 June 2015
Cumulus Linux 2.5.2 User Guide
Contents
(Click to expand)
Contents (see page 241)
Commands (see page 241)
Configuration Files (see page 241)
Querying SFP Port Information (see page 242)
Setting Port Speed, Duplexing, and Auto-negotiation (see page 243)
Persistent Configuration (see page 243)
Port Speed Limitations (see page 244)
Logical Switch Port Limitations (see page 244)
Caveats and Errata (see page 245)
Commands
service switchd restart
Configuration Files
As mentioned previously, switch port configurations are stored in /etc/cumulus/ports.conf. You
edit this file to change these configurations.
This configuration in /etc/cumulus/ports.conf shows four ports in 10G mode:
# SFP+ ports
#
# <port label 1-48> = [10G|40G/4]
1=10G
2=10G
3=10G
4=10G
The following configuration shows four 10G ports, starting with swp1, combined (ganged) into one 40G
port. This configuration results in device node swp1 followed by device node swp5 (swp2-4 are
skipped):
# SFP+ ports
#
# <port label 1-48> = [10G|40G/4]
1=40G/4
2=40G/4
3=40G/4
4=40G/4
5=10G
cumulusnetworks.com
241
Cumulus Networks
This configuration shows four 40G ports in 40G mode:
# QSFP+ ports
#
# <port label 49-52> = [4x10G|40G]
49=40G
50=40G
51=40G
52=40G
This configuration shows one unganged 40G port in 4x10G mode for four 10G logical ports. This results
in device node swp49 being replaced by swp49s0, swp49s1 .. swp49s3:
# QSFP+ ports
#
# <port label 49-52> = [4x10G|40G]
49=4x10G
Querying SFP Port Information
You can verify SFP settings using ethtool -m. The following example shows the output for 1G and 10G
modules:
cumulus@switch:~# sudo ethtool -m | egrep '(swp|RXPower :|TXPower :
|EthernetComplianceCode)'
swp1: SFP detected
EthernetComplianceCodes : 1000BASE-LX
RXPower : -10.4479dBm
TXPower : 18.0409dBm
swp3: SFP detected
10GEthernetComplianceCode : 10G Base-LR
RXPower : -3.2532dBm
TXPower : -2.0817dBm
242
03 June 2015
Cumulus Linux 2.5.2 User Guide
Setting Port Speed, Duplexing, and Auto-negotiation
You use ethtool to configure auto-negotiation, duplexing and the speed for your switch ports. You
must specify both port speed and duplexing in the ethtool command; auto-negotiation is optional.
The following examples use swp1.
To set the port speed to 1G, run:
ethtool -s swp1 speed 1000 duplex full
To set the port speed to 10G, run:
ethtool -s swp1 speed 10000 duplex full
To enable duplexing, run:
ethtool -s swp1 speed 10000 duplex full|half
To enable or disable auto-negotiation, run:
ethtool -s swp1 speed 10000 duplex full autoneg on|off
Persistent Configuration
You can create a persistent configuration for port speeds in /etc/network/interfaces. Add the
appropriate lines for each switch port stanza. For example:
auto swp1
iface swp1
address 10.1.1.1/24
mtu 9000
link-speed 10000
link-duplex full
link-autoneg off
In the above configuration, swp1 is configured with 10G port speed, full duplex, with auto-negotiation
disabled.
You should always specify speed and duplex modes in the interfaces file.
cumulusnetworks.com
243
Cumulus Networks
You should always specify speed and duplex modes in the interfaces file.
For more information about configuring network interfaces, see Understanding Network Interfaces
(see page 154) and Configuring and Managing Network Interfaces (see page 125).
Port Speed Limitations
You can configure switch ports to run at different speeds. In Cumulus Linux, you can configure:
A 1G switch port to run at 100Mbps using ethtool:
ethtool -s swp1 speed 100
A 10G switch port to run at 1Gbps using ethtool:
ethtool -s swp1 speed 1000
A 40G switch port to run at 10Gbps via the settings in /etc/cumulus/ports.conf — see
Configuration Files above.
Logical Switch Port Limitations
40G switches with Trident II chipsets (check the 40G Portfolio section of the HCL) can support a certain
number of logical ports, depending upon the manufacturer.
Before you configure any logical/unganged ports on a switch, check the limitations listed in /etc
/cumulus/ports.conf; this file is specific to each manufacturer.
For example, the Edge-Core AS6701-32X ports.conf file indicates the logical port limitation like this:
# ports.conf -#
# configure port speed, aggregation, and subdivision.
#
# accton,as6701_32x has:
# 32 QSFP+ ports numbered 1-32
# These ports are configurable as 40G or split into 4x10G ports.
#
# The X pipeline covers QSFP ports 1-11, 24-26, 30-32 and the Y pipeline
# covers QSFP ports 12-23, 27-29.
#
# The Trident II chip can only handle 52 logical ports per pipeline.
#
# This means 12 is the maximum number of 40G ports you can ungang
# per pipeline. The 12 40G ports become 48 unganged 10G ports,
244
03 June 2015
Cumulus Linux 2.5.2 User Guide
# plus the remaining 4 40G ports totals 52 logical ports for that
# pipeline.
Caveats and Errata
In the step that produces the swpN devices, the eeprom is examined for maximum MAC address
range. If the range allocated for this system is less than the total possible number of switch
ports (including splits), then the system is allocated a single MAC address across all ports and is
put in an L3-only mode.
The swpN names for combined ports are assigned with holes. For example, on a switch where
ports 5, 6, 7 and 8 are combined into one 40G port, you'll see swp1, 2, 3, 4, 5, 9, 10,... swp5 is the
combined port.
The link/activity lights for combined ports all blink in unison. For split ports, the LED values for
all ports is ORd: if one port of four has link/activity, the light is on/blinking.
Broadcom chips impose constraints on which ports can be combined. Cumulus Linux is
restricted to four adjacent ports where the first port is a multiple of 4 +1 (swp1-4, swp5-8, swp912, and so forth). This could change with other vendors or layouts.
Configuring Buffer and Queue Management
Hardware datapath configuration manages packet buffering, queueing, and scheduling in hardware.
There are two configuration input files:
/etc/cumulus/datapath/traffic.conf, which describes priority groups and assigns the
scheduling algorithm and weights
/etc/bcm.d/datapath/datapath.conf, which assigns buffer space and egress queues
Versions of these files prior to Cumulus Linux 2.1 are incompatible with Cumulus Linux 2.1
and later; using older files will cause switchd to fail to start and return an error that it cannot
find the /var/lib/cumulus/rc.datapath file.
Each packet is assigned to an ASIC Class of Service (CoS) value based on the packet’s priority value
stored in the 802.1p (Class of Service) or DSCP (Differentiated Services Code Point) header field. The
packet is assigned to a priority group based on the CoS value.
Priority groups include:
Control: Highest priority traffic
Service: Second-highest priority traffic
Lossless: Traffic protected by priority flow control
Bulk: All remaining traffic
A lossless traffic group is protected from packet drops by configuring the datapath to use priority
pause. A lossless priority group requires a port group configuration, which specifies the ports
configured for priority flow control and the additional buffer space assigned to each port for packets in
the lossless priority group.
cumulusnetworks.com
245
Cumulus Networks
The scheduler is configured to use a hybrid scheduling algorithm. It applies strict priority to control
traffic queues and a weighted round robin selection from the remaining queues. Unicast packets and
multicast packets with the same priority value are assigned to separate queues, which are assigned
equal scheduling weights.
Datapath configuration takes effect when you initialize switchd. Changes to the traffic.conf file
require you to restart switchd.
Contents
(Click to expand)
Contents (see page 246)
Commands (see page 246)
Configuration Files (see page 246)
Configuring Traffic Marking through ACL Rules (see page 247)
Configuring Link Pause (see page 248)
Useful Links (see page 250)
Caveats and Errata (see page 250)
Commands
If you modify the configuration in the /etc/cumulus/datapath/traffic.conf file, you must restart
switchd for the changes to take effect:
cumulus@switch:~$ sudo service switchd restart
Configuration Files
The following configuration applies to 10G and 40G switches only (any switch on the Trident, Trident+,
or Trident II platform).
/etc/cumulus/datapath/traffic.conf: The default datapath configuration file.
/etc/cumulus/datapath/custom_traffic.conf: An optional customized configuration file.
An example traffic configuration file:
cumulus@switch:~$ cat /etc/cumulus/datapath/traffic.conf
section: traffic
# packet priority source:
# -- 802.1p or dscp
packet priority source: 802.1p
# packet priority mapping to ingress priority values 0..7
246
03 June 2015
Cumulus Linux 2.5.2 User Guide
packet priorities = (0), ingress priority: 0
packet priorities = (1), ingress priority: 1
packet priorities = (2), ingress priority: 2
packet priorities = (3), ingress priority: 3
packet priorities = (4), ingress priority: 4
packet priorities = (5), ingress priority: 5
packet priorities = (6), ingress priority: 6
packet priorities = (7), ingress priority: 7
# remark packet priority value
# -- 802.1p or none
remark packet priority: none
# traffic configurations:
# -- name: an arbitrary label
# -- type: lossless, control, service, or bulk packets
# -- priorities assigned to each group
# -- bandwidth percent (for the lossless traffic group only)
traffic group name: green,
type: bulk,
ingress priority values =
traffic group name: blue,
type: service,
ingress priority values = (2)
traffic group name: yellow,
type: lossless, ingress priority values = (3),
(0,1,4,5,6)
bandwidth: 1000.0 Mb/s
traffic group name: red,
type: control,
ingress priority values = (7)
config_end
Configuring Traffic Marking through ACL Rules
You can mark traffic for egress packets through iptables or ip6tables rule classifications. To enable
these rules, you do one of the following:
Mark DSCP values in egress packets.
Mark 802.1p CoS values in egress packets.
To enable traffic marking, use cl-acltool. Add the -p option to specify the location of the policy file.
By default, if you don't include the -p option, cl-acltool looks for the policy file in /etc/cumulus
/acl/policy.d/.
The iptables-/ip6tables-based marking is supported via the following action extension:
-j SETQOS --set-dscp 10 --set-cos 5
You can specify one of the following targets for SETQOS:
cumulusnetworks.com
247
Cumulus Networks
Option
Description
–set-cos
INT
Sets the datapath resource/queuing class value. Values are defined in IEEE_P802.1p.
–set-dscp
value
Sets the DSCP field in packet header to a value, which can be either a decimal or hex
value.
–set-dscpclass class
Sets the DSCP field in the packet header to the value represented by the DiffServ class
value. This class can be EF, BE or any of the CSxx or AFxx classes.
You can specify either --set-dscp or --set-dscp-class, but not both.
Here are two example rules:
[iptables]
-t mangle -A -FORWARD -i --in-interface swp+ -p tcp --dport bgp -j SETQOS -set-dscp 10 --set-cos 5
[ip6tables]
-t mangle -A -FORWARD -i --in-interface swp+ -j SETQOS --set-dscp 10
You can put the rule in either the mangle table or the default filter table; the mangle table and filter
table are put into separate TCAM slices in the hardware.
To put the rule in the mangle table, include -t mangle; to put the rule in the filter table, omit -t
mangle.
Configuring Link Pause
The PAUSE frame is a flow control mechanism that halts the transmission of the transmitter for a
specified period of time. A server or other network node within the data center may be receiving traffic
faster than it can handle it, thus the PAUSE frame. In Cumulus Linux, individual ports can be configured
to execute link pause by:
Transmitting pause frames when its ingress buffers become congested (TX pause enable) and
/or
Responding to received pause frames (RX pause enable).
Just like configuring buffer and queue management link pause is configured by editing /etc/cumulus
/datapath/traffic.conf.
Here is an example configuration which turns of both types of link pause for swp2 and swp3:
248
03 June 2015
Cumulus Linux 2.5.2 User Guide
# to configure pause on a group of ports:
# uncomment the link pause port group list
# add or replace a port group name to the list
# populate the port set, e.g.
# swp1-swp4,swp8,swp50s0-swp50s3
# enable pause frame transmit and/or pause frame receive
# link pause
link_pause.port_group_list = [port_group_0]
link_pause.port_group_0.port_set = swp2-swp3
link_pause.port_group_0.rx_enable = true
link_pause.port_group_0.tx_enable = true
A port group refers to one or more sequences of contiguous ports. Multiple port groups can be defined
by:
Adding a comma-separated list of port group names to the port_group_list.
Adding the port_set, rx_enable, and tx_enable configuration lines for each port group.
You can specify the set of ports in a port group in comma-separated sequences of contiguous ports;
you can see which ports are contiguous in /var/lib/cumulus/porttab . The syntax supports:
A single port (swp1s0 or swp5)
A sequence of regular swp ports (swp2-swp5)
A sequence within a breakout swp port (swp6s0-swp6s3)
A sequence of regular and breakout ports, provided they are all in a contiguous range. For
example:
...
swp2
swp3
swp4
swp5
swp6s0
swp6s1
swp6s2
swp6s3
swp7
...
Restart switchd to allow link pause configuration changes to take effect:
cumulusnetworks.com
249
Cumulus Networks
cumulus@switch:~$ sudo service switchd restart
Useful Links
iptables-extensions man page
Caveats and Errata
You can configure Quality of Service (QoS) for 10G and 40G switches only; that is, any switch on
the Trident, Trident+, or Trident II platform.
Virtual Router Redundancy - VRR
VRR provides virtualized router redundancy in network configurations, which enables the hosts to
communicate with any redundant router without:
Needing to be reconfigured
Having to run dynamic router protocols
Having to run router redundancy protocols
A basic VRR-enabled network configuration is shown below. The network consists of several hosts, two
routers running Cumulus Linux and configured with CLAG (see page 215), and the rest of the network:
An actual implementation will have many more server hosts and network connections than are shown
250
03 June 2015
Cumulus Linux 2.5.2 User Guide
An actual implementation will have many more server hosts and network connections than are shown
here. But this basic configuration provides a complete description of the important aspects of the VRR
setup.
Contents
(Click to expand)
Contents (see page 251)
Configuring the Network (see page 251)
Configuring the Hosts (see page 252)
Configuring the Routers (see page 252)
Other Network Connections (see page 253)
Handling ARP Requests (see page 253)
Monitoring Peer Links and Uplinks (see page 253)
Using ifplugd (see page 253)
Notes (see page 255)
Configuring the Network
Configuring this network is fairly straightforward. First create the bridge subinterface, then create the
secondary address for the virtual router. Configure each router with a bridge; edit each router’s /etc
/network/interfaces file and add a configuration like the following:
auto bridge.500
iface bridge.500
address 192.168.0.252/24
address-virtual 00:00:5e:00:01:01 192.168.0.254/24
Notice the simpler configuration of the bridge with ifupdown2. For more information, see
Configuring and Managing Network Interfaces (see page 125).
You should always use ifupdown2 to configure VRR, because it ensures correct ordering
when bringing up the virtual and physical interfaces and it works best with VLAN-aware
bridges (see page 184).
If you are using the non-VLAN-aware bridge driver, the configuration would look like this:
auto bridge500
iface bridge500
address 192.168.0.252/24
address-virtual 00:00:5e:00:01:01 192.168.0.254/24
bridge_ports bond1.100 bond2.100 bond3.100
cumulusnetworks.com
251
Cumulus Networks
The IP address assigned to the bridge is the unique address for the bridge. The parameters of this
configuration are:
bridge.500: 500 represents a VLAN subinterface of the bridge, sometimes called a switched
virtual interface, or SVI.
192.168.0.252/24: The unique IP address assigned to this bridge. It is unique because, unlike
the 192.168.0.254 address, it is assigned only to this bridge, not the bridge on the other router.
00:00:5e:00:01:01: The MAC address of the virtual router. This must be the same on all
virtual routers.
192.168.0.254/24: The IP address of the virtual router, including the routing prefix. This must
be the same on all the virtual routers and must match the default gateway address configured
on the servers as well as the size of the subnet.
address-virtual: This keyword enables and configures VRR.
The above bridge configuration enables VRR by creating a MAC VLAN interface on the SVI. This MAC
VLAN interface is:
Named bridge-500-v0, which is the name of the SVI with dots changed to dashes and "-v0"
appended to the end.
Assigned a MAC address of 00:00:5e:00:01:01.
Assigned an IP address of 192.168.0.254/24.
Configuring the Hosts
Each host should have two network interfaces. The routers configure the interfaces as bonds running
LACP; the hosts should also configure its two interfaces using teaming, port aggregation, port group, or
EtherChannel running LACP. Configure the hosts, either statically or via DHCP, with a gateway address
that is the IP address of the virtual router; this default gateway address never changes.
Configure the links between the hosts and the routers in active-active mode for First Hop Redundancy
Protocol.
If you are configuring VRR without CLAG (see page 215), use active-standby mode instead.
Configuring the Routers
The routers implement the layer 2 network interconnecting the hosts, as well as the redundant routers.
If you are using CLAG (see page 215), configure each router with a bridge interface, named bridge in our
example, with these different types of interfaces:
One bond interface to each host (swp1-swp5 in the image above).
One or more interfaces to each peer router (peerbond in the image above). Multiple inter-peer
links are typically bonded interfaces in order to accommodate higher bandwidth between the
routers and to offer link redundancy.
If you are not using CLAG, then the bridge should have one switch port interface to each host
instead of a bond.
252
03 June 2015
Cumulus Linux 2.5.2 User Guide
Other Network Connections
Other interfaces on the router can connect to other subnets and are accessed through layer 3
forwarding (swp7 in the image above).
Handling ARP Requests
The entire purpose of this configuration is to have all the redundant routers respond to ARP requests
from hosts for the virtual router IP address (192.168.0.254 in the example above) with the virtual router
MAC address (00:00:5e:00:01:01 in the example above). All of the routers should respond in an identical
manner, but if one router fails, the other redundant routers will continue to respond in an identical
manner, leaving the hosts with the impression that nothing has changed.
Since the bridges in each of the redundant routers are connected, they will each receive and reply to
ARP requests for the virtual router IP address. Each ARP request made by a host will receive multiple
replies (typically two). But these replies will be identical and so the host that receives these replies will
not get confused over which response is "correct" and will either ignore replies after the first, or accept
them and overwrite the previous reply with identical information.
Monitoring Peer Links and Uplinks
When an uplink on a switch in active-active mode goes down, the peer link may get congested. When
this occurs, you should monitor the uplink and shut down all host-facing ports using ifplugd (or
another script).
When the peer link goes down in a CLAG environment, one of the switches becomes secondary and all
host-facing dual-connected bonds go down. The host side bond sees two different system MAC
addresses, so the link to primary is active on host. If any traffic from outside this environment goes to
the secondary CLAG switch, traffic will be black-holed. To avoid this, shut down all the uplinks when the
peer link goes down using ifplugd.
Using ifplugd
ifplugd is a link state monitoring daemon that can execute user-specified scripts on link transitions
(not admin-triggered transitions, but transitions when a cable is plugged in or removed).
Run the following commands to install the ifplugd service:
cumulus@switch:$ sudo apt-get update
cumulus@switch:$ sudo apt-get install ifplugd
Next, configure ifplugd. The example below indicates that when the peerbond goes down in a CLAG
environment, ifplugd brings down all the uplinks. Run the following ifplugd script on both the
primary and secondary CLAG (see page 215) switches.
To configure ifplugd, modify /etc/default/ifplugd and add the appropriate peerbond interface
name. /etc/default/ifplugd will look like this:
cumulusnetworks.com
253
Cumulus Networks
INTERFACES="peerbond"
HOTPLUG_INTERFACES=""
ARGS="-q -f -u0 -d1 -w -I"
SUSPEND_ACTION="stop"
Next, modify the /etc/ifplugd/action.d/ifupdown script.
#!/bin/sh
set -e
case "$2" in
up)
clagrole=$(clagctl | grep "Our Priority" | awk '{print $8}')
if [ "$clagrole" = "secondary" ]
then
#List all the interfaces below to bring up when clag
peerbond comes up.
for interface in swp1 bond1 bond3 bond4
do
echo "bringing up : $interface"
ip link set $interface up
done
fi
;;
down)
clagrole=$(clagctl | grep "Our Priority" | awk '{print $8}')
if [ "$clagrole" = "secondary" ]
then
#List all the interfaces below to bring down when clag
peerbond goes down.
for interface in swp1 bond1 bond3 bond4
do
echo "bringing down : $interface"
ip link set $interface down
done
fi
;;
esac
Finally, restart ifplugd for your changes to take effect:
254
03 June 2015
Cumulus Linux 2.5.2 User Guide
cumulus@switch:$ sudo service ifplugd restart
Notes
The default shell is /bin/sh, which is dash and not bash. This makes for faster execution of the
script since dash is small and quick, but consequently less featureful than bash. For example, it
doesn't handle multiple uplinks.
IGMP and MLD Snooping
IGMP (Internet Group Management Protocol) and MLD (Multicast Listener Discovery) snooping
functionality is implemented in the bridge driver in the kernel. IGMP snooping processes IGMP v1/v2/v3
reports received on a bridge port in a bridge to identify the hosts which would like to receive multicast
traffic destined to that group.
When an IGMPv2 leave message is received, a group specific query is sent to identify if there are any
other hosts interested in that group, before the group is deleted.
An IGMP query message received on a port is used to identify the port that is connected to a router and
is interested in receiving multicast traffic.
MLD snooping processes MLD v1/v2 reports, queries and v1 done messages for IPv6 groups. If IGMP or
MLD snooping is disabled, multicast traffic will be flooded to all the bridge ports in the bridge. The
multicast group IP address is mapped to a multicast MAC address and a forwarding entry is created
with a list of ports interested in receiving multicast traffic destined to that group.
cumulusnetworks.com
255
Cumulus Networks
Contents
(Click to expand)
Contents (see page 256)
Commands (see page 256)
Creating a Bridge and Configuring IGMP/MLD Snooping (see page 256)
Configuring IGMP/MLD Snooping Parameters (see page 258)
Persistent Configuration (see page 259)
Querier and Fast Leave Configuration (see page 259)
Static Group and Router Port Configuration (see page 260)
Configuration Files (see page 260)
Man Pages (see page 260)
Useful Links (see page 261)
Commands
brctl
bridge
Creating a Bridge and Configuring IGMP/MLD Snooping
Create a bridge and add bridge ports to the bridge. IGMP and MLD snooping are enabled by default on
the bridge:
cumulus@switch:~$ sudo brctl addbr br0
cumulus@switch:~$ sudo brctl addif br0 swp1 swp2 swp3
cumulus@switch:~$ sudo ifconfig br0 up
To get the IGMP/MLD snooping bridge state, use:
cumulus@switch:~# sudo brctl showstp br0
br0
bridge id
8000.7072cf8c272c
designated root
8000.7072cf8c272c
root port
max age
0
20.00
path cost
0
bridge max age
20.00
hello time
2.00
bridge hello time
2.00
forward delay
256
15.00
bridge forward delay
03 June 2015
Cumulus Linux 2.5.2 User Guide
15.00
ageing time
300.00
hello timer
0.00
tcn timer
0.00
gc timer
0.00
topology change timer
263.70
hash elasticity
4096
hash max
4096
mc last member count
2
mc init query count
2
mc router
1
mc snooping
1
mc last member timer
1.00
mc membership timer
260.00
mc querier timer
255.00
mc query interval
125.00
mc response interval
10.00
mc init query interval
31.25
mc querier
0
mc query ifaddr
0
flags
swp1 (1)
port id
8001
state
designated root
8000.7072cf8c272c
path cost
designated bridge
8000.7072cf8c272c
message age timer
8001
forward delay timer
forwarding
2
0.00
designated port
0.00
designated cost
0
hold timer
1
mc fast leave
0.00
mc router
0
flags
swp2 (2)
port id
8002
state
designated root
8000.7072cf8c272c
path cost
designated bridge
8000.7072cf8c272c
message age timer
8002
forward delay timer
forwarding
2
0.00
designated port
0.00
designated cost
0
hold timer
1
mc fast leave
0.00
mc router
0
flags
swp3 (3)
cumulusnetworks.com
257
Cumulus Networks
port id
8003
state
designated root
8000.7072cf8c272c
path cost
designated bridge
8000.7072cf8c272c
message age timer
8003
forward delay timer
forwarding
2
0.00
designated port
8.98
designated cost
0
hold timer
1
mc fast leave
0.00
mc router
0
flags
To get the groups and bridge port state, use bridge mdb show command. To display router ports and
group information use bridge -d mdb show command:
cumulus@switch:~# sudo bridge -d mdb show
dev br0 port swp2 grp 234.10.10.10 temp
dev br0 port swp1 grp 238.39.20.86 permanent
dev br0 port swp1 grp 234.1.1.1 temp
dev br0 port swp2 grp ff1a::9 permanent
router ports on br0: swp3
cumulus@switch:~# sudo bridge mdb show
dev br0 port swp2 grp 234.10.10.10 temp
dev br0 port swp1 grp 238.39.20.86 permanent
dev br0 port swp1 grp 234.1.1.1 temp
dev br0 port swp2 grp ff1a::9 permanent
To disable IGMP and MLD snooping, use:
cumulus@switch:~$ sudo brctl setmcsnoop br0 0
Configuring IGMP/MLD Snooping Parameters
For an explanation of these parameters, see the brctl and bridge-utils-interfaces man pages:
cumulus@switch:~$ sudo brctl setmclmc br0 2
cumulus@switch:~$ sudo brctl setmcrouter br0 1
cumulus@switch:~$ sudo brctl setmcsqc br0 2
cumulus@switch:~$ sudo brctl sethashel br0 4096
258
03 June 2015
Cumulus Linux 2.5.2 User Guide
cumulus@switch:~$ sudo brctl sethashmax br0 4096
cumulus@switch:~$ sudo brctl setmclmi br0 1
cumulus@switch:~$ sudo brctl setmcmi br0 260
cumulus@switch:~$ sudo brctl setmcqpi br0 255
cumulus@switch:~$ sudo brctl setmcqi br0 125
cumulus@switch:~$ sudo brctl setmcqri br0 10
cumulus@switch:~$ sudo brctl setmsqi br0 31
Persistent Configuration
The configuration in /etc/network/interfaces below is for the example bridge above:
auto br0
iface br0 inet static
bridge-ports swp1 swp2 swp3
bridge-mclmc 2
bridge-mcrouter 1
bridge-mcsnoop
1
bridge-mcsqc
2
bridge-mcqifaddr 0
bridge-mcquerier 0
bridge-hashel 4096
bridge-hashmax 4096
bridge-mclmi 1
bridge-mcmi 260
bridge-mcqpi 255
bridge-mcqi
125
bridge-mcqri 10
bridge-mcsqi 31
bridge-portmcrouter swp1=1 swp2=1
bridge-portmcfl swp1=0 swp2=0
Querier and Fast Leave Configuration
If there is no multicast router in the VLAN, the IGMP/MLD snooping querier can be configured to
generate query messages.
To send queries with a non-zero IP address, configure an IP address on the bridge device, then set
setmcqifaddr to 1:
cumulus@switch:~# sudo brctl setmcquerier br0 1
cumulus@switch:~$ sudo brctl setmcqifaddr br0 1
cumulusnetworks.com
259
Cumulus Networks
If only one host is attached to each host port, fast leave can be configured on that port. When a leave
message is received on that port, no query messages will be sent and the group will be deleted
immediately:
cumulus@switch:~# sudo brctl setportmcfl br0 swp1 1
Static Group and Router Port Configuration
To configure static permanent multicast group on a port, use:
cumulus@switch:~# sudo bridge mdb add dev br0 port swp2 grp ff1a::9
permanent
cumulus@switch:~# sudo bridge mdb add dev br0 port swp1 grp 238.39.20.86
permanent
A static temporary multicast group can also be configured on a port, which would be deleted after the
membership timer expires, if no report is received on that port:
cumulus@switch:~# sudo bridge mdb add dev br0 port swp1 grp 238.39.20.86
temp
To configure a static router port, use:
cumulus@switch:~# sudo brctl setportmcrouter br0 swp3 2
Configuration Files
/etc/network/interfaces
Man Pages
brctl(8)
bridge(8)
bridge-utils-interfaces(5)
260
03 June 2015
Cumulus Linux 2.5.2 User Guide
Useful Links
http://www.linuxfoundation.org/collaborate/workgroups/networking/bridge#Snooping
https://tools.ietf.org/html/rfc4541
http://en.wikipedia.org/wiki/IGMP_snooping
http://tools.ietf.org/rfc/rfc2236.txt
http://tools.ietf.org/html/rfc3376
http://tools.ietf.org/search/rfc2710
http://tools.ietf.org/html/rfc3810
Layer
3 Features
cumulusnetworks.com
261
Cumulus Networks
Layer 3 Features
Routing (see page 262)
Introduction to Routing Protocols (see page 267)
Network Topology (see page 269)
Quagga Overview (see page 271)
Configuring Quagga (see page 273)
Open Shortest Path First (OSPF) Protocol (see page 285)
Open Shortest Path First v3 (OSPFv3) Protocol (see page 294)
Configuring Border Gateway Protocol (BGP) (see page 297)
Hardware ECMP Hashing (see page 313)
Routing
This chapter discusses routing on switches running Cumulus Linux.
Contents
(Click to expand)
Contents (see page 262)
Commands (see page 262)
Static Routing via ip route (see page 262)
Persistently Adding a Static Route (see page 264)
Static Routing via quagga (see page 265)
Persistent Configuration (see page 266)
Supported Route Table Entries (see page 267)
Configuration Files (see page 267)
Useful Links (see page 267)
Caveats and Errata (see page 267)
Commands
ip route
Static Routing via ip route
The ip route command allows manipulating the kernel routing table directly from the Linux shell. See
man ip(8) for details. quagga monitors the kernel routing table changes and updates its own routing
table accordingly.
To display the routing table:
262
03 June 2015
Cumulus Linux 2.5.2 User Guide
cumulus@switch:~$ ip route show
default via 10.0.1.2 dev eth0
10.0.1.0/24 dev eth0
192.0.2.0/24 dev swp1
proto kernel
scope link
proto kernel
scope link
192.0.2.10/24 via 192.0.2.1 dev swp1
192.0.2.20/24
proto zebra
src 10.0.1.52
src 192.0.2.12
proto zebra
metric 20
metric 20
nexthop via 192.0.2.1
dev swp1 weight 1
nexthop via 192.0.2.2
dev swp2 weight 1
192.0.2.30/24 via 192.0.2.1 dev swp1
proto zebra
192.0.2.40/24 dev swp2
scope link
proto kernel
metric 20
src 192.0.2.42
192.0.2.50/24 via 192.0.2.2 dev swp2
proto zebra
metric 20
192.0.2.60/24 via 192.0.2.2 dev swp2
proto zebra
metric 20
192.0.2.70/24
proto zebra
metric 30
nexthop via 192.0.2.1
dev swp1 weight 1
nexthop via 192.0.2.2
dev swp2 weight 1
198.51.100.0/24 dev swp3
proto kernel
198.51.100.10/24 dev swp4
scope link
198.51.100.20/24 dev br0
proto kernel
src 198.51.100.1
proto kernel
scope link
scope link
src 198.51.100.11
src 198.51.100.21
To add a static route (does not persist across reboots):
cumulus@switch:~$ sudo ip route add 203.0.113.0/24 via 198.51.100.2
cumulus@switch:~$ ip route
default via 10.0.1.2 dev eth0
10.0.1.0/24 dev eth0
192.0.2.0/24 dev swp1
proto kernel
scope link
proto kernel
scope link
192.0.2.10/24 via 192.0.2.1 dev swp1
192.0.2.20/24
proto zebra
src 10.0.1.52
src 192.0.2.12
proto zebra
metric 20
metric 20
nexthop via 192.0.2.1
dev swp1 weight 1
nexthop via 192.0.2.2
dev swp2 weight 1
192.0.2.30/24 via 192.0.2.1 dev swp1
proto zebra
192.0.2.40/24 dev swp2
scope link
proto kernel
metric 20
src 192.0.2.42
192.0.2.50/24 via 192.0.2.2 dev swp2
proto zebra
metric 20
192.0.2.60/24 via 192.0.2.2 dev swp2
proto zebra
metric 20
192.0.2.70/24
proto zebra
metric 30
nexthop via 192.0.2.1
dev swp1 weight 1
nexthop via 192.0.2.2
dev swp2 weight 1
198.51.100.0/24 dev swp3
198.51.100.10/24 dev swp4
198.51.100.20/24 dev br0
proto kernel
proto kernel
scope link
proto kernel
scope link
scope link
src 198.51.100.1
src 198.51.100.11
src 198.51.100.21
203.0.113.0/24 via 198.51.100.2 dev swp3
To delete a static route (does not persist across reboots):
cumulusnetworks.com
263
Cumulus Networks
To delete a static route (does not persist across reboots):
cumulus@switch:~$ sudo ip route del 203.0.113.0/24
cumulus@switch:~$ ip route
default via 10.0.1.2 dev eth0
10.0.1.0/24 dev eth0
proto kernel
192.0.2.0/24 dev swp1
scope link
proto kernel
192.0.2.10/24 via 192.0.2.1 dev swp1
192.0.2.20/24
proto zebra
src 10.0.1.52
scope link
src 192.0.2.12
proto zebra
metric 20
metric 20
nexthop via 192.0.2.1
dev swp1 weight 1
nexthop via 192.0.2.2
dev swp2 weight 1
192.0.2.30/24 via 192.0.2.1 dev swp1
proto zebra
192.0.2.40/24 dev swp2
scope link
proto kernel
metric 20
src 192.0.2.42
192.0.2.50/24 via 192.0.2.2 dev swp2
proto zebra
metric 20
192.0.2.60/24 via 192.0.2.2 dev swp2
proto zebra
metric 20
192.0.2.70/24
proto zebra
metric 30
nexthop via 192.0.2.1
dev swp1 weight 1
nexthop via 192.0.2.2
dev swp2 weight 1
198.51.100.0/24 dev swp3
proto kernel
198.51.100.10/24 dev swp4
198.51.100.20/24 dev br0
proto kernel
proto kernel
scope link
scope link
scope link
src 198.51.100.1
src 198.51.100.11
src 198.51.100.21
Persistently Adding a Static Route
A static route can be persistently added by adding up ip route add .. into /etc/network
/interfaces. For example:
cumulus@switch:~$ cat /etc/network/interfaces
# This file describes the network interfaces available on your system
# and how to activate them. For more information, see interfaces(5).
# The loopback network interface
auto lo
iface lo inet loopback
auto swp3
iface swp3
address 198.51.100.1/24
up ip route add 203.0.113.0/24 via 198.51.100.2
264
03 June 2015
Cumulus Linux 2.5.2 User Guide
Notice the simpler configuration of swp3 due to ifupdown2. For more information, see
Configuring Network Interfaces with ifupdown (see page 125).
Static Routing via quagga
Static routes can also be managed via the quagga CLI. The routes are added to the quagga routing
table, and then will be updated into the kernel routing table as well.
To add a static route (does not persist across reboot):
cumulus@switch:~$ sudo vtysh
Hello, this is Quagga (version 0.99.21).
Copyright 1996-2005 Kunihiro Ishiguro, et al.
switch# conf t
switch(config)# ip route 203.0.113.0/24 198.51.100.2
switch(config)# end
switch# show ip route
Codes: K - kernel route, C - connected, S - static, R - RIP,
O - OSPF, I - IS-IS, B - BGP, A - Babel,
> - selected route, * - FIB route
K>* 0.0.0.0/0 via 10.0.1.2, eth0
C>* 10.0.1.0/24 is directly connected, eth0
O
192.0.2.0/24 [110/10] is directly connected, swp1, 00:13:25
C>* 192.0.2.0/24 is directly connected, swp1
O>* 192.0.2.10/24 [110/20] via 192.0.2.1, swp1, 00:13:09
O>* 192.0.2.20/24 [110/20] via 192.0.2.1, swp1, 00:13:09
*
via 192.0.2.41, swp2, 00:13:09
O>* 192.0.2.30/24 [110/20] via 192.0.2.1, swp1, 00:13:09
O
192.0.2.40/24 [110/10] is directly connected, swp2, 00:13:25
C>* 192.0.2.40/24 is directly connected, swp2
O>* 192.0.2.50/24 [110/20] via 192.0.2.41, swp2, 00:13:09
O>* 192.0.2.60/24 [110/20] via 192.0.2.41, swp2, 00:13:09
O>* 192.0.2.70/24 [110/30] via 192.0.2.1, swp1, 00:13:09
*
O
via 192.0.2.41, swp2, 00:13:09
198.51.100.0/24 [110/10] is directly connected, swp3, 00:13:22
C>* 198.51.100.0/24 is directly connected, swp3
O
198.51.100.10/24 [110/10] is directly connected, swp4, 00:13:22
C>* 198.51.100.10/24 is directly connected, swp4
O
198.51.100.20/24 [110/10] is directly connected, br0, 00:13:22
cumulusnetworks.com
265
Cumulus Networks
C>* 198.51.100.20/24 is directly connected, br0
S>* 203.0.113.0/24 [1/0] via 198.51.100.2, swp3
C>* 127.0.0.0/8 is directly connected, lo
To delete a static route (does not persist across reboot):
cumulus@switch:~$ sudo vtysh
Hello, this is Quagga (version 0.99.21).
Copyright 1996-2005 Kunihiro Ishiguro, et al.
switch# conf t
switch(config)# no ip route 203.0.113.0/24 198.51.100.2
switch(config)# end
switch# show ip route
Codes: K - kernel route, C - connected, S - static, R - RIP,
O - OSPF, I - IS-IS, B - BGP, A - Babel,
> - selected route, * - FIB route
K>* 0.0.0.0/0 via 10.0.1.2, eth0
C>* 10.0.1.0/24 is directly connected, eth0
O
192.0.2.0/24 [110/10] is directly connected, swp1, 00:13:55
C>* 192.0.2.0/24 is directly connected, swp1
O>* 192.0.2.10/24 [110/20] via 11.0.0.1, swp1, 00:13:39
O>* 192.0.2.20/24 [110/20] via 11.0.0.1, swp1, 00:13:39
*
via 11.0.4.1, swp2, 00:13:39
O>* 192.0.2.30/24 [110/20] via 11.0.0.1, swp1, 00:13:39
O
192.0.2.40/24 [110/10] is directly connected, swp2, 00:13:55
C>* 192.0.2.40/24 is directly connected, swp2
O>* 192.0.2.50/24 [110/20] via 11.0.4.1, swp2, 00:13:39
O>* 192.0.2.60/24 [110/20] via 11.0.4.1, swp2, 00:13:39
O>* 192.0.2.70/24 [110/30] via 11.0.0.1, swp1, 00:13:39
*
O
via 11.0.4.1, swp2, 00:13:39
198.51.100.0/24 [110/10] is directly connected, swp3, 00:13:52
C>* 198.51.100.0/24 is directly connected, swp3
O
198.51.100.10/24 [110/10] is directly connected, swp4, 00:13:52
C>* 198.51.100.10/24 is directly connected, swp4
O
198.51.100.20/24 [110/10] is directly connected, br0, 00:13:52
C>* 198.51.100.20/24 is directly connected, br0
C>* 127.0.0.0/8 is directly connected, lo
switch#
Persistent Configuration
266
03 June 2015
Cumulus Linux 2.5.2 User Guide
Persistent Configuration
From the quagga CLI, the running configuration can be saved so it persists between reboots:
switch# write mem
Configuration saved to /etc/quagga/zebra.conf
switch# end
Supported Route Table Entries
Cumulus Linux supports different numbers of route entries, depending upon your switch platform
(Trident, Trident+, or Trident II; see the HCL) and whether the routes are IPv4 or IPv6.
In addition, switches on the Trident II platform are configured to manage route table entries using
Algorithm Longest Prefix Match (ALPM). In ALPM mode, the hardware can store significantly more route
entries.
Following are the number of route supported on Trident II switches with ALPM:
32K IPv4 routes
16K IPv6 routes
32K total routes (both IPv4 and IPv6)
Following are the number of route supported on Trident and Trident+ switches:
16K IPv4 routes
8K IPv6 routes
16K total routes (both IPv4 and IPv6)
Configuration Files
/etc/network/interfaces
/etc/quagga/zebra.conf
Useful Links
http://linux-ip.net/html/tools-ip-route.html
http://www.nongnu.org/quagga/docs/docs-info.html#Static-Route-Commands
Caveats and Errata
Static routes added via quagga can be deleted via Linux shell. This operation, while possible,
should be avoided. Routes added by quagga should only be deleted by quagga, otherwise
quagga might not be able to clean up all its internal state completely and incorrect routing can
occur as a result.
Introduction to Routing Protocols
This chapter discusses the various routing protocols, and how to configure them.
cumulusnetworks.com
267
Cumulus Networks
Contents
(Click to expand)
Contents (see page 268)
Defining Routing Protocols (see page 268)
Configuring Routing Protocols (see page 268)
Protocol Tuning (see page 268)
Configuration Files (see page 269)
Defining Routing Protocols
A routing protocol dynamically computes reachability between various end points. This enables
communication to work around link and node failures, and additions and withdrawals of various
addresses.
IP routing protocols are typically distributed; that is, an instance of the routing protocol runs on each of
the routers in a network.
Cumulus Linux does not support running multiple instances of the same protocol on a router.
Distributed routing protocols compute reachability between end points by disseminating relevant
information and running a routing algorithm on this information to determine the routes to each end
station. To scale the amount of information that needs to be exchanged, routes are computed on
address prefixes rather than on every end point address.
Configuring Routing Protocols
A routing protocol needs to know three pieces of information, at a minimum:
Who am I (my identity)
To whom to disseminate information
What to disseminate
Most routing protocols use the concept of a router ID to identify a node. Different routing protocols
answer the last two questions differently.
The way they answer these questions affects the network design and thereby configuration. For
example, in a link-state protocol such as OSPF (see Open Shortest Path First (OSPF) Protocol (see page
285)) or IS-IS, complete local information (links and attached address prefixes) about a node is
disseminated to every other node in the network. Since the state that a node has to keep grows rapidly
in such a case, link-state protocols typically limit the number of nodes that communicate this way. They
allow for bigger networks to be built by breaking up a network into a set of smaller subnetworks (which
are called areas or levels), and by advertising summarized information about an area to other areas.
Besides the two critical pieces of information mentioned above, protocols have other parameters that
can be configured. These are usually specific to each protocol.
Protocol Tuning
Most protocols provide certain tunable parameters that are specific to convergence during changes.
268
03 June 2015
Cumulus Linux 2.5.2 User Guide
Wikipedia defines convergence as the “state of a set of routers that have the same topological
information about the network in which they operate”. It is imperative that the routers in a network
have the same topological state for the proper functioning of a network. Without this, traffic can be
blackholed, and thus not reach its destination. It is normal for different routers to have differing
topological states during changes, but this difference should vanish as the routers exchange
information about the change and recompute the forwarding paths. Different protocols converge at
different speeds in the presence of changes.
A key factor that governs how quickly a routing protocol converges is the time it takes to detect the
change. For example, how quickly can a routing protocol be expected to act when there is a link failure.
Routing protocols classify changes into two kinds: hard changes such as link failures, and soft changes
such as a peer dying silently. They’re classified differently because protocols provide different
mechanisms for dealing with these failures.
It is important to configure the protocols to be notified immediately on link changes. This is also true
when a node goes down, causing all of its links to go down.
Even if a link doesn’t fail, a routing peer can crash. This causes that router to usually delete the routes it
has computed or worse, it makes that router impervious to changes in the network, causing it to go out
of sync with the other routers in the network because it no longer shares the same topological
information as its peers.
The most common way to detect a protocol peer dying is to detect the absence of a heartbeat. All
routing protocols send a heartbeat (or “hello”) packet periodically. When a node does not see a
consecutive set of these hello packets from a peer, it declares its peer dead and informs other routers
in the network about this. The period of each heartbeat and the number of heartbeats that need to be
missed before a peer is declared dead are two popular configurable parameters.
If you configure these timers very low, the network can quickly descend into instability under stressful
conditions when a router is not able to keep sending the heartbeats quickly as it is busy computing
routing state; or the traffic is so much that the hellos get lost. Alternately, configuring this timer to very
high values also causes blackholing of communication because it takes much longer to detect peer
failures. Usually, the default values initialized within each protocol are good enough for most networks.
Cumulus Networks recommends you do not adjust these settings.
Configuration Files
/etc/quagga/daemons
Network Topology
In computer networks, topology refers to the structure of interconnecting various nodes. Some
commonly used topologies in networks are star, hub and spoke, leaf and spine, and broadcast.
Contents
(Click to expand)
Contents (see page 269)
Clos Topologies (see page 269)
Over-Subscribed and Non-Blocking Configurations (see page 270)
Containing the Failure Domain (see page 271)
Load Balancing (see page 271)
Clos Topologies
cumulusnetworks.com
269
Cumulus Networks
Clos Topologies
In the vast majority of modern data centers, Clos or fat tree topology is very popular. This topology is
shown in the figure below. It is also commonly referred to as leaf-spine topology. We shall use this
topology throughout the routing protocol guide.
This topology allows the building of networks of varying size using nodes of different port counts and
/or by increasing the tiers. The picture above is a three-tiered Clos network. We number the tiers from
the bottom to the top. Thus, in the picture, the lowermost layer is called tier 1 and the topmost tier is
called tier 3.
The number of end stations (such as servers) that can be attached to such a network is determined by
a very simple mathematical formula.
In a 2-tier network, if each node is made up of m ports, then the total number of end stations that can
be connected is m^2/2. In more general terms, if tier-1 nodes are m-port nodes and tier-2 nodes are nport nodes, then the total number of end stations that can be connected are (m*n)/2. In a three tier
network, where tier-3 nodes are o-port nodes, the total number of end stations that can be connected
are (m*n*o)/2^(number of tiers-1).
Let’s consider some practical examples. In many data centers, it is typical to connect 40 servers to a topof-rack (ToR) switch. The ToRs are all connected via a set of spine switches. If a ToR switch has 64 ports,
then after hooking up 40 ports to the servers, the remaining 24 ports can be hooked up to 24 spine
switches of the same link speed or to a smaller number of higher link speed switches. For example, if
the servers are all hooked up as 10GE links, then the ToRs can connect to the spine switches via 40G
links. So, instead of connecting to 24 spine switches with 10G links, the ToRs can connect to 6 spine
switches with each link being 40G. If the spine switches are also 64-port switches, then the total
number of end stations that can be connected is 2560 (40*64) stations.
In a three tier network of 64-port switches, the total number of servers that can be connected are
(40*64*64)/2 = 81920. As you can see, this kind of topology can serve quite a large network with three
tiers.
Over-Subscribed and Non-Blocking Configurations
In the above example, the network is over-subscribed; that is, 400G of bandwidth from end stations (40
servers * 10GE links) is serviced by only 240G of inter-rack bandwidth. The over-subscription ratio is 0.6
(240/400).
This can lead to congestion in the network and hot spots. Instead, if network operators connected 32
servers per rack, then 32 ports are left to be connected to spine switches. Now, the network is said to
be rerrangably non-blocking. Now any server in a rack can talk to any other server in any other rack
without necessarily blocking traffic between other servers.
270
03 June 2015
Cumulus Linux 2.5.2 User Guide
In such a network, the total number of servers that can be connected are (64*64)/2 = 2048. Similarly, a
three-tier version of the same can serve up to (64*64*64)/4 = 65536 servers.
Containing the Failure Domain
Traditional data centers were built using just two spine switches. This means that if one of those
switches fails, the network bandwidth is cut in half, thereby greatly increasing network congestion and
adversely affecting many applications. To avoid this, vendors typically try and make the spine switches
resilient to failures by providing such features as dual control line cards and attempting to make the
software highly available. However, as Douglas Adams famously noted, “>>>”. In many cases, HA is
among the top two or three causes of software failure (and thereby switch failure).
To support a fairly large network with just two spine switches also means that these switches have a
large port count. This can make the switches quite expensive.
If the number of spine switches were to be merely doubled, the effect of a single switch failure is
halved. With 8 spine switches, the effect of a single switch failure only causes a 12% reduction in
available bandwidth.
So, in modern data centers, people build networks with anywhere from 4 to 32 spine switches.
Load Balancing
In a Clos network, traffic is load balanced across the multiple links using equal cost multi-pathing
(ECMP).
Routing algorithms compute shortest paths between two end stations where shortest is typically the
lowest path cost. Each link is assigned a metric or cost. By default, a link’s cost is a function of the link
speed. The higher the link speed, the lower its cost. A 10G link has a higher cost than a 40G or 100G
link, but a lower cost than a 1G link. Thus, the link cost is a measure of its traffic carrying capacity.
In the modern data center, the links between tiers of the network are homogeneous; that is, they have
the same characteristics (same speed and therefore link cost) as the other links. As a result, the first
hop router can pick any of the spine switches to forward a packet to its destination (assuming that
there is no link failure between the spine and the destination switch). Most routing protocols recognize
that there are multiple equal-cost paths to a destination and enable any of them to be selected for a
given traffic flow.
Quagga Overview
Cumulus Linux uses quagga, an open source routing software suite, to provide the routing protocols
for dynamic routing. Cumulus Linux supports the l atest Quagga version, 0.99.23.1. Quagga is a fork of
the GNU Zebra project.
Quagga provides many routing protocols, of which Cumulus Linux supports the following:
Open Shortest Path First ( v2 (see page 285) and v3 (see page 294))
Border Gateway Protocol (see page 297)
Contents
(Click to expand)
Contents (see page 271)
Architecture (see page 272)
Zebra (see page 272)
cumulusnetworks.com
271
Cumulus Networks
Zebra (see page 272)
Configuration Files (see page 272)
Useful Links (see page 273)
Architecture
As shown in the figure above, the Quagga routing suite consists of various protocol-specific daemons
and a protocol-independent daemon called zebra. Each of the protocol-specific daemons are
responsible for running the relevant protocol and building the routing table based on the information
exchanged.
It is not uncommon to have more than one protocol daemon running at the same time. For example, at
the edge of an enterprise, protocols internal to an enterprise (called IGP for Interior Gateway Protocol)
such as OSPF (see page 285) or RIP run alongside the protocols that connect an enterprise to the rest of
the world (called EGP or Exterior Gateway Protocol) such as BGP (see page 297).
zebra is the daemon that resolves the routes provided by multiple protocols (including static routes
specified by the user) and programs these routes in the Linux kernel via netlink (in Linux). zebra
does more than this, of course.
Zebra
The quagga documentation defines zebra as the IP routing manager for quagga that “provides kernel
routing table updates, interface lookups, and redistribution of routes between different routing
protocols.”
Configuration Files
/etc/quagga/bgpd.conf
/etc/quagga/daemons
/etc/quagga/debian.conf
272
03 June 2015
Cumulus Linux 2.5.2 User Guide
/etc/quagga/ospf6d.conf
/etc/quagga/ospfd.conf
/etc/quagga/vtysh.conf
/etc/quagga/zebra.conf
Useful Links
http://www.quagga.net/
http://packages.debian.org/quagga
Configuring Quagga
This section provides an overview of configuring quagga.
Before you run quagga, make sure all relevant daemons, such as zebra, are running. Make your
changes in /etc/quagga/daemons then restart quagga with service quagga restart.
Contents
(Click to expand)
Contents (see page 273)
Configuration Files (see page 274)
Starting Quagga (see page 274)
Understanding Integrated Configurations (see page 274)
Interface IP Addresses (see page 276)
Using the vtysh Modal CLI (see page 276)
Using the Cumulus Linux Non-Modal CLI (see page 280)
Comparing vtysh and Cumulus Linux Commands (see page 281)
Creating a New Neighbor (see page 281)
Redistributing Routing Information (see page 281)
Defining a Static Route (see page 281)
Configuring an IPv6 Interface (see page 282)
Enabling PTM (see page 282)
Configuring MTU in IPv6 Network Discovery (see page 282)
Logging OSPF Adjacency Changes (see page 283)
Setting OSPF Interface Priority (see page 283)
Configuring Timing for OSPF SPF Calculations (see page 283)
Configuring Hello Packet Intervals (see page 283)
Displaying OSPF Debugging Status (see page 284)
Displaying BGP Information (see page 284)
Useful Links (see page 284)
cumulusnetworks.com
273
Cumulus Networks
Configuration Files
At startup, quagga reads a set of files to determine the startup configuration. The files and what they
contain are specified below:
File
Description
Quagga.conf
The default, integrated, single configuration file for all quagga daemons.
daemons
Contains the list of quagga daemons that must be started.
zebra.conf
Configuration file for the zebra daemon.
ospfd.conf
Configuration file for the OSPFv2 daemon.
ospf6d.conf
Configuration file for the OSPFv3 daemon.
bgpd.conf
Configuration file for the BGP daemon.
Starting Quagga
Quagga does not start by default in Cumulus Linux 2.0.
Before you start quagga, modify /etc/quagga/daemons to enable the corresponding daemons:
zebra=yes (* this one is mandatory to bring the others up)
bgpd=yes
ospfd=yes
ospf6d=yes
ripd=no
ripngd=no
isisd=no
babeld=no
Then, start quagga:
cumulus@switch1:~$ sudo service quagga start
Understanding Integrated Configurations
By default in Cumulus Linux, quagga saves the configuration of all daemons in a single integrated
configuration file, Quagga.conf.
You can disable this mode by running:
274
03 June 2015
Cumulus Linux 2.5.2 User Guide
quagga(config)# no service integrated-vtysh-config
quagga(config)#
To enable the integrated configuration file mode again, run:
quagga(config)# service integrated-vtysh-config
quagga(config)#
If you disable the integrated configuration mode, quagga saves each daemon-specific configuration file
in a separate file. At a minimum for a daemon to start, that daemon must be specified in the daemons
file and the daemon-specific configuration file must be present, even if that file is empty.
For example, to start bgpd, the daemons file needs to be formatted as follows, at minimum:
cumulus@switch:~$ sudo cat /etc/quagga/daemons
zebra=yes
bgpd=yes
The current configuration can be saved by running:
quagga# write mem
Building Configuration...
Integrated configuration saved to /etc/quagga/Quagga.conf
[OK]
You can use write file instead of write mem.
When the integrated configuration mode disabled, the output looks like this:
quagga# write mem
Building Configuration...
Configuration saved to /etc/quagga/zebra.conf
Configuration saved to /etc/quagga/bgpd.conf
[OK]
The daemons file is not written using the write mem command.
cumulusnetworks.com
275
Cumulus Networks
Interface IP Addresses
Quagga inherits the IP addresses for the network interfaces from the /etc/network/interfaces file.
This is the recommended way to define the addresses. For more information, see Configuring IP
Addresses (see page 133).
Using the vtysh Modal CLI
Quagga provides a CLI – vtysh – for configuring and displaying the state of the protocols. It is invoked
by running:
cumulus@switch:~$ sudo vtysh
Hello, this is Quagga (version 0.99.21).
Copyright 1996-2005 Kunihiro Ishiguro, et al.
quagga#
Launching vtysh brings you into zebra initially. From here, you can log into other protocol daemons,
such as bgpd, ospfd or babeld.
vtysh provides a Cisco-like modal CLI, and many of the commands are similar to Cisco IOS commands.
By modal CLI, we mean that there are different modes to the CLI, and certain commands are only
available within a specific mode. Configuration is available with the configure terminal command,
which is invoked thus:
quagga# configure terminal
quagga(config)#
The prompt displays the mode the CLI is in. For example, when the interface-specific commands are
invoked, the prompt changes to:
quagga(config)# interface swp1
quagga(config-if)#
When the routing protocol specific commands are invoked, the prompt changes to:
quagga(config)# router ospf
quagga(config-router)#
At any level, ”?” displays the list of available top-level commands at that level:
276
03 June 2015
Cumulus Linux 2.5.2 User Guide
quagga(config-if)# ?
babel
Babel interface commands
bandwidth
Set bandwidth informational parameter
description
Interface specific description
end
End current mode and change to enable mode
exit
Exit current mode and down to previous mode
ip
Interface Internet Protocol config commands
ipv6
Interface IPv6 config commands
isis
IS-IS commands
link-detect
Enable link detection on interface
list
Print command list
mpls-te
MPLS-TE specific commands
multicast
Set multicast flag to interface
no
Negate a command or set its defaults
ospf
OSPF interface commands
quit
Exit current mode and down to previous mode
shutdown
Shutdown the selected interface
?-based completion is also available to see the parameters that a command takes:
quagga(config-if)# bandwidth ?
<1-10000000>
Bandwidth in kilobits
quagga(config-if)# ip ?
address
Set the IP address of an interface
irdp
Alter ICMP Router discovery preference this interface
ospf
OSPF interface commands
rip
Routing Information Protocol
router
IP router interface commands
Displaying state can be done at any level, including the top level. For example, to see the routing table
as seen by zebra, you use:
quagga# show ip route
Codes: K - kernel route, C - connected, S - static, R - RIP,
O - OSPF, I - IS-IS, B - BGP, A - Babel,
> - selected route, * - FIB route
K>* 0.0.0.0/0 via 192.168.0.2, eth0
C>* 192.0.2.11/24 is directly connected, swp1
C>* 192.0.2.12/24 is directly connected, swp2
B>* 203.0.113.30/24 [200/0] via 192.0.2.2, swp1, 10:43:05
cumulusnetworks.com
277
Cumulus Networks
B>* 203.0.113.31/24 [200/0] via 192.0.2.2, swp1, 10:43:05
B>* 203.0.113.32/24 [200/0] via 192.0.2.2, swp1, 10:43:05
C>* 127.0.0.0/8 is directly connected, lo
C>* 192.168.0.0/24 is directly connected, eth0
To run the same command at a config level, you prepend do to it:
quagga(config-router)# do show ip route
Codes: K - kernel route, C - connected, S - static, R - RIP,
O - OSPF, I - IS-IS, B - BGP, A - Babel,
> - selected route, * - FIB route
K>* 0.0.0.0/0 via 192.168.0.2, eth0
C>* 192.0.2.11/24 is directly connected, swp1
C>* 192.0.2.12/24 is directly connected, swp2
B>* 203.0.113.30/24 [200/0] via 192.0.2.2, swp1, 10:43:05
B>* 203.0.113.31/24 [200/0] via 192.0.2.2, swp1, 10:43:05
B>* 203.0.113.32/24 [200/0] via 192.0.2.2, swp1, 10:43:05
C>* 127.0.0.0/8 is directly connected, lo
C>* 192.168.0.0/24 is directly connected, eth0
Running single commands with vtysh is possible using the -c option of vtysh:
cumulus@switch:~$ sudo vtysh -c 'sh ip route'
Codes: K - kernel route, C - connected, S - static, R - RIP,
O - OSPF, I - IS-IS, B - BGP, A - Babel,
> - selected route, * - FIB route
K>* 0.0.0.0/0 via 192.168.0.2, eth0
C>* 192.0.2.11/24 is directly connected, swp1
C>* 192.0.2.12/24 is directly connected, swp2
B>* 203.0.113.30/24 [200/0] via 192.0.2.2, swp1, 11:05:10
B>* 203.0.113.31/24 [200/0] via 192.0.2.2, swp1, 11:05:10
B>* 203.0.113.32/24 [200/0] via 192.0.2.2, swp1, 11:05:10
C>* 127.0.0.0/8 is directly connected, lo
C>* 192.168.0.0/24 is directly connected, eth0
Running a command multiple levels down is done thus:
cumulus@switch:~$ sudo vtysh -c 'configure terminal' -c 'router ospf' -c
'area 0.0.0.1 range 10.10.10.0/24'
278
03 June 2015
Cumulus Linux 2.5.2 User Guide
Notice that the commands also take a partial command name (for example, sh ip route above) as
long as the partial command name is not aliased:
cumulus@switch:~$ sudo vtysh -c 'sh ip r'
% Ambiguous command.
A command or feature can be disabled by prepending the command with no. For example:
quagga(config-router)# no area 0.0.0.1 range 10.10.10.0/24
The current state of the configuration can be viewed via:
quagga# show running-config
Building configuration...
Current configuration:
!
hostname quagga
log file /media/node/zebra.log
log file /media/node/bgpd.log
log timestamp precision 6
!
service integrated-vtysh-config
!
password xxxxxx
enable password xxxxxx
!
interface eth0
ipv6 nd suppress-ra
link-detect
!
interface lo
link-detect
!
interface swp1
ipv6 nd suppress-ra
link-detect
!
interface swp2
ipv6 nd suppress-ra
cumulusnetworks.com
279
Cumulus Networks
link-detect
!
router bgp 65000
bgp router-id 0.0.0.9
bgp log-neighbor-changes
bgp scan-time 20
network 29.0.1.0/24
timers bgp 30 90
neighbor tier-2 peer-group
neighbor 192.0.2.2 remote-as 65000
neighbor 192.0.2.2 ttl-security hops 1
neighbor 192.0.2.2 advertisement-interval 30
neighbor 192.0.2.2 timers 30 90
neighbor 192.0.2.2 timers connect 30
neighbor 192.0.2.2 next-hop-self
neighbor 192.0.2.12 remote-as 65000
neighbor 192.0.2.12 next-hop-self
neighbor 203.0.113.1 remote-as 65000
!
ip forwarding
ipv6 forwarding
!
line vty
exec-timeout 0 0
!
end
Using the Cumulus Linux Non-Modal CLI
The vtysh modal CLI can be difficult to work with and even more difficult to script. As an alternative to
this, Cumulus Linux contains a non-modal version of these commands, structured similar to the Linux
ip command. The available commands are:
Command
Description
cl-bgp
BGP (see page 297) commands. See man cl-bgp for details.
cl-ospf
OSPFv2 (see page 285) commands. For example:
cumulus@switch:~$ sudo cl-ospf area 0.0.0.1 range 10.10.10.0/24
cl-ospf6
OSPFv3 (see page 294) commands.
cl-ra
Route advertisement commands. See man cl-ra for details.
280
03 June 2015
Cumulus Linux 2.5.2 User Guide
Command
Description
cl-rctl
Zebra and non-routing protocol-specific commands. See man cl-rctl for details.
Comparing vtysh and Cumulus Linux Commands
This section describes how you can use the various Cumulus Linux CLI commands to configure Quagga,
without using vtysh.
Creating a New Neighbor
To create a new neighbor under Quagga, you would run:
quagga(config)# router bgp 65002
quagga(config-router)# neighbor 14.0.0.22 remote-as 65007
To create a new neighbor with the Cumulus Linux CLI, run:
cumulus@switch:~$ sudo cl-bgp as 65002 neighbor add 14.0.0.22 remote-as
65007
Redistributing Routing Information
To redistribute routing information from static route entries into RIP tables under Quagga, you would
run:
quagga(config)# router bgp 65002
quagga(config-router)# redistribute static
To redistribute routing information from static route entries into RIP tables with the Cumulus Linux CLI,
run:
cumulus@switch:~$ sudo cl-bgp as 65002 redistribute add static
Defining a Static Route
To define a static route under Quagga, you would run:
quagga(config)# ip route 155.1.2.20/24 br2 45
cumulusnetworks.com
281
Cumulus Networks
To define a static route with the Cumulus Linux CLI, run:
cumulus@switch:~$ sudo cl-rctl ip route add 175.0.0.0/28 interface br1
distance 25
Configuring an IPv6 Interface
To configure an IPv6 address under Quagga, you would run:
quagga(config)# int br3
quagga(config-if)# ipv6 address
3002:2123:1234:1abc::21/64
To configure an IPv6 address with the Cumulus Linux CLI, run:
cumulus@switch:~$ sudo cl-rctl interface add swp3 ipv6 address 3002:2123:
abcd:2120::41/64
Enabling PTM
To enable topology checking (PTM) under Quagga, you would run:
quagga(config)# ptm-enable
To enable topology checking (PTM) with the Cumulus Linux CLI, run:
cumulus@switch:~$ sudo cl-rctl ptm-enable set
Configuring MTU in IPv6 Network Discovery
To configure MTU in IPv6 network discovery for an interface under Quagga, you would run:
quagga(config)# int swp3
quagga(config-if)# ipv6 nd mtu 9000
To configure MTU in IPv6 network discovery for an interface with the Cumulus Linux CLI, run:
282
03 June 2015
Cumulus Linux 2.5.2 User Guide
cumulus@switch:~$ sudo cl-ra interface swp3 set mtu 9000
Logging OSPF Adjacency Changes
To log adjacency of OSPF changes under Quagga, you would run:
quagga(config)# router ospf
quagga(config-router)# router-id 2.0.0.21
quagga(config-router)# log-adjacency-changes
To log adjacency changes of OSPF with the Cumulus Linux CLI, run:
cumulus@switch:~$ sudo cl-ospf log-adjacency-changes set
cumulus@switch:~$ sudo cl-ospf router-id set 3.0.0.21
Setting OSPF Interface Priority
To set the OSPF interface priority under Quagga, you would run:
quagga(config)# int swp3
quagga(config-if)# ip ospf priority
120
To set the OSPF interface priority with the Cumulus Linux CLI, run:
cumulus@switch:~$ sudo cl-ospf interface set swp3 priority 120
Configuring Timing for OSPF SPF Calculations
To configure timing for OSPF SPF calculations under Quagga, you would run:
quagga(config)# router ospf6
quagga(config-ospf6)# timer throttle spf 40 50 60
To configure timing for OSPF SPF calculations with the Cumulus Linux CLI, run:
cumulus@switch:~$ sudo cl-ospf6 timer add throttle spf 40 50 60
Configuring Hello Packet Intervals
cumulusnetworks.com
283
Cumulus Networks
Configuring Hello Packet Intervals
To configure the OSPF Hello packet interval in number of seconds for an interface under Quagga, you
would run:
quagga(config)# int swp4
quagga(config-if)# ipv6 ospf6 hello-interval
60
To configure the OSPF Hello packet interval in number of seconds for an interface with the Cumulus
Linux CLI, run:
cumulus@switch:~$ sudo cl-ospf6 interface set swp4 hello-interval 60
Displaying OSPF Debugging Status
To display OSPF debugging status under Quagga, you would run:
quagga# show debugging ospf
To display OSPF debugging status with the Cumulus Linux CLI, run:
cumulus@switch:~$ sudo cl-ospf debug show
Displaying BGP Information
To display BGP information under Quagga, you would run:
quagga# show ip bgp summary
To display BGP information with the Cumulus Linux CLI, run:
cumulus@switch:~$ sudo cl-bgp summary
Useful Links
http://www.nongnu.org/quagga/docs/docs-info.html#BGP
http://www.nongnu.org/quagga/docs/docs-info.html#IPv6-Support
http://www.nongnu.org/quagga/docs/docs-info.html#Zebra
284
03 June 2015
Cumulus Linux 2.5.2 User Guide
Open Shortest Path First - OSPF - Protocol
OSPFv2 is a link-state routing protocol for IPv4. OSPF maintains the view of the network topology
conceptually as a directed graph. Each router represents a vertex in the graph. Each link between
neighboring routers represents a unidirectional edge. Each link has an associated weight (called cost)
that is either automatically derived from its bandwidth or administratively assigned. Using the weighted
topology graph, each router computes a shortest path tree (SPT) with itself as the root, and applies the
results to build its forwarding table. The computation is generally referred to as SPF computation and
the resultant tree as the SPF tree.
An LSA ( link-state advertisement) is the fundamental quantum of information that OSPF routers
exchange with each other. It seeds the graph building process on the node and triggers SPF
computation. LSAs originated by a node are distributed to all the other nodes in the network through a
mechanism called flooding. Flooding is done hop-by-hop. OSPF ensures reliability by using link state
acknowledgement packets. The set of LSAs in a router’s memory is termed link-state database (LSDB), a
representation of the network graph. Thus, OSPF ensures a consistent view of LSDB on each node in
the network in a distributed fashion (eventual consistency model); this is key to the protocol’s
correctness.
Contents
(Click to expand)
Contents (see page 285)
Scalability and Areas (see page 285)
Configuring OSPFv2 (see page 286)
Activating the OSPF Daemon (see page 286)
Enabling OSPF (see page 287)
Defining (Custom) OSPF Parameters on the Interfaces (see page 289)
Scaling Tip: Summarization (see page 289)
Scaling Tip: Stub Areas (see page 290)
Configuration Tip: Unnumbered Interfaces (see page 291)
ECMP (see page 292)
Topology Changes and OSPF Reconvergence (see page 292)
Example Configurations (see page 292)
Debugging OSPF (see page 293)
Configuration Files (see page 294)
Supported RFCs (see page 294)
Useful Links (see page 294)
Scalability and Areas
An increase in the number of nodes affects OSPF scalability in the following ways:
Memory footprint to hold the entire network topology,
Flooding performance,
SPF computation efficiency.
The OSPF protocol advocates hierarchy as a divide and conquer approach to achieve high scale. The
cumulusnetworks.com
285
Cumulus Networks
The OSPF protocol advocates hierarchy as a divide and conquer approach to achieve high scale. The
topology may be divided into areas, resulting in a two-level hierarchy. Area 0 (or 0.0.0.0), called the
backbone area, is the top level of the hierarchy. Packets traveling from one non-zero area to another
must go via the backbone area. As an example, the leaf-spine topology we have been referring to in the
routing section can be divided into areas as follows:
Here are some points to note about areas and OSPF behavior:
Routers that have links to multiple areas are called area border routers (ABR). For example,
routers R3, R4, R5, R6 are ABRs in the diagram. An ABR performs a set of specialized tasks, such
as SPF computation per area and summarization of routes across areas.
Most of the LSAs have an area-level flooding scope. These include router LSA, network LSA, and
summary LSA.
In the diagram, we reused the same non-zero area address. This is fine since the area address
is only a scoping parameter provided to all routers within that area. It has no meaning outside
the area. Thus, in the cases where ABRs do not connect to multiple non-zero areas, the same
area address can be used, thus reducing the operational headache of coming up with area
addresses.
Configuring OSPFv2
Configuring OSPF involves the following tasks:
Activating the OSPF daemon
Enabling OSPF
Defining (Custom) OSPF parameters on the interfaces
Activating the OSPF Daemon
1. Add the following to /etc/quagga/daemons:
ospfd=yes
2. Restart the quagga service to start the new daemons:
cumulus@switch:~$ sudo service quagga restart
286
03 June 2015
Cumulus Linux 2.5.2 User Guide
Enabling OSPF
As we discussed in Introduction to Routing Protocols (see page 267), there are three steps to the
configuration:
1. Identifying the router with the router ID.
2. With whom should the router communicate?
3. What information (most notably the prefix reachability) to advertise?
There are two ways to achieve (2) and (3) in the Quagga OSPF:
1. Explicitly enable OSPF for each interface by configuring it under the interface configuration mode
(recommended):
R3# configure terminal
R3(config)# interface swp1
R3(config-if)# ip ospf area 0.0.0.0
Adding the ip ospf statement as shown above after the interface command achieves both
the attempt to bring up an adjancecy with the peer across the interface and advertise the
prefixes assigned to that interface. If OSPF adjacency bringup is not desired, configure the
corresponding interfaces as passive. For example, in a data center topology, the host-facing
interfaces need not run OSPF; however the corresponding IP addresses should still be advertised
to neighbors. You can do this using the passive-interface construct.
From the vytsh/quagga CLI:
R3# configure terminal
R3(config)# router ospf
R3(config-router)# passive-interface swp10
R3(config-router)# passive-interface swp11
Or use the passive-interface default command to put all interfaces as passive and
selectively remove certain interfaces to bring up protocol adjacency:
R3# configure terminal
R3(config)# router ospf
R3(config-router)# passive-interface default
R3(config-router)# no passive-interface swp1
This method simplifies the configuration by removing all mention of IP addresses from the
router configuration, enabling the same configuration to be applicable across devices of the
same function (such as leaf or spine), thus making it far more easier to automate.
cumulusnetworks.com
287
Cumulus Networks
You must use this method with unnumbered interfaces, as discussed below (see page 291). It is
also recommended instead of using the network keyword discussed in option 2 below.
However, if you want to achieve step 3 (advertising) only, the quagga configuration provides
another method: redistribution. For example:
R3# configure terminal
R3(config)# router ospf
R3(config-router)# redistribute connected
Redistribution, however, unnecessarily loads the database with type-5 LSAs and should be
limited to generating real external prefixes (for example, prefixes learned from BGP). In general,
it is a good practice to generate local prefixes using network and/or passive-interface
statements.
2. The network statement under router ospf does both. The statement is specified with an IP
subnet prefix and an area address. All the interfaces on the router whose IP address matches
the network subnet are put into the specified area. OSPF process starts bringing up peering
adjacency on those interfaces. It also advertises the interface IP addresses formatted into LSAs
(of various types) to the neighbors for proper reachability.
From the Cumulus Linux shell:
cumulus@switch:~$ sudo vtysh
Hello, this is Quagga (version 0.99.21).
Copyright 1996-2005 Kunihiro Ishiguro, et al.
R3# configure terminal
R3(config)# router ospf
R3(config-router)# router-id 0.0.0.1
R3(config-router)# log-adjacency-changes detail
R3(config-router)# network 10.0.0.0/16 area 0.0.0.0
R3(config-router)# network 192.0.2.0/16 area 0.0.0.1
R3(config-router)#
Or through cl-ospf, from the Cumulus Linux shell:
cumulus@switch:~$ sudo cl-ospf router set id 0.0.0.1
cumulus@switch:~$ sudo cl-ospf router set log-adjacency-changes detail
cumulus@switch:~$ sudo cl-ospf router set network 10.0.0.0/16 area
0.0.0.0
cumulus@switch:~$ sudo cl-ospf router set network 192.0.2.0/16 area
0.0.0.1
288
03 June 2015
Cumulus Linux 2.5.2 User Guide
The subnets in the network subnet can be as coarse as possible to cover the most number of
interfaces on the router that should run OSPF.
If you do not require OSPF adjacency, you should configure the corresponding interfaces as
passive, as described in the previous option.
Defining (Custom) OSPF Parameters on the Interfaces
1. Network type, such as point-to-point, broadcast.
2. Timer tuning, like hello interval.
3. For unnumbered interfaces (see below), enable OSPF.
Using Quagga's vtysh:
R3(config)# interface swp1
R3(config-if)# ospf network point-to-point
R3(config-if)# ospf hello-interval 5
Or through cl-ospf, from the Cumulus Linux shell:
cumulus@switch:~$ sudo cl-ospf interface swp1 set network point-to-point
cumulus@switch:~$ sudo cl-ospf interface swp1 set hello-interval 5
The OSPF configuration is saved in /etc/quagga/ospfd.conf.
Scaling Tip: Summarization
By default, an ABR creates a summary (type-3) LSA for each route in an area and advertises it in
adjacent areas. Prefix range configuration optimizes this behavior by creating and advertising one
summary LSA for multiple routes.
To configure a range:
R3(config)# router ospf
R3(config-router)# area 0.0.0.1 range 30.0.0.0/8
Summarize in the direction to the backbone. The backbone receives summarized routes and
injects them to other areas already summarized.
Summarization can cause non-optimal forwarding of packets during failures. Here is an
example scenario:
cumulusnetworks.com
289
Cumulus Networks
As shown in the diagram, the ABRs in the right non-zero area summarize the host prefixes as 10.1.0.0
/16. When the link between R5 and R10 fails, R5 will send a worse metric for the summary route (metric
for the summary route is the maximum of the metrics of intra-area routes that are covered by the
summary route. Upon failure of the R5-R10 link, the metric for 10.1.2.0/24 goes higher at R5 as the path
is R5-R9-R6-R10). As a result, other backbone routers shift traffic destined to 10.1.0.0/16 towards R6.
This breaks ECMP and is an under-utilization of network capacity for traffic destined to 10.1.1.0/24.
Scaling Tip: Stub Areas
Nodes in an area receive and store intra-area routing information and summarized information about
other areas from the ABRs. In particular, a good summarization practice about inter-area routes
through prefix range configuration helps scale the routers and keeps the network stable.
Then there are external routes. External routes are the routes redistributed into OSPF from another
protocol. They have an AS-wide flooding scope. In many cases, external link states make up a large
percentage of the LSDB.
Stub areas alleviate this scaling problem. A stub area is an area that does not receive external route
advertisements.
To configure a stub area:
R3(config)# router ospf
R3(config-router)# area 0.0.0.1 stub
Stub areas still receive information about networks that belong to other areas of the same OSPF
domain. Especially, if summarization is not configured (or is not comprehensive), the information can
be overwhelming for the nodes. Totally stubby areas address this issue. Routers in totally stubby areas
keep in their LSDB information about routing within their area, plus the default route.
To configure a totally stubby area:
290
03 June 2015
Cumulus Linux 2.5.2 User Guide
R3(config)# router ospf
R3(config-router)# area 0.0.0.1 stub no-summary
Here is a brief tabular summary of the area type differences:
Type
Behavior
Normal non- zero
area
LSA types 1, 2, 3, 4 area-scoped, type 5 externals, inter-area routes
summarized
Stub area
LSA types 1, 2, 3, 4 area-scoped, No type 5 externals, inter-area routes
summarized
Totally stubby area
LSA types 1, 2 area-scoped, default summary, No type 3, 4, 5 LSA types
allowed
Configuration Tip: Unnumbered Interfaces
Unnumbered interfaces are interfaces without unique IP addresses. In OSPFv2, configuring
unnumbered interfaces reduces the links between routers into pure topological elements, and thus
dramatically simplifies network configuration and reconfiguration. In addition, routing database
contains only the real networks, hence memory footprint is reduced and SPF is faster.
Unnumbered is usable for point-to-point interfaces only.
If there is a network <network number>/<mask> area <area ID> command present in the
Quagga configuration, the ip ospf area <area ID> command is rejected with the error “Please
remove network command first.” This prevents you from configuring other areas on some of
the unnumbered interfaces. You can use either the network area command or the ospf area
command in the configuration, but not both.
Unless the Ethernet media is intended to be used as a LAN with multiple connected routers,
we recommend configuring the interface as point-to-point. It has the additional advantage of
a simplified adjacency state machine; there is no need for DR/BDR election and LSA reflection.
See RFC5309 for a more detailed discussion.
To configure an unnumbered interface, take the IP address of another interface (called the anchor) and
use that as the IP address of the unnumbered interface:
cumulus@switch:~$ sudo ifconfig lo 192.0.2.1/24
cumulus@switch:~$ sudo ifconfig swp1 192.0.2.1/24
cumulus@switch:~$ sudo ifconfig swp2 192.0.2.1/24
cumulusnetworks.com
291
Cumulus Networks
To enable OSPF on an unnumbered interface from within Quagga's vtysh:
R3(config)# interface swp1
R3(config-if)# ip ospf area 0.0.0.1
ECMP
During SPF computation for an area, if OSPF finds multiple paths with equal cost (metric), all those
paths are used for forwarding. For example, in the reference topology diagram, R8 uses both R3 and R4
as next hops to reach a destination attached to R9.
Topology Changes and OSPF Reconvergence
Topology changes usually occur due to one of four events:
1. Maintenance of a router node
2. Maintenance of a link
3. Failure of a router node
4. Failure of a link
For the maintenance events, operators typically raise the OSPF administrative weight of the link(s) to
ensure that all traffic is diverted from the link or the node (referred to as costing out). The speed of
reconvergence does not matter. Indeed, changing the OSPF cost causes LSAs to be reissued, but the
links remain in service during the SPF computation process of all routers in the network.
For the failure events, traffic may be lost during reconvergence; that is, until SPF on all nodes computes
an alternative path around the failed link or node to each of the destinations. The reconvergence
depends on layer 1 failure detection capabilities and at the worst case DeadInterval OSPF timer.
Example Configurations
Example configuration for event 1, using vtysh:
R3(config)# router ospf
R3(config-router)# max-metric router-lsa administrative
Or, with the non-modal shell command approach:
cumulus@switch:~$ sudo cl-ospf router set max-metric router-lsa
administrative
Example configuration for event 2, using vtysh:
292
03 June 2015
Cumulus Linux 2.5.2 User Guide
R3(config)# interface swp1
R3(config-if)# ospf cost 65535
Or, with the non-modal shell command approach:
cumulus@switch:~$ sudo cl-ospf interface swp1 set cost 65535
Debugging OSPF
OperState lists all the commands to view the operational state of OSPF.
The three most important states while troubleshooting the protocol are:
1. Neighbors, with show ip ospf neighbor. This is the starting point to debug neighbor states
(also see tcpdump below).
2. Database, with show ip ospf database. This is the starting point to verify that the LSDB is, in
fact, synchronized across all routers in the network. For example, sweeping through the output
of show ip ospf database router taken from all routers in an area will ensure if the
topology graph building process is complete; that is, every node has seen all the other nodes in
the area.
3. Routes, with show ip ospf route. This is the outcome of SPF computation that gets
downloaded to the forwarding table, and is the starting point to debug, for example, why an
OSPF route is not being forwarded correctly.
Compare the route output with kernel by using show ip route | grep zebra and
with the hardware entries using cl-route-check -V.
Using cl-ospf:
cumulus@switch:~$ sudo cl-ospf neighbor show [all | detail]
cumulus@switch:~$ sudo cl-ospf database show [asbr-summary | network |
opaque-area |
opaque-link | summary | external |
nssa-external | opaque-as | router]
cumulus@switch:~$ sudo cl-ospf route show
Debugging-OSPF lists all of the OSPF debug options.
Using cl-ospf:
cumulusnetworks.com
293
Cumulus Networks
Usage: cl-ospf debug { COMMAND | help }
COMMANDs
{ set | clear } (all | event | ism | ism [OBJECT] | lsa | lsa
[OBJECT] |
nsm | nsm [OBJECT] | nssa | packet | packet [OBJECT] |
zebra [OBJECT] | zebra all)
Using zebra under vtysh:
cumulus@switch:~$ sudo vtysh
R3# show [zebra]
IOBJECT := { events | status | timers }
OOBJECT := { interface | redistribute }
POBJECT := { all | dd | hello | ls-ack | ls-request | ls-update }
ZOBJECT := { all | events | kernel | packet | rib |
Using tcpdump to capture OSPF packets:
cumulus@switch:~$ sudo tcpdump -v -i swp1 ip proto ospf
Configuration Files
/etc/quagga/daemons
/etc/quagga/ospfd.conf
Supported RFCs
RFC2328
RFC3137
RFC5309
Useful Links
http://en.wikipedia.org/wiki/Open_Shortest_Path_First
http://www.nongnu.org/quagga/docs/docs-info.html#OSPFv2
Perlman, Radia (1999). Interconnections: Bridges, Routers, Switches, and Internetworking
Protocols (2 ed.). Addison-Wesley.
Moy, John T. OSPF: Anatomy of an Internet Routing Protocol. Addison-Wesley.
Open Shortest Path First v3 - OSPFv3 - Protocol
294
03 June 2015
Cumulus Linux 2.5.2 User Guide
Open Shortest Path First v3 - OSPFv3 - Protocol
OSPFv3 is a revised version of OSPFv2 to support the IPv6 address family. Refer to Open Shortest Path
First (OSPF) Protocol (see page 285) for a discussion on the basic concepts, which remain the same
between the two versions.
OSPFv3 has changed the formatting in some of the packets and LSAs either as a necessity to support
IPv6 or to improve the protocol behavior based on OSPFv2 experience. Most notably, v3 defines a new
LSA, called intra-area prefix LSA to separate out the advertisement of stub networks attached to a
router from the router LSA. It is a clear separation of node topology from prefix reachability and lends
itself well to an optimized SPF computation.
IETF has defined extensions to OSPFv3 to support multiple address families (that is, both IPv6
and IPv4). Quagga (see page 271) does not support it yet.
Contents
(Click to expand)
Contents (see page 295)
Configuring OSPFv3 (see page 295)
Unnumbered Interfaces (see page 297)
Debugging OSPF (see page 297)
Configuration Files (see page 297)
Supported RFCs (see page 297)
Useful Links (see page 297)
Configuring OSPFv3
Configuring OSPFv3 involves the following tasks:
1. Activating the OSPF6 daemon:
a. Add the following line to /etc/quagga/daemons:
ospf6d=yes
b. Restart the quagga service to start the new daemons:
cumulus@switch:~$ sudo service quagga restart
2. Enabling OSPF6 and map interfaces to areas. From Quagga’s vtysh shell:
cumulusnetworks.com
295
2.
Cumulus Networks
cumulus@switch:~$ sudo vtysh
Hello, this is Quagga (version 0.99.21).
Copyright 1996-2005 Kunihiro Ishiguro, et al.
R3# conf t
R3# configure
terminal
R3(config)# router ospf6
R3(config-router)# router-id 0.0.1
R3(config-router)# log-adjacency-changes detail
R3(config-router)# interface swp1 area 0.0.0.0
R3(config-router)# interface swp2 area 0.0.0.1
R3(config-router)#
Or through cl-ospf6, from the Cumulus Linux shell:
cumulus@switch:~$ sudo cl-ospf6 router set id 0.0.0.1
cumulus@switch:~$ sudo cl-ospf6 router set log-adjacency-changes detail
cumulus@switch:~$ sudo cl-ospf6 interface swp1 set area 0.0.0.0
cumulus@switch:~$ sudo cl-ospf6 interface swp2 set area 0.0.0.1
3. Defining (custom) OSPF6 parameters on the interfaces:
a. Network type (such as point-to-point, broadcast)
b. Timer tuning (for example, hello interval)
Using Quagga’s vtysh:
R3(config)# interface swp1
R3(config-if)# ipv6 ospf6 network point-to-point
R3(config-if)# ipv6 ospf6 hello-interval 5
Or through cl-ospf6, from the Cumulus Linux shell:
cumulus@switch:~$ sudo cl-ospf6 interface swp1 set network point-topoint
cumulus@switch:~$ sudo cl-ospf6 interface swp1 set hello-interval 5
The OSPFv3 configuration is saved in /etc/quagga/ospf6d.conf.
296
03 June 2015
Cumulus Linux 2.5.2 User Guide
Unnumbered Interfaces
Unlike OSPFv2, OSPFv3 intrinsically supports unnumbered interfaces. Forwarding to the next hop
router is done entirely using IPv6 link local addresses. Therefore, you are not required to configure any
global IPv6 address to interfaces between routers.
Debugging OSPF
See Debugging OSPF (see page 293) for OSPFv2 for the troubleshooting discussion. The equivalent
commands are:
cumulus@switch:~$ sudo vtysh
R3# show ipv6 ospf6 neighbor
R3# show ipv6 ospf6 database [detail | dump | internal |
as-external | group-membership |
inter-prefix | inter-router |
intra-prefix | link | network |
router | type-7 | * | adv-router |
linkstate-id | self-originated]
R3# show ip ospf route
Another helpful command is show ipv6 ospf6 [area <id>] spf tree. It dumps the node
topology as computed by SPF to help visualize the network view.
Configuration Files
/etc/quagga/daemons
/etc/quagga/ospf6d.conf
Supported RFCs
RFC5340
RFC3137
Useful Links
http://en.wikipedia.org/wiki/Open_Shortest_Path_First
http://www.nongnu.org/quagga/docs/docs-info.html#OSPFv3
Configuring Border Gateway Protocol - BGP
BGP is the routing protocol that runs the Internet. It is an increasingly popular protocol for use in the
data center as it lends itself well to the rich interconnections in a Clos topology. Specifically:
It does not require routing state to be periodically refreshed unlike OSPF.
It is less chatty than its link-state siblings. For example, a link or node transition can result in a
cumulusnetworks.com
297
Cumulus Networks
It is less chatty than its link-state siblings. For example, a link or node transition can result in a
bestpath change, causing BGP to send updates.
It is multi-protocol and extensible.
There are many robust vendor implementations.
The protocol is very mature and comes with many years of operational experience.
This IETF draft provides further details of the use of BGP within the data center.
Contents
(Click to expand)
Contents (see page 298)
Commands (see page 299)
Autonomous System Number (ASN) (see page 299)
eBGP and iBGP (see page 299)
Route Reflectors (see page 300)
ECMP with BGP (see page 300)
BGP for both IPv4 and IPv6 (see page 300)
Fast Convergence Design Considerations (see page 300)
Configuring BGP (see page 301)
Specifying the Interface Name in the neighbor Command (see page 303)
Troubleshooting Link-local Addresses (see page 303)
Configuration Tips (see page 305)
Using peer-group to Simplify Configuration (see page 305)
Preserving the AS_PATH Setting (see page 305)
Troubleshooting (see page 305)
Debugging Tip: Logging Neighbor State Changes (see page 308)
Enabling Read-only Mode (see page 309)
Applying a Route Map for Route Updates (see page 309)
Protocol Tuning (see page 310)
Converging Quickly On Link Failures (see page 310)
Converging Quickly On Soft Failures (see page 310)
Reconnecting Quickly (see page 311)
Advertisement Interval (see page 311)
Configuration Files (see page 312)
Useful Links (see page 312)
Caveats and Errata (see page 313)
ttl-security Issue (see page 313)
298
03 June 2015
Cumulus Linux 2.5.2 User Guide
Commands
Cumulus Linux:
bgp
vtysh
Quagga:
bgp
neighbor
router
show
Autonomous System Number (ASN)
One of the key concepts in BGP is an autonomous system number or ASN. An autonomous system is
defined as a set of routers under a common administration. Since BGP was originally designed to peer
between independently managed enterprises and/or service providers, each such enterprise is treated
as an autonomous system, responsible for a set of network addresses. Each such autonomous system
is given a unique number called its ASN. ASNs are handed out by a central authority (ICANN). However,
ASNs between 64512 and 65535 are reserved for private use. Using BGP within the data center relies
on either using this number space or else using the single ASN you own.
The ASN is central to how BGP builds a forwarding topology. A BGP route advertisement carries with it
not only the originator’s ASN, but also the list of ASNs that this route advertisement has passed
through. When forwarding a route advertisement, a BGP speaker adds itself to this list. This list of ASNs
is called the AS path. BGP uses the AS path to detect and avoid loops.
ASNs were originally 16-bit numbers, but were later modified to be 32-bit. Quagga supports both 16-bit
and 32-bit ASNs, but most implementations still run with 16-bit ASNs.
Private ASNs for 32-bit ASNs are a work in progress at the time of this writing.
eBGP and iBGP
When BGP is used to peer between autonomous systems, the peering is referred to as external BGP or
eBGP. When BGP is used within an autonomous system, the peering used is referred to as internal BGP
or iBGP. eBGP peers have different ASNs while iBGP peers have the same ASN.
While the heart of the protocol is the same when used as eBGP or iBGP, there is a key difference in the
protocol behavior between use as eBGP and iBGP: an iBGP node does not forward routing information
learned from one iBGP peer to another iBGP peer. It expects the originating iBGP peer to send this
information to all iBGP peers.
This implies that iBGP peers are all connected to each other. In a large network, this requirement can
quickly become unscalable. The most popular method to avoid this problem is to introduce a route
reflector.
cumulusnetworks.com
299
Cumulus Networks
Route Reflectors
Route reflectors are quite easy to understand in a Clos topology. In a two-tier Clos network, the leaf (or
tier 1) switches are the only ones connected to end stations. Subsequently, this means that the spines
themselves do not have any routes to announce. They’re merely reflecting the routes announced by
one leaf to the other leaves. Thus, the spine switches function as route reflectors while the leaf
switches serve as route reflector clients.
In a three-tier network, the tier 2 nodes (or mid-tier spines) act as both route reflector servers and
route reflector clients. They act as route reflectors because they announce the routes learned from the
tier 1 nodes to other tier 1 nodes and to tier 3 nodes. They also act as route reflector clients to the tier
3 nodes, receiving routes learned from other tier 2 nodes. Tier 3 nodes act only as route reflectors.
In the following illustration, tier 2 node 2.1 is acting as a route reflector server, announcing the routes
between tier 1 nodes 1.1 and 1.2 to tier 1 node 1.3. It is also a route reflector client, learning the routes
between tier 2 nodes 2.2 and 2.3 from the tier 3 node, 3.1.
ECMP with BGP
If a BGP node hears a prefix p from multiple peers, it has all the information necessary to program the
routing table to forward traffic for that prefix p through all of these peers. Thus, BGP supports equalcost multipathing.
BGP for both IPv4 and IPv6
Unlike OSPF, which has separate versions of the protocol to announce IPv4 and IPv6 routes, BGP is a
multi-protocol routing engine, capable of announcing both IPv4 and IPv6 prefixes. It supports
announcing IPv4 prefixes over an IPv4 session and IPv6 prefixes over an IPv6 session. It also supports
announcing prefixes of both these address families over a single IPv4 session or over a single IPv6
session.
Fast Convergence Design Considerations
Without getting into the why (see the IETF draft cited in Useful Links below that talks about BGP use
within the data center), we strongly recommend the following use of addresses in the design of a BGPbased data center network:
Use of interface addresses: Set up BGP sessions only using interface-scoped addresses. This
allows BGP to react quickly to link failures.
Use of next-hop-self: Every BGP node says that it knows how to forward traffic to the prefixes it
is announcing. This reduces the requirement to announce interface-specific addresses and
thereby reduces the size of the forwarding table.
300
03 June 2015
Cumulus Linux 2.5.2 User Guide
Configuring BGP
1. Activate the BGP daemon:
Add the following line to /etc/quagga/daemons:
bgpd = yes
Touch an empty bgpd configuration file:
cumulus@switch:~$ sudo touch /etc/quagga/bgpd.conf
A slightly more useful configuration file would contain the following lines:
hostname R7
password *****
enable password *****
log timestamp precision 6
log file /var/log/quagga/bgpd.log
!
line vty
exec-timeout 0 0
!
The most important information here is the specification of the location of the log file,
where the BGP process can log debugging and other useful information. A common
convention is to store the log files under /var/log/quagga.
You must restart quagga when a new daemon is enabled:
cumulus@switch:~$ sudo service quagga restart
2. Identify the BGP node by assigning an ASN and router-id:
cumulus@switch:~$ sudo vtysh
Hello, this is Quagga (version 0.99.21).
Copyright 1996-2005 Kunihiro Ishiguro, et al.
cumulusnetworks.com
301
Cumulus Networks
R7# configure
terminal
R7(config)# router bgp 65000
R7(config-router)# bgp router-id 0.0.0.1
3. Specify to whom it must disseminate routing information:
R7(config-router)# neighbor 10.0.0.2 remote-as 65001
If it is an iBGP session, the remote-as is the same as the local AS:
R7(config-router)# neighbor 10.0.0.2 remote-as 65000
Specifying the peer’s IP address allows BGP to set up a TCP socket with this peer, but it doesn’t
distribute any prefixes to it, unless it is explicitly told that it must via the activate command:
R7(config-router)# address-family ipv4 unicastR7(config-router-af)#
neighbor 10.0.0.2 activateR7(config-router-af)# exitR7(config-router)#
address-family ipv6R7(config-router-af)# neighbor 2002:0a00:0002::0a00:
0002 activateR7(config-router-af)# exit
As you can see, activate has to be specified for each address family that is being announced by
the BGP session.
4. Specify some properties of the BGP session:
R7(config-router)# neighbor 10.0.0.2 next-hop-selfR7(config-router)#
address-family ipv4 unicastR7(config-router-af)# maximum-paths 64
For iBGP, the maximum-paths is selected by typing:
R7(config-router-af)# maximum-paths ibgp 64
If this is a route-reflector client, it can be specified as follows:
R3(config-router-af)# neighbor 10.0.0.1 route-reflector-client
It is node R3, the route reflector, on which the peer is specified as a client.
302
03 June 2015
Cumulus Linux 2.5.2 User Guide
It is node R3, the route reflector, on which the peer is specified as a client.
5. Specify what prefixes to originate:
R7(config-router)# address-family ipv4 unicastR7(config-router-af)#
network 192.0.2.0/24R7(config-router-af)# network 203.0.113.1/24
Specifying the Interface Name in the neighbor Command
When you are configuring BGP for the neighbors of a given interface, you can specify the interface
name instead of its IP address. All the other neighbor command options remain the same.
This is equivalent to BGP peering to the link-local IPv6 address of the neighbor on the given interface.
The link-local address is learned via IPv6 neighbor discovery router advertisements.
Consider the following example configuration:
router bgp 65000
bgp router-id 0.0.0.1
neighbor swp1 interface
neighbor swp1 remote-as 65000
neighbor swp1 next-hop-self
!
address-family ipv6
neighbor swp1 activate
exit-address-family
Make sure that IPv6 neighbor discovery router advertisements are supported and not
suppressed. In Quagga, you do this by checking the running configuration. Under the
interface configuration, use no ipv6 nd suppress-ra to remove router suppression.
We recommend you adjust the router advertisement’s interval to a shorter value (ipv6 nd
ra-interval <>) to address scenarios when nodes come up and miss router advertisement
processing to relay the neighbor’s link-local address to BGP.
Troubleshooting Link-local Addresses
To verify that quagga learned the neighboring link-local IPv6 address via the IPv6 neighbor discovery
router advertisements on a given interface, use the show interface <if-name> command. If ipv6
nd suppress-ra isn’t enabled on both ends of the interface, then Neighbor address(s): should
have the other end’s link-local address. That is the address that BGP would use when BGP is enabled
on that interface.
Use vtysh to run quagga, then verify the configuration:
cumulusnetworks.com
303
Cumulus Networks
cumulus@switch:~$ sudo vtysh
Hello, this is Quagga (version 0.99.21).
Copyright 1996-2005 Kunihiro Ishiguro, et al.
R7# show interface
swp1
Interface swp1 is up, line protocol is up
PTM status: disabled
Description: rut
index 3 metric 1 mtu 1500
flags: <UP,BROADCAST,RUNNING,MULTICAST>
HWaddr: 00:02:00:00:00:09
inet 11.0.0.1/24 broadcast 11.0.0.255
inet6 fe80::202:ff:fe00:9/64
ND advertised reachable time is 0 milliseconds
ND advertised retransmit interval is 0 milliseconds
ND router advertisements are sent every 600 seconds
ND router advertisements lifetime tracks ra-interval
ND router advertisement default router preference is medium
Hosts use stateless autoconfig for addresses.
Neighbor address(s):
inet6 fe80::4638:39ff:fe00:129b/128
Instead of the IPv6 address, the peering interface name is displayed in the show ip bgp summary
command and wherever else applicable:
R7# show ip bgp summary
BGP router identifier 0.0.0.1, local AS number 65000
RIB entries 1, using 112 bytes of memory
Peers 1, using 8712 bytes of memory
Neighbor
V
AS MsgRcvd MsgSent
TblVer
InQ OutQ Up/Down
State
/PfxRcd
swp1
4 65000
161
170
0
0
0 00:02:28
0
Most of the show commands can take the interface name instead of the IP address, if that level of
specificity is needed:
R7# show ip bgp neighbors
<cr>
304
A.B.C.D
Neighbor to display information about
WORD
Neighbor on bgp configured interface
03 June 2015
Cumulus Linux 2.5.2 User Guide
X:X::X:X
Neighbor to display information about
R7# show ip bgp neighbors swp1
Configuration Tips
Using peer-group to Simplify Configuration
When there are many peers to connect to, the amount of redundant configuration becomes
overwhelming. For example, repeating the activate and next-hop-self commands for even 60
neighbors makes for a very long configuration file. Using peer-group addresses this problem.
Instead of specifying properties of each individual peer, Quagga allows for defining one or more peergroups and associating all the attributes common to that peer session to a peer-group.
After doing this, the only task is to associate an IP address with a peer-group. Here is an example of
defining and using peer-groups:
R7(config-router)# neighbor tier-2 peer-groupR7(config-router)# neighbor
tier-2 remote-as 65000R7(config-router)# address-family ipv4 unicastR7
(config-router-af)# neighbor tier-2 activateR7(config-router-af)# neighbor
tier-2 next-hop-selfR7(config-router-af)# maximum-paths ibgp 64R7(configrouter-af)# exitR7(config-router)# neighbor 10.0.0.2 peer-group tier-2R7
(config-router)# neighbor 192.0.2.2 peer-group tier-2
If you’re using eBGP, besides specifying the neighbor’s IP address, you also have to specify the neighbor’
s ASN, since it is different for each neighbor. In such a case, you wouldn’t specify the remote-as for the
peer-group.
Preserving the AS_PATH Setting
If you plan to use multipathing with the multipath-relax option, Quagga generates an AS_SET in
place of the current AS_PATH for the bestpath. This helps to prevent loops but is unusual behavior. To
preserve the AS_PATH setting, use the no-as-set option when configuring bestpath:
R7(config-router)# bgp bestpath as-path multipath-relax no-as-set
Troubleshooting
The most common starting point for troubleshooting BGP is to view the summary of neighbors
connected to and some information about these connections. A sample output of this command is as
follows:
cumulusnetworks.com
305
Cumulus Networks
R7# show ip bgp summary
BGP router identifier 0.0.0.9, local AS number 65000
RIB entries 7, using 672 bytes of memory
Peers 2, using 9120 bytes of memory
Neighbor
V
AS MsgRcvd MsgSent
TblVer
InQ OutQ Up/Down
State
/PfxRcd
10.0.0.2
4 65000
11
10
0
0
0 00:06:38
3
192.0.2.2
4 65000
11
10
0
0
0 00:06:38
3
Total number of neighbors 2
(Pop quiz: Are these iBGP or eBGP sessions? Hint: Look at the ASNs.)
It is also useful to view the routing table as defined by BGP:
R7# show ip bgp
BGP table version is 0, local router ID is 0.0.0.9
Status codes: s suppressed, d damped, h history, * valid, > best, i internal,
r RIB-failure, S Stale, R Removed
Origin codes: i - IGP, e - EGP, ? - incomplete
Network
Next Hop
Metric LocPrf Weight Path
*> 192.0.2.29/24
0.0.0.0
0
32768 i
*>i192.0.2.30/24
10.0.0.2
0
100
0 i
* i
192.0.2.2
0
100
0 i
*>i192.0.2.31/24
10.0.0.2
0
100
0 i
* i
192.0.2.2
0
100
0 i
*>i192.0.2.32/24
10.0.0.2
0
100
0 i
* i
192.0.2.2
0
100
0 i
Total number of prefixes 4
A more detailed breakdown of a specific neighbor can be obtained using show ip bgp neighbor
<neighbor ip address>:
R7# show ip bgp
neighbor 10.0.0.2
BGP neighbor is 10.0.0.2, remote AS 65000, local AS 65000, internal link
BGP version 4, remote router ID 0.0.0.5
BGP state = Established, up for 00:14:03
306
03 June 2015
Cumulus Linux 2.5.2 User Guide
Last read 14:52:31, hold time is 180, keepalive interval is 60 seconds
Neighbor capabilities:
4 Byte AS: advertised and received
Route refresh: advertised and received(old & new)
Address family IPv4 Unicast: advertised and received
Message statistics:
Inq depth is 0
Outq depth is 0
Sent
Rcvd
Opens:
1
1
Notifications:
0
0
Updates:
1
3
16
15
Route Refresh:
0
0
Capability:
0
0
18
19
Keepalives:
Total:
Minimum time between advertisement runs is 5 seconds
For address family: IPv4 Unicast
NEXT_HOP is always this router
Community attribute sent to this neighbor(both)
3 accepted prefixes
Connections established 1; dropped 0
Last reset never
Local host: 10.0.0.1, Local port: 35258
Foreign host: 10.0.0.2, Foreign port: 179
Nexthop: 10.0.0.1
Nexthop global: fe80::202:ff:fe00:19
Nexthop local: ::
BGP connection: non shared network
Read thread: on
Write thread: off
To see the details of a specific route such as from whom it was received, to whom it was sent, and so
forth, use the show ip bgp <ip address/prefix> command:
R7# show ip bgp 192.0.2.0
BGP routing table entry for 192.0.2.0/24
Paths: (2 available, best #1, table Default-IP-Routing-Table)
Not advertised to any peer
Local
10.0.0.2 (metric 1) from 10.0.0.2 (0.0.0.10)
Origin IGP, metric 0, localpref 100, valid, internal, best
cumulusnetworks.com
307
Cumulus Networks
Originator: 0.0.0.10, Cluster list: 0.0.0.5
Last update: Mon Jul
8 10:12:17 2013
Local
192.0.2.2 (metric 1) from 192.0.2.2 (0.0.0.10)
Origin IGP, metric 0, localpref 100, valid, internal
Originator: 0.0.0.10, Cluster list: 0.0.0.6
Last update: Mon Jul
8 10:12:17 2013
This shows that the routing table prefix seen by BGP is 192.0.2.0/24, that this route was not advertised
to any neighbor, and that it was heard by two neighbors, 10.0.0.2 and 192.0.2.2.
Here is another output of the same command, on a different node in the network:
cumulus@switch:~$ sudo vtysh -c 'sh ip bgp 192.0.2.0'
BGP routing table entry for 192.0.2.0/24
Paths: (1 available, best #1, table Default-IP-Routing-Table)
Advertised to non peer-group peers:
10.0.0.1 192.0.2.21 192.0.2.22
Local, (Received from a RR-client)
203.0.113.1 (metric 1) from 203.0.113.1 (0.0.0.10)
Origin IGP, metric 0, localpref 100, valid, internal, best
Last update: Mon Jul
8 09:07:41 2013
Debugging Tip: Logging Neighbor State Changes
It is very useful to log the changes that a neighbor goes through to troubleshoot any issues associated
with that neighbor. This is done using the log-neighbor-changes command:
R7(config-router)# bgp log-neighbor-changes
The output is sent to the specified log file, usually /var/log/quagga/bgpd.log, and looks like this:
2013/07/08 10:12:06.572827 BGP: %NOTIFICATION: sent to neighbor 10.0.0.2 6
/3 (Cease/Peer Unconfigured) 0 bytes
2013/07/08 10:12:06.572954 BGP: Notification sent to neighbor 10.0.0.2:
type 6/3
2013/07/08 10:12:16.682071 BGP: %ADJCHANGE: neighbor 192.0.2.2 Up
2013/07/08 10:12:16.682660 BGP: %ADJCHANGE: neighbor 10.0.0.2 Up
308
03 June 2015
Cumulus Linux 2.5.2 User Guide
Enabling Read-only Mode
You can enable read-only mode for when the BGP process restarts or when the BGP process is cleared
using clear ip bgp *. When enabled, read-only mode begins as soon as the first peer reaches its
established state and a timer for <max-delay> seconds is started.
While in read-only mode, BGP doesn’t run best-path or generate any updates to its peers. This mode
continues until:
All the configured peers, except the shutdown peers, have sent an explicit EOR (End-Of-RIB) or
an implicit EOR. The first keep-alive after BGP has reached the established state is considered an
implicit EOR. If the <establish-wait> option is specified, then BGP will wait for peers to reach
the established state from the start of the update-delay until the <establish-wait> period
is over; that is, the minimum set of established peers for which EOR is expected would be peers
established during the establish-wait window, not necessarily all the configured neighbors.
The max-delay period is over.
Upon reaching either of these two conditions, BGP resumes the decision process and generates
updates to its peers.
To enable read-only mode:
cumulus@switch:$ sudo bgp update-delay <max-delay in seconds> [<establishwait in seconds>]
The default <max-delay> is 0 — the feature is off by default.
Use output from show ip bgp summary for information about the state of the update delay.
This feature can be useful in reducing CPU/network usage as BGP restarts/clears. It’s particularly useful
in topologies where BGP learns a prefix from many peers. Intermediate best paths are possible for the
same prefix as peers get established and start receiving updates at different times. This feature is also
valuable if the network has a high number of such prefixes.
Applying a Route Map for Route Updates
You can apply a route map on route updates from BGP to Zebra. All the applicable match operations
are allowed, such as match on prefix, next-hop, communities, and so forth. Set operations for this
attach-point are limited to metric and next-hop only. Any operation of this feature does not affect BGPs
internal RIB.
Both IPv4 and IPv6 address families are supported. Route maps work on multi-paths as well. However,
the metric setting is based on the best path only.
To apply a route map for route updates:
cumulus@switch:$ sudo cl-bgp table-map <route-map-name>
cumulusnetworks.com
309
Cumulus Networks
Protocol Tuning
Converging Quickly On Link Failures
In the Clos topology, we recommend that you only use interface addresses to set up peering sessions.
This means that when the link fails, the BGP session is torn down immediately, triggering route updates
to propagate through the network quickly. This requires the following commands be enabled for all
links: link-detect and ttl-security hops <hops>. ttl-security hops specifies how many
hops away the neighbor is. For example, in a Clos topology, every peer is at most 1 hop away.
See Caveats and Errata below for information regarding ttl-security hops.
Here is an example:
cumulus@switch:~$ sudo vtysh
Hello, this is Quagga (version 0.99.21).
Copyright 1996-2005 Kunihiro Ishiguro, et al.
R7# configure
terminal
R7(config)# interface
swp1
R7(config-if)# link-detect
R7(config-if)# exit
R7(config)# router bgp 65000
R7(config-router)# neighbor
10.0.0.2 ttl-security
hops
1
Converging Quickly On Soft Failures
It is possible that the link is up, but the neighboring BGP process is hung or has crashed. If a BGP
process crashes, Quagga’s watchquagga daemon, which monitors the various quagga daemons, will
attempt to restart it. If the process is also hung, watchquagga will attempt to restart the process. BGP
itself has a keepalive timer that is exchanged between neighbors. By default, this keepalive timer is set
to 60 seconds. This time can be reduced to a lower number, but this has the disadvantage of increasing
the CPU load, especially in the presence of a lot of neighbors. keepalive-time is the periodicity with
which the keepalive message is sent. hold-time specifies how many keepalive messages can be lost
before the connection is considered invalid. It is usually set to 3 times the keepalive time. Here is an
example of reducing these timers:
R7(config-router)# neighbor 10.0.0.2 timers 30 90
We can make these the default for all BGP neighbors using a different command:
310
03 June 2015
Cumulus Linux 2.5.2 User Guide
R7(config-router)# timers bgp 30 90
The following display snippet shows that the default values have been modified for this neighbor:
R7(config-router)# do show ip bgp neighbor 10.0.0.2
BGP neighbor is 10.0.0.2, remote AS 65000, local AS 65000, internal link
BGP version 4, remote router ID 0.0.0.5
BGP state = Established, up for 05:53:59
Last read 14:53:25, hold time is 180, keepalive interval is 60 seconds
Configured hold time is 90, keepalive interval is 30 seconds
....
When you’re in a configuration mode, such as when you’re configuring BGP parameters, you
can run any show command by adding do to the original command. For example, do show
ip bgp neighbor was shown above. Under a non-configuration mode, you’d simply run:
show ip bgp neighbor 10.0.0.2
Reconnecting Quickly
A BGP process attempts to connect to a peer after a failure (or on startup) every connect-time
seconds. By default, this is 120 seconds. To modify this value, use:
R7(config-router)# neighbor 10.0.0.2 timers connect 30
This command has to be specified per each neighbor, peer-group doesn’t support this option in quagga
.
Advertisement Interval
BGP by default chooses stability over fast convergence. This is very useful when routing for the
Internet. For example, unlike link-state protocols, BGP typically waits for a duration of advertisementinterval seconds between sending consecutive updates to a neighbor. This ensures that an unstable
neighbor flapping routes won’t be propagated throughout the network. By default, this is set to 30
seconds for an eBGP session and 5 seconds for an iBGP session. For very fast convergence, set the
timer to 0 seconds. You can modify this as follows:
cumulusnetworks.com
311
Cumulus Networks
R7(config-router)# neighbor 10.0.0.2 advertisement-interval 0
The following output shows the modified value:
R7(config-router)# do show ip bgp neighbor 10.0.0.2
BGP neighbor is 10.0.0.2, remote AS 65000, local AS 65000, internal link
BGP version 4, remote router ID 0.0.0.5
BGP state = Established, up for 06:01:49
Last read 14:53:15, hold time is 180, keepalive interval is 60 seconds
Configured hold time is 90, keepalive interval is 30 seconds
Neighbor capabilities:
4 Byte AS: advertised and received
Route refresh: advertised and received(old & new)
Address family IPv4 Unicast: advertised and received
Message statistics:
Inq depth is 0
Outq depth is 0
Sent
Rcvd
Opens:
1
1
Notifications:
0
0
Updates:
1
3
Keepalives:
363
362
Route Refresh:
0
0
Capability:
0
0
365
366
Total:
Minimum time between advertisement runs is 0 seconds
....
This command is not supported with peer-groups.
See this IETF draft for more details on the use of this value.
Configuration Files
/etc/quagga/daemons
/etc/quagga/bgpd.conf
Useful Links
Wikipedia entry for BGP (includes list of useful RFCs)
Quagga online documentation for BGP (may not be up to date)
IETF draft discussing BGP use within data centers
312
03 June 2015
Cumulus Linux 2.5.2 User Guide
IETF draft discussing BGP use within data centers
Caveats and Errata
ttl-security Issue
Enabling ttl-security does not cause the hardware to be programmed with the relevant
information. This means that frames will come up to the CPU and be dropped there. It is
recommended that you use the cl-acltool command to explicitly add the relevant entry to
hardware.
For example, you can configure a file, like /etc/cumulus/acl/policy.d/01control_plane_bgp.
rules, with a rule like this for TTL:
INGRESS_INTF = swp1
INGRESS_CHAIN = INPUT, FORWARD
[iptables]
-A $INGRESS_CHAIN --in-interface $INGRESS_INTF -p tcp --dport bgp -m ttl -ttl 255 POLICE --set-mode pkt --set-rate 2000 --set-burst 1000
-A $INGRESS_CHAIN --in-interface $INGRESS_INTF -p tcp --dport bgp DROP
For more information about ACLs and cl-acltool, see Netfilter (ACLs) (see page 114).
This issue will be fixed in a future release of Cumulus Linux.
Hardware ECMP Hashing
Cumulus Linux uses hardware ECMP hashing (equal-cost multi-path routing). Hardware ECMP
automatically happens when multipath routes are added. You don’t have to do anything to enable this.
Contents
(Click to expand)
Contents (see page 313)
Understanding ECMP Hashing (see page 314)
Commands (see page 314)
Using cl-ecmpcalc (see page 314)
Caveats and Errata (see page 315)
Resilient Hashing (see page 315)
How Resilient Hashing Works (see page 315)
cumulusnetworks.com
313
Cumulus Networks
How Resilient Hashing Works (see page 315)
Configuring Resilient Hashing (see page 315)
Caveats (see page 316)
Useful Links (see page 316)
Understanding ECMP Hashing
Cumulus Linux hashes on the following fields:
Layer 3 protocol
Source interface
Destination IP address (IPv4 and IPv6)
Source IP address (IPv4 and IPv6)
For TCP/UDP frames, Cumulus Linux also hashes on:
Source port
Destination port
If the data in any of those fields vary, the switch will route the packet out of a different port in the ECMP
group.
Commands
cl-ecmpcalc
Using cl-ecmpcalc
cl-ecmpcalc is a program to predict the egress port for a given IP frame. It operates in one of two
modes:
Online mode: Which happens when you run cl-ecmpcalc on a switch with switchd running. In
this mode, cl-ecmpcalc can query the hardware for the hashing configuration and hashing
algorithm. It can also look up the ECMP route for a given destination address to determine the
ECMP member count. This ultimately allows cl-ecmpcalc to give an answer in terms of the
swp interface.
Offline mode: Which happens when you run cl-ecmpcalc off of the switch, or with switchd
stopped. In offline mode, you have to specify the full hash configuration (which fields to hash
on, algorithm to use, ECMP group count) and all the required frame field content (IP addresses,
port numbers, and so forth). In this mode, since the configuration/frame information is
hypothetical, cl-ecmpcalc gives an answer in terms of ecmp group offset, which is the
index into an ECMP group table. This mode is useful for debugging and configuring new hash
configurations.
For example:
cumulus@switch:~$ sudo cl-ecmpcalc -p tcp -i swp1 -s 10.0.0.1 --sport 20000
-d 10.0.0.1 --dport 80
ecmpcalc: will query hardware
swp3
314
03 June 2015
Cumulus Linux 2.5.2 User Guide
Example with a missing destination port:
cumulus@switch:~$ sudo cl-ecmpcalc -p tcp -i swp1 -s 10.0.0.1 --sport 20000
-d 10.0.0.1
ecmpcalc: will query hardware
usage: cl-ecmpcalc [-h] [-v] [-p PROTOCOL] [-s SRC] [--sport SPORT] [-d DST]
[--dport DPORT] [--vid VID] [-i IN_INTERFACE]
[--sportid SPORTID] [--smodid SMODID] [-o OUT_INTERFACE]
[--dportid DPORTID] [--dmodid DMODID] [--hardware]
[--nohardware] [-hs HASHSEED]
[-hf HASHFIELDS [HASHFIELDS ...]]
[--hashfunction {crc16-ccitt,crc16-bisync}] [-e EGRESS]
[-c MCOUNT]
cl-ecmpcalc: error: --sport and --dport required for TCP and UDP frames
Caveats and Errata
cl-ecmpcalc can only take input interfaces that can be converted to a single physical port in the port
tab file, like the physical switch ports (swp). Virtual interfaces like bridges, bonds, and subinterfaces
aren’t supported.
Resilient Hashing
In the Cumulus Linux default ECMP hashing mechanism, when a member link in an ECMP group fails,
existing flows on the unaffected links could be moved to other links. This can lead to packets being
delivered to end systems out of order. When resilient hashing is enabled and a member link fails,
existing flows on unaffected links remain on those same links; only the flows on an affected link are
moved to the unaffected links.
When a new member is added to an ECMP group, some of existing flows are moved to the new link
from an existing link. Resilient hashing ensures that there is minimal disruption in flows when adding
or deleting a link.
How Resilient Hashing Works
An available link list tracks available physical links. A pointer tracks which available link is next in the list.
The forwarding table then determines if the egress interface is an aggregate link in the ECMP group. A
flow identifier is determined based on the fields listed above (see page 314). The resilient hashing
comes into play by mapping flows already mapped to their original physical links. If the flow is not
already mapped, the flow is mapped to the link the pointer is currently referencing and the pointer
advances to the next link in the available link list. You can read more about resilient hashing on its
patent page.
Configuring Resilient Hashing
By default, resilient hashing is disabled. To enable resilient hashing, set resilient_hash_enable =
TRUE in /etc/cumulus/datapath/traffic.conf.
When resilient hashing is enabled, you can create up to 512 ECMP groups. Each Trident II system has
cumulusnetworks.com
315
Cumulus Networks
When resilient hashing is enabled, you can create up to 512 ECMP groups. Each Trident II system has
32,768 flowset table entries, and 128 flowset table entries are reserved for each ECMP group by
default. To change the number of resilient hashing flowset entries per ECMP group, set
resilient_hash_entries_ecmp in traffic.conf with values from 64, 128, 256, 512 or 1024.
The resilient hashing configuration is read from traffic.conf when switchd starts. If you modify
traffic.conf, restart switchd for the changes to take effect:
cumulus@switch:~$ sudo service switchd restart
For example, to enable resilient hashing and to reserve 256 entries per ECMP group, set the following
config in /etc/cumulus/datapath/traffic.conf:
# Enable resilient hashing
resilient_hash_enable = TRUE
# Resilient hashing flowset entries per ECMP group
# Valid values - 64, 128, 256, 512, 1024
resilient_hash_entries_ecmp = 256
To verify that configuration has been set, run:
cumulus@switch:~$ sudo cl-cfg -a switchd | grep resilient
traffic.resilient_hash_entries_ecmp = 256
traffic.resilient_hash_enable = TRUE
Caveats
Resilient hashing is supported only:
On switches with Trident II chipsets.
With unicast traffic.
For ECMP groups.
For IPv4 routes.
Useful Links
http://en.wikipedia.org/wiki/Equal-cost_multi-path_routing
Index
316
03 June 2015
Cumulus Linux 2.5.2 User Guide
Index
/
/mnt/persist 97
4
40G ports 244
logical limitations 244
8
802.1p 245
class of service 245
802.3ad link aggregation 230
A
ABRs 286
area border routers 286
access control lists 114
access ports 178
ACL policy files 118
ACL rules 247
ACLs 114
active-active mode 252
VRR 252
active image slot 90
active listener ports 74
active-standby mode 252
VRR 252
Algorithm Longest Prefix Match 267
routing 267
ALPM mode 267
routing 267
alternate image slot 90, 93, 96, 97
accessing 96
installing a new image 93
selecting 97
AOC cables 11
cumulusnetworks.com
317
Cumulus Networks
apt-get 104
area border routers 286
ABRs 286
arp cache 84
ARP requests 253
VRR 253
AS_PATH setting 305
BGP 305
ASN 299
autonomous system number 299
auto-negotiation 156, 243
switch ports 243
autonomous system number 299
BGP 299
autoprovision command 113
autoprovisioning 110
B
bestpath 305
BGP 305
BFD 144, 147
Bidirectional Forwarding Detection 144, 147
BGP 297, 300
Border Gateway Protocol 297
ECMP 300
Bidirectional Forwarding Detection 144
bonds 160, 230
LACP Bypass 230
boot recovery 37
bpdufilter 239
STP 239
BPDU guard 238
STP 238
brctl 13, 165, 166, 235, 256, 256
and STP 235
IGMP snooping 256
MLD snooping 256
bridges 163, 163, 164, 165, 166, 167, 167, 171, 173, 178, 178, 184, 195
access ports 178
adding interfaces 165, 166
adding IP addresses 171
IGMP snooping 195
MAC addresses 167
318
03 June 2015
Cumulus Linux 2.5.2 User Guide
MTU 167
physical interfaces 164
trunk ports 178
untagged frames 173
VLAN-aware 163, 184
C
cable connectivity 11
cabling 144
Prescriptive Topology Manager 144
cl-acltool 85, 115, 247
CLAG 215, 225, 225, 225, 229, 252
and VRR 252
backup link 225
peer link states 225
PROTO_DOWN state 225
STP 229
clagctl 224
class of service 245
cl-bgp 280
cl-cfg 30, 56
cl-ecmpcalc 314
cl-img-clear-overlay 99, 99
cl-img-install 93
cl-img-pkg 102
cl-img-select 97, 98, 100, 100, 101
cl-license 10
cl-netstat 41
cl-ospf 280, 288
cl-ospf6 280, 295
Clos topology 270
cl-ra 280
cl-rctl 281
cl-resource-query 30, 42
cl-route-check 293
cl-support 33
convergence 269
routing 269
Cumulus Linux 6, 7, 93, 95, 99, 100, 100, 195, 209
installing 6, 93
reprovisioning 100
reserved VLAN ranges 195
reverting 99
cumulusnetworks.com
319
Cumulus Networks
uninstalling 100
upgrading 7, 95
VXLAN 209
cumulus user 18
D
DAC cables 11
daemons 74
datapath 245
datapath.conf 245
date 36
setting 36
deb 109
debugging 31
decode-syseeprom 45
differentiated services code point 245
dmidecode 46
dpkg 106
dpkg-reconfigure 35
DSCP 245
differentiated services code point 245
DSCP marking 247
dual-connected hosts 218
duplexing 243
switch ports 243
duplex interfaces 156
dynamic routing 149, 271
and PTM 149
quagga 271
E
eBGP 299
external BGP 299
ebtables 115, 118
memory spaces 118
ECMP 1, 271, 292, 300
BGP 300
equal cost multi-pathing 271
monitoring 1
OSPF 292
ECMP hashing 313, 315
320
03 June 2015
Cumulus Linux 2.5.2 User Guide
resilient hashing 315
EGP 272
Exterior Gateway Protocol 272
equal cost multi-pathing 271
ECMP 271
equal-cost multi-path routing 313
ECMP hashing 313
ERSPAN 86
network troubleshooting 86
Ethernet management port 8
ethtool 39, 242
switch ports 242
external BGP 299
eBGP 299
F
fast convergence 300
BGP 300
fast leave 260
IGMP/MLD snooping 260
First Hop Redundancy Protocol 252
VRR 252
G
ganging 240
switch ports 240
globs 135
Graphviz 144
H
hardware 44
monitoring 44
hardware compatibility list 6
hash distribution 162
HCL 6
high availability 215, 271
host entries 42
monitoring 42
Host HA 215
hostname 8
cumulusnetworks.com
321
Cumulus Networks
hsflowd 51
hwclock 36
I
iBGP 299
internal BGP 299
ifdown 127
ifplugd 253
VRR 253
ifquery 78, 129
ifup 127
ifupdown 126
ifupdown2 77, 77, 77, 126, 133, 156, 176
excluding interfaces 77
logging 77
purging IP addresses 133
troubleshooting 77
VLAN tagging 176
IGMP snooping 195, 227, 255
MLAG 227
VLAN-aware bridges 195
IGP 272
Interior Gateway Protocol 272
image contents 102
image slots 90, 91, 91, 92
PowerPC 91
resizing 92
x86 91
installing 6
Cumulus Linux 6
interface counters 41
interface dependencies 128
interfaces 154, 157, 158
addresses 157
statistics 158
interface states 155
internal BGP 299
iBGP 299
ip6tables 115
IP addresses 133
purging 133
iproute2 81, 185
failures 81
322
03 June 2015
Cumulus Linux 2.5.2 User Guide
iptables 115
IPv4 routes 300
BGP 300
IPv6 routes 300
BGP 300
J
jdoo 47, 228
L
LACP 160, 216
CLAG 216
LACP Bypass 230
layer 3 access ports 13
configuring 13
LDAP 26
leaf-spine topology 270
license 9
installing 9
link aggregation 160
Link Layer Discovery Protocol 138
link-local IPv6 addresses 303
BGP 303
link pause 248
datapath 248
link-state advertisement 285
link state monitoring 253
VRR 253
LLDP 138
lldpcli 139
lldpd 138, 145
load balancing 271
logging 77, 77
ifupdown2 77
networking service 77
logging neighbor state changes 308
BGP 308
logical switch 215
longest prefix match 1
routing 1
loopback interface 14
cumulusnetworks.com
323
Cumulus Networks
configuring 14
LSA 285
link-state advertisement 285
LSDB 285
link-state database 285
lshw 46
M
MAC entries 42
monitoring 42
Mako templates 80, 135
debugging 80
mangle table 248
ACL rules 248
memory spaces 118
ebtables 118
MLAG 215, 227
IGMP snooping 227
MLD snooping 255
monitoring 31, 35, 39, 42, 50, 53
network traffic 50
mount points 103
mstpctl 180, 235
MTU 81, 157, 167
bridges 167
failures 81
multi-Chassis Link Aggregation 215
CLAG 215
multiple bridges 168
mz 84
traffic generator 84
N
name switch service 25
Netfilter 114
Net-SNMP 47
networking service 77
logging 77
network interfaces 154
network traffic 50
monitoring 50
324
03 June 2015
Cumulus Linux 2.5.2 User Guide
network virtualization 195, 196, 209
VMware NSX 196
no-as-set 305
BGP 305
nonatomic updates 117
switchd 117
non-blocking networks 270
NSS 25
name switch service 25
NTP 37
time 37
ntpd 37
O
ONIE 6, 101
rescue mode 101
Open Network Install Environment 6
Open Shortest Path First Protocol 285, 295
OSPFv2 285
OSPFv3 295
open source contributions 5
OSPF 1, 291, 292, 292, 294
ECMP 292
reconvergence 292
summary LSA 1
supported RFCs 294
unnumbered interfaces 291
ospf6d.conf 296
OSPFv2 285
OSPFv3 295, 297,
supported RFCs
unnumbered interfaces 297
overlayfs file system 103
over-subscribed networks 270
P
packages 104
managing 104
packet buffering 245
datapath 245
packet filtering 116
cumulusnetworks.com
325
Cumulus Networks
packet queueing 245
datapath 245
packet scheduling 245
datapath 245
PAM 25
pluggable authentication modules 25
parent interfaces 131
password 18
default 18
passwordless access 18
passwords 7
peer-groups 305
BGP 305
persistent configurations 97
Per VLAN Spanning Tree 235
PVST 235
ping 83
pluggable authentication modules 25
policy.conf 121
port lists 135
ports.conf 240, 241
port speed 243
switch ports 243
port speeds 156, 240
Prescriptive Topology Manager 144
primary image slot 90
priority groups 245
datapath 245
privileged commands 19
PROTO_DOWN state 225
CLAG 225
protocol tuning 268, 310
BGP 310
routing 268
PTM 144
Prescriptive Topology Manager 144
ptmctl 150
ptmd 144
PTM scripts 148
public community 49
PVRST 235
Rapid PVST 235
PVST 235
Per VLAN Spanning Tree 235
326
03 June 2015
Cumulus Linux 2.5.2 User Guide
Q
QSFP 42
Quagga 149, 149, 265, 271, 273
and PTM 149, 149
configuring 273
dynamic routing 271
static routing 265
quality of service 250
querier 259
IGMP/MLD snooping 259
R
Rapid PVST 235
PVRST 235
read-only mode 309
BGP 309
recommended configuration 98
reconvergence 292
OSPF 292
remote access 17
repositories 108
other packages 108
rescue mode 101
reserved VLAN ranges 195
resilient hashing 315
restart 30
switchd 30
root user 7, 18
route advertisements 299
BGP 299
route entries 267, 267
ALPM 267
limitations 267
route maps 309
BGP 309
route reflectors 300
BGP 300
routes 42
monitoring 42
routing protocols 268
RSTP 235
cumulusnetworks.com
327
Cumulus Networks
S
sensors command 46
serial console management 8
services 74
sFlow 50
sFlow visualization tools 53
SFP 42, 242
switch ports 242
single user mode 37
smonctl 49
smond 49
SNMP 47
snmpd 47
sources.list 108
SPAN 86
network troubleshooting 86
spanning tree parameters 237
Spanning Tree Protocol 184, 234
STP 234
VLAN-aware bridges 184
SSH 17
SSH keys 17
static routing 262, 265
with ip route 262
with Quagga 265
STP 229, 234
CLAG 229
Spanning Tree Protocol 234
stub areas 290
OSPF 290
sudo 18, 19
sudoers 19, 20
examples 20
summary LSA 289
OSPF 289
switchd 28, 28, 30, 56, 117, 314
and ECMP hashing 314
configuring 28
counters 56
file system 28
nonatomic updates 117
restarting 30
switch ports 12, 240, 240, 244
328
03 June 2015
Cumulus Linux 2.5.2 User Guide
configuring 12, 240
ganging 240
logical limitations 244
system management 31
T
templates 135
time 36
setting 36
time zone 9, 35
topology 144, 269
data center 144
traceroute 83
traffic.conf 245
traffic distribution 162
traffic generator 84
mz 84
traffic marking 247
datapath 247
troubleshooting 31, 37
single user mode 37
trunk ports 173, 178
tzdata 35
U
U-Boot 6, 31
unnumbered interfaces 291, 297
OSPF 291
OSPFv3 297
untagged frames 173
bridges 173
upgrading 7
Cumulus Linux 7
user accounts 18
cumulus 18
root 18
user authentication 25
user commands 134
interfaces 134
cumulusnetworks.com
329
Cumulus Networks
V
virtual device counters 53, 56, 56
monitoring 53
poll interval 56
VLAN statistics 56
virtual router redundancy 250
visudo 19
VLAN 53
statistics 53
VLAN-aware bridges 163, 184, 184, 187, 188, 189, 190, 193, 195
basic example 188
configuring 187
IGMP snooping 195
Spanning Tree Protocol 184
with access ports and pruned VLANs 189
with bonds 190
with CLAG 193
VLAN tagging 176, 176, 178
advanced example 178
basic example 176
VLAN translation 183
VRR 250
virtual router redundancy 250
VTEP 196, 197
vtysh 276
quagga CLI 276
VXLAN 53, 195, 197, 209
no controller 209
statistics 53
VMware NSX 197
Z
zebra 272
routing 272
zero touch provisioning 110
ZTP 110
330
03 June 2015
Was this manual useful for you? yes no
Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Download PDF

advertising