Link Aggregation
Link Aggregation joins two or more network interfaces together. Depending on the configuration, you can increase bandwidth, throughput, performance and redundancy.
In ethernet, link aggregation is defined in the 802.3ad and later 802.1ax standards. Link Aggregation Control Protocol or LACP is the implementation of these standards. Needless to say, LACP can only be used on switches that support LACP.
Linux
Link aggregation on Linux is available using either the Linux kernel ethernet bonding driver, the Linux Team driver (which runs the link validation, LACP implementation, etc. in user space), or some other application such as Open vSwitch.
Linux ethernet bonding driver
The Linux Ethernet bonding driver is typically included with most common distros. The driver is able to aggregate multiple ethernet interfaces (slaves) into a single logical 'bonded' interface.
Docs: https://www.kernel.org/doc/html/latest/networking/bonding.html
Supported modes
Each bonded interface can be configured with one of the following modes.
Mode | Type | Description |
---|---|---|
0 | balance-rr | Round Robin policy and is the default mode. Packets alternate in sequential order across all available interfaces. |
1 | active-backup | Only one slave is active at one time. All slave interfaces will share the same MAC address. |
2 | balance-xor | Transmits based on the hash of a packet's source / destination IP (layer2+3) / MAC (layer2) as defined in the xmit_hash_policy option.
|
3 | broadcase | Everything is transmitted on all slave interfaces |
4 | 802.3ad | IEEE 802.3ad Dynamic link aggregation policy. Used with a switch that supports IEEE 802.3ad dynamic link. Use this option if your switch is configured for 802.3ad / LACP. Some switches block all traffic for ports configured to use LACP until LACP is negotiated, so setting this option properly is very important. |
5 | balance-tlb | Adaptive transmit load balancing. Outgoing traffic is distributed based on load on each slave. |
6 | balance-alb | Adaptive load balancing. |
Network Manager
To create a new bond with Network Manager:
# nmcli con add type bond con-name bond0 ifname bond0 mode 802.3ad ip4 x.x.x.x/yy
# nmcli con mod bond0 bond.options mode=802.3ad,miimon=100,lacp_rate=fast,xmit_hash_policy=layer2+3
# nmcli con mod bond0 802-3-ethernet.mtu 9000
## Add slaves; ensure the slave interfaces are not used first.
# nmcli con del em2
# nmcli con del em3
# nmcli con add type bond-slave ifname em2 con-name em2 master bond0
# nmcli con add type bond-slave ifname em3 con-name em3 master bond0
## Bring up the bond
# nmcli con up bond0
Checking on the bond status
You can see the current bonding status by reading /proc/net/bonding/bond0
:
# cat /proc/net/bonding/bond0
Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)
Bonding Mode: fault-tolerance (active-backup) (fail_over_mac active)
Primary Slave: None
Currently Active Slave: ib2
MII Status: up
MII Polling Interval (ms): 100
Up Delay (ms): 0
Down Delay (ms): 0
Slave Interface: ib0
MII Status: up
Speed: 56000 Mbps
Duplex: full
Link Failure Count: 2
Permanent HW addr: a0:00:00:2a:fe:80
Slave queue ID: 0
Slave Interface: ib1
MII Status: up
Speed: 56000 Mbps
Duplex: full
Link Failure Count: 1
Permanent HW addr: a0:00:00:2c:fe:80
Slave queue ID: 0
Slave Interface: ib2
MII Status: up
Speed: 56000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: a0:00:00:2a:fe:80
Slave queue ID: 0
Slave Interface: ib3
MII Status: up
Speed: 56000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: a0:00:00:2c:fe:80
Slave queue ID: 0
Open vSwitch
Open vSwitch can be used to create a bonded network interface as a port to a virtual bridge.
To create a new bond, use the add-bond bridge-name bond-name interfaces options
subcommand. For example:
# ovs-vsctl add-br nic0
# ovs-vsctl add-bond nic0 bond0 eth0 eth1 bond_mode=balance-tcp lacp=active
LACP options
Here are some common Open vSwitch options when working with LACP. More information can be found in the man 5 ovs-vswitchd.conf.db
man page.
Option | Description |
---|---|
lacp
|
|
bond_mode
|
|
other_config:bond-detect-mode
|
|
other_config:bond-miimon-interval
|
Sets the miimon interval (if using miimon as the detection method) |
bond_updelay
|
Milliseconds required for a link to remain up before it is considered up. Set to 0 to make interfaces up immediately. |
bond_downdelay
|
Milliseconds required for a link to remain down before it is considered down. Set to 0 to make interfaces down immediately. |
other_config:lacp-fallback-ab
|
|
To check on the status of your bond, run: ovs-appctl lacp/show
To bring a port down, run: ovs-ofctl mod-port br0 bond0 down
LACP remains disabled in Open vSwitch
After configuring a pair of interfaces into a bonded port in Open vSwitch, the port remains disabled. The bond status also shows that the actor state
is defaulted
and the may_enable
parameter is false
. The underlying network interfaces do show LACP packets, but it is not being negotiated.
# ovs-appctl lacp/show
---- bond0 ----
status: active
sys_id: 6c:b3:11:13:58:d8
sys_priority: 65534
aggregation key: 5
lacp_time: slow
member: enp3s0f0: defaulted attached
port_id: 6
port_priority: 65535
may_enable: true
actor sys_id: 6c:b3:11:13:58:d8
actor sys_priority: 65534
actor port_id: 6
actor port_priority: 65535
actor key: 5
actor state: activity aggregation synchronized collecting distributing defaulted
partner sys_id: 00:00:00:00:00:00
partner sys_priority: 0
partner port_id: 0
partner port_priority: 0
partner key: 0
partner state:
member: enp3s0f1: defaulted attached
port_id: 5
port_priority: 65535
may_enable: true
actor sys_id: 6c:b3:11:13:58:d8
actor sys_priority: 65534
actor port_id: 5
actor port_priority: 65535
actor key: 5
actor state: activity aggregation synchronized collecting distributing defaulted
partner sys_id: 00:00:00:00:00:00
partner sys_priority: 0
partner port_id: 0
partner port_priority: 0
partner key: 0
partner state:
# ovs-appctl bond/show bond0
---- bond0 ----
bond_mode: balance-tcp
bond may use recirculation: no, Recirc-ID : -1
bond-hash-basis: 0
lb_output action: disabled, bond-id: -1
updelay: 0 ms
downdelay: 0 ms
lacp_status: configured
lacp_fallback_ab: true
active-backup primary: <none>
active member mac: 00:00:00:00:00:00(none)
member enp3s0f0: disabled
may_enable: false
member enp3s0f1: disabled
may_enable: false
Solution: The problem here is that the Intel X550T card is not supported by Open vSwitch. To confirm if this applies to you, run lspci:
# lspci|grep -i network
03:00.0 Ethernet controller: Intel Corporation Ethernet Controller 10G X550T (rev 01)
03:00.1 Ethernet controller: Intel Corporation Ethernet Controller 10G X550T (rev 01)
If this is the case, your next best option is to use Linux's LACP implementation described above (using nmcli
or network-scripts to set up bond0 with LACP) and then add the port to Open vSwitch to the bond0 device (ovs-vsctl add-port nic0 bond0 vlan_mode=native-untagged
).
Troubleshooting
Bond link keeps on randomly dying
I have a bond0 interface that randomly stops communicating with another device on the infiniband network. Here's some duct tape to toggle the active slave down and up which causes the bond to pick another active slave which in doing so fixes the network issues (at least temporarily). I run this script every 5 minutes via cronjob.
#!/bin/bash
function fix_fail (){
echo Fixing a bad connection
ActiveIb=$(cat /proc/net/bonding/bond0 |grep -i "Currently Active" | awk '{print $4}')
/sbin/ifconfig $ActiveIb down
/sbin/ifconfig $ActiveIb up
}
ping -W 2 -c 5 ems1-ib || fix_fail
See Also
- https://en.wikipedia.org/wiki/Link_aggregation
- http://www.linuxhorizon.ro/bonding.html
- https://wiki.debian.org/Bonding
- https://www.interserver.net/tips/kb/network-bonding-types-network-bonding/