Link Aggregation

From Leo's Notes
Last edited on 28 September 2022, at 23:30.

Link Aggregation joins two or more network interfaces together. Depending on the configuration, you can increase bandwidth, throughput, performance and redundancy.

In ethernet, link aggregation is defined in the 802.3ad and later 802.1ax standards. Link Aggregation Control Protocol or LACP is the implementation of these standards. Needless to say, LACP can only be used on switches that support LACP.

Linux

Link aggregation on Linux is available using either the Linux kernel ethernet bonding driver, the Linux Team driver (which runs the link validation, LACP implementation, etc. in user space), or some other application such as Open vSwitch.

Linux ethernet bonding driver

The Linux Ethernet bonding driver is typically included with most common distros. The driver is able to aggregate multiple ethernet interfaces (slaves) into a single logical 'bonded' interface.

Docs: https://www.kernel.org/doc/html/latest/networking/bonding.html

Supported modes

Each bonded interface can be configured with one of the following modes.

Mode Type Description
0 balance-rr Round Robin policy and is the default mode. Packets alternate in sequential order across all available interfaces.
1 active-backup Only one slave is active at one time. All slave interfaces will share the same MAC address.
2 balance-xor Transmits based on the hash of a packet's source / destination IP (layer2+3) / MAC (layer2) as defined in the xmit_hash_policy option.
3 broadcase Everything is transmitted on all slave interfaces
4 802.3ad IEEE 802.3ad Dynamic link aggregation policy. Used with a switch that supports IEEE 802.3ad dynamic link. Use this option if your switch is configured for 802.3ad / LACP. Some switches block all traffic for ports configured to use LACP until LACP is negotiated, so setting this option properly is very important.
5 balance-tlb Adaptive transmit load balancing. Outgoing traffic is distributed based on load on each slave.
6 balance-alb Adaptive load balancing.

Network Manager

To create a new bond with Network Manager:

# nmcli con add type bond con-name bond0 ifname bond0 mode 802.3ad ip4 x.x.x.x/yy
# nmcli con mod bond0 bond.options mode=802.3ad,miimon=100,lacp_rate=fast,xmit_hash_policy=layer2+3
# nmcli con mod bond0 802-3-ethernet.mtu 9000

## Add slaves; ensure the slave interfaces are not used first.
# nmcli con del em2
# nmcli con del em3
# nmcli con add type bond-slave ifname em2 con-name em2 master bond0
# nmcli con add type bond-slave ifname em3 con-name em3 master bond0

## Bring up the bond
# nmcli con up bond0

Checking on the bond status

You can see the current bonding status by reading /proc/net/bonding/bond0:

# cat /proc/net/bonding/bond0
Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)
 
Bonding Mode: fault-tolerance (active-backup) (fail_over_mac active)
Primary Slave: None
Currently Active Slave: ib2
MII Status: up
MII Polling Interval (ms): 100
Up Delay (ms): 0
Down Delay (ms): 0
 
Slave Interface: ib0
MII Status: up
Speed: 56000 Mbps
Duplex: full
Link Failure Count: 2
Permanent HW addr: a0:00:00:2a:fe:80
Slave queue ID: 0
 
Slave Interface: ib1
MII Status: up
Speed: 56000 Mbps
Duplex: full
Link Failure Count: 1
Permanent HW addr: a0:00:00:2c:fe:80
Slave queue ID: 0
 
Slave Interface: ib2
MII Status: up
Speed: 56000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: a0:00:00:2a:fe:80
Slave queue ID: 0
 
Slave Interface: ib3
MII Status: up
Speed: 56000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: a0:00:00:2c:fe:80
Slave queue ID: 0

Open vSwitch

Open vSwitch can be used to create a bonded network interface as a port to a virtual bridge.

To create a new bond, use the add-bond bridge-name bond-name interfaces options subcommand. For example:

# ovs-vsctl add-br nic0
# ovs-vsctl add-bond nic0 bond0 eth0 eth1 bond_mode=balance-tcp lacp=active

LACP options

Here are some common Open vSwitch options when working with LACP. More information can be found in the man 5 ovs-vswitchd.conf.db man page.

Option Description
lacp
  • active - ports can initiate LACP negotiations
  • passive - ports can participate in LACP negotiations if initiated by the switch
  • off - do not use LACP (default)
bond_mode
  • balance-slb - Balances traffic based on source MAC address and output VLAN
  • active-backup - Uses one member and fails over to a backup member (default)
  • balance-tcp - Requires 802.3ad / LACP enabled. Balances traffic based on l3/l4 information (Eg. IP addresses or ports).
other_config:bond-detect-mode
  • carrier - check by the link's carrier status
  • miimon - check by polling each interface's MII
other_config:bond-miimon-interval Sets the miimon interval (if using miimon as the detection method)
bond_updelay Milliseconds required for a link to remain up before it is considered up. Set to 0 to make interfaces up immediately.
bond_downdelay Milliseconds required for a link to remain down before it is considered down. Set to 0 to make interfaces down immediately.
other_config:lacp-fallback-ab
  • true - allows Open vSwitch to fallback to active-backup mode if the switch does not support LACP
  • false - the bond will be disabled if LACP is not supported by the switch (default).

To check on the status of your bond, run: ovs-appctl lacp/show

To bring a port down, run: ovs-ofctl mod-port br0 bond0 down

LACP remains disabled in Open vSwitch

After configuring a pair of interfaces into a bonded port in Open vSwitch, the port remains disabled. The bond status also shows that the actor state is defaulted and the may_enable parameter is false. The underlying network interfaces do show LACP packets, but it is not being negotiated.

# ovs-appctl lacp/show
---- bond0 ----
  status: active
  sys_id: 6c:b3:11:13:58:d8
  sys_priority: 65534
  aggregation key: 5
  lacp_time: slow

member: enp3s0f0: defaulted attached
  port_id: 6
  port_priority: 65535
  may_enable: true

  actor sys_id: 6c:b3:11:13:58:d8
  actor sys_priority: 65534
  actor port_id: 6
  actor port_priority: 65535
  actor key: 5
  actor state: activity aggregation synchronized collecting distributing defaulted

  partner sys_id: 00:00:00:00:00:00
  partner sys_priority: 0
  partner port_id: 0
  partner port_priority: 0
  partner key: 0
  partner state:

member: enp3s0f1: defaulted attached
  port_id: 5
  port_priority: 65535
  may_enable: true

  actor sys_id: 6c:b3:11:13:58:d8
  actor sys_priority: 65534
  actor port_id: 5
  actor port_priority: 65535
  actor key: 5
  actor state: activity aggregation synchronized collecting distributing defaulted

  partner sys_id: 00:00:00:00:00:00
  partner sys_priority: 0
  partner port_id: 0
  partner port_priority: 0
  partner key: 0
  partner state:

# ovs-appctl bond/show bond0
---- bond0 ----
bond_mode: balance-tcp
bond may use recirculation: no, Recirc-ID : -1
bond-hash-basis: 0
lb_output action: disabled, bond-id: -1
updelay: 0 ms
downdelay: 0 ms
lacp_status: configured
lacp_fallback_ab: true
active-backup primary: <none>
active member mac: 00:00:00:00:00:00(none)

member enp3s0f0: disabled
  may_enable: false

member enp3s0f1: disabled
  may_enable: false

Solution: The problem here is that the Intel X550T card is not supported by Open vSwitch. To confirm if this applies to you, run lspci:

# lspci|grep -i network
03:00.0 Ethernet controller: Intel Corporation Ethernet Controller 10G X550T (rev 01)
03:00.1 Ethernet controller: Intel Corporation Ethernet Controller 10G X550T (rev 01)

If this is the case, your next best option is to use Linux's LACP implementation described above (using nmcli or network-scripts to set up bond0 with LACP) and then add the port to Open vSwitch to the bond0 device (ovs-vsctl add-port nic0 bond0 vlan_mode=native-untagged).

Troubleshooting

Bond link keeps on randomly dying

I have a bond0 interface that randomly stops communicating with another device on the infiniband network. Here's some duct tape to toggle the active slave down and up which causes the bond to pick another active slave which in doing so fixes the network issues (at least temporarily). I run this script every 5 minutes via cronjob.

#!/bin/bash

function fix_fail (){
        echo Fixing a bad connection
        ActiveIb=$(cat /proc/net/bonding/bond0 |grep -i "Currently Active" | awk '{print $4}')
        /sbin/ifconfig $ActiveIb down
        /sbin/ifconfig $ActiveIb up
}

ping -W 2 -c 5 ems1-ib || fix_fail

See Also