Link Aggregation joins two or more network interfaces together. Depending on the configuration, you can increase bandwidth, throughput, performance and redundancy.

Linux[edit | edit source]

There is a bonding kernel module which can bond multiple network interfaces (so-called slaves) into a virtual bonded network interface. There are a few modes that are supported:

Type Description
mode=0, mode=balance-rr Round Robin policy and is the default mode.
mode=1, mode=active-backup Only one slave is active at one time. All slave interfaces will share the same MAC address.
mode=3, mode=broadcast Everything is transmitted on all slave interfaces
mode=4, mode=802.3ad Dynamic Link Aggregation mode, also known as LACP, on a switch that supports IEEE 802.3ad dynamic link.
mode=5, mode=balance-tlb Adaptive transmit load balancing. Outgoing traffic is distributed based on load on each slave.
mode=6, mode=balance-alb Adaptive load balancing.

You can see the current bonding status by reading /proc/net/bonding/bond0:

# cat /proc/net/bonding/bond0
Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)
 
Bonding Mode: fault-tolerance (active-backup) (fail_over_mac active)
Primary Slave: None
Currently Active Slave: ib2
MII Status: up
MII Polling Interval (ms): 100
Up Delay (ms): 0
Down Delay (ms): 0
 
Slave Interface: ib0
MII Status: up
Speed: 56000 Mbps
Duplex: full
Link Failure Count: 2
Permanent HW addr: a0:00:00:2a:fe:80
Slave queue ID: 0
 
Slave Interface: ib1
MII Status: up
Speed: 56000 Mbps
Duplex: full
Link Failure Count: 1
Permanent HW addr: a0:00:00:2c:fe:80
Slave queue ID: 0
 
Slave Interface: ib2
MII Status: up
Speed: 56000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: a0:00:00:2a:fe:80
Slave queue ID: 0
 
Slave Interface: ib3
MII Status: up
Speed: 56000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: a0:00:00:2c:fe:80
Slave queue ID: 0

Troubleshooting[edit | edit source]

Bond link keeps on randomly dying[edit | edit source]

I have a bond0 interface that randomly stops communicating with another device on the infiniband network. Here's some duct tape to toggle the active slave down and up which causes the bond to pick another active slave which in doing so fixes the network issues (at least temporarily). I run this script every 5 minutes via cronjob.

#!/bin/bash

function fix_fail (){
        echo Fixing a bad connection
        ActiveIb=$(cat /proc/net/bonding/bond0 |grep -i "Currently Active" | awk '{print $4}')
        /sbin/ifconfig $ActiveIb down
        /sbin/ifconfig $ActiveIb up
}

ping -W 2 -c 5 ems1-ib || fix_fail

See Also[edit | edit source]