Skip to content

LACP (Link Aggregation Control Protocol) - Primer

Why This Matters

A single link is a single point of failure. LACP bonds multiple physical links into one logical channel, providing both bandwidth aggregation and failover. It is the standard way to connect servers to top-of-rack switches with redundancy. Misconfigured bonding is a common cause of mysterious packet loss, asymmetric routing, and failover that does not actually fail over.

Link aggregation (LAG, port channel, bond, team) combines multiple physical links: - Bandwidth: N links provide up to N times the bandwidth - Redundancy: If one link fails, traffic shifts to surviving links - Single logical interface: Upper layers see one interface with one IP

Important: traffic is distributed across links via hashing, not round-robin. A single TCP flow uses one link. Aggregate bandwidth benefits come from many concurrent flows.

Fun fact: Link aggregation has many names across vendors: LAG (Link Aggregation Group), port-channel (Cisco), bond (Linux), team (Red Hat), trunk (some vendors — confusingly, Cisco uses "trunk" for VLAN trunking, not link aggregation). The IEEE standard is 802.3ad (original, 2000), now part of 802.1AX.

Gotcha: A 2-link bond does NOT give you 2x bandwidth for a single connection. Because hashing assigns a flow to one link, a single scp transfer uses only one link (1 Gbps on a 2x1G bond). You only see aggregate bandwidth with many concurrent flows. This surprises almost everyone the first time.

LACP (802.3ad)

LACP is the IEEE standard for dynamic link aggregation. Both sides negotiate and agree on which ports to bundle.

Modes

Mode Behavior
Active Sends LACP PDUs, actively tries to form a bond
Passive Responds to LACP PDUs but does not initiate

At least one side must be active. Both sides passive = no bond formed.

LACP PDU Exchange

Partners exchange LACPDUs every 1 second (fast) or 30 seconds (slow). If 3 consecutive PDUs are missed, the link is removed from the aggregate.

Default trap: The default LACP rate is slow (30-second interval). This means a link failure takes up to 90 seconds to detect (3 missed PDUs). Always set lacp_rate fast (1-second interval, 3-second detection) in production. The 90-second detection window on slow rate is unacceptable for most workloads.

Linux Bonding

Bonding Modes

Mode Name Description Requires Switch Config?
0 balance-rr Round-robin per packet Yes (static LAG)
1 active-backup One active, rest standby No
2 balance-xor Hash-based distribution Yes (static LAG)
3 broadcast Send on all links Yes
4 802.3ad LACP dynamic aggregation Yes (LACP)
5 balance-tlb Adaptive TX load balance No
6 balance-alb Adaptive TX+RX load balance No

Remember: The two most common bonding modes for production: Mode 1 (active-backup) when you need simple redundancy without switch config, and Mode 4 (802.3ad/LACP) when you need both bandwidth and redundancy with switch cooperation. Mnemonic: "1 for simple, 4 for fast."

Under the hood: Mode 4 (LACP) sends LACPDUs (Link Aggregation Control Protocol Data Units) using the slow multicast MAC 01:80:C2:00:00:02. The PDU contains the System ID (MAC), System Priority, Port Key, and Port Priority. Both sides must agree on the key and system ID to form a bundle.

Mode 4 (802.3ad/LACP) Setup

# Create bond interface
ip link add bond0 type bond mode 802.3ad

# Set LACP rate and hash policy
ip link set bond0 type bond lacp_rate fast
ip link set bond0 type bond xmit_hash_policy layer3+4

# Add member interfaces
ip link set eth0 master bond0
ip link set eth1 master bond0

# Bring up
ip link set bond0 up
ip addr add 10.0.0.1/24 dev bond0

Mode 1 (active-backup) Setup

Simplest redundancy — no switch configuration needed:

ip link add bond0 type bond mode active-backup
ip link set eth0 master bond0
ip link set eth1 master bond0
ip link set bond0 up

Persistent Configuration (systemd-networkd)

# /etc/systemd/network/bond0.netdev
[NetDev]
Name=bond0
Kind=bond

[Bond]
Mode=802.3ad
LACPTransmitRate=fast
TransmitHashPolicy=layer3+4

# /etc/systemd/network/bond0.network
[Match]
Name=bond0

[Network]
Address=10.0.0.1/24

Hash Policies

The hash policy determines how traffic is distributed across links:

Policy Hashes On Use Case
layer2 src/dst MAC Simple, default
layer3+4 src/dst IP + port Best for IP traffic
layer2+3 src/dst MAC + IP Good general choice
encap3+4 Inner headers (tunnels) VXLAN/GRE environments
# Set hash policy
ip link set bond0 type bond xmit_hash_policy layer3+4

Monitoring and Troubleshooting

Check Bond Status

# Detailed bond status
cat /proc/net/bonding/bond0

# Quick check
ip link show bond0
ip link show eth0    # check "master bond0" in output

# LACP partner info
cat /proc/net/bonding/bond0 | grep -A5 "Partner"

Common Issues

Bond formed but only one link active: Check switch LACP config. Both sides must agree on LAG membership.

Asymmetric traffic: Hash policy sends most flows through one link. This is normal for low-flow-count scenarios. Switch to layer3+4 for better distribution.

Failover not working: Check miimon (link monitoring interval). Without miimon, the bond has no way to detect link failures.

Debug clue: The first thing to check when LACP bonding seems broken: cat /proc/net/bonding/bond0 and look for "Partner Mac Address: 00:00:00:00:00:00". If the partner MAC is all zeros, the switch side is not sending LACPDUs. This means either (1) the switch port is not configured for LACP, (2) the switch is in passive mode and your Linux side is also passive, or (3) there is a physical layer issue (cable, SFP).

# Set link monitoring (milliseconds)
ip link set bond0 type bond miimon 100

# Or use ARP monitoring
ip link set bond0 type bond arp_interval 200
ip link set bond0 type bond arp_ip_target 10.0.0.254

LACP rate mismatch: Both sides should use the same rate (fast or slow).

Interview tip: "What is the difference between LACP and a static LAG?" LACP (802.3ad) dynamically negotiates the bond — both sides exchange PDUs to agree on which ports to bundle, and can detect one-sided failures (cable unplugged on far end). A static LAG has no negotiation — both sides are manually configured and have no health checking. If a cable is partially failed (link up but not passing traffic), LACP detects it; a static LAG does not.

NetworkManager (nmcli)

# Create bond
nmcli con add type bond con-name bond0 ifname bond0 \
  bond.options "mode=802.3ad,lacp_rate=fast,xmit_hash_policy=layer3+4,miimon=100"

# Add members
nmcli con add type ethernet con-name bond0-eth0 ifname eth0 master bond0
nmcli con add type ethernet con-name bond0-eth1 ifname eth1 master bond0

# Activate
nmcli con up bond0

Quick Reference

Task Command
Show bond status cat /proc/net/bonding/bond0
Create LACP bond ip link add bond0 type bond mode 802.3ad
Add member ip link set eth0 master bond0
Set hash policy ip link set bond0 type bond xmit_hash_policy layer3+4
Set monitoring ip link set bond0 type bond miimon 100
nmcli bond nmcli con add type bond ...

Wiki Navigation