Introduction
This document describes scenarios under which polarization in port-channel load balancing could occur and provides suggestions on how to prevent them.
Prerequisites
Requirements
Cisco recommends that you have knowledge of these topics:
Components Used
This document is not restricted to specific software and hardware versions.
The information in this document was created from the devices in a specific lab environment. All of the devices used in this document started with a cleared (default) configuration. If your network is live, ensure that you understand the potential impact of any command.
Background
Polarization is an issue where the hash algorithm selects certain paths in the network and leaves redundant paths unused.
Topology
Configuration
N7K1 and N7K2 are connected in VPC and Po100, Po200, Po300, and Po301 are in VPC port-channel.
N7K1 and N7K2 act as a pure L2 switch, with no routing happening on these switches.
All switches are running the same port-channel load-balancing algorithm.
The polarization issue is seen on traffic going out of N7K1 and N7K2, irrespective of whether the traffic from the source to the destination was in the same VLAN (no routing) or if they were in different VLANs with the routing happening on N7K3 or N7k4.
Traffic Flow
The source sends multiple streams to the destination (with multiple source and destination IP addresses, and the L4 port information also varies from packet to packet). A good mix of traffic is used in order to ensure that in an ideal situation, the traffic is evenly distributed among the port-channel member interfaces.
The traffic from the source lands on N7k3/N7k4 and then goes via N7K1/N7K2 to the destination.
One link among member links of Po100 and Po200 on each of N7K1 and N7K2 sends out almost 99% of the traffic and the other link remains idle. That is, on each switch N7K1 and N7K2, one link among 4/2 and 4/3 carries 99% unicast traffic and the other link carries less than 1%. Similarly, one link among 9/2 and 9/3 carries 99% traffic and the other link carries less than 1%. The output in the troubleshooting section shows traffic on Po100 and Po200 member interfaces on N7K1. Similar output can be seen on N7K2.
Irrespective of the type of port-channel load-balancing algorithm used, the issue can be seen as long as the same port-channel load-balancing algorithm is used on the N7K1/N7K2 pair and the N7K3/N7K4 pair. The command to check the port-channel load-balancing algorithm is shown here:
N7K1# show port-channel load-balance
Warning: Per Packet Load balance configuration has higher precedence
System config:
Non-IP: src-dst mac
IP: src-dst ip-l4port-vlan rotate 0
Port Channel Load-Balancing Configuration for all modules:
Module 1:
Non-IP: src-dst mac
IP: src-dst ip rotate 0
Module 2:
Non-IP: src-dst mac
IP: src-dst ip rotate 0
Module 3:
Non-IP: src-dst mac
IP: src-dst ip rotate 0
Module 4:
Non-IP: src-dst mac
IP: src-dst ip-l4port-vlan rotate 0
Module 7:
Non-IP: src-dst mac
IP: src-dst ip-l4port-vlan rotate 0
Module 8:
Non-IP: src-dst mac
IP: src-dst ip-l4port-vlan rotate 0
Module 9:
Non-IP: src-dst mac
IP: src-dst ip-l4port-vlan rotate 0
Troubleshooting
If uneven load balancing is seen on a port-channel, it can be because of polarization.
When traffic reaches N7K3 and N7K4 switches, they are forwarded to the N7K1/N7K2 switches via Po301 of N7K4 and Po300 of N7K3. Here, the load balancing algorithm kicks in and some flows are forwarded to N7K1 and other flows are forwarded to N7K2.
Initially, all the traffic comes into switches N7K3/N7K4 on eth1/1, and based on the src-dst IP address and l4 port information, certain flows are hashed on the link going toward N7K1 and other flows hashed on the link going toward N7K2. The hashing is done based on the rbh value which is calculated by the switch. For simplicity, let us assume that based on load-balance algorithm used, the switch segregates the incoming traffic into two flows (flow X and flow Y). Flow X is sent out of one port-channel member link and flow Y is sent out of the other port-channel member link.
Now, when the traffic is landing on the N7K1/N7K2 pair, there can be two possibilities. (Consider X and Y to be interchangeable.)
Case1:
N7K3 sent flow X to N7K1 and flow Y to N7K2
N7K4 sent flow Y to N7K1 and flow X to N7K2
Case2:
N7K3 sent flow X to N7K1 and flow Y to N7K2
N7K4 sent flow X to N7K1 and flow Y to N7K2
In Case 1, N7K1 and N7K2 receive both types of flows (flow X and flow Y) and even after using the same port-channel load balancing algorithm as that used by N7K3/N7K4, no polarization would be seen as the flows egress out of Po100 and Po200 on different links. Hence, we see a better traffic distribution among port-channel member interfaces.
In Case 2, N7K1 receives only flow X and N7K2 receives only flow Y and this could create polarization if the port-channel load-balancing algorithm used on the these switches is the same as the one used in the N7K3/N7K4 pair. As N7K1 and N7K2 use the same port-channel load balancing algorithm, N7K1 sends flow X on only one member link of Po100/Po200 and the other member link does not forward any traffic. Similarly, N7K2 sends flow Y on only one member link of Po100/Po200 and the other member link does not forward any traffic.
Since the traffic that switches N7K1 and N7K2 receive is already classified to begin with, only one port-channel member link is used to send all incoming traffic out of switch N7K1/N7K2 and nothing would be sent out of the other member link. In the case the incoming traffic rate exceeds the bandwidth of the single port-channel link, the additional traffic can be dropped as the other port-channel member link would not forward this traffic.
A similar issue can be seen when more than two links are used in the port-channel. For example, if four links are used in a port-channel, then depending on the hashing happening, either no polarization would happen or we see partial polarization where only two of the four port-channel member links would be used to forward all the incoming traffic. The other two links would not forward anything
The polarization is caused because of the design and hence it is important to analyze the design in order to make sure no polarization occurs. Output indicating polarization occurring on Po100 and Po200 on N7k1 is shown next (similar output can be seen on N7K2 as well).
N7K1# show port-channel summary | i 200
200 Po200(SU) Eth LACP Eth9/2(P) Eth9/3(P)
N7K1# show port-channel traffic interface port-channel 200
NOTE: Clear the port-channel member counters to get accurate statistics
ChanId Port Rx-Ucst Tx-Ucst Rx-Mcst Tx-Mcst Rx-Bcst Tx-Bcst
------ --------- ------- ------- ------- ------- ------- -------
200 Eth9/2 0.0% 99.99% 44.44% 4.00% 0.0% 100.00%
200 Eth9/3 0.0% 0.00% 55.55% 96.00% 0.0% 0.0%
N7K1# show port-channel summary | i 100
100 Po100(SU) Eth LACP Eth4/2(P) Eth4/3(P)
N7K1# show port-channel traffic interface port-channel 100
NOTE: Clear the port-channel member counters to get accurate statistics
ChanId Port Rx-Ucst Tx-Ucst Rx-Mcst Tx-Mcst Rx-Bcst Tx-Bcst
------ --------- ------- ------- ------- ------- ------- -------
100 Eth4/2 0.0% 99.99% 40.55% 7.00% 0.0% 100.00%
100 Eth4/3 0.0% 0.00% 54.44% 93.00% 0.0% 0.0%
Workarounds
Some of the workarounds used to ensure that polarization does not happen are described in this section.
- Proper Design: Since the main cause for polarization is improper design, it is best to ensure that we change the network design to make sure that there is no room for polarization in the topology.
If no changes to the design are possible, we can do the following.
- Use different port-channel load balancing algorithms at each level of switches (one algorithm on the N7K1/N7k2 pair and a different algorithm on the N7K3//N7k4 pair). When the load-balancing algorithm is changed, the N7k1/N7k2 switches now hash the incoming traffic based on some other information than the ones used by N7k3/N7k4 switches. Hence, the outgoing traffic uses all the port-channel member links. (The decision on what algorithm to choose depends on the type of traffic received by the switch.)
- If you wants to use same load-balancing algorithm, use different rotate values at each level of switches. The
rotate
command introduces randomness in the hashing algorithm by offsetting the hash-input by user-configured bytes and helps to avoid polarization. (Use one rotate value for the N7k1/N7k2 pair and a different rotate value for N7k3/N7k4 pair.)