Recently, I was checking an interesting issue for one of our customers. They are experiencing random packet dropping where users accessing their applications running on NSX environment are suffering from intermittent disconnections.
They are running NSX-V 6.4 where their applications are connected to the logical switches behind the DLR and two perimeter edges running in Equal Cost Multi-Pathing (ECMP) mode providing South-North traffic to their workloads. The topology is similar to the one below:
In this post, I will focus on two features that need your attention when configuring your edges in ECMP mode to avoid such packet dropping issues.
Disable ECMP-Edge firewall
Edge firewall is a stateful service which means it performs stateful packet inspection and tracks the state of network connections. This may drop asymmetric traffic resulting from the multiple data paths available via the ECMP edges. Firewall need to be disabled for ECMP to operate correctly.
So, the first rule of thumb here, is to disable edge firewall on ECMP edges.
Disable Reverse Path Filtering
This was the cause of our issue.
In NSX Edge, Reverse Path Forwarding (RPF) is enabled by default.
When RPF is enabled, the Edge only forward packets if they are received on the same interface that would be used to forward the traffic to the source of the packet. If the route to the source address of the packet is through a different interface than the one it is received on, the packet is dropped.
For more information you can check the below VMware KB article:
https://kb.vmware.com/s/article/2127073
So, second rule of thumb is to disable RPF on all Edges participating in an asymmetric routing environment.
To disable RPF via GUI:
To disable RPF via REST API, make the below API call to the NSX manager:
PUT https://NSX_mgr_IP/api/4.0/edges/<edge-ID>/systemcontrol/config
<systemControl>
<property>sysctl.net.ipv4.conf.all.rp_filter=0</property>
<property>sysctl.net.ipv4.conf.vNic_0.rp_filter=0</property>
<property>sysctl.net.ipv4.conf.vNic_1.rp_filter=0</property>
<property>sysctl.net.ipv4.conf.vNic_2.rp_filter=0</property>
<property>sysctl.net.ipv4.conf.vNic_3.rp_filter=0</property>
</systemControl>
In the command output, 0’s mean disabled and 1’s mean enabled.
To check RPF drop packet count:
This command shows you the number of packets being dropped by RPF if you are experiencing such an issue.
Conclusion
To avoid any packet drop when you are running ECMP with asymmetric routing, always disable reverse path filtering (RPF) and firewall on your NSX edges.
Hope this post is informative,
Thank you for reading,
Mohamad Alhussein