Lately I found myself beating my head against the wall troubleshooting an issue on a nested VCF 4.0 SDDC environment which was not working property. In this article I’m going to talk about the how the NSX-T GENEVE overlay tunnels were not coming up and what steps I had to take in order to fix the problem.
During bring-up of my SDDC lab (using VCF 4.0) I have two NSX-T 3.0 edge transport nodes running on the same management domain (cluster) sharing the same vDS (7.0) of the host transport nodes. Each host has only 2 physical NICs (pNIC). In design jargon this is referred to as “collapsed Edge & Compute cluster”.
More specifically the edge NSX-T N-VDS is deployed on the native vSphere vDS which is a new feature of vSphere 7.0+ (see this great article NSX-T on VDS 7 Guide for history and details)
I am deploying my SDDC with AVN (more about AVN on this post) and I have the following two T1 GW logical segments (in the legacy Advanced UI interface known as downlink logical router ports):
- 192.168.31.0/24 local to site A (region specific)
- 192.168.11.0/24 stretched across site A and B (cross region to use a VVD term)
During bring-up Cloud Builder (CB) failed at validating network connectivity of said T1 segments because they were not reachable. Upon inspecting the NSX-T Manager it was clear that the Geneve tunnels were not coming up; at least 50% of them weren’t. Confusing right ?
From the EDGE transport node running the command get bfd-sessions (filtered with specific fields for easiness of read) confirmed the same. VLAN tunnels are up because they do no require any encapsulation.
The main consequence of the tunnels being down is that the BGP neighbourships can’t form on the external L3 core (neighbours being the T0 and T1 SR gateways). Which is why CB couldn’t ping 192.168.31.254 neither 192.168.11.254
I’ll spare you all the steps and tests I actually did and jump straight to the important point, which is:
This is a good article https://blogs.vmware.com/networkvirtualization/2018/10/flexible-deployment-options-for-nsx-t-edge-vm.html see section “NSX-T Edge VM deployed on a compute host using N-VDS Logical switches in Collapsed Compute and Edge Cluster” which is what a VCF 4.0 bring-up configures in my JSON configuration.
Here the detail I missed that was causing all my problems:
Compute TEP IP and Edge TEP IP must be in different vlan. Hence, you need an additional vlan on underlay. N-S traffic from Compute workloads is encapsulated in GENEVE and sent to Edge node with Source IP as Compute TEP and destination IP as Edge TEP. Since these TEPs must sit in different vlan/subnets, this traffic must be routed via TOR.
This is also very well explained on the NSX-T 2.5 design guide: “This requirement is coming from protecting the host overlay from VM generating overlay traffic”.
And there it is! I was trying to use the same VLAN (150 subnet 192.168.150.0/24) for both EDGE and COMPUTE transport nodes TEPs. As soon as I provisioned a new VLAN (160 subnet 192.168.160.0/24) for the edge transport nodes the tunnels immediately came up.
*** 19/12/2020 Update ***
This requirement has gone with NSX-T 3.1, from the release notes:
Inter TEP communication within the same host:Edge TEP IP can be on the same subnet as the local hypervisor TEP.https://docs.vmware.com/en/VMware-NSX-T-Data-Center/3.1/rn/VMware-NSX-T-Data-Center-31-Release-Notes.html#whats_new
However 3.1 isn’t yet supported on VCF, not even 4.1 which is the latest version as I write this update on the article