Topics covered in this article:
- NSX L2 VPN Overview
- NSX L2 VPN Use cases
- NSX L2 VPN Topology
- NSX L2 VPN Server Configuration
- NSX L2 VPN Client Configuration
- Testing L2 VPN Connectivity
NSX L2 VPN Overview
One cool future of the NSX ESG (Edge Services Gateway) is L2 VPN which enables to stretch a L2 subnet over L3, tunnelled through an SSL VPN. The two sites form L2 adjacency, this could either be infra-dc (within the same data centre) or across data centers locations.
Back when I did my ICM class, NSX for vSphere 6.0 only supported one VLAN to be stretched (read trunked) but as of NSX for vSphere 6.1 it’s possible to trunk multiple logical networks, whether that’s VLAN to VLAN, VLAN to VXLAN or VXLAN to VXLAN. It’s also possible to deploy a standalone ESG on a remote site without that site being “NSX enabled” that is, connecting to VLAN (see Use Cases section).
I won’t hide I struggled at the beginning to understand the following two new abstracts, primarily because I couldn’t find proper documentation out there nor articles articulating the topic so I decided to wrap up my discoveries in this post.
- Trunk interface: allows multiple internal networks (either VLAN or VXLAN) to be trunked. They key for me is internal ! Having learnt networking (CCNA) from Cisco every time I hear the word trunk I think of uplink; which is effectively what ultimately the trunk interface will do but, because the interface type Uplink already existed on previous versions, this got me really confused, until I realised that yes, it’s a trunk but still for Internal networks!!!
- Local Egress Optimization: this is more easy to understand. It enables the ESG to route any packets sent towards the Egress Optimization IP address locally, and send everything else over the tunnel. Why? VM Mobility! If the default gateway for the virtual machines (belonging to the subnets you’re stretching) is same across the two sites you need this setting to ensure traffic will be locally routed on each site. Should you need to migrate VM from one site to the other, you can do so without touching the guest os network configuration. Nice ah ?!
NOTE: One “disadvantage” that I have noticed is that stretching a logical network requires its gateway interface to “reside” on the ESG which means no DLR LIF, that is the interface will not be distributed nor will perform at line-rate. In other words instead of east-west traffic it’s north-south, like used to be with vCloud vShield Edge. If you have some tiers at the Distributed Logical switch and others at the ESG level, from a design standpoint pay attention at traffic going northbound as the ESG will be your bottleneck!
That said, one could implement ECMP to increase this bottleneck, two I’m led to believe this L2VPN is meant to be temporary solution as opposed to permanent (think of migrations for example). Looking forward to hearing comments and this two points.
NSX L2 VPN Use Cases
One big use case is “cloud bursting” where a Private Cloud service bursts to a Public Cloud when the demand spikes. Effectively an Hybrid Cloud solution.
The following diagrams are taken from NSX 6.1 administration guide.
As you can see in this scenario, VLAN 10 on site A is stretched to VXLAN 5010 on site B. Similarly for VLAN 11 stretched to VXLAN 5011 on site B.
Again, this is an example where an NSX data centre is extended to a non-NSX data centre.
In this scenario, which could be used for a Private Cloud to Private Cloud migration or DR, the VXLAN (5010,5011) have been stretched to site B and mapped to the same VNI.
The concept is very similar to Cisco OTV, the nice things is you don’t need an expensive Cisco hardware because it’s all software (yay SDN rocks!).
Sounds cool ah? Let’s get into the nitty-gritty 🙂
NSX L2 VPN Topology
This is the topology I’m working with, which is VXLAN to VXLAN extension. As you can see I’m stretching VXLAN 5004 at Site B (Branch Web Tier) to VXLAN 5003 at Site A(Web Tier).
Let’s see how to configure it.
L2 VPN Server configuration (Site A)
Select the Edge Gateway > Manage > Settings > Interfaces. We need to create the Trunk interface and inside it the sub-interface mapped to the logical switch Web-Tier(5003).
- Select an unused interface (for me it’s vNIC2) > Edit. Select type Trunk and the distributed port group it connects to (here Mgmt_vDS_L2VPN_Trunk)
- Click on + to configure the sub-interface. This is where you map the sub-interface to the VXLAN (5003) and give it the IP address that will be the “stretched default gatway”.
Enable Send Redirect: I had no clue what this option was. The explanation on nsx_61_admin.pdf is pretty poor, it only says: “Enable Send Redirect to convey routing information to hosts”. What the heck does this mean?
Turns out this option is about enabling ICMP Redirect on the Edge, which means the ESG will inform the ESXi hosts that the best route to a particular subnet (in my case here 172.16.10.0/24) is available and the best default gateway to reach it will be 172.16.10.254. Subsequents packets from ESXi destined to hosts residing on the stretched subnet 172.16.10.0/24 will be directly sent to 172.16.10.254. Although it may sounds a nice feature, from a security perspective it’s not. Why? Attackers could maliciously alter routing tables, spoof traffic by injecting routes into hosts to redirect traffic. Hence why this option is disabled by default. There’s a good article from Cisco that explains this in more detail.
I need to thank NSX guru Michael Haines (Snr. Architect within NSBU at VMware) for clarifying this to me! Thanks Michael 🙂
- Clicking on Trunk the configuration should look like this
- Create your self-signed SSL certificate for the VPN. Settings > Certificates > Actions > Generate CSR (Certificate Signing Request)
- Fill in the classic information
- with CSR selected > Actions > Self Sign Certificate and input how many days you like the certificate to be valid for. This is how it should look like:
- VPN tab > L2 VPN under L2VPN Mode select Server then Change
Decide what IP address the VPN Server should listen to, the port, encryption algorithm and the self-signed certificate
- The Site Configuration > + symbol and here is where we configure the site details. Username and password must match on the client side.
and it should look like this
- Last, Enable the VPN and Publish the changes
- Because the VPN Client isn’t configured yet, the tunnel is expected to be in a down state. Click on Show L2VPN Statistics to see this
L2 VPN Client configuration (Site B)
- On the ESG on site B acting as VPN Client repeat steps 1 to 3 done for the Server in order to create a Trunk interface with the same IP address 172.16.10.254 mapped to VXLAN 5004 (Branch Web Tier)
- VPN tab > L2VPN > set L2VPN Mode to Client. Then Change. Here we put the Server listener IP, we select the stretched interface, set the optimization gateway address to be 172.16.10.254 and provide the same credentials used on the server.
Enable the Service and publish the changes. If you’re lucky after some seconds, fetching the tunnel status should reveal the tunnel as UP
- Likewise on the Server side
So here we go! We have the L2 VPN up and running. Time to test it!
Testing L2 VPN Connectivity
Again, this is the topology
If the L2 VPN tunnel is up I should be able to ping web-sv-03a (172.16.10.12) from web-sv-01a (172.16.10.10)
Duplicates packets (DUP!) are expected due to the environment being nested (Promiscuous Mode Accept).
Checking the ARP, we can see that 172.16.10.254 is mapped to MAC address 00:50:56:a1:11:cb which belongs to the Trunk interface (vNIC2) of the Perimeter ESG (VPN Server).
Now from web-sv-03a:
and again let’s check the ARP table:
This time 172.16.10.254 is mapped to MAC address 00:50:56:a1:4a:bc which is the Branch ESG vNIC2
This concludes this article on L2 VPN on NSX but stay tuned as more will come! 🙂