NSX for Newbies – Part 6: Distributed Logical Router (dLR)

Topics covered in this article

  • dLR concepts
  • dLR deployment
  • dLR deployment verification steps

dLR concepts

What is a Distributed Logical Router (dLR)?

As with traditional Cisco switches a Distributed Logical Router (DLR) is made up of two distinct elements:

  • Control Plane represented by a virtual machine called Logical Router (LR) Control VM. Dynamic routing protocols such as OSPF, BGP, IS-IS run between the Control VM and the upper layer, on NSX represented by the NSX Edge Gateway.
  • Data Plane (or line cards) represented by routing functionalities at the hypervisor level, which is achieved by installing kernel modules (VIB). I covered this in the post Introduction to NSX.

So where’s the deal ? Well, thinking about it, with a traditional approach the L3 traffic from the hypervisor always have to go northbound to the external router, whether that’s your physical L3 core switch or a virtual appliance running somewhere, this process is called hairpinning.
It’s a sub-optimal path we could say.
By moving the routing functionality to the hypervisor (kernel level) we are effectively removing this sub-optimal path because with DLR each ESXi host can route between L3 subnets at line rate (or nearly line rate) speed. The type of traffic dLR optimises and takes care of is VM to VM (or server to server) normally known as East-West traffic. In this logical diagram (sorry I’m not using official VMware icons) you can see what I just described.

As you can imagine in a typical (and most increasingly common) 3-tier application there is a lot of interaction between the tiers, web server to application, application to database. Therefore having an optimised path with a higher throughput is essential on modern SDN datacenters.It should be noted that DLR kernel modules are routing between VXLAN (logical) subnets, as opposed to VLANs. Brad Hedlund wrote a great article that explains the details.

LR Control VM

As mentioned before it’s the control plane and it doesn’t perform any routing so if it dies virtual machines traffic keeps going. Routes learnt are pushed to the hypervisor in a process that can be summarised as follow:

  1. NSX Edge Gateway (EGW) learns a new route from the “external world”
  2. LR Control VM learns this route because it’s a “neighbor” (adjacency) talking to EGW via the Protocol Address
  3. LR Control VM pass the new route to the NSX Controller(s)
  4. NSX Controller pushes (in a secure manner) the new route to the ESXi hosts via User World Agent (UWA) and the route gets installed on every host

Logical Interfaces (LIFs)

From the diagram you can see the dLR has several Logical Interfaces.

  • Internal LIFs which act as default gateway for each logical switch (web, app, db)
  • Uplink LIF connects the “northbound world” for north-south traffic.

LIFs are distributed to all ESXi hosts with the same IP address and every host maintain an ARP table for every connected LIF.

DLR Deployment

Distributed Logical Router configuration

From NSX Edges > + symbolI like to have SSH access so I’m enabling it SSH because I like to 🙂

Select the destination for the virtual machine

Click OK, Next

Management Interface Configuration: is not a LIF, it’s local to the Control VM and does not require an IP address assigned and even if you configure one you wouldn’t be able to reach it via a routing protocol because there’s Reverse Path Forwarding (RPF) enabled. Dmitri Kalintsev wrote a good article that explaining more in detail this concept.
Configure interfaces on this NSX Edge: using the + symbol repeat the wizard until you have created all the LIFs required on the environment

 Next and optional, configure a default gateway for the dLR. This typically would be the EGW ip address.

Next review and finish.

This screenshot actually has a typo, the Management Interface should be connected to Mgmt_vDS_Mgmt and not Mgmt_vDS_vMotion

 At the end of the deployment you should see all the network adapters connected.

In my environment I’m running NSX 6.1.1 and although I can see all the LIFs ip addresses assigned to the Control VM the Network adapters aren’t actually listed.
I’m not sure if this is an expected behaviour of 6.1 or a bug but I clearly remember that on 6.0 all the network adapters were listed.



6.1 screenshot



6.0 screenshot

dLR deployment verification/troubleshooting steps

From NSX Controller run

show control-cluster logical-routers instance all

This commands shows all the dLR instances and the corresponding hosts (VTEPs) that have joined.
This is VERY useful when something like routes are not propagated from the Control VM to the hosts. I had this problem where I was configuring static routes on the Control VM for north-south connectivity and they were not pushed to hosts, hence the virtual machines could not talk north of the Edge Gateway.

From this screenshot host 192.168.110.52 is missing. It so happens that the NSX Controller and the LR Control VM were running on 192.168.110.52
In this situation and to confirm that this was actually the root of my problem, I migrated all the virtual machines running on 192.168.110.52 to 192.168.110.51 and BOOM! the static I set on the Control VM immediately appeared on the ESXi routing table.
This means something is wrong/needs fixing on the host!

How do you check the ESXi routing table ?

1) From ESXi, find the dLR name with the command

net-vdr -l -I

2) Check the routing table installed on the host coming from that dLR

net-vdr -l --route CloudLab+edge-6

Running this command on the problematic host mentioned above generates an empty routing table as follow:

From the NSX Controller

show control-cluster logical-switches vni 5001
show control-cluster logical-switches mac-table 5001
show control-cluster logical-switches arp-table 5001

The first command list all the VTEPs that have joined vni 5003 (in my case being the Web tier)
The second list the mac addresses of the virtual machines that have joined vni 5003 and are powered-on
The third command same as previous but shows the arp and so the IP resolution.

In the next post I’m going to cover the NSX Edge Gateway. Stay tuned!

Be sociable, share!Tweet about this on TwitterShare on LinkedInShare on FacebookShare on Google+Email this to someone

11 Comments

 Add your comment
  1. After create the DLR and connected different subnets, does manual or dynamic routing needed for proper routing between VM’s connected to different Logical switches? My setup is below

    1) VM1 – App 172.16.20.11
    2) VM2 – Web 172.16.10.11
    2) VM3- DB 172.16.30.11

    DLR interface . 172.16.20.1, 172.16.10.1, 172.16.30.1

    From VM’s I can ping only their Gateway, ie,LIF on DLR. I cannot ping between VM’s. Do I have to configure Routing on DLR?

    • Hi,

      Did you manage to solved that problem? I am having the same issue with NSX 6.2. I have a traffic to the DLR and all mentioned above checks are fine but I am not able pass any east west between 2 VM on different subnets.

      Regards!

      • Hi Martin, can your VMs their default gateways? Have the ESXi joined the VNIs?

        • Hi Giuliano,

          Yes, my VMs are able to ping their gateway but they are not able to ping each other. On the other hand I have noticed that from the DLR I am not able to ping the VMs which is strange…

          I do believe that my ESXi hosts are joined please find below my output:

          [root@localhost:~] net-vdr -l -I

          VDR Instance Information :
          —————————

          Vdr Name: default+edge-10
          Vdr Id: 0x00001389
          Number of Lifs: 3
          Number of Routes: 4
          State: Enabled
          Controller IP: 192.168.200.18
          Control Plane IP: 192.168.200.24
          Control Plane Active: Yes
          Num unique nexthops: 1
          Generation Number: 0
          Edge Active: No

          [root@localhost:~] net-vdr -l –route default+edge-10

          VDR default+edge-10 Route Table
          Legend: [U: Up], [G: Gateway], [C: Connected], [I: Interface]
          Legend: [H: Host], [F: Soft Flush] [!: Reject] [E: ECMP]

          Destination GenMask Gateway Flags Ref Origin UpTime Interface
          ———– ——- ——- —– — —— —— ———
          0.0.0.0 0.0.0.0 192.168.200.1 UG 1 AUTO 66943 138900000002
          10.10.10.0 255.255.255.0 0.0.0.0 UCI 1 MANUAL 60983 13890000000a
          10.10.20.0 255.255.255.0 0.0.0.0 UCI 1 MANUAL 60752 13890000000b
          192.168.200.0 255.255.255.0 0.0.0.0 UCI 1 MANUAL 69108 138900000002

  2. Hi Sony,

    Have you checked if your ESXi hosts have joined the VNIs ? (see my troubleshooting section, tried that already?)

    Routing between VNIs (logical switches) does not require further configuration.

  3. hi all, i had a problem when i was deploying DLR. I cannot see HA interface configuration (IP address). So, i cannot assign IP for management. May be, it is problem. Please help me solve this bug. tks all. I use Vsphere Esxi 6.0, NSX 6.2.

  4. Dear all,

    It looks that I have solved my problem. It looks that I have some routing problem with my Linux test machines. As soon as my routing was correct the expected manner was achieved.

    Thanks a lot!
    Martin

    • Hi Martin. I am running into exactly same issue as you. I cannot ping between vm’s on different logical networks. They can ping their own dg, and can even ping the dg of the target vm. When i log into the DLR, I cannot ping a vm IP. So, although IO makes it to the DLR, it’s a blackhole from there.
      So, issue is either at the DLR, or some routing issue with my Windows vm. What routing problem did you come across with your Linux vm’s ?

  5. Hello Everyone
    I have problem, when deploying UDLR (not DLR) in HA Configuration section in Configure interfaces of this NSX Edge.
    (add interface—Connect to—-select (not showing any logical switch and is empty)

Leave a Comment

Your email address will not be published.

2 Trackbacks

  1. NSX-v for Newbies (The Series) | blog.bertello.org (Pingback)
  2. NSX Link-O-Rama | vcdx133.com (Pingback)