Problem
After an unexpected power-off I my vRealize Suite Lifecycle Manager appliance was in this state:
No netwoking detecteed. Pleae login and run /opt/vmware/share/vami/vami_config_net
Knowing that the networking properties are set via vApp options my first thought was to shut down the appliance and check that all settings were all still there, and yes, they were.
Powered back on the appliance and tried /opt/vmware/share/vami/vami_config_net but I went nowhere, the reconfiguration was throwing the following error:
[ERROR] Network FAILED to restart using new parameters eth0 (Errno 2 : )
Something was smelly with the network.
Checking the vCenter port group the vNIC was connected. However most of the port IDs were in a blocked state.
In my case the distributed switch is managed by NSX-T Manager. After reading the KB article Ports connected to N-VDS show in a Blocked State (66796) my suspicion was as this article state:
In the event where the Transport Node loses connectivity to the controller, the controller will start a 24 hour countdown. After 24 hours, if the communication between the transport node and the controller is still down, the controller will purge the Transport Node information from its database.
Now if the host/controller communication is restored after 24 hours have expired, the controllers push an empty dataset to the host, resulting in all the ports on the ESXi host connected to N-VDS to go in a “Port Blocked state”.
This KB however points out the problem was fixed on 2.4.1+ so I was inclined to exclude it for my environment, which is 3.1.3. Moreover the problem manifested itself after I powered back all the virtual machines. With that said, it did make me think about CCP to LCP communication and the fact something might have been “stuck” with the virtual machine port ID status at the host level.
Long story short: I fixed the problem by simply switching the VM vNIC to a different port group, applying the setting and then reverting back to the original NSX-T port group. It clearly invoked a sync to occur and the update previously pushed by the CCP got eventually refreshed. I had to apply this workaround to all VMs attached to this port group.
2 Trackbacks