How to fix ESXi hosts showing out of sync with a distributed switch

Recently I was in charge of designing and implementing a new vSphere and vSAN infrastructure for a customer.

Everything went just fine, except that we all have experienced many network outages due to misconfiguration of some Firewall Virtual Appliances where the whole infrastructure relies on.

As a result, all ESXi got disconnected from the vCenter and this happened many times!

Anyway, the network team in the end fixed the issue and the network was finally stable but I have to say that I’m not a fan of firewall virtual appliance as it brings some limitations and complexity. Unfortunately, those outages had a bad impact on the virtualization network as some warnings appeared showing some ESXi hosts out of sync with the distributed switch.

Distributed Switch Warning

This could not be a big deal at the beginning but this kind of issue definitely needs to be addressed to avoid any network problem in the future.

Basically, this error means that some ESXi hosts have a different configuration than the other host and the distributed switch. So that each time you’d like to make any configuration change to the vDS, those modifications might not be taken into account on the hosts being out of sync.

Now let’s fix that network!

The first thing I tried was to click on that “Rectify” button hoping that would solve the issue but it didn’t. By the way I have implemented distributed switches many many times and never saw that error before!

Distributed Switch Configuration Issue Details

The second thing to do was to go on each ESXi host mentioned on the warning message and click on the “Rectify” button. But again, no improvement.

Then I decided to put these 2 hosts in maintenance mode and to reboot them. And as I was running out of time I also opened a case with VMware just in case. After rebooting both ESXi hosts the warning disappeared and later on I received a call from VMware.

The support engineer I talked to said that what I’ve done so far was the right thing to do but that there were other steps and check to perform. He confirmed also that rebooting the hosts generally fix things once and for all.

By connecting on each ESXi hosts out of sync, we found the following in the logs:

2020-06-30T13:15:27.663Z warning hostd[2585928] [Originator@6876 sub=Hostsvc.NetworkProvider opID=1908c39c-4525 user=vpxuser] Skip saving dvport SW-DVS-PRD-00-134 to /vmfs/volumes/vsan:52759aa41b9a53da-cc69130ab780d0b8/fe13f25e-5f3f-acc3-97c1-1402ec77cfb4/.dvsData/50 1a 79 90 b3 01 ab e3-f6 1c a7 26 30 e2 b7 71/134: failed to create dir

This error indicates that the dvport “134” couldn’t be saved to the directory showed above because it actually failed to create the directory.

Apparently this is something that can occur with a vSAN datastore. In that situation, it will be required to manually create on each host out of sync the folder accordingly with the command below:

mkdir /vmfs/volumes/vsan:52759aa41b9a53da-cc69130ab780d0b8/fe13f25e-5f3f-acc3-97c1-1402ec77cfb4/

Then click on “Rectify” on each host, and this should solve the issue.

If with no luck none of the method above worked, then there is one last step to do which consist of performing a task that both ESXi and vCenter are aware of to force them both to re-synchronize.  The whole procedure which is from a VMware kb is available to you right below.

Method #1 (using the vCenter Flash Client): 

  1. Create a new blank VM (no OS needed)
  2. Edit the VM to put the VM in the portgroup from Step #5 above, click Ok
  3. Edit the VM again, expand the “Network Adapter” section
  4. In the “Port ID” field, enter the dvPort number noted in Step #2
  5. Power the VM up
  6. Migrate the VM to the host listed in Step #2
  7. Navigate to Flash Client > Networking > vDS > Ports, click on “Start Monitoring Port State” button
  8. Migrate the VM off the host listed in Step #2
  9. Refresh the Flash Client, the host should now be off the ‘out of sync’ list from Step #1

Method #2:

  1. Create a new blank VM (no OS needed) 
  2. Edit the VM to put the VM in the portgroup from Step #5 above, click Ok
  3. SSH into the host with the blank VM
  4. Navigate to the blank VM’s .vmx file
  5. Edit the .vmx file with “vi” editor
  6. Change the value of “ethernet0.dvs.portId” to the port number from Step #2 above
  7. Power the VM up
  8. Migrate the VM to the host listed in Step #2
  9. Migrate the VM back off this host
  10. Refresh the vCenter Client
  • Note: If there is a VM using the dvPort from Step #2, try the following: 
  1. Create a new portgroup with identical settings (VLAN, Teaming Policy, etc) to the portgroup from Step #5
  2. Edit the VM, and put it in the new portgroup
  3. After a short wait, refresh the vSphere Client to see if the host is in sync
  4. If the host is still out of sync, migrate the VM to a different host in the cluster
  5. If the host is still out of sync, try Method #1 or #2 above now that the dvPort from Step #2 is unused 

Note #1:  The vMotions on/off the problematic hosts are critical.  The goal is to perform a task (related to the dvPort in conflict) that both ESXi and vCenter are aware of to force them both to re-synchronize. 

Note #2:  The “Start Monitoring Port State” button is only available in the Flash Client.  This button may be of some help getting ESXi/VC to synchronize the dvPort(s).

Well, I hope that blogpost will be useful to many others who might encounter the same issue.

Leave a Reply

Your email address will not be published. Required fields are marked *