How to fix ESXi hosts showing out of sync with a distributed switch

Recently I was in charge of designing and implementing a new vSphere and vSAN infrastructure for a customer.

Everything went just fine, except that we all have experienced many network outages due to misconfiguration of some Firewall Virtual Appliances where the whole infrastructure relies on.

As a result, all ESXi got disconnected from the vCenter and this happened many times!

Anyway, the network team in the end fixed the issue and the network was finally stable but I have to say that I’m not a fan of firewall virtual appliance as it brings some limitations and complexity. Unfortunately, those outages had a bad impact on the virtualization network as some warnings appeared showing some ESXi hosts out of sync with the distributed switch.

Distributed Switch Warning

This could not be a big deal at the beginning but this kind of issue definitely needs to be addressed to avoid any network problem in the future.

Basically, this error means that some ESXi hosts have a different configuration than the other host and the distributed switch. So that each time you’d like to make any configuration change to the vDS, those modifications might not be taken into account on the hosts being out of sync.

Now let’s fix that network!

The first thing I tried was to click on that “Rectify” button hoping that would solve the issue but it didn’t. By the way I have implemented distributed switches many many times and never saw that error before!

Distributed Switch Configuration Issue Details

The second thing to do was to go on each ESXi host mentioned on the warning message and click on the “Rectify” button. But again, no improvement.

Then I decided to put these 2 hosts in maintenance mode and to reboot them. And as I was running out of time I also opened a case with VMware just in case. After rebooting both ESXi hosts the warning disappeared and later on I received a call from VMware.

The support engineer I talked to said that what I’ve done so far was the right thing to do but that there were other steps and check to perform. He confirmed also that rebooting the hosts generally fix things once and for all.

By connecting to each ESXi hosts out of sync, we found the following in the logs:

2020-06-30T13:15:27.663Z warning hostd[2585928] [Originator@6876 sub=Hostsvc.NetworkProvider opID=1908c39c-4525 user=vpxuser] Skip saving dvport SW-DVS-PRD-00-134 to /vmfs/volumes/vsan:52759aa41b9a53da-cc69130ab780d0b8/fe13f25e-5f3f-acc3-97c1-1402ec77cfb4/.dvsData/50 1a 79 90 b3 01 ab e3-f6 1c a7 26 30 e2 b7 71/134: failed to create dir

This error indicates that the dvport “134” couldn’t be saved to the directory showed above because it actually failed to create the directory.

Apparently this is something that can occur with a vSAN datastore. In that situation, it will be required to manually create on each host out of sync the folder accordingly with the command below:

mkdir /vmfs/volumes/vsan:52759aa41b9a53da-cc69130ab780d0b8/fe13f25e-5f3f-acc3-97c1-1402ec77cfb4/

Then click on “Rectify” on each host, and this should solve the issue.

If with no luck none of the method above worked, then there is one last step to do which consist of performing a task that both ESXi and vCenter are aware of to force them both to re-synchronize.  The whole procedure which is from a VMware kb is available to you right below.

Method #1 (using the vCenter Flash Client): 

  1. Create a new blank VM (no OS needed)
  2. Edit the VM to put the VM in the portgroup from Step #5 above, click Ok
  3. Edit the VM again, expand the “Network Adapter” section
  4. In the “Port ID” field, enter the dvPort number noted in Step #2
  5. Power the VM up
  6. Migrate the VM to the host listed in Step #2
  7. Navigate to Flash Client > Networking > vDS > Ports, click on “Start Monitoring Port State” button
  8. Migrate the VM off the host listed in Step #2
  9. Refresh the Flash Client, the host should now be off the ‘out of sync’ list from Step #1

Method #2:

  1. Create a new blank VM (no OS needed) 
  2. Edit the VM to put the VM in the portgroup from Step #5 above, click Ok
  3. SSH into the host with the blank VM
  4. Navigate to the blank VM’s .vmx file
  5. Edit the .vmx file with “vi” editor
  6. Change the value of “ethernet0.dvs.portId” to the port number from Step #2 above
  7. Power the VM up
  8. Migrate the VM to the host listed in Step #2
  9. Migrate the VM back off this host
  10. Refresh the vCenter Client
  • Note: If there is a VM using the dvPort from Step #2, try the following: 
  1. Create a new portgroup with identical settings (VLAN, Teaming Policy, etc) to the portgroup from Step #5
  2. Edit the VM, and put it in the new portgroup
  3. After a short wait, refresh the vSphere Client to see if the host is in sync
  4. If the host is still out of sync, migrate the VM to a different host in the cluster
  5. If the host is still out of sync, try Method #1 or #2 above now that the dvPort from Step #2 is unused 

Note #1:  The vMotions on/off the problematic hosts are critical.  The goal is to perform a task (related to the dvPort in conflict) that both ESXi and vCenter are aware of to force them both to re-synchronize. 

Note #2:  The “Start Monitoring Port State” button is only available in the Flash Client.  This button may be of some help getting ESXi/VC to synchronize the dvPort(s).

Well, I hope that blogpost will be useful to many others who might encounter the same issue.

8 thoughts on “How to fix ESXi hosts showing out of sync with a distributed switch

  1. AStraube

    Hello,

    creating the folder really helped and the error disappeared, many thanks, but now i have a little problem with the appearence of this created directory in the vSAN datastore. I know it’s just cosmetic, but how do I get rid of the folder or how can I hide it from the vSphere webclient? It was possible for me to delete the folder, but after a reboot the DSwitch error reappeared. After recreating the folder, the error disappeared again, but the folder is shown again in the vSAN datastore.

    Greetings

    Reply
    1. Teddy Post author

      Hi AStraube,

      Thanks for your message.
      I’m sorry for my late answer, I’ve been pretty busy lately.

      Please note that you’re not supposed to delete that folder.
      In this particular scenario, the folder was actually not properly created by the vSAN nodes out of sync, and this is why we had to do it manually.

      Hope that helps.

      Reply
      1. AStraube

        Hello and thanks for the reply.

        I reread my question and it seems, that I asked how to delete it, that was a little bit wrong. What I really wanted to know how I can make the folder invisible in the vSphere Web Client. Because I created the folder manually, the folder with the long hash-like name is visible if I open the content of the datastore in datastore->files. Is there a way to turn it into a system folder, that is invisible like the other system folders with a point as first character (like .folder)? That’s what I meant with cosmetics.

        Thanks in advance

        Reply
    1. Teddy Post author

      Hi Cacophony,

      Thanks for your question.

      I have indeed not mentioned which logs I was referring to, sorry my bad.
      I honestly cannot remember which log it was, but as I far as I remember it was regular logs such as hostd.log, vpxa.log etc… Sorry for not being more helpful here.

      Reply
  2. Perry

    I have been dealing with this issue for quite a while now and manually creating the directories in the vSAN datastore resolved the issue for me. Thank You for posting this information. It was a great help.

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *