vSAN cluster configuration consistency – Network Configuration is out of Sync – Health Test Failed

In this blog post we are going to discuss about the most common situations we may face in the production environments regarding vSAN cluster consistency. This health check validates if the hosts in the cluster and the corresponding disks have a consistent configuration with the cluster.

While working on vSAN production, I have seen that vSAN cluster consistency health check is showing warning in vSAN health. When I checked the reason behind this inconsistency, I found that that network configuration is out of sync. Having this very statement, only two thoughts came into my mind a.) vSAN Kernel port b.) unicastagent lists.

You can refer to the screenshot below of error message:

In above screenshot, error is highlighted and pertaining to this error-ed hostname is mentioned.

To resolve this issue, you can directly click on “remediate inconsistent configuration” icon on the top right. However, before going ahead you need to check what is actually misconfigured.

I initially suspected may be vSAN vmkernel is causing this issue but if it would have been then this host was in partitioned state which is not the case here. Secondly, I listed unicast agent list by running below command on this host

[root@ESXI001:~] localcli vsan cluster unicastagent list

NodeUuid                               IsWitness          Supports        Unicast  IP Address      Port       Iface Name

-------------------------------------------------------------------------------------------------

53edfd6f-d759-xxxxxxxxxxx          0                true              10.10.10.30  12321

53ee5c46-a821-xxxxxxxxxx          0                 true              10.10.10.31  12321

599642d7-a4bc-xxxxxxxxxx         0                  true              10.10.10.32  12321

5ad8a937-e78f-xxxxxxxxxxx        0                  true              10.10.10.33  12321  vmk0

58e215a5-7243-xxxxxxxxxx         0                  true              10.10.10.34  12321  vmk0

I found that in 5 node vSAN cluster, 5 unicast entries are listed on this host, when I logged into other hosts it only showed 4 unicast entries which is true. As you know unicastagents form a cluster membership and each node has neighbors entries. In this case, I found localunicast agent entry which caused this inconsistency in the cluster environment.

So, now this entry needs to be removed and there are two processes which can be followed in this case. First, you can manually remove this entry by running below commands and secondly you can directly remediate the cluster which will automatically correct things for you. Even before that you need to set a parameter which allows vCenter to accept new configuration.

[root@ESXI001:~] esxcfg-advcfg -g /VSAN/IgnoreClusterMemberListUpdates

Value of IgnoreClusterMemberListUpdates is 1 (By default value is set to 1)

Run below command to set it to false which will allow you to change the configuration

[root@ESXi001:~] esxcfg-advcfg -s 0 /VSAN/IgnoreClusterMemberListUpdates

Value of IgnoreClusterMemberListUpdates is 0

Now, you can remediate the cluster by clicking on “Remediate inconsistent configuration” or by running command

esxcli vsan cluster unicastagent remove -t node -a 10.10.10.34

Note: Make sure host is in MM with ensure accessibility before performing this activity. Once the activity is performed you can change the cluster-memberlist updates value to 1

I hope this blog post has been informative for you. Thank you for reading!!

Happy learning!!

Be the first to comment

Leave a Reply

Your email address will not be published.


*