How to Change All of the vSAN VMkernel Port IP Addresses in a vSAN Cluster

I have received many queries as to how safely vsan vmkernel can be changed or if administrators are using 1G vSAN network & they want to switch to 10G vSAN network, how this can be achieved without downtime window in production.

This post is very important to educate that before going for the vsan vmkernel IP change or any modification what steps to be taken to avoid cluster network partition.

Let’s consider a example where you are running 1G vsan vmkernel network and you want to move to 10G vsan vmkernel network without downtime.

Steps includes the following procedure:

1.) Assuming 10G switch has been setup in virtual datacenter and connected to vSAN nodes.

2.) Login to vSphere ⇒ Select vSAN Cluster host and start creating vmkernel ports on all the ESXi hosts

3.) Enable vSAN Service on new vmkernel port and attach 10G uplink to the portgroup

4.) Now, use vmkping utility to ping over specific vsan vmkernel port to other hosts

e.g.  

[root@ESXI01:~] vmkping -I vmk3 10.10.10.1
PING 10.10.10.1 (10.10.10.1): 56 data bytes
64 bytes from 10.10.10.1 : icmp_seq=0 ttl=64 time=0.122 ms
64 bytes from 10.10.10.1: icmp_seq=1 ttl=64 time=0.108 ms
64 bytes from 10.10.10.1: icmp_seq=2 ttl=64 time=0.121 ms

5.) Check all the ESXi hosts in the cluster, pinging on vmk3 to each other, which means that new vmkernel port is setup.

6.) To make sure that vsan is using two vmkernel ports, run below commands to confirm the same

localcli vsan network list ( It will show two vmkernel ports)

[root@ESXI01:~] localcli vsan network list

Interface:
   VmkNic Name: vmk1
   IP Protocol: IP
   Interface UUID: 1692e35b-2e05-63a2-543c-20677c1c44b0
   Agent Group Multicast Address: 224.2.3.4
   Agent Group IPv6 Multicast Address: ff19::2:3:4
   Agent Group Multicast Port: 23451
   Master Group Multicast Address: 224.1.2.3
   Master Group IPv6 Multicast Address: ff19::1:2:3
   Master Group Multicast Port: 12345
   Host Unicast Channel Bound Port: 12321
   Multicast TTL: 5
   Traffic Type: vsan

Interface:
   VmkNic Name: vmk3
   IP Protocol: IP
   Interface UUID: 0803e05b-25c4-1617-0d50-20677c1c44b0
   Agent Group Multicast Address: 224.2.3.4
   Agent Group IPv6 Multicast Address: ff19::2:3:4
   Agent Group Multicast Port: 23451
   Master Group Multicast Address: 224.1.2.3
   Master Group IPv6 Multicast Address: ff19::1:2:3
   Master Group Multicast Port: 12345
   Host Unicast Channel Bound Port: 12321
   Multicast TTL: 5
   Traffic Type: vsan [root@ESXI01:~] localcli vsan cluster unicastagent list

Above command will show all the neighbor vsan vmkernel port entries.

7.) Now, in order to test vsan vmkernel port will work in production, move one ESXi host into MM (ensure accessibility) and uncheck the vsan services from the origional vmkernel port like in this case vmk1.

8.) Select cluster and go to vsan health section to retest health plugin. Check if network partition is taking place and ESXi node is showing partitioned.

9.) If all the nodes in the cluster are showing in same partition group then you are good to go to perform the same in production without any downtime ( which means that vsan nodes are failing over to new vsan vmkernel IP)

10.) If network partition takes place, and prerequisites are working based on above steps then it can be network switch issue ( means vsan vmkernel port is not failing over to new vmkernel port)

11.) In one of scenario I found 1G network was on cisco switch and 10G network was on HP Switch. When compared the characteristics of both the switches, we found that HP is using network protection and cisco was no using such feature. We disabled network protection on HP Switch and vsan vmkernel was failing over to new one with any network partition.

 

I hope this post has been informative for you.

Happy learning!!

Be the first to comment

Leave a Reply

Your email address will not be published.


*