vSAN ISCSI Target a.k.a VIT High availability – Deep Dive (Part-2)

In the previous post, we discussed about VIT basic configuration to create targets, luns and initiator groups. We also talked about object placements, configuration files and little bit of high availability.

Today, I am going to discuss about VIT HA with scenarios in detail. However, before going ahead let’s discuss what is high availability in VIT and what type of high availability is supported as of today.

vSAN iSCSI Target LUNs are highly available to the initiator with the use of MPIO in windows/linux. From target portal point of view only one target(host) will be active and other targets (hosts) will be standby. Standby hosts will not become active unless target owner fails. The active target is the one which provides LUN access to the initiator and in case active target owner fails, target ownership will be transferred over to one of other standby targets to serve the LUN access to the initiator. When target owner fails, initiator retries to access the LUN and its request is transferred over to the newly created target owner which opens the LUN access to initiator.

As we discussed in the previous blog post, target contains the LUN and LUN is backed by vmdk. Hence, hosts are defined as target owner not LUN owner. If target has multiple LUNs, then all the luns have same owner as target. In VIT target owner can be considered as DOM owner for particular namespace object.

Initiator can only access the LUN via target owner if its is healthy & alive. Initiator access to the LUN via other hosts will be rejected.

 

Now, I am simulating HA failures in my lab and see how VIT HA works practically:

This is a initiator client on which I have already configured iscsi initiator to connect to target owner and have access to iscsi lun which is shown below

 

Now, you can see in iSCSI initiator properties, I have added two target portals which are blr1 & blr2

 

Below screenshot shows that I have two sessions per connection

 

This screenshot shows MPIO details where in I have configured two target portals in active/passive configuration. You can see that in session ID, one session is active and one session is standby which means that initiator has completed its first request to find out the target owner and now established a active session.

On the other side we can clearly see that target owner is blr2.vhabit.com and has provided LUN access to initiator

I have started a copy from my local computer to iscsi lun in order to check the disconnection when host fails. I am simulating host failure by rebooting host blr1

 

In above case we observed that Host blr1 failure did not impact LUN accessibility because target owner is blr2 and has been serving initiator.

Let’s move to second scenario where blr2 ( target owner ) fails.

I have re-initiated a copy to check LUN accessibility

In this screenshot we can clearly see that target owner blr2 went down and ownership needs to be transferred to some other host

Let’s us look at webclient and understand. In vCenter inventory, blr2.vhabit.com is not responding and IO owner of target has been changed to blr3.vhabit.com which means that target ownership transferred from bl2 to blr3. LUN is still accessible to initiator with minimal disruption and copying is in process.

If we understand the background, blr2 went down, initiator lost access to target owner. It retried to contact blr1 to provide access to LUN and blr1 redirected initiator to blr3 which is new elected target owner of iscsi target. Blr3 then resumed the access of LUN to initiator.

Post this copy has successfully completed and initiator has persistent access to the LUN.

Now, we can look at logs for more understanding:

blr1.vhabit.com- vitd.log:

2018-08-29T17:12:17Z vitd[2098782]: VITD: Thread-0x12126ba700 accepted connection from 192.168.2.101; portal group "pg-vmk0-3260"      {Initiator requested for LUN access}

2018-08-29T17:12:17Z vitd[2098782]: VITD: Thread-0x12126ba700 192.168.2.101 (iqn.1991-05.com.microsoft:jumpserver.vhabit.com): initiator requests to connect to target "iqn.1998-01.com.vmware:5a258a8b-fc34-a175-6d85-765c44a1e2f2"; auth-group "2ea7825b-8ff5-bef7-f158-005056015198"   {Initiator has provided target address to host blr1}

2018-08-29T17:12:17Z vitd[2098782]: VITD: Thread-0x12126ba700 192.168.2.101 (iqn.1991-05.com.microsoft:jumpserver.vhabit.com): operational parameter negotiation done; transitioning to Full Feature Phase  {Discovery, authentication, authorization took place}

2018-08-29T17:12:17Z vitd[2098782]: VITD: Thread-0x12126ba700 192.168.2.101 (iqn.1991-05.com.microsoft:jumpserver.vhabit.com): VitdGetTargetAddr: target owner for target iqn.1998-01.com.vmware:5a258a8b-fc34-a175-6d85-765c44a1e2f2 is 5b6d9702-0ffc-a594-302d-00505601519e

{Ownership transferred to UUID 5b6d9702-0ffc-a594-302d-00505601519}

2018-08-29T17:12:17Z vitd[2098782]: VITD: Thread-0x12126ba700 192.168.2.101 (iqn.1991-05.com.microsoft:jumpserver.vhabit.com): Got redirect IP address and port number: 192.168.2.103:3260,

{IP address of new target owner is 192.168.2.103 which is blr3.vhabit.com}

2018-08-29T17:12:17Z vitd[2098782]: VITD: Thread-0x12126ba700 192.168.2.101 (iqn.1991-05.com.microsoft:jumpserver.vhabit.com): The connection is redirected. Drop the connection!

To check HostUUID:

[root@blr1:~] cmmds-tool find -t HOSTNAME -u 5b6d9702-0ffc-a594-302d-00505601519e

owner=5b6d9702-0ffc-a594-302d-00505601519e(Health: Healthy) uuid=5b6d9702-0ffc-a594-302d-00505601519e type=HOSTNAME rev=0 minHostVer=0 [content = ("blr3")], errorStr=(null)

 

We can expect enhancements in VIT HA design in future releases. Till then, you can play around in your lab with vSAN ISCSI Target.

It is an interesting topic 🙂

I hope this has been informative for you. Please share if you like the article.

Thanks for reading!!

Be the first to comment

Leave a Reply

Your email address will not be published.


*