MSCS Clustering is quite popular in usage with VMware ESX. It can be realized by setting Windows VMs on both sides or setting one Windows VM on one side and using a physical Windows server on the other side (Which means we need two ESX servers). Recently we had some tests of Windows 2008 MSCS cluser with vSphere 4, simulating customer’s environment. The 2-node cluster has one Win2008 VM under each ESX server. We found three triky problems you might see as well.
1. IPv6 creats the duplicated tunnel adapters
The IPv6 is enabled in Windows 2008 by default, by running “ipconfig /all” command we’ll see a long list of “Tunnel adapters” showing as below.
Tunnel adapter Local Area Connection* 9:
Media State . . . . . . . . . . . : Media disconnected
Connection-specific DNS Suffix . :
Tunnel adapter Local Area Connection* 9:
Connection-specific DNS Suffix . :
Link-local IPv6 Address . . . . . : fe80::5ede:192.168.0.6%11
In some circumstances, there could be tunnel adapters with the same name due to your configuration before. Therefore the configuration test will report “LAN issue” saying there are duplicate adapters found when we setup a failover cluster on one node. The fast way is to delete the Tunnel Adapters, so I searched the “Local Area Connection” in the registry and deleted them under “\HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Newtwork”.
Wired is, I can still see them by “ipconfig /all” after that. After googling, I’ve found these adapters are visible unless IPv6 is disabled. There are two ways of doing this: Disable “Teredo Tunneling Pseudo-Interface” and Disable IPv6 in the registry. However only the second method works for me, please offer ideas how to make the first method work. Here are brief steps
a. Disable Teredo
Computer Management->Device Manager->Show Hidden Devices(Under “View” in Menu)->Network adapters->Teredo Tunneling Pseudo-Interface->Right-click and choose “Disable”
b. Disable IPv6 in Registry
1. Go to the registry “\HKEY_LOCAL_MACHINE \SYSTEM\CurrentControlSet\services >TCPIP6\Parameters”
2. Right-click and create a new DWORD(32-bit)Value called “DisabledComponents”
3. Double-click the value and type “ffffffff” in hexadecimal format (Somewhere said it’s “0xffffffff”)
4. Reboot Windows
2. How to add RDM correctly to second node
Suppose we have two ESX servers-ESX1 and ESX2. W2K8 VM on ESX1 is VM1, and another W2K8 VM on ESX2 is VM2. There is a “Shared Storage” concept which means RDM under VM1 will failover to the VM2 if the VM1 or ESX1 failed. This requires us to mask the RDM LUN with the same host ID to both ESX1 and ESX2. But one extra thing we need not to ignore is VM1’s LUN must be visible to ESX2 as well. Because MSCS setup on VM2 requires us to choose “Use existing device” when adding the RDM LUN which means we have to go to VM1’s LUN and choose the RDM’s vmdk file inside the same datastore. So the shared storage includes VM1’s boot LUN as well. Here are the procedures:
1. Mask VM1’s boot LUN and a RDM LUN to ESX1 and ESX2. Give the same host ID of RDM LUN on both ESX servers. VM1’s boot doesn’t need to have same LUN ID. Make sure the boot LUN appear in the ESX2 datastore under “Configuration” tab.
2. Add the RDM on VM1 with a new SCSI controller, RDM type is “Pass-through(Physical mode), controller type is SAS for W2K8, and set its compatibility mode to “Physical” on ESX1 ( The details described in next part)
3. Add the RDM on VM2, use the existing device, browse datastore, choose VM1’s LUN, and select its RDM’s VMDK. The SCSI controller setting is same as VM1
3. Pay attention to your RDM’s SCSI controller
The MSCS cluser requires RDM LUNs to use a speperate SCSI controller for RDM LUNs other than the boot LUN SCSI controller. Normally we choose 1:0 to add RDM which will subsequently create another controller. The problem is this SCSI controller type needs to be different for different Windows versions failover clusters. According to VMware White Paper “Setup for Failover Clustering and Microsoft Cluster Service“, there are three settings
Windows 2000 ………………………….. LSI Logic Parallel
Windows 2003 ………………………….. LSI Logic Parallel
Windows 2008 ………………………….. LSI Logic SAS
So you need to make sure your Windows 2008 is using SAS (The default controller type is “LSI Logic Parallel”). Without correct setting, you could setup the cluster, but will see warning messages in configuration test telling you there is no shared storage of the cluster. And remember SAS is required for RDM SCSI controller AND for boot disk.
Although we have setup the MSCS on ESX 4, it will have conflicts with other vSphere 4 features like vMotion and FT. Therefore, it’s better to analyze your current setup and requirements before you go for MSCS.
Reference:
VMware — Setup for Failover Clustering and Microsoft Cluster Service