Tag Archives: vSphere 4

Add or remove the vmnic on the vSwitch to recover the network connection

Have you ever accidentally removed your vmnic from your only vSwitch? Or added a vmnic to your vSwitch assigning an IP from different network? Both ways would cause your current network connection with ESX down. What is the workaround?

First, you need to use KVM or iLO to remotely login to your server screen. If you don’t have that, you have to connect monitor and keyboard to your server.

Next, let’s use the ESX commands to add or remove the vmnic from vSwitch.

ESX has an useful command “esxcfg-vswitch” which can assign, remove and modify the vSwitches with their port group, service console and vmnic.

If you deleted the vmnic, to add it back to a vSwitch

#esxcfg-vswitch -L vmnicN vSwitchN (vmicN is the vmnic number and vSwitchN is the vSwitch number)

Or if you want to remove an unnecessary vmnic from a vSwitch, use the command

#esxcfg-vswitch -U vmnicN vSwitchN

Sometimes we also need to add an uplink to a particular port group under a vSwitch

#esxcfg-vswitch -M vmnicN vswitchN -p PortGroupN (PortGroupN is the port group number)

Take note that the PortGoupN can also be a service console.

After such simple steps, you are happy to see your connection is back and all GUI operation can continue again.

Windows 2008 MSCS Clustering Setup on vSphere4

MSCS Clustering is quite popular in usage with VMware ESX. It can be realized by setting Windows VMs on both sides or setting one Windows VM on one side and using a physical Windows server on the other side (Which means we need two ESX servers). Recently we had some tests of Windows 2008 MSCS cluser with vSphere 4, simulating customer’s environment. The 2-node cluster has one Win2008 VM under each ESX server. We found three triky problems you might see as well.

1. IPv6 creats the duplicated tunnel  adapters

The IPv6 is enabled in Windows 2008 by default, by running “ipconfig /all” command we’ll see a long list of “Tunnel adapters” showing as below.

Tunnel adapter Local Area Connection* 9:
Media State . . . . . . . . . . . : Media disconnected

 

Connection-specific DNS Suffix . :
Tunnel adapter Local Area Connection* 9:
Connection-specific DNS Suffix . :

 

Link-local IPv6 Address . . . . . : fe80::5ede:192.168.0.6%11

In some circumstances, there could be tunnel adapters with the same name due to your configuration before.  Therefore the configuration test will report “LAN issue” saying there are duplicate adapters found when we setup a failover cluster on one node. The fast way is to delete the Tunnel Adapters, so I searched the “Local Area Connection” in the registry and deleted them under “\HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Newtwork”.

Wired is, I can still see them by “ipconfig /all” after that. After googling, I’ve found these adapters are visible unless IPv6 is disabled.  There are two ways of doing this: Disable “Teredo Tunneling Pseudo-Interface” and Disable IPv6 in the registry.  However only the second method works for me, please offer ideas how to make the first method work. Here are brief steps

a. Disable Teredo

Computer Management->Device Manager->Show Hidden Devices(Under “View” in Menu)->Network adapters->Teredo Tunneling Pseudo-Interface->Right-click and choose “Disable”

b. Disable IPv6 in Registry

1. Go to the registry “\HKEY_LOCAL_MACHINE \SYSTEM\CurrentControlSet\services >TCPIP6\Parameters”

2. Right-click and create a new DWORD(32-bit)Value called “DisabledComponents”

3. Double-click the value and type “ffffffff” in hexadecimal format (Somewhere said it’s “0xffffffff”)

4. Reboot Windows

2. How to add RDM correctly to second node

Suppose we have two ESX servers-ESX1 and ESX2. W2K8 VM on ESX1 is VM1, and another W2K8 VM on ESX2 is VM2. There is a “Shared Storage” concept which means RDM under VM1 will failover to the VM2 if the VM1 or ESX1 failed. This requires us to mask the RDM LUN with the same host ID to both ESX1 and ESX2. But one extra thing we need not to ignore is VM1’s LUN must be visible to ESX2 as well. Because MSCS setup on VM2 requires us to choose “Use existing device” when adding the RDM LUN which means we have to go to VM1’s LUN and choose the RDM’s vmdk file inside the same datastore. So the shared storage includes VM1’s boot LUN as well. Here are the procedures:

1. Mask VM1’s boot LUN and a RDM LUN to ESX1 and ESX2. Give the same host ID of RDM LUN on both ESX servers. VM1’s boot doesn’t need to have same LUN ID. Make sure the  boot LUN appear in the ESX2 datastore under “Configuration” tab.

2. Add the RDM on VM1 with a new SCSI controller, RDM type is “Pass-through(Physical mode), controller type is SAS for W2K8, and set its compatibility mode to “Physical” on ESX1 ( The details described in next part)

3. Add the RDM on VM2, use the existing device, browse datastore, choose VM1’s LUN, and select its RDM’s VMDK. The SCSI controller setting is same as VM1

3. Pay attention to your RDM’s SCSI controller

The MSCS cluser requires RDM LUNs to use a speperate SCSI controller for RDM LUNs other than the boot LUN SCSI controller. Normally we choose 1:0 to add RDM which will subsequently create another controller. The problem is this SCSI controller type needs to be different for different Windows versions failover clusters. According to VMware White Paper “Setup for Failover Clustering and Microsoft Cluster Service“, there are three settings

Windows 2000 ………………………….. LSI Logic Parallel

Windows 2003 ………………………….. LSI Logic Parallel

Windows 2008 ………………………….. LSI Logic SAS

So you need to make sure your Windows 2008 is using SAS (The default controller type is “LSI Logic Parallel”). Without correct setting, you could setup the cluster, but will see warning messages in configuration test telling you there is no shared storage of the cluster. And remember SAS is required for RDM SCSI controller AND for boot disk.

Although we have setup the MSCS on ESX 4, it will have conflicts with other vSphere 4 features like vMotion and FT. Therefore, it’s better to analyze your current setup and requirements before you go for MSCS.

Reference:

VMware — Setup for Failover Clustering and Microsoft Cluster Service

vSphere4 VLAN configuration on Nexus 5020

Recently I am doing a feature testing for Nexus 1000v which is a virtual switch incorporating into the vCenter in the form of a VM.  One of the prerequisites to setup this 1000v is having several VLANs-One for management, one for packet and another for control. I’ve always been using standard vSwitch without VLAN tagging for previous tests, so I wanted to use two ESX machines test VLAN first and later setup 1000v VM. And of course, I saw some tricky issues.

Talk about my setup first:

  • 2 ESX 4 U1 servers
  • Each server has one 1G Ethernet card and 1 FCoE CNA card
  • The Ethernet card connects to public network and CNA connects to a Nexus 5020 which uses private network
  • ESX and VMware are booting from EMC SAN

I created one VM on each ESX under the private network using CNA 10G Ethernet connection. Then I created a VLAN 5 on Nexus 5020. I set both VMs to use VLAN 5, but found the two can’t ping each other. Suddenly I realized trunking needs to be enabled for those ports using VLAN. ESX supports three kinds of VLAN tagging–Virtual Machine Guest Tagging (VGT Mode), External Switch Tagging (EST Mode) and ESX Virtual Switch Tagging (VST Mode). The VST mode is policy-based and easy to configure but still needs port trunking. Seems VMware side only has ESX 3.5 VLAN documents which also mention to enable “spanning-tree portfast” for the ports, so I did it. You also need to specify the allowed VLAN(range) on the port, otherwise traffic will be blocked. After these steps, the VMs could ping each other. I changed the VLAN tag arbitrarily as VLAN 6 which I didn’t create physically on Nexus 5020. VMs couldn’t ping again. I logged in to Nexus switch, created VLAN 6 and allowed the VLAN ID, problem gone. So another tip I got is you still need a physical VLAN on switch first before you use any of the VLAN tagging methods.

As a summary, here are the commands to configure VLAN on Nexus 5020 in my case

NX-5020# config

NX-5020(config)# vlan 5    //create the vlan

NX-5020(config)# show vlan //make sure the vlan is created

NX-5020(config)# show run //check the Ethernet interface settings of your target port

(I recommend to simply clear the settings on the port for easy troubleshooting)

NX-5020(config)# interface Ethernet PortID   //select the port

NX-5020(config-if)# switchport mode trunk  //Enable the trunking mode

NX-5020(config-if)# spanning-tree port type edge trunk  //Enable the port-fast on trunk, you will probably see

//warning message, just ignore

NX-5020(config-if)# switchport trunk allowed vlan ID1,ID2….    //Allow the VLANs, you can also use “all” to

//allow all VLANs
Now I’m configuring my Nexus 1000v now, still figuring out the topology, will update once it’s done. Hope this post can solve a bit confusion during your vSphere4 VLAN configuration on Nexus 5020.