Unsolved
This post is more than 5 years old
17 Posts
0
2469
May 21st, 2013 06:00
MSCS on VMware using MirrorView - failure on Array power off
Hi All
We have a Windows Cluster (2008 R2) running on VMware 5.0 using MirrorView/S (FLARE 30) to replicate the LUNS across two Clariion CX4-480.
Unfortunately on a fairly regular basis we have to power down our Primary machine room (due to local building development).
The cluster and MirrorView are failed over and all seems to work correctly from the Secondary array. However, when the the Primary array is powered off, the cluster fails and the application goes offline.
This configuration was tested successfully and implemented when we were running VMware 4.1 (vSphere and ESXi). We have since upgraded to VMware 5.0 (vSphere and ESXi) and we get these issues. However, testing did not include turning off the Primary array!
The Windows guy is talking about editing mappings files to remove references to the opposite Data Centre.
If anyone has any suggestions or thoughts or perhaps have experience of similar, that would be great.
Many thanks in anticipation
Jo
UWEadmin
17 Posts
0
May 21st, 2013 07:00
Apologies but is your suggestion that VMware check the mappings files - are they MSCS files you are referring too?
Sorry I am a UNIX gal......
AnkitMehta
1.4K Posts
0
May 21st, 2013 07:00
1. Get a Backup Powersource. (Generator etc.)
2. Refer to http://pubs.vmware.com/vsphere-50/topic/com.vmware.ICbase/PDF/vsphere-esxi-vcenter-server-50-mscs-guide.pdf and http://www.vmware.com/pdf/clariion_wp_eng.pdf
3. Refer to attached document as well.
I am with Windows guy at the mo. But if you could help me we can work together (here on ECN) :
1. Open a case with VMware Technical Support and check/verify the mappings
2. Post a diagram.
3. More details about licensing on VMware end/environment.
1 Attachment
h5588-emc-mirrorview-cluster-enabler-wp.pdf
AnkitMehta
1.4K Posts
0
May 21st, 2013 07:00
So if I understood this correctly, your major concern is loss of service when Primary SAN (CLARiiON) goes down and since, it has MV/S configured the target array doesn't take over, as should?
UWEadmin
17 Posts
0
May 21st, 2013 07:00
Thanks Ankit
1. The Business is working on it - slowly!
2. Will take a good look at these now.
3. Windows guy couldn't get it installed! We tried to engage help from EMC / reseller but he was not open to assistance.
A case was opened by Windows guy with VMware 13282791302 (now closed) but led to VMware KB article "
vSphere handling of LUNs detected as snapshot LUNs (1011387) "
We do have two calls currently open with EMC and VMware linked to a single vm host running Linux that was on a MirrorView LUN that we had problems with when we failed back (the host could see the LUN but the datastore was not recognised). VMware have sent us the same KB article as was sent to the Windows guy for his call. EMC have sent us procedures that incorporate the steps in the above VMware KB and these work fine on a test config. We have now cloned our Linux VM which was also set up whilst we were running vSphere 4.1 (ESX 4.1) and has not worked since upgrading to vSphere 5.0 (ESX 4.1) and we are going to try the KB steps to see if we can replicate the issue.
However all of the above doesn't seem to relate to a loss of service when the Primary SAN goes down. This seems to be a separate issue which I try as I might I cannot get the Windows guy to go through with me.
I will dig out a diagram, neutralize it and post.
We were running vSphere 4.1 and ESX 4.1. the environment was upgraded to vSphere 5.0 and ESXi 5.0, after the configuration was installed and tested. We have both the 5.0 standard and enterprise plus licenses.
Thanks again
Jo
UWEadmin
17 Posts
0
May 24th, 2013 04:00
Hi Ankit,
Yes that's the main concern.
We know the power outage is coming. We failover the cluster and the MirrorView/S and all is seemingly running fine. then the Primary SAN is powered off and everything fails.
Thanks
Jo
UWEadmin
17 Posts
0
May 24th, 2013 05:00
Haha
I have had a chat with VMware and I think I now understand.
The problem is that once the Primary SAN is shutdown the VMware is marking LUNs are deactivated snapshots
VMware KB 1101387 which are read only.
As the cluster can no longer write to the LUNS the application fails and all falls over.
I believe the problem is that whilst we fail the MirrorView/S over we do not stop the replication.
Unless we shutdown a SAN, we cannot replicate the problem - unless someone knows how to simulate a SAN power off to certain LUNS ithout actually powering off a SAN?.......
I am now off to find the procedure for stopping MirrorView/S replication when shutting down a SAN.
Thanks
Jo
AnkitMehta
1.4K Posts
1
May 24th, 2013 05:00
Hi Jo,
IMHO, it seems more of a complex configuration issue which I believe needs to be resolved after verifying the necessary support logs. When it comes to EMC CLARiiON and EMC Support Boundaries, Technical Support will ask for
1. SP Collects for both CLARiiON
2. vmsupport for ESX (they'll try to collaborate with internal vmware team and see if we can come up with any suggestion)
I am only referring to Support reason being I do not want to give you a blind fluke which may or may not work out! I rather suggest to take this route which will assure you sound sleep in future.
Are these both Storage Unit under Support Contract? or at least one? Please, respond with Yes/No so that I can post appropriate response to that!
UWEadmin
17 Posts
0
May 24th, 2013 05:00
They are both under support.
I have a call logged with VMware and we have a related call with EMC.
Our EMC account manager is trying to get all connected and move forward.
There are lots of politics this end which I am trying to leave out of all of this.
It is after all a technical problem that is sure to have a technical answer
Thanks
Jo
UWEadmin
17 Posts
0
May 24th, 2013 05:00
Attached is the design diagram.
There was an attempt to install MirrorView CE but the Windows guy did not get on with the install documents.
We have therefore used MirrorView/S instead.
Thanks
Jo
1 Attachment
New Blackboard Design diagram.vsd
AnkitMehta
1.4K Posts
0
May 24th, 2013 06:00
Hi Jo
Can you please, PM the EMC Support Ticket Number to Me, glen, TimH (Timothy Hughes, right?) and Mark. With time permitting either one of us will check this or will try to engage right support engineer with appropriate skill set!
glen TimH (Timothy Hughes, right?) Can you please, check this? Sounds intriguing and complex! Lets do this!
Seems one of those times, I want to run to John Nehiley for help!
UWEadmin
17 Posts
0
May 24th, 2013 10:00
Hi
The Forum Send mail says you need to be a connection/friend first - follow me and I can send to you, unless there is another way.
Thanks
Jo
AnkitMehta
1.4K Posts
1
May 27th, 2013 03:00
Hi Jo,
My apologies to take some time here to respond. I was working with relevant people on how to proceed with this and waiting for the response I can share.
My sincere apologies that you were not able to PM me and I had to be connection before you could PM or email me. It seems like after Jive Upgrade one can not PM directly to any contributor if he/she is not a connection. It is definitely a feedback for the ECN Team (Alan Z., Brace Rennels)
Anyhow, I have discussed with the relevant group and we will be creating a Private Discussion where we can work on this issue here on ECN and once, the issue if fixed we will be posting the resolution and brief summary here.
Ankit