Unsolved
This post is more than 5 years old
13 Posts
1
3586
July 13th, 2013 23:00
what are the procedures to go with Proactive hot spare !!
We have one disk failed in our environment (cx3-80) ,we have received about 500 warnings stating soft media error and critical error stating initiate proactive hot spare ..however due to poor monitoring we missed upon it and the disk failed....and though we had 2 hotspares configured of the same type and larger size ..still the flare does not initiated automatic hotspare...my doubts are below.
1) why the hotspare has not automatically intiated by flare ,when the disk failed.,please be imformed I have checked and the hotspare configuaratiom meet all the
requirement of the hot spare algorithm ie: size ,type and location
2) proactive hot spare can also be initaited after the disk failed (manually) , as in our case the hotspare not intiated automatically ?
3) can proactive hotspare be performed during business hour and does not need maintainence window
4) performing proactive hotspare will automatically change the host access to new location ( hot spare disk )and will be served ,and after the replacement the data will automatically copied from hot spare to new disk.
I have already refered this Kb aticle on this :
http://www.emc.com/collateral/software/white-papers/c1069-clariion-global-hot-spares-ldv.pdf
Thanks in advance....
Vipin VK
2 Intern
•
812 Posts
1
July 14th, 2013 00:00
If you have enough hotspare disks configured for all the disk types in the system, failed disk should be now being replaced by one of them. Please check the disk summary and status of faulty disk and hotspares.
When a disk is having large number (decided by FLARE OE) of soft scsi errors, the system will notify to initiate a proactive copy. And when the drive is about to fail, the system will invoke a hotspare against the same, and once copy is completed the disk will be marked as failed.
Proactive copy can cause small performance issues depending on the configuration and rebuild priority settings. But this does not require any maintenance window. Once disk is failed we can not do a proactive copy. Once the copy is initiated, the writes will be managed by the FLARE and will be pointing to the right disk. After the replacement, FLARE will initiate a copy back (equalization ) automatically.
All these are explained in the article you mentioned.
Steavson
9 Posts
0
July 15th, 2013 11:00
Please check whether drive fault is detected from both the SP's. If not hotspare will not be invoked. You can check this using SP collect file.
kelleg
4.5K Posts
1
July 16th, 2013 08:00
Proactive Copy to Hot Spare (PaCO) will not start for a number of conditions, but these do not appear to be valid in your situation as described. The only way to be certain what occurred would be to examine the spcollects at the time of the failure to determine why the copy to hot spare was not engaged.
Make sure that the version of Flare that you are running is the most up-to-date. There have been patches over the years to address some issues with the operation of PaCO.
2) proactive hot spare can also be initaited after the disk failed (manually) , as in our case the hotspare not intiated automatically ?
If the disk fails, then normal raid protection should take place - if the raid group that contains the disk is raid protected, then the Hot Spare should engage automatically
3) can proactive hotspare be performed during business hour and does not need maintainence window
This would be the same as if a disk failed and the Hot Spare engaged - the effect on production will be impacted as with any disk replacement
4) performing proactive hotspare will automatically change the host access to new location ( hot spare disk )and will be served ,and after the replacement the data will automatically copied from hot spare to new disk.
Yes - this is the way it works
Glen
Vipin VK
2 Intern
•
812 Posts
0
July 16th, 2013 09:00
I mentioned ' Once disk is failed we can not do a proactive copy ' in my comment, as it is being performed after the failure (not pro-active).
kelleg
4.5K Posts
0
July 16th, 2013 09:00
Sorry - missed that - too many words for old eyes
glen
Vipin VK
2 Intern
•
812 Posts
0
July 16th, 2013 10:00
No problems..
Thanks for a very good explanation.