Unsolved
This post is more than 5 years old
10 Posts
0
1642
November 24th, 2017 14:00
VNX5500 - 3 drives failed and 2 hot spare drives replace them. How is that possible? can a single hot spare drive replace more than one drive?
Hello,
A customer is planning to change his old VNX5500 for a UNITY 400. But first, we need to solve the following problem:
We took SPCollects from its VNX5500 and we see that 3 SASx300GBx15krpm drives, have failed and 2 hot spare drives of the same type, TLA number and speed replace them.
The Issues tab shows the following:
| Critical | Disk 0_1_12 is not enabled. | Replace failed disk or call your service provider. | |
| Critical | Disk 0_1_4 is not enabled. | Replace failed disk or call your service provider. | |
| Critical | Hot spare 0_6_12 is in use. | Replace failed disk or call your service provider. | |
| Critical | Hot spare 0_0_14 is in use. | Replace failed disk or call your service provider. | |
| Critical | Disk 0_0_2 is not enabled. | Replace failed disk or call your service provider. |
The "Hot Spare Replacing" column on the "Physical Disk Details" tab shows the following:
| Firmware Revision | State | Hot Spare Replacing |
| TC3Q/2264 | Hot Spare Ready | Inactive |
| ES0F | Enabled | 0_1_120_1_40_0_2 |
| C7A0 | Hot Spare Ready | Inactive |
| CS18 | Hot Spare Ready | Inactive |
| ES0F | Enabled | 0_1_120_1_40_0_2 |
It seems that the two hot spare drives are replacing the failed ones. I have never seen this before. As I understand, the hot spare drive can only replace one failed drive. (1:1)
Is this a normal behavior?
I would appreciate any help on this matter.
No Events found!



kelleg
4.5K Posts
1
November 30th, 2017 10:00
That output looks odd. One Hot Spare can not replace two different disks - something else is going on - I would recommend opening a service request with EMC to see what the actual issue is.
If you have replacement disks available, you should problem replace the disks listed as failed - but make sure that the disks that are failed are not in the same raid group (double fault).
glen
Chelino
10 Posts
0
November 30th, 2017 11:00
Hello Glen,
Yesterday we accessed to the Unisphere client directly, and as you stated, the hot spares drives are replacing in a one to one basis. Maybe the process to convert the SPCollect to a Excel output has errors.
In deed, the faulted drives belong to different RAID Groups.
First Hot Spare
Second Hot Spare
The problem we see is that, one of the failed drives is vault drive, and we are not able to copy to a hot spare drive. We suppose that a vault drive cannot be copied to a hot spare drive. Is that correct?.
In the case of the vault drive, I guess is safe to replace it since the vault drives have mirrored protection, right?
Thanks!
ZaphodB
195 Posts
1
December 5th, 2017 07:00
The vault drives are location dependent; so they do not engage a spare to rebuild that data. If the disks were also part of a pool or RAID group (not best practice, but not prohibited), then that additional data would cause a spare to be engaged.
It looks like these are 300GB disks, and in that case the vault disks have no room on them for anything but vault data; and they...technically...do not need a spare.
You do need to replace that vault disk to get the array healthy. If the array contains more than just the two hot spares, you could remove one, and use it to replace the vault.