Start a Conversation

Unsolved

This post is more than 5 years old

2938

June 24th, 2016 13:00

Drive in Predictive failure state on PowerEdge R510 server running ESXi5.5

Hi,

I would like to know if I can hot swap a predicted failure drive.  I'm pretty sure the RAID level is 5 when I set it up years back, but not sure if I had assigned a HOT spare assigned on the RAID.

1. I would like to determine if I had configured a hot spare or not. How can I determine this on an R510 running ESXi5.5?

2. If there is a hot spare setup, what is the best option (let the drive fail, will the hot spare kick in and rebuild?)

3. or, should I directly hot swap the predicted failed drive on the live production server and will it begin the rebuild process.

Many thanks

RT

Moderator

 • 

9.5K Posts

June 28th, 2016 12:00

GeneSysComputing,

When a drive is labeled as a Predicted Failure, and is still online, you will have to Force Offline that drive first. Do NOT try to hot swap the drive if it is currently ONLINE. The reason being is that a Predicted Failure drive is an indication that the drive has exceeded a threshold on bad blocks, by rebuilding off that drive you run the risk of transferring those bad blocks into the Virtual Disk, causing a double fault or punctured stripe.

What you can do though is use OpenManage Server Admin to view and force offline that drive, if need be.

You can read more on installing the OMSA VIb from this post, despite being ESXi 5.0, the steps remain the same. The downloads change though, the VIB file needed is located here. The OMSA Web version is located here.

Let me know if this helps.

June 29th, 2016 16:00

Sorry - which post has the instructions?

Moderator

 • 

9.5K Posts

June 30th, 2016 05:00

I apologize, I added the link to the post, as well as included it here.

July 4th, 2016 22:00

OK- couple of problems now:

Had to reboot the server to get VIB to load

After which I get the following:

1. message on workstation: " Server Administrator Web Server 8.2.0 is not supported with this Server Instrumentation Version 8.3.0. Refer to the Dell OpenManage Installation and Security User's Guide for details on how to upgrade the Server Administrator Web Server.

2. After the reboot a second drive has gone to predicted failure.

:-(

July 4th, 2016 22:00

OK - Got to see the array via OMA by installing a newer version of the desktop software.  At present I see the predicted failure disk 4 is in "rebuilding" State.

I have 6 physical disks total 0,1,2,3,4,5.  The original predicted failure was on disk 4.  Now I get a DRIVE FAULT on disk 5 (not predicted failure as I initial said), (this is displayed in VSphere).  However, Disk 5 is not present in OMA???

Any ideas as to what is going on here?  and what the next step should be?

Moderator

 • 

9.5K Posts

July 5th, 2016 05:00

Just to confirm, are you currently able to get to the OS? Whats the Virtual Disks status currently?

If you are able to make the OS, the first thing I would do is stop the Predicted Failure from rebuilding, as that can create the double fault I described. I would power down, remove the drive and then power up and insert a fresh replacement.

Lastly, what version is the firmware on the controller, as well as the drives? We will see if the reason is it is out of date.

Let me know.

July 5th, 2016 07:00

Yes, I can get to the OS. In VSPHERE, I have Drive 4 in Predictive Failure and Drive 5 as Drive Fault.  The Virtual Disk in OMSA is status: READY. (the rebuild completed overnight on Drive 4 and it is ONLINE with a predicted failure indicated in OMSA.

I don't have the replacement drive as yet it should be arriving this afternoon.

OMSA tells me the firmware is 6.3.0-0001, driver version 00.00.05.34-9vmw

The drives 0,1,2,3,4 are revision 1AJ30001

Drive 5 is not showing up in OMSA.

Suggestions?

Moderator

 • 

9.5K Posts

July 6th, 2016 08:00

Thank you. When the replacement arrive I would suggest that you force the predicted failure drive 4 offline, once it is offline then insert the replacement and it should rebuild cleanly, without the predicted failure flag transferring over. Regarding drive 5, what model drive is it, I can verify if there is an available update for it.

July 6th, 2016 09:00

Thanks Chris,

Please confirm that the following sequence is correct:

1. Use OMSA to put Drive 4 OFFLINE (the predicted failure drive) .

2. Pull and replace Drive 4 with New drive.

3, Let new Drive rebuild?

4. Once rebuild is complete pull Drive 5 (the fault drive) and get you the details from the drive label?

5, Leave Slot 5 empty

Let me know and I will proceed as soon as I hear back from you.

Cheers

July 6th, 2016 12:00

Thanks Chris - will keep you posted.

Expecting the drive to arrive today.

Moderator

 • 

9.5K Posts

July 6th, 2016 12:00

Perform steps 1, 2, and 3. then we will update the controller before resorting to steps 4 and 5.

The is the 6.3.3-002 firmware update for the controller here.


Lets see if that resolves the issue before moving on to steps 4 and 5.

No Events found!

Top