Andrew_P76

1 Rookie

•

16 Posts

0

2848

February 18th, 2020 09:00

SC040 Stuck - Cant delete disk or rebalance raid

Hello,

I have an old compelent that is no longer covered under hardware/software support. Two disks failed in one enclosure and one is in predictive failure so I'm panicking haha. (Due to SC040 not able to send email alerts so we didn't check in a long time, our bad, and assumed a light would appear on the physical disk when it goes down)

I've replaced two failed drives (03-00, and 03-08).

03-00 Appears twice in the Disks > 15K > Assigned folder
- One disk was a spare, Status = Down, Health = Predicted Failure
- Next disk for 03-00 shows Control Type = Unmanaged, Status =Up, Health = Healthy
03-08 appears twice in the Disks > 15K > Assigned folder
- One disk was Status = Down, Health = TooManyMedium Error, Path Alert: Disk 54 all paths to device are down
- Next disk for 03-08 shows Control Type = Transitioning, Status = Up, Health = Healthy

If I try to delete the failed disks, the system times out and says the task cannot be completed in the time given (2-3 minutes before it times out).

For the 03-00 healthy disk, I have the option to "revert to managed". However, that fails with error:

Summary: An error occurred trying to modify the disk object (identified as ''). Cause: Rollback DiskManager: Disk Request … Set Source: API Leader: 1 State: 6 CheckDeleteStatus RetVal: Failure Idx: 71

I tried to rebalance raid, but that task has been running for 12 days now and is still at 0%. When I check the Background Processes, there are 8 processes in total. 2 are "in progress", both at 0%. There are 2 other processes in the Stopped state:

Raid Scrub Progress 16% Progress Message xxx of xxx block, start time 09/13/2017

Raid Rebuild Progress 76% Progress Message xxx of xxx block, start time 07/05/2018

So it seems the system is stuck but I'm not sure how to proceed. Any tips? I'm not a Compellent guru so as much advice as possible is greatly appreciated.

Thank you.

Responses(5)

A

Andrew_P76

1 Rookie

•

16 Posts

0

February 18th, 2020 12:00

Hello,

I have an old SC040 on old firmware (6.3.10) that isn't on hw/sw support. It 'seems' the system is stuck when trying to complete some tasks/jobs. Any tips? I'm no Compellent guru so the more detail you can suggest is greatly appreciated.

Issue(s)

On same enclosure, 2 failed disks (one is spare), 1 predictive failure disk
Replaced 2 failed disk
- Both of the failed disks cannot be deleted - get a timeout error (no error ID or further details)
- Tried to do a raid rebalance, this has been running for 12 days and is stuck at 0%
- One disk has the option to "Revert to Managed", when I try this I get error:
  Summary: An error occurred trying to modify the disk object (identified as ''). Cause: Rollback DiskManager:Disk Request: 1408528467 Set Source: API Leader: 1 State 6 CheckDeleteStatus RetVal: Failure Idx: 71

When I view the background processes I see:

8 processes (1x from 2017, 2x from 2018, 5 from 2019)
- Stop, Raid Scrub, Progress 16%, xx of xxx blocks, scrub raid device 188, 2017
- Stop, Raid Rebuild, Progress 76%, xx of xxx blocks, rebuild raid device 189 extent 5, 2018
- Stop, Raid Scrub, Progress 0%, , scrub raid device 189, 2018
- Stop, Raid rebuilld, Progress 0%, , rebuild raid device 205 extent 19, 2019
- In progress, Raid rebuilld, Progress 0%, xx of xx blocks, rebuild raid device 206 extent 19, 2019
- Stop, Raid rebuilld, Progress 0%, , rebuild raid device 207 extent 19, 2019
- Stop, Raid rebuilld, Progress 0%, , rebuild raid device 208 extent 19, 2019
- In progress, Raid rebuilld, Progress 0%, , rebuild raid device 204 extent 19, 2019

A

Andrew_P76

1 Rookie

•

16 Posts

0

February 18th, 2020 12:00

Sorry for the duplicate post. 1st was marked as spam and took a while to be corrected/posted by moderator

DELL-Sam L

Moderator

•

7.6K Posts

0

February 18th, 2020 13:00

Hello Andrew_P76,

What is your current SCOS version that is on your SC040? Here are the steps that are needed to replace a drive & do a raid rebalance. If you are still using Enterprise Manger then these steps will not work.

https://dell.to/328ow80

Manage Unassigned Disks

https://dell.to/2u8OmvO

Please let us know if you have any other questions.

A

Andrew_P76

1 Rookie

•

16 Posts

0

February 18th, 2020 14:00

v6.3.10

The 1st link provided is for a newer version so we don't have those options

When I physically replaced the two failed disks, they did appear as unmanaged. I was able to assign them to the Assigned folder, that worked. But now the display shows the old failed disk (03-00 and 03-08), and the new disk in those same slots (these have an ok status).

Trying to delete the remaining failed disks times out. After a few minutes of the system trying to delete them, it provides a time out message saying something like the system was not able to delete the disks in a timely fashion and gives the option to continue waiting or close the window. I've tried continuing on one disk for about an hour with the same results.

Thanks for the tips, any other ideas?

I was trying to find documentation for consoling into the machine to see if I can get any more details or clues on errors and how to proceed. Or perhaps a full reboot of both controllers may help?

Dual controllers
3 enclosures (7k disks, 10k disks, 15k disks)

DELL-Sam L

Moderator

•

7.6K Posts

0

February 19th, 2020 10:00

Hello Andrew_P76,

Thanks for your SCOS version. Based on what you are having there is not a way to resolve this without contacting support. With your, rebuild still sitting at 0% support will have to look in secure console to see what is going on.

Please let us know if you have any other questions.

View All

No Events found!

Compellent

SC040 Stuck - Cant delete disk or rebalance raid

Was this post helpful?