1 Rookie
•
10 Posts
0
856
December 29th, 2022 15:00
The same disk failed 3 times
we have one CT-SC4020 Storage system which is just out of warrant, there is one disk fails 3 times starting from middle of Dec 2022, we ordered new replacement disk from our service partner and get it replaced, however it failed again after 1 or 2 days.
The recent replacement is on 27th Dec afternoon, and it failed again last night, after 1 day more. I checked the log file and found the following:
FRU ticket created for failed device Disk 154 Position[2-5] Alert Critical No FRU Ticket Alert Hardware
Today I soft reseat the disk and bring it back online, however the partner said its not advise to do that. ok anyway I am wondering if there is any tools we can do further troubleshooting as its not likely that all new 3 disks are having issue.
Many thanks in advanced
Shermaine
DiegoLopez
4 Operator
•
2.7K Posts
1
December 30th, 2022 02:00
Hello @flyheart,
It's difficult to say what could be causing the issue without more information. If we focus on a pure hardware issue. The main possible reason for a behavior like this is an outdated firmware or incompatible drives. So, first make sure all firmware is updated: SC Storage Customer Notification: HDD and SSD Firmware Best Practices in SC Series Systems https://dell.to/3ZbOqVx Even for new disks, as they may initially have an outdated firmware.
I don't think it is a problem of incompatible drives, if the partner suggested you the disks. But make sure just to double-check that they are compatible.
I strongly suggest you to open a case with phone Support and provide them a log, so they can check what may be causing the error.
Regards.
tianchou
1 Rookie
•
93 Posts
1
January 1st, 2023 19:00
If you have a serial cable, you can check disk status by disk show and disk get index command. it will give your the most detailed disk status and health information.
DELL-Joey C
Moderator
•
3.9K Posts
1
January 4th, 2023 21:00
Hi @flyheart,
I would agree with Diego that you should open a case with phone support to analyze the storage logs. Most of the time with returned{PathFailure} error, a hardware replacement should resolve the issue, but not for your issue. Hence, it could be a deeper issue. It could be firmware issue too. There are sets of commands to execute, but it only can be done by level 2 engineering after log analysis.
flyheart
1 Rookie
•
10 Posts
0
December 30th, 2022 10:00
Hi Diego,
Thanks for the recommendation, I will check the document. I soft reseat the disk yesterday and when I check the disk this moring, it is still online, however there are a lot of predictive failure in the log point to the same disk, it is saying returned{PathFailure}
Like I said the disk has been replaced 3 times with new disk, so this log message makes me think it might not be the disk issue, but something with the enclosure.
Thanks again
SM
flyheart
1 Rookie
•
10 Posts
0
January 4th, 2023 14:00
Now the disk is in the storage, I am not sure if I can run some command from storage console to find more informaiton.