Start a Conversation

Solved!

Go to Solution

604

August 29th, 2023 17:58

iDRAC: SSD Media Wearout indicator is inconsistent

I'm doing some testing with SSD media wearout where I am writing excessive data to the drive in an attempt to trigger an SSD Wear Threshold alert from iDRAC. The SSD in question is a(n):

Dell Certified 240GB Intel D3-S4610. (https://www.solidigm.com/products/data-center/d3/s4610.html)

From my research, this drive should be rated for 1.6 PBW (or about 7,000 rewrites of the drive's capacity, give or take based on not knowing everything...)

My concern is that while I've been conducting this test, using perccli I can watch as the worst erase count (0xAD) increases as expected, and 0xF5 decreases (the metric used by iDRAC for SSD Wearout according to https://www.dell.com/support/manuals/en-in/idrac9-lifecycle-controller-v4.x-series/idrac9_4.32.15.00_rn/storage-and-storage-controllers?guid=guid-ff39fa50-6112-4c86-93c7-64bec4d3349e&lang=en-us ).

However, even while 0xF5 shows a normalized value of 99%, iDRAC is still reporting 100% Remaining Rated Write Endurance. This would only make sense if iDRAC used 0xE9 for this reporting as that still shows 100%.
I could be wrong here, but 0xAD seems to also show the Write Endurance in its normalized value byte and, at the time of writing this, is down to 98.

Is iDRAC wrong and/or slow to update these values, or is there some behind-the-scenes math going on that my mortal eyes are not allowed to see?

PowerEdge R740XD; iDRAC is:

14G Monolithic
0.01
6.00.30.00

S.M.A.R.T. data before testing began:

Smart Data Info /c0/e64/s9 =
01 00 01 0e 00 82 82 bf 66 00 00 00 00 00 05 33
00 64 64 00 00 00 00 00 00 00 09 32 00 64 64 c5
4b 00 00 00 00 00 0c 32 00 64 64 18 00 00 00 00
00 00 0d 1e 00 82 82 bf 66 00 00 0e 00 00 aa 33
00 64 64 00 00 00 00 00 00 00 ad 12 00 64 64 1d 
00 00 00 00 00 00 ae 32 00 64 64 18 00 00 00 00
00 00 af 12 00 64 64 1a 00 00 00 00 00 00 b3 33
00 64 64 00 00 00 00 00 00 00 b4 32 00 64 64 e1
0d 00 00 00 00 00 b5 3a 00 64 64 00 00 00 00 00
00 00 b6 3a 00 64 64 00 00 00 00 00 00 00 b8 32
00 64 64 00 00 00 00 00 00 00 c2 22 00 64 64 1b
00 00 00 00 00 00 c3 32 00 64 64 00 00 00 00 00
00 00 c5 12 00 64 64 00 00 00 00 00 00 00 c6 10
00 64 64 00 00 00 00 00 00 00 c7 3e 00 64 64 00
00 00 00 00 00 00 c9 33 00 64 64 ee 09 ff ff 1c
00 00 ca 27 00 64 64 00 00 00 00 00 00 00 e9 32
00 64 64 d1 a6 00 00 00 00 00 ea 32 00 64 64 00
00 00 00 00 00 00 eb 0b 00 64 64 d1 a6 00 00 00
00 00 f1 32 00 64 64 d1 a6 00 00 00 00 00 f2 32
00 64 64 be 07 0d 00 00 00 00 f5 32 00 64 64 64
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 02 00 24 00 00 79
03 00 01 80 01 3c 3c 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 7f




S.M.A.R.T. data currently:

Smart Data Info /c0/e64/s9 =
01 00 01 0e 00 82 82 a9 a7 87 05 00 00 00 05 33
00 64 64 00 00 00 00 00 00 00 09 32 00 64 64 24
4c 00 00 00 00 00 0c 32 00 64 64 18 00 00 00 00
00 00 0d 1e 00 5a 5a a9 a7 87 05 b7 0d 00 aa 33
00 64 64 00 00 00 00 00 00 00 ad 12 00 62 62 cd
00 00 00 00 00 00 ae 32 00 64 64 18 00 00 00 00
00 00 af 12 00 64 64 ba 00 00 00 00 00 00 b3 33
00 64 64 00 00 00 00 00 00 00 b4 32 00 64 64 e1
0d 00 00 00 00 00 b5 3a 00 64 64 00 00 00 00 00
00 00 b6 3a 00 64 64 00 00 00 00 00 00 00 b8 32
00 64 64 00 00 00 00 00 00 00 c2 22 00 64 64 21
00 00 00 00 00 00 c3 32 00 64 64 00 00 00 00 00
00 00 c5 12 00 64 64 00 00 00 00 00 00 00 c6 10
00 64 64 00 00 00 00 00 00 00 c7 3e 00 64 64 00
00 00 00 00 00 00 c9 33 00 64 64 ee 09 ff ff 1c
00 00 ca 27 00 64 64 00 00 00 00 00 00 00 e9 32
00 64 64 7d bf 18 00 00 00 00 ea 32 00 64 64 00
00 00 00 00 00 00 eb 0b 00 64 64 7d bf 18 00 00
00 00 f1 32 00 64 64 7d bf 18 00 00 00 00 f2 32
00 64 64 46 0d 0d 00 00 00 00 f5 32 00 63 63 63
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 02 00 24 00 00 79
03 00 01 80 01 3c 3c 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 d7

October 2nd, 2023 18:22

For anyone looking to use this for monitoring their SSD %Life, it did eventually update itself (took maybe 2 weeks, which should be more than frequent enough for normal use cases).

Moderator

 • 

5.1K Posts

August 30th, 2023 02:00

Hello, thanks for choosing Dell. Are you using a Dell part? If so could you please give us the part #? In the meantime, is your SSD's firmware brought up to date? So is idrac? https://dell.to/3EfZtU2

September 1st, 2023 13:55

Apologies for my late update. This drive is holding up better than I anticipated, even at ~300MB/s writes.

Updated iDRAC from 6.0.0.3 to 7.0.0.0 and SSD % updated correctly; however, this appears to have been due to the iDRAC reboot rather than updating as the SSD life decreases. I brought the SSD down to 96% and iDRAC shows 97% still. Planning to leave it over the weekend to see if iDRAC updates on its own.

Side note for anyone doing similar testing with the Media Wearout Threshold feature: After some time, iDRAC eventually alerted (30 minutes ago) for write endurance on my drive (likely the weekly poll for SATA/SAS drives). A feature showing when the last poll occurred would be greatly appreciated on future releases if possible.

September 5th, 2023 15:03

I have received an alert for each of the 3 days of the weekend for the drive being below the set threshold; however, I let the wearout script run over the weekend and iDRAC still reports 97% while according to perccli it should be 93%.

Is the "THRESHOLD POLLING INTERVAL" (defined in table 1 here) an alerting interval or a %Life polling interval? The former appears wrong because I'm receiving daily alerts for a SAS/SATA SSD.

Moderator

 • 

9.5K Posts

05-09-2023 15:20 PM

Atkina1747,

 

I believe if it is depicting the intervel used between alert queries. 

Now in regards to the iDrac, I think the issue is that he idrac isn't real time keeping up with the details. Meaning that the idrac isn't showing the latest information, but instead is the information provided during the last inquiry to it. 

 

Let me know if this helps.

 

DELL-Chris H

Social Media and Communities Professional

Dell Technologies | Enterprise Support Services

#IWork4Dell

Did I answer your query? Please click on ‘Mark as Accepted Answer’. ‘Thumbs up’ the posts you like!

No Events found!

Top