4 Posts
0
604
August 29th, 2023 17:58
iDRAC: SSD Media Wearout indicator is inconsistent
I'm doing some testing with SSD media wearout where I am writing excessive data to the drive in an attempt to trigger an SSD Wear Threshold alert from iDRAC. The SSD in question is a(n):
Dell Certified 240GB Intel D3-S4610. (https://www.solidigm.com/products/data-center/d3/s4610.html)
From my research, this drive should be rated for 1.6 PBW (or about 7,000 rewrites of the drive's capacity, give or take based on not knowing everything...)
My concern is that while I've been conducting this test, using perccli I can watch as the worst erase count (0xAD) increases as expected, and 0xF5 decreases (the metric used by iDRAC for SSD Wearout according to https://www.dell.com/support/manuals/en-in/idrac9-lifecycle-controller-v4.x-series/idrac9_4.32.15.00_rn/storage-and-storage-controllers?guid=guid-ff39fa50-6112-4c86-93c7-64bec4d3349e&lang=en-us ).
However, even while 0xF5 shows a normalized value of 99%, iDRAC is still reporting 100% Remaining Rated Write Endurance. This would only make sense if iDRAC used 0xE9 for this reporting as that still shows 100%.
I could be wrong here, but 0xAD seems to also show the Write Endurance in its normalized value byte and, at the time of writing this, is down to 98.
Is iDRAC wrong and/or slow to update these values, or is there some behind-the-scenes math going on that my mortal eyes are not allowed to see?
PowerEdge R740XD; iDRAC is:
14G Monolithic | |
0.01 | |
6.00.30.00 |
S.M.A.R.T. data before testing began:
Smart Data Info /c0/e64/s9 =
01 00 01 0e 00 82 82 bf 66 00 00 00 00 00 05 33
00 64 64 00 00 00 00 00 00 00 09 32 00 64 64 c5
4b 00 00 00 00 00 0c 32 00 64 64 18 00 00 00 00
00 00 0d 1e 00 82 82 bf 66 00 00 0e 00 00 aa 33
00 64 64 00 00 00 00 00 00 00 ad 12 00 64 64 1d
00 00 00 00 00 00 ae 32 00 64 64 18 00 00 00 00
00 00 af 12 00 64 64 1a 00 00 00 00 00 00 b3 33
00 64 64 00 00 00 00 00 00 00 b4 32 00 64 64 e1
0d 00 00 00 00 00 b5 3a 00 64 64 00 00 00 00 00
00 00 b6 3a 00 64 64 00 00 00 00 00 00 00 b8 32
00 64 64 00 00 00 00 00 00 00 c2 22 00 64 64 1b
00 00 00 00 00 00 c3 32 00 64 64 00 00 00 00 00
00 00 c5 12 00 64 64 00 00 00 00 00 00 00 c6 10
00 64 64 00 00 00 00 00 00 00 c7 3e 00 64 64 00
00 00 00 00 00 00 c9 33 00 64 64 ee 09 ff ff 1c
00 00 ca 27 00 64 64 00 00 00 00 00 00 00 e9 32
00 64 64 d1 a6 00 00 00 00 00 ea 32 00 64 64 00
00 00 00 00 00 00 eb 0b 00 64 64 d1 a6 00 00 00
00 00 f1 32 00 64 64 d1 a6 00 00 00 00 00 f2 32
00 64 64 be 07 0d 00 00 00 00 f5 32 00 64 64 64
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 02 00 24 00 00 79
03 00 01 80 01 3c 3c 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 7f
S.M.A.R.T. data currently:
Smart Data Info /c0/e64/s9 =
01 00 01 0e 00 82 82 a9 a7 87 05 00 00 00 05 33
00 64 64 00 00 00 00 00 00 00 09 32 00 64 64 24
4c 00 00 00 00 00 0c 32 00 64 64 18 00 00 00 00
00 00 0d 1e 00 5a 5a a9 a7 87 05 b7 0d 00 aa 33
00 64 64 00 00 00 00 00 00 00 ad 12 00 62 62 cd
00 00 00 00 00 00 ae 32 00 64 64 18 00 00 00 00
00 00 af 12 00 64 64 ba 00 00 00 00 00 00 b3 33
00 64 64 00 00 00 00 00 00 00 b4 32 00 64 64 e1
0d 00 00 00 00 00 b5 3a 00 64 64 00 00 00 00 00
00 00 b6 3a 00 64 64 00 00 00 00 00 00 00 b8 32
00 64 64 00 00 00 00 00 00 00 c2 22 00 64 64 21
00 00 00 00 00 00 c3 32 00 64 64 00 00 00 00 00
00 00 c5 12 00 64 64 00 00 00 00 00 00 00 c6 10
00 64 64 00 00 00 00 00 00 00 c7 3e 00 64 64 00
00 00 00 00 00 00 c9 33 00 64 64 ee 09 ff ff 1c
00 00 ca 27 00 64 64 00 00 00 00 00 00 00 e9 32
00 64 64 7d bf 18 00 00 00 00 ea 32 00 64 64 00
00 00 00 00 00 00 eb 0b 00 64 64 7d bf 18 00 00
00 00 f1 32 00 64 64 7d bf 18 00 00 00 00 f2 32
00 64 64 46 0d 0d 00 00 00 00 f5 32 00 63 63 63
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 02 00 24 00 00 79
03 00 01 80 01 3c 3c 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 d7
atkina1747
4 Posts
0
October 2nd, 2023 18:22
For anyone looking to use this for monitoring their SSD %Life, it did eventually update itself (took maybe 2 weeks, which should be more than frequent enough for normal use cases).
DELL-Young E
Moderator
•
5.1K Posts
0
August 30th, 2023 02:00
Hello, thanks for choosing Dell. Are you using a Dell part? If so could you please give us the part #? In the meantime, is your SSD's firmware brought up to date? So is idrac? https://dell.to/3EfZtU2
atkina1747
4 Posts
0
September 1st, 2023 13:55
Apologies for my late update. This drive is holding up better than I anticipated, even at ~300MB/s writes.
Updated iDRAC from 6.0.0.3 to 7.0.0.0 and SSD % updated correctly; however, this appears to have been due to the iDRAC reboot rather than updating as the SSD life decreases. I brought the SSD down to 96% and iDRAC shows 97% still. Planning to leave it over the weekend to see if iDRAC updates on its own.
Side note for anyone doing similar testing with the Media Wearout Threshold feature: After some time, iDRAC eventually alerted (30 minutes ago) for write endurance on my drive (likely the weekly poll for SATA/SAS drives). A feature showing when the last poll occurred would be greatly appreciated on future releases if possible.
atkina1747
4 Posts
0
September 5th, 2023 15:03
I have received an alert for each of the 3 days of the weekend for the drive being below the set threshold; however, I let the wearout script run over the weekend and iDRAC still reports 97% while according to perccli it should be 93%.
Is the "THRESHOLD POLLING INTERVAL" (defined in table 1 here) an alerting interval or a %Life polling interval? The former appears wrong because I'm receiving daily alerts for a SAS/SATA SSD.