1 Rookie
•
8 Posts
0
132
February 26th, 2025 00:22
MD1400 Not showing any drives after power loss
Good Day,
We have an out of warranty MD1400 that stopped working after an environment power failure. It has no blue light on front and both EMM cards show solid amber. In the connected H830 it shows the chassis but with no slots or drives.
Connecting to maintenance via serial it gives a Locked Down! 03:000009 and WARNING!!! Boot Error Code Present: Major: 0x03 Minor: 0x000009 messages.
Reviewing logs it shows that the manufacturing flash table is empty. This appears to be case on both EMMs (logs below). In dtrace it also shows “FRU reading validation failed.”
Does anyone have any suggestion on how to fix this?
Thanks
Logs
WARNING!!! Boot Error Code Present: Major: 0x03 Minor: 0x000009
Locked Down! 03:000009 >showlogs
<0:00:00:25.061>:CONFIG:Mfg image in flash is unreadable
Locked Down! 03:000009 >flashtblinfo
FLASH ALGORITHM: AMD
FLASH MODE: 16 BIT
FLASH SIZE: 16 MB
START ADDRESS: 0x10000000
FLASH SECTOR COUNT: 128
SECTOR MAP:
================================================================================
ERASE BLOCK REGION SECTOR SIZE(KB) ERASE BLK SECTOR COUNT
================================================================================
0 128 128
================================================================================
FLASH REGION TABLE:
================================================================================
REGION REGION OFFSET START SECTOR SIZE IS
ID TYPE SECTOR COUNT (KB) VALID
================================================================================
0 BOOTLOADER 0x000000 0 12 1536 Y
1 FW COPY 1 0x180000 12 12 1536 Y
2 FW COPY 2 0x300000 24 12 1536 Y
3 MANUFACTURING 0x480000 36 1 128 EMPTY
4 CONFIG 1 0x4a0000 37 1 128 Y
5 CONFIG 2 0x4c0000 38 1 128 EMPTY
6 LOG 1 0x4e0000 39 1 128 -
7 LOG 2 0x500000 40 1 128 -
8 OEM 1 0x520000 41 1 128 EMPTY
9 OEM 2 0x540000 42 1 128 EMPTY
10 COREDUMP 0x560000 43 1 128 -
Locked Down! 03:000009 >dtrace
TM_ANALYZER-0x01 @Mon Feb 24 11:09:10 2025: Setting Enclosure SES to Critical! - analyzer.c:1831
TM_CORE-0x40 @Thu Jan 1 00:00:00 1970: FW Start Up - Boot Reason 0x0 - dellCore.c:74
TM_FRU-0x01 @Thu Jan 1 00:00:01 1970: FRU reading validation failed. EepromId 0 Verify Stat 0x80FFFFFF - platformSensor.c:324
TM_SENSOR-0x01 @Thu Jan 1 00:00:01 1970: FRU read failed! on EEPROM 0: 0x2 - fru.c:3833
TM_I2C-0x01 @Thu Jan 1 00:00:02 1970: Max Retry attempts exhuasted addr A4. Marking I2C port 05 as failed Status 0x5. - twi.c:570
TM_FRU-0x01 @Thu Jan 1 00:00:02 1970: FRU i2c failed. Return value is 0x5 - platformSensor.c:294
TM_SENSOR-0x01 @Thu Jan 1 00:00:02 1970: FRU read failed! on EEPROM 4: 0x1 - fru.c:3833
TM_ASSERT-0x00 @Thu Jan 1 00:00:02 1970: Current CM FRU is invalid. Lockdown! - bluemoonInterface.c:169
TM_ASSERT-0x00 @Thu Jan 1 00:00:02 1970: Lockdown by peer state: 0 - bluemoonInterface.c:271
TM_CORE-0x01 @Thu Jan 1 00:00:02 1970: Setting Lockdown - No Peer Failover. - sspTargetUtil.c:175
TM_CORE-0x40 @Thu Jan 1 00:00:00 1970: FW Start Up - Boot Reason 0x0 - dellCore.c:74
TM_FRU-0x01 @Thu Jan 1 00:00:01 1970: FRU reading validation failed. EepromId 0 Verify Stat 0x80FFFFFF - platformSensor.c:324
TM_SENSOR-0x01 @Thu Jan 1 00:00:01 1970: FRU read failed! on EEPROM 0: 0x2 - fru.c:3833
TM_I2C-0x01 @Thu Jan 1 00:00:02 1970: Max Retry attempts exhuasted addr A4. Marking I2C port 05 as failed Status 0x5. - twi.c:570
TM_FRU-0x01 @Thu Jan 1 00:00:02 1970: FRU i2c failed. Return value is 0x5 - platformSensor.c:294
TM_SENSOR-0x01 @Thu Jan 1 00:00:02 1970: FRU read failed! on EEPROM 4: 0x1 - fru.c:3833
TM_ASSERT-0x00 @Thu Jan 1 00:00:02 1970: Current CM FRU is invalid. Lockdown! - bluemoonInterface.c:169
TM_ASSERT-0x00 @Thu Jan 1 00:00:02 1970: Lockdown by peer state: 0 - bluemoonInterface.c:271
TM_CORE-0x01 @Thu Jan 1 00:00:02 1970: Setting Lockdown - No Peer Failover. - sspTargetUtil.c:175
TM_CORE-0x40 @Thu Jan 1 00:00:00 1970: FW Start Up - Boot Reason 0x0 - dellCore.c:74
TM_FRU-0x01 @Thu Jan 1 00:00:01 1970: FRU reading validation failed. EepromId 0 Verify Stat 0x80FFFFFF - platformSensor.c:324
TM_SENSOR-0x01 @Thu Jan 1 00:00:01 1970: FRU read failed! on EEPROM 0: 0x2 - fru.c:3833
TM_I2C-0x01 @Thu Jan 1 00:00:02 1970: Max Retry attempts exhuasted addr A4. Marking I2C port 05 as failed Status 0x5. - twi.c:570
TM_FRU-0x01 @Thu Jan 1 00:00:02 1970: FRU i2c failed. Return value is 0x5 - platformSensor.c:294
TM_SENSOR-0x01 @Thu Jan 1 00:00:02 1970: FRU read failed! on EEPROM 4: 0x1 - fru.c:3833
TM_ASSERT-0x00 @Thu Jan 1 00:00:02 1970: Current CM FRU is invalid. Lockdown! - bluemoonInterface.c:169
TM_ASSERT-0x00 @Thu Jan 1 00:00:02 1970: Lockdown by peer state: 0 - bluemoonInterface.c:271
TM_CORE-0x01 @Thu Jan 1 00:00:02 1970: Setting Lockdown - No Peer Failover. - sspTargetUtil.c:175


DELL-Joey C
Moderator
•
4.1K Posts
1
February 26th, 2025 08:27
Hi,
This would be a rare error that I've seen on a MD1400. Usually lockdowns are on MD3XXX where the storage will go into lockdown mode if the controller reboots more than 5 times in a row. From the error message, the controller seems to be having difficulty in boot from it's flash where it contains the firmware image. Have you tried removing the primary controller out of it's slot and just boot from the secondary controller? Also, can you confirm, the storage is turned on for it to boot for a few minutes before turning on the server?
And it's weird that the errors from dtrace is all dated in 1970 after the first error log.
This command is for MD3XXX to clear the lockdown, I'm unsure if it works for MD1400.
lemClearLockdown
DELL-Joey C
Moderator
•
4.1K Posts
1
February 28th, 2025 07:32
Hi,
I've checked, there isn't any command or way to reset the sensor. For Sensor 3 and 5, the instruction was to replace EMM. But you have verified that both EMM are working on secondary slot, which means the slot is faulty. The backplane is the hardware that connects the EMM for the communication. Most likely the backplane is faulty, as 3 of the sensors are not communicating nor providing reading.
RuddJ
1 Rookie
•
8 Posts
0
February 27th, 2025 00:00
Thank you very much for your suggestion Joey.
I had already tried swapping the controllers and had the same error but I had not tried just the secondary one as manual said to put a single controller in slot 1.
I put a single controller in secondary slot and it booted OK and EMM had a green LED and when I powered on server a few minutes later the drives were visible. If I then put the other controller into slot 1 the slot 1 controller ended up in Lockdown with solid amber LED while secondary stayed green.
The weird part is I then shutdown server, removed all controllers and put the other EMM in slot 2 and it worked fine as well. So both controllers seem to work OK when in Slot 2 but both show as failed in Slot 1. So it appears something is causing slot 1 to misbehave. At least data can be accessed again over secondary controller.
Any idea on a way to get slot 1 working again?
I also tried the lemClearLockdown but unfortunately it came up as unknown command.
The logs were an extract and that was the last command before power loss, I had to trim to fit under 10K character limit. I'll put the new logs at bottom of post.
Just wanted to say another big thank you for helping getting it partially working.
Logs only secondary controller:
(edited)
DELL-Joey C
Moderator
•
4.1K Posts
1
February 27th, 2025 04:16
Hi,
Well, that's good that least we know it isolates to the slot issue and not the EMM.
The error, kept pointing to I2C error, which led me to search through internal records for any reported solution.
There are reports on MD1200 on it's I2C issue where firmware update and the power supply troubleshooting resolves the issue. And yes, one may ask, why power supply. The I2C carries signals from every hardware and communicates through the EMM, hence, maybe swapping can help. Next is the firmware, try updating each EMM through the working slot. These are the 2 finding that resolves MD1200 I2C error.
RuddJ
1 Rookie
•
8 Posts
0
February 27th, 2025 05:09
Thanks Joey,
I have swapped the 2 power supplies and re-applied the firmware to both EMM cards, however when I plug the primary card back in it still shows as solid amber.
Looking at the I2C bus code appears to point to a temp sensor, on bus 5 shows as failed and the other as not present. Not sure if this is normal or may be cause.
Updated logs:
DELL-Joey C
Moderator
•
4.1K Posts
1
February 27th, 2025 07:45
Hi,
Alright, was hoping it can resolve the issue. That's a good finding, on Sensor ID. I did some research on MD1400 sensor ID placement, it seems Sensor ID: 0 is the Control Panel. Here's the control panel removal, maybe you can try to reseat the card, but it is not a hot swap, hence, will need to turn off the storage. https://dell.to/3XkuyQp
If you have OMSA installed, you can check under Storage > Enclosure, there is a table of Temperature Probes. There should be 7 (0 to 6).
RuddJ
1 Rookie
•
8 Posts
0
February 28th, 2025 04:50
Thanks Joey, I can see the temperature probes from the iDrac of attached R330 and it shows readings of 0c for Sensors 0, and missing for 3 & 5.
In CLI sensor 0, 3 & 5 have failed and the sensors 3 & 5 also show "Is Present: NO" .
I removed the Control Panel to check it out and blow off a small bit of dust at front and then returned it but errors are still showing.
iDrac Screenshot
RuddJ
1 Rookie
•
8 Posts
0
March 4th, 2025 05:53
@DELL-Joey C Thanks for all your help Joey. We will look at replacing the unit, but at least thanks to your help it can partially work in the meantime.
Thanks.