Start a Conversation

Solved!

Go to Solution

1 Rookie

 • 

20 Posts

8807

April 27th, 2021 07:00

R730 stuck POST after BIOS update

Hey,

As happens more I have inherited a project of a client with limited IT resources and knowledge in house.

My client wanted to bring his R730xd back to newer fw and BIOS but thought he could jump from a very old BIOS into the latest one and hope for the best.

 

Result: an R730xd which hangs on POST with BIOS Version 2.12.1 on screen with iDRAC IP reporting as ....

However the iDRAC had been configured so it we can still connect to it and try to resolve everything from there.

In the iDRAC the system still reports 1.2.10 (weird, no?)

Tests done:

reset NVRam, remove BIOS battery

update to older BIOS from iDRAC => remains in the scheduled stated, planned reboot is done manually (because we cant get past the first screen in the post)

rollback to older BIOS from iDRAC => BIOS is not visible in the rollback list

Tried disconnecting everything but PerC, kept CPUs and memory in system

Currently the iDRAC reports the following when going into update or rollback:

SUP0108: A firmware update operation is already in progress. Wait for the update operation to conclude and then re-try.

After a few minutes, it goes away and we can access those details again (but the error returns some time later)

server information details from iDRAC:

BIOS Version
1.2.10
Firmware Version
2.75.100.75
Lifecycle Controller Firmware
2.75.100.75

iDRAC has been updated to 2.75.100.75 through iDRAC itself.

My idea is that somehow the BIOS update to 2.12.1 is in the backend still running, not showing in the queued jobs, but failing and preventing any other BIOS updates or rollback to be executed.

Maybe racadm can help, but the problem is that a BIOS update gets pushed into scheduled on next reboot and then remains there forever (even after several reboots, complete disconnects from power, etc )

I am now stuck with a problem which i didnt create but want to solve.

The unit is at the end of its 5 years warranty, I don't know if it can be extended or that we can solve this matter in another way

1 Rookie

 • 

20 Posts

January 9th, 2022 05:00

in the end we replaced the whole mobo and the issue was resolved. Probably mobo stuck on BIOS update and didnt find a way to get it out of the BIOS update loop

4 Operator

 • 

3K Posts

April 27th, 2021 08:00

Can you check POST Code(Overview -> Server -> Trouble Shooting-> Post Code) and Life Controller Logs(Overview -> Server -> Logs-> Lifecycle Logs) on iDRAC and check whether that have any information on failure?

1 Rookie

 • 

20 Posts

April 28th, 2021 10:00

Post code is stuck at 0x6 Multiprocessor initialization

Lifecycle logs:

2021-04-27T14:36:15+0200 PSU0800
Power Supply 2: Status = 0x00, IOUT = 0x0, VOUT= 0x0, TEMP= 0x0, FAN = 0x0, INPUT= 0x0.
 
    2021-04-27T14:36:15+0200 PSU0800
Power Supply 1: Status = 0x00, IOUT = 0x0, VOUT= 0x0, TEMP= 0x0, FAN = 0x0, INPUT= 0x0.
 
    2021-04-27T14:25:29+0200 SYS1003
System CPU Resetting.
 
    2021-04-27T14:25:29+0200 SYS1001
System is turning off.
 
    2021-04-27T14:19:27+0200 RAC1195
User root via IP 192.168.0.186 requested state / configuration change to ServiceModule using GUI.
 
    2021-04-27T14:17:50+0200 RAC1195
User root via IP 192.168.0.186 requested state / configuration change to iDRAC Information using GUI.
 
    2021-04-27T14:14:45+0200 USR0030
Successfully logged in using root, from 192.168.0.186 and GUI.
 
    2021-04-27T14:14:08+0200 USR0032
The session for root from 192.168.0.186 using GUI is logged off.
 
    2021-04-27T14:02:55+0200 USR0030
Successfully logged in using root, from 192.168.0.84 and GUI.
 
    2021-04-27T13:42:35+0200 SYS1000
System is turning on.
 
    2021-04-27T13:42:27+0200 RAC0702
Requested system powercycle.
 
    2021-04-27T13:42:26+0200 SYS1003
System CPU Resetting.
 
    2021-04-27T13:42:26+0200 SYS1001
System is turning off.
 
    2021-04-27T13:42:21+0200 RAC1195
User root via IP 192.168.0.186 requested state / configuration change to Power Control using GUI.
 
    2021-04-27T13:37:41+0200 RED030
Reboot is complete.
 
    2021-04-27T13:37:38+0200 SYS1000
System is turning on.
 
    2021-04-27T13:37:32+0200 RAC0701
Requested system powerup.
 
    2021-04-27T13:37:30+0200 RAC0704
Requested system powerdown.
 
    2021-04-27T13:37:29+0200 SYS1003
System CPU Resetting.
 
    2021-04-27T13:37:29+0200 SYS1001
System is turning off.
 
    2021-04-27T13:37:23+0200 RAC1195
User root via IP 192.168.0.186 requested state / configuration change to Power Control using GUI.
 
    2021-04-27T13:30:35+0200 SEL0004
Log cleared.
 
    2021-04-27T13:30:35+0200 SEL0014
The System Event Log (SEL) was cleared by root from 192.168.0.186.
 
    2021-04-27T13:30:35+0200 RAC1195
User root via IP 192.168.0.186 requested state / configuration change to SEL using GUI.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
When connecting through SSH, and I try a racadm fwupdate -s to interupt it says no updates are in progress, but a few minutes later it says a firmware update is in progress. I feel like the BIOS update is somehow starting and not starting at the same time, but I dont know how to interrupt it properly (and probably get the server to boot again normally)

1 Rookie

 • 

20 Posts

April 28th, 2021 13:00

other noteworthy lifecycle logs:
2021-04-19T18:42:43+0200 VDR7
Virtual Disk 1 on Integrated RAID Controller 1 has failed.

2021-04-19T18:42:43+0200 VDR7
Virtual Disk 0 on Integrated RAID Controller 1 has failed.

2021-04-27T12:04:07+0200 SUP1906
Firmware update successful.

2021-04-27T12:02:30+0200 SUP1905
Firmware update programming flash.

2021-04-27T12:02:30+0200 SUP1903
Firmware update verify image headers.

2021-04-27T12:01:57+0200 SUP1904
Firmware update checksumming image.

2021-04-27T12:01:57+0200 SUP1911
Firmware update initialization complete.

2021-04-27T12:01:56+0200 SUP1901
Firmware update initializing.

2021-04-21T19:09:14+0200 SUP0518
Successfully updated the PowerEdge BIOS firmware to version 2.12.1.

2021-04-21T19:06:10+0200 SUP0516
Updating firmware for PowerEdge BIOS to version 2.12.1.

4 Operator

 • 

2.9K Posts

April 28th, 2021 14:00

Hello,

 

If it isn't allowing you to continue through POST at this point, I'd consider performing a power drain of the system, first. This would be accomplished by shutting the server off, unplugging it from the supply, then holding the power button down for ~10 seconds. You can then attempt to bring it back up as normal.

 

I'm not familiar with the error code you're seeing, though an image may provide the context needed to provide any further troubleshooting recommendations. 

 

The messages about the failed virtual disks wouldn't have anything to do with an inability to POST, though they could certainly prevent you from booting the OS. I'm not seeing anything in those entries that would stand out as a problem, aside from the virtual disk entries.

1 Rookie

 • 

20 Posts

April 29th, 2021 02:00

Hey DYlan,

 

I had already shut down and disconnected the system several times during previous testings (and to see if an actual cold boot makes a difference)

I just did it again, including 20 seconds of keeping the power button pressed, just to be sure.

 

the screen that comes up almost immediately on a warm boot shows the flashed BIOS, the rollback screen doesnt show any BIOS, the overview screen shows the old BIOS.

I put an intermediate BIOS update (to 1.5.6) through IDRAC on the system yesterday and did a warm boot, but it is still sitting in the queue as scheduled, the reboot job is showing as completed.

I will now complete the very cold boot and hope the actual BIOS upgrade job will start now!

Capturedell.PNGCapturedell2.PNGIMG_20210429_105953.jpg

 

1 Rookie

 • 

20 Posts

April 29th, 2021 02:00

after the cold boot, the scheduled BIOS upgrade job to 1.5.6 remains in the queue doing nothing.

But i did notice another thing, in my roll back overview the BIOS is back visible but the backplane 2.20 seems to have gone missing.

I now cleared the jobqueue and started a rollback job on 1.2.10 and hope the rollback actually gets initiated, afterwards I would attempt to go to idrac version that is available below and slowly upgrade through yearly BIOS version differences. (but that will remain a plan until we can solve this) Capturedell3.PNG

1 Rookie

 • 

20 Posts

April 29th, 2021 04:00

After the cold boot the BIOS rollback job remains scheduled. I do see my upload of 1.5.6 sitting available now but any BIOS update or rollback job is stuck in the queue, probably because it is stuck on that POST screen?

Capturedell4.PNGCapturedell5.PNG

Moderator

 • 

3.8K Posts

April 29th, 2021 07:00

Yes I think this is the reason. Can you verify where it is stuck?
Thanks

Marco

1 Rookie

 • 

20 Posts

April 30th, 2021 11:00

how can i verify this?

 

I can say the following:

1) racadm fwupdate -s sometimes gives the information that it is running, sometimes it doesnt

2) post code is stuck at 0x6 MultiProcessor Initialisation

3) sometimes in the iDrac the update and/or rollback pages will give an informational message that a firmware update is in progress

 

I have no idea on how to check 'where' it is stuck

4 Operator

 • 

2.9K Posts

April 30th, 2021 12:00

Hi dendo,

 

I think Marco is more looking for an image of the proc initialization error you're getting. An image of that can provide some very useful context to help with troubleshooting.

 

One thing we can still try is shutting the server off, then using the jumpers to clear NVRAM, but I'd like to get an image of the error first, if that's possible. Is it hanging at a black screen when it gives you that value, or are you seeing it elsewhere?

1 Rookie

 • 

20 Posts

May 1st, 2021 10:00

So to clarify:

1) the server 'screen' hangs on the FW 2.12.1 / iDRAC .... screen from a few posts before

IMG_20210429_105953.jpg

It has been sitting on this screen from the moment the servers starts to POST (so on a cold boot after around 30 seconds to a minute, on a warm boot 5 seconds?)

and I have been looking at this screen for far too long  

Do notice: screen states 2.12.1 even though the BIOS is still reporting 1.12.10 (and gives a firmware updating cycle from time to time)

2) the multiprocessor initialization where it hangs is on the 'post code' from the troubleshouting tab in iDRAC

dendob_0-1619891365892.png

does that answer the question?

Kind Regards

 

 

Moderator

 • 

2.8K Posts

May 3rd, 2021 00:00

Hi,

As far as I've read the thread, you have already tried many possible steps. If you had access to the LifeCycle Controller, you could do it from the repurpose or retire section as here. https://dell.to/2QSXno5

 

Can you also try resetting iDRAC, I don't think it's directly relevant but it's worth trying. If the system stops responding during POST, press and hold
the system ID button for more than five seconds to enter
BIOS progress mode.
To reset iDRAC (if not disabled in F2 iDRAC setup) press and
hold the button for more than 15 seconds.

 

If there is no problem with the motherboard, I can recommend you to try NVRAM clear. For NVRAM clear, you can check Page 182-183 https://dell.to/3vBIC7R

1 Rookie

 • 

20 Posts

May 3rd, 2021 12:00

Hey Erman,

 

I have access to the idrac, not to the lifecycle controller during POST/BOOT.

I can update the iDrac to a later version but that has not made a difference.

The NVRam already has been cleared once (I will do it again) => Jumper + removing the battery

What does the 'BIOS Progress Mode' entail? My google fu can't find a decent answer

Which of the buttons actually identifies itself as the system ID button?

The one on the front of the R730 at the left side? Or the one from the back at the far left side?

I will report on the iDrac reset and NVram clear tomorrow when I have physical access to the machine once more

Kind Regards

1 Rookie

 • 

20 Posts

May 11th, 2021 03:00

I have tested the pressing of the i button for 5-6 seconds and nothing has changed visually

I have tested it for 15+ seconds as well not a change as well.

I will now wait for the system for 30 minutes (to not interrupt anything if anything is happening)

and try the NVRam clear (even though I have tried that before)

 

No Events found!

Top