Start a Conversation

Solved!

Go to Solution

1 Rookie

 • 

2 Posts

1958

March 9th, 2022 23:00

Continue boot after watchdog reset

Hi,

we've enabler "OS Watchdog Timer" in all our R540/R740 and reset the timer every 2 minutes in our Linux OS. If the OS locks up the watchdog correctly power-cycles the system.

However, after that it stops at the POST screen with the messages that the watchdog power-cycled the system due to an errror.

That's not helpful because the idea of the watchdog timer is to restart the server after it locked up so that it can come up again and boot into the OS again to continue working.

How can we make the servers continue booting without interaction after a power cycle reset?

cu,

Frank

Moderator

 • 

5.1K Posts

January 6th, 2025 01:27

Hello, as Dell does not officially support the OS Rocky9, the questions has to be directed to the OS provider if they support R550.

This the list of what has tested and verified by Dell.https://dell.to/4j1oCoR

 

Hope this helps.

 

Happy New Year!

4 Operator

 • 

3K Posts

March 13th, 2022 21:00

Can you share the screenshot of the error message? Can you also flash latest iDRAC and BIOS FW to the server and check the behavior if you do not have latest FW on the server?

1 Rookie

 • 

2 Posts

March 14th, 2022 00:00

I cannot at the moment because I would have to trigger the watchdog reset (by stopping the resest script), but I cannot do this for a production server easily. Might take some days.

But I think it has todo with the "F1/F2 prompt on error" option which is set to "enabled". According to the log, the system considers a power cycle by the watchdog an error (instead of e.g. a warning), and therefore stops and waits for acknowledgement by F1 to continue boot. I can remember the messages said I had to press F1 to continue booting.

However, this doesn't make sense. The watchdog power cycle is there to ensure that a crashed server is restarted and continues to run without manual interventiont. If a watchdog reset has to be acknowledged manually, I could reset the crashed server mnanually as well.

I guess setting the "F1/F2 prompt on error" to "disabled" would solve this, but then it wouldn't stop at any error, which is not a good idea.

So, there should be a setting that allows boot to continue after a watchdog power cycle has happened. Otherwise, the watchdog is completely useless.

Btw, all firmwares etc are at the latest releases as we upgrade them automatically.

cu,
Frank

 

1 Message

August 19th, 2024 11:28

Can confirm this is happening on our R7525 systems.

Default configuration of the BIOS, issuing a restart from the OS results in the BIOS stopping and requiring F1/F2 input, stating the system was restarted due to a watchdog timeout.  To my knowledge we haven't configured any watchdogs - running Rocky 9.

Moderator

 • 

2.8K Posts

August 19th, 2024 13:09

Hello,

Rocky9 may be equivalent to RHEL9. But it is still not among the supported OSs. https://dell.to/4dNsdn3 The communication between the operating system and the BIOS/firmware might cause these watchdog timeout errors. If there is a BIOS update please do update first. You can verify if the operating system has any default watchdog services that could be interacting with the hardware watchdog. You can check for active watchdog services for OS level watchdog.

 

Hope that helps!

1 Rookie

 • 

1 Message

November 28th, 2024 09:41

This is not an OS related issue, we use RHEL on our side and we still see this message on boot, if we trigger the watchdog. There should be a default action to do after watchdog trigger.

Moderator

 • 

3.8K Posts

November 28th, 2024 21:46

Hello,

please which Poweredge model? And which BIOS version?

Thanks

1 Rookie

 • 

2 Posts

January 5th, 2025 20:46

We have the same issue on a r550 

1.15.2

7.10.90.00

RUnning OLVM (latest) on OEL 8.10 (redhat 8 fork)

We also have r740 servers which do not have this issue.

1 Rookie

 • 

2 Posts

February 16th, 2025 22:04

@DIXM​ 
note this was found via log history to also be occurring when running VMWARE 7 and we are installing redhat 8 now to reconfirm the issue is with the hardware and not the OS. 

1 Rookie

 • 

1 Message

February 25th, 2025 09:54

@viewless- wispy I am also having the same problem as you, I use Dell EMC Power T140 server product. At the moment I do not have a solution to this error, the server is often disconnected.

No Events found!

Top