Start a Conversation

Unsolved

Y

1 Rookie

 • 

3 Posts

24

September 11th, 2025 19:51

PowerEdge R7525 cores getting stuck at 400MHz

I have an R7525 with two EPYC 73F3 processors and 512GB of memory. Two months ago, we noticed performance had slowed to a crawl and was starting to affect the workload. Eventually I figured out all 64 cores were stuck at 400MHz. I found a forum post about a similar issue, and it mentioned re-inserting each power supply, one at a time, fixed the issue.

Today, I noticed the problem's happening again, but I caught it earlier. I can see that most cores are stuck at 400MHz, but the stuck cores move around. E.g. I'll look at cpuinfo and see processor 0 core 0 is stuck at 400MHz, wait a few seconds, look at cpuinfo again and see processor 0 core 0 is now running normally at 3.5GHz. I did three runs of this, and there were 48, 42, and 48 cores at 400MHz. After doing the power supply trick, no cores were left at 400MHz.

I'm going to check for firmware upgrades tonight, but we have an identical server with the exact same hardware/OS/firmware version that doesn't have this problem, so I don't have high hopes. I may try swapping the PSUs between them as well.

Has anyone else seen this, or have any troubleshooting ideas?

1 Rookie

 • 

3 Posts

September 11th, 2025 19:54

Forgot to mention:

  • OS: Debian 12
  • Kernel: 6.8.12-14-pve
  • PSU: dual 1400w, FW 00.18.29
  • BIOS version: 2.18.1
  • iDRAC version: 7.20.30.00

Moderator

 • 

3.9K Posts

September 12th, 2025 01:00

Hi,

 

There are some reports that faulty CPU can cause the CPU to throttle to lower speed and after a reboot or PSU troubleshooting, the speed is restored. Try check in BIOS if the C-State is disabled, if not, try disabling it and monitor. Checking the firmware if they are up-to-date is correct step. Are the servers connected to a UPS? Yes, try swapping the PSU between the mentioned identical server, perhaps that could conclude that the PSU is faulty. Would it be also possible to test with only 1 PSU at a time?

1 Rookie

 • 

3 Posts

September 15th, 2025 22:08

I updated the firmware and swapped the PSUs out, so we'll see what happens. We aren't able to run with just one PSU due to the loss of redundancy.

I haven't checked C-State but I believe it's enabled. If the problem comes up again I'll disable it. We have alerting for this now too so there shouldn't be any surprises.

Thanks Joey!

No Events found!

Top