1 Rookie

 • 

2 Posts

119

March 2nd, 2025 16:50

Server Dell poweredge r940 not start

*Erros Identificados:*

1. *The system board Pfault fail-safe voltage is outside of range:*

   -

2. *CPU 4 VSA PG voltage is outside of range:*

3. *CPU 4 VCORE PG voltage is outside of range:*

4. *The chassis is open while the power is off

Moderator

 • 

3.5K Posts

March 3rd, 2025 09:30

Okay, Martinho. It looks like your Dell PowerEdge R940 is failing to start and you've identified a few voltage-related errors, along with a chassis open alert. Let's break down what these errors likely mean and how to troubleshoot them.

Understanding the Errors

  • "The system board Pfault fail-safe voltage is outside of range": This is a critical error. "Pfault" likely refers to a power fault. The system board's fail-safe voltage is a protective mechanism, and if it's out of range, it indicates a potential power delivery problem on the motherboard itself. This is often a sign of a hardware failure.

  • "CPU 4 VSA PG voltage is outside of range": "VSA" likely refers to Voltage Sense Amplifier, and "PG" means Power Good. This means the voltage supplied to CPU 4 isn't within the acceptable range, or the power good signal from the voltage regulator for CPU 4 isn't being properly asserted. This could prevent CPU 4 from initializing or cause instability.

  • "CPU 4 VCORE PG voltage is outside of range": "VCORE" is the main voltage supplied to the CPU core. Similar to the VSA error, this indicates a problem with the voltage being supplied to CPU 4's core. Again, this is either a voltage level problem or a faulty power good signal.

  • "The chassis is open while the power is off": This is likely a separate issue. The server detects that the chassis cover is not properly closed. While seemingly minor, some servers will refuse to power on if the chassis is open as a safety precaution.

Troubleshooting Steps

Given the errors you're seeing, here's a systematic approach to troubleshooting:

  1. Chassis Closure:

    • Ensure the chassis is fully and correctly closed and that any chassis intrusion detection switches are properly engaged. Sometimes these switches can be finicky. This is the easiest thing to check and eliminate.
  2. Power Supply Check:

    • Redundancy: The R940 has redundant power supplies. Are all power supplies installed and lit up correctly? If one has failed, that could be contributing to voltage issues, even if the others are working.
    • Known Good PSU: If possible, test with known good power supplies. Swap them in one at a time to see if the errors clear.
    • Power Supply Voltages: Use a multimeter to check the output voltages of the power supplies if you're comfortable doing so. Refer to the R940 service manual for the correct voltage specifications and test points.
  3. Reseat Components:

    • CPU 4: Carefully reseat CPU 4 and its associated heatsink. Ensure proper thermal paste application when you reinstall the heatsink.
    • RAM: Reseat the RAM modules associated with CPU 4. Consult the R940 documentation to identify which slots are used by CPU 4.
    • Power Connectors: Reseat all power connectors on the motherboard, paying special attention to those near CPU 4 and the main power connectors.
  4. Inspect for Physical Damage:

    • Carefully examine the motherboard for any signs of physical damage, such as:
      • Burned components
      • Swollen capacitors
      • Damaged traces
    • Check for any signs of liquid spills or corrosion.
  5. Minimal Configuration:

    • Try booting the server with the absolute minimum configuration:
      • One CPU (if possible, not CPU 4 to see if the other CPUs allow the system to POST)
      • One stick of RAM (in the correct slot for that CPU)
      • Disconnect all unnecessary peripherals (hard drives, expansion cards, etc.)
  6. Firmware Update (Less Likely to Help):

    • While less likely to be the root cause, try updating the iDRAC and BIOS to the latest versions. You'll likely need to use the iDRAC's virtual console to perform this update, as the system isn't booting.
  7. Dell Support:

    • Given the nature of these errors, especially the system board Pfault, it's highly likely you have a hardware failure. Contact Dell support. They may have specific diagnostics or be able to guide you through more advanced troubleshooting. Provide them with the error messages you're seeing.

Likely Causes and Next Steps

  • Most Likely: A faulty motherboard is the most likely culprit, especially with the "system board Pfault" error.
  • Possible: A failing power supply (or power supplies) could also be the cause, or contributing factor.
  • Less Likely (but Possible): A faulty CPU 4 or a problem with the voltage regulator circuitry for CPU 4.

Important Safety Notes:

  • Always disconnect the server from the power source before working on internal components.
  • Use an ESD wrist strap to prevent electrostatic discharge damage.
  • Refer to the Dell PowerEdge R940 service manual for detailed instructions on component removal and installation.

Let me know the results of these steps. If you can provide more information about the server's configuration (e.g., how many PSUs, when the problem started, any recent changes), it might help narrow down the possibilities. Good luck!

No Events found!

Top