1 Rookie

 • 

18 Posts

40

January 6th, 2025 03:47

Please see the log for my Dell PowerEdge T320, and explain the errors you interpreted from this log

Please see my Dell powerEdge T320 Lifecycle Comtroller Log and explain the errors that are found in this log, here's the Google Drive link I created for it,

https://drive.google.com/file/d/1-B2UmJOS8mf1vNODAMCdpDCBn0qdtKe1/view?usp=drivesdk

I am planning to use this machine as a family file server for old family photos, thanks!

Moderator

 • 

4.6K Posts

January 7th, 2025 21:45

Hello,

 

These are memory errors Erman was referring to. You can look in your LifeCycle Controller log to see:

 

2024-11-08 17:19:42        MEM0001        Multi-bit memory errors detected on a memory device at location(s) DIMM_A1.

2024-11-08 17:19:33        CPU0704        CPU 1 machine check error detected.

2024-11-08 12:15:30        MEM0001        Multi-bit memory errors detected on a memory device at location(s) DIMM_A1.

2024-11-08 12:15:22        CPU0704        CPU 1 machine check error detected.

2024-11-08 12:14:44        MEM0001        Multi-bit memory errors detected on a memory device at location(s) DIMM_A1.

2024-11-08 12:14:35        CPU0704        CPU 1 machine check error detected.

2024-11-08 12:14:33        MEM0001        Multi-bit memory errors detected on a memory device at location(s) DIMM_A1.

 

I'll note I don't see them past Nov. 2024. Did you replace memory already?

Moderator

 • 

2.9K Posts

January 6th, 2025 10:53

Hello,

Firstly Keep firmware updated to reduce memory errors and prolong DIMM life. Memory errors on Dell PowerEdge systems 

Troubleshooting Steps:

  1. Identify Errors: Single-bit (SBE) or multi-bit errors (MBE) might not be due to the DIMM itself.
  2. Swap Testing: Swap memory DIMMs in different sockets, channels, banks, and controllers to find the faulty one.

Methods:

  • Method 1: Swap DIMM A1 with A9 (different channel and bank).
  • Method 2: Swap DIMM A1 with B1 (different memory controller).
  • Method 3: Swap the whole bank (A1, A2, A3 with B1, B2, B3).
  • Method 4: Swap the whole channel (A1, A4, A7 with B1, B4, B7).

Interpreting Results:

  • If the error moves with the DIMM, the DIMM is faulty.
  • If the error stays in the same socket, the system board or CPU might be faulty.
  • If the error moves with the CPU, replace the CPU.
  • If the error stays with the socket, replace the system board.
  • If the error appears on a different DIMM, another DIMM might be faulty.

Hope that helps!

 

1 Rookie

 • 

18 Posts

January 7th, 2025 20:59

Hi, I'm an IT person, i was just wondering what errors did you find in the Lifecycle Controller Log that i shared above?

1 Rookie

 • 

18 Posts

March 6th, 2025 02:25

So it's DIMM A1 that the memory error's interpreted from?

Moderator

 • 

4K Posts

March 6th, 2025 06:43

Hi,

 

If you have already updated iDRAC/LCC and BIOS and done the suggested troubleshooting step by Erman, and if the error follows the DIMM, it is the DIMM's error. If it is the slot, it would be the mainboard and sometimes CPU. 

No Events found!

Top