Start a Conversation

Unsolved

B

1 Rookie

 • 

12 Posts

306

October 21st, 2024 14:59

r740xd with 24x nvme backplane troubles.

I have a server that was shipping damaged, Riser2 had the connectors broken. I have replaced the riser and it seems to be working fine. I am however having huge problems getting all the nvme slots functioning. 

First of all, I am getting this error on bootup: HWC2003 the storage BP2 power cable is not connected.

Now, the server has no midplane, which as far as I can google is where the BP2 power connector is supposed to go, (I dont have one) so I am looking for an explanation for this error.

And no matter what I try I can not get the 2? backplanes in the front detected, Backplane 1 is listed as having 12 slots, so I am guessing there should be a Backplane 2 with the second half of the slots?

The extender connected to the second half of the backplane is also not recognized properly, it is listed in the inventory, but it does not show up under Storage in iDRAC. Only the one connected to the first half of the backplane show up here. 

I have tried switching places of the extenders, but same result. Tried several different 16x pcie slots and the same symptom follows. 

At the moment I am leaning towards one of the switch chips on the backplane it defective. But I haven't been able to find a definitive proof that it is dead. Mostly because of the following:

Some of the nvme drives in the second half of the slots are detected in TrueNAS, but NOT in idrac storage. And some of them shows up in inventory as nvme drive connected to Backplane 17 or 19 or some such nonsense.

Also, the drives connected to the "working" part of the backplane are listed as "Ready" in idrac storage, (not online), and they are NOT detected in TrueNAS. 

I'm highly confused about this so I'm trying to reach out to someone with more experience with these servers.

Moderator

 • 

4.4K Posts

October 21st, 2024 19:37

Hello,

 

I'm sorry to see that. Fixing a system that was damaged in shipping can be trying.

Does the replacement Riser2 have the same part number as the original Riser 2 in the system?

 

Check these places for BP2 cable and see if they are the ones you are looking for.

 

Page 182 connection #19

Backplane 2 signal connector

#19 J_BP_SIG2 Backplane 2 signal connector

   Check #6 on page 155. Figure 153. for the Cable routing – 24 x 2.5 inch NVMe drive backplane

https://dell.to/3UhXd7w

 

 

Page 182 connection #22

BP2 appears to be Riser 2 to system board

#22 J_BP2 (RSR2_225W) Backplane 2 power connector (Riser 2 PCIe 225 W power)

https://dell.to/3UhXd7w

 

Along with damaged connectors did you also verify any cable damage?

 

 

For the hard drives; did you have a Foreign configuration on the PERC controller that you could import?

Page 265 Importing foreign configuration using iDRAC web interface

https://dell.to/3UhXd7w

 

You may need to get the backplane issue resolved to see all the drives.

1 Rookie

 • 

12 Posts

October 22nd, 2024 14:26

@DELL-Charles R​ 

Does the replacement Riser2 have the same part number as the original Riser 2 in the system?

 

Yes I verified that it has the exact same part number.

Page 182 connection #19

Backplane 2 signal connector

#19 J_BP_SIG2 Backplane 2 signal connector

   Check #6 on page 155. Figure 153. for the Cable routing – 24 x 2.5 inch NVMe drive backplane

https://dell.to/3UhXd7w

 

This I have checked and verified continuity with a multimeter. And also the server will not power on with either of the signal wires disconnected or either of the power cables disconnected.

 

Page 182 connection #22

BP2 appears to be Riser 2 to system board

#22 J_BP2 (RSR2_225W) Backplane 2 power connector (Riser 2 PCIe 225 W power)

https://dell.to/3UhXd7w

 

I read that same thing, but its very unspecific.. Does this mean that the BP2 power socket is powered by the same traces that powers the Riser2? (I'm guessing "system board" is their version of motherboard)

Along with damaged connectors did you also verify any cable damage?

 

No cable damage as far as I can tell, I have not tried to test continuity of all the pcie extenders though. 

For the hard drives; did you have a Foreign configuration on the PERC controller that you could import?

Page 265 Importing foreign configuration using iDRAC web interface

https://dell.to/3UhXd7w

I have no PERC controller at all, just pcie extenders. But also, I have no prior configs.

I have screenshots from the seller of the server identifying drives 1-24 and both pcie extenders in iDRAC, now it only shows the extender connected to the first half of the front backplane.

This remains the case if I switch the extenders and the pcie slots. Which leads me towards the second half of the front backplane having an issue. 

HOWEVER. I did make a discovery last night, there were a minor break in the rear most connector for Riser 2 on the motherboard. We're taking so small that I had to have a magnifying glass to notice, and that means that there might have been 2-3 pins that missed their intended connections on Riser 2. 

This shouldn't really matter since I have no extenders plugged into Riser 2 atm. Unless, this riser gets power from the same place as the BP2 socket on the motherboard. Which MIGHT have made a short or something that makes the motherboard think that there is a middle backplane connected, which MAYBE? disables the front second half of the nvme backplane?

This I hope you can cast some light on. Does the second half of the front nvme backplane gets disabled (downgraded to sas?) if the mid backplane is detected?

(edited)

Moderator

 • 

4.4K Posts

October 22nd, 2024 15:08

Hello,

 

If you didn't have BP2 to Riser 2 originally I wouldn't think you need it ,but, the damage to the socket and especially the system board (mother board) could be causing issue in the multilayer PCB I can't say what all could be damaged.

 

I don't know why the signal wires disconnected would cause the system not to power on.

 

You could try bringing the system down to minimum-to-POST configuration for troubleshooting.

 

Minimum to post (this is minimum components to POST. If you get successful POST, put things back a little at a time until find faulting component)

 

The minimum components to allow the PowerEdge R740xd to complete POST are:

 

* One processor (CPU) in socket processor 1

* One memory module (DIMM) in socket A1

* One power supply unit

* Right control panel and cable (for power button functionality)

* System board

 

 

Remove anything not on that list: DVD, Hard drives, PERC controller, backplane, network card, NIC cable, any pcie cards, keyboard, mouse, USB devices, …. anything not on the list remove.

 

If it does not post then the issue is with one of those minimum to post components.

If you get successful POST, put things back a little at a time until you find the faulting component.

 

 

You may see if you can get replacement under the shipping company insurance.

1 Rookie

 • 

12 Posts

October 22nd, 2024 16:06

@DELL-Charles R​ 

I've already been down the minimum viable product route a few times.

There are no cable from the motherboard to Riser 2 (or any of the risers).

You need both cpus plugged in to access both backplanes since they are (need to be because of length limitations of the extender cables) connected to different cpus. I dont really have any problems posting, except for that nagging BP2 power connector fault.

The problem is that there are differing info in the different dell documentation as to what BP2 is.. Is it the second half of the front backplane? Or is it the mid backplane? Or certain places in Lifecycle I believe I found them listed as Backplane 0, 1 and 2. But the motherboard power connectors for the front backplane are labeled BP1 and BP3.

So is this BP2 error even related to the front backplane at all? Or is it simply a red herring.

But again, can you tell me how the behavior of the server would be if I connected a mid backplane to the BP2 power and a PERC? Would it change the front backplane from 24x nvme to 12x nvme + something else (SAS?)

ex: https://www.reddit.com/r/homelab/comments/15rls28/r740xd_with_24_nvme_drives_working_dell_parts_list/

This guy had some of the same errors when building his server to r740xd specs. and that is exactly the wiring I have in mine (minus the Perc and SAS cables).

Notice that Figure 16 from the manual lists: 

24 x 2.5 inch drive backplane with up to 24 NVMe or up to 8 SAS/SATA drives with NVMe

Could you tell me what that means? There are only 24 slots for drives, how is the SAS/SATA drives supposed to be incorporated? Could it be that only a few slots of the second half of the backplane is nvme compatible if the motherboards thinks that there is a PERC / midplane connected?

Moderator

 • 

4.4K Posts

October 22nd, 2024 18:36

Hello,

 

 

Has anything changed in the original configuration, particularly in the backplane configuration?

 

 

I ask that because we don't support field changing of backplane configuration as there are no kits for that.

 

Looks like #21 and #22 are power for GPU:

#21 - J_BP0 (RSR3_225W) Backplane 0 power connector (Riser 3 PCIe 225 W power)

#22 - J_BP2 (RSR2_225W) Backplane 2 power connector (Riser 2 PCIe 225 W power)

The riser power connecter provides additional power up to 225 W to support PCIe cards that are rated above 75 W - 300 W

 

 

From page 145 can you tell me which cable configuration your system has?

And which backplane you have from page 136?

https://dell.to/3UhXd7w

 

1 Rookie

 • 

12 Posts

October 23rd, 2024 06:13

@DELL-Charles R​ 

Nothing has been changed, its an original 24x nvme server.

Looks like #21 and #22 are power for GPU:

#21 - J_BP0 (RSR3_225W) Backplane 0 power connector (Riser 3 PCIe 225 W power)

#22 - J_BP2 (RSR2_225W) Backplane 2 power connector (Riser 2 PCIe 225 W power)

The riser power connecter provides additional power up to 225 W to support PCIe cards that are rated above 75 W - 300 W

That was my suspicion as well, but then why is the server whining about not having this connector "connected"? I do not have any cards at all plugged into Riser 2. (ATM).

Can I surmise that "Backplane 0" is for the optional rear backplane, and that Backplane 2 is the mid backplane? So the front would be 1 and 3? 

From page 145 can you tell me which cable configuration your system has?

And which backplane you have from page 136?

I have the cables from page 155 Figure 153.

And I have the backplane from page 136 Figure 125.

(and notice that it has a SAS connector, hence why I'm asking what the behavior would be if this is plugged in, it is not well explained in the manual as far as I can tell.)

Moderator

 • 

2.8K Posts

October 23rd, 2024 07:20

Hi,

According to the manual, the riser power connectors (#21 and #22) are there to give extra power to PCIe cards that need more than 75W. Since you don’t have any cards in Riser 2, the problem might be with the backplane setup. about the backplane numbers, you’re right! They usually go like this:

front backplane (closest to the front of the server), mid backplane (Backplane 2, as you mentioned), rear backplane (closest to the back of the server)

So, if you’re not using the mid backplane, the server might be complaining because it expects it to be connected. It’s hard to say for sure a definitive answer for me. I'm just brainstorming. Let's see what Charles thinking.

Moderator

 • 

4.4K Posts

October 23rd, 2024 20:38

Hello,

 

Try this:   check the backplane mode through the iDRAC GUI: iDRAC > Storage > Overview > Enclosures

 

See what difference it makes in Split mode and Unified mode

 

After setting it, you will need to save, exit and shut down ( Power Cycle)  

A Restart won’t be enough for the server to apply the change.

 

Configuring backplane mode using web interface

https://dell.to/40hITQ3

 

Note an additional reboot of the DRAC and Host power cycle is needed:

"To avoid inventory issues, in case of any back plane cable connection changes you require additional iDRAC reboot and Host power cycle"

1 Rookie

 • 

12 Posts

October 24th, 2024 15:12

@DELL-Erman O​ 

Thanks for chipping in, but that didn't really answer what I am wondering. Fair enough, the mid plane might be listed as BP2, but what is the rear plane then? (0 or 3)? And are the 2 fronts 0-1 or 1 and 3?

1 Rookie

 • 

12 Posts

October 24th, 2024 15:15

@DELL-Charles R​ 

Aw, I almost got my hopes up that there was a setting I had missed. But sadly, that option is not there. and I have this info listed in advanced properties:

Backplane 1 of PCIe Extender in PCIe Slot 8

......

Enclosure Split Mode Capability Not Capable

And if you're curious here are the full list of properties.

Advanced Properties
Device Description Backplane 1 of PCIe Extender in PCIe Slot 8
Connector 0
Enclosure Position Not Applicable
Enclosure Location Right
Bay ID 1
Firmware Version 4.35
SAS Address Not Applicable
Enclosure Split Mode Capability Not Capable
PCI Express Generation Gen 3

Again, the other extender/backplane does not show up :(

(edited)

Moderator

 • 

4.4K Posts

October 24th, 2024 15:36

Hello,

 

BP 0 is Rear, BP 1 is Front and BP2 is Mid tray

 

My understanding is you have all original parts, nothing changed, added or removed and system was working without error.

Then the shipping damage occurred and it started reporting errors?

Have you put everything back in and connected as originally shipped?

We shouldn't be needing any other cables or hardware if it was working in original configuration.

 

Have you tried or could you try running the built in diagnostics and see if it has any information about this issue?

Diagnostics:

Boot to  F11 on Dell Splash screen, selecting  Boot Manager -> System Utilities -> Launch Dell Diagnostics.  Note any messages and continue testing.

1 Rookie

 • 

12 Posts

October 24th, 2024 18:35

@DELL-Charles R​ 

Yes its all original as far as I know, and the seller says everything was working. Sent me screenshots, among others of both extenders being listed in idrac. now it will not list both.

Been running diagnostic several times, but it finishes without error except for a warning about there being errors in the event log. (the BP2 power thing. 

But if I check the Results tab manually, and goes through the pcie training portion there are some things fishy.. For instance. 

It lists what I believe is the second part of the front bay, as Bay0.

But it lists all the slots (12) as Link not active, Bay0 Slot XX empty. (which they are atm)

But then if I move to Configuration (hw config), Backplane 0 = Not installed, BP0 power and signal = not installed. (So here I believe that Backplane 0 is NOT the same as is listed in the Results of the scan.)

The PCIe extender plugged into the second half of the front backplane are shown, but its not listed with full info like the other one. Missing a few details for some reason. 

My gut is telling me that there are no "faults" with the extender or the backplane, but that it has somehow been "disabled" because of the BP2 power error. 

BTW: The error is shown a bit different in the Diagnostic Results. Might be a more informative msg:  Critical. BP2 Power Cable: Cable sensor, configuration error was asserted.

Sorry about the messy post, but lots of info in diagnostic, and no way to export it?

Moderator

 • 

4.4K Posts

October 24th, 2024 20:28

Hello,

 

Are you able to contact the seller and ask if the system is all original parts as purchased or if there has been any changes?

 

Did you look in the iDRAC System Event Log and the LifeCycle Controller log to see if the BP2 error was existing before you took delivery?

 

Try reseating all cables to the backplane, on both ends, and check for any cable damage. (I know, again.)

 

I did see this NOTE: For 24x2.5 inch 24 NVMe drive configuration, the PCIe bridge cards must be installed in slots 3 (CPU1) and 4 (CPU2).

Page 101: https://dell.to/4dVqtYA

Slot 8 is not validated for those bridge cards. Makes me wonder if supplier made some custom configurations. You may ask them.

 

1 Rookie

 • 

12 Posts

October 25th, 2024 10:48

@DELL-Charles R​ 

Are you able to contact the seller and ask if the system is all original parts as purchased or if there has been any changes?

 

Did you look in the iDRAC System Event Log and the LifeCycle Controller log to see if the BP2 error was existing before you took delivery?

 

Try reseating all cables to the backplane, on both ends, and check for any cable damage. (I know, again.)

He says the server is all original, and I tend to believe him. The log was cleared during the troubleshooting I have performed. Except for an old SupportAssist scan that seemed like everything was working. This has also been reset since so I cant go back and double check. 

And he had some screenshots of the system up and running with no errors and both extenders listed. 

Cables have been plugged/replugged an unknown amount of times by now ;)

I even tried to check the continuity of the extender cables, but they aren't "straight through", meaning the pin layout is different on one end and the other, so very hard to do any actual checking.

I did see this NOTE: For 24x2.5 inch 24 NVMe drive configuration, the PCIe bridge cards must be installed in slots 3 (CPU1) and 4 (CPU2).

I also read this, which is why I have been trying to fix the damage to Riser 2 connector on the motherboard (The damage of course had to be on the exact pins that go to slot 4), but its simply too small to handle without special equipment, so I am probably gonna need a new motherboard to know for sure what the issue is. :( 

Now it's getting very expensive throwing money at the problem to see what sticks....

The only extender showing up is the one in slot 8, so I suspect that "must be" isn't completely true.

BTW: originally the extenders were in slot 3 and 4. 

1 Rookie

 • 

12 Posts

November 16th, 2024 13:00

I'm back..

So, I found myself a replacement motherboard. And now all the risers have proper connections. It did not however solve my issues..

Now I have an error saying BP0 power is not connected, and the same issue with the extenders are still present. 

Only one of the extenders show up in idrac under storage->controllers, and any disk connected to the other extender will not show up in idrac inventory.

They DO however show up in TrueNAS, so the connection is working.

It seems to me that there is a fault with one of the switch chips on the backplane that makes the system not detect it, so only half the backplane shows up in hw inventory.

I tried switching the extender cables from one side to the other which didn't change any behavior.

So I'm still wondering why the server is still complaining about BP0 power not being installed, and which connector this physically is supposed to be. It does list the BP0 signal cable as installed.

No Events found!

Top