Unsolved
This post is more than 5 years old
3 Posts
0
2432
December 20th, 2017 18:00
Dell Compellent SC4020 with Nexus 9372 Networking
I know this is going to be very granular in scope to the audience but I pray someone out there is running in a similar configuration and can help me.
We own the SC4020 and have it connected to a pair of Nexus 9372 switches. Every version of NXOS after nxos.7.0.3.I2.1 is causing us an issue to the point where we are stuck and cannot upgrade. We have engaged both Cisco and Dell concerning this issue.
The issue is that after upgrading past the version mentioned we see intermittent timeouts with jumbo frames. This correlates to virtual machines (Hyper-V 2012 R2) that run from the storage freezing. Different actions cause the timeouts to occur such as rebooting a virtual machine.
Is anyone out there running these two together successfully and by the kindness of their heart be willing to share their configuration with us?
compellentnexus
3 Posts
0
December 28th, 2017 08:00
Also wanted to mention that the Hyper-V Hosts are connected to 2372 FEX's which are then connected to the Nexus 9372 parent switches. In case that hits a chord with anyone out there and their personal experience.
dell-richard g
605 Posts
0
December 30th, 2017 22:00
Couple items to consider:
1. I assume that the SC4020 is connected to the 9372PX, correct? (as opposed to the FEX)
2. Are you enabling Rx flow control? (TX flow control not necessary)
3. The "timeouts" you mention may be related to heavy packets drops on the FEX or Nexus switch. You need to start by checking for packets drops on all switch ports connected to the 4020 and server ports. In addition, check for any CRC errors on the same ports (i.e. show interfaces | gr CRC).
If this all worked before the firmware upgrade, then it is possible that there were changes to the buffer/memory management optimization. (again, check for packet drops on each switch).
With the SC4020, I assume that it is not under heavy load, thus it's possible that there could be link/connection issues. Verify that the switch ports are not bouncing up/down.
compellentnexus
3 Posts
0
January 1st, 2018 19:00
Answers:
1. Yes, the controllers are connected directly into the 9372PX, the Hyper-V Host goes into the top of rack FEX switches.
2. We followed the DELL best practice guide recently published for the Nexus and Compellent. Flow control receve on and priority-flow-control mode off
3. We were sent a full set of equipment and built a lab so this Host only has two virtual machines, there is zero load in this environment. Simply booting a VM causes timeouts. No errors, Dell did see some checksum errors on their end and there is a couple bugs that had me concerned but an upgrade of the SC4020 did not help.
SCOS-41806
Storage system iSCSI ports might hibernate when a failed TCP checksum is received on an inbound packet.
SCOS-41452
SCOS–22260
iSCSI I/O cards fail to honor an MTU value change made on a router.
Workaround: Contact Dell Technical Support.This is a problem on any NX OS after nxos.7.0.3.I2.2a.bin, I've tried so many versions and it seems it will never work again. Theory is they were allowing something they shouldn't have been and a true fix stopped something that was allowing it to work. But again, bypassing the FEX works, not a viable solution in my network. The FEX are 2232
SCOS–22260