PS4210XV performance degradation after upgrading from ESXi6.7 to 7.0.2

Question

We have upgraded three ESXi 6.7 to 7.02 (DELL PE R6516), all talking iSCSI over 2x 10GBit NIC with the PS4210XV. After that the (read-)performance was degraded.
A synthetic test with 4 subtests dd read/write 512/1M block size brings with 6.7 approx. 500-600MB/s. With 7.02 write remains, but read only approx. 40 MB/s. A new installation instead of an upgrade brings about 180 MB/s for read, still 1/3 of the original performance.
We installed one host back to 6.7, with the original, good performance.

Has anyone had similar experiences and perhaps a solution?

DELL-Sam L · Answer

Hello Wia, What is the current version of firmware that your PS4210XV is running?

dwilliam62 · Answer

Hello,

I believe you have a case opened on this issue? F/W is 10.x I believe?

ESXi v7.x is not a supported, certified OS for the Dell PS Series SANs. Though I have not heard of other customers who have upgraded, having the issue you describe. Since the performance returns after downgrading, I would first insure that all the recommended best practices are in place. MPIO, Delayed ACK, login timeout, etc.. Also that the VMs don't share multiple VMDKs on a single virtual SCSI adapter. If you VMs are using the paravirtual driver, you might want to try the LSI one instead.

When you are testing, has anyone used ESXTOP then expand the disk IO stats to see where the delay is greatest? In ESXTOP you can split out the disk latency into subset showing is the delay really the storage or at the ESXi kernel level.

https://kb.vmware.com/s/article/1008205

Regards,

Don

wla · Answer

EQL FW is V10.0.3Yes, I have an opened case. HCLWe have this reproduced with an upgraded ESXi, as well with a fresh install. Best practice on the fresh install as recommanded from EQL and VMWare.My goal is to learn if other Equallogic users are testing this or have already found this to be the case?

wla · Answer

We tested this and a lot of other 'best practice'. We always had MPIO and RR, no Jumboframes. Tested also with FP, Jumboframes.The configuration ist the same as 6.7 because of migration install. We tested as well with fresh install.DELL and VMWare will do nothing here, the entry in the HCL is 'incorrect'. Thus, I hope that others do not get the same problems with production environment and end the discussion here.

dwilliam62 · Answer

Hi,

I have not seen any other EQL customer report a problem upgrading to ESXi v7.x. I have a small cluster in my lab as well that hasn't shown any issues. It's not a qualified OS for the EQL Dell PS Series SANs.

I would suggest using VMware Round Robin MPIO with IOs per path changed to 3. Versus FIXED or the default VMware Round Robin IOs per path of 1000.

With regards,

Don

wla · Answer

Broadcom BCM57416 NetXtreme-E 10GBASE-T RDMA Ethernet Controller, driver bnxtnetSW iSCSI

dwilliam62 · Answer

Hello,

The entry isn't 'incorrect' DELL/EMC didn't submit qualification test results for EQL and ESXi v7.x. VMware's default is always going to be FIXED unless the vendor supplies the results of the QA certification suite. Then the onus is on the storage vendor not VMware to support that MPIO mode. Same goes for the Dell MEM MPIO enhancement. Dell has to do the testing and certification, then provide that to VMware to get the security key needed to install it as a certified extension.

Jumbo frames are very helpful to reduce CPU load and get a little extra performance. Since the array works fine with v6.7 that would suggest the array and switches are working correctly. My hunch is a network driver. What kind of network cards are you using? If you are using the broadcom iSCSI offload have you tried just using the SW iSCSI adapter? I've seen firmware/driver mismatches cause issues in the past with dependent iSCSI HW adapter. And on some 10GbE Intel NICs issues with interrupt coalescing I think it was caused issues.

Regards,

Don

dwilliam62 · Answer

Hello,

When did you last upgrade the servers themselves? Not the VMware OS. If you can, I would suggest upgrading to current versions and try again. If that fails then switch to using the iSCSI offload of those broadcoms. Just remember to set all the best practices first on each adapter before trying to discover volumes. You'll have to unbind the SW iSCSI adapter and bind each HW iSCSI adapter individually.

Regards,
Don

wla · Answer

We have reset the three servers to 6.7, with the old, good performance. One last interesting test I did: on one of the 6.7 servers I installed a nested ESXi, first in version 6.7, then upgraded to 7.0.2, configuration as far as possible like the physical servers. The performance is the same in both(!) cases, slightly worse due to the double virtualization, but READ and WRITE are as expected here. So from my point of view the problem lies in the interaction iSCSI stack of the 7.X / Broadcom BCM57416 driver / EQL.

dwilliam62 · Answer

Hello,

Thank you for the update. That's a very interesting test.

I don't see it being on the EQL side but with ESXi v7 and the Broadcom. Did you look into upgrading the BIOS and firmware on that server?

Regards,

Don

EqualLogic

PS4210XV performance degradation after upgrading from ESXi6.7 to 7.0.2

Was this post helpful?