Start a Conversation

Unsolved

This post is more than 5 years old

D

1006

May 31st, 2013 01:00

Unexpected reboot. Segfault in ishy4022.sys

Hi,

I am looking for a suggestion on how to isolate an issue of frequent reboots of an SP on a Clariion CX4 system. In the course of a month we observed three unexpected reboots of an SP on a production system. This has started exactly a month ago.

The relevant message from the NT Application Log (viewed through the SP application log) is the same in all instances:

The Storage Processor rebooted unexpectedly @ TIME on DATE: BugCheck 0, {0000000000000000, 0000000003107000, fffffadfa69ad298, 0000000000001772}, Failing Instruction: 0xfffffadf9e765ac9 in ishy4022.sys loaded @ 0xfffffadf9e761000   76008106

The admins who manage some of the Windows 2012 hosts connected to the system were also complaining that their systems did not pick up some of the LUNs trespassed to the other controller that is causing them a lot of pain. I do not know the details of their setup except from that they are using PowerLink over FC.

The name of the driver ishy4022.sys does not seem to be related to a regular Windows driver that could be present in the FLARE installation, so it must be something specific to the Clariion.

We are running FLARE version 4.30.000.5.517 with PROM version 5.30.0.

Thank you,

  Dmitri

June 2nd, 2013 00:00

dchubarov,

This requires support to review.  To expedite, knowing that they will ask for it, obtain the dump file (*.dmp.zip) in addition to the usual SP collects.

emc196023: "How to gather the dump file from a CLARiiON Storage Processor after a FLARE bugcheck"

4.5K Posts

June 7th, 2013 09:00

Was your question answered correctly? If so, please remember to mark your question Answered when you get the correct answer and award points to the person providing the answer. This helps others searching for a similar issue.

glen

9 Posts

June 8th, 2013 06:00

Following helpful comment from Christopher I have got the kernel dump and the bugcheck analysis file. While the SP collects and the dumps were sent to the support the analys text file provided the hints needed to get an idea of what was going on.

The ishy4022.sys is the driver for the QLogic ISP 4022 TCP off-load engine that performs full iSCSI offload. Therefore the problem is likely to be related to the Ethernet fabric that caused the driver to panic which led to bugcheck reboot of the SP.

The failure of Windows machines connected over FC to trespass to the other controller is probably an unrelated issue and additional testing should be performed.

Further research is probaby above my station. Thank you for the helpful advice.

4.5K Posts

June 10th, 2013 10:00

Two items you can check:

1. the failover mode settings on the array when running flare release 30 or higher for Windows 2008/2012 should be 4 (ALUA). On the hosts, PowerPath should configure itself.

2. see which version of PowerPath is being used - there are some issues on release 5.7, using the previous release works, but not sure which version is needed to support Windows 2012 - they should check the release notes for PowerPath to ensure they have the correct version or any hot fixes.

Support Solution emc254967 contains information about using the host agent and other information about host connections. There should be a document for connection CX4 to Windows servers "EMC Host Connectivity Guide for Windows" - this should contain the latest information for connecting Win2012 hosts to the array.


_https://support.emc.com/docu5134_Host_Connectivity_Guide_for_Windows.pdf?language=en_US


glen

No Events found!

Top