Start a Conversation

Unsolved

This post is more than 5 years old

W

2643

November 15th, 2013 07:00

cx300i burped!


YIKES!

our cx300i just dropped and rebooted. I got the following errors in this sequence:

(I removed most of the N/A fields and reformated to try to cut down on the size of the post....)

7:44     Event Code:0x9

Description:The device, \Device\Scsi\fcdmtl3, did not respond within the timeout period.   00 00 10 00 01 00 66 00 00 00 00 00 09 00 04 c0 01 01 00 50 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 07 00 00 00

Subsystem:APM00060903193     Device:N/A     SP:N/A     Host:cx300_1_spb     Source:fcdmtl

07:44:11 AM     Event Code:0x873     Description:Flare's ATM detects one CMI connection is down.

Subsystem:APM00060903193     Device:SP B     SP:SPB      Host:cx300_1_spb

07:44:11 AM     Event Code:0x908     Description:Fault - Cache Disabling

Subsystem:APM00060903193     Device:SP B     SP:SPB

07:44:12 AM     Event Code:0x9

Description:The device, \Device\Scsi\fcdmtl5, did not respond within the timeout period.   00 00 10 00 01 00 66 00 00 00 00 00 09 00 04 c0 01 01 00 50 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 07 00 00 00

Subsystem:APM00060903193     Device:N/A     SP:N/A     Host:cx300_1_spb     Source:fcdmtl

07:44:12 AM     Event Code:0x944     Description:Hard Peer Bus Error

Subsystem:APM00060903193     Device:SP B     SP:SPB     Host:cx300_1_spb

Source:N/A     Category:N/A     Log:Storage Array     Sense Key:0x2     Ext Code1:0xebd77c5c     Ext Code2:0x0

07:44:12 AM     Event Code:0x944     Description:Hard Peer Bus Error     Subsystem:APM00060903193

Device:SP B     SP:SPB     Host:cx300_1_spb    

Source:N/A         Category:N/A     Log:Storage Array     Sense Key:0x1     Ext Code1:0xebd77cc0     Ext Code2:0x0     Type:Error

07:44:12 AM     Event Code:0x40004001     Description:#THREADO: Peer died in Run:  1073774611   40 00 40 01     

Subsystem:APM00060903193     Device:N/A     SP:N/A     Host:cx300_1_spb

Source:MessageDispatcher     Category:NT Application Log     Log:NT Application Log    

Type:Warning

07:44:12 AM     Event Code:0xa23     Description:Peer SP Down.

Subsystem:APM00060903193     Device:SP A     SP:SPB     Host:cx300_1_spb

Source:N/A     Category:N/A     Log:Storage Array     Sense Key:0x3     Ext Code1:0x0     Ext Code2:0x0

Type:Critical Error

07:44:12 AM     Event Code:0x944     Description:Hard Peer Bus Error

Subsystem:APM00060903193     Device:SP B         SP:SPB     Host:cx300_1_spb

Source:N/A     Category:N/A     Log:Storage Array         Sense Key:0x13     Ext Code1:0xebd77bf8     Ext Code2:0x0

Type:Error

07:44:58 AM     Event Code:0x2580   

Description:Storage Array Faulted Bus 0 Enclosure 0 : Faulted Bus 0 Enclosure 0 SPS A : Removed SP A : Removed

Subsystem:APM00060903193     Device:N/A     SP:N/A     Host:cx300_1_spb     Source:N/A         Category:N/A     Log:Application

Type:Error

07:44:58 AM     Event Code:0x1

Description:EV_HBAPort::_handleHBASPStateChanges() - list lengths differ, 3 4  

Subsystem:APM00060903193     Device:N/A     SP:N/A     Host:cx300_1_spb     Source:Navisphere Agent

Category:NT Application Log     Log:NT Application Log    

Type:Warning

07:44:58 AM     Event Code:0x1     Description:Cabling status is unknown  

Subsystem:APM00060903193     Device:N/A     SP:N/A     Host:cx300_1_spb

Source:Navisphere Agent     Category:NT Application Log     Log:NT Application Log

Type:Warning

7:44:58 AM     Event Code:0x1     Description:EV_Object::~EV_Object, entries: 1  

Subsystem:APM00060903193     Device:N/A     SP:N/A     Host:cx300_1_spb     Source:Navisphere Agent

Category:NT Application Log     Log:NT Application Log

Type:Warning

07:44:58 AM     Event Code:0x6     Description:11/15/13 07:44:58 SP A - SP has been removed on host

Subsystem:APM00060903193     Device:SP A     SP:N/A     Host:cx300_1_spb

Type:Error

07:48:28 AM     Event Code:0x7404

Description:Standby Power Supply (Bus 0 Enclosure 0 SPS A) is faulted. See Navisphere Manager Alerts for details.

Subsystem:APM00060903193     Device:Enclosure 0 SPS A     SP:N/A     Host:cx300_1_spb     Log:Application

Type:Error

07:48:32 AM     Event Code:0x7409

Description:Disk Processore Enclosure (Bus 0 Enclosure 0) is faulted. Servers may have lost access to disk drives in this storage system. See Navisphere Manager Alerts for details.

Subsystem:APM00060903193     Host:cx300_1_spb     Log:Application

Type:Error

07:48:38 AM     Event Code:0x743a

Description:Navisphere can no longer manage (SP A). This does not impact server I/O to the storage system. See Navisphere Manager Alerts for details.

Subsystem:APM00060903193         Host:cx300_1_spb     Log:Application

Type:Error

07:48:38 AM     Event Code:0x720e

Description:Initiator (iqn.1991-05.com.microsoft:Srv062n.dom.com) on Server (srv062N.dom.com) registered with the storage system is now inactive. It does not have a working physical connection. See Navisphere Manager for details.

Subsystem:APM00060903193    Host:cx300_1_spb     Log:Application

Type:Warning

(The above message repeated fo several servers)

After the outage -

which lasted abt 5 mins - All servers reconnected - mostly with no problems. 1 server showed empty folders for all of its shares and had to be rebooted - at which time it was OK too.

Can someone explain what happened, and if remedial action is needed? The san appears to be functioning normally now with no alerts.

What I think I can guess from the errors is:

\Device\Scsi\fcdmtl3 did not respond in a timely fashion (is this the vault area?)

A CMI rebooted  SPA?

Of course as a result of that, SPB could not see SPA

Caches were disabled, etc

SPS A has an error (probably unrelated?)

Things come back up and servers reregister

The errors seem to indicate it still doesn't like SPA, but I dont see any actual evidence of that. There are no trespassed LUNs and SPA status looks fine...

Thanks for any insight you can provide!

4.5K Posts

November 22nd, 2013 13:00

From this message, it appears that SPA may have rebooted - Hard Peer Bus Error usually means that the SP reporting this error can not talk to the other SP.

07:44:12 AM     Event Code:0x944     Description:Hard Peer Bus Error

Subsystem:APM00060903193     Device:SP B     SP:SPB     Host:cx300_1_spb

Source:N/A     Category:N/A     Log:Storage Array     Sense Key:0x2     Ext Code1:0xebd77c5c     Ext Code2:0x0

But without the complete spcollects, it would be difficult to determine why this occurred. Check your version of flare running on the array and make sure you are running the most current version - 26.032 for the CX300i is the current version.

glen

No Events found!

Top