Start a Conversation

This post is more than 5 years old

Solved!

Go to Solution

34244

May 14th, 2012 01:00

Event Number 801 | Soft SCSI Bus Error

Hi All,

We are getting frequent Errors as shown below.

Around 800 odd emails we have received so far.

Time Stamp 05/11/12 11:27:20 (GMT) Event Number 801
Severity Warning Host SAPSAN

Storage Array FCNMA091500008 SPA Device Enclosure 2 Disk 7

Description Soft SCSI Bus Error


Time Stamp 05/11/12 11:27:15 (GMT) Event Number 801

Severity Warning Host SAPSAN

Storage Array FCNMA091500008 SPA Device Enclosure 2 Disk 7

Description Soft SCSI Bus Error


Time Stamp 05/11/12 11:27:16 (GMT) Event Number 801

Severity Warning Host OEM-5EVAP15NUU0

Storage Array FCNMA091500008 SPB Device Enclosure 2 Disk 7

Description Soft SCSI Bus Error

Support team is saying, you disable the event 801 from templates.

Is it the proper way to do it?

- Ashish K

433 Posts

May 15th, 2012 04:00

1394.....Please go ahead with a Proactive Drive replacemnet. I hope its still triggering on the same disk...Get the drive replaced.

142 Posts

May 14th, 2012 02:00

Hi Anirudha,

We have already raised service request for this.

They are saying that, this error can be ignored.

We have recently done firmware upgrade of AX4-5 Storage.

After that this error is comming contineously.

They have forwarded artical to disable Event Alerts. But which i think is not proper solution.

- Ashish K

May 14th, 2012 02:00

Hi,

If this 801 event generated for about 800 times...then really it could be a concern you should look into.

Raise a SR with EMC support and upload SP collects, put a request for through analysis of the logs.

Soft SCSI bus error could be related to faulty LCC cable or faulty LCC, it could be related to bad disk also.

801 event means - A SCSI operation failed and needed to be retried. The event indicates that the retry succeeded.

Thanks

Anirudh

433 Posts

May 14th, 2012 03:00

Please do not disable the alert. This drive is about to fail as it has triggered for more than 800 times. Also the extension here is :

0x09 (which means) The drive reported a hardware error. {Here in your alert it states that Proactive Drive Replacement }

You can wait for another some time till the drive itself fails.

Also please confirm the latest flare code on the array.

142 Posts

May 14th, 2012 03:00

We are getting this error after upgrading flare to below version :

Error1.png

Screenshot for Error

Error.png

I dont know the latest firmware for AX4-5.


142 Posts

May 14th, 2012 03:00

Its procured from DELL, so SR is logged in DELL.

Following is the mail reply which I got from support representative.

There is an EMC advisory which states the alerts are benign and can be safely ignored.

We can try to disable the 801 event in the storage template.

Please find steps below.

Product: CLARiiON CX Series

801 Soft SCSI Bus errors are not creating service requests.

Error code: 0x801 Soft SCSI Bus Error

801 Soft SCSI Bus errors are now filtered to prevent generating service requests.

Error Code: 801 0x9: Proactive Drive Replacement is now an 803 error code in FLARE code versions 14 and later.

All 801 errors are now treated as benign and will no longer create a service request.

801 0x9 calls for a proactive drive replacement. This error now dials home as an 803 error code in FLARE code versions 14 and later.

This filter has been implemented to reduce the data unavailability situations that can be caused by unnecessary drive and part replacements. 

Future FLARE code revisions will have the 801 error code unchecked in the dial-home template by default

If you are receiving emails from the array concerning 801 errors, uncheck 801 errors in the dial-home template by following these steps:

1.     Log in to Navisphere.

2.     From the Enterprise Storage Window, select the Monitors tab.

3.     Press Ctrl+Shift+F12 and use the password: XXXXXXX

4.     Expand the Templates folder by click the '+' symbol.

5.     Right-click the desired Call Home Template. (The template being used is usually the template labeled with the newest FLARE code revision number unless otherwise specified. Example: Call_Home_Template_6.xx.0  where xx is the FLARE code revision number.)

6.     In the Templates Properties window General tab, click the Advanced... button.

7.     Expand the Warnings and then expand Basic Array Features.

8.     Scroll down and uncheck the 0x0801 Soft SCSI Error check box.

- Ashish K

433 Posts

May 14th, 2012 03:00

If the support has analysed the logs and checked the extended code for this error then it might be ignored. May i have the SR# so that i can have a look at that too.

433 Posts

May 14th, 2012 04:00

Yes you are at the lastest one : AX4-5-Series FLARE OE Bundle 02.23.050.5.711

I think you will have to wait for the next 24 hours to monitor the same. I am pretty sure the drive will trigger as Failed and then you will have to get it replaced.

Thanks

1.4K Posts

May 14th, 2012 10:00

You can ask for Proactive replacement of the disk in this case.

Moreover, disabling CLARAlert for a particular event is for support personnel only.

It is recommended not to disable in hardware related errors and in the event of multiple notifications contact Technical Support for assistance.

As rupal mentioned you may wait for 24hours and see if drive fails. If not, you may go for disk replacement.

142 Posts

May 14th, 2012 20:00

.K

We will monitor the behavior of the system till today evening.

@AnkitM : Strange thing is, they have shared the Engineering Mode password also...
Oops... I forgot to remove it from Post....

I will update you both on outcome.

- Ashish K

142 Posts

May 15th, 2012 04:00

@Anirudha :

Primus emc71072 states about the EMC CX and VNX Series. We have AX Series, i.e. AX4-5.

Also the Error is having Sense Key as 0x0. and in Primus their is no any description for that.

SCSI Error.png

@Rupal:

Yes, its triggering on the same disk.

But DELL Support is saying

Event ‘801’ doesn’t hold any significance at all.  EMC Primus (emc121692)

@All :

Why there is so much confusion on error 0x801?

- Ashish K

142 Posts

May 15th, 2012 04:00

Hi,

Drive is still working fine. We have received 1394 Email notifications for this/

We got following reply from support Team.

From Support Team :

Event ‘801’ is treated as benign (ignorable) by EMC and we have a primus for that which I shared earlier.

Event ‘801’ doesn’t hold any significance at all. You can simply disable it if you don’t want any further event notifications for ‘801’

For proactive replacement of a drive, we look for event code ‘803’ without which we cannot replace the drive proactively.

If this drive reaches a threshold, it should automatically trigger ‘803’. Please let us know.

Should we disable the Event Alert?

Thanks

- Ashish K

May 15th, 2012 04:00

Hi Ashish,

Please look into this primus article for more details on 801 events.

Primus # emc71072

Disabling this event is not recommended,  if this event is repeating again and again, then it could be a backend loop related issue due to a faulty LCC cable or LCC or even a bad disk.

I would suggest you to escalate this issue in Dell to next level for further analysis...1394 email notifications regarding this same issue could be really serious.

Thanks

Anirudh

1.4K Posts

May 15th, 2012 11:00

This is the Extn Code for your Error with description:

0x06 - Command timeout.  The SP sent a command to the drive, and the  drive did not respond in time.

It is mentioned in the primus.

Generally, It is considered as benign and it can be ignored. However, considering amount of notifications you are receiving on one particular drive, we can make an exception and replace the drive.

.K

We will monitor the behavior of the system till today evening.

@AnkitM : Strange thing is, they have shared the Engineering Mode password also...
Oops... I forgot to remove it from Post....

I will update you both on outcome.

- Ashish K

Sadly, even Google knows that!

433 Posts

May 16th, 2012 06:00

Checked the latest version of the call homes and have come to a conclusion that you need a new call home template for your array as currently the alert 801 is not considered in the latest version. But your case would still be treated as an exception as its flooding over 1000 alerts.

You can get the latest template : EMC Service Partners Web > Software > CLARiiON Service Tools > Call Home Template.

May i know the current status of the alert.

Thanks

No Events found!

Top