Start a Conversation

Unsolved

This post is more than 5 years old

1583

May 5th, 2012 09:00

Interface down alarm re-appearing infinitely!!

We have SMARTS 8.1.2 installed in our Solaris 9 servers. We are facing an issue with some interfaces showing "Interface Down" alarms which gets cleared off in approximately 2-3 polling cycles and again alarm re-appears. Even if user acknlowledges an alarm, since it gets cleared automatically, alarm re-appears on te console. This automatic clear happens because in some polling cycle SMARTS is unable to get correct Status from Interface due to which status changes from Down to unknown and INTERFACE DOWN alarm clears off. When again correct status is obtained by SMARTS in further polling , status changes to again DOWN and alarm re-appears.

Can somebody help me to resolve this issue. Why an alarm clears off when status changes to UNKNOWN. Shouldnt it clear off only when status of an interface is UP ??

52 Posts

August 29th, 2012 17:00

We saw this recently occur for another customer.  To a certain extent this can be seen as expected behavior of the application.  When we cannot obtain a valid status from the devices, the OperState and AdminState will transition to UNKNOWN which will have the effect of clearing the alarm.  The design intent was that 99% of the time, a system-level Unresponsive will occur right after replacing the root cause in question.

The problem we observed was that for semi-random flows we would not get the response before the timeout.  This wasn't systematic, so it didn't trigger the Unresponsive for the system and generally the next polling cycle it would recover which would re-activate the alarm. 

A packet capture during the cycle to the problematic device would help identify the problem. 

As a note, this design isn't new - the behavior should be the same going back 12+ years. 

September 13th, 2012 09:00

I have had something similar.

On our system they were Cisco 800 sereis old IOS and the  interfaces were ATM sub-interfaces.

Snoop the problem device and see if the device sends a NoSuchInstance responce to either the admin or Oper state.

We had this responce  after Smarts did performance polling prior to the admin or oper state request.

This only seems to have appeared after moving to 8.1.2 from V7.

Try disabeling performance polling for this interface.

No Events found!

Top