Start a Conversation

This post is more than 5 years old

Solved!

Go to Solution

2227

May 1st, 2011 13:00

NARG alarms coming as root cause.

HI All

I have a issue where Port down is causing  the  associated NARG to go down and the NARG down alarm is coming as IsRoot Yes  I would expect NARG down to come as the impact of Port down.

The port down alarm is derived from the model file using below.

interface BT_Port: NetworkAdapter
"Generic Port object, which will be used in the devices which are not allowed to discover."
{
        propagate attribute boolean or BT_System_Status = BT_Switch,PartOf,IsUnresponsive;
        propagate attribute boolean BT_Card_Status = BT_Card,RealizedBy,IsCardNotOperating;
        propagate attribute boolean BT_Parent_Card_Status = BT_Card,RealizedBy,BT_Parent_Card_Status;
        computed attribute boolean BT_Port_Status = !BT_System_Status && !BT_Card_Status && IsNetworkAdapterNotOperating && !BT_Parent_Card_Status;
        event BT_PortMightBeDown "Based on system, card and port status" = BT_Port_Status;
        refine problem Down "Port is Down" => BT_PortMightBeDown, BT_PortMightBeDown explains;
        export Down;
        void insertPortValues(in string mtosi, in string btvendor, in string btmodel, in string location, in string resState, in string desc, in string displayname, in string deviceid)
                definition:
                BT_MTOSI = ((mtosi == "") ? BT_MTOSI : mtosi) , BT_Vendor = ((btvendor == "") ? BT_Vendor : btvendor) , BT_Model = ((btmodel == "") ? BT_Model : btmodel) , BT_Location = ((location == "") ? BT_Location : location),DisplayName = ((displayname == "") ? DisplayName : displayname),Description = ((desc == "") ? Description : desc),DeviceID = ((deviceid == "") ? DeviceID : deviceid),BT_Resource_Status = ((resState == "") ? BT_Resource_Status : resState);

}

Also below is the output of script

START {

} do {

   print("The problem ".class."::".instance."::".event);

   print("explainedBy    ".getExplainedBy(class,instance,event))?LOG,IGNORE;

   print("explains       ".getExplains(class,instance,event))?LOG,IGNORE;

   print("causes         ".getCauses(class,instance,event))?LOG,IGNORE;

   print("causedBy       ".getClosure(class,instance,event))?LOG,IGNORE;

   stop();

}

./bin/sm_adapter --server=N-ND-APM --broker=<> -Dclass=BT_Port -Dinstance=PORT-3035/pdh1/6 -Devent=Down getExplains.asl | tee The problem BT_Port::PORT-3035/pdh1/6::Down

explainedBy    {  }

explains       { BT_Port::PORT-3035/pdh1/6::BT_PortMightBeDown }

causes         {  }

causedBy       { BT_Port::PORT-3035/pdh1/6::BT_PortMightBeDown }

[root@linbgl150 smarts]#

Can someone here please tell me what is mssing in the model file as NARG corelation ha stopped post this change.

Thanks in advance.

52 Posts

May 17th, 2011 04:00

I wanted to post an email discussion with Vishal (original poster) that finally resolved the issue.  Hopefully, this will benefit others.  As a note, Vishal took option 1 from the section below.

Vishal,

    This seems kind of silly, but the Dynamic Model you have produced should inherently be part of the codebook as it stands today.  However, there are two alternatives I can think of.

1. Make sure you include the pre-existing NetworkAdapterImpact symptom into your refined root-cause.

Your current MODEL will also activate the BT_Port::Down in cases where Unstable or Disabled would have been previously notified.  If that is your intention, then proceed.

NOTE: Dynamic Model doesn't like the notion of including symptoms into the causality that are not defined in the current scope of your MODEL, so some editing of the LDM may be required.  What you really want to say is:

        refine problem Down "Port is Down" => BT_PortMightBeDown, BT_PortMightBeDown explains, NetworkAdapterImpact explains;

2. Adjust the existing value of AprioriProbability_Down to include the change to computation.  The existing definition (to prevent overlap of Down/Unstable/Disabled) is:

    computed attribute float AprioriProbability_Down

        =  (((Status == DOWN ||

              Status == NOTPRESENT ||

             (Status == TESTING && !SuppressTestingNotifications)) &&

            !IsFlapping)

          ? 0.0001 : 0.0)

          else 0.0;

    Your new value would be (again assuming you are collapsing Down/Disabled/Unstable):

    refine computed AprioriProbability_Down

      = (BT_PortStatus ? 0.0001 : 0.0) else 0.0;

    Remove the event and problem declarations and it should work just fine.

Regards,

Bill

No Events found!

Top