Start a Conversation

Unsolved

This post is more than 5 years old

3517

March 5th, 2010 03:00

Leveraging Ionix IT Operations Intelligence to make your Private Cloud more efficient using APIs: EMC World Developer Track 2010 - Boston 10-13 May

Understand how to expand the Ionix IT Operations Intelligence suite to automate IT

  • Extend the capabilities of the suite through Dynamic Modeling and discovery
  • Calculate the business impact of IT problems

52 Posts

March 12th, 2010 05:00

To hopefully generate some additional interest in this topic, let me present on of the use-cases I saw on the forums and tried to answer (with a practical example).  This will touch on some of the background you will need to perform some of these complicated tasks as well as debugging existing ones.  Specifically, I wanted to touch on the ability to link (abstract) messages into our root-cause analysis (RCA).

There was a prior discussion here on this same topic:

https://community.emc.com/docs/DOC-1212

but I really wanted to provide at least one practical way to solve it.  There are three attachments to this message that form a solution of taking Alcatel-SAM generated messages and tying them into the RCA.

What those files do is create a new NotificationList with an appropriate filter, an internal Notification List subscriber to receive and process messages as they come into the SMART Adapter Platform, and finally the tiny bit of ASL to create the aggregate notification itself.

As you can see, there is probably less than 200 lines across all of the files (most of which is formatting).  That is very little code to perform a fairly complicated task.  This exemplifies what we hope for most in our platform - the ability for almost trivial amounts of code to perform something really interesting.  For me, the best part of this example is that it doesn't need to know what the final RCA is.  It simply ties into an existing symptom and lets the RCA look for an appropriate explanation.

For the discussion next time, think about the following questions:

*  How did I determine what symptom I would want to tie by RCA to?

*  What is the difference between a problem *explaining* a symptom vs. *causing* a symptom?

With any good software toolkit, there are always more than one way to solve a particular problem.  For those of you interesting in more experimentation, all of what I outlined can also be done inside Notification Manager.  Take a look at the documentation on Powerlink:

http://powerlink.emc.com/km/live1/en_US/Offering_Technical/Technical_Documentation/300-009-537.pdf

Best of all, Notification Manager is practically free for those people with SAM and/or the SMART Adapter Platform.

3 Attachments

52 Posts

April 12th, 2010 05:00

Continuing on from the last discussion, there is special semantics in the MODEL language that is used to control the difference between a problem explaining a symptom and a problem causing a symptom.

In general, the explanation set is a slightly wider net of symptoms that while they don't really add any additional detail to the computation of the problem, they can be attributed to the problem when it occurs.  This distinction is particularly useful in to scenarios:

*  Business Impacts - because they are implied rather than measured, they should not be part of the RCA, but can be intuited and explained if a problem exists.

*  Dynamic Topology Environments

I am going to spend the majority of this post covering the second one as it gives developers a powerful tool to avoid frequent codebook recomputation in dynamic environments.  There are three points to remember as we go through this discussion:

1. SAM doesn't treat events any differently than problems when they arrive.  An event without an RCA is effectively an RCA in its own right in SAM.

2. SAM uses the explanation tree (getExplains/getExplainedBy) to populate the Causes/CausedBy relationship between notifications.

3. Explanations are topology based rather than codebook based.

Consider the following simplified MODEL:

interface PhysicalTrunk : Cable {

    problem Down
       "This trunk is not functioning properly"
        -> DownSymptom,
            DownImpact
              explains;


    // Local symptoms
    symptom DownSymptom
        "The symptoms caused when the physical trunk is not functioning "
        "properly"
        = OperationallyDownSymptom;

    symptom DownImpact
        "The impact caused when the physical trunk is not functioning "
        "properly"
        =  OperationallyDownSymptom
              explains,
          ConnectedVirtualCircuitDownImpact
             explains;

    propagate symptom ConnectedVirtualCircuitDownImpact
        "The impact caused on the Virtual Circuits running through"
        "this logical trunk."
        = PVC, Underlying, DownImpact;

}

interface PVC : ICIM_VirtualCircuit
        "A specialization of ICIM_VirtualCircuit that represents a permanent "
        "circuit implemented over a Wide-area network"
{

    event Down

        = OperStatus == ICIM_NetworkAdapter::DOWN;

    attribute ICIM_NetworkAdapter::icim_adapter_status_e OperStatus
        "Operational status of the port: UP, DOWN or UNKNOWN"
        = ICIM_NetworkAdapter::UNKNOWN;

    symptom DownImpact
        = Down
            explains;

}

There are other pieces to the puzzle here, but I wanted to highlight the fact that the PVC::Down event is NOT part of the RCA for the PhysicalTrunk, but is part of the explanation.  The value here is that when PhysicalTrunk::Down is reported, the getExplains() call made by SAM will traverse the Underlying relationship and provide a list of all of the PVC::Down event signatures back to SAM.  SAM will create the Causes/CausedBy relationship between the PhysicalTrunk::Down and the PVC::Down events if those events both exist in SAM.

As a result, I can change in real-time the layering of the PhysicalTrunk and the PVC to ensure that my explanation in SAM is correct without once touching the codebook.

Next time we will talk about how you can provide multiple thresholds with a bit of Dynamic MODEL.  As always feel free to suggest topics to cover.

37 Posts

May 20th, 2010 13:00

Thanks for the informative session -- popular topic, even on the last day of EMC World...  Presentation material attached below...

Kuhhirte_Ionix.jpg

1 Attachment

No Events found!

Top