Start a Conversation

Unsolved

This post is more than 5 years old

939

June 26th, 2014 06:00

Predictive and Self-healing Data Center Management

Existing data center monitoring and reporting tools can only raise an alert after the problem has occurred in the data center. Root Cause Analysis (RCA) is done only after the issue is raised. This leads to reactive rather than proactive troubleshooting or customer service.


Examples of typical problems include:
              •Sudden IT Service outage
              •Slowing device performance
              •Disk space shortage
              •Reactive customer service
              •Extensive security breaches
              •Downtime of mission-critical applications

Network/Storage/Virtualized IT Resources utilization event details can be obtained from multiple sources. This data can be used to generate reports on observed events which give details of problems that occur in a production setup.


Extending this data usage to the next level can enable predictions to be made based on the observed events and statistical parameters, analyzing the impact on the data center, and alerting users before the user/service is impacted. With this new model, the problem can be identified in advance by applying statistical analysis on network and storage parameters. The prediction will be used to take corrective action before service is impacted.


In this Knowledge Sharing article, Gnanasoundari Soundarajan, Vijay Gadwal, and Radhakrishna Vijayan describe a solution to achieve high availability in the data center through a predictive and self-healing technique.


Benefits of adopting this approach include:

•Helping to avoid critical unexpected issues rather than performing RCA analysis after the issue arises.
•Ability to predict problems and provide proactive solutions rather than reactive fixes.
•Take quick corrective actions based on the prediction.
•Using historical trending data to predict patterns.
•Reduced downtime cost per year and data center maintenance cost.
•Increased reliability of data center components due to early prediction of issues and correction.
•Better prepare networks, storage, and virtual infrastructure inside a data center to handle a flood of events, latency issues, and breakdowns.
•Highercustomer satisfaction which drives increased product revenues.
•Improved business-critical application availability.

Read the full article.

No Responses!
No Events found!

Top