Every business is challenged by IT Infrastructure landscape. It’s extraordinarily complex and heterogeneous. Managing a growing number of data centers and edge systems can be a nightmare. It’s not easy to detect unhealthy systems among dozens to hundreds and even thousands of systems in a timely manner. Troubleshooting to identify the source of problems and then finding the right advice to fix them is even more difficult. As a result, ITOps, DevOps and the business all suffer.
There is a technical solution: AIOps software.
Using telemetry data and patented machine learning algorithms, our CloudIQ AIOps software proactively monitors infrastructure health issues, identifies probable causes, provides help articles and recommendations for remediation, and has APIs for triggering automated actions.
There are three main reasons why IT teams use CloudIQ to monitor and analyze their Infrastructure. This blog will focus on the first.
- It proactively monitors system health and provides recommendations, so you can reduce risk
- It makes predictions, so you can plan ahead
- It saves you time, so you can be more productive
Technical Solution: Proactive Health Monitoring and Recommendations
CloudIQ provides a proactive health score for all Dell infrastructure systems in one consolidated view. The view can be filtered according to type of system (e.g., server, storage array) and sorted by health score value, business unit, location, among others, to help you prioritize actions.
The health score view for each system is based on the issues impacting the score, including component, configuration, capacity, performance and data protection issues. Each issue has an intelligently weighted impact value that is specific to the system type and model. It also provides a view of health changes over time as well as active issues, their description, and recommendations for resolution.
Three main factors that distinguish CloudIQ health monitoring are:
- Telemetry data from frequent time intervals are continuously monitored to calculate health score using a health score engine driven by patented algorithms. In addition, the health score engine monitors the predictive output of some of the analytical capabilities to trigger a new health score and health notification.
- It archives historical health score changes for up to 24 months so you can identify recurring problems. For each health change event, it displays a description of the issue and a recommendation for resolution.
- Health scores are available in a consolidated user interface and common format for the entire infrastructure stack across a broad spectrum of technologies, including data storage, networking, server, data protection and hyperconverged infrastructure, as well as data storage as-a-service and data protection in public clouds.
Health score categories include Poor, Fair and Good ratings. All score impacts are not equal and the health score engine recognizes that by only subtracting…