A key differentiator of Moogsoft is Situations, both the technology used to generate them, and also the ground-breaking collaborative workflow built around them.
However, some folks struggle to make the break from alert-based management to situational management. This is understandable in many cases, especially where vast amounts of time and technical expertise have been invested in fine-tuning alert-based systems.
A fairly common question I get asked, especially in mature environments, is: “What happens if our alerts cannot be clustered? How can you reduce then our workload?”
To tackle that question, you first have to understand where it’s coming from.
That question usually stems from the assumption that the environment is so well tuned that only “actionable” alerts are being presented. Therefore, there is no opportunity to reduce the workload.
Firstly, it would be remise not to address the issue of over aggressive filtering. If it’s indeed true that only truly actionable alerts are being allowed through, then we at Moogsoft would contend that you are over filtering, and consequently missing critical precursors to problems, as well as data that would assist in problem resolution, negatively impacting Mean-Time to Detect and Mean-Time to Resolution, respectively.
This new concept of MORE alerts, but in LESS situations, can feel a little counter-intuitive at first.
The next question is that of being able to aggregate alerts into situations. Moogsoft’s uniquely powerful approach to situation creation using multiple real-time algorithmic techniques to cluster the alert flow is radically changing the way people are managing vast, rapidly changing virtualized environments. But there remains the question: In a more static (legacy perhaps) environment, if only actionable alerts are being processed, is there a case for clustering?
4 Reasons for Managing Situations Over ‘Actionable’ Alerts
Firstly, there’s the power of small numbers, or “small data” if you were. While we all get excited when we see hundreds of alerts clustered into a situation, does that mean a small cluster size isn’t valuable? No! If you simply group two related alerts together, that’s a volume reduction of 50%. And when you have customers cost-justifying their investment in Moog with a 23% reduction, 50% is great!
Secondly, just because alerts are actionable, it doesn’t mean they can’t be clustered into a situation. An example I saw was a situation containing 12 e-mail notifications saying ATM machines were down, each requiring acknowledgment, and timely confirmation that remedial action is being taken. But are they really 12 discrete operator actions? Or can all 12 be dealt with simultaneously?
Thirdly, even if in the remote case that in fact, these alerts are actually unrelated, perhaps they can still belong in the same situation. I know, it sounds unlikely, but take the case of multiple servers requiring a re-boot? The servers are unrelated, on different segments, performing different services, but they are all related in terms of workflow. They require a re-boot. So you put them in the same situation, and issue a single command to re-start all servers in that situation.
And finally, are you REALLY sure these alerts can’t be clustered? You see, that ‘s one of the characteristics of Unsupervised Machine Learning, it can tell you things you didn’t know.
Get started today with a free trial of Incident.MOOG—a next generation approach to IT Operations and Event Management. Driven by real-time data science, Incident.MOOG helps IT Operations and Development teams detect anomalies across your production stack of applications, infrastructure and monitoring tools all under a single pane of glass.