AIOps and Smart Alerting
Will Cappelli | December 5, 2019

Smart Alerting is not enough. Effective deployment of AIOps requires an independent platform capable of interacting with all technologies along the path from signal to response.

Smart Alerting is not enough. Effective deployment of AIOps requires an independent platform capable of interacting with all technologies along the path from signal to response.

Within the context of AIOps, artificial intelligence can be applied to logs, metrics, and event records. In all contexts, AI delivers insight. Appropriate response to this insight is typically remedial action–whether the action is carried out by a team of human agents, robots, or some combination of the two. The path between the two typically involves IT monitoring systems, databases, analytical platforms, help desks, and smart alerting systems, among other tools.

Along the way from signal to response, there are many opportunities for the application of intelligence.

The last two years have witnessed vendors that supply each of these technology components adding AI functionality. Not surprisingly, each define AIOps in a way that places themselves at the definitional center of AIOps.

We at Moogsoft believe, however, that each of these technology-specific approaches to AIOps are fundamentally mistaken.

The effective deployment of AI in IT Operations instead requires an independent platform capable of interacting with all of these technologies along the path from signal to response. Why? Well, the main problem is that each technology component adds latency, and much of this latency results from the simple fact that human users must intervene and interpret each tool’s results.

Let’s examine the limitations of smart alerting systems, as an example. But first, it’s important to understand how all the dimensions of AIOps fit together.

The Latency of Moving from Insight to Decision

Human intervention itself takes two forms. First, insights based on input must be derived. Second, decisions based on these insights must be made. The process of going from input to insights takes time, as does the process from insights to decision. AI, by automating both of these processes, promises to significantly reduce decision-making time.

If one were to automate the processes of insight and decision-making at each of the technology points along the way, some latency reduction could be achieved. But there would be much duplication of effort, and maybe even some inconsistencies introduced. These would ultimately need to be resolved. However, by treating the path from signal to response as an integrated whole, and by automating the end-to-end processes of insight and decision, latency reduction improves. A much more radical and far reaching reduction in latency can be achieved.

Warping the Five AIOps Functions

Artificial intelligence itself comprises five primary functions:

  1. The process of data selection from a tsunami of incoming signals
  2. Discovery of patterns in the selected data
  3. The process of drawing inferences from those patterns
  4. Communication of the results of these inferences
  5. Finally, execution of remedial responses can be automated

One of the major ways that technology-specific AIOps limits its value is because of domain-specific emphasis of some of these dimensions at the expense of others.

For example, vendors of monitoring systems tend to emphasize data selection and pattern discovery at the expense of inference, communications, and remediation. On the other hand, database vendors focus almost exclusively on pattern discovery and inference. Analytics vendors tend to focus on inference and some aspects of communication. Help desk vendors stress remediation.

Another way that technology-specific AIOps tends to limit or distort value is through the shared tendency to overstate the significance of its domain at the expense of other domain-specific technology solutions.

The Limitations of Smart Alerting

Now let’s take a deeper look at how vendors of smart alerting systems tend to deploy AIOps.

Smart alerting system vendors largely focus only the fourth dimension of AIOps: communication. Historically, the signals or messages originated from outside of the platform. In general, smart alerting systems have been built of two basic components:

  1. A mechanism for communicating messages or signals to the appropriate recipients
  2. A mechanism for authoring and enforcing rules that guide the delivery of messages or signals of specified types to the appropriate recipients

In the middle of the last decade, smart alerting vendors began to use communication-based AI algorithms in a couple ways. First, they sought to determine the appropriate recipient, method, and path of delivery of a signal or message based purely on its data properties. Second, they sought to improve this determination of recipient, path, and method over time.

Since 2016 or so, interest in the application of AI to the entire range of IT Operations functions has grown significantly. Smart alerting vendors naturally have attempted to extend the scope of their own ‘smart’ capabilities. These product extensions have taken one of two paths.

Some attempted to push their data ingestion closer to the source, in effect bypassing monitoring systems, databases, and help desks to get system and network data directly into the platform. Others have attempted to expand the scope of their own AI capabilities to include some modicum of data selection, pattern discovery, and inference. This is often available as an add-on to their core communications functions.

Each form of smart alerting product extension has its problems. Their own algorithms end up working with a highly uninformative and very noisy data set. Most smart alerting vendors are not willing to completely recreate the domain specific knowledge which informs any given application, infrastructure, network, or storage monitoring platform.

With regard to extending into other algorithmic functions, a smart alerting vendor could choose to recreate algorithms addressing any of the four remaining AI dimensions. Or they could seek to build out new algorithms from their existing stock. Of course the problem here is that the optimization of recipient choice and path selection is very different from the selection of significant data sets, pattern discovery, and inference.

Furthermore, smart alerting platforms tend to be centralized. This is especially true when the emergence of modular, distributed, dynamic, and ephemeral architectures is dictating that such algorithms be applied close to the point of data generation.

Why an AIOps Platform Approach Is Best

Moogsoft argues that the most insightful decision-making results from the balanced choreography of all five dimensions of AIOps. It’s important to keep in mind the many years of development and significant amount of IP protection that stand behind our patented, original algorithms.

An AIOps platform vendor like Moogsoft instead exploits the work already done by the monitoring vendors. This is not only a question of technology. With good reason, enterprises tend to shy away from a “rip & replace” approach to new functionality deployment. Instead they try to take advantage of existing technologies, both to minimize the costs of disruption and to avoid offsetting the advantages of a new functionality by simultaneously deploying an immature version of an older solution.

These considerations suggest a more reasonable approach. Why not deploy an AIOps platform to act as a bridge between monitoring systems and smart alerting platforms?

This would allow the enterprise looking to deploy AIOps to take full advantage of the richness of data coming in through monitoring systems, with their domain knowledge-based data selection. At the same time, this would ensure that data gets appropriately analyzed and delivered to the smart alerting system.

Enterprises have always aspired to integrate the various tasks and technologies that together constitute IT Operations. The digital transformation of business has made this integration a necessity.

AI is indeed a means of making that integration a reality. But only if AI is deployed in an even-handed way across all technology silos. Any deployment overly centered on a specific domain threatens to reinforce the fragmentation of IT Operations, rather than mend it.

Moogsoft is a pioneer and leading provider of AIOps solutions that help IT teams work faster and smarter. With patented AI analyzing billions of events daily across the world’s most complex IT environments, the Moogsoft AIOps Platform helps the world’s top enterprises avoid outages, automate service assurance, and accelerate digital transformation initiatives.
See Related Posts by Topic:

About the author

mm

Will Cappelli

Will studied math and philosophy at university, has been involved in the IT industry for over 30 years, and for most of his professional life has focused on both AI and IT operations management technology and practises. As an analyst at Gartner he is widely credited for having been the first to define the AIOps market and has recently joined Moogsoft as CTO, EMEA and VP of Product Strategy. In his spare time, he dabbles in ancient languages.

All Posts by Will Cappelli

Moogsoft Resources

May 18, 2020

Applying AIOps to Logs Is Key for Observability

May 17, 2020

Moogsoft Enterprise 8.0: The Virtual NOC Is Here!

May 13, 2020

Rackspace Boosts IT Operations Management with AIOps

May 7, 2020

Assessing the Economic Value of AIOps

Loading...