AIOps: Time to Sit Up, Observe and Listen
Helen Beal | September 8, 2021

GigaOm’s latest Radar for AIOps solutions has just been released and it makes for compelling reading for anyone trying to maximize organizational performance in our digital world. Particularly if you’re down with DevOps.

GigaOm’s latest Radar for AIOps solutions has just been released and it makes for compelling reading for anyone trying to maximize organizational performance in our digital world. Particularly if you’re down with DevOps.

GigaOm’s latest Radar for AIOps solutions has just been released and it makes for compelling reading for anyone trying to maximize organizational performance in our digital world. Particularly if you’re down with DevOps.

Like Gartner, who has defined categories for domain-specific and agnostic solutions in the AIOps space, GigaOm recognizes that there are two entry points for vendors in this market. They either are monitoring providers (generally APM) who are bolting AI onto their solutions or they are pure-play AI specialists. In the report, they describe the latter group as “a cohort of startups that have delivered purpose-built AIOps solutions.” Moogsoft is in this group as they have a purposefully designed AIOps solution, with advanced data science at its core.

Am I getting hung up on this and does it really matter? I think it does when we consider this: “new upstarts have been investing in more modern functions.” I don’t think it’s right to describe Moogsoft as an “upstart” or a “startup.” I think it could be offputting to prospective clients to stick them in a category that, while maybe “cool,” also suggests they are just finding their feet, highly experimental, and not properly mature. But with a ten-year pedigree, an army of AI and data scientists, and an enviable customer list including Fannie Mae, FiServ, GoDaddy, Key Bank, SAP, Verizon, WorldPay, Moogsoft is a very solid bet.

But, let’s get back to the report and, more importantly, the tech. Let’s talk about those “more modern functions.” When organizations digitally transform (which they all are or must in order to continue to compete), their main tools are cloud and DevOps. This means distributed computing - containers, microservices, APIs galore, serverless, service mesh, and edge. It means hybrid and multi-cloud. It means CICD pipelines that massively accelerate the delivery of changes into production. It means telemetry everywhere to satisfy observability demands and consequently big data. Perhaps it would be better to say HUGE data.

The report also touches on a pet topic of mine; other use cases for AIOps. Traditionally, we’ve firmly put AIOps in the incident management box. However, as other research analysts have also noted, there are other use cases; notably customer experience. GigaOm describes how an AIOps tool consumes data from multiple sources: End User Monitoring (EUM), system or infrastructure monitoring (including cloud, natch), and application monitoring (including the API infrastructure that microservices rely on). It’s the EUM that I’m most interested in. As the report says:

“EUM or RUM (Real User Monitoring) is typically an APM requirement and not something the AIOps tool does by itself. AIOps should consume and use the end-user data no matter the source.”

Totally! AIOps’ job is to collect the data and make sense of it so humans can action the insights. This raises the profile of the outcomes of using AIOps from noise reduction for lowering MTTR, to insights into customer experience beyond “service is slow or unavailable” to “I like using this service”. As DevOps evolves towards Value Stream Management (VSM), teams are increasingly concerned with optimizing the flow of value and the realization of that value in their customers’ hands. AIOps can help with the flow; apply it to your CICD pipeline or DevOps toolchain and observe that (including ITSM/service desk). And value realization; all that quantitative data about how your customers are interacting with the services you are providing them with. You can think about qualitative data too - part of the reason data is HUGE is because of all the unstructured data being created (80%+ in a typical business). AI can analyze that unstructured data too - think sentiment analysis.

GigaOm also points out how AIOps helps DevOps teams perform live risk management by connecting the dots on those rapid changes (after all, most incidents occur after a change):

“The AIOps tool can see the dev toolchain, including integration with traditional DevOps tooling. This includes the ability to see the outcomes of the continuous deployment process of a DevOps toolchain and correlate that with ITSM change requests validated by the CMDB, so it is always monitoring what is, and not what was.”

In a recent report by Research in Action, the author, Eveline Oerhlich, renames AIOps to AIPA (AI Predictive Analytics). Partly because of those new use cases we just looked at, but also as GigaOm puts it when describing their ‘Automation’ key criteria:

“The ability to onboard new applications and create useful analysis with minimal human intervention, with the extensibility to automate remediation for well-known processes. Includes: Proactive/self-healing operations, meaning that the AIOps tool is able to solve problems automatically and without human intervention, either through external or internal orchestration tooling or leveraging an automated ticketing system.”

There’s a fine balance here. One is considering the difference between predictive and proactive. Take a listen to this webcast with myself and Oerhlich where we discuss this in detail. As a rule, we shouldn’t be able to predict a failure, because if we suspect it, arguably we should have already fixed it. But AIOps has the power to show us trends we don’t suspect and give us insights into unknown unknowns. This means our capabilities expand beyond incident remediation and into prioritizing and paying down technical debt and combining these efforts with chaos engineering.

From a self-healing perspective, people generally want to tread carefully. There’s a trust issue in letting the machines fix themselves. What if they get it wrong? The approach to building trust in our machines is the same as building trust with our fellow humans. We do it slowly, over time, in increments bedded in truth and proof. It’s an empirical and data-driven process; we try something, check it worked, and then try a bit more. One thing to avoid is building automated fixes where we should have fixed the underlying problem; the sticking-plaster effect.

It might seem that adding an AIOps tool means yet another burden on already overworked technology teams, but as the report explains, this technology, when built in the right way:

  • Provides visibility and controls into work flowing across the value stream

  • Connects Dev and Ops and accelerates their processes

  • Makes DevOps toolchains safer, faster, and more scalable

  • Destroys toil through noise reduction and auto-remediation

Moogsoft is a leader and outperformer in this category because this is what it was built for - it’s a core competency, not an add-on. Competitors are scrambling to catch up with its AI and ML capabilities and its integrations, notably DevOps and security tooling, while Moogsoft continues to forge ahead laying out the paths to the future.

Moogsoft is the AI-driven observability leader that provides intelligent monitoring solutions for smart DevOps. Moogsoft delivers the most advanced cloud-native, self-service platform for software engineers, developers and operators to instantly see everything, know what’s wrong and fix things faster.

About the author

mm

Helen Beal

Helen Beal is a DevOps and Ways of Working coach, Chief Ambassador at DevOps Institute and an Ambassador for the Continuous Delivery Foundation. She provides strategic advisory services to DevOps industry leaders and is an analyst at Accelerated Strategies Group. She hosts the Day-to-Day DevOps webinar series for BrightTalk, speaks regularly on DevOps topics, is a DevOps editor for InfoQ and also writes for a number of other online platforms. Outside of DevOps she is an ecologist and novelist.

All Posts by Helen Beal

Moogsoft Resources

September 14, 2021

Reducing Pages with Alertmanager and Moogsoft

September 8, 2021

AIOps: Time to Sit Up, Observe and Listen

August 31, 2021

Monthly Moo Update | September 2021

August 23, 2021

Chapter Twelve: In Which Dinesh Starts an AI Community of Practice

Loading...