A maxim in our industry is that when something stops working, the first question IT support asks is “what did you change?” It’s the right question. Almost 20 years ago, industry analysts reported that the majority of outages were caused by change.

About 10 years ago, I was working on a software solution for trouble-shooting an IP-PBX system. In common with many systems, issues often stemmed from ill-advised changes. This lead to a technique where regular configuration “snapshots” were taken, so when a problem surfaced, a “diff” of the current configuration, compared to a last known good, could be presented to the engineer as part of the ticket.

It was well received, but that was 10 years ago.

  • The velocity of change (and the dimensions of configuration) has increased dramatically
  • The snapshot approach lacks granularity
  • Now, you may not be analyzing changes made by a human in a maintenance window, but changes made by an autonomic system in real-time

Here’s a better approach:

  • Source change notifications
  • Correlate them in real-time with developing situations in the infrastructure

Now that’s 21st Century agility!

This is one of the difficult challenges that Incident.MOOG solves today.

 


Get started today with a free trial of Moogsoft AIOps — a next generation approach to IT Operations and Event Management. Driven by real-time data science, Moogsoft AIOps helps IT Operations and Development teams detect anomalies across your production stack of applications, infrastructure and monitoring tools all under a single pane of glass.