A maxim in our industry is that when something stops working, the first question IT support asks is “what did you change?” It’s the right question. Almost 20 years ago, industry analysts reported that the majority of outages were caused by change.
About 10 years ago, I was working on a software solution for trouble-shooting an IP-PBX system. In common with many systems, issues often stemmed from ill-advised changes. This lead to a technique where regular configuration “snapshots” were taken, so when a problem surfaced, a “diff” of the current configuration, compared to a last known good, could be presented to the engineer as part of the ticket.
It was well received, but that was 10 years ago.
- The velocity of change (and the dimensions of configuration) has increased dramatically
- The snapshot approach lacks granularity
- Now, you may not be analyzing changes made by a human in a maintenance window, but changes made by an autonomic system in real-time
Here’s a better approach:
- Source change notifications
- Correlate them in real-time with developing situations in the infrastructure
Now that’s 21st Century agility!
This is one of the difficult challenges that Incident.MOOG solves today.
If you want to learn more about Incident.MOOG and change management, my colleague Stephen Hart, CTO International at Moogsoft, will be addressing the topic of new approaches to change and configuration management in two upcoming events:
June 3: Innovise-ESM webinar, Situational Awareness: Enabling Incident Detection During Change Windows
I also plan to blog more about this in the near future!