Chief Complexity Officer
Dominic Wellington | June 2, 2016

What to do when there just aren’t enough hours in the day?

What to do when there just aren’t enough hours in the day?

When it comes to the wider trends of the IT market, there is never any shortage of far-ranging pronouncements. Without data, though, all these represent is somebody’s opinion. This is why it is always especially interesting to see a report based on a survey of actual users.

In this case, what caught my eye is a report by Trustmarque, entitled “The CIO Challenge: Simplifying IT Support.” The authors canvassed 200 UK CIOs and senior IT decision makers from large enterprises, which should be significant enough.

After some introductory material, the first big eye-catching finding is that 86% of CIOs think IT management has become more complex. I can confirm this finding based on my own anecdotal experience: Every IT decision maker I talk to complains of rising complexity in their IT environment. This is a trend that has been building for years now, starting with virtualization of the compute layer, adding self-service provisioning under the nebulous label of cloud computing, and now bringing network virtualization to the mainstream as well.

The Hidden Costs of Complexity

All of these technological shifts make individual management operations easier—provision a server with a click of a mouse, instead of going through a months-long process of procurement, delivery, installation, and configuration. The increased abstraction does not actually reduce complexity, though, it only hides it. That friction in the process of adding infrastructure also gave Operations teams the chance to get a handle on new components—not to mention making sure that old components were decommissioned on time! Eventually, someone will trip over the old, dust-covered box in the corner of the server room—or behind the drywall—and ask what it’s for.

No such comfort is available to the administrators of virtual infrastructure. Zombie servers can churn away for years, forgotten by their users but never decommissioned, consuming resources and offering security holes and compliance pitfalls the whole time.

The old way of dealing with this would have been to give the most junior member of the Operations team a clipboard and a pair of running shoes, but that doesn’t really cut it in the modern world.

It gets even worse when you start trying to operate this type of infrastructure at scale. The old world had plenty of recipes for a well-run data center, most of which hid among their ingredients an innocent reference to a complete and up-to-date CMDB or other euphemism for inventory.

Sound of a Needle Skipping on a Record

This type of approach worked fine around the time of my first sysadmin gig, when we kept our infrastructure diagram as a Microsoft Visio printout on the cubicle wall. It doesn’t work now. There are not enough junior operators left to update the diagram, let alone printers fast enough to commit it to paper before it’s obsolete.

This complexity is eating IT organizations alive. Therefore, 77% of the survey respondents have an explicit goal of getting away from “supporting ‘run the business’ IT,” and investing more in “transforming the organization through innovative technology projects.”

This is where a new approach can help. Instead of running the NOC the old way and trying to pare off a percentage point here or there at the edges, try looking at a new way of keeping the lights on in the data center. Stop throwing good money after bad in the Borgesian pursuit of an inventory that matches the reality of what it documents. After all, that inventory is a means to an end: being able to sift and prioritize incoming events according to their technical and business impact.

What if there were a way to do that—without needing to put in the effort to build, and worse, maintain the inventory?

That is exactly what Moogsoft’s approach does. Our technology can cut down on event storms by automatically identifying and surfacing only significant events, and then clustering them together into groups of related events. Each one of these clusters, or Situations, describes a particular issue that is occurring, grouping together all of the events that are symptoms of that issue.

This means that instead of the operations team having to watch a screen and wait to see an event they need to react to, they will be notified proactively and invited to collaborate on a real, actionable Situation. This reversal is how CIOs can reduce the drain on their resources that basic IT support represents. Instead of throwing bodies at vaguely defined problems, specialists can work together effectively to resolve specific issues.

The resources freed up by this inversion of the legacy approach to IT support can be dedicated to improving service, responding in a timely manner to new requests from the business and from end users. As expectations of IT agility increase, organizations will struggle to deliver – unless they have already done the foundational work to ensure that routine operations can be taken care of with minimal effort.

The only way to deal with the challenge of increasing complexity and accelerating change is to embrace it.

Moogsoft is a pioneer and leading provider of AIOps solutions that help IT teams work faster and smarter. With patented AI analyzing billions of events daily across the world’s most complex IT environments, the Moogsoft AIOps Platform helps the world’s top enterprises avoid outages, automate service assurance, and accelerate digital transformation initiatives.
See Related Posts by Topic:

About the author

Dominic Wellington

Dominic Wellington is the Director of Strategic Architecture at Moogsoft. He has been involved in IT operations for a number of years, working in fields as diverse as SecOps, cloud computing, and data center automation.

All Posts by Dominic Wellington

Moogsoft Resources

August 4, 2020

Telemetry Everywhere: Observability in the DevOps Cosmos

July 22, 2020

What’s Observability with AIOps? Check Out Our New Book, Webinars and Infographic

July 21, 2020

Why Observability Matters to Site Reliability Engineers

June 29, 2020

Moogsoft Express Helps DevOps and SRE Teams Develop More and Operate Less