Systems administration is as easy as riding a bike. Except the bike is on fire, and you’re on fire, and everything is on fire, and you’re in hell. While running from fire to fire is normal for many IT teams, it doesn’t have to be that way. AIOps can help.
IT automation is the oft-touted generic solution to all systems administration woes. For decades, IT vendors have been promising one form of IT automation or another. Blah, blah: “Buy our (overpriced) product, and we’ll save you time! Time is money, so really, you’re not paying anything!”
As a practicing systems administrator, I quickly reject anything that remotely sounds like this. The classic OpEx argument gets a little tired after a while.
There is always more for systems administrators to do, and the only way we succeed at getting more done is by automating as much as possible. Not later, right now.
The flip side of this is that I measure the value of an IT product in the amount of sleep it saves (or costs) me. In my world, money is intangible. It arrives in my bank account and is gone mere seconds later. Sleep — or more accurately, the lack thereof — is something I understand quite viscerally.
When some OpEx-saving software solution does somehow manage to actually save me notable amounts of time and effort, my emotions are mixed. I hate the constant firefighting, so yay for things that save me time. I have also been trained to be paranoid and anxious about everything, so “the robots is after me jerb” never ceases to tickle my amygdala. So where does AIOps fit in to all of this?
Automate or Perish
Here is the hard truth: none of us gets out of using automation. There are too many workloads under management already, too many security threats, too much technical debt, and too few of us to tackle it all. I don’t care where you work, or whom you work for, what I just said is something that I will stand by as universal. There is always more for systems administrators to do, and the only way we succeed at getting more done is by automating as much as possible. Not later, right now.
IT automation can take two paths: it can be something accessible to all administrators (current and future) within an organization, or it can be a bunch of custom wizardry that makes you an indispensable superhero. The latter approach works for exactly as long as it takes the business side of your organization to realize how overwhelmingly screwed they are if you get hit by a bus and don’t show up for work.
Business leaders fear the bus factor the way systems administrators fear untested data protection. They measure reduction of the corporate bus factor in nights of sleep. TL;DR: you won’t win that battle, so the whole “custom wizardry” approach to automation is not a good look.
Whatever automation we implement has to be approachable by other sysadmins. This usually means baking the automation we use into vendor-provided frameworks, using well-defined standards, or otherwise putting real effort into making our automation solutions approachable.
The foundation of IT automation is actionable insights about the solutions we seek to automate. Standards play an important role, and certainly interoperability is important, but all automation begins with having inputs to automate against.
Traditional automation starting points usually rely on basic responses to monitoring solutions. Has a given sensor gone above or below a pre-set threshold? If so, act on it. This approach is crude, but has been surprisingly effective for years.
Unfortunately, just as household thermostats are far more efficient if they’re more flexible, so too does the efficacy of IT automation solutions scale with the capability of the monitoring solution powering it. The more nuanced that monitoring solutions become – the more they’re capable of automated correlation, root cause analysis, and detecting deviations from baseline behaviour without false positives – the more useful they become for automation of critical IT services.
This is where AIOps fits in. It’s a better grade of monitoring: adaptive, constantly learning, and all around more capable than the primitive predecessors that most of us are still using. AIOps isn’t a replacement for humans, it’s just another tool – a fire hose hooked up to the mains instead of the personal fire extinguishers we’ve all been using until now.
The human is still needed to direct the battle against the flames, but having better tools to resolve the mayhem surely does help.
About the author Trevor Pott
Trevor Pott is a full-time nerd from Edmonton, Alberta, Canada. He is cofounder of eGeek Consulting Ltd. and splits his time between systems administration, consulting, and technology writing. As a consultant he helps Silicon Valley startups better understand systems administrators.