It’s a new year, the change freeze has thawed (although you really shouldn’t be doing a change freeze anyway), and it’s time to get back to work. Well, first everyone had to ask for their passwords to be reset, but once that was dealt with, it was time to get back to work.
We got done with 2018 predictions already at the end of last year; our own Richard Whitehead offered some predictions in APM Digest:
Driven, in part by a motivation to be cloud-native, we will see a shift in focus from “outside-in” APM, to instrumentation being provided at source by the app developer. And just as network monitoring is now the domain of the switch vendor, Platform and Infrastructure-as-a-Service providers will step up the level of native instrumentation too.
Happy New Year! Here’s A Bunch Of New Infrastructure For You To Manage
Richard’s prediction was focused on data sources for performance management, but there are other consequences of the ongoing transition away from “dumb” infrastructure and towards self-monitoring and even self-configuring infrastructure components. Telco operators I speak to expect a tenfold increase in the number of logical devices in their care, and they are concerned about their ability to manage this sudden increase.
When doing things according to accepted best practices is turning into a drag on business agility, something has to change.
In the same way, data center operators are in the middle of their own transitions towards cloud models, but are worried about their ability to keep up with cloud-phase competitors who started with a clean slate. When doing things according to accepted best practices is turning into a drag on business agility, something has to change.
AI and machine-learning (ML) appear to promise a way out of the trap of ever-increasing numbers of devices and ever-accelerating rates of change. The problem is that these are relatively new fields, and it is far from clear how to apply them effectively. Monsanto determined that a 99% failure rate with a current slate of 50-plus deep-learning projects is acceptable because “that 1% is going to bring exponential gain.” Enterprise IT cannot afford those odds — the lights have to stay on, apps must continue to be served, SLAs must be respected.
Enterprise IT Operations Are Ideally Placed To Take Advantage Of AI & ML
Fortunately, enterprise IT offers a much more favorable field of research than Monsanto’s labs have to deal with — partly because it is more circumscribed. Data are getting much simpler to acquire, as Richard noted above. Infrastructure these days generally comes with its own instrumentation built in, so the problem is not “monitoring” as it used to be done, but rather sifting and interpreting massive volumes of data being generated every second. Also, the formats are getting more unified over time. Scientists at Monsanto would probably give a limb for something like a RESTful API!
Another advantage for enterprise IT is that AI & ML techniques can be evaluated against real-world data without needing to plant a field and wait for harvest time. Even partial results are helpful when they can be evaluated intelligently, and discussed in real time by expert human operators. The combined results can then be fed back into the learning system to improve its results for the next iteration — and those iterations can be extremely rapid.
This combination of straightforward machine learning and human collaboration is known as “supervised learning” (as opposed to the “unsupervised” black-box version). Enterprise IT gains an additional advantage here: One of the problems that complexity has brought is increasing specialization, and as a consequence, the number of people and teams involved in any but the most trivial problems has mushroomed. All of these very different roles may struggle to build a common perspective from their very different perspectives and tools. With AI to assemble that holistic perspective, it becomes much easier for human specialists to work together and get things done.
How To Avoid Failing With AI/ML?
All of this may sound too good to be true, and at risk of winding up as one of those 99% of failed AI/ML projects. The way to mitigate that is to look for techniques and products that have worked elsewhere, and for a proven track record in the same type of environment. As discussed, Enterprise IT Operations is a relatively constrained domain, so we don’t need to look for general AI, just for narrowly-focused learning models that are applicable in this space.
This also makes it easy to build out test scenarios and validate a proposed AI solution before committing resources (effort, time, and money). Simply show all of the data to the AI model, give it a couple of rounds of feedback and configuration, and compare its results with what you had before. This should be enough to give you at least indicative results.
Beyond that, start thinking about long-term adoption, too. That shift from unsupervised to supervised AI is where a lot of value is locked up. ML on its own can only learn so much, but by combining ML with human expertise, and making it easier for those human specialists to work more effectively both with the AI and with each other, both the humans and the machines will improve faster and give better results.
How can Enterprise IT Operations Get Real Value from AI & ML?
The good news is that Moogsoft can satisfy these criteria in 2018, setting you up to start on your own road to AIOps. The analysts at CB Insights have built a list of the 100 most interesting AI companies out there. There are some interesting use cases listed, but when it comes to Enterprise IT Operations, Moogsoft is the only IT company to make the list. Gartner have also recognized Moogsoft as the only new technology vendor to satisfy all of their criteria for the emerging field of AI for IT Operations (AIOps).
We are not sitting on our hands, either. We have a very aggressive R&D program, including collaborations with academia, and we make sure to update our product frequently to make those advances available to our users.
Finally, and perhaps most importantly, Moogsoft technology has been proven through the success of our customers around the world and across many industries. We have also learned from those customers to produce roadmaps for adoption, including in partnership combining Moogsoft technology with expert outside advisors such as Deloitte.
We would love to work with you to set up your Enterprise IT Operations for success in 2018. Ask us about AIOps, and how best to adopt it in your own organization.
About the author Dominic Wellington
Dominic Wellington is the Director of Strategic Architecture at Moogsoft. He has been involved in IT operations for a number of years, working in fields as diverse as SecOps, cloud computing, and data center automation.