How AI Helps IT Ops Pros Work Remotely
Will Cappelli | April 2, 2020

IT Ops teams working from home can use AI to collaborate and communicate better

IT Ops teams working from home can use AI to collaborate and communicate better

While the COVID-19 pandemic reshapes work processes, digitalization is allowing businesses to adjust to the fluid situation. The deployment of AI in IT operations is a good case study of this.

Human beings’ social dimension needs cultivation. Otherwise, people become unhappy and perform ineffectively. Beyond that, many tasks require social interaction to be executed successfully, including in IT operations.

Although remote work removes physical co-presence, social media creates a certain  connectedness. Adding AI to one’s technology arsenal mitigates much of the damage from the lack of physical co-presence while adding new advantages.

Five Types of AI

AI is composed of five distinct types of algorithms: data selection and normalization algorithms, pattern discovery algorithms, causal analysis, and inference algorithms, communications and collaboration algorithms, and robotic process automation algorithms. Each of these algorithmic types mitigates a loss in effectiveness occasioned by a lack of physical co-presence.

AI and the Many Eyes

IT operations professionals must contend with high volume streams of rapidly changing, noisy, high dimensional data. Inevitably, any single human being is likely to miss many data items of significance. IT operations teams have historically been grouped together physically in a data center to ensure that many eyes are trained on the same data stream. In a remote work scenario, communication among the many team members will be much slower, and real-time adjustments of incorrect observations will be almost impossible. Data selection and normalization algorithms (the first AI type) can compensate by automating the initial observations and streamlining the data sets by removing noise and redundancy. The team of IT operators literally have less data to deal with and that data is of higher quality. Hence, while the many eyes cannot work together as rapidly and effectively as when physically grouped in a data center, much less will be missed. In fact, overall team performance will likely improve.

AI and Diversity

With data selected, it’s time to search for patterns that give that data its meaning. Because pattern discovery is a question of classification, physical co-presence plays an interesting, although indirect, role here. In the last five years,  enterprises have strived to diversify their workforces, in part because data classification has subjective roots and having diverse cultural perspectives yields more meaningful, actionable classification. Now, the inclusion of employees from marginalized communities requires  the management of an enterprise’s physical spaces and the physical interactions of the now diversified team. In a remote work scenario, this isn’t necessary, potentially reducing the involvement of marginalized community members and hence limiting the diversity of perspectives on data classification problems. By automating the pattern discovery process, AI does two things. First, it gives an IT operations team a head start on any data classification problem so it can compensate for any reduction in the employee-diversity element. Second, by rendering the classification impersonal, cultural distortions in the classification process will be muted.

AI and the Myopia of Isolation

With data classified and correlated, next is figuring out  the root causes of issues that have surfaced through the classification. In modern IT systems, causes and effects are not simple. A cause is often composed of multiple events occurring in many different components across many different layers of the stack. Physical co-presence allows IT operations sub-teams to collaborate on correlated clusters containing events from many different regions of the IT infrastructure to determine which among the correlations are causal relationships and hence actionable. In a remote work scenario, individuals will have far less opportunity for contact and cross-domain analysis of emerging situations. They will tend  to retreat into one’s own silo of expertise, and problems will be increasingly viewed from that perspective. Automated causal analysis and inference (the third type of AI) takes a first pass at separating causation from mere correlation and can mitigate the impact of isolation-induced myopia. Furthermore, it will likely be more effective than human beings in suggesting the broad scope of events which constitute cross-domain cause and the equally broad scope of multi-domain events that are the effect of that cause.

AI and Collaboration

After conducting causal analysis on the data, operations teams must communicate results to the individuals in charge of what takes place in the IT environment. In a remote work scenario, it’s difficult to quickly find those individuals and gather them together with access to the right information. Then complex communication is required to define and coordinate the appropriate activities. This is hard when people are sitting in a conference room. In the case of remote work, it is almost impossible. Communication and coordination enablement (the fourth type of AI) can help here in obvious ways. Smart alerting can bring the right people together while NLP and semantic clarification technologies, along with smart visualization tools, can ease the flow of communication. Difficulties will remain but AI here can minimize the negative impact and in some cases improve both process and results.

AI and the Closed Loop

Once everything has been communicated and an action plan adopted, it’s time to execute that plan. Smart automation is a growing requirement in any case but a  remote-work scenario makes direct oversight of remedial action taken on a system impossible. Many partially-automated responses will now have to be almost fully automated because people cannot be assembled  to deal with the issue. A big barrier to ‘closing the IT operations loop’ has been trust. Can an IT operations team trust a robot or a software agent to make appropriate changes without human intervention? Remote work stands a good chance of rendering that question of trust moot.

In summary, remote work presents five challenges to IT operations teams and, while those challenges will remain, the five types of AI, working together, can lessen their adverse impact.

For an in-depth discussion about this topic, attend my webinar “Virtualize the NOC: Accelerate Your Transition to Remote IT Ops with AIOps” on Wed., April 8 at 10 am PT.

About the author


Will Cappelli

Will studied math and philosophy at university, has been involved in the IT industry for over 30 years, and for most of his professional life has focused on both AI and IT operations management technology and practises. As an analyst at Gartner he is widely credited for having been the first to define the AIOps market before joining Moogsoft as Field CTO. In his spare time, he dabbles in ancient languages.

All Posts by Will Cappelli

Moogsoft Resources

February 17, 2021

Q&A: Datadog Expands Monitoring Reach with Moogsoft Observability Cloud

February 11, 2021

A Day in the Life: Intelligent Observability at Work with DevOps

February 3, 2021

Actionable Insights – Faster Incident Resolution with Datadog and Moogsoft Observability Cloud

January 25, 2021

Achieving the Observability Imperative Requires AI