Communications Provider Accelerates Incident Detection and Resolution using AIOps from Moogsoft

Avatar photo

Communications Provider Accelerates Incident Detection and Resolution using AIOps from Moogsoft

Alert volume also dropped by 90%-plus, and customer-impacting incidents fell by 30%

Overview

This leading outsourcer of cloud-based communications and collaboration solutions for enterprises had lost visibility and control over its IT environment.


“I bought Moogsoft to gain insight into our alerts so that I can sleep better at night.”

– Director of Technology


Key Challenges

This 10-year old organization used a variety of system monitoring and management tools like Microsoft SCOM (System Center Operations Manager), Splunk, SolarWinds, and Cacti, as well as various homegrown solutions to create email notifications for their operations teams. With about 15 people across the NOC, systems operations, infrastructure and applications teams, managing incidents proactively was a big challenge.

Through SCOM, operations teams had visibility into 40% of the total alert volume. The rest was turned off to avoid further alert overload. From these email alerts, 300 to 400 tickets were created each week for the NOC team to manage, but 70% of these tickets were closed without any action taken. Furthermore, when a P1/P2 incident did occur, all-hands conference calls were conducted.

It took the operations teams about two hours to detect incidents and another two hours to resolve them. They were operating reactively — over 70% of incidents were detected by customers first.

“Because there was such a high volume of alerts, we could only look at critical alerts when things were breaking,” the NOC manager said. “The ‘lows’ and ‘mediums’ that could be leading to problems would always be missed. It was like firefighting.”

“Because of SCOM’s server-level focus, it was very difficult to determine whether a larger part of the environment was being effected as a whole, since we were just concentrating on alerts coming in from one server,” the NOC manager said.

After years of challenges, they decided to evaluate Moogsoft.

Moogsoft

Today, all data from across their toolsets, including SCOM, feed into Moogsoft, which is now a direct interface into their ticketing system.

“We are using the same tools but the way in which we are using them has completely changed. We have turned on all alerts and are sending everything to Moogsoft for full visibility,” the NOC manager said.

Moogsoft has helped this organization achieve a 90% reduction in workloads, a 30%
reduction in customer-identified incidents, a 75% reduction in MTTD (mean time to detect), and a 25% reduction in MTTR (mean time to resolve).

CUSTOMER PROFILE


Industry

  • Cloud-based communications solutions

Key Challenges

  • Lengthy MTTD and MTTR
  • Many disparate tools and event sources
  • Lack of event correlation
  • 1000s of email alerts per day
  • 300-400 tickets per week

Business Impact

  • 70% of incidents detected by customers
  • Frequent, hours-long service-impacting incidents
  • Significant productivity loss

Moogsoft Business Benefits

  • >90% reduction in daily alerts
  • 75% reduction in MTTD
  • 25% reduction in MTTR
  • 30% reduction in customer-identified incidents
  • Dramatic increase in Level-1 operator productivity

Integrations

  • Microsoft SCOM
  • SolarWinds
  • Splunk
  • Cacti
  • Homegrown solutions

More Case Studies