Digital Insurance Provider Executes Public Cloud Migration using AIOps from Moogsoft

Avatar photo

Digital Insurance Provider Executes Public Cloud Migration using AIOps from Moogsoft

Overview

This digital insurer provides around 30 million customers worldwide with insurance, savings, and investment products. They combine strong life insurance, general insurance, and asset management businesses under one powerful brand.


“Moogsoft has  enabled our Cloud  Migration by  transforming our  Operations from  Incident-Focused to  Service-Focused”

– Event Management & Analytics  Manager


Key Challenges

With just 13 people, this insurer’s operations teams are responsible for managing service quality across all digital applications. To gain visibility into their production stack, the operations teams were using AppDynamics, Splunk, BMC End User Monitoring and HP OpenView.

These operators were manually analyzing 500 monitoring alerts/day (out of the 10,000’s), via an exchange mailbox. “We had to disable many of the alerts because it was killing the productivity of our level 1 operators”, said the tools architect. Despite this alert restriction, there was one day where their tools architect realized their team had a bigger problem:

“I was sitting close to our level one operations team and was listening to two conversations our support operators were having. One operator was looking at AppDynamics and speaking to application support level 2 saying ‘we’ve lost application server connectivity’ and another operator was on the phone with our network provider saying ‘the firewall has gone down.’ It took more than 30 minutes for both operators to realize that they were, in fact, investigating the same issue. It was at this point that I realized we lacked basic insight and event correlation across our toolsets”, said the tools architect.

In addition to correlation across toolsets, the operations teams struggled to understand application-impact when incidents occurred. “We were investigating incidents, but we didn’t understand Service Impact,” said the Event Management & Analytics manager.

Moogsoft Solution

As part of evaluating Moogsoft, all alerts were de-restricted. “In the first week, we generated 64,493 events. Moogsoft was able to reduce this volume down to 447 unique alerts, and correlate these alerts into 49 actionable incidents for our level one operators, representing a 99% event reduction and 10X increase in their productivity”, said the tools architect. Further, Moogsoft was able to detect a production incident one hour earlier than the level one operators who were still using their exchange mailbox process as part of the insurer’s side-by-side comparison test.

Today, Moogsoft is the single-pane-of-glass into the health of this insurer’s IT production stack.  “We don’t have to worry about fine-tuning our monitoring tools to avoid false positives and mastering thresholds because Moogsoft catches everything,” said the Event Management &  Analytics Manager.

This year, this insurer has onboarded 1000’s of new applications and has begun migration of those services to the cloud, without adding any headcount. Over the next several months, this insurer plans to migrate 180 additional applications to the Public Cloud – all without adding a single operations head.

Domain

  • Online Insurance Provider

Key Challenges

  • Overwhelmed by production event volume
  • Visibility limited by disabled alerting
  • Reactively addressing incidents
  • Lack of event correlation across tools
  • Difficult to understand service impact
  • High service costs of BMC implementation

Business Impact

  • Unable to transact quotes or policies during service interruptions
  • Frequent customer reported incidents
  • Lack of Incident Awareness increased time-to-resolution
  • Unable to scale the business and migrate to public cloud due to instability

Moogsoft Business Benefits

  • Reduced alert volumes by 99%
  • Clustered alerts into Incidents, reducing workload by 90%
  • Detected production incidents one hour earlier than existing processes
  • Increased Level 1 operator productivity by 10X
  • Increased alert visibility by 10X
  • Enabled on-boarding of thousands of applications without adding headcount
  • Enabled Migration of traditional apps to Public Cloud without adding headcount

Monitoring Ecosystems

  • AppDynamics • HP OpenView • Splunk
  • BMC End User Monitoring

More Case Studies