Software Development Provider
Ensures uptime and availability for customers around the globe
March 1, 2022
Overview
A no-code application development company needed a way to ensure uptime and availability for its thousands of customers across the globe that depend on their platform to build and run applications to support their businesses.
“Moogsoft is much more flexible and feature-rich than the other AIOps vendors on the market.”
– Director of Cloud Platform
Business Challenges
With just four people, this no-code application AIOps team is responsible for maintaining the availability of thousands of applications used by tens of millions of people running on their applications on AWS. Not only would downtime lead to a bad customer experience, but it could have significant financial consequences due to contractual SLAs they have with their customers.
Leading up to the launch of a new cloud-native platform that uses containers and Kubernetes, the customer recognized that there would be a significant increase in complexity and volume of telemetry data – and a higher risk for performance issues. For them, hiring more people wasn’t the answer. They knew they had to automate.
The customer spent six months building a system that routed alerts using AWS Lambda functions, but it wasn’t accurate and provided zero correlation capabilities, so every alert created a ticket.
They knew they needed a better solution since they were growing fast and the alert volume was quickly increasing. To address these concerns, the customer turned to Moogsoft to replace their home-built solution.
Moogsoft Solution
The customer chose Moogsoft because it helps them automate the incident management lifecycle by prioritizing processes and workflows.
Moogsoft’s correlation capabilities reduced the amount of alert noise by 85% and the number of incidents by 80% by clustering alerts. With a growing volume of tools, telemetry data, and a move to multi-cloud, Moogsoft integrates easily with everything and helps their small team determine what’s happening with the generated data, why, and what action to take.
According to their Director or Cloud Platform, “Moogsoft is much more flexible and feature-rich than the other AIOps vendors on the market. Most notably, our ability to interact with the platform via an API and the workflow automation capabilities have been game-changers for us. We’re only a small team, but Moogsoft helps us feel much larger.”
Domain
- Software Development Provider
Key Challenges
- Maintaining a mix of thousands of monolithic and new microservices-based apps with a small team
- Too much noise – 100+ signals to alert on
- Lack of customer visibility
- Maintaining custom Lambda functions for DIY
Business Impact
- 10% failure rate of homegrown AWS Lambda alert routing system created missed incidents and customer frustration
- Every alert that fired created a ticket- growing customer base made this unfathomable
Moogsoft Business Benefits
- 85% alert noise reduction
- 80% incident reduction by clustering alerts
- Automation vs. people to scale
- Out-of-the-box integrations (AWS, PagerDuty, Slack, etc.)
- The flexibility of workflow engine and data catalog- Event routing and event severity tracked in data catalogs
Monitoring Ecosystems
- Cloudwatch
- Elastic
- PagerDuty
- Zendesk
- Slack
- Grafana Cloud