Due to demand from our customers, Moogsoft now has out-of-the-box integration with Amazon CloudWatch. This allows customers to ingest, correlate and analyze events in real-time across all of their AWS resources, applications and existing monitoring tools like New Relic.
AWS CloudWatch in the Grand Scheme of Things
Amazon CloudWatch is your primary tool for monitoring the AWS resources and applications running on your Amazon infrastructure. It gives you great coverage into the resources behind EC2 instances, Elastic Load Balancers, EBS volumes, RDS database instances, SQS queues, and SNS topics.
When an Incident occurs and one of your cloud applications is impacted, you might be used to seeing hundreds or thousands of CloudWatch alerts. This might take you hours to analyze and understand while your end users are being impacted. With Moogsoft, there’s a better way.
Moogsoft Brings Correlation and Collaboration to CloudWatch
With Moogsoft, alert storms are reduced and contextualized in real-time, allowing Incidents to be simplified and resolved in just minutes.
Moogsoft does this by (1) de-duplicating and blacklisting unwanted CloudWatch events and (2) using machine learning to correlate CloudWatch alerts in real-time and create individual clusters (‘Situations’) of alerts that manifest the full narrative of an Incident; beginning to end.
However, the full narrative of an Incident can’t always be comprised of alerts from one single monitoring tool. In fact, that’s rarely the case when major Incidents occur at large enterprises and service providers. Fortunately, Moogsoft can ingest events and alerts from your ENTIRE production stack and correlate them in real-time to give you full situational awareness.
As a basic example, let’s say that you have an S3 storage issue and one of your applications can’t write to it. CloudWatch allows you to look at the impact through the storm of application events that come in. But what came first? How do you differentiate between the cause and the symptoms?
With Moogsoft, you can see exactly how this Incident unfolded:
(image not found)
Furthermore, Moogsoft facilitates collaboration across teams. When a Situation occurs, all relevant cross-domain stakeholders are automatically notified by Moogsoft so that they can come together to communicate and collaborate around remediation. Whether insights are shared, command scripts are executed, or resolving steps are revealed, all activity in the Situation Room is archived for future reference when similar Situations are identified by Moogsoft.
With Moogsoft, operators gain a clear, concise and actionable workload, specialized for each user to resolve Incidents in the most timely fashion. The result is is a massive reduction in the Mean Time to Detect (MTTD), Mean Time to Resolve (MTTR), and and overall business disruption.