I recently spoke with one of Moogsoft’s newer customers. Only a few minutes into the conversation, it became clear that AppDynamics was a key element of their monitoring stack. They felt that AppDynamics was adding clear value and were particularly fond of its easy setup and the out-of-the-box value from its automatic response time base lining capability.
For example, if a new deployment is pushed into production and someone forgets to apply an index on the database, several queries might take minutes to complete instead of seconds. AppDynamics will immediately detect this abnormality and send alerts about what business transactions were breached, along with what application and database nodes were impacted. These alerts generated by AppDynamics give you critical pieces of the puzzle to fully understand and fix the underlying issue.
The problem, however, is that an application has hundreds of components so you’re implicitly going to end up with a storm of alerts when a single point of failure occurs. This is the inevitable reality of any application monitoring tool. According to this customer, when thresholds were breached, operators would often receive thousands of alerts, making it quite difficult to identify whether one or several issues were occurring and whether they were coming from the application, its transactions or the underlying nodes.
This customer purchased Moogsoft, in part, to de-duplicate and correlate AppDynamics alerts with the rest of their networking and infrastructure alerts generated from several of their other monitoring tools. Today, we are pleased to announce the general release of the AppDynamics LAM for Incident.MOOG, making the integration of both products work out-of-the-box for our mutual customers.
From conversations with several mutual customers, it became clear that they see three important uses cases for the AppDynamics-Moogsoft integration:
- De-duplication of Alerts
- Correlation of Alerts with other network and infrastructure sources and toolsets
- Ability to drill into AppDynamics from Moogsoft while maintaining alert context
De-Duplication of Alerts
AppDynamics is quite intelligent in how it detects application anomalies. It can learn the normal response times of every metric in your application environment and provide deep visibility into the code execution so that you can get to probable cause faster. It also provides powerful visualizations to allow you to better understand the health and resource utilization of your applications.
Automatic baselining in the context of AppDynamics means that it can auto-configure the appropriate thresholds for each transaction related to an application. This is great, but if you have a single point of failure or slowdown in the application, than it’s entirely possible that you’ll get multiple alerts being fired for every transaction, node or health rule that is breached.
For example, if the login transactions normally take 2 seconds and that was the normal response time, what happens when it goes to 3 seconds? AppDynamics will identify this as anomalous and will fire off alerts with a typical interval before it will fire off another alert, unless the response returns back to its normal baseline.
Think of Moogsoft as noise canceling headphones for these application-related alerts. By leveraging our patent-pending natural-language-processing technology, Moogsoft can reduce alert noise by 99% via de-duplication and blacklisting unwanted alerts, all in real-time. We only surface the unique alerts that are related to an anomaly. In the case of this particular customer, they quantified Moogsoft’s de-duplication capability into a 10-fold increase in operator productivity!
Alert Correlation across all Domains for Situational Awareness
AppDynamics is really good at detecting application performance deviations, but is that deviation caused by inefficient code? Or is it caused by infrastructure and/or network components?
Imagine that there’s a problem with a storage array, or router, or network device, or LDAP server, or firewall. The application will be impacted, even though it may not be the root cause of the problem.
When speaking with the aforementioned customer, they talked through an incident where a firewall went down. One operator was logged into AppDynamics, sifting through a storm of transaction alerts, application response time events, exception events, server connectivity events, etc. He told level-2 application support that they lost application server connectivity. Meanwhile, another operator was on the phone with their network provider telling them that the firewall had gone down. It took both operators 30 minutes to realize that they were investigating the same issue.
While all of the generated AppDynamics alerts were legitimate, they all related to something external and were merely symptoms of what the application was experiencing. The truth is that detecting a firewall issue using AppDynamics is tough. It helps to have full situational awareness of what is happening when AppDynamics fires an alert.
Moogsoft uses unsupervised machine learning algorithms to correlate AppDynamics alerts with the rest of your network and infrastructure to offer what we call Situational Awareness. By looking at AppDynamics alerts alongside the related network and infrastructure alerts correlated by Moogsoft, you now see the complete picture and can easily determine where the root cause is.
UI (the anomaly) directly into the AppDynamics UI, while maintaining the situation context (e.g. time and hostname). Moogsoft provides a full production workbench to run all of your tools in the context of a situation, automating the drill-down and troubleshooting process for the user. This saves a huge amount of time and helps to accelerate the remediation and resolution process.
To get more information about Moogsoft AIOps integration with AppDynamics, download the Moogsoft / AppDynamics Partner Solution Note.