Moogsoft’s CEO and CTO address how a new partnership with Datadog tackles the issues of pager fatigue and lost productivity.
Nobody will dispute that a common goal of DevOps pros and SREs, and really any company today, is to delight their customers more by disappointing them less. This was the theme of a recent live webinar focused on announcing a new game-changing partnership between Datadog and Moogsoft.
The live session combined remarks by Moogsoft CEO Phil Tee and CTO Dave Casper on bringing together the best of these two technologies with a new seamless integration. The experts also showcased the benefit for customers of both companies via a live demo and audience Q&A.
“Telemetry is a great thing, but that’s a lot of data,” said Tee. “When you try to shortcut the proper steps to correlate logs, traces and metrics, what you end up with is pager fatigue. Teams are getting deluged with messages on Slack and everything else.”
He went on to explain how the new partnership aims to deepen the utility of the Datadog application for its customers. While Datadog has long been a reliable and valuable part of the stack, Moogsoft is adding additional best-in-class capabilities for deep correlation, automation and workflow.
“The two products together are a killer combination,” Tee quipped. “When you put these two products together, you will shrink the downtime in your applications.”
Seeing is Believing
Never content to let words act alone, the experts quickly set about demonstrating just how easy Datadog users could combine Moogsoft and Datadog, as Tee put it, “in the time it takes to make a cappuccino.”
During the demo, Casper showcased just how quickly a Datadog user has the ability from within the Datadog UI to request a trial of the Moogsoft Observability Cloud.
“It’s actually quite easy,” said Casper as he opened up a browser to set about the simple steps to connect the integration via the Datadog Marketplace. “It’s just a few steps that take really less than a minute to set this up. There’s a free trial and you get 14 days to try it out.”
On the Moogsoft side, he walked through the types of data that Moogsoft collects, including metrics which are automatically put through an anomaly detection process at the instant of ingestion.
“If something becomes anomalous, we have a severity-based set of alerts that will happen with that,” said Casper. “With this, we combine actual alerts, and they come from multiple systems.”
Typical users will have to look through all of these alerts and figure out which ones to make sense of. We take both these alerts and metrics and combine them through our easy-to-set-up correlation into what we call incidents.”
Casper continued to showcase how incidents are fed into a dedicated console, which displays all the important and recent incidents that users know they need to work through, helping to get them working more efficiently.
“It’s to reduce MTTR, to get people working efficiently,” said Casper. “Instead of getting multiple tickets that might have been worked on in isolation, you can actually have it all in one place.”
We’re pulling in Datadog events, Datadog metrics, combining it with everything else, that goes downstream to the correlation and then we pop the incidents straight back into Datadog.”
The full demo as well as an in-depth explanation by Tee of the algorithms at work in the Moogsoft Observability Cloud are available within the on demand version of the webinar.
Following is a recap of audience Q&A that wrapped up this insightful session.
How can this save time and money, and how does it help with reducing the number of tickets and on-call pages?
Tee: They are incredibly intimately related. The short answer to how you save money is actually part of the answer to the question about reducing on-call pages.
Fundamentally this is a productivity play. What we are doing here is using our AI to group together alerts which we believe are related to each other and causal. It’s a common thing that when something goes wrong, you don’t just get one alert. Typically you’ll get a scatter of data – hundreds of alerts and thousands of thresholds being broken.
If you don’t have something correlating those for you, each one of those potentially could end up in a call to action going out to an SRE. Of course, that creates noise and confusion. All of that leads to more downtime, more disruption, more disappointed customers. The top and bottom of that is that it is both cost and lost revenue.
How we solve this problem is by delivering actionability of the data producing pager fatigue, and getting people to the actual root cause. The way in which we do that correlation is AI.
How is this new integration different from the one that was previously available between Datadog and Moogsoft?
Casper: The high-level theme is sharing insights across systems where the unit of currency is an incident. There’s quite a lot of flexibility that offers a workflow of choice, but the main idea is this: the existing integration you have now would have been pulling data via an API from Datadog into Moogsoft to apply the analysis and come up with incidents that are then sent off to an app like ServiceNow.
This new integration is additionally sharing that incident back to Datadog so that everyone is really sharing the same insights, and there is flexibility — but at the end of the day it’s all just shared.
How can you bring other data sources into Moogsoft that aren’t covered by Datadog?
Casper: In Moogsoft, it’s as easy as clicking on our integrations tab. In this, there are three ways of getting data into Moogsoft. The main goal here is to get as much data into the analytics as possible to get the best outcome.
First, we have an agent that’s really easy to install in about three seconds and automatically starts discovering systems and anomalies across them. It also has its own SDK to extend it to anything else you want.
Second, there’s a large list of integrations built off the shelf. For instance, with all 88 AWS services.
Third, you can send data to us. We have an API for metrics, one for events, and you can also extend the Moogsoft REST API to create your own new integration.
Is there a sweet spot on where the integration best fits? Are there any limitations?
Casper: We certainly are able to handle, and want to handle, the data you have, but we have learned through working with many customers that the best idea is not to take a year to spin up an instance using professional services for a big bang.
What we recommend is to start small with one SRE or DevOps team, get the data in quickly and start getting immediate results. That helps start a snowball effect that can help grow adoption.
So, the sweet spot is to send us whatever data you have to quickly start getting value, then we can grow together.
Do I have to be an existing Datadog customer to access Moogsoft?
You don’t. If you want to get it through the marketplace then you can certainly do that, however of course both our products are available separately, whichever way you want to purchase it.
The full webinar, Datadog Expands Your Monitoring Reach with Moogsoft Observability Cloud, is available to watch on demand. Also sign up for a free trial of Moogsoft Observability Cloud and see first-hand what intelligent observability can do for you!
About the author
David is Moogsoft's Director, PR and Corporate Communications. He's been helping technology companies tell their stories for 15 years. A former journalist with the Sacramento Bee, David began his career assisting the Bee's technology desk understand the rising tide of dot-com PR pitches clouding journalists' view of how the Internet was to transform business. An enterprise technology PR practitioner since his first day in the business, David started his media relations career introducing Oracle's early application servers and developer network to the enterprise market. His experience includes client work with PayPal, Taleo, Nokia, Juniper Networks, Brocade, Trend Micro and VA Linux/OSDN.