Last month, Monitorama held its fifth Open Source Monitoring Conference & Hackathon in Portland, OR.
From dealing with a widespread power outage that made the whole conference relocate, to inhaling Blue Star Donuts (#BlueberryBourbonBasil was the best) on the last day, Monitorama was a great event, filled with many surprises.
The content of the show was just as good as that doughnut, too — I learned to hate the #oncallselfie with Alice Goldfuss (@alicegoldfuss), learned from the mistakes made by The Vasa with Pete Cheslock (@petecheslock), and with Charity Majors (@mipsytipsy) I learned of the seismic shift happening in monitoring today, and how the tools of the future must be interrogatory, exploratory, and made for an unpredictable world.
This conference was also the perfect event for a Moogsoft survey because — shocker — everyone we spoke to works in the monitoring world. So how did we get the attendees to take our survey? Simple. We offered them a chance to win a Star Wars X-Wing from Lego.
To give you a sense of who we surveyed, here are some of the titles who contributed to our survey: SRE and DevOps engineers, Sys Engineers, Cloud Engineers, and Technical Architects (we plied them with bluetooth speakers). So what did these monitoring folks tell us?
The most interesting piece of data we picked up was that 66.67% of those surveyed have 5-10 monitoring tools, and yet 61.90% of them are still struggling with alert noise / fatigue / volume.
Here are our key findings:
- The top 3 monitoring challenges are alert noise / fatigue / volume, collaboration across teams, and alert correlation across all tools.
- The average level of alert/event volume per month most commonly cited by people we surveyed was in the hundreds.
- On a scale of 1-10 — 10 being the most proactive company ever — most companies said they are a 7.
- The top 3 most used monitoring tools are Logstash (from Elastic Stack), Nagios, with Solarwinds and New Relic tied for third.
The Most Interesting Fact
The most interesting piece of data we picked up was that 66.67% of those surveyed have 5-10 monitoring tools in place, and yet 61.90% of them are still struggling with alert noise / fatigue / volume.
This begs the question: Are their tools really working?
It gets even more interesting. When we asked them — on a scale of 1-10, 1 being the most reactive company ever, 10 being the most proactive company ever — where they ranked their companies, over 50% said their company is fairly proactive (7) when it comes to alert/event management.
But we have a problem here: Referring back to their answers to the first question, how can these companies be fairly proactive and still be drowning in alerts? Can they really be both?
As you ponder that conundrum, have a look at the rest of the survey questions.
Over 76% of participants said that they don’t use an event manager / manager-of-managers (MoM) platform. Could this be because the current tools, like IBM Netcool, HP Omi, SCOM and others, aren’t helping with their problems of alert noise?
It’s no surprise to see New Relic, AppDynamics and Dynatrace at the top. Why? Because they’re the leaders in Gartner’s Magic Quadrant for Application Performance Monitoring Suites for 2017.
SolarWinds crushed all of the other network monitoring tools, which aligns with Gartner’s assessment that it’s a “challenger” in the network monitoring space. It’s also interesting to note that all three of Gartner’s referenced “leaders” in the space were non-existent in our Monitorama survey results — not one respondent was using NetScout, Viavi, or Riverbed.
So this made me wonder: Do network teams really attend events like Monitorama? Or are they stuck in silos?
As expected, survey respondents listed Nagios and SolarWinds as, respectively, the first and second most used infrastructure monitoring tools. But the most fascinating thing about this question was that over 28% of respondents said they didn’t have any monitoring tool for their infrastructure. Is infra monitoring a thing of the past?
Logstash, part of the Elastic Stack, beat out Splunk and every other log monitoring tool among the surveyed Monitorama attendees. It’s important to note that the survey respondents who said they have Splunk also told us that it’s getting “too expensive” to maintain.
Over 71.43% of respondents said that they don’t use a synthetic monitoring tool. Why was there a serious lack of synthetic tools at Monitorama?
Like most of our customers, the majority of the participants at Monitorama use PagerDuty to notify the right teams.
It’s interesting that the dinosaurs, HP & BMC, were nowhere to be seen. Is it safe to say that the old way of ticketing is dying out?
Zero surprise here — almost all of our respondents use Slack to communicate internally. (Personal Note: Monitorama had the best Slack channel I’ve ever been a part of.)
Every language I asked about was being used by the companies at Monitorama. It should be noted that many respondents also wrote in “Perl.” So it’s still alive and kickin!
No surprise that AWS dominated among Cloud technologies at Monitorama, it’s exactly the right crowd. Monitorama and Gartner confirm what we already know — AWS & Microsoft are the cloud leaders, by far.
The companies represented at Monitorama clearly have the best-in-breed monitoring tools — AppDynamics, New Relic, Nagios, Splunk — yet over 60% of them are still struggling with alert volumes.
This fit perfectly with the theory that most companies are still dealing with a siloed production stack — the APM isn’t talking to the NPM, while the log monitoring tool isn’t talking to the synthetic monitoring tool. None of these tools are working together to combat the huge amount of alert noise that these companies are suffering from.
These companies need to break down the barriers, smash the silos, and connect their entire monitoring ecosystem.