This is the twelfth chapter in The Observability Odyssey, a book exploring the role that intelligent observability plays in the day-to-day life of smart teams. In this chapter, our SRE, Dinesh, brings together his colleagues with an interest in AI.
There’s an old adage in DevOps, ‘Give the developers a pager and put them on call’. As an SRE I find this interesting. One thing, we don’t have pagers anymore, we all have smartphones so this seems like something prehistoric. Another thing, are we trying to punish our developers? Slap them into learning something? It doesn’t sit well with me, I’ve been both a developer and an IT ops guy - it’s what, I’m told, makes me a perfect SRE. In our new world, we build it, we own it. We’re aiming for autonomous, multifunctional teams - not silos handing off to one another.
Anyway, rant over. For now. I mention all of this because Derek, the new, interim, CIO said it to me and James the other day. I actually thought James might explode. He was showing Derek and the CEO the research to support our investments in AIOps (or predictive analytics?) and there was a lot of nodding up until that point. Let’s just put it this way, the pair of us are very hopeful that Derek is very much interim and someone else arrives who’s somewhat more forward-looking. For example, James, who has become remarkably protective of his developer friends and I applaud that.
In our ‘debrief’ (read: three beers each and a lot off our chests) I gave myself an action item to set up a community of practice. We already have a few for agile, DevOps and cloud and they’re going quite well. I’ve dropped into them and what impressed me was that a bunch of people are interested enough to voluntarily turn up and make things happen. They’re a little bit underground at the moment, not CXO-approved if you like, but the middle-management are supportive. I like anything a bit subversive and this smacks of the grassroots movement we need. If we’re going to be successful in this disrupted digital economy, we have to have autonomy. We can’t exist in a command and control culture and be empowered to participate in change. We’re taking the reins.
My community of practice, of course, is going to be AI. What I’m hoping is to tap into a whole host of humans I didn’t know shared my interest and also bolster James’ run for president. I mean CIO. I think he’d be amazing. That guy knows how to get stuff done. He’s not bothered about being everyone’s mate. He takes no prisoners. And he really, genuinely cares about doing things right and can see the future. He has some serious vision and he inspires me. I like working with him and if gets that role, it could mean great things for me too as we are genuine buddies and I know he likes the way I work too.
So when he said the AI community of practice was a good idea, I went all-in on it on the strength of his blessing. Here’s how our inaugural get-together went, two weeks after I posted the plan and date across a bunch of company Slack channels and told everyone in the hearing range what I was up to.
It was an experiment, so we had a hypothesis. A simple one; we build it, they will come. We said twenty people would turn up to the meeting. We made it a Friday afternoon in the office and I forked out for donuts. We were going for the fun and friendly vibe. Forty-eight people turned up, so the experiment was considered a success.
We should probably have had a hypothesis around outcomes too and I was kicking myself for not, as it’s become a bit of a mantra to me and James: outcomes over outputs. The number of people is an output. I had a bunch of topics I thought we could talk about, but I also wanted to let the community decide what they wanted - it to be self-organizing.
It’s not that I regret that decision, but we did go in a direction I wasn’t anticipating and wasn’t prepared for. Enter, Duena. First impression? I was a little scared, I’m not embarrassed to say. Also, how have I not met Duena before? She is a force to be reckoned with. And obsessed with AI. I feel like I’ve met my match and, let’s put it this way, not all of my thoughts are strictly professional.
Duena came along with an axe to grind, and the donuts weren’t calming her down. What was bugging her was PagerDuty. Specifically, how she felt it was still too noisy, despite the salesperson there telling her their event intelligence was sorting out the constant stream of alerts she was receiving.
This was why I hadn’t met Duena before. She works in one of our very recently acquired products and this was her first time in our new offices. She’d come in especially for this. I don’t mind saying I was flattered. But it’s not like it was about me, it was about the topic.
I’m getting off track. We don’t use PagerDuty anywhere else in C&Js so this was new. Did I tell you she’s also an SRE? It just gets better and better. Even better when a quick Google showed me I had a prime opportunity to gallop in like a white knight on a stallion. I have a feeling she probably wouldn’t like that analogy so I’ll keep it to myself. When she looked like she’d set out the fullness of her complaint, I stepped in.
“Duena,” I said, such a nice name, I’d never said it before. “We’ve been experimenting with AIOps tool, Moogsoft here for quite some time now. Way beyond experimentation really; we’ve got quite the portfolio of outcomes and evidence for the improvements it’s made.” James nodded, sat next to me. “When Moogsoft surfaces an incident, it can send it to PagerDuty in real-time. Based on the insights derived from the underlying data through the algorithms applied by Moogsoft, PagerDuty knows which teams and people need to take action and those who need to be kept up to date.
“Users have context and can acknowledge and escalate and can add comments and notes directly from PagerDuty. Moogsoft AIOps’ Situation Room allows all users to share a consistent view, while both platforms stay in sync throughout the lifecycle of the incident. Once the incident is resolved, PagerDuty streamlines post-mortems to speed up future response, by leveraging Moogsoft’s historical knowledge of prior, similar incidents. It sounds like we could make this experience a lot better for you. Really get that noise out of the way.”
“Sounds great,” she said, putting her glasses on and opening her laptop.
“No time like the present!” I said. I liked this. This is how I work too. I got my laptop up and running too. “So the Moogsoft and PagerDuty integration enable bidirectional communication between Moogsoft and PagerDuty. The integration allows you to send events to PagerDuty from Moogsoft Enterprise alerts and Situations. Each event relates to a service that creates a PagerDuty incident. Notes you add to a PagerDuty incident appear in Moogsoft Enterprise as collaboration posts in related Situations. Likewise, posts you add to a Situation appear in the PagerDuty incident notes.”
“That all sounds straightforward,” Duena said. “I’m assuming you’ll install this as the Moogsoft admin?”
“Yes, I’ll drive it but we both have things to do. Do you have the user role and an API User Token for it? We need that to query services for integrations, create integrations and extensions via the API.”
“Check,” said Duena.
“You also need to have the responder team role to create, acknowledge, and resolve incidents.”
“I’ll add you as a user to Moogsoft and give you a role that can access alerts and make sure the situation tools have the "moolet_informs" permission.”
“Received and logging in,” she confirmed. This lady is fast. I spun my laptop around and moved closer so she could see the screen.
“These integrations are so fast. First, we go to the Integrations tab. Then find PagerDuty in the notification and collaboration section. It needs a unique integration name - we’ll just use the default. Now you add the connection details for your PagerDuty system and we’ll configure the service mappings.” As she leaned in to use my keyboard, our shoulders touched for a moment.
“I’m just going to enable it to automatically create integrations and extensions and see what happens. We can tweak it later, right?” She looked at me as she asked. My heart skipped a beat.
“Of course. The integration creates extensions and integrations with the name "Moogsoft-Integration" and updates the URL of any existing extensions of the same name to this value. You should find this sorts out your noise problem and you’ll probably end up finding out a lot more about your systems too. I’d love to hear how you get on. Maybe at our next community of practice meeting?”
It turned out that wasn’t going to soon enough. We’re going to dinner Saturday night. She wants to know more about my AI Ph.D. I want to know more about her.
About the author
Helen Beal is a DevOps and Ways of Working coach, Chief Ambassador at DevOps Institute and an Ambassador for the Continuous Delivery Foundation. She provides strategic advisory services to DevOps industry leaders and is an analyst at Accelerated Strategies Group. She hosts the Day-to-Day DevOps webinar series for BrightTalk, speaks regularly on DevOps topics, is a DevOps editor for InfoQ and also writes for a number of other online platforms. Outside of DevOps she is an ecologist and novelist.