Human bottlenecks are a core challenge in IT incident management. How is the role of the service desk ticket changing in 2017?
In the spirit of controversial blog posts, I propose the question – what is the point of a service desk ticket in 2017?
Enterprise IT orgs have incident management processes built around service desk tickets, yet, when you ask why that’s the case, the answer is something like, “compliance requires us to track service requests and disruptions” or “it’s the way we’ve always done it.”
In conversations over the last several months at conferences like CiscoLive, Monitorama, and Velocity, as well as with customers and prospects, it’s apparent that people are feeling constrained by their ticketing systems and are starting to question why they use service desk tickets in the first place.
In this blog post, I will explore why the service desk ticket is so fundamental to traditional IT Incident Management, how the role of the service desk ticket is changing, and the alternative options that enterprise IT organizations are now investigating.
Why the Service Desk Ticket is Fundamental to Traditional IT Incident Management
To understand the service desk and the trouble ticket, we need to understand ITIL. ITIL was created in the 1980’s by the British Office of Government Commerce (OGC) to provide a strict set of processes across areas like Incident Management, Change Management, Problem Management, etc.. Being a trusted and structured system, successfully followed for decades, ITIL practices still influence most enterprise IT orgs today.
ITIL defines the service desk as a primary function of Incident/Service Management, serving as the single point of contact for users to interact with IT staff. When an incident occurs (incident = unplanned disruption of service), the service desk is contacted by users, and the support staff are then responsible for addressing the incident based on existing service models or escalation protocol.
To avoid constant back and forth email and loss of context as incidents escalate across teams, ticketing systems are the institutionalized ‘Systems of Record’ for Incident Management. The ITIL workflow for Incident Management goes as follows:
- Identify Incident
The service desk learns about a disruption by various means – phone, email, chat or even automated notifications from monitoring tools. The support staff then identifies that the disruption is indeed an incident, as opposed to a request or change proposal.
- Create Ticket
The service desk staff creates a ticket with all details of the incident: user’s name, date, time, description, etc.. Tickets are often created automatically, as well, as certain alerts types fire.
The service desk staff then adds to the ticket meta data by deciphering the appropriate category and sub-category. For example, “Storage” category and “Disk-full” Sub-Category.
Based on the severity of impact to users and the urgency of the issue, the ticket are prioritized. Depending on details like the allocated priority of an incident (P1, P2, P3, P4) and the criticality (Minor, Major, Critical), each organization will treat the ticket differently with a unique escalation and resolution process.
Now that the ticket exists, it’s time to respond to the ticket. The ITIL workflow for incident response is as follows:
- Diagnose – identify the type of incident and resources required to resolve
- Escalate – if advanced support is needed, escalate to the appropriate teams
- Diagnose – Re-investigate the incident and the resources required to resolve
- Resolve – Execute resolution steps and restore service
- Closure – Once restoration of service is confirmed, close the ticket.
The Problem with Service Desk Tickets
What do formula 1 and IT Incident Management have in common? Detecting and responding to issues at speed is crucial to success. Like IT Ops and DevOps, Formula 1 staff have monitoring tools that provide real-time telemetry around the performance of what matters most. But when something breaks, they don’t take a few minutes to create a ticket. They react at the scene to resolve the issue, and then they debrief afterward to figure out how to prevent it from recurring.
The fundamental flaw with service desk tickets is that they are constrained by people. It’s up to humans to understand the full context of an incident, escalate it to the right people, and document everything along the way. The modern service desk ticket is simply a series of completed fields and a history of an incident’s life cycle, without any real context. As Kalyan Kumar, CTO of HCL Technologies says in the AIOps Fireside Chat, “Ticketing systems are just multi-user excel sheets that send emails.”
So why has this worked so well in the past?
10-15 years ago, businesses weren’t software businesses; they were brick and mortar. The number of applications, the size of the infrastructure, and the number of teams across an organization were all small compared to today. Because of the simplicity, you could properly manage incidents across an enterprise with tickets.
Today, every business is a software business, and ‘Enterprise IT’ now means 10,000’s of apps, 100,000’s of servers, and change occurring faster than humans can react. Keeping track of service disruptions with tickets and humans is a problem. We know this because we hear this from enterprises across verticals.
As an example, here are some metrics on the tickets from a Fortune 100 Enterprise that I recently met.
The reality is that this organization unnecessarily creates tickets for any potential sign of service disruption. Once that ticket opens, it needs to be closed, meaning that the ticket is investigated, escalated, and triaged repeatedly. This is a serious issue when every potential disruption turns into a ticket and 99% of the tickets are non-actionable and 30% are duplicates.
The organization wastes human resources, reactively addressed incidents, and their service quality takes a hit.
How will Organizations track incidents and workflow without Tickets?
Unlike a Service Desk Ticket, an IT incident is a living and breathing thing. With the introduction of Artificial Intelligence and Machine Learning to IT Incident Management, humans are no longer the bottleneck. Service Disruptions can be automatically detected from patterns across event and alert messages. End-user calls to the service desk can be automatically captured and tied to related event and alert message for greater context.
Improved context means that non-actionable or false messages are closed, duplicates are thrown away, and subject matter experts are automatically inferred and notified, eliminating the need for manual and unnecessary escalations.
Gartner refers to this category of technology as Artificial Intelligence for IT Operations (AIOps). Leading enterprise IT organizations are now investigating leveraging AIOps platforms to develop more agile service desks and eliminate bottlenecks across teams.
If you said that you wanted to run your business on the cloud 10 years ago, you would get fired because of compliance restrictions. Today, everyone is moving to the cloud. Tickets aren’t so different…
What will happen to the Service Desk?
Based on my conversations with IT professionals this year, I’m predicting a dramatic shift in the role of service desk tickets in enterprise incident management. That being said, some might be wondering – will the Service Desk ticket ever go away?
The answer is no. The reason is that auditors will always exist and there needs to be an audit trail. Due purely to compliance requirements, large IT organizations will always use ticketing systems as their Systems of Record – just not as their System of Engagement.
About the author
Sahil Khanna is a Sr. Product Marketing Manager at Moogsoft, where he focuses on the emergence of Algorithmic IT Operations. In his free time, Sahil enjoys banging on drums and participating in high-stakes bets.