When you think about Event lists, Event consoles and the so-called single pane of glass idea, be cognizant of a behavioral science called Situation Awareness.
Think about this.
- Every morning, you have a one-hour commute to work.
- You quickly check your calendar for any pending meetings or appointments for the day.
- Then you take a look at the weather and traffic report for your local area.
What you’re discerning is and potential issues related to not have a smooth and timely commute.
You hop in your car and head out knowing what the weather is like and whether you have traffic slowdowns and pressure points. Along the way, you may check in via local radio station, the local traffic report to see if conditions have changed. You adjust your route accordingly. What you develop in your mind, is a picture of your commute situation. You look to be aware in changes in conditions so that you can adapt and minimize the impact of those changes. Monitoring and managing Networks is not much different.
In the model that Dr. Micah Endsley has identified, there are three parts to Situation Awareness:
- Level 1: Perception. Perceive the status, attributes and dynamics of the elements in your environment; recognize changes and cues, and be aware of the current states
- Level 2. Comprehension. Group the disjointed elements using pattern recognition, interpretation and evaluation and develop a comprehensive picture of what’s going on
- Level 3. Projection. Project future actions of the elements in your environment by perceiving, comprehending and pulling the Situation forward in time to determine future states and the cause and effect of actions
In IT Support, situation awareness is dependent upon instrumentation and getting the data transformed into information and subsequently, knowledge. IT Application of Situation Awareness Within the context of your organization, there are several types of Situation Awareness to consider.
Think of the contexts where situation awareness can be applied as a behavior.
- Like individually
- Like a team
- Like an organisation
In the context of the battlefield, individual Situation Awareness is enabled by the Heads Up Display (HUD) in a modern fighter jet. Within the HUD, you also see your team as they move and engage. And from the battlefield management, you see the area threats, resources, and localities. Management Domain Situational Awareness When you look at the products in the realm of Event Management, in IT Operations you can develop your own Situational Awareness related to how these products perform.
Here are a few questions you need to ask yourself:
- Is there significant delay from the time something happens until the time you see it?
- Do you only see cues or events that are very select?
- Do you ignore wholesale events that are not in the incident list?
- Are you missing visibility in your environment?
Given the aforementioned definitions, what level of Situation Awareness do you think your support organization achieves? Don’t be surprised if you don’t even have Level 1.
There are many support organizations that are blind to events and tend to wait on the phone to ring to respond. And ironically, some spend thousands of dollars to get there!
Temporality, Discreet Events versus Situations
Situation Awareness is not about discreet events in time.
- Discreet events can confuse and obfuscate situations as there is little concept of association beyond when an event is received and processed
- Situation awareness is about ascertaining what is going on, responding to the situations, and staying on top of the result
- Situation awareness focuses on workflows and processes that occur over a time period.
Because event displays are typically presented in time slices, some of the awareness of the time domain can be masked from users. Situations LIVE within a time domain natively.
Take a good look at your events display. Is this the events display that your Level 1 personnel work from? What do you see?
- How different is your event list from All events?
- Are you only looking at a discreet list of events? All too often, it is easy to narrow the scope of what’s presented to a very limited view.
- You assume that Level 1 personnel do not have the skills to respond to events outside of this very finite list
- And you assume that they only need knowledge of these discreet events
However, when you do this, you lose awareness of all of the other problems in the environment Some environments even go so far as to discard and not even log events outside the realm of what they want to present. They ignore others.
Part of this is done to not overwhelm the management system. Part of it is to glean out supposed noise. Do you only respond to critical events? If you, it means you are probably assuming that the folks on Level 1 lack the ability to prioritize. I’ve actually seen an implementation where all actionable events were black and white – no severity.
Problems come in all kinds of flavors, severities, and impacts.
What are you doing about non-critical events? Are you completely ignoring them? Can you go through and even determine which events are still active? There could be volumes of issues occurring in your environment and you have no clue. However, your customer, you end users, or your executive management may know full well there is a problem.
Enterprise Management is about WORKFLOW, not handling events. What are you solving with your current Event console?
You need the ability to tell if one event is related to another. And you need the ability to respond to situations, not just the discreet, finite problems you have already defined.
If all you can handle are the problems you have defined, then with all of the all other potential things that crop up, you are vehemently naked and should be afraid.
How do you achieve some level of Situation Awareness when you cannot comprehend your situations?
- With Incident.Moog, you set up all of your events streams to feed in to the system.
- The data is fed in, applied against several algorithms, and organized as situations.
Behind each situation are the events related to the situation, history, and activities related to the given situation. As a situation evolves, events related to the situation are added to the situation. And it gives you indications as to whether is growing or declining. In essence, the role of the Level 1 person becomes that of the Moderator. They orchestrate the workflow behind the Situation through to resolution. They work to get the right managers, customer service representatives, and engineers involved and working on the Situation.
There’s a lot more I can say about this but this blog is already pretty long! If you want more information on Incident.MOOG and situation management, send an email to firstname.lastname@example.org.
You may like to listen to my friend Richard Whitehead explain this concept in the webinar Four Steps to Transforming IT Operations.
I’d like to hear your thoughts! — Dougie