At Moogsoft, we are always developing new Link Access Modules (LAMs) to make it easier for our customers to import new data streams into Incident.MOOG for analysis and correlation. This week, we are excited to announce a LAM for Microsoft’s System Center Operations Manager (SCOM), making it easy for Incident.MOOG to ingest event and alerts from the SCOM platform and all of its management packs (MPs). Developed for and initially in use at some of our larger, financial services customers, the new SCOM LAM should be of high interest to most enterprise IT shops, especially given the continued ubiquity of Microsoft servers and networked-application software products in use today.
Microsoft’s SCOM is essentially a cross-platform data center management system for operating systems and hypervisors. SCOM uses a single interface that shows state, health, and performance information for Exchange Server, SQL Server, and pretty much any Microsoft application running on Microsoft servers. Additionally, SCOM provides configurable event and alert triggering according to some of the availability, performance, configuration or security conditions identified.
SCOM can be extended by importing MPs that define what it monitors. By default, SCOM only monitors some basic OS-related services, but new MPs can be imported to monitor services such as SQL servers, SharePoint, Apache, Tomcat, VMware and SUSE Linux. Many Microsoft products contain MPs that are released with them, yet many non-Microsoft software companies write MPs for their own products as well. SCOM then has the ability to generate streams of events and alerts from all of the software infrastructures that it watches over.
Unfortunately, SCOM has very little intelligence when it comes to analyzing and correlating across all the data generated. SCOM also does a poor job in streamlining the workflow to diagnosis and remediate incidents that occur. While SCOM can set event and alert triggers, it possesses limited event management capabilities, and lacks any of the features that a next-generation Manager of Managers (MoM) can provide. As a result, enterprise IT Ops teams become quite frustrated when trying to use SCOM to get early warning of unfolding incidents, or for forensic analysis to troubleshoot incidents.
In summary, the major limitations of SCOM include:
- No clustering or correlation of related events
- No algorithmic approach to detecting anomalie
- No holistic, cross-domain perspective of the data it generates
- No socialized workflow to drive collaboration and accelerate remediation
Incident.MOOG to the rescue
This is where Moogsoft’s Incident.MOOG comes into play. Incident.MOOG sits on top of SCOM and other data streams across the “IT stack”, ingesting the event feeds from SCOM along with the feeds from other tools, applications, and infrastructure domains. Incident.MOOG then applies automated algorithms to clean and contextualize the aggregated data, grouping tens or hundreds of related events into single, anomalous Situations. Incident.MOOG does this across domains, correlating the events and alarms across the entire IT environment, providing causal context, and making it easier to see how an incident is unfolding.
Once an incident has been detected, Incident.MOOG opens a Situation Room for each service-affecting situation. The Situation Room serves as a virtual war room in which the appropriate stakeholders are automatically notified to come together. This enables IT professionals to view the same clustered narrative, share communications, compare results from 3rd party tools, and streamline workflow, ultimately allowing everyone to work better together. With Incident.MOOG, Microsoft-based IT shops can now remediate problems faster and get services restored quicker before end users complain.
Moogsoft is a pioneer and leading provider of AIOps solutions that help IT teams work faster and smarter. With patented AI analyzing billions of events daily across the world’s most complex IT environments, the Moogsoft AIOps platform helps the world’s top enterprises avoid outages, automate service assurance, and accelerate digital transformation initiatives.