Event Correlation and Analysis Market Definition and Architecture Description, 2009
Debra Curtis, David Williams
When embarking on an event correlation and analysis (ECA) project, it's important to consider the right event management specialist products, manager of managers (MoMs) and business service management (BSM) options, pulling together the appropriate sources and data types supported by the right ECA architecture.
- IT operational efficiencies can be improved by consolidating a broad range of event data to understand how IT elements affect each other, and how IT events affect the services that IT provides to the business.
- It is a good practice to push as much element-specific event correlation as possible down to the lowest tiers of the ECA architecture, even as far down as the managed element itself.
- Not all event management consoles can act as a MoM; don't assume that Simple Network Management Protocol integration will suffice.
- Implement a tiered event correlation architecture, where event management products in each IT technology domain filter and pass data on to allow IT operations to view only the most important multidomain data at the highest-level MoM or BSM console.
- Identify the information you need at the highest-level MoM to help define what products pass data, which data is passed, when data is passed, and how data is filtered and correlated, as well as to highlight gaps in monitoring what will need to be filled.
- Preprocess event data at lower tiers in the ECA architecture, so as not to overload the MoM or BSM console.
- To reach full value, an ECA product must provide information about the effect that events have on IT service; therefore, we recommend giving priority consideration to integration with a configuration management database (CMDB), IT service dependency mapping tools and BSM tools, which can provide service impact information and can pass data back to the supporting event consoles to help IT support personnel prioritize which issues to address first.
There are two types of event categories that need to be accepted by the ECA product:
- Discrete state change in a managed element, sent asynchronously from the managed element, an agent installed on the managed element or another IT event-monitoring or ECA product.
- Threshold breaches indicating that a managed element is no longer operating within "normal" parameters, sent asynchronously from the managed element, an agent installed on the managed element or a separate performance-monitoring product.
Normal can be based on a predefined, default, out-of-the-box threshold; a customerdefined, customized setting; or a dynamic, measured baseline.