Skip to content

Fault Management

Fault Management is the process of collecting and processing incoming events, coming from various sources. NOC provides flexible event processing pipeline split to cleanly separated stages:

  • Collection - Collecting events from external sources, like Syslog, SNMP Trap, active probes, metrics thresholds and injecting them into event processing pipeline.
  • Classification - removing of all device-depended personality and replacing them by generalized Event Classes. NOC recognizes about 300 event classes out of the box.
  • Correlation - Analysis of possible alarm opening and closing events, rule-based correlation, topology-based correlation, raising and clearing of alarms, calculation of service impact
  • Escalation - Rule-based alarm processing, notification and escalation to external trouble ticket systems.

Each stage processing by different set of microservices, allowing to adjust amount of workers according your actual workload. Multi-stage processing allows to focus monitoring staff to fix only actual problems which causes service degradation.

See Also