Message Storm Protection¶
Message storm protection is intended for use in collector services, that is, services that receive messages from devices over the UDP protocol or any other. At the moment, this condition is satisfied by the services
syslogcollector and `trapcollector'.
Message Storm protection allows you to block messages or perform other actions in accordance with the policy when too many messages arrive from any of the devices without preventing the arrival of messages from other devices.
The service class must have an attribute containing device configurations, which includes at least the following fields: * id - ID of the managed object * partition - partition * storm_policy - storm protection policy * storm_threshold - storm protection message threshold
Its name is set in the module
stormprotection.py by the COLLECTOR_CONFIG_ATTRNAME constant.
Protection is provided by using the
noc.core.service.stormprotection.StormProtection. When initializing an object of this class, a periodically repeating process is started - round of verification. During the verification round, data is accumulated on the number of incoming messages from each of the devices. At the end of the round, the fact that the number of received messages exceeds the threshold for enabling protection for this device is determined. If the threshold is exceeded, then actions are taken in accordance with the storm protection policy for this device. This information is stored and further used during the next round in relation to all messages coming to the service collector.
The above actions are performed automatically in the 'StormProtection' class. In the service itself, in addition to creating the
StormProtection object and initializing it, it is necessary to execute the
process_messages method when receiving each message. It will perform the following actions: * increase the message counter for the device by one * performs actions in accordance with the current status of exceeding the device's message threshold - raises an alarm if it has not been raised yet * as a result, it will return a flag - whether the message needs to be blocked in the service or not
The message is blocked in the service itself by not sending the message, for example by exiting the appropriate method.
Example code for using message storm protection:
# creating and initializing `StormProtection` instance class TrapCollectorService(FastAPIService): async def on_activate(self): ... self.storm_protection = StormProtection( config.trapcollector.storm_round_duration, config.trapcollector.storm_threshold_reduction, config.trapcollector.storm_record_ttl, TRAPCOLLECTOR_STORM_ALARM_CLASS, ) self.storm_protection.initialize() # message processing and blocking if necessary class TrapServer(UDPServer): def on_read(self, data: bytes, address: Tuple[str, int]): ... if cfg.storm_policy != "D": need_block = self.service.storm_protection.process_message(address) if need_block: return
Description of methods of the
initialize()- initialize the object
pocess_message(ip_address: str)- message processing for the specified device. Performs the following actions:
- increments the message counter for the device by one
- performs actions in accordance with the current status of exceeding the device message threshold - raises an accident if it has not been raised yet
- returns a sign as a result - whether the message needs to be blocked in the service or not
Additional actions of the
In the object of the
StormProtection class, the following additional actions are also performed automatically, if necessary: * The alarm is closed if the protection from the device has been removed * Records about devices from which messages have stopped coming are deleted from the device table (internal to the
In the global settings (for a specific service): * storm_round_duration - the duration of the verification round in seconds * storm_threshold_reduction - the ratio of the shutdown threshold to the protection activation threshold * storm_record_ttl - maximum lifetime (in rounds) of the device record in case of absence of messages
In the device settings (SourceConfig): * storm_policy - storm protection policy - disable (Disabled - D) - block sender (Block - B) - raise alarm (Raise Alarm - R) - block sender and raise alarm (Block & Raise Alarm - A) * storm_threshold - threshold number of messages per round to enable protection