NOC 19.1¶
In accordance to our Release Policy we're proudly present release 19.1.
19.1 release contains of 272 bugfixes, optimisations and improvements.
Highlights¶
Usability¶
NOC Theme¶
19.1 introduces genuine NOC theme intended to replace venerable ExtJS' gray
. New flat theme is based upon Triton theme using NOC-branded colors. NOC theme can be activated via config on per-installation basis. We expect to make it default several releases later.
Collection Sharing¶
Collections is the viable part of NOC. We're gracefully appreciate any contributions. In order to make contribution process easier we'd added Share
button just into JSON preview. Enable collections sharing in config and create collections Merge Requests directly from NOC interface by single click.
New fm.alarm¶
Alarm console was thoroughly reworked. Current filters settings are stored in URL and can be shared with other users. Additional filters on services and subscribers were also added.
New runcommands¶
Run Commands interface was simplified. Left panel became hidden and working area was enlarged. List of objects can be modified directly from commands panel. Configurable command logging option was added to mrt service.
Alarm acknowledgement¶
Alarms can be acknowledged by user to show that alarm has been seen and now under investigation.
Integration¶
We continue to move towards better integration with external systems. Our first priority is clean up and document API to be used by external systems to communicate with NOC.
NBI¶
A new nbi service has been introduced. nbi service is the host for Northbound Interface API, allowing to access NOC's data from upper-level system.
objectmetrics API <api-nbi-objectmetrics>
for requesting metrics has introduced
DataStream¶
DataStream service <services-datastream>
got a lots of improvements:
alarm datastream <api-datastream-alarm>
for realtime alarm status streamingmanagedobject datastream <api-datastream-managedobject>
got asset part containing hardware inventory data
API Key ACL¶
API Key <reference-apikey>
got and additional ACL, allowing to restrict source addresses for particular keys.
Threshold Profiles¶
Threshold processing became more flexible. Instead of four fixed levels (Low error, low warning, high warning and high error) an arbitrary amount of levels can be configured via Threshold Profiles. Arbitrary actions can be set for each threshold violation, including:
- raising of alarm
- sending of notification
- calling handlers
Threshold closing condition can differ from opening one, allowing hysteresis to suppress unnecessary flapping.
Syslog archiving¶
Starting from 19.1 NOC can be used as long-term syslog archive solution. ManagedObjectProfile got additional Syslog Archive Policy setting. When enabled, syslogcollector <service-syslogcollector>
service mirrors all received syslog messages to long-term analytic ClickHouse database. ClickHouse supports replication, enforces transparent compression and has very descent IOPS requirements, making it ideal for high-load storage.
Collected messages can be queried both through BI interface and direct SQL queries.
STP Topology metrics¶
STP topology changes metrics supported out-of-box. Devices' dashboards can show topology changes on graphs and further analytics can be applied. In combination with BI analytics network operators got the valuable tool to investigate short-term traffic disruption problems in large networks.
New platform detection policy¶
Behavior on new platform detection became configurable. Previous behavior was to automatically create platform, which can lead to headache in particular cases. Now you have and options configured from Managed Object Profile:
- Create - preserve previous behavior and create new platform automatically (default)
- Alarm - raise umbrella alarm and stop discovery
Firmware Policy¶
Behavior on firmware policy violation also became configurable. ManagedObjectProfile allow to configure following options:
- Ignore - do nothing (default)
- Ignore&Stop - Stop discovery
- Raise Alarm - Raise umbrella alarm
- Raise&Stop - Raise umbrella alarm and stop discovery
New Profiles¶
19.1 contains support for TV optical-to-RF converters widely used in cable TV networks. 2 profiles has introduced:
- IRE-Polus.Taros
- Vector.Lambda
In addition, an NSM.TIMOS <profile-NSM.TIMOS>
profile became available
Performance, Scalability and optimisations¶
Caps Profile¶
caps discovery <discovery-box-caps>
used to collect all known capabilities for platform. Sometimes it is not desired behavior. So Caps profiles are introduced. Caps Profiles allows to enable or disable particular group of capabilities checking. Group of capabilities can be explicitly enabled, disabled or enabled only if required for configured topology discovery.
High-precision timers¶
19.1 contains time.perf_counter
backport to Python 2.7. perf_counter
uses CPU counters to measure time intervals. It's about 2x faster than time.time
and allows more granularity in time interval measurements (time.time
changes only \~64 times per second). This greatly increases precision of span interval measurements and of ping's RTT metrics.
Pymongo connection pool tuning¶
Our investigations showed that current pymongo's connection pool implementation has design flaw that leads to Pool connection poisoning problem under the common NOC's workfload: once opened mongo connection from discovery never been closed, leaving lots of connection after the spikes of load. We'd implemented own connection pool and submitted pull request to pymongo project (See LIFO connection pool policy).
ClickHouse table cleanup policy¶
ClickHouse table retention policy may be configured on per-table basis. partition dropping is automated and may be called manually or from cron.
Redis cache backend¶
Our investigations showed that memcached is prone to randomly forget keys while enough memory is available. This leads to random discovery job states loss, leading to resetting the state of measured snmp counters, loosing random metrics and leaving empty gaps in grafana dashboards. Problem is hard to diagnose and only cure is to restart memcached process. Problem lies deeply in memcached internal architecture and unlikely to be fixed.
So we'd introduced support for Redis cache backend. We'll make decision to make or not to make it default cache backend after testing period.
SO_REUSEPORT & SO_FREEBIND for collectors¶
syslogcollector <service-syslogcollector>
and trapcollector <service-trapcollector>
services supports SO_REUSEPORT
and SO_FREEBIND
options for listeners.
SO_REUSEPORT
allows to share single port by several collector' processes using in-kernel load balancing, greatly improving collectors' throughoutput.
SO_FREEBIND
allows to bind to non-existing address, opening support for floating virtual addresses for collector (VRRP), CARP) etc), adding necessary level of redundancy.
In combination with new Syslog Archive <release-19.1-syslog-archive>
and ClickHouse table cleanup policy <release-19.1-clickhouse-cleanup>
features NOC can be turned to high-performance syslog archiving solution.
GridVCS¶
GridVCS is NOC's high-performance redundant version control system used to store device configuration history. 19.1 release introduces several improvements to GridVCS subsystem.
- built-in compression - though Mongo's Wired Tiger uses transparent compression on storage level, explicit compression on GridVCS level reduces both disk usage and database server traffic.
- Previous releases used mercurial's mdiff to calculate config deltas. 19.1 uses BSDIFF4 format by default. During our tests BSDIFF4 showed better results in speed and delta size.
./noc gridvcs <man-gridvcs>
command got additionalcompress
subcommand, allowing to apply both compression and BSDIFF4 deltas to already collected data. While it can take a time for large storages it can free up significant disk space.
API improvements¶
profile.py¶
SA profiles <profiles>
used to live in __init__.py
file. Our code style advises to keep __init__.py
empty for various reason. Some features like profile loading from custom
will not work with __init__.py
anyway.
So starting with 19.1 it is recommended to place profile's code into profile.py
file. Loading from __init__.py
is still supported but it is a good time to plan migration of custom profiles.
OIDRule: High-order scale functions¶
Metrics scale
can be defined as high-order functions, i.e. function returning other functions. It's greatly increase flexibility of scaling subsystem and allows external configuration of scaling processing.
IPAM seen
propagation¶
Workflow's seen
signal can be configured to propagate up to the parent prefixes. Address and Prefix profiles got new Seen propagation policy
setting which determines should or should not parent prefix will be notified of child element seen by discovery.
Common usage pattern is to propagate seen
to aggregate prefixes to get notified when aggregate became used.
Phone workflow¶
phone
module got full-blown workflow support. Each phone number and phone range has own state which can be changed manually or via external signals.
Breaking Changes¶
Migration¶
New features¶
MR | Title |
---|---|
MR1515 | Add estimate param to job command. |
MR1525 | Collection sharing |
MR1498 | DataStream: asset part of ManagedObject |
MR1516 | APIKey ACL |
MR1518 | Add export/import to ./noc beef command. |
MR1514 | Configurable behavior on new platforms and firmware policy violations |
MR1512 | new fm-alarm |
MR1508 | IRE-Polus.Taros profile |
MR1507 | Summary glyph display order |
MR1501 | Add Errors Out and Discards In for ddash |
MR1595 | Add periodic diagnostic to alarm diagnostic. |
MR1460 | ThresholdProfile: Flexible thresholds configuration |
MR1497 | Alarm acknowledge/unacknowledge |
MR1491 | network stp topology changes on graph |
MR1476 | GridVCS: bsdiff4 patches and zlib compression |
MR1432 | Add initial support for NSN.TIMOS profile |
MR1475 | High-precision timers |
MR1458 | Add Network \| STP \| Topology Changes metric . |
MR1455 | CapsProfile |
MR1396 | redis cache backend |
MR1404 | #794: IPAM seen propagation policy |
MR1384 | card: project card |
MR1390 | #942: Remove Root container |
MR1352 | #694 ClickHouse table cleaning policy |
MR1363 | Vector.Lambda profile |
MR1283 | NOC theme |
MR1336 | OIDRule: High-order scale functions |
MR1338 | #539 Syslog archiving |
MR1255 | nbi service |
MR1345 | #497 syslogcollector/trapcollector: SO_REUSEPORT and IP_FREEBIND support |
MR1252 | datastream: Alarm datastream |
MR1226 | #636 Phone Workflow integraton |
MR1113 | Profiles should be moved to profile.py |
Improvements¶
MR | Title |
---|---|
MR1534 | Set default loglevel on command to info. |
MR1535 | Update RU translation. |
MR1527 | FM Alarms localization |
MR1529 | Add full_name to PlatformApplication query fields. |
MR1522 | Update/report interface status3 |
MR1510 | Update DLink.DxS profile |
MR1556 | Update Rotek.BT profile (get_version) |
MR1539 | Update settings by snmp requests for Dlink.DxS |
MR1500 | Update Juniper.JUNOS profile |
MR1503 | Speedup NetworkSegment Service Summary count. |
MR1502 | Update Report for Interfaces Status |
MR1490 | Generic.get_chassis_id disable Multicast MAC address check. |
MR1494 | SKS.SKS and BDCOM.IOS config volatile. |
MR1488 | Add platform to Linksys.SPS2xx profile. |
MR1451 | Unified loader interface |
MR1485 | Add caps profile to managedobject profile ETL loader. |
MR1484 | Add to Linksys.SPS24xx platform OID |
MR1434 | ./noc dnszone import: Parse complex \$TTL directives |
MR1452 | Move methods from SegmentTopology to BaseTopology |
MR1449 | inv.networksegment: Bulk fields calculation |
MR1454 | Add to_python method to ClickHouse model. |
MR1466 | Add to Huawei.VRP profile get Serial Number attributes. |
MR1453 | ResourceGroup: TreeCombo |
MR1461 | Add config_volatile to Orion.NOS and SKS.SKS |
MR1447 | Increase query interval for core.pm.utils function. |
MR1417 | Extendable Generic.get_chassis_id script |
MR1441 | Add patern more to Huawei.MA5600T profile. |
MR1440 | Optimize reportalarmdetail and reportobjectdetail. |
MR1439 | Update/eltex mes execute snmp |
MR1437 | Delete aggregateinterface bi model |
MR1420 | Add dynamically loader BI models. |
MR1418 | RepoPreview MVVC |
MR1427 | Migrate Alstec.24xx.get_metrics to new model. |
MR1414 | networkx 2.2 and improvend spring layout implementation |
MR1413 | dns.dnsserver: Remove sync field |
MR1400 | requests 2.20.0 |
MR1392 | Diverged permissions |
MR1382 | #961 Process All addresses and Loopback address syslog/trap source types |
MR1408 | Add Generic.get_vlans and get_switchport scripts. |
MR1409 | Add get_lldp_snmp capabilities for Cisco.IOS |
MR1410 | Change Iface Name OID for get_ifindexes Plante.WCDG profile |
MR1374 | migrate inv map to leafletjs |
MR1381 | #971 trapcollector: Gentler handling of BER decoding errors |
MR1371 | dnszone: Ignore addresses with missed FQDNs |
MR1369 | Add theme variable to login page render. |
MR1368 | Add "Up/10M" to reportcolumndatasource for report object detail. |
MR1391 | CODEOWNERS file |
MR1353 | #788 Try to determine VRF's for DHCP address discovery |
MR1361 | DataStream: Load from custom |
MR1251 | Customized PyMongo connection pool |
MR1397 | Juniper.junos |
MR1398 | auto logout remove msg |
MR1385 | Dead code cleanup |
MR1284 | runcommands refactoring |
MR1375 | Cleanup pyrule from classifier trigger. |
MR1341 | theme body padding for form |
MR1362 | Add convert ifname for MA4000 |
MR1349 | Cleanup AlliedTelesis profiles. |
MR1346 | snmp: Try to negotiate broken error_index |
MR1344 | Add Interface packets dashboard in MO dash. |
MR1318 | Migrate ReportProfileCheck report to ReportStat Backend. |
MR1228 | Move numpy import to parse_table_header in lib/text. |
MR1316 | Additional LLDP constants and caps conversion functions |
MR1324 | Add TZ parameter to NBI query. |
MR1126 | #260 add password widget |
MR1322 | Add get_lldp_neighbors and get_capabilities for Qtech2500 profile |
MR1264 | Add clean to events command. |
MR1307 | Update Alcatel.OS62xx profile |
MR1285 | Hp.1910 |
MR1190 | Update Rotek.RTBSv1 profile |
MR1297 | Add Rotek.RTBSv1.get_metrics script. |
MR1296 | add get_config script for Dlink.DVG profile |
MR1291 | Extend job command. |
MR1276 | Add clean_id_bson to alarm datastream. |
MR1274 | threadpool: Cleanup worker result just after setting future |
MR1286 | Add late_alarm metric to seflmon fm collector. |
MR1249 | Profile.cli_retries_super_password parameter |
MR1250 | perm: response layout |
MR1229 | ldap: Additional check of username format |
MR1214 | Add telemetry to MRT service. |
MR1244 | Add physical iface count metrics to selfmon. |
MR1216 | Add vv (very verbose parameter) to test command. |
Bugfixes¶
MR | Title |
---|---|
MR1487 | Use ch_escape function on syslogcollector. |
MR1478 | Fix Report Unknown Model Summary. |
MR1477 | Fix Generic.get_capabilities snmp_v1 |
MR1474 | Fix load metric priority. Profile first, Generic second. |
MR1473 | Fix Radio and SLA graph template for CH use. |
MR1481 | Fix displaying platform in some Cisco Stackable switches |
MR1479 | Fix Rotek RTBSv1 Tx Power metric |
MR1438 | Fix Huawei.VRP.get_mac_address_table script |
MR1422 | Fix MikroTik.RouterOS.get_interface_status_ex script |
MR1462 | Fix heavy cpu load on show vlan command |
MR1469 | Fix Huawei.VRP.get_version SerialNumber rogue chart. |
MR1467 | Fix DLink.DxS profile |
MR1463 | Fix Extreme.XOS.get_interfaces script |
MR1465 | Fix PrefixBookmark import loop. |
MR1464 | Fix selfmon FM metric name. |
MR1457 | Fix getting single oid from multiple metrics. |
MR1444 | Fix Iskratel.MSAN profile |
MR1450 | Fix Orion.NOS.get_lldp_neighbors script |
MR1433 | Fix Cisco.IOSXR profile |
MR1436 | Fix Cisco.NXOS.get_arp script |
MR1448 | Fix c.id in card.base.f_object_location. |
MR1445 | login button width fixed |
MR1459 | Lambda fix metrics |
MR1468 | Huawei.VRP.get_version strip serial number. |
MR1435 | InfiNet fix init.py pattern_prompt |
MR1426 | inv.map fix performance |
MR1443 | Fix Object.get_coordinate_zoom method. |
MR1428 | Fix Huawei.MA5600T profile |
MR1430 | Fix Alstec.24xx metric name. |
MR1289 | Fix Juniper.JUNOS.get_lldp_neighbors Parameter 'remote_port' required. |
MR1423 | Fix managedobject and object card for delete Root. |
MR1429 | Fix avs Object.get_address_text method |
MR1424 | Fix getting container path in Alarm Web and Card. |
MR1425 | Fix typo in ManagedObject console UI. |
MR1483 | Fix Raisecom.ROS.get_lldp_neighbors script |
MR1395 | Fix container field type when remove Root. |
MR1401 | ip.ipam: Fix prefix style |
MR1411 | Fix Add Objects to Maintenance from SA !582 |
MR1386 | fix error "Отсутствуют адреса линка" in dns.reportmissedp2p |
MR1405 | Fix Discovery Problem Detail report trace. |
MR1394 | Fix get_lldp_neighbors by SNMP |
MR1407 | Fix Plantet.WGSD Profile |
MR1403 | #976 Fix closing of already closed session |
MR1406 | Fix avs environments graph tmpl 148 |
MR1402 | jsloader fixed |
MR1399 | Fix Ubiquiti profile and Generic.get_interfaces(get_bulk) |
MR1389 | Fix Report Discovery Poison |
MR1378 | Fix theme variable in desktop.html template. |
MR1379 | Fix etl managedobject resourcegroup |
MR1367 | Fix prompt in Rotek.RTBS.v1 profile. |
MR1366 | Fix workflow CH dictionary. |
MR1365 | Fix selfmon FM collector. |
MR1364 | Fix update operation for superuser on secret field. |
MR1376 | noc/noc#952 Fix metric path for Environment metric scope. |
MR1310 | #964 Fix SA sessions leaking |
MR1357 | Natex_fix_sn |
MR1355 | Cisco_fix_snmp |
MR1370 | Increase ManagedObject cache version for syslog archive field. |
MR1356 | Fix Interface name Eltex.MES |
MR1354 | Fix Interface name QSW2500 |
MR1335 | Fix get_interfaces, add reth aenet |
MR1343 | Fix profilecheckdetail. |
MR1342 | Fix secret field. |
MR1351 | InfiNet-fix-get_version |
MR1350 | Fix get_interfaces for Telindus profile |
MR1348 | Fix stacked packets graph. |
MR1360 | Fix Interface name ROS |
MR1326 | Fix ch_state ch datasource. |
MR1332 | Fix Span Card view from ClickHouse data. |
MR1331 | Fix Huawei.MA5600T.get_cpe. |
MR1328 | Fix Cisco.IOS.get_lldp_neighbors regex |
MR1327 | Fix get_interfaces for Rotek.RTBSv1, add rule for platform RT-BS24 |
MR1325 | Fix CLIPS engine in slots. |
MR1320 | Fix SNMP Trap OID Resolver |
MR1323 | Fix get_interfaces for QSW2500 (dowwn -> down) |
MR1269 | Fix Juniper.JUNOSe.get_interfaces script |
MR1278 | Fix Huawei.MA5600T.get_cpe ValueError. |
MR1314 | Fix Generic.get_chassis_id script |
MR1306 | Fix AlliedTelesis.AT8000S.get_interfaces script |
MR1313 | Fix Cisco.IOS.get_version for ME series |
MR1262 | Fix Raisecom.RCIOS password prompt matching |
MR1238 | Fix Juniper.JUNOS profile |
MR1279 | Fixes empty range list in discoveryid. |
MR1305 | Fix Rotek.RTBS profiles. |
MR1304 | Fix some attributes for Span in MRT serivce |
MR1303 | Fix selfmon escalator metrics. |
MR1300 | fm.eventclassificationrule: Fix creating from event |
MR1295 | Fix ./noc mib lookup |
MR1298 | Fix custom metrics path in Generic.get_metrics. |
MR1290 | Fix custom metrics. |
MR1225 | noc/noc#954 Fix Cisco.IOS.get_inventory script |
MR1275 | Fix InfiNet.WANFlexX.get_lldp_neighbors script |
MR1281 | Delete quit() in script |
MR1280 | Fit get_config |
MR1277 | Fix Zhone.Bitstorm.get_interfaces script |
MR1254 | Fix InfiNet.WANFlexX.get_interfaces script |
MR1272 | Fix vendor name in SAE script credentials. |
MR1246 | Fix Huawei.VRP pager |
MR1268 | Fix scheme migrations |
MR1245 | Fix Huawei.VRP3 prompt match |
MR1259 | fix_error_web |
MR1258 | Fix managed_object_platform migration. |
MR1260 | Fix pm.util.get_objects_metrics if object_profile metrics empty. |
MR1253 | Fix path in radius(services) |
MR1203 | Fix prompt pattern in Eltex.DSLAM profile |
MR1247 | Fix consul resolver index handling |
MR1239 | #911 consul: Fix faulty state caused by changes in consul timeout behavior |
MR1237 | #956 fix web scripts |
MR1221 | Fix Generic.get_lldp_neighbors script |
MR1243 | Fix now shift for selfmon task late. |
MR1231 | noc/noc#946 Fix ManagedObject web console. |
MR1235 | Fix futurize in SLA probe. |
MR1234 | Fix Huawei.MA5600T.get_cpe. |
MR1220 | Fix Generic.get_interfaces script |
MR1204 | Fix Raisecom.ROS.get_interfaces script |
MR1215 | Fix platform field in Platform Card. |
MR1210 | ManagedObject datastream: Fix links property. capabilities property |
MR1212 | Fix save empty metrics threshold in ManagedObjectProfile UI. |
MR1211 | Fix interface validation errors in Huawei.VRP, Siklu.EH, Zhone.Bitstorm. |
MR1317 | sa.managedobjectprofile: Fix text |
MR1340 | noc/noc#966 |
MR1294 | selfmon typo in mo |
MR1105 | #856 Rack view fix |
MR1208 | #947 Fix MAC ranges optimization |