当前位置:网站首页>Cloud native monitoring system - Nightingale's recent list of new functions to solve multiple production pain points

Cloud native monitoring system - Nightingale's recent list of new functions to solve multiple production pain points

2022-06-21 16:50:00 InfoQ

Front end Introduction

Nightingale
 |  Nightingale monitoring , An advanced open source cloud native monitoring and analysis system , use  All-In-One  The design of the , Collection data collection 、 visualization 、 Monitoring alarm 、 Data analysis in one , Closely integrated with cloud native Ecology , Provide out of the box enterprise level monitoring, analysis and alarm capabilities . On  2022  year  5  month  11  Japan , Donated to the open source development committee of the Chinese computer society (CCF ODC), by  CCF ODC  The first open source project to receive donations after its establishment .

Write it at the front

Nightingale  The positioning of , It's enterprise  Prometheus, Not to say that  Prometheus  Not good. . for instance , For example, your team has set up a set of  Prometheus  Their own use , Everyone write  yaml  The configuration is very smooth , In fact, it's very good , The learning cost is not considered here . however , If your team wants to build greater influence within the company , We want to make the capability of this indicator monitoring system available to other teams , Then we have to do some work , A typical example is :

  • You need a set of  WEB UI, No one can modify it directly  yaml  file , Otherwise, it's easy to mess up , especially  yaml  Or indent sensitive configuration files
  • I hope there are some best practices deposited on the platform , Let's use it out of the box , After all , Not all teams are like yours , It can be done to  Prometheus  Play so smoothly
  • Can access multiple  Prometheus  colony , because  Prometheus  A single point has a capacity cap , Therefore, business segmentation or regional segmentation may be done in the enterprise , This requires that a set of systems can be used to connect multiple  Prometheus

Of course , The Nightingale's ability is not just a set  Prometheus  Of  WEB UI, The Nightingale can also do things like :

  • Provide alarm shielding 、 Subscription rules , And alarm rules with richer features
  • Through fault self-healing , A script can be automatically executed when an alarm is triggered
  • Provide alarm event management 、 Historical archives 、 Active alarm aggregation view
  • Provide out of the box alarm rules and monitoring system , It can be imported and used directly
  • Provides a quick view of monitoring data viewing , It is very convenient to see the picture by clicking
  • wait

Recent updates

null

Business group , It is a management concept in nightingale , Larger companies may have thousands of warning rules , Hundreds of large plates , If you use a flat table to list , Obviously, there is no way to manage , So Nightingale introduced the concept of a business group , You can manage these rules and the market . Recent updates : You can enable automatic tagging for business groups , In this way, the monitoring data reported by the machines belonging to this business group will be automatically marked  
busigroup=xx
  The label of , It's more convenient .

Quick view , This update action is relatively large , Removed the previous object view , This consideration is : Machines and equipment we may want to have a list to view , Click on different machines , You can view different monitoring objects , You can also view the monitoring data of multiple machines at the same time , No input is required for the whole process , Just click . Since machines and equipment have this demand ,MySQL example 、Redis example 、MQ example 、 Switches etc. , In fact, they all have this demand , therefore , We upgraded the object perspective , It becomes a shortcut view , You can customize various perspectives , It is a small functional innovation .

Monitor the market , The new version adds more chart type support , And it can be imported directly  Grafana  The broader market , Of course , Because the Nightingale's market and Grafana The configuration of the market is not completely consistent , So you can't import it completely , Common chart types are OK Of .

Alarm sending , mail 、 nailing 、 Enterprise micro 、 anonymous letter , These transmission channels are built into Nightingale's code , If you want to customize the sending mechanism , It can also be done through python Script , perhaps webhook, perhaps Redis Of pubsub Mechanism , Or the dynamic link library loads the code , And so on , thus , It is very convenient to integrate with the internal system of the enterprise .

in addition , The new version supports the maximum number of alarms , Previous versions already supported channel silence time , Or called repeat transmission frequency , Many friends' feedback is not enough , Some low-level alarms may only need to be notified twoorthree times ( If a high-level alarm is not recovered, you want to send a notification at a certain frequency ), Therefore, the limitation of the maximum number of alarms is introduced .

Alarm aggregation display , This is a small innovation , In order to better locate the problem , We usually do aggregation in the time dimension , For example, check this afternoon 2 All the alarm events generated around , Through analysis , You can find out which one is the root cause . however , Only the aggregation of time dimension is not enough , It should also support aggregation of different tags , So the new version , The active alarm card view is introduced , Support aggregation through tags and event attributes , This feature is well received .

Above , This is the function recently updated by Nightingale , Welcome to try , If you have any questions, you can ask us  
issue
.
原网站

版权声明
本文为[InfoQ]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/172/202206211532532088.html