当前位置:网站首页>What is thermal data detection?

What is thermal data detection?

2022-06-24 16:36:00 Programmer fish skin

If the data should also be classified like garbage , What kind of heat data is it ?

Hello everyone , I'm fish skin , Today, I will share a bit of technical knowledge .

As we all know , Various websites 、 The operation of applications cannot be separated from the support of data , Especially for enterprises , Business data is its life .

But sometimes , Pile all the data into a lump 、 Unified processing may not meet our requirements for performance and storage space . therefore , We need to classify the data , To adapt to different business needs and application scenarios .

among , One way to divide data is to divide it into “ Thermal data ”、“ Cold data ”, And even “ Warm data ”!

Just like garbage sorting ~

Let's talk about what is thermal data first !

What is thermal data ?

seeing the name of a thing one thinks of its function , Thermal data means Very popular 、 Frequently visited The data of .

For example, the news on a hot list , There may be thousands of visits per second .

According to the characteristics of thermal data , It can be divided into two categories :

  • There are expectations : It is expected that data will become popular , For example, in the big promotion activities with advance notice, the hot commodities endorsed by online celebrities , The double 11 Shopping Festival of a treasure is the best example .
  • No expectation : Data access suddenly soared ! It may have been maliciously attacked by people 、 Web crawler , Or the content that is suddenly popular inadvertently . For example, a big news suddenly appeared , A wave of Weibo hasn't had time to do a good job of protection , It may explode .

In response to thermal data , Usually we choose caching technology , Taking data to K / V( Key value pair ) Is stored in memory in advance .

Key value pair

When we need to access cached data , Need to be based on a key character string , To find the corresponding value .

Frequently visited key, Also called heat key, heat key It's a broad concept , It's not just about caching systems , For example, the following are all hot key:

  1. A primary key that is frequently accessed in a database , For example, for popular applications appId
  2. K / V Caching systems that are frequently accessed key
  3. A malicious attack 、 Request information of robot brush , Like the user's userId、 machine IP etc.
  4. Frequently accessed interface address , Such as app Information Service /app/query
  5. Count how often a single user accesses an interface , Such as userId + /app/query
  6. Count the frequency of a machine accessing an interface , Such as IP + /app/query
  7. Count how often a user accesses specific content of an interface , Such as userId + /app/query + appId

After knowing what is thermal data , Let's talk about thermal data detection technology , namely “ Find the heat data ” Technology .

Why do you want to test thermal data ?

The reason we check the thermal data is very simple :

1. Lifting performance

If you use distributed caching , Network communication is still required when reading , There will be extra time overhead . If you can cache hot data locally in advance , Namely preheating , It can greatly improve the performance of the machine in reading data , Reduce the pressure on the lower level cache cluster .

Of course , This does not mean that all data should be stored locally . More cache levels , The more complex the update operation , The greater the risk of data inconsistency !

2. Risk aversion

For unexpected thermal data ( heat key), It may bring great risks to the business , Risks can be divided into two levels :

Risks to the data layer

Under normal circumstances ,Redis A single cache can support about 100000 QPS( Number of requests per second ), And the concurrency can be increased through the cluster . For systems with average concurrency , use Redis Caching is enough . But if there is a sudden burst of commodity data , Or receive a malicious request , For this data key The interview of QPS May soar to millions 、 Tens of millions ! In low version Redis Single thread working mode , This will cause normal requests to queue , Unable to respond in time , In severe cases, the entire fragmented cluster will be paralyzed .

There's another situation , A hot spot key Suddenly expired , It will lead to a large number of requests directly crashing into the fragile database , Cause the database to hang up !

Risks to application services

Each application can accept and process a limited number of requests per unit time , If attacked by a malicious request , Let malicious users occupy a lot of request processing resources alone , It will cause other normal users who are harmless to humans and animals to fail to respond in time .

Request queuing caused by malicious requests

therefore , Need a dynamic thermal key Detection mechanism , When unexpected hot data appears , The first time I found him , And carry out special processing for these data . Such as local cache 、 Deny malicious users 、 Interface current limiting / Degradation etc. . Avoid possible risks while improving data access performance .

So how to detect thermal data ?

How to detect thermal data ?

First , We need to give “ heat ” Define a threshold or rule , How hot is it ?

It can be defined according to experience value , It can also be defined according to the average heat of the system data , such as 1 Seconds access 1000 The secondary data is thermal data .

For stand-alone applications , Detecting thermal data is simple , Directly locally for each key Create a sliding window counter , Count the total number of visits per unit time ( frequency ), And store the detected heat through a collection key.

The sliding window

For distributed applications , Antipyretic key The access of is distributed on different machines , Cannot compute independently locally , therefore , Need an independent 、 Centralized heat key Computing unit .

thus , Thermal data detection can be divided into configuration rules 、 heat key Report 、 heat key Statistics 、 heat key Push four steps :

  1. Configuration rules : Specify heat key Reporting conditions for , Circle the items that need to be monitored key
  2. heat key Report : Each machine will have its own key The access status is reported to the centralized computing unit
  3. heat key Statistics : Collect the information reported by each application instance , Use the sliding window algorithm to calculate key The heat of the
  4. heat key push : When key When the heat reaches the set value , Push heat key Information to all application instances , Each application instance will key Values are cached locally .
Escalation and calculation

Go through the above steps , A basic set of hot key The detection mechanism is completed . However, thermal data detection systems often face complex business scenarios , There are other issues to consider , such as key Failure treatment, etc .

I like it TMC heat Key Detection design

To meet high concurrency scenarios , In design heat key When detecting the frame , It should also focus on the following indicators :

  1. The real time : Considering the heat key The suddenness of ( Maybe even 1 millisecond ), Must be able to detect heat in real time key And push
  2. High performance : The frame shall remain lightweight and high performance , Effectively reduce costs
  3. accuracy : Accurately detect the heat that conforms to the rules key, No missing report 、 No false alarm
  4. Uniformity : Ensure the hot connection between the application instance and the local cache key Agreement , No data errors
  5. Scalable : To be counted key When the order of magnitude is very large , The centralized computing cluster can be expanded horizontally

Besides , Excellent heat key The detection framework shall also meet the requirements of easy access 、 There is no invasion of business 、 It can be configured dynamically 、 Rule hot update 、 Visual management and other features .


Last , Students who want to learn more can take a look at the popularity of JD open source key Detection frame JD-hotkey And those who like open source TMC, Their designs are very clever .

I have written an analysis of these two frameworks before , There will be a chance to sort it out later .

原网站

版权声明
本文为[Programmer fish skin]所创,转载请带上原文链接,感谢
https://yzsam.com/2021/04/20210415215901620o.html