当前位置:网站首页>What is thermal data detection?
What is thermal data detection?
2022-06-24 16:36:00 【Programmer fish skin】
If the data should also be classified like garbage , What kind of heat data is it ?
Hello everyone , I'm fish skin , Today, I will share a bit of technical knowledge .
As we all know , Various websites 、 The operation of applications cannot be separated from the support of data , Especially for enterprises , Business data is its life .
But sometimes , Pile all the data into a lump 、 Unified processing may not meet our requirements for performance and storage space . therefore , We need to classify the data , To adapt to different business needs and application scenarios .
among , One way to divide data is to divide it into “ Thermal data ”、“ Cold data ”, And even “ Warm data ”!
Just like garbage sorting ~
Let's talk about what is thermal data first !
What is thermal data ?
seeing the name of a thing one thinks of its function , Thermal data means Very popular 、 Frequently visited The data of .
For example, the news on a hot list , There may be thousands of visits per second .
According to the characteristics of thermal data , It can be divided into two categories :
- There are expectations : It is expected that data will become popular , For example, in the big promotion activities with advance notice, the hot commodities endorsed by online celebrities , The double 11 Shopping Festival of a treasure is the best example .
- No expectation : Data access suddenly soared ! It may have been maliciously attacked by people 、 Web crawler , Or the content that is suddenly popular inadvertently . For example, a big news suddenly appeared , A wave of Weibo hasn't had time to do a good job of protection , It may explode .
In response to thermal data , Usually we choose caching technology , Taking data to K / V( Key value pair ) Is stored in memory in advance .
When we need to access cached data , Need to be based on a key character string , To find the corresponding value .
Frequently visited key, Also called heat key, heat key It's a broad concept , It's not just about caching systems , For example, the following are all hot key:
- A primary key that is frequently accessed in a database , For example, for popular applications appId
- K / V Caching systems that are frequently accessed key
- A malicious attack 、 Request information of robot brush , Like the user's userId、 machine IP etc.
- Frequently accessed interface address , Such as app Information Service /app/query
- Count how often a single user accesses an interface , Such as userId + /app/query
- Count the frequency of a machine accessing an interface , Such as IP + /app/query
- Count how often a user accesses specific content of an interface , Such as userId + /app/query + appId
After knowing what is thermal data , Let's talk about thermal data detection technology , namely “ Find the heat data ” Technology .
Why do you want to test thermal data ?
The reason we check the thermal data is very simple :
1. Lifting performance
If you use distributed caching , Network communication is still required when reading , There will be extra time overhead . If you can cache hot data locally in advance , Namely preheating , It can greatly improve the performance of the machine in reading data , Reduce the pressure on the lower level cache cluster .
Of course , This does not mean that all data should be stored locally . More cache levels , The more complex the update operation , The greater the risk of data inconsistency !
2. Risk aversion
For unexpected thermal data ( heat key), It may bring great risks to the business , Risks can be divided into two levels :
Risks to the data layer
Under normal circumstances ,Redis A single cache can support about 100000 QPS( Number of requests per second ), And the concurrency can be increased through the cluster . For systems with average concurrency , use Redis Caching is enough . But if there is a sudden burst of commodity data , Or receive a malicious request , For this data key The interview of QPS May soar to millions 、 Tens of millions ! In low version Redis Single thread working mode , This will cause normal requests to queue , Unable to respond in time , In severe cases, the entire fragmented cluster will be paralyzed .
There's another situation , A hot spot key Suddenly expired , It will lead to a large number of requests directly crashing into the fragile database , Cause the database to hang up !
Risks to application services
Each application can accept and process a limited number of requests per unit time , If attacked by a malicious request , Let malicious users occupy a lot of request processing resources alone , It will cause other normal users who are harmless to humans and animals to fail to respond in time .
therefore , Need a dynamic thermal key Detection mechanism , When unexpected hot data appears , The first time I found him , And carry out special processing for these data . Such as local cache 、 Deny malicious users 、 Interface current limiting / Degradation etc. . Avoid possible risks while improving data access performance .
So how to detect thermal data ?
How to detect thermal data ?
First , We need to give “ heat ” Define a threshold or rule , How hot is it ?
It can be defined according to experience value , It can also be defined according to the average heat of the system data , such as 1 Seconds access 1000 The secondary data is thermal data .
For stand-alone applications , Detecting thermal data is simple , Directly locally for each key Create a sliding window counter , Count the total number of visits per unit time ( frequency ), And store the detected heat through a collection key.
For distributed applications , Antipyretic key The access of is distributed on different machines , Cannot compute independently locally , therefore , Need an independent 、 Centralized heat key Computing unit .
thus , Thermal data detection can be divided into configuration rules 、 heat key Report 、 heat key Statistics 、 heat key Push four steps :
- Configuration rules : Specify heat key Reporting conditions for , Circle the items that need to be monitored key
- heat key Report : Each machine will have its own key The access status is reported to the centralized computing unit
- heat key Statistics : Collect the information reported by each application instance , Use the sliding window algorithm to calculate key The heat of the
- heat key push : When key When the heat reaches the set value , Push heat key Information to all application instances , Each application instance will key Values are cached locally .
Go through the above steps , A basic set of hot key The detection mechanism is completed . However, thermal data detection systems often face complex business scenarios , There are other issues to consider , such as key Failure treatment, etc .
To meet high concurrency scenarios , In design heat key When detecting the frame , It should also focus on the following indicators :
- The real time : Considering the heat key The suddenness of ( Maybe even 1 millisecond ), Must be able to detect heat in real time key And push
- High performance : The frame shall remain lightweight and high performance , Effectively reduce costs
- accuracy : Accurately detect the heat that conforms to the rules key, No missing report 、 No false alarm
- Uniformity : Ensure the hot connection between the application instance and the local cache key Agreement , No data errors
- Scalable : To be counted key When the order of magnitude is very large , The centralized computing cluster can be expanded horizontally
Besides , Excellent heat key The detection framework shall also meet the requirements of easy access 、 There is no invasion of business 、 It can be configured dynamically 、 Rule hot update 、 Visual management and other features .
Last , Students who want to learn more can take a look at the popularity of JD open source key Detection frame JD-hotkey And those who like open source TMC, Their designs are very clever .
I have written an analysis of these two frameworks before , There will be a chance to sort it out later .
边栏推荐
- An error is reported during SVN uploading -svn sqlite[s13]
- Virtual machine virtual disk recovery case tutorial
- 炒期货在哪里开户最正规安全?怎么期货开户?
- Applet - use of template
- How to access tke cluster API interface with certificate or token
- Handling of communication failure between kuberbetes pod
- Heavy release! Tencent cloud ASW workflow, visual orchestration cloud service
- Modern finite element analysis can easily achieve accurate results
- 2021-05-03: given a non negative integer num, how to avoid circular statements,
- 06. Tencent cloud IOT device side learning - Introduction to basic functions
猜你喜欢

Applet - use of template

A survey on model compression for natural language processing (NLP model compression overview)

C. K-th not divisible by n (Mathematics + thinking) codeforces round 640 (Div. 4)

Cognition and difference of service number, subscription number, applet and enterprise number (enterprise wechat)
MySQL Advanced Series: Locks - Locks in InnoDB

ZOJ——4104 Sequence in the Pocket(思维问题)

C. K-th Not Divisible by n(数学+思维) Codeforces Round #640 (Div. 4)

B. Terry sequence (thinking + greed) codeforces round 665 (Div. 2)

A survey on dynamic neural networks for natural language processing, University of California
MySQL Advanced Series: locks - locks in InnoDB
随机推荐
Some adventurer hybrid versions with potential safety hazards will be recalled
Istio FAQ: sidecar startup sequence
Several characteristics of pharmaceutical industry
转置卷积学习笔记
Embedded Software Engineer written interview guide arm system and architecture
How does the effective date of SAP PP ECM affect the work order?
Cognition and difference of service number, subscription number, applet and enterprise number (enterprise wechat)
A troubleshooting of golang memory leak
A survey on dynamic neural networks for natural language processing, University of California
Development trend of CAE simulation analysis software
Abnormal dockgeddon causes CPU 100%
Nonholonomic constrained robot
Transpose convolution explanation
Product level design of a project in SAP mm
Kubernetes characteristic research: sidecar containers
What is the difference between optical fiber jumper and copper wire
What is Ethernet
Funny! Pictures and texts give you a comprehensive understanding of the effects of dynamics and mass
MySQL Innodb和Myisam
[idea] dynamic planning (DP)