当前位置：网站首页>Redis cluster data skew

Redis cluster data skew

2022-06-24 08:35:00 【Java interview 365】

Redis Of Cluster Cluster data skew

Preface

Redis Of Cluster Cluster is also called slice cluster ,Cluster All instances of are master nodes , The cluster adopts Hashi trough （hash Slot） To handle the mapping relationship between instances , There are... In total in the cluster 16384 Hash slot , The default form is to 16384 A hash slot is allocated to all nodes , Each instance node is allocated a hash slot similar to a data partition , Each key value is in accordance with CRC16 The algorithm obtains the hash value and then pairs it with 16384 modulus **CRC16（key）mod 16384 ** Through the final result, we get key Hash slot location where the value exists , The structure diagram is shown below

The advantage of this is that multiple nodes provide read-write Services , Simultaneous adoption Cluster There is no redundant data storage for data fragmentation , There is no need to configure the sentry cluster, which can be configured by Cluster The internal election of the cluster ensures high availability , This shows that Cluster Cluster has great advantages , but Cluster Clustering can easily lead to a problem Data skew .

Data skew is generally divided into two cases

** The amount of data is skewed ：** The data distribution of the instance is uneven , As a result, there are too many data on an instance , This leads to high access pressure for this instance .
** Data access ：** The instance data is evenly distributed , But the hot data is concentrated on one instance , As a result, the instance data access pressure is high .

The consequence of data skew is that the instance is under great pressure to process requests , Slow processing of requests , It may even cause the instance to crash , So how to avoid ？ It is necessary to analyze the specific causes of these two situations .

The amount of data is skewed

bigkey Cause to tilt

bigkey The causes are generally value Great value （String type ） perhaps value Save a large amount of set data （ Collection types ） This will lead to excessive instance memory consumption , Yes bigkey The operation of will cause the instance IO Thread blocking , If bigkey Frequent access will affect access to other key values of the instance , So we should try to avoid bigkey The birth of .

Slot Hash slot allocation is uneven

Slot Hash slots can be allocated automatically or manually , If it is automatically allocated, it will be evenly allocated to each instance , As shown below

However, the machines in the production cluster environment may not all have the same configuration , In other words, some machines with good performance may be allocated more Slot Hash slot , Some machines with poor performance require less allocation Slot Hash slot , In the allocation process, we may not know the corresponding relationship between the data and the Hashi slot , Then it is likely that a large amount of data will belong to Slot The hash slot is allocated to the same instance , This causes data skew .

Hash Tag Causing data skew

What is? Hash Tag Well ？

Hash Tag Is added to the key value pair key A pair of curly braces , This pair of curly braces will key Part of the value , The client is calculating key Of CRC16 Only the contents in curly brackets will be calculated when hashing the value , If there are no curly braces, the whole key It's worth it CRC16 value .

Such as key The value is user:order:{1001} Then what is in curly brackets is called Hash Tag That is to say 1001, In the calculation key Of CRC16 Time is also called calculation 1001 Of CRC16 Value instead of calculating user:order:{1001}.

Use Hash Tag The feature of is that different key values are as long as Hash Tag The same can be mapped to the same Slot Hash slot , So it must be in an instance .

Hash Tag What scenarios are they generally used in ？

==Redis Cluster Cross instance transactions and range queries are not supported ==, If the required data of the query exists in different instances , The consequence is that you need to query the instances one by one in the business code to get the results , Obviously, the efficiency is greatly reduced , So we can use it Hash Tag Centralize the data required by the business into one Slot Hash slot , This makes it easy to implement transactions and range queries .

If used blindly Hash Tag It is possible to cause a large amount of data to accumulate on one instance , This causes the number to skew , This requires a trade-off between efficiency and data skew , If because Hash Tag Cause data skew , Then we should give priority to avoiding data skew , Better not to use Hash Tag, Because transactions and range queries can be handled on the client side , Data skew may cause instance instability .

Data access

The root cause of data access skew is the existence of hot data , Such as e-commerce seckill commodity information , This situation may be the instance pressure caused by several key value accesses , Redistribution Slot Hash slot can not solve the problem caused by data access skew , We can use multiple copies .

What does multi copy mean ？ Is to copy multiple copies of hot data , Of each copy key Add a random access prefix , Make sure this key Data from other replicas will not be mapped to the same slot Hash slot .

What does that mean ？ for example Cluster There is 8 An example , Business key by abc,0-7 Instance No key They correspond to each other 0abc,1abc,2abc,3abc,… 7abc, Query this on the client key value abc when , By creating a 0 To 7 The random number , And abc Put it all together and ask redis.

But here's the thing slot Hash slots need to be allocated to different instances to achieve the effect , So the central idea of this solution is Distribute hot data to every instance , Share the pressure .

Another requirement for using this solution is Data needs to be read-only , If the data is writable, it also needs to increase the resource consumption of maintaining each copy , Clearly not appropriate .

原网站

版权声明
本文为[Java interview 365]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/175/202206240605522056.html