当前位置:网站首页>Redis cluster data skew
Redis cluster data skew
2022-06-24 08:35:00 【Java interview 365】
Redis Of Cluster Cluster data skew
Preface
Redis Of Cluster Cluster is also called slice cluster ,Cluster All instances of are master nodes , The cluster adopts Hashi trough (hash Slot) To handle the mapping relationship between instances , There are... In total in the cluster 16384 Hash slot , The default form is to 16384 A hash slot is allocated to all nodes , Each instance node is allocated a hash slot similar to a data partition , Each key value is in accordance with CRC16 The algorithm obtains the hash value and then pairs it with 16384 modulus **CRC16(key)mod 16384 ** Through the final result, we get key Hash slot location where the value exists , The structure diagram is shown below

The advantage of this is that multiple nodes provide read-write Services , Simultaneous adoption Cluster There is no redundant data storage for data fragmentation , There is no need to configure the sentry cluster, which can be configured by Cluster The internal election of the cluster ensures high availability , This shows that Cluster Cluster has great advantages , but Cluster Clustering can easily lead to a problem Data skew .
Data skew is generally divided into two cases
** The amount of data is skewed :** The data distribution of the instance is uneven , As a result, there are too many data on an instance , This leads to high access pressure for this instance .
** Data access :** The instance data is evenly distributed , But the hot data is concentrated on one instance , As a result, the instance data access pressure is high .
The consequence of data skew is that the instance is under great pressure to process requests , Slow processing of requests , It may even cause the instance to crash , So how to avoid ? It is necessary to analyze the specific causes of these two situations .
The amount of data is skewed
bigkey Cause to tilt
bigkey The causes are generally value Great value (String type ) perhaps value Save a large amount of set data ( Collection types ) This will lead to excessive instance memory consumption , Yes bigkey The operation of will cause the instance IO Thread blocking , If bigkey Frequent access will affect access to other key values of the instance , So we should try to avoid bigkey The birth of .
Slot Hash slot allocation is uneven
Slot Hash slots can be allocated automatically or manually , If it is automatically allocated, it will be evenly allocated to each instance , As shown below

However, the machines in the production cluster environment may not all have the same configuration , In other words, some machines with good performance may be allocated more Slot Hash slot , Some machines with poor performance require less allocation Slot Hash slot , In the allocation process, we may not know the corresponding relationship between the data and the Hashi slot , Then it is likely that a large amount of data will belong to Slot The hash slot is allocated to the same instance , This causes data skew .
Hash Tag Causing data skew
What is? Hash Tag Well ?
Hash Tag Is added to the key value pair key A pair of curly braces , This pair of curly braces will key Part of the value , The client is calculating key Of CRC16 Only the contents in curly brackets will be calculated when hashing the value , If there are no curly braces, the whole key It's worth it CRC16 value .
Such as key The value is user:order:{1001} Then what is in curly brackets is called Hash Tag That is to say 1001, In the calculation key Of CRC16 Time is also called calculation 1001 Of CRC16 Value instead of calculating user:order:{1001}.
Use Hash Tag The feature of is that different key values are as long as Hash Tag The same can be mapped to the same Slot Hash slot , So it must be in an instance .
Hash Tag What scenarios are they generally used in ?
==Redis Cluster Cross instance transactions and range queries are not supported ==, If the required data of the query exists in different instances , The consequence is that you need to query the instances one by one in the business code to get the results , Obviously, the efficiency is greatly reduced , So we can use it Hash Tag Centralize the data required by the business into one Slot Hash slot , This makes it easy to implement transactions and range queries .
If used blindly Hash Tag It is possible to cause a large amount of data to accumulate on one instance , This causes the number to skew , This requires a trade-off between efficiency and data skew , If because Hash Tag Cause data skew , Then we should give priority to avoiding data skew , Better not to use Hash Tag, Because transactions and range queries can be handled on the client side , Data skew may cause instance instability .
Data access
The root cause of data access skew is the existence of hot data , Such as e-commerce seckill commodity information , This situation may be the instance pressure caused by several key value accesses , Redistribution Slot Hash slot can not solve the problem caused by data access skew , We can use multiple copies .
What does multi copy mean ? Is to copy multiple copies of hot data , Of each copy key Add a random access prefix , Make sure this key Data from other replicas will not be mapped to the same slot Hash slot .
What does that mean ? for example Cluster There is 8 An example , Business key by abc,0-7 Instance No key They correspond to each other 0abc,1abc,2abc,3abc,… 7abc, Query this on the client key value abc when , By creating a 0 To 7 The random number , And abc Put it all together and ask redis.
But here's the thing slot Hash slots need to be allocated to different instances to achieve the effect , So the central idea of this solution is Distribute hot data to every instance , Share the pressure .
Another requirement for using this solution is Data needs to be read-only , If the data is writable, it also needs to increase the resource consumption of maintaining each copy , Clearly not appropriate .
边栏推荐
- Robot acceleration level task priority inverse kinematics
- 12--合并两个有序链表
- Opencv get (propid) common values
- QT writing security video monitoring system 36 onvif continuous movement
- Review SGI STL secondary space configurator (internal storage pool) | notes for personal use
- LabVIEW查找n个元素数组中的质数
- Introduction to RCNN, fast RCNN and fast RCNN
- 李白最经典的20首诗排行榜
- Promise的使用场景
- Nodejs redlock notes
猜你喜欢

日本大阪大学万伟伟研究员介绍基于WRS系统机器人的快速集成方法和应用

ZUCC_ Principles of compiling language and compilation_ Experiment 03 getting started with compiler

LabVIEW查找n个元素数组中的质数

2021-03-16 comp9021 class 9 notes

jwt(json web token)

LabVIEW finds prime numbers in an array of n elements

ZUCC_编译语言原理与编译_实验02 FSharp OCaml语言

MAYA重新拓布

Question 3 - MessageBox pop-up box, modify the default background color

2022年制冷与空调设备运行操作上岗证题库及模拟考试
随机推荐
Review SGI STL secondary space configurator (internal storage pool) | notes for personal use
JUC personal simple notes
2021-03-11 comp9021 class 8 notes
[acnoi2022] I have done it, but I can't
Markdown 实现文内链接跳转
SQL intra statement operation
貸款五級分類
Which is the first poem of Tang Dynasty?
How to replace the web player easyplayerproactivex Key in OCX?
[untitled]
PAT 1157:校庆
新技术实战,一步步用Activity Results API封装权限申请库
IIS build wordpress5.7 manually
QPS, TPS, concurrent users, throughput relationship
2022年流动式起重机司机特种作业证考试题库及在线模拟考试
12--合并两个有序链表
[graduation season] Hello stranger, this is a pink letter
(PKCS1) RSA 公私钥 pem 文件解析
JS to get the last element of the array
Redis的Cluster集群数据倾斜