当前位置:网站首页>Redis cluster data skew
Redis cluster data skew
2022-06-24 08:35:00 【Java interview 365】
Redis Of Cluster Cluster data skew
Preface
Redis Of Cluster Cluster is also called slice cluster ,Cluster All instances of are master nodes , The cluster adopts Hashi trough (hash Slot) To handle the mapping relationship between instances , There are... In total in the cluster 16384 Hash slot , The default form is to 16384 A hash slot is allocated to all nodes , Each instance node is allocated a hash slot similar to a data partition , Each key value is in accordance with CRC16 The algorithm obtains the hash value and then pairs it with 16384 modulus **CRC16(key)mod 16384 ** Through the final result, we get key Hash slot location where the value exists , The structure diagram is shown below

The advantage of this is that multiple nodes provide read-write Services , Simultaneous adoption Cluster There is no redundant data storage for data fragmentation , There is no need to configure the sentry cluster, which can be configured by Cluster The internal election of the cluster ensures high availability , This shows that Cluster Cluster has great advantages , but Cluster Clustering can easily lead to a problem Data skew .
Data skew is generally divided into two cases
** The amount of data is skewed :** The data distribution of the instance is uneven , As a result, there are too many data on an instance , This leads to high access pressure for this instance .
** Data access :** The instance data is evenly distributed , But the hot data is concentrated on one instance , As a result, the instance data access pressure is high .
The consequence of data skew is that the instance is under great pressure to process requests , Slow processing of requests , It may even cause the instance to crash , So how to avoid ? It is necessary to analyze the specific causes of these two situations .
The amount of data is skewed
bigkey Cause to tilt
bigkey The causes are generally value Great value (String type ) perhaps value Save a large amount of set data ( Collection types ) This will lead to excessive instance memory consumption , Yes bigkey The operation of will cause the instance IO Thread blocking , If bigkey Frequent access will affect access to other key values of the instance , So we should try to avoid bigkey The birth of .
Slot Hash slot allocation is uneven
Slot Hash slots can be allocated automatically or manually , If it is automatically allocated, it will be evenly allocated to each instance , As shown below

However, the machines in the production cluster environment may not all have the same configuration , In other words, some machines with good performance may be allocated more Slot Hash slot , Some machines with poor performance require less allocation Slot Hash slot , In the allocation process, we may not know the corresponding relationship between the data and the Hashi slot , Then it is likely that a large amount of data will belong to Slot The hash slot is allocated to the same instance , This causes data skew .
Hash Tag Causing data skew
What is? Hash Tag Well ?
Hash Tag Is added to the key value pair key A pair of curly braces , This pair of curly braces will key Part of the value , The client is calculating key Of CRC16 Only the contents in curly brackets will be calculated when hashing the value , If there are no curly braces, the whole key It's worth it CRC16 value .
Such as key The value is user:order:{1001} Then what is in curly brackets is called Hash Tag That is to say 1001, In the calculation key Of CRC16 Time is also called calculation 1001 Of CRC16 Value instead of calculating user:order:{1001}.
Use Hash Tag The feature of is that different key values are as long as Hash Tag The same can be mapped to the same Slot Hash slot , So it must be in an instance .
Hash Tag What scenarios are they generally used in ?
==Redis Cluster Cross instance transactions and range queries are not supported ==, If the required data of the query exists in different instances , The consequence is that you need to query the instances one by one in the business code to get the results , Obviously, the efficiency is greatly reduced , So we can use it Hash Tag Centralize the data required by the business into one Slot Hash slot , This makes it easy to implement transactions and range queries .
If used blindly Hash Tag It is possible to cause a large amount of data to accumulate on one instance , This causes the number to skew , This requires a trade-off between efficiency and data skew , If because Hash Tag Cause data skew , Then we should give priority to avoiding data skew , Better not to use Hash Tag, Because transactions and range queries can be handled on the client side , Data skew may cause instance instability .
Data access
The root cause of data access skew is the existence of hot data , Such as e-commerce seckill commodity information , This situation may be the instance pressure caused by several key value accesses , Redistribution Slot Hash slot can not solve the problem caused by data access skew , We can use multiple copies .
What does multi copy mean ? Is to copy multiple copies of hot data , Of each copy key Add a random access prefix , Make sure this key Data from other replicas will not be mapped to the same slot Hash slot .
What does that mean ? for example Cluster There is 8 An example , Business key by abc,0-7 Instance No key They correspond to each other 0abc,1abc,2abc,3abc,… 7abc, Query this on the client key value abc when , By creating a 0 To 7 The random number , And abc Put it all together and ask redis.
But here's the thing slot Hash slots need to be allocated to different instances to achieve the effect , So the central idea of this solution is Distribute hot data to every instance , Share the pressure .
Another requirement for using this solution is Data needs to be read-only , If the data is writable, it also needs to increase the resource consumption of maintaining each copy , Clearly not appropriate .
边栏推荐
- LabVIEW finds prime numbers in an array of n elements
- AUTO PWN
- Opencv实现图像的基本变换
- SQL intra statement operation
- Three categories of financial assets under the new standards: AMC, fvoci and FVTPL
- "Adobe international certification" Photoshop software, about drawing tutorial?
- 一文带你了解Windows操作系统安全,保护自己的电脑不受侵害
- ZUCC_编译语言原理与编译_实验04 语言与文法
- Live broadcast appointment: growth of Mengxin Product Manager
- Markdown 实现文内链接跳转
猜你喜欢

2022茶艺师(中级)上岗证题库及在线模拟考试

Longhorn installation and use

小样本故障诊断 - 注意力机制代码 - BiGRU代码解析实现

Small sample fault diagnosis - attention mechanism code - Implementation of bigru code parsing

5分钟,客服聊天处理技巧,炉火纯青

WCF TCP protocol transmission

问题4 — DatePicker日期选择器,2个日期选择器(开始、结束日期)的禁用

RCNN、Fast-RCNN、Faster-RCNN介绍

独立站运营中如何提升客户留存率?客户细分很重要!

12--合并两个有序链表
随机推荐
12--合并两个有序链表
ZUCC_编译语言原理与编译_实验05 正则表达式、有限自动机、词法分析
2022 mobile crane driver special operation certificate examination question bank and online simulation examination
io模型初探
AUTO PWN
13 -- 移除无效的括号
Live broadcast appointment: growth of Mengxin Product Manager
Live broadcast review | detailed explanation of koordinator architecture of cloud native hybrid system (complete ppt attached)
"Wechat cloud hosting" first practical battle | introduction to minimalist demo
新技术实战,一步步用Activity Results API封装权限申请库
Understanding of the concept of "quality"
2021-03-16 COMP9021第九节课笔记
jwt(json web token)
Maya re deployment
Two methods of QT exporting PDF files
The JS macro of WPS implements the separation method of picture text in the same paragraph
Fundamentals of 3D mathematics [17] inverse square theorem
etcd备份恢复原理详解及踩坑实录
How to improve the customer retention rate in the operation of independent stations? Customer segmentation is very important!
日本大阪大学万伟伟研究员介绍基于WRS系统机器人的快速集成方法和应用