当前位置：网站首页>Data skew analysis of redis slice cluster

Data skew analysis of redis slice cluster

2022-06-23 01:26:00 【ZhanLi】

Redis How to deal with data skew in

Redis How to deal with data skew in

What is data skew

If Redis Deployment in , Using a slice cluster , The data will be distributed to different instances and saved according to certain rules , such as , Use Redis Cluster or Codis.

There are two cases of data skew ：

1、 The amount of data is skewed ： In some cases , The data on the instance is unevenly distributed , There is a lot of data on an instance .

2、 Data access ： Although the amount of data on each cluster instance varies little , But the data on an instance is hot data , Very frequently visited .

There's data skew , It will cause those instance nodes with large amount of data and high access , The load on the system increases , Slow response . Severe conditions cause memory resources to run out , Cause the system to crash .

The amount of data is skewed

The amount of data is skewed , That is, the data on the instance is unevenly distributed , There are a lot of data distributed in an instance .

Tilt of data volume , There are three main situations ：

1、bigkey Cause to tilt ;

2、Slot Uneven distribution leads to tilt ;

3、Hash Tag Cause to tilt .

Let's analyze them one by one

bigkey Cause to tilt

What is? bigkey： We will have large data or a large number of members 、 The number of lists Key Call it big Key.

One STRING Type of Key, Its value is 5MB（ Too much data ）
One LIST Type of Key, Its list number is 20000 individual （ Too many lists ）
One ZSET Type of Key, Its number of members is 10000 individual （ Too many members ）
One HASH Format Key, The number of its members is only 1000 But these members value Total size is 100MB（ The member volume is too large ）

If an instance has bigkey, This may lead to data skew in the cluster .

bigkey Existing problems

Uneven memory space ： If the deployment scheme of slice cluster is adopted , It is easy to cause uneven memory allocation of some instance nodes ;
Causing network congestion ： Read bigkey It means consuming more network traffic , It may be true Redis Server impact ;
Expired delete ：bigkey Not only slow reading and writing , Deletion is also slow , Delete expired bigkey It's also time consuming ;
It's difficult to move ： Due to the huge data , Backup and restore are also prone to blocking , operation failed ;

How to avoid

about bigkey It can be handled from the following two aspects

1、 Reasonably optimize the data structure

1、 Compress large data ;
2、 Split sets ： Split a large set into small sets （ For example, slice by time ） Or individual data .

2、 Choose other technologies to store bigkey;

Use other forms of storage , Consider using cdn Or a documented database MongoDB.

Slot Uneven distribution leads to tilt

For example, in Redis Cluster adopt Slot To assign instances to data

1、Redis Cluster The scheme uses hash slot to process KEY Distribution in different instances , A slice cluster has 16384 Hash slot , These hash slots are similar to data partitions , Each key value pair will be based on its key, Is mapped to a hash slot ;

2、 One KEY , First of all, according to CRC16 The algorithm computes a 16 bit Value ; then , To use this 16bit It's worth it 16384 modulus , obtain 0~16383 Modulus in range , Each module represents a corresponding numbered Hashi trough .

3、 Then allocate the hash slot to all instances , for example , If there is N An example , that , The number of slots on each instance is 16384/N individual .

If Slot Uneven distribution , This will cause a large amount of data in some instances , This leads to data skew .

It's a problem , We can use the migration command to transfer these Slot Migrate to other instances , that will do .

Hash Tag Cause to tilt

Hash Tag be used for redis In the cluster , Its function is to store data with a certain fixed feature on the same instance . It is realized in key Add a {}, for example test{1}.

Use Hash Tag After that, the client is calculating key Of crc16 when , Only calculate {} Data in the . If not used Hash Tag, The client will be responsible for the whole key Conduct crc16 Calculation .

data key	Hash computing	Corresponding Slot
user:info:{3231}	CRC16('3231') mod 16384	1024
user:info:{5328}	CRC16('5328') mod 16384	3210
user:order:{3231}	CRC16('3231') mod 16384	1024
user:order:{5328}	CRC16('5328') mod 16384	3210

This way Hash Tag You can store a certain fixed feature data on an instance , Avoid querying instances in the cluster one by one .

Li Ru ： If we perform transaction operations or data range queries , because Redis Cluster and Codis It does not support cross instance transaction operations and scope queries , When business applications have these requirements , You can only read these data to the business layer for transaction processing , Or query each instance one by one , Get the result of range query .

Hash Tag The underlying problem is , There may be situations where a large amount of data is mapped to the same instance , This causes the data skew of the cluster , The load in the cluster is unbalanced .

All when I use Hash Tag Do a good job of evaluation when , If our business demands do not use Hash Tag Can it be solved , If unavoidable use , We need to evaluate the amount of data , Try to avoid data skew .

Data access

Although the amount of data on each cluster instance varies little , But the data on an instance is hot data , Very frequently visited , This is data access skew .

The main culprit of data access skew is Hot Key

Slice... In a cluster Key It will eventually be stored in a fixed location in the cluster Redis In the example . One of them Key Access is much higher than others in a period of time Key, That is the Key Corresponding Redis example , Will receive excessive traffic requests , This example is prone to overload and jamming , Even get hung up .

Common causes of hot spots Key The situation of ：

1、 Hot events in the news ;

2、 In the second kill , Cost effective goods ;

How to discover Hot Key

1、 Withdrawal and pre judgment ;

Predict in advance according to business experience ;

2、 Collect on the client side ;

By adding command collection at the client , To find hot spots Key;

3、 Use Redis Self contained commands ;

Use monitor Command statistics hotspot key（ Not recommended , Under the condition of high concurrency, it will cause redis The hidden danger of memory explosion ）;

hotkeys Parameters ,redis 4.0.3 Provides redis-cli The hot key Discovery function , perform redis-cli When combined with –hotkeys Options can be . However, when the parameter is executed , If key More , It's slow to implement .

4、 stay Proxy Layer collection

If the cluster architecture introduces proxy, Can be in proxy Make statistics in

5、 Take your own bag and evaluate

Redis Client side usage TCP The protocol interacts with the server , The communication protocol is RESP. Write your own program listening port , according to RESP Protocol rule parsing data , Analyze . The disadvantage is the high cost of development , Difficult to maintain , There is a possibility of packet loss .

Hot Key How to solve

got it Hot Key How to deal with it

1、 Yes Key Disperse it ;

Take a chestnut

There is a hot Key The name is Hot-key-test, It can be dispersed into Hot-key-test1,Hot-key-test2... And then put these Key Distributed to multiple instance nodes , When the client accesses , A random subscript Key Visit , In this way, the traffic can be dispersed into different instances , Avoid the overload of a cache node .

In general , You can add suffixes or prefixes , Put one hotkey The number becomes redis The number of instances N Multiple M, By visiting a redis key It becomes an interview N * M individual redis key. N*M individual redis key After slicing and distributing to different instances , Spread the traffic to all instances .

const M = N * 2
// Generate random number 
random = GenRandom(0, M)
// Build backup new key
bakHotKey = hotKey + “_” + random
data = redis.GET(bakHotKey)
if data == NULL {
    data = GetFromDB()
    redis.SET(bakHotKey, expireTime + GenRandom(0,5))
}

2、 Use local cache ;

The business side can also use local caching , Heat it up key Records are cached locally , To reduce the impact on remote caching .

here , There is a place to pay attention to , Hot data multi copy method can only be used for read-only hot data . If hot data is read and write , The multi copy method is not suitable , Because to ensure data consistency among multiple copies , There will be additional costs .

For hot data with read and write , We are going to add resources to the instance itself , For example, use a machine with a higher configuration , To cope with a lot of access pressure .

summary

1、 There are two cases of data skew ;

1、 The amount of data is skewed ： In some cases , The data on the instance is unevenly distributed , There is a lot of data on an instance .
2、 Data access ： Although the amount of data on each cluster instance varies little , But the data on an instance is hot data , Very frequently visited .

2、 Tilt of data volume , There are three main situations ;