当前位置：网站首页>Data system partition design - partition rebalancing

Data system partition design - partition rebalancing

2022-07-25 15:47:00 【JavaEdge】

With the business blowout ,DB Change ：

Query load increases , Need more CPU Handling loads
Data scale increases , Need more disk and memory to store
The node may fail , Other nodes are required to take over the failed node

All these changes require data 、 Requests can be transferred from one node to another . The process of moving the load from one node to another in the cluster is called Rebalance （rebalancing）. No matter which partition strategy , Partition rebalancing Usually at least ：

rebalancing after , load 、 data storage 、 Read and write requests should be more evenly distributed within the cluster
rebalancing Execution time ,DB It should be able to continue to read and write normally
Avoid unnecessary load migration , To speed up rebalancing efficiency , And minimize network and disk I/O influence

4.1 Rebalancing strategy

4.1.1 Cautionary tale ：hash mod N

chart -3 I mentioned , It is best to hash Values are divided into different ranges , Then each interval is assigned to a partition .

Then why not use mod（Java Medium % Operator ）. Such as hash(key) mod 10 Return between 0 and 9 Number between . If you have any 10 Nodes , The number is 0~9, This seems to be putting every K The easiest way to assign to a node .

But the problem is , If the number of nodes N change , majority K Will need to move from one node to another . hypothesis hash(key)=123456 . first 10 Nodes , Then K At the beginning, at the node 6（ because 123456 mod 10 = 6）：

When it grows to 11 Nodes ,K You need to move to the node 3（

123456\ mod\ 11 = 3

）

When it grows to 12 Nodes ,K You need to move to the node 0（

123456\ mod\ 12 = 0

）

This frequent migration has greatly increased rebalancing Cost of .

So the key is to reduce the migrated data .

4.1.2 A fixed number of partitions

Fortunately, there is a very simple solution ： Create more partitions than nodes , And assign multiple partitions to each node . Such as 10 Cluster of nodes ,DB It may be logically divided into 1,000 Zones , So there's about 100 Partitions are allocated to each node .

If a new node joins the cluster , The new node can steal some partitions from each current node , Until the partition reaches global balance again . Process diagram -6. If you delete a node from the cluster , The opposite will happen .

The entire partition selected will be migrated between nodes , But the total number of partitions remains the same ,K The mapping relationship to the partition is also unchanged . The only change is the node where the partition is located . This change is not immediate , After all, it takes time to transmit data on the network , So during the transmission , The old partition can still receive read and write operations .

In principle, , You can also take into account the different hardware configuration factors in the cluster ： More powerful nodes allocate more partitions , So as to share more load . stay ES 、Couchbase This dynamic balance method is used in .

When using this policy , The number of partitions is usually DB When it is first established , It won't change after . Although it can be split in principle 、 Merge partitions , But a fixed number of partitions makes the operation easier , Therefore, many fixed partition strategies DB Decided not to support partition splitting . therefore , The number of partitions during initialization is the maximum number of nodes you can have , So you should take full account of future business needs , Set the number of partitions large enough . But each partition also has additional management overhead , Choosing too high a number also has side effects .

If the total size of the data set is difficult to predict （ If possible, it starts small , But over time, it will often vary greatly ）, here , Choosing the right number of partitions is difficult . Because the upper limit of the amount of data contained in each partition is fixed , Therefore, the actual size of each partition is proportional to the total amount of data in the cluster ：

If there is a large amount of data in the partition , Then the cost of rebalancing and recovering from node failure is very high
If the partition is too small , There will be too much overhead

The partition size should “ Just right ”, If the number of partitions is fixed , But the total data volume has changed a lot , It is difficult to achieve the best performance .

4.1.3 Dynamic partitioning

For the use of K The scope of the partition is DB, If there is a problem with the boundary setting , It may cause all data to be crowded in one partition while other partitions are basically empty , Then set the fixed boundary 、 A fixed number of partitions will be inconvenient ： It is tedious to reconfigure partition boundaries manually .

Regarding this ,K The scope of the partition is DB, Such as HBase Create partitions dynamically ：

When the data growth of the partition exceeds the configured threshold （HBase Default 10GB）, It will be split into two partitions , Each takes half of the data
contrary , If a large amount of data is deleted , And the partition is reduced below a certain threshold , Then merge it with adjacent partitions

It's kind of similar B The process of tree splitting .

Each partition is assigned to a node , Each node can host multiple partitions , As with a fixed number of partitions . After the large partition is split , Half of them can be transferred to another node , To balance the load .HBase in , The partition file is transferred through HDFS Realization .

One advantage of dynamic partitioning , The number of partitions can automatically adapt to the total amount of data ：

If there is only a small amount of data , A few partitions are enough , The cost is also very small
If there is a large amount of data , The size of each partition is limited to a configurable maximum

But an empty DB, Because there is no prior information to determine the partition boundary , So we will start with a partition . The dataset may start small , Until the split point of the first partition is reached , All writes must be handled by a single node , Other nodes are idle . To solve the problem ,HBase、MongoDB Allow in an empty DB Configure a set of initial partitions （ Pre segmentation ,pre-splitting）. stay K Under the range partition policy , Pre segmentation needs to be known in advance K Distribution of .

Dynamic partitioning is not only suitable for K The scope partition of , Can also be applied to hash Partition .MongoDB 2.4 Start supporting both scope and hash Partition , And both support dynamic partition .

4.1.4 Partition according to the proportion of nodes

Dynamic partition policy , The number of partitions is proportional to the size of the dataset , Because split 、 The merging process keeps the size of each partition fixed min and max Between
A fixed number of partitions , The size of each partition is proportional to the size of the dataset

In both cases , The number of partitions is independent of the number of nodes .

Cassandra The third scheme is adopted , Make the number of partitions proportional to the number of cluster nodes . That is, each node has a fixed number of partitions . here , The size of each partition is proportional to the size of the data set , And the number of nodes remains unchanged , But when you increase the number of nodes , The partition will be smaller again . Because of the large amount of data, a large number of nodes are usually required to store , Therefore, this method also keeps the size of each partition stable .

When a new node joins the cluster , It randomly selects a fixed number of existing partitions to split , Then take half of the data volume of these partitions , Leave the other half of the data on the original node . Random selection may produce unfair partition , But when the average number of partitions is large （Cassandra By default, each node has 256 Zones ）, The new node will eventually get a considerable amount of load from the existing node . Cassandra 3.0 Introduce Optimization Algorithm , Can avoid unfair division .

Random selection of partition boundaries requires hash Partition strategy （ Can be obtained from hash Set the boundary in the number range generated by the function ）. This method also best conforms to the definition of consistent hash .

4.2 Operation and maintenance ： Manual or Automatic rebalancing

Is the dynamic performed automatically or manually ？

Automatic rebalancing （ That is, it is automatically determined by the system , When to migrate partitions from one node to another , There is no need for human intervention ） And completely manual （ That is, the mapping from partition to node is explicitly configured by the administrator ） There is a trade-off between . Such as Couchbase A recommended partition allocation will be automatically generated , But it needs to be confirmed by the administrator to take effect .

Automatic rebalancing is more convenient , There is little operation work beyond normal maintenance , But it may be unpredictable . Rebalancing is an expensive operation , Because it needs to reroute the request , And migrate a large amount of data from one node to another . If something goes wrong , It may overload the network or nodes , And reduce the performance of other requests .

The combination of automatic balancing and automatic fault detection may also pose risks . Suppose a node is overloaded , And the response to the request is temporarily slow , Other nodes draw conclusions ： Overload node has failed , And automatically balance the cluster , Transfer its load . Objectively , This will aggravate the node 、 Load of other nodes and Networks , Thus making the situation worse , Even cascading failure .

Regarding this , It is more recommended to have someone participate in the rebalancing process . This is a little slower than fully automatic response , But it can effectively prevent accidents .

原网站

版权声明
本文为[JavaEdge]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/206/202207251521384443.html