当前位置:网站首页>Data system partition design - partition rebalancing
Data system partition design - partition rebalancing
2022-07-25 15:47:00 【JavaEdge】
With the business blowout ,DB Change :
- Query load increases , Need more CPU Handling loads
- Data scale increases , Need more disk and memory to store
- The node may fail , Other nodes are required to take over the failed node
All these changes require data 、 Requests can be transferred from one node to another . The process of moving the load from one node to another in the cluster is called Rebalance (rebalancing). No matter which partition strategy , Partition rebalancing Usually at least :
- rebalancing after , load 、 data storage 、 Read and write requests should be more evenly distributed within the cluster
- rebalancing Execution time ,DB It should be able to continue to read and write normally
- Avoid unnecessary load migration , To speed up rebalancing efficiency , And minimize network and disk I/O influence
4.1 Rebalancing strategy
4.1.1 Cautionary tale :hash mod N
chart -3 I mentioned , It is best to hash Values are divided into different ranges , Then each interval is assigned to a partition .
Then why not use mod(Java Medium % Operator ). Such as hash(key) mod 10 Return between 0 and 9 Number between . If you have any 10 Nodes , The number is 0~9, This seems to be putting every K The easiest way to assign to a node .
But the problem is , If the number of nodes N change , majority K Will need to move from one node to another . hypothesis hash(key)=123456 . first 10 Nodes , Then K At the beginning, at the node 6( because 123456 mod 10 = 6):
- When it grows to 11 Nodes ,K You need to move to the node 3(
)
- When it grows to 12 Nodes ,K You need to move to the node 0(
)
This frequent migration has greatly increased rebalancing Cost of .
So the key is to reduce the migrated data .
4.1.2 A fixed number of partitions
Fortunately, there is a very simple solution : Create more partitions than nodes , And assign multiple partitions to each node . Such as 10 Cluster of nodes ,DB It may be logically divided into 1,000 Zones , So there's about 100 Partitions are allocated to each node .
If a new node joins the cluster , The new node can steal some partitions from each current node , Until the partition reaches global balance again . Process diagram -6. If you delete a node from the cluster , The opposite will happen .
The entire partition selected will be migrated between nodes , But the total number of partitions remains the same ,K The mapping relationship to the partition is also unchanged . The only change is the node where the partition is located . This change is not immediate , After all, it takes time to transmit data on the network , So during the transmission , The old partition can still receive read and write operations .
In principle, , You can also take into account the different hardware configuration factors in the cluster : More powerful nodes allocate more partitions , So as to share more load . stay ES 、Couchbase This dynamic balance method is used in .
When using this policy , The number of partitions is usually DB When it is first established , It won't change after . Although it can be split in principle 、 Merge partitions , But a fixed number of partitions makes the operation easier , Therefore, many fixed partition strategies DB Decided not to support partition splitting . therefore , The number of partitions during initialization is the maximum number of nodes you can have , So you should take full account of future business needs , Set the number of partitions large enough . But each partition also has additional management overhead , Choosing too high a number also has side effects .
If the total size of the data set is difficult to predict ( If possible, it starts small , But over time, it will often vary greatly ), here , Choosing the right number of partitions is difficult . Because the upper limit of the amount of data contained in each partition is fixed , Therefore, the actual size of each partition is proportional to the total amount of data in the cluster :
- If there is a large amount of data in the partition , Then the cost of rebalancing and recovering from node failure is very high
- If the partition is too small , There will be too much overhead
The partition size should “ Just right ”, If the number of partitions is fixed , But the total data volume has changed a lot , It is difficult to achieve the best performance .
4.1.3 Dynamic partitioning
For the use of K The scope of the partition is DB, If there is a problem with the boundary setting , It may cause all data to be crowded in one partition while other partitions are basically empty , Then set the fixed boundary 、 A fixed number of partitions will be inconvenient : It is tedious to reconfigure partition boundaries manually .
Regarding this ,K The scope of the partition is DB, Such as HBase Create partitions dynamically :
- When the data growth of the partition exceeds the configured threshold (HBase Default 10GB), It will be split into two partitions , Each takes half of the data
- contrary , If a large amount of data is deleted , And the partition is reduced below a certain threshold , Then merge it with adjacent partitions
It's kind of similar B The process of tree splitting .
Each partition is assigned to a node , Each node can host multiple partitions , As with a fixed number of partitions . After the large partition is split , Half of them can be transferred to another node , To balance the load .HBase in , The partition file is transferred through HDFS Realization .
One advantage of dynamic partitioning , The number of partitions can automatically adapt to the total amount of data :
- If there is only a small amount of data , A few partitions are enough , The cost is also very small
- If there is a large amount of data , The size of each partition is limited to a configurable maximum
But an empty DB, Because there is no prior information to determine the partition boundary , So we will start with a partition . The dataset may start small , Until the split point of the first partition is reached , All writes must be handled by a single node , Other nodes are idle . To solve the problem ,HBase、MongoDB Allow in an empty DB Configure a set of initial partitions ( Pre segmentation ,pre-splitting). stay K Under the range partition policy , Pre segmentation needs to be known in advance K Distribution of .
Dynamic partitioning is not only suitable for K The scope partition of , Can also be applied to hash Partition .MongoDB 2.4 Start supporting both scope and hash Partition , And both support dynamic partition .
4.1.4 Partition according to the proportion of nodes
- Dynamic partition policy , The number of partitions is proportional to the size of the dataset , Because split 、 The merging process keeps the size of each partition fixed min and max Between
- A fixed number of partitions , The size of each partition is proportional to the size of the dataset
In both cases , The number of partitions is independent of the number of nodes .
Cassandra The third scheme is adopted , Make the number of partitions proportional to the number of cluster nodes . That is, each node has a fixed number of partitions . here , The size of each partition is proportional to the size of the data set , And the number of nodes remains unchanged , But when you increase the number of nodes , The partition will be smaller again . Because of the large amount of data, a large number of nodes are usually required to store , Therefore, this method also keeps the size of each partition stable .
When a new node joins the cluster , It randomly selects a fixed number of existing partitions to split , Then take half of the data volume of these partitions , Leave the other half of the data on the original node . Random selection may produce unfair partition , But when the average number of partitions is large (Cassandra By default, each node has 256 Zones ), The new node will eventually get a considerable amount of load from the existing node . Cassandra 3.0 Introduce Optimization Algorithm , Can avoid unfair division .
Random selection of partition boundaries requires hash Partition strategy ( Can be obtained from hash Set the boundary in the number range generated by the function ). This method also best conforms to the definition of consistent hash .
4.2 Operation and maintenance : Manual or Automatic rebalancing
Is the dynamic performed automatically or manually ?
Automatic rebalancing ( That is, it is automatically determined by the system , When to migrate partitions from one node to another , There is no need for human intervention ) And completely manual ( That is, the mapping from partition to node is explicitly configured by the administrator ) There is a trade-off between . Such as Couchbase A recommended partition allocation will be automatically generated , But it needs to be confirmed by the administrator to take effect .
Automatic rebalancing is more convenient , There is little operation work beyond normal maintenance , But it may be unpredictable . Rebalancing is an expensive operation , Because it needs to reroute the request , And migrate a large amount of data from one node to another . If something goes wrong , It may overload the network or nodes , And reduce the performance of other requests .
The combination of automatic balancing and automatic fault detection may also pose risks . Suppose a node is overloaded , And the response to the request is temporarily slow , Other nodes draw conclusions : Overload node has failed , And automatically balance the cluster , Transfer its load . Objectively , This will aggravate the node 、 Load of other nodes and Networks , Thus making the situation worse , Even cascading failure .
Regarding this , It is more recommended to have someone participate in the rebalancing process . This is a little slower than fully automatic response , But it can effectively prevent accidents .
边栏推荐
- Pytoch learning notes -- seresnet50 construction
- Pat class a topic directory
- Activity review | July 6 Anyuan AI X machine heart series lecture No. 2 | MIT professor Max tegmark shares "symbiotic evolution of human and AI"
- GAMES101复习:三维变换
- Hdu3873 shortest path with dependency (topological sorting)
- I want to ask whether the variable configuration function can only be used in SQL mode
- JVM knowledge brain map sharing
- Are you ready to break away from the "involution circle"?
- Cf566a greed + dictionary tree
- Deadlock gossip
猜你喜欢

共2600页!又一份神级的面试手册面世~

Leetcode - 232 realize queue with stack (design double stack to realize queue)

十字链表的存储结构

P4552 differential

No tracked branch configured for branch xxx or the branch doesn‘t exist. To make your branch trac

Solve the vender-base.66c6fc1c0b393478adf7.js:6 typeerror: cannot read property 'validate' of undefined problem

Geogle colab notes 1-- run the.Py file on the cloud hard disk of Geogle

Idea - click the file code to automatically synchronize with the directory

Leetcode - 641 design cycle double ended queue (Design)*

p4552-差分
随机推荐
我想问下变量配置功能是只能在SQL模式下使用吗
Reasons for data format conversion when matlab reads the displayed image
PAT甲级1153 Decode Registration Card of PAT (25 分)
July 25th, 2022 Daily: Microsoft proposed CodeT: a new SOTA for code generation, with 20 points of performance improvement
CF566A-贪心+字典树
共2600页!又一份神级的面试手册面世~
GAMES101复习:线性代数
Leetcode - 641 design cycle double ended queue (Design)*
JS URLEncode function
LeetCode - 622 设计循环队列 (设计)
CircleIndicator组件,使指示器风格更加多样化
Beyond compare 4 realizes class file comparison [latest]
LeetCode - 707 设计链表 (设计)
ICPC2021昆明M-暴力+主席树
Idea - click the file code to automatically synchronize with the directory
Cf365-e - Mishka and divisors, number theory +dp
2019 Shaanxi provincial competition j-bit operation + greed
No tracked branch configured for branch xxx or the branch doesn‘t exist. To make your branch trac
Storage structure of cross linked list
2021江苏省赛A. Array-线段树,维护值域,欧拉降幂