当前位置:网站首页>B tree and b+ tree

B tree and b+ tree

2022-06-23 00:53:00 Fried cat


One 、 B Trees

1、B The concept of tree

B A tree is an absolutely balanced multiway lookup tree ,B The maximum number of subtrees of all nodes in the tree is called B The steps of the tree , use m Express . One m Step B If the tree is not null, The following properties must be satisfied :

  • The number of each node in the tree is m-1 Key words , At most m Kezi tree .
  • Except for the root node , All non leaf nodes contain at least [m/2]-1 Key words , So at least [m-2] Kezi tree .( Leaf nodes are all null).
  • The root node keyword can be less than [m/2]-1, There can be no subtree , If there is a subtree, at least 2 individual , because B Trees are absolute Balance tree , The height difference is 0.
  • The non leaf node structure is :
    [ Failed to transfer the external chain picture , The origin station may have anti-theft chain mechanism , It is suggested to save the pictures and upload them directly (img-1NTkptFU-1655360655348)(C:\Users\ZhaLeMaoDeMao\AppData\Roaming\Typora\typora-user-images\image-20220615105523125.png)]

k Represent keyword , also k1 < k2 < k3 ······< kn

p Represents the subtree pointed to on both sides of the keyword .

problem n Keywords m rank B Trees :

1、 How many leaf nodes are there

n+1

2、 What's the minimum height

The fattest tree is the smallest , That is, each node satisfies the maximum keyword m-1

m^h

Height The maximum number of nodes Maximum number of keywords
11 5 1 5^1 51 -1
25 5 2 5^2 52 -1
325 5 3 5^3 53 -1
h m h m^h mh m h m^h mh-1

therefore The minimum height is m h m^h mh-1 >= n

namely h >= $\log_m (n+1)$

3、 What is the maximum height

The thinnest tree is the largest , That is, each node has the minimum number of key words [m/2]-1, Each layer meets the minimum number of points [m/2], Set the minimum number of nodes as k, Then the number of key words is k-1.

Height The minimum number of nodes Minimum number of key words
111
225=4+1
3617=12+4+1
h2 k h − 2 k^{h-2} kh22 k h − 2 k^{h-2} kh2(k-1)

therefore h The number of high keywords is 1+2( k 0 k^0 k0+ k 1 k^1 k1+ k 2 k^2 k2········ k h − 2 k^{h-2} kh2 )*(k-1)<= n

Simplify to :1+2( k h − 1 k^{h-1} kh1 -1)<=n

h <= $\log_k\frac{n+1}{2}$+1

2、B Insertion of trees

Mainly in the process of division , If you don't need to split, just put it in .

Split method :

​ 1、 From the middle, that is [m/2] The position of the split , Divide keywords into two parts , The keyword on the left is still in the original node , The keyword on the right is placed on the new node of the same layer , The middle position is inserted into the parent node of the original node .

​ 2、 If after the previous step , The parent node has exceeded the upper limit of keywords , Split the parent node , Until it reaches The root node , If The root node also exceeds the limit , The root node continues to split , Generate a new root node , The height of the tree plus one .

Demonstrate the insertion process : First , Declare a fifth order B Trees

One 、 Insert 30、40、20、60

 Insert picture description here

Two 、 Continue to insert 40, It needs to split up , First, follow the above Split method 1 perform , And then execute Split method 2, Generate a new root node

[ Failed to transfer the external chain picture , The origin station may have anti-theft chain mechanism , It is suggested to save the pictures and upload them directly (img-fb1ky8Dx-1655360655353)(C:\Users\ZhaLeMaoDeMao\AppData\Roaming\Typora\typora-user-images\image-20220616141341872.png)]

3、 ... and 、 Continue to insert 23、28

 Insert picture description here

Four 、 Continue to insert 25, Split up , according to Split method 1.

 Insert picture description here

5、 ... and 、 Continue to insert 22、24、26、29

 Insert picture description here

6、 ... and 、 Continue to insert 21、27、 At this time The root node keyword has reached the maximum value

 Insert picture description here

7、 ... and 、 Continue to insert 55、70
 Insert picture description here

8、 ... and 、 Continue to insert 80, here The rightmost subtree splits ,60 Move to the parent node , But the position of the parent node has reached the maximum , Split again to produce a new root node .

 Insert picture description here

3、B Delete tree

1、 if B The deleted node of the tree is located in the terminal node , And the number of key words of the node Greater than [m/2]-1, Then delete it directly .

2、 if B The deleted node of the tree is located in the terminal node , But the number of key words in this node be equal to [m/2]-1, Then to the sibling node ( Left or right ) Node keyword Greater than [m/2]-1 The node of borrows a keyword , But it is not borrowed directly, but Take the node of the parent node , There are three situations

​ 2.1、 Keyword of the deleted node Are smaller than the parent node , Only the right brother , take The smallest keyword of the parent node Put it in that position , The smallest keyword of the right brother Put it in the parent node .

​ 2.2、 Keyword of the deleted node Are larger than the parent node , Only the left brother , take The largest keyword of the parent node Put it in that position , Left brother's biggest keyword Put it in the parent node .

​ 2.3、 The keyword of the deleted node is located in The middle of the parent node , There are right brothers , And left brother , If you borrow from brother youyou , In fact, that is 2.1; If you borrow from brother Zuo , In fact, that is 2.2.

3、 if B The deleted node of the tree is located in the terminal node , And the number of keywords in this node is equal to [m/2]-1, The keyword of the sibling node adjacent to the node is also equal to [m/2]-1, Add the keywords in the parent node to the current node and adjacent nodes ( Split keywords ) A merger . If the merge results in the parent node keyword being less than [m/2]-1, Then traverse upward with the parent node , All the way to the root .

4、 If the keyword to be deleted is not in the terminal node , Then use the Direct precursor perhaps Direct follow-up Instead of keywords , Up to the terminal node , Turn the problem into a terminal node .

​ Direct precursor : Current keyword The left node is the rightmost Key words of .

​ Direct follow-up : Current keyword The leftmost node on the right Key words of .

Demonstration process : B The tree is shown in the figure below :

 Insert picture description here

One 、 Delete 42, Compliance 1 , Delete directly

 Insert picture description here

Two 、 Delete 33, Compliance 2

 Insert picture description here

3、 ... and 、 Delete 29, Compliance 3

 Insert picture description here

Four 、 Delete 35, Corresponding situation 4

 Insert picture description here

5、 ... and 、 Delete 40, Corresponding situation 2, Move the split node down , The maximum value of the left sibling node moves up

[ Failed to transfer the external chain picture , The origin station may have anti-theft chain mechanism , It is suggested to save the pictures and upload them directly (img-B2RP1A9o-1655446089821)(C:\Users\ZhaLeMaoDeMao\AppData\Roaming\Typora\typora-user-images\image-20220617140607796.png)]

Two 、B+ Trees

1、B+ The concept of tree

B+ Trees and B The trees are very similar , First look at The same thing :

  • The root node has at least one element

  • Non root node range :[m/2] <= k < = m-1

Difference :

  • B+ Treelike Only leaf nodes store data , Non leaf nodes do not store data , This is the B The biggest difference about trees .
  • In the internal node key All in order from small to large , For one of the internal nodes key, All in the left tree key All smaller than it , In the right subtree key Are greater than or equal to it . The records in the leaf node are also in accordance with key The size of .
  • Each leaf node has a pointer to the adjacent leaf node , Leaf nodes themselves are linked in large order according to the size of keywords .
  • The parent node holds the index of the first element of the right child .

 Insert picture description here

2、B+ Insertion of trees

​ B+ Trees and B The insertion of trees is very similar , The difference is

When the number of node elements is greater than m-1 When , Split into two parts according to the middle element , The intermediate elements are split to the parent node for index storage , however , The middle element itself splits the right part , Leaf nodes are connected by pointers .

Demonstrate the insertion process : First declare a 5 rank B+ Trees

One 、 Insert... In turn 10、20、30、40

 Insert picture description here

Two 、 Insert 50 , Split up

 Insert picture description here

3、 ... and 、 Continue to insert 45、60, Split up

 Insert picture description here

3、B+ Delete tree

With the above B Trees are like , A little different .

1、 The number of nodes is greater than [m/2] Delete directly , If the parent node keyword points to the node where the deleted keyword is located , Update parent node index .

2、 Number of deleted nodes be equal to [m/2], Elements of sibling nodes Greater than [m/2], The leaf node has a pointer , When borrowing elements from sibling nodes , No need to go through the parent node , Instead, you can move directly through the sibling node , Then update the index of the parent node .

3、 The number of deleted nodes is equal to [m/2], And the sibling node element is not greater than [m/2], Then merge the current node and the sibling node , Then update the index of the parent node .

Example : Make a statement 5 rank B+ Trees

One 、 The initial is as shown in the figure

 Insert picture description here

Two 、 Delete 60, Just delete it

 Insert picture description here

3、 ... and 、 Delete 45, Need merger , And update the parent node keyword ( Actually, it was deleted )

 Insert picture description here

Four 、 Delete 25

 Insert picture description here

summary

B+ Trees are relative to B Trees have some advantages , It can be summed up in the following points .

  • A single node stores more elements , Make the query IO Fewer times , So it makes it more suitable to be a database MySQL The underlying data structure of .
  • All queries need to find leaf nodes , Query performance is stable , and B Trees , Each node can find data , So it's not stable .
  • All the leaf nodes form an ordered list , It's easier to find .
原网站

版权声明
本文为[Fried cat]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/173/202206221438195342.html