当前位置:网站首页>1. go deep into tidb: see tidb for the first time
1. go deep into tidb: see tidb for the first time
2022-06-24 07:04:00 【luozhiyun】
Please state the source of reprint ~, This article was published at luozhiyun The blog of :https://www.luozhiyun.com/archives/584
This article should be my research TiDB The first article , It mainly introduces the whole TiDB Architecture and what functions it can support . As for the details , I'm also curious , So you might as well pay attention to , Let me tell it slowly .
Why study TiDB ?
Actually TiDB I've wanted to know for a long time , But I always don't want to face such a big pool of code . Because... Has been used in the project team Mysql, But the amount of data is increasing , The data volume of some tables has reached hundreds of millions , The amount of data in this order of magnitude is actually right Mysql It's already very hard for me , So I want to find a distributed database .
So I found a financial level database mainly promoted by Tencent TDSQL.TDSQL Is based on MariaDB kernel , combination mysql-proxy、ZooKeeper Open source components such as the implementation of the database cluster system , And based on MySQL Semi synchronous mechanism , A lot of optimizations have been done at the kernel level , There are significant improvements in performance and data consistency , At the same time, it is fully compatible with MySQL grammar , Support Shard Pattern ( Middleware sub database sub table ) and NoShard Pattern ( Stand alone example ).
adopt TDSQL Of Shard The pattern puts a table Shard Later processing actually requires some sacrifice in function , And it is relatively invasive to the application , For example, all tables must be defined shard-key , For some scenarios of distributed transactions, there may be bottlenecks .
So in this context, I began to study NewSQL database , and TiDB yes NewSQL Representative products in the industry .
about NewSQL Database may not be heard by many people , Here's to say .NewSQL The more general definition is : One can be compatible with similar MySQL The traditional stand-alone database 、 Scalable horizontally 、 Data strong consistency synchronization 、 Support distributed transactions 、 A relational database that stores and computes separately .
TiDB Introduce
According to the official introduction TiDB It has the following advantages :
- Support elastic expansion and contraction ;
- Support SQL, Compatible with most MySQL The grammar of , In most scenarios, you can directly replace MySQL;
- High availability is supported by default , Automatic data repair and failover ;
- Support ACID Business ;
As can be seen from the picture, it is mainly divided into :TiDB Server 、PD (Placement Driver) Server、 Storage nodes .
- TiDB Server:TiDB Server It doesn't store data on its own , Responsible for accepting client connections , analysis SQL, Forward the actual data read request to the underlying storage node ;
- PD (Placement Driver) Server: Responsible for storing every TiKV The real-time data distribution of nodes and the overall topology of the cluster , And assign transactions to distributed transactions ID. At the same time, it is also responsible for issuing data scheduling commands to specific TiKV node ;
- Storage nodes : Storage nodes are mainly composed of two parts TiKV Server and TiFlash
- TiKV : A distributed transaction provider Key-Value Storage engine ;
- TiFlash: To solve OLAP scene . With the help of ClickHouse Achieve efficient column calculation
TiDB Storage
Here we mainly introduce TiKV .TiKV In fact, you can imagine it as a huge Map, Used to store Key-Value data . But the data will pass RocksDB Save on disk .
TiKV The stored data is also distributed , It will pass. Raft agreement ( About Raft You can read this article :https://www.luozhiyun.com/archives/287) Copy data to multiple machines .
TiKV Every data change falls into one Raft journal , adopt Raft The log copy function of , Synchronize data securely and reliably to every node in the replication group . A few machine downtime can also pass the native Raft The agreement automatically completes the copy , You can achieve no perception of the business .
TiKV Will the whole Key-Value The space is divided into many segments , Each paragraph is a series of consecutive Key, Call each paragraph a Region. Every Region The data saved in the default is 144MB, I still quote an official picture here :
When a Region The size of exceeds a certain limit ( The default is 144MB),TiKV Will split it into two or more Region, To ensure that each Region The size of is roughly close , Again , When a Region Because of a large number of delete requests Region The size of becomes smaller ,TiKV The smaller two adjacent Region Merge into one .
Divide the data into Region after ,TiKV We will try our best to ensure that the services on each node Region It's about the same amount , And Region For the unit Raft Replication and member management .
Key-Value Mapping data
because TiDB It's through TiKV To store the , But in relational databases , A table may have many columns , This requires mapping the data of each column in a row into a (Key, Value) Key value pair .
For example, there is such a table :
CREATE TABLE User ( ID int, Name varchar(20), Role varchar(20), Age int, PRIMARY KEY (ID), KEY idxAge (Age) );
There are three rows of data in the table :
1, "TiDB", "SQL Layer", 10 2, "TiKV", "KV Engine", 20 3, "PD", "Manager", 30
So these data are TiKV When stored on the, it will build key. For the primary key and unique index, the unique of the table will be carried on each data ID, And the data in the table RowID. For example, the above three lines of data will construct :
t10_r1 --> ["TiDB", "SQL Layer", 10] t10_r2 --> ["TiKV", "KV Engine", 20] t10_r3 --> ["PD", "Manager", 30]
among key in t Is said TableID Prefix ,t10 Represents the unique name of the table ID yes 10;key in r Express RowID Prefix ,r1 Represents this data RowID The value is 1,r2 Express RowID The value is 2 wait .
For ordinary secondary indexes that do not need to meet uniqueness constraints , A key value may correspond to multiple lines , You need to query the corresponding... According to the key value range RowID. In the data above idxAge This index will map to :
t10_i1_10_1 --> null t10_i1_20_2 --> null t10_i1_30_3 --> null
above key The corresponding meaning is :t surface ID_i Indexes ID_ Index value _RowID
.
You can see whether it is a unique index or a secondary index , All build key Mapping rules to find data , Like this one SQL:
select count(*) from user where name = "TiDB"
where The condition does not go to the index , Then you need to read all the data in the table , Then check that the name
Is the field TiDB
, The execution process is :
- Construct out Key Range , That is, the data range to be scanned , In this example is the full table , therefore Key Range Namely
[0, MaxInt64)
; - scanning Key Range Reading data ;
- Filter the read data , Calculation
name = "TiDB"
This expression , If it is true , Then return this line up , Otherwise, discard this row of data ; - Calculation
Count(*)
: For every line that meets the requirements , Accumulated toCount(*)
The results above .
SQL Execution process
- Parser & validator: Parse text into structured data , It's the abstract syntax tree (AST), Then on AST Verify the validity ;
- Logical Optimize Logic optimization : Apply some optimization rules to the input logical execution plan in order , So as to make the whole logical execution plan better . for example : Associate sub query to associate 、Max/Min eliminate 、 Predicate push-down 、Join Reordering, etc ;
- Physical Optimize Physical optimization : It is used to make a physical execution plan for the logical execution plan generated in the previous stage . The optimizer selects a specific physical implementation for each operator in the logical execution plan . For the same logical operator , There may be multiple physical operators to implement , such as
LogicalAggregate
, Its implementation can adopt hash algorithmHashAggregate
, It can also be streamingStreamAggregate
; - Coprocessor : stay TiDB in , The calculation is based on Region In units ,SQL The layer will analyze the data to be processed Key Range, And I'll put these Key Range according to PD From Region The information is divided into several Key Range, Finally, send these requests to the corresponding Region, Their respective Region Corresponding TiKV The module of data and calculation is called Coprocessor ;
- TiDB Executor:TiDB Will Region Consolidate, summarize and settle the returned data ;
Business
As a distributed database , Distributed transaction is one of the important characteristics .TiDB Realize the distributed transaction of snapshot isolation level , Support pessimistic affairs and optimistic Affairs .
- Optimistic business : There are no conflicts between transactions or allow transactions to fail due to data conflicts ; The ultimate performance .
- Pessimism : There are conflicts between transactions and there are requirements for the success rate of transaction submission ; Because of the lock operation , Performance will be worse than optimistic transactions .
Optimistic business
TiDB Use two-phase commit (Two-Phase Commit) To ensure the atomicity of distributed transactions , It is divided into Prewrite and Commit Two phases :
- Prewrite
- TiDB Select one of the data currently to be written Key As the current transaction Primary Key;
- TiDB Concurrently to all involved TiKV launch Prewrite request ;
- TiKV Check the data version information for conflicts , Eligible data will be locked ;
- TiDB Received all Prewrite Respond and all Prewrite They all succeeded ;
- Commit
- TiDB towards TiKV Initiate the second phase submission ;
- TiKV received Commit After the operation , Check whether the lock exists and clean it Prewrite Locks left by stages ;
When using the optimistic transaction model , In a high conflict scenario , Transactions can easily fail to commit .
Pessimism
The pessimistic business is Prewrite There was an increase in Acquire Pessimistic Lock Stages are used to avoid Prewrite When there is a conflict :
- Every DML Will add pessimistic lock , Lock write TiKV in ;
- Pessimistic transactions check various constraints when applying pessimistic locks ;
- Pessimistic locks do not contain data , Only the lock , Only used to prevent other transactions from modifying the same Key, It doesn't block reading ;
- The existence of pessimistic locks at the time of submission guarantees Prewrite Not going to happen Write Conflict, Ensure that the submission will be successful ;
summary
In this article, we have a general understanding of , As a distributed relational database TiDB What is its overall architecture . How to use Key-Value In the form of data storage , its SQL How is it implemented , And the transaction support as a relational database .
Reference
https://www.infoq.cn/article/mfttecc4y3qc1egnnfym
https://pingcap.com/cases-cn/user-case-webank/
https://docs.pingcap.com/zh/tidb/stable/tidb-architecture
https://pingcap.com/blog-cn/tidb-internal-1/
https://pingcap.com/blog-cn/tidb-internal-2/
https://pingcap.com/blog-cn/tidb-internal-3/
https://docs.pingcap.com/zh/tidb/stable/tidb-best-practices
边栏推荐
- 应用配置管理,基础原理分析
- Do you know about Statistics?
- JSON online parsing and the structure of JSON
- 智能视觉组A4纸识别样例
- 数据同步工具 DataX 已经正式支持读写 TDengine
- 如何低成本构建一个APP
- [binary tree] - middle order traversal of binary tree
- Brief introduction of domain name registration
- 十年
- What is the main function of cloud disk? How to restore deleted pictures
猜你喜欢
机器人迷雾之算力与智能
Application of intelligent reservoir management based on 3D GIS system
程序员使用个性壁纸
oracle sql综合运用 习题
Challenges brought by maker education to teacher development
leetcode:1856. 子数组最小乘积的最大值
LuChen technology was invited to join NVIDIA startup acceleration program
Arduino融资3200万美元,进军企业市场
RealNetworks vs. Microsoft: the battle in the early streaming media industry
Open source and innovation
随机推荐
程序员使用个性壁纸
What is the OSI seven layer model? What is the role of each layer?
Online font converter what is the meaning of font conversion
On BOM and DOM (6): bit value calculation of DOM objects and event objects, such as offsetx/top and clearx
How to register the cloud service platform and what are the advantages of cloud server
MAUI使用Masa blazor组件库
[Yugong series] June 2022 asp Basic introduction and use of cellreport reporting tool under net core
The data synchronization tool dataX has officially supported reading and writing tdengine
System design: partition or data partition
Counter attack of flour dregs: MySQL 66 questions, 20000 words + 50 pictures
35岁危机?内卷成程序员代名词了
Asp+access web server reports an error CONN.ASP error 80004005
SAP实施项目上的内部顾问与外部顾问,相互为难还是相互成就?【英文版】
leetcode:剑指 Offer 26:判断t1中是否含有t2的全部拓扑结构
JSON online parsing and the structure of JSON
LuChen technology was invited to join NVIDIA startup acceleration program
Application configuration management, basic principle analysis
puzzle(019.1)Hook、Gear
On BOM and DOM (3): DOM node operation - element style modification and DOM content addition, deletion, modification and query
With a goal of 50million days' living, pwnk wants to build a "Disneyland" for the next generation of young people