当前位置:网站首页>What do you pay special attention to when you insert / update / delete / obtain millions of rows of data in a DML statement?
What do you pay special attention to when you insert / update / delete / obtain millions of rows of data in a DML statement?
2022-06-22 23:49:00 【Kunlunbase Kunlun database】
Preface
Any node of a distributed computing and storage system may be overloaded , Node calculation 、 Insufficient storage resources , Network delay , The network is temporarily unreachable, resulting in operation timeout .
Any operation of the distributed system while waiting for the remote node to return , Usually hold various resources , You can't wait indefinitely , Otherwise, the overall operation of the system will be blocked and gradually stagnate .
So timeout control is a problem that all distributed systems need to solve , And if it is not solved well, it will lead to the stagnation of the system , Not working properly .
A brief introduction to the timeout control mechanism of Kunlun distributed database
Kunlun distributed database has the following timeout control variables :
Part of it is in the computing node , The timeout variables of the compute node are all in the configuration file of the compute node instance , It can be modified as needed , And refresh the parameters of the running instance after modification .
Part of it is in the storage node , The timeout variable of the storage node is in the storage node configuration file , Profile can be modified , It can also be performed on a compute node or a storage node set Statement to modify the corresponding variable value .
In general, users do not need to modify these variables , Because we have optimized the configuration parameters of the computing node and the storage node for the general situation .
But in the Special scenes You still need to modify these timeout variables .
A typical scenario is to be in a DML Insert... In the statement / to update / Delete millions of rows or more of data , Or a select Statements return millions of rows or more of data .
for example , Logical import of large data tables or full data , Update the entire table for the super large table , Data analysis (OLAP) The query needs to scan a very large table , And programmers or DBA I plan to delete the database and run away .
In these scenarios, it is best for the user to insert according to the estimation / to update / Delete / Amount of data read , Increase the following timeout values in advance , To ensure that relevant statements and operations can work normally until they are completed , It will not be mistaken by the timeout mechanism as a statement that has timed out and cannot be executed correctly and terminated in advance .
Or the user can try these operations and get errors , Increase these timeout values .
Let's take a look at all timeout control variables of Kunlun distributed database .
Calculate the timeout variable function of the node
1. statement_timeout: Statement timeout .
If the total query execution time of the computing node exceeds this limit , The statement will be rolled back .
such as , If the computing node uses part of the data returned by the storage cluster to perform table connection, the time consumption is too long , Then it will eventually stop after the timeout limit is reached ( Default 100 second ).
2. mysql_read_timeout and mysql_write_timeout: Compute nodes vs. storage nodes / The communication between metadata nodes ( Reading and writing ) Overtime .
Read more than mysql_read_timeout Or write more than mysql_write_timeout Then the calculation node uses mysql The client library will report an error and read from the / Write waiting returns , In this way, the execution of the statement is terminated in advance .
If one is sent to the computing node insert Statement will insert 100 Ten thousand rows of data , Or one select Statement will return millions of rows of data from the storage node , Then it is better to increase the value of these two variables , By default they are 50 second .
in addition , In this case, it needs to be increased mysql_max_packet_size Variable , Ensure that such large packets can be sent to the storage node correctly .
3. lock_timeout: Calculate the time that the node waits for the table lock .
The addition, deletion, modification and query statements executed concurrently are compatible with the table , No need to wait for the lock .
But if one alter table Statement is executing , In this case, other connections on the same compute node cannot execute the DML sentence , They can't wait that long , If you can't get the lock, you will report an error and return ( Default 100 second ).
3. log_min_duration_statement: Statements that exceed this time will be recorded in the log file as slow queries .
If you want to be in each insert Insert tens of thousands of lines or more into the statement , Then we must increase this variable , Otherwise, a large amount of data will be recorded in the log file , This causes the computing node to run out of disk space ( Default 10 second ).
The timeout variable function of the storage node
1. lock_wait_timeout:mysql server Lock timeout variable of layer .
wait for server Maximum time of table lock of layer . If one DDL Statements in alter table, So all the things you do to this table DML Statement will block up to so many tables waiting , If the table lock is not obtained, an error will be returned .
stay MySQL8.0 Time , The most common operations, such as adding columns and adding citations, which once had to lock the entire table, no longer require long-term locking of the entire table , Has become online ddl, So default 5 Seconds are generally enough .
2. innodb_lock_wait_timeout:mysql innodb Lock timeout variable of , wait for innodb Maximum time of row lock .
More than that DML Statement will report an error and return .
If you want to update the whole table , And the amount of data in the table is very large , For example, hundreds of GB Even more , that update Statement will lock a large number of lines for a long time , At this time, other transactions usually have lock timeouts , Unless its innodb_lock_wait_timeout( Default 20 second ).
3. If the storage cluster uses MySQL Group Replication High availability , Then you need to increase
MGR Of group_replication_member_expel_timeout,group_replication_component_stop_timeout, group_replication_unreachable_majority_timeout Timeout control variable , otherwise MGR The standby node of is down by mistake, thus initiating the active / standby switchover , Or the primary node loses contact with the standby machine and cannot write to it .
Conclusion
Kunlun distributed database has a perfect timeout control mechanism , There is a timeout control in any inter node communication mechanism , Ensure that any operation has a maximum time consumption limit , Ensure that the system status can continue to advance , System resources can continuously serve more service requests .
The project is open source
【GitHub:】
https://github.com/zettadb
【Gitee:】
https://gitee.com/zettadb
THE END
边栏推荐
- Bubble sort pointer
- 【GO】Go Modules入門
- wallys/WiFi6 MiniPCIe Module 2T2R 2 × 2.4GHz 2x5GHz
- OJ每日一练——过滤多余的空格
- 为什么现在大家都不用外键了(二)?
- IPV4的未来替代品!一文读懂IPV6的优势特点和地址类型
- Tianyi cloud takes advantage of the new infrastructure to build a "4+2" capability system for digital transformation
- How to use swagger2
- 昆仑分布式数据库独特的变量读写功能介绍
- 2022天梯赛-全国总决赛复盘赛
猜你喜欢

Programmers' choice of taking private jobs and part-time jobs
![[go] go array and slice (dynamic array)](/img/63/9a3fb70b202ca45828cd1b62897eec.jpg)
[go] go array and slice (dynamic array)

考过HCIP依然转行失败,职业网工最看重的到底是什么

KunlunDB 查询优化(一)
![[STM32 skill] use the hardware I2C of STM32 Hal library to drive rx8025t real-time clock chip](/img/32/88321db57afb50ccc096d687ff9c41.png)
[STM32 skill] use the hardware I2C of STM32 Hal library to drive rx8025t real-time clock chip

Php7.3 error undefined function simplexml_ load_ string()

如何使用enum数据类型

Oracle ASM使用asmcmd中的cp命令来执行远程复制

【首发】Redis系列2:数据持久化提高可用性

KunlunDB查询优化(二)Project和Filter下推
随机推荐
Uniapp modifies array properties, and the view is not updated
Leakcanary source code (2)
Anti shake & throttling enhanced version
[arm] it is reported that horizontal display is set for LVDS screen of rk3568 development board
输出字符串中最长的单词
昆仑分布式数据库Sequence功能及其实现机制
DCC888 :SSA (static single assignment form)
JSBridge
【ARM】讯为rk3568开发板lvds屏设置横屏显示
Digital data was invited to participate in Nantong enterprise digital transformation Seminar
KunlunDB 查询优化(一)
Future alternatives to IPv4! Read the advantages, features and address types of IPv6
Redis cache
包管理工具--NPM、--CNPM、 --Yarn、 --CYarn
事务系统的隔离级别
KunlunDB备份和恢复
昆仑分布式数据库独特的变量读写功能介绍
再立云计算“昆仑”,联想混合云Lenovo xCloud凭什么?
KunlunDB查询优化(二)Project和Filter下推
Customize multi-level list styles in word