当前位置:网站首页>[MySQL] MySQL million level data paging query method and its optimization
[MySQL] MySQL million level data paging query method and its optimization
2022-06-26 05:25:00 【weixin_ forty-three million two hundred and twenty-four thousan】
Method 1: Use the database directly SQL sentence
Sentence style : MySQL in , The following methods can be used : SELECT * FROM The name of the table LIMIT M,N
Adaptive scene : It is suitable for small amount of data ( Tuples hundred / Thousand level )
reason / shortcoming : Full table scan , It's going to be slow And Some database result sets return unstable ( Like a return 1,2,3, Another return 2,1,3). Limit The limitation is from the result set M Take it out of position N Bar output , The rest is abandoned .
Method 2: Create a primary key or unique index , Using index ( Suppose that every page 10 strip )
Sentence style : MySQL in , The following methods can be used : SELECT * FROM The name of the table WHERE id_pk > (pageNum*10) LIMIT M
Adaptive scene : It is suitable for large amount of data ( There are tens of thousands of tuples )
reason : An index scan , It's going to be fast . Put forward by a friend : Because the data is not searched according to pk_id Sort of , So there will be cases of missing data , It can only be done by 3
Method 3: Reorder based on Index
Sentence style : MySQL in , The following methods can be used : SELECT * FROM The name of the table WHERE id_pk > (pageNum*10) ORDER BY id_pk ASC LIMIT M
Adaptive scene : It is suitable for large amount of data ( There are tens of thousands of tuples ). best ORDER BY The next column object is Primary key Or only so , bring ORDERBY The operation can be eliminated by using the index, but the result set is stable ( What stability means , See method 1)
reason : An index scan , It's going to be fast . but MySQL The sorting operation of , Only ASC No, DESC(DESC It's fake , The future will do real DESC, expect …).
Method 4: Use... Based on index prepare
The first question mark indicates pageNum, the second ? Represents the number of tuples per page
Sentence style : MySQL in , The following methods can be used : PREPARE stmt_name FROM SELECT * FROM The name of the table WHERE id_pk > (?* ?) ORDER BY id_pk ASC LIMIT M
Adaptive scene : big data The amount
reason : An index scan , It's going to be fast . prepare Statement is a little faster than the general query statement .
Method 5: utilize MySQL Support ORDER Operations can use the index to quickly locate partial tuples , Avoid full table scanning
such as : Read 1000 To 1019 Row tuple (pk It's the primary key / The only key ).
SELECT * FROM your_table WHERE pk>=1000 ORDER BY pk ASC LIMIT 0,20
- 1
Method 6: utilize “ Subquery / Connect + Indexes ” Quickly locate the position of a tuple , And then read tuples .
such as (id It's the primary key / The only key , Blue font is variable )
Use the subquery example :
SELECT * FROM your_table WHERE id <=
(SELECT id FROM your_table ORDER BY id desc LIMIT ($page-1)*$pagesize ORDER BY id desc
LIMIT $pagesize
- 1
- 2
- 3
Using the connection example :
SELECT * FROM your_table AS t1
JOIN (SELECT id FROM your_table ORDER BY id desc LIMIT ($page-1)*$pagesize AS t2
WHERE t1.id <= t2.id ORDER BY t1.id desc LIMIT $pagesize;
- 1
- 2
- 3
mysql Big data usage limit Pagination , As the page number increases , The less efficient the query is .
Test experiment
- Direct use limit start, count Paging statement , It's also the method I use in my program :
select * from product limit start, count
- 1
When the start page is small , There is no performance problem with the query , Let's look at it separately from 10, 100, 1000, 10000 Start paging execution time ( Take every page 20 strip ).
as follows :
select * from product limit 10, 20 0.016 second
select * from product limit 100, 20 0.016 second
select * from product limit 1000, 20 0.047 second
select * from product limit 10000, 20 0.094 second
- 1
- 2
- 3
- 4
We've seen that as the starting record increases , Time is also increasing , This explains paging statements limit It has a lot to do with the starting page number , So let's change the starting record to 40w look down ( That is to say, the general record is about )
select * from product limit 400000, 20 3.229 second
- 1
Let's take a look at the time when we took the last page of records
select * from product limit 866613, 20 37.44 second
- 1
It's obvious that this kind of time can't be tolerated for pages with the largest page size .
We can also sum up two things :
limit The query time of the statement is proportional to the position of the starting record
mysql Of limit Sentences are very convenient , But the tables with many records are not suitable for direct use .
2. Yes limit Performance optimization methods for paging problems
Use table coverage index to speed up paging query
We all know , If only that index column is included in the statement using index query ( Overlay index ), Then this situation will be inquired soon .
Because there is an optimization algorithm to use index search , And the data is on the query index , You don't have to look for the relevant data address anymore , This saves a lot of time . in addition Mysql There are also related index caches in , When the concurrency is high, it is better to use cache .
In our case , We know id Field is primary key , Naturally, it contains the default primary key index . Now let's take a look at the effect of using an overlay index .
This time we look up the data on the last page ( Use overlay index , Contains only id Column ), as follows :
select id from product limit 866613, 20 0.2 second
- 1
Relative to the 37.44 second , It's about 100 Multiple speed
So if we also want to query all columns , There are two ways , One is id>= In the form of , The other is to use join, Take a look at the actual situation :
SELECT * FROM product WHERE ID > =(select id from product limit 866613, 1) limit 20
- 1
The query time is 0.2 second !
Another way of writing
SELECT * FROM product a JOIN (select id from product limit 866613, 20) b ON a.ID = b.id
- 1
The query time is also very short !
3. Composite index optimization method
MySql How high the performance can be ?MySql This database is absolutely suitable for dba Level master to play , Usually do a little 1 You can write 10000 news in a small system , use xx Frameworks can be developed quickly . But the amount of data has arrived 10 ten thousand , Millions to tens of millions , Is his performance still that high ? A little mistake , It may cause the whole system to be rewritten , Even worse, the system can't work properly ! Okay , Not so much nonsense .
Speak with facts , Look at examples :
Data sheet collect (id, title ,info ,vtype) Is this 4 A field , among title Use fixed length ,info use text, id It's gradual ,vtype yes tinyint,vtype It's the index . This is a simple model of a basic news system . Now fill in the data , fill 10 Ten thousand news . Last collect by 10 Ten thousand records , Database tables take up hard 1.6G.
OK , Look at this one below sql sentence :
select id,title from collect limit 1000,10;
- 1
Soon ; Basically 0.01 In seconds OK, Look at the following
select id,title from collect limit 90000,10;
- 1
from 9 Ten thousand start to page , result ?
8-9 Seconds to complete ,my god What's wrong ? Actually, to optimize this data , You can find the answer on the Internet . Look at the following sentence :
select id from collect order by id limit 90000,10;
- 1
Soon ,0.04 In seconds OK. Why? ? Because in the id Of course, it's fast to index the primary key . The change on the Internet is :
select id,title from collect where id>=(select id from collect order by id limit 90000,1) limit 10;
- 1
That's how it works id The result of the index . But the problem is a little complicated , It's over . Look at the following sentence
select id from collect where vtype=1 order by id limit 90000,10; Very slowly , It was used 8-9 second !
I believe that many people will be like me here , There's a sense of collapse !vtype It's indexed ? How can it be slow ?vtype It's good to index , Your direct
select id from collect where vtype=1 limit 1000,10;
- 1
It's very fast , Basically 0.05 second , But improving 90 times , from 9 Ten thousand starts , That's it 0.05*90=4.5 At a rate of one second . And test results 8-9 Seconds to an order of magnitude .
From here on, someone put forward the idea of sub table , This and dis #cuz Forum is the same idea . Ideas as follows :
Build an index table : t (id,title,vtype) And set it to a fixed length , And then do pagination , Page out the results and go to collect Go inside info . Is it feasible ? We'll see in the experiment .
10 Ten thousand records to t(id,title,vtype) in , Data table size 20M about . use
select id from t where vtype=1 order by id limit 90000,10;
- 1
soon . Basically 0.1-0.2 You can run it in seconds . Why is this so ? I guess it's because collect Too much data , So it's a long way to go .limit It's all about the size of the data table . In fact, it's still full table scanning , Just because of the small amount of data , Only 10 Ten thousand talents are quick .OK, Let's do a crazy experiment , Add to 100 Ten thousand , Test performance . added 10 Times the data , immediately t Here's the watch 200 many M, And it's fixed length . It's the same query statement , Time is 0.1-0.2 Seconds to complete ! There is no problem with the performance of the sub meter ?
wrong ! Because of our limit still 9 ten thousand , So come on . Give me a big one ,90 Ten thousand starts
select id from t where vtype=1 order by id limit 900000,10;
- 1
Look at the results , Time is 1-2 second !why ?
It's still such a long time , Very depressed ! It is said that the growth will be improved limit Performance of , At first I thought , Because the length of a record is fixed ,mysql It should be possible to work out 90 Wan's position is right ? But we overestimate mysql The intelligence of , He's not a business database , It has been proved that fixed length and non fixed length are right limit The impact is not big ? No wonder someone said discuz here we are 100 Ten thousand records will be slow , I believe it's true , This is about database design !
Don't MySQL Can't break through 100 The limit of ten thousand ??? here we are 100 Ten thousand pages is really to the limit ?
The answer is : NO Why can't we break through 100 It's because I can't design mysql Caused by the . The following is the non - sub table method , A crazy test ! A list is done 100 Ten thousand records , also 10G database , How to quickly paginate !
Okay , Our test goes back to collect surface , At the beginning of the test, the conclusion is :
30 All the data , It is feasible to use the sub table method , exceed 30 You can't stand it ! Of course, if you use the sub table + I don't know this way , It's absolutely perfect . But in my way , It can be perfectly solved without sub table !
The answer is : Composite index ! There was a design mysql When indexing , Inadvertently found that the index name can be arbitrary , You can select a few fields to come in , What's the use ?
At the beginning
select id from collect order by id limit 90000,10;
- 1
It's so fast because of the index , But if you add where No index . With the idea of having a try, I added search(vtype,id) Such an index .
Then test
select id from collect where vtype=1 limit 90000,10;
- 1
Very fast !0.04 Seconds to complete !
Retest :
select id ,title from collect where vtype=1 limit 90000,10;
- 1
Very regret ,8-9 second , Didn't go search Indexes !
Retest :search(id,vtype), still select id This statement , And I'm sorry ,0.5 second .
Sum up : If there is where Conditions , I want to go again limit Of , You have to design an index , take where First place ,limit The primary key used is put in the second place 2 position , And only select Primary key !
Perfect solution to the paging problem . Quick return id There is hope to optimize limit , According to this logic , Millions of limit belong 0.0x You can finish in seconds . It seems mysql Statement optimization and indexing are very important !
边栏推荐
- LeetCode_ Binary search tree_ Simple_ 108. convert an ordered array to a binary search tree
- PHP one sentence Trojan horse
- Daily production training report (16)
- PHP二维/多维数组按照指定的键值来进行升序和降序
- Sentimentin tensorflow_ analysis_ cell
- The localstorage browser stores locally to limit the number of forms submitted when tourists do not log in.
- 【MYSQL】MySQL 百万级数据量分页查询方法及其优化
- Learn cache lines and pseudo sharing of JVM slowly
- 二次bootloader关于boot28.asm应用的注意事项,28035的
- As promised: Mars, the mobile terminal IM network layer cross platform component library used by wechat, has been officially open source
猜你喜欢
cartographer_backend_constraint
How to select the data transmission format of instant messaging application
Red team scoring method statistics
红队得分方法统计
AutowiredAnnotationBeanPostProcessor什么时候被实例化的?
Redis usage and memory optimization
【ARM】在NUC977上搭建基于boa的嵌入式web服务器
Introduction to GUI programming to game practice (I)
[activity recommendation] cloud native, industrial Internet, low code, Web3, metauniverse... Which is the architecture hot spot in 2022
Windows下安装Tp6.0框架,图文。Thinkphp6.0安装教程
随机推荐
cartographer_pose_graph_2d
Ad tutorial series | 4 - creating an integration library file
Secondary bootloader about boot28 Precautions for ASM application, 28035
cartographer_optimization_problem_2d
cartographer_ optimization_ problem_ 2d
Positioning setting horizontal and vertical center (multiple methods)
数据存储:MySQL之InnoDB与MyISAM的区别
Procedural life
The best Chinese open source class of vision transformer, ten hours of on-site coding to play with the popular model of Vit!
Replacing domestic image sources in openwrt for soft routing (take Alibaba cloud as an example)
Protocol selection of mobile IM system: UDP or TCP?
使用Jedis監聽Redis Stream 實現消息隊列功能
Beidou navigation technology and industrial application of "chasing dreams in space and feeling for Beidou"
data = self._data_queue.get(timeout=timeout)
The wechat team disclosed that the wechat interface is stuck with a super bug "15..." The context of
小小面试题之GET和POST的区别
Could not get unknown property ‘*‘ for SigningConfig container of type org. gradle. api. internal
Pytorch forecast house price
How does P2P technology reduce the bandwidth of live video by 75%?
apktool 工具使用文档