当前位置:网站首页>[MySQL] MySQL million level data paging query method and its optimization
[MySQL] MySQL million level data paging query method and its optimization
2022-06-26 05:25:00 【weixin_ forty-three million two hundred and twenty-four thousan】
Method 1: Use the database directly SQL sentence
Sentence style : MySQL in , The following methods can be used : SELECT * FROM The name of the table LIMIT M,N
Adaptive scene : It is suitable for small amount of data ( Tuples hundred / Thousand level )
reason / shortcoming : Full table scan , It's going to be slow And Some database result sets return unstable ( Like a return 1,2,3, Another return 2,1,3). Limit The limitation is from the result set M Take it out of position N Bar output , The rest is abandoned .
Method 2: Create a primary key or unique index , Using index ( Suppose that every page 10 strip )
Sentence style : MySQL in , The following methods can be used : SELECT * FROM The name of the table WHERE id_pk > (pageNum*10) LIMIT M
Adaptive scene : It is suitable for large amount of data ( There are tens of thousands of tuples )
reason : An index scan , It's going to be fast . Put forward by a friend : Because the data is not searched according to pk_id Sort of , So there will be cases of missing data , It can only be done by 3
Method 3: Reorder based on Index
Sentence style : MySQL in , The following methods can be used : SELECT * FROM The name of the table WHERE id_pk > (pageNum*10) ORDER BY id_pk ASC LIMIT M
Adaptive scene : It is suitable for large amount of data ( There are tens of thousands of tuples ). best ORDER BY The next column object is Primary key Or only so , bring ORDERBY The operation can be eliminated by using the index, but the result set is stable ( What stability means , See method 1)
reason : An index scan , It's going to be fast . but MySQL The sorting operation of , Only ASC No, DESC(DESC It's fake , The future will do real DESC, expect …).
Method 4: Use... Based on index prepare
The first question mark indicates pageNum, the second ? Represents the number of tuples per page
Sentence style : MySQL in , The following methods can be used : PREPARE stmt_name FROM SELECT * FROM The name of the table WHERE id_pk > (?* ?) ORDER BY id_pk ASC LIMIT M
Adaptive scene : big data The amount
reason : An index scan , It's going to be fast . prepare Statement is a little faster than the general query statement .
Method 5: utilize MySQL Support ORDER Operations can use the index to quickly locate partial tuples , Avoid full table scanning
such as : Read 1000 To 1019 Row tuple (pk It's the primary key / The only key ).
SELECT * FROM your_table WHERE pk>=1000 ORDER BY pk ASC LIMIT 0,20
- 1
Method 6: utilize “ Subquery / Connect + Indexes ” Quickly locate the position of a tuple , And then read tuples .
such as (id It's the primary key / The only key , Blue font is variable )
Use the subquery example :
SELECT * FROM your_table WHERE id <=
(SELECT id FROM your_table ORDER BY id desc LIMIT ($page-1)*$pagesize ORDER BY id desc
LIMIT $pagesize
- 1
- 2
- 3
Using the connection example :
SELECT * FROM your_table AS t1
JOIN (SELECT id FROM your_table ORDER BY id desc LIMIT ($page-1)*$pagesize AS t2
WHERE t1.id <= t2.id ORDER BY t1.id desc LIMIT $pagesize;
- 1
- 2
- 3
mysql Big data usage limit Pagination , As the page number increases , The less efficient the query is .
Test experiment
- Direct use limit start, count Paging statement , It's also the method I use in my program :
select * from product limit start, count
- 1
When the start page is small , There is no performance problem with the query , Let's look at it separately from 10, 100, 1000, 10000 Start paging execution time ( Take every page 20 strip ).
as follows :
select * from product limit 10, 20 0.016 second
select * from product limit 100, 20 0.016 second
select * from product limit 1000, 20 0.047 second
select * from product limit 10000, 20 0.094 second
- 1
- 2
- 3
- 4
We've seen that as the starting record increases , Time is also increasing , This explains paging statements limit It has a lot to do with the starting page number , So let's change the starting record to 40w look down ( That is to say, the general record is about )
select * from product limit 400000, 20 3.229 second
- 1
Let's take a look at the time when we took the last page of records
select * from product limit 866613, 20 37.44 second
- 1
It's obvious that this kind of time can't be tolerated for pages with the largest page size .
We can also sum up two things :
limit The query time of the statement is proportional to the position of the starting record
mysql Of limit Sentences are very convenient , But the tables with many records are not suitable for direct use .
2. Yes limit Performance optimization methods for paging problems
Use table coverage index to speed up paging query
We all know , If only that index column is included in the statement using index query ( Overlay index ), Then this situation will be inquired soon .
Because there is an optimization algorithm to use index search , And the data is on the query index , You don't have to look for the relevant data address anymore , This saves a lot of time . in addition Mysql There are also related index caches in , When the concurrency is high, it is better to use cache .
In our case , We know id Field is primary key , Naturally, it contains the default primary key index . Now let's take a look at the effect of using an overlay index .
This time we look up the data on the last page ( Use overlay index , Contains only id Column ), as follows :
select id from product limit 866613, 20 0.2 second
- 1
Relative to the 37.44 second , It's about 100 Multiple speed
So if we also want to query all columns , There are two ways , One is id>= In the form of , The other is to use join, Take a look at the actual situation :
SELECT * FROM product WHERE ID > =(select id from product limit 866613, 1) limit 20
- 1
The query time is 0.2 second !
Another way of writing
SELECT * FROM product a JOIN (select id from product limit 866613, 20) b ON a.ID = b.id
- 1
The query time is also very short !
3. Composite index optimization method
MySql How high the performance can be ?MySql This database is absolutely suitable for dba Level master to play , Usually do a little 1 You can write 10000 news in a small system , use xx Frameworks can be developed quickly . But the amount of data has arrived 10 ten thousand , Millions to tens of millions , Is his performance still that high ? A little mistake , It may cause the whole system to be rewritten , Even worse, the system can't work properly ! Okay , Not so much nonsense .
Speak with facts , Look at examples :
Data sheet collect (id, title ,info ,vtype) Is this 4 A field , among title Use fixed length ,info use text, id It's gradual ,vtype yes tinyint,vtype It's the index . This is a simple model of a basic news system . Now fill in the data , fill 10 Ten thousand news . Last collect by 10 Ten thousand records , Database tables take up hard 1.6G.
OK , Look at this one below sql sentence :
select id,title from collect limit 1000,10;
- 1
Soon ; Basically 0.01 In seconds OK, Look at the following
select id,title from collect limit 90000,10;
- 1
from 9 Ten thousand start to page , result ?
8-9 Seconds to complete ,my god What's wrong ? Actually, to optimize this data , You can find the answer on the Internet . Look at the following sentence :
select id from collect order by id limit 90000,10;
- 1
Soon ,0.04 In seconds OK. Why? ? Because in the id Of course, it's fast to index the primary key . The change on the Internet is :
select id,title from collect where id>=(select id from collect order by id limit 90000,1) limit 10;
- 1
That's how it works id The result of the index . But the problem is a little complicated , It's over . Look at the following sentence
select id from collect where vtype=1 order by id limit 90000,10; Very slowly , It was used 8-9 second !
I believe that many people will be like me here , There's a sense of collapse !vtype It's indexed ? How can it be slow ?vtype It's good to index , Your direct
select id from collect where vtype=1 limit 1000,10;
- 1
It's very fast , Basically 0.05 second , But improving 90 times , from 9 Ten thousand starts , That's it 0.05*90=4.5 At a rate of one second . And test results 8-9 Seconds to an order of magnitude .
From here on, someone put forward the idea of sub table , This and dis #cuz Forum is the same idea . Ideas as follows :
Build an index table : t (id,title,vtype) And set it to a fixed length , And then do pagination , Page out the results and go to collect Go inside info . Is it feasible ? We'll see in the experiment .
10 Ten thousand records to t(id,title,vtype) in , Data table size 20M about . use
select id from t where vtype=1 order by id limit 90000,10;
- 1
soon . Basically 0.1-0.2 You can run it in seconds . Why is this so ? I guess it's because collect Too much data , So it's a long way to go .limit It's all about the size of the data table . In fact, it's still full table scanning , Just because of the small amount of data , Only 10 Ten thousand talents are quick .OK, Let's do a crazy experiment , Add to 100 Ten thousand , Test performance . added 10 Times the data , immediately t Here's the watch 200 many M, And it's fixed length . It's the same query statement , Time is 0.1-0.2 Seconds to complete ! There is no problem with the performance of the sub meter ?
wrong ! Because of our limit still 9 ten thousand , So come on . Give me a big one ,90 Ten thousand starts
select id from t where vtype=1 order by id limit 900000,10;
- 1
Look at the results , Time is 1-2 second !why ?
It's still such a long time , Very depressed ! It is said that the growth will be improved limit Performance of , At first I thought , Because the length of a record is fixed ,mysql It should be possible to work out 90 Wan's position is right ? But we overestimate mysql The intelligence of , He's not a business database , It has been proved that fixed length and non fixed length are right limit The impact is not big ? No wonder someone said discuz here we are 100 Ten thousand records will be slow , I believe it's true , This is about database design !
Don't MySQL Can't break through 100 The limit of ten thousand ??? here we are 100 Ten thousand pages is really to the limit ?
The answer is : NO Why can't we break through 100 It's because I can't design mysql Caused by the . The following is the non - sub table method , A crazy test ! A list is done 100 Ten thousand records , also 10G database , How to quickly paginate !
Okay , Our test goes back to collect surface , At the beginning of the test, the conclusion is :
30 All the data , It is feasible to use the sub table method , exceed 30 You can't stand it ! Of course, if you use the sub table + I don't know this way , It's absolutely perfect . But in my way , It can be perfectly solved without sub table !
The answer is : Composite index ! There was a design mysql When indexing , Inadvertently found that the index name can be arbitrary , You can select a few fields to come in , What's the use ?
At the beginning
select id from collect order by id limit 90000,10;
- 1
It's so fast because of the index , But if you add where No index . With the idea of having a try, I added search(vtype,id) Such an index .
Then test
select id from collect where vtype=1 limit 90000,10;
- 1
Very fast !0.04 Seconds to complete !
Retest :
select id ,title from collect where vtype=1 limit 90000,10;
- 1
Very regret ,8-9 second , Didn't go search Indexes !
Retest :search(id,vtype), still select id This statement , And I'm sorry ,0.5 second .
Sum up : If there is where Conditions , I want to go again limit Of , You have to design an index , take where First place ,limit The primary key used is put in the second place 2 position , And only select Primary key !
Perfect solution to the paging problem . Quick return id There is hope to optimize limit , According to this logic , Millions of limit belong 0.0x You can finish in seconds . It seems mysql Statement optimization and indexing are very important !
边栏推荐
- 第九章 设置结构化日志记录(一)
- vscode config
- [greedy college] recommended system engineer training plan
- ECCV 2020 double champion team, take you to conquer target detection on the 7th
- First day of deep learning and tensorflow learning
- C# 40. byte[]与16进制string互转
- The best Chinese open source class of vision transformer, ten hours of on-site coding to play with the popular model of Vit!
- skimage.morphology.medial_axis
- Internship May 29, 2019
- C# 39. string类型和byte[]类型相互转换(实测)
猜你喜欢

cartographer_ fast_ correlative_ scan_ matcher_ 2D branch and bound rough matching

Uni app ceiling fixed style
![[unity3d] human computer interaction input](/img/4d/47f6d40bb82400fe9c6d624c8892f7.png)
[unity3d] human computer interaction input

The difference between get and post in small interview questions

cartographer_ optimization_ problem_ 2d
![C# 40. Byte[] to hexadecimal string](/img/3e/1b8b4e522b28eea4faca26b276a27b.png)
C# 40. Byte[] to hexadecimal string

How to rewrite a pseudo static URL created by zenpart

redis探索之布隆过滤器

Anaconda creates tensorflow environment
![C# 40. byte[]与16进制string互转](/img/3e/1b8b4e522b28eea4faca26b276a27b.png)
C# 40. byte[]与16进制string互转
随机推荐
How to make your big file upload stable and fast?
cartographer_ fast_ correlative_ scan_ matcher_ 2D branch and bound rough matching
The State Council issued a document to improve the application of identity authentication and electronic seals, and strengthen the construction of Digital Government
11 IO frame
Implementation of IM message delivery guarantee mechanism (II): ensure reliable delivery of offline messages
Wechat team sharing: technical decryption behind wechat's 100 million daily real-time audio and video chats
Uni app ceiling fixed style
12 multithreading
Tp5.0框架 PDO连接mysql 报错:Too many connections 解决方法
Positioning setting horizontal and vertical center (multiple methods)
【MYSQL】MySQL 百万级数据量分页查询方法及其优化
Codeforces Round #802 (Div. 2)(A-D)
Install the tp6.0 framework under windows, picture and text. Thinkphp6.0 installation tutorial
C# 40. Byte[] to hexadecimal string
Recursively traverse directory structure and tree presentation
Douban top250
A beginner's entry is enough: develop mobile IM from zero
Could not get unknown property ‘*‘ for SigningConfig container of type org. gradle. api. internal
CMakeLists. txt Template
Supplementary course on basic knowledge of IM development (II): how to design a server-side storage architecture for a large number of image files?