当前位置:网站首页>[MySQL] MySQL million level data paging query method and its optimization

[MySQL] MySQL million level data paging query method and its optimization

2022-06-26 05:25:00 weixin_ forty-three million two hundred and twenty-four thousan

Method 1: Use the database directly SQL sentence

  • Sentence style : MySQL in , The following methods can be used : SELECT * FROM The name of the table LIMIT M,N

  • Adaptive scene : It is suitable for small amount of data ( Tuples hundred / Thousand level )

  • reason / shortcoming : Full table scan , It's going to be slow And Some database result sets return unstable ( Like a return 1,2,3, Another return 2,1,3). Limit The limitation is from the result set M Take it out of position N Bar output , The rest is abandoned .

Method 2: Create a primary key or unique index , Using index ( Suppose that every page 10 strip )

  • Sentence style : MySQL in , The following methods can be used : SELECT * FROM The name of the table WHERE id_pk > (pageNum*10) LIMIT M

  • Adaptive scene : It is suitable for large amount of data ( There are tens of thousands of tuples )

  • reason : An index scan , It's going to be fast . Put forward by a friend : Because the data is not searched according to pk_id Sort of , So there will be cases of missing data , It can only be done by 3

Method 3: Reorder based on Index

  • Sentence style : MySQL in , The following methods can be used : SELECT * FROM The name of the table WHERE id_pk > (pageNum*10) ORDER BY id_pk ASC LIMIT M

  • Adaptive scene : It is suitable for large amount of data ( There are tens of thousands of tuples ). best ORDER BY The next column object is Primary key Or only so , bring ORDERBY The operation can be eliminated by using the index, but the result set is stable ( What stability means , See method 1)

  • reason : An index scan , It's going to be fast . but MySQL The sorting operation of , Only ASC No, DESC(DESC It's fake , The future will do real DESC, expect …).

Method 4: Use... Based on index prepare

The first question mark indicates pageNum, the second ? Represents the number of tuples per page

  • Sentence style : MySQL in , The following methods can be used : PREPARE stmt_name FROM SELECT * FROM The name of the table WHERE id_pk > (?* ?) ORDER BY id_pk ASC LIMIT M

  • Adaptive scene :  big data The amount

  • reason : An index scan , It's going to be fast . prepare Statement is a little faster than the general query statement .

Method 5: utilize MySQL Support ORDER Operations can use the index to quickly locate partial tuples , Avoid full table scanning

such as : Read 1000 To 1019 Row tuple (pk It's the primary key / The only key ).

SELECT * FROM your_table WHERE pk>=1000 ORDER BY pk ASC LIMIT 0,20
  • 1

Method 6: utilize “ Subquery / Connect + Indexes ” Quickly locate the position of a tuple , And then read tuples .

such as (id It's the primary key / The only key , Blue font is variable )

Use the subquery example :

SELECT * FROM your_table WHERE id <=
(SELECT id FROM your_table ORDER BY id desc LIMIT ($page-1)*$pagesize ORDER BY id desc
LIMIT $pagesize 
  • 1
  • 2
  • 3

Using the connection example :

SELECT * FROM your_table AS t1
JOIN (SELECT id FROM your_table ORDER BY id desc LIMIT ($page-1)*$pagesize AS t2
WHERE t1.id <= t2.id ORDER BY t1.id desc LIMIT $pagesize; 
  • 1
  • 2
  • 3

mysql Big data usage limit Pagination , As the page number increases , The less efficient the query is .

Test experiment

  1. Direct use limit start, count Paging statement , It's also the method I use in my program :
select * from product limit start, count 
  • 1

When the start page is small , There is no performance problem with the query , Let's look at it separately from 10, 100, 1000, 10000 Start paging execution time ( Take every page 20 strip ).

as follows :

select * from product limit 10, 20   0.016 second  
select * from product limit 100, 20   0.016 second 
select * from product limit 1000, 20   0.047 second 
select * from product limit 10000, 20   0.094 second 
  • 1
  • 2
  • 3
  • 4

We've seen that as the starting record increases , Time is also increasing , This explains paging statements limit It has a lot to do with the starting page number , So let's change the starting record to 40w look down ( That is to say, the general record is about )

select * from product limit 400000, 20   3.229 second  
  • 1

Let's take a look at the time when we took the last page of records

select * from product limit 866613, 20   37.44 second  
  • 1

It's obvious that this kind of time can't be tolerated for pages with the largest page size .

We can also sum up two things :

  1. limit The query time of the statement is proportional to the position of the starting record

  2. mysql Of limit Sentences are very convenient , But the tables with many records are not suitable for direct use .

2. Yes limit Performance optimization methods for paging problems

Use table coverage index to speed up paging query

We all know , If only that index column is included in the statement using index query ( Overlay index ), Then this situation will be inquired soon .

Because there is an optimization algorithm to use index search , And the data is on the query index , You don't have to look for the relevant data address anymore , This saves a lot of time . in addition Mysql There are also related index caches in , When the concurrency is high, it is better to use cache .

In our case , We know id Field is primary key , Naturally, it contains the default primary key index . Now let's take a look at the effect of using an overlay index .

This time we look up the data on the last page ( Use overlay index , Contains only id Column ), as follows :

select id from product limit 866613, 20 0.2 second  
  • 1

Relative to the 37.44 second , It's about 100 Multiple speed

So if we also want to query all columns , There are two ways , One is id>= In the form of , The other is to use join, Take a look at the actual situation :

SELECT * FROM product WHERE ID > =(select id from product limit 866613, 1) limit 20
  • 1

The query time is 0.2 second !

Another way of writing

SELECT * FROM product a JOIN (select id from product limit 866613, 20) b ON a.ID = b.id
  • 1

The query time is also very short !

3. Composite index optimization method

MySql How high the performance can be ?MySql This database is absolutely suitable for dba Level master to play , Usually do a little 1 You can write 10000 news in a small system , use xx Frameworks can be developed quickly . But the amount of data has arrived 10 ten thousand , Millions to tens of millions , Is his performance still that high ? A little mistake , It may cause the whole system to be rewritten , Even worse, the system can't work properly ! Okay , Not so much nonsense .

Speak with facts , Look at examples :

Data sheet collect (id, title ,info ,vtype) Is this 4 A field , among title Use fixed length ,info use text, id It's gradual ,vtype yes tinyint,vtype It's the index . This is a simple model of a basic news system . Now fill in the data , fill 10 Ten thousand news . Last collect by 10 Ten thousand records , Database tables take up hard 1.6G.

OK , Look at this one below sql sentence :

select id,title from collect limit 1000,10;
  • 1

Soon ; Basically 0.01 In seconds OK, Look at the following

select id,title from collect limit 90000,10;
  • 1

from 9 Ten thousand start to page , result ?

8-9 Seconds to complete ,my god What's wrong ? Actually, to optimize this data , You can find the answer on the Internet . Look at the following sentence :

select id from collect order by id limit 90000,10;
  • 1

Soon ,0.04 In seconds OK. Why? ? Because in the id Of course, it's fast to index the primary key . The change on the Internet is :

select id,title from collect where id>=(select id from collect order by id limit 90000,1) limit 10;
  • 1

That's how it works id The result of the index . But the problem is a little complicated , It's over . Look at the following sentence

select id from collect where vtype=1 order by id limit 90000,10; Very slowly , It was used 8-9 second !

I believe that many people will be like me here , There's a sense of collapse !vtype It's indexed ? How can it be slow ?vtype It's good to index , Your direct

select id from collect where vtype=1 limit 1000,10;
  • 1

It's very fast , Basically 0.05 second , But improving 90 times , from 9 Ten thousand starts , That's it 0.05*90=4.5 At a rate of one second . And test results 8-9 Seconds to an order of magnitude .

From here on, someone put forward the idea of sub table , This and dis #cuz Forum is the same idea . Ideas as follows :

Build an index table : t (id,title,vtype) And set it to a fixed length , And then do pagination , Page out the results and go to collect Go inside info . Is it feasible ? We'll see in the experiment .

10 Ten thousand records to t(id,title,vtype) in , Data table size 20M about . use

select id from t where vtype=1 order by id limit 90000,10;
  • 1

soon . Basically 0.1-0.2 You can run it in seconds . Why is this so ? I guess it's because collect Too much data , So it's a long way to go .limit It's all about the size of the data table . In fact, it's still full table scanning , Just because of the small amount of data , Only 10 Ten thousand talents are quick .OK, Let's do a crazy experiment , Add to 100 Ten thousand , Test performance . added 10 Times the data , immediately t Here's the watch 200 many M, And it's fixed length . It's the same query statement , Time is 0.1-0.2 Seconds to complete ! There is no problem with the performance of the sub meter ?

wrong ! Because of our limit still 9 ten thousand , So come on . Give me a big one ,90 Ten thousand starts

select id from t where vtype=1 order by id limit 900000,10;
  • 1

Look at the results , Time is 1-2 second !why ?

It's still such a long time , Very depressed ! It is said that the growth will be improved limit Performance of , At first I thought , Because the length of a record is fixed ,mysql It should be possible to work out 90 Wan's position is right ? But we overestimate mysql The intelligence of , He's not a business database , It has been proved that fixed length and non fixed length are right limit The impact is not big ? No wonder someone said discuz here we are 100 Ten thousand records will be slow , I believe it's true , This is about database design !

Don't MySQL Can't break through 100 The limit of ten thousand ??? here we are 100 Ten thousand pages is really to the limit ?

The answer is : NO Why can't we break through 100 It's because I can't design mysql Caused by the . The following is the non - sub table method , A crazy test ! A list is done 100 Ten thousand records , also 10G database , How to quickly paginate !

Okay , Our test goes back to collect surface , At the beginning of the test, the conclusion is :

30 All the data , It is feasible to use the sub table method , exceed 30 You can't stand it ! Of course, if you use the sub table + I don't know this way , It's absolutely perfect . But in my way , It can be perfectly solved without sub table !

The answer is : Composite index ! There was a design mysql When indexing , Inadvertently found that the index name can be arbitrary , You can select a few fields to come in , What's the use ?

At the beginning

select id from collect order by id limit 90000,10; 
  • 1

It's so fast because of the index , But if you add where No index . With the idea of having a try, I added search(vtype,id) Such an index .

Then test

select id from collect where vtype=1 limit 90000,10; 
  • 1

Very fast !0.04 Seconds to complete !

Retest :

select id ,title from collect where vtype=1 limit 90000,10; 
  • 1

Very regret ,8-9 second , Didn't go search Indexes !

Retest :search(id,vtype), still select id This statement , And I'm sorry ,0.5 second .

Sum up : If there is where Conditions , I want to go again limit Of , You have to design an index , take where First place ,limit The primary key used is put in the second place 2 position , And only select Primary key !

Perfect solution to the paging problem . Quick return id There is hope to optimize limit , According to this logic , Millions of limit belong 0.0x You can finish in seconds . It seems mysql Statement optimization and indexing are very important !

原网站

版权声明
本文为[weixin_ forty-three million two hundred and twenty-four thousan]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/177/202206260519524867.html