当前位置：网站首页>In and exceptions, count (*) query optimization

In and exceptions, count (*) query optimization

2022-06-26 18:05:00 【Time is light, you are safe】

One 、in and exsits

1.1 principle

Small tables drive large tables , That is, small datasets drive large datasets

1.2 in

Applicable scenario ： When B The data set of the table is less than A When the data set of the table ,in be better than exists

select * from A where id in (select id from B)

       Think about it first , It's in brackets SQL First or outside first ？？？
       In the case of large numbers, it is in parentheses SQL Execute first , There are also some special cases , Related to the amount of data , It may be that the external personnel execute first .
       Here we discuss the first execution in brackets .
       In fact, it is to execute the SQL, And then put the outside SQL Put it in there SQL Line by line correlation comparison in the executed result set .

 # Equivalent to 
 for(select id from B){
     
  select * from A where A.id = B.id 
 }

B The smaller the amount of data in the table ,for The fewer cycles

1.3 exsits

Applicable scenario ： When A The data set of the table is less than B When the data set of the table ,exists be better than in

select * from A where exists (select 1 from B where B.id = A.id)

Based on the above in Of SQL, Above SQL It is the outer statement that executes first .
In fact, it's to put the outside first SQL Statement result set , And then put it into the sub query line by line B Do conditional verification , See if you can find the data （ return true or false）, return true The data will be put into the final result set .

 # Equivalent to : 
for(select * from A){
     
   select * from B where B.id = A.id 
} 
#A Table and B Tabular ID Fields should be indexed

1、EXISTS (subquery) Only return TRUE or FALSE, So in the subquery SELECT * It can also be used. SELECT 1 Replace , The official saying is that the actual implementation will Ignore SELECT detailed list , So there's no difference
2、EXISTS The actual execution process of the subquery may have been optimized, rather than the item by item comparison in our understanding
3、EXISTS Subqueries can often also be used JOIN Instead of , What's the best need to be analyzed in detail

Two 、count(*)

The table structure is as follows ：
Insert picture description here
Temporarily Closed mysql The query cache , To see sql The real time of multiple executions

set global query_cache_size=0; 
set global query_cache_type=0;

Now there are four things SQL

select count(1) from employees;
select count(id) from employees;
select count(name) from employees; 
select count(*) from employees;

Which one SQL The query efficiency will be a little higher ？？ yes count( Constant )、count( Primary key )、count( Non primary key fields )、 still count(*) Well ？

2.1 Under the analysis of count( Non primary key fields )

select count(name) from employees;

name Is a non primary key field , In the union index , So the bottom layer will actually scan the non primary key tree of the joint index , Scan one by one , Every time you scan one, the value +1, When it comes to null Value will not be added 1

2.2 Under the analysis of count( Primary key field )

select count(id) from employees;

id Is the primary key field , In the primary key index , So the bottom layer actually scans the primary key index tree , Scan one by one , Every time you scan one, the value +1,（ Because the primary key cannot be empty , Therefore, null values are not considered here ）.

2.3 count Comparison of different fields

Come here , You can think about ：
①、count(name) It is better to be more efficient count(id) Be more efficient ？
answer count(name) High efficiency . Before MySQL It is mentioned in the index introduction of , because Secondary indexes store less data than primary indexes （ The secondary index only stores the primary key data of the corresponding data , The primary key index stores all of the data Field ）, Query performance should be higher

②、count(name) Efficient or count(1) Efficient ？
answer ：count(1) High efficiency .count(1) and count(name) The final use is the secondary index , In general, the performance is similar , however count(1) Only the index tree will be traversed , Will not put name Take out the fields , and count(name) When traversing the index tree , The corresponding fields will be taken out , The bottom layer may encode and decode fields , The difference between the two is the time .

③、count(id) Efficient or count(*) Efficient ？
This may be difficult to understand , You can use first explain Look at the results , It is also a scanned secondary index tree , So relatively speaking count(id) Efficient .

therefore , In general ：count(1) > count(name) > count(*) > count(id)
(ps： stay MySQL5.7 In the version of the ,count(id) Optimized , In fact, it is the secondary index , Instead of the primary key index )

summary ：
four sql The efficiency of implementation is almost the same , The difference lies in According to a certain field count The statistics field will not be null value The data line

2.4 Common optimizations

2.4.1 Inquire about mysql The total number of rows maintained by yourself

about myisam The table of the storage engine does not have where Conditions of the count Query performance is very high , because myisam The total number of rows in the table of the storage engine will be mysql Stored on disk , The query does not need to calculate
Insert picture description here
about innodb The table that stores the engine ,mysql The total number of record rows of the table will not be stored , Inquire about count Real time computing is required

2.4.2 show table status

If you just need to know Of the total number of rows in the table Estimated value It can be used as follows sql Inquire about , A high performance
Insert picture description here

2.4.3 Maintain the total to Redis in

When inserting or deleting table data rows, maintain redis The total number of rows in the table key The count of ( use incr or decr command ), But this way Maybe not , It is difficult to guarantee table operation and redis Transactional consistency of operations

2.4.4 Add database count table

Maintain the count table while inserting or deleting table data rows , Let them Operate in the same transaction

More optimization , You can go to Paging query 、JOIN Association query optimization To view the .

原网站

版权声明
本文为[Time is light, you are safe]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/177/202206261753537304.html