当前位置:网站首页>Why do I use index, query or slow?
Why do I use index, query or slow?
2022-06-23 02:27:00 【Qingyang】
Link to the original text cnblogs.com/jackyfei/p/12122767.html
Students often have questions , Why sometimes one SQL Statement uses index , Why do you still enter the slow query ? Today we will begin with this question Talk about indexes and slow queries .
Insert another digression , I think the team should use it reasonably ORM. What is used reasonably is ORM Advantages in object-oriented and write operations , Avoid possible pits on Federated queries ( Of course if your Linq Query ability is very strong ), because ORM Too much shielding DB Knowledge content at the bottom , Not a good thing for programmers , The ultimate pursuit of performance , however ORM Teams that don't understand well should be more cautious .
Case analysis
Get down to business , For the experiment , I created the following table :
CREATE TABLE `T`( `id` int(11) NOT NULL, `a` int(11) DEFAUT NULL, PRIMARY KEY(`id`), KEY `a`(`a`) ) ENGINE=InnoDB;
The table has three fields , Among them id It's the primary key index ,a It's a general index .
First SQL Determine whether a statement is a slow query statement , It uses the execution time of the statement . He followed the execution time of the statement with long_query_time The system parameters are compared , If the statement takes longer to execute , This statement will be recorded in the slow query log , The default value for this parameter is 10 second . In production, of course , We're not going to set that big , It's usually set to 1 second , For some sensitive businesses , A ratio may be set 1 Second is still small .
Is the table index used during statement execution , Can pass explain The output of a statement KEY The value is not NULL.
Let's take a look at explain select * from t; Of KEY The result is NULL
( Figure 1 )
explain select * from t where id=2; Of KEY The result is PRIMARY, It is often said that we use the primary key index
( Figure 2 )
explain select a from t; Of KEY The result is a, Indicates that a This index .
( Figure 3 )
Although the latter two queries KEY Are not NULL, But the last one actually scans the entire index tree a.
Let's say that the amount of data in this table is 100 Line ten thousand , The statement in Figure 2 can be executed very quickly , But figure three must be very slow . If it's more extreme , such as , On this database CPU The pressure is very high , Then maybe the 2 The execution time of statements will also exceed long_query_time, It will enter the slow query log .
So we can come to a conclusion : There is no necessary relationship between using index and entering slow query . Using an index simply represents a SQL Statement execution , Whether to enter the slow query is determined by its execution time , And this execution time , May be affected by various external factors . In other words , Your statement may still be slow with index .
The lack of full index scanning
So if we look at this problem at a deeper level , In fact, he has a question to clarify , What is the use of index .
We all know ,InnoDB It's the index organization table , All the data is stored on the index tree . Like the table above t, This table contains two indexes , A primary key index and a common index . stay InnoDB in , The data is in the primary key index . As shown in the figure :
You can see that all the data is on the primary key index , If logically , be-all InnoDB The query on the table , At least one index is used , So now I ask you a question , If you execute select from t where id>0, Do you think this statement is useful for indexing ?
Let's look at the above sentence explain The output of is PRIMARY. In fact, from the data you know , This statement must have done a full scan . But the optimizer thinks , During the execution of this statement , It needs to be indexed according to the primary key , Locate to the first 1 A satisfaction ID>0 Value , Index is also used .
So even explain In the results of KEY No NULL, In fact, it may be a full table scan , therefore InnoDB There is only one case that is called not using index , That is, starting from the leftmost leaf node of the primary key index , Scan the entire index tree to the right .
in other words , Not using an index is not an accurate description .
You can use a full table scan to show that a query traverses the entire primary key index tree ;
You can also scan with full index , Say it like select a from t; Such a query , He scanned the entire index tree ;
and select * from t where id=2 Such a statement , It's the index that we usually use . What he means is , We use the quick search function of index , And effectively reduce the number of scanning lines .
Index filtering should be good enough
According to the above anatomy , We know that full index scanning will slow down queries , Next, let's talk about index filtering .
Suppose you maintain a table now , This table records China 14 Basic information of 100 million people , Now we need to find out that all ages 10~15 Names and basic information between the ages , So that's what your statement says ,select * from t_people where age between 10 and 15.
You must read this sentence in age The fields are starting to be indexed , Otherwise it's a full scan , But you will find , After you index , This statement is still slow to execute , Because there may be more data than 1 Billion rows .
Let's take a look at indexing later , The organization chart of this table :
The execution flow of this statement is as follows :
Search by tree from index , Take it to 1 individual age be equal to 10 The record of , Get its primary key id Value , according to id The primary key index is used to retrieve the whole row information , Return... As part of the result set ;
In the index age Up right scan , Take down one id Value , Get the whole row information on the primary key index , Return... As part of the result set ;
Repeat the above steps , Until I met the 1 individual age Greater than 15 The record of ;
Look at this sentence , Although he used the index , But he scanned more than 1 Billion rows . So now you know , When we're talking about using indexes , In fact, we are concerned about the number of scanning lines .
For a large watch , More than index , Index filtering should be good enough .
Like in the example just now age, Its filtration is not good enough , When designing table structure , We need to make all the filters good enough , That is, the differentiation is high enough .
The price of returning the watch
So filtering is good , Does it mean that the number of scan lines of the query must be small ?
Let's take another example :
If your execution statement is select * from t_people where name=' Zhang San ' and age=8
t_people There is an index on the table which is a joint index of name and age , Then the filtering of this joint index should be good , You can quickly find the 1 A name is Zhang San , And the age is 8 Little friend , Of course, there shouldn't be many such children , So there are very few lines to the right , Query efficiency is very high .
But the filterability of query and index may not be the same , If your need now is to find out the number of all the names 1 A word is Zhang , And the age is 8 All the children at the age of , How do you write your sentences ?
How do you write your sentences ? It's obvious that you would write :select * from t_people where name like ' Zhang %' and age=8;
stay MySQL5.5 And previous versions of , The execution flow of this statement is as follows :
First find... On the union index 1 The age fields are records at the beginning , Take out the primary key id, Then go to the primary key index tree , according to id Take out the value of the whole line ;
Determine whether the age field is equal to 8, If it is returned as a row in the result set , If not, discard .
Traversal right on union index , And repeat the logic of table and judgment , Until I come across the first name on the joint index tree 1 A word is not Zhang's record .
We put the basis id Find the whole row of data on the primary key index , It's called back to the table . You can see the execution process , The most time-consuming step is to go back to the table , Suppose the name of the country is 1 Zhang's person has 8000 ten thousand , So this process is going back to the table 8000 Ten thousand times , When locating the first row of records , Only the leftmost prefix of index and union index can be used , Most called the leftmost prefix principle .
You can see the execution process , It has a very high number of times to return the watch , Poor performance , Is there an optimization method ?
stay MySQL5.6 edition , Introduced index condition pushdown The optimization of the . Let's take a look at this optimized execution process :
First, from the joint index tree , Find No 1 The age fields are records at the beginning , Judge the index record , Is the value of age 8, If it is, return the form , Take out the whole line of data , Return... As part of the result set , If not, discard ;
On the union index tree , Traverse right , And after judging the age field , Make a schedule as needed , Until I come across the first name on the joint index tree 1 A word is not Zhang's record ;
This process is different from the above , In the process of traversing the union index , Equal age to 8 And push it to all traversal processes , Reduce the number of times to return to the table , Suppose the name of the country is 1 A word is in Zhang's person , Yes 100 Ten thousand are 8 Year old children , In this query process, we need to traverse the union index 8000 Ten thousand times , The only way to get back to the watch is 100 Ten thousand times .
Virtual column
It can be seen that the optimization effect is still very good , But this optimization still doesn't get around the restriction of the leftmost prefix principle , So in the joint index you still need to scan 8000 Line ten thousand , Is there any further optimization method ?
We can consider putting the first word of the name and age To make a joint index . Here you can use MySQL5.7 The introduction of virtual columns to achieve . To modify the table structure SQL sentence :
alter table t_people add name_first varchar(2) generated (left(name,1)),add index(name_first,age);
So let's look at this SQL Statement execution effect :
CREATE TABLE `t_people`( `id` int(11) DEFAULT NULL, `name` varchar(20) DEFAUT NULL, `name_first` varchar(2) GENERATED ALWAYS AS (left(`name`,1)) VIRTUAL,KEY `name_first`(`name_first`,'age') ) ENGINE=InnoDB DEFAULT CHARSET=utf8;
First of all, he is people Create a field called name_first Virtual columns of , And then to name_first and age Create a union index on , also , Let the value of this virtual column always equal to name The first two bytes of the field , Virtual columns cannot specify values when inserting data , You can't change it when you update it , Its value is automatically generated according to the definition , stay name When the field is modified, it will also be automatically modified .
With this new joint index , We're looking for number 1 A word is Zhang , And the age is 8 When I was a kid , This SQL The sentence can be written like this :select * from t_people where name_first=' Zhang ' and age=8.
So the execution of this statement , Just scan the joint index 100 Line ten thousand , And go back to the table 100 Ten thousand times , The essence of this optimization is that we create a more compact index , To speed up the query process .
summary
This article introduces the basic structure of index and some basic ideas of query optimization , Now you know , Using index statements can also be slow queries , Our query optimization process , It's often the process of reducing the number of scan lines .
Slow query can be summed up in several situations :
- Full table scan
- Full index scan
- Index filtering is not good
- The cost of frequent return
边栏推荐
- Salesforce fileUpload (I) how to configure the file upload function
- An article shows you the difference between high fidelity and low fidelity prototypes
- Cmake configuration error, error configuration process, Preject files may be invalid
- Digital integrated circuit design process
- Hypervisor Necromancy; Recover kernel protector (1)
- Markdown - enter a score (typora, latex)
- Quick sorting C language code + auxiliary diagram + Notes
- Wechat applet camera compressed image is Base64
- what the fuck! If you can't grab it, write it yourself. Use code to realize a Bing Dwen Dwen. It's so beautiful ~!
- JS rotation chart (Netease cloud rotation chart)
猜你喜欢

1.3-1.4 web page data capture

Small knowledge points of asset

Microservice Optimization: internal communication of microservices using grpc

Arm assembly syntax

Understand GB, gbdt and xgboost step by step

5g access network and base station evolution

Three methods for solving Fibonacci sequence feibonacci (seeking rabbit) - program design

pd. read_ CSV and np Differences between loadtext

JS advanced part

Rebirth -- millimeter wave radar and some things I have to say
随机推荐
8 vertical centering methods
Rebirth -- millimeter wave radar and some things I have to say
1. Mx6u bare metal program (5) - external interrupt
Spread spectrum and frequency hopping
Xgboost Guide
Common mistakes in C language (sizeof and strlen)
2022-1-14
How to set up an H5 demo of easyplayer locally to play h265 video streams?
Performance test -- Jenkins environment construction for 15jmeter performance test
Solve the problem that QQ flash photos cannot be saved
Detailed explanation of makefile usage
Log a log4j2 vulnerability handling
Targeted and ready to go
You must know the type and method of urllib
//1.8 char character variable assignment integer
Hello
Spark broadcast variables and accumulators (cases attached)
Anaconda creates a new environment encounter pit
[CodeWars] Convert Decimal Degrees to Degrees, Minutes, Seconds
What is a smart farm?