当前位置：网站首页>Mysql database query is so slow. Besides index, what else can it do?

Mysql database query is so slow. Besides index, what else can it do?

2022-07-24 20:59:00 【Trouvailless】

I am proficient in application ctrl c and ctrl v Development curd Code for years .

mysql Why is the query slow , On this question , Often encountered in the actual development , And in the interview , It's also a high-frequency problem .

Meet this kind of problem , We usually think it's because of the index .

Apart from the index , What other factors can cause database queries to slow down ？

What are the operations , Can improve mysql What about your query ability ？

Today's article , Let's talk about the scenarios that will slow down database queries , And give reasons and solutions .

Database query process

So let's see , A query statement comes down , What processes will you go through .

For example, we have a database table

CREATE TABLE `user` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT COMMENT ' Primary key ',
  `name` varchar(100) NOT NULL DEFAULT '' COMMENT ' name ',
  `age` int(11) NOT NULL DEFAULT '0' COMMENT ' Age ',
  `gender` int(8) NOT NULL DEFAULT '0' COMMENT ' Gender ',
  PRIMARY KEY (`id`),
  KEY `idx_age` (`age`),
  KEY `idx_gender` (`gender`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

The application code we usually write （go or C++ And so on. ）, It's called client 了 .

The bottom layer of the client will bring the account and password , Try to mysql Create a TCP Long link .

mysql Of Connection management module Will manage this connection .

Once the connection is established , The client executes a query sql sentence . such as ：

select * from user where gender = 1 and age = 100;

The client will sql Statement is connected to... Through the network mysql.

mysql received sql After the statement , Will be in analyzer Judge first SQL Is there any grammatical error in the statement , such as select, If you miss one l , It's written in slect , May be an error You have an error in your SQL syntax; . This error reporting is familiar to a disabled party like me .

Next is Optimizer , It will be here Choose what index to use according to certain rules .

after , Is through actuator To call Storage engine The interface function of .

Mysql framework

The storage engine is like a component , They are mysql Where you actually get a row of data and return it , The storage engine is replaceable , You can use a that does not support transactions MyISAM, It can also be replaced by a transaction supporting Innodb. This can be specified when creating a table . such as

CREATE TABLE `user` (
  ...
) ENGINE=InnoDB;

Now the most commonly used InnoDB .

Let's focus on this .

InnoDB in , Because direct operation of the disk will be slower , So I added a layer of memory to speed up , It's called buffer pool , Inside this , Put a lot of memory pages , Every page 16KB, Some memory pages contain row by row data as seen in the database table , Some are index information .

bufferPool And disk

Inquire about SQL here we are InnoDB in . According to the index calculated in the optimizer above , Go to Query the corresponding index page , If not buffer pool In, the index page is loaded from the disk . Then speed up the query through the index page , Get the data page The exact location of . If these data pages are not buffer pool in , Then load it from the disk .

In this way, we get the row by row data we want .

Relationship between index page and disk page

Finally, the obtained data results are returned to the client .

Slow query analysis

If the above process is slow , We can turn on profiling See where the process is slow .

mysql> set profiling=ON;
Query OK, 0 rows affected, 1 warning (0.00 sec)

mysql> show variables like 'profiling';
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| profiling     | ON    |
+---------------+-------+
1 row in set (0.00 sec)

Execute normally sql sentence .

these SQL The execution time of the statement will be recorded , At this point, you want to see which statements have been recorded , It can be executed show profiles;

mysql> show profiles;
+----------+------------+---------------------------------------------------+
| Query_ID | Duration   | Query                                             |
+----------+------------+---------------------------------------------------+
|        1 | 0.06811025 | select * from user where age>=60                  |
|        2 | 0.00151375 | select * from user where gender = 2 and age = 80  |
|        3 | 0.00230425 | select * from user where gender = 2 and age = 60  |
|        4 | 0.00070400 | select * from user where gender = 2 and age = 100 |
|        5 | 0.07797650 | select * from user where age!=60                  |
+----------+------------+---------------------------------------------------+
5 rows in set, 1 warning (0.00 sec)

Pay attention to the above query_id , such as select * from user where age>=60 Corresponding query_id yes 1, If you want to check this SQL The specific time of the statement , Then you can execute the following command .

mysql> show profile for query 1;
+----------------------+----------+
| Status               | Duration |
+----------------------+----------+
| starting             | 0.000074 |
| checking permissions | 0.000010 |
| Opening tables       | 0.000034 |
| init                 | 0.000032 |
| System lock          | 0.000027 |
| optimizing           | 0.000020 |
| statistics           | 0.000058 |
| preparing            | 0.000018 |
| executing            | 0.000013 |
| Sending data         | 0.067701 |
| end                  | 0.000021 |
| query end            | 0.000015 |
| closing tables       | 0.000014 |
| freeing items        | 0.000047 |
| cleaning up          | 0.000027 |
+----------------------+----------+
15 rows in set, 1 warning (0.00 sec)

Through the above items , You can see where the specific time is . For example, as can be seen from the above Sending data It takes the most time , This means actuator The time it takes to start querying data and sending it to the client , Because my data is qualified Tens of thousands , So this one takes the most time , It's also in line with expectations .

In general , We are in the process of development , Time consuming, mostly in Sending data Stage , And if this stage is slow , The easiest thing to think of is index related reasons .

Index related reasons

Index related problems , Generally available explain Commands help analyze . Through it you can see What indexes are used , Probably How many lines are scanned Information like that .

mysql Will be in Optimizer phase Look at which index to choose , Queries will be faster .

Generally, several factors are mainly considered , such as ：

Select this index to scan How many rows? （rows）
To get these lines out , Need to read How many? 16kb Page of
You need to go back to the table when you go to the ordinary index , The primary key index does not need , Return cost big ？

go back to show profile Mentioned in sql sentence , We use explain select * from user where age>=60 Look at the .

explain sql

The above statement , The use of type by ALL, It means Full table scan , possible_keys Refer to Possible index , The index that may be used here is age Build a common index , But in fact, the index used in the database is key That column , yes NULL . in other words This sentence sql Don't walk index , Full table scan .

This is because in the data sheet , Number of qualified data rows （ rows ） Too much , If you use age Indexes , Then you need to move them from age Read it in the index , also age The index is General index , It also needs to be Back to the table Find the corresponding Primary key To find the corresponding Data pages . It's better to go straight to the primary key . So I finally chose to scan the whole table .

Of course, the above is just an example , actually ,mysql perform sql when , Do not use an index or use an index that does not meet our expectations This often happens , There are many scenarios of index failure , For example. Unequal sign , Implicit conversion etc. , I'm sure you've recited a lot of eight part essays , I won't go back to .

Talk about two problems easily encountered in production .

Index does not meet expectations

There are some special situations in actual development , For example, some database tables have a small amount of data at first , Few indexes , perform sql when , It does use an index that meets your expectations . But at any time, the time is long , There are more developers , The amount of data has also increased , You may even add some other redundant indexes , It's possible to use , Other indexes that do not meet your expectations are used . This causes the query to suddenly slow down .

This kind of problem , It's a good solution , Can pass force index Specify the index . such as

force index Specify the index

adopt explain It can be seen that , added force index after ,sql I chose idx_age This index .

After walking, the index is still very slow

There are some sql, use explain Order to see , It's clearly the index , But it's still slow . There are generally two situations ：

The first is that the index discrimination is too low , For example, the full path of the web page url link , This is used as an index , At a glance, it was all the same domain name , If Prefix index The length of the is not long enough , Then go with me Full table scan like , The correct posture is to try to make the index Degree of differentiation Higher , For example, remove the domain name , Just take the back URI Part to index .

Index prefix discrimination is too low

The second is that the matched data in the index is too large , What needs attention at this time is explain Inside rows Field .

It is used for forecast The number of rows that this query statement needs to query , It may not be completely accurate , But it can reflect a rough order of magnitude .

When it's big , Generally, the following situations are common .

If this field has only Properties of , Such as telephone number, etc , Generally, there should not be a lot of repetition , That may be your code logic A large number of repeated inserts The operation of , You need to check the code logic , Or you need to add unique index Under the limit .
If the data in this field is large , Whether you need to take all ？ If you don't need to , Add one limit Under the limit . If you really want to take all , You can't take it all at once , You have a small amount of data today , Maybe there's no pressure to take 10000 or 20000 at a time , In case it rises to 100000 one day , That one-time withdrawal is a little unbearable . You may need to Take... In batches , The specific operation is to use order by id Sort out , Get a batch of data and take Maximum id As the starting position of the next data retrieval .

Too few connections

We're done talking about the reasons related to the index , Let's talk about , Besides index , What other factors will limit our query speed .

We can see ,mysql Of server There is a in the floor Connection management , Its role is to manage clients and mysql Long connection between .

Under normal circumstances , The client and server If only One Connect , Then in execution sql After the query , Can only block and wait for the result to return , If there are a large number of queries and concurrent requests , that Subsequent requests need to wait for the execution of the previous request to complete after , To start execution .

Too few connections can lead to sql Blocking

So many times our applications , such as go or java these , Will print out the sql Log execution for several minutes , But in fact, you carry out this statement alone , But it's on the millisecond level . It's all because of these sql Statements in wait for Ahead sql Execution completed .

How to solve it ？

If we can Build more connections , Then the request can be executed concurrently , The back connection doesn't have to wait so long .

Adding connections can speed up execution sql

And the connection number is too small , Limited by both sides of the database and the client .

The number of database connections is too small

Mysql The default number of connections is 100 , Maximum attainable 16384 .

Can be set by mysql Of max_connections Parameters , Change the maximum number of connections to the database .

mysql> set global max_connections= 500;
Query OK, 0 rows affected (0.00 sec)

mysql> show variables like 'max_connections';
+-----------------+-------+
| Variable_name   | Value |
+-----------------+-------+
| max_connections | 500   |
+-----------------+-------+
1 row in set (0.00 sec)

Operation above , Change the maximum number of connections to 500.

The number of connections on the application side is too small

The database connection size has been adjusted , But it seems that the problem has not changed ？ There are still a lot of sql The execution reached a few minutes , Even overtime ？

That's probably because your application side （go,java The application of writing , That is to say mysql The client of ） The number of connections is too small .

Application side and mysql The underlying connection , yes be based on TCP Long link of the agreement , and TCP agreement , Need to go through Three handshakes and four waves To build and release . If every time I execute sql If you re-establish a new connection , Then keep shaking hands and waving , It's very Time consuming . So you usually create a Long connection pool , After the connection is used up , Plug it into the connection pool , Next time we have to execute sql When , Then fish a connection from the inside and use , Very environmentally friendly .

Connection pool principle

We usually write code , Will be through a third party orm library To operate the database , And mature orm library , One million percent will have a connection pool .

And this connection pool , It usually has a size . This size controls the maximum number of your connections , If your connection pool is too small , It's not as big as the database , It doesn't work to adjust the maximum number of connections to the database .

In general , You can turn down the orm Library documentation , Let's see how to set the size of this connection pool , Just a few lines of code , Just change it . such as go In the language gorm That's how it's set up in the

func Init() {
  db, err := gorm.Open(mysql.Open(conn), config)
    sqlDB, err := db.DB()
    // SetMaxIdleConns  Set the maximum number of connections in the free connection pool 
    sqlDB.SetMaxIdleConns(200)
    // SetMaxOpenConns  Set the maximum number of open database connections 
    sqlDB.SetMaxOpenConns(1000)
}

buffer pool Too small

The number of connections is up , The speed has also increased .

Once I met an interviewer who would ask , Is there any other way to make it faster ？

Then you have to frown , Pretend to think , And said, ： yes , we have .

In the previous database query process , Mentioned entering innodb after , There will be a layer of memory buffer pool, Used to load disk data pages into memory pages , Just find buffer pool Are there in , You can go straight back to , Otherwise, you have to leave the disk IO, That's slow .

in other words , If my buffer pool The bigger it is , Then the more data pages we can put , Corresponding ,sql When querying, you are more likely to hit buffer pool, The query speed is naturally faster .

You can query through the following command buffer pool Size , The unit is Byte .

mysql> show global variables like 'innodb_buffer_pool_size';
+-------------------------+-----------+
| Variable_name           | Value     |
+-------------------------+-----------+
| innodb_buffer_pool_size | 134217728 |
+-------------------------+-----------+
1 row in set (0.01 sec)

That is to say 128Mb .

If you want to turn it up a little . It can be executed

mysql> set global innodb_buffer_pool_size = 536870912;
Query OK, 0 rows affected (0.01 sec)

mysql> show global variables like 'innodb_buffer_pool_size';
+-------------------------+-----------+
| Variable_name           | Value     |
+-------------------------+-----------+
| innodb_buffer_pool_size | 536870912 |
+-------------------------+-----------+
1 row in set (0.01 sec)

This way buffer pool Increase to 512Mb 了 .

But! , If buffer pool Normal size , It's just Other reasons Resulting in slower queries , That change buffer pool meaningless .

But here's the problem .

How do I know buffer pool Is it too small ？

We can see this buffer pool Cache hit rate of .

see buffer pool shooting

adopt show status like 'Innodb_buffer_pool_%'; You can see the heel buffer pool Some information about .

Innodb_buffer_pool_read_requests Indicates the number of read requests .

Innodb_buffer_pool_reads Indicates the number of requests to read data from the physical disk .

therefore buffer pool You can get the hit rate in this way ：

buffer pool  shooting  = 1 - (Innodb_buffer_pool_reads/Innodb_buffer_pool_read_requests) * 100%

For example, in my screenshot above is ,1 - (405/2278354) = 99.98%. It can be said that the hit rate is very high .

In general buffer pool shooting All in 99% above , If it's below this value , Just need to consider increasing innodb buffer pool Size .

Of course , You can also make this hit rate monitor in , In the middle of the night sql It's slowing down , Working in the morning can also locate the reason , It's very comfortable .

What other operations ？

Previously mentioned is in Storage engine layer Joined in buffer pool Used to cache memory pages , This can speed up the query .

That's the same thing , server layer You can also add a cache , Directly cache the results of the first query , such The next time The query will immediately return , Sounds beautiful .

According to the truth , If you hit the cache , It can really speed up the query . But this function is very limited , The biggest problem is that as long as the database table is updated , Inside the watch All caches will be invalidated , Frequent update of data table , It will lead to frequent cache failures . So this function is only suitable for those Less updated data tables .

in addition , This function is 8.0 edition after , I was kill 了 . So this function can be used to chat , There is no need to really use it in production .

Query cache deleted

summary

Too slow data query is generally an index problem , Maybe it's the wrong index , It may also be because there are too many rows in the query .
The number of client and database connections is too small , Will limit sql Number of concurrent queries , Increasing the number of connections can increase the speed .
innodb There will be a layer of memory buffer pool Used to improve query speed , Average hit rate >99%, If it's below this value , Consider increasing buffer pool Size , This can also increase speed .
The query cache （query cache） It can really speed up the query , But it is generally not recommended to open , Because the restrictions are relatively large , also 8.0 After the Mysql This function has been eliminated in .