当前位置：网站首页>MySQL InnoDB and MyISAM

MySQL InnoDB and MyISAM

2022-06-24 16:06:00 【Silent storage】

Innodb

InnoDB It is a general storage engine with high reliability and high performance , The architecture is divided into two parts ： Structure in memory and structure on disk .InnoDB Use the log first strategy , Modify the data in memory first , And log transactions as redo logs (Redo Log), Convert to order IO Efficient commit transactions .

This is about logging to the database , The corresponding transaction can be returned to the user , Indicates that the transaction is complete . In fact, this data may only be modified in memory , It didn't go to disk . If the machine hangs up before data landing , Then this part of data is lost .

InnoDB adopt redo Log to ensure data consistency . That is to check regularly （ Checkpoint mechanism ）, Make sure that the log before the checkpoint has been written to disk , The next recovery only needs to start at the checkpoint .

The main advantage

its DML Operation follows ACID Model , The transaction has commit 、 Rollback and crash recovery , To protect user data .
Row level locking and Oracle Style consistent reading improves multi-user concurrency and performance .
InnoDB Tables arrange your data on disk to optimize primary key based queries . Every InnoDB Each table has a primary key index called a clustered index , For organizing data to minimize primary key lookups I/O.
To maintain data integrity ,InnoDB Support FOREIGN KEY constraint . Using the foreign key , Check insertion 、 Update and delete to ensure that they do not cause inconsistencies between related tables .

ACID Model

ACID： Atomicity （Atomicity）、 Uniformity （Consistency）、 Isolation, （Isolation）、 persistence （Durability）

1、 Atomicity mainly involves InnoDB Business .

Autocommit Set up
COMMIT Statement
ROLLBACK Statement

2、 The consistency aspect mainly involves InnoDB Internal processing that protects data from corruption , Protecting data from system crashes .

InnoDB Double write buffer for (doublewrite buffer)
InnoDB The failure recovery mechanism of (crash recovery)

3、innodb Isolation is also mainly achieved through transaction mechanism , In particular, the multiple isolation levels provided for transactions .

Autocommit Set up
SET ISOLATION LEVEL sentence
InnoDB Locking mechanism

4、 The persistence aspect involves hardware configuration interaction MySQL software function , according to CPU、 There are many possibilities for the capabilities of networks and storage devices , This aspect of providing specific guidelines is therefore the most complex .

Double write buffer (doublewrite buffer)
innodb_flush_log_at_trx_commit Variable
sync_binlog Variable
innodb_file_per_table Variable
Write buffer in storage device , For example, disk drive 、SSD or RAID array
Battery backup cache in storage device
Used to run the MySQL Operating system of , Especially it's right to fsync() System call support
Uninterruptible power supply (UPS) Protection operation MySQL Servers and storage MySQL Power supply of all computer servers and storage devices for data
Backup policy , For example, the frequency and type of backup and the retention period of backup
For distributed or managed data applications ,MySQL Specific characteristics of the data center where the server hardware is located , And network connectivity between data centers

Many versions

InnoDB Is a multi version storage engine . It retains information about older versions of changed rows to support transactional functionality , For example, concurrency and rollback .

InnoDB Use the information in the rollback segment to perform the undo operations required for transaction rollback . It also uses this information to build an earlier version of the row for consistent reading .

InnoDB Add three fields for each row stored in the database :

One 6 byte DB_TRX_ID The field indicates the transaction identifier of the last transaction that inserted or updated the row . Besides , Delete is treated internally as update , A special bit in the row is set to mark it as deleted .
DB_ROLL_PTR Called a scroll pointer 7 Byte field . The rollback pointer points to the undo logging written to the rollback segment . If the row is updated , The undo log record contains the information needed to rebuild the contents of the row before updating .
One 6 Bytes of DB_ROW_ID The field contains a row ID, Increase monotonously as new rows are inserted . If InnoDB Automatically generate clustered indexes , Then the index contains rows ID value . otherwise , The DB_ROW_ID Columns do not appear in any index .

The undo logs in the rollback segment are divided into insert and update undo logs . Inserting undo logs is only required when the transaction is rolled back , And it can be discarded immediately after the transaction is committed . Update undo logs are also used for consistent reads , But only if no transaction exists and InnoDB Snapshots can only be discarded if they are allocated , In a consistent read, you may need to update the information in the undo log to build an earlier version of the database row .

Multi version and secondary index

InnoDB Multi version concurrency control (MVCC) Dealing with secondary indexes is different from clustered indexes . Records in the clustered index are updated in place , Their hidden system columns point to undo log entries , An earlier version of the record can be rebuilt from it . Unlike clustered index records , Secondary index records do not contain hidden system columns , It won't update in place .

When the secondary index column is updated , Old secondary index records are marked for deletion , The new record is inserted , And finally the deleted mark records are cleared .
When the secondary index record is marked for deletion or the secondary index page is updated by the transaction ,InnoDB Look up database records in a clustered index . In clustered index ,DB_TRX_ID Check the record , If the record is modified after the read transaction is started , Retrieve the correct version of the record from the undo log .

If the secondary index record is marked for deletion or the secondary index page is updated for transaction update , We don't use overlay index technology . Does not return a value from the index structure , It is InnoDB Looking up records in a clustered index .

Official structure

Memory structure

1、 Buffer pool

A buffer pool is an area of main memory , Used in InnoDB Cache table and index data on access . Buffer pools allow access to frequently used data directly from memory , So as to speed up the processing speed . On a dedicated server , As many as 80% Of physical memory is usually allocated to buffer pools .

To improve the efficiency of a large number of read operations , The buffer pool is divided into pages that may contain multiple rows . For the efficiency of cache management , The buffer pool is implemented as a linked list of pages ; Rarely used data uses least recently used (LRU) Variants of the algorithm age from the cache .

1.1、 Buffer pool LRU Algorithm

Buffer pool usage LRU Variants of the algorithm are managed as lists . When space is needed to add new pages to the buffer pool , The least recently used page will be evicted , And add a new page to the middle of the list .

This midpoint insertion policy treats the list as two sub lists ：

Head ： Recently visited New （“ young ”） Sub list of pages
The tail ： Sub list of old pages recently visited

By default , The algorithm operates as follows ：

Buffer pool 3/8 Dedicated to old sublists .
The midpoint of the list is the boundary between the tail of the new sub list and the head of the old sub list .
When InnoDB When a page is read into the buffer pool , It initially inserts it into the midpoint （ The head of the old sublist ）. Can read pages , Because it is a user initiated operation （ for example SQL Inquire about ） Necessary , Or by Part of an automatic read ahead operation InnoDB.
Access the pages in the old sublist to make them “ young ”, Move it to the head of the new sublist . If the page is read because a user initiated operation requires it , The first visit will occur immediately , And the pages get younger . If the page is read as a result of a read ahead operation , The first visit does not occur immediately , And it may not happen until the page is ejected .
With the database running , The unreached pages in the buffer pool are moved to the end of the list to “ Ageing ”. The pages in the old and new sub lists age as other pages are updated . Pages in the old sublist also age with the insertion midpoint of the page . Final , An unused page reaches the end of the old sublist and is evicted .

1.2、 Buffer pool configuration

You can improve performance by configuring various aspects of the buffer pool

Set the buffer pool size to the largest possible value , Thus, enough memory is reserved for other processes on the server to run without excessive paging . The larger the buffer pool , The more InnoDB Like an in memory database , Read data from disk once , Data is then accessed from memory during subsequent reads .
With enough memory 64 Bit system , You can split the buffer pool into multiple parts , To minimize contention for memory structures between concurrent operations .
You can keep frequently accessed data in memory , Regardless of the sudden surge in operational activity , These operations will bring a large amount of infrequently accessed data into the buffer pool .
You can control how and when read ahead requests are executed to asynchronously prefetch pages into the buffer pool , Expect to need these pages soon .
You can control when the background refresh occurs and whether the refresh rate is dynamically adjusted according to the workload .
You can configure how InnoDB Keep the current buffer pool state to avoid long warm-up after the server restarts .

2、 Change buffer

Changing the buffer is a special data structure , When the secondary index page is not in the buffer pool , It caches the second level index pages change . Maybe by INSERT、 UPDATE or DELETE operation (DML) Resulting buffer changes Merge later when other read operations load pages into the buffer pool .

Unlike a clustered index , Secondary indexes are usually non unique , And insert the secondary index in a relatively random order . Again , Deleting and updating may affect non adjacent secondary index pages in the index tree . Merge cached changes later when other operations read the affected pages into the buffer pool , Avoid a large number of random access I/O, And these I/O You will need to read secondary index pages from disk into the buffer pool .

Cleanup operations that run when the system is mostly idle or during a slow shutdown periodically write updated index pages to disk . Compared to writing each value to disk immediately , The purge operation can more effectively write to disk blocks for a series of index values .

In memory , The change buffer occupies a portion of the buffer pool . On disk , Changing the buffer is part of the system tablespace , When the database server shuts down , Index changes are buffered in .

Change the data type cached in the buffer by innodb_change_buffering Variable control .

Allow the innodb_change_buffering Value has ：

all The default value is , Buffer insert 、 Delete mark operation and clear
none Do not buffer any operations
inserts Buffer insert operation
deletes Buffer delete mark operation
changes Buffer insert and delete tag operations
purges Physical deletion of buffer in the background

3、 adaptive hash index

Adaptive hash indexing can InnoDB Executing on a system with an appropriate combination of workloads and sufficient buffer pool memory is more like an in memory database , Without sacrificing transactional functionality or reliability . The adaptive hash index consists of innodb_adaptive_hash_index Variables enable , Or shut down when the server starts --skip-innodb-adaptive-hash-index.

4、 Log buffer

The log buffer is the memory area that holds the data to be written to the log file on the disk , The log buffer size is determined by innodb_log_buffer_size Variable definitions , The default size is 16MB.

The contents of the log buffer are periodically flushed to disk , Large log buffers enable large transactions to run , There is no need to write redo log data to disk before the transaction is committed . If there is an update 、 Inserts or deletes transactions for many rows , Increasing the size of the log buffer can save disk I/O.

Disk structure

1、 establish InnoDB surface

CREATE TABLE t1 (a INT, b VARCHAR (20), PRIMARY KEY (a)) ENGINE=InnoDB;

ENGINE=InnoDB When InnoDB This clause is not required when defining as the default storage engine , By default, it is .

To determine by issuing the following statement MySQL The default storage engine on the server instance ：

mysql> SELECT @@default_storage_engine;
+--------------------------+
| @@default_storage_engine |
+--------------------------+
| InnoDB                   |
+--------------------------+

InnoDB By default , The watch is in the file-per-table Created in a tablespace .

2、.frm file

MySQL Store the data dictionary information of the table in the database directory .frm In file . And others MySQL Storage engines are different , InnoDB It also encodes information about tables in its own internal data dictionary in the system table space . When MySQL When deleting a table or a database , It will delete one or more .frm Files and InnoDB The corresponding entry in the data dictionary .

You can't InnoDB Simply by moving .frm Files move tables between databases .

3、 Line format

InnoDB The row format of a table determines how its rows are physically stored on disk .

InnoDB Supports four line formats , Each format has different storage characteristics , Supported line formats include REDUNDANT、COMPACT、 DYNAMIC( Default )、COMPRESSED.

4、 Primary key

Select the characteristics of the primary key ：

The most important query refers to the column
Columns that will never be left blank
Columns that never have duplicate values
Columns that rarely change values after insertion

5、 see InnoDB Table properties

To see InnoDB Table properties , perform SHOW TABLE STATUS sentence ：

mysql> SHOW TABLE STATUS FROM test LIKE 't%' \G;
*************************** 1. row ***************************
           Name: t1
         Engine: InnoDB
        Version: 10
     Row_format: Dynamic
           Rows: 0
 Avg_row_length: 0
    Data_length: 16384
Max_data_length: 0
   Index_length: 0
      Data_free: 0
 Auto_increment: NULL
    Create_time: 2021-02-18 12:18:28
    Update_time: NULL
     Check_time: NULL
      Collation: utf8mb4_0900_ai_ci
       Checksum: NULL
 Create_options: 
        Comment:

just so so InnoDB By inquiring InnoDB Information architecture system tables to access table properties ：

ysql> SELECT * FROM INFORMATION_SCHEMA.INNODB_SYS_TABLES WHERE NAME='test/t1' \G
*************************** 1. row ***************************
     TABLE_ID: 45
         NAME: test/t1
         FLAG: 1
       N_COLS: 5
        SPACE: 35
  FILE_FORMAT: Barracuda
   ROW_FORMAT: Dynamic
ZIP_PAGE_SIZE: 0
   SPACE_TYPE: Single

Indexes

Every InnoDB Every table has a special index called a clustered index , Used to store row data .

stay PRIMARY KEY When defined on the table ,InnoDB Use it as a clustered index .
without PRIMARY KEY Define... For the table , be InnoDB Use the first one UNIQUE Indexes , And define all key columns as NOT NULL Clustered index .
If the table has no index PRIMARY KEY Or no suitable UNIQUE Indexes , be InnoDB Generate to GEN_CLUST_INDEX Include lines ID Value of the composite column named hidden clustered index .

Accessing a row through a clustered index is fast , Because the index search directly points to the page containing row data . If the watch is big , Compared with a storage organization that uses pages different from index records to store row data , Clustered index architecture can usually save disk I/O operation .

1、 The relationship between secondary index and clustered index

Indexes other than clustered indexes are called secondary indexes . Each record in the secondary index contains the primary key column of the row , And the columns specified for the secondary index .InnoDB Use this key value to search rows in the clustered index .

2、 The physical structure of the index

In addition to spatial indexes ,InnoDB All indexes are B Tree data structure .

Spatial index uses R Trees , It is a special data structure for indexing multidimensional data . Index records are stored in their B Tree or R In the leaf page of the tree data structure . The default size of the index page is 16KB.

When a new record is inserted into InnoDB When in a clustered index ,InnoDB Try to keep 1/16 Page free space for future insertion and update of index records . If in order （ In ascending or descending order ） Insert index record , The generated index page is about 15/16. If you insert records in random order , Page from 1/2 To 15/16 Is full .

InnoDB Lock of

InnoDB Implement standard row level locking , There are two types of locks , Shared lock and exclusive lock .

Shared locks allow transactions that hold locks to read rows
Exclusive locks allow you to hold locks , Update or delete transactions for rows

InnoDB Support for multi granularity locking , Allow row locks and table locks to coexist .

In order to make multi granularity locking practical ,InnoDB Use intention lock , Intention lock is a table lock , It refers to the type of lock that the transaction needs to use for the rows in the table later （ Shared lock or exclusive lock ）.

Transaction model

InnoDB The quotation is made by SQL All four transaction isolation levels described ：READ UNCOMMITTED( Read uncommitted )、 READ COMMITTED( Read submitted )、 REPEATABLE READ( Repeatable reading )、SERIALIZABLE( Serializable )

InnoDB The default isolation level is REPEATABLE READ

Isolation level	Read data consistency	Dirty reading	It can't be read repeatedly	Fantasy reading
READ UNCOMMITTED	The lowest level , Don't read physical good and bad data	yes	yes	yes
READ COMMITTED	Sentence level	no	yes	yes
REPEATABLE READ	Transaction level	no	no	yes
SERIALIZABLE	highest level , Transaction level	no	no	no

READ UNCOMMITTED

Dirty reading allowed , That is, it is possible to read the uncommitted transaction modification data in other sessions

READ COMMITTED

Only the submitted data can be read

REPEATABLE READ

Repeatable . Queries within the same transaction are consistent at the beginning of the transaction ,InnoDB Default level . stay SQL In the standard , This isolation level eliminates non rereading , But there's still phantom reading

SERIALIZABLE

A sequence of transactions . Transactions are executed one by one , Wait for the previous transaction to complete , Only later transactions can be executed sequentially .

Deadlock

Deadlocks are situations in which different transactions cannot continue , Because each transaction holds another required lock . Because both transactions are waiting for resources to become available , So neither of them will release the lock it holds .

When a transaction locks rows in multiple tables （UPDATE or SELECT ... FOR UPDATE） But in reverse order , There may be deadlock . When these statements lock the range of index records and gaps , Deadlocks can also occur , Each transaction acquires some locks instead of others due to time problems .

The possibility of deadlock is not affected by the isolation level , Because the isolation level changes the behavior of read operations , The deadlock occurs because of the write operation .

When deadlock detection is enabled （ Default ） And when a deadlock does occur ,InnoDB Detect the condition and roll back one of the transactions （ The victim ）.

Myisam

MyISAM Table use B Tree index ,MyISAM The tables are stored in three files on disk , The file name begins with the table name , And has an extension that indicates the file type .

File storage table format .frm
The data file has .MYD( MYData)
Index file has .MYI ( MYIndex)

establish MyISAM surface ：

CREATE TABLE t (i INT) ENGINE = MYISAM;

MyISAM The characteristics of a table ：

characteristic	Do you support
B Tree index	yes
Backup / Time to recover	yes
Cluster database supports	no
Clustered index	no
compressed data	yes
Data caching	no
Encrypt data	yes
Foreign key support	no
Full text search index	yes
Geospatial data types support	yes
Geospatial index support	yes
Hash index	no
The index buffer	yes
MVCC	no
Replication support	yes
Storage limits	256TB
T Tree index	no
Update the statistics of the data dictionary	yes

MyISAM Supported features ：

Support authenticity VARCHAR type ; One VARCHAR The column starts with the length stored in one or two bytes .
with VARCHAR A table of columns may have a fixed or dynamic president .
In the table VARCHAR and CHAR The total length of the columns may be as high as 64KB.
Any length limit UNIQUE.

Table storage format

1、 Static table

The static format is MyISAM The default format for tables . When a table does not contain variable length columns, it is used for （VARCHAR,VARBINARY,BLOB or TEXT）. Each row is stored in a fixed number of bytes .

In three kinds MyISAM In the storage format , Static format is the simplest and most secure （ Least likely to be damaged ）.

CHAR and VARCHAR Columns are filled with spaces to the specified column width , Although the column type has not changed .BINARY and VARBINARY Column use 0x00 Bytes are filled to the column width
NULL Columns need extra space in rows to record whether their values are NULL. Every time NULL One more column , Round to the nearest byte
Soon
Easy to cache
Easy to rebuild after a crash , Because the row is in a fixed position
Usually requires more disk space than dynamic format tables

2、 Dynamic table

When a table contains any variable length column （VARCHAR,VARBINARY,BLOB or TEXT）, Or the table uses ROW_FORMAT = DYNAMIC Table options created , The dynamic storage format .

Dynamic format is a little more complicated than static format , Because every line has a title , It means how long it is . When it gets longer due to updates , Rows can become fragmented （ Store in discrete segments ）.

Except that the length is less than 4 Out of the string of characters , All string columns are dynamic .
Each row is preceded by a bitmap , Indicates which columns contain empty strings （ For character string Columns ） Or zero （ For columns of numbers ）
NULL Columns need extra space in rows to record whether their values are NULL. Every time NULL One more column , Round to the nearest byte .
Usually requires much less disk space than a fixed length table .
It is more difficult to rebuild after a crash than a static format table , Because lines can be divided into many parts and linked （ fragment ） May lose .

3、 Compression meter

The compressed storage format uses myisampack Tool generated read-only format , Compressed tables can be myisamchk decompression .

Compressed tables take up very little disk space
Each row is compressed separately , Therefore, the access cost is very small
Can be used for fixed length or dynamic length rows

MyISAM Table problem

Even if MyISAM The table format is very reliable （SQL All changes made to the table by the statement are written before the statement returns ）, But if any of the following events occur , The table may still be damaged ：

mysqld The process was killed in the middle of writing
An unexpected computer shutdown occurred
Hardware failure
Using external program （ for example myisamchk） To modify tables that are also modified by the server
MySQL or MyISAM Software error in code

The typical symptom of a damaged watch is

The following error occurred while selecting data from the table Incorrect key file for table: '...'. Try to repair it
The query will not find rows in the table or return incomplete results

difference

	MyISAM	InnoDB
Storage	Every MyISAM Store three files on disk . The name of the first file starts with the name of the table , The extension indicates the file type ..frm File storage table definition .MYD Data files .MYI Index file	Disk based resources are InnoDB Tablespace data file and its log file ,InnoDB The size of the table is limited to the size of the operating system file , It's usually 2GB
Business	MyISAM Manage non transaction tables . It provides high-speed storage and Retrieval , And full text search capabilities . If the application needs to execute a large number of SELECT Inquire about , that MyISAM It's a better choice	Support 4 Transaction isolation levels , Roll back , Crash repair capability and multi version concurrent transaction security , Include ACID. If the application needs to execute a large number of INSERT or UPDATE operation , Should be used InnoDB, This can improve the performance of multi-user concurrent operations
SELECT、UPDATE、INSERT、DELETE	If a large number of SELECT,MyISAM It's a better choice	Yes INSERT or UPDATE Have good support ;DELET when ,InnoDB The table will not be recreated , It's line by line deletion
The specific number of rows in the table	MyISAM Just simply read out the number of saved lines , When count(*) The statement contains where When the conditions , The operation of the two kinds of tables is the same	InnoDB The specific number of rows of the table is not saved in , in other words , perform count(*) when , Scan the entire table to see how many rows there are
lock	Support table level lock	Row level locking is supported ,InnoDB The row lock of the table is not absolute either , If you are executing a SQL When the sentence is MySQL Can't determine the range to scan ,InnoDB The watch also locks the whole watch
Indexes	MyISAM（ Pile organization chart ） Using a non clustered index 、 Separate index from file , random packing , Only indexes can be cached	InnoDB（ Index organization table ） Clustered index used 、 Index is data , Sequential storage , So you can cache indexes , Can also cache data
Concurrent	Reading and writing block each other ： Not only does it block reads while writing ,MyISAM It also blocks writes while reading , But reading itself doesn't block other reading	Read and write blocking is related to transaction isolation level

Scene selection

The difference between the two storage engines :

InnoDB Support transactions ,MyISAM I won't support it , This is very important . Transaction is an advanced way of processing , For example, in some column additions, deletions and changes, you can roll back and restore as long as any error occurs , and MyISAM I can't .
MyISAM Suitable for query and insert based applications ,InnoDB It is suitable for frequent modification and high security applications
InnoDB Support foreign keys ,MyISAM I won't support it
MyISAM Is the default engine ,InnoDB You need to specify the
InnoDB I won't support it FULLTEXT Index of type
InnoDB The number of rows in the table is not saved , Such as select count() from table when ,InnoDB You need to scan the entire table to calculate how many rows there are , however MyISAM Simply read out the number of saved lines . Pay attention to is , When count() The statement contains where When the conditions MyISAM You also need to scan the entire table
For self growing fields ,InnoDB Must contain only the index of this field , But in MyISAM Tables can be federated with other fields
When you empty the entire table ,InnoDB It's line by line deletion , Efficiency is very slow .MyISAM The table is rebuilt
InnoDB Support row lock （ In some cases, the whole table is locked , Such as update table set a=1 where user like '%lee%'

MyISAM	InnoDB
No need for transaction support （ I won't support it ）	Need transaction support （ It has better transaction characteristics ）
Concurrency is relatively low （ Locking mechanism problem ）	Row level locking has a good adaptability to high concurrency , But you need to make sure that the query is done by index
There are relatively few data changes （ Blocking problem ）, Mainly reading	Scenarios with frequent data updates
Data consistency requirements are not very high	The requirement of data consistency is high
--	The memory of hardware device is large , You can use InnoDB Better cache capacity to improve memory utilization , Minimize the number of disks IO