当前位置:网站首页>MySQL InnoDB and MyISAM

MySQL InnoDB and MyISAM

2022-06-24 16:06:00 Silent storage

Innodb

InnoDB It is a general storage engine with high reliability and high performance , The architecture is divided into two parts : Structure in memory and structure on disk .InnoDB Use the log first strategy , Modify the data in memory first , And log transactions as redo logs (Redo Log), Convert to order IO Efficient commit transactions .

This is about logging to the database , The corresponding transaction can be returned to the user , Indicates that the transaction is complete . In fact, this data may only be modified in memory , It didn't go to disk . If the machine hangs up before data landing , Then this part of data is lost .

InnoDB adopt redo Log to ensure data consistency . That is to check regularly ( Checkpoint mechanism ), Make sure that the log before the checkpoint has been written to disk , The next recovery only needs to start at the checkpoint .

The main advantage

  • its DML Operation follows ACID Model , The transaction has commit 、 Rollback and crash recovery , To protect user data .
  • Row level locking and Oracle Style consistent reading improves multi-user concurrency and performance .
  • InnoDB Tables arrange your data on disk to optimize primary key based queries . Every InnoDB Each table has a primary key index called a clustered index , For organizing data to minimize primary key lookups I/O.
  • To maintain data integrity ,InnoDB Support FOREIGN KEY constraint . Using the foreign key , Check insertion 、 Update and delete to ensure that they do not cause inconsistencies between related tables .

ACID Model

ACID: Atomicity (Atomicity)、 Uniformity (Consistency)、 Isolation, (Isolation)、 persistence (Durability)

1、 Atomicity mainly involves InnoDB Business .

  • Autocommit Set up
  • COMMIT Statement
  • ROLLBACK Statement

2、 The consistency aspect mainly involves InnoDB Internal processing that protects data from corruption , Protecting data from system crashes .

  • InnoDB Double write buffer for (doublewrite buffer)
  • InnoDB The failure recovery mechanism of (crash recovery)

3、innodb Isolation is also mainly achieved through transaction mechanism , In particular, the multiple isolation levels provided for transactions .

  • Autocommit Set up
  • SET ISOLATION LEVEL sentence
  • InnoDB Locking mechanism

4、 The persistence aspect involves hardware configuration interaction MySQL software function , according to CPU、 There are many possibilities for the capabilities of networks and storage devices , This aspect of providing specific guidelines is therefore the most complex .

  • Double write buffer (doublewrite buffer)
  • innodb_flush_log_at_trx_commit Variable
  • sync_binlog Variable
  • innodb_file_per_table Variable
  • Write buffer in storage device , For example, disk drive 、SSD or RAID array
  • Battery backup cache in storage device
  • Used to run the MySQL Operating system of , Especially it's right to fsync() System call support
  • Uninterruptible power supply (UPS) Protection operation MySQL Servers and storage MySQL Power supply of all computer servers and storage devices for data
  • Backup policy , For example, the frequency and type of backup and the retention period of backup
  • For distributed or managed data applications ,MySQL Specific characteristics of the data center where the server hardware is located , And network connectivity between data centers

Many versions

InnoDB Is a multi version storage engine . It retains information about older versions of changed rows to support transactional functionality , For example, concurrency and rollback .

InnoDB Use the information in the rollback segment to perform the undo operations required for transaction rollback . It also uses this information to build an earlier version of the row for consistent reading .

InnoDB Add three fields for each row stored in the database :

  • One 6 byte DB_TRX_ID The field indicates the transaction identifier of the last transaction that inserted or updated the row . Besides , Delete is treated internally as update , A special bit in the row is set to mark it as deleted .
  • DB_ROLL_PTR Called a scroll pointer 7 Byte field . The rollback pointer points to the undo logging written to the rollback segment . If the row is updated , The undo log record contains the information needed to rebuild the contents of the row before updating .
  • One 6 Bytes of DB_ROW_ID The field contains a row ID, Increase monotonously as new rows are inserted . If InnoDB Automatically generate clustered indexes , Then the index contains rows ID value . otherwise , The DB_ROW_ID Columns do not appear in any index .

The undo logs in the rollback segment are divided into insert and update undo logs . Inserting undo logs is only required when the transaction is rolled back , And it can be discarded immediately after the transaction is committed . Update undo logs are also used for consistent reads , But only if no transaction exists and InnoDB Snapshots can only be discarded if they are allocated , In a consistent read, you may need to update the information in the undo log to build an earlier version of the database row .

Multi version and secondary index

InnoDB Multi version concurrency control (MVCC) Dealing with secondary indexes is different from clustered indexes . Records in the clustered index are updated in place , Their hidden system columns point to undo log entries , An earlier version of the record can be rebuilt from it . Unlike clustered index records , Secondary index records do not contain hidden system columns , It won't update in place .

  • When the secondary index column is updated , Old secondary index records are marked for deletion , The new record is inserted , And finally the deleted mark records are cleared .
  • When the secondary index record is marked for deletion or the secondary index page is updated by the transaction ,InnoDB Look up database records in a clustered index . In clustered index ,DB_TRX_ID Check the record , If the record is modified after the read transaction is started , Retrieve the correct version of the record from the undo log .

If the secondary index record is marked for deletion or the secondary index page is updated for transaction update , We don't use overlay index technology . Does not return a value from the index structure , It is InnoDB Looking up records in a clustered index .

Official structure

Memory structure

1、 Buffer pool

A buffer pool is an area of main memory , Used in InnoDB Cache table and index data on access . Buffer pools allow access to frequently used data directly from memory , So as to speed up the processing speed . On a dedicated server , As many as 80% Of physical memory is usually allocated to buffer pools .

To improve the efficiency of a large number of read operations , The buffer pool is divided into pages that may contain multiple rows . For the efficiency of cache management , The buffer pool is implemented as a linked list of pages ; Rarely used data uses least recently used (LRU) Variants of the algorithm age from the cache .

1.1、 Buffer pool LRU Algorithm

Buffer pool usage LRU Variants of the algorithm are managed as lists . When space is needed to add new pages to the buffer pool , The least recently used page will be evicted , And add a new page to the middle of the list .

This midpoint insertion policy treats the list as two sub lists :

  • Head : Recently visited New (“ young ”) Sub list of pages
  • The tail : Sub list of old pages recently visited

By default , The algorithm operates as follows :

  • Buffer pool 3/8 Dedicated to old sublists .
  • The midpoint of the list is the boundary between the tail of the new sub list and the head of the old sub list .
  • When InnoDB When a page is read into the buffer pool , It initially inserts it into the midpoint ( The head of the old sublist ). Can read pages , Because it is a user initiated operation ( for example SQL Inquire about ) Necessary , Or by Part of an automatic read ahead operation InnoDB.
  • Access the pages in the old sublist to make them “ young ”, Move it to the head of the new sublist . If the page is read because a user initiated operation requires it , The first visit will occur immediately , And the pages get younger . If the page is read as a result of a read ahead operation , The first visit does not occur immediately , And it may not happen until the page is ejected .
  • With the database running , The unreached pages in the buffer pool are moved to the end of the list to “ Ageing ”. The pages in the old and new sub lists age as other pages are updated . Pages in the old sublist also age with the insertion midpoint of the page . Final , An unused page reaches the end of the old sublist and is evicted .

1.2、 Buffer pool configuration

You can improve performance by configuring various aspects of the buffer pool

  • Set the buffer pool size to the largest possible value , Thus, enough memory is reserved for other processes on the server to run without excessive paging . The larger the buffer pool , The more InnoDB Like an in memory database , Read data from disk once , Data is then accessed from memory during subsequent reads .
  • With enough memory 64 Bit system , You can split the buffer pool into multiple parts , To minimize contention for memory structures between concurrent operations .
  • You can keep frequently accessed data in memory , Regardless of the sudden surge in operational activity , These operations will bring a large amount of infrequently accessed data into the buffer pool .
  • You can control how and when read ahead requests are executed to asynchronously prefetch pages into the buffer pool , Expect to need these pages soon .
  • You can control when the background refresh occurs and whether the refresh rate is dynamically adjusted according to the workload .
  • You can configure how InnoDB Keep the current buffer pool state to avoid long warm-up after the server restarts .

2、 Change buffer

Changing the buffer is a special data structure , When the secondary index page is not in the buffer pool , It caches the second level index pages change . Maybe by INSERTUPDATE or DELETE operation (DML) Resulting buffer changes Merge later when other read operations load pages into the buffer pool .

Unlike a clustered index , Secondary indexes are usually non unique , And insert the secondary index in a relatively random order . Again , Deleting and updating may affect non adjacent secondary index pages in the index tree . Merge cached changes later when other operations read the affected pages into the buffer pool , Avoid a large number of random access I/O, And these I/O You will need to read secondary index pages from disk into the buffer pool .

Cleanup operations that run when the system is mostly idle or during a slow shutdown periodically write updated index pages to disk . Compared to writing each value to disk immediately , The purge operation can more effectively write to disk blocks for a series of index values .

In memory , The change buffer occupies a portion of the buffer pool . On disk , Changing the buffer is part of the system tablespace , When the database server shuts down , Index changes are buffered in .

Change the data type cached in the buffer by innodb_change_buffering Variable control .

Allow the innodb_change_buffering Value has :

  • all The default value is , Buffer insert 、 Delete mark operation and clear
  • none Do not buffer any operations
  • inserts Buffer insert operation
  • deletes Buffer delete mark operation
  • changes Buffer insert and delete tag operations
  • purges Physical deletion of buffer in the background

3、 adaptive hash index

Adaptive hash indexing can InnoDB Executing on a system with an appropriate combination of workloads and sufficient buffer pool memory is more like an in memory database , Without sacrificing transactional functionality or reliability . The adaptive hash index consists of innodb_adaptive_hash_index Variables enable , Or shut down when the server starts --skip-innodb-adaptive-hash-index.

4、 Log buffer

The log buffer is the memory area that holds the data to be written to the log file on the disk , The log buffer size is determined by innodb_log_buffer_size Variable definitions , The default size is 16MB.

The contents of the log buffer are periodically flushed to disk , Large log buffers enable large transactions to run , There is no need to write redo log data to disk before the transaction is committed . If there is an update 、 Inserts or deletes transactions for many rows , Increasing the size of the log buffer can save disk I/O.

Disk structure

1、 establish InnoDB surface

CREATE TABLE t1 (a INT, b VARCHAR (20), PRIMARY KEY (a)) ENGINE=InnoDB;

ENGINE=InnoDB When InnoDB This clause is not required when defining as the default storage engine , By default, it is .

To determine by issuing the following statement MySQL The default storage engine on the server instance :

mysql> SELECT @@default_storage_engine;
+--------------------------+
| @@default_storage_engine |
+--------------------------+
| InnoDB                   |
+--------------------------+

InnoDB By default , The watch is in the file-per-table Created in a tablespace .

2、.frm file

MySQL Store the data dictionary information of the table in the database directory .frm In file . And others MySQL Storage engines are different , InnoDB It also encodes information about tables in its own internal data dictionary in the system table space . When MySQL When deleting a table or a database , It will delete one or more .frm Files and InnoDB The corresponding entry in the data dictionary .

You can't InnoDB Simply by moving .frm Files move tables between databases .

3、 Line format

InnoDB The row format of a table determines how its rows are physically stored on disk .

InnoDB Supports four line formats , Each format has different storage characteristics , Supported line formats include REDUNDANTCOMPACTDYNAMIC( Default )COMPRESSED.

4、 Primary key

Select the characteristics of the primary key :

  • The most important query refers to the column
  • Columns that will never be left blank
  • Columns that never have duplicate values
  • Columns that rarely change values after insertion

5、 see InnoDB Table properties

To see InnoDB Table properties , perform SHOW TABLE STATUS sentence :

mysql> SHOW TABLE STATUS FROM test LIKE 't%' \G;
*************************** 1. row ***************************
           Name: t1
         Engine: InnoDB
        Version: 10
     Row_format: Dynamic
           Rows: 0
 Avg_row_length: 0
    Data_length: 16384
Max_data_length: 0
   Index_length: 0
      Data_free: 0
 Auto_increment: NULL
    Create_time: 2021-02-18 12:18:28
    Update_time: NULL
     Check_time: NULL
      Collation: utf8mb4_0900_ai_ci
       Checksum: NULL
 Create_options: 
        Comment:

just so so InnoDB By inquiring InnoDB Information architecture system tables to access table properties :

ysql> SELECT * FROM INFORMATION_SCHEMA.INNODB_SYS_TABLES WHERE NAME='test/t1' \G
*************************** 1. row ***************************
     TABLE_ID: 45
         NAME: test/t1
         FLAG: 1
       N_COLS: 5
        SPACE: 35
  FILE_FORMAT: Barracuda
   ROW_FORMAT: Dynamic
ZIP_PAGE_SIZE: 0
   SPACE_TYPE: Single

Indexes

Every InnoDB Every table has a special index called a clustered index , Used to store row data .

  • stay PRIMARY KEY When defined on the table ,InnoDB Use it as a clustered index .
  • without PRIMARY KEY Define... For the table , be InnoDB Use the first one UNIQUE Indexes , And define all key columns as NOT NULL Clustered index .
  • If the table has no index PRIMARY KEY Or no suitable UNIQUE Indexes , be InnoDB Generate to GEN_CLUST_INDEX Include lines ID Value of the composite column named hidden clustered index .

Accessing a row through a clustered index is fast , Because the index search directly points to the page containing row data . If the watch is big , Compared with a storage organization that uses pages different from index records to store row data , Clustered index architecture can usually save disk I/O operation .

1、 The relationship between secondary index and clustered index

Indexes other than clustered indexes are called secondary indexes . Each record in the secondary index contains the primary key column of the row , And the columns specified for the secondary index .InnoDB Use this key value to search rows in the clustered index .

2、 The physical structure of the index

In addition to spatial indexes ,InnoDB All indexes are B Tree data structure .

Spatial index uses R Trees , It is a special data structure for indexing multidimensional data . Index records are stored in their B Tree or R In the leaf page of the tree data structure . The default size of the index page is 16KB.

When a new record is inserted into InnoDB When in a clustered index ,InnoDB Try to keep 1/16 Page free space for future insertion and update of index records . If in order ( In ascending or descending order ) Insert index record , The generated index page is about 15/16. If you insert records in random order , Page from 1/2 To 15/16 Is full .

InnoDB Lock of

InnoDB Implement standard row level locking , There are two types of locks , Shared lock and exclusive lock .

  • Shared locks allow transactions that hold locks to read rows
  • Exclusive locks allow you to hold locks , Update or delete transactions for rows

InnoDB Support for multi granularity locking , Allow row locks and table locks to coexist .

In order to make multi granularity locking practical ,InnoDB Use intention lock , Intention lock is a table lock , It refers to the type of lock that the transaction needs to use for the rows in the table later ( Shared lock or exclusive lock ).

Transaction model

InnoDB The quotation is made by SQL All four transaction isolation levels described :READ UNCOMMITTED( Read uncommitted )READ COMMITTED( Read submitted )REPEATABLE READ( Repeatable reading )SERIALIZABLE( Serializable )

InnoDB The default isolation level is REPEATABLE READ

Isolation level

Read data consistency

Dirty reading

It can't be read repeatedly

Fantasy reading

READ UNCOMMITTED

The lowest level , Don't read physical good and bad data

yes

yes

yes

READ COMMITTED

Sentence level

no

yes

yes

REPEATABLE READ

Transaction level

no

no

yes

SERIALIZABLE

highest level , Transaction level

no

no

no

READ UNCOMMITTED

Dirty reading allowed , That is, it is possible to read the uncommitted transaction modification data in other sessions

READ COMMITTED

Only the submitted data can be read

REPEATABLE READ

Repeatable . Queries within the same transaction are consistent at the beginning of the transaction ,InnoDB Default level . stay SQL In the standard , This isolation level eliminates non rereading , But there's still phantom reading

SERIALIZABLE

A sequence of transactions . Transactions are executed one by one , Wait for the previous transaction to complete , Only later transactions can be executed sequentially .

Deadlock

Deadlocks are situations in which different transactions cannot continue , Because each transaction holds another required lock . Because both transactions are waiting for resources to become available , So neither of them will release the lock it holds .

When a transaction locks rows in multiple tables (UPDATE or SELECT ... FOR UPDATE) But in reverse order , There may be deadlock . When these statements lock the range of index records and gaps , Deadlocks can also occur , Each transaction acquires some locks instead of others due to time problems .

The possibility of deadlock is not affected by the isolation level , Because the isolation level changes the behavior of read operations , The deadlock occurs because of the write operation .

When deadlock detection is enabled ( Default ) And when a deadlock does occur ,InnoDB Detect the condition and roll back one of the transactions ( The victim ).

Myisam

MyISAM Table use B Tree index ,MyISAM The tables are stored in three files on disk , The file name begins with the table name , And has an extension that indicates the file type .

  • File storage table format .frm
  • The data file has .MYD( MYData)
  • Index file has .MYI ( MYIndex)

establish MyISAM surface :

CREATE TABLE t (i INT) ENGINE = MYISAM;

MyISAM The characteristics of a table :

characteristic

Do you support

B Tree index

yes

Backup / Time to recover

yes

Cluster database supports

no

Clustered index

no

compressed data

yes

Data caching

no

Encrypt data

yes

Foreign key support

no

Full text search index

yes

Geospatial data types support

yes

Geospatial index support

yes

Hash index

no

The index buffer

yes

MVCC

no

Replication support

yes

Storage limits

256TB

T Tree index

no

Update the statistics of the data dictionary

yes

MyISAM Supported features :

  • Support authenticity VARCHAR type ; One VARCHAR The column starts with the length stored in one or two bytes .
  • with VARCHAR A table of columns may have a fixed or dynamic president .
  • In the table VARCHAR and CHAR The total length of the columns may be as high as 64KB.
  • Any length limit UNIQUE.

Table storage format

1、 Static table

The static format is MyISAM The default format for tables . When a table does not contain variable length columns, it is used for (VARCHAR,VARBINARY,BLOB or TEXT). Each row is stored in a fixed number of bytes .

In three kinds MyISAM In the storage format , Static format is the simplest and most secure ( Least likely to be damaged ).

  • CHAR and VARCHAR Columns are filled with spaces to the specified column width , Although the column type has not changed .BINARY and VARBINARY Column use 0x00 Bytes are filled to the column width
  • NULL Columns need extra space in rows to record whether their values are NULL. Every time NULL One more column , Round to the nearest byte
  • Soon
  • Easy to cache
  • Easy to rebuild after a crash , Because the row is in a fixed position
  • Usually requires more disk space than dynamic format tables

2、 Dynamic table

When a table contains any variable length column (VARCHAR,VARBINARY,BLOB or TEXT), Or the table uses ROW_FORMAT = DYNAMIC Table options created , The dynamic storage format .

Dynamic format is a little more complicated than static format , Because every line has a title , It means how long it is . When it gets longer due to updates , Rows can become fragmented ( Store in discrete segments ).

  • Except that the length is less than 4 Out of the string of characters , All string columns are dynamic .
  • Each row is preceded by a bitmap , Indicates which columns contain empty strings ( For character string Columns ) Or zero ( For columns of numbers )
  • NULL Columns need extra space in rows to record whether their values are NULL. Every time NULL One more column , Round to the nearest byte .
  • Usually requires much less disk space than a fixed length table .
  • It is more difficult to rebuild after a crash than a static format table , Because lines can be divided into many parts and linked ( fragment ) May lose .

3、 Compression meter

The compressed storage format uses myisampack Tool generated read-only format , Compressed tables can be myisamchk decompression .

  • Compressed tables take up very little disk space
  • Each row is compressed separately , Therefore, the access cost is very small
  • Can be used for fixed length or dynamic length rows

MyISAM Table problem

Even if MyISAM The table format is very reliable (SQL All changes made to the table by the statement are written before the statement returns ), But if any of the following events occur , The table may still be damaged :

  • mysqld The process was killed in the middle of writing
  • An unexpected computer shutdown occurred
  • Hardware failure
  • Using external program ( for example myisamchk) To modify tables that are also modified by the server
  • MySQL or MyISAM Software error in code

The typical symptom of a damaged watch is

  • The following error occurred while selecting data from the table Incorrect key file for table: '...'. Try to repair it
  • The query will not find rows in the table or return incomplete results

difference

MyISAM

InnoDB

Storage

Every MyISAM Store three files on disk . The name of the first file starts with the name of the table , The extension indicates the file type ..frm File storage table definition .MYD Data files .MYI Index file

Disk based resources are InnoDB Tablespace data file and its log file ,InnoDB The size of the table is limited to the size of the operating system file , It's usually 2GB

Business

MyISAM Manage non transaction tables . It provides high-speed storage and Retrieval , And full text search capabilities . If the application needs to execute a large number of SELECT Inquire about , that MyISAM It's a better choice

Support 4 Transaction isolation levels , Roll back , Crash repair capability and multi version concurrent transaction security , Include ACID. If the application needs to execute a large number of INSERT or UPDATE operation , Should be used InnoDB, This can improve the performance of multi-user concurrent operations

SELECT、UPDATE、INSERT、DELETE

If a large number of SELECT,MyISAM It's a better choice

Yes INSERT or UPDATE Have good support ;DELET when ,InnoDB The table will not be recreated , It's line by line deletion

The specific number of rows in the table

MyISAM Just simply read out the number of saved lines , When count(*) The statement contains where When the conditions , The operation of the two kinds of tables is the same

InnoDB The specific number of rows of the table is not saved in , in other words , perform count(*) when , Scan the entire table to see how many rows there are

lock

Support table level lock

Row level locking is supported ,InnoDB The row lock of the table is not absolute either , If you are executing a SQL When the sentence is MySQL Can't determine the range to scan ,InnoDB The watch also locks the whole watch

Indexes

MyISAM( Pile organization chart ) Using a non clustered index 、 Separate index from file , random packing , Only indexes can be cached

InnoDB( Index organization table ) Clustered index used 、 Index is data , Sequential storage , So you can cache indexes , Can also cache data

Concurrent

Reading and writing block each other : Not only does it block reads while writing ,MyISAM It also blocks writes while reading , But reading itself doesn't block other reading

Read and write blocking is related to transaction isolation level

Scene selection

The difference between the two storage engines :

  • InnoDB Support transactions ,MyISAM I won't support it , This is very important . Transaction is an advanced way of processing , For example, in some column additions, deletions and changes, you can roll back and restore as long as any error occurs , and MyISAM I can't .
  • MyISAM Suitable for query and insert based applications ,InnoDB It is suitable for frequent modification and high security applications
  • InnoDB Support foreign keys ,MyISAM I won't support it
  • MyISAM Is the default engine ,InnoDB You need to specify the
  • InnoDB I won't support it FULLTEXT Index of type
  • InnoDB The number of rows in the table is not saved , Such as select count() from table when ,InnoDB You need to scan the entire table to calculate how many rows there are , however MyISAM Simply read out the number of saved lines . Pay attention to is , When count() The statement contains where When the conditions MyISAM You also need to scan the entire table
  • For self growing fields ,InnoDB Must contain only the index of this field , But in MyISAM Tables can be federated with other fields
  • When you empty the entire table ,InnoDB It's line by line deletion , Efficiency is very slow .MyISAM The table is rebuilt
  • InnoDB Support row lock ( In some cases, the whole table is locked , Such as update table set a=1 where user like '%lee%'

MyISAM

InnoDB

No need for transaction support ( I won't support it )

Need transaction support ( It has better transaction characteristics )

Concurrency is relatively low ( Locking mechanism problem )

Row level locking has a good adaptability to high concurrency , But you need to make sure that the query is done by index

There are relatively few data changes ( Blocking problem ), Mainly reading

Scenarios with frequent data updates

Data consistency requirements are not very high

The requirement of data consistency is high

--

The memory of hardware device is large , You can use InnoDB Better cache capacity to improve memory utilization , Minimize the number of disks IO

原网站

版权声明
本文为[Silent storage]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/175/202206241545217957.html