当前位置:网站首页>MySQL field type and corresponding length & bytes

MySQL field type and corresponding length & bytes

2022-06-22 02:58:00 Floating stream

Byte chapter

Byte history

  • Americans first coded English letters , It's the earliest ascii code , Use the bottom of a byte pair 7 In English 128 Characters .
  • Later the Europeans discovered 128 It's not enough , There are also phonetic symbols in French. How to distinguish them? The higher one should be included , In this way, Europe generally uses a full byte encoding , At most, it can express 256 position . European and American people just like to go straight , Fewer characters , The number of bits used for coding is small ;
  • But even if the number of digits is small , Different countries and regions use different character codes , although 0–127 The symbols are the same , however 128–255 The explanation of this paragraph is completely out of order , Even if 2 The base numbers are exactly the same , The characters represented are completely different , such as 135 In French , Hebrew , Russian code is completely different symbols ;
  • More troubling , Chinese culture is broad and profound. We find that we have 10 More than ten thousand Chinese characters , You are in Europe and America 256 It's not enough to plug the teeth . So we invented GB2312 These Chinese characters are encoded , Typical use 2 Bytes to represent most commonly used Chinese characters , At most, it can mean 65536 Chinese characters , In this way, it is not difficult to understand that some Chinese characters can be found in Xinhua dictionary , But if you don't deal with it on the computer, you won't be able to display it .
  • Now each character set is used to encode , How can the world be unified ? The Russians sent a letter email To the Chinese , The character set codes on both sides are different , Nima shows that they are all garbled . In order to unify , So we invented unicode, Include all the symbols in the world , Each symbol is given a unique code , Now? unicode It can hold 100 More than ten thousand symbols , The coding of each symbol is different , This can be unified , All languages can communicate , A web page can display the words of different countries at the same time .
  • However ,unicode Although it unifies the binary coding of characters all over the world , But it doesn't stipulate how to store it , Pro - .x86 and amd The small end sequence and the large end sequence of the computer architecture are indistinguishable , Don't mention how computers recognize what is unicode still acsii 了 . If Unicode Uniform rules , Each symbol is represented by three or four bytes , Then every letter must be preceded by two or three bytes 0, The size of the text file will therefore be two or three times larger , It's a huge waste of storage . This leads to a consequence : There is Unicode A variety of storage methods
  • The rise of the Internet , Various characters should be displayed on the web page , Must be unified , Pro - .utf-8 Namely Unicode One of the most important implementations . And then there is utf-16、utf-32 etc. .UTF-8 Not a fixed word length encoding , It is a kind of variable length coding . It can be used 1~4 Bytes represent a symbol , The length of the bytes varies according to the symbol . This is a clever design , If the first bit of a byte is 0, Then this byte is just a character ; If the first one is 1, How many in a row 1, It means how many bytes the current character occupies .
  • Be careful unicode Character encoding and utf-8 The storage code representation of is different , for example ” yan ” The word Unicode Code is 4E25,UTF-8 Encoding is E4B8A5, This 7 It explains ,UTF-8 Coding not only considers coding , Storage is also considered ,E4B8A5 On the basis of storing the identification code 4E25.
  • UTF-8 Use one to four bytes to encode each character .128 individual ASCII character (Unicode Range from U+0000 to U+007F) Just one byte , Latin with diacritic symbols 、 Greek 、 Cyrillic alphabet 、 Armenian 、 Hebrew 、 arabic 、 Syrian and Maldivian (Unicode Range from U+0080 to U+07FF) Two bytes are required , Other basic multilingual planes (BMP) The characters in (CJK Belong to this category -Qieqie notes ) Use three bytes , other Unicode The characters of the auxiliary plane are encoded in four bytes .

English letter :

code byte
GB23121
GBK1
GB180301
ISO-8859-11
UTF-81
UTF-164
UTF-16BE2
UTF-16LE2

Chinese characters

code byte
GB23122
GBK2
GB180302
ISO-8859-11
UTF-83
UTF-164
UTF-16BE2
UTF-16LE2

Mysql The field type corresponds to the length

Numeric type

Column type The amount of storage needed
TINYINT1 Bytes
SMALLINT2 Bytes
MEDIUMNT3 Bytes
INT4 Bytes
INTEGER4 Bytes
BIGINT8 Bytes
FLOAT(X)4 Bytes
FLOAT4 Bytes
DOUBLE8 Bytes
DOUBLE PRECISION8 Bytes
REAL8 Bytes
DECIMAL(M,D)M Bytes (D+2, If M < D)
NUMERIC(M,D)M Bytes (D+2, If M < D)

The date type

type The amount of storage needed
DATE3 Bytes
DATETIME8 Bytes
TIMESTAMP4 Bytes
TIME3 Bytes
YEAR1 byte

String type

type The amount of storage needed
CHAR(M)M byte ,1 <= M <= 255
VARCHAR(M)L+1 byte , Here it is L <= M and 1 <= M <= 255
TINYBLOB,TINYTEXTL+1 byte , Here it is L< 2 ^ 8
BLOB,TEXTL+2 byte , Here it is L< 2 ^ 16
MEDIUMBLOB,MEDIUMTEXTL+3 byte , Here it is L< 2 ^ 24
LONGBLOB,LONGTEXTL+4 byte , Here it is L< 2 ^ 32
ENUM(‘value1’, ‘value2’, …)1 or 2 Bytes , Depends on the number of enumeration values ( Maximum 65535)
SET(‘value1’, ‘value2’)1,2,3,4 or 8 Bytes , Depends on the number of members of the collection ( most 64 Members )
 Reference resources :
https://blog.csdn.net/yaomingyang/article/details/79374209
https://blog.csdn.net/lijinzhou2017/article/details/81062877
原网站

版权声明
本文为[Floating stream]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/173/202206220252036072.html