当前位置:网站首页>【MySQL】字符集utf8mb4无法存储表情踩坑记录
【MySQL】字符集utf8mb4无法存储表情踩坑记录
2022-07-24 01:56:00 【morris131】
现象
字段上的字符集优先级高于表的字符集,表的字符集优先级高于数据库的字符集,理论上只要表的字符集为utf8mb4就能存储表情,真的是这样吗?
MySQL数据表的字符集已经设置成了utf8mb4,但是通过JDBC向数据库写入4字节的emoji表情时报错,但是通过直接使用SQL语句在命令行插入该4字节的emoji表情时却成功了。
示例如下:
表结构:
CREATE TABLE `user_info` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(11) NOT NULL,
`age` int(4) DEFAULT NULL
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
JDBC写入:
User user = new User();
user.setName("\uD83D\uDC8B");
user.setAge(18);
userMapper.insertUser(user);
报错结果如下:
org.springframework.jdbc.UncategorizedSQLException:
### Error updating database. Cause: java.sql.SQLException: Incorrect string value: '\xF0\x9F\x92\x8B' for column 'name' at row 1
### The error may involve com.wakzz.database.persistence.UserMapper.insertUser-Inline
### The error occurred while setting parameters
### SQL: insert into user_info (name, age) values (?, ? )
### Cause: java.sql.SQLException: Incorrect string value: '\xF0\x9F\x92\x8B' for column 'name' at row 1
; uncategorized SQLException; SQL state [HY000]; error code [1366]; Incorrect string value: '\xF0\x9F\x92\x8B' for column 'name' at row 1; nested exception is java.sql.SQLException: Incorrect string value: '\xF0\x9F\x92\x8B' for column 'name' at row 1
at org.springframework.jdbc.support.AbstractFallbackSQLExceptionTranslator.translate(AbstractFallbackSQLExceptionTranslator.java:89)
at org.springframework.jdbc.support.AbstractFallbackSQLExceptionTranslator.translate(AbstractFallbackSQLExceptionTranslator.java:81)
at org.springframework.jdbc.support.AbstractFallbackSQLExceptionTranslator.translate(AbstractFallbackSQLExceptionTranslator.java:81)
at org.mybatis.spring.MyBatisExceptionTranslator.translateExceptionIfPossible(MyBatisExceptionTranslator.java:73)
at org.mybatis.spring.SqlSessionTemplate$SqlSessionInterceptor.invoke(SqlSessionTemplate.java:446)
at com.sun.proxy.$Proxy81.insert(Unknown Source)
at org.mybatis.spring.SqlSessionTemplate.insert(SqlSessionTemplate.java:278)
at org.apache.ibatis.binding.MapperMethod.execute(MapperMethod.java:58)
at org.apache.ibatis.binding.MapperProxy.invoke(MapperProxy.java:59)
at com.sun.proxy.$Proxy92.insertUser(Unknown Source)
命令行写入成功:
mysql> insert into user_info (name,age) values ('',18);
Query OK, 1 row affected
错误原因
原因:JDBC会自动检测MySQL服务端character_set_server的值,自动执行SET NAMES命令设置整个连接的字符集编码,其目的是自动检测服务端字符集编码配置而减少JDBC客户端的字符集编码配置。如果MySQL服务端character_set_server的值为utf8,那么JDBC就会将连接的字符集编码设置为utf8,这样即使表的字符集为utf8mb4也是无法存储表情的。
官方说明:https://dev.mysql.com/doc/relnotes/connector-j/5.1/en/news-5-1-13.html

查看JDBC源码发现:
// realJavaEncoding为url中指定characterEncoding的值
if (realJavaEncoding.equalsIgnoreCase("UTF-8") || realJavaEncoding.equalsIgnoreCase("UTF8")) {
// charset names are case-sensitive
// 取MySQL服务端character_set_server
boolean useutf8mb4 = CharsetMapping.UTF8MB4_INDEXES.contains(this.session.getServerDefaultCollationIndex());
if (!this.useOldUTF8Behavior.getValue()) {
if (dontCheckServerMatch || !this.session.characterSetNamesMatches("utf8") || (!this.session.characterSetNamesMatches("utf8mb4"))) {
// 执行set names xxx
execSQL(null, "SET NAMES " + (useutf8mb4 ? "utf8mb4" : "utf8"), -1, null, false, this.database, null, false);
this.session.getServerVariables().put("character_set_client", useutf8mb4 ? "utf8mb4" : "utf8");
this.session.getServerVariables().put("character_set_connection", useutf8mb4 ? "utf8mb4" : "utf8");
}
} else {
execSQL(null, "SET NAMES latin1", -1, null, false, this.database, null, false);
this.session.getServerVariables().put("character_set_client", "latin1");
this.session.getServerVariables().put("character_set_connection", "latin1");
}
this.characterEncoding.setValue(realJavaEncoding);
}
在获取mysql的服务器参数后,解析字符集编码:
- 当character_set_server为utf8时,执行SET NAMES utf8
- 当character_set_server为utf8mb4时,执行SET NAMES utf8mb4
在命令行中测试SET NAMES发现即使数据库表的字符集是utf8mb4时,若执行了SET NAMES utf8也会导致4字节字符写入mysql失败。成功复现了JDBC写入emoji写入异常的问题。
mysql> SET NAMES utf8;
Query OK, 0 rows affected
mysql> insert into user_info (name,age) values ('',18);
1366 - Incorrect string value: '\xF0\x9F\x92\x8B' for column 'name' at row 1
mysql> SET NAMES utf8mb4;
Query OK, 0 rows affected
mysql> insert into user_info (name,age) values ('',18);
Query OK, 1 row affected
解决办法
修改character_set_server为utf8mb4
修改mysql配置文件my.cnf,添加以下配置:
character_set_server = utf8mb4
需要重启数据库实例。
修改前字符集
mysql> show variables like "%char%";
+--------------------------+----------------------------------------+
| Variable_name | Value |
+--------------------------+----------------------------------------+
| character_set_client | utf8mb4 |
| character_set_connection | utf8mb4 |
| character_set_database | latin1 |
| character_set_filesystem | binary |
| character_set_results | utf8mb4 |
| character_set_server | utf8 |
| character_set_system | utf8 |
| character_sets_dir | /usr/soft/mysql-5.6.31/share/charsets/ |
+--------------------------+----------------------------------------+
8 rows in set
mysql> SHOW VARIABLES LIKE 'collation%';
+----------------------+--------------------+
| Variable_name | Value |
+----------------------+--------------------+
| collation_connection | utf8mb4_general_ci |
| collation_database | latin1_swedish_ci |
| collation_server | utf8_general_ci |
+----------------------+--------------------+
3 rows in set
修改后字符集
mysql> show variables like "%char%";
+--------------------------+----------------------------------------+
| Variable_name | Value |
+--------------------------+----------------------------------------+
| character_set_client | utf8mb4 |
| character_set_connection | utf8mb4 |
| character_set_database | latin1 |
| character_set_filesystem | binary |
| character_set_results | utf8mb4 |
| character_set_server | utf8mb4 |
| character_set_system | utf8 |
| character_sets_dir | /usr/soft/mysql-5.6.31/share/charsets/ |
+--------------------------+----------------------------------------+
8 rows in set
mysql> SHOW VARIABLES LIKE 'collation%';
+----------------------+--------------------+
| Variable_name | Value |
+----------------------+--------------------+
| collation_connection | utf8mb4_general_ci |
| collation_database | latin1_swedish_ci |
| collation_server | utf8mb4_general_ci |
+----------------------+--------------------+
3 rows in set
手动设置数据库连接的编码为utf8mb4
JDBC设置连接的编码:
Connection conn = DriverManager.getConnection(url, userName, password);
conn.prepareStatement("set names utf8mb4").executeQuery();
如果是Spring项目,可以从ThreadLocal拿到连接:
ConnectionHolder connectionHolder = (ConnectionHolder) TransactionSynchronizationManager.getResource(dataSource);
Connection connection = connectionHolder.getConnection();
try {
PreparedStatement preparedStatement = connection.prepareStatement("SET NAMES utf8mb4");
preparedStatement.executeQuery();
} catch (SQLException e) {
e.printStackTrace();
}
其中dataSource可以通过Spring注入进来:
private DataSource dataSource;
注意使用此方法,需要开启事务,否则从ThreadLocal中拿不到连接。
最后付上修改数据库、表、字符字符集的SQL:
-- 修改数据库的字符集
alter database DBNAME DEFAULT CHARACTER SET utf8mb4;
-- 修改表的字符集,改了表字符集后只对新增的字段有效
alter table tbl_name convert to character set character_name ;
-- 修改字段的字符集
alter table tbl_name change col1 col1 varchar(20) CHARACTER SET utf8mb4;
对字符串进行编码存入,取出解码
存入数据库时对字符串进行编码:
String encode = URLEncoder.encode("就🧑", StandardCharsets.UTF_8.name());
从数据库取出时对字符串进行解码:
URLDecoder.decode(encode, StandardCharsets.UTF_8.name());
边栏推荐
- 141. Circular linked list
- Review of HCIA
- Code reading methods and best practices
- Customer first | domestic Bi leader, smart software completes round C financing
- Excel simple macro
- Hcip first day notes
- LiteSpeed Web服务器中安装SSL证书
- Is Huatai Securities safe to open an account? How to handle it
- Decrypt redis to help the e-commerce seckill system behind the double 11
- Non boost ASIO notes: UDP UART socketcan multicast UDS
猜你喜欢

Design of hospital wireless network system

Hardware knowledge 2 -- Protocol class (based on Baiwen hardware operation Daquan video tutorial)

Webshell management tool and its traffic characteristics analysis

Improvement of DB file sequential read caused by insert

MySQL Basics (operators, sorting and paging, multi table queries, functions)

医院综合布线

Construction and test of hfish honey pot

Hcip day 4 notes

Hcip third day notes

Review of HCIA
随机推荐
Spark memory management mechanism new version
Basic knowledge of mathematical vector
LiteSpeed Web服务器中安装SSL证书
Notes - record the solution to the failure of @refreshscope dynamic refresh configuration
Matlab绘制双坐标图(全网最简单)
Webshell management tool and its traffic characteristics analysis
Some ideas and skills suitable for pinduoduo small business accessories
NETCORE - how to ensure that icollection or list privatization is not externally modified?
How to finally generate a file from saveastextfile in spark
How CAD draws arrows with arcs
[code case] website confession wall & to do list (including complete source code)
浅谈领域驱动设计
Win11 highlights of win11 system
Using tessellation in unity
Exchange 2013 SSL certificate installation document
医院网络安全架构
Distributed resource management and task scheduling framework yarn
What are the principal guaranteed financial products with an annual interest rate of about 6%?
Hcip day 5 notes
Hcip day 6 notes