测试对比 , utf8 和 utf8mb4 在空间占用上的区别
版本:mysql5.7.25官方社区版,默认utf8字符集
mysql [localhost:5725] {root} ((none)) > status ;
--------------
/root/opt/mysql/5.7.25/bin/mysql Ver 14.14 Distrib 5.7.25, for linux-glibc2.12 (x86_64) using EditLine wrapper
Connection id:6
Current database:
Current user:root@localhost
SSL:Not in use
Current pager:stdout
Using outfile:''
Using delimiter:;
Server version:5.7.25-log MySQL Community Server (GPL)
Protocol version:10
Connection:Localhost via UNIX socket
Server characterset:utf8
Db characterset:utf8
Client characterset:utf8
Conn. characterset:utf8
UNIX socket:/tmp/mysql_sandbox5725.sock
Uptime:4 hours 58 min 57 sec
Threads: 1 Questions: 131 Slow queries: 0 Opens: 121 Flush tables: 1 Open tables: 107 Queries per second avg: 0.007
--------------
mysql [localhost:5725] {root} ((none)) > show global variables like 'char%';
+--------------------------+----------------------------------------+
| Variable_name | Value |
+--------------------------+----------------------------------------+
| character_set_client | utf8 |
| character_set_connection | utf8 |
| character_set_database | utf8 |
| character_set_filesystem | binary |
| character_set_results | utf8 |
| character_set_server | utf8 |
| character_set_system | utf8 |
| character_sets_dir | /root/opt/mysql/5.7.25/share/charsets/ |
+--------------------------+----------------------------------------+
8 rows in set (0.00 sec)
# 创建2张测试表并造些测试数据
mysql [localhost:5725] {root} (test) > CREATE TABLE `t1` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT COMMENT 'pk',
`name` varchar(10) NOT NULL DEFAULT '' COMMENT 'name',
`age` int(10) unsigned NOT NULL DEFAULT '0' COMMENT 'age',
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
mysql [localhost:5725] {root} (test) > CREATE TABLE `t2` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT COMMENT 'pk',
`name` varchar(10) NOT NULL DEFAULT '' COMMENT 'name',
`age` int(10) unsigned NOT NULL DEFAULT '0' COMMENT 'age',
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
insert into t1 (name,age) values('abcde12345',25);
insert into t1 (name,age) select name,age from t1 ; 用脚本多执行几次这个sql
insert into t2 (name,age) values('abcde12345',25);
insert into t2 (name,age) select name,age from t2 ; 用脚本多执行几次这个sql
select max(id) from t1;
+----------+
| max(id) |
+----------+
| 17301353 |
+----------+
select max(id) from t2;
+----------+
| max(id) |
+----------+
| 17301353 |
+----------+
# 加个索引
alter table t1 add index idx_name (name);
alter table t2 add index idx_name (name);
# 查看2张表的体积
mysql [localhost:5725] {root} (test) > show table status like 't1' \G
*************************** 1. row ***************************
Name: t1
Engine: InnoDB
Version: 10
Row_format: Dynamic
Rows: 16349147
Avg_row_length: 38
Data_length: 631242752
Max_data_length: 0
Index_length: 0
Data_free: 4194304
Auto_increment: 17366761
Create_time: 2019-05-07 10:26:57
Update_time: 2019-05-07 10:22:35
Check_time: NULL
Collation: utf8_general_ci
Checksum: NULL
Create_options:
Comment:
1 row in set (0.00 sec)
mysql [localhost:5725] {root} (test) > show table status like 't2' \G
*************************** 1. row ***************************
Name: t2
Engine: InnoDB
Version: 10
Row_format: Dynamic
Rows: 16357733
Avg_row_length: 37
Data_length: 618659840
Max_data_length: 0
Index_length: 0
Data_free: 4194304
Auto_increment: 17366761
Create_time: 2019-05-07 10:29:16
Update_time: 2019-05-07 10:24:09
Check_time: NULL
Collation: utf8mb4_general_ci
Checksum: NULL
Create_options:
Comment:
1 row in set (0.00 sec)
# 查看表物理文件占用的大小
[root@Dev-K8s-N1 /root/sandboxes/msb_5_7_25/data/test ]
# ll *.ibd
-rw-r----- 1 root root 1098907648 May 7 10:26 t1.ibd
-rw-r----- 1 root root 1098907648 May 7 10:29 t2.ibd
可以看到存相同的的数据时候, utf8mb4 并没有比 utf8 多占空间。 同时,utf8mb4 能支持emoji表情字符,因此强烈推荐建表时候使用utf8mb4字符集(注意代码连接数据库的时候也要改成utf8mb4的才行)
另外, MySQL8.0中,字符集默认就是 utf8mb4了。 大势所趋。新上的数据库,赶紧都用 utf8mb4 吧。