今日推荐: 数据库设计的三范式与反范式:优化数据结构,提升数据库性能
文章链接: https://cloud.tencent.com/developer/article/2469671
推荐语: 数据库的性能受很多方式的影响,设计良好的结构是至关重要,良好的设计来源于良好的原则,该文就介绍了数据库设计的规则,并结合原则给出了实际的示例和为什么要这样设计,值得时时参考。
下面开始今天对于MySQL索引的一些实践
在实际项目中, 往往是由多人协同开发,在项目的更新迭代中难免有时候会由不同的开发人员在同一张表上建立了不同名字但实际上重复列的索引, 之前我们说过每个索引都会占用空间,并且在插入新数据,更新和删除已有数据的时候都需要维护索引,所以去除掉这些重复的索引很有必要。怎么样方便的找到这些多余的索引呢,有一个很有用的工具pt-duplicate-key-checker
在employees表上创建如下的索引
mysql> create index idx_name on employees(last_name, first_name);
Query OK, 0 rows affected (3.28 sec)
Records: 0 Duplicates: 0 Warnings: 0
mysql> create index idx_last_name on employees(last_name);
Query OK, 0 rows affected (2.39 sec)
Records: 0 Duplicates: 0 Warnings: 0
mysql> create index idx_last_and_first_name on employees(last_name, first_name);
Query OK, 0 rows affected, 1 warning (4.64 sec)
Records: 0 Duplicates: 0 Warnings: 1Percona Toolkit是Percona开发的用于管理MySQL的工具套件,其中一个工具就是我们这里要用到的pt-duplicate-key-checker,安装也非常方便。
以MacOS上的安装为例
brew install percona-toolkit安装完成后检查版本,目前的版本是 3.6.0
MacBook-Pro:~ hongyan$ pt-duplicate-key-checker --version
pt-duplicate-key-checker 3.6.0安装完成以后我们便可以使用该工具来检查现有数据库中的重复索引
pt-duplicate-key-checker --database employees --user username --password password
其中
--database 后面的参数是我们需要检查的数据库名
--user 后面的参数是数据库实际的用户名
--password 后面的参数是数据库该用户名实际对应的密码
接下来我们来看该命令对于刚刚我们所建立的三个索引的运行结果
MacBook-Pro:~ hongyan$ pt-duplicate-key-checker --database employees --user username --password password
#######################################################################
# employees.employees
# ########################################################################
# idx_name is a duplicate of idx_last_and_first_name
# Key definitions:
# KEY `idx_name` (`last_name`,`first_name`),
# KEY `idx_last_and_first_name` (`last_name`,`first_name`)
# Column types:
# `last_name` varchar(16) not null
# `first_name` varchar(14) not null
# To remove this duplicate index, execute:
ALTER TABLE `employees`.`employees` DROP INDEX `idx_name`;
# idx_last_name is a left-prefix of idx_last_and_first_name
# Key definitions:
# KEY `idx_last_name` (`last_name`),
# KEY `idx_last_and_first_name` (`last_name`,`first_name`)
# Column types:
# `last_name` varchar(16) not null
# `first_name` varchar(14) not null
# To remove this duplicate index, execute:
ALTER TABLE `employees`.`employees` DROP INDEX `idx_last_name`;
# ########################################################################
# Summary of indexes
# ########################################################################
# Size Duplicate Indexes 56580670
# Total Duplicate Indexes 2
# Total Indexes 19从该输出我们可以看到
的最左前缀,通过我们前一篇文章MySQL性能优化 - 索引匹配 - 最左前缀 索引匹配原则,所以这里的idx_last_name是一个冗余索引,并紧接着给出了如果fix该冗余索引。
最后给出了此次检查的一些summary, 也就是此次总共检查了19个索引,总共有2个重复的索引,并且重复索引占用的空间是56580670字节, 可以用以下命令查看employees表每个索引占的空间
mysql> select index_name, sum(stat_value) * @@innodb_page_size size
-> from mysql.innodb_index_stats
-> where stat_name = 'size'
-> and database_name = 'employees'
-> and table_name = 'employees'
-> group by index_name;
+-------------------------+----------+
| index_name | size |
+-------------------------+----------+
| PRIMARY | 34127872 |
| idx_last_and_first_name | 8929280 |
| idx_last_name | 6832128 |
| idx_name | 8929280 |
+-------------------------+----------+
4 rows in set (0.01 sec)最后我们按照该结果删除重复索引
mysql> ALTER TABLE `employees`.`employees` DROP INDEX `idx_name`;
Query OK, 0 rows affected (0.02 sec)
Records: 0 Duplicates: 0 Warnings: 0
mysql> ALTER TABLE `employees`.`employees` DROP INDEX `idx_last_name`;
Query OK, 0 rows affected (0.03 sec)
Records: 0 Duplicates: 0 Warnings: 0再运行一次检查指令
MacBook-Pro:~ hongyan$ pt-duplicate-key-checker --database employees --user username --password password
# ########################################################################
# Summary of indexes
# ########################################################################
# Total Indexes 17从这次命令的运行结果来看,总索引的条数变成了17,也没有了重复索引。
快在实际项目中去试一下这个好用的工具吧,我们的目标 是让SQL飞起来 O(∩_∩)O哈哈~
在刚刚的实践中,该检查是指定的本地的项目,所以并没有指明host是什么,所以当然这个小工具还有其它的选项
MacBook-Pro:~ hongyan$ pt-duplicate-key-checker --help
pt-duplicate-key-checker examines MySQL tables for duplicate or redundant
indexes and foreign keys. Connection options are read from MySQL option files.
For more details, please use the --help option, or try 'perldoc
/usr/local/Cellar/percona-toolkit/3.6.0_2/libexec/bin/pt-duplicate-key-checker'
for complete documentation.
Usage: pt-duplicate-key-checker [OPTIONS] [DSN]
Options:
--all-structs Compare indexes with different structs (BTREE, HASH,
etc)
--ask-pass Prompt for a password when connecting to MySQL
--charset=s -A Default character set
--[no]clustered PK columns appended to secondary key is duplicate (
default yes)
--config=A Read this comma-separated list of config files; if
specified, this must be the first option on the command
line
--databases=h -d Check only this comma-separated list of databases
--defaults-file=s -F Only read mysql options from the given file
--engines=h -e Check only tables whose storage engine is in this comma-
separated list
--help Show help and exit
--host=s -h Connect to host
--ignore-databases=H Ignore this comma-separated list of databases
--ignore-engines=H Ignore this comma-separated list of storage engines
--ignore-order Ignore index order so KEY(a,b) duplicates KEY(b,a)
--ignore-tables=H Ignore this comma-separated list of tables
--key-types=s Check for duplicate f=foreign keys, k=keys or fk=both (
default fk)
--password=s -p Password to use when connecting
--pid=s Create the given PID file
--port=i -P Port number to use for connection
--set-vars=A Set the MySQL variables in this comma-separated list of
variable=value pairs
--socket=s -S Socket file to use for connection
--[no]sql Print DROP KEY statement for each duplicate key (
default yes)
--[no]summary Print summary of indexes at end of output (default yes)
--tables=h -t Check only this comma-separated list of tables
--user=s -u User for login if not current user
--verbose -v Output all keys and/or foreign keys found, not just
redundant ones
--version Show version and exit
--[no]version-check Check for the latest version of Percona Toolkit, MySQL,
and other programs (default yes)
Option types: s=string, i=integer, f=float, h/H/a/A=comma-separated list, d=DSN, z=size, m=time
Rules:
This tool accepts additional command-line arguments. Refer to the SYNOPSIS and usage information for details.
DSN syntax is key=value[,key=value...] Allowable DSN keys:
KEY COPY MEANING
=== ==== =============================================
A yes Default character set
D yes Default database
F yes Only read default options from the given file
P yes Port number to use for connection
S yes Socket file to use for connection
h yes Connect to host
p yes Password to use when connecting
u yes User for login if not current user
If the DSN is a bareword, the word is treated as the 'h' key.
Options and values after processing arguments:
--all-structs FALSE
--ask-pass FALSE
--charset (No value)
--clustered TRUE
--config /etc/percona-toolkit/percona-toolkit.conf,/etc/percona-toolkit/pt-duplicate-key-checker.conf,/Users/lihongyan/.percona-toolkit.conf,/Users/lihongyan/.pt-duplicate-key-checker.conf
--databases (No value)
--defaults-file (No value)
--engines (No value)
--help TRUE
--host (No value)
--ignore-databases
--ignore-engines
--ignore-order FALSE
--ignore-tables
--key-types fk
--password (No value)
--pid (No value)
--port (No value)
--set-vars
--socket (No value)
--sql TRUE
--summary TRUE
--tables (No value)
--user (No value)
--verbose FALSE
--version FALSE
--version-check TRUE--all-structs 默认情况下是FALSE, 因为BTREE索引可能会覆盖作为FULLTEXT索引的列,这种情况下并不是真正的重复,因为是不同的索引结构
--clustered 默认情况下是TRUE, 如果主键索引的列是一个非主键索引的后缀,则认为这两个索引是重复的。比如我们再在employees表上创建一个last_name, first_name, emp_no三列组成的索引,其中emp_no是主键索引
mysql> create index idx_name_no on employees(last_name, first_name, emp_no);
Query OK, 0 rows affected (1.35 sec)
Records: 0 Duplicates: 0 Warnings: 0我们查看一下目前employees表的索引情况
mysql> show index from employees;
+-----------+------------+-------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | Visible | Expression |
+-----------+------------+-------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+------------+
| employees | 0 | PRIMARY | 1 | emp_no | A | 297793 | NULL | NULL | | BTREE | | | YES | NULL |
| employees | 1 | idx_name_no | 1 | last_name | A | 1633 | NULL | NULL | | BTREE | | | YES | NULL |
| employees | 1 | idx_name_no | 2 | first_name | A | 282095 | NULL | NULL | | BTREE | | | YES | NULL |
| employees | 1 | idx_name_no | 3 | emp_no | A | 297793 | NULL | NULL | | BTREE | | | YES | NULL |
+-----------+------------+-------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+------------+
4 rows in set (0.01 sec)
那我们运行检查工具的时候会得到以下结果
MacBook-Pro:~ hongyan$ pt-duplicate-key-checker --database=employees --user=username --password=password
# ########################################################################
# employees.employees
# ########################################################################
# Key idx_name_no ends with a prefix of the clustered index
# Key definitions:
# KEY `idx_name_no` (`last_name`,`first_name`,`emp_no`)
# PRIMARY KEY (`emp_no`),
# Column types:
# `last_name` varchar(16) not null
# `first_name` varchar(14) not null
# `emp_no` int not null
# To shorten this duplicate clustered index, execute:
ALTER TABLE `employees`.`employees` DROP INDEX `idx_name_no`, ADD INDEX `idx_name_no` (`last_name`,`first_name`);
# ########################################################################
# Summary of indexes
# ########################################################################
# Size Duplicate Indexes 38117504
# Total Duplicate Indexes 1
# Total Indexes 16我们看到工具给出的建议是去掉idx_name_no这个索引,然后加上只包含last_name, first_name的索引, 如果我们不想让工具认为这种情况是重复索引,则加上参数--noclustered, 如下所示
lihongyans-MacBook-Pro:~ lihongyan$ pt-duplicate-key-checker --database=employees --user=root --password=Kangju0610 --noclustered
# ########################################################################
# Summary of indexes
# ########################################################################
# Total Indexes 16可以看到这里并没有检出重复索引
--host 指定MySQL server所在的主机
--port 指定server所在的端口
更多信息可以参考 https://docs.percona.com/percona-toolkit/pt-duplicate-key-checker.html
原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。
如有侵权,请联系 cloudcommunity@tencent.com 删除。
原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。
如有侵权,请联系 cloudcommunity@tencent.com 删除。