Hash Join作为表连接的基础连接类型,各大关系型数据库(譬如Oracle、sqlserver、Postgres等)很早都支持了Hash Join这种连接类型。作为关系型数据库领域的领袖,Oracle数据库支持三种主流的连接类型:Nested Loop Join、Hash Join 和 Sort Merge Join。而作为最流行的关系型数据库的MySQL 却一直没有支持Hash Join,这点一直为人诟病。千呼万唤始出来,MySQL 8.0.18开始终于支持了Hash Join的连接算法。MySQL 8.0 的所有新特性中,Hash Join 曾经最让我期待的一个新特性。
MySQL 8.0之前,MySQL数据库仅支持嵌套循环链接Nested loop Join一种连接算法。Nested Loop Join 是一个双重循环的结构,对于连接的两张表,先循环遍历外层驱动表,对于驱动表的每一条记录,内层循环遍历被驱动表来判断是否符合连接条件;假设驱动表存储在M个page上有m条记录,被驱动表在N个page上有n条记录,那么,Nested Loop Join的IO成本为:M+(m*N)。嵌套循环连接的IO复杂度是很高的。
MySQL 8.0中的Hash Join可以通过Hash表的方式来降低IO复杂度。Hash Join 算法先遍历驱动表,根据表的连接条件作为key值在内存中建立一张hash表,对于被驱动表的每一条记录也根据连接条件计算hash值,验证hash值与hash表中的值是否匹配来完成连接。假设驱动表存储在M个page上有m条记录,被驱动表在N个page上有n条记录,那么,Hash Join的IO成本为:M+N,另外内存中的IO成本为2M(一次写hash表,一次匹配时的读hash表)。可以看出,Hash Join比Nested Loop Join的IO复杂度要低很多。
执行Hash Join一般包含两个过程,hash表的构建过程(build),和hash表的探测过程(probe)。
以Oracle官方的例子来描述一下Hash Join的具体实现。
# SQL连接查询示例
SELECT
given_name, country_name
FROM
persons JOIN countries ON persons.country_id = countries.country_id;
遍历驱动表,以连接条件为key,查询需要的列作为value,在内存中创建hash表;
逐行遍历被驱动表,对于被驱动表的每条记录,根据连接条件计算hash值,并在内存hash表中查找匹配记录,如果找到匹配记录则输出,否则跳过,知道遍历完所有被驱动表的记录。
基础的hash join要求在内存中装载整个驱动表(或者驱动表中满足谓词过滤条件的结果集),所以一般选择参与连接的两张表中记录数较小的表或者经过谓词过滤后结果集较小的表作为驱动表,尽量使得驱动表结果集的hash表能够全部装载到内存。 在MySQL 8.0中,构建hash表能够使用的内存大小由参数join_buffer_size控制,默认256K,线上环境调整为2M左右,依据具体业务场景而定。如果hash表大小超过join_buffer_size,那么hash join就需要调整为On-disk Hash Join。
On-disk Hash Join为了控制内存占用,将外表分成若干片段执行,使得内存能够容纳单个分片。每当外表填充满hash表时就截断build过程。然后,针对每个被截断的分片,都执行一遍内表全量数据的Proble过程。假设外表分成了k片,那么将扫描k次内表,总体IO成本是3M+kN。在MySQL 8.0.22之后又对On-disk Hash Join进行了一些优化,分别对驱动表和被驱动表构建hash表分布在磁盘的分片文件中,然后对相同分片编号(连接键相同)的分片中的数据再进行hash join;这种优化后的代价是,外表和内表在build阶段进行一次读IO和一次写IO,在probe阶段进行了一次读IO,所以整体IO成本是3*(M+N)。总体来说,On-disk Hash Join的性能就会差很多了。
hash join可以用于内连接、外连接、半连接、反连接的等值或非等值连接。根据hash join与nested loop join的算法对比,hash join可以显著减少被驱动表的循环扫描次数,IO复杂度更低,所以适用于相对大数据量的表连接。
1)误区一: Hash Join 只适用于两张大表之间的连接。 hash join适用于两张表中符合连接条件的结果集较大的表连接场景的优化,跟表的数据量级大小关系并不大。
2)误区二:Hash Join 仅仅在完全没有索引的情况下才能够显著提高性能的论断。 首先,对于连接列有索引的表之间连接的场景下,hash join也可以显著提高性能;其次,hash join连接的同时也可以使用表上的谓词过滤条件对应列上的索引,并非hash join就不能走索引。
稍后举例论证。
在以下场景,hash join的连接效率比nested loop join的效率要高很多:
测试场景:两张只有1万条记录的测试表之间关联,并且关联列c上有索引。
# 表结构如下,记录数1w条
CREATE TABLE `t1w` (
`id` int NOT NULL AUTO_INCREMENT,
`k` int NOT NULL DEFAULT '0',
`c` varchar(60) NOT NULL DEFAULT '',
`pad` char(60) NOT NULL DEFAULT '',
PRIMARY KEY (`id`),
KEY `idx_c` (`c`)
) ENGINE=InnoDB AUTO_INCREMENT=10847 DEFAULT CHARSET=latin1
# 测试表连接SQL
MySQL [sbtest]> explain select count(*) from t1w t1, t2w t2 where t1.c=t2.c;
MySQL 5.7中,执行计划为Nested loop join,通过关联列c上的索引idx_c的嵌套循环连接,1w条记录的两张表关联,执行耗时 0.02秒。删除关联列c上的索引idx_c后,嵌套循环连接执行耗时13.32秒。
# MySQL 5.7中执行计划为Nested loop join
MySQL [sbtest]> explain select count(*) from t1w t1, t2w t2 where t1.c=t2.c;
+----+-------------+-------+------------+-------+---------------+-------+---------+-------------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+-------+---------------+-------+---------+-------------+------+----------+-------------+
| 1 | SIMPLE | t1 | NULL | index | idx_c | idx_c | 62 | NULL | 9830 | 100.00 | Using index |
| 1 | SIMPLE | t2 | NULL | ref | idx_c | idx_c | 62 | sbtest.t1.c | 1 | 100.00 | Using index |
+----+-------------+-------+------------+-------+---------------+-------+---------+-------------+------+----------+-------------+
2 rows in set, 1 warning (0.06 sec)
# MySQL 5.7中执行计划为Nested loop join,
MySQL [sbtest]> select count(*) from t1w t1, t2w t2 where t1.c=t2.c;
+----------+
| count(*) |
+----------+
| 10002 |
+----------+
1 row in set (0.02 sec)
MySQL [sbtest]> alter table t1w drop index idx_c;
MySQL [sbtest]> alter table t2w drop index idx_c;
MySQL [sbtest]> select count(*) from t1w t1, t2w t2 where t1.c=t2.c;
+----------+
| count(*) |
+----------+
| 10002 |
+----------+
1 row in set (13.32 sec)
MySQL 8.0中,执行计划为hash join,1w条记录的两张表关联,执行耗时 0.01秒。
# MySQL 8.0中
MySQL [sbtest]> explain analyze select count(*) from t1w t1, t2w t2 where t1.c=t2.c;
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| EXPLAIN |
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| -> Aggregate: count(0) (actual time=13.000..13.001 rows=1 loops=1)
-> Inner hash join (t2.c = t1.c) (cost=19650475.26 rows=10073) (actual time=4.991..12.469 rows=10002 loops=1)
-> Table scan on t2 (cost=0.00 rows=19507) (actual time=0.027..3.999 rows=20000 loops=1)
-> Hash
-> Table scan on t1 (cost=1031.55 rows=10073) (actual time=0.041..2.042 rows=10000 loops=1)
|
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.02 sec)
MySQL [sbtest]> select count(*) from t1w t1, t2w t2 where t1.c=t2.c;
+----------+
| count(*) |
+----------+
| 10002 |
+----------+
1 row in set (0.01 sec)
小结:通过对比MySQL 5.7和8.0的执行计划和执行耗时,对于两张1万条记录的表连接,hash join的表连接耗时0.01秒,连接列有索引的情况下nested loop join耗时0.02秒,连接列没有索引的情况下执行耗时13.32秒。可以看出,hash join算法即使在1万条记录的表连接场景下,执行效率比nested loop join要高很多。
测试场景:一张100w记录和一张1000w记录的测试表连接,谓词条件对应列上存在索引,连接列上也存在索引。
# 测试表t1m 和 t10m分别有100w和1000w条记录,并且谓词条件k上有索引,关联列c上有索引
CREATE TABLE `t1m` (
`id` int NOT NULL AUTO_INCREMENT,
`k` int NOT NULL DEFAULT '0',
`c` varchar(60) NOT NULL DEFAULT '',
`pad` char(60) NOT NULL DEFAULT '',
PRIMARY KEY (`id`),
KEY `idx_k` (`k`),
KEY `idx_c` (`c`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
# 表关联查询SQL
select t1.c,t2.pad from t1m t1, t10m t2 where t1.c=t2.c and t1.k between 10000000 and 10100000 and t2.k between 10000000 and 10010000;
MySQL 8.0 对于表连接列上存在索引的情况下,默认会使用nested loop join连接,这时执行耗时为24.33秒。
# 连接列上存在索引时,执行计划为nested loop join连接
MySQL [sbtest]> explain analyze select t1.c,t2.pad from t1m t1, t10m t2 where t1.c=t2.c and t1.k between 10000000 and 10100000 and t2.k between 10000000 and 10010000;
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| EXPLAIN |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| -> Nested loop inner join (cost=8227.09 rows=1914) (actual time=24324.946..24324.946 rows=0 loops=1)
-> Index range scan on t2 using idx_k, with index condition: (t2.k between 10000000 and 10010000) (cost=4528.16 rows=10062) (actual time=0.785..78.070 rows=10062 loops=1)
-> Filter: (t1.k between 10000000 and 10100000) (cost=0.26 rows=0) (actual time=2.410..2.410 rows=0 loops=10062)
-> Index lookup on t1 using idx_c (c=t2.c), with index condition: (t1.c = t2.c) (cost=0.26 rows=1) (actual time=2.409..2.409 rows=0 loops=10062)
|
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (24.33 sec)
MySQL 8.0 将连接列上的索引设置为隐藏索引,强制执行计划走hash join,先通过谓词条件对应列上的索引扫描返回结果集,然后对索引过滤后的结果集构建hash表进行hash join连接,这时执行耗时为0.38秒。
MySQL [sbtest]> alter table t10m alter index idx_c invisible;
MySQL [sbtest]> alter table t1m alter index idx_c invisible;
MySQL [sbtest]> flush tables;
MySQL [sbtest]> explain analyze select t1.c,t2.pad from t1m t1, t10m t2 where t1.c=t2.c and t1.k between 10000000 and 10100000 and t2.k between 10000000 and 10010000;
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| EXPLAIN |
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| -> Inner hash join (t1.c = t2.c) (cost=80350110.88 rows=17855217) (actual time=377.007..377.007 rows=0 loops=1)
-> Index range scan on t1 using idx_k, with index condition: (t1.k between 10000000 and 10100000) (cost=62109.34 rows=177454) (actual time=0.440..297.338 rows=99712 loops=1)
-> Hash
-> Index range scan on t2 using idx_k, with index condition: (t2.k between 10000000 and 10010000) (cost=4528.16 rows=10062) (actual time=0.785..59.710 rows=10062 loops=1)
|
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.38 sec)
小结:MySQL 8.0中hash join连接的同时也可以使用表上的谓词过滤条件对应列上的索引,并非hash join就不能走索引。并且,hash join连接的执行效率比nested loop join要高很多。
这个测试案例也反映了一个问题,就是MySQL 8.0中如果两张表的连接列上存在索引,那么优化器就会去选择nested loop join连接方式,从优化器的cost可以看出优化器认为这种场景下的hash join成本比nested loop join 高1w倍,但实际情况下,hash join的执行效率比nested loop join高64倍。MySQL 对于hash join的执行成本评估还存在巨大的问题。
Hash Join 作为MySQL 8.0 曾经最让人期待的新特性,终于还是与大家见面了,在一些场景下Hash Join能够显著地提升表连接性能。不过在MySQL 8.0中,如果表连接列存在索引,那么优化器就不会走到Hash Join的连接算法,并且相较于其他数据库的Hash Join实现也还有一些待优化的地方。我们期待未来MySQL能够持续对Hash Join算法进行优化。