经常听到有人说LIMIT影响SQL查询性能,其实单纯的LIMIT子句不会影响SQL性能,如果有影响,也是好的影响,特别是子查询中limit语句,可以限制中间结果集的大小,从而为减少后续处理的数据量。本文来讨论如何对LIMIT子句进行下推优化。
和谓词下推优化类似,Limit子句下推优化通过尽可能地下压Limit子句,提前过滤掉部分数据, 减少中间结果集的大小,减少后续计算需要处理的数据量, 以提高查询性能。
譬如如下的案例,在外查询有一个Limit子句,可以将其下推至内层查询执行:
select *
from (select c_nationkey nation, 'C' as type, count(1) num
from customer
group by c_nationkey
union
select s_nationkey nation, 'S' as type, count(1) num
from supplier
group by nation) as nation_s
order by nation limit 20, 10
重写之后的SQL如下:
select *
from (
(select customer.c_nationkey as nation, 'C' as `type`, count(1) as num
from customer
group by customer.c_nationkey
order by customer.c_nationkey limit 30)
union
(select supplier.s_nationkey as nation, 'S' as `type`, count(1) as num
from supplier
group by supplier.s_nationkey
order by supplier.s_nationkey limit 30)) as nation_s
order by nation_s.nation limit 20, 10
从优化后的执行计划我们可以看到,在UNION
操作之前两个子查询分别新增了一个LIMIT
节点,限制了中间结果的返回行数为30行(offset + limit), 对上下游节点都有性能提升的影响。
单纯由于LIMIT子句下推,整体的执行时间从176.93ms减少为3.54ms,整体性能提升了4898.02%。
PawSQL针对所有数据库默认开启LIMIT子句下推优化,
本文所使用的执行计划可视化工具为PawSQL Explain Visualizer , 支持MySQL、PostgreSQL、openGauss等数据库。
PawSQL专注数据库性能优化的自动化和智能化,支持MySQL,PostgreSQL,Opengauss等,提供的SQL优化产品包括