第62篇-学习并分析“一次巧妙的SQL优化改写”

老虎刘

发布于 2022-06-22 10:02:12

23400

代码可运行

文章被收录于专栏：老虎刘谈oracle性能优化老虎刘谈oracle性能优化

运行总次数：0

代码可运行

之前在手机上看到某公众号一篇标题为”一次巧妙的SQL优化改写“文章，深入浅出的分析了一个通过改写得到很大提升的案例（24分钟到1.86秒），非常不错。最近在电脑上重新分析这个案例时，又有新体会：

原SQL：

select count(*) 
from jbpm_testtest1 t
join jbpm_testtest2 pi
  on t.procinst_=Pi.id_
inner join ccform_testtest3 d
  on d.cf_id = pi.id_   
where ( t.end_= (select max(end_) 
                 from jbpm_testtest1 s
                 where s.procinst_=d.cf_id 
                       and d.cf_gdsta in ('K') 
                ) 
        or (t.end_ is null and t.isopen =1 )
       );

该文章最后给出的改写SQL：

with V as
(select procinst_,max(end_) as max_end_ 
 from jbpm_testtest1
 group by procinst_
)
select 
      sum(case when (t.end_ is null and t.isopen=1) or 
                      t.end_ = v.max_end_ 
                     then 1 else 0 end) cnt
from jbpm_testtest1 t
join   jbpm_testtest2 pi
  on t.procinst_=Pi.id_
inner join ccform_testtest3 d
  on d.cf_id = pi.id_
left join v on 
 (d.cf_id=v.procinst_ and d.cf_gdsta in ('K')) 
;

这个改写看起来比较巧妙的地方是select 部分的sum case when，利用了只求count，不返回具体行列内容的特点，直接计算做sum。起优化作用的还是把where部分的标量子查询改成外关联。

我认为这个“巧妙改写“写成下面形式可能更易读一些，这种写法从性能上没有什么损失，如果要返回具体的结果集，也方便修改：

with V as
(
 select procinst_,max(end_) as max_end_ 
 from jbpm_testtest1
 group by procinst_
)
select count(*)
from
( select t.procinst_,t.end_,t.isopen,d.cf_id,d.cf_gdsta,v.max_end_
    from jbpm_testtest1 t
    join jbpm_testtest2 pi
     on t.procinst_=Pi.id_
   inner join ccform_testtest3 d
     on d.cf_id = pi.id_
    left join v on 
     (d.cf_id=v.procinst_ and d.cf_gdsta in ('K')) 
) where (end_ is null and isopen=1) or (end_ = max_end_ )
;

如果我们再深入分析一下原的改写后的SQL，发现还有进一步提升效率的空间，即开头with部分的group by，其实是不需要的，去掉之后可以减少一次关联，下面是老虎刘给出的改写方法：

select count(*) 
from
( select  t.procinst_,t.end_ , t.isopen,d.cf_id,d.cf_gdsta,
        max(end_) over (partition by t.procinst_) as max_end_
from jbpm_testtest1 t
join jbpm_testtest2 pi
  on t.procinst_=Pi.id_
inner join ccform_testtest3 d
  on d.cf_id = pi.id_
) where (end_ is null and isopen=1) or (cf_id=procinst_ and end_=max_end_ and cf_gdsta in ('K')) 
;