之前在手机上看到某公众号一篇标题为”一次巧妙的SQL优化改写“文章,深入浅出的分析了一个通过改写得到很大提升的案例(24分钟到1.86秒),非常不错。最近在电脑上重新分析这个案例时,又有新体会:
原SQL:
select count(*)
from jbpm_testtest1 t
join jbpm_testtest2 pi
on t.procinst_=Pi.id_
inner join ccform_testtest3 d
on d.cf_id = pi.id_
where ( t.end_= (select max(end_)
from jbpm_testtest1 s
where s.procinst_=d.cf_id
and d.cf_gdsta in ('K')
)
or (t.end_ is null and t.isopen =1 )
);
该文章最后给出的改写SQL:
with V as
(select procinst_,max(end_) as max_end_
from jbpm_testtest1
group by procinst_
)
select
sum(case when (t.end_ is null and t.isopen=1) or
t.end_ = v.max_end_
then 1 else 0 end) cnt
from jbpm_testtest1 t
join jbpm_testtest2 pi
on t.procinst_=Pi.id_
inner join ccform_testtest3 d
on d.cf_id = pi.id_
left join v on
(d.cf_id=v.procinst_ and d.cf_gdsta in ('K'))
;
这个改写看起来比较巧妙的地方是select 部分的sum case when,利用了只求count,不返回具体行列内容的特点,直接计算做sum。起优化作用的还是把where部分的标量子查询改成外关联。
我认为这个“巧妙改写“写成下面形式可能更易读一些,这种写法从性能上没有什么损失,如果要返回具体的结果集,也方便修改:
with V as
(
select procinst_,max(end_) as max_end_
from jbpm_testtest1
group by procinst_
)
select count(*)
from
( select t.procinst_,t.end_,t.isopen,d.cf_id,d.cf_gdsta,v.max_end_
from jbpm_testtest1 t
join jbpm_testtest2 pi
on t.procinst_=Pi.id_
inner join ccform_testtest3 d
on d.cf_id = pi.id_
left join v on
(d.cf_id=v.procinst_ and d.cf_gdsta in ('K'))
) where (end_ is null and isopen=1) or (end_ = max_end_ )
;
如果我们再深入分析一下原的改写后的SQL,发现还有进一步提升效率的空间,即开头with部分的group by,其实是不需要的,去掉之后可以减少一次关联,下面是老虎刘给出的改写方法:
select count(*)
from
( select t.procinst_,t.end_ , t.isopen,d.cf_id,d.cf_gdsta,
max(end_) over (partition by t.procinst_) as max_end_
from jbpm_testtest1 t
join jbpm_testtest2 pi
on t.procinst_=Pi.id_
inner join ccform_testtest3 d
on d.cf_id = pi.id_
) where (end_ is null and isopen=1) or (cf_id=procinst_ and end_=max_end_ and cf_gdsta in ('K'))
;
不妥之处,欢迎指正。
(完)
本文分享自 老虎刘谈oracle性能优化 微信公众号,前往查看
如有侵权,请联系 cloudcommunity@tencent.com 删除。
本文参与 腾讯云自媒体同步曝光计划 ,欢迎热爱写作的你一起参与!