前往小程序,Get更优阅读体验!
立即前往
发布
社区首页 >专栏 >PostgreSQL SQL优化用兵法,优化后提高 140倍速度

PostgreSQL SQL优化用兵法,优化后提高 140倍速度

作者头像
AustinDatabases
发布2025-02-07 15:13:49
发布2025-02-07 15:13:49
9300
代码可运行
举报
文章被收录于专栏:AustinDatabasesAustinDatabases
运行总次数:0
代码可运行

今天的领悟,人生没有白走的,对错都算数,做完一些事情,经常会后悔,如果我当时选择了A,就好了,可我当时选择的是B,然后不断地悔恨。人的眼睛都是长在前面的,选择就选择了,错了就错了,错了吸收教训,不要再犯就好,当时我也站在迷雾中,也很迷茫,哪怕在给我一次机会,还是会选择B,与其后悔,不如持续修炼--- 一个现实世界的“申公豹”留言。

SQL 的优化是DBA工作的主题,这里我想以这篇文章,将SQL优化的方式方法,以及程序员的撰写SQL的问题说清楚,如果有不周之处还请各位老师指点一二。

SQL 本身是什么,完成什么功能这点做DBA的同学要清晰。

1 SQL是程序完成数据输入,输出的必经之路

2 SQL是程序搜取数据不多方案中的最有效的数据提取方案

3 SQL除了必备的数据提取功能,还有数据处理和计算的功能

4 SQL可能完成了程序功能的一部分,甚至是程序核心的一部分

5 SQL是大部分程序中的热点,是一个程序运行好坏的至关重要的一部分

明白了这些SQL的运行的好坏与程序之间的关系就明确了。

为什么SQL就不能一次性写好

1 程序生成的SQL,通过程序将代码转换为SQL,这样的SQL明显不能进行优化的写法实现,在系统初步运行的过程可以接受这样的写法,但在系统高负载的情况下,则这样的SQL就是问题的爆发点。

2 乱建索引,这点很常见,索引的好坏影响SQL的运行。一个好的索引可以助力SQL快速的运行,一个差劲的索引是SQL运行的阻碍,乱建的方式

1 每个字段建立一个索引

2 将很多的字段建立索引

3 不按照语句的规律和查询的字段顺序和方式建立索引

4 索引重复建设 (ORACLE没有这个问题,其他都有)

5 语句作废后的无用索引的滞留

3 业务持续变化,SQL语句从简单到复杂,程序员把SQL当程序写,希望一个SQL解决所有需求(这是产生问题的关键)

下面我们用一个SQL和执行计划来说明如何快速的分析一个PG的SQL的优化点。咱们先看语句

这个语句有几个问题

1 语句中使用了数据源嵌套方式,也就是select * from (select * from )我们俗称的子查询,子查询实际上也没有什么问题,关键是子查询中的SQL复杂度和SQL所过滤后的数据量。在子查询中的过滤后的数据量较大的情况下就会产生临时表,SQL过于复杂会增加产生执行计划准确度的难度。所以我们在撰写SQL都希望通过 join的方式来表达表和表之间的关系,并获得关系后的并集,交集等数据结果。

代码语言:javascript
代码运行次数:0
复制
SELECT
  ata.planId,
  ata.detail_di,
  ata.item_di,
  ata.item_code,
  ata.item_name,
  ata.item_pinyin,
  ata.unit_di,
  ata.unit_name,
  ata.enable_muti_size,
  ata.small_class_di,
  ata.small_class_code,
  ata.small_class_name,
  ata.lane_di,
  ata.lane_name,
  ata.standardTime,
  ata.warnTime,
  SUM(ata.maxQty) AS maxQty
FROM
  (
    SELECT
      kifd.itemfilter_di AS planId,
      kifd.detail_di,
      ai.di AS item_di,
      ai.code AS item_code,
      ai.NAME AS item_name,
      ai.pinyin AS item_pinyin,
      ai.unit_di,
      f_get_unit_name (ai.unit_di) AS unit_name,
      ai.enable_muti_size,
      aic2."di" AS small_class_di,
      aic2.code AS small_class_code,
      aic2.NAME AS small_class_name,
      kifd.lane_di,
      kifl.lane_name AS lane_name,
      ki.standard_time AS standardTime,
      ki.warn_time AS warnTime,
      bmcd.maxqty AS maxQty
    FROM
      kc_itemfilter AS kif
      INNER JOIN kc_itemfilter_detail AS kifd ON kif.di = kifd.itemfilter_di
      AND kifd.setting_mode = 0
      LEFT OUTER JOIN kc_itemfilter_lane AS kifl ON kifd.lane_di = kifl.di
      INNER JOIN _item AS ai ON ai.di = kifd.detail_di
      AND ai.delflg = 0
      AND ai.is_enable = TRUE
      AND ai.is_package = FALSE
      AND ai.create__di = '160539'
      INNER JOIN _item_class AS aic2 ON ai.small_class_di = aic2.di
      AND aic2.delflg = 0
      AND aic2.LEVEL = 1
      AND aic2.create__di = '160539'
      LEFT JOIN kc_item AS ki ON ai.di = ki.item_di
      AND ki.item_size_di = -1
      AND ki.delflg = 0
      AND ki.belong__di = '160540'
      LEFT JOIN bm_machine_combine_dishes AS bmcd ON ai.di = bmcd.item_di
      AND bmcd.size_di = -1
      AND bmcd.delflg = 0
      AND bmcd.create__di = ai.create__di
    WHERE
      kifd.itemfilter_di IN ('16053900000000001')
    UNION
    SELECT
      kifd.itemfilter_di AS planId,
      kifd.detail_di,
      ai.di AS item_di,
      ai.code AS item_code,
      ai.NAME AS item_name,
      ai.pinyin AS item_pinyin,
      ai.unit_di,
      f_get_unit_name (ai.unit_di) AS unit_name,
      ai.enable_muti_size,
      aic2."di" AS small_class_di,
      aic2.code AS small_class_code,
      aic2.NAME AS small_class_name,
      kifd.lane_di,
      kifl.lane_name AS lane_name,
      ki.standard_time AS standardTime,
      ki.warn_time AS warnTime,
      bmcd.maxqty AS maxQty
    FROM
      kc_itemfilter AS kif
      INNER JOIN kc_itemlter_detail AS kifd ON kif.di = kifd.itemfilter_di
      AND kifd.setting_mode <> 0
      LEFT OUTER JOIN kc_iteilter_lane AS kifl ON kifd.lane_di = kifl.di
      INNER JOIN _item AS ai ON ai.small_class_di = kifd.detail_di
      AND ai.delflg = 0
      AND ai.is_enable = TRUE
      AND ai.is_package = FALSE
      AND ai.create__di = '160539'
      INNER JOIN _item_lass AS aic2 ON ai.small_class_di = aic2.di
      AND aic2.delflg = 0
      AND aic2.LEVEL = 1
      AND aic2.create__di = '160539'
      LEFT JOIN kc_item AS ki ON ai.di = ki.item_di
      AND ki.item_size_di = -1
      AND ki.delflg = 0
      AND ki.belong__di = '160540'
      LEFT JOIN mach_cobine_dishes AS bmcd ON ai.di = bmcd.item_di
      AND bmcd.size_di = -1
      AND bmcd.delflg = 0
      AND bmcd.create__di = ai.create__di
    WHERE
      kifd.itemfilter_di IN ('16053900000001')
  ) AS ata
GROUP BY
  ata.planId,
  ata.detail_di,
  ata.item_di,
  ata.item_code,
  ata.item_name,
  ata.item_pinyin,
  ata.unit_di,
  ata.unit_name,
  ata.enable_muti_size,
  ata.small_class_di,
  ata.small_class_code,
  ata.small_class_name,
  ata.lane_di,
  ata.lane_name,
  ata.standardTime,
  ata.warnTime
ORDER BY
  small_class_code
代码语言:javascript
代码运行次数:0
复制
                                                                                                                                      QUERY PLAN                                                                                                                                       
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
GroupAggregate  (cost=4296.87..4296.98 rows=2 wdith=861) (actual time=2067.727..2068.947 rows=1068 loops=1)
 Group Key: aic2.code, kifd.itemfilter_di, kifd.detail_di, ai.di, ai.code, ai.name, ai.pinyin, ai.unit_di, (f_get_unit_name(ai.unit_di, false)), ai.enable_muti_size, aic2.di, aic2.name, kifd.lane_di, kifl.lane_name, ki.standard_time, ki.warn_time
 ->  Sort  (cost=4296.87..4296.87 rows=2 wdith=857) (actual time=2067.714..2067.804 rows=1068 loops=1)
       Sort Key: aic2.code, kifd.itemfilter_di, kifd.detail_di, ai.di, ai.code, ai.name, ai.pinyin, ai.unit_di, (f_get_unit_name(ai.unit_di, false)), ai.enable_muti_size, aic2.di, aic2.name, kifd.lane_di, kifl.lane_name, ki.standard_time, ki.warn_time
       Sort Method: quicksort  Memory: 200kB
       ->  Unique  (cost=4296.75..4296.84 rows=2 wdith=857) (actual time=2065.780..2066.441 rows=1068 loops=1)
             ->  Sort  (cost=4296.75..4296.75 rows=2 wdith=857) (actual time=2065.778..2065.865 rows=1068 loops=1)
                   Sort Key: kifd.itemfilter_di, kifd.detail_di, ai.di, ai.code, ai.name, ai.pinyin, ai.unit_di, (f_get_unit_name(ai.unit_di, false)), ai.enable_muti_size, aic2.di, aic2.code, aic2.name, kifd.lane_di, kifl.lane_name, ki.standard_time, ki.warn_time, bmcd.maxqty
                   Sort Method: quicksort  Memory: 200kB
                   ->  Append  (cost=736.35..4296.74 rows=2 wdith=857) (actual time=3.874..2063.680 rows=1068 loops=1)
                         ->  Nested Loop Left Join  (cost=736.35..2148.02 rows=1 wdith=145) (actual time=3.873..2060.342 rows=1068 loops=1)
                               Join Filter: ((bmcd.create__di = ai.create__di) AND (ai.di = bmcd.item_di))
                               ->  Nested Loop Left Join  (cost=736.35..2051.28 rows=1 wdith=113) (actual time=2.937..1589.088 rows=1068 loops=1)
                                     Join Filter: (ai.di = ki.item_di)
                                     Rows Removed by Join Filter: 1167324
                                     ->  Nested Loop Left Join  (cost=18.37..697.16 rows=1 wdith=109) (actual time=0.621..15.154 rows=1068 loops=1)
                                           ->  Nested Loop  (cost=18.23..697.00 rows=1 wdith=101) (actual time=0.615..13.895 rows=1068 loops=1)
                                                 ->  Nested Loop  (cost=18.09..695.73 rows=1 wdith=101) (actual time=0.593..10.341 rows=1068 loops=1)
                                                       ->  Hash Join  (cost=17.80..692.98 rows=2 wdith=77) (actual time=0.075..3.503 rows=1088 loops=1)
                                                             Hash Cond: (ai.small_class_di = aic2.di)
                                                             ->  Index Scan using dix__item on _item ai  (cost=0.29..674.36 rows=425 wdith=57) (actual time=0.020..2.010 rows=1088 loops=1)
                                                                   Index Cond: (create__di = 160539)
                                                                   Filter: (is_enable AND (NOT is_package) AND (delflg = 0))
                                                                   Rows Removed by Filter: 7
                                                             ->  Hash  (cost=17.35..17.35 rows=13 wdith=28) (actual time=0.039..0.040 rows=18 loops=1)
                                                                   Buckets: 1024  Batches: 1  Memory Usage: 9kB
                                                                   ->  Index Scan using dix__item_class on _item_class aic2  (cost=0.28..17.35 rows=13 wdith=28) (actual time=0.012..0.032 rows=18 loops=1)
                                                                         Index Cond: (create__di = 160539)
                                                                         Filter: ((delflg = 0) AND (level = 1))
                                                                         Rows Removed by Filter: 14
                                                       ->  Index Scan using pk_kc_itemfilter_detail on kc_itemfilter_detail kifd  (cost=0.29..1.37 rows=1 wdith=24) (actual time=0.005..0.005 rows=1 loops=1088)
                                                             Index Cond: ((itemfilter_di = '16053900000000001'::bigint) AND (detail_di = ai.di))
                                                             Filter: (setting_mode = 0)
                                                 ->  Index Only Scan using pk_kc_itemfilter on kc_itemfilter kif  (cost=0.14..1.26 rows=1 wdith=8) (actual time=0.002..0.003 rows=1 loops=1068)
                                                       Index Cond: (di = '16053900000000001'::bigint)
                                                       Heap Fetches: 1068
                                           ->  Index Scan using pk_kc_itemfilter_lane on kc_itemfilter_lane kifl  (cost=0.14..0.16 rows=1 wdith=16) (actual time=0.000..0.000 rows=0 loops=1068)
                                                 Index Cond: (di = kifd.lane_di)
                                     ->  Bitmap Heap Scan on kc_item ki  (cost=717.98..1347.36 rows=541 wdith=12) (actual time=1.095..1.380 rows=1094 loops=1068)
                                           Recheck Cond: (belong__di = 160540)
                                           Filter: ((item_size_di = '-1'::integer) AND (delflg = 0))
                                           Rows Removed by Filter: 18
                                           Heap Blocks: exact=52332
                                           ->  Bitmap Index Scan on pk_kc_item  (cost=0.00..717.84 rows=1102 wdith=0) (actual time=1.086..1.086 rows=1150 loops=1068)
                                                 Index Cond: (belong__di = 160540)
                               ->  Seq Scan on bm_machine_combine_dishes bmcd  (cost=0.00..96.48 rows=1 wdith=16) (actual time=0.410..0.410 rows=0 loops=1068)
                                     Filter: ((size_di = '-1'::integer) AND (delflg = 0) AND (create__di = 160539))
                                     Rows Removed by Filter: 3012
                         ->  Nested Loop Left Join  (cost=736.35..2148.69 rows=1 wdith=145) (actual time=2.957..2.961 rows=0 loops=1)
                               Join Filter: ((bmcd_1.create__di = ai_1.create__di) AND (ai_1.di = bmcd_1.item_di))
                               ->  Nested Loop Left Join  (cost=736.35..2051.94 rows=1 wdith=113) (actual time=2.956..2.960 rows=0 loops=1)
                                     Join Filter: (ai_1.di = ki_1.item_di)
                                     ->  Nested Loop Left Join  (cost=18.37..697.82 rows=1 wdith=109) (actual time=2.955..2.959 rows=0 loops=1)
                                           ->  Nested Loop  (cost=18.23..697.23 rows=1 wdith=101) (actual time=2.955..2.958 rows=0 loops=1)
                                                 ->  Nested Loop  (cost=18.09..695.96 rows=1 wdith=101) (actual time=2.955..2.957 rows=0 loops=1)
                                                       ->  Hash Join  (cost=17.80..692.98 rows=2 wdith=85) (actual time=0.077..1.179 rows=1088 loops=1)
                                                             Hash Cond: (ai_1.small_class_di = aic2_1.di)
                                                             ->  Index Scan using dix__item on _item ai_1  (cost=0.29..674.36 rows=425 wdith=57) (actual time=0.020..0.788 rows=1088 loops=1)
                                                                   Index Cond: (create__di = 160539)
                                                                   Filter: (is_enable AND (NOT is_package) AND (delflg = 0))
                                                                   Rows Removed by Filter: 7
                                                             ->  Hash  (cost=17.35..17.35 rows=13 wdith=28) (actual time=0.042..0.043 rows=18 loops=1)
                                                                   Buckets: 1024  Batches: 1  Memory Usage: 9kB
                                                                   ->  Index Scan using dix__item_class on _item_class aic2_1  (cost=0.28..17.35 rows=13 wdith=28) (actual time=0.012..0.030 rows=18 loops=1)
                                                                         Index Cond: (create__di = 160539)
                                                                         Filter: ((delflg = 0) AND (level = 1))
                                                                         Rows Removed by Filter: 14
                                                       ->  Index Scan using pk_kc_itemfilter_detail on kc_itemfilter_detail kifd_1  (cost=0.29..1.37 rows=1 wdith=24) (actual time=0.001..0.001 rows=0 loops=1088)
                                                             Index Cond: ((itemfilter_di = '16053900000000001'::bigint) AND (detail_di = ai_1.small_class_di))
                                                             Filter: (setting_mode <> 0)
                                                 ->  Index Only Scan using pk_kc_itemfilter on kc_itemfilter kif_1  (cost=0.14..1.26 rows=1 wdith=8) (never executed)
                                                       Index Cond: (di = '16053900000000001'::bigint)
                                                       Heap Fetches: 0
                                           ->  Index Scan using pk_kc_itemfilter_lane on kc_itemfilter_lane kifl_1  (cost=0.14..0.57 rows=1 wdith=16) (never executed)
                                                 Index Cond: (di = kifd_1.lane_di)
                                     ->  Bitmap Heap Scan on kc_item ki_1  (cost=717.98..1347.36 rows=541 wdith=12) (never executed)
                                           Recheck Cond: (belong__di = 160540)
                                           Filter: ((item_size_di = '-1'::integer) AND (delflg = 0))
                                           ->  Bitmap Index Scan on pk_kc_item  (cost=0.00..717.84 rows=1102 wdith=0) (never executed)
                                                 Index Cond: (belong__di = 160540)
                               ->  Seq Scan on bm_machine_combine_dishes bmcd_1  (cost=0.00..96.48 rows=1 wdith=16) (never executed)
                                     Filter: ((size_di = '-1'::integer) AND (delflg = 0) AND (create__di = 160539))
Planning Time: 7.995 ms

首先看到这么一大坨语句和执行计划,都犯怵,但是我们要快速优化SQL需要有方法,有了方法就可以快速的将60%-80%的问题解决。

1 识别关键词 GroupAggregate : 这个词在PG中时进行聚合操作,这标明这里有GROUP BY 或者分组聚合类的操作。

sort: 排序,看到这个词说明查询结果需要进行排序

unique: 看这个说明有去重操作

append: 看到这个词说明查询结果需要合并

Nested Loop Joins: 说明两个结果及进行了笛卡尔积操作

Bitmap Heap Scan : 将数据转换为位图后进行扫描

index scans: 在索引上进行全扫

seq scan : 全表扫描

另外我在分析PostgreSQL SQL的会注意一个部分,rows removed by filter

1 如果有大量的行被移除,说明在这个操作过程中有大量不符合条件的数据被移除,不符合条件就有可能是索引缺少或索引错误导致的,并且会产生大量的IO

2 通过了解那些行被移除,可以查看是否缺少对应的索引,连接方式是否合理等

那么这里我优化的方法论或框架

1 寻找 seq scan 从这里入手

2 查看 rows removed by filter 行数多的

3 发现cost 突然增大的部分

现在就用我上面的方法论来对这个SQL来看看找出他的一些问题

1 先看Seq Scan ,在语句的执行计划中,有两个地方发生了 Seq Scan

根据这个方法论,首先第一个表就没有索引,来我们对这个表进行分析 ,看是否需要添加索引,在判断需要添加索引后我们看执行计划的变化

仅仅针对这个部分进行添加索引后,就从原来的2秒多,变更到0.8秒,优化达成率 60% 。 在此扫描新的执行计划,里面已经没有 Seq Scan,下一步,找到 rows removed by filter 数据较大的部分来查看。

然后我们照着第二个部分下手,rows removed by filter,很快我就定位了这部分过滤的大户,然后顺藤摸瓜,直接找到

然后针对这块来发现索引的问题,一看的确是缺索引。经过我第二次添加索引,整体比以前快了59倍。

写到这里,还可以继续优化,不过SQL优化和过生活一样,不要太急功近利,太多的索引未必对其他的操作有帮助,现在SQL已经来到了14ms,还想怎么样,收工。

通过以上的方法我们DBA 可以快速的对一些难搞的POSTGRESQL 进行快速的优化,达到解决80%SQL以上的问题,兵法总结:抓住要点,顺藤摸瓜,直击要害,解决关键问题,

1查全表扫描,

2查过滤行数,

3关注Cost突发变大的位置。

当然SQL得优化并没有这么简单,但简单的方法能解决80%的问题,我们终究是要解决问题,如SQL改写,逻辑变更,条件置换,条件前置,添加插件进行POSTGRESQL SQL hint 等在大部分情况下并不是优化的常用手段,部分情况下你要了解业务,深入到业务当中的,才可以使用更绝的手段。

本文参与 腾讯云自媒体同步曝光计划,分享自微信公众号。
原始发表:2025-02-06,如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 AustinDatabases 微信公众号,前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体同步曝光计划  ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档