我有一个查询运行得很慢。我非常确定瓶颈是计划中的顺序扫描,所以我想构建适当的索引和/或重新安排查询以改进这一点。
下面是我的查询(和here is a fiddle with a schema and test data):
SELECT conversations.id, max(messages.timestamp) as latest_message FROM
conversations JOIN messages on conversations.id = messages.cid
GROUP BY conversations.id ORDER BY latest_message;
我已经在所有相关列上建立了索引,并在cid
和timestamp
上进行了双向嵌套索引,但都无济于事。顺序扫描仍为:
Sort (cost=31.31..31.56 rows=100 width=12)
Sort Key: (max(messages."timestamp"))
-> HashAggregate (cost=26.99..27.99 rows=100 width=12)
Group Key: conversations.id
-> Hash Join (cost=3.25..21.99 rows=1000 width=12)
Hash Cond: (messages.cid = conversations.id)
-> Seq Scan on messages (cost=0.00..16.00 rows=1000 width=12)
-> Hash (cost=2.00..2.00 rows=100 width=4)
-> Seq Scan on conversations (cost=0.00..2.00 rows=100 width=4)
我如何改进这个查询和/或我可以构建哪些索引来修复这些顺序扫描?
发布于 2020-05-04 23:44:38
您不需要join
SELECT m.cid, max(m.timestamp) as latest_message
FROM messages m
GROUP BY m.cid
ORDER BY latest_message;
这应该能够在messages(cid, timestamp desc)
上使用索引。但是,将此代码编写为:
SELECT DISTINCT ON (m.cid) m.*
FROM messages m
ORDER BY m.cid, m.timestamp DESC;
使用相同的索引。
https://stackoverflow.com/questions/61596064
复制相似问题