Hazard Pointers:
================
There was one hazard pointer (hp) based scan in buffer pool. With this work
we'll have four hp based scans.
We have used the concept of hazard pointer in kind of reverse order.
Academically a hazard pointer is a pointer that the thread working on it will
declare as such and as long as that thread is not done no other thread is
allowed to do anything with it.
In our case, we declare the pointer as a hazard pointer and then if any other
thread attempts to work on it, it is allowed to do so but it has to adjust the
hazard pointer to the next valid value. We use hazard pointer solely for reverse
traversal of lists within a buffer pool instance.
Example:
For each buffer pool instance we'll have:
The last two types of iterators will be capped by LRU->old i.e.: if we reach the
end of 'old' pages within LRU list we'll reset to the tail of LRU.
Page Eviction from LRU to free list:
====================================
We evict pages from LRU to free list when:
What we have currently is quite involved. During LRU batch if we hit a page that
is good for eviction (it is clean) we evict it i.e., put it on free list. If,
however, it is dirty we flush it but don't put it on free list. This is because
during flushing we release the buf_pool::mutex and we don't want to start a
rescan for eviction when we are done flushing of that page. We simply leave it
for eviction in the next iteration.
In case of single page flushing we do:
In new scheme of things we'll move the eviction to the IO completion routine
(typically run in IO helper thread). This means we can do both flushing and
eviction in a single pass during LRU and single page flushing.
LRU batches in single pass:
===========================
Currently we divide LRU batches into smaller chunks of 100 pages. Why?
We do this because we want to go through quick iterations of flushing and
eviction as described above. This ensures that thread waiting for a clean page
don't have to wait very long in case of a very large LRU batch i.e., high value
of innodb_lru_scan_depth.
However, in new scheme of things the eviction is done in the same pass as
flushing. That means we can have arbitrarily large LRU batch and the clean
blocks will become available to the waiting threads as and when they are flushed
(because they will be evicted in the IO completion routine).
Therefore, the logic to divide LRU batch into smaller chunks is not needed
anymore.
No wait to end LRU batch:
=========================
The threads looking for a victim don't need to wait for the LRU batch to end. If
they can find a page in free list that is fine, otherwise they can retry as the
pages are becoming available as and when they are flushed and if that doesn't
help they should try single page flushing.
Other Changes:
==============
One other thing that showed up on perf when working on high end system is that
when doing a search for replaceable page we call buf_LRU_free_page(). This will
compute the fold value and grab page_hash::lock before checking whether or not
page is replaceable. This is called very frequently and we should only try this
function if buf_flush_ready_for_replace() returns true.
Changes based on Performance meeting discussions:
=================================================
本文系转载,前往查看
如有侵权,请联系 cloudcommunity@tencent.com 删除。
本文系转载,前往查看
如有侵权,请联系 cloudcommunity@tencent.com 删除。