Linux 内核有个机制叫OOM killer(Out-Of-Memory killer),该机制会监控那些占用内存过大,尤其是瞬间很快消耗大量内存的进程,为了防止内存耗尽而内核会把该进程杀掉。
oom killer是计算出选择哪个进程kill呢?我们先来看一下kernel提供给用户态的/proc下的一些参数: /proc/[pid]/oom_adj ,该pid进程被oom killer杀掉的权重,介于 [-17,15]之间,越高的权重,意味着更可能被oom killer选中,-17表示禁止被kill掉。 /proc/[pid]/oom_score,当前该pid进程的被kill的分数,越高的分数意味着越可能被kill,这个数值是根据oom_adj运算后的结果,是oom_killer的主要参考。
sysctl 下有2个可配置选项:
触发oom killer时/var/log/message打印了进程的score:
Apr 18 16:56:18 v125000100.bja kernel: : [22254386.758297] [ pid ] uid tgid total_vm rss cpu oom_adj oom_score_adj name
Apr 18 16:56:18 v125000100.bja kernel: : [22254386.758311] [ 399] 0 399 2709 133 2 -17 -1000 udevd
Apr 18 16:56:18 v125000100.bja kernel: : [22254386.758314] [ 810] 0 810 2847 43 0 0 0 svscanboot
Apr 18 16:56:18 v125000100.bja kernel: : [22254386.758317] [ 824] 0 824 1039 21 0 0 0 svscan
Apr 18 16:56:18 v125000100.bja kernel: : [22254386.758320] [ 825] 0 825 993 17 1 0 0 readproctitle
Apr 18 16:56:18 v125000100.bja kernel: : [22254386.758322] [ 826] 0 826 996 16 0 0 0 supervise
Apr 18 16:56:18 v125000100.bja kernel: : [22254386.758325] [ 827] 0 827 996 17 0 0 0 supervise
Apr 18 16:56:18 v125000100.bja kernel: : [22254386.758327] [ 828] 0 828 996 16 0 0 0 supervise
Apr 18 16:56:18 v125000100.bja kernel: : [22254386.758330] [ 829] 0 829 996 17 2 0 0 supervise
Apr 18 16:56:18 v125000100.bja kernel: : [22254386.758333] [ 830] 0 830 6471 152 0 0 0 run
Apr 18 16:56:18 v125000100.bja kernel: : [22254386.758335] [ 831] 99 831 1032 21 0 0 0 multilog
所以,如果想修改被oom killer选中的概率,禁止或者给oom_adj最小或偏小的值,也可以通过sysctl调节oom killer行为
grep "Out of memory" /var/log/messages