xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: James Dingwall <james.dingwall@zynstra.com>
To: Bob Liu <bob.liu@oracle.com>
Cc: xen-devel@lists.xen.org
Subject: Re: Kernel 3.11 / 3.12 OOM killer and Xen ballooning
Date: Thu, 26 Dec 2013 08:42:23 +0000	[thread overview]
Message-ID: <52BBEBEF.8040509@zynstra.com> (raw)
In-Reply-To: <52B3B6D7.50606@oracle.com>

Bob Liu wrote:
> On 12/20/2013 03:08 AM, James Dingwall wrote:
>> Bob Liu wrote:
>>> On 12/12/2013 12:30 AM, James Dingwall wrote:
>>>> Bob Liu wrote:
>>>>> On 12/10/2013 11:27 PM, Konrad Rzeszutek Wilk wrote:
>>>>>> On Tue, Dec 10, 2013 at 02:52:40PM +0000, James Dingwall wrote:
>>>>>>> Konrad Rzeszutek Wilk wrote:
>>>>>>>> On Mon, Dec 09, 2013 at 05:50:29PM +0000, James Dingwall wrote:
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> Since 3.11 I have noticed that the OOM killer quite frequently
>>>>>>>>> triggers in my Xen guest domains which use ballooning to
>>>>>>>>> increase/decrease their memory allocation according to their
>>>>>>>>> requirements.  One example domain I have has a maximum memory
>>>>>>>>> setting of ~1.5Gb but it usually idles at ~300Mb, it is also
>>>>>>>>> configured with 2Gb swap which is almost 100% free.
>>>>>>>>>
>>>>>>>>> # free
>>>>>>>>>                 total       used       free     shared    buffers
>>>>>>>>> cached
>>>>>>>>> Mem:        272080     248108      23972          0 1448      63064
>>>>>>>>> -/+ buffers/cache:     183596      88484
>>>>>>>>> Swap:      2097148          8    2097140
>>>>>>>>>
>>>>>>>>> There is plenty of available free memory in the hypervisor to
>>>>>>>>> balloon to the maximum size:
>>>>>>>>> # xl info | grep free_mem
>>>>>>>>> free_memory            : 14923
>>>>>>>>>
>>>>>>>>> An example trace (they are always the same) from the oom killer in
>>>>>>>>> 3.12 is added below.  So far I have not been able to reproduce this
>>>>>>>>> at will so it is difficult to start bisecting it to see if a
>>>>>>>>> particular change introduced this.  However it does seem that the
>>>>>>>>> behaviour is wrong because a) ballooning could give the guest more
>>>>>>>>> memory, b) there is lots of swap available which could be used as a
>>>>>>>>> fallback.
>>>>>> Keep in mind that swap with tmem is actually no more swap. Heh, that
>>>>>> sounds odd -but basically pages that are destined for swap end up
>>>>>> going in the tmem code which pipes them up to the hypervisor.
>>>>>>
>>>>>>>>> If other information could help or there are more tests that I
>>>>>>>>> could
>>>>>>>>> run then please let me know.
>>>>>>>> I presume you have enabled 'tmem' both in the hypervisor and in
>>>>>>>> the guest right?
>>>>>>> Yes, domU and dom0 both have the tmem module loaded and  tmem
>>>>>>> tmem_dedup=on tmem_compress=on is given on the xen command line.
>>>>>> Excellent. The odd thing is that your swap is not used that much, but
>>>>>> it should be (as that is part of what the self-balloon is suppose to
>>>>>> do).
>>>>>>
>>>>>> Bob, you had a patch for the logic of how self-balloon is suppose
>>>>>> to account for the slab - would this be relevant to this problem?
>>>>>>
>>>>> Perhaps, I have attached the patch.
>>>>> James, could you please apply it and try your application again? You
>>>>> have to rebuild the guest kernel.
>>>>> Oh, and also take a look at whether frontswap is in use, you can check
>>>>> it by watching "cat /sys/kernel/debug/frontswap/*".
>>>> I have tested this patch with a workload where I have previously seen
>>>> failures and so far so good.  I'll try to keep a guest with it stressed
>>>> to see if I do get any problems.  I don't know if it is expected but I
>>> By the way, besides longer time of kswapd, is this patch work well
>>> during your stress testing?
>>>
>>> Have you seen the OOM killer triggered quite frequently again?(with
>>> selfshrink=true)
>>>
>>> Thanks,
>>> -Bob
>> It was looking good until today (selfshrink=true).  The trace below is
>> during a compile of subversion, it looks like the memory has ballooned
>> to almost the maximum permissible but even under pressure the swap disk
>> has hardly come in to use.
>>
> So if without selfshrink the swap disk can be used a lot?
>
> If that's the case, I'm afraid the frontswap-selfshrink in
> xen-selfballoon did something incorrect.
>
> Could you please try this patch which make the frontswap-selfshrink
> slower and add a printk for debug.
> Please still keep selfshrink=true in your test but can with or without
> my previous patch.
> Thanks a lot!
>
The oom trace below was triggered during a compile of gcc.  I have the 
full dmesg from boot which shows all the printks, please let me know if 
you would like to see that.

James


[504372.929678] frontswap selfshrink 5424 pages
[504403.018185] frontswap selfshrink 5152 pages
[504433.124844] frontswap selfshrink 4894 pages
[504468.335358] frontswap selfshrink 12791 pages
[504498.536467] frontswap selfshrink 12152 pages
[504533.813484] frontswap selfshrink 19751 pages
[504589.067299] frontswap selfshrink 19043 pages
[504638.441894] cc1plus invoked oom-killer: gfp_mask=0x280da, order=0, 
oom_score_adj=0
[504638.441902] CPU: 1 PID: 21506 Comm: cc1plus Tainted: G W    3.12.5 #88
[504638.441905]  ffff88001ca406f8 ffff880002c0fa58 ffffffff8148f200 
ffff88001f90e8e8
[504638.441909]  ffff88001ca401c0 ffff880002c0faf8 ffffffff8148ccf7 
ffff880002c0faa8
[504638.441912]  ffffffff810f8d97 ffff880002c0fa88 ffffffff81006dc8 
ffff880002c0fa98
[504638.441917] Call Trace:
[504638.441928]  [<ffffffff8148f200>] dump_stack+0x46/0x58
[504638.441932]  [<ffffffff8148ccf7>] dump_header.isra.9+0x6d/0x1cc
[504638.441938]  [<ffffffff810f8d97>] ? super_cache_count+0xa8/0xb8
[504638.441943]  [<ffffffff81006dc8>] ? xen_clocksource_read+0x20/0x22
[504638.441946]  [<ffffffff81006ea9>] ? xen_clocksource_get_cycles+0x9/0xb
[504638.441951]  [<ffffffff81494abe>] ? 
_raw_spin_unlock_irqrestore+0x47/0x62
[504638.441957]  [<ffffffff81296b27>] ? ___ratelimit+0xcb/0xe8
[504638.441962]  [<ffffffff810b2bbf>] oom_kill_process+0x70/0x2fd
[504638.441966]  [<ffffffff810bca0e>] ? zone_reclaimable+0x11/0x1e
[504638.441970]  [<ffffffff81048779>] ? has_ns_capability_noaudit+0x12/0x19
[504638.441973]  [<ffffffff81048792>] ? has_capability_noaudit+0x12/0x14
[504638.441976]  [<ffffffff810b32de>] out_of_memory+0x31b/0x34e
[504638.441981]  [<ffffffff810b7438>] __alloc_pages_nodemask+0x65b/0x792
[504638.441985]  [<ffffffff810e3da3>] alloc_pages_vma+0xd0/0x10c
[504638.441988]  [<ffffffff81003f69>] ? 
__raw_callee_save_xen_pmd_val+0x11/0x1e
[504638.441993]  [<ffffffff810cf7cd>] handle_mm_fault+0x6d4/0xd54
[504638.441996]  [<ffffffff81006dc8>] ? xen_clocksource_read+0x20/0x22
[504638.441999]  [<ffffffff810115d2>] ? sched_clock+0x9/0xd
[504638.442005]  [<ffffffff8106772f>] ? sched_clock_local+0x12/0x75
[504638.442008]  [<ffffffff8106823b>] ? arch_vtime_task_switch+0x81/0x86
[504638.442013]  [<ffffffff81037f40>] __do_page_fault+0x3d8/0x437
[504638.442016]  [<ffffffff81062f1e>] ? finish_task_switch+0xe8/0x144
[504638.442018]  [<ffffffff810115d2>] ? sched_clock+0x9/0xd
[504638.442021]  [<ffffffff8106772f>] ? sched_clock_local+0x12/0x75
[504638.442025]  [<ffffffff810a45cc>] ? __acct_update_integrals+0xb4/0xbf
[504638.442028]  [<ffffffff810a493f>] ? acct_account_cputime+0x17/0x19
[504638.442030]  [<ffffffff81067c28>] ? account_user_time+0x67/0x92
[504638.442033]  [<ffffffff8106811b>] ? vtime_account_user+0x4d/0x52
[504638.442036]  [<ffffffff81037fd8>] do_page_fault+0x1a/0x5a
[504638.442041]  [<ffffffff810a065f>] ? rcu_user_enter+0xe/0x10
[504638.442044]  [<ffffffff81495158>] page_fault+0x28/0x30
[504638.442046] Mem-Info:
[504638.442048] Node 0 DMA per-cpu:
[504638.442050] CPU    0: hi:    0, btch:   1 usd:   0
[504638.442052] CPU    1: hi:    0, btch:   1 usd:   0
[504638.442053] Node 0 DMA32 per-cpu:
[504638.442055] CPU    0: hi:  186, btch:  31 usd:  26
[504638.442057] CPU    1: hi:  186, btch:  31 usd:  72
[504638.442058] Node 0 Normal per-cpu:
[504638.442060] CPU    0: hi:    0, btch:   1 usd:   0
[504638.442061] CPU    1: hi:    0, btch:   1 usd:   0
[504638.442067] active_anon:103684 inactive_anon:103733 isolated_anon:0
  active_file:10897 inactive_file:15059 isolated_file:0
  unevictable:0 dirty:1 writeback:0 unstable:0
  free:1164 slab_reclaimable:2356 slab_unreclaimable:3421
  mapped:4413 shmem:200 pagetables:2699 bounce:0
  free_cma:0 totalram:249264 balloontarget:315406
[504638.442069] Node 0 DMA free:1964kB min:88kB low:108kB high:132kB 
active_anon:4664kB inactive_anon:4736kB active_file:628kB 
inactive_file:1420kB unevictable:0kB isolated(anon):0kB 
isolated(file):0kB present:15996kB managed:15084kB mlocked:0kB dirty:0kB 
writeback:0kB mapped:228kB shmem:0kB slab_reclaimable:184kB 
slab_unreclaimable:324kB kernel_stack:48kB pagetables:256kB unstable:0kB 
bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:21824 
all_unreclaimable? yes
[504638.442078] lowmem_reserve[]: 0 469 469 469
[504638.442081] Node 0 DMA32 free:2692kB min:2728kB low:3408kB 
high:4092kB active_anon:175172kB inactive_anon:175184kB 
active_file:21244kB inactive_file:35340kB unevictable:0kB 
isolated(anon):0kB isolated(file):0kB present:507904kB managed:458288kB 
mlocked:0kB dirty:0kB writeback:0kB mapped:7764kB shmem:676kB 
slab_reclaimable:6628kB slab_unreclaimable:11496kB kernel_stack:1720kB 
pagetables:8444kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB 
pages_scanned:613279 all_unreclaimable? yes
[504638.442088] lowmem_reserve[]: 0 0 0 0
[504638.442091] Node 0 Normal free:0kB min:0kB low:0kB high:0kB 
active_anon:234900kB inactive_anon:235012kB active_file:21716kB 
inactive_file:23476kB unevictable:0kB isolated(anon):0kB 
isolated(file):0kB present:524288kB managed:523684kB mlocked:0kB 
dirty:4kB writeback:0kB mapped:9660kB shmem:124kB 
slab_reclaimable:2612kB slab_unreclaimable:1864kB kernel_stack:136kB 
pagetables:2096kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB 
pages_scanned:773613 all_unreclaimable? yes
[504638.442098] lowmem_reserve[]: 0 0 0 0
[504638.442101] Node 0 DMA: 1*4kB (R) 3*8kB (R) 1*16kB (R) 0*32kB 0*64kB 
1*128kB (R) 1*256kB (R) 1*512kB (R) 1*1024kB (R) 0*2048kB 0*4096kB = 1964kB
[504638.442114] Node 0 DMA32: 673*4kB (UE) 0*8kB 0*16kB 0*32kB 0*64kB 
0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 2692kB
[504638.442123] Node 0 Normal: 0*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 
0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 0kB
[504638.442131] 22294 total pagecache pages
[504638.442133] 11197 pages in swap cache
[504638.442135] Swap cache stats: add 3449125, delete 3437928, find 
590699/956067
[504638.442136] Free swap  = 1868108kB
[504638.442137] Total swap = 2097148kB
[504638.452335] 262143 pages RAM
[504638.452336] 6697 pages reserved
[504638.452337] 558286 pages shared
[504638.452338] 239987 pages non-shared
[504638.452340] [ pid ]   uid  tgid total_vm      rss nr_ptes swapents 
oom_score_adj name
<snip process list>
[504638.452596] Out of memory: Kill process 21506 (cc1plus) score 123 or 
sacrifice child
[504638.452598] Killed process 21506 (cc1plus) total-vm:543168kB, 
anon-rss:350300kB, file-rss:9520kB
[504659.367289] frontswap selfshrink 18428 pages
[504689.415694] frontswap selfshrink 479 pages
[504719.462401] frontswap selfshrink 456 pages
[504749.506876] frontswap selfshrink 434 pages
[504779.558204] frontswap selfshrink 406 pages
[504809.604425] frontswap selfshrink 386 pages
[504839.654849] frontswap selfshrink 367 pages

  parent reply	other threads:[~2013-12-26  8:42 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-12-09 17:50 Kernel 3.11 / 3.12 OOM killer and Xen ballooning James Dingwall
2013-12-09 21:48 ` Konrad Rzeszutek Wilk
2013-12-10 14:52   ` James Dingwall
2013-12-10 15:27     ` Konrad Rzeszutek Wilk
2013-12-11  7:22       ` Bob Liu
2013-12-11  9:25         ` James Dingwall
2013-12-11  9:54           ` Bob Liu
2013-12-11 10:16             ` James Dingwall
2013-12-11 16:30         ` James Dingwall
2013-12-12  1:03           ` Bob Liu
2013-12-13 16:59             ` James Dingwall
2013-12-17  6:11               ` Bob Liu
2013-12-18 12:04           ` Bob Liu
2013-12-19 19:08             ` James Dingwall
2013-12-20  3:17               ` Bob Liu
2013-12-20 12:22                 ` James Dingwall
2013-12-26  8:42                 ` James Dingwall [this message]
2014-01-02  6:25                   ` Bob Liu
2014-01-07  9:21                     ` James Dingwall
2014-01-09 10:48                       ` Bob Liu
2014-01-09 10:54                         ` James Dingwall
2014-01-09 11:04                         ` James Dingwall
2014-01-15  8:49                         ` James Dingwall
2014-01-15 14:41                           ` Bob Liu
2014-01-15 16:35                             ` James Dingwall
2014-01-16  1:22                               ` Bob Liu
2014-01-16 10:52                                 ` James Dingwall
2014-01-28 17:15                                 ` James Dingwall
2014-01-29 14:35                                   ` Bob Liu
2014-01-29 14:45                                     ` James Dingwall
2014-01-31 16:56                                       ` Konrad Rzeszutek Wilk
2014-02-03  9:49                                         ` Daniel Kiper
2014-02-03 10:30                                           ` Konrad Rzeszutek Wilk
2014-02-03 11:20                                           ` James Dingwall
2014-02-03 14:00                                             ` Daniel Kiper
2013-12-10  8:16 ` Jan Beulich
2013-12-10 14:01   ` James Dingwall
2013-12-10 14:25     ` Jan Beulich
2013-12-10 14:52       ` James Dingwall
2013-12-10 14:59         ` Jan Beulich
2013-12-10 15:16           ` James Dingwall

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=52BBEBEF.8040509@zynstra.com \
    --to=james.dingwall@zynstra.com \
    --cc=bob.liu@oracle.com \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).