All of lore.kernel.org
 help / color / mirror / Atom feed
From: James Dingwall <james.dingwall@zynstra.com>
To: Bob Liu <bob.liu@oracle.com>
Cc: xen-devel@lists.xen.org
Subject: Re: Kernel 3.11 / 3.12 OOM killer and Xen ballooning
Date: Thu, 26 Dec 2013 08:42:23 +0000	[thread overview]
Message-ID: <52BBEBEF.8040509@zynstra.com> (raw)
In-Reply-To: <52B3B6D7.50606@oracle.com>

Bob Liu wrote:
> On 12/20/2013 03:08 AM, James Dingwall wrote:
>> Bob Liu wrote:
>>> On 12/12/2013 12:30 AM, James Dingwall wrote:
>>>> Bob Liu wrote:
>>>>> On 12/10/2013 11:27 PM, Konrad Rzeszutek Wilk wrote:
>>>>>> On Tue, Dec 10, 2013 at 02:52:40PM +0000, James Dingwall wrote:
>>>>>>> Konrad Rzeszutek Wilk wrote:
>>>>>>>> On Mon, Dec 09, 2013 at 05:50:29PM +0000, James Dingwall wrote:
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> Since 3.11 I have noticed that the OOM killer quite frequently
>>>>>>>>> triggers in my Xen guest domains which use ballooning to
>>>>>>>>> increase/decrease their memory allocation according to their
>>>>>>>>> requirements.  One example domain I have has a maximum memory
>>>>>>>>> setting of ~1.5Gb but it usually idles at ~300Mb, it is also
>>>>>>>>> configured with 2Gb swap which is almost 100% free.
>>>>>>>>>
>>>>>>>>> # free
>>>>>>>>>                 total       used       free     shared    buffers
>>>>>>>>> cached
>>>>>>>>> Mem:        272080     248108      23972          0 1448      63064
>>>>>>>>> -/+ buffers/cache:     183596      88484
>>>>>>>>> Swap:      2097148          8    2097140
>>>>>>>>>
>>>>>>>>> There is plenty of available free memory in the hypervisor to
>>>>>>>>> balloon to the maximum size:
>>>>>>>>> # xl info | grep free_mem
>>>>>>>>> free_memory            : 14923
>>>>>>>>>
>>>>>>>>> An example trace (they are always the same) from the oom killer in
>>>>>>>>> 3.12 is added below.  So far I have not been able to reproduce this
>>>>>>>>> at will so it is difficult to start bisecting it to see if a
>>>>>>>>> particular change introduced this.  However it does seem that the
>>>>>>>>> behaviour is wrong because a) ballooning could give the guest more
>>>>>>>>> memory, b) there is lots of swap available which could be used as a
>>>>>>>>> fallback.
>>>>>> Keep in mind that swap with tmem is actually no more swap. Heh, that
>>>>>> sounds odd -but basically pages that are destined for swap end up
>>>>>> going in the tmem code which pipes them up to the hypervisor.
>>>>>>
>>>>>>>>> If other information could help or there are more tests that I
>>>>>>>>> could
>>>>>>>>> run then please let me know.
>>>>>>>> I presume you have enabled 'tmem' both in the hypervisor and in
>>>>>>>> the guest right?
>>>>>>> Yes, domU and dom0 both have the tmem module loaded and  tmem
>>>>>>> tmem_dedup=on tmem_compress=on is given on the xen command line.
>>>>>> Excellent. The odd thing is that your swap is not used that much, but
>>>>>> it should be (as that is part of what the self-balloon is suppose to
>>>>>> do).
>>>>>>
>>>>>> Bob, you had a patch for the logic of how self-balloon is suppose
>>>>>> to account for the slab - would this be relevant to this problem?
>>>>>>
>>>>> Perhaps, I have attached the patch.
>>>>> James, could you please apply it and try your application again? You
>>>>> have to rebuild the guest kernel.
>>>>> Oh, and also take a look at whether frontswap is in use, you can check
>>>>> it by watching "cat /sys/kernel/debug/frontswap/*".
>>>> I have tested this patch with a workload where I have previously seen
>>>> failures and so far so good.  I'll try to keep a guest with it stressed
>>>> to see if I do get any problems.  I don't know if it is expected but I
>>> By the way, besides longer time of kswapd, is this patch work well
>>> during your stress testing?
>>>
>>> Have you seen the OOM killer triggered quite frequently again?(with
>>> selfshrink=true)
>>>
>>> Thanks,
>>> -Bob
>> It was looking good until today (selfshrink=true).  The trace below is
>> during a compile of subversion, it looks like the memory has ballooned
>> to almost the maximum permissible but even under pressure the swap disk
>> has hardly come in to use.
>>
> So if without selfshrink the swap disk can be used a lot?
>
> If that's the case, I'm afraid the frontswap-selfshrink in
> xen-selfballoon did something incorrect.
>
> Could you please try this patch which make the frontswap-selfshrink
> slower and add a printk for debug.
> Please still keep selfshrink=true in your test but can with or without
> my previous patch.
> Thanks a lot!
>
The oom trace below was triggered during a compile of gcc.  I have the 
full dmesg from boot which shows all the printks, please let me know if 
you would like to see that.

James


[504372.929678] frontswap selfshrink 5424 pages
[504403.018185] frontswap selfshrink 5152 pages
[504433.124844] frontswap selfshrink 4894 pages
[504468.335358] frontswap selfshrink 12791 pages
[504498.536467] frontswap selfshrink 12152 pages
[504533.813484] frontswap selfshrink 19751 pages
[504589.067299] frontswap selfshrink 19043 pages
[504638.441894] cc1plus invoked oom-killer: gfp_mask=0x280da, order=0, 
oom_score_adj=0
[504638.441902] CPU: 1 PID: 21506 Comm: cc1plus Tainted: G W    3.12.5 #88
[504638.441905]  ffff88001ca406f8 ffff880002c0fa58 ffffffff8148f200 
ffff88001f90e8e8
[504638.441909]  ffff88001ca401c0 ffff880002c0faf8 ffffffff8148ccf7 
ffff880002c0faa8
[504638.441912]  ffffffff810f8d97 ffff880002c0fa88 ffffffff81006dc8 
ffff880002c0fa98
[504638.441917] Call Trace:
[504638.441928]  [<ffffffff8148f200>] dump_stack+0x46/0x58
[504638.441932]  [<ffffffff8148ccf7>] dump_header.isra.9+0x6d/0x1cc
[504638.441938]  [<ffffffff810f8d97>] ? super_cache_count+0xa8/0xb8
[504638.441943]  [<ffffffff81006dc8>] ? xen_clocksource_read+0x20/0x22
[504638.441946]  [<ffffffff81006ea9>] ? xen_clocksource_get_cycles+0x9/0xb
[504638.441951]  [<ffffffff81494abe>] ? 
_raw_spin_unlock_irqrestore+0x47/0x62
[504638.441957]  [<ffffffff81296b27>] ? ___ratelimit+0xcb/0xe8
[504638.441962]  [<ffffffff810b2bbf>] oom_kill_process+0x70/0x2fd
[504638.441966]  [<ffffffff810bca0e>] ? zone_reclaimable+0x11/0x1e
[504638.441970]  [<ffffffff81048779>] ? has_ns_capability_noaudit+0x12/0x19
[504638.441973]  [<ffffffff81048792>] ? has_capability_noaudit+0x12/0x14
[504638.441976]  [<ffffffff810b32de>] out_of_memory+0x31b/0x34e
[504638.441981]  [<ffffffff810b7438>] __alloc_pages_nodemask+0x65b/0x792
[504638.441985]  [<ffffffff810e3da3>] alloc_pages_vma+0xd0/0x10c
[504638.441988]  [<ffffffff81003f69>] ? 
__raw_callee_save_xen_pmd_val+0x11/0x1e
[504638.441993]  [<ffffffff810cf7cd>] handle_mm_fault+0x6d4/0xd54
[504638.441996]  [<ffffffff81006dc8>] ? xen_clocksource_read+0x20/0x22
[504638.441999]  [<ffffffff810115d2>] ? sched_clock+0x9/0xd
[504638.442005]  [<ffffffff8106772f>] ? sched_clock_local+0x12/0x75
[504638.442008]  [<ffffffff8106823b>] ? arch_vtime_task_switch+0x81/0x86
[504638.442013]  [<ffffffff81037f40>] __do_page_fault+0x3d8/0x437
[504638.442016]  [<ffffffff81062f1e>] ? finish_task_switch+0xe8/0x144
[504638.442018]  [<ffffffff810115d2>] ? sched_clock+0x9/0xd
[504638.442021]  [<ffffffff8106772f>] ? sched_clock_local+0x12/0x75
[504638.442025]  [<ffffffff810a45cc>] ? __acct_update_integrals+0xb4/0xbf
[504638.442028]  [<ffffffff810a493f>] ? acct_account_cputime+0x17/0x19
[504638.442030]  [<ffffffff81067c28>] ? account_user_time+0x67/0x92
[504638.442033]  [<ffffffff8106811b>] ? vtime_account_user+0x4d/0x52
[504638.442036]  [<ffffffff81037fd8>] do_page_fault+0x1a/0x5a
[504638.442041]  [<ffffffff810a065f>] ? rcu_user_enter+0xe/0x10
[504638.442044]  [<ffffffff81495158>] page_fault+0x28/0x30
[504638.442046] Mem-Info:
[504638.442048] Node 0 DMA per-cpu:
[504638.442050] CPU    0: hi:    0, btch:   1 usd:   0
[504638.442052] CPU    1: hi:    0, btch:   1 usd:   0
[504638.442053] Node 0 DMA32 per-cpu:
[504638.442055] CPU    0: hi:  186, btch:  31 usd:  26
[504638.442057] CPU    1: hi:  186, btch:  31 usd:  72
[504638.442058] Node 0 Normal per-cpu:
[504638.442060] CPU    0: hi:    0, btch:   1 usd:   0
[504638.442061] CPU    1: hi:    0, btch:   1 usd:   0
[504638.442067] active_anon:103684 inactive_anon:103733 isolated_anon:0
  active_file:10897 inactive_file:15059 isolated_file:0
  unevictable:0 dirty:1 writeback:0 unstable:0
  free:1164 slab_reclaimable:2356 slab_unreclaimable:3421
  mapped:4413 shmem:200 pagetables:2699 bounce:0
  free_cma:0 totalram:249264 balloontarget:315406
[504638.442069] Node 0 DMA free:1964kB min:88kB low:108kB high:132kB 
active_anon:4664kB inactive_anon:4736kB active_file:628kB 
inactive_file:1420kB unevictable:0kB isolated(anon):0kB 
isolated(file):0kB present:15996kB managed:15084kB mlocked:0kB dirty:0kB 
writeback:0kB mapped:228kB shmem:0kB slab_reclaimable:184kB 
slab_unreclaimable:324kB kernel_stack:48kB pagetables:256kB unstable:0kB 
bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:21824 
all_unreclaimable? yes
[504638.442078] lowmem_reserve[]: 0 469 469 469
[504638.442081] Node 0 DMA32 free:2692kB min:2728kB low:3408kB 
high:4092kB active_anon:175172kB inactive_anon:175184kB 
active_file:21244kB inactive_file:35340kB unevictable:0kB 
isolated(anon):0kB isolated(file):0kB present:507904kB managed:458288kB 
mlocked:0kB dirty:0kB writeback:0kB mapped:7764kB shmem:676kB 
slab_reclaimable:6628kB slab_unreclaimable:11496kB kernel_stack:1720kB 
pagetables:8444kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB 
pages_scanned:613279 all_unreclaimable? yes
[504638.442088] lowmem_reserve[]: 0 0 0 0
[504638.442091] Node 0 Normal free:0kB min:0kB low:0kB high:0kB 
active_anon:234900kB inactive_anon:235012kB active_file:21716kB 
inactive_file:23476kB unevictable:0kB isolated(anon):0kB 
isolated(file):0kB present:524288kB managed:523684kB mlocked:0kB 
dirty:4kB writeback:0kB mapped:9660kB shmem:124kB 
slab_reclaimable:2612kB slab_unreclaimable:1864kB kernel_stack:136kB 
pagetables:2096kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB 
pages_scanned:773613 all_unreclaimable? yes
[504638.442098] lowmem_reserve[]: 0 0 0 0
[504638.442101] Node 0 DMA: 1*4kB (R) 3*8kB (R) 1*16kB (R) 0*32kB 0*64kB 
1*128kB (R) 1*256kB (R) 1*512kB (R) 1*1024kB (R) 0*2048kB 0*4096kB = 1964kB
[504638.442114] Node 0 DMA32: 673*4kB (UE) 0*8kB 0*16kB 0*32kB 0*64kB 
0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 2692kB
[504638.442123] Node 0 Normal: 0*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 
0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 0kB
[504638.442131] 22294 total pagecache pages
[504638.442133] 11197 pages in swap cache
[504638.442135] Swap cache stats: add 3449125, delete 3437928, find 
590699/956067
[504638.442136] Free swap  = 1868108kB
[504638.442137] Total swap = 2097148kB
[504638.452335] 262143 pages RAM
[504638.452336] 6697 pages reserved
[504638.452337] 558286 pages shared
[504638.452338] 239987 pages non-shared
[504638.452340] [ pid ]   uid  tgid total_vm      rss nr_ptes swapents 
oom_score_adj name
<snip process list>
[504638.452596] Out of memory: Kill process 21506 (cc1plus) score 123 or 
sacrifice child
[504638.452598] Killed process 21506 (cc1plus) total-vm:543168kB, 
anon-rss:350300kB, file-rss:9520kB
[504659.367289] frontswap selfshrink 18428 pages
[504689.415694] frontswap selfshrink 479 pages
[504719.462401] frontswap selfshrink 456 pages
[504749.506876] frontswap selfshrink 434 pages
[504779.558204] frontswap selfshrink 406 pages
[504809.604425] frontswap selfshrink 386 pages
[504839.654849] frontswap selfshrink 367 pages

  parent reply	other threads:[~2013-12-26  8:42 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-12-09 17:50 Kernel 3.11 / 3.12 OOM killer and Xen ballooning James Dingwall
2013-12-09 21:48 ` Konrad Rzeszutek Wilk
2013-12-10 14:52   ` James Dingwall
2013-12-10 15:27     ` Konrad Rzeszutek Wilk
2013-12-11  7:22       ` Bob Liu
2013-12-11  9:25         ` James Dingwall
2013-12-11  9:54           ` Bob Liu
2013-12-11 10:16             ` James Dingwall
2013-12-11 16:30         ` James Dingwall
2013-12-12  1:03           ` Bob Liu
2013-12-13 16:59             ` James Dingwall
2013-12-17  6:11               ` Bob Liu
2013-12-18 12:04           ` Bob Liu
2013-12-19 19:08             ` James Dingwall
2013-12-20  3:17               ` Bob Liu
2013-12-20 12:22                 ` James Dingwall
2013-12-26  8:42                 ` James Dingwall [this message]
2014-01-02  6:25                   ` Bob Liu
2014-01-07  9:21                     ` James Dingwall
2014-01-09 10:48                       ` Bob Liu
2014-01-09 10:54                         ` James Dingwall
2014-01-09 11:04                         ` James Dingwall
2014-01-15  8:49                         ` James Dingwall
2014-01-15 14:41                           ` Bob Liu
2014-01-15 16:35                             ` James Dingwall
2014-01-16  1:22                               ` Bob Liu
2014-01-16 10:52                                 ` James Dingwall
2014-01-28 17:15                                 ` James Dingwall
2014-01-29 14:35                                   ` Bob Liu
2014-01-29 14:45                                     ` James Dingwall
2014-01-31 16:56                                       ` Konrad Rzeszutek Wilk
2014-02-03  9:49                                         ` Daniel Kiper
2014-02-03 10:30                                           ` Konrad Rzeszutek Wilk
2014-02-03 11:20                                           ` James Dingwall
2014-02-03 14:00                                             ` Daniel Kiper
2013-12-10  8:16 ` Jan Beulich
2013-12-10 14:01   ` James Dingwall
2013-12-10 14:25     ` Jan Beulich
2013-12-10 14:52       ` James Dingwall
2013-12-10 14:59         ` Jan Beulich
2013-12-10 15:16           ` James Dingwall
  -- strict thread matches above, loose matches on Subject: below --
2013-11-21 11:28 James Dingwall

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=52BBEBEF.8040509@zynstra.com \
    --to=james.dingwall@zynstra.com \
    --cc=bob.liu@oracle.com \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.