From mboxrd@z Thu Jan 1 00:00:00 1970 From: James Dingwall Subject: Re: Kernel 3.11 / 3.12 OOM killer and Xen ballooning Date: Thu, 19 Dec 2013 19:08:47 +0000 Message-ID: <52B3443F.5060704@zynstra.com> References: <52A602E5.3080300@zynstra.com> <20131209214816.GA3000@phenom.dumpdata.com> <52A72AB8.9060707@zynstra.com> <20131210152746.GF3184@phenom.dumpdata.com> <52A812B0.6060607@oracle.com> <52A89334.3090007@zynstra.com> <52B18F44.2030500@oracle.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; Format="flowed" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <52B18F44.2030500@oracle.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Bob Liu Cc: xen-devel@lists.xen.org List-Id: xen-devel@lists.xenproject.org Bob Liu wrote: > On 12/12/2013 12:30 AM, James Dingwall wrote: >> Bob Liu wrote: >>> On 12/10/2013 11:27 PM, Konrad Rzeszutek Wilk wrote: >>>> On Tue, Dec 10, 2013 at 02:52:40PM +0000, James Dingwall wrote: >>>>> Konrad Rzeszutek Wilk wrote: >>>>>> On Mon, Dec 09, 2013 at 05:50:29PM +0000, James Dingwall wrote: >>>>>>> Hi, >>>>>>> >>>>>>> Since 3.11 I have noticed that the OOM killer quite frequently >>>>>>> triggers in my Xen guest domains which use ballooning to >>>>>>> increase/decrease their memory allocation according to their >>>>>>> requirements. One example domain I have has a maximum memory >>>>>>> setting of ~1.5Gb but it usually idles at ~300Mb, it is also >>>>>>> configured with 2Gb swap which is almost 100% free. >>>>>>> >>>>>>> # free >>>>>>> total used free shared buffers >>>>>>> cached >>>>>>> Mem: 272080 248108 23972 0 1448 63064 >>>>>>> -/+ buffers/cache: 183596 88484 >>>>>>> Swap: 2097148 8 2097140 >>>>>>> >>>>>>> There is plenty of available free memory in the hypervisor to >>>>>>> balloon to the maximum size: >>>>>>> # xl info | grep free_mem >>>>>>> free_memory : 14923 >>>>>>> >>>>>>> An example trace (they are always the same) from the oom killer in >>>>>>> 3.12 is added below. So far I have not been able to reproduce this >>>>>>> at will so it is difficult to start bisecting it to see if a >>>>>>> particular change introduced this. However it does seem that the >>>>>>> behaviour is wrong because a) ballooning could give the guest more >>>>>>> memory, b) there is lots of swap available which could be used as a >>>>>>> fallback. >>>> Keep in mind that swap with tmem is actually no more swap. Heh, that >>>> sounds odd -but basically pages that are destined for swap end up >>>> going in the tmem code which pipes them up to the hypervisor. >>>> >>>>>>> If other information could help or there are more tests that I could >>>>>>> run then please let me know. >>>>>> I presume you have enabled 'tmem' both in the hypervisor and in >>>>>> the guest right? >>>>> Yes, domU and dom0 both have the tmem module loaded and tmem >>>>> tmem_dedup=on tmem_compress=on is given on the xen command line. >>>> Excellent. The odd thing is that your swap is not used that much, but >>>> it should be (as that is part of what the self-balloon is suppose to >>>> do). >>>> >>>> Bob, you had a patch for the logic of how self-balloon is suppose >>>> to account for the slab - would this be relevant to this problem? >>>> >>> Perhaps, I have attached the patch. >>> James, could you please apply it and try your application again? You >>> have to rebuild the guest kernel. >>> Oh, and also take a look at whether frontswap is in use, you can check >>> it by watching "cat /sys/kernel/debug/frontswap/*". >> I have tested this patch with a workload where I have previously seen >> failures and so far so good. I'll try to keep a guest with it stressed >> to see if I do get any problems. I don't know if it is expected but I > By the way, besides longer time of kswapd, is this patch work well > during your stress testing? > > Have you seen the OOM killer triggered quite frequently again?(with > selfshrink=true) > > Thanks, > -Bob It was looking good until today (selfshrink=true). The trace below is during a compile of subversion, it looks like the memory has ballooned to almost the maximum permissible but even under pressure the swap disk has hardly come in to use. James [76253.420363] javac invoked oom-killer: gfp_mask=0x280da, order=0, oom_score_adj=0 [76253.420371] CPU: 0 PID: 4995 Comm: javac Tainted: G W 3.12.5 #87 [76253.420374] ffff88001289fcb8 ffff880100ed1a58 ffffffff8148f1e0 ffff88001f80e8e8 [76253.420378] ffff88001289f780 ffff880100ed1af8 ffffffff8148ccd7 ffff880100ed1aa8 [76253.420381] ffffffff810f8d97 ffff880100ed1a88 ffffffff81006dc8 ffff880100ed1a98 [76253.420385] Call Trace: [76253.420402] [] dump_stack+0x46/0x58 [76253.420406] [] dump_header.isra.9+0x6d/0x1cc [76253.420412] [] ? super_cache_count+0xa8/0xb8 [76253.420417] [] ? xen_clocksource_read+0x20/0x22 [76253.420421] [] ? xen_clocksource_get_cycles+0x9/0xb [76253.420426] [] ? _raw_spin_unlock_irqrestore+0x47/0x62 [76253.420432] [] ? ___ratelimit+0xcb/0xe8 [76253.420437] [] oom_kill_process+0x70/0x2fd [76253.420441] [] ? zone_reclaimable+0x11/0x1e [76253.420446] [] ? has_ns_capability_noaudit+0x12/0x19 [76253.420449] [] ? has_capability_noaudit+0x12/0x14 [76253.420452] [] out_of_memory+0x31b/0x34e [76253.420456] [] __alloc_pages_nodemask+0x65b/0x792 [76253.420460] [] alloc_pages_vma+0xd0/0x10c [76253.420464] [] ? __raw_callee_save_xen_pmd_val+0x11/0x1e [76253.420468] [] handle_mm_fault+0x6d4/0xd54 [76253.420471] [] ? change_protection+0x4d7/0x66c [76253.420474] [] ? error_exit+0x2a/0x60 [76253.420480] [] __do_page_fault+0x3d8/0x437 [76253.420483] [] ? xen_clocksource_read+0x20/0x22 [76253.420487] [] ? sched_clock+0x9/0xd [76253.420493] [] ? sched_clock_local+0x12/0x75 [76253.420497] [] ? __acct_update_integrals+0xb4/0xbf [76253.420499] [] ? acct_account_cputime+0x17/0x19 [76253.420503] [] ? account_user_time+0x67/0x92 [76253.420506] [] ? vtime_account_user+0x4d/0x52 [76253.420510] [] do_page_fault+0x1a/0x5a [76253.420512] [] page_fault+0x28/0x30 [76253.420514] Mem-Info: [76253.420515] Node 0 DMA per-cpu: [76253.420518] CPU 0: hi: 0, btch: 1 usd: 0 [76253.420520] CPU 1: hi: 0, btch: 1 usd: 0 [76253.420521] Node 0 DMA32 per-cpu: [76253.420523] CPU 0: hi: 186, btch: 31 usd: 137 [76253.420525] CPU 1: hi: 186, btch: 31 usd: 110 [76253.420526] Node 0 Normal per-cpu: [76253.420528] CPU 0: hi: 0, btch: 1 usd: 0 [76253.420529] CPU 1: hi: 0, btch: 1 usd: 0 [76253.420535] active_anon:28274 inactive_anon:35997 isolated_anon:0 active_file:10839 inactive_file:15340 isolated_file:0 unevictable:0 dirty:0 writeback:2 unstable:0 free:1142 slab_reclaimable:3001 slab_unreclaimable:3771 mapped:3635 shmem:140 pagetables:2168 bounce:0 free_cma:0 totalram:109276 balloontarget:107948 [76253.420537] Node 0 DMA free:1952kB min:88kB low:108kB high:132kB active_anon:976kB inactive_anon:1368kB active_file:692kB inactive_file:916kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15996kB managed:7576kB mlocked:0kB dirty:0kB writeback:0kB mapped:196kB shmem:0kB slab_reclaimable:220kB slab_unreclaimable:304kB kernel_stack:112kB pagetables:64kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:6633 all_unreclaimable? yes [76253.420546] lowmem_reserve[]: 0 469 469 469 [76253.420549] Node 0 DMA32 free:2616kB min:2728kB low:3408kB high:4092kB active_anon:31720kB inactive_anon:62120kB active_file:15304kB inactive_file:32692kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:507904kB managed:205076kB mlocked:0kB dirty:0kB writeback:0kB mapped:4784kB shmem:528kB slab_reclaimable:7908kB slab_unreclaimable:13148kB kernel_stack:1624kB pagetables:7500kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:215144 all_unreclaimable? yes [76253.420557] lowmem_reserve[]: 0 0 0 0 [76253.420560] Node 0 Normal free:0kB min:0kB low:0kB high:0kB active_anon:80400kB inactive_anon:80500kB active_file:27360kB inactive_file:27752kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:524288kB managed:224452kB mlocked:0kB dirty:0kB writeback:8kB mapped:9560kB shmem:32kB slab_reclaimable:3876kB slab_unreclaimable:1632kB kernel_stack:136kB pagetables:1108kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:324394 all_unreclaimable? yes [76253.420567] lowmem_reserve[]: 0 0 0 0 [76253.420570] Node 0 DMA: 1*4kB (U) 2*8kB (R) 1*16kB (R) 2*32kB (R) 1*64kB (R) 0*128kB 1*256kB (R) 1*512kB (R) 1*1024kB (R) 0*2048kB 0*4096kB = 1956kB [76253.420584] Node 0 DMA32: 626*4kB (UEM) 14*8kB (UM) 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 2616kB [76253.420594] Node 0 Normal: 0*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 0kB [76253.420603] 11432 total pagecache pages [76253.420605] 427 pages in swap cache [76253.420606] Swap cache stats: add 97961, delete 97534, find 10712/12435 [76253.420607] Free swap = 2065332kB [76253.420608] Total swap = 2097148kB [76253.434007] 425983 pages RAM [76253.434008] 170422 pages reserved [76253.434009] 546002 pages shared [76253.434010] 246013 pages non-shared [76253.434243] Out of memory: Kill process 4989 (javac) score 36 or sacrifice child [76253.434248] Killed process 4989 (javac) total-vm:1194836kB, anon-rss:79368kB, file-rss:9908kB