From mboxrd@z Thu Jan 1 00:00:00 1970 From: James Dingwall Subject: Re: Kernel 3.11 / 3.12 OOM killer and Xen ballooning Date: Thu, 9 Jan 2014 11:04:58 +0000 Message-ID: <52CE825A.9090006@zynstra.com> References: <52A602E5.3080300@zynstra.com> <20131209214816.GA3000@phenom.dumpdata.com> <52A72AB8.9060707@zynstra.com> <20131210152746.GF3184@phenom.dumpdata.com> <52A812B0.6060607@oracle.com> <52A89334.3090007@zynstra.com> <52B18F44.2030500@oracle.com> <52B3443F.5060704@zynstra.com> <52B3B6D7.50606@oracle.com> <52BBEBEF.8040509@zynstra.com> <52C50661.7060900@oracle.com> <52CBC700.1060602@zynstra.com> <52CE7E67.5080708@oracle.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; Format="flowed" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <52CE7E67.5080708@oracle.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Bob Liu Cc: xen-devel@lists.xen.org List-Id: xen-devel@lists.xenproject.org Bob Liu wrote: > On 01/07/2014 05:21 PM, James Dingwall wrote: >> Bob Liu wrote: >>> Could you confirm that this problem doesn't exist if loading tmem with >>> selfshrinking=0 during compile gcc? It seems that you are compiling >>> difference packages during your testing. >>> This will help to figure out whether selfshrinking is the root cause. >> Got an oom with selfshrinking=0, again during a gcc compile. >> Unfortunately I don't have a single test case which demonstrates the >> problem but as I mentioned before it will generally show up under >> compiles of large packages such as glibc, kdelibs, gcc etc. >> > So the root cause is not because enabled selfshrinking. > Then what I can think of is that the xen-selfballoon driver was too > aggressive, too many pages were ballooned out which causeed heavy memory > pressure to guest OS. > And kswapd started to reclaim page until most of pages were > unreclaimable(all_unreclaimable=yes for all zones), then OOM Killer was > triggered. > In theory the balloon driver should give back ballooned out pages to > guest OS, but I'm afraid this procedure is not fast enough. > > My suggestion is reserve a min memory for your guest OS so that the > xen-selfballoon won't be so aggressive. > You can do it through parameters selfballoon_reserved_mb or > selfballoon_min_usable_mb. > >> I don't know if this is a separate or related issue but over the >> holidays I also had a problem with six of the guests on my system where >> kswapd was running at 100% and had clocked up >9000 minutes of cpu time >> even though there was otherwise no load on them. Of the guests I >> restarted yesterday in this state two have already got in to the same >> state again, they are running a kernel with the first patch that you sent. As soon as I echo 32 both (originally 0) /sys/devices/system/xen_memory/xen_memory0/selfballoon/selfballoon_reserved_mb /sys/devices/system/xen_memory/xen_memory0/selfballoon/selfballoon_min_usable_mb Then the kswapd process stopped running at 100%. Unfortunately I didn't check between the two commands to see if one by itself made a difference but I'll look for that next time. > Could you get the meminfo in guest OS at that time? After > cat /proc/meminfo MemTotal: 397028 kB MemFree: 163756 kB Buffers: 1260 kB Cached: 129284 kB SwapCached: 132 kB Active: 22664 kB Inactive: 159576 kB Active(anon): 8004 kB Inactive(anon): 44412 kB Active(file): 14660 kB Inactive(file): 115164 kB Unevictable: 0 kB Mlocked: 0 kB SwapTotal: 2097148 kB SwapFree: 2096896 kB Dirty: 20 kB Writeback: 0 kB AnonPages: 51640 kB Mapped: 14136 kB Shmem: 720 kB Slab: 19492 kB SReclaimable: 7692 kB SUnreclaim: 11800 kB KernelStack: 1816 kB PageTables: 7928 kB NFS_Unstable: 0 kB Bounce: 0 kB WritebackTmp: 0 kB CommitLimit: 2295660 kB Committed_AS: 338552 kB VmallocTotal: 34359738367 kB VmallocUsed: 9020 kB VmallocChunk: 34359716408 kB HardwareCorrupted: 0 kB AnonHugePages: 0 kB DirectMap4k: 1048576 kB DirectMap2M: 0 kB > cat /proc/vmstat nr_free_pages 40916 nr_alloc_batch 0 nr_inactive_anon 11102 nr_active_anon 2009 nr_inactive_file 28791 nr_active_file 3665 nr_unevictable 0 nr_mlock 0 nr_anon_pages 12904 nr_mapped 3534 nr_file_pages 32669 nr_dirty 5 nr_writeback 0 nr_slab_reclaimable 1923 nr_slab_unreclaimable 2945 nr_page_table_pages 1982 nr_kernel_stack 227 nr_unstable 0 nr_bounce 0 nr_vmscan_write 781891 nr_vmscan_immediate_reclaim 6245 nr_writeback_temp 0 nr_isolated_anon 0 nr_isolated_file 0 nr_shmem 180 nr_dirtied 86609 nr_written 861010 numa_hit 8353372 numa_miss 0 numa_foreign 0 numa_interleave 0 numa_local 8353372 numa_other 0 nr_anon_transparent_hugepages 0 nr_free_cma 0 nr_dirty_threshold 16991 nr_dirty_background_threshold 8495 pgpgin 2044575 pgpgout 645866 pswpin 123 pswpout 153 pgalloc_dma 164944 pgalloc_dma32 7347917 pgalloc_normal 1032559 pgalloc_movable 0 pgfree 8586607 pgactivate 2012718 pgdeactivate 2276721 pgfault 7295414 pgmajfault 345301 pgrefill_dma 55271 pgrefill_dma32 2263007 pgrefill_normal 1771 pgrefill_movable 0 pgsteal_kswapd_dma 44880 pgsteal_kswapd_dma32 2587500 pgsteal_kswapd_normal 0 pgsteal_kswapd_movable 0 pgsteal_direct_dma 0 pgsteal_direct_dma32 37 pgsteal_direct_normal 0 pgsteal_direct_movable 0 pgscan_kswapd_dma 204749 pgscan_kswapd_dma32 4477230 pgscan_kswapd_normal 0 pgscan_kswapd_movable 0 pgscan_direct_dma 0 pgscan_direct_dma32 39 pgscan_direct_normal 0 pgscan_direct_movable 0 pgscan_direct_throttle 0 zone_reclaim_failed 0 pginodesteal 0 slabs_scanned 2720128 kswapd_inodesteal 41065 kswapd_low_wmark_hit_quickly 14897 kswapd_high_wmark_hit_quickly 116697740 pageoutrun 116717997 allocstall 1 pgrotated 8497 numa_pte_updates 0 numa_huge_pte_updates 0 numa_hint_faults 0 numa_hint_faults_local 0 numa_pages_migrated 0 pgmigrate_success 0 pgmigrate_fail 0 compact_migrate_scanned 0 compact_free_scanned 0 compact_isolated 0 compact_stall 0 compact_fail 0 compact_success 0 unevictable_pgs_culled 29365 unevictable_pgs_scanned 0 unevictable_pgs_rescued 29145 unevictable_pgs_mlocked 29550 unevictable_pgs_munlocked 29550 unevictable_pgs_cleared 0 unevictable_pgs_stranded 0 thp_fault_alloc 0 thp_fault_fallback 0 thp_collapse_alloc 0 thp_collapse_alloc_failed 0 thp_split 0 thp_zero_page_alloc 0 thp_zero_page_alloc_failed 0 nr_tlb_remote_flush 10780 nr_tlb_remote_flush_received 21564 nr_tlb_local_flush_all 66247 nr_tlb_local_flush_one 1446496 > > Thanks, > -Bob > >> /sys/module/tmem/parameters/cleancache Y >> /sys/module/tmem/parameters/frontswap Y >> /sys/module/tmem/parameters/selfballooning Y >> /sys/module/tmem/parameters/selfshrinking N >> >> James >> >> [ 8212.940520] cc1plus invoked oom-killer: gfp_mask=0x200da, order=0, >> oom_score_adj=0 >> [ 8212.940529] CPU: 1 PID: 23678 Comm: cc1plus Tainted: G W 3.12.5 #88 >> [ 8212.940532] ffff88001e38cdf8 ffff88000094f968 ffffffff8148f200 >> ffff88001f90e8e8 >> [ 8212.940536] ffff88001e38c8c0 ffff88000094fa08 ffffffff8148ccf7 >> ffff88000094f9b8 >> [ 8212.940538] ffffffff810f8d97 ffff88000094f998 ffffffff81006dc8 >> ffff88000094f9a8 >> [ 8212.940542] Call Trace: >> [ 8212.940554] [] dump_stack+0x46/0x58 >> [ 8212.940558] [] dump_header.isra.9+0x6d/0x1cc >> [ 8212.940564] [] ? super_cache_count+0xa8/0xb8 >> [ 8212.940569] [] ? xen_clocksource_read+0x20/0x22 >> [ 8212.940573] [] ? xen_clocksource_get_cycles+0x9/0xb >> [ 8212.940578] [] ? >> _raw_spin_unlock_irqrestore+0x47/0x62 >> [ 8212.940583] [] ? ___ratelimit+0xcb/0xe8 >> [ 8212.940588] [] oom_kill_process+0x70/0x2fd >> [ 8212.940592] [] ? zone_reclaimable+0x11/0x1e >> [ 8212.940597] [] ? has_ns_capability_noaudit+0x12/0x19 >> [ 8212.940600] [] ? has_capability_noaudit+0x12/0x14 >> [ 8212.940603] [] out_of_memory+0x31b/0x34e >> [ 8212.940608] [] __alloc_pages_nodemask+0x65b/0x792 >> [ 8212.940612] [] alloc_pages_vma+0xd0/0x10c >> [ 8212.940617] [] read_swap_cache_async+0x70/0x120 >> [ 8212.940620] [] swapin_readahead+0x90/0xd4 >> [ 8212.940623] [] ? pte_mfn_to_pfn+0x59/0xcb >> [ 8212.940627] [] handle_mm_fault+0x8a4/0xd54 >> [ 8212.940630] [] ? xen_clocksource_read+0x20/0x22 >> [ 8212.940634] [] ? sched_clock+0x9/0xd >> [ 8212.940638] [] ? sched_clock_local+0x12/0x75 >> [ 8212.940641] [] ? arch_vtime_task_switch+0x81/0x86 >> [ 8212.940646] [] __do_page_fault+0x3d8/0x437 >> [ 8212.940649] [] ? xen_clocksource_read+0x20/0x22 >> [ 8212.940652] [] ? sched_clock+0x9/0xd >> [ 8212.940654] [] ? sched_clock_local+0x12/0x75 >> [ 8212.940658] [] ? __acct_update_integrals+0xb4/0xbf >> [ 8212.940661] [] ? acct_account_cputime+0x17/0x19 >> [ 8212.940663] [] ? account_user_time+0x67/0x92 >> [ 8212.940666] [] ? vtime_account_user+0x4d/0x52 >> [ 8212.940669] [] do_page_fault+0x1a/0x5a >> [ 8212.940674] [] ? rcu_user_enter+0xe/0x10 >> [ 8212.940677] [] page_fault+0x28/0x30 >> [ 8212.940679] Mem-Info: >> [ 8212.940681] Node 0 DMA per-cpu: >> [ 8212.940684] CPU 0: hi: 0, btch: 1 usd: 0 >> [ 8212.940685] CPU 1: hi: 0, btch: 1 usd: 0 >> [ 8212.940686] Node 0 DMA32 per-cpu: >> [ 8212.940688] CPU 0: hi: 186, btch: 31 usd: 116 >> [ 8212.940690] CPU 1: hi: 186, btch: 31 usd: 124 >> [ 8212.940691] Node 0 Normal per-cpu: >> [ 8212.940693] CPU 0: hi: 0, btch: 1 usd: 0 >> [ 8212.940694] CPU 1: hi: 0, btch: 1 usd: 0 >> [ 8212.940700] active_anon:105765 inactive_anon:105882 isolated_anon:0 >> active_file:8412 inactive_file:8612 isolated_file:0 >> unevictable:0 dirty:0 writeback:0 unstable:0 >> free:1143 slab_reclaimable:3575 slab_unreclaimable:3464 >> mapped:3792 shmem:6 pagetables:2534 bounce:0 >> free_cma:0 totalram:246132 balloontarget:306242 >> [ 8212.940702] Node 0 DMA free:1964kB min:88kB low:108kB high:132kB >> active_anon:5092kB inactive_anon:5328kB active_file:416kB >> inactive_file:608kB unevictable:0kB isolated(anon):0kB >> isolated(file):0kB present:15996kB managed:15392kB mlocked:0kB dirty:0kB >> writeback:0kB mapped:320kB shmem:0kB slab_reclaimable:252kB >> slab_unreclaimable:492kB kernel_stack:120kB pagetables:252kB >> unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB >> pages_scanned:26951 all_unreclaimable? yes >> [ 8212.940711] lowmem_reserve[]: 0 469 469 469 >> [ 8212.940715] Node 0 DMA32 free:2608kB min:2728kB low:3408kB >> high:4092kB active_anon:181456kB inactive_anon:181528kB >> active_file:22296kB inactive_file:22644kB unevictable:0kB >> isolated(anon):0kB isolated(file):0kB present:507904kB managed:466364kB >> mlocked:0kB dirty:0kB writeback:0kB mapped:8628kB shmem:20kB >> slab_reclaimable:10756kB slab_unreclaimable:12548kB kernel_stack:1688kB >> pagetables:8876kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB >> pages_scanned:612393 all_unreclaimable? yes >> [ 8212.940722] lowmem_reserve[]: 0 0 0 0 >> [ 8212.940725] Node 0 Normal free:0kB min:0kB low:0kB high:0kB >> active_anon:236512kB inactive_anon:236672kB active_file:10936kB >> inactive_file:11196kB unevictable:0kB isolated(anon):0kB >> isolated(file):0kB present:524288kB managed:502772kB mlocked:0kB >> dirty:0kB writeback:0kB mapped:6220kB shmem:4kB slab_reclaimable:3292kB >> slab_unreclaimable:816kB kernel_stack:64kB pagetables:1008kB >> unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB >> pages_scanned:745963 all_unreclaimable? yes >> [ 8212.940732] lowmem_reserve[]: 0 0 0 0 >> [ 8212.940735] Node 0 DMA: 1*4kB (R) 0*8kB 4*16kB (R) 1*32kB (R) 1*64kB >> (R) 2*128kB (R) 0*256kB 1*512kB (R) 1*1024kB (R) 0*2048kB 0*4096kB = 1956kB >> [ 8212.940747] Node 0 DMA32: 652*4kB (U) 0*8kB 0*16kB 0*32kB 0*64kB >> 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 2608kB >> [ 8212.940756] Node 0 Normal: 0*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB >> 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 0kB >> [ 8212.940765] 16847 total pagecache pages >> [ 8212.940766] 8381 pages in swap cache >> [ 8212.940768] Swap cache stats: add 741397, delete 733016, find >> 250268/342284 >> [ 8212.940769] Free swap = 1925576kB >> [ 8212.940770] Total swap = 2097148kB >> [ 8212.951044] 262143 pages RAM >> [ 8212.951046] 11939 pages reserved >> [ 8212.951047] 540820 pages shared >> [ 8212.951048] 240248 pages non-shared >> [ 8212.951050] [ pid ] uid tgid total_vm rss nr_ptes swapents >> oom_score_adj name >> >> [ 8212.951310] Out of memory: Kill process 23721 (cc1plus) score 119 or >> sacrifice child >> [ 8212.951313] Killed process 23721 (cc1plus) total-vm:530268kB, >> anon-rss:350980kB, file-rss:9408kB >> [54810.683658] kjournald starting. Commit interval 5 seconds >> [54810.684381] EXT3-fs (xvda1): using internal journal >> [54810.684402] EXT3-fs (xvda1): mounted filesystem with writeback data mode >> -- *James Dingwall* Script Monkey zynstra-signature-logo twitter-black linkedin-black Zynstra is a private limited company registered in England and Wales (registered number 07864369). Our registered office is 5 New Street Square, London, EC4A 3TW and our headquarters are at Bath Ventures, Broad Quay, Bath, BA1 1UD.