From mboxrd@z Thu Jan 1 00:00:00 1970 From: Bob Liu Subject: Re: Kernel 3.11 / 3.12 OOM killer and Xen ballooning Date: Fri, 20 Dec 2013 11:17:43 +0800 Message-ID: <52B3B6D7.50606@oracle.com> References: <52A602E5.3080300@zynstra.com> <20131209214816.GA3000@phenom.dumpdata.com> <52A72AB8.9060707@zynstra.com> <20131210152746.GF3184@phenom.dumpdata.com> <52A812B0.6060607@oracle.com> <52A89334.3090007@zynstra.com> <52B18F44.2030500@oracle.com> <52B3443F.5060704@zynstra.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="------------010100040508030703000003" Return-path: In-Reply-To: <52B3443F.5060704@zynstra.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: James Dingwall Cc: xen-devel@lists.xen.org List-Id: xen-devel@lists.xenproject.org This is a multi-part message in MIME format. --------------010100040508030703000003 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit On 12/20/2013 03:08 AM, James Dingwall wrote: > Bob Liu wrote: >> On 12/12/2013 12:30 AM, James Dingwall wrote: >>> Bob Liu wrote: >>>> On 12/10/2013 11:27 PM, Konrad Rzeszutek Wilk wrote: >>>>> On Tue, Dec 10, 2013 at 02:52:40PM +0000, James Dingwall wrote: >>>>>> Konrad Rzeszutek Wilk wrote: >>>>>>> On Mon, Dec 09, 2013 at 05:50:29PM +0000, James Dingwall wrote: >>>>>>>> Hi, >>>>>>>> >>>>>>>> Since 3.11 I have noticed that the OOM killer quite frequently >>>>>>>> triggers in my Xen guest domains which use ballooning to >>>>>>>> increase/decrease their memory allocation according to their >>>>>>>> requirements. One example domain I have has a maximum memory >>>>>>>> setting of ~1.5Gb but it usually idles at ~300Mb, it is also >>>>>>>> configured with 2Gb swap which is almost 100% free. >>>>>>>> >>>>>>>> # free >>>>>>>> total used free shared buffers >>>>>>>> cached >>>>>>>> Mem: 272080 248108 23972 0 1448 63064 >>>>>>>> -/+ buffers/cache: 183596 88484 >>>>>>>> Swap: 2097148 8 2097140 >>>>>>>> >>>>>>>> There is plenty of available free memory in the hypervisor to >>>>>>>> balloon to the maximum size: >>>>>>>> # xl info | grep free_mem >>>>>>>> free_memory : 14923 >>>>>>>> >>>>>>>> An example trace (they are always the same) from the oom killer in >>>>>>>> 3.12 is added below. So far I have not been able to reproduce this >>>>>>>> at will so it is difficult to start bisecting it to see if a >>>>>>>> particular change introduced this. However it does seem that the >>>>>>>> behaviour is wrong because a) ballooning could give the guest more >>>>>>>> memory, b) there is lots of swap available which could be used as a >>>>>>>> fallback. >>>>> Keep in mind that swap with tmem is actually no more swap. Heh, that >>>>> sounds odd -but basically pages that are destined for swap end up >>>>> going in the tmem code which pipes them up to the hypervisor. >>>>> >>>>>>>> If other information could help or there are more tests that I >>>>>>>> could >>>>>>>> run then please let me know. >>>>>>> I presume you have enabled 'tmem' both in the hypervisor and in >>>>>>> the guest right? >>>>>> Yes, domU and dom0 both have the tmem module loaded and tmem >>>>>> tmem_dedup=on tmem_compress=on is given on the xen command line. >>>>> Excellent. The odd thing is that your swap is not used that much, but >>>>> it should be (as that is part of what the self-balloon is suppose to >>>>> do). >>>>> >>>>> Bob, you had a patch for the logic of how self-balloon is suppose >>>>> to account for the slab - would this be relevant to this problem? >>>>> >>>> Perhaps, I have attached the patch. >>>> James, could you please apply it and try your application again? You >>>> have to rebuild the guest kernel. >>>> Oh, and also take a look at whether frontswap is in use, you can check >>>> it by watching "cat /sys/kernel/debug/frontswap/*". >>> I have tested this patch with a workload where I have previously seen >>> failures and so far so good. I'll try to keep a guest with it stressed >>> to see if I do get any problems. I don't know if it is expected but I >> By the way, besides longer time of kswapd, is this patch work well >> during your stress testing? >> >> Have you seen the OOM killer triggered quite frequently again?(with >> selfshrink=true) >> >> Thanks, >> -Bob > It was looking good until today (selfshrink=true). The trace below is > during a compile of subversion, it looks like the memory has ballooned > to almost the maximum permissible but even under pressure the swap disk > has hardly come in to use. > So if without selfshrink the swap disk can be used a lot? If that's the case, I'm afraid the frontswap-selfshrink in xen-selfballoon did something incorrect. Could you please try this patch which make the frontswap-selfshrink slower and add a printk for debug. Please still keep selfshrink=true in your test but can with or without my previous patch. Thanks a lot! -Bob --------------010100040508030703000003 Content-Type: text/x-patch; name="a.patch" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="a.patch" diff --git a/drivers/xen/xen-selfballoon.c b/drivers/xen/xen-selfballoon.c index 21e18c1..6e9bf0b 100644 --- a/drivers/xen/xen-selfballoon.c +++ b/drivers/xen/xen-selfballoon.c @@ -133,7 +133,7 @@ static unsigned int frontswap_hysteresis __read_mostly = 20; * frontswap selfshrinking should commence. Note that selfshrinking does * not use a separate worker thread. */ -static unsigned int frontswap_inertia __read_mostly = 3; +static unsigned int frontswap_inertia __read_mostly = 6; /* Countdown to next invocation of frontswap_shrink() */ static unsigned long frontswap_inertia_counter; @@ -170,6 +170,8 @@ static void frontswap_selfshrink(void) tgt_frontswap_pages = cur_frontswap_pages - (cur_frontswap_pages / frontswap_hysteresis); frontswap_shrink(tgt_frontswap_pages); + printk("frontswap selfshrink %ld pages\n", tgt_frontswap_pages); + frontswap_inertia_counter = frontswap_inertia; } #endif /* CONFIG_FRONTSWAP */ --------------010100040508030703000003 Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel --------------010100040508030703000003--