From mboxrd@z Thu Jan 1 00:00:00 1970 From: James Dingwall Subject: Re: Kernel 3.11 / 3.12 OOM killer and Xen ballooning Date: Wed, 15 Jan 2014 08:49:11 +0000 Message-ID: <52D64B87.6000400@zynstra.com> References: <52A602E5.3080300@zynstra.com> <20131209214816.GA3000@phenom.dumpdata.com> <52A72AB8.9060707@zynstra.com> <20131210152746.GF3184@phenom.dumpdata.com> <52A812B0.6060607@oracle.com> <52A89334.3090007@zynstra.com> <52B18F44.2030500@oracle.com> <52B3443F.5060704@zynstra.com> <52B3B6D7.50606@oracle.com> <52BBEBEF.8040509@zynstra.com> <52C50661.7060900@oracle.com> <52CBC700.1060602@zynstra.com> <52CE7E67.5080708@oracle.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; Format="flowed" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <52CE7E67.5080708@oracle.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Bob Liu Cc: xen-devel@lists.xen.org List-Id: xen-devel@lists.xenproject.org Bob Liu wrote: > On 01/07/2014 05:21 PM, James Dingwall wrote: >> Bob Liu wrote: >>> Could you confirm that this problem doesn't exist if loading tmem with >>> selfshrinking=0 during compile gcc? It seems that you are compiling >>> difference packages during your testing. >>> This will help to figure out whether selfshrinking is the root cause. >> Got an oom with selfshrinking=0, again during a gcc compile. >> Unfortunately I don't have a single test case which demonstrates the >> problem but as I mentioned before it will generally show up under >> compiles of large packages such as glibc, kdelibs, gcc etc. >> > So the root cause is not because enabled selfshrinking. > Then what I can think of is that the xen-selfballoon driver was too > aggressive, too many pages were ballooned out which causeed heavy memory > pressure to guest OS. > And kswapd started to reclaim page until most of pages were > unreclaimable(all_unreclaimable=yes for all zones), then OOM Killer was > triggered. > In theory the balloon driver should give back ballooned out pages to > guest OS, but I'm afraid this procedure is not fast enough. > > My suggestion is reserve a min memory for your guest OS so that the > xen-selfballoon won't be so aggressive. > You can do it through parameters selfballoon_reserved_mb or > selfballoon_min_usable_mb. I am still getting OOM errors with both of these set to 32 so I'll try another bump to 64. I think that if I do find values which prevent it though then it is more of a work around than a fix because it still suggests that swap is not being used when ballooning is no longer capable of satisfying the request. I've also got an Ubuntu Saucy (3.11 kernel) guest running on the dom0 with tmem activated so I'm going to see if I can find a comparable workload to see if I get the same issue with a different kernel configuration. James