From: James Dingwall <james.dingwall@zynstra.com>
To: Bob Liu <bob.liu@oracle.com>
Cc: xen-devel@lists.xen.org
Subject: Re: Kernel 3.11 / 3.12 OOM killer and Xen ballooning
Date: Tue, 28 Jan 2014 17:15:00 +0000 [thread overview]
Message-ID: <52E7E594.2050104@zynstra.com> (raw)
In-Reply-To: <52D7346A.3000300@oracle.com>
Bob Liu wrote:
Hi Bob,
> On 01/16/2014 12:35 AM, James Dingwall wrote:
>> Bob Liu wrote:
>>> On 01/15/2014 04:49 PM, James Dingwall wrote:
>>>> Bob Liu wrote:
>>>>> On 01/07/2014 05:21 PM, James Dingwall wrote:
>>>>>> Bob Liu wrote:
>>>>>>> Could you confirm that this problem doesn't exist if loading tmem
>>>>>>> with
>>>>>>> selfshrinking=0 during compile gcc? It seems that you are compiling
>>>>>>> difference packages during your testing.
>>>>>>> This will help to figure out whether selfshrinking is the root cause.
>>>>>> Got an oom with selfshrinking=0, again during a gcc compile.
>>>>>> Unfortunately I don't have a single test case which demonstrates the
>>>>>> problem but as I mentioned before it will generally show up under
>>>>>> compiles of large packages such as glibc, kdelibs, gcc etc.
>>>>>>
>>>>> So the root cause is not because enabled selfshrinking.
>>>>> Then what I can think of is that the xen-selfballoon driver was too
>>>>> aggressive, too many pages were ballooned out which causeed heavy
>>>>> memory
>>>>> pressure to guest OS.
>>>>> And kswapd started to reclaim page until most of pages were
>>>>> unreclaimable(all_unreclaimable=yes for all zones), then OOM Killer was
>>>>> triggered.
>>>>> In theory the balloon driver should give back ballooned out pages to
>>>>> guest OS, but I'm afraid this procedure is not fast enough.
>>>>>
>>>>> My suggestion is reserve a min memory for your guest OS so that the
>>>>> xen-selfballoon won't be so aggressive.
>>>>> You can do it through parameters selfballoon_reserved_mb or
>>>>> selfballoon_min_usable_mb.
>>>> I am still getting OOM errors with both of these set to 32 so I'll try
>>>> another bump to 64. I think that if I do find values which prevent it
>>>> though then it is more of a work around than a fix because it still
>>>> suggests that swap is not being used when ballooning is no longer
>>> Yes, it's like a work around. But I don't think there is a better way.
>>>
>>> From the recent OOM log your reported:
>>> [ 8212.940769] Free swap = 1925576kB
>>> [ 8212.940770] Total swap = 2097148kB
>>>
>>> [504638.442136] Free swap = 1868108kB
>>> [504638.442137] Total swap = 2097148kB
>>>
>>> 171572KB and 229040KB data are swapped out to swap disk, I think there
>>> are already significantly values for guest OS has only ~300M usable
>>> memory.
>>> guest OS can't find out pages suitable for swap any more after so many
>>> pages are swapped out, although at that time the swap device still have
>>> enough space.
>>>
>>> The OOM may not be triggered if the balloon driver can give back memory
>>> to guest OS fast enough but I think it's unrealistic.
>>> So the best way is reserve more memory to guest OS.
>>>
>>>> capable of satisfying the request. I've also got an Ubuntu Saucy (3.11
>>>> kernel) guest running on the dom0 with tmem activated so I'm going to
>>>> see if I can find a comparable workload to see if I get the same issue
>>>> with a different kernel configuration.
>>>>
>> I've done a bit more testing and seem to have found an extra condition
>> which is affecting the OOM behaviour in my guests. All my Gentoo guests
>> have swap space backed by a dm-crypt volume and if I remove this layer
>> then things seem to be behaving much more reliably. In my Ubuntu guests
>> I have plain swap space and so far I haven't been able to trigger an OOM
>> condition. Is it possible that it is the dm-crypt layer failing to get
>> working memory when swapping something in/out and causing the error?
>>
> One possible reason is the dm layer and related dm target driver occupy
> a significant mount of memory and there is no way for xenselfballoon to
> know this. So selfballoon driver ballooned out more memory than the
> system really requires.
>
> I have made a patch by reserving extra 10% of original total memory, by
> this way I think we can make the system much more reliably in all cases.
> Could you please have a test? You don't need to set
> selfballoon_reserved_mb by yourself any more.
I have to say that with this patch the situation has definitely
improved. I have been running it with 3.12.[78] and 3.13 and pushing it
quite hard for the last 10 days or so. Unfortunately yesterday I got an
OOM during a compile (link) of webkit-gtk. I think your patch is part
of the solution but I'm not sure if the other bit is simply to be more
generous with the guest memory allocation or something else. Having
tested with memory = 512 and no tmem I get an OOM with the same
compile, with memory = 1024 and no tmem the compile completes ok (both
cases without maxmem). As my domains are usually started with memory =
512 and maxmem = 1024 it seems that there should be sufficient with my
default parameters. Also for an experiment I set memory=1024 and removed
maxmem and when tmem is activated I see "[ 3393.884105] xen:balloon:
reserve_additional_memory: add_memory() failed: -17" printed many times
in the guest kernel log.
Regards,
James
[456770.748827] Mem-Info:
[456770.748829] Node 0 DMA per-cpu:
[456770.748833] CPU 0: hi: 0, btch: 1 usd: 0
[456770.748835] CPU 1: hi: 0, btch: 1 usd: 0
[456770.748836] Node 0 DMA32 per-cpu:
[456770.748838] CPU 0: hi: 186, btch: 31 usd: 173
[456770.748840] CPU 1: hi: 186, btch: 31 usd: 120
[456770.748846] active_anon:91431 inactive_anon:96269 isolated_anon:0
active_file:13286 inactive_file:31256 isolated_file:0
unevictable:0 dirty:0 writeback:0 unstable:0
free:1155 slab_reclaimable:7001 slab_unreclaimable:3932
mapped:2300 shmem:88 pagetables:2576 bounce:0
free_cma:0 totalram:255578 balloontarget:327320
[456770.748849] Node 0 DMA free:1956kB min:88kB low:108kB high:132kB
active_anon:3128kB inactive_anon:3328kB active_file:1888kB
inactive_file:2088kB unevictable:0kB isolated(anon):0kB
isolated(file):0kB present:15996kB managed:15912kB mlocked:0kB dirty:0kB
writeback:0kB mapped:32kB shmem:0kB slab_reclaimable:684kB
slab_unreclaimable:720kB kernel_stack:72kB pagetables:488kB unstable:0kB
bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:17841
all_unreclaimable? yes
[456770.748863] lowmem_reserve[]: 0 469 469 469
[456770.748866] Node 0 DMA32 free:2664kB min:2728kB low:3408kB
high:4092kB active_anon:362596kB inactive_anon:381748kB
active_file:51256kB inactive_file:122936kB unevictable:0kB
isolated(anon):0kB isolated(file):0kB present:1032192kB
managed:1006400kB mlocked:0kB dirty:0kB writeback:0kB mapped:9168kB
shmem:352kB slab_reclaimable:27320kB slab_unreclaimable:15008kB
kernel_stack:1784kB pagetables:9816kB unstable:0kB bounce:0kB
free_cma:0kB writeback_tmp:0kB pages_scanned:1382021 all_unreclaimable? yes
[456770.748874] lowmem_reserve[]: 0 0 0 0
[456770.748877] Node 0 DMA: 1*4kB (R) 0*8kB 0*16kB 5*32kB (R) 2*64kB (R)
1*128kB (R) 0*256kB 1*512kB (R) 1*1024kB (R) 0*2048kB 0*4096kB = 1956kB
[456770.748890] Node 0 DMA32: 666*4kB (U) 0*8kB 0*16kB 0*32kB 0*64kB
0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 2664kB
[456770.748899] 48556 total pagecache pages
[456770.748901] 35203 pages in swap cache
[456770.748903] Swap cache stats: add 358621, delete 323418, find
206319/224002
[456770.748904] Free swap = 1671532kB
[456770.748905] Total swap = 2097148kB
[456770.748906] 262047 pages RAM
[456770.748907] 0 pages HighMem/MovableOnly
[456770.748908] 6448 pages reserved
<snip process list>
[456770.749070] Out of memory: Kill process 28271 (ld) score 110 or
sacrifice child
[456770.749073] Killed process 28271 (ld) total-vm:358488kB,
anon-rss:324588kB, file-rss:1456kB
>
>
> xen_selfballoon_deaggressive.patch
>
>
> diff --git a/drivers/xen/xen-selfballoon.c b/drivers/xen/xen-selfballoon.c
> index 21e18c1..8f33254 100644
> --- a/drivers/xen/xen-selfballoon.c
> +++ b/drivers/xen/xen-selfballoon.c
> @@ -175,6 +175,7 @@ static void frontswap_selfshrink(void)
> #endif /* CONFIG_FRONTSWAP */
>
> #define MB2PAGES(mb) ((mb) << (20 - PAGE_SHIFT))
> +#define PAGES2MB(pages) ((pages) >> (20 - PAGE_SHIFT))
>
> /*
> * Use current balloon size, the goal (vm_committed_as), and hysteresis
> @@ -525,6 +526,7 @@ EXPORT_SYMBOL(register_xen_selfballooning);
> int xen_selfballoon_init(bool use_selfballooning, bool use_frontswap_selfshrink)
> {
> bool enable = false;
> + unsigned long reserve_pages;
>
> if (!xen_domain())
> return -ENODEV;
> @@ -549,6 +551,26 @@ int xen_selfballoon_init(bool use_selfballooning, bool use_frontswap_selfshrink)
> if (!enable)
> return -ENODEV;
>
> + /*
> + * Give selfballoon_reserved_mb a default value(10% of total ram pages)
> + * to make selfballoon not so aggressive.
> + *
> + * There are two reasons:
> + * 1) The goal_page doesn't contain some pages used by kernel space,
> + * like slab cache and pages used by device drivers.
> + *
> + * 2) The balloon driver may not give back memory to guest OS fast
> + * enough when the workload suddenly aquries a lot of memory.
> + *
> + * In both cases, the guest OS will suffer from memory pressure and
> + * OOM killer may be triggered.
> + * By reserving extra 10% of total ram pages, we can keep the system
> + * much more reliably and response faster in some cases.
> + */
> + if (!selfballoon_reserved_mb) {
> + reserve_pages = totalram_pages / 10;
> + selfballoon_reserved_mb = PAGES2MB(reserve_pages);
> + }
> schedule_delayed_work(&selfballoon_worker, selfballoon_interval * HZ);
>
> return 0;
--
*James Dingwall*
Script Monkey
zynstra-signature-logo <http://www.zynstra.com/>twitter-black
<http://www.twitter.com/zynstra>linkedin-black
<http://www.linkedin.com/company/zynstra>
Zynstra is a private limited company registered in England and Wales
(registered number 07864369). Our registered office is 5 New Street
Square, London, EC4A 3TW and our headquarters are at Bath Ventures,
Broad Quay, Bath, BA1 1UD.
next prev parent reply other threads:[~2014-01-28 17:15 UTC|newest]
Thread overview: 41+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-12-09 17:50 Kernel 3.11 / 3.12 OOM killer and Xen ballooning James Dingwall
2013-12-09 21:48 ` Konrad Rzeszutek Wilk
2013-12-10 14:52 ` James Dingwall
2013-12-10 15:27 ` Konrad Rzeszutek Wilk
2013-12-11 7:22 ` Bob Liu
2013-12-11 9:25 ` James Dingwall
2013-12-11 9:54 ` Bob Liu
2013-12-11 10:16 ` James Dingwall
2013-12-11 16:30 ` James Dingwall
2013-12-12 1:03 ` Bob Liu
2013-12-13 16:59 ` James Dingwall
2013-12-17 6:11 ` Bob Liu
2013-12-18 12:04 ` Bob Liu
2013-12-19 19:08 ` James Dingwall
2013-12-20 3:17 ` Bob Liu
2013-12-20 12:22 ` James Dingwall
2013-12-26 8:42 ` James Dingwall
2014-01-02 6:25 ` Bob Liu
2014-01-07 9:21 ` James Dingwall
2014-01-09 10:48 ` Bob Liu
2014-01-09 10:54 ` James Dingwall
2014-01-09 11:04 ` James Dingwall
2014-01-15 8:49 ` James Dingwall
2014-01-15 14:41 ` Bob Liu
2014-01-15 16:35 ` James Dingwall
2014-01-16 1:22 ` Bob Liu
2014-01-16 10:52 ` James Dingwall
2014-01-28 17:15 ` James Dingwall [this message]
2014-01-29 14:35 ` Bob Liu
2014-01-29 14:45 ` James Dingwall
2014-01-31 16:56 ` Konrad Rzeszutek Wilk
2014-02-03 9:49 ` Daniel Kiper
2014-02-03 10:30 ` Konrad Rzeszutek Wilk
2014-02-03 11:20 ` James Dingwall
2014-02-03 14:00 ` Daniel Kiper
2013-12-10 8:16 ` Jan Beulich
2013-12-10 14:01 ` James Dingwall
2013-12-10 14:25 ` Jan Beulich
2013-12-10 14:52 ` James Dingwall
2013-12-10 14:59 ` Jan Beulich
2013-12-10 15:16 ` James Dingwall
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=52E7E594.2050104@zynstra.com \
--to=james.dingwall@zynstra.com \
--cc=bob.liu@oracle.com \
--cc=xen-devel@lists.xen.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).