xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* auto-ballooning crashing Dom0?
@ 2012-08-02 13:41 Andre Przywara
  2012-08-02 14:45 ` Ian Jackson
  0 siblings, 1 reply; 3+ messages in thread
From: Andre Przywara @ 2012-08-02 13:41 UTC (permalink / raw)
  To: Ian Campbell, Ian Jackson; +Cc: xen-devel

Hi,

during some experiments with many guests I get crashing Dom0s because of 
too less memory. Actually the OOM killer goes 'round and kills random 
things, preferably qemu-dm's ;-)
The box in question has 128GB of memory, I start with dom0_mem=8192M (or 
16384M, doesn't matter). I also used "dom0_mem=8192M,min:1536M", but 
that didn't make any difference. Xen is c/s 25688.

Then I start some guests with 2GB each. This works fine until about 55 
guests, then I get some denies from xl when starting guests (which would 
be OK). But sometimes the guest start works (even after having failed 
before), but it has obviously ripped off precious memory from Dom0. With 
around 55 guests Dom0 has about 500MB in use.
The whole Dom0 is in trouble then, I get "fork: cannot allocate memory" 
messages for a simple "ls" and have to reboot the box.
This is with xl.conf:autoballooning=1 (= the commented default)
Setting it to 0 works, but is obviously not a real option as a default.

I found the hardcoded 128MB limit in libxl_internal.h, I guess this is 
way too small for this type of machine.

Either we change this to something higher (768 MB worked for me) or we 
make this a config option in xl.conf (like it was in xend-config.sxp)

Another option would be to make it dynamic, by looking at the actual 
memory currently used in Dom0 and don't balloon down to 110% or so of it.

Sadly (well..) I am about to leave for vacation, so no patch this time, 
I leave this as an exercise to the tool buffs ;-)

In any case we should do something still for Xen 4.2, as I guess people 
dislike crashing Dom0, tearing down all the domains with it...

Regards,
Andre.

-- 
Andre Przywara
AMD-Operating System Research Center (OSRC), Dresden, Germany

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: auto-ballooning crashing Dom0?
  2012-08-02 13:41 auto-ballooning crashing Dom0? Andre Przywara
@ 2012-08-02 14:45 ` Ian Jackson
  2013-06-06 13:20   ` Alex Bligh
  0 siblings, 1 reply; 3+ messages in thread
From: Ian Jackson @ 2012-08-02 14:45 UTC (permalink / raw)
  To: Andre Przywara; +Cc: Ian Campbell, xen-devel

Andre Przywara writes ("auto-ballooning crashing Dom0?"):
> during some experiments with many guests I get crashing Dom0s because of 
> too less memory. Actually the OOM killer goes 'round and kills random 
> things, preferably qemu-dm's ;-)
> The box in question has 128GB of memory, I start with dom0_mem=8192M (or 
> 16384M, doesn't matter). I also used "dom0_mem=8192M,min:1536M", but 
> that didn't make any difference. Xen is c/s 25688.

I have seen similar effects occasionally but have usually been to busy
in the middle of something else to do anything about it.  The
autoballooning arrangements aren't very good TBH and we are intending
to improve things in 4.3.

> Either we change this to something higher (768 MB worked for me) or we 
> make this a config option in xl.conf (like it was in xend-config.sxp)

Certainly it should be a config option.

> Another option would be to make it dynamic, by looking at the actual 
> memory currently used in Dom0 and don't balloon down to 110% or so of it.

That would be a possibility.

> In any case we should do something still for Xen 4.2, as I guess people 
> dislike crashing Dom0, tearing down all the domains with it...

Yes.

Ian.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: auto-ballooning crashing Dom0?
  2012-08-02 14:45 ` Ian Jackson
@ 2013-06-06 13:20   ` Alex Bligh
  0 siblings, 0 replies; 3+ messages in thread
From: Alex Bligh @ 2013-06-06 13:20 UTC (permalink / raw)
  To: Ian Jackson, Andre Przywara; +Cc: Ian Campbell, Alex Bligh, xen-devel

(Resurrecting a thread from the past)

--On 2 August 2012 15:45:00 +0100 Ian Jackson <Ian.Jackson@eu.citrix.com> 
wrote:

> Andre Przywara writes ("auto-ballooning crashing Dom0?"):
>> during some experiments with many guests I get crashing Dom0s because of
>> too less memory. Actually the OOM killer goes 'round and kills random
>> things, preferably qemu-dm's ;-)
>> The box in question has 128GB of memory, I start with dom0_mem=8192M (or
>> 16384M, doesn't matter). I also used "dom0_mem=8192M,min:1536M", but
>> that didn't make any difference. Xen is c/s 25688.
>
> I have seen similar effects occasionally but have usually been to busy
> in the middle of something else to do anything about it.  The
> autoballooning arrangements aren't very good TBH and we are intending
> to improve things in 4.3.
>
>> Either we change this to something higher (768 MB worked for me) or we
>> make this a config option in xl.conf (like it was in xend-config.sxp)
>
> Certainly it should be a config option.
>
>> Another option would be to make it dynamic, by looking at the actual
>> memory currently used in Dom0 and don't balloon down to 110% or so of it.
>
> That would be a possibility.
>
>> In any case we should do something still for Xen 4.2, as I guess people
>> dislike crashing Dom0, tearing down all the domains with it...
>
> Yes.
>

We got hit by this. Our fix is to turn off autoballooning of dom0.

However, for the record, xen4.2.2 seems to perform very strangely here.
Things worked with 2.5GB of RAM, failed with 3GB, but worked again
with 4GB. The symptom was a hang initialising the balloon driver.

Our 'compounding factor' is that we run with a large initrd that stays
alive during normal running as a ramdisk. Those pages are marked as
buffer/page cache (I forget which) but never get flushed. I suspect
this confuses the free memory calculations as xen's balloon driver
thinks there are pages that can be freed that actually can't.

We'll probably turn autoballooning off with xen4.3 (as our dom0
memory usage is pretty static), but it might affect that too.

-- 
Alex Bligh

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2013-06-06 13:20 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-08-02 13:41 auto-ballooning crashing Dom0? Andre Przywara
2012-08-02 14:45 ` Ian Jackson
2013-06-06 13:20   ` Alex Bligh

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).