* [PATCH] linux/balloon: don't allow ballooning down a domain below a reasonable limit
@ 2008-04-04 15:07 Jan Beulich
2008-04-05 21:39 ` Keir Fraser
0 siblings, 1 reply; 29+ messages in thread
From: Jan Beulich @ 2008-04-04 15:07 UTC (permalink / raw)
To: xen-devel; +Cc: Ky Srinivasan, Kurt Garloff
From: K.Y. Srinivasan <ksrinivasan@novell.com>
Reasonable is hard to judge; we don't want to disallow small domains.
But the system needs a reasonable amount of memory to perform its
duties, set up tables, etc. If on the other hand, the admin is able
to set up and boot up correctly a very small domain, there's no point
in forcing it to be larger.
We end up with some kind of logarithmic function, approximated.
Memory changes are logged, so making domains too small should at least
result in a trace.
As usual, written and tested on 2.6.25-rc8 and 2.6.16.60 and made apply
to the 2.6.18 tree without further testing.
Signed-off-by: Kurt Garloff <garloff@suse.de>
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Index: head-2008-02-20/drivers/xen/balloon/balloon.c
===================================================================
--- head-2008-02-20.orig/drivers/xen/balloon/balloon.c 2008-02-20 10:32:43.000000000 +0100
+++ head-2008-02-20/drivers/xen/balloon/balloon.c 2008-02-20 10:40:54.000000000 +0100
@@ -194,6 +194,42 @@ static unsigned long current_target(void
return target;
}
+static unsigned long minimum_target(void)
+{
+ unsigned long min_pages;
+ unsigned long curr_pages = current_target();
+#ifndef CONFIG_XEN
+#define max_pfn totalram_pages
+#endif
+
+#define MB2PAGES(mb) ((mb) << (20 - PAGE_SHIFT))
+ /* Simple continuous piecewiese linear function:
+ * max MiB -> min MiB gradient
+ * 0 0
+ * 16 16
+ * 32 24
+ * 128 72 (1/2)
+ * 512 168 (1/4)
+ * 2048 360 (1/8)
+ * 8192 552 (1/32)
+ * 32768 1320
+ * 131072 4392
+ */
+ if (max_pfn < MB2PAGES(128))
+ min_pages = MB2PAGES(8) + (max_pfn >> 1);
+ else if (max_pfn < MB2PAGES(512))
+ min_pages = MB2PAGES(40) + (max_pfn >> 2);
+ else if (max_pfn < MB2PAGES(2048))
+ min_pages = MB2PAGES(104) + (max_pfn >> 3);
+ else
+ min_pages = MB2PAGES(296) + (max_pfn >> 5);
+#undef MB2PAGES
+
+ /* Don't enforce growth */
+ return min_pages < curr_pages ? min_pages : curr_pages;
+#undef max_pfn
+}
+
static int increase_reservation(unsigned long nr_pages)
{
unsigned long pfn, i, flags;
@@ -382,6 +418,17 @@ static void balloon_process(struct work_
/* Resets the Xen limit, sets new target, and kicks off processing. */
void balloon_set_new_target(unsigned long target)
{
+ /* First make sure that we are not lowering the value below the
+ * "minimum".
+ */
+ unsigned long min_pages = minimum_target();
+
+ if (target < min_pages)
+ target = min_pages;
+
+ printk(KERN_INFO "Setting mem allocation to %lu kiB\n",
+ PAGES2KB(target));
+
/* No need for lock. Not read-modify-write updates. */
bs.hard_limit = ~0UL;
bs.target_pages = target;
^ permalink raw reply [flat|nested] 29+ messages in thread* Re: [PATCH] linux/balloon: don't allow ballooning down a domain below a reasonable limit
2008-04-04 15:07 [PATCH] linux/balloon: don't allow ballooning down a domain below a reasonable limit Jan Beulich
@ 2008-04-05 21:39 ` Keir Fraser
2008-04-07 7:10 ` [PATCH] linux/balloon: don't allow ballooningdown " Jan Beulich
0 siblings, 1 reply; 29+ messages in thread
From: Keir Fraser @ 2008-04-05 21:39 UTC (permalink / raw)
To: Jan Beulich, xen-devel; +Cc: Ky Srinivasan, Kurt Garloff
On 4/4/08 16:07, "Jan Beulich" <jbeulich@novell.com> wrote:
> +#ifndef CONFIG_XEN
> +#define max_pfn totalram_pages
> +#endif
This is silly. We modify totalram_pages as we balloon up and down, so this
really isn't very max_pfn-like after ballooning gets under way.
So I've applied the patch but I made it a no-op if !defined(CONFIG_XEN),
until/unless someone comes up with a better alternative to totalram_pages.
Possibly just latching totalram_pages when we install the balloon driver
would be sufficient?
-- Keir
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH] linux/balloon: don't allow ballooningdown a domain below a reasonable limit
2008-04-05 21:39 ` Keir Fraser
@ 2008-04-07 7:10 ` Jan Beulich
2008-04-29 18:35 ` Dan Magenheimer
0 siblings, 1 reply; 29+ messages in thread
From: Jan Beulich @ 2008-04-07 7:10 UTC (permalink / raw)
To: Keir Fraser; +Cc: Ky Srinivasan, xen-devel, Kurt Garloff
>>> Keir Fraser <keir.fraser@eu.citrix.com> 05.04.08 23:39 >>>
>On 4/4/08 16:07, "Jan Beulich" <jbeulich@novell.com> wrote:
>
>> +#ifndef CONFIG_XEN
>> +#define max_pfn totalram_pages
>> +#endif
>
>This is silly. We modify totalram_pages as we balloon up and down, so this
>really isn't very max_pfn-like after ballooning gets under way.
Indeed. It's been a very long time since I had to last touch this patch, so
I can only assume that originally was meant to address a build problem,
and then got forgotten about.
>So I've applied the patch but I made it a no-op if !defined(CONFIG_XEN),
>until/unless someone comes up with a better alternative to totalram_pages.
>Possibly just latching totalram_pages when we install the balloon driver
>would be sufficient?
That would be one option, though not exactly representing what is
intended here - the minimum memory requirement depends (at least for
FLATMEM) much more on the size of the 'struct page' array than on the
part of the array that's actually valid memory.
Since max_mapnr doesn't get initialized for x86-64 and end_pfn is no
longer being exported in 2.6.25, num_physpages would seem to be
the only other alternative.
Jan
^ permalink raw reply [flat|nested] 29+ messages in thread
* RE: [PATCH] linux/balloon: don't allow ballooningdown a domain below a reasonable limit
2008-04-07 7:10 ` [PATCH] linux/balloon: don't allow ballooningdown " Jan Beulich
@ 2008-04-29 18:35 ` Dan Magenheimer
2008-04-30 6:29 ` [PATCH] linux/balloon: don't allow ballooningdowna " Jan Beulich
0 siblings, 1 reply; 29+ messages in thread
From: Dan Magenheimer @ 2008-04-29 18:35 UTC (permalink / raw)
To: Jan Beulich, Keir Fraser
Cc: Ky Srinivasan, xen-devel@lists.xensource.com, Kurt Garloff
I made some actual measurements of the results of this algorithm
(on a RHEL5u1-32bit guest).
memory= Minimum
128 75776kB
256 108544kB
512 173056kB
1024 238592kB
This corresponds to expected values in the source comment
However, I wonder if the algorithm is probably too
conservative for large(r) memory domains. With
a light load (i.e. continuously compiling Xen),
memory utilization rarely exceeds 72MB, regardless
of the max memory (at least in the above tested values).
> -----Original Message-----
> From: xen-devel-bounces@lists.xensource.com
> [mailto:xen-devel-bounces@lists.xensource.com]On Behalf Of Jan Beulich
> Sent: Monday, April 07, 2008 1:10 AM
> To: Keir Fraser
> Cc: Ky Srinivasan; xen-devel@lists.xensource.com; Kurt Garloff
> Subject: Re: [Xen-devel] [PATCH] linux/balloon: don't allow
> ballooningdown a domain below a reasonable limit
>
>
> >>> Keir Fraser <keir.fraser@eu.citrix.com> 05.04.08 23:39 >>>
> >On 4/4/08 16:07, "Jan Beulich" <jbeulich@novell.com> wrote:
> >
> >> +#ifndef CONFIG_XEN
> >> +#define max_pfn totalram_pages
> >> +#endif
> >
> >This is silly. We modify totalram_pages as we balloon up and
> down, so this
> >really isn't very max_pfn-like after ballooning gets under way.
>
> Indeed. It's been a very long time since I had to last touch
> this patch, so
> I can only assume that originally was meant to address a
> build problem,
> and then got forgotten about.
>
> >So I've applied the patch but I made it a no-op if
> !defined(CONFIG_XEN),
> >until/unless someone comes up with a better alternative to
> totalram_pages.
> >Possibly just latching totalram_pages when we install the
> balloon driver
> >would be sufficient?
>
> That would be one option, though not exactly representing what is
> intended here - the minimum memory requirement depends (at least for
> FLATMEM) much more on the size of the 'struct page' array than on the
> part of the array that's actually valid memory.
> Since max_mapnr doesn't get initialized for x86-64 and end_pfn is no
> longer being exported in 2.6.25, num_physpages would seem to be
> the only other alternative.
>
> Jan
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel
>
^ permalink raw reply [flat|nested] 29+ messages in thread
* RE: [PATCH] linux/balloon: don't allow ballooningdowna domain below a reasonable limit
2008-04-29 18:35 ` Dan Magenheimer
@ 2008-04-30 6:29 ` Jan Beulich
2008-04-30 16:04 ` Dan Magenheimer
0 siblings, 1 reply; 29+ messages in thread
From: Jan Beulich @ 2008-04-30 6:29 UTC (permalink / raw)
To: dan.magenheimer@oracle.com
Cc: Ky Srinivasan, xen-devel@lists.xensource.com, Keir Fraser,
KurtGarloff
>>> "Dan Magenheimer" <dan.magenheimer@oracle.com> 29.04.08 20:35 >>>
>I made some actual measurements of the results of this algorithm
>(on a RHEL5u1-32bit guest).
>
>memory= Minimum
>128 75776kB
>256 108544kB
>512 173056kB
>1024 238592kB
>
>This corresponds to expected values in the source comment
>However, I wonder if the algorithm is probably too
>conservative for large(r) memory domains. With
>a light load (i.e. continuously compiling Xen),
>memory utilization rarely exceeds 72MB, regardless
>of the max memory (at least in the above tested values).
Sure, this was (in different wording) also stated in the comment
that came with the patch. A more precise estimate would certainly
be welcome, but I'm afraid is going to come with a much higher
(complexity) price tag. Unless you have something simple and
obvious in mind that we simply didn't spot...
Jan
^ permalink raw reply [flat|nested] 29+ messages in thread
* RE: [PATCH] linux/balloon: don't allow ballooningdowna domain below a reasonable limit
2008-04-30 6:29 ` [PATCH] linux/balloon: don't allow ballooningdowna " Jan Beulich
@ 2008-04-30 16:04 ` Dan Magenheimer
2008-04-30 23:49 ` Dan Magenheimer
0 siblings, 1 reply; 29+ messages in thread
From: Dan Magenheimer @ 2008-04-30 16:04 UTC (permalink / raw)
To: Jan Beulich
Cc: Ky Srinivasan, xen-devel@lists.xensource.com, Keir Fraser,
KurtGarloff
Hi Jan --
Thanks for the reply. I see the comment now... it didn't
find its way into the source.
I will definitely be working on tuning this estimate
as I am working on maximizing the number of domains
that can be run on a system and this is a constraint.
As a quick-and-dirty test, I just divided the result
of your algorithm (on a 512MB domain) by two and the
maximally-ballooned kernel still ran fine (with
86528kB instead of 173056kB).
Could you explain the logic behind your current algorithm?
I understand you are trying to estimate the additional
kernel data structure space with the addition of the
max_pfn computation but don't understand why this
is a good estimator. I also am wondering how you chose
the magic values for x in MB2PAGES(x). And also if
you have any tests/workloads you might have used to evaluate
the algorithm.
Thanks,
Dan
> -----Original Message-----
> From: Jan Beulich [mailto:jbeulich@novell.com]
> Sent: Wednesday, April 30, 2008 12:29 AM
> To: dan.magenheimer@oracle.com
> Cc: Keir Fraser; xen-devel@lists.xensource.com; Ky Srinivasan;
> KurtGarloff
> Subject: RE: [Xen-devel] [PATCH] linux/balloon: don't allow
> ballooningdowna domain below a reasonable limit
>
>
> >>> "Dan Magenheimer" <dan.magenheimer@oracle.com> 29.04.08 20:35 >>>
> >I made some actual measurements of the results of this algorithm
> >(on a RHEL5u1-32bit guest).
> >
> >memory= Minimum
> >128 75776kB
> >256 108544kB
> >512 173056kB
> >1024 238592kB
> >
> >This corresponds to expected values in the source comment
> >However, I wonder if the algorithm is probably too
> >conservative for large(r) memory domains. With
> >a light load (i.e. continuously compiling Xen),
> >memory utilization rarely exceeds 72MB, regardless
> >of the max memory (at least in the above tested values).
>
> Sure, this was (in different wording) also stated in the comment
> that came with the patch. A more precise estimate would certainly
> be welcome, but I'm afraid is going to come with a much higher
> (complexity) price tag. Unless you have something simple and
> obvious in mind that we simply didn't spot...
>
> Jan
>
>
^ permalink raw reply [flat|nested] 29+ messages in thread
* RE: [PATCH] linux/balloon: don't allow ballooningdowna domain below a reasonable limit
2008-04-30 16:04 ` Dan Magenheimer
@ 2008-04-30 23:49 ` Dan Magenheimer
2008-05-01 7:01 ` Keir Fraser
0 siblings, 1 reply; 29+ messages in thread
From: Dan Magenheimer @ 2008-04-30 23:49 UTC (permalink / raw)
To: Jan Beulich
Cc: Ky Srinivasan, xen-devel@lists.xensource.com, Keir Fraser,
KurtGarloff
OK, I think I am understanding it a bit better:
the max_pfn part is just adding in some "slop"
which is a fraction of total main memory which
is growing smaller (roughly logarithmically)
as memory grows larger. I'm still not sure about
the magic values in MB2PAGES though... I'm guessing
these were gathered somehow experimentally?
With the "divide result of your algorithm by two",
I was able to get thirteen 512MB domains (idle
for now) running on a 2GB system.
I'm experimenting now with an algorithm which starts
with vm_committed_space* and adds back in a (for
now) fixed fraction of 1/32 of total physical
memory.
Dan
* Alas this is not exported so won't work in a module,
but it seems to be a pretty good estimator of active
virtual memory usage.
> -----Original Message-----
> From: Dan Magenheimer [mailto:dan.magenheimer@oracle.com]
> Sent: Wednesday, April 30, 2008 10:04 AM
> To: 'Jan Beulich'
> Cc: 'Keir Fraser'; 'xen-devel@lists.xensource.com'; 'Ky Srinivasan';
> 'KurtGarloff'
> Subject: RE: [Xen-devel] [PATCH] linux/balloon: don't allow
> ballooningdowna domain below a reasonable limit
>
>
> Hi Jan --
>
> Thanks for the reply. I see the comment now... it didn't
> find its way into the source.
>
> I will definitely be working on tuning this estimate
> as I am working on maximizing the number of domains
> that can be run on a system and this is a constraint.
> As a quick-and-dirty test, I just divided the result
> of your algorithm (on a 512MB domain) by two and the
> maximally-ballooned kernel still ran fine (with
> 86528kB instead of 173056kB).
>
> Could you explain the logic behind your current algorithm?
> I understand you are trying to estimate the additional
> kernel data structure space with the addition of the
> max_pfn computation but don't understand why this
> is a good estimator. I also am wondering how you chose
> the magic values for x in MB2PAGES(x). And also if
> you have any tests/workloads you might have used to evaluate
> the algorithm.
>
> Thanks,
> Dan
>
> > -----Original Message-----
> > From: Jan Beulich [mailto:jbeulich@novell.com]
> > Sent: Wednesday, April 30, 2008 12:29 AM
> > To: dan.magenheimer@oracle.com
> > Cc: Keir Fraser; xen-devel@lists.xensource.com; Ky Srinivasan;
> > KurtGarloff
> > Subject: RE: [Xen-devel] [PATCH] linux/balloon: don't allow
> > ballooningdowna domain below a reasonable limit
> >
> >
> > >>> "Dan Magenheimer" <dan.magenheimer@oracle.com> 29.04.08
> 20:35 >>>
> > >I made some actual measurements of the results of this algorithm
> > >(on a RHEL5u1-32bit guest).
> > >
> > >memory= Minimum
> > >128 75776kB
> > >256 108544kB
> > >512 173056kB
> > >1024 238592kB
> > >
> > >This corresponds to expected values in the source comment
> > >However, I wonder if the algorithm is probably too
> > >conservative for large(r) memory domains. With
> > >a light load (i.e. continuously compiling Xen),
> > >memory utilization rarely exceeds 72MB, regardless
> > >of the max memory (at least in the above tested values).
> >
> > Sure, this was (in different wording) also stated in the comment
> > that came with the patch. A more precise estimate would certainly
> > be welcome, but I'm afraid is going to come with a much higher
> > (complexity) price tag. Unless you have something simple and
> > obvious in mind that we simply didn't spot...
> >
> > Jan
> >
> >
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH] linux/balloon: don't allow ballooningdowna domain below a reasonable limit
2008-04-30 23:49 ` Dan Magenheimer
@ 2008-05-01 7:01 ` Keir Fraser
2008-05-01 14:44 ` Dan Magenheimer
2008-05-01 16:36 ` Alan Cox
0 siblings, 2 replies; 29+ messages in thread
From: Keir Fraser @ 2008-05-01 7:01 UTC (permalink / raw)
To: dan.magenheimer@oracle.com, Jan Beulich
Cc: Ky Srinivasan, xen-devel@lists.xensource.com, KurtGarloff
On 1/5/08 00:49, "Dan Magenheimer" <dan.magenheimer@oracle.com> wrote:
> I'm experimenting now with an algorithm which starts
> with vm_committed_space* and adds back in a (for
> now) fixed fraction of 1/32 of total physical
> memory.
>
> Dan
>
> * Alas this is not exported so won't work in a module,
> but it seems to be a pretty good estimator of active
> virtual memory usage.
Can't vm_committed_space grow *bigger* than available memory when using
swap? It may indicate the 'static' demand for memory (in terms of requested
allocations so far) but we're interested in dynamic demand (how much of that
allocated memory is actually used), which afaics vm_committed_space doesn't
appear very useful for.
Some indication of paging churn, or average age of pages in memory, or
something like that, would seem more useful (albeit arbitrarily harder to
work out from within the balloon driver!).
Maybe I'm missing something. I've only had a quick look at usage of
vm_committed_space.
-- Keir
^ permalink raw reply [flat|nested] 29+ messages in thread
* RE: [PATCH] linux/balloon: don't allow ballooningdowna domain below a reasonable limit
2008-05-01 7:01 ` Keir Fraser
@ 2008-05-01 14:44 ` Dan Magenheimer
2008-05-01 16:36 ` Alan Cox
1 sibling, 0 replies; 29+ messages in thread
From: Dan Magenheimer @ 2008-05-01 14:44 UTC (permalink / raw)
To: Keir Fraser, Jan Beulich
Cc: Ky Srinivasan, xen-devel@lists.xensource.com, KurtGarloff
> Can't vm_committed_space grow *bigger* than available memory
> when using
> swap? It may indicate the 'static' demand for memory (in
> terms of requested
> allocations so far) but we're interested in dynamic demand
> (how much of that
> allocated memory is actually used), which afaics
> vm_committed_space doesn't
> appear very useful for.
>
> Some indication of paging churn, or average age of pages in memory, or
> something like that, would seem more useful (albeit
> arbitrarily harder to
> work out from within the balloon driver!).
>
> Maybe I'm missing something. I've only had a quick look at usage of
> vm_committed_space.
You are right about vm_committed_space growing bigger. I was
too terse: I was trying to use it in a min(x,y,z) calculation
with the existing algorithm, but apparently it becomes too small
sometimes (e.g. during shutdown) and I observed OOM problems.
However vm_committed_space seems to be a pretty good dynamic demand
indicator in general. Try watch'ing it in a window while doing
other tasks on a machine/VM, e.g.:
watch grep Committed_AS /proc/meminfo
It grows rapidly when memory hogs are launched and shrinks
rapidly when the machine goes idle. So it seems to work
well for at least a first approximation for selfballooning.
Dan
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH] linux/balloon: don't allow ballooningdowna domain below a reasonable limit
2008-05-01 7:01 ` Keir Fraser
2008-05-01 14:44 ` Dan Magenheimer
@ 2008-05-01 16:36 ` Alan Cox
2008-05-01 16:56 ` Keir Fraser
2008-05-01 16:59 ` Dan Magenheimer
1 sibling, 2 replies; 29+ messages in thread
From: Alan Cox @ 2008-05-01 16:36 UTC (permalink / raw)
To: Keir Fraser
Cc: dan.magenheimer@oracle.com, xen-devel@lists.xensource.com,
Ky Srinivasan, Jan Beulich, KurtGarloff
> Can't vm_committed_space grow *bigger* than available memory when using
> swap? It may indicate the 'static' demand for memory
vm_committed_space measures address space commitment for anonymous
objects. It doesn't measure memory or total virtual commitment.
Alan
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH] linux/balloon: don't allow ballooningdowna domain below a reasonable limit
2008-05-01 16:36 ` Alan Cox
@ 2008-05-01 16:56 ` Keir Fraser
2008-05-01 20:05 ` Alan Cox
2008-05-01 16:59 ` Dan Magenheimer
1 sibling, 1 reply; 29+ messages in thread
From: Keir Fraser @ 2008-05-01 16:56 UTC (permalink / raw)
To: Alan Cox
Cc: dan.magenheimer@oracle.com, xen-devel@lists.xensource.com,
Ky Srinivasan, Jan Beulich, KurtGarloff
On 1/5/08 17:36, "Alan Cox" <alan@lxorguk.ukuu.org.uk> wrote:
>> Can't vm_committed_space grow *bigger* than available memory when using
>> swap? It may indicate the 'static' demand for memory
>
> vm_committed_space measures address space commitment for anonymous
> objects. It doesn't measure memory or total virtual commitment.
i.e., things which will occupy swap space (if you have swap configured)?
-- Keir
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH] linux/balloon: don't allow ballooningdowna domain below a reasonable limit
2008-05-01 16:56 ` Keir Fraser
@ 2008-05-01 20:05 ` Alan Cox
0 siblings, 0 replies; 29+ messages in thread
From: Alan Cox @ 2008-05-01 20:05 UTC (permalink / raw)
To: Keir Fraser
Cc: dan.magenheimer@oracle.com, xen-devel@lists.xensource.com,
Ky Srinivasan, Jan Beulich, KurtGarloff
> > vm_committed_space measures address space commitment for anonymous
> > objects. It doesn't measure memory or total virtual commitment.
>
> i.e., things which will occupy swap space (if you have swap configured)?
Yes. It measures swap commitment so that you can run with overcommit
prevention.
^ permalink raw reply [flat|nested] 29+ messages in thread
* RE: [PATCH] linux/balloon: don't allow ballooningdowna domain below a reasonable limit
2008-05-01 16:36 ` Alan Cox
2008-05-01 16:56 ` Keir Fraser
@ 2008-05-01 16:59 ` Dan Magenheimer
2008-05-01 21:18 ` Keir Fraser
1 sibling, 1 reply; 29+ messages in thread
From: Dan Magenheimer @ 2008-05-01 16:59 UTC (permalink / raw)
To: Alan Cox, Keir Fraser
Cc: Ky Srinivasan, xen-devel@lists.xensource.com, KurtGarloff,
Jan Beulich
> It doesn't measure memory or total virtual commitment.
Hmm... in my experiments, it seems to do exactly that.
I wrote a simple "eatmem" program that uses a random
amount of memory (writing to the first byte in each page)
for a random amount of time (printing out the random values),
and watched Committed_AS in /proc/meminfo and it seems
to track well.
> -----Original Message-----
> From: Alan Cox [mailto:alan@lxorguk.ukuu.org.uk]
> Sent: Thursday, May 01, 2008 10:37 AM
> To: Keir Fraser
> Cc: dan.magenheimer@oracle.com; Jan Beulich; Ky Srinivasan;
> xen-devel@lists.xensource.com; KurtGarloff
> Subject: Re: [Xen-devel] [PATCH] linux/balloon: don't allow
> ballooningdowna domain below a reasonable limit
>
>
> > Can't vm_committed_space grow *bigger* than available
> memory when using
> > swap? It may indicate the 'static' demand for memory
>
> vm_committed_space measures address space commitment for anonymous
> objects. It doesn't measure memory or total virtual commitment.
>
> Alan
>
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH] linux/balloon: don't allow ballooningdowna domain below a reasonable limit
2008-05-01 16:59 ` Dan Magenheimer
@ 2008-05-01 21:18 ` Keir Fraser
2008-05-01 23:03 ` Alan Cox
0 siblings, 1 reply; 29+ messages in thread
From: Keir Fraser @ 2008-05-01 21:18 UTC (permalink / raw)
To: dan.magenheimer@oracle.com, Alan Cox
Cc: Ky Srinivasan, xen-devel@lists.xensource.com, KurtGarloff,
Jan Beulich
On 1/5/08 17:59, "Dan Magenheimer" <dan.magenheimer@oracle.com> wrote:
>> It doesn't measure memory or total virtual commitment.
>
> Hmm... in my experiments, it seems to do exactly that.
> I wrote a simple "eatmem" program that uses a random
> amount of memory (writing to the first byte in each page)
> for a random amount of time (printing out the random values),
> and watched Committed_AS in /proc/meminfo and it seems
> to track well.
This makes sense since it does track swap commitment, and the pages your
program allocates are anonymous and hence would be backed by swap.
But still it seems to me there is a different between memory commitment and
dynamic memory pressure. And I would say that ballooning should be
influenced by the latter. For example, if your program allocates a random
amount of memory and dirties it all once, that ultimately will take up swap
space long term but it doesn't increase memory pressure unless the pages are
in active use by your application. What matters is the collective working
set across processes.
It might be the case though that, in practice, vm_committed_space is a
reasonable predictor for working set for some common types of workload. Many
applications probably keep their heaps fairly warm and hence in main memory.
-- Keir
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH] linux/balloon: don't allow ballooningdowna domain below a reasonable limit
2008-05-01 21:18 ` Keir Fraser
@ 2008-05-01 23:03 ` Alan Cox
2008-05-01 23:27 ` Dan Magenheimer
0 siblings, 1 reply; 29+ messages in thread
From: Alan Cox @ 2008-05-01 23:03 UTC (permalink / raw)
To: Keir Fraser
Cc: dan.magenheimer@oracle.com, xen-devel@lists.xensource.com,
Ky Srinivasan, Jan Beulich, KurtGarloff
> It might be the case though that, in practice, vm_committed_space is a
> reasonable predictor for working set for some common types of workload. Many
> applications probably keep their heaps fairly warm and hence in main memory.
I am dubious. What vm_committed_space does allow you to do however if you
watch it in Xen is to actually ensure you never balloon out a virtual
machine to the point you make it start killing stuff off.
There are better ways to measure memory pressure - sizes of the various
active and inactive lists etc. OLPC has code for this and a proposed
memory pressure notifier feature that they use to allow user space apps
to cleanup.
^ permalink raw reply [flat|nested] 29+ messages in thread
* RE: [PATCH] linux/balloon: don't allow ballooningdowna domain below a reasonable limit
2008-05-01 23:03 ` Alan Cox
@ 2008-05-01 23:27 ` Dan Magenheimer
2008-05-02 7:05 ` Keir Fraser
0 siblings, 1 reply; 29+ messages in thread
From: Dan Magenheimer @ 2008-05-01 23:27 UTC (permalink / raw)
To: Alan Cox, Keir Fraser
Cc: Ky Srinivasan, xen-devel@lists.xensource.com, KurtGarloff,
Jan Beulich
> > It might be the case though that, in practice, vm_committed_space is a
> > reasonable predictor for working set for some common types of workload. Many
> > applications probably keep their heaps fairly warm and hence in main memory.
Yes, and it also seems to be an "advance" predictor so that
the balloon driver has an opportunity to release pages to
a soon-to-be-memory-hungry domain before it really needs them
(or in case it might).
> I am dubious. What vm_committed_space does allow you to do
> however if you
> watch it in Xen is to actually ensure you never balloon out a virtual
> machine to the point you make it start killing stuff off.
Oddly, this is the opposite of what I observed. When I replaced
Jan's minimum_target calculation in the balloon driver with just
vm_committed_space (accidentally), I got lots of OOM's when I
was shutting down. So vm_committed_space CAN shrink too far
sometime and needs to be lower-bounded by some other calculated
minimum.
> There are better ways to measure memory pressure - sizes of the various
> active and inactive lists etc. OLPC has code for this and a proposed
> memory pressure notifier feature that they use to allow user space apps
> to cleanup.
Definitely agreed that vm_committed_space is just intended to be
a first order approximation. It also has the advantage of having
been around awhile so that the balloon driver will work with many
distros rather than require a TBD memory pressure indicator.
Alan, do you have a pointer to the proposed OLPC code?
Keir, I'm working on some xenbus support and will submit an
updated patch, probably early next week. I also have a horrible
hack to work in a module, but I suspect that will get jettisoned. ;-)
Thanks,
Dan
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH] linux/balloon: don't allow ballooningdowna domain below a reasonable limit
2008-05-01 23:27 ` Dan Magenheimer
@ 2008-05-02 7:05 ` Keir Fraser
2008-05-03 13:53 ` Dan Magenheimer
0 siblings, 1 reply; 29+ messages in thread
From: Keir Fraser @ 2008-05-02 7:05 UTC (permalink / raw)
To: dan.magenheimer@oracle.com, Alan Cox
Cc: Ky Srinivasan, xen-devel@lists.xensource.com, KurtGarloff,
Jan Beulich
On 2/5/08 00:27, "Dan Magenheimer" <dan.magenheimer@oracle.com> wrote:
>> There are better ways to measure memory pressure - sizes of the various
>> active and inactive lists etc. OLPC has code for this and a proposed
>> memory pressure notifier feature that they use to allow user space apps
>> to cleanup.
>
> Definitely agreed that vm_committed_space is just intended to be
> a first order approximation. It also has the advantage of having
> been around awhile so that the balloon driver will work with many
> distros rather than require a TBD memory pressure indicator.
>
> Alan, do you have a pointer to the proposed OLPC code?
>
> Keir, I'm working on some xenbus support and will submit an
> updated patch, probably early next week. I also have a horrible
> hack to work in a module, but I suspect that will get jettisoned. ;-)
Okay, I will comment when I see it. I'm not sure what design you are working
on: perhaps extract memory stats from the guest via xenbus and implement
ballooning policy in dom0? That is what I would prefer to see. The Novell
ballooning-limit checks were only added as a safety backstop in the guest
itself. Apart from tweaking it to make it less conservative where we are
confident that is safe, any more complicated policy, and of course and
cross-domain global optimisation, doesn't belong in the individual guests.
-- Keir
^ permalink raw reply [flat|nested] 29+ messages in thread
* RE: [PATCH] linux/balloon: don't allow ballooningdowna domain below a reasonable limit
2008-05-02 7:05 ` Keir Fraser
@ 2008-05-03 13:53 ` Dan Magenheimer
2008-05-03 14:11 ` Keir Fraser
` (2 more replies)
0 siblings, 3 replies; 29+ messages in thread
From: Dan Magenheimer @ 2008-05-03 13:53 UTC (permalink / raw)
To: Keir Fraser, xen-devel@lists.xensource.com
> From: Keir Fraser [mailto:keir.fraser@eu.citrix.com]
> Okay, I will comment when I see it. I'm not sure what design you are working
> on: perhaps extract memory stats from the guest via xenbus and implement
> ballooning policy in dom0? That is what I would prefer to see. The Novell
> ballooning-limit checks were only added as a safety backstop in the guest
> itself. Apart from tweaking it to make it less conservative where we are
> confident that is safe, any more complicated policy, and of course and
> cross-domain global optimisation, doesn't belong in the
> individual guests.
(cc's removed due to topic drift)
I was planning on providing both Model C and Model D (see below),
but let me know if you will only accept Model C (or even Model B)
and I will adjust accordingly.
===============
MODEL A (current):
Domain 0 sez: "Hey guest A, I have no clue how much memory you have
(though you may or may not have obeyed a previous request) or how much
you need, but change your memory usage to 150MB"
Guest A (silently): "(Silly domain 0 wants me to reduce my memory
usage to 150MB but my minimum is 160MB. Well, I guess I'll do
my best.)"
===============
MODEL B (guest provides info when prodded):
Domain 0 sez: "Hey guest A, tell me how much memory you have and how
much you need"
Guest A sez: "I have 198MB but I really only need 129MB"
Domain 0 sez: "Guest A, reduce your memory usage to 129MB"
Guest A (silently): "(My min is 150MB but I'll do my best)"
Domain 0 sez: "Hey guest A, tell me how much memory you have and how
much you need"
Guest A sez: "I have 150MB but I really only need 129MB"
[etc]
===============
MODEL C (guest provides info regularly):
Guest A sez: "I have 198 MB, I really only need 180MB, and my
minimum is 150MB. I'll provide another update in a second."
[one second later]
Guest A sez: "I have 198 MB, I really only need 129MB, and my
minimum is 150MB. I'll provide another update in a second."
Domain 0 sez: "Guest A, reduce your memory to 150MB"
Guest A (silently): "(ballooning down now to 150MB)"
[one second later]
Guest A sez: "I have 150MB, I really need 250MB and my minimum
is 150MB. I'll provide another update in a second."
Domain 0 sez: "Guest A, increase your memory to 250MB"
===============
MODEL D (autoballooning):
Domain 0 sez: "Hey Guest A, do the right thing with your memory"
Guest A sez: "I have 198MB, I really only need 129MB, and my
minimum is 150MB"
Guest A (silently): "(ballooning down now to 150MB)"
[one second later]
Guest A sez: "I have 150MB, I really need 250MB, and my
minimum is 150MB"
Guest A (silently): "(ballooning up now to 250MB... oops looks
like I can't get that much but I'll take what I can get)"
[one second later]
Guest A sez: "I have 200MB, I really need 300MB, and my
minimum is 150MB"
Guest A (silently): "(ballooning up now to 250MB... oops looks
like I can't get any more... time to start swapping)"
===================================
Thanks... for the memory
I really could use more / My throughput's on the floor
The balloon is flat / My swap disk's fat / I've O-O-M's in store
Overcommitted we are
(with apologies to the late great Bob Hope)
^ permalink raw reply [flat|nested] 29+ messages in thread* Re: [PATCH] linux/balloon: don't allow ballooningdowna domain below a reasonable limit
2008-05-03 13:53 ` Dan Magenheimer
@ 2008-05-03 14:11 ` Keir Fraser
2008-05-03 19:27 ` Dan Magenheimer
2008-05-03 17:32 ` Mark Williamson
2008-05-12 22:19 ` [PATCH] linux/balloon: don't allowballooningdowna " Ian Pratt
2 siblings, 1 reply; 29+ messages in thread
From: Keir Fraser @ 2008-05-03 14:11 UTC (permalink / raw)
To: dan.magenheimer@oracle.com, xen-devel@lists.xensource.com
On 3/5/08 14:53, "Dan Magenheimer" <dan.magenheimer@oracle.com> wrote:
>> From: Keir Fraser [mailto:keir.fraser@eu.citrix.com]
>> Okay, I will comment when I see it. I'm not sure what design you are working
>> on: perhaps extract memory stats from the guest via xenbus and implement
>> ballooning policy in dom0? That is what I would prefer to see. The Novell
>> ballooning-limit checks were only added as a safety backstop in the guest
>> itself. Apart from tweaking it to make it less conservative where we are
>> confident that is safe, any more complicated policy, and of course and
>> cross-domain global optimisation, doesn't belong in the
>> individual guests.
>
> (cc's removed due to topic drift)
>
> I was planning on providing both Model C and Model D (see below),
> but let me know if you will only accept Model C (or even Model B)
> and I will adjust accordingly.
I don't know that what you are trying to do is, in full generality,
tractable. Who knows how valuable the memory pages belonging to a domU are,
relative to other domains? Just because they are only buffer-cache pages,
for example, may not necessarily mean we want the domU to aggressively page
them out every N seconds.
At least if you can extract some measure of memory pressure from each domU
(e.g., paging frequency, size of active/inactive page lists) dom0 can then
make some global optimisation periodically based on e.g., relative
priorities of domains.
Not that your approach is not applicable for some scenarios. As long as it's
a switchable option, perhaps it is the kind of thing to let users vote on.
-- Keir
^ permalink raw reply [flat|nested] 29+ messages in thread
* RE: [PATCH] linux/balloon: don't allow ballooningdowna domain below a reasonable limit
2008-05-03 14:11 ` Keir Fraser
@ 2008-05-03 19:27 ` Dan Magenheimer
0 siblings, 0 replies; 29+ messages in thread
From: Dan Magenheimer @ 2008-05-03 19:27 UTC (permalink / raw)
To: Keir Fraser, xen-devel@lists.xensource.com
> > I was planning on providing both Model C and Model D (see below),
> > but let me know if you will only accept Model C (or even Model B)
> > and I will adjust accordingly.
>
> I don't know that what you are trying to do is, in full generality,
> tractable. Who knows how valuable the memory pages belonging
> to a domU are,
> relative to other domains? Just because they are only
> buffer-cache pages,
> for example, may not necessarily mean we want the domU to
> aggressively page
> them out every N seconds.
>
> At least if you can extract some measure of memory pressure
> from each domU
> (e.g., paging frequency, size of active/inactive page lists)
> dom0 can then
> make some global optimisation periodically based on e.g., relative
> priorities of domains.
>
> Not that your approach is not applicable for some scenarios.
> As long as it's
> a switchable option, perhaps it is the kind of thing to let
> users vote on.
I agree the general case is not tractable. But I think the
basic concept is useful and an initial implementation may
serve as a good foundation for later tuning. The initial
selfballooning policy is simply "if I have extra memory,
give it up; and if I need memory, ask for it back"; where
"extra" and "need" are admittedly poorly estimated (but are
good enough for some workloads) and the grant-memory-to-
asking-domain policy is first-come-first-served. Any
more complex policy certainly requires more information
to be passed from the guest to domain0.
OK, I will submit a patch to cover both Model C and D.
Still tracking down a bug or two...
Dan
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH] linux/balloon: don't allow ballooningdowna domain below a reasonable limit
2008-05-03 13:53 ` Dan Magenheimer
2008-05-03 14:11 ` Keir Fraser
@ 2008-05-03 17:32 ` Mark Williamson
2008-05-03 19:43 ` Dan Magenheimer
2008-05-12 22:19 ` [PATCH] linux/balloon: don't allowballooningdowna " Ian Pratt
2 siblings, 1 reply; 29+ messages in thread
From: Mark Williamson @ 2008-05-03 17:32 UTC (permalink / raw)
To: xen-devel, dan.magenheimer@oracle.com; +Cc: Keir Fraser
Part of this work could usefully (if you fancy it) involve keeping Xend
informed about how much memory is being used. Some of your scenarios
probably require this, others don't. Right now, Xend gets a confused idea of
the memory usage in circumstances such as the administrator of a guest
ballooning it down using the /proc interface.
> ===============
> MODEL C (guest provides info regularly):
>
> Guest A sez: "I have 198 MB, I really only need 180MB, and my
> minimum is 150MB. I'll provide another update in a second."
> [one second later]
> Guest A sez: "I have 198 MB, I really only need 129MB, and my
> minimum is 150MB. I'll provide another update in a second."
> Domain 0 sez: "Guest A, reduce your memory to 150MB"
> Guest A (silently): "(ballooning down now to 150MB)"
> [one second later]
> Guest A sez: "I have 150MB, I really need 250MB and my minimum
> is 150MB. I'll provide another update in a second."
> Domain 0 sez: "Guest A, increase your memory to 250MB"
This has the advantage that it can interact with a policy implemented in dom0.
e.g. it's easy to switch on and off automatic memory sizing for different
machines, externally set per-domain minimums and maximums, weight the
priority should multiple domains compete for the same memory, etc.
> MODEL D (autoballooning):
>
> Domain 0 sez: "Hey Guest A, do the right thing with your memory"
> Guest A sez: "I have 198MB, I really only need 129MB, and my
> minimum is 150MB"
> Guest A (silently): "(ballooning down now to 150MB)"
> [one second later]
> Guest A sez: "I have 150MB, I really need 250MB, and my
> minimum is 150MB"
> Guest A (silently): "(ballooning up now to 250MB... oops looks
> like I can't get that much but I'll take what I can get)"
> [one second later]
> Guest A sez: "I have 200MB, I really need 300MB, and my
> minimum is 150MB"
> Guest A (silently): "(ballooning up now to 250MB... oops looks
> like I can't get any more... time to start swapping)"
Presumably you'd intend to switch this on and off via Xenstore? And perhaps
control the parameters of the autoballooning via Xenstore too?
Cheers,
Mark
--
Push Me Pull You - Distributed SCM tool (http://www.cl.cam.ac.uk/~maw48/pmpu/)
^ permalink raw reply [flat|nested] 29+ messages in thread
* RE: [PATCH] linux/balloon: don't allow ballooningdowna domain below a reasonable limit
2008-05-03 17:32 ` Mark Williamson
@ 2008-05-03 19:43 ` Dan Magenheimer
0 siblings, 0 replies; 29+ messages in thread
From: Dan Magenheimer @ 2008-05-03 19:43 UTC (permalink / raw)
To: Mark Williamson, xen-devel@lists.xensource.com; +Cc: Keir Fraser
> > MODEL C (guest provides info regularly):
>
> This has the advantage that it can interact with a policy
> implemented in dom0.
> e.g. it's easy to switch on and off automatic memory sizing
> for different
> machines, externally set per-domain minimums and maximums, weight the
> priority should multiple domains compete for the same memory, etc.
Yep, this needs some kind of manager on the domain0 side
for best (better?) results. The disadvantage is that the
communication has latency so by the time domain0 reacts
to a "I have idle memory" message with a "balloon down" message,
the VM might already need ballooning up.
Still this is really the only way to do any reasonably
complex cross-domain memory load balancing.
> > MODEL D (autoballooning):
(oops... I meant selfballooning... autoballooning for
domain0 already exists and seems different so I created
a different term)
> Presumably you'd intend to switch this on and off via
> Xenstore? And perhaps
> control the parameters of the autoballooning via Xenstore too?
Exactly. I don't have parameters in the initial patch, but
some way of controlling "balloon down slowly but balloon up
quickly" is a reasonable next step.
In the end, I envision some cross between the two where a
guest can react quickly to short-term memory pressure but
domain0 still manages each guest's dynamic memory "bracket"
based on QoS goals. But not in the first patch ;-)
Dan
^ permalink raw reply [flat|nested] 29+ messages in thread
* RE: [PATCH] linux/balloon: don't allowballooningdowna domain below a reasonable limit
2008-05-03 13:53 ` Dan Magenheimer
2008-05-03 14:11 ` Keir Fraser
2008-05-03 17:32 ` Mark Williamson
@ 2008-05-12 22:19 ` Ian Pratt
2008-05-12 23:34 ` Dan Magenheimer
2008-05-13 10:35 ` Markus Hochholdinger
2 siblings, 2 replies; 29+ messages in thread
From: Ian Pratt @ 2008-05-12 22:19 UTC (permalink / raw)
To: Dan Magenheimer, Keir Fraser, xen-devel; +Cc: Ian Pratt
> I was planning on providing both Model C and Model D (see below),
> but let me know if you will only accept Model C (or even Model B)
> and I will adjust accordingly.
I think all these models are wrong :-)
'free' guest memory is often serving useful purposes such as acting as a
buffer cache etc, so ballooning it out unnecessarily is probably not a
good thing. Model D might work better if we had a way of giving up
memory in a way that wasn't 'final' i.e. we could surrender pages back
to xen, but would get a ticket with which we could ask Xen if it still
had the page, and if xen hadn't zeroed them and handed them to someone
else we could get the original page back. Hence, we could treat pages
handed back to xen as a kind of 'unreliable swap device'.
Even if we had such extensions, I'm not sure that having every domain
eagerly surrender memory to xen is necessarily the best approach. It may
be better to have domains just indicate to domain0 whether they are in a
position to release memory, or whether they could actively benefit from
more, and then have domain0 act as arbiter.
Ian
> ===============
> MODEL A (current):
>
> Domain 0 sez: "Hey guest A, I have no clue how much memory you have
> (though you may or may not have obeyed a previous request) or how much
> you need, but change your memory usage to 150MB"
> Guest A (silently): "(Silly domain 0 wants me to reduce my memory
> usage to 150MB but my minimum is 160MB. Well, I guess I'll do
> my best.)"
> ===============
> MODEL B (guest provides info when prodded):
>
> Domain 0 sez: "Hey guest A, tell me how much memory you have and how
> much you need"
> Guest A sez: "I have 198MB but I really only need 129MB"
> Domain 0 sez: "Guest A, reduce your memory usage to 129MB"
> Guest A (silently): "(My min is 150MB but I'll do my best)"
> Domain 0 sez: "Hey guest A, tell me how much memory you have and how
> much you need"
> Guest A sez: "I have 150MB but I really only need 129MB"
> [etc]
> ===============
> MODEL C (guest provides info regularly):
>
> Guest A sez: "I have 198 MB, I really only need 180MB, and my
> minimum is 150MB. I'll provide another update in a second."
> [one second later]
> Guest A sez: "I have 198 MB, I really only need 129MB, and my
> minimum is 150MB. I'll provide another update in a second."
> Domain 0 sez: "Guest A, reduce your memory to 150MB"
> Guest A (silently): "(ballooning down now to 150MB)"
> [one second later]
> Guest A sez: "I have 150MB, I really need 250MB and my minimum
> is 150MB. I'll provide another update in a second."
> Domain 0 sez: "Guest A, increase your memory to 250MB"
> ===============
> MODEL D (autoballooning):
>
> Domain 0 sez: "Hey Guest A, do the right thing with your memory"
> Guest A sez: "I have 198MB, I really only need 129MB, and my
> minimum is 150MB"
> Guest A (silently): "(ballooning down now to 150MB)"
> [one second later]
> Guest A sez: "I have 150MB, I really need 250MB, and my
> minimum is 150MB"
> Guest A (silently): "(ballooning up now to 250MB... oops looks
> like I can't get that much but I'll take what I can get)"
> [one second later]
> Guest A sez: "I have 200MB, I really need 300MB, and my
> minimum is 150MB"
> Guest A (silently): "(ballooning up now to 250MB... oops looks
> like I can't get any more... time to start swapping)"
>
>
> ===================================
> Thanks... for the memory
> I really could use more / My throughput's on the floor
> The balloon is flat / My swap disk's fat / I've O-O-M's in store
> Overcommitted we are
> (with apologies to the late great Bob Hope)
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel
^ permalink raw reply [flat|nested] 29+ messages in thread
* RE: [PATCH] linux/balloon: don't allowballooningdowna domain below a reasonable limit
2008-05-12 22:19 ` [PATCH] linux/balloon: don't allowballooningdowna " Ian Pratt
@ 2008-05-12 23:34 ` Dan Magenheimer
2008-05-13 10:35 ` Markus Hochholdinger
1 sibling, 0 replies; 29+ messages in thread
From: Dan Magenheimer @ 2008-05-12 23:34 UTC (permalink / raw)
To: Ian Pratt, Keir Fraser, xen-devel@lists.xensource.com
Thanks for the thoughtful reply, Ian!
> > I was planning on providing both Model C and Model D (see below),
> > but let me know if you will only accept Model C (or even Model B)
> > and I will adjust accordingly.
>
> I think all these models are wrong :-)
Yes, well, I think allowing guests to unproductively hoard
idle physical memory is also wrong. :-)
> 'free' guest memory is often serving useful purposes such as
> acting as a buffer cache etc, so ballooning it out unnecessarily
> is probably not a good thing.
Depends on the domain and workload. If the working set of
the domain is much smaller than physical memory, then letting
the domain fill its buffer cache on the odd chance that it
might use one or more of those pages again -- especially when
there are other 'memory-starved' domains -- is probably not
a good thing either.
> Model D might work better if we had a way of giving up
> memory in a way that wasn't 'final' i.e. we could surrender pages back
> to xen, but would get a ticket with which we could ask Xen if it still
> had the page, and if xen hadn't zeroed them and handed them to someone
> else we could get the original page back. Hence, we could treat pages
> handed back to xen as a kind of 'unreliable swap device'.
Cool! Yes, this would be a nice addition and would make a great
research project. I think you are positing that a large percentage
of the pages would be handed back and thus taking them away 'permanently'
is not a good idea. I wouldn't argue that there are many domains
and workloads where this is true. But I *would* argue that there
are also many domains and workloads where the percentage would
be very small, and that taking them away permanently wouldn't
be noticeable.
So certainly Model D shouldn't be mandated for all domains; but
providing it as an option seems reasonable to me. Also, with
adequate hysteresis built in, we give the domain plenty of time
to change its mind before pressuring it to give away its most
precious buffer cache pages, while enforcing that it give away
its least-likely-to-be-reused pages. So in a sense, a high
downhysteresis value essentially provides the same 'unreliable
swap device' -- but each domain is far better able to implement
a reasonable 'victim' algorithm than is domain0.
> Even if we had such extensions, I'm not sure that having every domain
> eagerly surrender memory to xen is necessarily the best
> approach. It may
> be better to have domains just indicate to domain0 whether
> they are in a
> position to release memory, or whether they could actively
> benefit from
> more, and then have domain0 act as arbiter.
The proposed implementation defaults to exactly that: Each domain
now provides "I'm in a position to release memory" or "I released
too much and I need some back" to domain0. Though one can argue
about the quality/accuracy of the data provided in the proposed
implementation, some believe it's a reasonable first approximation
(and, indeed, that it OVER-estimates the true working set).
The selfballooning part simply serves as a quick-and-dirty
first-come-first-served policy that could just as easily be
implemented in domain0 (with latency), but also serves as a nice
standalone overcommit demo which also may be "good enough" for
some real world environments where supporting more simultaneous
virtual machines is more important than a small loss in
responsiveness.
Honestly, I think this is a rather elegant way to get around the
"semantic gap" and double-paging. Indeed, I'm thinking that OS's
that are becoming increasingly virtualization-conscious should
all understand that memory is a valuable shareable resource,
and should provide "idle memory" metrics and APIs to allow
a virtualization system to manage it appropriately. Another good
research topic?
Thanks!
Dan
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH] linux/balloon: don't allowballooningdowna domain below a reasonable limit
2008-05-12 22:19 ` [PATCH] linux/balloon: don't allowballooningdowna " Ian Pratt
2008-05-12 23:34 ` Dan Magenheimer
@ 2008-05-13 10:35 ` Markus Hochholdinger
1 sibling, 0 replies; 29+ messages in thread
From: Markus Hochholdinger @ 2008-05-13 10:35 UTC (permalink / raw)
To: xen-devel
[-- Attachment #1.1: Type: text/plain, Size: 3093 bytes --]
Hi,
Am Dienstag, 13. Mai 2008 00:19 schrieb Ian Pratt:
> > I was planning on providing both Model C and Model D (see below),
> > but let me know if you will only accept Model C (or even Model B)
> > and I will adjust accordingly.
> I think all these models are wrong :-)
> 'free' guest memory is often serving useful purposes such as acting as a
> buffer cache etc, so ballooning it out unnecessarily is probably not a
> good thing. Model D might work better if we had a way of giving up
> memory in a way that wasn't 'final' i.e. we could surrender pages back
> to xen, but would get a ticket with which we could ask Xen if it still
> had the page, and if xen hadn't zeroed them and handed them to someone
> else we could get the original page back. Hence, we could treat pages
> handed back to xen as a kind of 'unreliable swap device'.
i'm running older Xen 3.0.3 and i do the balloning manual by myself. My manual
considerations are based on
* real used memory (used - buffers - cache)
* real free memory (without buffers and without cache)
Real free memory should be given away instantly because it's really not used.
In normal situation i give a domU twice the real used memory so the domU can
use cache and buffers.
If i need more ram, i'm looking how much in the domUs is swapped. I take away
more memory from domUs which have not used swap till they began to swap.
As i've better data (rrd graphs for a long period) as xen can have, i can
better consider the need for a longer period.
But perhaps something like the following could be used:
* Minimum memory should be at least the real used memory plus 10 percent of
this.
* Maximum memory should be given in configuration file to better adjust memory
manual. Alternative the maximum memory should be twice the real used memory.
* For memory pressure:
Let xen adjust between min and max memory. Say we have a value of x between
min and max memory for each domU. Let xen give the same percent of each x
for each domU until all memory is fully used or the needed memory (for
perhaps starting a new domU) is free. (The percentage could climb over 100%
if a lot of memory is available and the domUs need them.)
* For a lot of free memory:
Let xen adjust memory towards maximum memory, but don't let more than 10
percent of total memory be free.
The use of percentage across the domUs should better balance between little
(e.g. 64MB) and big (e.g. 8096MB) domUs. If a lot of swap is used in a domU
and the domU gets more ram, the domU will automatically move swap to ram as
more ram arrives. If a lot of swap is allocated but not heavily used it will
perhaps not influence the performance much and also it will not influence
memory distribution.
I know i'm no kernel hacker and perhaps my assumptions are stupid from the
view of a hacker. But this is what i do manually in real life and works very
well for me (for about one hundred domUs) so i hope this could be helpful for
the people coding the auto balloning stuff?
--
greetings
eMHa
[-- Attachment #1.2: Type: application/pgp-signature, Size: 189 bytes --]
[-- Attachment #2: Type: text/plain, Size: 138 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
^ permalink raw reply [flat|nested] 29+ messages in thread
* RE: [PATCH] linux/balloon:don't allow ballooningdowna domain below a reasonable limit
@ 2008-05-02 19:30 Jan Beulich
2008-05-02 22:22 ` Dan Magenheimer
2008-05-09 20:38 ` Dan Magenheimer
0 siblings, 2 replies; 29+ messages in thread
From: Jan Beulich @ 2008-05-02 19:30 UTC (permalink / raw)
To: dan.magenheimer; +Cc: Ky Srinivasan, xen-devel, keir.fraser, garloff
>>> "Dan Magenheimer" <dan.magenheimer@oracle.com> 05/01/08 2:00 AM >>>
>OK, I think I am understanding it a bit better:
>the max_pfn part is just adding in some "slop"
>which is a fraction of total main memory which
>is growing smaller (roughly logarithmically)
>as memory grows larger. I'm still not sure about
>the magic values in MB2PAGES though... I'm guessing
>these were gathered somehow experimentally?
I have to defer to the original author here - Kurt?
>With the "divide result of your algorithm by two",
>I was able to get thirteen 512MB domains (idle
>for now) running on a 2GB system.
You mean ballooned-down domains, right? Perhaps using your
self-ballooning change? I have to admit I'm a little nervous
about attempting to overcommit memory in this way in a
production environment, but as long as this depends on a
decision of the operator it's certainly a good option to have.
Jan
^ permalink raw reply [flat|nested] 29+ messages in thread
* RE: [PATCH] linux/balloon:don't allow ballooningdowna domain below a reasonable limit
2008-05-02 19:30 [PATCH] linux/balloon:don't allow ballooningdowna " Jan Beulich
@ 2008-05-02 22:22 ` Dan Magenheimer
2008-05-03 13:24 ` Goswin von Brederlow
2008-05-09 20:38 ` Dan Magenheimer
1 sibling, 1 reply; 29+ messages in thread
From: Dan Magenheimer @ 2008-05-02 22:22 UTC (permalink / raw)
To: Jan Beulich
Cc: Ky Srinivasan, xen-devel@lists.xensource.com,
keir.fraser@eu.citrix.com, garloff@suse.de
> >OK, I think I am understanding it a bit better:
> >the max_pfn part is just adding in some "slop"
> >which is a fraction of total main memory which
> >is growing smaller (roughly logarithmically)
> >as memory grows larger. I'm still not sure about
> >the magic values in MB2PAGES though... I'm guessing
> >these were gathered somehow experimentally?
>
> I have to defer to the original author here - Kurt?
Eagerly awaiting... In addition to cutting it
in half, I subtracted another 10MB (in a memory=512
domain) and still didn't see any OOMs, though my
testing was admittedly limited.
> >With the "divide result of your algorithm by two",
> >I was able to get thirteen 512MB domains (idle
> >for now) running on a 2GB system.
>
> You mean ballooned-down domains, right? Perhaps using your
> self-ballooning change? I have to admit I'm a little nervous
> about attempting to overcommit memory in this way in a
> production environment, but as long as this depends on a
> decision of the operator it's certainly a good option to have.
Yes, ballooned-down domains. In fact with minimum_target()
modified as above (half of algorithm minus 10MB) and
a variable load (repeating { compile xen; sleep(30<rand<541) }),
I got fifteen 512MB domains running on a 2GB systems.
Agreed that there are many environments where this kind
of ballooning would cause performance problems (or worse).
However, there are certainly some environments (and some
competitive situations ;-) where one might choose to
tradeoff performance to run more VMs per physical machine.
Dan
^ permalink raw reply [flat|nested] 29+ messages in thread* Re: [PATCH] linux/balloon:don't allow ballooningdowna domain below a reasonable limit
2008-05-02 22:22 ` Dan Magenheimer
@ 2008-05-03 13:24 ` Goswin von Brederlow
0 siblings, 0 replies; 29+ messages in thread
From: Goswin von Brederlow @ 2008-05-03 13:24 UTC (permalink / raw)
To: dan.magenheimer@oracle.com
Cc: Ky Srinivasan, xen-devel@lists.xensource.com,
keir.fraser@eu.citrix.com, Jan Beulich, garloff@suse.de
"Dan Magenheimer" <dan.magenheimer@oracle.com> writes:
>> >OK, I think I am understanding it a bit better:
>> >the max_pfn part is just adding in some "slop"
>> >which is a fraction of total main memory which
>> >is growing smaller (roughly logarithmically)
>> >as memory grows larger. I'm still not sure about
>> >the magic values in MB2PAGES though... I'm guessing
>> >these were gathered somehow experimentally?
>>
>> I have to defer to the original author here - Kurt?
>
> Eagerly awaiting... In addition to cutting it
> in half, I subtracted another 10MB (in a memory=512
> domain) and still didn't see any OOMs, though my
> testing was admittedly limited.
>
>> >With the "divide result of your algorithm by two",
>> >I was able to get thirteen 512MB domains (idle
>> >for now) running on a 2GB system.
>>
>> You mean ballooned-down domains, right? Perhaps using your
>> self-ballooning change? I have to admit I'm a little nervous
>> about attempting to overcommit memory in this way in a
>> production environment, but as long as this depends on a
>> decision of the operator it's certainly a good option to have.
>
> Yes, ballooned-down domains. In fact with minimum_target()
> modified as above (half of algorithm minus 10MB) and
> a variable load (repeating { compile xen; sleep(30<rand<541) }),
> I got fifteen 512MB domains running on a 2GB systems.
>
> Agreed that there are many environments where this kind
> of ballooning would cause performance problems (or worse).
> However, there are certainly some environments (and some
> competitive situations ;-) where one might choose to
> tradeoff performance to run more VMs per physical machine.
>
> Dan
I for example have 6 domains to compile software for 32bit and 64bit
for Debian stable, testing and unstable each. They can benefit from
more memory when they build gcc for example. But 99.9% of the time
they just wait for something to do.
It would be real nice to autoballoon them down when idle. In fact I
would like to auto balloon them so that the domain that swaps least
gets sized down and the domain that swaps most gets the pages. To a
lesser degree cpu and IO usage should also play a part.
MfG
Goswin
^ permalink raw reply [flat|nested] 29+ messages in thread
* RE: [PATCH] linux/balloon:don't allow ballooningdowna domain below a reasonable limit
2008-05-02 19:30 [PATCH] linux/balloon:don't allow ballooningdowna " Jan Beulich
2008-05-02 22:22 ` Dan Magenheimer
@ 2008-05-09 20:38 ` Dan Magenheimer
1 sibling, 0 replies; 29+ messages in thread
From: Dan Magenheimer @ 2008-05-09 20:38 UTC (permalink / raw)
To: Jan Beulich
Cc: Ky Srinivasan, xen-devel@lists.xensource.com,
keir.fraser@eu.citrix.com, garloff@suse.de
Hmmm... it appears to me that minimum_target() doesn't
work when balloon.c is built as a module (always returns 0).
Can you confirm/deny?
Also, still hoping to get more info on the algorithm
used in minimum_target().
Thanks,
Dan
> -----Original Message-----
> From: xen-devel-bounces@lists.xensource.com
> [mailto:xen-devel-bounces@lists.xensource.com]On Behalf Of Jan Beulich
> Sent: Friday, May 02, 2008 1:31 PM
> To: dan.magenheimer@oracle.com
> Cc: Ky Srinivasan; xen-devel@lists.xensource.com;
> keir.fraser@eu.citrix.com; garloff@suse.de
> Subject: RE: [Xen-devel] [PATCH] linux/balloon:don't allow
> ballooningdowna domain below a reasonable limit
>
>
> >>> "Dan Magenheimer" <dan.magenheimer@oracle.com> 05/01/08
> 2:00 AM >>>
> >OK, I think I am understanding it a bit better:
> >the max_pfn part is just adding in some "slop"
> >which is a fraction of total main memory which
> >is growing smaller (roughly logarithmically)
> >as memory grows larger. I'm still not sure about
> >the magic values in MB2PAGES though... I'm guessing
> >these were gathered somehow experimentally?
>
> I have to defer to the original author here - Kurt?
>
> >With the "divide result of your algorithm by two",
> >I was able to get thirteen 512MB domains (idle
> >for now) running on a 2GB system.
>
> You mean ballooned-down domains, right? Perhaps using your
> self-ballooning change? I have to admit I'm a little nervous
> about attempting to overcommit memory in this way in a
> production environment, but as long as this depends on a
> decision of the operator it's certainly a good option to have.
>
> Jan
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel
>
^ permalink raw reply [flat|nested] 29+ messages in thread
end of thread, other threads:[~2008-05-13 10:35 UTC | newest]
Thread overview: 29+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-04-04 15:07 [PATCH] linux/balloon: don't allow ballooning down a domain below a reasonable limit Jan Beulich
2008-04-05 21:39 ` Keir Fraser
2008-04-07 7:10 ` [PATCH] linux/balloon: don't allow ballooningdown " Jan Beulich
2008-04-29 18:35 ` Dan Magenheimer
2008-04-30 6:29 ` [PATCH] linux/balloon: don't allow ballooningdowna " Jan Beulich
2008-04-30 16:04 ` Dan Magenheimer
2008-04-30 23:49 ` Dan Magenheimer
2008-05-01 7:01 ` Keir Fraser
2008-05-01 14:44 ` Dan Magenheimer
2008-05-01 16:36 ` Alan Cox
2008-05-01 16:56 ` Keir Fraser
2008-05-01 20:05 ` Alan Cox
2008-05-01 16:59 ` Dan Magenheimer
2008-05-01 21:18 ` Keir Fraser
2008-05-01 23:03 ` Alan Cox
2008-05-01 23:27 ` Dan Magenheimer
2008-05-02 7:05 ` Keir Fraser
2008-05-03 13:53 ` Dan Magenheimer
2008-05-03 14:11 ` Keir Fraser
2008-05-03 19:27 ` Dan Magenheimer
2008-05-03 17:32 ` Mark Williamson
2008-05-03 19:43 ` Dan Magenheimer
2008-05-12 22:19 ` [PATCH] linux/balloon: don't allowballooningdowna " Ian Pratt
2008-05-12 23:34 ` Dan Magenheimer
2008-05-13 10:35 ` Markus Hochholdinger
-- strict thread matches above, loose matches on Subject: below --
2008-05-02 19:30 [PATCH] linux/balloon:don't allow ballooningdowna " Jan Beulich
2008-05-02 22:22 ` Dan Magenheimer
2008-05-03 13:24 ` Goswin von Brederlow
2008-05-09 20:38 ` Dan Magenheimer
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.