* [PATCH v2] xen: Fix selfballooning and ensure it doesn't go too far
@ 2011-09-27 15:03 Dan Magenheimer
2011-09-27 15:57 ` David Vrabel
0 siblings, 1 reply; 5+ messages in thread
From: Dan Magenheimer @ 2011-09-27 15:03 UTC (permalink / raw)
To: Konrad Wilk, linux-kernel; +Cc: xen-devel, David Vrabel, Jeremy Fitzhardinge
Note: This patch is also now in a git tree at:
git://oss.oracle.com/git/djm/tmem.git#selfballoon-fix-v2
The balloon driver's "current_pages" is very different from
totalram_pages. Self-ballooning needs to be driven by
the latter. Also, Committed_AS doesn't account for pages
used by the kernel so enforce a floor for when there are little
or no user-space threads using memory. The floor function
includes a "safety margin" tuneable in case we discover later
that the floor function is still too aggressive in some workloads,
though likely it will not be needed.
Changes since version 1:
- tuneable safety margin added
[v2: konrad.wilk@oracle.com: make safety margin tuneable]
Signed-off-by: Dan Magenheimer <dan.magenheimer@oracle.com>
diff --git a/drivers/xen/xen-selfballoon.c b/drivers/xen/xen-selfballoon.c
index 6ea852e..ceaada6 100644
--- a/drivers/xen/xen-selfballoon.c
+++ b/drivers/xen/xen-selfballoon.c
@@ -93,6 +93,12 @@ static unsigned int selfballoon_uphysteresis __read_mostly = 1;
/* In HZ, controls frequency of worker invocation. */
static unsigned int selfballoon_interval __read_mostly = 5;
+/*
+ * Scale factor for safety margin for minimum selfballooning target for balloon.
+ * Default adds about 3% (4/128) of max_pfn.
+ */
+static unsigned int selfballoon_safety_margin = 4;
+
static void selfballoon_process(struct work_struct *work);
static DECLARE_DELAYED_WORK(selfballoon_worker, selfballoon_process);
@@ -195,14 +201,13 @@ __setup("selfballooning", xen_selfballooning_setup);
*/
static void selfballoon_process(struct work_struct *work)
{
- unsigned long cur_pages, goal_pages, tgt_pages;
+ unsigned long cur_pages, goal_pages, tgt_pages, floor_pages;
bool reset_timer = false;
if (xen_selfballooning_enabled) {
- cur_pages = balloon_stats.current_pages;
+ cur_pages = totalram_pages;
tgt_pages = cur_pages; /* default is no change */
- goal_pages = percpu_counter_read_positive(&vm_committed_as) +
- balloon_stats.current_pages - totalram_pages;
+ goal_pages = percpu_counter_read_positive(&vm_committed_as);
#ifdef CONFIG_FRONTSWAP
/* allow space for frontswap pages to be repatriated */
if (frontswap_selfshrinking && frontswap_enabled)
@@ -217,7 +222,13 @@ static void selfballoon_process(struct work_struct *work)
((goal_pages - cur_pages) /
selfballoon_uphysteresis);
/* else if cur_pages == goal_pages, no change */
- balloon_set_new_target(tgt_pages);
+ floor_pages = totalreserve_pages +
+ ((roundup_pow_of_two(max_pfn) / 128) *
+ selfballoon_safety_margin);
+ if (tgt_pages < floor_pages)
+ tgt_pages = floor_pages;
+ balloon_set_new_target(tgt_pages +
+ balloon_stats.current_pages - totalram_pages);
reset_timer = true;
}
#ifdef CONFIG_FRONTSWAP
@@ -340,6 +351,30 @@ static ssize_t store_selfballoon_uphys(struct sys_device *dev,
static SYSDEV_ATTR(selfballoon_uphysteresis, S_IRUGO | S_IWUSR,
show_selfballoon_uphys, store_selfballoon_uphys);
+SELFBALLOON_SHOW(selfballoon_safety_margin, "%d\n", selfballoon_safety_margin);
+
+static ssize_t store_selfballoon_safety_margin(struct sys_device *dev,
+ struct sysdev_attribute *attr,
+ const char *buf,
+ size_t count)
+{
+ unsigned long val;
+ int err;
+
+ if (!capable(CAP_SYS_ADMIN))
+ return -EPERM;
+ err = strict_strtoul(buf, 10, &val);
+ if (err || val == 0 || val > 128)
+ return -EINVAL;
+ selfballoon_safety_margin = val;
+ return count;
+}
+
+static SYSDEV_ATTR(selfballoon_safety_margin, S_IRUGO | S_IWUSR,
+ show_selfballoon_safety_margin,
+ store_selfballoon_safety_margin);
+
+
#ifdef CONFIG_FRONTSWAP
SELFBALLOON_SHOW(frontswap_selfshrinking, "%d\n", frontswap_selfshrinking);
@@ -421,6 +456,7 @@ static struct attribute *selfballoon_attrs[] = {
&attr_selfballoon_interval.attr,
&attr_selfballoon_downhysteresis.attr,
&attr_selfballoon_uphysteresis.attr,
+ &attr_selfballoon_safety_margin.attr,
#ifdef CONFIG_FRONTSWAP
&attr_frontswap_selfshrinking.attr,
&attr_frontswap_hysteresis.attr,
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH v2] xen: Fix selfballooning and ensure it doesn't go too far
2011-09-27 15:03 [PATCH v2] xen: Fix selfballooning and ensure it doesn't go too far Dan Magenheimer
@ 2011-09-27 15:57 ` David Vrabel
2011-09-27 16:19 ` Dan Magenheimer
0 siblings, 1 reply; 5+ messages in thread
From: David Vrabel @ 2011-09-27 15:57 UTC (permalink / raw)
To: Dan Magenheimer
Cc: Konrad Wilk, linux-kernel@vger.kernel.org,
xen-devel@lists.xensource.com, Jeremy Fitzhardinge
On 27/09/11 16:03, Dan Magenheimer wrote:
> Note: This patch is also now in a git tree at:
>
> git://oss.oracle.com/git/djm/tmem.git#selfballoon-fix-v2
>
> The balloon driver's "current_pages" is very different from
> totalram_pages. Self-ballooning needs to be driven by
> the latter.
I don't think this part of the change makes any difference. It looks like it
rearranges the maths without changing the end result (other than
slightly increasing the rate of change).
I think this (partial, untested) patch is equivalent:
diff --git a/drivers/xen/xen-selfballoon.c b/drivers/xen/xen-selfballoon.c
index 1b4afd8..b207e89 100644
--- a/drivers/xen/xen-selfballoon.c
+++ b/drivers/xen/xen-selfballoon.c
@@ -194,14 +194,15 @@ __setup("selfballooning", xen_selfballooning_setup);
*/
static void selfballoon_process(struct work_struct *work)
{
- unsigned long cur_pages, goal_pages, tgt_pages;
+ unsigned long cur_pages, goal_pages, tgt_pages, rsvd_pages, floor_pages;
bool reset_timer = false;
if (xen_selfballooning_enabled) {
cur_pages = balloon_stats.current_pages;
tgt_pages = cur_pages; /* default is no change */
- goal_pages = percpu_counter_read_positive(&vm_committed_as) +
- balloon_stats.current_pages - totalram_pages;
+ rsvd_pages = cur_pages - totalram_pages;
+ goal_pages = percpu_counter_read_positive(&vm_committed_as)
+ + rsvd_pages;
#ifdef CONFIG_FRONTSWAP
/* allow space for frontswap pages to be repatriated */
if (frontswap_selfshrinking && frontswap_enabled)
@@ -216,6 +217,11 @@ static void selfballoon_process(struct work_struct *work)
((goal_pages - cur_pages) /
selfballoon_uphysteresis);
/* else if cur_pages == goal_pages, no change */
+ floor_pages = rsvd_pages + totalreserve_pages +
+ ((roundup_pow_of_two(max_pfn) / 128) *
+ selfballoon_safety_margin);
+ if (tgt_pages < floor_pages)
+ tgt_pages = floor_pages;
balloon_set_new_target(tgt_pages);
reset_timer = true;
}
> Also, Committed_AS doesn't account for pages
> used by the kernel so enforce a floor for when there are little
> or no user-space threads using memory. The floor function
> includes a "safety margin" tuneable in case we discover later
> that the floor function is still too aggressive in some workloads,
> though likely it will not be needed.
The sysfs file isn't documented (but then neither are any of the other
(self-)balloon driver sysfs files).
I don't think "safety_margin" is the right name. Perhaps,
"min_reservation_ratio" or something like that?
David
> Changes since version 1:
> - tuneable safety margin added
>
> [v2: konrad.wilk@oracle.com: make safety margin tuneable]
>
> Signed-off-by: Dan Magenheimer <dan.magenheimer@oracle.com>
>
> diff --git a/drivers/xen/xen-selfballoon.c b/drivers/xen/xen-selfballoon.c
> index 6ea852e..ceaada6 100644
> --- a/drivers/xen/xen-selfballoon.c
> +++ b/drivers/xen/xen-selfballoon.c
> @@ -93,6 +93,12 @@ static unsigned int selfballoon_uphysteresis __read_mostly = 1;
> /* In HZ, controls frequency of worker invocation. */
> static unsigned int selfballoon_interval __read_mostly = 5;
>
> +/*
> + * Scale factor for safety margin for minimum selfballooning target for balloon.
> + * Default adds about 3% (4/128) of max_pfn.
> + */
> +static unsigned int selfballoon_safety_margin = 4;
> +
> static void selfballoon_process(struct work_struct *work);
> static DECLARE_DELAYED_WORK(selfballoon_worker, selfballoon_process);
>
> @@ -195,14 +201,13 @@ __setup("selfballooning", xen_selfballooning_setup);
> */
> static void selfballoon_process(struct work_struct *work)
> {
> - unsigned long cur_pages, goal_pages, tgt_pages;
> + unsigned long cur_pages, goal_pages, tgt_pages, floor_pages;
> bool reset_timer = false;
>
> if (xen_selfballooning_enabled) {
> - cur_pages = balloon_stats.current_pages;
> + cur_pages = totalram_pages;
> tgt_pages = cur_pages; /* default is no change */
> - goal_pages = percpu_counter_read_positive(&vm_committed_as) +
> - balloon_stats.current_pages - totalram_pages;
> + goal_pages = percpu_counter_read_positive(&vm_committed_as);
> #ifdef CONFIG_FRONTSWAP
> /* allow space for frontswap pages to be repatriated */
> if (frontswap_selfshrinking && frontswap_enabled)
> @@ -217,7 +222,13 @@ static void selfballoon_process(struct work_struct *work)
> ((goal_pages - cur_pages) /
> selfballoon_uphysteresis);
> /* else if cur_pages == goal_pages, no change */
> - balloon_set_new_target(tgt_pages);
> + floor_pages = totalreserve_pages +
> + ((roundup_pow_of_two(max_pfn) / 128) *
> + selfballoon_safety_margin);
> + if (tgt_pages < floor_pages)
> + tgt_pages = floor_pages;
> + balloon_set_new_target(tgt_pages +
> + balloon_stats.current_pages - totalram_pages);
> reset_timer = true;
> }
> #ifdef CONFIG_FRONTSWAP
> @@ -340,6 +351,30 @@ static ssize_t store_selfballoon_uphys(struct sys_device *dev,
> static SYSDEV_ATTR(selfballoon_uphysteresis, S_IRUGO | S_IWUSR,
> show_selfballoon_uphys, store_selfballoon_uphys);
>
> +SELFBALLOON_SHOW(selfballoon_safety_margin, "%d\n", selfballoon_safety_margin);
> +
> +static ssize_t store_selfballoon_safety_margin(struct sys_device *dev,
> + struct sysdev_attribute *attr,
> + const char *buf,
> + size_t count)
> +{
> + unsigned long val;
> + int err;
> +
> + if (!capable(CAP_SYS_ADMIN))
> + return -EPERM;
> + err = strict_strtoul(buf, 10, &val);
> + if (err || val == 0 || val > 128)
> + return -EINVAL;
> + selfballoon_safety_margin = val;
> + return count;
> +}
> +
> +static SYSDEV_ATTR(selfballoon_safety_margin, S_IRUGO | S_IWUSR,
> + show_selfballoon_safety_margin,
> + store_selfballoon_safety_margin);
> +
> +
> #ifdef CONFIG_FRONTSWAP
> SELFBALLOON_SHOW(frontswap_selfshrinking, "%d\n", frontswap_selfshrinking);
>
> @@ -421,6 +456,7 @@ static struct attribute *selfballoon_attrs[] = {
> &attr_selfballoon_interval.attr,
> &attr_selfballoon_downhysteresis.attr,
> &attr_selfballoon_uphysteresis.attr,
> + &attr_selfballoon_safety_margin.attr,
> #ifdef CONFIG_FRONTSWAP
> &attr_frontswap_selfshrinking.attr,
> &attr_frontswap_hysteresis.attr,
^ permalink raw reply related [flat|nested] 5+ messages in thread
* RE: [PATCH v2] xen: Fix selfballooning and ensure it doesn't go too far
2011-09-27 15:57 ` David Vrabel
@ 2011-09-27 16:19 ` Dan Magenheimer
2011-09-27 17:08 ` David Vrabel
0 siblings, 1 reply; 5+ messages in thread
From: Dan Magenheimer @ 2011-09-27 16:19 UTC (permalink / raw)
To: David Vrabel; +Cc: Konrad Wilk, linux-kernel, xen-devel, Jeremy Fitzhardinge
> From: David Vrabel [mailto:david.vrabel@citrix.com]
> Subject: Re: [PATCH v2] xen: Fix selfballooning and ensure it doesn't go too far
>
> On 27/09/11 16:03, Dan Magenheimer wrote:
> > Note: This patch is also now in a git tree at:
> >
> > git://oss.oracle.com/git/djm/tmem.git#selfballoon-fix-v2
> >
> > The balloon driver's "current_pages" is very different from
> > totalram_pages. Self-ballooning needs to be driven by
> > the latter.
Hi David --
Thanks for the feedback!
> I don't think this part of the change makes any difference. It looks like it
> rearranges the maths without changing the end result (other than
> slightly increasing the rate of change).
> I think this (partial, untested) patch is equivalent:
Actually it does. The key difference is the parameter to the
call to balloon_set_new_target. The math in my patch is
done in "internal" math (e.g. kernel-relevant variables)
and the math in your patch is done in "external" math (e.g.
Xen-relevant variables). Balloon_set_new_target requires
"external" math, so I convert at the point of call.
> The sysfs file isn't documented (but then neither are any of the other
> (self-)balloon driver sysfs files).
Yep. This is a bug fix, so I'm not trying to fix all the sins
of others (and myself). Since you are familiar with the
meaning of all the core balloon driver variables exposed through
sysfs, perhaps you might submit a patch to document them
and/or suggest which ones should be in debugfs instead?
> I don't think "safety_margin" is the right name. Perhaps,
> "min_reservation_ratio" or something like that?
Yeah, I struggled with the name because the concept that
the variable implements is pretty complex. I finally decided
on safety_margin because I think it will draw the attention
of a user who has reason to look for it. I don't expect that
it will be used anyway, but it is there in case I am wrong.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH v2] xen: Fix selfballooning and ensure it doesn't go too far
2011-09-27 16:19 ` Dan Magenheimer
@ 2011-09-27 17:08 ` David Vrabel
2011-09-27 20:25 ` Dan Magenheimer
0 siblings, 1 reply; 5+ messages in thread
From: David Vrabel @ 2011-09-27 17:08 UTC (permalink / raw)
To: Dan Magenheimer
Cc: Konrad Wilk, linux-kernel@vger.kernel.org,
xen-devel@lists.xensource.com, Jeremy Fitzhardinge
On 27/09/11 17:19, Dan Magenheimer wrote:
>> From: David Vrabel [mailto:david.vrabel@citrix.com]
>> Subject: Re: [PATCH v2] xen: Fix selfballooning and ensure it doesn't go too far
>>
>> On 27/09/11 16:03, Dan Magenheimer wrote:
>>> Note: This patch is also now in a git tree at:
>>>
>>> git://oss.oracle.com/git/djm/tmem.git#selfballoon-fix-v2
>>>
>>> The balloon driver's "current_pages" is very different from
>>> totalram_pages. Self-ballooning needs to be driven by
>>> the latter.
>
> Hi David --
>
> Thanks for the feedback!
>
>> I don't think this part of the change makes any difference. It looks like it
>> rearranges the maths without changing the end result (other than
>> slightly increasing the rate of change).
>> I think this (partial, untested) patch is equivalent:
>
> Actually it does.
Really?
Both patched and unpatched the new target, S, is (eventually):
S = V + F + C - T
where V is vm_committed_as, F is frontswap_curr_pages(), C is
balloon_stats.current_pages, and T = totalram_pages.
Perhaps the refactoring of the maths is a good idea (I don't think so)
but it shouldn't be part of this patch and it shouldn't be described as
a fix.
David
^ permalink raw reply [flat|nested] 5+ messages in thread
* RE: [PATCH v2] xen: Fix selfballooning and ensure it doesn't go too far
2011-09-27 17:08 ` David Vrabel
@ 2011-09-27 20:25 ` Dan Magenheimer
0 siblings, 0 replies; 5+ messages in thread
From: Dan Magenheimer @ 2011-09-27 20:25 UTC (permalink / raw)
To: David Vrabel; +Cc: Konrad Wilk, linux-kernel, xen-devel, Jeremy Fitzhardinge
> From: David Vrabel [mailto:david.vrabel@citrix.com]
> Cc: Konrad Wilk; linux-kernel@vger.kernel.org; xen-devel@lists.xensource.com; Jeremy Fitzhardinge
> Subject: Re: [PATCH v2] xen: Fix selfballooning and ensure it doesn't go too far
>
> On 27/09/11 17:19, Dan Magenheimer wrote:
> >> From: David Vrabel [mailto:david.vrabel@citrix.com]
> >> Subject: Re: [PATCH v2] xen: Fix selfballooning and ensure it doesn't go too far
> >>
> >> On 27/09/11 16:03, Dan Magenheimer wrote:
> >>> Note: This patch is also now in a git tree at:
> >>>
> >>> git://oss.oracle.com/git/djm/tmem.git#selfballoon-fix-v2
> >>>
> >>> The balloon driver's "current_pages" is very different from
> >>> totalram_pages. Self-ballooning needs to be driven by
> >>> the latter.
> >
> > Hi David --
> >
> > Thanks for the feedback!
> >
> >> I don't think this part of the change makes any difference. It looks like it
> >> rearranges the maths without changing the end result (other than
> >> slightly increasing the rate of change).
> >> I think this (partial, untested) patch is equivalent:
> >
> > Actually it does.
>
> Really?
>
> Both patched and unpatched the new target, S, is (eventually):
>
> S = V + F + C - T
>
> where V is vm_committed_as, F is frontswap_curr_pages(), C is
> balloon_stats.current_pages, and T = totalram_pages.
Sorry, in my haste to shoot off a quick reply while my mind
was somewhere else, I see my reply was poor and misleading.
Yes, "S", the value passed to balloon_set_new_target(), is
the same in most cases. However, it is V+F, not S, that must
be compared against M (= the floor function); the "target"
of selfballooning, the value that the kernel cares about
(not the value that Xen cares about) is max(V+F,M). This
gets converted to Xen-cares-about, IOW:
S = max(V+F,M) + C - T
where S is passed to balloon_set_new_target.
> Perhaps the refactoring of the maths is a good idea (I don't think so)
> but it shouldn't be part of this patch and it shouldn't be described as
> a fix.
The refactored version makes sense now from a kernel perspective,
though I can see how it might be confusing from a Xen perspective,
especially to a balloon driver expert such as yourself.
It is most definitely a fix because the formula is different
and OOMs that previously happened no longer happen. I don't
think the commit comment describes the *refactoring* as a fix,
just says that self-ballooning needs to be driven by kernel-
cares-about values (even if it has to interface to the
Xen balloon driver with a Xen-cares-about parameter).
Hopefully that makes more sense?
Dan
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2011-09-27 20:26 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-09-27 15:03 [PATCH v2] xen: Fix selfballooning and ensure it doesn't go too far Dan Magenheimer
2011-09-27 15:57 ` David Vrabel
2011-09-27 16:19 ` Dan Magenheimer
2011-09-27 17:08 ` David Vrabel
2011-09-27 20:25 ` Dan Magenheimer
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).