* [PATCH] x86: expose XENMEM_get_pod_target to subject domain
@ 2014-02-24 13:31 Jan Beulich
2014-02-24 17:24 ` George Dunlap
0 siblings, 1 reply; 12+ messages in thread
From: Jan Beulich @ 2014-02-24 13:31 UTC (permalink / raw)
To: xen-devel; +Cc: dgdegra, Keir Fraser
[-- Attachment #1: Type: text/plain, Size: 2799 bytes --]
Not having got any satisfactory suggestions on the inquiry on how to
determine the amount a PoD guest needs to balloon down by (see
http://lists.xenproject.org/archives/html/xen-devel/2014-01/msg01524.html
and the thread following it), expose XENMEM_get_pod_target such that
the guest can use it for this purpose.
Also leverage some cleanup potential resulting from this change.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -4800,7 +4800,6 @@ long arch_memory_op(int op, XEN_GUEST_HA
{
xen_pod_target_t target;
struct domain *d;
- struct p2m_domain *p2m;
if ( copy_from_guest(&target, arg, 1) )
return -EFAULT;
@@ -4810,23 +4809,17 @@ long arch_memory_op(int op, XEN_GUEST_HA
return -ESRCH;
if ( op == XENMEM_set_pod_target )
- rc = xsm_set_pod_target(XSM_PRIV, d);
- else
- rc = xsm_get_pod_target(XSM_PRIV, d);
-
- if ( rc != 0 )
- goto pod_target_out_unlock;
-
- if ( op == XENMEM_set_pod_target )
{
- if ( target.target_pages > d->max_pages )
- {
+ rc = xsm_set_pod_target(XSM_PRIV, d);
+ if ( rc )
+ /* nothing */;
+ else if ( target.target_pages > d->max_pages )
rc = -EINVAL;
- goto pod_target_out_unlock;
- }
-
- rc = p2m_pod_set_mem_target(d, target.target_pages);
+ else
+ rc = p2m_pod_set_mem_target(d, target.target_pages);
}
+ else
+ rc = xsm_get_pod_target(XSM_TARGET, d);
if ( rc == -EAGAIN )
{
@@ -4835,19 +4828,16 @@ long arch_memory_op(int op, XEN_GUEST_HA
}
else if ( rc >= 0 )
{
- p2m = p2m_get_hostp2m(d);
+ const struct p2m_domain *p2m = p2m_get_hostp2m(d);
+
target.tot_pages = d->tot_pages;
target.pod_cache_pages = p2m->pod.count;
target.pod_entries = p2m->pod.entry_count;
if ( __copy_to_guest(arg, &target, 1) )
- {
rc= -EFAULT;
- goto pod_target_out_unlock;
- }
}
- pod_target_out_unlock:
rcu_unlock_domain(d);
return rc;
}
--- a/xen/include/xsm/dummy.h
+++ b/xen/include/xsm/dummy.h
@@ -307,7 +307,7 @@ static XSM_INLINE char *xsm_show_securit
static XSM_INLINE int xsm_get_pod_target(XSM_DEFAULT_ARG struct domain *d)
{
- XSM_ASSERT_ACTION(XSM_PRIV);
+ XSM_ASSERT_ACTION(XSM_TARGET);
return xsm_default_action(action, current->domain, d);
}
[-- Attachment #2: expose-get_pod_target.patch --]
[-- Type: text/plain, Size: 2847 bytes --]
x86: expose XENMEM_get_pod_target to subject domain
Not having got any satisfactory suggestions on the inquiry on how to
determine the amount a PoD guest needs to balloon down by (see
http://lists.xenproject.org/archives/html/xen-devel/2014-01/msg01524.html
and the thread following it), expose XENMEM_get_pod_target such that
the guest can use it for this purpose.
Also leverage some cleanup potential resulting from this change.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -4800,7 +4800,6 @@ long arch_memory_op(int op, XEN_GUEST_HA
{
xen_pod_target_t target;
struct domain *d;
- struct p2m_domain *p2m;
if ( copy_from_guest(&target, arg, 1) )
return -EFAULT;
@@ -4810,23 +4809,17 @@ long arch_memory_op(int op, XEN_GUEST_HA
return -ESRCH;
if ( op == XENMEM_set_pod_target )
- rc = xsm_set_pod_target(XSM_PRIV, d);
- else
- rc = xsm_get_pod_target(XSM_PRIV, d);
-
- if ( rc != 0 )
- goto pod_target_out_unlock;
-
- if ( op == XENMEM_set_pod_target )
{
- if ( target.target_pages > d->max_pages )
- {
+ rc = xsm_set_pod_target(XSM_PRIV, d);
+ if ( rc )
+ /* nothing */;
+ else if ( target.target_pages > d->max_pages )
rc = -EINVAL;
- goto pod_target_out_unlock;
- }
-
- rc = p2m_pod_set_mem_target(d, target.target_pages);
+ else
+ rc = p2m_pod_set_mem_target(d, target.target_pages);
}
+ else
+ rc = xsm_get_pod_target(XSM_TARGET, d);
if ( rc == -EAGAIN )
{
@@ -4835,19 +4828,16 @@ long arch_memory_op(int op, XEN_GUEST_HA
}
else if ( rc >= 0 )
{
- p2m = p2m_get_hostp2m(d);
+ const struct p2m_domain *p2m = p2m_get_hostp2m(d);
+
target.tot_pages = d->tot_pages;
target.pod_cache_pages = p2m->pod.count;
target.pod_entries = p2m->pod.entry_count;
if ( __copy_to_guest(arg, &target, 1) )
- {
rc= -EFAULT;
- goto pod_target_out_unlock;
- }
}
- pod_target_out_unlock:
rcu_unlock_domain(d);
return rc;
}
--- a/xen/include/xsm/dummy.h
+++ b/xen/include/xsm/dummy.h
@@ -307,7 +307,7 @@ static XSM_INLINE char *xsm_show_securit
static XSM_INLINE int xsm_get_pod_target(XSM_DEFAULT_ARG struct domain *d)
{
- XSM_ASSERT_ACTION(XSM_PRIV);
+ XSM_ASSERT_ACTION(XSM_TARGET);
return xsm_default_action(action, current->domain, d);
}
[-- Attachment #3: Type: text/plain, Size: 126 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] x86: expose XENMEM_get_pod_target to subject domain
2014-02-24 13:31 [PATCH] x86: expose XENMEM_get_pod_target to subject domain Jan Beulich
@ 2014-02-24 17:24 ` George Dunlap
2014-02-25 8:03 ` Jan Beulich
0 siblings, 1 reply; 12+ messages in thread
From: George Dunlap @ 2014-02-24 17:24 UTC (permalink / raw)
To: Jan Beulich; +Cc: xen-devel, Daniel De Graaf, Keir Fraser
On Mon, Feb 24, 2014 at 1:31 PM, Jan Beulich <JBeulich@suse.com> wrote:
> Not having got any satisfactory suggestions on the inquiry on how to
> determine the amount a PoD guest needs to balloon down by (see
> http://lists.xenproject.org/archives/html/xen-devel/2014-01/msg01524.html
> and the thread following it), expose XENMEM_get_pod_target such that
> the guest can use it for this purpose.
So in theory the bug you're seeing has nothing to do with PoD -- it
just has to do with a different interpretation that the balloon driver
and Xen may have as to what "target" means. Is that right? The only
difference is that in the PoD case, not ballooning down enough can be
deadly to the domain; whereas in the non-PoD case, the worst that can
happen is that the toolstack has less memory left over on the host
than it may have expected.
I don't like the idea of exposing specific PoD information to the
guest -- PoD should be completely transparent to the guest. If we
make it PoD-specific, we may end up with a different sized domain
depending on whether we booted with PoD mode or not.
Is the real problem that there's no way to determine the number of
potentially non-empty pfn ranges? If we exposed the number of
non-empty p2m ranges (either ram or PoD), then the guest could compare
that to the target and balloon down as necessary, no?
> Also leverage some cleanup potential resulting from this change.
Cleanup should generally be done in separate patches, so that one
change can be reviewed at a time.
-George
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] x86: expose XENMEM_get_pod_target to subject domain
2014-02-24 17:24 ` George Dunlap
@ 2014-02-25 8:03 ` Jan Beulich
2014-02-26 9:25 ` Ian Campbell
0 siblings, 1 reply; 12+ messages in thread
From: Jan Beulich @ 2014-02-25 8:03 UTC (permalink / raw)
To: George Dunlap; +Cc: xen-devel, Daniel De Graaf, Keir Fraser
>>> On 24.02.14 at 18:24, George Dunlap <dunlapg@umich.edu> wrote:
> On Mon, Feb 24, 2014 at 1:31 PM, Jan Beulich <JBeulich@suse.com> wrote:
>> Not having got any satisfactory suggestions on the inquiry on how to
>> determine the amount a PoD guest needs to balloon down by (see
>> http://lists.xenproject.org/archives/html/xen-devel/2014-01/msg01524.html
>> and the thread following it), expose XENMEM_get_pod_target such that
>> the guest can use it for this purpose.
>
> So in theory the bug you're seeing has nothing to do with PoD -- it
> just has to do with a different interpretation that the balloon driver
> and Xen may have as to what "target" means. Is that right? The only
> difference is that in the PoD case, not ballooning down enough can be
> deadly to the domain; whereas in the non-PoD case, the worst that can
> happen is that the toolstack has less memory left over on the host
> than it may have expected.
To me this is very much a dependency on whether PoD is in use,
precisely because of the deadliness of ballooning out too little in
that case.
> I don't like the idea of exposing specific PoD information to the
> guest -- PoD should be completely transparent to the guest. If we
> make it PoD-specific, we may end up with a different sized domain
> depending on whether we booted with PoD mode or not.
>
> Is the real problem that there's no way to determine the number of
> potentially non-empty pfn ranges? If we exposed the number of
> non-empty p2m ranges (either ram or PoD), then the guest could compare
> that to the target and balloon down as necessary, no?
The only thing the guest really cares about from this hypercall is
the result of
.tot_pages + .pod_entries - .pod_cache_pages
so exposing anything to the guest that allows it to calculate this
value would be sufficient. It just seems odd to me to invent
something new if we already have what we need, as long as
exposing that information as individual pieces (instead of the
accumulated result) is not a security risk.
Anyway, I also consider it odd to complain about this now when
the referenced discussion has happened weeks ago, with no
useful result.
>> Also leverage some cleanup potential resulting from this change.
>
> Cleanup should generally be done in separate patches, so that one
> change can be reviewed at a time.
Honestly I think that's a matter of taste - I personally dislike leaving
unclean code in place when the cleanup isn't harmful to the actual
change's understandability. Of course larger cleanup actions should
go by themselves.
Jan
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] x86: expose XENMEM_get_pod_target to subject domain
2014-02-25 8:03 ` Jan Beulich
@ 2014-02-26 9:25 ` Ian Campbell
2014-02-26 9:46 ` Jan Beulich
0 siblings, 1 reply; 12+ messages in thread
From: Ian Campbell @ 2014-02-26 9:25 UTC (permalink / raw)
To: Jan Beulich; +Cc: xen-devel, George Dunlap, Keir Fraser, Daniel De Graaf
On Tue, 2014-02-25 at 08:03 +0000, Jan Beulich wrote:
> Anyway, I also consider it odd to complain about this now when
> the referenced discussion has happened weeks ago, with no
> useful result.
IIRC George was on vacation and/or travelling for a conference back then
(or maybe he was just back in town and buried in email, with the same
affect)
Anyway, since George knows PoD better than any of us I think it is worth
revisiting things with his input.
I would certainly prefer a solution which exposed some more semantically
meaningful information (from the guest's PoV) which lets it do the right
thing rather than just exposing the PoD internal accounting directly,
since that seems like rather an implementation detail to me. e.g. what
if we decide to make the PoD cache shared between a bunch of related
guests (totally hypothetical) or back it with the normal page sharing or
paging mechanisms? We don't want those sorts of changes to create a
visible guest ABI difference.
Ian.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] x86: expose XENMEM_get_pod_target to subject domain
2014-02-26 9:25 ` Ian Campbell
@ 2014-02-26 9:46 ` Jan Beulich
2014-02-26 10:28 ` Ian Campbell
0 siblings, 1 reply; 12+ messages in thread
From: Jan Beulich @ 2014-02-26 9:46 UTC (permalink / raw)
To: Ian Campbell, George Dunlap; +Cc: xen-devel, Daniel De Graaf, Keir Fraser
>>> On 26.02.14 at 10:25, Ian Campbell <Ian.Campbell@citrix.com> wrote:
> On Tue, 2014-02-25 at 08:03 +0000, Jan Beulich wrote:
>> Anyway, I also consider it odd to complain about this now when
>> the referenced discussion has happened weeks ago, with no
>> useful result.
>
> IIRC George was on vacation and/or travelling for a conference back then
> (or maybe he was just back in town and buried in email, with the same
> affect)
>
> Anyway, since George knows PoD better than any of us I think it is worth
> revisiting things with his input.
>
> I would certainly prefer a solution which exposed some more semantically
> meaningful information (from the guest's PoV) which lets it do the right
> thing rather than just exposing the PoD internal accounting directly,
> since that seems like rather an implementation detail to me. e.g. what
> if we decide to make the PoD cache shared between a bunch of related
> guests (totally hypothetical) or back it with the normal page sharing or
> paging mechanisms? We don't want those sorts of changes to create a
> visible guest ABI difference.
So what's the counter proposal then?
Jan
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] x86: expose XENMEM_get_pod_target to subject domain
2014-02-26 9:46 ` Jan Beulich
@ 2014-02-26 10:28 ` Ian Campbell
2014-04-01 16:55 ` George Dunlap
0 siblings, 1 reply; 12+ messages in thread
From: Ian Campbell @ 2014-02-26 10:28 UTC (permalink / raw)
To: Jan Beulich; +Cc: xen-devel, George Dunlap, Keir Fraser, Daniel De Graaf
On Wed, 2014-02-26 at 09:46 +0000, Jan Beulich wrote:
> >>> On 26.02.14 at 10:25, Ian Campbell <Ian.Campbell@citrix.com> wrote:
> > On Tue, 2014-02-25 at 08:03 +0000, Jan Beulich wrote:
> >> Anyway, I also consider it odd to complain about this now when
> >> the referenced discussion has happened weeks ago, with no
> >> useful result.
> >
> > IIRC George was on vacation and/or travelling for a conference back then
> > (or maybe he was just back in town and buried in email, with the same
> > affect)
> >
> > Anyway, since George knows PoD better than any of us I think it is worth
> > revisiting things with his input.
> >
> > I would certainly prefer a solution which exposed some more semantically
> > meaningful information (from the guest's PoV) which lets it do the right
> > thing rather than just exposing the PoD internal accounting directly,
> > since that seems like rather an implementation detail to me. e.g. what
> > if we decide to make the PoD cache shared between a bunch of related
> > guests (totally hypothetical) or back it with the normal page sharing or
> > paging mechanisms? We don't want those sorts of changes to create a
> > visible guest ABI difference.
>
> So what's the counter proposal then?
My expectation was that was why George is asking questions... Since he
knows PoD I'm hoping he will have some ideas.
Ian.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] x86: expose XENMEM_get_pod_target to subject domain
2014-02-26 10:28 ` Ian Campbell
@ 2014-04-01 16:55 ` George Dunlap
2014-04-01 17:05 ` George Dunlap
0 siblings, 1 reply; 12+ messages in thread
From: George Dunlap @ 2014-04-01 16:55 UTC (permalink / raw)
To: Ian Campbell; +Cc: xen-devel, Daniel De Graaf, Keir Fraser, Jan Beulich
On Wed, Feb 26, 2014 at 10:28 AM, Ian Campbell <Ian.Campbell@citrix.com> wrote:
> On Wed, 2014-02-26 at 09:46 +0000, Jan Beulich wrote:
>> >>> On 26.02.14 at 10:25, Ian Campbell <Ian.Campbell@citrix.com> wrote:
>> > On Tue, 2014-02-25 at 08:03 +0000, Jan Beulich wrote:
>> >> Anyway, I also consider it odd to complain about this now when
>> >> the referenced discussion has happened weeks ago, with no
>> >> useful result.
>> >
>> > IIRC George was on vacation and/or travelling for a conference back then
>> > (or maybe he was just back in town and buried in email, with the same
>> > affect)
>> >
>> > Anyway, since George knows PoD better than any of us I think it is worth
>> > revisiting things with his input.
>> >
>> > I would certainly prefer a solution which exposed some more semantically
>> > meaningful information (from the guest's PoV) which lets it do the right
>> > thing rather than just exposing the PoD internal accounting directly,
>> > since that seems like rather an implementation detail to me. e.g. what
>> > if we decide to make the PoD cache shared between a bunch of related
>> > guests (totally hypothetical) or back it with the normal page sharing or
>> > paging mechanisms? We don't want those sorts of changes to create a
>> > visible guest ABI difference.
>>
>> So what's the counter proposal then?
>
> My expectation was that was why George is asking questions... Since he
> knows PoD I'm hoping he will have some ideas.
Sorry, this thread managed to fall out of my normal mechanism to mark
a thread as important to come back to.
So the basic issue, regardless of PoD, is that the guest is told to
balloon down to X, by handing back (T - X) pages to Xen. But it
doesn't actually know what T is, for a number of reasons; mainly due
to having holes in lowmem for vga ram and things like that. When PoD
is enabled, this may cause a guest to end up killed as a result of
accidentally not ballooning down enough; but even on non-PoD systems,
this would lead to a small mismatch between the toolstack's idea of
how much a VM was using (or should be using) and how much the guest is
using.
It looks like libxl anyway already writes "static-max" into xenstore,
right next to the balloon target. Is there any reason we can't use
the toolstack-written static-max to calculate how big a balloon we
need to inflate?
If it's not accruate or consistent at the moment, we should be able to
make it consistent going forward.
-George
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] x86: expose XENMEM_get_pod_target to subject domain
2014-04-01 16:55 ` George Dunlap
@ 2014-04-01 17:05 ` George Dunlap
2014-04-02 8:52 ` Jan Beulich
0 siblings, 1 reply; 12+ messages in thread
From: George Dunlap @ 2014-04-01 17:05 UTC (permalink / raw)
To: Ian Campbell
Cc: Keir Fraser, Stefano Stabellini, Jan Beulich, xen-devel,
Daniel De Graaf
On Tue, Apr 1, 2014 at 5:55 PM, George Dunlap
<George.Dunlap@eu.citrix.com> wrote:
> On Wed, Feb 26, 2014 at 10:28 AM, Ian Campbell <Ian.Campbell@citrix.com> wrote:
>> On Wed, 2014-02-26 at 09:46 +0000, Jan Beulich wrote:
>>> >>> On 26.02.14 at 10:25, Ian Campbell <Ian.Campbell@citrix.com> wrote:
>>> > On Tue, 2014-02-25 at 08:03 +0000, Jan Beulich wrote:
>>> >> Anyway, I also consider it odd to complain about this now when
>>> >> the referenced discussion has happened weeks ago, with no
>>> >> useful result.
>>> >
>>> > IIRC George was on vacation and/or travelling for a conference back then
>>> > (or maybe he was just back in town and buried in email, with the same
>>> > affect)
>>> >
>>> > Anyway, since George knows PoD better than any of us I think it is worth
>>> > revisiting things with his input.
>>> >
>>> > I would certainly prefer a solution which exposed some more semantically
>>> > meaningful information (from the guest's PoV) which lets it do the right
>>> > thing rather than just exposing the PoD internal accounting directly,
>>> > since that seems like rather an implementation detail to me. e.g. what
>>> > if we decide to make the PoD cache shared between a bunch of related
>>> > guests (totally hypothetical) or back it with the normal page sharing or
>>> > paging mechanisms? We don't want those sorts of changes to create a
>>> > visible guest ABI difference.
>>>
>>> So what's the counter proposal then?
>>
>> My expectation was that was why George is asking questions... Since he
>> knows PoD I'm hoping he will have some ideas.
>
> Sorry, this thread managed to fall out of my normal mechanism to mark
> a thread as important to come back to.
>
> So the basic issue, regardless of PoD, is that the guest is told to
> balloon down to X, by handing back (T - X) pages to Xen. But it
> doesn't actually know what T is, for a number of reasons; mainly due
> to having holes in lowmem for vga ram and things like that. When PoD
> is enabled, this may cause a guest to end up killed as a result of
> accidentally not ballooning down enough; but even on non-PoD systems,
> this would lead to a small mismatch between the toolstack's idea of
> how much a VM was using (or should be using) and how much the guest is
> using.
>
> It looks like libxl anyway already writes "static-max" into xenstore,
> right next to the balloon target. Is there any reason we can't use
> the toolstack-written static-max to calculate how big a balloon we
> need to inflate?
>
> If it's not accruate or consistent at the moment, we should be able to
> make it consistent going forward.
One potential problem I see at the moment is that during domain
creation, "static-max" is set to maxmem, but "target" is set to
"memory-video_memkb"; but the actual number of pages will be memory -
VGA_HOLE, which probably won't match either one. Then in
libxl_set_memory_target(), it's just straight-up set to "target".
This may lead to a rather unexpected situation where after starting
with memory=2048, setting the target to 1024, and then setting it back
to 2048, the guest has a different amount of memory than it started
with. (Or perhaps the balloon driver will be running up against the
memory ceiling.)
It would be ideal if the balloon driver could just release (static-max
- target) memory and be assured that everything would be OK.
-George
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] x86: expose XENMEM_get_pod_target to subject domain
2014-04-01 17:05 ` George Dunlap
@ 2014-04-02 8:52 ` Jan Beulich
2014-04-02 14:24 ` George Dunlap
0 siblings, 1 reply; 12+ messages in thread
From: Jan Beulich @ 2014-04-02 8:52 UTC (permalink / raw)
To: George Dunlap
Cc: Keir Fraser, Ian Campbell, Stefano Stabellini, xen-devel,
Daniel De Graaf
>>> On 01.04.14 at 19:05, <George.Dunlap@eu.citrix.com> wrote:
> On Tue, Apr 1, 2014 at 5:55 PM, George Dunlap
> <George.Dunlap@eu.citrix.com> wrote:
>> So the basic issue, regardless of PoD, is that the guest is told to
>> balloon down to X, by handing back (T - X) pages to Xen. But it
>> doesn't actually know what T is, for a number of reasons; mainly due
>> to having holes in lowmem for vga ram and things like that. When PoD
>> is enabled, this may cause a guest to end up killed as a result of
>> accidentally not ballooning down enough; but even on non-PoD systems,
>> this would lead to a small mismatch between the toolstack's idea of
>> how much a VM was using (or should be using) and how much the guest is
>> using.
>>
>> It looks like libxl anyway already writes "static-max" into xenstore,
>> right next to the balloon target. Is there any reason we can't use
>> the toolstack-written static-max to calculate how big a balloon we
>> need to inflate?
>>
>> If it's not accruate or consistent at the moment, we should be able to
>> make it consistent going forward.
>
> One potential problem I see at the moment is that during domain
> creation, "static-max" is set to maxmem, but "target" is set to
> "memory-video_memkb"; but the actual number of pages will be memory -
> VGA_HOLE, which probably won't match either one. Then in
> libxl_set_memory_target(), it's just straight-up set to "target".
> This may lead to a rather unexpected situation where after starting
> with memory=2048, setting the target to 1024, and then setting it back
> to 2048, the guest has a different amount of memory than it started
> with. (Or perhaps the balloon driver will be running up against the
> memory ceiling.)
>
> It would be ideal if the balloon driver could just release (static-max
> - target) memory and be assured that everything would be OK.
Another problem is that "target" (or in fact anything in xenstore) isn't
tightly coupled with what the hypervisor knows for the domain. Yet
the main goal (preventing to be killed due to ballooning down to much)
depends on only the hypervisor's knowledge, and hence I think that
only the hypervisor should be consulted to find out.
Jan
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] x86: expose XENMEM_get_pod_target to subject domain
2014-04-02 8:52 ` Jan Beulich
@ 2014-04-02 14:24 ` George Dunlap
2014-04-02 15:02 ` Jan Beulich
0 siblings, 1 reply; 12+ messages in thread
From: George Dunlap @ 2014-04-02 14:24 UTC (permalink / raw)
To: Jan Beulich
Cc: Keir Fraser, Ian Campbell, Stefano Stabellini, xen-devel,
Daniel De Graaf
On 04/02/2014 09:52 AM, Jan Beulich wrote:
>>>> On 01.04.14 at 19:05, <George.Dunlap@eu.citrix.com> wrote:
>> On Tue, Apr 1, 2014 at 5:55 PM, George Dunlap
>> <George.Dunlap@eu.citrix.com> wrote:
>>> So the basic issue, regardless of PoD, is that the guest is told to
>>> balloon down to X, by handing back (T - X) pages to Xen. But it
>>> doesn't actually know what T is, for a number of reasons; mainly due
>>> to having holes in lowmem for vga ram and things like that. When PoD
>>> is enabled, this may cause a guest to end up killed as a result of
>>> accidentally not ballooning down enough; but even on non-PoD systems,
>>> this would lead to a small mismatch between the toolstack's idea of
>>> how much a VM was using (or should be using) and how much the guest is
>>> using.
>>>
>>> It looks like libxl anyway already writes "static-max" into xenstore,
>>> right next to the balloon target. Is there any reason we can't use
>>> the toolstack-written static-max to calculate how big a balloon we
>>> need to inflate?
>>>
>>> If it's not accruate or consistent at the moment, we should be able to
>>> make it consistent going forward.
>> One potential problem I see at the moment is that during domain
>> creation, "static-max" is set to maxmem, but "target" is set to
>> "memory-video_memkb"; but the actual number of pages will be memory -
>> VGA_HOLE, which probably won't match either one. Then in
>> libxl_set_memory_target(), it's just straight-up set to "target".
>> This may lead to a rather unexpected situation where after starting
>> with memory=2048, setting the target to 1024, and then setting it back
>> to 2048, the guest has a different amount of memory than it started
>> with. (Or perhaps the balloon driver will be running up against the
>> memory ceiling.)
>>
>> It would be ideal if the balloon driver could just release (static-max
>> - target) memory and be assured that everything would be OK.
> Another problem is that "target" (or in fact anything in xenstore) isn't
> tightly coupled with what the hypervisor knows for the domain. Yet
> the main goal (preventing to be killed due to ballooning down to much)
> depends on only the hypervisor's knowledge, and hence I think that
> only the hypervisor should be consulted to find out.
I understand the sentiment; but as I said, the real problem is a lack of
clarity about what exactly the toolstack is asking the VM to do. This
is obviously a particular problem in the case of PoD, but it's still a
problem even for non-PoD guests; it's just that the outcomes are less
severe. If we solve the general problem, we'll solve the PoD problem.
The other thing is that the whole point of PoD is to be transparent to
the guest. Xen is already careful in how it handles post-creation
adjustments to the PoD size -- always increasing the PoD cache, never
decreasing it -- specifically so that the guest doesn't need to know.
What should really happen at some point is for PoD to just become a
special case of swapping. In a sense, that's almost the same issue: you
could have a situation where the toolstack asks a guest to balloon down,
and the guest does so; but not as low as the toolstack expected, so the
toolstack labels the guest as "misbehaving" and tells Xen to swap out
pages until it reaches what the toolstack thinks is the correct value.
The guest won't crash, but performance will be impacted.
The target in xenstore could be made tightly coupled: if the toolstack
always wrote into xenstore exactly what it reported to Xen, then it
would be the same.
Alternately, since now Xen is involved with ballooning targets --
whether you're doing PoD or swapping -- maybe we should consider moving
the "target" into Xen entirely. Then there would be no chance for
"drift", as Xen and the balloon driver would be working from the same
data. This would be basically repurposing the get/set pod_target
hypercall to something specifically for ballooning.
-George
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] x86: expose XENMEM_get_pod_target to subject domain
2014-04-02 14:24 ` George Dunlap
@ 2014-04-02 15:02 ` Jan Beulich
2014-04-02 15:10 ` George Dunlap
0 siblings, 1 reply; 12+ messages in thread
From: Jan Beulich @ 2014-04-02 15:02 UTC (permalink / raw)
To: George Dunlap
Cc: KeirFraser, Ian Campbell, Stefano Stabellini, xen-devel,
Daniel De Graaf
>>> On 02.04.14 at 16:24, <george.dunlap@eu.citrix.com> wrote:
> I understand the sentiment; but as I said, the real problem is a lack of
> clarity about what exactly the toolstack is asking the VM to do. This
> is obviously a particular problem in the case of PoD, but it's still a
> problem even for non-PoD guests; it's just that the outcomes are less
> severe. If we solve the general problem, we'll solve the PoD problem.
>
> The other thing is that the whole point of PoD is to be transparent to
> the guest. Xen is already careful in how it handles post-creation
> adjustments to the PoD size -- always increasing the PoD cache, never
> decreasing it -- specifically so that the guest doesn't need to know.
>
> What should really happen at some point is for PoD to just become a
> special case of swapping. In a sense, that's almost the same issue: you
> could have a situation where the toolstack asks a guest to balloon down,
> and the guest does so; but not as low as the toolstack expected, so the
> toolstack labels the guest as "misbehaving" and tells Xen to swap out
> pages until it reaches what the toolstack thinks is the correct value.
> The guest won't crash, but performance will be impacted.
>
> The target in xenstore could be made tightly coupled: if the toolstack
> always wrote into xenstore exactly what it reported to Xen, then it
> would be the same.
>
> Alternately, since now Xen is involved with ballooning targets --
> whether you're doing PoD or swapping -- maybe we should consider moving
> the "target" into Xen entirely. Then there would be no chance for
> "drift", as Xen and the balloon driver would be working from the same
> data. This would be basically repurposing the get/set pod_target
> hypercall to something specifically for ballooning.
This all reads like something that won't happen soon, and wouldn't
likely to be reasonably backportable. Yet we have the problem in
shipping code, and hence alongside a proper long term solution we
should also (and perhaps first) try to find a simple and sufficiently
correct short term one. (But yes, the present "balloon down much
further than needed" model might be perceived as that short term
one, albeit personally I don't like it.)
Jan
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] x86: expose XENMEM_get_pod_target to subject domain
2014-04-02 15:02 ` Jan Beulich
@ 2014-04-02 15:10 ` George Dunlap
0 siblings, 0 replies; 12+ messages in thread
From: George Dunlap @ 2014-04-02 15:10 UTC (permalink / raw)
To: Jan Beulich
Cc: KeirFraser, Ian Campbell, Stefano Stabellini, xen-devel,
Daniel De Graaf
On 04/02/2014 04:02 PM, Jan Beulich wrote:
>>>> On 02.04.14 at 16:24, <george.dunlap@eu.citrix.com> wrote:
>> I understand the sentiment; but as I said, the real problem is a lack of
>> clarity about what exactly the toolstack is asking the VM to do. This
>> is obviously a particular problem in the case of PoD, but it's still a
>> problem even for non-PoD guests; it's just that the outcomes are less
>> severe. If we solve the general problem, we'll solve the PoD problem.
>>
>> The other thing is that the whole point of PoD is to be transparent to
>> the guest. Xen is already careful in how it handles post-creation
>> adjustments to the PoD size -- always increasing the PoD cache, never
>> decreasing it -- specifically so that the guest doesn't need to know.
>>
>> What should really happen at some point is for PoD to just become a
>> special case of swapping. In a sense, that's almost the same issue: you
>> could have a situation where the toolstack asks a guest to balloon down,
>> and the guest does so; but not as low as the toolstack expected, so the
>> toolstack labels the guest as "misbehaving" and tells Xen to swap out
>> pages until it reaches what the toolstack thinks is the correct value.
>> The guest won't crash, but performance will be impacted.
>>
>> The target in xenstore could be made tightly coupled: if the toolstack
>> always wrote into xenstore exactly what it reported to Xen, then it
>> would be the same.
>>
>> Alternately, since now Xen is involved with ballooning targets --
>> whether you're doing PoD or swapping -- maybe we should consider moving
>> the "target" into Xen entirely. Then there would be no chance for
>> "drift", as Xen and the balloon driver would be working from the same
>> data. This would be basically repurposing the get/set pod_target
>> hypercall to something specifically for ballooning.
> This all reads like something that won't happen soon, and wouldn't
> likely to be reasonably backportable. Yet we have the problem in
> shipping code, and hence alongside a proper long term solution we
> should also (and perhaps first) try to find a simple and sufficiently
> correct short term one. (But yes, the present "balloon down much
> further than needed" model might be perceived as that short term
> one, albeit personally I don't like it.)
There are three things I mentioned:
1. Make the static-max / target something rational and useable by the
balloon driver
2. Move "target" from xenstore into the hypervisor, and make a proper
interface for it there.
3. Re-implement PoD as a special case of hypervisor swap
#3 is unlikely to happen soon; but it's not a solution to your problem
anyway. It just changes the failure mode from "guest crashes" to "guest
experiences performance degradation".
Either #1 or #2 should be straightforward to implement and backport; #1
would probably be the easiest to backport. (Yet another reason to
prefer it over #2.)
-George
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2014-04-02 15:10 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-02-24 13:31 [PATCH] x86: expose XENMEM_get_pod_target to subject domain Jan Beulich
2014-02-24 17:24 ` George Dunlap
2014-02-25 8:03 ` Jan Beulich
2014-02-26 9:25 ` Ian Campbell
2014-02-26 9:46 ` Jan Beulich
2014-02-26 10:28 ` Ian Campbell
2014-04-01 16:55 ` George Dunlap
2014-04-01 17:05 ` George Dunlap
2014-04-02 8:52 ` Jan Beulich
2014-04-02 14:24 ` George Dunlap
2014-04-02 15:02 ` Jan Beulich
2014-04-02 15:10 ` George Dunlap
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.