* [PATCH] xen: reduce severity of message about using v1 grant tables.
@ 2009-12-02 19:28 Ian Campbell
2009-12-02 19:33 ` Jeremy Fitzhardinge
0 siblings, 1 reply; 12+ messages in thread
From: Ian Campbell @ 2009-12-02 19:28 UTC (permalink / raw)
To: xen-devel; +Cc: Steven Smith, Jeremy Fitzhardinge, Ian Campbell, Ian Campbell
From: Ian Campbell <ijc@hellion.org.uk>
It's hardly the end of the world...
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Steven Smith <Steven.Smith@eu.citrix.com>
---
drivers/xen/grant-table.c | 19 ++++++++++---------
1 files changed, 10 insertions(+), 9 deletions(-)
diff --git a/drivers/xen/grant-table.c b/drivers/xen/grant-table.c
index 2240adf..85ce951 100644
--- a/drivers/xen/grant-table.c
+++ b/drivers/xen/grant-table.c
@@ -699,21 +699,22 @@ static void gnttab_request_version(void)
gsv.version = 2;
rc = HYPERVISOR_grant_table_op(GNTTABOP_set_version, &gsv, 1);
- if (rc == 0) {
+ if (rc == 0)
grant_table_version = 2;
- printk(KERN_NOTICE "Using V2 grant tables.\n");
- } else {
+ else {
if (grant_table_version == 2) {
- /* If we've already used version 2 features,
- but then suddenly discover that they're not
- available (e.g. migrating to an older
- version of Xen), almost unbounded badness
- can happen. */
+ /*
+ * If we've already used version 2 features,
+ * but then suddenly discover that they're not
+ * available (e.g. migrating to an older
+ * version of Xen), almost unbounded badness
+ * can happen.
+ */
panic("we need grant tables version 2, but only version 1 is available");
}
grant_table_version = 1;
- printk(KERN_WARNING "Using legacy V1 grant tables; upgrade to a newer version of Xen.\n");
}
+ printk(KERN_INFO "Grant tables using version %d layout.\n", grant_table_version);
}
static int gnttab_map(unsigned int start_idx, unsigned int end_idx)
--
1.6.5.3
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH] xen: reduce severity of message about using v1 grant tables.
2009-12-02 19:28 [PATCH] xen: reduce severity of message about using v1 grant tables Ian Campbell
@ 2009-12-02 19:33 ` Jeremy Fitzhardinge
2009-12-02 19:54 ` Ian Campbell
0 siblings, 1 reply; 12+ messages in thread
From: Jeremy Fitzhardinge @ 2009-12-02 19:33 UTC (permalink / raw)
To: Ian Campbell; +Cc: Steven Smith, xen-devel, Ian Campbell
On 12/02/09 11:28, Ian Campbell wrote:
> From: Ian Campbell <ijc@hellion.org.uk>
>
> It's hardly the end of the world...
>
> Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
> Cc: Jeremy Fitzhardinge <jeremy@goop.org>
> Cc: Steven Smith <Steven.Smith@eu.citrix.com>
> ---
> drivers/xen/grant-table.c | 19 ++++++++++---------
> 1 files changed, 10 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/xen/grant-table.c b/drivers/xen/grant-table.c
> index 2240adf..85ce951 100644
> --- a/drivers/xen/grant-table.c
> +++ b/drivers/xen/grant-table.c
> @@ -699,21 +699,22 @@ static void gnttab_request_version(void)
>
> gsv.version = 2;
> rc = HYPERVISOR_grant_table_op(GNTTABOP_set_version, &gsv, 1);
> - if (rc == 0) {
> + if (rc == 0)
> grant_table_version = 2;
> - printk(KERN_NOTICE "Using V2 grant tables.\n");
> - } else {
> + else {
> if (grant_table_version == 2) {
> - /* If we've already used version 2 features,
> - but then suddenly discover that they're not
> - available (e.g. migrating to an older
> - version of Xen), almost unbounded badness
> - can happen. */
> + /*
> + * If we've already used version 2 features,
> + * but then suddenly discover that they're not
> + * available (e.g. migrating to an older
> + * version of Xen), almost unbounded badness
> + * can happen.
> + */
> panic("we need grant tables version 2, but only version 1 is available");
>
Does it really need to be a panic? Can't we just start failing all
future operations? Seems bad to take out the whole machine if we can
just get away with crippling one device (especially if it can be
recovered by downing it and re-upping a new one with nc1 and/or gt1).
J
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] xen: reduce severity of message about using v1 grant tables.
2009-12-02 19:33 ` Jeremy Fitzhardinge
@ 2009-12-02 19:54 ` Ian Campbell
2009-12-03 10:28 ` Ian Campbell
0 siblings, 1 reply; 12+ messages in thread
From: Ian Campbell @ 2009-12-02 19:54 UTC (permalink / raw)
To: Jeremy Fitzhardinge; +Cc: Smith, xen-devel@lists.xensource.com
On Wed, 2009-12-02 at 19:33 +0000, Jeremy Fitzhardinge wrote:
> On 12/02/09 11:28, Ian Campbell wrote:
> > From: Ian Campbell <ijc@hellion.org.uk>
> >
> > It's hardly the end of the world...
> >
> > Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
> > Cc: Jeremy Fitzhardinge <jeremy@goop.org>
> > Cc: Steven Smith <Steven.Smith@eu.citrix.com>
> > ---
> > drivers/xen/grant-table.c | 19 ++++++++++---------
> > 1 files changed, 10 insertions(+), 9 deletions(-)
> >
> > diff --git a/drivers/xen/grant-table.c b/drivers/xen/grant-table.c
> > index 2240adf..85ce951 100644
> > --- a/drivers/xen/grant-table.c
> > +++ b/drivers/xen/grant-table.c
> > @@ -699,21 +699,22 @@ static void gnttab_request_version(void)
> >
> > gsv.version = 2;
> > rc = HYPERVISOR_grant_table_op(GNTTABOP_set_version, &gsv, 1);
> > - if (rc == 0) {
> > + if (rc == 0)
> > grant_table_version = 2;
> > - printk(KERN_NOTICE "Using V2 grant tables.\n");
> > - } else {
> > + else {
> > if (grant_table_version == 2) {
> > - /* If we've already used version 2 features,
> > - but then suddenly discover that they're not
> > - available (e.g. migrating to an older
> > - version of Xen), almost unbounded badness
> > - can happen. */
> > + /*
> > + * If we've already used version 2 features,
> > + * but then suddenly discover that they're not
> > + * available (e.g. migrating to an older
> > + * version of Xen), almost unbounded badness
> > + * can happen.
> > + */
> > panic("we need grant tables version 2, but only version 1 is available");
> >
>
> Does it really need to be a panic? Can't we just start failing all
> future operations? Seems bad to take out the whole machine if we can
> just get away with crippling one device (especially if it can be
> recovered by downing it and re-upping a new one with nc1 and/or gt1).
Wouldn't there be (failing) grant table ops on the down path?
In any case doesn't it effect all devices since they all use the same
grant table?
Ian.
>
> J
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Re: [PATCH] xen: reduce severity of message about using v1 grant tables.
2009-12-02 19:54 ` Ian Campbell
@ 2009-12-03 10:28 ` Ian Campbell
2009-12-03 12:01 ` Steven Smith
2009-12-03 20:10 ` Jeremy Fitzhardinge
0 siblings, 2 replies; 12+ messages in thread
From: Ian Campbell @ 2009-12-03 10:28 UTC (permalink / raw)
To: Jeremy Fitzhardinge; +Cc: Steven Smith, xen-devel@lists.xensource.com
On Wed, 2009-12-02 at 19:54 +0000, Ian Campbell wrote:
>
> > Does it really need to be a panic? Can't we just start failing all
> > future operations? Seems bad to take out the whole machine if we
> can
> > just get away with crippling one device (especially if it can be
> > recovered by downing it and re-upping a new one with nc1 and/or
> gt1).
>
> Wouldn't there be (failing) grant table ops on the down path?
>
> In any case doesn't it effect all devices since they all use the same
> grant table?
Oh, I see what you meant... in the proper resume case (as opposed to the
cancelled suspend/checkpoint case I was thinking of) there should be no
grant tables in use at this point so most devices should, in theory, be
able to reconnect using v1 grants, any drivers which require v2 grant
tables need to check for them in their resume hook as well as at start
of day.
Unfortunately frontend devices tear down their grant entries after the
resume rather than before the suspend (I presume this has to do with
faster checkpointing?) which means they could be trying to clear an
entry of the wrong layout, leading the unbounded badness that the
comment refers to.
I think the choices are basically:
* Always latch to either v1 or v2 at start of day, if we can't get
the version we want then panic (this is a stronger restriction
than the current code which will try to upgrade to v2 on resume)
* Write v1<->v2 layout transformations called on gnttab resume
before the devices get a chance to try and unmap their old
entries. Would need to handle v2 entries sing feature which are
not expressible in v1.
I'm tempted to go with the former for simplicity, it enables migration
to a newer version of Xen (the guest will just keep using v1) but will
not allow migration back to an older version of Xen, which is not
something we generally support anyway.
Ian.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Re: [PATCH] xen: reduce severity of message about using v1 grant tables.
2009-12-03 10:28 ` Ian Campbell
@ 2009-12-03 12:01 ` Steven Smith
2009-12-03 12:15 ` Ian Campbell
2009-12-03 20:13 ` Jeremy Fitzhardinge
2009-12-03 20:10 ` Jeremy Fitzhardinge
1 sibling, 2 replies; 12+ messages in thread
From: Steven Smith @ 2009-12-03 12:01 UTC (permalink / raw)
To: Ian Campbell
Cc: Steven Smith, Jeremy Fitzhardinge, xen-devel@lists.xensource.com
[-- Attachment #1.1: Type: text/plain, Size: 7318 bytes --]
> Oh, I see what you meant... in the proper resume case (as opposed to the
> cancelled suspend/checkpoint case I was thinking of) there should be no
> grant tables in use at this point so most devices should, in theory, be
> able to reconnect using v1 grants, any drivers which require v2 grant
> tables need to check for them in their resume hook as well as at start
> of day.
>
> Unfortunately frontend devices tear down their grant entries after the
> resume rather than before the suspend (I presume this has to do with
> faster checkpointing?) which means they could be trying to clear an
> entry of the wrong layout, leading the unbounded badness that the
> comment refers to.
Actually, I think that'd be okay. Drivers should be clearing grant
table entries through the gnttab_end_* functions, which always check
grant_table_version and do the right thing.
gnttab_grant_foreign_access_ref_{subpage,trans} would BUG(), but that
could be turned into an error return easily enough. Handling it
everywhere would be a pain, though.
The only other potential subtlety is handling a suspend while some
other vcpu is doing grant table operations, but I think the
stop_machine() is sufficient protection against that.
So I think a downgrade might actually be safe (despite what I said
before). It can only happen if you migrate from a new Xen to an older
one, which I didn't think we supported anyway, but it could be made to
work.
> I think the choices are basically:
> * Always latch to either v1 or v2 at start of day, if we can't get
> the version we want then panic (this is a stronger restriction
> than the current code which will try to upgrade to v2 on resume)
Yeah, that probably counts as a bug in the current code. If we boot
on a Xen which only supports v1 tabled then we should probably stick
with v1 until we shut down.
panic()ing when v2 isn't available sounds like overkill, though.
Would something like this work?
Allow downgrades from v2 grant tables to v1.
Completely untested, beyond checking that it actually compiles.
Signed-off-by: Steven Smith <sos22@cam.ac.uk.>
diff --git a/drivers/xen/grant-table.c b/drivers/xen/grant-table.c
index fd619b4..fa971e2 100644
--- a/drivers/xen/grant-table.c
+++ b/drivers/xen/grant-table.c
@@ -210,36 +210,40 @@ int gnttab_grant_foreign_access(domid_t domid, unsigned long frame,
}
EXPORT_SYMBOL_GPL(gnttab_grant_foreign_access);
-void gnttab_grant_foreign_access_ref_subpage(grant_ref_t ref, domid_t domid,
- unsigned long frame, int flags,
- unsigned page_off,
- unsigned length)
+int gnttab_grant_foreign_access_ref_subpage(grant_ref_t ref, domid_t domid,
+ unsigned long frame, int flags,
+ unsigned page_off,
+ unsigned length)
{
BUG_ON(flags & (GTF_accept_transfer | GTF_reading |
GTF_writing | GTF_sub_page | GTF_permit_access));
- BUG_ON(grant_table_version == 1);
+ if (grant_table_version == 1)
+ return -EOPNOTSUPP;
shared.v2[ref].sub_page.frame = frame;
shared.v2[ref].sub_page.page_off = page_off;
shared.v2[ref].sub_page.length = length;
shared.v2[ref].hdr.domid = domid;
wmb();
shared.v2[ref].hdr.flags = GTF_permit_access | GTF_sub_page | flags;
+ return 0;
}
EXPORT_SYMBOL_GPL(gnttab_grant_foreign_access_ref_subpage);
-void gnttab_grant_foreign_access_ref_trans(grant_ref_t ref, domid_t domid,
- int flags,
- domid_t trans_domid,
- grant_ref_t trans_gref)
+int gnttab_grant_foreign_access_ref_trans(grant_ref_t ref, domid_t domid,
+ int flags,
+ domid_t trans_domid,
+ grant_ref_t trans_gref)
{
BUG_ON(flags & (GTF_accept_transfer | GTF_reading |
GTF_writing | GTF_sub_page | GTF_permit_access));
- BUG_ON(grant_table_version == 1);
+ if (grant_table_version == 1)
+ return -EOPNOTSUPP;
shared.v2[ref].transitive.trans_domid = trans_domid;
shared.v2[ref].transitive.gref = trans_gref;
shared.v2[ref].hdr.domid = domid;
wmb();
shared.v2[ref].hdr.flags = GTF_permit_access | GTF_transitive | flags;
+ return 0;
}
EXPORT_SYMBOL_GPL(gnttab_grant_foreign_access_ref_trans);
@@ -588,25 +592,40 @@ static inline unsigned max_nr_grant_status_frames(void)
return nr_status_frames(max_nr_grant_frames());
}
+/* Called whenever there's some possibility that the hypervisor
+ disagrees with us about what grant table version we're using
+ (i.e. start of day and following a resume). */
static void gnttab_request_version(void)
{
int rc;
struct gnttab_set_version gsv;
- gsv.version = 2;
- rc = HYPERVISOR_grant_table_op(GNTTABOP_set_version, &gsv, 1);
- if (rc == 0) {
- grant_table_version = 2;
- } else {
- if (grant_table_version == 2) {
- /* If we've already used version 2 features,
- but then suddenly discover that they're not
- available (e.g. migrating to an older
- version of Xen), almost unbounded badness
- can happen. */
- panic("we need grant tables version 2, but only version 1 is available");
+ if (grant_table_version == 0 || grant_table_version == 2) {
+ /* Try to get v2 grant tables. */
+ gsv.version = 2;
+ rc = HYPERVISOR_grant_table_op(GNTTABOP_set_version, &gsv, 1);
+ if (rc == 0) {
+ /* Success */
+ grant_table_version = 2;
+ } else {
+ if (grant_table_version == 2) {
+ /* We've seen v2 features before, but
+ they're no longer available.
+ That's not disastrous, but it is
+ pretty bad. */
+ printk(KERN_WARNING "Downgrading from v2 grant tables to v1; expect some device misbehaviour.\n");
+ }
+ grant_table_version = 1;
}
- grant_table_version = 1;
+ }
+ if (grant_table_version == 1) {
+ /* This shouldn't be necessary (Xen should default to
+ v1, and we've not asked for anything else), but do
+ it anyway for sanity. */
+ gsv.version = 1;
+ /* If this fails, it probably just means the
+ hypervisor only supports v1, so we're fine. */
+ (void)HYPERVISOR_grant_table_op(GNTTABOP_set_version, &gsv, 1);
}
}
diff --git a/include/xen/grant_table.h b/include/xen/grant_table.h
index cf316d7..85f2d52 100644
--- a/include/xen/grant_table.h
+++ b/include/xen/grant_table.h
@@ -65,14 +65,14 @@ int gnttab_resume(void);
int gnttab_grant_foreign_access(domid_t domid, unsigned long frame,
int flags);
-void gnttab_grant_foreign_access_ref_subpage(grant_ref_t ref, domid_t domid,
- unsigned long frame, int flags,
- unsigned page_off,
- unsigned length);
-void gnttab_grant_foreign_access_ref_trans(grant_ref_t ref, domid_t domid,
- int flags,
- domid_t trans_domid,
- grant_ref_t trans_gref);
+int gnttab_grant_foreign_access_ref_subpage(grant_ref_t ref, domid_t domid,
+ unsigned long frame, int flags,
+ unsigned page_off,
+ unsigned length);
+int gnttab_grant_foreign_access_ref_trans(grant_ref_t ref, domid_t domid,
+ int flags,
+ domid_t trans_domid,
+ grant_ref_t trans_gref);
/*
* Are sub-page grants available on this version of Xen? Returns 1 if
[-- Attachment #1.2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]
[-- Attachment #2: Type: text/plain, Size: 138 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: Re: [PATCH] xen: reduce severity of message about using v1 grant tables.
2009-12-03 12:01 ` Steven Smith
@ 2009-12-03 12:15 ` Ian Campbell
2009-12-03 14:47 ` Steven Smith
2009-12-03 20:15 ` Jeremy Fitzhardinge
2009-12-03 20:13 ` Jeremy Fitzhardinge
1 sibling, 2 replies; 12+ messages in thread
From: Ian Campbell @ 2009-12-03 12:15 UTC (permalink / raw)
To: Steven Smith; +Cc: Jeremy Fitzhardinge, xen-devel@lists.xensource.com
On Thu, 2009-12-03 at 12:01 +0000, Steven Smith wrote:
> > Oh, I see what you meant... in the proper resume case (as opposed to the
> > cancelled suspend/checkpoint case I was thinking of) there should be no
> > grant tables in use at this point so most devices should, in theory, be
> > able to reconnect using v1 grants, any drivers which require v2 grant
> > tables need to check for them in their resume hook as well as at start
> > of day.
> >
> > Unfortunately frontend devices tear down their grant entries after the
> > resume rather than before the suspend (I presume this has to do with
> > faster checkpointing?) which means they could be trying to clear an
> > entry of the wrong layout, leading the unbounded badness that the
> > comment refers to.
> Actually, I think that'd be okay. Drivers should be clearing grant
> table entries through the gnttab_end_* functions, which always check
> grant_table_version and do the right thing.
In the case where you have gone v1->v2 then the entries would have been
setup while grante_table_version==1 layout but by the time gnttab_end_*
is called grant_table_version==2 and the wrong thing happens.
>
> > I think the choices are basically:
> > * Always latch to either v1 or v2 at start of day, if we can't get
> > the version we want then panic (this is a stronger restriction
> > than the current code which will try to upgrade to v2 on resume)
> Yeah, that probably counts as a bug in the current code. If we boot
> on a Xen which only supports v1 tabled then we should probably stick
> with v1 until we shut down.
>
> panic()ing when v2 isn't available sounds like overkill, though.
It's only if you've already used v2 in this guest.
> Would something like this work?
I don't think so, because I don't think changing the grant table layout
is safe.
I'm experimenting with this:
Subject: xen: latch grant-table layout at start of day.
It is not possible to switch grant table layout over a save restore
since entries setup before with the old layout cannot be torn down under
the new layout.
Also add a grant_table.version parameter to allow the user to force a
particular layout (e.g. if they know they need to migrate to v1 only
hosts)
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
diff --git a/drivers/xen/grant-table.c b/drivers/xen/grant-table.c
index c653970..a449ef4 100644
--- a/drivers/xen/grant-table.c
+++ b/drivers/xen/grant-table.c
@@ -77,6 +77,7 @@ static grant_status_t *grstatus;
static struct gnttab_free_callback *gnttab_free_callback_list;
static int grant_table_version;
+module_param_named(version, grant_table_version, int, 0);
static int gnttab_expand(unsigned int req_entries);
@@ -697,24 +698,29 @@ static void gnttab_request_version(void)
int rc;
struct gnttab_set_version gsv;
- gsv.version = 2;
+ /*
+ * If we have already used a particular layout (e.g. before
+ * suspend/resume) then we must continue to use that
+ * version. Otherwise try and use v2.
+ */
+ gsv.version = grant_table_version ? : 2;
rc = HYPERVISOR_grant_table_op(GNTTABOP_set_version, &gsv, 1);
- if (rc == 0)
- grant_table_version = 2;
- else {
- if (grant_table_version == 2) {
- /*
- * If we've already used version 2 features,
- * but then suddenly discover that they're not
- * available (e.g. migrating to an older
- * version of Xen), almost unbounded badness
- * can happen.
- */
- panic("we need grant tables version 2, but only version 1 is
available");
- }
- grant_table_version = 1;
- }
- printk(KERN_INFO "Grant tables using version %d layout.\n",
grant_table_version);
+ BUG_ON(rc == -EFAULT);
+
+ /*
+ * Compatibility with hypervisors which do not support >= v2
+ * grant tables.
+ */
+ if (rc == -ENOSYS)
+ gsv.version = 1;
+ else if (rc != 0)
+ printk(KERN_WARNING "Error %d requesting v%d grant table layout, got
v%d\n",
+ rc, grant_table_version ? : 2, gsv.version);
+
+ if (grant_table_version && grant_table_version != gsv.version)
+ panic("unable to set required v%d grant table layout",
grant_table_version);
+
+ grant_table_version = gsv.version;
}
static int gnttab_map(unsigned int start_idx, unsigned int end_idx)
@@ -964,7 +970,7 @@ static int __devinit gnttab_init(void)
gnttab_free_count = nr_init_grefs - NR_RESERVED_ENTRIES;
gnttab_free_head = NR_RESERVED_ENTRIES;
- printk("Grant table initialized\n");
+ printk("Grant table initialized using v%d layout\n",
grant_table_version);
return 0;
ini_nomem:
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: Re: [PATCH] xen: reduce severity of message about using v1 grant tables.
2009-12-03 12:15 ` Ian Campbell
@ 2009-12-03 14:47 ` Steven Smith
2009-12-03 20:15 ` Jeremy Fitzhardinge
1 sibling, 0 replies; 12+ messages in thread
From: Steven Smith @ 2009-12-03 14:47 UTC (permalink / raw)
To: Ian Campbell
Cc: Steven Smith, Jeremy Fitzhardinge, xen-devel@lists.xensource.com
[-- Attachment #1.1: Type: text/plain, Size: 2873 bytes --]
> > > Oh, I see what you meant... in the proper resume case (as opposed to the
> > > cancelled suspend/checkpoint case I was thinking of) there should be no
> > > grant tables in use at this point so most devices should, in theory, be
> > > able to reconnect using v1 grants, any drivers which require v2 grant
> > > tables need to check for them in their resume hook as well as at start
> > > of day.
> > >
> > > Unfortunately frontend devices tear down their grant entries after the
> > > resume rather than before the suspend (I presume this has to do with
> > > faster checkpointing?) which means they could be trying to clear an
> > > entry of the wrong layout, leading the unbounded badness that the
> > > comment refers to.
> > Actually, I think that'd be okay. Drivers should be clearing grant
> > table entries through the gnttab_end_* functions, which always check
> > grant_table_version and do the right thing.
> In the case where you have gone v1->v2 then the entries would have been
> setup while grante_table_version==1 layout but by the time gnttab_end_*
> is called grant_table_version==2 and the wrong thing happens.
Hmm... I still think we're probably okay (in the narrowest possible
sense). Changing grant table version memset()s the shared tables to
zero, which effectively revokes all the grants, so all we need is for
the gnttab_end_*_ref operations to be no-ops when applied to zeroed
grant table slots, and a quick check of the code suggests that they
are.
But, yeah, it's not something we want to be doing anyway, whether it's
safe or not.
> > > I think the choices are basically:
> > > * Always latch to either v1 or v2 at start of day, if we can't get
> > > the version we want then panic (this is a stronger restriction
> > > than the current code which will try to upgrade to v2 on resume)
> > Yeah, that probably counts as a bug in the current code. If we boot
> > on a Xen which only supports v1 tabled then we should probably stick
> > with v1 until we shut down.
> >
> > panic()ing when v2 isn't available sounds like overkill, though.
> It's only if you've already used v2 in this guest.
Ah, okay.
> > Would something like this work?
>
> I don't think so, because I don't think changing the grant table layout
> is safe.
>
> I'm experimenting with this:
>
> Subject: xen: latch grant-table layout at start of day.
>
> It is not possible to switch grant table layout over a save restore
> since entries setup before with the old layout cannot be torn down under
> the new layout.
>
> Also add a grant_table.version parameter to allow the user to force a
> particular layout (e.g. if they know they need to migrate to v1 only
> hosts)
>
> Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Steven Smith <sos22@cam.ac.uk>
Steven.
[-- Attachment #1.2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]
[-- Attachment #2: Type: text/plain, Size: 138 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Re: [PATCH] xen: reduce severity of message about using v1 grant tables.
2009-12-03 10:28 ` Ian Campbell
2009-12-03 12:01 ` Steven Smith
@ 2009-12-03 20:10 ` Jeremy Fitzhardinge
1 sibling, 0 replies; 12+ messages in thread
From: Jeremy Fitzhardinge @ 2009-12-03 20:10 UTC (permalink / raw)
To: Ian Campbell; +Cc: Steven Smith, xen-devel@lists.xensource.com
On 12/03/09 02:28, Ian Campbell wrote:
> On Wed, 2009-12-02 at 19:54 +0000, Ian Campbell wrote:
>
>>
>>> Does it really need to be a panic? Can't we just start failing all
>>> future operations? Seems bad to take out the whole machine if we
>>>
>> can
>>
>>> just get away with crippling one device (especially if it can be
>>> recovered by downing it and re-upping a new one with nc1 and/or
>>>
>> gt1).
>>
>> Wouldn't there be (failing) grant table ops on the down path?
>>
>> In any case doesn't it effect all devices since they all use the same
>> grant table?
>>
> Oh, I see what you meant... in the proper resume case (as opposed to the
> cancelled suspend/checkpoint case I was thinking of) there should be no
> grant tables in use at this point so most devices should, in theory, be
> able to reconnect using v1 grants, any drivers which require v2 grant
> tables need to check for them in their resume hook as well as at start
> of day.
>
> Unfortunately frontend devices tear down their grant entries after the
> resume rather than before the suspend (I presume this has to do with
> faster checkpointing?) which means they could be trying to clear an
> entry of the wrong layout, leading the unbounded badness that the
> comment refers to.
>
I think the reason frontends don't do anything before suspend is because
they need to cope with backends going away at any moment, and a
suspend/migrate is just a special case of that. But a normal backend
restart won't change the grant table format, whereas a resume/migrate
can, so it does make sense to take advantage of the suspend callback.
Also I think originally there wasn't a suspend callback, but there is
now that we're using the device model.
I don't know how it affects performance, but I guess it would require
checkpoints to do a full teardown/reconnect so that the checkpointed
image can cope.
On the other hand, on resume, there are no existing grants, so the
device can just ignore any grant state it currently has established and
do it all afresh with the current grant mechanism, no?
> I think the choices are basically:
> * Always latch to either v1 or v2 at start of day, if we can't get
> the version we want then panic (this is a stronger restriction
> than the current code which will try to upgrade to v2 on resume)
> * Write v1<->v2 layout transformations called on gnttab resume
> before the devices get a chance to try and unmap their old
> entries. Would need to handle v2 entries sing feature which are
> not expressible in v1.
>
> I'm tempted to go with the former for simplicity, it enables migration
> to a newer version of Xen (the guest will just keep using v1) but will
> not allow migration back to an older version of Xen, which is not
> something we generally support anyway.
>
Yeah, given the "no downgrade" rule we don't need to solve it in the
most general way.
J
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Re: [PATCH] xen: reduce severity of message about using v1 grant tables.
2009-12-03 12:01 ` Steven Smith
2009-12-03 12:15 ` Ian Campbell
@ 2009-12-03 20:13 ` Jeremy Fitzhardinge
2009-12-03 21:23 ` Ian Campbell
1 sibling, 1 reply; 12+ messages in thread
From: Jeremy Fitzhardinge @ 2009-12-03 20:13 UTC (permalink / raw)
To: Steven Smith; +Cc: Ian Campbell, Steven Smith, xen-devel@lists.xensource.com
On 12/03/09 04:01, Steven Smith wrote:
>> Oh, I see what you meant... in the proper resume case (as opposed to the
>> cancelled suspend/checkpoint case I was thinking of) there should be no
>> grant tables in use at this point so most devices should, in theory, be
>> able to reconnect using v1 grants, any drivers which require v2 grant
>> tables need to check for them in their resume hook as well as at start
>> of day.
>>
>> Unfortunately frontend devices tear down their grant entries after the
>> resume rather than before the suspend (I presume this has to do with
>> faster checkpointing?) which means they could be trying to clear an
>> entry of the wrong layout, leading the unbounded badness that the
>> comment refers to.
>>
> Actually, I think that'd be okay. Drivers should be clearing grant
> table entries through the gnttab_end_* functions, which always check
> grant_table_version and do the right thing.
> gnttab_grant_foreign_access_ref_{subpage,trans} would BUG(), but that
> could be turned into an error return easily enough. Handling it
> everywhere would be a pain, though.
>
> The only other potential subtlety is handling a suspend while some
> other vcpu is doing grant table operations, but I think the
> stop_machine() is sufficient protection against that.
>
Yes. We already have to be careful to prevent pte updates from being
preempted by suspend, so grant operations are similar. (When
CONFIG_PREEMPT is enabled, it is freeze/thaw which provides this guarantee.)
> BUG_ON(flags & (GTF_accept_transfer | GTF_reading |
> GTF_writing | GTF_sub_page | GTF_permit_access));
> - BUG_ON(grant_table_version == 1);
> + if (grant_table_version == 1)
>
Rather than having all these version tests, would it make sense to have
a grant_ops vector?
J
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Re: [PATCH] xen: reduce severity of message about using v1 grant tables.
2009-12-03 12:15 ` Ian Campbell
2009-12-03 14:47 ` Steven Smith
@ 2009-12-03 20:15 ` Jeremy Fitzhardinge
2009-12-03 21:22 ` Ian Campbell
1 sibling, 1 reply; 12+ messages in thread
From: Jeremy Fitzhardinge @ 2009-12-03 20:15 UTC (permalink / raw)
To: Ian Campbell; +Cc: Steven Smith, xen-devel@lists.xensource.com
On 12/03/09 04:15, Ian Campbell wrote:
> I'm experimenting with this:
>
> Subject: xen: latch grant-table layout at start of day.
>
> It is not possible to switch grant table layout over a save restore
> since entries setup before with the old layout cannot be torn down under
> the new layout.
>
Doesn't resume implicitly tear them all down?
J
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Re: [PATCH] xen: reduce severity of message about using v1 grant tables.
2009-12-03 20:15 ` Jeremy Fitzhardinge
@ 2009-12-03 21:22 ` Ian Campbell
0 siblings, 0 replies; 12+ messages in thread
From: Ian Campbell @ 2009-12-03 21:22 UTC (permalink / raw)
To: Jeremy Fitzhardinge; +Cc: Steven Smith, xen-devel@lists.xensource.com
On Thu, 2009-12-03 at 20:15 +0000, Jeremy Fitzhardinge wrote:
> On 12/03/09 04:15, Ian Campbell wrote:
> > I'm experimenting with this:
> >
> > Subject: xen: latch grant-table layout at start of day.
> >
> > It is not possible to switch grant table layout over a save restore
> > since entries setup before with the old layout cannot be torn down under
> > the new layout.
> >
>
> Doesn't resume implicitly tear them all down?
Yes, but the devices will try and tear them down again in their resume
handlers. I think the conclusion was (see Steven's last mail) that since
changing gnttab version memsets the array to zero and gnttab_end_*
happen to be safe (by inspection) when used on a zeroed slot it would be
ok.
I think it's just safest to avoid the whole issue, we don't care about
the downgrade case and in the upgrade case the guest isn't going to have
any devices running which need v2 so can just as well stick with v1.
Ian.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Re: [PATCH] xen: reduce severity of message about using v1 grant tables.
2009-12-03 20:13 ` Jeremy Fitzhardinge
@ 2009-12-03 21:23 ` Ian Campbell
0 siblings, 0 replies; 12+ messages in thread
From: Ian Campbell @ 2009-12-03 21:23 UTC (permalink / raw)
To: Jeremy Fitzhardinge; +Cc: Steven Smith, xen-devel@lists.xensource.com
On Thu, 2009-12-03 at 20:13 +0000, Jeremy Fitzhardinge wrote:
>
> > BUG_ON(flags & (GTF_accept_transfer | GTF_reading |
> > GTF_writing | GTF_sub_page |
> GTF_permit_access));
> > - BUG_ON(grant_table_version == 1);
> > + if (grant_table_version == 1)
> >
>
> Rather than having all these version tests, would it make sense to
> have a grant_ops vector?
I was looking at these today and reckon you could go a long way with a
couple of accessor macros towards avoiding most of the version checks.
Ian.
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2009-12-03 21:23 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-12-02 19:28 [PATCH] xen: reduce severity of message about using v1 grant tables Ian Campbell
2009-12-02 19:33 ` Jeremy Fitzhardinge
2009-12-02 19:54 ` Ian Campbell
2009-12-03 10:28 ` Ian Campbell
2009-12-03 12:01 ` Steven Smith
2009-12-03 12:15 ` Ian Campbell
2009-12-03 14:47 ` Steven Smith
2009-12-03 20:15 ` Jeremy Fitzhardinge
2009-12-03 21:22 ` Ian Campbell
2009-12-03 20:13 ` Jeremy Fitzhardinge
2009-12-03 21:23 ` Ian Campbell
2009-12-03 20:10 ` Jeremy Fitzhardinge
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.