xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] tools/migrate: Fix regression when migrating from older version of Xen
@ 2013-07-10 19:19 Andrew Cooper
  2013-07-11  9:46 ` George Dunlap
  2013-07-11  9:55 ` Ian Campbell
  0 siblings, 2 replies; 9+ messages in thread
From: Andrew Cooper @ 2013-07-10 19:19 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Ian Jackson, Ian Campbell

Commit 00a4b65f8534c9e6521eab2e6ce796ae36037774 Sep 7 2010
  "libxc: provide notification of final checkpoint to restore end"
broke migration from any version of Xen using tools from prior to that commit

Older tools have no idea about an XC_SAVE_ID_LAST_CHECKPOINT, causing newer
tools xc_domain_restore() to start reading the qemu save record, as
ctx->last_checkpoint is 0.

The failure looks like:
  xc: error: Max batch size exceeded (1970103633). Giving up.
where 1970103633 = 0x756d6551 = *(uint32_t*)"Qemu"

Sadly, the simple fix of just setting ctx->last_checkpoint = 1 will cause an
opposite function regresson for anyone using the current behaviour of
save_callbacks->checkpoint().

The only safe fix is to rely on the toolstack to provide this information.

Passing 0 results in unchanged behaviour, while passing nonzero means "the
other end of the migration stream does not know about
XC_SAVE_ID_LAST_CHECKPOINT but is performing a normal migrate"

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
 tools/libxc/xc_domain_restore.c |    3 ++-
 tools/libxc/xc_nomigrate.c      |    2 +-
 tools/libxc/xenguest.h          |    4 +++-
 tools/libxl/libxl_save_helper.c |    2 +-
 tools/xcutils/xc_restore.c      |    2 +-
 5 files changed, 8 insertions(+), 5 deletions(-)

diff --git a/tools/libxc/xc_domain_restore.c b/tools/libxc/xc_domain_restore.c
index 63d36cd..40c9282 100644
--- a/tools/libxc/xc_domain_restore.c
+++ b/tools/libxc/xc_domain_restore.c
@@ -1401,7 +1401,7 @@ int xc_domain_restore(xc_interface *xch, int io_fd, uint32_t dom,
                       domid_t store_domid, unsigned int console_evtchn,
                       unsigned long *console_mfn, domid_t console_domid,
                       unsigned int hvm, unsigned int pae, int superpages,
-                      int no_incr_generationid,
+                      int no_incr_generationid, int last_checkpoint_unaware,
                       unsigned long *vm_generationid_addr,
                       struct restore_callbacks *callbacks)
 {
@@ -1472,6 +1472,7 @@ int xc_domain_restore(xc_interface *xch, int io_fd, uint32_t dom,
 
     ctx->superpages = superpages;
     ctx->hvm = hvm;
+    ctx->last_checkpoint = !!last_checkpoint_unaware;
 
     ctxt = xc_hypercall_buffer_alloc(xch, ctxt, sizeof(*ctxt));
 
diff --git a/tools/libxc/xc_nomigrate.c b/tools/libxc/xc_nomigrate.c
index 73e7566..1d3aec5 100644
--- a/tools/libxc/xc_nomigrate.c
+++ b/tools/libxc/xc_nomigrate.c
@@ -35,7 +35,7 @@ int xc_domain_restore(xc_interface *xch, int io_fd, uint32_t dom,
                       domid_t store_domid, unsigned int console_evtchn,
                       unsigned long *console_mfn, domid_t console_domid,
                       unsigned int hvm, unsigned int pae, int superpages,
-                      int no_incr_generationid,
+                      int no_incr_generationid, int last_checkpoint_unaware,
                       unsigned long *vm_generationid_addr,
                       struct restore_callbacks *callbacks)
 {
diff --git a/tools/libxc/xenguest.h b/tools/libxc/xenguest.h
index 4714bd2..977dde3 100644
--- a/tools/libxc/xenguest.h
+++ b/tools/libxc/xenguest.h
@@ -114,6 +114,8 @@ struct restore_callbacks {
  * @parm pae non-zero if this HVM domain has PAE support enabled
  * @parm superpages non-zero to allocate guest memory with superpages
  * @parm no_incr_generationid non-zero if generation id is NOT to be incremented
+ * @parm last_checkpoint_unaware non-zero if far end of the migration is unaware
+ *       of the XC_SAVE_ID_LAST_CHECKPOINT hunk in the migration stream.
  * @parm vm_generationid_addr returned with the address of the generation id buffer
  * @parm callbacks non-NULL to receive a callback to restore toolstack
  *       specific data
@@ -124,7 +126,7 @@ int xc_domain_restore(xc_interface *xch, int io_fd, uint32_t dom,
                       domid_t store_domid, unsigned int console_evtchn,
                       unsigned long *console_mfn, domid_t console_domid,
                       unsigned int hvm, unsigned int pae, int superpages,
-                      int no_incr_generationid,
+                      int no_incr_generationid, int last_checkpoint_unaware,
                       unsigned long *vm_generationid_addr,
                       struct restore_callbacks *callbacks);
 /**
diff --git a/tools/libxl/libxl_save_helper.c b/tools/libxl/libxl_save_helper.c
index 772251a..8f14b8b 100644
--- a/tools/libxl/libxl_save_helper.c
+++ b/tools/libxl/libxl_save_helper.c
@@ -264,7 +264,7 @@ int main(int argc, char **argv)
         r = xc_domain_restore(xch, io_fd, dom, store_evtchn, &store_mfn,
                               store_domid, console_evtchn, &console_mfn,
                               console_domid, hvm, pae, superpages,
-                              no_incr_genidad, &genidad,
+                              no_incr_genidad, 0, &genidad,
                               &helper_restore_callbacks);
         helper_stub_restore_results(store_mfn,console_mfn,genidad,0);
         complete(r);
diff --git a/tools/xcutils/xc_restore.c b/tools/xcutils/xc_restore.c
index 35d725c..e657c7f 100644
--- a/tools/xcutils/xc_restore.c
+++ b/tools/xcutils/xc_restore.c
@@ -52,7 +52,7 @@ main(int argc, char **argv)
 
     ret = xc_domain_restore(xch, io_fd, domid, store_evtchn, &store_mfn, 0,
                             console_evtchn, &console_mfn, 0, hvm, pae, superpages,
-                            0, NULL, NULL);
+                            0, 0, NULL, NULL);
 
     if ( ret == 0 )
     {
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] tools/migrate: Fix regression when migrating from older version of Xen
  2013-07-10 19:19 [PATCH] tools/migrate: Fix regression when migrating from older version of Xen Andrew Cooper
@ 2013-07-11  9:46 ` George Dunlap
  2013-07-11 11:09   ` Andrew Cooper
  2013-07-11  9:55 ` Ian Campbell
  1 sibling, 1 reply; 9+ messages in thread
From: George Dunlap @ 2013-07-11  9:46 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Ian Jackson, Ian Campbell, Xen-devel

On Wed, Jul 10, 2013 at 8:19 PM, Andrew Cooper
<andrew.cooper3@citrix.com> wrote:
> Commit 00a4b65f8534c9e6521eab2e6ce796ae36037774 Sep 7 2010
>   "libxc: provide notification of final checkpoint to restore end"
> broke migration from any version of Xen using tools from prior to that commit
>
> Older tools have no idea about an XC_SAVE_ID_LAST_CHECKPOINT, causing newer
> tools xc_domain_restore() to start reading the qemu save record, as
> ctx->last_checkpoint is 0.
>
> The failure looks like:
>   xc: error: Max batch size exceeded (1970103633). Giving up.
> where 1970103633 = 0x756d6551 = *(uint32_t*)"Qemu"
>
> Sadly, the simple fix of just setting ctx->last_checkpoint = 1 will cause an
> opposite function regresson for anyone using the current behaviour of
> save_callbacks->checkpoint().
>
> The only safe fix is to rely on the toolstack to provide this information.
>
> Passing 0 results in unchanged behaviour, while passing nonzero means "the
> other end of the migration stream does not know about
> XC_SAVE_ID_LAST_CHECKPOINT but is performing a normal migrate"
>
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Shouldn't there also be a way to actually use the flag?

 -George

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] tools/migrate: Fix regression when migrating from older version of Xen
  2013-07-10 19:19 [PATCH] tools/migrate: Fix regression when migrating from older version of Xen Andrew Cooper
  2013-07-11  9:46 ` George Dunlap
@ 2013-07-11  9:55 ` Ian Campbell
  2013-07-11 11:19   ` Andrew Cooper
  1 sibling, 1 reply; 9+ messages in thread
From: Ian Campbell @ 2013-07-11  9:55 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Ian Jackson, Xen-devel

On Wed, 2013-07-10 at 20:19 +0100, Andrew Cooper wrote:
> Commit 00a4b65f8534c9e6521eab2e6ce796ae36037774 Sep 7 2010
>   "libxc: provide notification of final checkpoint to restore end"
> broke migration from any version of Xen using tools from prior to that commit
> 
> Older tools have no idea about an XC_SAVE_ID_LAST_CHECKPOINT, causing newer
> tools xc_domain_restore() to start reading the qemu save record, as
> ctx->last_checkpoint is 0.

Can the receive not distinguish the case where it is receiving a
checkpoint stream from a regular one? Either via some property of the
stream or the parameters with which it was called?

ctx->last_checkpoint should (be made to) only matter if you are doing
check pointing and I think we can mandate that checking pointing is only
supported between like versions of Xen. TBH it's a bit odd to send the
LAST_CHECKPOINT in anon-checkpoint stream anyway, but what's done is
done.

Ian.
> 
> The failure looks like:
>   xc: error: Max batch size exceeded (1970103633). Giving up.
> where 1970103633 = 0x756d6551 = *(uint32_t*)"Qemu"
> 
> Sadly, the simple fix of just setting ctx->last_checkpoint = 1 will cause an
> opposite function regresson for anyone using the current behaviour of
> save_callbacks->checkpoint().
> 
> The only safe fix is to rely on the toolstack to provide this information.
> 
> Passing 0 results in unchanged behaviour, while passing nonzero means "the
> other end of the migration stream does not know about
> XC_SAVE_ID_LAST_CHECKPOINT but is performing a normal migrate"
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> ---
>  tools/libxc/xc_domain_restore.c |    3 ++-
>  tools/libxc/xc_nomigrate.c      |    2 +-
>  tools/libxc/xenguest.h          |    4 +++-
>  tools/libxl/libxl_save_helper.c |    2 +-
>  tools/xcutils/xc_restore.c      |    2 +-
>  5 files changed, 8 insertions(+), 5 deletions(-)
> 
> diff --git a/tools/libxc/xc_domain_restore.c b/tools/libxc/xc_domain_restore.c
> index 63d36cd..40c9282 100644
> --- a/tools/libxc/xc_domain_restore.c
> +++ b/tools/libxc/xc_domain_restore.c
> @@ -1401,7 +1401,7 @@ int xc_domain_restore(xc_interface *xch, int io_fd, uint32_t dom,
>                        domid_t store_domid, unsigned int console_evtchn,
>                        unsigned long *console_mfn, domid_t console_domid,
>                        unsigned int hvm, unsigned int pae, int superpages,
> -                      int no_incr_generationid,
> +                      int no_incr_generationid, int last_checkpoint_unaware,
>                        unsigned long *vm_generationid_addr,
>                        struct restore_callbacks *callbacks)
>  {
> @@ -1472,6 +1472,7 @@ int xc_domain_restore(xc_interface *xch, int io_fd, uint32_t dom,
>  
>      ctx->superpages = superpages;
>      ctx->hvm = hvm;
> +    ctx->last_checkpoint = !!last_checkpoint_unaware;
>  
>      ctxt = xc_hypercall_buffer_alloc(xch, ctxt, sizeof(*ctxt));
>  
> diff --git a/tools/libxc/xc_nomigrate.c b/tools/libxc/xc_nomigrate.c
> index 73e7566..1d3aec5 100644
> --- a/tools/libxc/xc_nomigrate.c
> +++ b/tools/libxc/xc_nomigrate.c
> @@ -35,7 +35,7 @@ int xc_domain_restore(xc_interface *xch, int io_fd, uint32_t dom,
>                        domid_t store_domid, unsigned int console_evtchn,
>                        unsigned long *console_mfn, domid_t console_domid,
>                        unsigned int hvm, unsigned int pae, int superpages,
> -                      int no_incr_generationid,
> +                      int no_incr_generationid, int last_checkpoint_unaware,
>                        unsigned long *vm_generationid_addr,
>                        struct restore_callbacks *callbacks)
>  {
> diff --git a/tools/libxc/xenguest.h b/tools/libxc/xenguest.h
> index 4714bd2..977dde3 100644
> --- a/tools/libxc/xenguest.h
> +++ b/tools/libxc/xenguest.h
> @@ -114,6 +114,8 @@ struct restore_callbacks {
>   * @parm pae non-zero if this HVM domain has PAE support enabled
>   * @parm superpages non-zero to allocate guest memory with superpages
>   * @parm no_incr_generationid non-zero if generation id is NOT to be incremented
> + * @parm last_checkpoint_unaware non-zero if far end of the migration is unaware
> + *       of the XC_SAVE_ID_LAST_CHECKPOINT hunk in the migration stream.
>   * @parm vm_generationid_addr returned with the address of the generation id buffer
>   * @parm callbacks non-NULL to receive a callback to restore toolstack
>   *       specific data
> @@ -124,7 +126,7 @@ int xc_domain_restore(xc_interface *xch, int io_fd, uint32_t dom,
>                        domid_t store_domid, unsigned int console_evtchn,
>                        unsigned long *console_mfn, domid_t console_domid,
>                        unsigned int hvm, unsigned int pae, int superpages,
> -                      int no_incr_generationid,
> +                      int no_incr_generationid, int last_checkpoint_unaware,
>                        unsigned long *vm_generationid_addr,
>                        struct restore_callbacks *callbacks);
>  /**
> diff --git a/tools/libxl/libxl_save_helper.c b/tools/libxl/libxl_save_helper.c
> index 772251a..8f14b8b 100644
> --- a/tools/libxl/libxl_save_helper.c
> +++ b/tools/libxl/libxl_save_helper.c
> @@ -264,7 +264,7 @@ int main(int argc, char **argv)
>          r = xc_domain_restore(xch, io_fd, dom, store_evtchn, &store_mfn,
>                                store_domid, console_evtchn, &console_mfn,
>                                console_domid, hvm, pae, superpages,
> -                              no_incr_genidad, &genidad,
> +                              no_incr_genidad, 0, &genidad,
>                                &helper_restore_callbacks);
>          helper_stub_restore_results(store_mfn,console_mfn,genidad,0);
>          complete(r);
> diff --git a/tools/xcutils/xc_restore.c b/tools/xcutils/xc_restore.c
> index 35d725c..e657c7f 100644
> --- a/tools/xcutils/xc_restore.c
> +++ b/tools/xcutils/xc_restore.c
> @@ -52,7 +52,7 @@ main(int argc, char **argv)
>  
>      ret = xc_domain_restore(xch, io_fd, domid, store_evtchn, &store_mfn, 0,
>                              console_evtchn, &console_mfn, 0, hvm, pae, superpages,
> -                            0, NULL, NULL);
> +                            0, 0, NULL, NULL);
>  
>      if ( ret == 0 )
>      {

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] tools/migrate: Fix regression when migrating from older version of Xen
  2013-07-11  9:46 ` George Dunlap
@ 2013-07-11 11:09   ` Andrew Cooper
  0 siblings, 0 replies; 9+ messages in thread
From: Andrew Cooper @ 2013-07-11 11:09 UTC (permalink / raw)
  To: George Dunlap; +Cc: Ian Jackson, Ian Campbell, Xen-devel

On 11/07/13 10:46, George Dunlap wrote:
> On Wed, Jul 10, 2013 at 8:19 PM, Andrew Cooper
> <andrew.cooper3@citrix.com> wrote:
>> Commit 00a4b65f8534c9e6521eab2e6ce796ae36037774 Sep 7 2010
>>   "libxc: provide notification of final checkpoint to restore end"
>> broke migration from any version of Xen using tools from prior to that commit
>>
>> Older tools have no idea about an XC_SAVE_ID_LAST_CHECKPOINT, causing newer
>> tools xc_domain_restore() to start reading the qemu save record, as
>> ctx->last_checkpoint is 0.
>>
>> The failure looks like:
>>   xc: error: Max batch size exceeded (1970103633). Giving up.
>> where 1970103633 = 0x756d6551 = *(uint32_t*)"Qemu"
>>
>> Sadly, the simple fix of just setting ctx->last_checkpoint = 1 will cause an
>> opposite function regresson for anyone using the current behaviour of
>> save_callbacks->checkpoint().
>>
>> The only safe fix is to rely on the toolstack to provide this information.
>>
>> Passing 0 results in unchanged behaviour, while passing nonzero means "the
>> other end of the migration stream does not know about
>> XC_SAVE_ID_LAST_CHECKPOINT but is performing a normal migrate"
>>
>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Shouldn't there also be a way to actually use the flag?
>
>  -George

How do you mean? Plumb it up through to xl?

xl (in any non-experemental form) post-dates this regression.  xend
predates it, but I can't find any way of working out whether the far end
predates the regression.

~Andrew

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] tools/migrate: Fix regression when migrating from older version of Xen
  2013-07-11  9:55 ` Ian Campbell
@ 2013-07-11 11:19   ` Andrew Cooper
  2013-07-11 11:43     ` Ian Campbell
  0 siblings, 1 reply; 9+ messages in thread
From: Andrew Cooper @ 2013-07-11 11:19 UTC (permalink / raw)
  To: Ian Campbell; +Cc: Ian Jackson, Xen-devel

On 11/07/13 10:55, Ian Campbell wrote:
> On Wed, 2013-07-10 at 20:19 +0100, Andrew Cooper wrote:
>> Commit 00a4b65f8534c9e6521eab2e6ce796ae36037774 Sep 7 2010
>>   "libxc: provide notification of final checkpoint to restore end"
>> broke migration from any version of Xen using tools from prior to that commit
>>
>> Older tools have no idea about an XC_SAVE_ID_LAST_CHECKPOINT, causing newer
>> tools xc_domain_restore() to start reading the qemu save record, as
>> ctx->last_checkpoint is 0.
> Can the receive not distinguish the case where it is receiving a
> checkpoint stream from a regular one? Either via some property of the
> stream or the parameters with which it was called?
>
> ctx->last_checkpoint should (be made to) only matter if you are doing
> check pointing and I think we can mandate that checking pointing is only
> supported between like versions of Xen. TBH it's a bit odd to send the
> LAST_CHECKPOINT in anon-checkpoint stream anyway, but what's done is
> done.
>
> Ian.

Sending LAST_CHECKPOINT is the only thing which has allowed normal
migration to work since this regression.

I cant find any way of distinguishing a normal vs checkpointed migrate
from the stream alone.  The save_callback->checkpoint function pointer
may or may not be filled in by a toolstack, but is completely
out-of-band as far as the migration stream is concerned.

~Andrew

>> The failure looks like:
>>   xc: error: Max batch size exceeded (1970103633). Giving up.
>> where 1970103633 = 0x756d6551 = *(uint32_t*)"Qemu"
>>
>> Sadly, the simple fix of just setting ctx->last_checkpoint = 1 will cause an
>> opposite function regresson for anyone using the current behaviour of
>> save_callbacks->checkpoint().
>>
>> The only safe fix is to rely on the toolstack to provide this information.
>>
>> Passing 0 results in unchanged behaviour, while passing nonzero means "the
>> other end of the migration stream does not know about
>> XC_SAVE_ID_LAST_CHECKPOINT but is performing a normal migrate"
>>
>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
>> ---
>>  tools/libxc/xc_domain_restore.c |    3 ++-
>>  tools/libxc/xc_nomigrate.c      |    2 +-
>>  tools/libxc/xenguest.h          |    4 +++-
>>  tools/libxl/libxl_save_helper.c |    2 +-
>>  tools/xcutils/xc_restore.c      |    2 +-
>>  5 files changed, 8 insertions(+), 5 deletions(-)
>>
>> diff --git a/tools/libxc/xc_domain_restore.c b/tools/libxc/xc_domain_restore.c
>> index 63d36cd..40c9282 100644
>> --- a/tools/libxc/xc_domain_restore.c
>> +++ b/tools/libxc/xc_domain_restore.c
>> @@ -1401,7 +1401,7 @@ int xc_domain_restore(xc_interface *xch, int io_fd, uint32_t dom,
>>                        domid_t store_domid, unsigned int console_evtchn,
>>                        unsigned long *console_mfn, domid_t console_domid,
>>                        unsigned int hvm, unsigned int pae, int superpages,
>> -                      int no_incr_generationid,
>> +                      int no_incr_generationid, int last_checkpoint_unaware,
>>                        unsigned long *vm_generationid_addr,
>>                        struct restore_callbacks *callbacks)
>>  {
>> @@ -1472,6 +1472,7 @@ int xc_domain_restore(xc_interface *xch, int io_fd, uint32_t dom,
>>  
>>      ctx->superpages = superpages;
>>      ctx->hvm = hvm;
>> +    ctx->last_checkpoint = !!last_checkpoint_unaware;
>>  
>>      ctxt = xc_hypercall_buffer_alloc(xch, ctxt, sizeof(*ctxt));
>>  
>> diff --git a/tools/libxc/xc_nomigrate.c b/tools/libxc/xc_nomigrate.c
>> index 73e7566..1d3aec5 100644
>> --- a/tools/libxc/xc_nomigrate.c
>> +++ b/tools/libxc/xc_nomigrate.c
>> @@ -35,7 +35,7 @@ int xc_domain_restore(xc_interface *xch, int io_fd, uint32_t dom,
>>                        domid_t store_domid, unsigned int console_evtchn,
>>                        unsigned long *console_mfn, domid_t console_domid,
>>                        unsigned int hvm, unsigned int pae, int superpages,
>> -                      int no_incr_generationid,
>> +                      int no_incr_generationid, int last_checkpoint_unaware,
>>                        unsigned long *vm_generationid_addr,
>>                        struct restore_callbacks *callbacks)
>>  {
>> diff --git a/tools/libxc/xenguest.h b/tools/libxc/xenguest.h
>> index 4714bd2..977dde3 100644
>> --- a/tools/libxc/xenguest.h
>> +++ b/tools/libxc/xenguest.h
>> @@ -114,6 +114,8 @@ struct restore_callbacks {
>>   * @parm pae non-zero if this HVM domain has PAE support enabled
>>   * @parm superpages non-zero to allocate guest memory with superpages
>>   * @parm no_incr_generationid non-zero if generation id is NOT to be incremented
>> + * @parm last_checkpoint_unaware non-zero if far end of the migration is unaware
>> + *       of the XC_SAVE_ID_LAST_CHECKPOINT hunk in the migration stream.
>>   * @parm vm_generationid_addr returned with the address of the generation id buffer
>>   * @parm callbacks non-NULL to receive a callback to restore toolstack
>>   *       specific data
>> @@ -124,7 +126,7 @@ int xc_domain_restore(xc_interface *xch, int io_fd, uint32_t dom,
>>                        domid_t store_domid, unsigned int console_evtchn,
>>                        unsigned long *console_mfn, domid_t console_domid,
>>                        unsigned int hvm, unsigned int pae, int superpages,
>> -                      int no_incr_generationid,
>> +                      int no_incr_generationid, int last_checkpoint_unaware,
>>                        unsigned long *vm_generationid_addr,
>>                        struct restore_callbacks *callbacks);
>>  /**
>> diff --git a/tools/libxl/libxl_save_helper.c b/tools/libxl/libxl_save_helper.c
>> index 772251a..8f14b8b 100644
>> --- a/tools/libxl/libxl_save_helper.c
>> +++ b/tools/libxl/libxl_save_helper.c
>> @@ -264,7 +264,7 @@ int main(int argc, char **argv)
>>          r = xc_domain_restore(xch, io_fd, dom, store_evtchn, &store_mfn,
>>                                store_domid, console_evtchn, &console_mfn,
>>                                console_domid, hvm, pae, superpages,
>> -                              no_incr_genidad, &genidad,
>> +                              no_incr_genidad, 0, &genidad,
>>                                &helper_restore_callbacks);
>>          helper_stub_restore_results(store_mfn,console_mfn,genidad,0);
>>          complete(r);
>> diff --git a/tools/xcutils/xc_restore.c b/tools/xcutils/xc_restore.c
>> index 35d725c..e657c7f 100644
>> --- a/tools/xcutils/xc_restore.c
>> +++ b/tools/xcutils/xc_restore.c
>> @@ -52,7 +52,7 @@ main(int argc, char **argv)
>>  
>>      ret = xc_domain_restore(xch, io_fd, domid, store_evtchn, &store_mfn, 0,
>>                              console_evtchn, &console_mfn, 0, hvm, pae, superpages,
>> -                            0, NULL, NULL);
>> +                            0, 0, NULL, NULL);
>>  
>>      if ( ret == 0 )
>>      {
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] tools/migrate: Fix regression when migrating from older version of Xen
  2013-07-11 11:19   ` Andrew Cooper
@ 2013-07-11 11:43     ` Ian Campbell
  2013-07-15 14:24       ` Andrew Cooper
  0 siblings, 1 reply; 9+ messages in thread
From: Ian Campbell @ 2013-07-11 11:43 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Ian Jackson, Xen-devel

On Thu, 2013-07-11 at 12:19 +0100, Andrew Cooper wrote:
> On 11/07/13 10:55, Ian Campbell wrote:
> > On Wed, 2013-07-10 at 20:19 +0100, Andrew Cooper wrote:
> >> Commit 00a4b65f8534c9e6521eab2e6ce796ae36037774 Sep 7 2010
> >>   "libxc: provide notification of final checkpoint to restore end"
> >> broke migration from any version of Xen using tools from prior to that commit
> >>
> >> Older tools have no idea about an XC_SAVE_ID_LAST_CHECKPOINT, causing newer
> >> tools xc_domain_restore() to start reading the qemu save record, as
> >> ctx->last_checkpoint is 0.
> > Can the receive not distinguish the case where it is receiving a
> > checkpoint stream from a regular one? Either via some property of the
> > stream or the parameters with which it was called?
> >
> > ctx->last_checkpoint should (be made to) only matter if you are doing
> > check pointing and I think we can mandate that checking pointing is only
> > supported between like versions of Xen. TBH it's a bit odd to send the
> > LAST_CHECKPOINT in anon-checkpoint stream anyway, but what's done is
> > done.
> >
> > Ian.
> 
> Sending LAST_CHECKPOINT is the only thing which has allowed normal
> migration to work since this regression.
> 
> I cant find any way of distinguishing a normal vs checkpointed migrate
> from the stream alone.  The save_callback->checkpoint function pointer
> may or may not be filled in by a toolstack, but is completely
> out-of-band as far as the migration stream is concerned.

I suppose your new last_generation-aware flag is similar to the "is a
checkpoint restore" lag which I was thinking of, but it somewhat easier
for the toolstack to know that it is receiving a checkpoint vs. normal
migration compared with magically knowing the version on the other end
of the connection.

Ian.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] tools/migrate: Fix regression when migrating from older version of Xen
  2013-07-11 11:43     ` Ian Campbell
@ 2013-07-15 14:24       ` Andrew Cooper
  2013-07-16  9:32         ` Ian Campbell
  0 siblings, 1 reply; 9+ messages in thread
From: Andrew Cooper @ 2013-07-15 14:24 UTC (permalink / raw)
  To: Ian Campbell; +Cc: Ian Jackson, Xen-devel

On 11/07/13 12:43, Ian Campbell wrote:
> On Thu, 2013-07-11 at 12:19 +0100, Andrew Cooper wrote:
>> On 11/07/13 10:55, Ian Campbell wrote:
>>> On Wed, 2013-07-10 at 20:19 +0100, Andrew Cooper wrote:
>>>> Commit 00a4b65f8534c9e6521eab2e6ce796ae36037774 Sep 7 2010
>>>>   "libxc: provide notification of final checkpoint to restore end"
>>>> broke migration from any version of Xen using tools from prior to that commit
>>>>
>>>> Older tools have no idea about an XC_SAVE_ID_LAST_CHECKPOINT, causing newer
>>>> tools xc_domain_restore() to start reading the qemu save record, as
>>>> ctx->last_checkpoint is 0.
>>> Can the receive not distinguish the case where it is receiving a
>>> checkpoint stream from a regular one? Either via some property of the
>>> stream or the parameters with which it was called?
>>>
>>> ctx->last_checkpoint should (be made to) only matter if you are doing
>>> check pointing and I think we can mandate that checking pointing is only
>>> supported between like versions of Xen. TBH it's a bit odd to send the
>>> LAST_CHECKPOINT in anon-checkpoint stream anyway, but what's done is
>>> done.
>>>
>>> Ian.
>> Sending LAST_CHECKPOINT is the only thing which has allowed normal
>> migration to work since this regression.
>>
>> I cant find any way of distinguishing a normal vs checkpointed migrate
>> from the stream alone.  The save_callback->checkpoint function pointer
>> may or may not be filled in by a toolstack, but is completely
>> out-of-band as far as the migration stream is concerned.
> I suppose your new last_generation-aware flag is similar to the "is a
> checkpoint restore" lag which I was thinking of, but it somewhat easier
> for the toolstack to know that it is receiving a checkpoint vs. normal
> migration compared with magically knowing the version on the other end
> of the connection.
>
> Ian.
>

So is this an indication of this patch being a good fix for the problem?
(I guess this is a pseudo-ping given no specific objection)

~Andrew

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] tools/migrate: Fix regression when migrating from older version of Xen
  2013-07-15 14:24       ` Andrew Cooper
@ 2013-07-16  9:32         ` Ian Campbell
  2013-07-16  9:32           ` Andrew Cooper
  0 siblings, 1 reply; 9+ messages in thread
From: Ian Campbell @ 2013-07-16  9:32 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Ian Jackson, Xen-devel

On Mon, 2013-07-15 at 15:24 +0100, Andrew Cooper wrote:
> On 11/07/13 12:43, Ian Campbell wrote:
> > On Thu, 2013-07-11 at 12:19 +0100, Andrew Cooper wrote:
> >> On 11/07/13 10:55, Ian Campbell wrote:
> >>> On Wed, 2013-07-10 at 20:19 +0100, Andrew Cooper wrote:
> >>>> Commit 00a4b65f8534c9e6521eab2e6ce796ae36037774 Sep 7 2010
> >>>>   "libxc: provide notification of final checkpoint to restore end"
> >>>> broke migration from any version of Xen using tools from prior to that commit
> >>>>
> >>>> Older tools have no idea about an XC_SAVE_ID_LAST_CHECKPOINT, causing newer
> >>>> tools xc_domain_restore() to start reading the qemu save record, as
> >>>> ctx->last_checkpoint is 0.
> >>> Can the receive not distinguish the case where it is receiving a
> >>> checkpoint stream from a regular one? Either via some property of the
> >>> stream or the parameters with which it was called?
> >>>
> >>> ctx->last_checkpoint should (be made to) only matter if you are doing
> >>> check pointing and I think we can mandate that checking pointing is only
> >>> supported between like versions of Xen. TBH it's a bit odd to send the
> >>> LAST_CHECKPOINT in anon-checkpoint stream anyway, but what's done is
> >>> done.
> >>>
> >>> Ian.
> >> Sending LAST_CHECKPOINT is the only thing which has allowed normal
> >> migration to work since this regression.
> >>
> >> I cant find any way of distinguishing a normal vs checkpointed migrate
> >> from the stream alone.  The save_callback->checkpoint function pointer
> >> may or may not be filled in by a toolstack, but is completely
> >> out-of-band as far as the migration stream is concerned.
> > I suppose your new last_generation-aware flag is similar to the "is a
> > checkpoint restore" lag which I was thinking of, but it somewhat easier
> > for the toolstack to know that it is receiving a checkpoint vs. normal
> > migration compared with magically knowing the version on the other end
> > of the connection.
> >
> > Ian.
> >
> 
> So is this an indication of this patch being a good fix for the problem?
> (I guess this is a pseudo-ping given no specific objection)

If it implements the semantics I describe then all you need to do is
change the name.

The reason I prefer a "incoming checkpointed stream" flag over a
"understands last checkpoint" is that the former is something the
toolstack knows about independently of the versions of Xen on either end
of the socket.

Ian.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] tools/migrate: Fix regression when migrating from older version of Xen
  2013-07-16  9:32         ` Ian Campbell
@ 2013-07-16  9:32           ` Andrew Cooper
  0 siblings, 0 replies; 9+ messages in thread
From: Andrew Cooper @ 2013-07-16  9:32 UTC (permalink / raw)
  To: Ian Campbell; +Cc: Ian Jackson, Xen-devel

On 16/07/13 10:32, Ian Campbell wrote:
> On Mon, 2013-07-15 at 15:24 +0100, Andrew Cooper wrote:
>> On 11/07/13 12:43, Ian Campbell wrote:
>>> On Thu, 2013-07-11 at 12:19 +0100, Andrew Cooper wrote:
>>>> On 11/07/13 10:55, Ian Campbell wrote:
>>>>> On Wed, 2013-07-10 at 20:19 +0100, Andrew Cooper wrote:
>>>>>> Commit 00a4b65f8534c9e6521eab2e6ce796ae36037774 Sep 7 2010
>>>>>>   "libxc: provide notification of final checkpoint to restore end"
>>>>>> broke migration from any version of Xen using tools from prior to that commit
>>>>>>
>>>>>> Older tools have no idea about an XC_SAVE_ID_LAST_CHECKPOINT, causing newer
>>>>>> tools xc_domain_restore() to start reading the qemu save record, as
>>>>>> ctx->last_checkpoint is 0.
>>>>> Can the receive not distinguish the case where it is receiving a
>>>>> checkpoint stream from a regular one? Either via some property of the
>>>>> stream or the parameters with which it was called?
>>>>>
>>>>> ctx->last_checkpoint should (be made to) only matter if you are doing
>>>>> check pointing and I think we can mandate that checking pointing is only
>>>>> supported between like versions of Xen. TBH it's a bit odd to send the
>>>>> LAST_CHECKPOINT in anon-checkpoint stream anyway, but what's done is
>>>>> done.
>>>>>
>>>>> Ian.
>>>> Sending LAST_CHECKPOINT is the only thing which has allowed normal
>>>> migration to work since this regression.
>>>>
>>>> I cant find any way of distinguishing a normal vs checkpointed migrate
>>>> from the stream alone.  The save_callback->checkpoint function pointer
>>>> may or may not be filled in by a toolstack, but is completely
>>>> out-of-band as far as the migration stream is concerned.
>>> I suppose your new last_generation-aware flag is similar to the "is a
>>> checkpoint restore" lag which I was thinking of, but it somewhat easier
>>> for the toolstack to know that it is receiving a checkpoint vs. normal
>>> migration compared with magically knowing the version on the other end
>>> of the connection.
>>>
>>> Ian.
>>>
>> So is this an indication of this patch being a good fix for the problem?
>> (I guess this is a pseudo-ping given no specific objection)
> If it implements the semantics I describe then all you need to do is
> change the name.
>
> The reason I prefer a "incoming checkpointed stream" flag over a
> "understands last checkpoint" is that the former is something the
> toolstack knows about independently of the versions of Xen on either end
> of the socket.
>
> Ian.
>
>

Ok - I will respin to that effect.

~Andrew

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2013-07-16  9:32 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-07-10 19:19 [PATCH] tools/migrate: Fix regression when migrating from older version of Xen Andrew Cooper
2013-07-11  9:46 ` George Dunlap
2013-07-11 11:09   ` Andrew Cooper
2013-07-11  9:55 ` Ian Campbell
2013-07-11 11:19   ` Andrew Cooper
2013-07-11 11:43     ` Ian Campbell
2013-07-15 14:24       ` Andrew Cooper
2013-07-16  9:32         ` Ian Campbell
2013-07-16  9:32           ` Andrew Cooper

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).