* [PATCH] tools/migrate: Fix regression when migrating from older version of Xen
@ 2013-07-10 19:19 Andrew Cooper
2013-07-11 9:46 ` George Dunlap
2013-07-11 9:55 ` Ian Campbell
0 siblings, 2 replies; 9+ messages in thread
From: Andrew Cooper @ 2013-07-10 19:19 UTC (permalink / raw)
To: Xen-devel; +Cc: Andrew Cooper, Ian Jackson, Ian Campbell
Commit 00a4b65f8534c9e6521eab2e6ce796ae36037774 Sep 7 2010
"libxc: provide notification of final checkpoint to restore end"
broke migration from any version of Xen using tools from prior to that commit
Older tools have no idea about an XC_SAVE_ID_LAST_CHECKPOINT, causing newer
tools xc_domain_restore() to start reading the qemu save record, as
ctx->last_checkpoint is 0.
The failure looks like:
xc: error: Max batch size exceeded (1970103633). Giving up.
where 1970103633 = 0x756d6551 = *(uint32_t*)"Qemu"
Sadly, the simple fix of just setting ctx->last_checkpoint = 1 will cause an
opposite function regresson for anyone using the current behaviour of
save_callbacks->checkpoint().
The only safe fix is to rely on the toolstack to provide this information.
Passing 0 results in unchanged behaviour, while passing nonzero means "the
other end of the migration stream does not know about
XC_SAVE_ID_LAST_CHECKPOINT but is performing a normal migrate"
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
tools/libxc/xc_domain_restore.c | 3 ++-
tools/libxc/xc_nomigrate.c | 2 +-
tools/libxc/xenguest.h | 4 +++-
tools/libxl/libxl_save_helper.c | 2 +-
tools/xcutils/xc_restore.c | 2 +-
5 files changed, 8 insertions(+), 5 deletions(-)
diff --git a/tools/libxc/xc_domain_restore.c b/tools/libxc/xc_domain_restore.c
index 63d36cd..40c9282 100644
--- a/tools/libxc/xc_domain_restore.c
+++ b/tools/libxc/xc_domain_restore.c
@@ -1401,7 +1401,7 @@ int xc_domain_restore(xc_interface *xch, int io_fd, uint32_t dom,
domid_t store_domid, unsigned int console_evtchn,
unsigned long *console_mfn, domid_t console_domid,
unsigned int hvm, unsigned int pae, int superpages,
- int no_incr_generationid,
+ int no_incr_generationid, int last_checkpoint_unaware,
unsigned long *vm_generationid_addr,
struct restore_callbacks *callbacks)
{
@@ -1472,6 +1472,7 @@ int xc_domain_restore(xc_interface *xch, int io_fd, uint32_t dom,
ctx->superpages = superpages;
ctx->hvm = hvm;
+ ctx->last_checkpoint = !!last_checkpoint_unaware;
ctxt = xc_hypercall_buffer_alloc(xch, ctxt, sizeof(*ctxt));
diff --git a/tools/libxc/xc_nomigrate.c b/tools/libxc/xc_nomigrate.c
index 73e7566..1d3aec5 100644
--- a/tools/libxc/xc_nomigrate.c
+++ b/tools/libxc/xc_nomigrate.c
@@ -35,7 +35,7 @@ int xc_domain_restore(xc_interface *xch, int io_fd, uint32_t dom,
domid_t store_domid, unsigned int console_evtchn,
unsigned long *console_mfn, domid_t console_domid,
unsigned int hvm, unsigned int pae, int superpages,
- int no_incr_generationid,
+ int no_incr_generationid, int last_checkpoint_unaware,
unsigned long *vm_generationid_addr,
struct restore_callbacks *callbacks)
{
diff --git a/tools/libxc/xenguest.h b/tools/libxc/xenguest.h
index 4714bd2..977dde3 100644
--- a/tools/libxc/xenguest.h
+++ b/tools/libxc/xenguest.h
@@ -114,6 +114,8 @@ struct restore_callbacks {
* @parm pae non-zero if this HVM domain has PAE support enabled
* @parm superpages non-zero to allocate guest memory with superpages
* @parm no_incr_generationid non-zero if generation id is NOT to be incremented
+ * @parm last_checkpoint_unaware non-zero if far end of the migration is unaware
+ * of the XC_SAVE_ID_LAST_CHECKPOINT hunk in the migration stream.
* @parm vm_generationid_addr returned with the address of the generation id buffer
* @parm callbacks non-NULL to receive a callback to restore toolstack
* specific data
@@ -124,7 +126,7 @@ int xc_domain_restore(xc_interface *xch, int io_fd, uint32_t dom,
domid_t store_domid, unsigned int console_evtchn,
unsigned long *console_mfn, domid_t console_domid,
unsigned int hvm, unsigned int pae, int superpages,
- int no_incr_generationid,
+ int no_incr_generationid, int last_checkpoint_unaware,
unsigned long *vm_generationid_addr,
struct restore_callbacks *callbacks);
/**
diff --git a/tools/libxl/libxl_save_helper.c b/tools/libxl/libxl_save_helper.c
index 772251a..8f14b8b 100644
--- a/tools/libxl/libxl_save_helper.c
+++ b/tools/libxl/libxl_save_helper.c
@@ -264,7 +264,7 @@ int main(int argc, char **argv)
r = xc_domain_restore(xch, io_fd, dom, store_evtchn, &store_mfn,
store_domid, console_evtchn, &console_mfn,
console_domid, hvm, pae, superpages,
- no_incr_genidad, &genidad,
+ no_incr_genidad, 0, &genidad,
&helper_restore_callbacks);
helper_stub_restore_results(store_mfn,console_mfn,genidad,0);
complete(r);
diff --git a/tools/xcutils/xc_restore.c b/tools/xcutils/xc_restore.c
index 35d725c..e657c7f 100644
--- a/tools/xcutils/xc_restore.c
+++ b/tools/xcutils/xc_restore.c
@@ -52,7 +52,7 @@ main(int argc, char **argv)
ret = xc_domain_restore(xch, io_fd, domid, store_evtchn, &store_mfn, 0,
console_evtchn, &console_mfn, 0, hvm, pae, superpages,
- 0, NULL, NULL);
+ 0, 0, NULL, NULL);
if ( ret == 0 )
{
--
1.7.10.4
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH] tools/migrate: Fix regression when migrating from older version of Xen
2013-07-10 19:19 [PATCH] tools/migrate: Fix regression when migrating from older version of Xen Andrew Cooper
@ 2013-07-11 9:46 ` George Dunlap
2013-07-11 11:09 ` Andrew Cooper
2013-07-11 9:55 ` Ian Campbell
1 sibling, 1 reply; 9+ messages in thread
From: George Dunlap @ 2013-07-11 9:46 UTC (permalink / raw)
To: Andrew Cooper; +Cc: Ian Jackson, Ian Campbell, Xen-devel
On Wed, Jul 10, 2013 at 8:19 PM, Andrew Cooper
<andrew.cooper3@citrix.com> wrote:
> Commit 00a4b65f8534c9e6521eab2e6ce796ae36037774 Sep 7 2010
> "libxc: provide notification of final checkpoint to restore end"
> broke migration from any version of Xen using tools from prior to that commit
>
> Older tools have no idea about an XC_SAVE_ID_LAST_CHECKPOINT, causing newer
> tools xc_domain_restore() to start reading the qemu save record, as
> ctx->last_checkpoint is 0.
>
> The failure looks like:
> xc: error: Max batch size exceeded (1970103633). Giving up.
> where 1970103633 = 0x756d6551 = *(uint32_t*)"Qemu"
>
> Sadly, the simple fix of just setting ctx->last_checkpoint = 1 will cause an
> opposite function regresson for anyone using the current behaviour of
> save_callbacks->checkpoint().
>
> The only safe fix is to rely on the toolstack to provide this information.
>
> Passing 0 results in unchanged behaviour, while passing nonzero means "the
> other end of the migration stream does not know about
> XC_SAVE_ID_LAST_CHECKPOINT but is performing a normal migrate"
>
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Shouldn't there also be a way to actually use the flag?
-George
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] tools/migrate: Fix regression when migrating from older version of Xen
2013-07-11 9:46 ` George Dunlap
@ 2013-07-11 11:09 ` Andrew Cooper
0 siblings, 0 replies; 9+ messages in thread
From: Andrew Cooper @ 2013-07-11 11:09 UTC (permalink / raw)
To: George Dunlap; +Cc: Ian Jackson, Ian Campbell, Xen-devel
On 11/07/13 10:46, George Dunlap wrote:
> On Wed, Jul 10, 2013 at 8:19 PM, Andrew Cooper
> <andrew.cooper3@citrix.com> wrote:
>> Commit 00a4b65f8534c9e6521eab2e6ce796ae36037774 Sep 7 2010
>> "libxc: provide notification of final checkpoint to restore end"
>> broke migration from any version of Xen using tools from prior to that commit
>>
>> Older tools have no idea about an XC_SAVE_ID_LAST_CHECKPOINT, causing newer
>> tools xc_domain_restore() to start reading the qemu save record, as
>> ctx->last_checkpoint is 0.
>>
>> The failure looks like:
>> xc: error: Max batch size exceeded (1970103633). Giving up.
>> where 1970103633 = 0x756d6551 = *(uint32_t*)"Qemu"
>>
>> Sadly, the simple fix of just setting ctx->last_checkpoint = 1 will cause an
>> opposite function regresson for anyone using the current behaviour of
>> save_callbacks->checkpoint().
>>
>> The only safe fix is to rely on the toolstack to provide this information.
>>
>> Passing 0 results in unchanged behaviour, while passing nonzero means "the
>> other end of the migration stream does not know about
>> XC_SAVE_ID_LAST_CHECKPOINT but is performing a normal migrate"
>>
>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Shouldn't there also be a way to actually use the flag?
>
> -George
How do you mean? Plumb it up through to xl?
xl (in any non-experemental form) post-dates this regression. xend
predates it, but I can't find any way of working out whether the far end
predates the regression.
~Andrew
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] tools/migrate: Fix regression when migrating from older version of Xen
2013-07-10 19:19 [PATCH] tools/migrate: Fix regression when migrating from older version of Xen Andrew Cooper
2013-07-11 9:46 ` George Dunlap
@ 2013-07-11 9:55 ` Ian Campbell
2013-07-11 11:19 ` Andrew Cooper
1 sibling, 1 reply; 9+ messages in thread
From: Ian Campbell @ 2013-07-11 9:55 UTC (permalink / raw)
To: Andrew Cooper; +Cc: Ian Jackson, Xen-devel
On Wed, 2013-07-10 at 20:19 +0100, Andrew Cooper wrote:
> Commit 00a4b65f8534c9e6521eab2e6ce796ae36037774 Sep 7 2010
> "libxc: provide notification of final checkpoint to restore end"
> broke migration from any version of Xen using tools from prior to that commit
>
> Older tools have no idea about an XC_SAVE_ID_LAST_CHECKPOINT, causing newer
> tools xc_domain_restore() to start reading the qemu save record, as
> ctx->last_checkpoint is 0.
Can the receive not distinguish the case where it is receiving a
checkpoint stream from a regular one? Either via some property of the
stream or the parameters with which it was called?
ctx->last_checkpoint should (be made to) only matter if you are doing
check pointing and I think we can mandate that checking pointing is only
supported between like versions of Xen. TBH it's a bit odd to send the
LAST_CHECKPOINT in anon-checkpoint stream anyway, but what's done is
done.
Ian.
>
> The failure looks like:
> xc: error: Max batch size exceeded (1970103633). Giving up.
> where 1970103633 = 0x756d6551 = *(uint32_t*)"Qemu"
>
> Sadly, the simple fix of just setting ctx->last_checkpoint = 1 will cause an
> opposite function regresson for anyone using the current behaviour of
> save_callbacks->checkpoint().
>
> The only safe fix is to rely on the toolstack to provide this information.
>
> Passing 0 results in unchanged behaviour, while passing nonzero means "the
> other end of the migration stream does not know about
> XC_SAVE_ID_LAST_CHECKPOINT but is performing a normal migrate"
>
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> ---
> tools/libxc/xc_domain_restore.c | 3 ++-
> tools/libxc/xc_nomigrate.c | 2 +-
> tools/libxc/xenguest.h | 4 +++-
> tools/libxl/libxl_save_helper.c | 2 +-
> tools/xcutils/xc_restore.c | 2 +-
> 5 files changed, 8 insertions(+), 5 deletions(-)
>
> diff --git a/tools/libxc/xc_domain_restore.c b/tools/libxc/xc_domain_restore.c
> index 63d36cd..40c9282 100644
> --- a/tools/libxc/xc_domain_restore.c
> +++ b/tools/libxc/xc_domain_restore.c
> @@ -1401,7 +1401,7 @@ int xc_domain_restore(xc_interface *xch, int io_fd, uint32_t dom,
> domid_t store_domid, unsigned int console_evtchn,
> unsigned long *console_mfn, domid_t console_domid,
> unsigned int hvm, unsigned int pae, int superpages,
> - int no_incr_generationid,
> + int no_incr_generationid, int last_checkpoint_unaware,
> unsigned long *vm_generationid_addr,
> struct restore_callbacks *callbacks)
> {
> @@ -1472,6 +1472,7 @@ int xc_domain_restore(xc_interface *xch, int io_fd, uint32_t dom,
>
> ctx->superpages = superpages;
> ctx->hvm = hvm;
> + ctx->last_checkpoint = !!last_checkpoint_unaware;
>
> ctxt = xc_hypercall_buffer_alloc(xch, ctxt, sizeof(*ctxt));
>
> diff --git a/tools/libxc/xc_nomigrate.c b/tools/libxc/xc_nomigrate.c
> index 73e7566..1d3aec5 100644
> --- a/tools/libxc/xc_nomigrate.c
> +++ b/tools/libxc/xc_nomigrate.c
> @@ -35,7 +35,7 @@ int xc_domain_restore(xc_interface *xch, int io_fd, uint32_t dom,
> domid_t store_domid, unsigned int console_evtchn,
> unsigned long *console_mfn, domid_t console_domid,
> unsigned int hvm, unsigned int pae, int superpages,
> - int no_incr_generationid,
> + int no_incr_generationid, int last_checkpoint_unaware,
> unsigned long *vm_generationid_addr,
> struct restore_callbacks *callbacks)
> {
> diff --git a/tools/libxc/xenguest.h b/tools/libxc/xenguest.h
> index 4714bd2..977dde3 100644
> --- a/tools/libxc/xenguest.h
> +++ b/tools/libxc/xenguest.h
> @@ -114,6 +114,8 @@ struct restore_callbacks {
> * @parm pae non-zero if this HVM domain has PAE support enabled
> * @parm superpages non-zero to allocate guest memory with superpages
> * @parm no_incr_generationid non-zero if generation id is NOT to be incremented
> + * @parm last_checkpoint_unaware non-zero if far end of the migration is unaware
> + * of the XC_SAVE_ID_LAST_CHECKPOINT hunk in the migration stream.
> * @parm vm_generationid_addr returned with the address of the generation id buffer
> * @parm callbacks non-NULL to receive a callback to restore toolstack
> * specific data
> @@ -124,7 +126,7 @@ int xc_domain_restore(xc_interface *xch, int io_fd, uint32_t dom,
> domid_t store_domid, unsigned int console_evtchn,
> unsigned long *console_mfn, domid_t console_domid,
> unsigned int hvm, unsigned int pae, int superpages,
> - int no_incr_generationid,
> + int no_incr_generationid, int last_checkpoint_unaware,
> unsigned long *vm_generationid_addr,
> struct restore_callbacks *callbacks);
> /**
> diff --git a/tools/libxl/libxl_save_helper.c b/tools/libxl/libxl_save_helper.c
> index 772251a..8f14b8b 100644
> --- a/tools/libxl/libxl_save_helper.c
> +++ b/tools/libxl/libxl_save_helper.c
> @@ -264,7 +264,7 @@ int main(int argc, char **argv)
> r = xc_domain_restore(xch, io_fd, dom, store_evtchn, &store_mfn,
> store_domid, console_evtchn, &console_mfn,
> console_domid, hvm, pae, superpages,
> - no_incr_genidad, &genidad,
> + no_incr_genidad, 0, &genidad,
> &helper_restore_callbacks);
> helper_stub_restore_results(store_mfn,console_mfn,genidad,0);
> complete(r);
> diff --git a/tools/xcutils/xc_restore.c b/tools/xcutils/xc_restore.c
> index 35d725c..e657c7f 100644
> --- a/tools/xcutils/xc_restore.c
> +++ b/tools/xcutils/xc_restore.c
> @@ -52,7 +52,7 @@ main(int argc, char **argv)
>
> ret = xc_domain_restore(xch, io_fd, domid, store_evtchn, &store_mfn, 0,
> console_evtchn, &console_mfn, 0, hvm, pae, superpages,
> - 0, NULL, NULL);
> + 0, 0, NULL, NULL);
>
> if ( ret == 0 )
> {
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] tools/migrate: Fix regression when migrating from older version of Xen
2013-07-11 9:55 ` Ian Campbell
@ 2013-07-11 11:19 ` Andrew Cooper
2013-07-11 11:43 ` Ian Campbell
0 siblings, 1 reply; 9+ messages in thread
From: Andrew Cooper @ 2013-07-11 11:19 UTC (permalink / raw)
To: Ian Campbell; +Cc: Ian Jackson, Xen-devel
On 11/07/13 10:55, Ian Campbell wrote:
> On Wed, 2013-07-10 at 20:19 +0100, Andrew Cooper wrote:
>> Commit 00a4b65f8534c9e6521eab2e6ce796ae36037774 Sep 7 2010
>> "libxc: provide notification of final checkpoint to restore end"
>> broke migration from any version of Xen using tools from prior to that commit
>>
>> Older tools have no idea about an XC_SAVE_ID_LAST_CHECKPOINT, causing newer
>> tools xc_domain_restore() to start reading the qemu save record, as
>> ctx->last_checkpoint is 0.
> Can the receive not distinguish the case where it is receiving a
> checkpoint stream from a regular one? Either via some property of the
> stream or the parameters with which it was called?
>
> ctx->last_checkpoint should (be made to) only matter if you are doing
> check pointing and I think we can mandate that checking pointing is only
> supported between like versions of Xen. TBH it's a bit odd to send the
> LAST_CHECKPOINT in anon-checkpoint stream anyway, but what's done is
> done.
>
> Ian.
Sending LAST_CHECKPOINT is the only thing which has allowed normal
migration to work since this regression.
I cant find any way of distinguishing a normal vs checkpointed migrate
from the stream alone. The save_callback->checkpoint function pointer
may or may not be filled in by a toolstack, but is completely
out-of-band as far as the migration stream is concerned.
~Andrew
>> The failure looks like:
>> xc: error: Max batch size exceeded (1970103633). Giving up.
>> where 1970103633 = 0x756d6551 = *(uint32_t*)"Qemu"
>>
>> Sadly, the simple fix of just setting ctx->last_checkpoint = 1 will cause an
>> opposite function regresson for anyone using the current behaviour of
>> save_callbacks->checkpoint().
>>
>> The only safe fix is to rely on the toolstack to provide this information.
>>
>> Passing 0 results in unchanged behaviour, while passing nonzero means "the
>> other end of the migration stream does not know about
>> XC_SAVE_ID_LAST_CHECKPOINT but is performing a normal migrate"
>>
>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
>> ---
>> tools/libxc/xc_domain_restore.c | 3 ++-
>> tools/libxc/xc_nomigrate.c | 2 +-
>> tools/libxc/xenguest.h | 4 +++-
>> tools/libxl/libxl_save_helper.c | 2 +-
>> tools/xcutils/xc_restore.c | 2 +-
>> 5 files changed, 8 insertions(+), 5 deletions(-)
>>
>> diff --git a/tools/libxc/xc_domain_restore.c b/tools/libxc/xc_domain_restore.c
>> index 63d36cd..40c9282 100644
>> --- a/tools/libxc/xc_domain_restore.c
>> +++ b/tools/libxc/xc_domain_restore.c
>> @@ -1401,7 +1401,7 @@ int xc_domain_restore(xc_interface *xch, int io_fd, uint32_t dom,
>> domid_t store_domid, unsigned int console_evtchn,
>> unsigned long *console_mfn, domid_t console_domid,
>> unsigned int hvm, unsigned int pae, int superpages,
>> - int no_incr_generationid,
>> + int no_incr_generationid, int last_checkpoint_unaware,
>> unsigned long *vm_generationid_addr,
>> struct restore_callbacks *callbacks)
>> {
>> @@ -1472,6 +1472,7 @@ int xc_domain_restore(xc_interface *xch, int io_fd, uint32_t dom,
>>
>> ctx->superpages = superpages;
>> ctx->hvm = hvm;
>> + ctx->last_checkpoint = !!last_checkpoint_unaware;
>>
>> ctxt = xc_hypercall_buffer_alloc(xch, ctxt, sizeof(*ctxt));
>>
>> diff --git a/tools/libxc/xc_nomigrate.c b/tools/libxc/xc_nomigrate.c
>> index 73e7566..1d3aec5 100644
>> --- a/tools/libxc/xc_nomigrate.c
>> +++ b/tools/libxc/xc_nomigrate.c
>> @@ -35,7 +35,7 @@ int xc_domain_restore(xc_interface *xch, int io_fd, uint32_t dom,
>> domid_t store_domid, unsigned int console_evtchn,
>> unsigned long *console_mfn, domid_t console_domid,
>> unsigned int hvm, unsigned int pae, int superpages,
>> - int no_incr_generationid,
>> + int no_incr_generationid, int last_checkpoint_unaware,
>> unsigned long *vm_generationid_addr,
>> struct restore_callbacks *callbacks)
>> {
>> diff --git a/tools/libxc/xenguest.h b/tools/libxc/xenguest.h
>> index 4714bd2..977dde3 100644
>> --- a/tools/libxc/xenguest.h
>> +++ b/tools/libxc/xenguest.h
>> @@ -114,6 +114,8 @@ struct restore_callbacks {
>> * @parm pae non-zero if this HVM domain has PAE support enabled
>> * @parm superpages non-zero to allocate guest memory with superpages
>> * @parm no_incr_generationid non-zero if generation id is NOT to be incremented
>> + * @parm last_checkpoint_unaware non-zero if far end of the migration is unaware
>> + * of the XC_SAVE_ID_LAST_CHECKPOINT hunk in the migration stream.
>> * @parm vm_generationid_addr returned with the address of the generation id buffer
>> * @parm callbacks non-NULL to receive a callback to restore toolstack
>> * specific data
>> @@ -124,7 +126,7 @@ int xc_domain_restore(xc_interface *xch, int io_fd, uint32_t dom,
>> domid_t store_domid, unsigned int console_evtchn,
>> unsigned long *console_mfn, domid_t console_domid,
>> unsigned int hvm, unsigned int pae, int superpages,
>> - int no_incr_generationid,
>> + int no_incr_generationid, int last_checkpoint_unaware,
>> unsigned long *vm_generationid_addr,
>> struct restore_callbacks *callbacks);
>> /**
>> diff --git a/tools/libxl/libxl_save_helper.c b/tools/libxl/libxl_save_helper.c
>> index 772251a..8f14b8b 100644
>> --- a/tools/libxl/libxl_save_helper.c
>> +++ b/tools/libxl/libxl_save_helper.c
>> @@ -264,7 +264,7 @@ int main(int argc, char **argv)
>> r = xc_domain_restore(xch, io_fd, dom, store_evtchn, &store_mfn,
>> store_domid, console_evtchn, &console_mfn,
>> console_domid, hvm, pae, superpages,
>> - no_incr_genidad, &genidad,
>> + no_incr_genidad, 0, &genidad,
>> &helper_restore_callbacks);
>> helper_stub_restore_results(store_mfn,console_mfn,genidad,0);
>> complete(r);
>> diff --git a/tools/xcutils/xc_restore.c b/tools/xcutils/xc_restore.c
>> index 35d725c..e657c7f 100644
>> --- a/tools/xcutils/xc_restore.c
>> +++ b/tools/xcutils/xc_restore.c
>> @@ -52,7 +52,7 @@ main(int argc, char **argv)
>>
>> ret = xc_domain_restore(xch, io_fd, domid, store_evtchn, &store_mfn, 0,
>> console_evtchn, &console_mfn, 0, hvm, pae, superpages,
>> - 0, NULL, NULL);
>> + 0, 0, NULL, NULL);
>>
>> if ( ret == 0 )
>> {
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] tools/migrate: Fix regression when migrating from older version of Xen
2013-07-11 11:19 ` Andrew Cooper
@ 2013-07-11 11:43 ` Ian Campbell
2013-07-15 14:24 ` Andrew Cooper
0 siblings, 1 reply; 9+ messages in thread
From: Ian Campbell @ 2013-07-11 11:43 UTC (permalink / raw)
To: Andrew Cooper; +Cc: Ian Jackson, Xen-devel
On Thu, 2013-07-11 at 12:19 +0100, Andrew Cooper wrote:
> On 11/07/13 10:55, Ian Campbell wrote:
> > On Wed, 2013-07-10 at 20:19 +0100, Andrew Cooper wrote:
> >> Commit 00a4b65f8534c9e6521eab2e6ce796ae36037774 Sep 7 2010
> >> "libxc: provide notification of final checkpoint to restore end"
> >> broke migration from any version of Xen using tools from prior to that commit
> >>
> >> Older tools have no idea about an XC_SAVE_ID_LAST_CHECKPOINT, causing newer
> >> tools xc_domain_restore() to start reading the qemu save record, as
> >> ctx->last_checkpoint is 0.
> > Can the receive not distinguish the case where it is receiving a
> > checkpoint stream from a regular one? Either via some property of the
> > stream or the parameters with which it was called?
> >
> > ctx->last_checkpoint should (be made to) only matter if you are doing
> > check pointing and I think we can mandate that checking pointing is only
> > supported between like versions of Xen. TBH it's a bit odd to send the
> > LAST_CHECKPOINT in anon-checkpoint stream anyway, but what's done is
> > done.
> >
> > Ian.
>
> Sending LAST_CHECKPOINT is the only thing which has allowed normal
> migration to work since this regression.
>
> I cant find any way of distinguishing a normal vs checkpointed migrate
> from the stream alone. The save_callback->checkpoint function pointer
> may or may not be filled in by a toolstack, but is completely
> out-of-band as far as the migration stream is concerned.
I suppose your new last_generation-aware flag is similar to the "is a
checkpoint restore" lag which I was thinking of, but it somewhat easier
for the toolstack to know that it is receiving a checkpoint vs. normal
migration compared with magically knowing the version on the other end
of the connection.
Ian.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] tools/migrate: Fix regression when migrating from older version of Xen
2013-07-11 11:43 ` Ian Campbell
@ 2013-07-15 14:24 ` Andrew Cooper
2013-07-16 9:32 ` Ian Campbell
0 siblings, 1 reply; 9+ messages in thread
From: Andrew Cooper @ 2013-07-15 14:24 UTC (permalink / raw)
To: Ian Campbell; +Cc: Ian Jackson, Xen-devel
On 11/07/13 12:43, Ian Campbell wrote:
> On Thu, 2013-07-11 at 12:19 +0100, Andrew Cooper wrote:
>> On 11/07/13 10:55, Ian Campbell wrote:
>>> On Wed, 2013-07-10 at 20:19 +0100, Andrew Cooper wrote:
>>>> Commit 00a4b65f8534c9e6521eab2e6ce796ae36037774 Sep 7 2010
>>>> "libxc: provide notification of final checkpoint to restore end"
>>>> broke migration from any version of Xen using tools from prior to that commit
>>>>
>>>> Older tools have no idea about an XC_SAVE_ID_LAST_CHECKPOINT, causing newer
>>>> tools xc_domain_restore() to start reading the qemu save record, as
>>>> ctx->last_checkpoint is 0.
>>> Can the receive not distinguish the case where it is receiving a
>>> checkpoint stream from a regular one? Either via some property of the
>>> stream or the parameters with which it was called?
>>>
>>> ctx->last_checkpoint should (be made to) only matter if you are doing
>>> check pointing and I think we can mandate that checking pointing is only
>>> supported between like versions of Xen. TBH it's a bit odd to send the
>>> LAST_CHECKPOINT in anon-checkpoint stream anyway, but what's done is
>>> done.
>>>
>>> Ian.
>> Sending LAST_CHECKPOINT is the only thing which has allowed normal
>> migration to work since this regression.
>>
>> I cant find any way of distinguishing a normal vs checkpointed migrate
>> from the stream alone. The save_callback->checkpoint function pointer
>> may or may not be filled in by a toolstack, but is completely
>> out-of-band as far as the migration stream is concerned.
> I suppose your new last_generation-aware flag is similar to the "is a
> checkpoint restore" lag which I was thinking of, but it somewhat easier
> for the toolstack to know that it is receiving a checkpoint vs. normal
> migration compared with magically knowing the version on the other end
> of the connection.
>
> Ian.
>
So is this an indication of this patch being a good fix for the problem?
(I guess this is a pseudo-ping given no specific objection)
~Andrew
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] tools/migrate: Fix regression when migrating from older version of Xen
2013-07-15 14:24 ` Andrew Cooper
@ 2013-07-16 9:32 ` Ian Campbell
2013-07-16 9:32 ` Andrew Cooper
0 siblings, 1 reply; 9+ messages in thread
From: Ian Campbell @ 2013-07-16 9:32 UTC (permalink / raw)
To: Andrew Cooper; +Cc: Ian Jackson, Xen-devel
On Mon, 2013-07-15 at 15:24 +0100, Andrew Cooper wrote:
> On 11/07/13 12:43, Ian Campbell wrote:
> > On Thu, 2013-07-11 at 12:19 +0100, Andrew Cooper wrote:
> >> On 11/07/13 10:55, Ian Campbell wrote:
> >>> On Wed, 2013-07-10 at 20:19 +0100, Andrew Cooper wrote:
> >>>> Commit 00a4b65f8534c9e6521eab2e6ce796ae36037774 Sep 7 2010
> >>>> "libxc: provide notification of final checkpoint to restore end"
> >>>> broke migration from any version of Xen using tools from prior to that commit
> >>>>
> >>>> Older tools have no idea about an XC_SAVE_ID_LAST_CHECKPOINT, causing newer
> >>>> tools xc_domain_restore() to start reading the qemu save record, as
> >>>> ctx->last_checkpoint is 0.
> >>> Can the receive not distinguish the case where it is receiving a
> >>> checkpoint stream from a regular one? Either via some property of the
> >>> stream or the parameters with which it was called?
> >>>
> >>> ctx->last_checkpoint should (be made to) only matter if you are doing
> >>> check pointing and I think we can mandate that checking pointing is only
> >>> supported between like versions of Xen. TBH it's a bit odd to send the
> >>> LAST_CHECKPOINT in anon-checkpoint stream anyway, but what's done is
> >>> done.
> >>>
> >>> Ian.
> >> Sending LAST_CHECKPOINT is the only thing which has allowed normal
> >> migration to work since this regression.
> >>
> >> I cant find any way of distinguishing a normal vs checkpointed migrate
> >> from the stream alone. The save_callback->checkpoint function pointer
> >> may or may not be filled in by a toolstack, but is completely
> >> out-of-band as far as the migration stream is concerned.
> > I suppose your new last_generation-aware flag is similar to the "is a
> > checkpoint restore" lag which I was thinking of, but it somewhat easier
> > for the toolstack to know that it is receiving a checkpoint vs. normal
> > migration compared with magically knowing the version on the other end
> > of the connection.
> >
> > Ian.
> >
>
> So is this an indication of this patch being a good fix for the problem?
> (I guess this is a pseudo-ping given no specific objection)
If it implements the semantics I describe then all you need to do is
change the name.
The reason I prefer a "incoming checkpointed stream" flag over a
"understands last checkpoint" is that the former is something the
toolstack knows about independently of the versions of Xen on either end
of the socket.
Ian.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] tools/migrate: Fix regression when migrating from older version of Xen
2013-07-16 9:32 ` Ian Campbell
@ 2013-07-16 9:32 ` Andrew Cooper
0 siblings, 0 replies; 9+ messages in thread
From: Andrew Cooper @ 2013-07-16 9:32 UTC (permalink / raw)
To: Ian Campbell; +Cc: Ian Jackson, Xen-devel
On 16/07/13 10:32, Ian Campbell wrote:
> On Mon, 2013-07-15 at 15:24 +0100, Andrew Cooper wrote:
>> On 11/07/13 12:43, Ian Campbell wrote:
>>> On Thu, 2013-07-11 at 12:19 +0100, Andrew Cooper wrote:
>>>> On 11/07/13 10:55, Ian Campbell wrote:
>>>>> On Wed, 2013-07-10 at 20:19 +0100, Andrew Cooper wrote:
>>>>>> Commit 00a4b65f8534c9e6521eab2e6ce796ae36037774 Sep 7 2010
>>>>>> "libxc: provide notification of final checkpoint to restore end"
>>>>>> broke migration from any version of Xen using tools from prior to that commit
>>>>>>
>>>>>> Older tools have no idea about an XC_SAVE_ID_LAST_CHECKPOINT, causing newer
>>>>>> tools xc_domain_restore() to start reading the qemu save record, as
>>>>>> ctx->last_checkpoint is 0.
>>>>> Can the receive not distinguish the case where it is receiving a
>>>>> checkpoint stream from a regular one? Either via some property of the
>>>>> stream or the parameters with which it was called?
>>>>>
>>>>> ctx->last_checkpoint should (be made to) only matter if you are doing
>>>>> check pointing and I think we can mandate that checking pointing is only
>>>>> supported between like versions of Xen. TBH it's a bit odd to send the
>>>>> LAST_CHECKPOINT in anon-checkpoint stream anyway, but what's done is
>>>>> done.
>>>>>
>>>>> Ian.
>>>> Sending LAST_CHECKPOINT is the only thing which has allowed normal
>>>> migration to work since this regression.
>>>>
>>>> I cant find any way of distinguishing a normal vs checkpointed migrate
>>>> from the stream alone. The save_callback->checkpoint function pointer
>>>> may or may not be filled in by a toolstack, but is completely
>>>> out-of-band as far as the migration stream is concerned.
>>> I suppose your new last_generation-aware flag is similar to the "is a
>>> checkpoint restore" lag which I was thinking of, but it somewhat easier
>>> for the toolstack to know that it is receiving a checkpoint vs. normal
>>> migration compared with magically knowing the version on the other end
>>> of the connection.
>>>
>>> Ian.
>>>
>> So is this an indication of this patch being a good fix for the problem?
>> (I guess this is a pseudo-ping given no specific objection)
> If it implements the semantics I describe then all you need to do is
> change the name.
>
> The reason I prefer a "incoming checkpointed stream" flag over a
> "understands last checkpoint" is that the former is something the
> toolstack knows about independently of the versions of Xen on either end
> of the socket.
>
> Ian.
>
>
Ok - I will respin to that effect.
~Andrew
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2013-07-16 9:32 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-07-10 19:19 [PATCH] tools/migrate: Fix regression when migrating from older version of Xen Andrew Cooper
2013-07-11 9:46 ` George Dunlap
2013-07-11 11:09 ` Andrew Cooper
2013-07-11 9:55 ` Ian Campbell
2013-07-11 11:19 ` Andrew Cooper
2013-07-11 11:43 ` Ian Campbell
2013-07-15 14:24 ` Andrew Cooper
2013-07-16 9:32 ` Ian Campbell
2013-07-16 9:32 ` Andrew Cooper
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).