All of lore.kernel.org
 help / color / mirror / Atom feed
* [Planning for Xen-4.6] Migration v2
@ 2014-11-25 19:54 Andrew Cooper
  2014-11-26  8:09 ` Olaf Hering
                   ` (3 more replies)
  0 siblings, 4 replies; 10+ messages in thread
From: Andrew Cooper @ 2014-11-25 19:54 UTC (permalink / raw)
  To: Xen-devel List
  Cc: Juergen Gross, Wei Liu, Ian Campbell, Tim Deegan, Ian Jackson,
	Ross Lagerwall, David Vrabel, Jan Beulich, Shriram Rajagopalan,
	Hongyang Yang

Hello,

The purpose of this email is to plan how to progress the migrationv2
series through to being merged.  I believe I have CC'd everyone with a
specific interest in this area, but apologies if I have missed anyone.

Migration v2 is in exclusive use in XenServer 6.5.  We primarily
developed migration v2 because we needed a 32bit -> 64bit toolstack
upgrade path.  The code has all the features XenServer previously
supported, and we consider it fully baked and without any known bugs,
including transparent legacy-to-v2 conversion on upgrade.

We did endeavour to get migration v2 into Xen 4.5, but regrettably this
did not happen.  A consequence of this, along with the code being in
XenServer 6.5, is that the wire format is now set in stone.  Luckily, it
has been explicitly designed to be easy to extend in a forward
compatible manor, so this is not a problem moving forward.

The expectation is that the migration v2 code will completely replace
the existing migration code, which will involve removing
xc_domain_save.c and xc_domain_restore.c, as well as assorted other
orphaned code in libxenctrl and libxenguest

There are 3 areas of concern which have been identified so far.

1) TMEM support

Migration v2 doesn't currently have any tmem migration support.  The
maintainers have been asked whether they actually expect legacy tmem
migration to work, but I have not heard any reply yet.  At the very
least, migration v2 tmem support would want some new thought put into
wire protocol.  I am hoping that, as TMEM is still tech preview and
still in the process of having XSA-15 fixed, working tmem migration v2
is not insisted as a prerequisite.

2) Remus/COLO support

Migration v2 doesn't currently have any Remus support.  There was a
draft series which added Remus support, and showed that it was
particularly simple to add Remus support to migration v2.  I integrated
several bugfixes as a side effect of that series, but the actual Remus
content needed a refresh.  This got delayed behind the Remus libxl
effort.  It is my hope that the Remus maintainers can refresh that
series and provide assistance while testing.

3) Libxl and xl support

Libxl and xl have as many problems as the libxc code did when it comes
to incompatible wire formats and layering violations.  In particular, it
is not possible to determine the bitness of the sending
libxl-saverestore-helper, meaning that legacy conversion requires active
administrator input, or at least a passive assumption that the bitness
is the same.

There is an xl/libxl part of the migration v2 series which attempts to
rectify this all in one go, as there is no alternative way of doing so. 
The libxl section of the series is certainly not yet complete, but
specific queries to the maintainers have thusfar gone unanswered.  On
the other hand, the series does basically WorkForMe, including
transparent legacy upgrade, suggesting that it is at least in an
appropriate ballpark.


*) Specific non-requirements:

There have been issues identified with dynamic (in a p2m sense) guests
and migration, which results in failed migration or image corruption. 
While these issues certainly want fixing, they are bugs which exist in
the legacy code.  As such, they are not prerequisites to fix before v2
can be accepted.


Anyway, it is my hope that this planning email can help get things on
track to start perusing active development again as soon as the 4.6 dev
window opens again, with the aim to get all the code merged as early as
possible in the dev window to allow as much testing as possible.

~Andrew

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Planning for Xen-4.6] Migration v2
  2014-11-25 19:54 [Planning for Xen-4.6] Migration v2 Andrew Cooper
@ 2014-11-26  8:09 ` Olaf Hering
  2014-11-26 13:17   ` Andrew Cooper
  2014-11-26 16:44 ` Ian Campbell
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 10+ messages in thread
From: Olaf Hering @ 2014-11-26  8:09 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Juergen Gross, Wei Liu, Ian Campbell, Ian Jackson, Tim Deegan,
	Xen-devel List, Ross Lagerwall, David Vrabel, Jan Beulich,
	Shriram Rajagopalan, Hongyang Yang

On Tue, Nov 25, Andrew Cooper wrote:

> The purpose of this email is to plan how to progress the migrationv2
> series through to being merged.  I believe I have CC'd everyone with a
> specific interest in this area, but apologies if I have missed anyone.

While you mow that lawn, did you guys think of handling downtime of the
migrated VM? I added some knobs to abort migration in a very libxc
specific way. What I would like to see is a simple user interface for
virsh/xl to control the downtime. See the thread "limit downtime during
life migration from xl/virsh":

http://lists.xenproject.org/archives/html/xen-devel/2014-03/msg00785.html


Olaf

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Planning for Xen-4.6] Migration v2
  2014-11-26  8:09 ` Olaf Hering
@ 2014-11-26 13:17   ` Andrew Cooper
  2014-11-26 14:53     ` Olaf Hering
  0 siblings, 1 reply; 10+ messages in thread
From: Andrew Cooper @ 2014-11-26 13:17 UTC (permalink / raw)
  To: Olaf Hering
  Cc: Juergen Gross, Wei Liu, Ian Campbell, Ian Jackson, Tim Deegan,
	Xen-devel List, Ross Lagerwall, David Vrabel, Jan Beulich,
	Shriram Rajagopalan, Hongyang Yang

On 26/11/14 08:09, Olaf Hering wrote:
> On Tue, Nov 25, Andrew Cooper wrote:
>
>> The purpose of this email is to plan how to progress the migrationv2
>> series through to being merged.  I believe I have CC'd everyone with a
>> specific interest in this area, but apologies if I have missed anyone.
> While you mow that lawn,

Lawns cease to be called lawns when they get this gnarly.  Paddock
perhaps? :)

> did you guys think of handling downtime of the
> migrated VM?

This again is in contradiction to the primary purpose of "Make something
which is no less functional than legacy migration".  Having said that,
migration downtime is an area will be looking into in XenServer,
although at the moment the bottlenecks are elsewhere in the system.

> I added some knobs to abort migration in a very libxc
> specific way. What I would like to see is a simple user interface for
> virsh/xl to control the downtime. See the thread "limit downtime during
> life migration from xl/virsh":
>
> http://lists.xenproject.org/archives/html/xen-devel/2014-03/msg00785.html

The v2 code is substantially more efficient than the legacy code.  It
makes fewer hypercalls, it doesn't attempt to map frames its not
planning to send, or frames which it knows doesn't exist, it makes an
order of magnitude fewer syscalls as part of actually moving data.  Side
by side in otherwise identical situations, v2 is noticeably faster than
legacy.  A gut feel on small VMs during dev testing would be between 1/4
and 1/3rd faster, but we also have various measurements at a higher
level which show an improvement all round.  In particular, total time to
suspend a 128GB HVM guest is down by 60% when the *only* different in
the system is the algorithm used in libxenguest.

Also, the current knobs in migration v2 are different.  Legacy has max
iterations and max factor as knobs, (where max factor is frankly a
ludicrous parameter in my opinion), and replaced with a dirty threshold
in v2.  v2 currently has these values hard coded to 5 iterations and 50
(or fewer) dirty frames before pause.  This has proved to be better

It is certainly my hope going forward that different knobs can be
exposed.  One thing I think would be interesting is some proper
calculations of the delta in the dirty set, and offering a threshold
which chooses between "pause and complete" or "abort the migration and
complain that the VM is too active"

~Andrew

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Planning for Xen-4.6] Migration v2
  2014-11-26 13:17   ` Andrew Cooper
@ 2014-11-26 14:53     ` Olaf Hering
  0 siblings, 0 replies; 10+ messages in thread
From: Olaf Hering @ 2014-11-26 14:53 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Juergen Gross, Wei Liu, Ian Campbell, Ian Jackson, Tim Deegan,
	Xen-devel List, Ross Lagerwall, David Vrabel, Jan Beulich,
	Shriram Rajagopalan, Hongyang Yang

On Wed, Nov 26, Andrew Cooper wrote:

> It is certainly my hope going forward that different knobs can be
> exposed.  One thing I think would be interesting is some proper
> calculations of the delta in the dirty set, and offering a threshold
> which chooses between "pause and complete" or "abort the migration and
> complain that the VM is too active"

The "pause and complete" step is what causes unexpected time jumps in
the guest. Would be nice if that can be controlled with a knob.

Olaf

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Planning for Xen-4.6] Migration v2
  2014-11-25 19:54 [Planning for Xen-4.6] Migration v2 Andrew Cooper
  2014-11-26  8:09 ` Olaf Hering
@ 2014-11-26 16:44 ` Ian Campbell
  2014-11-26 17:22   ` Andrew Cooper
  2014-11-26 16:50 ` Ian Campbell
  2014-11-27  8:33 ` Hongyang Yang
  3 siblings, 1 reply; 10+ messages in thread
From: Ian Campbell @ 2014-11-26 16:44 UTC (permalink / raw)
  To: Andrew Cooper, Ian Jackson
  Cc: Juergen Gross, Wei Liu, Tim Deegan, Xen-devel List,
	Ross Lagerwall, David Vrabel, Jan Beulich, Shriram Rajagopalan,
	Hongyang Yang

On Tue, 2014-11-25 at 19:54 +0000, Andrew Cooper wrote:
> There is an xl/libxl part of the migration v2 series which attempts to
> rectify this all in one go, as there is no alternative way of doing so. 
> The libxl section of the series is certainly not yet complete, but
> specific queries to the maintainers have thusfar gone unanswered.  On
> the other hand, the series does basically WorkForMe, including
> transparent legacy upgrade, suggesting that it is at least in an
> appropriate ballpark.

Is this, from "[PATCH 27/29] [VERY RFC] tools/libxl: Support restoring
legacy streams":

        This WorksForMe in the success case, but the error handling is certainly lacking.
        
        Specifically, the conversion scripts output fd can't be closed until the v2
        read loop has exited (cleanly or otherwise), without risking a close()/open()
        race silently replacing the fd behind the loops back.
        
        However, it can't be closed when the read loop exits, as the conversion script
        child might still be alive, and would prefer terminating cleaning than failing
        with a bad FD.
        
        Obviously, having one error handler block for the success/failure of the other
        side is a no-go, and would still involve a preselecting which was expected to
        exit first.
        
        Does anyone have any clever ideas of how to asynchronously collect the events
        "the conversion script has exited", "the save helper has exited" and "the v2
        read loop has finished" given the available infrastructure, to kick of a
        combined cleanup of all 3?

? I said then:
        
        This is probably one for Ian when he gets back, but a state machine
        which is cranked in response to the callbacks from the various
        completion events might be one way to approach this.
        
Prodding Ian again (by moving to the To: line...)

Was there any other questions? I've had a scrobble through the bit of v7
which 00/29 suggests might contain them, but that's the only one I saw.

Ian. 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Planning for Xen-4.6] Migration v2
  2014-11-25 19:54 [Planning for Xen-4.6] Migration v2 Andrew Cooper
  2014-11-26  8:09 ` Olaf Hering
  2014-11-26 16:44 ` Ian Campbell
@ 2014-11-26 16:50 ` Ian Campbell
  2014-11-26 17:39   ` Andrew Cooper
  2014-11-27  8:33 ` Hongyang Yang
  3 siblings, 1 reply; 10+ messages in thread
From: Ian Campbell @ 2014-11-26 16:50 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Juergen Gross, Wei Liu, Tim Deegan, Ian Jackson, Xen-devel List,
	Ross Lagerwall, David Vrabel, Jan Beulich, Shriram Rajagopalan,
	Hongyang Yang

On Tue, 2014-11-25 at 19:54 +0000, Andrew Cooper wrote:
> 3) Libxl and xl support
> 
> Libxl and xl have as many problems as the libxc code did when it comes
> to incompatible wire formats and layering violations.  In particular, it
> is not possible to determine the bitness of the sending
> libxl-saverestore-helper, meaning that legacy conversion requires active
> administrator input, or at least a passive assumption that the bitness
> is the same.

IOW when migrating legacy->new we have the same restriction as we do
today in the purely legacy world, which is that the two dom0's must
having match bit widths?

IMHO this is fine. It essentially means that for xl users there is some
delayed gratification wrt the promise of migration between non-alike
dom0s. The migration from 4.5(legacy)->4.6(v2) won't support such
migrations, but the next step from 4.6(v2)->4.7(v2) will.

Ian.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Planning for Xen-4.6] Migration v2
  2014-11-26 16:44 ` Ian Campbell
@ 2014-11-26 17:22   ` Andrew Cooper
  0 siblings, 0 replies; 10+ messages in thread
From: Andrew Cooper @ 2014-11-26 17:22 UTC (permalink / raw)
  To: Ian Campbell, Ian Jackson
  Cc: Juergen Gross, Wei Liu, Tim Deegan, Xen-devel List,
	Ross Lagerwall, David Vrabel, Jan Beulich, Shriram Rajagopalan,
	Hongyang Yang

On 26/11/14 16:44, Ian Campbell wrote:
> On Tue, 2014-11-25 at 19:54 +0000, Andrew Cooper wrote:
>> There is an xl/libxl part of the migration v2 series which attempts to
>> rectify this all in one go, as there is no alternative way of doing so. 
>> The libxl section of the series is certainly not yet complete, but
>> specific queries to the maintainers have thusfar gone unanswered.  On
>> the other hand, the series does basically WorkForMe, including
>> transparent legacy upgrade, suggesting that it is at least in an
>> appropriate ballpark.
> Is this, from "[PATCH 27/29] [VERY RFC] tools/libxl: Support restoring
> legacy streams":
>
>         This WorksForMe in the success case, but the error handling is certainly lacking.
>         
>         Specifically, the conversion scripts output fd can't be closed until the v2
>         read loop has exited (cleanly or otherwise), without risking a close()/open()
>         race silently replacing the fd behind the loops back.
>         
>         However, it can't be closed when the read loop exits, as the conversion script
>         child might still be alive, and would prefer terminating cleaning than failing
>         with a bad FD.
>         
>         Obviously, having one error handler block for the success/failure of the other
>         side is a no-go, and would still involve a preselecting which was expected to
>         exit first.
>         
>         Does anyone have any clever ideas of how to asynchronously collect the events
>         "the conversion script has exited", "the save helper has exited" and "the v2
>         read loop has finished" given the available infrastructure, to kick of a
>         combined cleanup of all 3?
>
> ? I said then:
>         
>         This is probably one for Ian when he gets back, but a state machine
>         which is cranked in response to the callbacks from the various
>         completion events might be one way to approach this.
>         
> Prodding Ian again (by moving to the To: line...)
>
> Was there any other questions? I've had a scrobble through the bit of v7
> which 00/29 suggests might contain them, but that's the only one I saw.

I think that was the main one.  To the best of my understanding, there
is nowhere else in libxl which currently does multiple child management.

There were some other questions about the datacopier stuff, but that has
changed given the remus libxl changed.  I am happy for those to be
deferred by one iteration of the series and see what falls out in the wash.

~Andrew

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Planning for Xen-4.6] Migration v2
  2014-11-26 16:50 ` Ian Campbell
@ 2014-11-26 17:39   ` Andrew Cooper
  2014-11-27  8:46     ` Ian Campbell
  0 siblings, 1 reply; 10+ messages in thread
From: Andrew Cooper @ 2014-11-26 17:39 UTC (permalink / raw)
  To: Ian Campbell
  Cc: Juergen Gross, Wei Liu, Tim Deegan, Ian Jackson, Xen-devel List,
	Ross Lagerwall, David Vrabel, Jan Beulich, Shriram Rajagopalan,
	Hongyang Yang

On 26/11/14 16:50, Ian Campbell wrote:
> On Tue, 2014-11-25 at 19:54 +0000, Andrew Cooper wrote:
>> 3) Libxl and xl support
>>
>> Libxl and xl have as many problems as the libxc code did when it comes
>> to incompatible wire formats and layering violations.  In particular, it
>> is not possible to determine the bitness of the sending
>> libxl-saverestore-helper, meaning that legacy conversion requires active
>> administrator input, or at least a passive assumption that the bitness
>> is the same.
> IOW when migrating legacy->new we have the same restriction as we do
> today in the purely legacy world, which is that the two dom0's must
> having match bit widths?

The legacy->new conversion removes bitness from the equation, but the
bitness of the legacy side is an input parameter to conversion.

For XenServer, this is easy, as all older versions of XenServer are
32bit.  This version, and future versions will use the new format, where
bitness is specifically irrelevant.

For xl, this is harder.  There exist both 32 and 64bit versions doing
legacy migration, and on the receiving side it is impossible to
determine, given only the incoming stream.

>
> IMHO this is fine. It essentially means that for xl users there is some
> delayed gratification wrt the promise of migration between non-alike
> dom0s. The migration from 4.5(legacy)->4.6(v2) won't support such
> migrations, but the next step from 4.6(v2)->4.7(v2) will.

Two options exist.

1) Assume that the sending bitness is the same as the receiving
bitness.  This is already the status quo, and will require that the two
dom0s are the same width.

2) Allow the administrator to specify the bitness of the sending side. 
In this case, xl 4.5(legacy)->4.6(v2) works even cross-bitness.

~Andrew

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Planning for Xen-4.6] Migration v2
  2014-11-25 19:54 [Planning for Xen-4.6] Migration v2 Andrew Cooper
                   ` (2 preceding siblings ...)
  2014-11-26 16:50 ` Ian Campbell
@ 2014-11-27  8:33 ` Hongyang Yang
  3 siblings, 0 replies; 10+ messages in thread
From: Hongyang Yang @ 2014-11-27  8:33 UTC (permalink / raw)
  To: Andrew Cooper, Xen-devel List
  Cc: Juergen Gross, Wei Liu, Ian Campbell, Tim Deegan, Ian Jackson,
	Ross Lagerwall, David Vrabel, Jan Beulich, Shriram Rajagopalan



在 11/26/2014 03:54 AM, Andrew Cooper 写道:
> Hello,
>
> The purpose of this email is to plan how to progress the migrationv2
> series through to being merged.  I believe I have CC'd everyone with a
> specific interest in this area, but apologies if I have missed anyone.
>
> Migration v2 is in exclusive use in XenServer 6.5.  We primarily
> developed migration v2 because we needed a 32bit -> 64bit toolstack
> upgrade path.  The code has all the features XenServer previously
> supported, and we consider it fully baked and without any known bugs,
> including transparent legacy-to-v2 conversion on upgrade.
>
> We did endeavour to get migration v2 into Xen 4.5, but regrettably this
> did not happen.  A consequence of this, along with the code being in
> XenServer 6.5, is that the wire format is now set in stone.  Luckily, it
> has been explicitly designed to be easy to extend in a forward
> compatible manor, so this is not a problem moving forward.
>
> The expectation is that the migration v2 code will completely replace
> the existing migration code, which will involve removing
> xc_domain_save.c and xc_domain_restore.c, as well as assorted other
> orphaned code in libxenctrl and libxenguest
>
> There are 3 areas of concern which have been identified so far.
>
> 1) TMEM support
>
> Migration v2 doesn't currently have any tmem migration support.  The
> maintainers have been asked whether they actually expect legacy tmem
> migration to work, but I have not heard any reply yet.  At the very
> least, migration v2 tmem support would want some new thought put into
> wire protocol.  I am hoping that, as TMEM is still tech preview and
> still in the process of having XSA-15 fixed, working tmem migration v2
> is not insisted as a prerequisite.
>
> 2) Remus/COLO support
>
> Migration v2 doesn't currently have any Remus support.  There was a
> draft series which added Remus support, and showed that it was
> particularly simple to add Remus support to migration v2.  I integrated
> several bugfixes as a side effect of that series, but the actual Remus
> content needed a refresh.  This got delayed behind the Remus libxl
> effort.  It is my hope that the Remus maintainers can refresh that
> series and provide assistance while testing.

Sure, I'm planning to refresh the patches as soon as Xen 4.6 merge window
opened. And also going to start the work on libxl side because libxl part
of migration v2 has already done(although not fullly finished?). And we
hope COLO support will go into Xen 4.6 also.

>
> 3) Libxl and xl support
>
> Libxl and xl have as many problems as the libxc code did when it comes
> to incompatible wire formats and layering violations.  In particular, it
> is not possible to determine the bitness of the sending
> libxl-saverestore-helper, meaning that legacy conversion requires active
> administrator input, or at least a passive assumption that the bitness
> is the same.
>
> There is an xl/libxl part of the migration v2 series which attempts to
> rectify this all in one go, as there is no alternative way of doing so.
> The libxl section of the series is certainly not yet complete, but
> specific queries to the maintainers have thusfar gone unanswered.  On
> the other hand, the series does basically WorkForMe, including
> transparent legacy upgrade, suggesting that it is at least in an
> appropriate ballpark.
>
>
> *) Specific non-requirements:
>
> There have been issues identified with dynamic (in a p2m sense) guests
> and migration, which results in failed migration or image corruption.
> While these issues certainly want fixing, they are bugs which exist in
> the legacy code.  As such, they are not prerequisites to fix before v2
> can be accepted.
>
>
> Anyway, it is my hope that this planning email can help get things on
> track to start perusing active development again as soon as the 4.6 dev
> window opens again, with the aim to get all the code merged as early as
> possible in the dev window to allow as much testing as possible.
>
> ~Andrew
>
> .
>

-- 
Thanks,
Yang.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Planning for Xen-4.6] Migration v2
  2014-11-26 17:39   ` Andrew Cooper
@ 2014-11-27  8:46     ` Ian Campbell
  0 siblings, 0 replies; 10+ messages in thread
From: Ian Campbell @ 2014-11-27  8:46 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Juergen Gross, Wei Liu, Tim Deegan, Ian Jackson, Xen-devel List,
	Ross Lagerwall, David Vrabel, Jan Beulich, Shriram Rajagopalan,
	Hongyang Yang

On Wed, 2014-11-26 at 17:39 +0000, Andrew Cooper wrote:
> > IMHO this is fine. It essentially means that for xl users there is some
> > delayed gratification wrt the promise of migration between non-alike
> > dom0s. The migration from 4.5(legacy)->4.6(v2) won't support such
> > migrations, but the next step from 4.6(v2)->4.7(v2) will.
> 
> Two options exist.
> 
> 1) Assume that the sending bitness is the same as the receiving
> bitness.  This is already the status quo, and will require that the two
> dom0s are the same width.

As I said above I think this is absolutely acceptable as a transitional
step.

> 2) Allow the administrator to specify the bitness of the sending side. 
> In this case, xl 4.5(legacy)->4.6(v2) works even cross-bitness.

If this is trivial to plumb in and you are motivated to do so then this
seems like a reasonable enough "stretch goal".

Ian.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2014-11-27  8:46 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-11-25 19:54 [Planning for Xen-4.6] Migration v2 Andrew Cooper
2014-11-26  8:09 ` Olaf Hering
2014-11-26 13:17   ` Andrew Cooper
2014-11-26 14:53     ` Olaf Hering
2014-11-26 16:44 ` Ian Campbell
2014-11-26 17:22   ` Andrew Cooper
2014-11-26 16:50 ` Ian Campbell
2014-11-26 17:39   ` Andrew Cooper
2014-11-27  8:46     ` Ian Campbell
2014-11-27  8:33 ` Hongyang Yang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.