* Migration between different bitness toolstacks
@ 2014-01-14 14:57 Andrew Cooper
2014-01-14 15:27 ` Ian Campbell
2014-01-14 16:05 ` Jan Beulich
0 siblings, 2 replies; 6+ messages in thread
From: Andrew Cooper @ 2014-01-14 14:57 UTC (permalink / raw)
To: Xen-devel List, Ian Campbell, Ian Jackson
Hello,
As part of XenServer's attempt to move to a 64bit dom0, we have
encountered a sizeable flaw in xc_domain_{save,restore}().
Migration of a VM from a 32bit toolstack to a 64bit toolstackfails with:
xc: detail: xc_domain_restore: starting restore of new domid 1
xc: detail: xc_domain_restore: p2m_size = ffffffff00010000
xc: error: Couldn't allocate p2m_frame_list array: Internal error
xc: detail: Restore exit of domid 1 with rc=1
This is caused because of
RDEXACT(io_fd, &dinfo->p2m_size, sizeof(unsigned long))
where sizeof(unsigned long) is different between the source and destination.
It is unreasonable for the format of the migration stream to rely on the
bitness of the toolstack, which should be completely transparent as far
as "motion of a VM" is concerned. Furthermore, the same issue occurs
with suspend/resume where the stream gets written to a file in the meantime.
A quick grep across the code shows several other items in the migration
stream which depend on toolstack bitness.
There is no way to divine whether the far side of the migration stream
is 32 or 64 bit, which is now vital information required to read the
stream correctly.
As a result, it is not obvious how best to fix this with backwards
compatibility in mind.
~Andrew
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: Migration between different bitness toolstacks
2014-01-14 14:57 Migration between different bitness toolstacks Andrew Cooper
@ 2014-01-14 15:27 ` Ian Campbell
2014-01-14 16:05 ` Jan Beulich
1 sibling, 0 replies; 6+ messages in thread
From: Ian Campbell @ 2014-01-14 15:27 UTC (permalink / raw)
To: Andrew Cooper; +Cc: Ian Jackson, Xen-devel List
On Tue, 2014-01-14 at 14:57 +0000, Andrew Cooper wrote:
> As a result, it is not obvious how best to fix this with backwards
> compatibility in mind.
You clearly aren't devious enough ;-)
It's pretty nasty, and it might might turn out to take us more than one
release to resolve, meaning we might have to insert something into the
save stream in one release but not be able to fully use it until the
release after, but I think it can be done in one release...
Given that the very first thing in the migration stream is "unsigned
long p2m_size" we pretty much have to play some horrible trick with it.
For example, what if we say that if the first 4 bytes of the stream are
0xffffffff then the stream is in "64-bit clean" mode -- which means that
all subsequent unsigned longs are actually 64-bit, including p2m_size
which follows immediately after the 0xffffffff magic number. If the
first 4 bytes are not 0xffffffff then this is a normal native word size
stream and those are 4 bytes of the p2m size.
On a 64-bit restorer you would have to read 4 bytes and if it is not
0xffffffff read another 4 and combine them to get the actual p2m size,
if the first 4 are 0xffffffff then you continue as normal with an 8 byte
read to get the p2m size.
On a 32-bit restorer, well, I guess you get the idea.
It's gross, but backwards compat can be like that...
Slightly more flexible would be to take 4-bytes = 0xffffffff to indicate
that an "extended-info" section follows, containing a non-optional
non-PV specific "stream info" block, which initially would simply
indicate that the stream was either 64-bit clean or would indicate what
sizeof(unsigned long) it uses. Since you would naturally design this
"stream info" block to be extensible it could be used in the future to
dig out of other holes.
Ian.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Migration between different bitness toolstacks
2014-01-14 14:57 Migration between different bitness toolstacks Andrew Cooper
2014-01-14 15:27 ` Ian Campbell
@ 2014-01-14 16:05 ` Jan Beulich
2014-01-14 16:18 ` Ian Campbell
1 sibling, 1 reply; 6+ messages in thread
From: Jan Beulich @ 2014-01-14 16:05 UTC (permalink / raw)
To: Andrew Cooper, Ian Campbell, Ian Jackson; +Cc: Xen-devel List
>>> On 14.01.14 at 15:57, Andrew Cooper <andrew.cooper3@citrix.com> wrote:
> As part of XenServer's attempt to move to a 64bit dom0, we have
> encountered a sizeable flaw in xc_domain_{save,restore}().
>
> Migration of a VM from a 32bit toolstack to a 64bit toolstackfails with:
>
> xc: detail: xc_domain_restore: starting restore of new domid 1
> xc: detail: xc_domain_restore: p2m_size = ffffffff00010000
> xc: error: Couldn't allocate p2m_frame_list array: Internal error
> xc: detail: Restore exit of domid 1 with rc=1
>
> This is caused because of
>
> RDEXACT(io_fd, &dinfo->p2m_size, sizeof(unsigned long))
>
> where sizeof(unsigned long) is different between the source and destination.
>
>
> It is unreasonable for the format of the migration stream to rely on the
> bitness of the toolstack, which should be completely transparent as far
> as "motion of a VM" is concerned. Furthermore, the same issue occurs
> with suspend/resume where the stream gets written to a file in the meantime.
>
> A quick grep across the code shows several other items in the migration
> stream which depend on toolstack bitness.
>
> There is no way to divine whether the far side of the migration stream
> is 32 or 64 bit, which is now vital information required to read the
> stream correctly.
And I think, even if x86 doesn't care, differing endianness should
be dealt with at the same time.
Jan
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: Migration between different bitness toolstacks
2014-01-14 16:05 ` Jan Beulich
@ 2014-01-14 16:18 ` Ian Campbell
2014-01-14 16:30 ` Roger Pau Monné
0 siblings, 1 reply; 6+ messages in thread
From: Ian Campbell @ 2014-01-14 16:18 UTC (permalink / raw)
To: Jan Beulich; +Cc: Andrew Cooper, Ian Jackson, Xen-devel List
On Tue, 2014-01-14 at 16:05 +0000, Jan Beulich wrote:
> >>> On 14.01.14 at 15:57, Andrew Cooper <andrew.cooper3@citrix.com> wrote:
> > As part of XenServer's attempt to move to a 64bit dom0, we have
> > encountered a sizeable flaw in xc_domain_{save,restore}().
> >
> > Migration of a VM from a 32bit toolstack to a 64bit toolstackfails with:
> >
> > xc: detail: xc_domain_restore: starting restore of new domid 1
> > xc: detail: xc_domain_restore: p2m_size = ffffffff00010000
> > xc: error: Couldn't allocate p2m_frame_list array: Internal error
> > xc: detail: Restore exit of domid 1 with rc=1
> >
> > This is caused because of
> >
> > RDEXACT(io_fd, &dinfo->p2m_size, sizeof(unsigned long))
> >
> > where sizeof(unsigned long) is different between the source and destination.
> >
> >
> > It is unreasonable for the format of the migration stream to rely on the
> > bitness of the toolstack, which should be completely transparent as far
> > as "motion of a VM" is concerned. Furthermore, the same issue occurs
> > with suspend/resume where the stream gets written to a file in the meantime.
> >
> > A quick grep across the code shows several other items in the migration
> > stream which depend on toolstack bitness.
> >
> > There is no way to divine whether the far side of the migration stream
> > is 32 or 64 bit, which is now vital information required to read the
> > stream correctly.
>
> And I think, even if x86 doesn't care, differing endianness should
> be dealt with at the same time.
FWIW I'm not currently expecting ARM to reuse
tools/libxc/xc_domain_{save,restore}.c.
It might be worth putting the effort into making the ARM code be cleaner
and supportable with a sensible protocol so that other future ports can
reuse it. Potentially even x86 could one day switch, although the old
code would have to remain for compat purposes.
Ian.
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: Migration between different bitness toolstacks
2014-01-14 16:18 ` Ian Campbell
@ 2014-01-14 16:30 ` Roger Pau Monné
2014-01-14 16:34 ` Ian Campbell
0 siblings, 1 reply; 6+ messages in thread
From: Roger Pau Monné @ 2014-01-14 16:30 UTC (permalink / raw)
To: Ian Campbell, Jan Beulich; +Cc: Andrew Cooper, Ian Jackson, Xen-devel List
On 14/01/14 17:18, Ian Campbell wrote:
> On Tue, 2014-01-14 at 16:05 +0000, Jan Beulich wrote:
>>>>> On 14.01.14 at 15:57, Andrew Cooper <andrew.cooper3@citrix.com> wrote:
>>> As part of XenServer's attempt to move to a 64bit dom0, we have
>>> encountered a sizeable flaw in xc_domain_{save,restore}().
>>>
>>> Migration of a VM from a 32bit toolstack to a 64bit toolstackfails with:
>>>
>>> xc: detail: xc_domain_restore: starting restore of new domid 1
>>> xc: detail: xc_domain_restore: p2m_size = ffffffff00010000
>>> xc: error: Couldn't allocate p2m_frame_list array: Internal error
>>> xc: detail: Restore exit of domid 1 with rc=1
>>>
>>> This is caused because of
>>>
>>> RDEXACT(io_fd, &dinfo->p2m_size, sizeof(unsigned long))
>>>
>>> where sizeof(unsigned long) is different between the source and destination.
>>>
>>>
>>> It is unreasonable for the format of the migration stream to rely on the
>>> bitness of the toolstack, which should be completely transparent as far
>>> as "motion of a VM" is concerned. Furthermore, the same issue occurs
>>> with suspend/resume where the stream gets written to a file in the meantime.
>>>
>>> A quick grep across the code shows several other items in the migration
>>> stream which depend on toolstack bitness.
>>>
>>> There is no way to divine whether the far side of the migration stream
>>> is 32 or 64 bit, which is now vital information required to read the
>>> stream correctly.
>>
>> And I think, even if x86 doesn't care, differing endianness should
>> be dealt with at the same time.
>
> FWIW I'm not currently expecting ARM to reuse
> tools/libxc/xc_domain_{save,restore}.c.
>
> It might be worth putting the effort into making the ARM code be cleaner
> and supportable with a sensible protocol so that other future ports can
> reuse it. Potentially even x86 could one day switch, although the old
> code would have to remain for compat purposes.
If we only guarantee migration support between n and n+1 (so for example
4.2 to 4.3, but not 4.2 to 4.4), the old code could go away at some point.
http://wiki.xen.org/wiki/Xen_Version_Compatibility
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: Migration between different bitness toolstacks
2014-01-14 16:30 ` Roger Pau Monné
@ 2014-01-14 16:34 ` Ian Campbell
0 siblings, 0 replies; 6+ messages in thread
From: Ian Campbell @ 2014-01-14 16:34 UTC (permalink / raw)
To: Roger Pau Monné
Cc: Andrew Cooper, Ian Jackson, Jan Beulich, Xen-devel List
On Tue, 2014-01-14 at 17:30 +0100, Roger Pau Monné wrote:
> On 14/01/14 17:18, Ian Campbell wrote:
> > On Tue, 2014-01-14 at 16:05 +0000, Jan Beulich wrote:
> >>>>> On 14.01.14 at 15:57, Andrew Cooper <andrew.cooper3@citrix.com> wrote:
> >>> As part of XenServer's attempt to move to a 64bit dom0, we have
> >>> encountered a sizeable flaw in xc_domain_{save,restore}().
> >>>
> >>> Migration of a VM from a 32bit toolstack to a 64bit toolstackfails with:
> >>>
> >>> xc: detail: xc_domain_restore: starting restore of new domid 1
> >>> xc: detail: xc_domain_restore: p2m_size = ffffffff00010000
> >>> xc: error: Couldn't allocate p2m_frame_list array: Internal error
> >>> xc: detail: Restore exit of domid 1 with rc=1
> >>>
> >>> This is caused because of
> >>>
> >>> RDEXACT(io_fd, &dinfo->p2m_size, sizeof(unsigned long))
> >>>
> >>> where sizeof(unsigned long) is different between the source and destination.
> >>>
> >>>
> >>> It is unreasonable for the format of the migration stream to rely on the
> >>> bitness of the toolstack, which should be completely transparent as far
> >>> as "motion of a VM" is concerned. Furthermore, the same issue occurs
> >>> with suspend/resume where the stream gets written to a file in the meantime.
> >>>
> >>> A quick grep across the code shows several other items in the migration
> >>> stream which depend on toolstack bitness.
> >>>
> >>> There is no way to divine whether the far side of the migration stream
> >>> is 32 or 64 bit, which is now vital information required to read the
> >>> stream correctly.
> >>
> >> And I think, even if x86 doesn't care, differing endianness should
> >> be dealt with at the same time.
> >
> > FWIW I'm not currently expecting ARM to reuse
> > tools/libxc/xc_domain_{save,restore}.c.
> >
> > It might be worth putting the effort into making the ARM code be cleaner
> > and supportable with a sensible protocol so that other future ports can
> > reuse it. Potentially even x86 could one day switch, although the old
> > code would have to remain for compat purposes.
>
> If we only guarantee migration support between n and n+1 (so for example
> 4.2 to 4.3, but not 4.2 to 4.4), the old code could go away at some point.
>
> http://wiki.xen.org/wiki/Xen_Version_Compatibility
We've historically not deliberately broken it though, but given a clean
break we could perhaps remove the old code eventually.
Ian.
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2014-01-14 16:34 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-01-14 14:57 Migration between different bitness toolstacks Andrew Cooper
2014-01-14 15:27 ` Ian Campbell
2014-01-14 16:05 ` Jan Beulich
2014-01-14 16:18 ` Ian Campbell
2014-01-14 16:30 ` Roger Pau Monné
2014-01-14 16:34 ` Ian Campbell
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.