* live saving of domU
@ 2006-05-10 15:40 Jayesh Salvi
2006-05-10 16:32 ` Ewan Mellor
0 siblings, 1 reply; 13+ messages in thread
From: Jayesh Salvi @ 2006-05-10 15:40 UTC (permalink / raw)
To: xen-devel
[-- Attachment #1.1: Type: text/plain, Size: 1644 bytes --]
Hi,
Could anyone tell me, why 'xm save' has live parameter set to false by
default. From yesterday's patch ([PATCH] [XenD] Migration-related change) i
guess this paramter is renamed to network.
This is the piece of code I am talking about:
XendDomain.py:
def domain_save(self, domid, dst):
"""Start saving a domain to file.
@param dst: destination file
"""
try:
dominfo = self.domain_lookup_by_name_or_id_nr(domid)
if not dominfo:
raise XendInvalidDomain(str(domid))
if dominfo.getDomid() == PRIV_DOMAIN:
raise XendError("Cannot save privileged domain %i" % domid)
fd = os.open(dst, os.O_WRONLY | os.O_CREAT | os.O_TRUNC)
try:
---> # For now we don't support 'live checkpoint'
---> return XendCheckpoint.save(fd, dominfo, False)
finally:
os.close(fd)
I am interested in saving the state of a virtual machine to a file, but want
to continue it running. I want to backup the state of the machine, so I want
this to be unintrusive operation. I would like to pause the domU and save it
to file but keeping it still in memory. After save to file is done I will
unpause the domU.
I don't see why this shouldn't be possible if live migration works so well.
Is it possible? Are there any hidden xm options for doing live save, if not
where can I tweak this?
Please let me know.
Thanks,
--
Jayesh
------------------------------------------------------------------------
Everything you can imagine is real
[-- Attachment #1.2: Type: text/html, Size: 2878 bytes --]
[-- Attachment #2: Type: text/plain, Size: 138 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
^ permalink raw reply [flat|nested] 13+ messages in thread* Re: live saving of domU
2006-05-10 15:40 live saving of domU Jayesh Salvi
@ 2006-05-10 16:32 ` Ewan Mellor
2006-05-10 16:59 ` Anthony Liguori
2006-05-10 19:00 ` Jayesh Salvi
0 siblings, 2 replies; 13+ messages in thread
From: Ewan Mellor @ 2006-05-10 16:32 UTC (permalink / raw)
To: Jayesh Salvi; +Cc: xen-devel
On Wed, May 10, 2006 at 10:40:31AM -0500, Jayesh Salvi wrote:
> Hi,
>
> Could anyone tell me, why 'xm save' has live parameter set to false by
> default. From yesterday's patch ([PATCH] [XenD] Migration-related change)
> i guess this paramter is renamed to network.
This parameter has not been renamed -- the rename was for a similar flag
passed in to the device migration code, but the live flag for migration
remains.
> [Snip]
>
> I am interested in saving the state of a virtual machine to a file, but
> want to continue it running. I want to backup the state of the machine, so
> I want this to be unintrusive operation. I would like to pause the domU
> and save it to file but keeping it still in memory. After save to file is
> done I will unpause the domU.
>
> I don't see why this shouldn't be possible if live migration works so
> well.
The reason it's not supported at the moment is this: if you take a snapshot of
a VM, then run for a bit, and then try and run the snapshot against the same
filesystem that you were using before, you will inevitably corrupt the
filesystem.
If you had a way to snapshot the storage at the same time as the VM, then you
could make live snapshotting of VMs work properly. As it is, this is not
integrated into Xend at the moment, and not supported because of the danger of
filesystem corruption if you don't know what you are doing.
Ewan.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: live saving of domU
2006-05-10 16:32 ` Ewan Mellor
@ 2006-05-10 16:59 ` Anthony Liguori
2006-05-10 19:06 ` Jayesh Salvi
2006-05-11 8:25 ` Jacob Gorm Hansen
2006-05-10 19:00 ` Jayesh Salvi
1 sibling, 2 replies; 13+ messages in thread
From: Anthony Liguori @ 2006-05-10 16:59 UTC (permalink / raw)
To: Ewan Mellor; +Cc: xen-devel, Jayesh Salvi
Ewan Mellor wrote:
> On Wed, May 10, 2006 at 10:40:31AM -0500, Jayesh Salvi wrote:
>
>
>> Hi,
>>
>> Could anyone tell me, why 'xm save' has live parameter set to false by
>> default. From yesterday's patch ([PATCH] [XenD] Migration-related change)
>> i guess this paramter is renamed to network.
>>
>
> This parameter has not been renamed -- the rename was for a similar flag
> passed in to the device migration code, but the live flag for migration
> remains.
>
>
>> [Snip]
>>
>> I am interested in saving the state of a virtual machine to a file, but
>> want to continue it running. I want to backup the state of the machine, so
>> I want this to be unintrusive operation. I would like to pause the domU
>> and save it to file but keeping it still in memory. After save to file is
>> done I will unpause the domU.
>>
>> I don't see why this shouldn't be possible if live migration works so
>> well.
>>
>
> The reason it's not supported at the moment is this: if you take a snapshot of
> a VM, then run for a bit, and then try and run the snapshot against the same
> filesystem that you were using before, you will inevitably corrupt the
> filesystem.
>
Moreover, you cannot dump the state of a domain after a pause and expect
it to ever run again.
Guests are aware of the physical addresses of the memory that's been
allocated to them. Because of this, to save a domain's state in a
restorable way you need the guest to "canonicalize" itself. The only
way to do this today is through a suspend operation which happens to be
a subop of shutdown. Shutdowns are non-recoverable so you cannot use
this as a snapshotting mechanism.
The closest thing you can achieve is a localhost migration. There are
some caveats to this, of course. The first is that you need to have as
much memory as the domain has available since you'll have a copy of the
domain created briefly while the migration takes place. Migrations are
also quite intrusive since they involve tearing down and bringing up all
the devices.
I've gotten a lot of requests for light weight checkpointing. AFAIK,
noone is actually working on it though.
Regards,
Anthony Liguori
> If you had a way to snapshot the storage at the same time as the VM, then you
> could make live snapshotting of VMs work properly. As it is, this is not
> integrated into Xend at the moment, and not supported because of the danger of
> filesystem corruption if you don't know what you are doing.
>
> Ewan.
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel
>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: live saving of domU
2006-05-10 16:59 ` Anthony Liguori
@ 2006-05-10 19:06 ` Jayesh Salvi
2006-05-10 19:30 ` Anthony Liguori
2006-05-11 8:25 ` Jacob Gorm Hansen
1 sibling, 1 reply; 13+ messages in thread
From: Jayesh Salvi @ 2006-05-10 19:06 UTC (permalink / raw)
To: Anthony Liguori; +Cc: xen-devel, Ewan Mellor
[-- Attachment #1.1: Type: text/plain, Size: 3290 bytes --]
On 5/10/06, Anthony Liguori <aliguori@us.ibm.com> wrote:
>
> Ewan Mellor wrote:
> > On Wed, May 10, 2006 at 10:40:31AM -0500, Jayesh Salvi wrote:
> >
> >
> >> Hi,
> >>
> >> Could anyone tell me, why 'xm save' has live parameter set to false by
> >> default. From yesterday's patch ([PATCH] [XenD] Migration-related
> change)
> >> i guess this paramter is renamed to network.
> >>
> >
> > This parameter has not been renamed -- the rename was for a similar flag
> > passed in to the device migration code, but the live flag for migration
> > remains.
> >
> >
> >> [Snip]
> >>
> >> I am interested in saving the state of a virtual machine to a file, but
> >> want to continue it running. I want to backup the state of the machine,
> so
> >> I want this to be unintrusive operation. I would like to pause the domU
> >> and save it to file but keeping it still in memory. After save to file
> is
> >> done I will unpause the domU.
> >>
> >> I don't see why this shouldn't be possible if live migration works so
> >> well.
> >>
> >
> > The reason it's not supported at the moment is this: if you take a
> snapshot of
> > a VM, then run for a bit, and then try and run the snapshot against the
> same
> > filesystem that you were using before, you will inevitably corrupt the
> > filesystem.
> >
>
> Moreover, you cannot dump the state of a domain after a pause and expect
> it to ever run again.
>
> Guests are aware of the physical addresses of the memory that's been
> allocated to them. Because of this, to save a domain's state in a
> restorable way you need the guest to "canonicalize" itself. The only
> way to do this today is through a suspend operation which happens to be
> a subop of shutdown. Shutdowns are non-recoverable so you cannot use
> this as a snapshotting mechanism.
Thanks, that was informative. But the shutdown you are refering to here is
not the traditional shutdown of domU right? I mean 'xm shutdown' will do
proper shutdown of domU OS, the shutdown you are referring while doing 'xm
save' is different from that, I hope!
The closest thing you can achieve is a localhost migration. There are
> some caveats to this, of course. The first is that you need to have as
> much memory as the domain has available since you'll have a copy of the
> domain created briefly while the migration takes place. Migrations are
> also quite intrusive since they involve tearing down and bringing up all
> the devices.
>
> I've gotten a lot of requests for light weight checkpointing. AFAIK,
> noone is actually working on it though.
>
Regards,
>
> Anthony Liguori
>
> > If you had a way to snapshot the storage at the same time as the VM,
> then you
> > could make live snapshotting of VMs work properly. As it is, this is
> not
> > integrated into Xend at the moment, and not supported because of the
> danger of
> > filesystem corruption if you don't know what you are doing.
> >
> > Ewan.
> >
> > _______________________________________________
> > Xen-devel mailing list
> > Xen-devel@lists.xensource.com
> > http://lists.xensource.com/xen-devel
> >
>
>
--
Jayesh
------------------------------------------------------------------------
Everything you can imagine is real
[-- Attachment #1.2: Type: text/html, Size: 4263 bytes --]
[-- Attachment #2: Type: text/plain, Size: 138 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: live saving of domU
2006-05-10 19:06 ` Jayesh Salvi
@ 2006-05-10 19:30 ` Anthony Liguori
2006-05-11 1:29 ` Jayesh Salvi
0 siblings, 1 reply; 13+ messages in thread
From: Anthony Liguori @ 2006-05-10 19:30 UTC (permalink / raw)
To: Jayesh Salvi; +Cc: xen-devel, Ewan Mellor
Jayesh Salvi wrote:
>
>
> On 5/10/06, *Anthony Liguori* <aliguori@us.ibm.com
> <mailto:aliguori@us.ibm.com>> wrote:
>
> Guests are aware of the physical addresses of the memory that's been
> allocated to them. Because of this, to save a domain's state in a
> restorable way you need the guest to "canonicalize" itself. The only
> way to do this today is through a suspend operation which happens
> to be
> a subop of shutdown. Shutdowns are non-recoverable so you cannot use
> this as a snapshotting mechanism.
>
>
> Thanks, that was informative. But the shutdown you are refering to
> here is not the traditional shutdown of domU right? I mean 'xm
> shutdown' will do proper shutdown of domU OS, the shutdown you are
> referring while doing 'xm save' is different from that, I hope!
From the perspective of the hypervisor, there isn't a significant
difference between doing an "xm shutdown" or an "xm save". Don't read
too much into that though b/c the vast majority of both operations are
performed in userspace. Unfortunately, this lack of distinction in the
hypervisor is what currently prevents light weight check pointing.
You could, theoritically, add another entry point in the guest kernel
and use some xenstore magic to force the domain to spin in a loop after
canonicalizing, pause it, switch a register value, then unpause it.
Something like (in psuedo-code):
while (eax == 0) halt;
Right after canonicalizing. Then pause, and do a
getvcpuinfo/setvcpuinfo to change eax to 1. Then unpause. If you added
another suspend entry point you could probably even avoid tearing down
the devices...
Regards,
Anthony Liguori
> The closest thing you can achieve is a localhost migration. There are
> some caveats to this, of course. The first is that you need to
> have as
> much memory as the domain has available since you'll have a copy
> of the
> domain created briefly while the migration takes
> place. Migrations are
> also quite intrusive since they involve tearing down and bringing
> up all
> the devices.
>
> I've gotten a lot of requests for light weight checkpointing. AFAIK,
> noone is actually working on it though.
>
> Regards,
>
> Anthony Liguori
>
> > If you had a way to snapshot the storage at the same time as the
> VM, then you
> > could make live snapshotting of VMs work properly. As it is,
> this is not
> > integrated into Xend at the moment, and not supported because of
> the danger of
> > filesystem corruption if you don't know what you are doing.
> >
> > Ewan.
> >
> > _______________________________________________
> > Xen-devel mailing list
> > Xen-devel@lists.xensource.com <mailto:Xen-devel@lists.xensource.com>
> > http://lists.xensource.com/xen-devel
> >
>
>
>
>
> --
> Jayesh
> ------------------------------------------------------------------------
> Everything you can imagine is real
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: live saving of domU
2006-05-10 19:30 ` Anthony Liguori
@ 2006-05-11 1:29 ` Jayesh Salvi
2006-05-11 1:32 ` Anthony Liguori
0 siblings, 1 reply; 13+ messages in thread
From: Jayesh Salvi @ 2006-05-11 1:29 UTC (permalink / raw)
To: Anthony Liguori; +Cc: xen-devel, Ewan Mellor
[-- Attachment #1.1: Type: text/plain, Size: 4772 bytes --]
On 5/10/06, Anthony Liguori <aliguori@us.ibm.com> wrote:
>
> Jayesh Salvi wrote:
> >
> >
> > On 5/10/06, *Anthony Liguori* <aliguori@us.ibm.com
> > <mailto:aliguori@us.ibm.com>> wrote:
> >
> > Guests are aware of the physical addresses of the memory that's been
> > allocated to them. Because of this, to save a domain's state in a
> > restorable way you need the guest to "canonicalize" itself. The
> only
> > way to do this today is through a suspend operation which happens
> > to be
> > a subop of shutdown. Shutdowns are non-recoverable so you cannot
> use
> > this as a snapshotting mechanism.
> >
> >
> > Thanks, that was informative. But the shutdown you are refering to
> > here is not the traditional shutdown of domU right? I mean 'xm
> > shutdown' will do proper shutdown of domU OS, the shutdown you are
> > referring while doing 'xm save' is different from that, I hope!
>
> From the perspective of the hypervisor, there isn't a significant
> difference between doing an "xm shutdown" or an "xm save". Don't read
> too much into that though b/c the vast majority of both operations are
> performed in userspace. Unfortunately, this lack of distinction in the
> hypervisor is what currently prevents light weight check pointing.
>
> You could, theoritically, add another entry point in the guest kernel
> and use some xenstore magic to force the domain to spin in a loop after
> canonicalizing, pause it, switch a register value, then unpause it.
> Something like (in psuedo-code):
>
> while (eax == 0) halt;
>
> Right after canonicalizing. Then pause, and do a
> getvcpuinfo/setvcpuinfo to change eax to 1. Then unpause. If you added
> another suspend entry point you could probably even avoid tearing down
> the devices...
While reading more on the topic of backup of virtual machines, I came across
the problems and solutions in VMWare world. You can check out the VMWare
consolidated backup (VCB) feature introduced in ESX server. (
http://blog.baeke.info/blog/_archives/2006/3/23/1836968.html).
The VCB addresses some different problems of backups in virtual environment,
but points relevant to our current thread of conversation:
VCB does give a mechanism to quiesce the virtual machine. The virtual disk
is mounted on something called junction point, there is some windows
specific snapshot mechanism using which the snapshot of the disk is taken.
Then traditional softwares are used to do backup this shapshot / junction
point.
>From this I guess, sooner or later we will need such quisceing mechanism
working in Xen.
On a little different note, it is amusing to note that VMWare has to do some
tricky dance of junction point/snapshot, etc. May be VMWare virtual machines
principally use loopback file systems for virtual disk - that's why they
have problem in viewing the same file system from dom0 and domU (consistent
or inconsistent). Xen allows devices to be used as boot disk. Hence the file
system is visible from dom0 and domU the same way (at least RO from dom0). I
might be wrong. Correct me if you think so.
Regards,
>
> Anthony Liguori
>
> > The closest thing you can achieve is a localhost migration. There
> are
> > some caveats to this, of course. The first is that you need to
> > have as
> > much memory as the domain has available since you'll have a copy
> > of the
> > domain created briefly while the migration takes
> > place. Migrations are
> > also quite intrusive since they involve tearing down and bringing
> > up all
> > the devices.
> >
> > I've gotten a lot of requests for light weight
> checkpointing. AFAIK,
> > noone is actually working on it though.
> >
> > Regards,
> >
> > Anthony Liguori
> >
> > > If you had a way to snapshot the storage at the same time as the
> > VM, then you
> > > could make live snapshotting of VMs work properly. As it is,
> > this is not
> > > integrated into Xend at the moment, and not supported because of
> > the danger of
> > > filesystem corruption if you don't know what you are doing.
> > >
> > > Ewan.
> > >
> > > _______________________________________________
> > > Xen-devel mailing list
> > > Xen-devel@lists.xensource.com <mailto:
> Xen-devel@lists.xensource.com>
> > > http://lists.xensource.com/xen-devel
> > >
> >
> >
> >
> >
> > --
> > Jayesh
> > ------------------------------------------------------------------------
> > Everything you can imagine is real
>
>
--
Jayesh
------------------------------------------------------------------------
Everything you can imagine is real
[-- Attachment #1.2: Type: text/html, Size: 6670 bytes --]
[-- Attachment #2: Type: text/plain, Size: 138 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: live saving of domU
2006-05-11 1:29 ` Jayesh Salvi
@ 2006-05-11 1:32 ` Anthony Liguori
0 siblings, 0 replies; 13+ messages in thread
From: Anthony Liguori @ 2006-05-11 1:32 UTC (permalink / raw)
To: Jayesh Salvi; +Cc: xen-devel, Ewan Mellor
Jayesh Salvi wrote:
>
> On a little different note, it is amusing to note that VMWare has to
> do some tricky dance of junction point/snapshot, etc. May be VMWare
> virtual machines principally use loopback file systems for virtual
> disk - that's why they have problem in viewing the same file system
> from dom0 and domU (consistent or inconsistent). Xen allows devices to
> be used as boot disk. Hence the file system is visible from dom0 and
> domU the same way (at least RO from dom0). I might be wrong. Correct
> me if you think so.
Keep in mind, VMware and Xen are very different. VMware emulates a
device. That device has state, I presume that they need to get their
device into a well defined state. We don't have this problem as we can
just detach and reattach all of the devices.
Of course, VMware is advantaged in that they use shadowing paging so
they do not have any of the canonicalization problems that we have.
Regards,
Anthony Liguori
>
> --
> Jayesh
> ------------------------------------------------------------------------
> Everything you can imagine is real
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: live saving of domU
2006-05-10 16:59 ` Anthony Liguori
2006-05-10 19:06 ` Jayesh Salvi
@ 2006-05-11 8:25 ` Jacob Gorm Hansen
1 sibling, 0 replies; 13+ messages in thread
From: Jacob Gorm Hansen @ 2006-05-11 8:25 UTC (permalink / raw)
To: Anthony Liguori; +Cc: xen-devel, Ewan Mellor, Jayesh Salvi
On 5/10/06, Anthony Liguori <aliguori@us.ibm.com> wrote:
> Moreover, you cannot dump the state of a domain after a pause and expect
> it to ever run again.
>
> Guests are aware of the physical addresses of the memory that's been
> allocated to them. Because of this, to save a domain's state in a
> restorable way you need the guest to "canonicalize" itself. The only
> way to do this today is through a suspend operation which happens to be
> a subop of shutdown. Shutdowns are non-recoverable so you cannot use
> this as a snapshotting mechanism.
> I've gotten a lot of requests for light weight checkpointing. AFAIK,
> noone is actually working on it though.
Hi,
I've got this running for xenlinux, using self-checkpointing. I solve
the canonicalization problem by having a radix-tree which maps mfns to
offsets in the pfn->mfn table, and upon resume use this to remap all
the page tables.
It works both across the network and to a block device, as long as
it's opened with O_DIRECT.
I am working on bringing the patch up to date with Xen 3.0.2 at the
moment. It would be great to have the driver model better support
resume after suspend of xenbus and devices in the same domU.
Jacob
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: live saving of domU
2006-05-10 16:32 ` Ewan Mellor
2006-05-10 16:59 ` Anthony Liguori
@ 2006-05-10 19:00 ` Jayesh Salvi
1 sibling, 0 replies; 13+ messages in thread
From: Jayesh Salvi @ 2006-05-10 19:00 UTC (permalink / raw)
To: Ewan Mellor; +Cc: xen-devel
[-- Attachment #1.1: Type: text/plain, Size: 2279 bytes --]
On 5/10/06, Ewan Mellor <ewan@xensource.com> wrote:
>
> On Wed, May 10, 2006 at 10:40:31AM -0500, Jayesh Salvi wrote:
>
> > Hi,
> >
> > Could anyone tell me, why 'xm save' has live parameter set to false by
> > default. From yesterday's patch ([PATCH] [XenD] Migration-related
> change)
> > i guess this paramter is renamed to network.
>
> This parameter has not been renamed -- the rename was for a similar flag
> passed in to the device migration code, but the live flag for migration
> remains.
>
> > [Snip]
> >
> > I am interested in saving the state of a virtual machine to a file, but
> > want to continue it running. I want to backup the state of the machine,
> so
> > I want this to be unintrusive operation. I would like to pause the domU
> > and save it to file but keeping it still in memory. After save to file
> is
> > done I will unpause the domU.
> >
> > I don't see why this shouldn't be possible if live migration works so
> > well.
>
> The reason it's not supported at the moment is this: if you take a
> snapshot of
> a VM, then run for a bit, and then try and run the snapshot against the
> same
> filesystem that you were using before, you will inevitably corrupt the
> filesystem.
>
> If you had a way to snapshot the storage at the same time as the VM, then
> you
> could make live snapshotting of VMs work properly. As it is, this is not
> integrated into Xend at the moment, and not supported because of the
> danger of
> filesystem corruption if you don't know what you are doing.
OK. So as I understand it, we will need a snapshotting mechanism that does
snapshot of storage and that of domU state in one single atomic operation. I
will study the suspend, pause , migrate and shutdown semantics and see what
would be the best to stop domU from modifying the disk "for a while" -
during which we can have a consistent snapshot of storage. If nothing else
works an explicit 'xm save' and behind the curtain 'xm restore' will do the
job I guess - a bit intrusive solution it will be though.
If anyone has further ideas on this line, I am all ears.
Thanks!
Ewan.
>
--
Jayesh
------------------------------------------------------------------------
Everything you can imagine is real
[-- Attachment #1.2: Type: text/html, Size: 2792 bytes --]
[-- Attachment #2: Type: text/plain, Size: 138 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
^ permalink raw reply [flat|nested] 13+ messages in thread
[parent not found: <E1FdtIP-0000Id-VQ@host-192-168-0-1-bcn-london>]
* Re: live saving of domU
[not found] <E1FdtIP-0000Id-VQ@host-192-168-0-1-bcn-london>
@ 2006-05-10 19:14 ` Andres Lagar Cavilla
2006-05-10 19:40 ` Anthony Liguori
0 siblings, 1 reply; 13+ messages in thread
From: Andres Lagar Cavilla @ 2006-05-10 19:14 UTC (permalink / raw)
To: xen-devel
Anthony Liguori wrote
>Moreover, you cannot dump the state of a domain after a pause and expect
>it to ever run again.
>
>Guests are aware of the physical addresses of the memory that's been
>allocated to them. Because of this, to save a domain's state in a
>restorable way you need the guest to "canonicalize" itself. The only
>way to do this today is through a suspend operation which happens to be
>a subop of shutdown. Shutdowns are non-recoverable so you cannot use
>this as a snapshotting mechanism.
>
>
My understanding is that the guest only canonicalizes the store and
console mfn's and places them on the shared info frame which is passed
to the suspend hypercall. The rest of the canonicalizations are done by
dom0 user-space code (xc_linux_save).
The guest never really shuts down: it issues the suspend hypercall and
waits for it to return. This could happen months later when the domain
is resumed :) The suspend hypercall executing in xen is the one that
pauses all vcpus and kills the domain.
Is it feasible to use a different hypercall that pauses the domain but
doesn't kill it, and once xc_linux_save is done checkpointing have it
issue a dom0_op that unpauses the domain?
For filesystem corruption you're gonna have to hack up your own thing.
Probably a CoW solution, where you begin a new "epoch" when resuming
from the checkpoint.
Andres
>The closest thing you can achieve is a localhost migration. There are
>some caveats to this, of course. The first is that you need to have as
>much memory as the domain has available since you'll have a copy of the
>domain created briefly while the migration takes place. Migrations are
>also quite intrusive since they involve tearing down and bringing up all
>the devices.
>
>I've gotten a lot of requests for light weight checkpointing. AFAIK,
>noone is actually working on it though.
>
^ permalink raw reply [flat|nested] 13+ messages in thread* Re: live saving of domU
2006-05-10 19:14 ` Andres Lagar Cavilla
@ 2006-05-10 19:40 ` Anthony Liguori
2006-05-10 20:42 ` Andres Lagar Cavilla
0 siblings, 1 reply; 13+ messages in thread
From: Anthony Liguori @ 2006-05-10 19:40 UTC (permalink / raw)
To: Andres Lagar Cavilla; +Cc: xen-devel
Andres Lagar Cavilla wrote:
> Anthony Liguori wrote
>
>> Moreover, you cannot dump the state of a domain after a pause and
>> expect it to ever run again.
>>
>> Guests are aware of the physical addresses of the memory that's been
>> allocated to them. Because of this, to save a domain's state in a
>> restorable way you need the guest to "canonicalize" itself. The only
>> way to do this today is through a suspend operation which happens to
>> be a subop of shutdown. Shutdowns are non-recoverable so you cannot
>> use this as a snapshotting mechanism.
>>
>>
> My understanding is that the guest only canonicalizes the store and
> console mfn's and places them on the shared info frame which is passed
> to the suspend hypercall. The rest of the canonicalizations are done
> by dom0 user-space code (xc_linux_save).
Sort of. When you pause a domain, it could be doing something like a
PTE update in which case it has a PFN in a register (or on the stack
somewhere). Part of the reason for having a suspend entry point in the
kernel is to ensure that we're in a consistent state.
> The guest never really shuts down: it issues the suspend hypercall and
> waits for it to return. This could happen months later when the domain
> is resumed :) The suspend hypercall executing in xen is the one that
> pauses all vcpus and kills the domain.
Actually, take a look at what HYPERVISOR_suspend is:
static inline int
HYPERVISOR_suspend(
unsigned long srec)
{
struct sched_shutdown sched_shutdown = {
.reason = SHUTDOWN_suspend
};
int rc = _hypercall3(int, sched_op, SCHEDOP_shutdown,
&sched_shutdown, srec);
if (rc == -ENOSYS)
rc = _hypercall3(int, sched_op_compat, SCHEDOP_shutdown,
SHUTDOWN_suspend, srec);
return rc;
}
It's just a shutdown op. It's the same hypercall as reboot/halt and the
hypervisor doesn't do anything differently for these calls. What
happens next is that the hypervisor stops scheduling the domain and sets
the 's' flag in the domain's state. The userspace tools will see that
the domain has now "suspended" by checking the shutdown reason and begin
the teardown process.
This of course implies that the userspace tools have to remember which
domains have issued which requests (so they know who to check these
things for). This is why we have the Xend daemon--to keep track of this
information.
> Is it feasible to use a different hypercall that pauses the domain but
> doesn't kill it, and once xc_linux_save is done checkpointing have it
> issue a dom0_op that unpauses the domain?
A domain is "killed" with a dom0_op of domain_destroy which is invoked
by Xend. The problem with checkpointing is that once the 's' bit has
been set on a domain, there's no way to unset that bit.
> For filesystem corruption you're gonna have to hack up your own thing.
> Probably a CoW solution, where you begin a new "epoch" when resuming
> from the checkpoint.
Especially with something like dm-userspace on the horizon, this would
(hopefully) soon be a moot issue. It just leaves us with this pesky
problem of suspend.
Regards,
Anthony Liguori
> Andres
>
>> The closest thing you can achieve is a localhost migration. There
>> are some caveats to this, of course. The first is that you need to
>> have as much memory as the domain has available since you'll have a
>> copy of the domain created briefly while the migration takes place.
>> Migrations are also quite intrusive since they involve tearing down
>> and bringing up all the devices.
>>
>> I've gotten a lot of requests for light weight checkpointing. AFAIK,
>> noone is actually working on it though.
>>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel
^ permalink raw reply [flat|nested] 13+ messages in thread* Re: live saving of domU
2006-05-10 19:40 ` Anthony Liguori
@ 2006-05-10 20:42 ` Andres Lagar Cavilla
2006-05-10 21:05 ` Anthony Liguori
0 siblings, 1 reply; 13+ messages in thread
From: Andres Lagar Cavilla @ 2006-05-10 20:42 UTC (permalink / raw)
To: Anthony Liguori; +Cc: xen-devel
>> My understanding is that the guest only canonicalizes the store and
>> console mfn's and places them on the shared info frame which is
>> passed to the suspend hypercall. The rest of the canonicalizations
>> are done by dom0 user-space code (xc_linux_save).
>
>
> Sort of. When you pause a domain, it could be doing something like a
> PTE update in which case it has a PFN in a register (or on the stack
> somewhere). Part of the reason for having a suspend entry point in
> the kernel is to ensure that we're in a consistent state.
Does the guest kernel do anything beyond what's in __do_suspend in reboot.c?
>> The guest never really shuts down: it issues the suspend hypercall
>> and waits for it to return. This could happen months later when the
>> domain is resumed :) The suspend hypercall executing in xen is the
>> one that pauses all vcpus and kills the domain.
>
> Actually, take a look at what HYPERVISOR_suspend is:
>
> It's just a shutdown op.
But it doesn't have to be. The hypercall could only pause the domain,
and let the user-space tools unpause (no 's' bit -> no domain/devices
teardown) when checkpointing is over. The guest kernel can't tell the
difference: it returns from the hypercall and life goes on, as long as
the devices are still there. That's what I was referring to with:
>> Is it feasible to use a different hypercall that pauses the domain
>> but doesn't kill it, and once xc_linux_save is done checkpointing
>> have it issue a dom0_op that unpauses the domain?
>
> A domain is "killed" with a dom0_op of domain_destroy which is invoked
> by Xend. The problem with checkpointing is that once the 's' bit has
> been set on a domain, there's no way to unset that bit.
As I said a few lines up, let's not set the 's' bit for lightweight
checkpoints. This is likely to cause a lot of special casing for
xend/xenstore, right?
Andres
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: live saving of domU
2006-05-10 20:42 ` Andres Lagar Cavilla
@ 2006-05-10 21:05 ` Anthony Liguori
0 siblings, 0 replies; 13+ messages in thread
From: Anthony Liguori @ 2006-05-10 21:05 UTC (permalink / raw)
To: Andres Lagar Cavilla; +Cc: xen-devel
Andres Lagar Cavilla wrote:
>>> My understanding is that the guest only canonicalizes the store and
>>> console mfn's and places them on the shared info frame which is
>>> passed to the suspend hypercall. The rest of the canonicalizations
>>> are done by dom0 user-space code (xc_linux_save).
>>
>>
>> Sort of. When you pause a domain, it could be doing something like a
>> PTE update in which case it has a PFN in a register (or on the stack
>> somewhere). Part of the reason for having a suspend entry point in
>> the kernel is to ensure that we're in a consistent state.
>
> Does the guest kernel do anything beyond what's in __do_suspend in
> reboot.c?
Nothing that isn't reachable from that function.
>>> The guest never really shuts down: it issues the suspend hypercall
>>> and waits for it to return. This could happen months later when the
>>> domain is resumed :) The suspend hypercall executing in xen is the
>>> one that pauses all vcpus and kills the domain.
>>
>> Actually, take a look at what HYPERVISOR_suspend is:
>>
>> It's just a shutdown op.
>
> But it doesn't have to be. The hypercall could only pause the domain,
> and let the user-space tools unpause (no 's' bit -> no domain/devices
> teardown) when checkpointing is over. The guest kernel can't tell the
> difference: it returns from the hypercall and life goes on, as long as
> the devices are still there. That's what I was referring to with:
It could, but you have a number of other problems you have to solve.
How do you signal to userspace that the domain is suspended? You could
introduce another VIRQ perhaps or extend the state. The __do_suspend
path supposes that the devices are being cycled too. You either need
Xend to participate in this process. How devices interact would need
some careful thinking.
>>> Is it feasible to use a different hypercall that pauses the domain
>>> but doesn't kill it, and once xc_linux_save is done checkpointing
>>> have it issue a dom0_op that unpauses the domain?
>>
>> A domain is "killed" with a dom0_op of domain_destroy which is
>> invoked by Xend. The problem with checkpointing is that once the 's'
>> bit has been set on a domain, there's no way to unset that bit.
>
> As I said a few lines up, let's not set the 's' bit for lightweight
> checkpoints. This is likely to cause a lot of special casing for
> xend/xenstore, right?
Yeah, there's a lot of bits of userspace code that would be effected. I
hope this isn't disparaging, I certainly think it's worth the effort.
Regards,
Anthony Liguori
> Andres
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2006-05-11 8:25 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-05-10 15:40 live saving of domU Jayesh Salvi
2006-05-10 16:32 ` Ewan Mellor
2006-05-10 16:59 ` Anthony Liguori
2006-05-10 19:06 ` Jayesh Salvi
2006-05-10 19:30 ` Anthony Liguori
2006-05-11 1:29 ` Jayesh Salvi
2006-05-11 1:32 ` Anthony Liguori
2006-05-11 8:25 ` Jacob Gorm Hansen
2006-05-10 19:00 ` Jayesh Salvi
[not found] <E1FdtIP-0000Id-VQ@host-192-168-0-1-bcn-london>
2006-05-10 19:14 ` Andres Lagar Cavilla
2006-05-10 19:40 ` Anthony Liguori
2006-05-10 20:42 ` Andres Lagar Cavilla
2006-05-10 21:05 ` Anthony Liguori
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.