* [RFC V6] Libxl Domain Snapshot API Design
@ 2014-08-27 7:22 Chunyan Liu
2014-09-05 3:18 ` Chun Yan Liu
2014-09-05 12:44 ` Ian Campbell
0 siblings, 2 replies; 11+ messages in thread
From: Chunyan Liu @ 2014-08-27 7:22 UTC (permalink / raw)
To: xen-devel; +Cc: jfehlig, ian.jackson, Ian.Campbell, Chunyan Liu
Since Bamvor left SUSE and turns to work on ARM server, I'd like to continue
this work and make progress. Following the discussion about V5, which mainly
focused on the API design, here post the updated API design. Thanks for any
of your further suggestions!
Main changes to V5:
* libxl_disk_snapshot: reuse libxl_device_disk rather than specify path,
format separately in the structure. Including two libxl_device_disk
components, one is to indicate the original disk info, one is to
indicate the external snapshot info if it is 'external snapshot'.
* define common APIs for domain snapshot creating/deleting/reverting,
rather than a group of functions for disk snapshot operations.
* remove those APIs for loading/storing/deleting snapshot config.
V5 is here:
http://lists.xenproject.org/archives/html/xen-devel/2014-07/msg00893.html
V5 about API Design is here:
http://lists.xenproject.org/archives/html/xen-devel/2014-07/msg00897.html
===========================================================================
Libxl Domain Snapshot API
libxl_domain_snapshot = Struct("domain_snapshot",[
("name", string), /* snapshot name */
("description", string), /* snapshot description */
("creation_time", uint64), /* creation time, in seconds */
/* save memory or not. "false" means disk-only snapshot */
("memory", bool),
/* memory state file when snapshot is external */
("memory_file", string),
/* Array to store disk snapshot info. */
("disks", Array(libxl_disk_snapshot, "num_disks")),
])
libxl_disk_snapshot = Struct("disk_snapshot",[
("disk", libxl_device_disk), /* orignal disk */
("name", string), /* snapshot name */
("external", bool), /* external snapshot or not */
/* external snapshot info, including file path and format, etc.
* if "external" is false, this will be "NULL".
*/
("external_sn", libxl_device_disk),
])
enum libxlDomainSnapshotCreateFlags {
/* disk snapshot, not system checkpoint */
LIBXL_DOMAIN_SNAPSHOT_CREATE_DISK_ONLY = 1,
/* create the snapshot while the guest is running */
LIBXL_DOMAIN_SNAPSHOT_CREATE_LIVE = 2,
}
enum libxlDomainSnapshotDeleteFlags {
LIBXL_DOMAIN_SNAPSHOT_DELETE_CHILDREN = 1, /* delete children too */
}
enum virDomainSnapshotRevertFlags {
LIBXL_DOMAIN_SNAPSHOT_REVERT_RUNNING = 1, /* run after revert */
LIBXL_DOMAIN_SNAPSHOT_REVERT_PAUSED = 2, /* pause after revert */
LIBXL_DOMAIN_SNAPSHOT_REVERT_FORCE = 4, /* force revert */
}
int libxl_domain_snapshot_create(libxl_ctx *ctx, const char *domname,
libxl_domain_snapshot *snapshot,
unsigned int flags);
Creates a new snapshot of a domain based on the snapshot config contained
in @snapshot.
If @flags includes LIBXL_DOMAIN_SNAPSHOT_CREATE_LIVE, then the domain is not
paused while creating the snapshot, like live migration. This increases size
of the memory dump file, but reducess downtime of the guest. Only support
this flag during external checkpoints.
If @flags includes LIBXL_DOMAIN_SNAPSHOT_CREATE_DISK_ONLY, then the snapshot
will be limited to the disks described in @snapshot, and no VM state will
be saved. For an active guest, this is not supported.
ctx: context
domname: domain name
snapshot: configuration of domain snapshot
flags: bitwise-OR of libxlDomainSnapshotCreateFlags
Returns: 0 on success, -1 on failure
int libxl_domain_snapshot_delete(libxl_ctx *ctx, const char *domname,
const char *snapshot_name,
unsigned int flags);
Delete a snapshot.
If @flags is 0, then just this snapshot is deleted, and changes from this
snapshot are automatically merged into children snapshots.
If @flags includes LIBXL_DOMAIN_SNAPSHOT_DELETE_CHILDREN, then this snapshot
and any descendant snapshots are deleted.
ctx: context
domname: domain name
snapshot_name: snapshot name
flags: bitwise-OR of supported libxlDomainSnapshotDeleteFlags
Returns: 0 on success, -1 on error.
int libxl_disk_snapshot_revert(libxl_ctx *ctx, const char *domname,
const char *snapshot_name,
unsigned int flags);
Revert the domain to a given snapshot.
Normally, the domain will revert to the same state the domain was in while
the snapshot was taken (whether inactive, running, or paused).
If @flags includes LIBXL_DOMAIN_SNAPSHOT_REVERT_RUNNING, then overrides the
snapshot state to guarantee a running domain after the revert.
If @flags includes LIBXL_DOMAIN_SNAPSHOT_REVERT_PAUSED, then guarantees a
paused domain after the revert.
ctx: context
domname: domain name
snapshot_name: snapshot name
flags: bitwise-OR of supported libxlDomainSnapshotRevertFlags
Returns: 0 on success, -1 on error.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC V6] Libxl Domain Snapshot API Design
2014-08-27 7:22 [RFC V6] Libxl Domain Snapshot API Design Chunyan Liu
@ 2014-09-05 3:18 ` Chun Yan Liu
2014-09-05 12:44 ` Ian Campbell
1 sibling, 0 replies; 11+ messages in thread
From: Chun Yan Liu @ 2014-09-05 3:18 UTC (permalink / raw)
To: xen-devel, Chun Yan Liu; +Cc: Jim Fehlig, ian.jackson, Ian.Campbell
Hi, Ian J. & Ian C.,
Do you have any comments about the new structures and new functions?
Especially libxl_disk_snapshot structure, it's one of the most disputable parts
in last version. If no objection, I'll start writing code.
Thanks,
Chunyan
>>> On 8/27/2014 at 03:22 PM, in message
<1409124146-18249-1-git-send-email-cyliu@suse.com>, Chunyan Liu
<cyliu@suse.com> wrote:
> Since Bamvor left SUSE and turns to work on ARM server, I'd like to continue
> this work and make progress. Following the discussion about V5, which mainly
> focused on the API design, here post the updated API design. Thanks for any
> of your further suggestions!
>
> Main changes to V5:
> * libxl_disk_snapshot: reuse libxl_device_disk rather than specify path,
> format separately in the structure. Including two libxl_device_disk
> components, one is to indicate the original disk info, one is to
> indicate the external snapshot info if it is 'external snapshot'.
> * define common APIs for domain snapshot creating/deleting/reverting,
> rather than a group of functions for disk snapshot operations.
> * remove those APIs for loading/storing/deleting snapshot config.
>
> V5 is here:
> http://lists.xenproject.org/archives/html/xen-devel/2014-07/msg00893.html
> V5 about API Design is here:
> http://lists.xenproject.org/archives/html/xen-devel/2014-07/msg00897.html
>
> ===========================================================================
> Libxl Domain Snapshot API
>
> libxl_domain_snapshot = Struct("domain_snapshot",[
> ("name", string), /* snapshot name */
> ("description", string), /* snapshot description */
> ("creation_time", uint64), /* creation time, in seconds */
>
> /* save memory or not. "false" means disk-only snapshot */
> ("memory", bool),
>
> /* memory state file when snapshot is external */
> ("memory_file", string),
>
> /* Array to store disk snapshot info. */
> ("disks", Array(libxl_disk_snapshot, "num_disks")),
> ])
>
> libxl_disk_snapshot = Struct("disk_snapshot",[
> ("disk", libxl_device_disk), /* orignal disk */
> ("name", string), /* snapshot name */
> ("external", bool), /* external snapshot or not
> */
>
> /* external snapshot info, including file path and format, etc.
> * if "external" is false, this will be "NULL".
> */
> ("external_sn", libxl_device_disk),
> ])
>
> enum libxlDomainSnapshotCreateFlags {
> /* disk snapshot, not system checkpoint */
> LIBXL_DOMAIN_SNAPSHOT_CREATE_DISK_ONLY = 1,
>
> /* create the snapshot while the guest is running */
> LIBXL_DOMAIN_SNAPSHOT_CREATE_LIVE = 2,
> }
>
> enum libxlDomainSnapshotDeleteFlags {
> LIBXL_DOMAIN_SNAPSHOT_DELETE_CHILDREN = 1, /* delete children too */
> }
>
> enum virDomainSnapshotRevertFlags {
> LIBXL_DOMAIN_SNAPSHOT_REVERT_RUNNING = 1, /* run after revert */
> LIBXL_DOMAIN_SNAPSHOT_REVERT_PAUSED = 2, /* pause after revert
> */
> LIBXL_DOMAIN_SNAPSHOT_REVERT_FORCE = 4, /* force revert */
> }
>
> int libxl_domain_snapshot_create(libxl_ctx *ctx, const char *domname,
> libxl_domain_snapshot *snapshot,
> unsigned int flags);
>
> Creates a new snapshot of a domain based on the snapshot config
> contained
> in @snapshot.
>
> If @flags includes LIBXL_DOMAIN_SNAPSHOT_CREATE_LIVE, then the domain is
> not
> paused while creating the snapshot, like live migration. This increases
> size
> of the memory dump file, but reducess downtime of the guest. Only
> support
> this flag during external checkpoints.
>
> If @flags includes LIBXL_DOMAIN_SNAPSHOT_CREATE_DISK_ONLY, then the
> snapshot
> will be limited to the disks described in @snapshot, and no VM state
> will
> be saved. For an active guest, this is not supported.
>
> ctx: context
> domname: domain name
> snapshot: configuration of domain snapshot
> flags: bitwise-OR of libxlDomainSnapshotCreateFlags
> Returns: 0 on success, -1 on failure
>
>
> int libxl_domain_snapshot_delete(libxl_ctx *ctx, const char *domname,
> const char *snapshot_name,
> unsigned int flags);
>
> Delete a snapshot.
>
> If @flags is 0, then just this snapshot is deleted, and changes from
> this
> snapshot are automatically merged into children snapshots.
>
> If @flags includes LIBXL_DOMAIN_SNAPSHOT_DELETE_CHILDREN, then this
> snapshot
> and any descendant snapshots are deleted.
>
> ctx: context
> domname: domain name
> snapshot_name: snapshot name
> flags: bitwise-OR of supported libxlDomainSnapshotDeleteFlags
> Returns: 0 on success, -1 on error.
>
> int libxl_disk_snapshot_revert(libxl_ctx *ctx, const char *domname,
> const char *snapshot_name,
> unsigned int flags);
>
> Revert the domain to a given snapshot.
>
> Normally, the domain will revert to the same state the domain was in
> while
> the snapshot was taken (whether inactive, running, or paused).
>
> If @flags includes LIBXL_DOMAIN_SNAPSHOT_REVERT_RUNNING, then overrides
> the
> snapshot state to guarantee a running domain after the revert.
>
> If @flags includes LIBXL_DOMAIN_SNAPSHOT_REVERT_PAUSED, then guarantees
> a
> paused domain after the revert.
>
> ctx: context
> domname: domain name
> snapshot_name: snapshot name
> flags: bitwise-OR of supported libxlDomainSnapshotRevertFlags
> Returns: 0 on success, -1 on error.
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel
>
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC V6] Libxl Domain Snapshot API Design
2014-08-27 7:22 [RFC V6] Libxl Domain Snapshot API Design Chunyan Liu
2014-09-05 3:18 ` Chun Yan Liu
@ 2014-09-05 12:44 ` Ian Campbell
2014-09-09 8:43 ` Chun Yan Liu
1 sibling, 1 reply; 11+ messages in thread
From: Ian Campbell @ 2014-09-05 12:44 UTC (permalink / raw)
To: Chunyan Liu; +Cc: jfehlig, ian.jackson, xen-devel
On Wed, 2014-08-27 at 15:22 +0800, Chunyan Liu wrote:
> Since Bamvor left SUSE and turns to work on ARM server, I'd like to continue
> this work and make progress. Following the discussion about V5, which mainly
> focused on the API design, here post the updated API design. Thanks for any
> of your further suggestions!
>
> Main changes to V5:
> * libxl_disk_snapshot: reuse libxl_device_disk rather than specify path,
> format separately in the structure. Including two libxl_device_disk
> components, one is to indicate the original disk info, one is to
> indicate the external snapshot info if it is 'external snapshot'.
> * define common APIs for domain snapshot creating/deleting/reverting,
> rather than a group of functions for disk snapshot operations.
> * remove those APIs for loading/storing/deleting snapshot config.
Please could you say a few words on what a snapshot actually is (i.e.
what are its component parts), I think I know but it would be good to
make sure we are all on the same page.
This might include a description of what "internal" vs "external"
snapshots are (see below).
>
> V5 is here:
> http://lists.xenproject.org/archives/html/xen-devel/2014-07/msg00893.html
> V5 about API Design is here:
> http://lists.xenproject.org/archives/html/xen-devel/2014-07/msg00897.html
>
> ===========================================================================
> Libxl Domain Snapshot API
>
> libxl_domain_snapshot = Struct("domain_snapshot",[
Please can you say which of these needs to be filled in by the caller of
libxl_domain_snapshot_create and which are filled in by that function
(i.e. which fields are inputs and which are outputs).
My guess is that most of these are inputs.
> ("name", string), /* snapshot name */
> ("description", string), /* snapshot description */
> ("creation_time", uint64), /* creation time, in seconds */
Are these necessary at the libxl level? They seem like the sort of thing
the toolstack ought to be keeping as part of its overall snapshot
tracking.
> /* save memory or not. "false" means disk-only snapshot */
> ("memory", bool),
>
> /* memory state file when snapshot is external */
Under what circumstances is the snapshot external? What is it external
to?
> ("memory_file", string),
>
> /* Array to store disk snapshot info. */
> ("disks", Array(libxl_disk_snapshot, "num_disks")),
> ])
>
> libxl_disk_snapshot = Struct("disk_snapshot",[
> ("disk", libxl_device_disk), /* orignal disk */
"original"
> ("name", string), /* snapshot name */
> ("external", bool), /* external snapshot or not */
>
> /* external snapshot info, including file path and format, etc.
> * if "external" is false, this will be "NULL".
> */
> ("external_sn", libxl_device_disk),
What does the "sn" suffix stand for?
> int libxl_domain_snapshot_create(libxl_ctx *ctx, const char *domname,
Should take a domid not a name, for consistency with all the other libxl
functions.
> libxl_domain_snapshot *snapshot,
> unsigned int flags);
>
> Creates a new snapshot of a domain based on the snapshot config contained
> in @snapshot.
>
> If @flags includes LIBXL_DOMAIN_SNAPSHOT_CREATE_LIVE, then the domain is not
> paused while creating the snapshot, like live migration. This increases size
> of the memory dump file, but reducess downtime of the guest. Only support
> this flag during external checkpoints.
>
> If @flags includes LIBXL_DOMAIN_SNAPSHOT_CREATE_DISK_ONLY, then the snapshot
> will be limited to the disks described in @snapshot, and no VM state will
> be saved. For an active guest, this is not supported.
Isn't this redundant with the ->memory field of the snapshot object?
> ctx: context
> domname: domain name
> snapshot: configuration of domain snapshot
> flags: bitwise-OR of libxlDomainSnapshotCreateFlags
> Returns: 0 on success, -1 on failure
>
>
> int libxl_domain_snapshot_delete(libxl_ctx *ctx, const char *domname,
Hrm, this suggests that libxl will have some mechanism for managing
snapshots, is that right?
I don't think libxl should have that functionality since that is the
toolstack's responsibility to manage the snapshots once they are
created, using whatever means it likes.
libxl should be providing the mechanisms ("take a snapshot and put it
here") but not the policies ("snapshots live in this directory, have
this lifecycle and this format").
Similar to how libxl provides a way to say "take these bits and present
them as a disk to the guest", but it leaves the management of image
files to the toolstack (or in the case of xl the actual user).
> const char *snapshot_name,
If this were a libxl_domain_snapshot object I could just about imagine
that this would be useful helper which just iterated over the files
referenced by the snapshot and removed them. I'm not sure how useful
that helper would be in practice though (depends on the toolstack's
actual requirements).
> unsigned int flags);
>
> Delete a snapshot.
>
> If @flags is 0, then just this snapshot is deleted, and changes from this
> snapshot are automatically merged into children snapshots.
>
> If @flags includes LIBXL_DOMAIN_SNAPSHOT_DELETE_CHILDREN, then this snapshot
> and any descendant snapshots are deleted.
This definitely sounds like toolstack level functionality to me.
> ctx: context
> domname: domain name
> snapshot_name: snapshot name
> flags: bitwise-OR of supported libxlDomainSnapshotDeleteFlags
> Returns: 0 on success, -1 on error.
>
> int libxl_disk_snapshot_revert(libxl_ctx *ctx, const char *domname,
Should take a domid.
> const char *snapshot_name,
The input here should be a libxl_domain_snapshot object I think.
(Otherwise libxl would have to track/manage snapshot names)
> unsigned int flags);
>
> Revert the domain to a given snapshot.
>
> Normally, the domain will revert to the same state the domain was in while
> the snapshot was taken (whether inactive, running, or paused).
What is the distinction between inactive and paused?
>
> If @flags includes LIBXL_DOMAIN_SNAPSHOT_REVERT_RUNNING, then overrides the
> snapshot state to guarantee a running domain after the revert.
>
> If @flags includes LIBXL_DOMAIN_SNAPSHOT_REVERT_PAUSED, then guarantees a
> paused domain after the revert.
>
> ctx: context
> domname: domain name
> snapshot_name: snapshot name
> flags: bitwise-OR of supported libxlDomainSnapshotRevertFlags
> Returns: 0 on success, -1 on error.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC V6] Libxl Domain Snapshot API Design
2014-09-05 12:44 ` Ian Campbell
@ 2014-09-09 8:43 ` Chun Yan Liu
2014-09-09 9:28 ` Wei Liu
2014-09-09 10:13 ` Ian Campbell
0 siblings, 2 replies; 11+ messages in thread
From: Chun Yan Liu @ 2014-09-09 8:43 UTC (permalink / raw)
To: Ian Campbell; +Cc: Jim Fehlig, ian.jackson, xen-devel
>>> On 9/5/2014 at 08:44 PM, in message
<1409921073.10156.82.camel@kazak.uk.xensource.com>, Ian Campbell
<Ian.Campbell@citrix.com> wrote:
> On Wed, 2014-08-27 at 15:22 +0800, Chunyan Liu wrote:
> > Since Bamvor left SUSE and turns to work on ARM server, I'd like to
> continue
> > this work and make progress. Following the discussion about V5, which
> mainly
> > focused on the API design, here post the updated API design. Thanks for any
> > of your further suggestions!
> >
> > Main changes to V5:
> > * libxl_disk_snapshot: reuse libxl_device_disk rather than specify path,
> > format separately in the structure. Including two libxl_device_disk
> > components, one is to indicate the original disk info, one is to
> > indicate the external snapshot info if it is 'external snapshot'.
> > * define common APIs for domain snapshot creating/deleting/reverting,
> > rather than a group of functions for disk snapshot operations.
> > * remove those APIs for loading/storing/deleting snapshot config.
>
> Please could you say a few words on what a snapshot actually is (i.e.
> what are its component parts), I think I know but it would be good to
> make sure we are all on the same page.
>
> This might include a description of what "internal" vs "external"
> snapshots are (see below).
Thanks, I'll include the description part in next version.
>
> >
> > V5 is here:
> > http://lists.xenproject.org/archives/html/xen-devel/2014-07/msg00893.html
> > V5 about API Design is here:
> > http://lists.xenproject.org/archives/html/xen-devel/2014-07/msg00897.html
> >
> > ===========================================================================
> > Libxl Domain Snapshot API
> >
> > libxl_domain_snapshot = Struct("domain_snapshot",[
>
> Please can you say which of these needs to be filled in by the caller of
> libxl_domain_snapshot_create and which are filled in by that function
> (i.e. which fields are inputs and which are outputs).
Thanks, I'll update.
>
> My guess is that most of these are inputs.
Yes. This is quite like libxl_domain_config to libxl_domain_create.
Internally there may be another structure (could be
libxl_domain_snapshot_obj) to store domain snapshot info, including
these info, and parent, children info.
>
> > ("name", string), /* snapshot name */
Input. non-NULL. Should be prepared by application.
> > ("description", string), /* snapshot description */
Input. Could be NULL.
> > ("creation_time", uint64), /* creation time, in seconds */
Input. non-NULL.
>
> Are these necessary at the libxl level? They seem like the sort of thing
> the toolstack ought to be keeping as part of its overall snapshot
> tracking.
These are necessary information when showing snapshot info. Generally
it could be like creating domain, xl can list all domains created by xl or virsh,
xl snapshot-list could list domain snapshots created by xl or virsh too. To
show complete snapshot information, I think it's better to include these
at libxl level.
>
> > /* save memory or not. "false" means disk-only snapshot */
> > ("memory", bool);
Input. Should be filled.
> >
> > /* memory state file when snapshot is external */
>
> Under what circumstances is the snapshot external? What is it external
> to?
To talk about internal snapshot and external snapshot, it's closely related
to disk snapshot:
Internal snapshot means disk snapshot info stored within the disk image
itself, it is possible to a qcow2 disk.
External snapshot means disk snapshot info stored in another file,
different from the original disk backend. After snapshot, it's like the concept
of backing file. To some disk backend type, like 'raw' format, only external
is supported.
Here, to memory state file, in design:
if memory state file is NULL, it means memory state is piggy-backed
with other internal disk state;
if memory state file is not NULL, it means 'external', there will be another
file holding the VM memory state.
I'm not sure if the former case is proper to xen, probably only 'external'
one is doable, like in 'xl save', it always stores the memory state in a
separate file.
>
> > ("memory_file", string),
Input. In design, could be NULL. But as mentioned above, in xen code,
to reuse current functions in libxl and libxc, probably always need a path.
> >
> > /* Array to store disk snapshot info. */
> > ("disks", Array(libxl_disk_snapshot, "num_disks")),
> > ])
> >
> > libxl_disk_snapshot = Struct("disk_snapshot",[
> > ("disk", libxl_device_disk), /* orignal disk */
>
> "original"
>
> > ("name", string), /* snapshot name */
> > ("external", bool), /* external snapshot or not
> */
> >
> > /* external snapshot info, including file path and format, etc.
> > * if "external" is false, this will be "NULL".
> > */
> > ("external_sn", libxl_device_disk),
>
> What does the "sn" suffix stand for?
'snapshot'.
This structure is talked about many times in last version. As input,
following information should be filled:
target device, like 'vda'; ('libxl_device_disk disk'should include that)
disk snapshot name;
external or internal;
if external, external type;
if external, external path;
(these two should be included in 'libxl_device_disk external_sn')
>
> > int libxl_domain_snapshot_create(libxl_ctx *ctx, const char *domname,
>
> Should take a domid not a name, for consistency with all the other libxl
> functions.
There is one problem:
If domain is not active (not started), domain snapshot can also be done at
disk-only mode. But domid does not exist in this case.
>
> > libxl_domain_snapshot *snapshot,
> > unsigned int flags);
> >
> > Creates a new snapshot of a domain based on the snapshot config
> contained
> > in @snapshot.
> >
> > If @flags includes LIBXL_DOMAIN_SNAPSHOT_CREATE_LIVE, then the domain
> is not
> > paused while creating the snapshot, like live migration. This increases
> size
> > of the memory dump file, but reducess downtime of the guest. Only
> support
> > this flag during external checkpoints.
> >
> > If @flags includes LIBXL_DOMAIN_SNAPSHOT_CREATE_DISK_ONLY, then the
> snapshot
> > will be limited to the disks described in @snapshot, and no VM state
> will
> > be saved. For an active guest, this is not supported.
>
> Isn't this redundant with the ->memory field of the snapshot object?
Yes. Can be removed. Just to keep consistent with libvirt flags.
>
> > ctx: context
> > domname: domain name
> > snapshot: configuration of domain snapshot
> > flags: bitwise-OR of libxlDomainSnapshotCreateFlags
> > Returns: 0 on success, -1 on failure
> >
> >
> > int libxl_domain_snapshot_delete(libxl_ctx *ctx, const char *domname,
>
> Hrm, this suggests that libxl will have some mechanism for managing
> snapshots, is that right?
>
> I don't think libxl should have that functionality since that is the
> toolstack's responsibility to manage the snapshots once they are
> created, using whatever means it likes.
To delete internal domain snapshot, it needs to delete internal disk snapshot.
That should be done by calling qmp command. Libvirt libxl driver doesn't have
that ability. It needs to call libxl API to help doing that.
>
> libxl should be providing the mechanisms ("take a snapshot and put it
> here") but not the policies ("snapshots live in this directory, have
> this lifecycle and this format").
>
> Similar to how libxl provides a way to say "take these bits and present
> them as a disk to the guest", but it leaves the management of image
> files to the toolstack (or in the case of xl the actual user).
>
> > const char *snapshot_name,
Thanks for pointing out.
This is not we expected. I realized the parameter here should not be
a snapshot-name, but libxl_domain_snapshot maybe. Application needs
to supply path info too as in libxl_domain_snapshot_create.
>
> If this were a libxl_domain_snapshot object I could just about imagine
> that this would be useful helper which just iterated over the files
> referenced by the snapshot and removed them. I'm not sure how useful
> that helper would be in practice though (depends on the toolstack's
> actual requirements).
>
> > unsigned int flags);
> >
> > Delete a snapshot.
> >
> > If @flags is 0, then just this snapshot is deleted, and changes from
> this
> > snapshot are automatically merged into children snapshots.
> >
> > If @flags includes LIBXL_DOMAIN_SNAPSHOT_DELETE_CHILDREN, then this
> snapshot
> > and any descendant snapshots are deleted.
>
> This definitely sounds like toolstack level functionality to me.
>
>
> > ctx: context
> > domname: domain name
> > snapshot_name: snapshot name
> > flags: bitwise-OR of supported libxlDomainSnapshotDeleteFlags
> > Returns: 0 on success, -1 on error.
> >
> > int libxl_disk_snapshot_revert(libxl_ctx *ctx, const char *domname,
>
> Should take a domid.
>
> > const char *snapshot_name,
>
> The input here should be a libxl_domain_snapshot object I think.
> (Otherwise libxl would have to track/manage snapshot names)
Yes. I realized that.
>
> > unsigned int flags);
> >
> > Revert the domain to a given snapshot.
> >
> > Normally, the domain will revert to the same state the domain was in
> while
> > the snapshot was taken (whether inactive, running, or paused).
>
> What is the distinction between inactive and paused?
Inactive should mean not started at all.
Disk-only snapshot can still be created in this status .
>
> >
> > If @flags includes LIBXL_DOMAIN_SNAPSHOT_REVERT_RUNNING, then overrides
> the
> > snapshot state to guarantee a running domain after the revert.
> >
> > If @flags includes LIBXL_DOMAIN_SNAPSHOT_REVERT_PAUSED, then guarantees
> a
> > paused domain after the revert.
> >
> > ctx: context
> > domname: domain name
> > snapshot_name: snapshot name
> > flags: bitwise-OR of supported libxlDomainSnapshotRevertFlags
> > Returns: 0 on success, -1 on error.
>
>
>
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC V6] Libxl Domain Snapshot API Design
2014-09-09 8:43 ` Chun Yan Liu
@ 2014-09-09 9:28 ` Wei Liu
2014-09-09 10:13 ` Ian Campbell
1 sibling, 0 replies; 11+ messages in thread
From: Wei Liu @ 2014-09-09 9:28 UTC (permalink / raw)
To: Chun Yan Liu; +Cc: wei.liu2, Jim Fehlig, ian.jackson, Ian Campbell, xen-devel
On Tue, Sep 09, 2014 at 02:43:09AM -0600, Chun Yan Liu wrote:
[...]
> >
> > > int libxl_domain_snapshot_create(libxl_ctx *ctx, const char *domname,
> >
> > Should take a domid not a name, for consistency with all the other libxl
> > functions.
>
> There is one problem:
> If domain is not active (not started), domain snapshot can also be done at
> disk-only mode. But domid does not exist in this case.
>
>From reading this email my understanding of "not active" is that a
domain is not even created. I don't think libxl can associate a string
(domname) with any particular domain configuration. So IMO you should
provide a libxl_domain_config instead.
This libxl_domain_config can be retrieved either via libxl (with the new
API on the way) or parsing an "inactive" config. Then you can set your
flag to indicate whether it's a disk only snapshot or the other one.
Wei.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC V6] Libxl Domain Snapshot API Design
2014-09-09 8:43 ` Chun Yan Liu
2014-09-09 9:28 ` Wei Liu
@ 2014-09-09 10:13 ` Ian Campbell
2014-09-10 5:53 ` Chun Yan Liu
1 sibling, 1 reply; 11+ messages in thread
From: Ian Campbell @ 2014-09-09 10:13 UTC (permalink / raw)
To: Chun Yan Liu; +Cc: Jim Fehlig, ian.jackson, xen-devel
On Tue, 2014-09-09 at 02:43 -0600, Chun Yan Liu wrote:
> Generally
> it could be like creating domain, xl can list all domains created by xl or virsh,
> xl snapshot-list could list domain snapshots created by xl or virsh too. To
> show complete snapshot information, I think it's better to include these
> at libxl level.
I think we have a fundamental disconnect in what we consider a snapshot
to be like here and at what level of the toolstacl hierarchy they exist,
so I'm going to focus on just this one bit for now since there isn't
much point in moving forward with the rest until we've come to an
agreement on this.
My view is that libxl is primarily concerned with domains which are
actually running and the operations which can be performed on them.
Therefore it provides mechanisms for listing all running domains which
xl and libvirt etc can use to list all running domains.
However libxl does not concern itself with domains which are not
currently running, it simply has no idea about them. This knowledge of
non-running domains exists only at the higher levels of the toolstack.
(this is why I said carefully libvirt in the previous paragraph and not
virsh, since virsh accesses the higher level toolstack and can therefore
list non-running domains too)
With xl this is exposed quite fundamentally since a non running domain
exists only in the cfg file and disk images, which the user managed by
hand. When you save a domain it is to a user provided file, and after
that point libxl has no further knowledge of it.
With libvirt this manifests as libvirt keeping track of every domain's
configuration even when the domain is not running as part of its own
state. I'm not 100% sure where a saved domain goes with libvirt but once
it is saved I believe libxl no longer knows about the domain and it only
exists as part of the libvirt state.
A second consequence of this is that libxl has no concept of storage
management, i.e. it doesn't know anything about disk images except when
it is asked to attach one particular disk image to a domain. With xl
users do storage management by hand (with lvcreate and qemu-img etc)
whereas libvirt has its own storage management which it uses.
In my view a snapshot is more like a saved domain than a running one. As
such once the snapshot has been created then libxl should not need to
know anything more about it. The snapshotted domain of course keeps
running and libxl knows about that, but the snapshot itself is no longer
libxl's concern.
What this means is that in order to implement "xl snapshot-list" then
*xl*, not libxl, would need to manage those shapshots itself. Perhaps by
growing a whole bunch of snapshot and storage management functionality,
but more likely by passing this responsibility on to the user (i.e. "xl
snapshot-list" becomes "run ls on the directory where you store these
things"").
I think virst snapshot-list is more interesting, but AIUI libvirt's
datamodel already includes all of the storage and snapshot management
which is required here. It already keeps the state for non-running
domains and it already has infrastructure for managing disks and it
already knows about snapshotting of domains at a high level etc.
So I don't think there is any need (or even desire) for libxl to
replicate any of that functionality. It should concern itself with
taking a snapshot when asked and then forget all about it.
I think things work the same with e.g. qemu and other libvirt backends.
A qemu process only exists to track an actual running domain, all of the
other state including the configuration when the domain is not running
etc is all handled at the libvirt layer. When you take a snapshot of a
qemu backed domain then after that snapshot has happened the qemu
process knows no more about it. So in this way of thinking qemu and
libxl are functionally equivalent bits i.e. things which are used to
instantiate actual running domains and perform operations on them.
Does that make sense?
Perhaps my understanding of the libvirt datamodel is incorrect or
incomplete, so please point out if this is the case.
Ian.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC V6] Libxl Domain Snapshot API Design
2014-09-09 10:13 ` Ian Campbell
@ 2014-09-10 5:53 ` Chun Yan Liu
2014-09-10 12:17 ` Ian Campbell
0 siblings, 1 reply; 11+ messages in thread
From: Chun Yan Liu @ 2014-09-10 5:53 UTC (permalink / raw)
To: Ian Campbell; +Cc: Jim Fehlig, ian.jackson, xen-devel
>>> On 9/9/2014 at 06:13 PM, in message
<1410257637.8217.94.camel@kazak.uk.xensource.com>, Ian Campbell
<Ian.Campbell@citrix.com> wrote:
> On Tue, 2014-09-09 at 02:43 -0600, Chun Yan Liu wrote:
> > Generally
> > it could be like creating domain, xl can list all domains created by xl or
> virsh,
> > xl snapshot-list could list domain snapshots created by xl or virsh too. To
> > show complete snapshot information, I think it's better to include these
> > at libxl level.
>
> I think we have a fundamental disconnect in what we consider a snapshot
> to be like here and at what level of the toolstacl hierarchy they exist,
> so I'm going to focus on just this one bit for now since there isn't
> much point in moving forward with the rest until we've come to an
> agreement on this.
>
> My view is that libxl is primarily concerned with domains which are
> actually running and the operations which can be performed on them.
> Therefore it provides mechanisms for listing all running domains which
> xl and libvirt etc can use to list all running domains.
>
> However libxl does not concern itself with domains which are not
> currently running, it simply has no idea about them. This knowledge of
> non-running domains exists only at the higher levels of the toolstack.
> (this is why I said carefully libvirt in the previous paragraph and not
> virsh, since virsh accesses the higher level toolstack and can therefore
> list non-running domains too)
>
> With xl this is exposed quite fundamentally since a non running domain
> exists only in the cfg file and disk images, which the user managed by
> hand. When you save a domain it is to a user provided file, and after
> that point libxl has no further knowledge of it.
>
> With libvirt this manifests as libvirt keeping track of every domain's
> configuration even when the domain is not running as part of its own
> state. I'm not 100% sure where a saved domain goes with libvirt but once
> it is saved I believe libxl no longer knows about the domain and it only
> exists as part of the libvirt state.
>
> A second consequence of this is that libxl has no concept of storage
> management, i.e. it doesn't know anything about disk images except when
> it is asked to attach one particular disk image to a domain. With xl
> users do storage management by hand (with lvcreate and qemu-img etc)
> whereas libvirt has its own storage management which it uses.
>
> In my view a snapshot is more like a saved domain than a running one. As
> such once the snapshot has been created then libxl should not need to
> know anything more about it. The snapshotted domain of course keeps
> running and libxl knows about that, but the snapshot itself is no longer
> libxl's concern.
>
> What this means is that in order to implement "xl snapshot-list" then
> *xl*, not libxl, would need to manage those shapshots itself. Perhaps by
> growing a whole bunch of snapshot and storage management functionality,
> but more likely by passing this responsibility on to the user (i.e. "xl
> snapshot-list" becomes "run ls on the directory where you store these
> things"").
>
> I think virst snapshot-list is more interesting, but AIUI libvirt's
> datamodel already includes all of the storage and snapshot management
> which is required here. It already keeps the state for non-running
> domains and it already has infrastructure for managing disks and it
> already knows about snapshotting of domains at a high level etc.
Yes, that's right.
>
> So I don't think there is any need (or even desire) for libxl to
> replicate any of that functionality. It should concern itself with
> taking a snapshot when asked and then forget all about it.
>
> I think things work the same with e.g. qemu and other libvirt backends.
> A qemu process only exists to track an actual running domain, all of the
> other state including the configuration when the domain is not running
> etc is all handled at the libvirt layer. When you take a snapshot of a
> qemu backed domain then after that snapshot has happened the qemu
> process knows no more about it. So in this way of thinking qemu and
> libxl are functionally equivalent bits i.e. things which are used to
> instantiate actual running domains and perform operations on them.
>
> Does that make sense?
Totally agree with you. I've mixed something in xl and libxl. It's correct
that as libxl it should only do what application asks it to do and then
forget everything. Maintaining state and other information is not libxl's
work, it's the work of high level application. Libvirt does have datamodel
to keep tracking snapshot info, it doesn't need libxl to do that. xl can
maintain snapshot info in application level.
The only problem is snapshot-delete:
to delete disk snapshot, it has to call qmp command, libvirt libxl driver
cannot do that itself (except adding qmp related helper functions, quite
a duplicate with libxl_qmp.c I think).
If we supply libxl_domain_snapshot_delete API, to handle snapshot-chain
(parent, children, sibling), it doesn't work without info about all snapshots.
Or, if we supply libxl_disk_snapshot_delete API, that's also weird with this
only-one disk snapshot API.
>
> Perhaps my understanding of the libvirt datamodel is incorrect or
> incomplete, so please point out if this is the case.
>
> Ian.
>
>
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC V6] Libxl Domain Snapshot API Design
2014-09-10 5:53 ` Chun Yan Liu
@ 2014-09-10 12:17 ` Ian Campbell
2014-09-11 8:30 ` Chun Yan Liu
0 siblings, 1 reply; 11+ messages in thread
From: Ian Campbell @ 2014-09-10 12:17 UTC (permalink / raw)
To: Chun Yan Liu; +Cc: Jim Fehlig, ian.jackson, xen-devel
On Tue, 2014-09-09 at 23:53 -0600, Chun Yan Liu wrote:
> > So I don't think there is any need (or even desire) for libxl to
> > replicate any of that functionality. It should concern itself with
> > taking a snapshot when asked and then forget all about it.
> >
> > I think things work the same with e.g. qemu and other libvirt backends.
> > A qemu process only exists to track an actual running domain, all of the
> > other state including the configuration when the domain is not running
> > etc is all handled at the libvirt layer. When you take a snapshot of a
> > qemu backed domain then after that snapshot has happened the qemu
> > process knows no more about it. So in this way of thinking qemu and
> > libxl are functionally equivalent bits i.e. things which are used to
> > instantiate actual running domains and perform operations on them.
> >
> > Does that make sense?
>
> Totally agree with you. I've mixed something in xl and libxl. It's correct
> that as libxl it should only do what application asks it to do and then
> forget everything. Maintaining state and other information is not libxl's
> work, it's the work of high level application. Libvirt does have datamodel
> to keep tracking snapshot info, it doesn't need libxl to do that. xl can
> maintain snapshot info in application level.
Great! I'm glad we are on the same page.
> The only problem is snapshot-delete:
> to delete disk snapshot, it has to call qmp command, libvirt libxl driver
> cannot do that itself (except adding qmp related helper functions, quite
> a duplicate with libxl_qmp.c I think).
What is this qmp command, what does it do and what arguments does it
require?
Perhaps it's something like "the snapshot chain of $disk may have
changed, go and figure out what to do"? Or is it more complicated than
that?
> If we supply libxl_domain_snapshot_delete API, to handle snapshot-chain
> (parent, children, sibling), it doesn't work without info about all snapshots.
> Or, if we supply libxl_disk_snapshot_delete API, that's also weird with this
> only-one disk snapshot API.
So, the issue here is deleting a snapshot where there is an actively
running domain using some other snapshot in the chain, is that right?
Specifically we aren't interested in deleting a snapshot which is
actually being used right now? Would that case be handled by destroying
the domain in question first?
So in the simplest case we might have:
BASE -> SNAPSHOT A -> SNAPSHOT B
And there may be a domain running which is using any of BASE, A or B? Do
we then need to be able to delete either of the other two snapshots?
If a domain is running using SNAPSHOT A then I presume SNAPSHOT B can be
trivially deleted without needing to communicate with that domain.
If a domain is running using SNAPSHOT B and we want to delete SNAPSHOT A
then the domain needs to know that it should rescan the chain to
discover that it is has become:
BASE -> SNAPSHOT B
Or is the qemu associated with the domain actually responsible for
making that happen (i.e. folding the contents of SNAPSHOT A into either
BASE or SNAPSHOT B as appropriate at runtime)? Or does that happen
elsewhere?
I think there are more complex cases with tree's of snapshots etc (e.g):
----- SNAPSHOT A
'
'
BASE ----x
,
,
----- SNAPSHOT B
But lets deal with the simple linear case first ;-)
Ian.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC V6] Libxl Domain Snapshot API Design
2014-09-10 12:17 ` Ian Campbell
@ 2014-09-11 8:30 ` Chun Yan Liu
2014-09-24 12:21 ` Ian Campbell
0 siblings, 1 reply; 11+ messages in thread
From: Chun Yan Liu @ 2014-09-11 8:30 UTC (permalink / raw)
To: Ian Campbell; +Cc: Jim Fehlig, ian.jackson, xen-devel
>>> On 9/10/2014 at 08:17 PM, in message
<1410351477.8217.375.camel@kazak.uk.xensource.com>, Ian Campbell
<Ian.Campbell@citrix.com> wrote:
> On Tue, 2014-09-09 at 23:53 -0600, Chun Yan Liu wrote:
> > > So I don't think there is any need (or even desire) for libxl to
> > > replicate any of that functionality. It should concern itself with
> > > taking a snapshot when asked and then forget all about it.
> > >
> > > I think things work the same with e.g. qemu and other libvirt backends.
> > > A qemu process only exists to track an actual running domain, all of the
> > > other state including the configuration when the domain is not running
> > > etc is all handled at the libvirt layer. When you take a snapshot of a
> > > qemu backed domain then after that snapshot has happened the qemu
> > > process knows no more about it. So in this way of thinking qemu and
> > > libxl are functionally equivalent bits i.e. things which are used to
> > > instantiate actual running domains and perform operations on them.
> > >
> > > Does that make sense?
> >
> > Totally agree with you. I've mixed something in xl and libxl. It's correct
> > that as libxl it should only do what application asks it to do and then
> > forget everything. Maintaining state and other information is not libxl's
> > work, it's the work of high level application. Libvirt does have datamodel
> > to keep tracking snapshot info, it doesn't need libxl to do that. xl can
> > maintain snapshot info in application level.
>
> Great! I'm glad we are on the same page.
>
> > The only problem is snapshot-delete:
> > to delete disk snapshot, it has to call qmp command, libvirt libxl driver
> > cannot do that itself (except adding qmp related helper functions, quite
> > a duplicate with libxl_qmp.c I think).
>
> What is this qmp command, what does it do and what arguments does it
> require?
The qmp command is 'blockdev-snapshot-delete-internal-sync'.
Purpose of this qmp command is to delete an internal snapshot within the
disk. E.g. for qcow2 format, a internal disk snapshot will store snapshot
related info within the disk. To delete this snapshot, one has to communicate
with qemu (through qmp command), let qcow2 driver to handle that.
{ "execute": "blockdev-snapshot-delete-internal-sync",
"arguments": { "device": "ide-hd0",
"name": "snapshot0" }
}
>
> Perhaps it's something like "the snapshot chain of $disk may have
> changed, go and figure out what to do"? Or is it more complicated than
> that?
>
> > If we supply libxl_domain_snapshot_delete API, to handle snapshot-chain
> > (parent, children, sibling), it doesn't work without info about all
> snapshots.
> > Or, if we supply libxl_disk_snapshot_delete API, that's also weird with
> this
> > only-one disk snapshot API.
>
> So, the issue here is deleting a snapshot where there is an actively
> running domain using some other snapshot in the chain, is that right?
I meant to say:
BASE -> SNAPSHOT A -> SNAPSHOT B
When user want to delete SNAPSHOT A, if flag indicates deleting
all children, then SNAPSHOT B should be deleted too; if without this flag,
then by default all changes to SNAPSHOT A should be merged to
SNAPSHOT B. In any case, it should know the relationship between
SNAPSHOT A and SNAPSHOT B.
Well, after thinking again, I think maybe there is no problem here:
libxl supplies libxl_domain_snapshot_delete API, that could help doing disk
snapshot-delete work. In disk level, 'all changes to SNAPSHOT A should
be merged to SNAPSHOT B' can be ensured by qemu automatically, libxl
doesn't need to take care of that.
Then xl or libvirt can do all other things, they scan the domain snapshot
information to decide which snapshot is related and should be deleted,
and refresh domain snapshot chain relationship after the work.
>
> Specifically we aren't interested in deleting a snapshot which is
> actually being used right now? Would that case be handled by destroying
> the domain in question first?
>
> So in the simplest case we might have:
>
> BASE -> SNAPSHOT A -> SNAPSHOT B
>
> And there may be a domain running which is using any of BASE, A or B? Do
> we then need to be able to delete either of the other two snapshots?
>
> If a domain is running using SNAPSHOT A then I presume SNAPSHOT B can be
> trivially deleted without needing to communicate with that domain.
>
> If a domain is running using SNAPSHOT B and we want to delete SNAPSHOT A
> then the domain needs to know that it should rescan the chain to
> discover that it is has become:
>
> BASE -> SNAPSHOT B
>
> Or is the qemu associated with the domain actually responsible for
> making that happen (i.e. folding the contents of SNAPSHOT A into either
> BASE or SNAPSHOT B as appropriate at runtime)? Or does that happen
> elsewhere?
If SNAPSHOT B is based on SNAPSHOT A, when A is deleted, changes made
to A will be merged to B. Qemu can ensure that, at least for a qcow2 disk,
qcow2 driver can ensure that.
But from Domain level, it is high level application to refresh the domain
snapshot chain relationship. In libvirt, it is qemu_driver to rerefresh the
chain relationship and update related qemuDomainSnapshotObj (update
information of .parent, .sibling, .children, etc.).
>
> I think there are more complex cases with tree's of snapshots etc (e.g):
> ----- SNAPSHOT A
> '
> '
> BASE ----x
> ,
> ,
> ----- SNAPSHOT B
>
> But lets deal with the simple linear case first ;-)
>
> Ian.
>
>
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC V6] Libxl Domain Snapshot API Design
2014-09-11 8:30 ` Chun Yan Liu
@ 2014-09-24 12:21 ` Ian Campbell
2014-09-26 3:00 ` Chun Yan Liu
0 siblings, 1 reply; 11+ messages in thread
From: Ian Campbell @ 2014-09-24 12:21 UTC (permalink / raw)
To: Chun Yan Liu; +Cc: Jim Fehlig, ian.jackson, xen-devel
On Thu, 2014-09-11 at 02:30 -0600, Chun Yan Liu wrote:
Sorry, I've just realised I lost track of this thread.
>
> >>> On 9/10/2014 at 08:17 PM, in message
> <1410351477.8217.375.camel@kazak.uk.xensource.com>, Ian Campbell
> <Ian.Campbell@citrix.com> wrote:
> > On Tue, 2014-09-09 at 23:53 -0600, Chun Yan Liu wrote:
> > > > So I don't think there is any need (or even desire) for libxl to
> > > > replicate any of that functionality. It should concern itself with
> > > > taking a snapshot when asked and then forget all about it.
> > > >
> > > > I think things work the same with e.g. qemu and other libvirt backends.
> > > > A qemu process only exists to track an actual running domain, all of the
> > > > other state including the configuration when the domain is not running
> > > > etc is all handled at the libvirt layer. When you take a snapshot of a
> > > > qemu backed domain then after that snapshot has happened the qemu
> > > > process knows no more about it. So in this way of thinking qemu and
> > > > libxl are functionally equivalent bits i.e. things which are used to
> > > > instantiate actual running domains and perform operations on them.
> > > >
> > > > Does that make sense?
> > >
> > > Totally agree with you. I've mixed something in xl and libxl. It's correct
> > > that as libxl it should only do what application asks it to do and then
> > > forget everything. Maintaining state and other information is not libxl's
> > > work, it's the work of high level application. Libvirt does have datamodel
> > > to keep tracking snapshot info, it doesn't need libxl to do that. xl can
> > > maintain snapshot info in application level.
> >
> > Great! I'm glad we are on the same page.
> >
> > > The only problem is snapshot-delete:
> > > to delete disk snapshot, it has to call qmp command, libvirt libxl driver
> > > cannot do that itself (except adding qmp related helper functions, quite
> > > a duplicate with libxl_qmp.c I think).
> >
> > What is this qmp command, what does it do and what arguments does it
> > require?
>
> The qmp command is 'blockdev-snapshot-delete-internal-sync'.
> Purpose of this qmp command is to delete an internal snapshot within the
> disk. E.g. for qcow2 format, a internal disk snapshot will store snapshot
> related info within the disk. To delete this snapshot, one has to communicate
> with qemu (through qmp command), let qcow2 driver to handle that.
>
> { "execute": "blockdev-snapshot-delete-internal-sync",
> "arguments": { "device": "ide-hd0",
> "name": "snapshot0" }
> }
Thanks, I can see why this would be necessary if a domain was actively
using the snapshot chain.
Presumably if there is no active domain the calling application will
take some action directly to delete the snapshot (such as using qemu-img
perhaps?)
I suppose there is an added wrinkle which is that there might in fact be
multiple active domains using different snapshots within the chain
contained in the same qcow2 file? Or if qcow2 outlaws this we should
either consider a more general container or explicitly decide that we
don't support this sort of thing. TBH I'm not sure how multiple qemu
processes accessing the same qcow2 file could work, at least for
internal snapshots, and external ones I suppose are pretty trivial to
deal with.
So at the libxl layer we would want a function which takes a domid and a
libxl_device_disk and some sort of "handle" to the snapshot to remove.
I say "handle" rather than "string containing the name" since different
containers and/or backend types might have different manifestations of
such things, so it would have to be somewhat opaque from the libxl API
point of view.
I *think* (but I'm not sure) that the handle will be a function of the
disk format only and not the disk backend. My reasoning is that libxl
will need to know for each backend type how to speak to it (e.g. qdisk
== qmp) to delete a snapshot and will forward the handle on in the
request it makes to the backend, which will necessarily understand the
how snapshots are referenced for that disk container. Does that make
sense?
Given that it seems like the libxl function should take some sort of
parameter which depends on the disk format, we could just make that a
void*, but perhaps something more structured like:
libxl_snapshot_handle = Struct("snapshot_handle", [
("u", KeyedUnion(None, libxl_disk_format, "format",
[("qcow2", Struct(None, [("name", string)]),
])
])
So for qcow you would set
handle.format = LIBXL_DISK_FORMAT_QCOW2
handle.qcow2.name = "snapshot0"
and pass the result to the function.
handle.format would duplicate disk.format, and it would be an error for
them to be mismatched. Maybe we could just use an IDL Union instead of a
KeyedUnion to avoid that. KeyedUnion is nice because it enforces a
relationship between the related enum and union, but maybe in this case
it is uglier than it is helpful.
> > Perhaps it's something like "the snapshot chain of $disk may have
> > changed, go and figure out what to do"? Or is it more complicated than
> > that?
> >
> > > If we supply libxl_domain_snapshot_delete API, to handle snapshot-chain
> > > (parent, children, sibling), it doesn't work without info about all
> > snapshots.
> > > Or, if we supply libxl_disk_snapshot_delete API, that's also weird with
> > this
> > > only-one disk snapshot API.
> >
> > So, the issue here is deleting a snapshot where there is an actively
> > running domain using some other snapshot in the chain, is that right?
>
> I meant to say:
> BASE -> SNAPSHOT A -> SNAPSHOT B
> When user want to delete SNAPSHOT A, if flag indicates deleting
> all children, then SNAPSHOT B should be deleted too; if without this flag,
> then by default all changes to SNAPSHOT A should be merged to
> SNAPSHOT B. In any case, it should know the relationship between
> SNAPSHOT A and SNAPSHOT B.
If you want to delete SNAPSHOT A and all of its children then there
cannot be any active domains using A or its children, so would it be
necessary to go via libxl at all? (as hinted above perhaps such things
are better handled directly in the application).
If you are deleting A only and B is in active use then merging A into B
would be the job of B's backend, via the call/mechanism proposed above.
Does that seem reasonable?
> Well, after thinking again, I think maybe there is no problem here:
> libxl supplies libxl_domain_snapshot_delete API, that could help doing disk
> snapshot-delete work. In disk level, 'all changes to SNAPSHOT A should
> be merged to SNAPSHOT B' can be ensured by qemu automatically, libxl
> doesn't need to take care of that.
> Then xl or libvirt can do all other things, they scan the domain snapshot
> information to decide which snapshot is related and should be deleted,
> and refresh domain snapshot chain relationship after the work.
IOW it would be the applications responsibility to iterate over the
chain and delete individual things? That sounds reasonable and makes it
easier for us to ignore the possiblity of multiple VMs at the libxl API
layer.
> > Specifically we aren't interested in deleting a snapshot which is
> > actually being used right now? Would that case be handled by destroying
> > the domain in question first?
> >
> > So in the simplest case we might have:
> >
> > BASE -> SNAPSHOT A -> SNAPSHOT B
> >
> > And there may be a domain running which is using any of BASE, A or B? Do
> > we then need to be able to delete either of the other two snapshots?
> >
> > If a domain is running using SNAPSHOT A then I presume SNAPSHOT B can be
> > trivially deleted without needing to communicate with that domain.
> >
> > If a domain is running using SNAPSHOT B and we want to delete SNAPSHOT A
> > then the domain needs to know that it should rescan the chain to
> > discover that it is has become:
> >
> > BASE -> SNAPSHOT B
> >
> > Or is the qemu associated with the domain actually responsible for
> > making that happen (i.e. folding the contents of SNAPSHOT A into either
> > BASE or SNAPSHOT B as appropriate at runtime)? Or does that happen
> > elsewhere?
>
> If SNAPSHOT B is based on SNAPSHOT A, when A is deleted, changes made
> to A will be merged to B. Qemu can ensure that, at least for a qcow2 disk,
> qcow2 driver can ensure that.
Ack.
> But from Domain level, it is high level application to refresh the domain
> snapshot chain relationship. In libvirt, it is qemu_driver to rerefresh the
> chain relationship and update related qemuDomainSnapshotObj (update
> information of .parent, .sibling, .children, etc.).
Is the "qemu_driver" here a "qemu storage driver" or a "qemu vm
driver" (is there even such a distinction?)
Ian.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC V6] Libxl Domain Snapshot API Design
2014-09-24 12:21 ` Ian Campbell
@ 2014-09-26 3:00 ` Chun Yan Liu
0 siblings, 0 replies; 11+ messages in thread
From: Chun Yan Liu @ 2014-09-26 3:00 UTC (permalink / raw)
To: Ian Campbell; +Cc: Jim Fehlig, ian.jackson, xen-devel
>>> On 9/24/2014 at 08:21 PM, in message
<1411561287.28127.22.camel@kazak.uk.xensource.com>, Ian Campbell
<Ian.Campbell@citrix.com> wrote:
> On Thu, 2014-09-11 at 02:30 -0600, Chun Yan Liu wrote:
>
> Sorry, I've just realised I lost track of this thread.
> >
> > >>> On 9/10/2014 at 08:17 PM, in message
> > <1410351477.8217.375.camel@kazak.uk.xensource.com>, Ian Campbell
> > <Ian.Campbell@citrix.com> wrote:
> > > On Tue, 2014-09-09 at 23:53 -0600, Chun Yan Liu wrote:
> > > > > So I don't think there is any need (or even desire) for libxl to
> > > > > replicate any of that functionality. It should concern itself with
> > > > > taking a snapshot when asked and then forget all about it.
> > > > >
> > > > > I think things work the same with e.g. qemu and other libvirt backends.
>
> > > > > A qemu process only exists to track an actual running domain, all of the
>
> > > > > other state including the configuration when the domain is not running
> > > > > etc is all handled at the libvirt layer. When you take a snapshot of a
> > > > > qemu backed domain then after that snapshot has happened the qemu
> > > > > process knows no more about it. So in this way of thinking qemu and
> > > > > libxl are functionally equivalent bits i.e. things which are used to
> > > > > instantiate actual running domains and perform operations on them.
> > > > >
> > > > > Does that make sense?
> > > >
> > > > Totally agree with you. I've mixed something in xl and libxl. It's
> correct
> > > > that as libxl it should only do what application asks it to do and then
> > > > forget everything. Maintaining state and other information is not libxl's
>
> > > > work, it's the work of high level application. Libvirt does have
> datamodel
> > > > to keep tracking snapshot info, it doesn't need libxl to do that. xl can
> > > > maintain snapshot info in application level.
> > >
> > > Great! I'm glad we are on the same page.
> > >
> > > > The only problem is snapshot-delete:
> > > > to delete disk snapshot, it has to call qmp command, libvirt libxl driver
>
> > > > cannot do that itself (except adding qmp related helper functions, quite
> > > > a duplicate with libxl_qmp.c I think).
> > >
> > > What is this qmp command, what does it do and what arguments does it
> > > require?
> >
> > The qmp command is 'blockdev-snapshot-delete-internal-sync'.
> > Purpose of this qmp command is to delete an internal snapshot within the
> > disk. E.g. for qcow2 format, a internal disk snapshot will store snapshot
> > related info within the disk. To delete this snapshot, one has to
> communicate
> > with qemu (through qmp command), let qcow2 driver to handle that.
> >
> > { "execute": "blockdev-snapshot-delete-internal-sync",
> > "arguments": { "device": "ide-hd0",
> > "name": "snapshot0" }
> > }
>
> Thanks, I can see why this would be necessary if a domain was actively
> using the snapshot chain.
>
> Presumably if there is no active domain the calling application will
> take some action directly to delete the snapshot (such as using qemu-img
> perhaps?)
Yes, without qemu process, qmp is not working. I think it can only call
qemu-img command to delete the internal disk snapshot.
>
> I suppose there is an added wrinkle which is that there might in fact be
> multiple active domains using different snapshots within the chain
> contained in the same qcow2 file? Or if qcow2 outlaws this we should
> either consider a more general container or explicitly decide that we
> don't support this sort of thing. TBH I'm not sure how multiple qemu
> processes accessing the same qcow2 file could work,
I don't know if there is problem, at least without snapshot supported, we
don't allow different domain accessing the same disk image (if not 'shared').
So, about 'multiple active domains using different snapshots within the
chain contained in the same qcow2 file', maybe it's better to not support.
> contained in the same qcow2 file> at least for
> internal snapshots, and external ones I suppose are pretty trivial to
> deal with.
>
> So at the libxl layer we would want a function which takes a domid and a
> libxl_device_disk and some sort of "handle" to the snapshot to remove.
>
> I say "handle" rather than "string containing the name" since different
> containers and/or backend types might have different manifestations of
> such things, so it would have to be somewhat opaque from the libxl API
> point of view.
>
> I *think* (but I'm not sure) that the handle will be a function of the
> disk format only and not the disk backend. My reasoning is that libxl
> will need to know for each backend type how to speak to it (e.g. qdisk
> == qmp) to delete a snapshot and will forward the handle on in the
> request it makes to the backend, which will necessarily understand the
> how snapshots are referenced for that disk container. Does that make
> sense?
>
> Given that it seems like the libxl function should take some sort of
> parameter which depends on the disk format, we could just make that a
> void*, but perhaps something more structured like:
>
> libxl_snapshot_handle = Struct("snapshot_handle", [
> ("u", KeyedUnion(None, libxl_disk_format, "format",
> [("qcow2", Struct(None, [("name", string)]),
> ])
> ])
>
> So for qcow you would set
> handle.format = LIBXL_DISK_FORMAT_QCOW2
> handle.qcow2.name = "snapshot0"
> and pass the result to the function.
>
> handle.format would duplicate disk.format, and it would be an error for
> them to be mismatched. Maybe we could just use an IDL Union instead of a
> KeyedUnion to avoid that. KeyedUnion is nice because it enforces a
> relationship between the related enum and union, but maybe in this case
> it is uglier than it is helpful.
Well, I think if the 'handle' is just to provide format and snapshot name
information, maybe we don't need it. Like other libxl snapshot APIs, we can
take domid and libxl_domain_snapshot (now I think libxl_domain_snapshot_args
may be better) as parameters. In libxl_domain_snapshot, we could have:
{
memory, //bool
memory_location, //if 'memory' is true, then this is not NULL.
num_disks, //how many disk need to handle
disks[], //disk snapshot information
/* each disk is a structure of libxl_disk_snapshot.
libxl_disk_snapshot {
libxl_device_disk,
snapshot_name,
external, //bool
external_format,
external_path,
}
*/
}
For internal disk snapshot, disk format can be found from
libxl_device_disk, snapshot_name is also included in this structure.
So with this structure, it's enough.
Do I miss something in your concern?
And, how do you think about the structure? I plan to use this structure
to accept inputs for all three APIs: libxl_domain_snapshot_create,
libxl_domain_snapshot_delete, and libxl_domain_snapshot_revert.
>
> > > Perhaps it's something like "the snapshot chain of $disk may have
> > > changed, go and figure out what to do"? Or is it more complicated than
> > > that?
> > >
> > > > If we supply libxl_domain_snapshot_delete API, to handle snapshot-chain
> > > > (parent, children, sibling), it doesn't work without info about all
> > > snapshots.
> > > > Or, if we supply libxl_disk_snapshot_delete API, that's also weird with
> > > this
> > > > only-one disk snapshot API.
> > >
> > > So, the issue here is deleting a snapshot where there is an actively
> > > running domain using some other snapshot in the chain, is that right?
> >
> > I meant to say:
> > BASE -> SNAPSHOT A -> SNAPSHOT B
> > When user want to delete SNAPSHOT A, if flag indicates deleting
> > all children, then SNAPSHOT B should be deleted too; if without this flag,
> > then by default all changes to SNAPSHOT A should be merged to
> > SNAPSHOT B. In any case, it should know the relationship between
> > SNAPSHOT A and SNAPSHOT B.
>
> If you want to delete SNAPSHOT A and all of its children then there
> cannot be any active domains using A or its children, so would it be
> necessary to go via libxl at all? (as hinted above perhaps such things
> are better handled directly in the application).
Right. This is got clear. Now the libxl_domain_snapshot_delete function
only delete one snapshot according to libxl_domain_snapshot(_args?) input.
According to libxl_domain_snapshot(_args?), delete the memory state file,
and then delete related disk snapshot. All other things are maintained by
application, like: whether need to delete children vm snapshot, if yes, app
call libxl_domain_snapshot_delete again with another libxl_domain_snapshot
input.
>
> If you are deleting A only and B is in active use then merging A into B
> would be the job of B's backend, via the call/mechanism proposed above.
> Does that seem reasonable?
Yes, at least for qcow2, there is no problem. No extra work needs to be
done. Qemu qcow2 snapshot mechanism can ensure the data integrity
of snapshot B.
>
> > Well, after thinking again, I think maybe there is no problem here:
> > libxl supplies libxl_domain_snapshot_delete API, that could help doing disk
> > snapshot-delete work. In disk level, 'all changes to SNAPSHOT A should
> > be merged to SNAPSHOT B' can be ensured by qemu automatically, libxl
> > doesn't need to take care of that.
> > Then xl or libvirt can do all other things, they scan the domain snapshot
> > information to decide which snapshot is related and should be deleted,
> > and refresh domain snapshot chain relationship after the work.
>
> IOW it would be the applications responsibility to iterate over the
> chain and delete individual things? That sounds reasonable and makes it
> easier for us to ignore the possiblity of multiple VMs at the libxl API
> layer.
Right.
>
> > > Specifically we aren't interested in deleting a snapshot which is
> > > actually being used right now? Would that case be handled by destroying
> > > the domain in question first?
> > >
> > > So in the simplest case we might have:
> > >
> > > BASE -> SNAPSHOT A -> SNAPSHOT B
> > >
> > > And there may be a domain running which is using any of BASE, A or B? Do
> > > we then need to be able to delete either of the other two snapshots?
> > >
> > > If a domain is running using SNAPSHOT A then I presume SNAPSHOT B can be
> > > trivially deleted without needing to communicate with that domain.
> > >
> > > If a domain is running using SNAPSHOT B and we want to delete SNAPSHOT A
> > > then the domain needs to know that it should rescan the chain to
> > > discover that it is has become:
> > >
> > > BASE -> SNAPSHOT B
> > >
> > > Or is the qemu associated with the domain actually responsible for
> > > making that happen (i.e. folding the contents of SNAPSHOT A into either
> > > BASE or SNAPSHOT B as appropriate at runtime)? Or does that happen
> > > elsewhere?
> >
> > If SNAPSHOT B is based on SNAPSHOT A, when A is deleted, changes made
> > to A will be merged to B. Qemu can ensure that, at least for a qcow2 disk,
> > qcow2 driver can ensure that.
>
> Ack.
>
> > But from Domain level, it is high level application to refresh the domain
> > snapshot chain relationship. In libvirt, it is qemu_driver to rerefresh the
> > chain relationship and update related qemuDomainSnapshotObj (update
> > information of .parent, .sibling, .children, etc.).
>
> Is the "qemu_driver" here a "qemu storage driver" or a "qemu vm
> driver" (is there even such a distinction?)
Here, it's qemu hypervisor driver, managing all kvm vms.
Thanks,
Chunyan
>
> Ian.
>
>
>
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2014-09-26 3:00 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-08-27 7:22 [RFC V6] Libxl Domain Snapshot API Design Chunyan Liu
2014-09-05 3:18 ` Chun Yan Liu
2014-09-05 12:44 ` Ian Campbell
2014-09-09 8:43 ` Chun Yan Liu
2014-09-09 9:28 ` Wei Liu
2014-09-09 10:13 ` Ian Campbell
2014-09-10 5:53 ` Chun Yan Liu
2014-09-10 12:17 ` Ian Campbell
2014-09-11 8:30 ` Chun Yan Liu
2014-09-24 12:21 ` Ian Campbell
2014-09-26 3:00 ` Chun Yan Liu
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).