* [RFC V8 0/3] domain snapshot document
@ 2014-11-10 8:17 Chunyan Liu
2014-11-10 8:17 ` [RFC V8 1/3] libxl domain snapshot introduction Chunyan Liu
` (2 more replies)
0 siblings, 3 replies; 24+ messages in thread
From: Chunyan Liu @ 2014-11-10 8:17 UTC (permalink / raw)
To: xen-devel; +Cc: Ian.Jackson, jfehlig, wei.liu2, Ian.Campbell, Chunyan Liu
This is high level document for domain snapshot design, including
libxl API design and basic xl interface design.
Changes to V7:
- In libxl API:
* remove libxl_domain_snapshot_revert, let application
does the work by themselves.
* remove disk-only support.
- In xl interface:
* modify wrong libxl_foo structure to xl_foo structure,
* remove all disk-only syntax.
- Other changes according to Ian's comment.
V7 is here:
http://lists.xen.org/archives/html/xen-devel/2014-10/msg01163.html
^ permalink raw reply [flat|nested] 24+ messages in thread* [RFC V8 1/3] libxl domain snapshot introduction 2014-11-10 8:17 [RFC V8 0/3] domain snapshot document Chunyan Liu @ 2014-11-10 8:17 ` Chunyan Liu 2014-11-10 8:17 ` [RFC V8 2/3] libxl domain snapshot API design Chunyan Liu 2014-11-10 8:17 ` [RFC V8 3/3] xl snapshot-xxx Design Chunyan Liu 2 siblings, 0 replies; 24+ messages in thread From: Chunyan Liu @ 2014-11-10 8:17 UTC (permalink / raw) To: xen-devel; +Cc: Ian.Jackson, jfehlig, wei.liu2, Ian.Campbell, Chunyan Liu Changes to V7: * update the words a little to avoid confusing in v6. =========================================================================== Domain snapshot includes disk snapshots and domain state saving. domain could be resumed to the very state when the snapshot was created. This kind of snapshot is also referred to as a domain checkpoint or system checkpoint. It's consistent. Disk snapshot is inconsistent if the domain is running. To libxl, even domain is paused, there is no data flush to disk operation. So, for active domain (domain is started), take a disk-only snapshot and then resume, it is as if the guest had crashed. For this reason, we won't support disk-only snapshot in libxl. Within domain snapshot, disk snapshot could be "internal" (like in qcow2 format, snapshot and delta are both in one image file), or "external" (snapshot in one file, delta in another). Expected 4 types of operations: "domain snapshot create": means saving domain state (if not disk-only) and doing disk snapshots. "domain snapshot revert": means rolling back to the state of indicated snapshot. "domain snapshot delete": delete indicated domain snapshot. "domain snapshot list": list domain snapshot information. ^ permalink raw reply [flat|nested] 24+ messages in thread
* [RFC V8 2/3] libxl domain snapshot API design 2014-11-10 8:17 [RFC V8 0/3] domain snapshot document Chunyan Liu 2014-11-10 8:17 ` [RFC V8 1/3] libxl domain snapshot introduction Chunyan Liu @ 2014-11-10 8:17 ` Chunyan Liu 2014-11-10 17:04 ` George Dunlap 2014-12-05 16:06 ` Wei Liu 2014-11-10 8:17 ` [RFC V8 3/3] xl snapshot-xxx Design Chunyan Liu 2 siblings, 2 replies; 24+ messages in thread From: Chunyan Liu @ 2014-11-10 8:17 UTC (permalink / raw) To: xen-devel; +Cc: Ian.Jackson, jfehlig, wei.liu2, Ian.Campbell, Chunyan Liu changes to V7: * remove libxl_domain_snapshot_revert API * about libxl_domain_snapshot_delete, disk snapshot part could be extracted to libxlu if we would support many kinds of disk backendtypes in future. Add words to the docs, but not extended. Basic goal will support only raw and qcow2. * remove all disk-only descriptions. Won't support disk-only snapshot in xl and libxl. =========================================================================== Libxl Domain Snapshot API 1. New Structures libxl_disk_snapshot = Struct("disk_snapshot",[ # target disk ("disk", libxl_device_disk), # disk snapshot name ("name", string), # internal/external disk snapshot? ("external", bool), # for external disk snapshot, specify following two field ("external_format", string), ("external_path", string), ]) libxl_domain_snapshot_args = Struct("domain_snapshot_args",[ # memory state path ("memory_path", string), # array to store disk snapshot info ("disks", Array(libxl_disk_snapshot, "num_disks")), ] 2. New Functions int libxl_domain_snapshot_create(libxl_ctx *ctx, int domid, libxl_domain_snapshot_args *snapshot, bool live) Creates a new snapshot of a domain based on the snapshot config contained in @snapshot. Save domain and do disk snapshot. ctx (INPUT): context domid (INPUT): domain id snapshot (INPUT): configuration of domain snapshot live (INPUT): live snapshot or not Returns: 0 on success, -1 on failure ctx: context. domid: domid of the domain. live: true or false. when live is 'true', domain is not paused while creating the snapshot, like live migration. This increases size of the memory dump file, but reducess downtime of the guest. snapshot: memory_path: path to save memory state. num_disks: number of disks that need to take disk snapshot. disks: array of disk snapshot configuration. Has num_disks members. libxl_device_disk: structure to represent which disk. name: snapshot name. external: true or flase. 'false' means internal disk snapshot. external_format and external_path will be ignored. 'true' means external disk snapshot, then external_format and external_path should be provided. external_format: Should be provided when 'external' is true. If not provided, will use default format proper to the backend file. Ignored when 'external' is false. external_path: Must be provided when 'external' is true. Ignored when 'external' is false. int libxl_domain_snapshot_delete(libxl_ctx *ctx, int domid, libxl_domain_snapshot_args *snapshot); Delete a snapshot. This will delete the related memory state file and disk snapshots. ctx (INPUT): context domid (INPUT): domain id snapshot (INPUT): domain snapshot related info Returns: 0 on success, -1 on error. About each input, explanation is the same as libxl_domain_snapshot_create. libxl_domain_snapshot_revert API will not be provided. Considering that: domain snapshot revert work could be done by destroying the domain and then restore a new domain from the snapshot. Application could do that by themselves. If wrapped as an libxl API, the new domain will not be awared by libvirt, causing problems in libvirt, worse than better. 3. Function Implementation libxl_domain_snapshot_create: 1). check args validation 2). save domain memory through save-domain 3). take disk snapshot by qmp command (if domian is active) or qemu-img command (if domain is inactive). libxl_domain_snapshot_delete: 1). check args validation 2). remove memory state file. 3). delete disk snapshot. (for internal disk snapshot, through qmp command or qemu-img command) To handle disk snapshot (disk snapshot create or delete) of different disk backend types, this work could be extracted to libxlu in future. (Base goal would support raw and qcow2 types only, as currently kvm does). ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [RFC V8 2/3] libxl domain snapshot API design 2014-11-10 8:17 ` [RFC V8 2/3] libxl domain snapshot API design Chunyan Liu @ 2014-11-10 17:04 ` George Dunlap 2014-11-11 8:07 ` Chun Yan Liu 2014-12-05 16:06 ` Wei Liu 1 sibling, 1 reply; 24+ messages in thread From: George Dunlap @ 2014-11-10 17:04 UTC (permalink / raw) To: Chunyan Liu Cc: Ian Jackson, Jim Fehlig, Wei Liu, Ian Campbell, xen-devel@lists.xen.org On Mon, Nov 10, 2014 at 8:17 AM, Chunyan Liu <cyliu@suse.com> wrote: > > 3. Function Implementation > > libxl_domain_snapshot_create: > 1). check args validation > 2). save domain memory through save-domain > 3). take disk snapshot by qmp command (if domian is active) or qemu-img > command (if domain is inactive). By "active" here, do you you mean "live" (vs paused)? > libxl_domain_snapshot_delete: > 1). check args validation > 2). remove memory state file. > 3). delete disk snapshot. (for internal disk snapshot, through qmp > command or qemu-img command) Out of curiosity, why is this necessary? Is libxl keeping track of the snapshots somewhere? Or qemu? Or to put it a different way, since the caller knows the filenames, why can't the caller just erase the files themselves? -George ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [RFC V8 2/3] libxl domain snapshot API design 2014-11-10 17:04 ` George Dunlap @ 2014-11-11 8:07 ` Chun Yan Liu 2014-11-13 3:07 ` Chun Yan Liu 0 siblings, 1 reply; 24+ messages in thread From: Chun Yan Liu @ 2014-11-11 8:07 UTC (permalink / raw) To: George Dunlap Cc: Ian Jackson, Jim Fehlig, Wei Liu, Ian Campbell, xen-devel@lists.xen.org >>> On 11/11/2014 at 01:04 AM, in message <CAFLBxZZVqZxUouciujSTP-GJsUOquofUK6dy1K2rNXuEEb4Ekw@mail.gmail.com>, George Dunlap <dunlapg@umich.edu> wrote: > On Mon, Nov 10, 2014 at 8:17 AM, Chunyan Liu <cyliu@suse.com> wrote: > > > > > 3. Function Implementation > > > > libxl_domain_snapshot_create: > > 1). check args validation > > 2). save domain memory through save-domain > > 3). take disk snapshot by qmp command (if domian is active) or > qemu-img > > command (if domain is inactive). > > By "active" here, do you you mean "live" (vs paused)? Means the domain is started (no matter is running or paused). vs (libvirt defines a domain but not started). Here, I should update this to: 3). take disk snapshot by qmp command libxl only handles active domain. > > > libxl_domain_snapshot_delete: > > 1). check args validation > > 2). remove memory state file. > > 3). delete disk snapshot. (for internal disk snapshot, through qmp > > command or qemu-img command) > > Out of curiosity, why is this necessary? Is libxl keeping track of > the snapshots somewhere? Or qemu? > > Or to put it a different way, since the caller knows the filenames, > why can't the caller just erase the files themselves? Ian asks the same question. The only reason I propose an API is: xl and libvirt can share the code. And in future, when support many other disk backend types, there is much repeated code. But as Ian mentioned in last version, for handling many disk backend types, maybe better placed in libxlu. Well, if both of you object, I'll remove this API. - Chunyan > > -George > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel > > ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [RFC V8 2/3] libxl domain snapshot API design 2014-11-11 8:07 ` Chun Yan Liu @ 2014-11-13 3:07 ` Chun Yan Liu 2014-11-13 11:41 ` Ian Campbell 0 siblings, 1 reply; 24+ messages in thread From: Chun Yan Liu @ 2014-11-13 3:07 UTC (permalink / raw) To: Chun Yan Liu, George Dunlap Cc: Ian Jackson, Jim Fehlig, Wei Liu, Ian Campbell, xen-devel@lists.xen.org >>> On 11/11/2014 at 04:07 PM, in message <5462343C020000660007880A@soto.provo.novell.com>, "Chun Yan Liu" <cyliu@suse.com> wrote: > >>>> On 11/11/2014 at 01:04 AM, in message > <CAFLBxZZVqZxUouciujSTP-GJsUOquofUK6dy1K2rNXuEEb4Ekw@mail.gmail.com>, George > Dunlap <dunlapg@umich.edu> wrote: > > On Mon, Nov 10, 2014 at 8:17 AM, Chunyan Liu <cyliu@suse.com> wrote: > > > > > > > > 3. Function Implementation > > > > > > libxl_domain_snapshot_create: > > > 1). check args validation > > > 2). save domain memory through save-domain > > > 3). take disk snapshot by qmp command (if domian is active) or > > qemu-img > > > command (if domain is inactive). > > > > By "active" here, do you you mean "live" (vs paused)? > Means the domain is started (no matter is running or paused). > vs (libvirt defines a domain but not started). > Here, I should update this to: > 3). take disk snapshot by qmp command > libxl only handles active domain. > > > > > > libxl_domain_snapshot_delete: > > > 1). check args validation > > > 2). remove memory state file. > > > 3). delete disk snapshot. (for internal disk snapshot, through qmp > > > command or qemu-img command) > > > > Out of curiosity, why is this necessary? Is libxl keeping track of > > the snapshots somewhere? Or qemu? > > > > Or to put it a different way, since the caller knows the filenames, > > why can't the caller just erase the files themselves? > > Ian asks the same question. The only reason I propose an API is: > xl and libvirt can share the code. And in future, when support many other > disk > backend types, there is much repeated code. But as Ian mentioned in > last version, for handling many disk backend types, maybe better placed in > libxlu. Well, if both of you object, I'll remove this API. Similar to snapshot delete, for libxl_domain_snapshot_create, the work is in fact: save memory by domain_save, and do disk snaphsot (by qmp, or qemu-img can also do that). Considering xl only, not very necessary to have a new libxl API? But xl and libvirt can share the code if wrapped in API. So, which is preferred? Any opinion? - Chunyan > > > > > -George > > > > _______________________________________________ > > Xen-devel mailing list > > Xen-devel@lists.xen.org > > http://lists.xen.org/xen-devel > > > > > > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel > > ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [RFC V8 2/3] libxl domain snapshot API design 2014-11-13 3:07 ` Chun Yan Liu @ 2014-11-13 11:41 ` Ian Campbell 2014-11-25 9:08 ` Chun Yan Liu 0 siblings, 1 reply; 24+ messages in thread From: Ian Campbell @ 2014-11-13 11:41 UTC (permalink / raw) To: Chun Yan Liu Cc: Ian Jackson, George Dunlap, Wei Liu, Jim Fehlig, xen-devel@lists.xen.org On Wed, 2014-11-12 at 20:07 -0700, Chun Yan Liu wrote: > > > By "active" here, do you you mean "live" (vs paused)? > > Means the domain is started (no matter is running or paused). > > vs (libvirt defines a domain but not started). > > Here, I should update this to: > > 3). take disk snapshot by qmp command > > libxl only handles active domain. I think the problem here is that different components in the system use different terminology for things or even different concepts (e.g. libxl has no inherent concept of inactive vs active domains, because it only concerns itself with active domains). Perhaps a glossary defining these things would help (also see below). > > > > libxl_domain_snapshot_delete: > > > > 1). check args validation > > > > 2). remove memory state file. > > > > 3). delete disk snapshot. (for internal disk snapshot, through qmp > > > > command or qemu-img command) > > > > > > Out of curiosity, why is this necessary? Is libxl keeping track of > > > the snapshots somewhere? Or qemu? > > > > > > Or to put it a different way, since the caller knows the filenames, > > > why can't the caller just erase the files themselves? > > > > Ian asks the same question. The only reason I propose an API is: > > xl and libvirt can share the code. And in future, when support many other > > disk > > backend types, there is much repeated code. But as Ian mentioned in > > last version, for handling many disk backend types, maybe better placed in > > libxlu. Well, if both of you object, I'll remove this API. I think the reason we are having these same discussions over again is that this proposal is focusing on the libxl API (e.g. the details of what functions exist and what parameters they take) without an introductory section which provides a broad overview of the architecture, containing e.g. things like: * What the general requirements for domain snapshotting are; * What are the constraints which we are operating under; e.g. libvirt or xl design requirements * What the various components are (and which, possibly multiple, entities provide them) and where the various responsibilities lie. I think we've teased a lot of this sort of thing out in past iterations but without having it written down here I think we are all having trouble agreeing (or remembering that we've agreed) that the API makes sense because we all have different ideas about what the higher level architecture/abstraction should look like. See for example http://xenbits.xen.org/people/dvrabel/event-channels-H.pdf or http://lists.xen.org/archives/html/xen-devel/2014-10/msg03235.html (you don't necessarily need to go all out on that level of formality, but they provide some examples of the sorts of higher level design I'm talking about) I think it would also help with the glossary question above since it would help define the terms. I'm sorry for not observing this sooner. Ian. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [RFC V8 2/3] libxl domain snapshot API design 2014-11-13 11:41 ` Ian Campbell @ 2014-11-25 9:08 ` Chun Yan Liu 2014-11-28 15:43 ` Ian Campbell 0 siblings, 1 reply; 24+ messages in thread From: Chun Yan Liu @ 2014-11-25 9:08 UTC (permalink / raw) To: Ian Campbell Cc: Ian Jackson, Jim Fehlig, Wei Liu, George Dunlap, xen-devel@lists.xen.org Hi, Ian, According to previous discussion, snapshot delete and revert are inclined to be done by high level application itself, won't supply a libxl API. I'm wondering snapshot create need a new common API? In fact its main work is save domain and take disk snapshot, xl can do it too. I just write down an overview of the snapshot work (see below). The problem is: do we need to export API? What kind of API? In updating Bamvor's code, I think xl can do all the work, libvirt can do the work too even without libxl's help. Of course, there are some thing if put in libxl, it will be easier to use, like the domain snapshot info structure, gentype.py will directly generate useful init/dispose/to_json/from_json functions. Or the disk snapshot part can be extracted and placed in libxl or libxlu. Any suggestions about which part is better to be extracted as libxl API or better not? Thanks, Chunyan ------------------------------------------------------------------------------------------------------ libxl domain snapshot overview 0. Glossary * Active domain: domain created and started * Inactive domain: domain created but not started * Domain snapshot: Domain snapshot is a system checkpoint of a domain. It contains the memory status at the checkpoint and the disk status. * Disk-only snapshot: Disk-only snapshot only keeps the status of disk, not saving memory status. It's a special kind of domain snapshot. It's valid when domain is inactive, or domain is paused and all cached data has been flushed to disk. Otherwise, disk-only snapshot is a useless inconsistent state. 1. Purpose Domain snapshot is a system checkpoint of a domain. Later, one can roll back the domain to that checkpoint. It's a very useful backup function. A domain snapshot contains the memory status at the checkpoint and the disk status (which we called disk snapshot). Domain snapshot functionality should include: * create a domain snapshot * roll back (or called "revert") to a domain snapshot * delete a domain snapshot * list all domain snapshots Domain Snapshot Support and Not Support: * support live snapshot * support internal disk snapshot and external disk snapshot * support different disk backend types. * support chain snapshots * not support snapshot when domain is shutdowning or dying. * not support disk-only snapshot [1]. [1] This is different from "libvirt". To xl, it only concerns active domains, and even when domain is paused, there is no data flush to disk operation. So, take a disk-only snapshot and then resume, it is as if the guest had crashed. For this reason, disk-only snapshot is meaningless to xl. Should not support. To libvirt, it has active domains and inactive domains, for the active domains, as "xl", it's meaning less to take disk-only snapshot, but for inactive domains, disk-only snapshot is valid. Should support. 2. Requirements General Requirements: * ability to save/restore domain memory * ability to create/delete/apply disk snapshot [2] * ability to parse user config file * ability to save/load/update domain snapshot metadata (or called domain snapshot info, the metadata at least includes: snapshot name, create time, description, memory state file, disk snapshot info, parent (in snapshot chain), current (is currently applied)) [2] Disk snapshot requirements: * external tools: qemu-img, lvcreate, vhd-util, etc. * For a basic goal, we support 'raw' and 'qcow2' backend types only. Then only requires qemu: use libxl qmp command (better) or "qemu-img" 3. Interaction with other operations: Generally, when domain is deleted, all snapshots should be deleted first. 4. General workflow Create a snapshot: * parse user cfg file if passed in * check parameter validation * check snapshot operation is allowed * save domain, saving memory status to file (refer to: save_domain) * take disk snapshot (call qmp command) * snapshot chain info: - get domain snapshots list (this will retrives all snapshot metadata files and returns a list) - check if domain is currently on some snapshot, if yes, then that snapshot is the 'parent' of our snapshot. * save snapshot metadata to json file (save/load/retrive snapshot metadata files are similar to save/load libxl domain config files.) Delete a snapshot: * get snapshot info (retrieve corresponding snapshot metadata file and parse into snapshot info) * according to options, get snapshot chain info - get domain snapshot list (retrieves all snapshot metadata files and returns a list) - find parent and children of this snapshot * delete this snapshot or this snapshot plus children snapshot (according to options) - remove memory state file (unlink) - delete disk snapshot (call qmp command) - update snapshot metadata file of children (if not deleted), change 'parent'. - delete snapshot metadata file of this snapshot Revert: * get snapshot info (retrieve corresponding snapshot metadata file and parse into snapshot info) * destroy this domain * create a new domain from snapshot info - apply disk snapshot (qemu-img) - a process like restore domain * update snapshot metadata, set 'current'. List: * get snapshot info list (retrieves all snapshot metadata files and returns a list) * print in certain format according info list >>> On 11/13/2014 at 07:41 PM, in message <1415878862.21321.9.camel@citrix.com>, Ian Campbell <Ian.Campbell@citrix.com> wrote: > On Wed, 2014-11-12 at 20:07 -0700, Chun Yan Liu wrote: > > > > By "active" here, do you you mean "live" (vs paused)? > > > Means the domain is started (no matter is running or paused). > > > vs (libvirt defines a domain but not started). > > > Here, I should update this to: > > > 3). take disk snapshot by qmp command > > > libxl only handles active domain. > > I think the problem here is that different components in the system use > different terminology for things or even different concepts (e.g. libxl > has no inherent concept of inactive vs active domains, because it only > concerns itself with active domains). > > Perhaps a glossary defining these things would help (also see below). > > > > > > libxl_domain_snapshot_delete: > > > > > 1). check args validation > > > > > 2). remove memory state file. > > > > > 3). delete disk snapshot. (for internal disk snapshot, through > qmp > > > > > command or qemu-img command) > > > > > > > > Out of curiosity, why is this necessary? Is libxl keeping track of > > > > the snapshots somewhere? Or qemu? > > > > > > > > Or to put it a different way, since the caller knows the filenames, > > > > why can't the caller just erase the files themselves? > > > > > > Ian asks the same question. The only reason I propose an API is: > > > xl and libvirt can share the code. And in future, when support many other > > > > disk > > > backend types, there is much repeated code. But as Ian mentioned in > > > last version, for handling many disk backend types, maybe better placed in > > > > libxlu. Well, if both of you object, I'll remove this API. > > I think the reason we are having these same discussions over again is > that this proposal is focusing on the libxl API (e.g. the details of > what functions exist and what parameters they take) without an > introductory section which provides a broad overview of the > architecture, containing e.g. things like: > > * What the general requirements for domain snapshotting are; > * What are the constraints which we are operating under; e.g. > libvirt or xl design requirements > * What the various components are (and which, possibly multiple, > entities provide them) and where the various responsibilities > lie. > > I think we've teased a lot of this sort of thing out in past iterations > but without having it written down here I think we are all having > trouble agreeing (or remembering that we've agreed) that the API makes > sense because we all have different ideas about what the higher level > architecture/abstraction should look like. > > See for example > http://xenbits.xen.org/people/dvrabel/event-channels-H.pdf or > http://lists.xen.org/archives/html/xen-devel/2014-10/msg03235.html (you > don't necessarily need to go all out on that level of formality, but > they provide some examples of the sorts of higher level design I'm > talking about) > > I think it would also help with the glossary question above since it > would help define the terms. > > I'm sorry for not observing this sooner. > > Ian. > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel > > ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [RFC V8 2/3] libxl domain snapshot API design 2014-11-25 9:08 ` Chun Yan Liu @ 2014-11-28 15:43 ` Ian Campbell 2014-12-03 6:14 ` Chun Yan Liu 0 siblings, 1 reply; 24+ messages in thread From: Ian Campbell @ 2014-11-28 15:43 UTC (permalink / raw) To: Chun Yan Liu Cc: Ian Jackson, Jim Fehlig, Wei Liu, George Dunlap, xen-devel@lists.xen.org On Tue, 2014-11-25 at 02:08 -0700, Chun Yan Liu wrote: > Hi, Ian, > > According to previous discussion, snapshot delete and revert are > inclined to be done by high level application itself, won't supply a > libxl API. I thought you had explained a scenario where the toolstack needed to be at least aware of delete, specifically when you are deleting a snapshot from the middle of an active chain. Maybe that's not "snapshot delete API in libxl" though, but rather a notification API which the toolstack can use to tell libxl something is going on. > I'm wondering snapshot create need a new common API? > In fact its main work is save domain and take disk snapshot, xl can > do it too. I don't believe xl can take a disk snapshot of an active disk, it doesn't have the machinery to deal with that sort of thing, nor should it, this is exactly the sort of thing which libxl is provided to deal with. Also, libxl is driving the migration/memory snapshot, and I think the disk snapshot fundamentally needs to be involved in that process, not done separately by the toolstack. > I just write down an overview of the snapshot work (see below). > The problem is: do we need to export API? What kind of API? > In updating Bamvor's code, I think xl can do all the work, libvirt can > do the work too even without libxl's help. > > Of course, there are some thing if put in libxl, it will be easier to > use, like the domain snapshot info structure, gentype.py will > directly generate useful init/dispose/to_json/from_json functions. > Or the disk snapshot part can be extracted and placed in libxl or libxlu. > > Any suggestions about which part is better to be extracted as libxl > API or better not? > > Thanks, > Chunyan > > ------------------------------------------------------------------------------------------------------ > libxl domain snapshot overview Just to be 100% clear: This is an overview of a domain snapshot architecture for a toolstack which uses libxl. A bunch of the things described here belong to the toolstack and not to libxl itself. I've tried to read with that in mind but a complete document should mention this and be careful to be clear about the distinction where it matters. > 0. Glossary [...] > * not support disk-only snapshot [1]. > > [1] > This is different from "libvirt". > To xl, it only concerns active domains, and even when domain > is paused, there is no data flush to disk operation. So, take > a disk-only snapshot and then resume, it is as if the guest > had crashed. For this reason, disk-only snapshot is meaningless > to xl. Should not support. > > To libvirt, it has active domains and inactive domains, for > the active domains, as "xl", it's meaning less to take disk-only > snapshot, but for inactive domains, disk-only snapshot is valid. > Should support. Do you mean to say here that disk-only snapshots are not supported in some toolstacks, or in no toolstack? Or are you just saying that libxl doesn't need to support them because they only apply to inactive domains? In either case it seems to me like your footnote is saying that you *do* want to support disk-only snapshots, at least in some stacks and/or configurations. I think you probably mean to say that disk-only snapshots of *active* domains are not supported. Whereas disk-only snapshots of inactive domains may or may not be depending on the toolstack. > > 2. Requirements > > General Requirements: > * ability to save/restore domain memory > * ability to create/delete/apply disk snapshot [2] > * ability to parse user config file > * ability to save/load/update domain snapshot metadata (or called > domain snapshot info, the metadata at least includes: > snapshot name, create time, description, memory state file, > disk snapshot info, parent (in snapshot chain), current (is > currently applied)) > > [2] Disk snapshot requirements: > * external tools: qemu-img, lvcreate, vhd-util, etc. > * For a basic goal, we support 'raw' and 'qcow2' backend types only. > Then only requires qemu: > use libxl qmp command (better) or "qemu-img" You should leave these implementation details for a later section, in this context they just invite quibbling about whether things belong in libxl etc and whether qmp commands are "better". The rest looks ok, but without the remainder of the design described in terms of the concepts given here it's hard to comment further. I'd suggest putting this all into one coherent document (not 3 as before) which starts by describing the terminology (section 0 in your mail which I'm replying to now), then gives an overview of the architecture (the rest of that mail), then describe which components (libxl, toolstack, etc) implement each bit of the architecture, then describe the libxl API which makes this possible (covered in previous mails I think). I think you have most of the words either here or from the other mails, they just need putting together into a single thing and going through to make sure that they use the same terminology and describe the same things etc. Please take a look at http://xenbits.xen.org/people/dvrabel/event-channels-H.pdf or http://lists.xen.org/archives/html/xen-devel/2014-10/msg03235.html for examples of the sort of cohesive document I mean. Ian. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [RFC V8 2/3] libxl domain snapshot API design 2014-11-28 15:43 ` Ian Campbell @ 2014-12-03 6:14 ` Chun Yan Liu 2014-12-05 14:02 ` Ian Campbell 0 siblings, 1 reply; 24+ messages in thread From: Chun Yan Liu @ 2014-12-03 6:14 UTC (permalink / raw) To: Ian Campbell Cc: Ian Jackson, Jim Fehlig, Wei Liu, George Dunlap, xen-devel@lists.xen.org >>> On 11/28/2014 at 11:43 PM, in message <1417189409.23604.62.camel@citrix.com>, Ian Campbell <Ian.Campbell@citrix.com> wrote: > On Tue, 2014-11-25 at 02:08 -0700, Chun Yan Liu wrote: > > Hi, Ian, > > > > According to previous discussion, snapshot delete and revert are > > inclined to be done by high level application itself, won't supply a > > libxl API. > > I thought you had explained a scenario where the toolstack needed to be > at least aware of delete, specifically when you are deleting a snapshot > from the middle of an active chain. The reason why I post such an overview here before sending next version is: I'm puzzled about what should be in libxl and what in toolstack after previous discussion. So posted here to seek some ideas or agreement first. It's not a full design, not break down to libxl and toolstack yet. > > Maybe that's not "snapshot delete API in libxl" though, but rather a > notification API which the toolstack can use to tell libxl something is > going on. About notification API, after looking at lvm, vhd-util and qcow2, I don't think we need it. No extra work needs to do to handle disk snapshot chain. lvm: doesn't support snapshot of snapshot. vhd-util: backing file chain, external snapshot. Don't need to delete the disk snapshot when deleting domain snapshot. qcow2: * internal disk snapshot: each snapshot increases the refcount of data, deleting snapshot only decrease the refcount, won't affect other snapshots. * external disk snapshot: same as vhd-util, backing file chain. Don't need to delete disk snapshot when deleting domain snapshot. > > > I'm wondering snapshot create need a new common API? > > In fact its main work is save domain and take disk snapshot, xl can > > do it too. For saving memory, there is already API for that. The missing part is taking disk snapshot. > > I don't believe xl can take a disk snapshot of an active disk, it > doesn't have the machinery to deal with that sort of thing, nor should > it, this is exactly the sort of thing which libxl is provided to deal > with. Like delete a disk snapshot, xl can call external command to do that (e.g. qemu-img). But it's better to call qmp to do that. Anyway, if for domain snapshot create, we should put creating disk snapshot process in libxl, then for domain snapshot delete, we should put deleting disk snapshot process in libxl. That is, in libxl there should be: libxl_disk_snapshot_create (which handles creating disk snapshot) libxl_disk_snapshot_delete (which handles deleting disk snapshot) Otherwise I would think it's weird to have in libxl: libxl_domain_snapshot_create (wrap saving memory [already has API] and creating disk snapshot) libxl_disk_snapshot_delete (deleting disk snapshot) > > Also, libxl is driving the migration/memory snapshot, and I think the > disk snapshot fundamentally needs to be involved in that process, not > done separately by the toolstack. > > > I just write down an overview of the snapshot work (see below). > > The problem is: do we need to export API? What kind of API? > > In updating Bamvor's code, I think xl can do all the work, libvirt can > > do the work too even without libxl's help. > > > > Of course, there are some thing if put in libxl, it will be easier to > > use, like the domain snapshot info structure, gentype.py will > > directly generate useful init/dispose/to_json/from_json functions. > > Or the disk snapshot part can be extracted and placed in libxl or libxlu. And about the snapshot json file store and retrieve, using gentype.py to autogenerate xx_to_json and xx_from_json functions is very convenient, there would be a group of functions set/get/update/delete_snapshot_metadata based on that. But I didn't see other such usage in xl, and it's not proper to place in libxl. Anywhere could it be placed but used by xl? Wei might have some ideas about this? -Chunyan > > > > Any suggestions about which part is better to be extracted as libxl > > API or better not? > > > > Thanks, > > Chunyan > > > > > ----------------------------------------------------------------------------- > ------------------------- > > libxl domain snapshot overview > > Just to be 100% clear: This is an overview of a domain snapshot > architecture for a toolstack which uses libxl. A bunch of the things > described here belong to the toolstack and not to libxl itself. > > I've tried to read with that in mind but a complete document should > mention this and be careful to be clear about the distinction where it > matters. > > > 0. Glossary > [...] > > * not support disk-only snapshot [1]. > > > > [1] > > This is different from "libvirt". > > To xl, it only concerns active domains, and even when domain > > is paused, there is no data flush to disk operation. So, take > > a disk-only snapshot and then resume, it is as if the guest > > had crashed. For this reason, disk-only snapshot is meaningless > > to xl. Should not support. > > > > To libvirt, it has active domains and inactive domains, for > > the active domains, as "xl", it's meaning less to take disk-only > > snapshot, but for inactive domains, disk-only snapshot is valid. > > Should support. > > Do you mean to say here that disk-only snapshots are not supported in > some toolstacks, or in no toolstack? Or are you just saying that libxl > doesn't need to support them because they only apply to inactive > domains? > > In either case it seems to me like your footnote is saying that you *do* > want to support disk-only snapshots, at least in some stacks and/or > configurations. > > I think you probably mean to say that disk-only snapshots of *active* > domains are not supported. Whereas disk-only snapshots of inactive > domains may or may not be depending on the toolstack. > > > > > 2. Requirements > > > > General Requirements: > > * ability to save/restore domain memory > > * ability to create/delete/apply disk snapshot [2] > > * ability to parse user config file > > * ability to save/load/update domain snapshot metadata (or called > > domain snapshot info, the metadata at least includes: > > snapshot name, create time, description, memory state file, > > disk snapshot info, parent (in snapshot chain), current (is > > currently applied)) > > > > [2] Disk snapshot requirements: > > * external tools: qemu-img, lvcreate, vhd-util, etc. > > * For a basic goal, we support 'raw' and 'qcow2' backend types only. > > Then only requires qemu: > > use libxl qmp command (better) or "qemu-img" > > You should leave these implementation details for a later section, in > this context they just invite quibbling about whether things belong in > libxl etc and whether qmp commands are "better". > > The rest looks ok, but without the remainder of the design described in > terms of the concepts given here it's hard to comment further. > > I'd suggest putting this all into one coherent document (not 3 as > before) which starts by describing the terminology (section 0 in your > mail which I'm replying to now), then gives an overview of the > architecture (the rest of that mail), then describe which components > (libxl, toolstack, etc) implement each bit of the architecture, then > describe the libxl API which makes this possible (covered in previous > mails I think). > > I think you have most of the words either here or from the other mails, > they just need putting together into a single thing and going through to > make sure that they use the same terminology and describe the same > things etc. > > Please take a look at > http://xenbits.xen.org/people/dvrabel/event-channels-H.pdf or > http://lists.xen.org/archives/html/xen-devel/2014-10/msg03235.html for > examples of the sort of cohesive document I mean. > > Ian. > > > ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [RFC V8 2/3] libxl domain snapshot API design 2014-12-03 6:14 ` Chun Yan Liu @ 2014-12-05 14:02 ` Ian Campbell 0 siblings, 0 replies; 24+ messages in thread From: Ian Campbell @ 2014-12-05 14:02 UTC (permalink / raw) To: Chun Yan Liu Cc: Ian Jackson, Jim Fehlig, Wei Liu, George Dunlap, xen-devel@lists.xen.org On Tue, 2014-12-02 at 23:14 -0700, Chun Yan Liu wrote: > > >>> On 11/28/2014 at 11:43 PM, in message <1417189409.23604.62.camel@citrix.com>, > Ian Campbell <Ian.Campbell@citrix.com> wrote: > > On Tue, 2014-11-25 at 02:08 -0700, Chun Yan Liu wrote: > > > Hi, Ian, > > > > > > According to previous discussion, snapshot delete and revert are > > > inclined to be done by high level application itself, won't supply a > > > libxl API. > > > > I thought you had explained a scenario where the toolstack needed to be > > at least aware of delete, specifically when you are deleting a snapshot > > from the middle of an active chain. > > The reason why I post such an overview here before sending next > version is: I'm puzzled about what should be in libxl and what > in toolstack after previous discussion. So posted here to seek > some ideas or agreement first. It's not a full design, not break > down to libxl and toolstack yet. I guess I thought we had gotten closer to this than we actually have. > > Maybe that's not "snapshot delete API in libxl" though, but rather a > > notification API which the toolstack can use to tell libxl something is > > going on. > > About notification API, after looking at lvm, vhd-util and qcow2, > I don't think we need it. No extra work needs to do to handle > disk snapshot chain. > lvm: doesn't support snapshot of snapshot. > vhd-util: backing file chain, external snapshot. Don't need to > delete the disk snapshot when deleting domain snapshot. > qcow2: > * internal disk snapshot: each snapshot increases the refcount > of data, deleting snapshot only decrease the refcount, won't > affect other snapshots. > * external disk snapshot: same as vhd-util, backing file chain. > Don't need to delete disk snapshot when deleting domain snapshot. You don't need to, but might a toolstack (or user) want to consolidate anyway, e.g. to reduce chain length? (which might otherwise be overly long.) > > I don't believe xl can take a disk snapshot of an active disk, it > > doesn't have the machinery to deal with that sort of thing, nor should > > it, this is exactly the sort of thing which libxl is provided to deal > > with. > > Like delete a disk snapshot, xl can call external command to do that > (e.g. qemu-img). But it's better to call qmp to do that. The toolstack (xl or libvirt) doesn't have direct access to qmp, it would have to go via a libxl API, for an Active domain at least. qemu-img is the right answer for an Inactive domain. Secondly, the disk snapshot has to happen while the domain is paused/quiesced for consistency. This happens deep in the bowels of the libxl save/restore code. So either libxl has to do the disk snapshots at the same time or we need a callback to the toolstack in order for it to make the snapshots. > Anyway, if for domain snapshot create, we should put creating disk > snapshot process in libxl, then for domain snapshot delete, we > should put deleting disk snapshot process in libxl. That is, in libxl > there should be: > libxl_disk_snapshot_create (which handles creating disk snapshot) > libxl_disk_snapshot_delete (which handles deleting disk snapshot) > > Otherwise I would think it's weird to have in libxl: > libxl_domain_snapshot_create (wrap saving memory [already has API] > and creating disk snapshot) > libxl_disk_snapshot_delete (deleting disk snapshot) The create and delete cases are subtly different, so it may be that the API ends up asymmetric. The create mechanism (whichever one it is) operates on a single Active domain and is reasonably well defined. The delete operation however can potentially operate on multiple Active domains, e.g. 2 domains are running with a common ancestor snapshot which is being removed. How would the delete interface deal with this case? In particular without libxl becoming involved in "storage management". The reason I'm thinking of a "delete notify" style interface for Active domains is that it then applies to a single Active domain at a time. If multiple domains are affected by a snapshot deletion then the notification is called multiple times. > And about the snapshot json file store and retrieve, using > gentype.py to autogenerate xx_to_json and xx_from_json functions > is very convenient, there would be a group of functions > set/get/update/delete_snapshot_metadata based on that. > But I didn't see other such usage in xl, and it's not proper to > place in libxl. Anywhere could it be placed but used by xl? > Wei might have some ideas about this? xl hasn't needed to use the autogeneration infrastructure to date, but there's no reason why it couldn't do so if there was a need. Just create xl_types.idl and hook it into the Makeile. It would be harder to extend this to other toolstack, but I suspect we don't need to. Ian. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [RFC V8 2/3] libxl domain snapshot API design 2014-11-10 8:17 ` [RFC V8 2/3] libxl domain snapshot API design Chunyan Liu 2014-11-10 17:04 ` George Dunlap @ 2014-12-05 16:06 ` Wei Liu 2014-12-05 16:11 ` Ian Campbell 2014-12-08 7:34 ` Chun Yan Liu 1 sibling, 2 replies; 24+ messages in thread From: Wei Liu @ 2014-12-05 16:06 UTC (permalink / raw) To: Chunyan Liu; +Cc: Ian.Jackson, jfehlig, wei.liu2, Ian.Campbell, xen-devel I have to admit I'm confused by the back and forth discussion. It's hard to justify the design of new API without knowing what the constraints and requirements are from your PoV. Here are my two cents, not about details, but about general constraints. There are two layers, one is user of libxl (clients -- xl, libvirt etc) and libxl (the library itself). 1. it's better to *not* have storage management in libxl. It's likely that clients can have their own management functionality already. I'm told that libvirt has that as well as XAPI. Having this functionality in libxl is a bit redundant and requires lots of work (enlighten libxl on what a disk looks like and call out to various utilities). 2. it's *not* a requirement for xl to have the capability to manage snapshots. It's the same arguement that xl has no idea on how to manage snapshots created by "xl save". This should ease your concern on having to duplicate code for libvirt and xl. IMHO the xl only needs to have the capability to create a snapshot and create a domain from a snapshot. The downside is that now xl and libvirt are disconnected, but I think it's fine. The arguement is that you're not allowed to run two toolstack on the same host (think about xl and xend in previous releases). Do these two constraints make your work easier (or harder)? Regarding JSON API, as Ian said, feel free to hook it up to libxlu. Wei. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [RFC V8 2/3] libxl domain snapshot API design 2014-12-05 16:06 ` Wei Liu @ 2014-12-05 16:11 ` Ian Campbell 2014-12-05 16:22 ` Wei Liu 2014-12-08 7:34 ` Chun Yan Liu 1 sibling, 1 reply; 24+ messages in thread From: Ian Campbell @ 2014-12-05 16:11 UTC (permalink / raw) To: Wei Liu; +Cc: Ian.Jackson, jfehlig, Chunyan Liu, xen-devel On Fri, 2014-12-05 at 16:06 +0000, Wei Liu wrote: > Regarding JSON API, as Ian said, feel free to hook it up to libxlu. *If* it is useful to multiple toolstacks but not suitable for libxl then libxlu would be the right place. As I understood things the need for JSON here was xl specific, and it is IMHO fine for xl to also use the idl infrastructure, without needing to launder it via libxlu. Ian. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [RFC V8 2/3] libxl domain snapshot API design 2014-12-05 16:11 ` Ian Campbell @ 2014-12-05 16:22 ` Wei Liu 0 siblings, 0 replies; 24+ messages in thread From: Wei Liu @ 2014-12-05 16:22 UTC (permalink / raw) To: Ian Campbell; +Cc: Ian.Jackson, jfehlig, Wei Liu, Chunyan Liu, xen-devel On Fri, Dec 05, 2014 at 04:11:48PM +0000, Ian Campbell wrote: > On Fri, 2014-12-05 at 16:06 +0000, Wei Liu wrote: > > Regarding JSON API, as Ian said, feel free to hook it up to libxlu. > > *If* it is useful to multiple toolstacks but not suitable for libxl then > libxlu would be the right place. > > As I understood things the need for JSON here was xl specific, and it is > IMHO fine for xl to also use the idl infrastructure, without needing to > launder it via libxlu. > Hmm... I was think about if by any chance Chunyan wants to unify xl and libvirt's knowledge of a domain snapshot, it can go into libxlu. I'm no libvirt expert though. If libvirt doesn't need this then putting it in xl is enough. Wei. > Ian. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [RFC V8 2/3] libxl domain snapshot API design 2014-12-05 16:06 ` Wei Liu 2014-12-05 16:11 ` Ian Campbell @ 2014-12-08 7:34 ` Chun Yan Liu 2014-12-08 11:12 ` Wei Liu 1 sibling, 1 reply; 24+ messages in thread From: Chun Yan Liu @ 2014-12-08 7:34 UTC (permalink / raw) To: Wei Liu; +Cc: Ian.Jackson, Jim Fehlig, Ian.Campbell, xen-devel >>> On 12/6/2014 at 12:06 AM, in message <20141205160615.GA24938@zion.uk.xensource.com>, Wei Liu <wei.liu2@citrix.com> wrote: > I have to admit I'm confused by the back and forth discussion. It's hard > to justify the design of new API without knowing what the constraints > and requirements are from your PoV. > > Here are my two cents, not about details, but about general constraints. > > There are two layers, one is user of libxl (clients -- xl, libvirt etc) > and libxl (the library itself). > > 1. it's better to *not* have storage management in libxl. > > It's likely that clients can have their own management functionality > already. I'm told that libvirt has that as well as XAPI. Having this > functionality in libxl is a bit redundant and requires lots of work > (enlighten libxl on what a disk looks like and call out to various > utilities). Thanks Wei and Ian for your reply. We did have much discussion around can/cannot (e.g. xl can finish disk snapshot?) and should/shouldnot (e.g. disk snapshot process should not in xl? domain_snapshot_delete should not in libxl?), and confusing because have different ideas. So, settling it down is helpful. Talking about libvirt, it does provide storage management but through storage pools and volumes. But usually, we don't use storage pool/vol but directly use backend files, then libvirt storage driver can not manage them. And for libvirt vol, functionality in storage driver is limited, at least 'snapshot' cannot be done. So, libvirt side also needs to add codes to handle disk snapshot, quite like xl does. Following the constraint that it's better NOT to supply disk snapshot functions in libxl, then we let xl and libvirt do that by themselve separately, that's OK. Then I think NO new API needs to be exported in libxl, since: * saving/restoring memory, there are already APIs. * disk snapshot work is xl internal, can be put in xl (or xlu). * handle JSON files is xl internal, can be put in xl. (these are the main work vm snapshot handles). Right? This is quite different from previous document, so better to confirm. > > 2. it's *not* a requirement for xl to have the capability to manage > snapshots. > > It's the same arguement that xl has no idea on how to manage snapshots > created by "xl save". This should ease your concern on having to > duplicate code for libvirt and xl. IMHO the xl only needs to have the > capability to create a snapshot and create a domain from a snapshot. This way it's much easier since we don't need to maintain the snapshot info in file and don't need to take care of snapshot chain. But I doubt if that's good? 1. from user's side, it's a very common request to list all snapshots. 2. now for kvm, virsh supplies snapshot-create/delete/list/revert, Is it good the xl only supply snapshot-create/revert? After all, it's more complicated for user to take care of memory saving file and disk snapshot info then 'xl save' (user only needs to take care of memory state file). > The > downside is that now xl and libvirt are disconnected, but I think it's > fine. Two things here: 1. connect xl and libvirt, then will need to manage snapshot info in libxl (or libxlu) That's not preferred since the initial design. This is not the point we discuss here. 2. for xl only, list snapshots and delete snapshots, also need to manage snapshot info (in xl) Considering manage snapshot info in xl, only question is about idl and gentypes.py, expected structure is as following and expected to be saved into json file, but it contains xl namespace and libxl namespace things, gentypes.py will have problem. Better ideas? typedef struct xl_domain_snapshot { char * name; char * description; uint64_t creation_time; char * memory_path; int num_disks; libxl_disk_snapshot *disks; char *parent; bool *current; } xl_domain_snapshot; Thanks a lot! Chunyan > The arguement is that you're not allowed to run two toolstack on > the same host (think about xl and xend in previous releases). > > Do these two constraints make your work easier (or harder)? > > Regarding JSON API, as Ian said, feel free to hook it up to libxlu. > > Wei. > > ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [RFC V8 2/3] libxl domain snapshot API design 2014-12-08 7:34 ` Chun Yan Liu @ 2014-12-08 11:12 ` Wei Liu 2014-12-08 11:24 ` Wei Liu 2014-12-09 5:04 ` Chun Yan Liu 0 siblings, 2 replies; 24+ messages in thread From: Wei Liu @ 2014-12-08 11:12 UTC (permalink / raw) To: Chun Yan Liu; +Cc: Ian.Jackson, Jim Fehlig, Wei Liu, Ian.Campbell, xen-devel On Mon, Dec 08, 2014 at 12:34:47AM -0700, Chun Yan Liu wrote: > > > >>> On 12/6/2014 at 12:06 AM, in message > <20141205160615.GA24938@zion.uk.xensource.com>, Wei Liu <wei.liu2@citrix.com> > wrote: > > I have to admit I'm confused by the back and forth discussion. It's hard > > to justify the design of new API without knowing what the constraints > > and requirements are from your PoV. > > > > Here are my two cents, not about details, but about general constraints. > > > > There are two layers, one is user of libxl (clients -- xl, libvirt etc) > > and libxl (the library itself). > > > > 1. it's better to *not* have storage management in libxl. > > > > It's likely that clients can have their own management functionality > > already. I'm told that libvirt has that as well as XAPI. Having this > > functionality in libxl is a bit redundant and requires lots of work > > (enlighten libxl on what a disk looks like and call out to various > > utilities). > > Thanks Wei and Ian for your reply. We did have much discussion around > can/cannot (e.g. xl can finish disk snapshot?) and should/shouldnot > (e.g. disk snapshot process should not in xl? domain_snapshot_delete > should not in libxl?), and confusing because have different ideas. So, > settling it down is helpful. > > Talking about libvirt, it does provide storage management but through > storage pools and volumes. But usually, we don't use storage pool/vol > but directly use backend files, then libvirt storage driver can not > manage them. And for libvirt vol, functionality in storage driver is > limited, at least 'snapshot' cannot be done. So, libvirt side also > needs to add codes to handle disk snapshot, quite like xl does. > OK, so I take it that libvirt can be completely out the picture? I mean, it's not a requirement for you to integrate with libvirt? I was thinking a stack like this when I replied: libvirt: manages snapshot (including storage snapshot) libxl [other lower level stuffs] While I read from your reply, libvirt doesn't have that functionality (or very limited), so you would like to do things like: libvirt (or your homebrew toolstack) xl (xl or libxl manages domain snapshots) libxl [other lower level stuffs] That's why you spent loads of time discussing with Ian what should be done where, right? > Following the constraint that it's better NOT to supply disk snapshot > functions in libxl, then we let xl and libvirt do that by themselve > separately, that's OK. > > Then I think NO new API needs to be exported in libxl, since: > * saving/restoring memory, there are already APIs. The principle is that if existing API doesn't work good enough for you we will consider adding a new one. We probably need a new API. If you want to do a live snapshot, we would need to notify xl that we are in the middle of pausing and resuming a domain. > * disk snapshot work is xl internal, can be put in xl (or xlu). > * handle JSON files is xl internal, can be put in xl. Yes. > (these are the main work vm snapshot handles). > > Right? > > This is quite different from previous document, so better to confirm. > > > > > 2. it's *not* a requirement for xl to have the capability to manage > > snapshots. > > > > It's the same arguement that xl has no idea on how to manage snapshots > > created by "xl save". This should ease your concern on having to > > duplicate code for libvirt and xl. IMHO the xl only needs to have the > > capability to create a snapshot and create a domain from a snapshot. > > This way it's much easier since we don't need to maintain the snapshot > info in file and don't need to take care of snapshot chain. But I doubt if > that's good? > 1. from user's side, it's a very common request to list all snapshots. > 2. now for kvm, virsh supplies snapshot-create/delete/list/revert, > Is it good the xl only supply snapshot-create/revert? After all, > it's more complicated for user to take care of memory saving file > and disk snapshot info then 'xl save' (user only needs to take > care of memory state file). > However the current architecture for libvirt to use libxl is like libvirt libxl [other lower level stuffs] So implementing snapshot management in xl cannot work for you either. It's not part of the current architecture. Not that I'm against the idea of managing domain snapshot in xl, I'm trying to reduce workload here. > > The > > downside is that now xl and libvirt are disconnected, but I think it's > > fine. > > Two things here: > 1. connect xl and libvirt, then will need to manage snapshot info in libxl (or > libxlu) That's not preferred since the initial design. This is not the point > we discuss here. > 2. for xl only, list snapshots and delete snapshots, also need to manage > snapshot info (in xl) > > Considering manage snapshot info in xl, only question is about idl and > gentypes.py, expected structure is as following and expected to be saved > into json file, but it contains xl namespace and libxl namespace things, > gentypes.py will have problem. Better ideas? > > typedef struct xl_domain_snapshot { > char * name; > char * description; > uint64_t creation_time; > char * memory_path; > int num_disks; > libxl_disk_snapshot *disks; > char *parent; > bool *current; > } xl_domain_snapshot; > The libxl_disk_snapshot suggests that you still want storage management inside libxl, which should probably be in libxlu? libvirt libxl libxlu [other lower level stuffs] Wei. > Thanks a lot! > Chunyan > > > The arguement is that you're not allowed to run two toolstack on > > the same host (think about xl and xend in previous releases). > > > > Do these two constraints make your work easier (or harder)? > > > > Regarding JSON API, as Ian said, feel free to hook it up to libxlu. > > > > Wei. > > > > ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [RFC V8 2/3] libxl domain snapshot API design 2014-12-08 11:12 ` Wei Liu @ 2014-12-08 11:24 ` Wei Liu 2014-12-09 5:04 ` Chun Yan Liu 1 sibling, 0 replies; 24+ messages in thread From: Wei Liu @ 2014-12-08 11:24 UTC (permalink / raw) To: Chun Yan Liu; +Cc: Ian.Jackson, Jim Fehlig, Wei Liu, Ian.Campbell, xen-devel On Mon, Dec 08, 2014 at 11:12:14AM +0000, Wei Liu wrote: > On Mon, Dec 08, 2014 at 12:34:47AM -0700, Chun Yan Liu wrote: > > > > > > >>> On 12/6/2014 at 12:06 AM, in message > > <20141205160615.GA24938@zion.uk.xensource.com>, Wei Liu <wei.liu2@citrix.com> > > wrote: > > > I have to admit I'm confused by the back and forth discussion. It's hard > > > to justify the design of new API without knowing what the constraints > > > and requirements are from your PoV. > > > > > > Here are my two cents, not about details, but about general constraints. > > > > > > There are two layers, one is user of libxl (clients -- xl, libvirt etc) > > > and libxl (the library itself). > > > > > > 1. it's better to *not* have storage management in libxl. > > > > > > It's likely that clients can have their own management functionality > > > already. I'm told that libvirt has that as well as XAPI. Having this > > > functionality in libxl is a bit redundant and requires lots of work > > > (enlighten libxl on what a disk looks like and call out to various > > > utilities). > > > > Thanks Wei and Ian for your reply. We did have much discussion around > > can/cannot (e.g. xl can finish disk snapshot?) and should/shouldnot > > (e.g. disk snapshot process should not in xl? domain_snapshot_delete > > should not in libxl?), and confusing because have different ideas. So, > > settling it down is helpful. > > > > Talking about libvirt, it does provide storage management but through > > storage pools and volumes. But usually, we don't use storage pool/vol > > but directly use backend files, then libvirt storage driver can not > > manage them. And for libvirt vol, functionality in storage driver is > > limited, at least 'snapshot' cannot be done. So, libvirt side also > > needs to add codes to handle disk snapshot, quite like xl does. > > > > OK, so I take it that libvirt can be completely out the picture? I mean, > it's not a requirement for you to integrate with libvirt? > > I was thinking a stack like this when I replied: > > libvirt: manages snapshot (including storage snapshot) > libxl > [other lower level stuffs] > > While I read from your reply, libvirt doesn't have that functionality > (or very limited), so you would like to do things like: > > libvirt (or your homebrew toolstack) > xl (xl or libxl manages domain snapshots) > libxl > [other lower level stuffs] > Note, I'm in no way endorsing this approach. This is my understanding (or misunderstanding) of what you intended to do. I started drawing pictures so that we can understand each other better. Wei. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [RFC V8 2/3] libxl domain snapshot API design 2014-12-08 11:12 ` Wei Liu 2014-12-08 11:24 ` Wei Liu @ 2014-12-09 5:04 ` Chun Yan Liu 2014-12-09 11:11 ` Ian Campbell 1 sibling, 1 reply; 24+ messages in thread From: Chun Yan Liu @ 2014-12-09 5:04 UTC (permalink / raw) To: Wei Liu; +Cc: Ian.Jackson, Jim Fehlig, Ian.Campbell, xen-devel >>> On 12/8/2014 at 07:12 PM, in message <20141208111214.GC17128@zion.uk.xensource.com>, Wei Liu <wei.liu2@citrix.com> wrote: > On Mon, Dec 08, 2014 at 12:34:47AM -0700, Chun Yan Liu wrote: > > > > > > >>> On 12/6/2014 at 12:06 AM, in message > > <20141205160615.GA24938@zion.uk.xensource.com>, Wei Liu <wei.liu2@citrix.com> > > wrote: > > > I have to admit I'm confused by the back and forth discussion. It's hard > > > to justify the design of new API without knowing what the constraints > > > and requirements are from your PoV. > > > > > > Here are my two cents, not about details, but about general constraints. > > > > > > There are two layers, one is user of libxl (clients -- xl, libvirt etc) > > > and libxl (the library itself). > > > > > > 1. it's better to *not* have storage management in libxl. > > > > > > It's likely that clients can have their own management functionality > > > already. I'm told that libvirt has that as well as XAPI. Having this > > > functionality in libxl is a bit redundant and requires lots of work > > > (enlighten libxl on what a disk looks like and call out to various > > > utilities). > > > > Thanks Wei and Ian for your reply. We did have much discussion around > > can/cannot (e.g. xl can finish disk snapshot?) and should/shouldnot > > (e.g. disk snapshot process should not in xl? domain_snapshot_delete > > should not in libxl?), and confusing because have different ideas. So, > > settling it down is helpful. > > > > Talking about libvirt, it does provide storage management but through > > storage pools and volumes. But usually, we don't use storage pool/vol > > but directly use backend files, then libvirt storage driver can not > > manage them. And for libvirt vol, functionality in storage driver is > > limited, at least 'snapshot' cannot be done. So, libvirt side also > > needs to add codes to handle disk snapshot, quite like xl does. > > > > OK, so I take it that libvirt can be completely out the picture? I mean, > it's not a requirement for you to integrate with libvirt? > > I was thinking a stack like this when I replied: > > libvirt: manages snapshot (including storage snapshot) > libxl > [other lower level stuffs] > > While I read from your reply, libvirt doesn't have that functionality > (or very limited), so you would like to do things like: > > libvirt (or your homebrew toolstack) > xl (xl or libxl manages domain snapshots) > libxl > [other lower level stuffs] > > That's why you spent loads of time discussing with Ian what should be > done where, right? Partly. At least for domain disk snapshot create/delete, I prefer using qmp commands instead of calling qemu-img one by one. Using qmp commands, libvirt will need libxl's help. Of course, if libxl doesn't supply that, libvirt can call qemu-img to each disk one by one, not preferred but can do. > > > Following the constraint that it's better NOT to supply disk snapshot > > functions in libxl, then we let xl and libvirt do that by themselve > > separately, that's OK. > > > > Then I think NO new API needs to be exported in libxl, since: > > * saving/restoring memory, there are already APIs. > > The principle is that if existing API doesn't work good enough for you > we will consider adding a new one. > > We probably need a new API. If you want to do a live snapshot, we would > need to notify xl that we are in the middle of pausing and resuming a > domain. This is where we discussed a lot. Do we really need libxl_domain_snapshot_create API? or does xl can do the work? Even for live snapshot, after calling libxl_domain_suspend with LIVE flags, memory is saved and domain is paused. xl then can call disk snapshot functions to finish disk snapshots, after all of that, call libxl_domain_unpause to unpause the domain. So I don't think xl has any trouble to do that. In case there is some misunderstanding, please point out. > > > * disk snapshot work is xl internal, can be put in xl (or xlu). > > * handle JSON files is xl internal, can be put in xl. > > Yes. > > > (these are the main work vm snapshot handles). > > > > Right? > > > > This is quite different from previous document, so better to confirm. > > > > > > > > 2. it's *not* a requirement for xl to have the capability to manage > > > snapshots. > > > > > > It's the same arguement that xl has no idea on how to manage snapshots > > > created by "xl save". This should ease your concern on having to > > > duplicate code for libvirt and xl. IMHO the xl only needs to have the > > > capability to create a snapshot and create a domain from a snapshot. > > > > This way it's much easier since we don't need to maintain the snapshot > > info in file and don't need to take care of snapshot chain. But I doubt if > > that's good? > > 1. from user's side, it's a very common request to list all snapshots. > > 2. now for kvm, virsh supplies snapshot-create/delete/list/revert, > > Is it good the xl only supply snapshot-create/revert? After all, > > it's more complicated for user to take care of memory saving file > > and disk snapshot info then 'xl save' (user only needs to take > > care of memory state file). > > > > However the current architecture for libvirt to use libxl is like > > libvirt > libxl > [other lower level stuffs] > > So implementing snapshot management in xl cannot work for you either. > It's not part of the current architecture. You are right. I understand you are trying to suggest a way to ease the job. Here just to make clear this way is really better and finally acceptable? :-) Just IMO, I think xl snapshot-list is wanted, that means managing snapshots in xl is needed. > > Not that I'm against the idea of managing domain snapshot in xl, I'm > trying to reduce workload here. > > > > The > > > downside is that now xl and libvirt are disconnected, but I think it's > > > fine. > > > > Two things here: > > 1. connect xl and libvirt, then will need to manage snapshot info in libxl > (or > > libxlu) That's not preferred since the initial design. This is not the > point > > we discuss here. > > 2. for xl only, list snapshots and delete snapshots, also need to manage > > snapshot info (in xl) > > > > Considering manage snapshot info in xl, only question is about idl and > > gentypes.py, expected structure is as following and expected to be saved > > into json file, but it contains xl namespace and libxl namespace things, > > gentypes.py will have problem. Better ideas? > > > > typedef struct xl_domain_snapshot { > > char * name; > > char * description; > > uint64_t creation_time; > > char * memory_path; > > int num_disks; > > libxl_disk_snapshot *disks; > > char *parent; > > bool *current; > > } xl_domain_snapshot; > > > > The libxl_disk_snapshot suggests that you still want storage management > inside libxl, which should probably be in libxlu? Yeah. I may put it in libxlu. Question is the same, cross two name spaces. But now I think we can turn a way to do it. In xl_domain_snapshot, define xl_disk_snapshot, then before calling libxlu API, turning it into libxlu_disk_snapshot. Thanks for your concern and suggestions. Appreciate it a lot! - Chunyan > > libvirt > libxl libxlu > [other lower level stuffs] > > Wei. > > > Thanks a lot! > > Chunyan > > > > > The arguement is that you're not allowed to run two toolstack on > > > the same host (think about xl and xend in previous releases). > > > > > > Do these two constraints make your work easier (or harder)? > > > > > > Regarding JSON API, as Ian said, feel free to hook it up to libxlu. > > > > > > Wei. > > > > > > > > ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [RFC V8 2/3] libxl domain snapshot API design 2014-12-09 5:04 ` Chun Yan Liu @ 2014-12-09 11:11 ` Ian Campbell 2014-12-10 3:46 ` Chun Yan Liu 0 siblings, 1 reply; 24+ messages in thread From: Ian Campbell @ 2014-12-09 11:11 UTC (permalink / raw) To: Chun Yan Liu; +Cc: Ian.Jackson, Jim Fehlig, Wei Liu, xen-devel On Mon, 2014-12-08 at 22:04 -0700, Chun Yan Liu wrote: > Partly. At least for domain disk snapshot create/delete, I prefer using > qmp commands instead of calling qemu-img one by one. Using qmp > commands, libvirt will need libxl's help. Of course, if libxl doesn't > supply that, libvirt can call qemu-img to each disk one by one, > not preferred but can do. You can't use qmp unless the domain is active, for an inactive domain there is no qemu to talk to, so you have to use qemu-img anyway in that case. Does libvirt not have existing code to do all this sort of thing? (I thought so, but it turns out I may be wrong, see below). And for an active domain I expect that *must* use qmp, since it seems unlikely that you can go around changing things under the feet of an active process (maybe I'm wrong?). > > > Following the constraint that it's better NOT to supply disk snapshot > > > functions in libxl, then we let xl and libvirt do that by themselve > > > separately, that's OK. > > > > > > Then I think NO new API needs to be exported in libxl, since: > > > * saving/restoring memory, there are already APIs. > > > > The principle is that if existing API doesn't work good enough for you > > we will consider adding a new one. > > > > We probably need a new API. If you want to do a live snapshot, we would > > need to notify xl that we are in the middle of pausing and resuming a > > domain. > > This is where we discussed a lot. Do we really need > libxl_domain_snapshot_create API? or does xl can do the work? > > Even for live snapshot, after calling libxl_domain_suspend with LIVE flags, > memory is saved and domain is paused. xl then can call disk snapshot > functions to finish disk snapshots, after all of that, call libxl_domain_unpause > to unpause the domain. So I don't think xl has any trouble to do that. > In case there is some misunderstanding, please point out. My mistake, I incorrectly remembered that libxl_domain_suspend would destroy (for save or migate) or resume (for checkpoint) the guest before returning. Having refreshed my memory I see that you are correct: it returns with the domain paused and it is up to the toolstack to resume or destroy it as it wishes. Sorry for the confusion. Given that it does seem like the toolstack could indeed take the disksnapshots itself without an additional API. > > However the current architecture for libvirt to use libxl is like > > > > libvirt > > libxl > > [other lower level stuffs] > > > > So implementing snapshot management in xl cannot work for you either. > > It's not part of the current architecture. This is correct, xl should not be involved in a libvirt control stack, it is orthogonal. > You are right. I understand you are trying to suggest a way to ease the job. > Here just to make clear this way is really better and finally acceptable? :-) > Just IMO, I think xl snapshot-list is wanted, that means managing snapshots > in xl is needed. The xl idiom is that you do this sort of operation with existing CLI commands e.g. ls /var/lib/vm-images/*.qcow2, lvs, qemu-img etc. Adding snapshot-list to xl would be a whole load of work to create a bunch of infrastructure which you do not need to do. My understanding was that your primary aim here was to enable snapshots via libvirt and that xl support was just a nice to have. Is that right? > > Not that I'm against the idea of managing domain snapshot in xl, I'm > > trying to reduce workload here. > > > > > > The > > > > downside is that now xl and libvirt are disconnected, but I think it's > > > > fine. > > > > > > Two things here: > > > 1. connect xl and libvirt, then will need to manage snapshot info in libxl > > (or > > > libxlu) That's not preferred since the initial design. This is not the > > point > > > we discuss here. > > > 2. for xl only, list snapshots and delete snapshots, also need to manage > > > snapshot info (in xl) > > > > > > Considering manage snapshot info in xl, only question is about idl and > > > gentypes.py, expected structure is as following and expected to be saved > > > into json file, but it contains xl namespace and libxl namespace things, > > > gentypes.py will have problem. Better ideas? > > > > > > typedef struct xl_domain_snapshot { > > > char * name; > > > char * description; > > > uint64_t creation_time; > > > char * memory_path; > > > int num_disks; > > > libxl_disk_snapshot *disks; > > > char *parent; > > > bool *current; > > > } xl_domain_snapshot; > > > > > > > The libxl_disk_snapshot suggests that you still want storage management > > inside libxl, which should probably be in libxlu? > > Yeah. I may put it in libxlu. This depends on who the consumers of this datastructure are: If xl only -> put it in xl itself. If libvirt+xl -> put it in libxlu. My understanding was that libvirt already has data structures for dealing with snapshots, but this was based entirely on the commands listed by: virsh help | grep -E pool-\|snapshot- which seemed to me to be pretty feature rich and suggested that libvirt has a great deal of support for storage and snapshot management already. If libvirt already has generic infrastructure for managing snapshots this then IMHO you should use it, not reimplement it on the Xen side (whether in libxl, libxlu or xl), the additions to Xen should be limited to providing the underlying functionality which libvirt's generic code requires from the backend. However, Wei has suggested to me that perhaps libvirt's snapshotting capabilities are not as generic internally as I might have imaged and that it is up to each backend driver to reinvent things, is that true? If Wei's suggestion is correct then it may turn out that it is useful to put some of the new generic code which you would need to write into libxlu. Ian. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [RFC V8 2/3] libxl domain snapshot API design 2014-12-09 11:11 ` Ian Campbell @ 2014-12-10 3:46 ` Chun Yan Liu 2014-12-12 16:22 ` Ian Campbell 0 siblings, 1 reply; 24+ messages in thread From: Chun Yan Liu @ 2014-12-10 3:46 UTC (permalink / raw) To: Ian Campbell; +Cc: Ian.Jackson, Jim Fehlig, Wei Liu, xen-devel >>> On 12/9/2014 at 07:11 PM, in message <1418123518.14361.20.camel@citrix.com>, Ian Campbell <Ian.Campbell@citrix.com> wrote: > On Mon, 2014-12-08 at 22:04 -0700, Chun Yan Liu wrote: > > Partly. At least for domain disk snapshot create/delete, I prefer using > > qmp commands instead of calling qemu-img one by one. Using qmp > > commands, libvirt will need libxl's help. Of course, if libxl doesn't > > supply that, libvirt can call qemu-img to each disk one by one, > > not preferred but can do. > > You can't use qmp unless the domain is active, for an inactive domain > there is no qemu to talk to, so you have to use qemu-img anyway in that > case. Does libvirt not have existing code to do all this sort of thing? > (I thought so, but it turns out I may be wrong, see below). No. Even inlibvirt/qemu_driver (for kvm), it does the work itself through qemu monitor commands. > > And for an active domain I expect that *must* use qmp, since it seems > unlikely that you can go around changing things under the feet of an > active process (maybe I'm wrong?). For active domain, I tried 'qemu-img snapshot' after pausing a domain, the commands succeeded. But I also think using qmp commands is better since qemu supplies transaction qmp, it avoids the trouble to roll back status when using qemu-img to do disk snapshot one by one but only part of disks succeed. So, if disk snapshot functions can be provided to both libvirt and xl usage, it's very helpful to libvirt side. In this way, I may prefer to put disk snapshot functions to libxlu. > > > > > Following the constraint that it's better NOT to supply disk snapshot > > > > functions in libxl, then we let xl and libvirt do that by themselve > > > > separately, that's OK. > > > > > > > > Then I think NO new API needs to be exported in libxl, since: > > > > * saving/restoring memory, there are already APIs. > > > > > > The principle is that if existing API doesn't work good enough for you > > > we will consider adding a new one. > > > > > > We probably need a new API. If you want to do a live snapshot, we would > > > need to notify xl that we are in the middle of pausing and resuming a > > > domain. > > > > This is where we discussed a lot. Do we really need > > libxl_domain_snapshot_create API? or does xl can do the work? > > > > Even for live snapshot, after calling libxl_domain_suspend with LIVE flags, > > memory is saved and domain is paused. xl then can call disk snapshot > > functions to finish disk snapshots, after all of that, call > libxl_domain_unpause > > to unpause the domain. So I don't think xl has any trouble to do that. > > In case there is some misunderstanding, please point out. > > My mistake, I incorrectly remembered that libxl_domain_suspend would > destroy (for save or migate) or resume (for checkpoint) the guest before > returning. Having refreshed my memory I see that you are correct: it > returns with the domain paused and it is up to the toolstack to resume > or destroy it as it wishes. Sorry for the confusion. > > Given that it does seem like the toolstack could indeed take the > disksnapshots itself without an additional API. > > > > However the current architecture for libvirt to use libxl is like > > > > > > libvirt > > > libxl > > > [other lower level stuffs] > > > > > > So implementing snapshot management in xl cannot work for you either. > > > It's not part of the current architecture. > > This is correct, xl should not be involved in a libvirt control stack, > it is orthogonal. > > > You are right. I understand you are trying to suggest a way to ease the > job. > > Here just to make clear this way is really better and finally acceptable? > :-) > > Just IMO, I think xl snapshot-list is wanted, that means managing snapshots > > in xl is needed. > > The xl idiom is that you do this sort of operation with existing CLI > commands e.g. ls /var/lib/vm-images/*.qcow2, lvs, qemu-img etc. > > Adding snapshot-list to xl would be a whole load of work to create a > bunch of infrastructure which you do not need to do. > > My understanding was that your primary aim here was to enable snapshots > via libvirt and that xl support was just a nice to have. Is that right? We hope both :-) Libvirt side already has some codes as I know and hopes to integrate with libxl to enable snapshots. Of course the two toolstacks can have some differences in commands, that's OK. Libvirt side, to use unified virsh commands, it will supply snapshot-create/delete/revert/list. Xl side, if it's better to supply snapshot-create/revert, we can implement like that. Then it IS much easier since no need to manage snapshots in xl, then no save/retrieve json file things and no snapshot chain things. Do we want/decide to follow this? > > > > Not that I'm against the idea of managing domain snapshot in xl, I'm > > > trying to reduce workload here. > > > > > > > > The > > > > > downside is that now xl and libvirt are disconnected, but I think it's > > > > > fine. > > > > > > > > Two things here: > > > > 1. connect xl and libvirt, then will need to manage snapshot info in > libxl > > > (or > > > > libxlu) That's not preferred since the initial design. This is not > the > > > point > > > > we discuss here. > > > > 2. for xl only, list snapshots and delete snapshots, also need to manage > > > > snapshot info (in xl) > > > > > > > > Considering manage snapshot info in xl, only question is about idl and > > > > gentypes.py, expected structure is as following and expected to be saved > > > > into json file, but it contains xl namespace and libxl namespace things, > > > > gentypes.py will have problem. Better ideas? > > > > > > > > typedef struct xl_domain_snapshot { > > > > char * name; > > > > char * description; > > > > uint64_t creation_time; > > > > char * memory_path; > > > > int num_disks; > > > > libxl_disk_snapshot *disks; > > > > char *parent; > > > > bool *current; > > > > } xl_domain_snapshot; > > > > > > > > > > The libxl_disk_snapshot suggests that you still want storage management > > > inside libxl, which should probably be in libxlu? > > > > Yeah. I may put it in libxlu. > > This depends on who the consumers of this datastructure are: > > If xl only -> put it in xl itself. > If libvirt+xl -> put it in libxlu. > > My understanding was that libvirt already has data structures for > dealing with snapshots, but this was based entirely on the commands > listed by: > virsh help | grep -E pool-\|snapshot- > which seemed to me to be pretty feature rich and suggested that libvirt > has a great deal of support for storage and snapshot management already. > Oh, I didn't say clearly. Here we mean libxl_disk_snapshot should be libxlu_disk_snapshot, that is, disk snapshot functions better in libxlu. Not mean xl_domain_snapshot. If managing snapshots, it will be in xl. Libvirt has its own data structures about managing domain snapshots. Thanks, Chunyan > If libvirt already has generic infrastructure for managing snapshots > this then IMHO you should use it, not reimplement it on the Xen side > (whether in libxl, libxlu or xl), the additions to Xen should be limited > to providing the underlying functionality which libvirt's generic code > requires from the backend. > > However, Wei has suggested to me that perhaps libvirt's snapshotting > capabilities are not as generic internally as I might have imaged and > that it is up to each backend driver to reinvent things, is that true? > > If Wei's suggestion is correct then it may turn out that it is useful to > put some of the new generic code which you would need to write into > libxlu. > > Ian. > > > ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [RFC V8 2/3] libxl domain snapshot API design 2014-12-10 3:46 ` Chun Yan Liu @ 2014-12-12 16:22 ` Ian Campbell 2014-12-15 3:13 ` Chun Yan Liu 0 siblings, 1 reply; 24+ messages in thread From: Ian Campbell @ 2014-12-12 16:22 UTC (permalink / raw) To: Chun Yan Liu; +Cc: Ian.Jackson, Jim Fehlig, Wei Liu, xen-devel On Tue, 2014-12-09 at 20:46 -0700, Chun Yan Liu wrote: > > >>> On 12/9/2014 at 07:11 PM, in message <1418123518.14361.20.camel@citrix.com>, > Ian Campbell <Ian.Campbell@citrix.com> wrote: > > On Mon, 2014-12-08 at 22:04 -0700, Chun Yan Liu wrote: > > > Partly. At least for domain disk snapshot create/delete, I prefer using > > > qmp commands instead of calling qemu-img one by one. Using qmp > > > commands, libvirt will need libxl's help. Of course, if libxl doesn't > > > supply that, libvirt can call qemu-img to each disk one by one, > > > not preferred but can do. > > > > You can't use qmp unless the domain is active, for an inactive domain > > there is no qemu to talk to, so you have to use qemu-img anyway in that > > case. Does libvirt not have existing code to do all this sort of thing? > > (I thought so, but it turns out I may be wrong, see below). > > No. Even inlibvirt/qemu_driver (for kvm), it does the work itself through > qemu monitor commands. Is this just the code for the actual act of taking a snapshot or is it a complete snapshotting infrastructure in the driver itself? I would hope/assume that there was a split between the common code which drives everything and tracks all the state etc and the specific driver backend which is used to make state changes to active domains. Is that the case or is everything snapshot related in the libvirt qemu_driver? > > And for an active domain I expect that *must* use qmp, since it seems > > unlikely that you can go around changing things under the feet of an > > active process (maybe I'm wrong?). > > For active domain, I tried 'qemu-img snapshot' after pausing a domain, > the commands succeeded. But I also think using qmp commands is better > since qemu supplies transaction qmp, it avoids the trouble to roll back > status when using qemu-img to do disk snapshot one by one but only part of > disks succeed. Yes, using qmp for an active domain seems sensible. But you can't use qmp on an inactive domain. Does libvirt deal with this in common code or does it require two code paths in the backend driver, one for active and one for inactive domains? > So, if disk snapshot functions can be provided to both libvirt and xl usage, > it's very helpful to libvirt side. In this way, I may prefer to put disk snapshot > functions to libxlu. The actual command to snapshot a disk of an active+paused domain is fine to go into libxl. In fact due to the proposed use of qmp it would have to be. Anything to do with the subsequent management of snapshots most likely doesn't belong in libxl. Whether that stuff belongs in libxlu, xl or libvirt depends on what scope there is for multiple toolstacks to use a given helper function. > > My understanding was that your primary aim here was to enable snapshots > > via libvirt and that xl support was just a nice to have. Is that right? > > We hope both :-) OK, thanks for clarifying. > Libvirt side already has some codes as I know and hopes to integrate with > libxl to enable snapshots. Of course the two toolstacks can have some > differences in commands, that's OK. > > Libvirt side, to use unified virsh commands, it will supply > snapshot-create/delete/revert/list. This is what I expected you were aiming for. > Xl side, if it's better to supply snapshot-create/revert, we can implement > like that. Then it IS much easier since no need to manage snapshots in xl, > then no save/retrieve json file things and no snapshot chain things. Do > we want/decide to follow this? The xl snapshot functionality should be kept as simple as possible and following the existing xl idioms of managing storage and saved VM images via existing CLI command (qemu-img, lvcreate, ls, mv, cp etc). Ian. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [RFC V8 2/3] libxl domain snapshot API design 2014-12-12 16:22 ` Ian Campbell @ 2014-12-15 3:13 ` Chun Yan Liu 0 siblings, 0 replies; 24+ messages in thread From: Chun Yan Liu @ 2014-12-15 3:13 UTC (permalink / raw) To: Ian Campbell; +Cc: Ian.Jackson, Jim Fehlig, Wei Liu, xen-devel >>> On 12/13/2014 at 12:22 AM, in message <1418401323.16425.28.camel@citrix.com>, Ian Campbell <Ian.Campbell@citrix.com> wrote: > On Tue, 2014-12-09 at 20:46 -0700, Chun Yan Liu wrote: > > > > >>> On 12/9/2014 at 07:11 PM, in message <1418123518.14361.20.camel@citrix.com>, > > Ian Campbell <Ian.Campbell@citrix.com> wrote: > > > On Mon, 2014-12-08 at 22:04 -0700, Chun Yan Liu wrote: > > > > Partly. At least for domain disk snapshot create/delete, I prefer using > > > > qmp commands instead of calling qemu-img one by one. Using qmp > > > > commands, libvirt will need libxl's help. Of course, if libxl doesn't > > > > supply that, libvirt can call qemu-img to each disk one by one, > > > > not preferred but can do. > > > > > > You can't use qmp unless the domain is active, for an inactive domain > > > there is no qemu to talk to, so you have to use qemu-img anyway in that > > > case. Does libvirt not have existing code to do all this sort of thing? > > > (I thought so, but it turns out I may be wrong, see below). > > > > No. Even inlibvirt/qemu_driver (for kvm), it does the work itself through > > qemu monitor commands. > > Is this just the code for the actual act of taking a snapshot or is it a > complete snapshotting infrastructure in the driver itself? > > I would hope/assume that there was a split between the common code which > drives everything and tracks all the state etc and the specific driver > backend which is used to make state changes to active domains. Is that > the case or is everything snapshot related in the libvirt qemu_driver? Everything snapshot related is done in libvirt qemu driver. But data structures about managing domain snapshots are common, so each hypervisor driver can share. > > > > And for an active domain I expect that *must* use qmp, since it seems > > > unlikely that you can go around changing things under the feet of an > > > active process (maybe I'm wrong?). > > > > For active domain, I tried 'qemu-img snapshot' after pausing a domain, > > the commands succeeded. But I also think using qmp commands is better > > since qemu supplies transaction qmp, it avoids the trouble to roll back > > status when using qemu-img to do disk snapshot one by one but only part of > > disks succeed. > > Yes, using qmp for an active domain seems sensible. > > But you can't use qmp on an inactive domain. Does libvirt deal with this > in common code or does it require two code paths in the backend driver, > one for active and one for inactive domains? Taking libvirt qemu driver for example, it goes two codes paths for active domain and inactive domain. For inactive domain, it calls qemu-img command to do the job. For active domain, calls qmp commands through qemu monitor. > > > So, if disk snapshot functions can be provided to both libvirt and xl > usage, > > it's very helpful to libvirt side. In this way, I may prefer to put disk > snapshot > > functions to libxlu. > > The actual command to snapshot a disk of an active+paused domain is fine > to go into libxl. In fact due to the proposed use of qmp it would have > to be. > > Anything to do with the subsequent management of snapshots most likely > doesn't belong in libxl. Whether that stuff belongs in libxlu, xl or > libvirt depends on what scope there is for multiple toolstacks to use a > given helper function. OK. Thanks. > > > > My understanding was that your primary aim here was to enable snapshots > > > via libvirt and that xl support was just a nice to have. Is that right? > > > > We hope both :-) > > OK, thanks for clarifying. > > > Libvirt side already has some codes as I know and hopes to integrate with > > libxl to enable snapshots. Of course the two toolstacks can have some > > differences in commands, that's OK. > > > > Libvirt side, to use unified virsh commands, it will supply > > snapshot-create/delete/revert/list. > > This is what I expected you were aiming for. > > > Xl side, if it's better to supply snapshot-create/revert, we can implement > > like that. Then it IS much easier since no need to manage snapshots in xl, > > then no save/retrieve json file things and no snapshot chain things. Do > > we want/decide to follow this? > > The xl snapshot functionality should be kept as simple as possible and > following the existing xl idioms of managing storage and saved VM images > via existing CLI command (qemu-img, lvcreate, ls, mv, cp etc). Got it. Thanks. So I'll update document. Chunyan > > Ian. > > > ^ permalink raw reply [flat|nested] 24+ messages in thread
* [RFC V8 3/3] xl snapshot-xxx Design 2014-11-10 8:17 [RFC V8 0/3] domain snapshot document Chunyan Liu 2014-11-10 8:17 ` [RFC V8 1/3] libxl domain snapshot introduction Chunyan Liu 2014-11-10 8:17 ` [RFC V8 2/3] libxl domain snapshot API design Chunyan Liu @ 2014-11-10 8:17 ` Chunyan Liu 2 siblings, 0 replies; 24+ messages in thread From: Chunyan Liu @ 2014-11-10 8:17 UTC (permalink / raw) To: xen-devel; +Cc: Ian.Jackson, jfehlig, wei.liu2, Ian.Campbell, Chunyan Liu Changes to V7: * change wrong libxl_domain_snapshot_info naming to xl_domain_snapshot_info * remove all disk-only syntax * update xl snapshot-revert implementaion =========================================================================== 1. xl commandline interface design xl snapshot-create: Create a snapshot (disk and RAM) of a domain. SYNOPSIS: snapshot-create <domain> [<cfgfile>] [--name <string>] [--live] OPTIONS: --name <string> snapshot name --live take a live snapshot If option includes --live, then the domain is not paused while creating the snapshot, like live migration. This increases size of the memory dump file, but reducess downtime of the guest. If option doens't include --name, a default name will be generated according to the creation time. If specify @cfgfile, cfgfile is prioritized. (e.g. if --name specifies a name, meanwhile there is name info in cfgfile, name in cfgfile will be used.) xl snapshot-delete: Delete a snapshot (disk and RAM) of a domain. SYNOPSIS: snapshot-delete <domain> <snapshotname> [--children] [--children-only] By default, just this snapshot is deleted, and changes from this snapshot are automatically merged into children snapshots. OPTIONS: --children delete snapshot and all children --children-only delete children but not snapshot If option includes --children, then this snapshot and any descendant snapshots are deleted. If option include --children-only, only descendant snapshots are deleted, this snapshot is not deleted. xl snapshot-revert: Revert domain to status of a snapshot. SYNOPSIS: snapshot-revert <domain> <snapshotname> [--running] [--paused] [--force] OPTIONS: --running after reverting, change state to running --paused after reverting, change state to paused --force try harder on risky reverts Normally, the domain will revert to the same state the domain was in while the snapshot was taken (whether running, or paused). If option includes --running, then overrides the snapshot state to guarantee a running domain after the revert. If option includes --paused, then guarantees a paused domain after the revert. xl snapshot-list: List snapshots for a domain. SYNOPSIS: snapshot-list <domain> [--parent] [--internal] [--external] [--tree] [--name] OPTIONS: --internal filter by internal snapshots --external filter by external snapshots --tree list snapshots in a tree --parent add a column showing parent snapshot --name list snapshot names only 2. cfgfile syntax "xl snapshot-create" supports creating a VM snapshot with user provided configuration file. The configuration file syntax is as below: #snapshot name. If user doesn't provide a VM snapshot name, xl will generate #a name automatically by the creation time. name="" #snapshot description. Default is NULL. description="" #memory location. This field should be filled when memory=1. Default is NULL. memory_path="" #disk snapshot information disks=['sda,1,qcow2,/tmp/sda_snapshot.qcow2','sdb,1,qcow2,/tmp/sda_snapshot.qcow2'] or disks=['sda,0','sdb,0'] disk syntax: 'target device, external disk snapshot?, external format, external path' 3. xl structure to maintain VM snapshot info xl_domain_snapshot_info = Struct("domain_snapshot_info",[ # snapshot name ("name", string) ("create_time", string) ("description", string) # memory path and disk snapshot info ("snapshot_args", libxl_domain_snapshot_args), # parent snapshot name ("parent", string), # array to store all children snapshot name ("children", Array(string, "num_children"), ] According to xl_domain_snapshot_info, a json file will be saved on disk. 4. xl snapshot-xxx implementation details "xl snapshot-create" 1), parse args or domain snapshot configuration file. 2), fill info in libxl_domain_snapshot_args struct according to options or config file. 3), call libxl_domain_snapshot_create() 4), fill info in xl_domain_snapshot_info. 5), save snapshot info in json file under "/var/lib/xen/snapshots/domain_uuid" "xl snapshot-list" 1), read all domain snapshot related json file under "/var/lib/xen/snapshots/domain_uuid". Parse each file and fill in xl_domain_snapshot_info struct. 2), display information from those xl_domain_snapshot_info(s) "xl snapshot-delete" 1), read snapshot json file from "/var/lib/xen/snapshots/domain_uuid/snapshotdata-<snapshot_name>\ .libxl-json", parse the file and fill in xl_domain_snapshot_info 2), according to parent/children info in xl_domain_snapshot_info and commandline options, decide which domain snapshot to be deleted. To delete each domain snapshot, fill in libxl_domain_snapshot_args and call libxl_domain_snapshot_delete(). 3), refresh parent/children relationship, delete json file for those already deleted snapshot. "xl snapshot-revert" 1), read snapshot json file from "/var/lib/xen/snapshots/domain_uuid/snapshotdata-<snapshot_name>\ .libxl-json", parse the file and fill in xl_domain_snapshot_info. 2), destroy current domain 3). according to the info in xl_domain_snapshot_info, create a new domain from snapshot. Interact with other operations: All snapshots should be deleted before deleting a domain. This will affact xl destroy/shutdown/save/migrate, adding related check and error reporting. ^ permalink raw reply [flat|nested] 24+ messages in thread
[parent not found: <547F479A0200006600041019@soto.provo.novell.com>]
* Re: [RFC V8 2/3] libxl domain snapshot API design [not found] <547F479A0200006600041019@soto.provo.novell.com> @ 2014-12-03 6:26 ` Chun Yan Liu 0 siblings, 0 replies; 24+ messages in thread From: Chun Yan Liu @ 2014-12-03 6:26 UTC (permalink / raw) To: Ian.Campbell; +Cc: Ian.Jackson, Jim Fehlig, wei.liu2, dunlapg, xen-devel >>> On 11/28/2014 at 11:43 PM, in message <1417189409.23604.62.camel@citrix.com>, Ian Campbell <Ian.Campbell@citrix.com> wrote: > On Tue, 2014-11-25 at 02:08 -0700, Chun Yan Liu wrote: > > Hi, Ian, > > > > According to previous discussion, snapshot delete and revert are > > inclined to be done by high level application itself, won't supply a > > libxl API. > > I thought you had explained a scenario where the toolstack needed to be > at least aware of delete, specifically when you are deleting a snapshot > from the middle of an active chain. The reason why I post such an overview here before sending next version is: I'm puzzled about what should be in libxl and what in toolstack after previous discussion. So posted here to seek some ideas or agreement first. It's not a full design, not break down to libxl and toolstack yet. > > Maybe that's not "snapshot delete API in libxl" though, but rather a > notification API which the toolstack can use to tell libxl something is > going on. About notification API, after looking at lvm, vhd-util and qcow2, I don't think we need it. No extra work needs to do to handle disk snapshot chain. lvm: doesn't support snapshot of snapshot. vhd-util: backing file chain, external snapshot. Don't need to delete the disk snapshot when deleting domain snapshot. qcow2: * internal disk snapshot: each snapshot increases the refcount of data, deleting snapshot only decrease the refcount, won't affect other snapshots. * external disk snapshot: same as vhd-util, backing file chain. Don't need to delete disk snapshot when deleting domain snapshot. > > > I'm wondering snapshot create need a new common API? > > In fact its main work is save domain and take disk snapshot, xl can > > do it too. For saving memory, there is already API for that. The missing part is taking disk snapshot. > > I don't believe xl can take a disk snapshot of an active disk, it > doesn't have the machinery to deal with that sort of thing, nor should > it, this is exactly the sort of thing which libxl is provided to deal > with. Like delete a disk snapshot, xl can call external command to do that (e.g. qemu-img). But it's better to call qmp to do that. Anyway, if for domain snapshot create, we should put creating disk snapshot process in libxl, then for domain snapshot delete, we should put deleting disk snapshot process in libxl. That is, in libxl there should be: libxl_disk_snapshot_create (which handles creating disk snapshot) libxl_disk_snapshot_delete (which handles deleting disk snapshot) Otherwise I would think it's weird to have in libxl: libxl_domain_snapshot_create (wrap saving memory [already has API] and creating disk snapshot) libxl_disk_snapshot_delete (deleting disk snapshot) > > Also, libxl is driving the migration/memory snapshot, and I think the > disk snapshot fundamentally needs to be involved in that process, not > done separately by the toolstack. > > > I just write down an overview of the snapshot work (see below). > > The problem is: do we need to export API? What kind of API? > > In updating Bamvor's code, I think xl can do all the work, libvirt can > > do the work too even without libxl's help. > > > > Of course, there are some thing if put in libxl, it will be easier to > > use, like the domain snapshot info structure, gentype.py will > > directly generate useful init/dispose/to_json/from_json functions. > > Or the disk snapshot part can be extracted and placed in libxl or libxlu. And about the snapshot json file store and retrieve, using gentype.py to autogenerate xx_to_json and xx_from_json functions is very convenient, there would be a group of functions set/get/update/delete_snapshot_metadata based on that. But I didn't see other such usage in xl, and it's not proper to place in libxl. Anywhere could it be placed but used by xl? Wei might have some ideas about this? -Chunyan > > > > Any suggestions about which part is better to be extracted as libxl > > API or better not? > > > > Thanks, > > Chunyan > > > > > ----------------------------------------------------------------------------- > ------------------------- > > libxl domain snapshot overview > > Just to be 100% clear: This is an overview of a domain snapshot > architecture for a toolstack which uses libxl. A bunch of the things > described here belong to the toolstack and not to libxl itself. > > I've tried to read with that in mind but a complete document should > mention this and be careful to be clear about the distinction where it > matters. > > > 0. Glossary > [...] > > * not support disk-only snapshot [1]. > > > > [1] > > This is different from "libvirt". > > To xl, it only concerns active domains, and even when domain > > is paused, there is no data flush to disk operation. So, take > > a disk-only snapshot and then resume, it is as if the guest > > had crashed. For this reason, disk-only snapshot is meaningless > > to xl. Should not support. > > > > To libvirt, it has active domains and inactive domains, for > > the active domains, as "xl", it's meaning less to take disk-only > > snapshot, but for inactive domains, disk-only snapshot is valid. > > Should support. > > Do you mean to say here that disk-only snapshots are not supported in > some toolstacks, or in no toolstack? Or are you just saying that libxl > doesn't need to support them because they only apply to inactive > domains? > > In either case it seems to me like your footnote is saying that you *do* > want to support disk-only snapshots, at least in some stacks and/or > configurations. > > I think you probably mean to say that disk-only snapshots of *active* > domains are not supported. Whereas disk-only snapshots of inactive > domains may or may not be depending on the toolstack. > > > > > 2. Requirements > > > > General Requirements: > > * ability to save/restore domain memory > > * ability to create/delete/apply disk snapshot [2] > > * ability to parse user config file > > * ability to save/load/update domain snapshot metadata (or called > > domain snapshot info, the metadata at least includes: > > snapshot name, create time, description, memory state file, > > disk snapshot info, parent (in snapshot chain), current (is > > currently applied)) > > > > [2] Disk snapshot requirements: > > * external tools: qemu-img, lvcreate, vhd-util, etc. > > * For a basic goal, we support 'raw' and 'qcow2' backend types only. > > Then only requires qemu: > > use libxl qmp command (better) or "qemu-img" > > You should leave these implementation details for a later section, in > this context they just invite quibbling about whether things belong in > libxl etc and whether qmp commands are "better". > > The rest looks ok, but without the remainder of the design described in > terms of the concepts given here it's hard to comment further. > > I'd suggest putting this all into one coherent document (not 3 as > before) which starts by describing the terminology (section 0 in your > mail which I'm replying to now), then gives an overview of the > architecture (the rest of that mail), then describe which components > (libxl, toolstack, etc) implement each bit of the architecture, then > describe the libxl API which makes this possible (covered in previous > mails I think). > > I think you have most of the words either here or from the other mails, > they just need putting together into a single thing and going through to > make sure that they use the same terminology and describe the same > things etc. > > Please take a look at > http://xenbits.xen.org/people/dvrabel/event-channels-H.pdf or > http://lists.xen.org/archives/html/xen-devel/2014-10/msg03235.html for > examples of the sort of cohesive document I mean. > > Ian. > > > ^ permalink raw reply [flat|nested] 24+ messages in thread
end of thread, other threads:[~2014-12-15 3:13 UTC | newest]
Thread overview: 24+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-11-10 8:17 [RFC V8 0/3] domain snapshot document Chunyan Liu
2014-11-10 8:17 ` [RFC V8 1/3] libxl domain snapshot introduction Chunyan Liu
2014-11-10 8:17 ` [RFC V8 2/3] libxl domain snapshot API design Chunyan Liu
2014-11-10 17:04 ` George Dunlap
2014-11-11 8:07 ` Chun Yan Liu
2014-11-13 3:07 ` Chun Yan Liu
2014-11-13 11:41 ` Ian Campbell
2014-11-25 9:08 ` Chun Yan Liu
2014-11-28 15:43 ` Ian Campbell
2014-12-03 6:14 ` Chun Yan Liu
2014-12-05 14:02 ` Ian Campbell
2014-12-05 16:06 ` Wei Liu
2014-12-05 16:11 ` Ian Campbell
2014-12-05 16:22 ` Wei Liu
2014-12-08 7:34 ` Chun Yan Liu
2014-12-08 11:12 ` Wei Liu
2014-12-08 11:24 ` Wei Liu
2014-12-09 5:04 ` Chun Yan Liu
2014-12-09 11:11 ` Ian Campbell
2014-12-10 3:46 ` Chun Yan Liu
2014-12-12 16:22 ` Ian Campbell
2014-12-15 3:13 ` Chun Yan Liu
2014-11-10 8:17 ` [RFC V8 3/3] xl snapshot-xxx Design Chunyan Liu
[not found] <547F479A0200006600041019@soto.provo.novell.com>
2014-12-03 6:26 ` [RFC V8 2/3] libxl domain snapshot API design Chun Yan Liu
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.