* [Qemu-devel] Qemu and Changed Block Tracking @ 2017-02-21 12:43 Peter Lieven 2017-02-21 15:11 ` Eric Blake 2017-02-21 21:13 ` John Snow 0 siblings, 2 replies; 15+ messages in thread From: Peter Lieven @ 2017-02-21 12:43 UTC (permalink / raw) To: qemu-devel@nongnu.org Hi, is there anyone ever thought about implementing something like VMware CBT in Qemu? https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1020128 Thanks, Peter ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Qemu-devel] Qemu and Changed Block Tracking 2017-02-21 12:43 [Qemu-devel] Qemu and Changed Block Tracking Peter Lieven @ 2017-02-21 15:11 ` Eric Blake 2017-02-21 21:13 ` John Snow 1 sibling, 0 replies; 15+ messages in thread From: Eric Blake @ 2017-02-21 15:11 UTC (permalink / raw) To: Peter Lieven, qemu-devel@nongnu.org [-- Attachment #1: Type: text/plain, Size: 598 bytes --] On 02/21/2017 06:43 AM, Peter Lieven wrote: > Hi, > > > is there anyone ever thought about implementing something like VMware > CBT in Qemu? > > > https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1020128 Yes; in fact, the work on persistent dirty bitmaps and on NBD BLOCK_STATUS reporting is what we envision as the building blocks for an upper layer software to be able to grab CBT information on which blocks are dirty. -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 604 bytes --] ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Qemu-devel] Qemu and Changed Block Tracking 2017-02-21 12:43 [Qemu-devel] Qemu and Changed Block Tracking Peter Lieven 2017-02-21 15:11 ` Eric Blake @ 2017-02-21 21:13 ` John Snow 2017-02-22 8:45 ` Peter Lieven 1 sibling, 1 reply; 15+ messages in thread From: John Snow @ 2017-02-21 21:13 UTC (permalink / raw) To: Peter Lieven, qemu-devel@nongnu.org On 02/21/2017 07:43 AM, Peter Lieven wrote: > Hi, > > > is there anyone ever thought about implementing something like VMware > CBT in Qemu? > > > https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1020128 > > > > Thanks, > Peter > > A bit outdated now, but: http://wiki.qemu-project.org/Features/IncrementalBackup and also a summary I wrote not too far back (PDF): https://drive.google.com/file/d/0B3CFr1TuHydWalVJaEdPaE5PbFE and I'm sure the Virtuozzo developers could chime in on this subject, but basically we do have something similar in the works, as eblake says. --js ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Qemu-devel] Qemu and Changed Block Tracking 2017-02-21 21:13 ` John Snow @ 2017-02-22 8:45 ` Peter Lieven 2017-02-22 12:32 ` Eric Blake 2017-02-22 21:17 ` John Snow 0 siblings, 2 replies; 15+ messages in thread From: Peter Lieven @ 2017-02-22 8:45 UTC (permalink / raw) To: John Snow, qemu-devel@nongnu.org, Christian Theune Am 21.02.2017 um 22:13 schrieb John Snow: > > On 02/21/2017 07:43 AM, Peter Lieven wrote: >> Hi, >> >> >> is there anyone ever thought about implementing something like VMware >> CBT in Qemu? >> >> >> https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1020128 >> >> >> >> Thanks, >> Peter >> >> > A bit outdated now, but: > http://wiki.qemu-project.org/Features/IncrementalBackup > > and also a summary I wrote not too far back (PDF): > https://drive.google.com/file/d/0B3CFr1TuHydWalVJaEdPaE5PbFE > > and I'm sure the Virtuozzo developers could chime in on this subject, > but basically we do have something similar in the works, as eblake says. Hi John, Hi Erik, thanks for your feedback. Are you both the ones working primary on this topic? If there is anything to review or help needed, please let me know. My 2 cents: I thing I had in mind if there is no image fleecing available, but fetching the dirty bitmap from external would be a feauture to put a write lock on a block device. Write lock means, drain all pending writes and queue all further writes until unlock (as if they were throttled to zero). This could help fetch consistent backups from storage device (thinking of iSCSI SAN) without the help of the hypervisor to actually transfer data (no load in the frontend network or the host). What would further be needed is a write generation for each block, not just only a dirty bitmap. In this case something like this via QMP (and external software) should work: ---8<--- gen = write generation of last backup (or 0 for full backup) do { nextgen = fetch current write generation (via QMP) dirtymap = send all block whose write generation is greater than 'gen' (via QMP) dirtycnt = 0 foreach block in dirtymap { copy to backup via external software dirtycnt++ } gen = nextgen } while (dirtycnt < X) <--- to achieve this a thorttling or similar might be needed fsfreeze (optional) write lock (via QMP) backupgen = fetch current write generation (via QMP) dirtymap = send all block whose write generation is greater than 'gen' (via QMP) foreach block in dirtymap { copy to backup via external software } unlock (via QMP) fsthaw (optional) --->8--- As far as I understand CBT in VMware is not just only a dirty bitmap, but also a write generation tracking for blocks (size 64kb or whatever) Peter ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Qemu-devel] Qemu and Changed Block Tracking 2017-02-22 8:45 ` Peter Lieven @ 2017-02-22 12:32 ` Eric Blake 2017-02-23 14:27 ` Peter Lieven 2017-02-22 21:17 ` John Snow 1 sibling, 1 reply; 15+ messages in thread From: Eric Blake @ 2017-02-22 12:32 UTC (permalink / raw) To: Peter Lieven, John Snow, qemu-devel@nongnu.org, Christian Theune [-- Attachment #1: Type: text/plain, Size: 3760 bytes --] On 02/22/2017 02:45 AM, Peter Lieven wrote: >> A bit outdated now, but: >> http://wiki.qemu-project.org/Features/IncrementalBackup >> >> and also a summary I wrote not too far back (PDF): >> https://drive.google.com/file/d/0B3CFr1TuHydWalVJaEdPaE5PbFE >> >> and I'm sure the Virtuozzo developers could chime in on this subject, >> but basically we do have something similar in the works, as eblake says. > > Hi John, Hi Erik, It's Eric, but you're not the first to make that typo :) > > thanks for your feedback. Are you both the ones working primary on this topic? > If there is anything to review or help needed, please let me know. > > My 2 cents: > I thing I had in mind if there is no image fleecing available, but fetching the dirty bitmap > from external would be a feauture to put a write lock on a block device. The whole idea is to use a dirty bitmap coupled with image fleecing, where the point-in-time of the image fleecing is done at a window where the guest I/O is quiescent in order to get a stable fleecing point. We already support write locks (guest quiesence) using qga to do fsfreeze. You want the time that guest I/O is frozen to be as small as possible (in particular, the Windows implementation of quiescence will fail if you hold things frozen for more than a couple of seconds). Right now, the qcow2 image format does not track write generations, and I don't think we plan on adding that directly into qcow2. However, you can externally simulate write generations by keeping track of how many image fleecing points you have created (each fleecing point is another write generation). > In this case something like this via QMP (and external software) should work: > ---8<--- > gen = write generation of last backup (or 0 for full backup) > do { > nextgen = fetch current write generation (via QMP) > dirtymap = send all block whose write generation is greater than 'gen' (via QMP) No, we are NOT going to send dirty information via QMP. Rather, we are going to send it via NBD's extension NBD_CMD_BLOCK_STATUS. The idea is that a client connects and asks which qemu blocks are dirty, then uses that information to read only the dirty blocks. > dirtycnt = 0 > foreach block in dirtymap { > copy to backup via external software > dirtycnt++ > } > gen = nextgen > } while (dirtycnt < X) <--- to achieve this a thorttling or similar might be needed > > fsfreeze (optional) > write lock (via QMP) > backupgen = fetch current write generation (via QMP) > dirtymap = send all block whose write generation is greater than 'gen' (via QMP) > foreach block in dirtymap { > copy to backup via external software > } > unlock (via QMP) > fsthaw (optional) > --->8--- That is too long for the guest to be frozen. Rather, the flow is more like: set up bitmap0 to track all writes since last point in time fsfreeze (optional) transaction to pivot to new bitmap1 (effectively freezing bitmap0 as the point in time we are interested in) fsthaw connect via NBD with a request to view the data at the bitmap0 point in time - read the bitmap, then read the sectors that the bitmap says are dirty clean up bitmap0 (qemu can finally delete any point-in-time sectors that were copied off due to any writes after the thaw) > As far as I understand CBT in VMware is not just only a dirty bitmap, but also a write generation tracking for blocks (size 64kb or whatever) Write generation is a matter of tracking which bitmaps and points in time you fleeced from. -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 604 bytes --] ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Qemu-devel] Qemu and Changed Block Tracking 2017-02-22 12:32 ` Eric Blake @ 2017-02-23 14:27 ` Peter Lieven 2017-02-24 21:31 ` John Snow 0 siblings, 1 reply; 15+ messages in thread From: Peter Lieven @ 2017-02-23 14:27 UTC (permalink / raw) To: Eric Blake, John Snow, qemu-devel@nongnu.org, Christian Theune Am 22.02.2017 um 13:32 schrieb Eric Blake: > On 02/22/2017 02:45 AM, Peter Lieven wrote: >>> A bit outdated now, but: >>> http://wiki.qemu-project.org/Features/IncrementalBackup >>> >>> and also a summary I wrote not too far back (PDF): >>> https://drive.google.com/file/d/0B3CFr1TuHydWalVJaEdPaE5PbFE >>> >>> and I'm sure the Virtuozzo developers could chime in on this subject, >>> but basically we do have something similar in the works, as eblake says. >> Hi John, Hi Erik, > It's Eric, but you're not the first to make that typo :) > >> thanks for your feedback. Are you both the ones working primary on this topic? >> If there is anything to review or help needed, please let me know. >> >> My 2 cents: >> I thing I had in mind if there is no image fleecing available, but fetching the dirty bitmap >> from external would be a feauture to put a write lock on a block device. > The whole idea is to use a dirty bitmap coupled with image fleecing, > where the point-in-time of the image fleecing is done at a window where > the guest I/O is quiescent in order to get a stable fleecing point. We > already support write locks (guest quiesence) using qga to do fsfreeze. > You want the time that guest I/O is frozen to be as small as possible > (in particular, the Windows implementation of quiescence will fail if > you hold things frozen for more than a couple of seconds). > > Right now, the qcow2 image format does not track write generations, and > I don't think we plan on adding that directly into qcow2. However, you > can externally simulate write generations by keeping track of how many > image fleecing points you have created (each fleecing point is another > write generation). > > >> In this case something like this via QMP (and external software) should work: >> ---8<--- >> gen = write generation of last backup (or 0 for full backup) >> do { >> nextgen = fetch current write generation (via QMP) >> dirtymap = send all block whose write generation is greater than 'gen' (via QMP) > No, we are NOT going to send dirty information via QMP. Rather, we are > going to send it via NBD's extension NBD_CMD_BLOCK_STATUS. The idea is > that a client connects and asks which qemu blocks are dirty, then uses > that information to read only the dirty blocks. I understand, that for the case of local storage connecting via NBD to Qemu to grep a snapshot might be a good idea, but consider that you have a NAS for your vServer images. May it be NFS, iSCSI, CEPH or whatever. In an enterprise scenario I would generally except to have a NAS rather than local storage. When you are going to backup your vServer (full or incremental) you shuffle all the traffic through Qemu and your Node running the vServer. In this case you run all the traffic over the wire twice. NAS -> Node -> Qemu - > Backup Server But the Backup Server could instead connect to the NAS directly avoiding load on the frontent LAN and the Qemu Node. I would like to find a nice solution for this scenario. If not in the first step it would maybe be good to have this in mind when implementing a dirty block tracking. Peter ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Qemu-devel] Qemu and Changed Block Tracking 2017-02-23 14:27 ` Peter Lieven @ 2017-02-24 21:31 ` John Snow 2017-02-24 21:44 ` Eric Blake 0 siblings, 1 reply; 15+ messages in thread From: John Snow @ 2017-02-24 21:31 UTC (permalink / raw) To: Peter Lieven, Eric Blake, qemu-devel@nongnu.org, Christian Theune On 02/23/2017 09:27 AM, Peter Lieven wrote: > Am 22.02.2017 um 13:32 schrieb Eric Blake: >> On 02/22/2017 02:45 AM, Peter Lieven wrote: >>>> A bit outdated now, but: >>>> http://wiki.qemu-project.org/Features/IncrementalBackup >>>> >>>> and also a summary I wrote not too far back (PDF): >>>> https://drive.google.com/file/d/0B3CFr1TuHydWalVJaEdPaE5PbFE >>>> >>>> and I'm sure the Virtuozzo developers could chime in on this subject, >>>> but basically we do have something similar in the works, as eblake >>>> says. >>> Hi John, Hi Erik, >> It's Eric, but you're not the first to make that typo :) >> >>> thanks for your feedback. Are you both the ones working primary on >>> this topic? >>> If there is anything to review or help needed, please let me know. >>> >>> My 2 cents: >>> I thing I had in mind if there is no image fleecing available, but >>> fetching the dirty bitmap >>> from external would be a feauture to put a write lock on a block device. >> The whole idea is to use a dirty bitmap coupled with image fleecing, >> where the point-in-time of the image fleecing is done at a window where >> the guest I/O is quiescent in order to get a stable fleecing point. We >> already support write locks (guest quiesence) using qga to do fsfreeze. >> You want the time that guest I/O is frozen to be as small as possible >> (in particular, the Windows implementation of quiescence will fail if >> you hold things frozen for more than a couple of seconds). >> >> Right now, the qcow2 image format does not track write generations, and >> I don't think we plan on adding that directly into qcow2. However, you >> can externally simulate write generations by keeping track of how many >> image fleecing points you have created (each fleecing point is another >> write generation). >> >> >>> In this case something like this via QMP (and external software) >>> should work: >>> ---8<--- >>> gen = write generation of last backup (or 0 for full backup) >>> do { >>> nextgen = fetch current write generation (via QMP) >>> dirtymap = send all block whose write generation is greater >>> than 'gen' (via QMP) >> No, we are NOT going to send dirty information via QMP. Rather, we are >> going to send it via NBD's extension NBD_CMD_BLOCK_STATUS. The idea is >> that a client connects and asks which qemu blocks are dirty, then uses >> that information to read only the dirty blocks. > > I understand, that for the case of local storage connecting via NBD to > Qemu to grep a snapshot > might be a good idea, but consider that you have a NAS for your vServer > images. May it be NFS, > iSCSI, CEPH or whatever. In an enterprise scenario I would generally > except to have a NAS rather > than local storage. > > When you are going to backup your vServer (full or incremental) you > shuffle all the traffic through > Qemu and your Node running the vServer. In this case you run all the > traffic over the wire twice. > > NAS -> Node -> Qemu - > Backup Server > > But the Backup Server could instead connect to the NAS directly avoiding > load on the frontent LAN > and the Qemu Node. > In a live backup I don't see how you will be removing QEMU from the data transfer loop. QEMU is the only process that knows what the correct view of the image is, and needs to facilitate. It's not safe to copy the blocks directly without QEMU's mediation. --js > I would like to find a nice solution for this scenario. If not in the > first step it would maybe be good to > have this in mind when implementing a dirty block tracking. > > Peter ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Qemu-devel] Qemu and Changed Block Tracking 2017-02-24 21:31 ` John Snow @ 2017-02-24 21:44 ` Eric Blake 2017-02-26 20:41 ` Peter Lieven 2017-02-27 20:39 ` John Snow 0 siblings, 2 replies; 15+ messages in thread From: Eric Blake @ 2017-02-24 21:44 UTC (permalink / raw) To: John Snow, Peter Lieven, qemu-devel@nongnu.org, Christian Theune [-- Attachment #1: Type: text/plain, Size: 1932 bytes --] On 02/24/2017 03:31 PM, John Snow wrote: >> >> But the Backup Server could instead connect to the NAS directly avoiding >> load on the frontent LAN >> and the Qemu Node. >> > > In a live backup I don't see how you will be removing QEMU from the data > transfer loop. QEMU is the only process that knows what the correct view > of the image is, and needs to facilitate. > > It's not safe to copy the blocks directly without QEMU's mediation. Although we may already have enough tools in place to help achieve that: create a temporary qcow2 wrapper around the primary image via external snapshot, so that the primary image is now read-only in qemu; then use whatever block-status mechanism (whether the NBD block status extension, or directly reading from a persistent bitmap) to facilitate whatever more efficient offline transfer of just the relevant portions of that main file, then live block-commit to get qemu to start writing to the file again. In other words, any time your algorithm wants to cause an I/O freeze to a particular file, the solution is to add a qcow2 external snapshot followed by a live commit. So tweaking the proposal a few mails ago: fsfreeze (optional) create qcow2 snapshot wrapper as a write lock (via QMP) fsthaw - now with no risk of violating guest timing constraints dirtymap = find all blocks that are dirty since last backup (via named bitmap/NBD block status) foreach block in dirtymap { copy to backup via external software } live commit image (via QMP) The window where guest I/O is frozen is small (the freeze/snapshot create/thaw steps can be done in less than a second), while the window where you are extracting incremental backup data is longer (during that time, guest I/O is happening into a wrapper qcow2 file). -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 604 bytes --] ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Qemu-devel] Qemu and Changed Block Tracking 2017-02-24 21:44 ` Eric Blake @ 2017-02-26 20:41 ` Peter Lieven 2017-02-27 16:56 ` Eric Blake 2017-02-27 20:39 ` John Snow 1 sibling, 1 reply; 15+ messages in thread From: Peter Lieven @ 2017-02-26 20:41 UTC (permalink / raw) To: Eric Blake; +Cc: John Snow, qemu-devel@nongnu.org, Christian Theune > Am 24.02.2017 um 22:44 schrieb Eric Blake <eblake@redhat.com>: > > On 02/24/2017 03:31 PM, John Snow wrote: >>> >>> But the Backup Server could instead connect to the NAS directly avoiding >>> load on the frontent LAN >>> and the Qemu Node. >>> >> >> In a live backup I don't see how you will be removing QEMU from the data >> transfer loop. QEMU is the only process that knows what the correct view >> of the image is, and needs to facilitate. >> >> It's not safe to copy the blocks directly without QEMU's mediation. > > Although we may already have enough tools in place to help achieve that: > create a temporary qcow2 wrapper around the primary image via external > snapshot, so that the primary image is now read-only in qemu; then use > whatever block-status mechanism (whether the NBD block status extension, > or directly reading from a persistent bitmap) to facilitate whatever > more efficient offline transfer of just the relevant portions of that > main file, then live block-commit to get qemu to start writing to the > file again. > > In other words, any time your algorithm wants to cause an I/O freeze to > a particular file, the solution is to add a qcow2 external snapshot > followed by a live commit. > > So tweaking the proposal a few mails ago: > > fsfreeze (optional) > create qcow2 snapshot wrapper as a write lock (via QMP) > fsthaw - now with no risk of violating guest timing constraints > dirtymap = find all blocks that are dirty since last backup (via named > bitmap/NBD block status) > foreach block in dirtymap { > copy to backup via external software > } > live commit image (via QMP) > > The window where guest I/O is frozen is small (the freeze/snapshot > create/thaw steps can be done in less than a second), while the window > where you are extracting incremental backup data is longer (during that > time, guest I/O is happening into a wrapper qcow2 file). The live-snapshot/live-commit stuff could indeed help in my scenario. If I understand correctly this is something that already works today, correct? If I have taken a live-snapshot, is live-migration and stop/start of the VM still possible? What about live-migration and start/stop during live-commit? I don’t talk about the dirty bitmap tracking I understand that persistence and live-migration support is still in the works, I’m just interested in the snapshot/commit part. Thanks Peter ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Qemu-devel] Qemu and Changed Block Tracking 2017-02-26 20:41 ` Peter Lieven @ 2017-02-27 16:56 ` Eric Blake 0 siblings, 0 replies; 15+ messages in thread From: Eric Blake @ 2017-02-27 16:56 UTC (permalink / raw) To: Peter Lieven; +Cc: John Snow, qemu-devel@nongnu.org, Christian Theune [-- Attachment #1: Type: text/plain, Size: 1045 bytes --] On 02/26/2017 02:41 PM, Peter Lieven wrote: > The live-snapshot/live-commit stuff could indeed help in my scenario. If I understand correctly this is > something that already works today, correct? If I have taken a live-snapshot, is live-migration and > stop/start of the VM still possible? What about live-migration and start/stop during live-commit? Yes, a guest can be started or stopped while migration and/or live-commit are underway. You probably have to keep the qemu process around (stopping the guest but keeping qemu alive is different than stopping qemu altogether), which is where the persistence factors into it (once we have persistent bitmaps, then stopping qemu altogether becomes possible). > I don’t talk about the dirty bitmap tracking I understand that persistence and live-migration support > is still in the works, I’m just interested in the snapshot/commit part. > > Thanks > Peter > -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 604 bytes --] ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Qemu-devel] Qemu and Changed Block Tracking 2017-02-24 21:44 ` Eric Blake 2017-02-26 20:41 ` Peter Lieven @ 2017-02-27 20:39 ` John Snow 1 sibling, 0 replies; 15+ messages in thread From: John Snow @ 2017-02-27 20:39 UTC (permalink / raw) To: Eric Blake, Peter Lieven, qemu-devel@nongnu.org, Christian Theune On 02/24/2017 04:44 PM, Eric Blake wrote: > On 02/24/2017 03:31 PM, John Snow wrote: >>> >>> But the Backup Server could instead connect to the NAS directly avoiding >>> load on the frontent LAN >>> and the Qemu Node. >>> >> >> In a live backup I don't see how you will be removing QEMU from the data >> transfer loop. QEMU is the only process that knows what the correct view >> of the image is, and needs to facilitate. >> >> It's not safe to copy the blocks directly without QEMU's mediation. > > Although we may already have enough tools in place to help achieve that: > create a temporary qcow2 wrapper around the primary image via external > snapshot, so that the primary image is now read-only in qemu; then use > whatever block-status mechanism (whether the NBD block status extension, > or directly reading from a persistent bitmap) to facilitate whatever > more efficient offline transfer of just the relevant portions of that > main file, then live block-commit to get qemu to start writing to the > file again. > Right, really good point. We can just turn the "live" backup into a not-live one (kind of!) to work around the constraint. In this case, creating the external snapshot should probably create a "new" bitmap on the root, leaving the old one behind on the backing file. This avoids spurious copies of data that hasn't changed in the backing file, and makes clearing the bitmap on success easier for us. Once the snapshots are re-merged, we can merge their respective bitmaps again. This can work in some scenarios, sure! We may have to be careful about how exactly bitmaps fork when you create new external snapshots, but that does seem workable and (possibly) the most performant, if that's a concern. --js > In other words, any time your algorithm wants to cause an I/O freeze to > a particular file, the solution is to add a qcow2 external snapshot > followed by a live commit. > > So tweaking the proposal a few mails ago: > > fsfreeze (optional) > create qcow2 snapshot wrapper as a write lock (via QMP) > fsthaw - now with no risk of violating guest timing constraints > dirtymap = find all blocks that are dirty since last backup (via named > bitmap/NBD block status) > foreach block in dirtymap { > copy to backup via external software > } > live commit image (via QMP) > > The window where guest I/O is frozen is small (the freeze/snapshot > create/thaw steps can be done in less than a second), while the window > where you are extracting incremental backup data is longer (during that > time, guest I/O is happening into a wrapper qcow2 file). > ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Qemu-devel] Qemu and Changed Block Tracking 2017-02-22 8:45 ` Peter Lieven 2017-02-22 12:32 ` Eric Blake @ 2017-02-22 21:17 ` John Snow 2017-02-23 14:29 ` Peter Lieven 1 sibling, 1 reply; 15+ messages in thread From: John Snow @ 2017-02-22 21:17 UTC (permalink / raw) To: Peter Lieven, qemu-devel@nongnu.org, Christian Theune On 02/22/2017 03:45 AM, Peter Lieven wrote: > > Am 21.02.2017 um 22:13 schrieb John Snow: >> >> On 02/21/2017 07:43 AM, Peter Lieven wrote: >>> Hi, >>> >>> >>> is there anyone ever thought about implementing something like VMware >>> CBT in Qemu? >>> >>> >>> https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1020128 >>> >>> >>> >>> Thanks, >>> Peter >>> >>> >> A bit outdated now, but: >> http://wiki.qemu-project.org/Features/IncrementalBackup >> >> and also a summary I wrote not too far back (PDF): >> https://drive.google.com/file/d/0B3CFr1TuHydWalVJaEdPaE5PbFE >> >> and I'm sure the Virtuozzo developers could chime in on this subject, >> but basically we do have something similar in the works, as eblake says. > > Hi John, Hi Erik, > > thanks for your feedback. Are you both the ones working primary on this topic? > If there is anything to review or help needed, please let me know. > I've been working on incremental backups; Fam and I now co-maintain block/dirty-bitmap.c. Vladimir Sementsov-Ogievskiy has been working on bitmap persistence and migration from Virtuozzo; as well as the NBD specification amendment to allow us to fleece images with dirty bitmaps. (Check the wiki and the whitepaper I linked!) Eric has been guiding the review process for the NBD side of things. > My 2 cents: > I thing I had in mind if there is no image fleecing available, but fetching the dirty bitmap > from external would be a feauture to put a write lock on a block device. > Write lock means, drain all pending writes and queue all further writes until unlock (as if they > were throttled to zero). This could help fetch consistent backups from storage device (thinking of iSCSI SAN) without > the help of the hypervisor to actually transfer data (no load in the frontend network or the host). What would further > be needed is a write generation for each block, not just only a dirty bitmap. > > In this case something like this via QMP (and external software) should work: > ---8<--- > gen = write generation of last backup (or 0 for full backup) > do { > nextgen = fetch current write generation (via QMP) As Eric said, there's a lot of hostility to using QMP as a metadata transmission protocol. > dirtymap = send all block whose write generation is greater than 'gen' (via QMP) > dirtycnt = 0 > foreach block in dirtymap { > copy to backup via external software > dirtycnt++ > } > gen = nextgen > } while (dirtycnt < X) <--- to achieve this a thorttling or similar might be needed > > fsfreeze (optional) > write lock (via QMP) > backupgen = fetch current write generation (via QMP) > dirtymap = send all block whose write generation is greater than 'gen' (via QMP) > foreach block in dirtymap { > copy to backup via external software > } > unlock (via QMP) > fsthaw (optional) > --->8--- > > As far as I understand CBT in VMware is not just only a dirty bitmap, but also a write generation tracking for blocks (size 64kb or whatever) > I think at the moment I'm worried about getting the basic features out the door, but I'm not opposed to adding fancier features if there's justification or demand for them. > Peter > --js ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Qemu-devel] Qemu and Changed Block Tracking 2017-02-22 21:17 ` John Snow @ 2017-02-23 14:29 ` Peter Lieven 2017-02-23 19:34 ` John Snow 0 siblings, 1 reply; 15+ messages in thread From: Peter Lieven @ 2017-02-23 14:29 UTC (permalink / raw) To: John Snow, qemu-devel@nongnu.org, Christian Theune Am 22.02.2017 um 22:17 schrieb John Snow: > > On 02/22/2017 03:45 AM, Peter Lieven wrote: >> Am 21.02.2017 um 22:13 schrieb John Snow: >>> On 02/21/2017 07:43 AM, Peter Lieven wrote: >>>> Hi, >>>> >>>> >>>> is there anyone ever thought about implementing something like VMware >>>> CBT in Qemu? >>>> >>>> >>>> https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1020128 >>>> >>>> >>>> >>>> Thanks, >>>> Peter >>>> >>>> >>> A bit outdated now, but: >>> http://wiki.qemu-project.org/Features/IncrementalBackup >>> >>> and also a summary I wrote not too far back (PDF): >>> https://drive.google.com/file/d/0B3CFr1TuHydWalVJaEdPaE5PbFE >>> >>> and I'm sure the Virtuozzo developers could chime in on this subject, >>> but basically we do have something similar in the works, as eblake says. >> Hi John, Hi Erik, >> >> thanks for your feedback. Are you both the ones working primary on this topic? >> If there is anything to review or help needed, please let me know. >> > I've been working on incremental backups; Fam and I now co-maintain > block/dirty-bitmap.c. > > Vladimir Sementsov-Ogievskiy has been working on bitmap persistence and > migration from Virtuozzo; as well as the NBD specification amendment to > allow us to fleece images with dirty bitmaps. > > (Check the wiki and the whitepaper I linked!) > > Eric has been guiding the review process for the NBD side of things. > >> My 2 cents: >> I thing I had in mind if there is no image fleecing available, but fetching the dirty bitmap >> from external would be a feauture to put a write lock on a block device. >> Write lock means, drain all pending writes and queue all further writes until unlock (as if they >> were throttled to zero). This could help fetch consistent backups from storage device (thinking of iSCSI SAN) without >> the help of the hypervisor to actually transfer data (no load in the frontend network or the host). What would further >> be needed is a write generation for each block, not just only a dirty bitmap. >> >> In this case something like this via QMP (and external software) should work: >> ---8<--- >> gen = write generation of last backup (or 0 for full backup) >> do { >> nextgen = fetch current write generation (via QMP) > As Eric said, there's a lot of hostility to using QMP as a metadata > transmission protocol. > >> dirtymap = send all block whose write generation is greater than 'gen' (via QMP) >> dirtycnt = 0 >> foreach block in dirtymap { >> copy to backup via external software >> dirtycnt++ >> } >> gen = nextgen >> } while (dirtycnt < X) <--- to achieve this a thorttling or similar might be needed >> >> fsfreeze (optional) >> write lock (via QMP) >> backupgen = fetch current write generation (via QMP) >> dirtymap = send all block whose write generation is greater than 'gen' (via QMP) >> foreach block in dirtymap { >> copy to backup via external software >> } >> unlock (via QMP) >> fsthaw (optional) >> --->8--- >> >> As far as I understand CBT in VMware is not just only a dirty bitmap, but also a write generation tracking for blocks (size 64kb or whatever) >> > I think at the moment I'm worried about getting the basic features out > the door, but I'm not opposed to adding fancier features if there's > justification or demand for them. Sure, the basic features are most important. I was just thinking of the above scenario to interact with a NAS and have Qemu's "help" to create incremental backups. Peter ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Qemu-devel] Qemu and Changed Block Tracking 2017-02-23 14:29 ` Peter Lieven @ 2017-02-23 19:34 ` John Snow 2017-02-24 7:59 ` Peter Lieven 0 siblings, 1 reply; 15+ messages in thread From: John Snow @ 2017-02-23 19:34 UTC (permalink / raw) To: Peter Lieven, qemu-devel@nongnu.org, Christian Theune On 02/23/2017 09:29 AM, Peter Lieven wrote: > Am 22.02.2017 um 22:17 schrieb John Snow: >> >> On 02/22/2017 03:45 AM, Peter Lieven wrote: >>> Am 21.02.2017 um 22:13 schrieb John Snow: >>>> On 02/21/2017 07:43 AM, Peter Lieven wrote: >>>>> Hi, >>>>> >>>>> >>>>> is there anyone ever thought about implementing something like VMware >>>>> CBT in Qemu? >>>>> >>>>> >>>>> https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1020128 >>>>> >>>>> >>>>> >>>>> >>>>> Thanks, >>>>> Peter >>>>> >>>>> >>>> A bit outdated now, but: >>>> http://wiki.qemu-project.org/Features/IncrementalBackup >>>> >>>> and also a summary I wrote not too far back (PDF): >>>> https://drive.google.com/file/d/0B3CFr1TuHydWalVJaEdPaE5PbFE >>>> >>>> and I'm sure the Virtuozzo developers could chime in on this subject, >>>> but basically we do have something similar in the works, as eblake >>>> says. >>> Hi John, Hi Erik, >>> >>> thanks for your feedback. Are you both the ones working primary on >>> this topic? >>> If there is anything to review or help needed, please let me know. >>> >> I've been working on incremental backups; Fam and I now co-maintain >> block/dirty-bitmap.c. >> >> Vladimir Sementsov-Ogievskiy has been working on bitmap persistence and >> migration from Virtuozzo; as well as the NBD specification amendment to >> allow us to fleece images with dirty bitmaps. >> >> (Check the wiki and the whitepaper I linked!) >> >> Eric has been guiding the review process for the NBD side of things. >> >>> My 2 cents: >>> I thing I had in mind if there is no image fleecing available, but >>> fetching the dirty bitmap >>> from external would be a feauture to put a write lock on a block device. >>> Write lock means, drain all pending writes and queue all further >>> writes until unlock (as if they >>> were throttled to zero). This could help fetch consistent backups >>> from storage device (thinking of iSCSI SAN) without >>> the help of the hypervisor to actually transfer data (no load in the >>> frontend network or the host). What would further >>> be needed is a write generation for each block, not just only a dirty >>> bitmap. >>> >>> In this case something like this via QMP (and external software) >>> should work: >>> ---8<--- >>> gen = write generation of last backup (or 0 for full backup) >>> do { >>> nextgen = fetch current write generation (via QMP) >> As Eric said, there's a lot of hostility to using QMP as a metadata >> transmission protocol. >> >>> dirtymap = send all block whose write generation is greater >>> than 'gen' (via QMP) >>> dirtycnt = 0 >>> foreach block in dirtymap { >>> copy to backup via external software >>> dirtycnt++ >>> } >>> gen = nextgen >>> } while (dirtycnt < X) <--- to achieve this a thorttling or >>> similar might be needed >>> >>> fsfreeze (optional) >>> write lock (via QMP) >>> backupgen = fetch current write generation (via QMP) >>> dirtymap = send all block whose write generation is greater than >>> 'gen' (via QMP) >>> foreach block in dirtymap { >>> copy to backup via external software >>> } >>> unlock (via QMP) >>> fsthaw (optional) >>> --->8--- >>> >>> As far as I understand CBT in VMware is not just only a dirty bitmap, >>> but also a write generation tracking for blocks (size 64kb or whatever) >>> >> I think at the moment I'm worried about getting the basic features out >> the door, but I'm not opposed to adding fancier features if there's >> justification or demand for them. > > Sure, the basic features are most important. I was just thinking of the > above scenario to interact with a NAS and have Qemu's "help" > to create incremental backups. > > Peter If you get the chance to read the white paper I linked to you, please let me know which use cases we might not be able to cover that you feel other programs might offer. I can also make a point to CC you on future upstream discussions as they happen. Thanks, --js ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Qemu-devel] Qemu and Changed Block Tracking 2017-02-23 19:34 ` John Snow @ 2017-02-24 7:59 ` Peter Lieven 0 siblings, 0 replies; 15+ messages in thread From: Peter Lieven @ 2017-02-24 7:59 UTC (permalink / raw) To: John Snow, qemu-devel@nongnu.org, Christian Theune Am 23.02.2017 um 20:34 schrieb John Snow: > > On 02/23/2017 09:29 AM, Peter Lieven wrote: >> Am 22.02.2017 um 22:17 schrieb John Snow: >>> On 02/22/2017 03:45 AM, Peter Lieven wrote: >>>> Am 21.02.2017 um 22:13 schrieb John Snow: >>>>> On 02/21/2017 07:43 AM, Peter Lieven wrote: >>>>>> Hi, >>>>>> >>>>>> >>>>>> is there anyone ever thought about implementing something like VMware >>>>>> CBT in Qemu? >>>>>> >>>>>> >>>>>> https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1020128 >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> Thanks, >>>>>> Peter >>>>>> >>>>>> >>>>> A bit outdated now, but: >>>>> http://wiki.qemu-project.org/Features/IncrementalBackup >>>>> >>>>> and also a summary I wrote not too far back (PDF): >>>>> https://drive.google.com/file/d/0B3CFr1TuHydWalVJaEdPaE5PbFE >>>>> >>>>> and I'm sure the Virtuozzo developers could chime in on this subject, >>>>> but basically we do have something similar in the works, as eblake >>>>> says. >>>> Hi John, Hi Erik, >>>> >>>> thanks for your feedback. Are you both the ones working primary on >>>> this topic? >>>> If there is anything to review or help needed, please let me know. >>>> >>> I've been working on incremental backups; Fam and I now co-maintain >>> block/dirty-bitmap.c. >>> >>> Vladimir Sementsov-Ogievskiy has been working on bitmap persistence and >>> migration from Virtuozzo; as well as the NBD specification amendment to >>> allow us to fleece images with dirty bitmaps. >>> >>> (Check the wiki and the whitepaper I linked!) >>> >>> Eric has been guiding the review process for the NBD side of things. >>> >>>> My 2 cents: >>>> I thing I had in mind if there is no image fleecing available, but >>>> fetching the dirty bitmap >>>> from external would be a feauture to put a write lock on a block device. >>>> Write lock means, drain all pending writes and queue all further >>>> writes until unlock (as if they >>>> were throttled to zero). This could help fetch consistent backups >>>> from storage device (thinking of iSCSI SAN) without >>>> the help of the hypervisor to actually transfer data (no load in the >>>> frontend network or the host). What would further >>>> be needed is a write generation for each block, not just only a dirty >>>> bitmap. >>>> >>>> In this case something like this via QMP (and external software) >>>> should work: >>>> ---8<--- >>>> gen = write generation of last backup (or 0 for full backup) >>>> do { >>>> nextgen = fetch current write generation (via QMP) >>> As Eric said, there's a lot of hostility to using QMP as a metadata >>> transmission protocol. >>> >>>> dirtymap = send all block whose write generation is greater >>>> than 'gen' (via QMP) >>>> dirtycnt = 0 >>>> foreach block in dirtymap { >>>> copy to backup via external software >>>> dirtycnt++ >>>> } >>>> gen = nextgen >>>> } while (dirtycnt < X) <--- to achieve this a thorttling or >>>> similar might be needed >>>> >>>> fsfreeze (optional) >>>> write lock (via QMP) >>>> backupgen = fetch current write generation (via QMP) >>>> dirtymap = send all block whose write generation is greater than >>>> 'gen' (via QMP) >>>> foreach block in dirtymap { >>>> copy to backup via external software >>>> } >>>> unlock (via QMP) >>>> fsthaw (optional) >>>> --->8--- >>>> >>>> As far as I understand CBT in VMware is not just only a dirty bitmap, >>>> but also a write generation tracking for blocks (size 64kb or whatever) >>>> >>> I think at the moment I'm worried about getting the basic features out >>> the door, but I'm not opposed to adding fancier features if there's >>> justification or demand for them. >> Sure, the basic features are most important. I was just thinking of the >> above scenario to interact with a NAS and have Qemu's "help" >> to create incremental backups. >> >> Peter > If you get the chance to read the white paper I linked to you, please > let me know which use cases we might not be able to cover that you feel > other programs might offer. Will do. Only have had a short glimpse yet. > > I can also make a point to CC you on future upstream discussions as they > happen. Yes, please. Peter ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2017-02-27 20:39 UTC | newest] Thread overview: 15+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2017-02-21 12:43 [Qemu-devel] Qemu and Changed Block Tracking Peter Lieven 2017-02-21 15:11 ` Eric Blake 2017-02-21 21:13 ` John Snow 2017-02-22 8:45 ` Peter Lieven 2017-02-22 12:32 ` Eric Blake 2017-02-23 14:27 ` Peter Lieven 2017-02-24 21:31 ` John Snow 2017-02-24 21:44 ` Eric Blake 2017-02-26 20:41 ` Peter Lieven 2017-02-27 16:56 ` Eric Blake 2017-02-27 20:39 ` John Snow 2017-02-22 21:17 ` John Snow 2017-02-23 14:29 ` Peter Lieven 2017-02-23 19:34 ` John Snow 2017-02-24 7:59 ` Peter Lieven
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).