From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([209.51.188.92]:35717) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gm40q-0003UD-SY for qemu-devel@nongnu.org; Tue, 22 Jan 2019 16:55:18 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gm40n-0005k6-5Z for qemu-devel@nongnu.org; Tue, 22 Jan 2019 16:55:14 -0500 References: From: Eric Blake Message-ID: Date: Tue, 22 Jan 2019 15:54:54 -0600 MIME-Version: 1.0 In-Reply-To: Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="dyU0aTasuFcMQhEdy1tdebxD4uA2GPBFK" Subject: Re: [Qemu-devel] Incremental drive-backup with dirty bitmaps List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Bharadwaj Rayala , qemu-discuss@nongnu.org, qemu-devel@nongnu.org Cc: kashyap.cv@gmail.com, Suman Swaroop , kchamart@redhat.com, John Snow This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --dyU0aTasuFcMQhEdy1tdebxD4uA2GPBFK From: Eric Blake To: Bharadwaj Rayala , qemu-discuss@nongnu.org, qemu-devel@nongnu.org Cc: kashyap.cv@gmail.com, Suman Swaroop , kchamart@redhat.com, John Snow Message-ID: Subject: Re: [Qemu-devel] Incremental drive-backup with dirty bitmaps References: In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable On 1/22/19 1:29 PM, Bharadwaj Rayala wrote: > Hi, >=20 > TL(Cant)R: I am trying to figure out a workflow for doing incremental > drive-backups using dirty-bitmaps. Feels qemu lacks some essential feat= ures > to achieve it. >=20 > I am trying to build a backup workflow(program) using drive-backup alon= g > with dirty bitmaps to take backups of kvm vms. EIther pull/push model w= orks > for me. Since drive-backup push model is already implemented, I am > going forward with it. I am not able to figure out a few details and > couldn't find any documentation around it. Any help would be appreciate= d >=20 > Context: I would like to take recoverable, consistent, incremental > backups of kvm vms, whose disks are backed either by qcow2 or raw image= s. > Lets say there is a vm:vm1 with drive1 backed by image chain( A <-- B )= =2E > This are the rough steps i would like to do. >=20 > Method 1: > Backup: > 1. Perform a full backup using `drive-backup(drive1, sync=3Dfull, dest = =3D > /nfs/vm1/drive1)`. Use transaction to do `block-dirty-bitmap-add(drive1= , > bitmap1)`. Store the vm config seperately > 2. Perform an incremental backup using `drive-backup(drive1, > sync=3Dincremental, mode=3Dexisting, bitmap=3Dbitmap1, dest=3D/nfs/vm1/= drive1)`. > Store the vm config seperately > 3. Rinse and repeat. > Recovery(Just the latest backup, incremental not required): > Copy the full qcow2 from nfs to host storage. Spawn a new vm with t= he > same vm config. > Temporary quick recovery: > Create a new qcow2 layer on top of existing /nfs/vm1/drive1 on the = nfs > storage itself. Spawn a new vm with disk on nfs storage itself. Sounds like it should work; using qemu to push the backup out. > were > Issues i face: > 1. Does the drive-backup stall for the whole time the block job is in > progress. This is a strict no for me. I didnot find any documentation > regarding it but a powerpoint presentation(from kaskyapc) mentioning it= =2E > (Assuming yes!) The drive-backup is running in parallel to the guest. I'm not sure what stalls you are seeing - but as qemu is doing all the work, it DOES have to service both guest requests and the work to copy out the backup; also, if you have known-inefficient lseek() situations, there may be cases where qemu is doing a lousy job (there's work underway on the list to improve qemu's caching of lseek() data). > 2. Is the backup consistent? Are the drive file-systems quiesced on bac= kup? > (Assuming no!) If you want the file systems quiesced on backup, then merely bracket your transaction that kicks off the drive-backup inside guest-agent commands that freeze and thaw the disk. So, consistency is not default (because it requires trusting the guest), but is possible. >=20 > To achieve both of the above, one hack i could think of was to take a > snapshot and read from the snapshot. >=20 > Method 2: > 1. Perform a full backup using `drive-backup(drive1, sync=3Dfull, dest = =3D > /nfs/vm1/drive1)`. Use transaction to do `block-dirty-bitmap-add(drive1= , > bitmap1)`. Store the vm config seperately > 2. Perform the incremental backup by > a. add bitmap2 to drive1 `block-dirty-bitmap-add(drive1, bitmap2)`= =2E > b. Take a vm snapshot with drive1(exclude memory, quiesce). The dr= ive1 > image chain is now A<--B<--C. > c. Take incremental using bitmap1 but using data from node B. > `drive-backup(*#nodeB*, sync=3Dincremental, mode=3Dexisting, bitmap=3Db= itmap1, > dest=3D/nfs/vm1/drive1)` > d. Delete bitmap1 `block-dirty-bitmap-delete(drive1, bitmap1)` > e. Delete vm snapshot on drive1. The drive1 image chain is now A <= --B. > f. bitmap2 now tracks the changes from incrementa 1 to incremental= 2. >=20 > Drawbacks with this method would be(had it worked) that incremental bac= kups > would contain dirty blocks that are a superset of the actual blocks tha= t > are changed between the snapshot and the last snapshot.(Incremental x w= ould > contain blocks that have changed when incremental x-1 backup was in > progress). But there are no correctness issues. >=20 >=20 > *I cannot do this because drive-backup doesnot allow bitmap and node th= at > the bitmap is attached to, to be different. :( * It might, as long as the bitmap is found on the backing chain (I'm a bit fuzzier on that case, but KNOW that for pull-mode backups, my libvirt code is definitely relying on being able to access the bitmap from the backing file of the BDS being exported over NBD). > Some other issues i was facing that i worked around: > 1. Lets say i have to backup a vm with 2 disks(both at a fixed point in= > time, either both fail or both pass). To atomically do a bitmap-add and= > drive-backup(sync=3Dfull) i can use transcations. To achieve a backup a= t a > fixed point in time, i can use transaction with multiple drive-backups.= To > either fail the whole backup or succeed(when multiple drives are presen= t), > i can use completion-mode =3D grouped. But then i cant combine them as = its > not supported. i.e, do a > Transaction{drive-backup(drive1), dirty-bitmap-add(drive1, > bitmap1),drive-backup(drive2), dirty-bitmap-add(drive2, bitmap1), > completion-mode=3Dgrouped}. What error message are you getting? I'm not surprised if completion-mode=3Dgrouped isn't playing nicely with bitmaps in transactions, although that should be something that we should fix. > Workaround: Create bitmaps first, then take full. Effect: Incrementals= > would be a small superset of actual changed blocks. > 2. Why do I need to dismiss old jobs to start a new job on node. I want= to > retain the block-job end state for a day before i clear them. So i set > auto-dismiss to false. This doesnot allow new jobs to run unless the ol= d > job is dismissed even if state=3Dconcluded. Yes, there is probably more work needed to make parallel jobs do what people want. > Workaround: no workaround, store the end-job-status somewhere else. > 3. Is there a way pre 2.12 to achieve auto-finalise =3D false in a > transaction. Can I somehow add a dummy block job, that will only finish= > when i want to finalise the actual 2 disks block jobs? My backup workfl= ow > needs to run on env's pre 2.12. Ouch - backups pre-2.12 have issues. If I had not read this paragraph, my recommendation would be to stick to 3.1 and use pull-mode backups (where you use NBD to learn which portions of the image were dirtied, and pull those portions of the disk over NBD rather than qemu pushing them); I even have a working demo of preliminary libvirt code driving that which I presented at last year's KVM Forum. > Workaround: Couldnot achieve this. So if an incremental fails after bl= ock > jobs succeed before i can ensure success(have to do some metadata > operations on my side), i retry with sync=3Dfull mode. >=20 >=20 > *So what is the recommeded way of taking backups with incremental bitma= ps > ? * > Thanks you for taking time to read through this. >=20 > Best, > Bharadwaj. >=20 --=20 Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3226 Virtualization: qemu.org | libvirt.org --dyU0aTasuFcMQhEdy1tdebxD4uA2GPBFK Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- iQEzBAEBCAAdFiEEccLMIrHEYCkn0vOqp6FrSiUnQ2oFAlxHkS4ACgkQp6FrSiUn Q2o1NAf+M0jKR3hSJfU2Y3APR1eOBkalt1hbZpq0BfdVrpzElKpBtGMxSsT+OGGp 87xWDajh9oa3lzqsUF1OyrGftiSd+Ni6EzIh0IQxspD9FIjBcFWtSZh8aZFCGSMF ve9DLny4tDngeifm+nTIPIDqqWLWl21vXVuJkn6VUr0kZcYS13QbaenbY3fz3EZt R4Ei3R0PHAf+Jk9U492cEPmUzwItnMbM4n1XFBSxyXHGjBWGCqMCDt5Ow6Zfg2uB 3+kOdg6r7OFIZLOW1OsX50+t2CzhUMZAw8ZjGfHT2n7WI3dbPJ0A8etqJuwy+zOa IRwJv20Wad5RDg8mhAUPUyuSml394g== =jJEo -----END PGP SIGNATURE----- --dyU0aTasuFcMQhEdy1tdebxD4uA2GPBFK--