From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39614) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YOrG8-0006AD-PL for qemu-devel@nongnu.org; Fri, 20 Feb 2015 12:21:05 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1YOrG4-0004jh-Ik for qemu-devel@nongnu.org; Fri, 20 Feb 2015 12:21:00 -0500 Received: from mx1.redhat.com ([209.132.183.28]:45852) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YOrG4-0004jQ-AQ for qemu-devel@nongnu.org; Fri, 20 Feb 2015 12:20:56 -0500 Message-ID: <54E76CF5.50303@redhat.com> Date: Fri, 20 Feb 2015 12:20:53 -0500 From: John Snow MIME-Version: 1.0 References: <1423865338-8576-1-git-send-email-jsnow@redhat.com> <20150220110920.GD3867@stefanha-thinkpad.redhat.com> In-Reply-To: <20150220110920.GD3867@stefanha-thinkpad.redhat.com> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH v13 00/17] block: incremental backup series List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Stefan Hajnoczi Cc: kwolf@redhat.com, famz@redhat.com, armbru@redhat.com, qemu-devel@nongnu.org, vsementsov@parallels.com, mreitz@redhat.com On 02/20/2015 06:09 AM, Stefan Hajnoczi wrote: > On Fri, Feb 13, 2015 at 05:08:41PM -0500, John Snow wrote: >> This series requires: [PATCH v3] blkdebug: fix "once" rule >> >> Welcome to the "incremental backup" newsletter, where we discuss >> exciting developments in non-redundant backup technology. >> This issue is called the "Max Reitz Fixup" issue. >> >> This patchset enables the in-memory part of the incremental backup >> feature. There are two series on the mailing list now by Vladimir >> Sementsov-Ogievskiy that enable the migration and persistence of >> dirty bitmaps. >> >> This series and most patches were originally by Fam Zheng. > > Please add docs/incremental-backup.txt explaining how the commands are > intended to be used to perform backups. The QAPI documentation is > sparse and QMP clients would probably need to read the code to > understand how these commands work. > OK. I wrote a markdown formatted file that explains it all pretty well; should I check it in as-is, or should I convert it to some other format? https://github.com/jnsnow/qemu/blob/bitmap-demo/docs/bitmaps.md > I'm not sure I understand the need for all the commands: add, remove, > enable, disable, clear. Why are these commands necessary? add/remove are self explanatory. clear allows you to re-sync a bitmap to a full backup after you have already been running for some time. See the readme for the usage case. Yes, you *COULD* delete and re-create the bitmap with add/remove, but why? Clear is nice, I stand by it. enable/disable: Not necessarily useful for the simple case at this exact moment; they can be used to track writes during a period of time if desired. You could use them with transactions to record activity through different periods of time. They were originally used for what the "frozen" case covers now, but I left them in. I could stand to part with them if you think they detract more than help. They seemed like useful primitives. My default action will be to leave them in, still. > > I think just add and remove should be enough: > > Initial full backup > ------------------- > Use transaction to add bitmaps to all drives atomically (i.e. > point-in-time snapshot across all drives) and launch drive-backup (full) > on all drives. > > In your patch the bitmap starts disabled. In my example bitmaps are > always enabled unless they have a successor. No they don't: bitmap->disabled = false; > > Incremental backup > ------------------ > Use transaction to launch drive-backup (bitmap mode) on all drives. > > If backup completes successfully we'll be able to run the same command > again for the next incremental backup. > > If backup fails I see a problem when taking consistent incremental > backups across multiple drives: > > Imagine 2 drives (A and B). Transaction is used to launch drive-backup > on both with their respective dirty bitmaps. drive-backup A fails and > merges successor dirty bits so nothing is lost, but what do we do with > drive-backup B? > > drive-backup B could be in-progress or it could have completed before > drive-backup A failed. Now we're in trouble because we need to take > consistent snapshots across all drives but we've thrown away the dirty > bitmap for drive B! Just re-run the transaction. If one failed but one succeeded, just run it again. The one that succeeded prior will now have a trivial incremental backup, and the one that failed will have a new valid incremental backup. The two new incrementals will be point in time consistent. E.G.: [full_a] <-- [incremental_a_0: FAILED] [full_b] <-- [incremental_b_0: SUCCESS] then re-run: [full_a] <------------------------ [incremental_a_1] [full_b] <-- [incremental_b_0] <-- [incremental_b_1] If the extra image in the chain for the one that succeeded is problematic, you can use other tools to consolidate them later. Either way, how do you propose getting a consistent snapshot across multiple drives after a failure? The only recovery option is to just create a new snapshot *later*, you can never go back to what was, just like you can't for single drives. libvirt can tell which portions of the transaction ultimately succeeded by the BLOCK_JOB_COMPLETED events that it receives back, one per each drive. For drives that complete successfully, it can make a mental note that it has a new proper incremental. For drives that do not, it can make a note that it needs to try for a new consistent drives-wide incremental, erasing the half-baked attempt that got created last time. I am not convinced there is a problem. Since we don't want filename management (&c) as part of this solution inside of QEMU, there is nothing left to do except within libvirt. > > Stopping incremental backup > --------------------------- > Use transaction to remove bitmaps on all drives. This case is easy. > > Finally, what about bdrv_truncate()? When the disk size changes the > dirty bitmaps keep the old size. Seems likely to cause problems. Have > you thought about this case? > Only just recently when reviewing Vladimir's bitmap persistence patches, actually. > Stefan >