From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:38882) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XqlKY-0008La-Pw for qemu-devel@nongnu.org; Tue, 18 Nov 2014 11:08:44 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1XqlKT-0002Mt-Sn for qemu-devel@nongnu.org; Tue, 18 Nov 2014 11:08:38 -0500 Received: from mx1.redhat.com ([209.132.183.28]:55616) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XqlKT-0002Mo-IY for qemu-devel@nongnu.org; Tue, 18 Nov 2014 11:08:33 -0500 Message-ID: <546B6EFE.5000108@redhat.com> Date: Tue, 18 Nov 2014 11:08:30 -0500 From: John Snow MIME-Version: 1.0 References: <545CB9CE.9000302@parallels.com> <20141108071919.GB4940@fam-t430.nay.redhat.com> <54607427.8040404@parallels.com> <5462327C.5080704@redhat.com> <5464B80E.6060201@parallels.com> <546A88DD.10006@redhat.com> <546B254D.2020808@parallels.com> <546B44F4.9020301@parallels.com> In-Reply-To: <546B44F4.9020301@parallels.com> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] [PATCH v6 00/10] block: Incremental backup series List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Vladimir Sementsov-Ogievskiy , Fam Zheng Cc: "Denis V. Lunev" , stefanha@redhat.com, qemu-devel@nongnu.org On 11/18/2014 08:09 AM, Vladimir Sementsov-Ogievskiy wrote: >> (3) Data Integrity >> >> The dirty flag could work something like: >> >> - If, on first open, the file has the dirty flag set, we need to >> discard the bitmap data because we can no longer trust it. >> - If the bitmap file is clean, proceed as normal, but take a lock >> against any of the bitmap functions to prevent them from marking any >> bits dirty. >> - On first write to a clean persistent bitmap, delay the write until >> we can mark the bitmap as dirty first. This incurs a write penalty >> when we try to use the bitmap at first... >> - Unlock the bitmap functions and allow them to mark blocks as needed. >> - At some point, based on a sync policy, re-commit the dirty >> information to the file and mark the file as clean once more and >> re-take the persistence lock. > Correct me if I'm wrong. > > #Read bitmap: > read in blockdev_init, before any write to device, so no lock is needed. > > #Set bits in bitmap: > if bitmap.dirty_flag: > set bits > else: > LOCK > set bits > set bitmap.dirty_flag > set dirty_flag in bitmap file > UNLOCK > > #Sync: > if not bitmap.dirty_flag: > skip sync > else: > LOCK > save one of bitmap levels (saving the last one is too long and not > very good idea, because it is fast-updateing) > unset dirty_flag in bitmap file > unset bitmap.dirty_flag > UNLOCK > > #Last sync in bdrv_close: > Just save the last bitmap level and unset dirty_flag in bitmap file > > Also.. I'm not quite sure about locking.. As I understand, co-routines > in qemu are not running in parallel, is locking required? Or sync timer > will not be co-routine based? > > Best regards, > Vladimir Might be being too informal. I just meant a lock or barrier to prevent=20 actual IO throughput until we can confirm the dirty flag has been=20 adjusted to indicate that the persistent bitmap is now officially out of=20 date. Nothing fancy. Wasn't trying to imply that we needed threading protection, just=20 "locking" the IO until we can configure the bitmap as we need it to be. > On 18.11.2014 13:54, Vladimir Sementsov-Ogievskiy wrote: >> >>> (2) File Format >>> >>> Some standard file magic, which includes: >>> >>> - Some magic byte(s) >>> - Dirty flag. Needed to tell if we can trust this data or not. >>> - The size of the bitmap >>> - The granularity of the bitmap >>> - The offset to the first sector of bitmap data (Maybe? It can't hurt >>> if we give ourselves a sector's worth to write metadata within.) >>> - Data starting at... PAGESIZE? >> - The name of the bitmap and also the size of this name >> >>> >>> (5) Partial Persistence >>> >>> We did not discuss only saving higher levels of the bitmap. What's >>> the primary benefit you're seeking? >> Hmm. It may be used for faster sync. Maybe, save some of bitmap levels >> on timer while vm is running and save the last level on shutdown? >> >> CC qemu-devel - ok. >> >> Best regards, >> Vladimir >> >> On 18.11.2014 02:46, John Snow wrote: >>> >>> >>> On 11/13/2014 08:54 AM, Vladimir Sementsov-Ogievskiy wrote: >>>> Hi >>>> >>>> I'd just like to start working on persistent dirty bitmap. My though= ts >>>> about it are the following: >>>> - qemu -drive file=3Dfile,dirty_bitmap=3Dfile >>>> so, bitmap will be loaded with drive open and saved with drive >>>> close. >>>> - save only meaningful (the last) level of the bitmap, restore all >>>> levels on bitmap loading >>>> - bool parameter "persistent" for bdrv_create_dirty_bitmap and >>>> BdrvDirtyBitmap >>>> - internal dirty_bitmaps, saved in qcow2 file >>>> >>>> Best regards, >>>> Vladimir >>> >>> I am thinking: >>> >>> (1) Command Lines >>> >>> If you enable dirty bitmaps and give it a file that doesn't exist, it >>> should error out on you. >>> >>> If you enable dirty bitmaps and give it a file that's blank, it >>> understands that it is to create a persistent bitmap file in this >>> location and it should enable persistence. >>> >>> If a bitmap file is given and it has valid magic, this should imply >>> persistence. >>> >>> I am hesitant to have it auto-create files that don't already exist >>> in case the files become large in size and a misconfiguration leads >>> to repeated creation of these files that get orphaned in random >>> folders. Perhaps we can add a create=3Dauto flag or similar to allow >>> this behavior if wanted. >>> >>> (2) File Format >>> >>> Some standard file magic, which includes: >>> >>> - Some magic byte(s) >>> - Dirty flag. Needed to tell if we can trust this data or not. >>> - The size of the bitmap >>> - The granularity of the bitmap >>> - The offset to the first sector of bitmap data (Maybe? It can't hurt >>> if we give ourselves a sector's worth to write metadata within.) >>> - Data starting at... PAGESIZE? >>> >>> (3) Data Integrity >>> >>> The dirty flag could work something like: >>> >>> - If, on first open, the file has the dirty flag set, we need to >>> discard the bitmap data because we can no longer trust it. >>> - If the bitmap file is clean, proceed as normal, but take a lock >>> against any of the bitmap functions to prevent them from marking any >>> bits dirty. >>> - On first write to a clean persistent bitmap, delay the write until >>> we can mark the bitmap as dirty first. This incurs a write penalty >>> when we try to use the bitmap at first... >>> - Unlock the bitmap functions and allow them to mark blocks as needed. >>> - At some point, based on a sync policy, re-commit the dirty >>> information to the file and mark the file as clean once more and >>> re-take the persistence lock. >>> >>> (4) Synchronization Policy >>> >>> - Sync after so many bits become dirty in the bitmap, either as an >>> absolute threshold or a density percentage? >>> - Sync periodically on a fixed timer? >>> - Sync periodically opportunistically when I/O utilization becomes >>> relatively low? (With some sort of starvation prevention timer?) >>> - Sync only at shutdown? >>> >>> In discussing with Stefan, I think we rather liked the idea of a >>> timer that tries to re-commit the block data during lulls in the I/O. >>> >>> (5) Partial Persistence >>> >>> We did not discuss only saving higher levels of the bitmap. What's >>> the primary benefit you're seeking? >>> >>> (6) Inclusion as qcow2 Metadata >>> >>> And lastly, we did discuss the inclusion of the bitmap as qcow2 >>> metadata, but decided it wasn't our principle target for the format >>> to allow generality to other file formats. We didn't really discuss >>> the idea of having it as an option or an extension, but I don't (off >>> the top of my head) have any reasonings against it, but I will likely >>> not work on it myself. >>> >>> >>> You didn't CC qemu-devel on this (so I won't!), but perhaps we should >>> re-send out our ideas to the wider list for feedback before we >>> proceed any further. Maybe we can split the work if we agree upon a >>> design. >>> >>> Thanks! >>> --js >>> >>> P.S.: I'm still cleaning up Fam's first patchset based on Max's and >>> your feedback. Hope to have it out by the end of this week. >>> >>>> On 11.11.2014 18:59, John Snow wrote: >>>>> >>>>> >>>>> On 11/10/2014 03:15 AM, Vladimir Sementsov-Ogievskiy wrote: >>>>>> Hi Fam, hi Jorn. >>>>>> >>>>>> Jagane's project - http://wiki.qemu.org/Features/Livebackup >>>>>> >>>>>> In two words: >>>>>> Normal delta - like in qemu, while backuping, we save all new >>>>>> writes to >>>>>> separate virtual disk - delta. When backup is done, we can merge >>>>>> delta >>>>>> back to original image. >>>>>> Reverse delta - while backuping, we don't stop writing to original >>>>>> image >>>>>> (and qemu works with it, not with delta), but before every write >>>>>> we copy >>>>>> the corresponding block (if backup needs it) to separate virtual d= isk >>>>>> (reverse delta). So, for backuping, if the current block is not >>>>>> rewritten, we take it from original file, otherwise we take it fro= m >>>>>> reverse delta. The benefit is that, theoretically we'll have two >>>>>> times >>>>>> less overhead I/Os then with "normal delta backup + merge", >>>>>> because of >>>>>> part of (about 0.5) blocks which are rewritten while backuping >>>>>> will be >>>>>> backuped before rewrite, and they will not go to delta then. Also, >>>>>> there >>>>>> are should be methods to improve this coefficient, by choosing rul= es >>>>>> about what blocks to backup first. >>>>>> >>>>>> Are someone work now on persistent (while vm is turned off) >>>>>> bitmaps? If >>>>>> not, are there any recommendations? >>>>>> >>>>>> Best regards, >>>>>> Vladimir >>>>>> >>>>> >>>>> Hi Vladimir: >>>>> >>>>> I recently inherited Fam's backup series and I intended to start >>>>> reviewing the comments this week. After I sift through that situati= on, >>>>> the next step for me is persistent dirty bitmap support. I had been >>>>> working on AHCI and IDE an awful lot lately, so it will take me a >>>>> little bit to switch over. >>>>> >>>>> I don't have any existing notions of how I intend to implement it, = so >>>>> if you'd like to discuss any concerns you have about that subsystem= , >>>>> we can talk :) >>>>> >>>>> Thanks, >>>>> --John >>>>> >>>>>> On 08.11.2014 10:19, Fam Zheng wrote: >>>>>>> On Fri, 11/07 15:23, Vladimir Sementsov-Ogievskiy wrote: >>>>>>>> Hi! >>>>>>>> >>>>>>>> Glad to see you working on backup series again. Some time ago I'= ve >>>>>>>> already >>>>>>>> fixed v4 to be mergeable, but unfortunately didn't publish. >>>>>>>> Actually I need (and I'm going to develop) incremental backup wi= th >>>>>>>> reverse >>>>>>>> delta (like abandoned project >>>>>>>> http://wiki.qemu.org/Features/Livebackup by >>>>>>>> Jagane). And I've already fixed Jagane's project to be working w= ith >>>>>>>> current >>>>>>>> Qemu. But the code is bad. Can you give me a peace of advice abo= ut >>>>>>>> what >>>>>>>> should I notice? I have a very superficial understanding of qemu >>>>>>>> block >>>>>>>> devices, block jobs, etc. >>>>>>> Hi Vladimir, >>>>>>> >>>>>>> Thank you for reviewing my series. >>>>>>> >>>>>>> I haven't seen Jagane's series, so could you pleased explain what >>>>>>> "reverse >>>>>>> delta" is? >>>>>>> >>>>>>> John Snow (CC'ed) will take my work on incremental backup so I th= ink >>>>>>> you can >>>>>>> coordinate. I'll probably count on John to work on your comments >>>>>>> too, >>>>>>> in the >>>>>>> future I'll follow the progress and review. >>>>>>> >>>>>>> Thanks, >>>>>>> Fam >>>>>> >>>>> >>>> >>> >> >> > --=20 =97js