From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:46547) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Un16l-0001rk-Ar for qemu-devel@nongnu.org; Thu, 13 Jun 2013 02:34:10 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Un16i-0002Ts-4I for qemu-devel@nongnu.org; Thu, 13 Jun 2013 02:34:07 -0400 Received: from mx1.redhat.com ([209.132.183.28]:23522) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Un16h-0002Tn-Su for qemu-devel@nongnu.org; Thu, 13 Jun 2013 02:34:04 -0400 Date: Thu, 13 Jun 2013 14:33:40 +0800 From: Fam Zheng Message-ID: <20130613063340.GA16044@localhost.nay.redhat.com> References: <1369917299-5725-1-git-send-email-stefanha@redhat.com> <1369917299-5725-4-git-send-email-stefanha@redhat.com> <20130606035618.GA24375@localhost.nay.redhat.com> <20130606080513.GA13466@stefanha-thinkpad.redhat.com> <20130606085649.GA15648@localhost.nay.redhat.com> <20130607071812.GA16953@stefanha-thinkpad.redhat.com> <51B960BA.4050801@linux.vnet.ibm.com> <51B961A4.1060608@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <51B961A4.1060608@linux.vnet.ibm.com> Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] [PATCH v5 03/11] block: add basic backup support to block driver List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Wenchao Xia Cc: Kevin Wolf , Stefan Hajnoczi , qemu-devel@nongnu.org, imain@redhat.com, Stefan Hajnoczi , Paolo Bonzini , dietmar@proxmox.com On Thu, 06/13 14:07, Wenchao Xia wrote: > =E4=BA=8E 2013-6-13 14:03, Wenchao Xia =E5=86=99=E9=81=93: > >=E4=BA=8E 2013-6-7 15:18, Stefan Hajnoczi =E5=86=99=E9=81=93: > >>On Thu, Jun 06, 2013 at 04:56:49PM +0800, Fam Zheng wrote: > >>>On Thu, 06/06 10:05, Stefan Hajnoczi wrote: > >>>>On Thu, Jun 06, 2013 at 11:56:18AM +0800, Fam Zheng wrote: > >>>>>On Thu, 05/30 14:34, Stefan Hajnoczi wrote: > >>>>>>+ > >>>>>>+static int coroutine_fn backup_before_write_notify( > >>>>>>+ NotifierWithReturn *notifier, > >>>>>>+ void *opaque) > >>>>>>+{ > >>>>>>+ BdrvTrackedRequest *req =3D opaque; > >>>>>>+ > >>>>>>+ return backup_do_cow(req->bs, req->sector_num, > >>>>>>req->nb_sectors, NULL); > >>>>>>+} > >>>>> > >>>>>I'm wondering if we can see the logic here with a backing hd > >>>>>relationship? req->bs is a backing file of job->target, but guest= is > >>>>>going to write to it, so we need to COW down the data to job->targ= et > >>>>>before overwritting (i.e. cluster is not allocated in child). > >>>>> > >>>>>I think if we do this in block layer, there's not much necessity f= or a > >>>>>before-write notifier here (although it may be useful for other > >>>>>cases): > >>>>> > >>>>> in bdrv_write: > >>>>> for child in req->bs->open_children > >>>>> if not child->is_allocated(req->sectors) > >>>>> do COW to child > >>>>> > >>>>>The advantage of this is that we won't need to start block-backup > >>>>>job in > >>>>>sync mode "none" to do point-in-time snapshot (image fleecing), an= d we > >>>>>get writable snapshot (possibility to open backing file writable a= nd > >>>>>write to it safely) as a by-product. > >>>>> > >>>>>But we will need to keep track of parent<->child of block states, > >>>>>and we > >>>>>still need to take care of overlapping writing between block job a= nd > >>>>>guest request. > >>>> > >>>>There's one catch here: bs->target may not support backing files, i= t > >>>>can > >>>>be a raw file, for example. We'll only use backing files for > >>>>point-in-time snapshots but other use cases might not. raw doesn't > >>>>really implement is_allocated(), so the whole concept would have to > >>>>change a little: > >>> > >>>Another use case may be parent modification. Suppose we have > >>> > >>> ,--- child1.qcow2 > >>> parent.qcow2 < > >>> `--- child2.qcow2 > >>> > >>>We can use parent.qcow2 as block device in QEMU without breaking > >>>child1.qcow2 or child2.qcow2 by telling QEMU who its children are: > >>> > >>> $QEMU -drive file=3Dparent.qcow2,children=3Dchild1.qcow2:child2.q= cow2 > >>> > >>>Then we open the three images and setup parent_bs->open_children, th= e > >>>children are protected from being corrupted. > >>> > >>>> > >>>>bs->open_children becomes independent of backing files - any > >>>>BlockDriverState can be added to this list. ->is_allocated() basic= ally > >>>>becomes the bitmap that we keep in the block job. > >>> > >>>Yes. But it is possible to keep a bitmap for raw (and those don't > >>>implement is_allocated()) in block layer too, or in overlay: could > >>>add-cow by Dongxu Wang help here? > >> > >>Yes absolutely. > >> > >>Stefan > >> > > One advantage of external backup, or backing up chain, is that it > >holds 'Delta' data only and is small enough. If it is changed toward a > >'full' data writable snapshot, it become bigger. With backup chain > >qemu-img can restore/clone a writable and usable one, So I don't > >think adding that in qemu emulator helps much, and it will make things > >more complicit.... user won't care who is doing the job, qemu or > >qemu-img. > > > I mean that "get writable snapshot (possibility to open backing file > writable and write to it safely) as a by-product." in this series, is > not very valuable. >=20 I'm not selling writable snapshot, my point was just that semantic of block-backup, getting a point-in-time snapshot, inherently works like a backing chain but writting to parent (guest drive) will not break its children (our thin PIT snapshot). If we see it this way, COW is not so specific to a block job like block-backup, it can be generic in the backing chain logic. Though, the value in a writable snapshot is that we can actually _modify_ a backing image in place, rather than forking the chain to write to the new child. This is not supported with qemu or qemu-img now, once you create a child with the image as backing file, you mustn't modify it. --=20 Fam