qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
To: Stefan Hajnoczi <stefanha@redhat.com>,
	Kevin Wolf <kwolf@redhat.com>,
	Stefan Hajnoczi <stefanha@gmail.com>,
	qemu-devel@nongnu.org, imain@redhat.com,
	Paolo Bonzini <pbonzini@redhat.com>,
	dietmar@proxmox.com, Fam Zheng <famz@redhat.com>
Subject: Re: [Qemu-devel] [PATCH v5 03/11] block: add basic backup support to block driver
Date: Thu, 13 Jun 2013 16:02:03 +0800	[thread overview]
Message-ID: <51B97C7B.4030708@linux.vnet.ibm.com> (raw)
In-Reply-To: <20130613063340.GA16044@localhost.nay.redhat.com>

于 2013-6-13 14:33, Fam Zheng 写道:
> On Thu, 06/13 14:07, Wenchao Xia wrote:
>> 于 2013-6-13 14:03, Wenchao Xia 写道:
>>> 于 2013-6-7 15:18, Stefan Hajnoczi 写道:
>>>> On Thu, Jun 06, 2013 at 04:56:49PM +0800, Fam Zheng wrote:
>>>>> On Thu, 06/06 10:05, Stefan Hajnoczi wrote:
>>>>>> On Thu, Jun 06, 2013 at 11:56:18AM +0800, Fam Zheng wrote:
>>>>>>> On Thu, 05/30 14:34, Stefan Hajnoczi wrote:
>>>>>>>> +
>>>>>>>> +static int coroutine_fn backup_before_write_notify(
>>>>>>>> +        NotifierWithReturn *notifier,
>>>>>>>> +        void *opaque)
>>>>>>>> +{
>>>>>>>> +    BdrvTrackedRequest *req = opaque;
>>>>>>>> +
>>>>>>>> +    return backup_do_cow(req->bs, req->sector_num,
>>>>>>>> req->nb_sectors, NULL);
>>>>>>>> +}
>>>>>>>
>>>>>>> I'm wondering if we can see the logic here with a backing hd
>>>>>>> relationship?  req->bs is a backing file of job->target, but guest is
>>>>>>> going to write to it, so we need to COW down the data to job->target
>>>>>>> before overwritting (i.e.  cluster is not allocated in child).
>>>>>>>
>>>>>>> I think if we do this in block layer, there's not much necessity for a
>>>>>>> before-write notifier here (although it may be useful for other
>>>>>>> cases):
>>>>>>>
>>>>>>>      in bdrv_write:
>>>>>>>      for child in req->bs->open_children
>>>>>>>          if not child->is_allocated(req->sectors)
>>>>>>>              do COW to child
>>>>>>>
>>>>>>> The advantage of this is that we won't need to start block-backup
>>>>>>> job in
>>>>>>> sync mode "none" to do point-in-time snapshot (image fleecing), and we
>>>>>>> get writable snapshot (possibility to open backing file writable and
>>>>>>> write to it safely) as a by-product.
>>>>>>>
>>>>>>> But we will need to keep track of parent<->child of block states,
>>>>>>> and we
>>>>>>> still need to take care of overlapping writing between block job and
>>>>>>> guest request.
>>>>>>
>>>>>> There's one catch here: bs->target may not support backing files, it
>>>>>> can
>>>>>> be a raw file, for example.  We'll only use backing files for
>>>>>> point-in-time snapshots but other use cases might not.  raw doesn't
>>>>>> really implement is_allocated(), so the whole concept would have to
>>>>>> change a little:
>>>>>
>>>>> Another use case may be parent modification. Suppose we have
>>>>>
>>>>>                      ,--- child1.qcow2
>>>>>      parent.qcow2  <
>>>>>                      `--- child2.qcow2
>>>>>
>>>>> We can use parent.qcow2 as block device in QEMU without breaking
>>>>> child1.qcow2 or child2.qcow2 by telling QEMU who its children are:
>>>>>
>>>>>    $QEMU -drive file=parent.qcow2,children=child1.qcow2:child2.qcow2
>>>>>
>>>>> Then we open the three images and setup parent_bs->open_children, the
>>>>> children are protected from being corrupted.
>>>>>
>>>>>>
>>>>>> bs->open_children becomes independent of backing files - any
>>>>>> BlockDriverState can be added to this list.  ->is_allocated() basically
>>>>>> becomes the bitmap that we keep in the block job.
>>>>>
>>>>> Yes. But it is possible to keep a bitmap for raw (and those don't
>>>>> implement is_allocated()) in block layer too, or in overlay: could
>>>>> add-cow by Dongxu Wang help here?
>>>>
>>>> Yes absolutely.
>>>>
>>>> Stefan
>>>>
>>>    One advantage of external backup, or backing up chain, is that it
>>> holds 'Delta' data only and is small enough. If it is changed toward a
>>> 'full' data writable snapshot, it become bigger. With backup chain
>>> qemu-img can restore/clone a writable and usable one, So I don't
>>> think adding that in qemu emulator helps much, and it will make things
>>> more complicit.... user won't care who is doing the job, qemu or
>>> qemu-img.
>>>
>>    I mean that "get writable snapshot (possibility to open backing file
>> writable and write to it safely) as a by-product." in this series, is
>> not very valuable.
>>
>
> I'm not selling writable snapshot, my point was just that semantic of
> block-backup, getting a point-in-time snapshot, inherently works like a
> backing chain but writting to parent (guest drive) will not break its
> children (our thin PIT snapshot). If we see it this way, COW is not so
> specific to a block job like block-backup, it can be generic in the
> backing chain logic.
>
   OK, similar thing happens in drive-mirror, if you treat it as
backing chain. To do it, following assumption need to be removed:
1 top *bs is the active one.
2 guest write request goes only to top *bs.
3 *bs->backing_hd do not care about *bs(also a hidden change: maybe
image should remember its child as well as parent.)

   Actually it will change the chain relationship from one direction of
top-down into both direction. I think a separate series is needed to
do that.

> Though, the value in a writable snapshot is that we can actually
> _modify_ a backing image in place, rather than forking the chain to

   I can't see a good user case requiring modifying backing image in
place, to me snapshot means a read only one with time stamp in the past.
Personally I think forbid modification of of snapshot is correct,
in construction there should be pre-defined concepts to avoid
chaos in architecture.

> write to the new child. This is not supported with qemu or qemu-img now,
> once you create a child with the image as backing file, you mustn't
> modify it.
>




-- 
Best Regards

Wenchao Xia

  reply	other threads:[~2013-06-13  8:03 UTC|newest]

Thread overview: 59+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-05-30 12:34 [Qemu-devel] [PATCH v5 00/11] block: drive-backup live backup command Stefan Hajnoczi
2013-05-30 12:34 ` [Qemu-devel] [PATCH v5 01/11] notify: add NotiferWithReturn so notifier list can abort Stefan Hajnoczi
2013-05-30 22:27   ` Eric Blake
2013-06-03  9:11     ` Stefan Hajnoczi
2013-06-19 10:55   ` Kevin Wolf
2013-05-30 12:34 ` [Qemu-devel] [PATCH v5 02/11] block: add bdrv_add_before_write_notifier() Stefan Hajnoczi
2013-05-30 22:47   ` Eric Blake
2013-06-19 10:56   ` Kevin Wolf
2013-05-30 12:34 ` [Qemu-devel] [PATCH v5 03/11] block: add basic backup support to block driver Stefan Hajnoczi
2013-06-06  3:56   ` Fam Zheng
2013-06-06  8:05     ` Stefan Hajnoczi
2013-06-06  8:56       ` Fam Zheng
2013-06-07  7:18         ` Stefan Hajnoczi
2013-06-13  6:03           ` Wenchao Xia
2013-06-13  6:07             ` Wenchao Xia
2013-06-13  6:33               ` Fam Zheng
2013-06-13  8:02                 ` Wenchao Xia [this message]
2013-06-13  8:51                 ` Stefan Hajnoczi
2013-06-17  3:43   ` Fam Zheng
2013-06-17 12:36     ` Stefan Hajnoczi
2013-06-18 14:52   ` Kevin Wolf
2013-06-19  7:38     ` Stefan Hajnoczi
2013-06-19 10:50   ` Kevin Wolf
2013-06-19 11:14     ` Paolo Bonzini
2013-06-20 12:05       ` Stefan Hajnoczi
2013-06-19 11:19     ` Paolo Bonzini
2013-06-20 12:06       ` Stefan Hajnoczi
2013-06-20 12:11     ` Stefan Hajnoczi
2013-05-30 12:34 ` [Qemu-devel] [PATCH v5 04/11] blockdev: drop redundant proto_drv check Stefan Hajnoczi
2013-06-03 17:14   ` Eric Blake
2013-06-06  5:00   ` Wenchao Xia
2013-06-19 10:57   ` Kevin Wolf
2013-05-30 12:34 ` [Qemu-devel] [PATCH v5 05/11] blockdev: use bdrv_getlength() in qmp_drive_mirror() Stefan Hajnoczi
2013-05-30 13:24   ` Paolo Bonzini
2013-06-03 17:15   ` Eric Blake
2013-06-19 10:58   ` Kevin Wolf
2013-05-30 12:34 ` [Qemu-devel] [PATCH v5 06/11] block: add drive-backup QMP command Stefan Hajnoczi
2013-06-03 17:09   ` Eric Blake
2013-06-19 11:07   ` Kevin Wolf
2013-06-20 12:19     ` Stefan Hajnoczi
2013-06-20 13:15       ` Kevin Wolf
2013-05-30 12:34 ` [Qemu-devel] [PATCH v5 07/11] blockdev: rename BlkTransactionStates to singular Stefan Hajnoczi
2013-05-30 22:55   ` Eric Blake
2013-06-19 11:13   ` Kevin Wolf
2013-05-30 12:34 ` [Qemu-devel] [PATCH v5 08/11] blockdev: allow BdrvActionOps->commit() to be NULL Stefan Hajnoczi
2013-05-30 22:57   ` Eric Blake
2013-06-03  9:16     ` Stefan Hajnoczi
2013-06-19 11:14   ` Kevin Wolf
2013-05-30 12:34 ` [Qemu-devel] [PATCH v5 09/11] blockdev: add DriveBackup transaction Stefan Hajnoczi
2013-06-03 17:20   ` Eric Blake
2013-06-19 11:17   ` Kevin Wolf
2013-05-30 12:34 ` [Qemu-devel] [PATCH v5 10/11] blockdev: add Abort transaction Stefan Hajnoczi
2013-05-30 13:11   ` Eric Blake
2013-06-03  9:21     ` Stefan Hajnoczi
2013-06-19 11:21       ` Kevin Wolf
2013-06-21  9:31         ` Eric Blake
2013-05-30 12:34 ` [Qemu-devel] [PATCH v5 11/11] qemu-iotests: add 055 drive-backup test case Stefan Hajnoczi
2013-06-19 12:41   ` Kevin Wolf
2013-06-06  5:06 ` [Qemu-devel] [PATCH v5 00/11] block: drive-backup live backup command Wenchao Xia

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=51B97C7B.4030708@linux.vnet.ibm.com \
    --to=xiawenc@linux.vnet.ibm.com \
    --cc=dietmar@proxmox.com \
    --cc=famz@redhat.com \
    --cc=imain@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@gmail.com \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).