qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Max Reitz <mreitz@redhat.com>
To: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>,
	"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
	"qemu-block@nongnu.org" <qemu-block@nongnu.org>
Cc: "fam@euphon.net" <fam@euphon.net>,
	"kwolf@redhat.com" <kwolf@redhat.com>,
	Denis Lunev <den@virtuozzo.com>,
	"stefanha@redhat.com" <stefanha@redhat.com>,
	"jsnow@redhat.com" <jsnow@redhat.com>
Subject: Re: [Qemu-devel] [PATCH v8 4/7] block: introduce backup-top filter driver
Date: Fri, 14 Jun 2019 22:03:20 +0200	[thread overview]
Message-ID: <46f18bb5-f123-b20a-7cb9-97caedae8290@redhat.com> (raw)
In-Reply-To: <46ad242a-3aa5-bfa1-1d64-5f2e98f4ed32@virtuozzo.com>


[-- Attachment #1.1: Type: text/plain, Size: 6902 bytes --]

On 14.06.19 18:22, Vladimir Sementsov-Ogievskiy wrote:
> 14.06.2019 15:57, Max Reitz wrote:
>> On 14.06.19 11:04, Vladimir Sementsov-Ogievskiy wrote:
>>> 13.06.2019 18:57, Max Reitz wrote:
>>>> On 29.05.19 17:46, Vladimir Sementsov-Ogievskiy wrote:
>>>>> Backup-top filter does copy-before-write operation. It should be
>>>>> inserted above active disk and has a target node for CBW, like the
>>>>> following:
>>>>>
>>>>>       +-------+
>>>>>       | Guest |
>>>>>       +-------+
>>>>>           |r,w
>>>>>           v
>>>>>       +------------+  target   +---------------+
>>>>>       | backup_top |---------->| target(qcow2) |
>>>>>       +------------+   CBW     +---------------+
>>>>>           |
>>>>> backing |r,w
>>>>>           v
>>>>>       +-------------+
>>>>>       | Active disk |
>>>>>       +-------------+
>>>>>
>>>>> The driver will be used in backup instead of write-notifiers.
>>>>>
>>>>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>>>>> ---
>>>>>    block/backup-top.h  |  64 +++++++++
>>>>>    block/backup-top.c  | 322 ++++++++++++++++++++++++++++++++++++++++++++
>>>>>    block/Makefile.objs |   2 +
>>>>>    3 files changed, 388 insertions(+)
>>>>>    create mode 100644 block/backup-top.h
>>>>>    create mode 100644 block/backup-top.c
>>>>>
>>>>> diff --git a/block/backup-top.h b/block/backup-top.h
>>>>> new file mode 100644
>>>>> index 0000000000..788e18c358
>>>>> --- /dev/null
>>>>> +++ b/block/backup-top.h
>>
>> [...]
>>
>>>>> +/*
>>>>> + * bdrv_backup_top_append
>>>>> + *
>>>>> + * Append backup_top filter node above @source node. @target node will receive
>>>>> + * the data backed up during CBE operations. New filter together with @target
>>>>> + * node are attached to @source aio context.
>>>>> + *
>>>>> + * The resulting filter node is implicit.
>>>>
>>>> Why?  It’s just as easy for the caller to just make it implicit if it
>>>> should be.  (And usually the caller should decide that.)
>>>
>>> Personally, I don't know what are the reasons for filters to bi implicit or not,
>>> I just made it like other job-filters.. I can move making-implicit to the caller
>>> or drop it at all (if it will work).
>>
>> Nodes are implicit if they haven’t been added consciously by the user.
>> A node added by a block job can be non-implicit, too, as mirror shows;
>> If the user specifies the filter-node-name option, they will know about
>> the node, thus it is no longer implicit.
>>
>> If the user doesn’t know about the node (they didn’t give the
>> filter-node-name option), the node is implicit.
>>
> 
> Ok, I understand it. But it doesn't show, why it should be implicit?
> Isn't it less buggy to make all filters explicit? We don't hide implicit nodes
> from query-named-block-nodes (the only interface to explore the whole graph for now)
> anyway. And we can't absolutely hide side effects of additional node in the graph.

Well, we try, at least.  At least we hide them from query-block.

> So, is there any real benefit of supporting separation into implicit and explicit filters?
> It seems for me that it only complicates things...
> In other words, what will break if we make all filters explicit?

The theory is that qemu may decide to add nodes at any point, but at
least when managing chains etc., they may not be visible to the user.  I
don’t think we can get rid of them so easily.

One example that isn’t implemented yet is copy-on-read.  In theory,
specifying copy-on-read=on for -drive should create an implicit COR node
on top.  The user shouldn’t see that node when inspecting the drive or
when issuing block jobs on it, etc.  And this node has to stay there
when the user does e.g. an active commit somewhere down the chain.

That sounds like a horrible ordeal to implement, so it hasn’t been done
yet.  Maybe it never will.  It isn’t that bad for the job filters,
because they generally freeze the block graph, so there is no problem
with potential modifications.

All in all I do think having implicit nodes makes sense.  Maybe not so
much now, but in the future (if someone implements converting -drive COR
and throttle options to implicit nodes...).

>> [...]
>>
>>>>> +static int coroutine_fn backup_top_co_flush(BlockDriverState *bs)
>>>>> +{
>>>>> +    if (!bs->backing) {
>>>>> +        return 0;
>>>>> +    }
>>>>> +
>>>>> +    return bdrv_co_flush(bs->backing->bs);
>>>>
>>>> Should we flush the target, too?
>>>
>>> Hm, you've asked it already, on previous version :)
>>
>> I wasn’t sure...
>>
>>> Backup don't do it,
>>> so I just keep old behavior. And what is the reason to flush backup target
>>> on any guest flush?
>>
>> Hm, well, what’s the reason not to do it?
> 
> guest flushes will be slowed down?

Hm, the user could specify cache=unsafe if they don’t care.  Which gives
me second thoughs... [1]

>> Also, there are not only guest flushes.  bdrv_flush_all() exists, which
>> is called when the guest is stopped.  So who is going to flush the
>> target if not its parent?
>>
>> [...]
> 
> Backup job? But filter should not relay on it..

Hm.  Backup job actually doesn’t sound that wrong.

> But should really filter do that job, or it is here only to do CBW? Maybe target
> must have another parent which will care about flushing.
> 
> Ok, I think I'll flush target here too for safety, and leave a comment, that something
> smarter would be better.
> (or, if we decide to flush all children by default in generic code, I'll drop this handler)

[1] Thinking more about this, for normal backups the target file does
not reflect a valid disk state until the backup is done; just like for
qemu-img convert.  Just like qemu-img convert, there is therefore no
reason to flush the target until the job is done.

But there is also the other case of image fleecing.  In this case, the
target will have another parent, so bdrv_flush_all() will be done by
someone.  Still, it intuitively makes sense to me that in this case, the
backup-top filter should flush the target if the guest decides to flush
the device (so that the target stays consistent on disk).

backup-top currently cannot differentiate between the cases, but maybe
that is generally a useful hint to give to it?  (That is, whether the
target has a consistent state or not.)

[...]

>>>>> +        }
>>>>> +    } else {
>>>>> +        /* Source child */
>>>>> +        if (perm & BLK_PERM_WRITE) {
>>>>
>>>> Or WRITE_UNCHANGED, no?
>>>
>>> Why? We don't need doing CBW for unchanged write.
>>
>> But we will do it still, won’t we?
>>
>> (If an unchanging write comes in, this driver will handle it just like a
>> normal write, will it not?)
> 
> No, it will not, I check this flag in backup_top_co_pwritev

Oops.  My bad.

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

  reply	other threads:[~2019-06-14 20:09 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-05-29 15:46 [Qemu-devel] [PATCH v8 0/7] backup-top filter driver for backup Vladimir Sementsov-Ogievskiy
2019-05-29 15:46 ` [Qemu-devel] [PATCH v8 1/7] block: teach bdrv_debug_breakpoint skip filters with backing Vladimir Sementsov-Ogievskiy
2019-06-13 13:43   ` Max Reitz
2019-05-29 15:46 ` [Qemu-devel] [PATCH v8 2/7] block: swap operation order in bdrv_append Vladimir Sementsov-Ogievskiy
2019-06-13 13:45   ` Max Reitz
2019-06-13 14:02     ` Vladimir Sementsov-Ogievskiy
2019-05-29 15:46 ` [Qemu-devel] [PATCH v8 3/7] block: allow not one child for implicit node Vladimir Sementsov-Ogievskiy
2019-06-13 13:51   ` Max Reitz
2019-05-29 15:46 ` [Qemu-devel] [PATCH v8 4/7] block: introduce backup-top filter driver Vladimir Sementsov-Ogievskiy
2019-06-13 15:57   ` Max Reitz
2019-06-14  9:04     ` Vladimir Sementsov-Ogievskiy
2019-06-14 12:57       ` Max Reitz
2019-06-14 16:22         ` Vladimir Sementsov-Ogievskiy
2019-06-14 20:03           ` Max Reitz [this message]
2019-06-17 10:36             ` Vladimir Sementsov-Ogievskiy
2019-06-17 14:56               ` Max Reitz
2019-06-17 15:53                 ` Kevin Wolf
2019-06-17 16:01                   ` Max Reitz
2019-06-17 16:25                     ` Kevin Wolf
2019-06-18  7:19                       ` Vladimir Sementsov-Ogievskiy
2019-06-18  8:20                         ` Kevin Wolf
2019-06-18  8:29                           ` Vladimir Sementsov-Ogievskiy
2019-06-18  7:25                 ` Vladimir Sementsov-Ogievskiy
2019-05-29 15:46 ` [Qemu-devel] [PATCH v8 5/7] block/io: refactor wait_serialising_requests Vladimir Sementsov-Ogievskiy
2019-05-29 15:46 ` [Qemu-devel] [PATCH v8 6/7] block: add lock/unlock range functions Vladimir Sementsov-Ogievskiy
2019-06-13 16:31   ` Max Reitz
2019-05-29 15:46 ` [Qemu-devel] [PATCH v8 7/7] block/backup: use backup-top instead of write notifiers Vladimir Sementsov-Ogievskiy
2019-06-13 18:02   ` Max Reitz
2019-06-14  9:14     ` Vladimir Sementsov-Ogievskiy
2019-05-30 13:25 ` [Qemu-devel] [PATCH v8 0/7] backup-top filter driver for backup Vladimir Sementsov-Ogievskiy
2019-06-13 16:08 ` no-reply
2019-06-13 16:41 ` no-reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=46f18bb5-f123-b20a-7cb9-97caedae8290@redhat.com \
    --to=mreitz@redhat.com \
    --cc=den@virtuozzo.com \
    --cc=fam@euphon.net \
    --cc=jsnow@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    --cc=vsementsov@virtuozzo.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).