From: Max Reitz <mreitz@redhat.com>
To: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>,
"qemu-block@nongnu.org" <qemu-block@nongnu.org>
Cc: Kevin Wolf <kwolf@redhat.com>,
"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>
Subject: Re: [Qemu-devel] [PATCH v4 02/11] block: Filtered children access functions
Date: Wed, 17 Apr 2019 18:22:05 +0200 [thread overview]
Message-ID: <f7fa24d8-7608-ecc8-475c-42e3d58ba272@redhat.com> (raw)
In-Reply-To: <5a335159-8a98-5c60-657d-920e1eb81065@virtuozzo.com>
[-- Attachment #1: Type: text/plain, Size: 11465 bytes --]
On 16.04.19 12:02, Vladimir Sementsov-Ogievskiy wrote:
> 10.04.2019 23:20, Max Reitz wrote:
>> What bs->file and bs->backing mean depends on the node. For filter
>> nodes, both signify a node that will eventually receive all R/W
>> accesses. For format nodes, bs->file contains metadata and data, and
>> bs->backing will not receive writes -- instead, writes are COWed to
>> bs->file. Usually.
>>
>> In any case, it is not trivial to guess what a child means exactly with
>> our currently limited form of expression. It is better to introduce
>> some functions that actually guarantee a meaning:
>>
>> - bdrv_filtered_cow_child() will return the child that receives requests
>> filtered through COW. That is, reads may or may not be forwarded
>> (depending on the overlay's allocation status), but writes never go to
>> this child.
>>
>> - bdrv_filtered_rw_child() will return the child that receives requests
>> filtered through some very plain process. Reads and writes issued to
>> the parent will go to the child as well (although timing, etc. may be
>> modified).
>>
>> - All drivers but quorum (but quorum is pretty opaque to the general
>> block layer anyway) always only have one of these children: All read
>> requests must be served from the filtered_rw_child (if it exists), so
>> if there was a filtered_cow_child in addition, it would not receive
>> any requests at all.
>> (The closest here is mirror, where all requests are passed on to the
>> source, but with write-blocking, write requests are "COWed" to the
>> target. But that just means that the target is a special child that
>> cannot be introspected by the generic block layer functions, and that
>> source is a filtered_rw_child.)
>> Therefore, we can also add bdrv_filtered_child() which returns that
>> one child (or NULL, if there is no filtered child).
>>
>> Also, many places in the current block layer should be skipping filters
>> (all filters or just the ones added implicitly, it depends) when going
>> through a block node chain. They do not do that currently, but this
>> patch makes them.
>>
>> One example for this is qemu-img map, which should skip filters and only
>> look at the COW elements in the graph. The change to iotest 204's
>> reference output shows how using blkdebug on top of a COW node used to
>> make qemu-img map disregard the rest of the backing chain, but with this
>> patch, the allocation in the base image is reported correctly.
>>
>> Furthermore, a note should be made that sometimes we do want to access
>> bs->backing directly. This is whenever the operation in question is not
>> about accessing the COW child, but the "backing" child, be it COW or
>> not. This is the case in functions such as bdrv_open_backing_file() or
>> whenever we have to deal with the special behavior of @backing as a
>> blockdev option, which is that it does not default to null like all
>> other child references do.
>>
>> Finally, the query functions (query-block and query-named-block-nodes)
>> are modified to return any filtered child under "backing", not just
>> bs->backing or COW children. This is so that filters do not interrupt
>> the reported backing chain. This changes the output of iotest 184, as
>> the throttled node now appears as a backing child.
>>
>> Signed-off-by: Max Reitz <mreitz@redhat.com>
>> ---
>> qapi/block-core.json | 4 +
>> include/block/block.h | 1 +
>> include/block/block_int.h | 40 +++++--
>> block.c | 210 +++++++++++++++++++++++++++------
>> block/backup.c | 8 +-
>> block/block-backend.c | 16 ++-
>> block/commit.c | 33 +++---
>> block/io.c | 45 ++++---
>> block/mirror.c | 21 ++--
>> block/qapi.c | 30 +++--
>> block/stream.c | 13 +-
>> blockdev.c | 88 +++++++++++---
>> migration/block-dirty-bitmap.c | 4 +-
>> nbd/server.c | 6 +-
>> qemu-img.c | 29 ++---
>> tests/qemu-iotests/184.out | 7 +-
>> tests/qemu-iotests/204.out | 1 +
>> 17 files changed, 411 insertions(+), 145 deletions(-)
>
> really huge... didn't you consider conversion file-by-file?
Frankly, no, I just didn’t consider it.
Hm. I don’t know, 30-patch series always look so frightening.
>> diff --git a/block.c b/block.c
>> index 16615bc876..e8f6febda0 100644
>> --- a/block.c
>> +++ b/block.c
>
> [..]
>
>>
>> @@ -3467,14 +3469,17 @@ static int bdrv_reopen_parse_backing(BDRVReopenState *reopen_state,
>> /*
>> * Find the "actual" backing file by skipping all links that point
>> * to an implicit node, if any (e.g. a commit filter node).
>> + * We cannot use any of the bdrv_skip_*() functions here because
>> + * those return the first explicit node, while we are looking for
>> + * its overlay here.
>> */
>> overlay_bs = bs;
>> - while (backing_bs(overlay_bs) && backing_bs(overlay_bs)->implicit) {
>> - overlay_bs = backing_bs(overlay_bs);
>> + while (overlay_bs->backing && bdrv_filtered_bs(overlay_bs)->implicit) {
>
> So, you don't want to skip implicit filters with 'file' child? Then, why not to use
> child_bs(overlay_bs->backing), like in following if condition?
I think it was an artifact of writing the patch. I started with
bdrv_filtered_bs() and then realized this depends on ->backing,
actually. There was no functional difference so I left it as it was.
But you’re right, it is more clear to use child_bs(overlay_bs->backing)
isntead.
> Could we instead make backing-based filters equal to file-based, to make it possible
> to use file-based filters in backing-chain related scenarios (like upcoming copy-on-read
> filter for stream)? So, to expand backing-chain concept to include filters with file child?
If I understand you correctly, that’s basically the purpose of this
series and especially this patch here. As far as it is possible and
reasonable, I want filters that use bs->backing and bs->file behave the
same.
However, there are cases where this is not possible and
bdrv_reopen_parse_backing() is one such case. bs->backing and bs->file
correspond to QAPI names, namely 'backing' and 'file'. If that
distinction was already visible to the user, we cannot change it now.
We definitely cannot make file-based filters use bs->backing now because
you can create them over QAPI and they use 'file' as their child name.
Can we make backing-based filters use bs->file? Seems more likely,
because all of them are implicit nodes, so the user usually doesn’t see
them. But usually isn’t always; they do become user-visible once the
user specifies a node-name for mirror or commit.
I found it more reasonable to introduce new functions that explicitly
express what kind of child they expect and then apply them everywhere as
I saw fit, instead of making the mirror/commit filter drivers use
bs->file and hope it works; not least because I’d still have to go
through the whole block layer and check every instance of bs->backing to
see whether it really needs bs->backing or whether it should use either
of bs->backing or bs->file.
>> + overlay_bs = bdrv_filtered_bs(overlay_bs);
>> }
>>
>> /* If we want to replace the backing file we need some extra checks */
>> - if (new_backing_bs != backing_bs(overlay_bs)) {
>> + if (new_backing_bs != child_bs(overlay_bs->backing)) { > /* Check for implicit nodes between bs and its backing file */
>> if (bs != overlay_bs) {
>> error_setg(errp, "Cannot change backing link if '%s' has "
>
> [..]
>
>> @@ -4203,8 +4208,8 @@ int bdrv_change_backing_file(BlockDriverState *bs,
>> BlockDriverState *bdrv_find_overlay(BlockDriverState *active,
>> BlockDriverState *bs)
>> {
>> - while (active && bs != backing_bs(active)) {
>> - active = backing_bs(active);
>> + while (active && bs != bdrv_filtered_bs(active)) {
>
> hmm and here you actually support backing-chain with file-child-based filters in it..
Yes, because this is not about the QAPI 'backing' link. This function
should continue to work even if there are filters in the backing chain.
>> + active = bdrv_filtered_bs(active);
>> }
>>
>> return active;
>> @@ -4226,11 +4231,11 @@ bool bdrv_is_backing_chain_frozen(BlockDriverState *bs, BlockDriverState *base,
>> {
>> BlockDriverState *i;
>>
>> - for (i = bs; i != base; i = backing_bs(i)) {
>> + for (i = bs; i != base; i = child_bs(i->backing)) {
>
> and here don't..
Yes, because this function is about the QAPI 'backing' link.
>> if (i->backing && i->backing->frozen) {
>> error_setg(errp, "Cannot change '%s' link from '%s' to '%s'",
>> i->backing->name, i->node_name,
>> - backing_bs(i)->node_name);
>> + i->backing->bs->node_name);
>> return true;
>> }
>> }
>
> [..]
>
>> +static BlockDriverState *bdrv_skip_filters(BlockDriverState *bs,
>> + bool stop_on_explicit_filter)
>> +{
>> + BdrvChild *filtered;
>> +
>> + if (!bs) {
>> + return NULL;
>> + }
>> +
>> + while (!(stop_on_explicit_filter && !bs->implicit)) {
>
> you may save some characters and extra operators by
>
> bool skip_explicit
> ...
> while (skip_explicit || bs->implicit) {
But is it really simpler?
>> + filtered = bdrv_filtered_rw_child(bs);
>> + if (!filtered) {
>> + break;
>> + }
>> + bs = filtered->bs;
>> + }
>> + /*
>> + * Note that this treats nodes with bs->drv == NULL as not being
>> + * R/W filters (bs->drv == NULL should be replaced by something
>> + * else anyway).
>> + * The advantage of this behavior is that this function will thus
>> + * always return a non-NULL value (given a non-NULL @bs).
>> + */
>> +
>> + return bs;
>> +}
>> +
>> +/*
>> + * Return the first BDS that has not been added implicitly or that
>> + * does not have an RW-filtered child down the chain starting from @bs
>> + * (including @bs itself).
>> + */
>> +BlockDriverState *bdrv_skip_implicit_filters(BlockDriverState *bs)
>> +{
>> + return bdrv_skip_filters(bs, true);
>> +}
>> +
>> +/*
>> + * Return the first BDS that does not have an RW-filtered child down
>> + * the chain starting from @bs (including @bs itself).
>> + */
>> +BlockDriverState *bdrv_skip_rw_filters(BlockDriverState *bs)
>> +{
>> + return bdrv_skip_filters(bs, false);
>> +}
>> +
>> +/*
>> + * For a backing chain, return the first non-filter backing image.
>
> or second, if we start from filter
Hm, in a sense. Maybe:
> For a backing chain, return the first non-filter backing image of the
> first non-filter image.
?
>> + */
>> +BlockDriverState *bdrv_backing_chain_next(BlockDriverState *bs)
>> +{
>> + return bdrv_skip_rw_filters(bdrv_filtered_cow_bs(bdrv_skip_rw_filters(bs)));
>> +}
>
>
>
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
next prev parent reply other threads:[~2019-04-17 16:22 UTC|newest]
Thread overview: 54+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-04-10 20:20 [Qemu-devel] [PATCH v4 00/11] block: Deal with filters Max Reitz
2019-04-10 20:20 ` Max Reitz
2019-04-10 20:20 ` [Qemu-devel] [PATCH v4 01/11] block: Mark commit and mirror as filter drivers Max Reitz
2019-04-10 20:20 ` Max Reitz
2019-04-10 20:20 ` [Qemu-devel] [PATCH v4 02/11] block: Filtered children access functions Max Reitz
2019-04-10 20:20 ` Max Reitz
2019-04-16 10:02 ` Vladimir Sementsov-Ogievskiy
2019-04-16 10:02 ` Vladimir Sementsov-Ogievskiy
2019-04-17 16:22 ` Max Reitz [this message]
2019-04-17 16:22 ` Max Reitz
2019-04-18 8:36 ` Vladimir Sementsov-Ogievskiy
2019-04-18 8:36 ` Vladimir Sementsov-Ogievskiy
2019-04-24 15:23 ` Max Reitz
2019-04-24 15:23 ` Max Reitz
2019-04-19 10:23 ` Vladimir Sementsov-Ogievskiy
2019-04-19 10:23 ` Vladimir Sementsov-Ogievskiy
2019-04-24 16:36 ` Max Reitz
2019-04-24 16:36 ` Max Reitz
2019-05-07 9:32 ` Vladimir Sementsov-Ogievskiy
2019-05-07 13:15 ` Max Reitz
2019-05-07 13:33 ` Vladimir Sementsov-Ogievskiy
2019-05-31 16:26 ` Max Reitz
2019-05-31 17:02 ` Max Reitz
2019-05-07 13:30 ` Vladimir Sementsov-Ogievskiy
2019-05-07 15:13 ` Max Reitz
2019-05-17 11:50 ` Vladimir Sementsov-Ogievskiy
2019-05-23 14:49 ` Max Reitz
2019-05-23 15:08 ` Vladimir Sementsov-Ogievskiy
2019-05-23 15:56 ` Max Reitz
2019-05-17 14:50 ` Vladimir Sementsov-Ogievskiy
2019-05-23 17:27 ` Max Reitz
2019-05-24 8:12 ` Vladimir Sementsov-Ogievskiy
2019-04-10 20:20 ` [Qemu-devel] [PATCH v4 03/11] block: Storage child access function Max Reitz
2019-04-10 20:20 ` Max Reitz
2019-05-20 10:41 ` Vladimir Sementsov-Ogievskiy
2019-05-28 18:09 ` Max Reitz
2019-04-10 20:20 ` [Qemu-devel] [PATCH v4 04/11] block: Inline bdrv_co_block_status_from_*() Max Reitz
2019-04-10 20:20 ` Max Reitz
2019-05-21 8:57 ` Vladimir Sementsov-Ogievskiy
2019-05-28 17:58 ` Max Reitz
2019-04-10 20:20 ` [Qemu-devel] [PATCH v4 05/11] block: Fix check_to_replace_node() Max Reitz
2019-04-10 20:20 ` Max Reitz
2019-04-10 20:20 ` [Qemu-devel] [PATCH v4 06/11] iotests: Add tests for mirror @replaces loops Max Reitz
2019-04-10 20:20 ` Max Reitz
2019-04-10 20:20 ` [Qemu-devel] [PATCH v4 07/11] block: Leave BDS.backing_file constant Max Reitz
2019-04-10 20:20 ` Max Reitz
2019-04-10 20:20 ` [Qemu-devel] [PATCH v4 08/11] iotests: Add filter commit test cases Max Reitz
2019-04-10 20:20 ` Max Reitz
2019-04-10 20:20 ` [Qemu-devel] [PATCH v4 09/11] iotests: Add filter mirror " Max Reitz
2019-04-10 20:20 ` Max Reitz
2019-04-10 20:20 ` [Qemu-devel] [PATCH v4 10/11] iotests: Add test for commit in sub directory Max Reitz
2019-04-10 20:20 ` Max Reitz
2019-04-10 20:20 ` [Qemu-devel] [PATCH v4 11/11] iotests: Test committing to overridden backing Max Reitz
2019-04-10 20:20 ` Max Reitz
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=f7fa24d8-7608-ecc8-475c-42e3d58ba272@redhat.com \
--to=mreitz@redhat.com \
--cc=kwolf@redhat.com \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
--cc=vsementsov@virtuozzo.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).