From: Kevin Wolf <kwolf@redhat.com>
To: Max Reitz <mreitz@redhat.com>
Cc: qemu-devel@nongnu.org, qemu-block@nongnu.org
Subject: Re: [PATCH v3 03/33] block: Add BdrvChildRole and BdrvChildRoleBits
Date: Tue, 5 May 2020 14:54:13 +0200 [thread overview]
Message-ID: <20200505125413.GK5759@linux.fritz.box> (raw)
In-Reply-To: <aa2c60ac-0e41-bf3b-d81d-fe504c42c172@redhat.com>
[-- Attachment #1: Type: text/plain, Size: 10098 bytes --]
Am 05.05.2020 um 13:59 hat Max Reitz geschrieben:
> On 05.05.20 13:19, Kevin Wolf wrote:
> > Am 18.02.2020 um 13:42 hat Max Reitz geschrieben:
> >> This mask will supplement BdrvChildClass when it comes to what role (or
> >> combination of roles) a child takes for its parent. It consists of
> >> BdrvChildRoleBits values (which is an enum).
> >>
> >> Because empty enums are not allowed, let us just start with it filled.
> >>
> >> Signed-off-by: Max Reitz <mreitz@redhat.com>
> >> ---
> >> include/block/block.h | 38 ++++++++++++++++++++++++++++++++++++++
> >> 1 file changed, 38 insertions(+)
> >>
> >> diff --git a/include/block/block.h b/include/block/block.h
> >> index fd89eb6c75..8c23948d08 100644
> >> --- a/include/block/block.h
> >> +++ b/include/block/block.h
> >> @@ -268,6 +268,44 @@ enum {
> >> DEFAULT_PERM_UNCHANGED = BLK_PERM_ALL & ~DEFAULT_PERM_PASSTHROUGH,
> >> };
> >>
> >> +enum BdrvChildRoleBits {
> >> + /* Child stores data */
> >> + BDRV_CHILD_DATA = (1 << 0),
> >> +
> >> + /* Child stores metadata */
> >> + BDRV_CHILD_METADATA = (1 << 1),
> >> +
> >> + /*
> >> + * A child to which the parent forwards all reads and writes. It
> >
> > Is "_all_ reads and writes" really required? Imagine a caching block
> > driver, should it not be considered a filter because it may just
> > complete the requests from its cache rather than asking the child?
>
> Hm. The important thing is that parent and child always show exactly
> the same data, which is the second part of the definition given here.
> So maybe we should flip both sentences, e.g.:
>
> “A child which always presents exactly the same visibile data as the
> parent, e.g. by virtue of the parent forwarding all reads and writes.”
>
> ?
Turns it into an example, which is a good way of explaining things that
are commonly, but not universally true. I like it.
> >> + * must present exactly the same visible data as the parent.
> >> + * Any node may have at most one filtered child at a time.
> >> + */
> >> + BDRV_CHILD_FILTERED = (1 << 2),
> >> +
> >> + /*
> >> + * Child from which to read all data that isn’t allocated in the
> >> + * parent (i.e., the backing child); such data is copied to the
> >> + * parent through COW (and optionally COR).
> >> + */
> >> + BDRV_CHILD_COW = (1 << 3),
> >> +
> >> + /*
> >> + * The primary child. For most drivers, this is the child whose
> >> + * filename applies best to the parent node.
> >> + * Each parent must give this flag to no more than one child at a
> >> + * time.
> >> + */
> >> + BDRV_CHILD_PRIMARY = (1 << 4),
> >
> > And I assume some drivers like quorum don't set it on any child.
>
> I thought “no more than one” implies that.
Technically no, though it's probably the first assumption of most
people. It's just a suggestion for clarification, feel free to ignore
it.
> >> + /* Useful combination of flags */
> >> + BDRV_CHILD_IMAGE = BDRV_CHILD_DATA
> >> + | BDRV_CHILD_METADATA
> >> + | BDRV_CHILD_PRIMARY,
> >> +};
> >> +
> >> +/* Mask of BdrvChildRoleBits values */
> >> +typedef unsigned int BdrvChildRole;
> >> +
> >> char *bdrv_perm_names(uint64_t perm);
> >> uint64_t bdrv_qapi_perm_to_blk_perm(BlockPermission qapi_perm);
> >
> > The list intuitively makes sense to me. Let me try to think of some
> > interesting cases to see whether the documentation is complete or
> > whether it could be improved.
> >
> >
> > qcow2 is what everyone has in mind, so it should be obvious:
> >
> > * Without a data file:
> > * file: BDRV_CHILD_IMAGE
> > * backing: BDRV_CHILD_COW
> >
> > * With a data file:
> > * file: BDRV_CHILD_PRIMARY | BDRV_CHILD_METADATA
> > * data-file: BDRV_CHILD_DATA
> > * backing: BDRV_CHILD_COW
> >
> >
> > We can use VMDK to make things a bit more interesting:
> >
> > * file: BDRV_CHILD_PRIMARY | BDRV_CHILD_METADATA
> > * extents.*: BDRV_CHILD_METADATA | BDRV_CHILD_DATA
> > * backing: BDRV_CHILD_COW
> >
> > In other words, we can have multiple data and metadata children. Is this
> > correct or should extents not be marked as metadata? (Checked the final
> > code: yes we do have multiple of them in vmdk.) Should this be mentioned
> > in the documentation?
>
> If the extents contain metadata (I thought not, but I think I was just
> wrong; sparse extents must contain their respective mapping structures),
> then yes, they should all be marked as metadata children.
>
> I’m not sure whether that needs to be mentioned explicitly in the doc,
> because “Child stores metadata” seems sufficient to me.
When you're the author, the meaning of everything is clear to you. :-)
In case of doubt, I would be more explicit so that the comment gives a
clear guideline for which role to use in which scenario.
> > Do we then also want to allow multiple BDRV_CHILD_COW children? We don't
> > currently have a driver that needs it, but maybe it would be consistent
> > with DATA and METADATA then. However, it would contradict the
> > documentation that it's the "Child from which to read all data".
>
> Yes. I would revisit that problem when the need arises.
>
> It seems to me like this would open a whole can of worms, just like
> allowing multiple filtered children does.
Okay. Shall we document it explicitly like we do for the filter role?
> > blkverify:
> >
> > * x-image: BDRV_CHILD_PRIMARY | BDRV_CHILD_DATA | BDRV_CHILD_FILTERED
> > * x-raw: BDRV_CHILD_DATA | BDRV_CHILD_FILTERED
> >
> > Hm, according to the documentation, this doesn't work, FILTERED can be
> > set only for one node. But the condition ("the parent forwards all reads
> > and writes") applies to both children. I think the documentation should
> > mention what needs to be done in such cases.
>
> I don’t know. blkverify is a rare exception by design, because it can
> abort when both children don’t match. (I suppose we could theoretically
> have a quorum mode where a child gets ejected once a mismatch is
> detected, but that isn’t the case now.)
Well, yes, this is exceptional. I would ignore that property for
assigning roles because when it comes to play, roles don't matter any
more because the whole process is gone. So...
> Furthermore, I would argue that blkverify actually expects the formatted
> image to sometimes differ from the raw image, if anything, because the
> format driver is to be tested. This is the reason why I chose x-raw to
> be the filtered child.
...I don't think this case is relevant. If blkverify returns something,
both children have the same data.
> So there is no general instruction on what to do in such cases that I
> followed here, I specifically chose one child based on what blkverify is
> and what it’s supposed to do. Therefore, I can’t really give a general
> instruction on “what needs to be done in such cases”.
Maybe the missing part for me is what FILTERED is even used for. I
assume it's for skipping over filters in certain functions in the
generic block layer?
In this case, maybe the right answer is that...
> > For blkverify, both
> > children are not equal in intention, so I guess the "real" filtered
> > child is x-image. But for quorum, you can't make any such distinction. I
> > assume the recommendation should be not to set FILTERED for any child
> > then.
>
> Quorum just isn’t a filter driver.
...blkverify isn't one either because performing an operation on only
one child usually won't be correct.
> > Looking at the final code... Hm, your choice looks quite different: You
> > don't have DATA for x-raw, but you make it the PRIMARY and FILTERED
> > child. I think PRIMARY/FILTERED is just a bug (e.g. getlength and flush
> > being forwarded only to x-image show that it's primary).
>
> I rather consider getlength() a special case. Ideally, we’d forward
> getlength() to both and compare the results; however, image formats
> might have different size resolution than raw files, so there could be a
> difference, but it’d be irrelevant.
>
> It makes then sense to forward it to the formatted image, because
> generally formats have byte resolution for the disk size, whereas for
> raw files it depends on caching and the filesystem, I think.
>
> As for flush, yes, why do we forward it only to x-image? Why is “the
> raw file not important”?
Because it's the copy that is used to check whether the main image is
correct. If things break, you just create a new copy. At least that's
how blkverify was supposed to be used.
In fact, I guess in the typical use cases for blkverify, cache=unsafe is
enough anyway because you would start over from scratch, so... not a
strong argument.
> > I do wonder
> > whether I have a different interpretation of DATA than you, though.
>
> I never set DATA for FILTERED, because I consider FILTERED to be
> stronger than DATA, so once FILTERED is set, it doesn’t matter whether
> DATA is set or not. I suppose that should either be mentioned in the
> documentation, or we decide that we should always set DATA regardless.
Either option should be fine. I guess documenting it is less work.
> > Also, the comparison makes me wonder whether FILTERED always implies
> > PRIMARY? Would there ever be a scenario where a child is FILTERED, but
> > not PRIMARY?
>
> I don’t know. I suppose it does. But what’s the implication?
*shrug* I was just asking to see if I understand things right. We could
document it, but I don't have a good reason why we must do that.
Maybe the more relevant question would be if a FILTERED child must be
the only child to avoid the problems we're discussing for blkverify. But
I think I already answered that question for myself with "no", so
probably not much use asking it.
Kevin
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
next prev parent reply other threads:[~2020-05-05 12:55 UTC|newest]
Thread overview: 63+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-02-18 12:42 [PATCH v3 00/33] block: Introduce real BdrvChildRole Max Reitz
2020-02-18 12:42 ` [PATCH v3 01/33] block: Add BlockDriver.is_format Max Reitz
2020-02-18 12:42 ` [PATCH v3 02/33] block: Rename BdrvChildRole to BdrvChildClass Max Reitz
2020-02-18 12:42 ` [PATCH v3 03/33] block: Add BdrvChildRole and BdrvChildRoleBits Max Reitz
2020-02-18 13:05 ` Eric Blake
2020-05-05 11:19 ` Kevin Wolf
2020-05-05 11:59 ` Max Reitz
2020-05-05 12:54 ` Kevin Wolf [this message]
2020-05-05 13:20 ` Max Reitz
2020-05-05 13:38 ` Kevin Wolf
2020-02-18 12:42 ` [PATCH v3 04/33] block: Add BdrvChildRole to BdrvChild Max Reitz
2020-02-18 13:06 ` Eric Blake
2020-02-18 12:42 ` [PATCH v3 05/33] block: Pass BdrvChildRole to bdrv_child_perm() Max Reitz
2020-02-18 12:42 ` [PATCH v3 06/33] block: Pass BdrvChildRole to .inherit_options() Max Reitz
2020-02-18 12:42 ` [PATCH v3 07/33] block: Pass parent_is_format " Max Reitz
2020-02-18 12:42 ` [PATCH v3 08/33] block: Rename bdrv_inherited_options() Max Reitz
2020-02-18 12:42 ` [PATCH v3 09/33] block: Add generic bdrv_inherited_options() Max Reitz
2020-05-06 10:37 ` Kevin Wolf
2020-05-06 13:11 ` Kevin Wolf
2020-05-07 9:18 ` Max Reitz
2020-05-07 8:49 ` Max Reitz
2020-05-07 11:19 ` Kevin Wolf
2020-05-07 11:34 ` Max Reitz
2020-02-18 12:42 ` [PATCH v3 10/33] block: Use bdrv_inherited_options() Max Reitz
2020-02-18 12:42 ` [PATCH v3 11/33] block: Unify bdrv_child_cb_attach() Max Reitz
2020-02-18 12:42 ` [PATCH v3 12/33] block: Unify bdrv_child_cb_detach() Max Reitz
2020-05-06 12:41 ` Kevin Wolf
2020-05-07 9:09 ` Max Reitz
2020-02-18 12:42 ` [PATCH v3 13/33] block: Add child_of_bds Max Reitz
2020-05-06 12:59 ` Kevin Wolf
2020-02-18 12:42 ` [PATCH v3 14/33] block: Distinguish paths in *_format_default_perms Max Reitz
2020-02-18 12:42 ` [PATCH v3 15/33] block: Pull out bdrv_default_perms_for_backing() Max Reitz
2020-05-06 13:21 ` Kevin Wolf
2020-05-07 9:19 ` Max Reitz
2020-02-18 12:42 ` [PATCH v3 16/33] block: Pull out bdrv_default_perms_for_storage() Max Reitz
2020-02-18 12:42 ` [PATCH v3 17/33] block: Split bdrv_default_perms_for_storage() Max Reitz
2020-02-18 12:42 ` [PATCH v3 18/33] block: Add bdrv_default_perms() Max Reitz
2020-05-06 13:47 ` Kevin Wolf
2020-05-07 9:26 ` Max Reitz
2020-02-18 12:42 ` [PATCH v3 19/33] raw-format: Split raw_read_options() Max Reitz
2020-02-18 12:42 ` [PATCH v3 20/33] block: Switch child_format users to child_of_bds Max Reitz
2020-02-18 13:10 ` Eric Blake
2020-02-18 12:42 ` [PATCH v3 21/33] block: Drop child_format Max Reitz
2020-02-18 12:42 ` [PATCH v3 22/33] block: Make backing files child_of_bds children Max Reitz
2020-05-06 16:37 ` Kevin Wolf
2020-05-07 9:28 ` Max Reitz
2020-02-18 12:42 ` [PATCH v3 23/33] block: Drop child_backing Max Reitz
2020-02-18 12:42 ` [PATCH v3 24/33] block: Make format drivers use child_of_bds Max Reitz
2020-02-18 12:42 ` [PATCH v3 25/33] block: Make filter " Max Reitz
2020-02-18 12:42 ` [PATCH v3 26/33] block: Use child_of_bds in remaining places Max Reitz
2020-05-06 17:04 ` Kevin Wolf
2020-05-07 9:33 ` Max Reitz
2020-05-07 11:32 ` Kevin Wolf
2020-02-18 12:42 ` [PATCH v3 27/33] tests: Use child_of_bds instead of child_file Max Reitz
2020-02-18 12:42 ` [PATCH v3 28/33] block: Use bdrv_default_perms() Max Reitz
2020-02-18 12:42 ` [PATCH v3 29/33] block: Make bdrv_filter_default_perms() static Max Reitz
2020-02-18 12:42 ` [PATCH v3 30/33] block: Drop bdrv_format_default_perms() Max Reitz
2020-02-18 12:42 ` [PATCH v3 31/33] block: Drop child_file Max Reitz
2020-02-18 12:42 ` [PATCH v3 32/33] block: Pass BdrvChildRole in remaining cases Max Reitz
2020-05-06 17:13 ` Kevin Wolf
2020-05-07 9:36 ` Max Reitz
2020-05-07 11:40 ` Kevin Wolf
2020-02-18 12:42 ` [PATCH v3 33/33] block: Drop @child_class from bdrv_child_perm() Max Reitz
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200505125413.GK5759@linux.fritz.box \
--to=kwolf@redhat.com \
--cc=mreitz@redhat.com \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).