qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Hanna Reitz <hreitz@redhat.com>
To: Kevin Wolf <kwolf@redhat.com>, qemu-block@nongnu.org
Cc: eesposit@redhat.com, stefanha@redhat.com, pbonzini@redhat.com,
	qemu-devel@nongnu.org
Subject: Re: [PATCH 12/13] block: Don't poll in bdrv_replace_child_noperm()
Date: Mon, 14 Nov 2022 21:22:32 +0100	[thread overview]
Message-ID: <b998dc15-4c83-4f3a-c79e-edd78562ec3f@redhat.com> (raw)
In-Reply-To: <20221108123738.530873-13-kwolf@redhat.com>

On 08.11.22 13:37, Kevin Wolf wrote:
> In order to make sure that bdrv_replace_child_noperm() doesn't have to
> poll any more, get rid of the bdrv_parent_drained_begin_single() call.
>
> This is possible now because we can require that the child is already
> drained when the function is called (it better be, having in-flight
> requests while modifying the graph isn't going to end well!) and we
> don't call the parent drain callbacks more than once.
>
> The additional drain calls needed in callers cause the test case to run
> its code in the drain handler too early (bdrv_attach_child() drains
> now), so modify it to only enable the code after the test setup has
> completed.
>
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> ---
>   include/block/block-io.h     |  8 ++++
>   block.c                      | 72 +++++++++++++++++++++++++-----------
>   block/io.c                   |  2 +-
>   tests/unit/test-bdrv-drain.c | 10 +++++
>   4 files changed, 70 insertions(+), 22 deletions(-)

I find this change complicated.  I understand it’s the point of the 
series, but I find it difficult to grasp.  But I guess there can be no 
drain series without such a patch.

As usual, I was very skeptical of the code at first, and over time 
slowly realized that I’m mostly confused by the comments, and the code 
seems fine.  Ah, well.

[...]

> diff --git a/block.c b/block.c
> index 5f5f79cd16..12039e9b8a 100644
> --- a/block.c
> +++ b/block.c

[...]

> @@ -2414,12 +2428,20 @@ static TransactionActionDrv bdrv_replace_child_drv = {
>    *
>    * Note: real unref of old_bs is done only on commit.
>    *
> + * Both child and new_bs (if non-NULL) must be drained. new_bs must be kept
> + * drained until the transaction is completed (this automatically implies that
> + * child remains drained, too).

I find “child” extremely ambiguous.  The problem is that there generally 
is no way to drain a BdrvChild object, is there?  You can only drain the 
BDS in it, which then drains the parent through the BdrvChild object.  
Historically, I don’t think there was ever a place where we cared about 
the BdrvChild object between the two to be drained, was there?  I mean, 
now there apparently is, in bdrv_child_attach_common(), but that’s a 
different story.

So the problem is that “draining a BdrvChild object” generally appears 
in the context of bdrv_parent_drained_*() functions, i.e. actually 
functions draining the parent.  Which makes it a bit confusing to refer 
to a BdrvChild object just as “child”.

I know that “child” here refers to the variable (or does it not?), but 
that is why I really prefer marking variables that are just plain 
English words, e.g. as @child or `child`, so it’s clear they are a name 
and not a noun.

In any case, because the concept is generally to drain the `child->bs` 
instead of the BdrvChild object directly, I understand the comment to 
mean: “Both the old child (`child->bs`) and `new_bs` (if non-NULL) must 
be drained.  `new_bs` must be kept drained until the transaction is 
completed.  This implies that the parent too will be kept drained until 
the transaction is completed by the BdrvChild object `child`.”

Or am I misunderstanding something, and the distinction between `child` 
and `child->bs` and the parent node is important here? (Would be good to 
know. :))

> + *
>    * The function doesn't update permissions, caller is responsible for this.
>    */
>   static void bdrv_replace_child_tran(BdrvChild *child, BlockDriverState *new_bs,
>                                       Transaction *tran)
>   {
>       BdrvReplaceChildState *s = g_new(BdrvReplaceChildState, 1);
> +
> +    assert(child->parent_quiesce_counter);
> +    assert(!new_bs || new_bs->quiesce_counter);
> +
>       *s = (BdrvReplaceChildState) {
>           .child = child,
>           .old_bs = child->bs,
> @@ -2818,6 +2840,12 @@ static void bdrv_replace_child_noperm(BdrvChild *child,

This function now has its callers fulfill kind of a complicated 
contract.  I would prefer that to be written out in a doc comment, 
especially because it sounds like the assertions can’t cover everything 
(i.e. callers that remove a child are required to have stopped issuing 
requests to that child, but they are free to do that in any way they 
want, so no assertion will check for it here).

>       int new_bs_quiesce_counter;
>   
>       assert(!child->frozen);
> +    /*
> +     * When removing the child, it's the callers responsibility to make sure
> +     * that no requests are in flight any more. Usually the parent is drained,
> +     * but not through child->parent_quiesce_counter.
> +     */

When I see a comment above an assertion, I immediately assume it is 
going to describe what the assertion checks.  Unless I’m 
misunderstanding something (again?), this comment actually describes 
what the assertion *does not* check.  I find that confusing, especially 
because the comment leads with “it’s the caller’s responsibility”, which 
to me implies “and that’s why we check it here in this assertion”, 
because assertions are there to verify that contracts are met.

The assertion verifies that the parent must be drained (through @child), 
unless the child is removed, which case isn’t covered by the assertion.  
That “isn’t covered” is then described by the comment, right?

I’d prefer the comment to lead with describing what the assertion does 
check, and then transitioning to “But in case the child is removed, we 
ignore that, and just note that it’s the caller’s responsibility to...”.

Also, the comment doesn’t explicitly say why we don’t check it in the 
assertion.  It says “usually” and “child->parent_quiesce_counter”, which 
implies “can’t get any information from child->parent_quiesce_counter, 
and regardless, callers can do what they want do achieve quiescing in 
regards to this child, so there’s nothing we can check”.  It feels like 
we can just say outright that there’s an informal contract that we can’t 
formally verify here, but callers naturally still must adhere to it.  It 
would be interesting to know (and note) why that is, though, i.e. why we 
can’t have parents be drained through the BdrvChild object for the child 
that is being removed.

I understand the intention behind the assertion to be: “We require the 
parent not to have in-flight requests to the BdrvChild object 
manipulated here.  In most cases, we verify that by requiring the parent 
be drained through this BdrvChild object.  However, when a child is 
being removed, we skip formal verification, because we leave callers 
free in deciding how to ensure that no requests are in flight.  Usually, 
they will still have the parent be drained (even if not through this 
BdrvChild object), but we don’t require that.”

I may well be wrong, but then it would be good for a comment to correct 
me. :)

(Interestingly, because bdrv_replace_child_noperm() no longer polls 
itself, it can’t know for sure that `child->parent_quiesce_counter > 0` 
means that there are no requests in flight.)

> +    assert(!new_bs || child->parent_quiesce_counter);
>       assert(old_bs != new_bs);
>       GLOBAL_STATE_CODE();

[...]

> @@ -2865,9 +2875,6 @@ static void bdrv_replace_child_noperm(BdrvChild *child,
>       /*
>        * If the old child node was drained but the new one is not, allow

This now also covers the case where there was no old child node, but the 
parent was simply drained via an empty BdrvChild by the caller.

>        * requests to come in only after the new node has been attached.
> -     *
> -     * Update new_bs_quiesce_counter because bdrv_parent_drained_begin_single()
> -     * polls, which could have changed the value.
>        */
>       new_bs_quiesce_counter = (new_bs ? new_bs->quiesce_counter : 0);
>       if (!new_bs_quiesce_counter && child->parent_quiesce_counter) {
> @@ -3004,6 +3011,12 @@ static BdrvChild *bdrv_attach_child_common(BlockDriverState *child_bs,
>       }
>   
>       bdrv_ref(child_bs);
> +    /*
> +     * Let every new BdrvChild start drained, inserting it in the graph with
> +     * bdrv_replace_child_noperm() will undrain it if the child node is not
> +     * drained. The child was only just created, so polling is not necessary.

I feel like this is hiding some complexity.  Unless I missed something, 
draining a BdrvChild always meant draining the parent. But here, it 
absolutely does not mean that, and maybe that deserves a big warning sign?

Beginning a drain without poll means quiescing.  You assert that there 
can be no requests to the new child, which I agree on[1].  The 
combination of no new requests coming in, and no requests being there at 
this point is what being drained means.  So @new_child is indeed “drained”.

But the parent isn’t drained, because it isn’t polled.  There may still 
be requests in flight to its other children.  That’s really interesting, 
and I found it extremely confusing until I wrote ten paragraphs in reply 
here and scrapped most of them again.  Whenever I find this to be my 
reaction to something, I really wish for a detailed comment that 
explains the situation.

I would like the comment to:
- Expand on what “only just created” means.  As it’s written, that could 
mean relying on a race condition.  At which point would the parent be 
able to send requests?  (I assume either the .attach() in 
bdrv_replace_child_noperm(), or when this function returns, whichever 
comes first.  (The former always comes first.))
- Say in more detail that calling bdrv_parent_drained_begin_single() 
without polling will quiesce the parent, preventing new requests from 
appearing.
- Note that because there are no requests in flight, and because no new 
requests can then appear, the BdrvChild is drained.
- Note that the parent is only quiesced, not drained, and may still have 
requests in flight to other children, but naturally we don’t care about 
them.

I feel like the comment tries to hide all that complexity simply by 
avoiding the word “parent”.

[1] As far as I can piece together, no requests to the new child can 
have started yet, because this function creates the BdrvChild object, so 
before it is returned to the caller (or BdrvChildClass.attach() is 
called in bdrv_replace_child_noperm()), the block driver won’t generate 
requests to it.

Hanna

> +     */
> +    bdrv_parent_drained_begin_single(new_child, false);
>       bdrv_replace_child_noperm(new_child, child_bs);
>   
>       BdrvAttachChildCommonState *s = g_new(BdrvAttachChildCommonState, 1);



  parent reply	other threads:[~2022-11-14 23:59 UTC|newest]

Thread overview: 61+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-08 12:37 [PATCH 00/13] block: Simplify drain Kevin Wolf
2022-11-08 12:37 ` [PATCH 01/13] qed: Don't yield in bdrv_qed_co_drain_begin() Kevin Wolf
2022-11-09  9:21   ` Vladimir Sementsov-Ogievskiy
2022-11-09  9:27   ` Vladimir Sementsov-Ogievskiy
2022-11-09 12:22     ` Kevin Wolf
2022-11-09 21:49   ` Stefan Hajnoczi
2022-11-10 11:07     ` Kevin Wolf
2022-11-11 11:14   ` Emanuele Giuseppe Esposito
2022-11-14 18:16   ` Hanna Reitz
2022-11-08 12:37 ` [PATCH 02/13] test-bdrv-drain: Don't yield in .bdrv_co_drained_begin/end() Kevin Wolf
2022-11-09 10:50   ` Vladimir Sementsov-Ogievskiy
2022-11-09 12:28     ` Kevin Wolf
2022-11-09 13:45   ` Vladimir Sementsov-Ogievskiy
2022-11-11 11:14   ` Emanuele Giuseppe Esposito
2022-11-14 18:16   ` Hanna Reitz
2022-11-08 12:37 ` [PATCH 03/13] block: Revert .bdrv_drained_begin/end to non-coroutine_fn Kevin Wolf
2022-11-09 14:29   ` Vladimir Sementsov-Ogievskiy
2022-11-09 22:13   ` Stefan Hajnoczi
2022-11-11 11:14   ` Emanuele Giuseppe Esposito
2022-11-14 18:17   ` Hanna Reitz
2022-11-08 12:37 ` [PATCH 04/13] block: Remove drained_end_counter Kevin Wolf
2022-11-09 14:44   ` Vladimir Sementsov-Ogievskiy
2022-11-11 16:37     ` Kevin Wolf
2022-11-11 11:15   ` Emanuele Giuseppe Esposito
2022-11-14 18:19   ` Hanna Reitz
2022-11-08 12:37 ` [PATCH 05/13] block: Inline bdrv_drain_invoke() Kevin Wolf
2022-11-09 15:34   ` Vladimir Sementsov-Ogievskiy
2022-11-10 19:48   ` Stefan Hajnoczi
2022-11-11 11:15   ` Emanuele Giuseppe Esposito
2022-11-14 18:19   ` Hanna Reitz
2022-11-08 12:37 ` [PATCH 06/13] block: Drain invidual nodes during reopen Kevin Wolf
2022-11-09 16:00   ` Vladimir Sementsov-Ogievskiy
2022-11-11 16:54     ` Kevin Wolf
2022-11-08 12:37 ` [PATCH 07/13] block: Don't use subtree drains in bdrv_drop_intermediate() Kevin Wolf
2022-11-09 16:18   ` Vladimir Sementsov-Ogievskiy
2022-11-14 18:20   ` Hanna Reitz
2022-11-08 12:37 ` [PATCH 08/13] stream: Replace subtree drain with a single node drain Kevin Wolf
2022-11-09 16:52   ` Vladimir Sementsov-Ogievskiy
2022-11-10 10:16     ` Kevin Wolf
2022-11-10 11:25       ` Vladimir Sementsov-Ogievskiy
2022-11-10 17:27         ` Kevin Wolf
2022-11-14 18:21   ` Hanna Reitz
2022-11-08 12:37 ` [PATCH 09/13] block: Remove subtree drains Kevin Wolf
2022-11-09 17:22   ` Vladimir Sementsov-Ogievskiy
2022-11-14 18:22   ` Hanna Reitz
2022-11-08 12:37 ` [PATCH 10/13] block: Call drain callbacks only once Kevin Wolf
2022-11-09 18:05   ` Vladimir Sementsov-Ogievskiy
2022-11-14 12:32     ` Kevin Wolf
2022-11-09 18:54   ` Vladimir Sementsov-Ogievskiy
2022-11-14 18:23   ` Hanna Reitz
2022-11-08 12:37 ` [PATCH 11/13] block: Remove ignore_bds_parents parameter from drain functions Kevin Wolf
2022-11-09 18:57   ` Vladimir Sementsov-Ogievskiy
2022-11-14 18:23   ` Hanna Reitz
2022-11-08 12:37 ` [PATCH 12/13] block: Don't poll in bdrv_replace_child_noperm() Kevin Wolf
2022-11-11 11:21   ` Emanuele Giuseppe Esposito
2022-11-14 20:22   ` Hanna Reitz [this message]
2022-11-17 13:27     ` Kevin Wolf
2022-11-08 12:37 ` [PATCH 13/13] block: Remove poll parameter from bdrv_parent_drained_begin_single() Kevin Wolf
2022-11-14 20:24   ` Hanna Reitz
2022-11-10 20:13 ` [PATCH 00/13] block: Simplify drain Stefan Hajnoczi
2022-11-11 11:23 ` Emanuele Giuseppe Esposito

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b998dc15-4c83-4f3a-c79e-edd78562ec3f@redhat.com \
    --to=hreitz@redhat.com \
    --cc=eesposit@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).