qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Paolo Bonzini <pbonzini@redhat.com>
To: Emanuele Giuseppe Esposito <eesposit@redhat.com>, qemu-block@nongnu.org
Cc: Kevin Wolf <kwolf@redhat.com>, Fam Zheng <fam@euphon.net>,
	Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>,
	qemu-devel@nongnu.org, Hanna Reitz <hreitz@redhat.com>,
	Stefan Hajnoczi <stefanha@redhat.com>,
	John Snow <jsnow@redhat.com>
Subject: Re: [PATCH 10/12] block.c: add subtree_drains where needed
Date: Wed, 19 Jan 2022 10:47:15 +0100	[thread overview]
Message-ID: <3d99fc75-9b7a-a55c-3587-b1c1ce07b6f4@redhat.com> (raw)
In-Reply-To: <20220118162738.1366281-11-eesposit@redhat.com>

Subject doesn't say what needs them: "block: drain block devices across 
graph modifications"

On 1/18/22 17:27, Emanuele Giuseppe Esposito wrote:
> Protect bdrv_replace_child_noperm, as it modifies the
> graph by adding/removing elements to .children and .parents
> list of a bs. Use the newly introduced
> bdrv_subtree_drained_{begin/end}_unlocked drains to achieve
> that and be free from the aiocontext lock.
> 
> One important criteria to keep in mind is that if the caller of
> bdrv_replace_child_noperm creates a transaction, we need to make sure that the
> whole transaction is under the same drain block.

Okay, this is the important part and it should be mentioned in patch 8 
as well.  It should also be in a comment above bdrv_replace_child_noperm().

> This is imperative, as having
> multiple drains also in the .abort() class of functions causes discrepancies
> in the drained counters (as nodes are put back into the original positions),
> making it really hard to retourn all to zero and leaving the code very buggy.
> See https://patchew.org/QEMU/20211213104014.69858-1-eesposit@redhat.com/
> for more explanations.
> 
> Unfortunately we still need to have bdrv_subtree_drained_begin/end
> in bdrv_detach_child() releasing and then holding the AioContext
> lock, since it later invokes bdrv_try_set_aio_context() that is
> not safe yet. Once all is cleaned up, we can also remove the
> acquire/release locks in job_unref, artificially added because of this.

About this:

> +         * TODO: this is called by job_unref with lock held, because
> +         * afterwards it calls bdrv_try_set_aio_context.
> +         * Once all of this is fixed, take care of removing
> +         * the aiocontext lock and make this function _unlocked.

It may be clear to you, but it's quite cryptic:

- which lock is held by job_unref()?  Also, would it make more sense to 
talk about block_job_free() rather than job_unref()?  I can't quite 
follow where the AioContext lock is taken.

- what is "all of this", and what do you mean by "not safe yet"?  Do 
both refer to bdrv_try_set_aio_context() needing the AioContext lock?

- what is "this function" (that should become _unlocked)?

I think you could also split the patch in multiple parts for different 
call chains.  In particular bdrv_set_backing_hd can be merged with the 
patch to bdrv_reopen_parse_file_or_backing, since both of them deal with 
bdrv_set_file_or_backing_noperm.

Paolo

> Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
> ---
>   block.c | 50 ++++++++++++++++++++++++++++++++++++++++++++------
>   1 file changed, 44 insertions(+), 6 deletions(-)
> 
> diff --git a/block.c b/block.c
> index fcc44a49a0..6196c95aae 100644
> --- a/block.c
> +++ b/block.c
> @@ -3114,8 +3114,22 @@ static void bdrv_detach_child(BdrvChild **childp)
>       BlockDriverState *old_bs = (*childp)->bs;
>   
>       assert(qemu_in_main_thread());
> +    if (old_bs) {
> +        /*
> +         * TODO: this is called by job_unref with lock held, because
> +         * afterwards it calls bdrv_try_set_aio_context.
> +         * Once all of this is fixed, take care of removing
> +         * the aiocontext lock and make this function _unlocked.
> +         */
> +        bdrv_subtree_drained_begin(old_bs);
> +    }
> +
>       bdrv_replace_child_noperm(childp, NULL, true);
>   
> +    if (old_bs) {
> +        bdrv_subtree_drained_end(old_bs);
> +    }
> +
>       if (old_bs) {
>           /*
>            * Update permissions for old node. We're just taking a parent away, so
> @@ -3154,6 +3168,7 @@ BdrvChild *bdrv_root_attach_child(BlockDriverState *child_bs,
>       Transaction *tran = tran_new();
>   
>       assert(qemu_in_main_thread());
> +    bdrv_subtree_drained_begin_unlocked(child_bs);
>   
>       ret = bdrv_attach_child_common(child_bs, child_name, child_class,
>                                      child_role, perm, shared_perm, opaque,
> @@ -3168,6 +3183,7 @@ out:
>       tran_finalize(tran, ret);
>       /* child is unset on failure by bdrv_attach_child_common_abort() */
>       assert((ret < 0) == !child);
> +    bdrv_subtree_drained_end_unlocked(child_bs);
>   
>       bdrv_unref(child_bs);
>       return child;
> @@ -3197,6 +3213,9 @@ BdrvChild *bdrv_attach_child(BlockDriverState *parent_bs,
>   
>       assert(qemu_in_main_thread());
>   
> +    bdrv_subtree_drained_begin_unlocked(parent_bs);
> +    bdrv_subtree_drained_begin_unlocked(child_bs);
> +
>       ret = bdrv_attach_child_noperm(parent_bs, child_bs, child_name, child_class,
>                                      child_role, &child, tran, errp);
>       if (ret < 0) {
> @@ -3211,6 +3230,9 @@ BdrvChild *bdrv_attach_child(BlockDriverState *parent_bs,
>   out:
>       tran_finalize(tran, ret);
>       /* child is unset on failure by bdrv_attach_child_common_abort() */
> +    bdrv_subtree_drained_end_unlocked(child_bs);
> +    bdrv_subtree_drained_end_unlocked(parent_bs);
> +
>       assert((ret < 0) == !child);
>   
>       bdrv_unref(child_bs);
> @@ -3456,6 +3478,11 @@ int bdrv_set_backing_hd(BlockDriverState *bs, BlockDriverState *backing_hd,
>   
>       assert(qemu_in_main_thread());
>   
> +    bdrv_subtree_drained_begin_unlocked(bs);
> +    if (backing_hd) {
> +        bdrv_subtree_drained_begin_unlocked(backing_hd);
> +    }
> +
>       ret = bdrv_set_backing_noperm(bs, backing_hd, tran, errp);
>       if (ret < 0) {
>           goto out;
> @@ -3464,6 +3491,10 @@ int bdrv_set_backing_hd(BlockDriverState *bs, BlockDriverState *backing_hd,
>       ret = bdrv_refresh_perms(bs, errp);
>   out:
>       tran_finalize(tran, ret);
> +    if (backing_hd) {
> +        bdrv_subtree_drained_end_unlocked(backing_hd);
> +    }
> +    bdrv_subtree_drained_end_unlocked(bs);
>   
>       return ret;
>   }
> @@ -5266,7 +5297,8 @@ static int bdrv_replace_node_common(BlockDriverState *from,
>   
>       assert(qemu_get_current_aio_context() == qemu_get_aio_context());
>       assert(bdrv_get_aio_context(from) == bdrv_get_aio_context(to));
> -    bdrv_drained_begin(from);
> +    bdrv_subtree_drained_begin_unlocked(from);
> +    bdrv_subtree_drained_begin_unlocked(to);
>   
>       /*
>        * Do the replacement without permission update.
> @@ -5298,7 +5330,8 @@ static int bdrv_replace_node_common(BlockDriverState *from,
>   out:
>       tran_finalize(tran, ret);
>   
> -    bdrv_drained_end(from);
> +    bdrv_subtree_drained_end_unlocked(to);
> +    bdrv_subtree_drained_end_unlocked(from);
>       bdrv_unref(from);
>   
>       return ret;
> @@ -5345,6 +5378,9 @@ int bdrv_append(BlockDriverState *bs_new, BlockDriverState *bs_top,
>   
>       assert(!bs_new->backing);
>   
> +    bdrv_subtree_drained_begin_unlocked(bs_new);
> +    bdrv_subtree_drained_begin_unlocked(bs_top);
> +
>       ret = bdrv_attach_child_noperm(bs_new, bs_top, "backing",
>                                      &child_of_bds, bdrv_backing_role(bs_new),
>                                      &bs_new->backing, tran, errp);
> @@ -5360,6 +5396,8 @@ int bdrv_append(BlockDriverState *bs_new, BlockDriverState *bs_top,
>       ret = bdrv_refresh_perms(bs_new, errp);
>   out:
>       tran_finalize(tran, ret);
> +    bdrv_subtree_drained_end_unlocked(bs_top);
> +    bdrv_subtree_drained_end_unlocked(bs_new);
>   
>       bdrv_refresh_limits(bs_top, NULL, NULL);
>   
> @@ -5379,8 +5417,8 @@ int bdrv_replace_child_bs(BdrvChild *child, BlockDriverState *new_bs,
>       assert(qemu_in_main_thread());
>   
>       bdrv_ref(old_bs);
> -    bdrv_drained_begin(old_bs);
> -    bdrv_drained_begin(new_bs);
> +    bdrv_subtree_drained_begin_unlocked(old_bs);
> +    bdrv_subtree_drained_begin_unlocked(new_bs);
>   
>       bdrv_replace_child_tran(&child, new_bs, tran, true);
>       /* @new_bs must have been non-NULL, so @child must not have been freed */
> @@ -5394,8 +5432,8 @@ int bdrv_replace_child_bs(BdrvChild *child, BlockDriverState *new_bs,
>   
>       tran_finalize(tran, ret);
>   
> -    bdrv_drained_end(old_bs);
> -    bdrv_drained_end(new_bs);
> +    bdrv_subtree_drained_end_unlocked(new_bs);
> +    bdrv_subtree_drained_end_unlocked(old_bs);
>       bdrv_unref(old_bs);
>   
>       return ret;



  reply	other threads:[~2022-01-19  9:51 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-01-18 16:27 [PATCH 00/12] Removal of Aiocontext lock through drains: protect bdrv_replace_child_noperm Emanuele Giuseppe Esposito
2022-01-18 16:27 ` [PATCH 01/12] introduce BDRV_POLL_WHILE_UNLOCKED Emanuele Giuseppe Esposito
2022-01-26 10:49   ` Stefan Hajnoczi
2022-02-03 13:57     ` Emanuele Giuseppe Esposito
2022-02-04 12:13       ` Paolo Bonzini
2022-01-18 16:27 ` [PATCH 02/12] block/io.c: make bdrv_do_drained_begin_quiesce static and introduce bdrv_drained_begin_no_poll Emanuele Giuseppe Esposito
2022-01-19  9:11   ` Paolo Bonzini
2022-01-18 16:27 ` [PATCH 03/12] block.c: bdrv_replace_child_noperm: first remove the child, and then call ->detach() Emanuele Giuseppe Esposito
2022-01-18 16:27 ` [PATCH 04/12] block.c: bdrv_replace_child_noperm: first call ->attach(), and then add child Emanuele Giuseppe Esposito
2022-01-18 16:27 ` [PATCH 05/12] test-bdrv-drain.c: adapt test to the coming subtree drains Emanuele Giuseppe Esposito
2022-01-19  9:18   ` Paolo Bonzini
2022-02-03 11:41     ` Emanuele Giuseppe Esposito
2022-01-18 16:27 ` [PATCH 06/12] test-bdrv-drain.c: remove test_detach_by_parent_cb() Emanuele Giuseppe Esposito
2022-01-18 16:27 ` [PATCH 07/12] block/io.c: introduce bdrv_subtree_drained_{begin/end}_unlocked Emanuele Giuseppe Esposito
2022-01-19  9:52   ` Paolo Bonzini
2022-01-26 11:04   ` Stefan Hajnoczi
2022-01-18 16:27 ` [PATCH 08/12] reopen: add a transaction to drain_end nodes picked in bdrv_reopen_parse_file_or_backing Emanuele Giuseppe Esposito
2022-01-19  9:33   ` Paolo Bonzini
2022-01-26 11:16   ` Stefan Hajnoczi
2022-01-18 16:27 ` [PATCH 09/12] jobs: ensure sleep in job_sleep_ns is fully performed Emanuele Giuseppe Esposito
2022-01-26 11:21   ` Stefan Hajnoczi
2022-02-03 14:21     ` Emanuele Giuseppe Esposito
2022-01-18 16:27 ` [PATCH 10/12] block.c: add subtree_drains where needed Emanuele Giuseppe Esposito
2022-01-19  9:47   ` Paolo Bonzini [this message]
2022-02-03 13:13     ` Emanuele Giuseppe Esposito
2022-02-01 14:47   ` Vladimir Sementsov-Ogievskiy
2022-02-02 15:37     ` Emanuele Giuseppe Esposito
2022-02-02 17:38       ` Paolo Bonzini
2022-02-03 10:09         ` Emanuele Giuseppe Esposito
2022-02-04  9:49       ` Vladimir Sementsov-Ogievskiy
2022-02-04 13:30         ` Emanuele Giuseppe Esposito
2022-02-04 14:03           ` Vladimir Sementsov-Ogievskiy
2022-01-18 16:27 ` [PATCH 11/12] block/io.c: fully enable assert_bdrv_graph_writable Emanuele Giuseppe Esposito
2022-01-18 16:27 ` [PATCH 12/12] block.c: additional assert qemu in main tread Emanuele Giuseppe Esposito
2022-01-19  9:51 ` [PATCH 00/12] Removal of Aiocontext lock through drains: protect bdrv_replace_child_noperm Paolo Bonzini
2022-01-26 11:29 ` Stefan Hajnoczi
2022-01-27 13:46   ` Paolo Bonzini
2022-01-28 12:20     ` Emanuele Giuseppe Esposito

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3d99fc75-9b7a-a55c-3587-b1c1ce07b6f4@redhat.com \
    --to=pbonzini@redhat.com \
    --cc=eesposit@redhat.com \
    --cc=fam@euphon.net \
    --cc=hreitz@redhat.com \
    --cc=jsnow@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    --cc=vsementsov@virtuozzo.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).