From: Paolo Bonzini <pbonzini@redhat.com>
To: Emanuele Giuseppe Esposito <eesposit@redhat.com>, qemu-block@nongnu.org
Cc: Kevin Wolf <kwolf@redhat.com>, Fam Zheng <fam@euphon.net>,
Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>,
qemu-devel@nongnu.org, Hanna Reitz <hreitz@redhat.com>,
Stefan Hajnoczi <stefanha@redhat.com>,
John Snow <jsnow@redhat.com>
Subject: Re: [PATCH 10/12] block.c: add subtree_drains where needed
Date: Wed, 19 Jan 2022 10:47:15 +0100 [thread overview]
Message-ID: <3d99fc75-9b7a-a55c-3587-b1c1ce07b6f4@redhat.com> (raw)
In-Reply-To: <20220118162738.1366281-11-eesposit@redhat.com>
Subject doesn't say what needs them: "block: drain block devices across
graph modifications"
On 1/18/22 17:27, Emanuele Giuseppe Esposito wrote:
> Protect bdrv_replace_child_noperm, as it modifies the
> graph by adding/removing elements to .children and .parents
> list of a bs. Use the newly introduced
> bdrv_subtree_drained_{begin/end}_unlocked drains to achieve
> that and be free from the aiocontext lock.
>
> One important criteria to keep in mind is that if the caller of
> bdrv_replace_child_noperm creates a transaction, we need to make sure that the
> whole transaction is under the same drain block.
Okay, this is the important part and it should be mentioned in patch 8
as well. It should also be in a comment above bdrv_replace_child_noperm().
> This is imperative, as having
> multiple drains also in the .abort() class of functions causes discrepancies
> in the drained counters (as nodes are put back into the original positions),
> making it really hard to retourn all to zero and leaving the code very buggy.
> See https://patchew.org/QEMU/20211213104014.69858-1-eesposit@redhat.com/
> for more explanations.
>
> Unfortunately we still need to have bdrv_subtree_drained_begin/end
> in bdrv_detach_child() releasing and then holding the AioContext
> lock, since it later invokes bdrv_try_set_aio_context() that is
> not safe yet. Once all is cleaned up, we can also remove the
> acquire/release locks in job_unref, artificially added because of this.
About this:
> + * TODO: this is called by job_unref with lock held, because
> + * afterwards it calls bdrv_try_set_aio_context.
> + * Once all of this is fixed, take care of removing
> + * the aiocontext lock and make this function _unlocked.
It may be clear to you, but it's quite cryptic:
- which lock is held by job_unref()? Also, would it make more sense to
talk about block_job_free() rather than job_unref()? I can't quite
follow where the AioContext lock is taken.
- what is "all of this", and what do you mean by "not safe yet"? Do
both refer to bdrv_try_set_aio_context() needing the AioContext lock?
- what is "this function" (that should become _unlocked)?
I think you could also split the patch in multiple parts for different
call chains. In particular bdrv_set_backing_hd can be merged with the
patch to bdrv_reopen_parse_file_or_backing, since both of them deal with
bdrv_set_file_or_backing_noperm.
Paolo
> Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
> ---
> block.c | 50 ++++++++++++++++++++++++++++++++++++++++++++------
> 1 file changed, 44 insertions(+), 6 deletions(-)
>
> diff --git a/block.c b/block.c
> index fcc44a49a0..6196c95aae 100644
> --- a/block.c
> +++ b/block.c
> @@ -3114,8 +3114,22 @@ static void bdrv_detach_child(BdrvChild **childp)
> BlockDriverState *old_bs = (*childp)->bs;
>
> assert(qemu_in_main_thread());
> + if (old_bs) {
> + /*
> + * TODO: this is called by job_unref with lock held, because
> + * afterwards it calls bdrv_try_set_aio_context.
> + * Once all of this is fixed, take care of removing
> + * the aiocontext lock and make this function _unlocked.
> + */
> + bdrv_subtree_drained_begin(old_bs);
> + }
> +
> bdrv_replace_child_noperm(childp, NULL, true);
>
> + if (old_bs) {
> + bdrv_subtree_drained_end(old_bs);
> + }
> +
> if (old_bs) {
> /*
> * Update permissions for old node. We're just taking a parent away, so
> @@ -3154,6 +3168,7 @@ BdrvChild *bdrv_root_attach_child(BlockDriverState *child_bs,
> Transaction *tran = tran_new();
>
> assert(qemu_in_main_thread());
> + bdrv_subtree_drained_begin_unlocked(child_bs);
>
> ret = bdrv_attach_child_common(child_bs, child_name, child_class,
> child_role, perm, shared_perm, opaque,
> @@ -3168,6 +3183,7 @@ out:
> tran_finalize(tran, ret);
> /* child is unset on failure by bdrv_attach_child_common_abort() */
> assert((ret < 0) == !child);
> + bdrv_subtree_drained_end_unlocked(child_bs);
>
> bdrv_unref(child_bs);
> return child;
> @@ -3197,6 +3213,9 @@ BdrvChild *bdrv_attach_child(BlockDriverState *parent_bs,
>
> assert(qemu_in_main_thread());
>
> + bdrv_subtree_drained_begin_unlocked(parent_bs);
> + bdrv_subtree_drained_begin_unlocked(child_bs);
> +
> ret = bdrv_attach_child_noperm(parent_bs, child_bs, child_name, child_class,
> child_role, &child, tran, errp);
> if (ret < 0) {
> @@ -3211,6 +3230,9 @@ BdrvChild *bdrv_attach_child(BlockDriverState *parent_bs,
> out:
> tran_finalize(tran, ret);
> /* child is unset on failure by bdrv_attach_child_common_abort() */
> + bdrv_subtree_drained_end_unlocked(child_bs);
> + bdrv_subtree_drained_end_unlocked(parent_bs);
> +
> assert((ret < 0) == !child);
>
> bdrv_unref(child_bs);
> @@ -3456,6 +3478,11 @@ int bdrv_set_backing_hd(BlockDriverState *bs, BlockDriverState *backing_hd,
>
> assert(qemu_in_main_thread());
>
> + bdrv_subtree_drained_begin_unlocked(bs);
> + if (backing_hd) {
> + bdrv_subtree_drained_begin_unlocked(backing_hd);
> + }
> +
> ret = bdrv_set_backing_noperm(bs, backing_hd, tran, errp);
> if (ret < 0) {
> goto out;
> @@ -3464,6 +3491,10 @@ int bdrv_set_backing_hd(BlockDriverState *bs, BlockDriverState *backing_hd,
> ret = bdrv_refresh_perms(bs, errp);
> out:
> tran_finalize(tran, ret);
> + if (backing_hd) {
> + bdrv_subtree_drained_end_unlocked(backing_hd);
> + }
> + bdrv_subtree_drained_end_unlocked(bs);
>
> return ret;
> }
> @@ -5266,7 +5297,8 @@ static int bdrv_replace_node_common(BlockDriverState *from,
>
> assert(qemu_get_current_aio_context() == qemu_get_aio_context());
> assert(bdrv_get_aio_context(from) == bdrv_get_aio_context(to));
> - bdrv_drained_begin(from);
> + bdrv_subtree_drained_begin_unlocked(from);
> + bdrv_subtree_drained_begin_unlocked(to);
>
> /*
> * Do the replacement without permission update.
> @@ -5298,7 +5330,8 @@ static int bdrv_replace_node_common(BlockDriverState *from,
> out:
> tran_finalize(tran, ret);
>
> - bdrv_drained_end(from);
> + bdrv_subtree_drained_end_unlocked(to);
> + bdrv_subtree_drained_end_unlocked(from);
> bdrv_unref(from);
>
> return ret;
> @@ -5345,6 +5378,9 @@ int bdrv_append(BlockDriverState *bs_new, BlockDriverState *bs_top,
>
> assert(!bs_new->backing);
>
> + bdrv_subtree_drained_begin_unlocked(bs_new);
> + bdrv_subtree_drained_begin_unlocked(bs_top);
> +
> ret = bdrv_attach_child_noperm(bs_new, bs_top, "backing",
> &child_of_bds, bdrv_backing_role(bs_new),
> &bs_new->backing, tran, errp);
> @@ -5360,6 +5396,8 @@ int bdrv_append(BlockDriverState *bs_new, BlockDriverState *bs_top,
> ret = bdrv_refresh_perms(bs_new, errp);
> out:
> tran_finalize(tran, ret);
> + bdrv_subtree_drained_end_unlocked(bs_top);
> + bdrv_subtree_drained_end_unlocked(bs_new);
>
> bdrv_refresh_limits(bs_top, NULL, NULL);
>
> @@ -5379,8 +5417,8 @@ int bdrv_replace_child_bs(BdrvChild *child, BlockDriverState *new_bs,
> assert(qemu_in_main_thread());
>
> bdrv_ref(old_bs);
> - bdrv_drained_begin(old_bs);
> - bdrv_drained_begin(new_bs);
> + bdrv_subtree_drained_begin_unlocked(old_bs);
> + bdrv_subtree_drained_begin_unlocked(new_bs);
>
> bdrv_replace_child_tran(&child, new_bs, tran, true);
> /* @new_bs must have been non-NULL, so @child must not have been freed */
> @@ -5394,8 +5432,8 @@ int bdrv_replace_child_bs(BdrvChild *child, BlockDriverState *new_bs,
>
> tran_finalize(tran, ret);
>
> - bdrv_drained_end(old_bs);
> - bdrv_drained_end(new_bs);
> + bdrv_subtree_drained_end_unlocked(new_bs);
> + bdrv_subtree_drained_end_unlocked(old_bs);
> bdrv_unref(old_bs);
>
> return ret;
next prev parent reply other threads:[~2022-01-19 9:51 UTC|newest]
Thread overview: 38+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-01-18 16:27 [PATCH 00/12] Removal of Aiocontext lock through drains: protect bdrv_replace_child_noperm Emanuele Giuseppe Esposito
2022-01-18 16:27 ` [PATCH 01/12] introduce BDRV_POLL_WHILE_UNLOCKED Emanuele Giuseppe Esposito
2022-01-26 10:49 ` Stefan Hajnoczi
2022-02-03 13:57 ` Emanuele Giuseppe Esposito
2022-02-04 12:13 ` Paolo Bonzini
2022-01-18 16:27 ` [PATCH 02/12] block/io.c: make bdrv_do_drained_begin_quiesce static and introduce bdrv_drained_begin_no_poll Emanuele Giuseppe Esposito
2022-01-19 9:11 ` Paolo Bonzini
2022-01-18 16:27 ` [PATCH 03/12] block.c: bdrv_replace_child_noperm: first remove the child, and then call ->detach() Emanuele Giuseppe Esposito
2022-01-18 16:27 ` [PATCH 04/12] block.c: bdrv_replace_child_noperm: first call ->attach(), and then add child Emanuele Giuseppe Esposito
2022-01-18 16:27 ` [PATCH 05/12] test-bdrv-drain.c: adapt test to the coming subtree drains Emanuele Giuseppe Esposito
2022-01-19 9:18 ` Paolo Bonzini
2022-02-03 11:41 ` Emanuele Giuseppe Esposito
2022-01-18 16:27 ` [PATCH 06/12] test-bdrv-drain.c: remove test_detach_by_parent_cb() Emanuele Giuseppe Esposito
2022-01-18 16:27 ` [PATCH 07/12] block/io.c: introduce bdrv_subtree_drained_{begin/end}_unlocked Emanuele Giuseppe Esposito
2022-01-19 9:52 ` Paolo Bonzini
2022-01-26 11:04 ` Stefan Hajnoczi
2022-01-18 16:27 ` [PATCH 08/12] reopen: add a transaction to drain_end nodes picked in bdrv_reopen_parse_file_or_backing Emanuele Giuseppe Esposito
2022-01-19 9:33 ` Paolo Bonzini
2022-01-26 11:16 ` Stefan Hajnoczi
2022-01-18 16:27 ` [PATCH 09/12] jobs: ensure sleep in job_sleep_ns is fully performed Emanuele Giuseppe Esposito
2022-01-26 11:21 ` Stefan Hajnoczi
2022-02-03 14:21 ` Emanuele Giuseppe Esposito
2022-01-18 16:27 ` [PATCH 10/12] block.c: add subtree_drains where needed Emanuele Giuseppe Esposito
2022-01-19 9:47 ` Paolo Bonzini [this message]
2022-02-03 13:13 ` Emanuele Giuseppe Esposito
2022-02-01 14:47 ` Vladimir Sementsov-Ogievskiy
2022-02-02 15:37 ` Emanuele Giuseppe Esposito
2022-02-02 17:38 ` Paolo Bonzini
2022-02-03 10:09 ` Emanuele Giuseppe Esposito
2022-02-04 9:49 ` Vladimir Sementsov-Ogievskiy
2022-02-04 13:30 ` Emanuele Giuseppe Esposito
2022-02-04 14:03 ` Vladimir Sementsov-Ogievskiy
2022-01-18 16:27 ` [PATCH 11/12] block/io.c: fully enable assert_bdrv_graph_writable Emanuele Giuseppe Esposito
2022-01-18 16:27 ` [PATCH 12/12] block.c: additional assert qemu in main tread Emanuele Giuseppe Esposito
2022-01-19 9:51 ` [PATCH 00/12] Removal of Aiocontext lock through drains: protect bdrv_replace_child_noperm Paolo Bonzini
2022-01-26 11:29 ` Stefan Hajnoczi
2022-01-27 13:46 ` Paolo Bonzini
2022-01-28 12:20 ` Emanuele Giuseppe Esposito
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3d99fc75-9b7a-a55c-3587-b1c1ce07b6f4@redhat.com \
--to=pbonzini@redhat.com \
--cc=eesposit@redhat.com \
--cc=fam@euphon.net \
--cc=hreitz@redhat.com \
--cc=jsnow@redhat.com \
--cc=kwolf@redhat.com \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
--cc=stefanha@redhat.com \
--cc=vsementsov@virtuozzo.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).