From: Kevin Wolf <kwolf@redhat.com>
To: Eric Blake <eblake@redhat.com>
Cc: qemu-block@nongnu.org, stefanha@redhat.com, eesposit@redhat.com,
pbonzini@redhat.com, vsementsov@yandex-team.ru,
qemu-devel@nongnu.org
Subject: Re: [PATCH 16/24] block: Mark bdrv_replace_node() GRAPH_WRLOCK
Date: Fri, 3 Nov 2023 11:32:39 +0100 [thread overview]
Message-ID: <ZUTMRxsxLbw4OePX@redhat.com> (raw)
In-Reply-To: <3dndhoo6fq2pes3dldplykyg7svuwyfntix5txvotr3zpklnly@gf6yi37ijtmm>
Am 27.10.2023 um 23:33 hat Eric Blake geschrieben:
> On Fri, Oct 27, 2023 at 05:53:25PM +0200, Kevin Wolf wrote:
> > Instead of taking the writer lock internally, require callers to already
> > hold it when calling bdrv_replace_node(). Its callers may already want
> > to hold the graph lock and so wouldn't be able to call functions that
> > take it internally.
> >
> > Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> > ---
> > include/block/block-global-state.h | 6 ++++--
> > block.c | 26 +++++++-------------------
> > block/commit.c | 13 +++++++++++--
> > block/mirror.c | 26 ++++++++++++++++----------
> > blockdev.c | 5 +++++
> > tests/unit/test-bdrv-drain.c | 6 ++++++
> > tests/unit/test-bdrv-graph-mod.c | 13 +++++++++++--
> > 7 files changed, 60 insertions(+), 35 deletions(-)
> >
> > +++ b/block.c
> > @@ -5484,25 +5484,7 @@ out:
> > int bdrv_replace_node(BlockDriverState *from, BlockDriverState *to,
> > Error **errp)
> > {
> > - int ret;
> > -
> > - GLOBAL_STATE_CODE();
> > -
> > - /* Make sure that @from doesn't go away until we have successfully attached
> > - * all of its parents to @to. */
>
> Useful comment that you just moved here in the previous patch...
>
> > - bdrv_ref(from);
> > - bdrv_drained_begin(from);
> > - bdrv_drained_begin(to);
> > - bdrv_graph_wrlock(to);
> > -
> > - ret = bdrv_replace_node_common(from, to, true, false, errp);
> > -
> > - bdrv_graph_wrunlock();
> > - bdrv_drained_end(to);
> > - bdrv_drained_end(from);
> > - bdrv_unref(from);
> > -
> > - return ret;
> > + return bdrv_replace_node_common(from, to, true, false, errp);
> > }
> >
> > int bdrv_drop_filter(BlockDriverState *bs, Error **errp)
> > @@ -5717,9 +5699,15 @@ BlockDriverState *bdrv_insert_node(BlockDriverState *bs, QDict *options,
> > goto fail;
> > }
> >
> > + bdrv_ref(bs);
>
> ...but now it is gone. Intentional?
I figured it was obvious enough that bdrv_ref() is always called to make
sure that the node doesn't go away too early, but I can add it back.
> > bdrv_drained_begin(bs);
> > + bdrv_drained_begin(new_node_bs);
> > + bdrv_graph_wrlock(new_node_bs);
> > ret = bdrv_replace_node(bs, new_node_bs, errp);
> > + bdrv_graph_wrunlock();
> > + bdrv_drained_end(new_node_bs);
> > bdrv_drained_end(bs);
> > + bdrv_unref(bs);
> >
> > if (ret < 0) {
> > error_prepend(errp, "Could not replace node: ");
> > diff --git a/block/commit.c b/block/commit.c
> > index d92af02ead..2fecdce86f 100644
> > --- a/block/commit.c
> > +++ b/block/commit.c
> > @@ -68,6 +68,7 @@ static void commit_abort(Job *job)
> > {
> > CommitBlockJob *s = container_of(job, CommitBlockJob, common.job);
> > BlockDriverState *top_bs = blk_bs(s->top);
> > + BlockDriverState *commit_top_backing_bs;
> >
> > if (s->chain_frozen) {
> > bdrv_graph_rdlock_main_loop();
> > @@ -94,8 +95,12 @@ static void commit_abort(Job *job)
> > * XXX Can (or should) we somehow keep 'consistent read' blocked even
> > * after the failed/cancelled commit job is gone? If we already wrote
> > * something to base, the intermediate images aren't valid any more. */
> > - bdrv_replace_node(s->commit_top_bs, s->commit_top_bs->backing->bs,
> > - &error_abort);
> > + commit_top_backing_bs = s->commit_top_bs->backing->bs;
> > + bdrv_drained_begin(commit_top_backing_bs);
> > + bdrv_graph_wrlock(commit_top_backing_bs);
>
> Here, and elsewhere in the patch, drained_begin/end is outside
> wr(un)lock...
>
> > + bdrv_replace_node(s->commit_top_bs, commit_top_backing_bs, &error_abort);
> > + bdrv_graph_wrunlock();
> > + bdrv_drained_end(commit_top_backing_bs);
> >
> > bdrv_unref(s->commit_top_bs);
> > bdrv_unref(top_bs);
> > @@ -425,7 +430,11 @@ fail:
> > /* commit_top_bs has to be replaced after deleting the block job,
> > * otherwise this would fail because of lack of permissions. */
> > if (commit_top_bs) {
> > + bdrv_graph_wrlock(top);
> > + bdrv_drained_begin(top);
> > bdrv_replace_node(commit_top_bs, top, &error_abort);
> > + bdrv_drained_end(top);
> > + bdrv_graph_wrunlock();
>
> ...but here you do it in the opposite order. Intentional?
No, this is actually wrong. bdrv_drained_begin() has a nested event
loop, and running a nested event loop while holding the graph lock can
cause deadlocks, so it's forbidden. Thanks for catching this!
> > +++ b/tests/unit/test-bdrv-drain.c
> > @@ -2000,7 +2000,13 @@ static void do_test_replace_child_mid_drain(int old_drain_count,
> > parent_s->was_undrained = false;
> >
> > g_assert(parent_bs->quiesce_counter == old_drain_count);
> > + bdrv_drained_begin(old_child_bs);
> > + bdrv_drained_begin(new_child_bs);
> > + bdrv_graph_wrlock(NULL);
>
> Why is this locking on NULL instead of new_child_bs?
The parameter for bdrv_graph_wrlock() is a BDS whose AioContext is
locked and needs to be temporarily unlocked to avoid deadlocks. We don't
hold any AioContext lock here, so NULL is right.
> > bdrv_replace_node(old_child_bs, new_child_bs, &error_abort);
> > + bdrv_graph_wrunlock();
> > + bdrv_drained_end(new_child_bs);
> > + bdrv_drained_end(old_child_bs);
> > g_assert(parent_bs->quiesce_counter == new_drain_count);
> >
> > if (!old_drain_count && !new_drain_count) {
Since the two comments above are the only thing you found in the review,
I'll just directly fix them while applying the series.
Kevin
next prev parent reply other threads:[~2023-11-03 10:33 UTC|newest]
Thread overview: 58+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-10-27 15:53 [PATCH 00/24] block: Graph locking part 6 (bs->file/backing) Kevin Wolf
2023-10-27 15:53 ` [PATCH 01/24] block: Mark bdrv_probe_blocksizes() and callers GRAPH_RDLOCK Kevin Wolf
2023-10-27 19:40 ` Eric Blake
2023-10-27 15:53 ` [PATCH 02/24] block: Mark bdrv_has_zero_init() " Kevin Wolf
2023-10-27 19:59 ` Eric Blake
2023-10-27 15:53 ` [PATCH 03/24] block: Mark bdrv_filter_bs() " Kevin Wolf
2023-10-27 20:02 ` Eric Blake
2023-10-27 20:45 ` Eric Blake
2023-10-27 15:53 ` [PATCH 04/24] block: Mark bdrv_root_attach_child() GRAPH_WRLOCK Kevin Wolf
2023-10-27 20:22 ` Eric Blake
2023-11-03 9:45 ` Kevin Wolf
2023-11-03 12:33 ` Eric Blake
2023-10-27 15:53 ` [PATCH 05/24] block: Mark block_job_add_bdrv() GRAPH_WRLOCK Kevin Wolf
2023-10-27 20:27 ` Eric Blake
2023-10-27 15:53 ` [PATCH 06/24] block: Mark bdrv_filter_or_cow_bs() and callers GRAPH_RDLOCK Kevin Wolf
2023-10-27 20:33 ` Eric Blake
2023-10-27 15:53 ` [PATCH 07/24] block: Mark bdrv_skip_implicit_filters() " Kevin Wolf
2023-10-27 20:37 ` Eric Blake
2023-10-27 15:53 ` [PATCH 08/24] block: Mark bdrv_skip_filters() " Kevin Wolf
2023-10-27 20:52 ` Eric Blake
2023-10-27 15:53 ` [PATCH 09/24] block: Mark bdrv_(un)freeze_backing_chain() " Kevin Wolf
2023-10-27 21:00 ` Eric Blake
2023-11-03 9:54 ` Kevin Wolf
2023-10-27 15:53 ` [PATCH 10/24] block: Mark bdrv_chain_contains() " Kevin Wolf
2023-10-27 21:17 ` Eric Blake
2023-10-27 15:53 ` [PATCH 11/24] block: Mark bdrv_filter_child() " Kevin Wolf
2023-10-27 21:19 ` Eric Blake
2023-10-27 15:53 ` [PATCH 12/24] block: Mark bdrv_cow_child() " Kevin Wolf
2023-10-27 21:20 ` Eric Blake
2023-10-27 15:53 ` [PATCH 13/24] block: Mark bdrv_set_backing_hd_drained() GRAPH_WRLOCK Kevin Wolf
2023-10-27 21:22 ` Eric Blake
2023-10-27 15:53 ` [PATCH 14/24] block: Inline bdrv_set_backing_noperm() Kevin Wolf
2023-10-27 21:23 ` Eric Blake
2023-10-27 15:53 ` [PATCH 15/24] block: Mark bdrv_replace_node_common() GRAPH_WRLOCK Kevin Wolf
2023-10-27 21:27 ` Eric Blake
2023-10-27 15:53 ` [PATCH 16/24] block: Mark bdrv_replace_node() GRAPH_WRLOCK Kevin Wolf
2023-10-27 21:33 ` Eric Blake
2023-11-03 10:32 ` Kevin Wolf [this message]
2023-11-03 12:37 ` Eric Blake
2023-10-27 15:53 ` [PATCH 17/24] block: Protect bs->backing with graph_lock Kevin Wolf
2023-10-27 21:46 ` Eric Blake
2023-10-27 15:53 ` [PATCH 18/24] blkverify: Add locking for request_fn Kevin Wolf
2023-10-30 13:51 ` Eric Blake
2023-10-27 15:53 ` [PATCH 19/24] block: Introduce bdrv_co_change_backing_file() Kevin Wolf
2023-10-30 13:57 ` Eric Blake
2023-11-03 10:33 ` Kevin Wolf
2023-11-03 12:38 ` Eric Blake
2023-10-27 15:53 ` [PATCH 20/24] block: Add missing GRAPH_RDLOCK annotations Kevin Wolf
2023-10-30 21:19 ` Eric Blake
2023-10-27 15:53 ` [PATCH 21/24] qcow2: Take locks for accessing bs->file Kevin Wolf
2023-10-30 21:25 ` Eric Blake
2023-10-27 15:53 ` [PATCH 22/24] vhdx: " Kevin Wolf
2023-10-30 21:26 ` Eric Blake
2023-10-27 15:53 ` [PATCH 23/24] block: Take graph lock for most of .bdrv_open Kevin Wolf
2023-10-30 21:34 ` Eric Blake
2023-11-03 10:05 ` Kevin Wolf
2023-10-27 15:53 ` [PATCH 24/24] block: Protect bs->file with graph_lock Kevin Wolf
2023-10-30 21:37 ` Eric Blake
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZUTMRxsxLbw4OePX@redhat.com \
--to=kwolf@redhat.com \
--cc=eblake@redhat.com \
--cc=eesposit@redhat.com \
--cc=pbonzini@redhat.com \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
--cc=stefanha@redhat.com \
--cc=vsementsov@yandex-team.ru \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).