qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Kevin Wolf <kwolf@redhat.com>
To: Eric Blake <eblake@redhat.com>
Cc: qemu-block@nongnu.org, stefanha@redhat.com, eesposit@redhat.com,
	pbonzini@redhat.com, vsementsov@yandex-team.ru,
	qemu-devel@nongnu.org
Subject: Re: [PATCH 16/24] block: Mark bdrv_replace_node() GRAPH_WRLOCK
Date: Fri, 3 Nov 2023 11:32:39 +0100	[thread overview]
Message-ID: <ZUTMRxsxLbw4OePX@redhat.com> (raw)
In-Reply-To: <3dndhoo6fq2pes3dldplykyg7svuwyfntix5txvotr3zpklnly@gf6yi37ijtmm>

Am 27.10.2023 um 23:33 hat Eric Blake geschrieben:
> On Fri, Oct 27, 2023 at 05:53:25PM +0200, Kevin Wolf wrote:
> > Instead of taking the writer lock internally, require callers to already
> > hold it when calling bdrv_replace_node(). Its callers may already want
> > to hold the graph lock and so wouldn't be able to call functions that
> > take it internally.
> > 
> > Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> > ---
> >  include/block/block-global-state.h |  6 ++++--
> >  block.c                            | 26 +++++++-------------------
> >  block/commit.c                     | 13 +++++++++++--
> >  block/mirror.c                     | 26 ++++++++++++++++----------
> >  blockdev.c                         |  5 +++++
> >  tests/unit/test-bdrv-drain.c       |  6 ++++++
> >  tests/unit/test-bdrv-graph-mod.c   | 13 +++++++++++--
> >  7 files changed, 60 insertions(+), 35 deletions(-)
> > 
> > +++ b/block.c
> > @@ -5484,25 +5484,7 @@ out:
> >  int bdrv_replace_node(BlockDriverState *from, BlockDriverState *to,
> >                        Error **errp)
> >  {
> > -    int ret;
> > -
> > -    GLOBAL_STATE_CODE();
> > -
> > -    /* Make sure that @from doesn't go away until we have successfully attached
> > -     * all of its parents to @to. */
> 
> Useful comment that you just moved here in the previous patch...
> 
> > -    bdrv_ref(from);
> > -    bdrv_drained_begin(from);
> > -    bdrv_drained_begin(to);
> > -    bdrv_graph_wrlock(to);
> > -
> > -    ret = bdrv_replace_node_common(from, to, true, false, errp);
> > -
> > -    bdrv_graph_wrunlock();
> > -    bdrv_drained_end(to);
> > -    bdrv_drained_end(from);
> > -    bdrv_unref(from);
> > -
> > -    return ret;
> > +    return bdrv_replace_node_common(from, to, true, false, errp);
> >  }
> >  
> >  int bdrv_drop_filter(BlockDriverState *bs, Error **errp)
> > @@ -5717,9 +5699,15 @@ BlockDriverState *bdrv_insert_node(BlockDriverState *bs, QDict *options,
> >          goto fail;
> >      }
> >  
> > +    bdrv_ref(bs);
> 
> ...but now it is gone.  Intentional?

I figured it was obvious enough that bdrv_ref() is always called to make
sure that the node doesn't go away too early, but I can add it back.

> >      bdrv_drained_begin(bs);
> > +    bdrv_drained_begin(new_node_bs);
> > +    bdrv_graph_wrlock(new_node_bs);
> >      ret = bdrv_replace_node(bs, new_node_bs, errp);
> > +    bdrv_graph_wrunlock();
> > +    bdrv_drained_end(new_node_bs);
> >      bdrv_drained_end(bs);
> > +    bdrv_unref(bs);
> >  
> >      if (ret < 0) {
> >          error_prepend(errp, "Could not replace node: ");
> > diff --git a/block/commit.c b/block/commit.c
> > index d92af02ead..2fecdce86f 100644
> > --- a/block/commit.c
> > +++ b/block/commit.c
> > @@ -68,6 +68,7 @@ static void commit_abort(Job *job)
> >  {
> >      CommitBlockJob *s = container_of(job, CommitBlockJob, common.job);
> >      BlockDriverState *top_bs = blk_bs(s->top);
> > +    BlockDriverState *commit_top_backing_bs;
> >  
> >      if (s->chain_frozen) {
> >          bdrv_graph_rdlock_main_loop();
> > @@ -94,8 +95,12 @@ static void commit_abort(Job *job)
> >       * XXX Can (or should) we somehow keep 'consistent read' blocked even
> >       * after the failed/cancelled commit job is gone? If we already wrote
> >       * something to base, the intermediate images aren't valid any more. */
> > -    bdrv_replace_node(s->commit_top_bs, s->commit_top_bs->backing->bs,
> > -                      &error_abort);
> > +    commit_top_backing_bs = s->commit_top_bs->backing->bs;
> > +    bdrv_drained_begin(commit_top_backing_bs);
> > +    bdrv_graph_wrlock(commit_top_backing_bs);
> 
> Here, and elsewhere in the patch, drained_begin/end is outside
> wr(un)lock...
> 
> > +    bdrv_replace_node(s->commit_top_bs, commit_top_backing_bs, &error_abort);
> > +    bdrv_graph_wrunlock();
> > +    bdrv_drained_end(commit_top_backing_bs);
> >  
> >      bdrv_unref(s->commit_top_bs);
> >      bdrv_unref(top_bs);
> > @@ -425,7 +430,11 @@ fail:
> >      /* commit_top_bs has to be replaced after deleting the block job,
> >       * otherwise this would fail because of lack of permissions. */
> >      if (commit_top_bs) {
> > +        bdrv_graph_wrlock(top);
> > +        bdrv_drained_begin(top);
> >          bdrv_replace_node(commit_top_bs, top, &error_abort);
> > +        bdrv_drained_end(top);
> > +        bdrv_graph_wrunlock();
> 
> ...but here you do it in the opposite order.  Intentional?

No, this is actually wrong. bdrv_drained_begin() has a nested event
loop, and running a nested event loop while holding the graph lock can
cause deadlocks, so it's forbidden. Thanks for catching this!

> > +++ b/tests/unit/test-bdrv-drain.c
> > @@ -2000,7 +2000,13 @@ static void do_test_replace_child_mid_drain(int old_drain_count,
> >      parent_s->was_undrained = false;
> >  
> >      g_assert(parent_bs->quiesce_counter == old_drain_count);
> > +    bdrv_drained_begin(old_child_bs);
> > +    bdrv_drained_begin(new_child_bs);
> > +    bdrv_graph_wrlock(NULL);
> 
> Why is this locking on NULL instead of new_child_bs?

The parameter for bdrv_graph_wrlock() is a BDS whose AioContext is
locked and needs to be temporarily unlocked to avoid deadlocks. We don't
hold any AioContext lock here, so NULL is right.

> >      bdrv_replace_node(old_child_bs, new_child_bs, &error_abort);
> > +    bdrv_graph_wrunlock();
> > +    bdrv_drained_end(new_child_bs);
> > +    bdrv_drained_end(old_child_bs);
> >      g_assert(parent_bs->quiesce_counter == new_drain_count);
> >  
> >      if (!old_drain_count && !new_drain_count) {

Since the two comments above are the only thing you found in the review,
I'll just directly fix them while applying the series.

Kevin



  reply	other threads:[~2023-11-03 10:33 UTC|newest]

Thread overview: 58+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-10-27 15:53 [PATCH 00/24] block: Graph locking part 6 (bs->file/backing) Kevin Wolf
2023-10-27 15:53 ` [PATCH 01/24] block: Mark bdrv_probe_blocksizes() and callers GRAPH_RDLOCK Kevin Wolf
2023-10-27 19:40   ` Eric Blake
2023-10-27 15:53 ` [PATCH 02/24] block: Mark bdrv_has_zero_init() " Kevin Wolf
2023-10-27 19:59   ` Eric Blake
2023-10-27 15:53 ` [PATCH 03/24] block: Mark bdrv_filter_bs() " Kevin Wolf
2023-10-27 20:02   ` Eric Blake
2023-10-27 20:45     ` Eric Blake
2023-10-27 15:53 ` [PATCH 04/24] block: Mark bdrv_root_attach_child() GRAPH_WRLOCK Kevin Wolf
2023-10-27 20:22   ` Eric Blake
2023-11-03  9:45     ` Kevin Wolf
2023-11-03 12:33       ` Eric Blake
2023-10-27 15:53 ` [PATCH 05/24] block: Mark block_job_add_bdrv() GRAPH_WRLOCK Kevin Wolf
2023-10-27 20:27   ` Eric Blake
2023-10-27 15:53 ` [PATCH 06/24] block: Mark bdrv_filter_or_cow_bs() and callers GRAPH_RDLOCK Kevin Wolf
2023-10-27 20:33   ` Eric Blake
2023-10-27 15:53 ` [PATCH 07/24] block: Mark bdrv_skip_implicit_filters() " Kevin Wolf
2023-10-27 20:37   ` Eric Blake
2023-10-27 15:53 ` [PATCH 08/24] block: Mark bdrv_skip_filters() " Kevin Wolf
2023-10-27 20:52   ` Eric Blake
2023-10-27 15:53 ` [PATCH 09/24] block: Mark bdrv_(un)freeze_backing_chain() " Kevin Wolf
2023-10-27 21:00   ` Eric Blake
2023-11-03  9:54     ` Kevin Wolf
2023-10-27 15:53 ` [PATCH 10/24] block: Mark bdrv_chain_contains() " Kevin Wolf
2023-10-27 21:17   ` Eric Blake
2023-10-27 15:53 ` [PATCH 11/24] block: Mark bdrv_filter_child() " Kevin Wolf
2023-10-27 21:19   ` Eric Blake
2023-10-27 15:53 ` [PATCH 12/24] block: Mark bdrv_cow_child() " Kevin Wolf
2023-10-27 21:20   ` Eric Blake
2023-10-27 15:53 ` [PATCH 13/24] block: Mark bdrv_set_backing_hd_drained() GRAPH_WRLOCK Kevin Wolf
2023-10-27 21:22   ` Eric Blake
2023-10-27 15:53 ` [PATCH 14/24] block: Inline bdrv_set_backing_noperm() Kevin Wolf
2023-10-27 21:23   ` Eric Blake
2023-10-27 15:53 ` [PATCH 15/24] block: Mark bdrv_replace_node_common() GRAPH_WRLOCK Kevin Wolf
2023-10-27 21:27   ` Eric Blake
2023-10-27 15:53 ` [PATCH 16/24] block: Mark bdrv_replace_node() GRAPH_WRLOCK Kevin Wolf
2023-10-27 21:33   ` Eric Blake
2023-11-03 10:32     ` Kevin Wolf [this message]
2023-11-03 12:37       ` Eric Blake
2023-10-27 15:53 ` [PATCH 17/24] block: Protect bs->backing with graph_lock Kevin Wolf
2023-10-27 21:46   ` Eric Blake
2023-10-27 15:53 ` [PATCH 18/24] blkverify: Add locking for request_fn Kevin Wolf
2023-10-30 13:51   ` Eric Blake
2023-10-27 15:53 ` [PATCH 19/24] block: Introduce bdrv_co_change_backing_file() Kevin Wolf
2023-10-30 13:57   ` Eric Blake
2023-11-03 10:33     ` Kevin Wolf
2023-11-03 12:38       ` Eric Blake
2023-10-27 15:53 ` [PATCH 20/24] block: Add missing GRAPH_RDLOCK annotations Kevin Wolf
2023-10-30 21:19   ` Eric Blake
2023-10-27 15:53 ` [PATCH 21/24] qcow2: Take locks for accessing bs->file Kevin Wolf
2023-10-30 21:25   ` Eric Blake
2023-10-27 15:53 ` [PATCH 22/24] vhdx: " Kevin Wolf
2023-10-30 21:26   ` Eric Blake
2023-10-27 15:53 ` [PATCH 23/24] block: Take graph lock for most of .bdrv_open Kevin Wolf
2023-10-30 21:34   ` Eric Blake
2023-11-03 10:05     ` Kevin Wolf
2023-10-27 15:53 ` [PATCH 24/24] block: Protect bs->file with graph_lock Kevin Wolf
2023-10-30 21:37   ` Eric Blake

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZUTMRxsxLbw4OePX@redhat.com \
    --to=kwolf@redhat.com \
    --cc=eblake@redhat.com \
    --cc=eesposit@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    --cc=vsementsov@yandex-team.ru \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).