All of lore.kernel.org
 help / color / mirror / Atom feed
From: Kevin Wolf <kwolf@redhat.com>
To: Eric Blake <eblake@redhat.com>
Cc: qemu-block@nongnu.org, stefanha@redhat.com, eesposit@redhat.com,
	pbonzini@redhat.com, vsementsov@yandex-team.ru,
	qemu-devel@nongnu.org
Subject: Re: [PATCH 16/24] block: Mark bdrv_replace_node() GRAPH_WRLOCK
Date: Fri, 3 Nov 2023 11:32:39 +0100	[thread overview]
Message-ID: <ZUTMRxsxLbw4OePX@redhat.com> (raw)
In-Reply-To: <3dndhoo6fq2pes3dldplykyg7svuwyfntix5txvotr3zpklnly@gf6yi37ijtmm>

Am 27.10.2023 um 23:33 hat Eric Blake geschrieben:
> On Fri, Oct 27, 2023 at 05:53:25PM +0200, Kevin Wolf wrote:
> > Instead of taking the writer lock internally, require callers to already
> > hold it when calling bdrv_replace_node(). Its callers may already want
> > to hold the graph lock and so wouldn't be able to call functions that
> > take it internally.
> > 
> > Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> > ---
> >  include/block/block-global-state.h |  6 ++++--
> >  block.c                            | 26 +++++++-------------------
> >  block/commit.c                     | 13 +++++++++++--
> >  block/mirror.c                     | 26 ++++++++++++++++----------
> >  blockdev.c                         |  5 +++++
> >  tests/unit/test-bdrv-drain.c       |  6 ++++++
> >  tests/unit/test-bdrv-graph-mod.c   | 13 +++++++++++--
> >  7 files changed, 60 insertions(+), 35 deletions(-)
> > 
> > +++ b/block.c
> > @@ -5484,25 +5484,7 @@ out:
> >  int bdrv_replace_node(BlockDriverState *from, BlockDriverState *to,
> >                        Error **errp)
> >  {
> > -    int ret;
> > -
> > -    GLOBAL_STATE_CODE();
> > -
> > -    /* Make sure that @from doesn't go away until we have successfully attached
> > -     * all of its parents to @to. */
> 
> Useful comment that you just moved here in the previous patch...
> 
> > -    bdrv_ref(from);
> > -    bdrv_drained_begin(from);
> > -    bdrv_drained_begin(to);
> > -    bdrv_graph_wrlock(to);
> > -
> > -    ret = bdrv_replace_node_common(from, to, true, false, errp);
> > -
> > -    bdrv_graph_wrunlock();
> > -    bdrv_drained_end(to);
> > -    bdrv_drained_end(from);
> > -    bdrv_unref(from);
> > -
> > -    return ret;
> > +    return bdrv_replace_node_common(from, to, true, false, errp);
> >  }
> >  
> >  int bdrv_drop_filter(BlockDriverState *bs, Error **errp)
> > @@ -5717,9 +5699,15 @@ BlockDriverState *bdrv_insert_node(BlockDriverState *bs, QDict *options,
> >          goto fail;
> >      }
> >  
> > +    bdrv_ref(bs);
> 
> ...but now it is gone.  Intentional?

I figured it was obvious enough that bdrv_ref() is always called to make
sure that the node doesn't go away too early, but I can add it back.

> >      bdrv_drained_begin(bs);
> > +    bdrv_drained_begin(new_node_bs);
> > +    bdrv_graph_wrlock(new_node_bs);
> >      ret = bdrv_replace_node(bs, new_node_bs, errp);
> > +    bdrv_graph_wrunlock();
> > +    bdrv_drained_end(new_node_bs);
> >      bdrv_drained_end(bs);
> > +    bdrv_unref(bs);
> >  
> >      if (ret < 0) {
> >          error_prepend(errp, "Could not replace node: ");
> > diff --git a/block/commit.c b/block/commit.c
> > index d92af02ead..2fecdce86f 100644
> > --- a/block/commit.c
> > +++ b/block/commit.c
> > @@ -68,6 +68,7 @@ static void commit_abort(Job *job)
> >  {
> >      CommitBlockJob *s = container_of(job, CommitBlockJob, common.job);
> >      BlockDriverState *top_bs = blk_bs(s->top);
> > +    BlockDriverState *commit_top_backing_bs;
> >  
> >      if (s->chain_frozen) {
> >          bdrv_graph_rdlock_main_loop();
> > @@ -94,8 +95,12 @@ static void commit_abort(Job *job)
> >       * XXX Can (or should) we somehow keep 'consistent read' blocked even
> >       * after the failed/cancelled commit job is gone? If we already wrote
> >       * something to base, the intermediate images aren't valid any more. */
> > -    bdrv_replace_node(s->commit_top_bs, s->commit_top_bs->backing->bs,
> > -                      &error_abort);
> > +    commit_top_backing_bs = s->commit_top_bs->backing->bs;
> > +    bdrv_drained_begin(commit_top_backing_bs);
> > +    bdrv_graph_wrlock(commit_top_backing_bs);
> 
> Here, and elsewhere in the patch, drained_begin/end is outside
> wr(un)lock...
> 
> > +    bdrv_replace_node(s->commit_top_bs, commit_top_backing_bs, &error_abort);
> > +    bdrv_graph_wrunlock();
> > +    bdrv_drained_end(commit_top_backing_bs);
> >  
> >      bdrv_unref(s->commit_top_bs);
> >      bdrv_unref(top_bs);
> > @@ -425,7 +430,11 @@ fail:
> >      /* commit_top_bs has to be replaced after deleting the block job,
> >       * otherwise this would fail because of lack of permissions. */
> >      if (commit_top_bs) {
> > +        bdrv_graph_wrlock(top);
> > +        bdrv_drained_begin(top);
> >          bdrv_replace_node(commit_top_bs, top, &error_abort);
> > +        bdrv_drained_end(top);
> > +        bdrv_graph_wrunlock();
> 
> ...but here you do it in the opposite order.  Intentional?

No, this is actually wrong. bdrv_drained_begin() has a nested event
loop, and running a nested event loop while holding the graph lock can
cause deadlocks, so it's forbidden. Thanks for catching this!

> > +++ b/tests/unit/test-bdrv-drain.c
> > @@ -2000,7 +2000,13 @@ static void do_test_replace_child_mid_drain(int old_drain_count,
> >      parent_s->was_undrained = false;
> >  
> >      g_assert(parent_bs->quiesce_counter == old_drain_count);
> > +    bdrv_drained_begin(old_child_bs);
> > +    bdrv_drained_begin(new_child_bs);
> > +    bdrv_graph_wrlock(NULL);
> 
> Why is this locking on NULL instead of new_child_bs?

The parameter for bdrv_graph_wrlock() is a BDS whose AioContext is
locked and needs to be temporarily unlocked to avoid deadlocks. We don't
hold any AioContext lock here, so NULL is right.

> >      bdrv_replace_node(old_child_bs, new_child_bs, &error_abort);
> > +    bdrv_graph_wrunlock();
> > +    bdrv_drained_end(new_child_bs);
> > +    bdrv_drained_end(old_child_bs);
> >      g_assert(parent_bs->quiesce_counter == new_drain_count);
> >  
> >      if (!old_drain_count && !new_drain_count) {

Since the two comments above are the only thing you found in the review,
I'll just directly fix them while applying the series.

Kevin



  reply	other threads:[~2023-11-03 10:33 UTC|newest]

Thread overview: 58+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-10-27 15:53 [PATCH 00/24] block: Graph locking part 6 (bs->file/backing) Kevin Wolf
2023-10-27 15:53 ` [PATCH 01/24] block: Mark bdrv_probe_blocksizes() and callers GRAPH_RDLOCK Kevin Wolf
2023-10-27 19:40   ` Eric Blake
2023-10-27 15:53 ` [PATCH 02/24] block: Mark bdrv_has_zero_init() " Kevin Wolf
2023-10-27 19:59   ` Eric Blake
2023-10-27 15:53 ` [PATCH 03/24] block: Mark bdrv_filter_bs() " Kevin Wolf
2023-10-27 20:02   ` Eric Blake
2023-10-27 20:45     ` Eric Blake
2023-10-27 15:53 ` [PATCH 04/24] block: Mark bdrv_root_attach_child() GRAPH_WRLOCK Kevin Wolf
2023-10-27 20:22   ` Eric Blake
2023-11-03  9:45     ` Kevin Wolf
2023-11-03 12:33       ` Eric Blake
2023-10-27 15:53 ` [PATCH 05/24] block: Mark block_job_add_bdrv() GRAPH_WRLOCK Kevin Wolf
2023-10-27 20:27   ` Eric Blake
2023-10-27 15:53 ` [PATCH 06/24] block: Mark bdrv_filter_or_cow_bs() and callers GRAPH_RDLOCK Kevin Wolf
2023-10-27 20:33   ` Eric Blake
2023-10-27 15:53 ` [PATCH 07/24] block: Mark bdrv_skip_implicit_filters() " Kevin Wolf
2023-10-27 20:37   ` Eric Blake
2023-10-27 15:53 ` [PATCH 08/24] block: Mark bdrv_skip_filters() " Kevin Wolf
2023-10-27 20:52   ` Eric Blake
2023-10-27 15:53 ` [PATCH 09/24] block: Mark bdrv_(un)freeze_backing_chain() " Kevin Wolf
2023-10-27 21:00   ` Eric Blake
2023-11-03  9:54     ` Kevin Wolf
2023-10-27 15:53 ` [PATCH 10/24] block: Mark bdrv_chain_contains() " Kevin Wolf
2023-10-27 21:17   ` Eric Blake
2023-10-27 15:53 ` [PATCH 11/24] block: Mark bdrv_filter_child() " Kevin Wolf
2023-10-27 21:19   ` Eric Blake
2023-10-27 15:53 ` [PATCH 12/24] block: Mark bdrv_cow_child() " Kevin Wolf
2023-10-27 21:20   ` Eric Blake
2023-10-27 15:53 ` [PATCH 13/24] block: Mark bdrv_set_backing_hd_drained() GRAPH_WRLOCK Kevin Wolf
2023-10-27 21:22   ` Eric Blake
2023-10-27 15:53 ` [PATCH 14/24] block: Inline bdrv_set_backing_noperm() Kevin Wolf
2023-10-27 21:23   ` Eric Blake
2023-10-27 15:53 ` [PATCH 15/24] block: Mark bdrv_replace_node_common() GRAPH_WRLOCK Kevin Wolf
2023-10-27 21:27   ` Eric Blake
2023-10-27 15:53 ` [PATCH 16/24] block: Mark bdrv_replace_node() GRAPH_WRLOCK Kevin Wolf
2023-10-27 21:33   ` Eric Blake
2023-11-03 10:32     ` Kevin Wolf [this message]
2023-11-03 12:37       ` Eric Blake
2023-10-27 15:53 ` [PATCH 17/24] block: Protect bs->backing with graph_lock Kevin Wolf
2023-10-27 21:46   ` Eric Blake
2023-10-27 15:53 ` [PATCH 18/24] blkverify: Add locking for request_fn Kevin Wolf
2023-10-30 13:51   ` Eric Blake
2023-10-27 15:53 ` [PATCH 19/24] block: Introduce bdrv_co_change_backing_file() Kevin Wolf
2023-10-30 13:57   ` Eric Blake
2023-11-03 10:33     ` Kevin Wolf
2023-11-03 12:38       ` Eric Blake
2023-10-27 15:53 ` [PATCH 20/24] block: Add missing GRAPH_RDLOCK annotations Kevin Wolf
2023-10-30 21:19   ` Eric Blake
2023-10-27 15:53 ` [PATCH 21/24] qcow2: Take locks for accessing bs->file Kevin Wolf
2023-10-30 21:25   ` Eric Blake
2023-10-27 15:53 ` [PATCH 22/24] vhdx: " Kevin Wolf
2023-10-30 21:26   ` Eric Blake
2023-10-27 15:53 ` [PATCH 23/24] block: Take graph lock for most of .bdrv_open Kevin Wolf
2023-10-30 21:34   ` Eric Blake
2023-11-03 10:05     ` Kevin Wolf
2023-10-27 15:53 ` [PATCH 24/24] block: Protect bs->file with graph_lock Kevin Wolf
2023-10-30 21:37   ` Eric Blake

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZUTMRxsxLbw4OePX@redhat.com \
    --to=kwolf@redhat.com \
    --cc=eblake@redhat.com \
    --cc=eesposit@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    --cc=vsementsov@yandex-team.ru \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.