Re: [PATCH] block: bdrv_set_backing_hd(): use drained section

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: Kevin Wolf <kwolf@redhat.com>
To: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Cc: Emanuele Giuseppe Esposito <eesposit@redhat.com>,
	qemu-block@nongnu.org, qemu-devel@nongnu.org, hreitz@redhat.com,
	den@openvz.org, Paolo Bonzini <pbonzini@redhat.com>
Subject: Re: [PATCH] block: bdrv_set_backing_hd(): use drained section
Date: Thu, 27 Jan 2022 15:13:02 +0100	[thread overview]
Message-ID: <YfKobj+ZpzIxLasz@redhat.com> (raw)
In-Reply-To: <4aa42545-e0da-2a15-110e-3d7b2d8cd273@virtuozzo.com>

Am 25.01.2022 um 11:12 hat Vladimir Sementsov-Ogievskiy geschrieben:
> 25.01.2022 12:24, Paolo Bonzini wrote:
> > On 1/24/22 18:37, Vladimir Sementsov-Ogievskiy wrote:
> > > Graph modifications should be done in drained section. stream_prepare()
> > > handler of block stream job call bdrv_set_backing_hd() without using
> > > drained section and it's theoretically possible that some IO request
> > > will interleave with graph modification and will use outdated pointers
> > > to removed block nodes.
> > > 
> > > Some other callers use bdrv_set_backing_hd() not caring about drained
> > > sections too. So it seems good to make a drained section exactly in
> > > bdrv_set_backing_hd().
> > 
> > Emanuele has a similar patch in his series to protect all graph
> > modifications with drains:
> > 
> > @@ -3456,6 +3478,11 @@ int bdrv_set_backing_hd(BlockDriverState *bs, BlockDriverState *backing_hd,
> > 
> >       assert(qemu_in_main_thread());
> > 
> > +    bdrv_subtree_drained_begin_unlocked(bs);
> > +    if (backing_hd) {
> > +        bdrv_subtree_drained_begin_unlocked(backing_hd);
> > +    }
> > +
> >       ret = bdrv_set_backing_noperm(bs, backing_hd, tran, errp);
> >       if (ret < 0) {
> >           goto out;
> > @@ -3464,6 +3491,10 @@ int bdrv_set_backing_hd(BlockDriverState *bs, BlockDriverState *backing_hd,
> >       ret = bdrv_refresh_perms(bs, errp);
> >   out:
> >       tran_finalize(tran, ret);
> > +    if (backing_hd) {
> > +        bdrv_subtree_drained_end_unlocked(backing_hd);
> > +    }
> > +    bdrv_subtree_drained_end_unlocked(bs);
> > 
> >       return ret;
> >   }
> > 
> > so the idea at least is correct.
> > 
> > I don't object to fixing this independently, but please check
> > 1) if a subtree drain would be more appropriate, 2) whether
> > backing_hd should be drained as well, 3) whether we're guaranteed
> > to be holding the AioContext lock as required for
> > bdrv_drained_begin/end.
> > 
> 
> Hmm.
> 
> 1. Subtree draining of backing_hd will not help, as bs is not drained,
> we still may have in-fight request in bs, touching old bs->backing.
> 
> 2. I think non-recursive drain of bs is enough. We modify only bs
> node, so we should drain it. backing_hd itself is not modified. If
> backing_hd participate in some other backing chain - it's not touched,
> and in-flight requests in that other chain are not broken by
> modification, so why to drain it? Same for old bs->backing and other
> bs children. We are not interested in in-flight requests in subtree
> which are not part of request in bs. So, if no inflight requests in
> bs, we can modify bs and not care about requests in subtree.

I agree on both points. Emanuele's patch seems to be doing unnecessary
work there.

> 3. Jobs are bound to aio context, so I believe that they care to hold
> AioContext lock. For example, on path job_prepare may be called
> through job_exit(), job_exit() does
> aio_context_acquire(job->aio_context), or it may be called through
> job_cancel(), which seems to be called under aio_context_acquire() as
> well. So, seems in general we care about it, and of course
> bdrv_set_backing_hd() must be called with AioContext lock held. If for
> some code path it isn't, it's a bug..

We do have some code that does exactly that: In the main thread, we
often don't hold the AioContext lock, but only the BQL. I find it quite
ugly, but it works as long as the node is in the main AioContext.

One path where this is relevant is bdrv_open_inherit() ->
bdrv_open_backing_file() -> bdrv_set_backing_hd(). This one is harmless
because we know that we just created the new node in the main
AioContext.

All the other paths seem to come either from jobs (which take the
AioContext as you explained) or directly from monitor commands, which I
just checked to take the lock as well.

Kevin

next prev parent reply	other threads:[~2022-01-27 14:19 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-01-24 17:37 [PATCH] block: bdrv_set_backing_hd(): use drained section Vladimir Sementsov-Ogievskiy
2022-01-25  9:24 ` Paolo Bonzini
2022-01-25 10:12   ` Vladimir Sementsov-Ogievskiy
2022-01-27 14:13     ` Kevin Wolf [this message]
2022-01-28 14:12       ` Emanuele Giuseppe Esposito
2022-02-01 11:00         ` Vladimir Sementsov-Ogievskiy
2022-01-27 14:14 ` Kevin Wolf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YfKobj+ZpzIxLasz@redhat.com \
    --to=kwolf@redhat.com \
    --cc=den@openvz.org \
    --cc=eesposit@redhat.com \
    --cc=hreitz@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=vsementsov@virtuozzo.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).