From: Kevin Wolf <kwolf@redhat.com>
To: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Cc: qemu-block@nongnu.org, eesposit@redhat.com, stefanha@redhat.com,
hreitz@redhat.com, pbonzini@redhat.com, qemu-devel@nongnu.org
Subject: Re: [PATCH 08/13] stream: Replace subtree drain with a single node drain
Date: Thu, 10 Nov 2022 18:27:46 +0100 [thread overview]
Message-ID: <Y200ktGQEUKHjZgW@redhat.com> (raw)
In-Reply-To: <ad3b4250-a0e3-10fa-aea4-4e98a6e10ac5@yandex-team.ru>
Am 10.11.2022 um 12:25 hat Vladimir Sementsov-Ogievskiy geschrieben:
> On 11/10/22 13:16, Kevin Wolf wrote:
> > Am 09.11.2022 um 17:52 hat Vladimir Sementsov-Ogievskiy geschrieben:
> > > On 11/8/22 15:37, Kevin Wolf wrote:
> > > > The subtree drain was introduced in commit b1e1af394d9 as a way to avoid
> > > > graph changes between finding the base node and changing the block graph
> > > > as necessary on completion of the image streaming job.
> > > >
> > > > The block graph could change between these two points because
> > > > bdrv_set_backing_hd() first drains the parent node, which involved
> > > > polling and can do anything.
> > > >
> > > > Subtree draining was an imperfect way to make this less likely (because
> > > > with it, fewer callbacks are called during this window). Everyone agreed
> > > > that it's not really the right solution, and it was only committed as a
> > > > stopgap solution.
> > > >
> > > > This replaces the subtree drain with a solution that simply drains the
> > > > parent node before we try to find the base node, and then call a version
> > > > of bdrv_set_backing_hd() that doesn't drain, but just asserts that the
> > > > parent node is already drained.
> > > >
> > > > This way, any graph changes caused by draining happen before we start
> > > > looking at the graph and things stay consistent between finding the base
> > > > node and changing the graph.
> > > >
> > > > Signed-off-by: Kevin Wolf<kwolf@redhat.com>
> > > [..]
> > >
> > > > base = bdrv_filter_or_cow_bs(s->above_base);
> > > > - if (base) {
> > > > - bdrv_ref(base);
> > > > - }
> > > > -
> > > > unfiltered_base = bdrv_skip_filters(base);
> > > > if (bdrv_cow_child(unfiltered_bs)) {
> > > > @@ -82,7 +85,7 @@ static int stream_prepare(Job *job)
> > > > }
> > > > }
> > > > - bdrv_set_backing_hd(unfiltered_bs, base, &local_err);
> > > > + bdrv_set_backing_hd_drained(unfiltered_bs, base, &local_err);
> > > > ret = bdrv_change_backing_file(unfiltered_bs, base_id, base_fmt, false);
> > > If we have yield points / polls during bdrv_set_backing_hd_drained()
> > > and bdrv_change_backing_file(), it's still bad and another
> > > graph-modifying operation may interleave. But b1e1af394d9 reports only
> > > polling in bdrv_set_backing_hd(), so I think it's OK to not care about
> > > other cases.
> > At this point in the series, bdrv_replace_child_noperm() can indeed
> > still poll. I'm not sure how bad it is, but at this point we're already
> > reconfiguring the graph with two specific nodes and somehow this poll
> > hasn't caused problems in the past. Anyway, at the end of the series,
> > there isn't be any polling left in bdrv_set_backing_hd_drained(), as far
> > as I can tell.
> >
> > bdrv_change_backing_file() will certainly poll because it does I/O to
> > the image file. However, the change to the graph is completed at that
> > point, so I don't think it's a problem. Do you think it would be worth
> > putting a comment before bdrv_change_backing_file() that mentions that
> > the graph may change again from here on, but we've completed the graph
> > change?
> >
>
> Comment won't hurt. I think theoretically that's possible that we
>
> 1. change the graph
> 2. yield in bdrv_change_backing_file
> 3. switch to another graph-modifying operation, change backing file and do another bdrv_change_backing_file()
> 4. return to bdrv_change_backing_file() of [2] and write wrong backing file to metadata
>
> And the only solution for such things that I can imagine is a kind of
> global graph-modifying lock, which should be held around the whole
> graph modifying operation, including writing metadata.
Actually, I don't think this is the case. The problem that you get here
is just that we haven't really defined what happens when you get two
concurrent .bdrv_change_backing_file requests. To solve this, you don't
need to lock the whole graph, you just need to order the updates at the
block driver level instead of doing them in parallel, so that we know
that the last .bdrv_change_backing_file call wins. I think taking
s->lock in qcow2 would already achieve this (but still lock more than is
strictly necessary).
> Probably, we shouldn't care until we have real bug reports of it.
> Actually I hope that the only user who start stream and commit jobs in
> parallel on same backing-chain is our iotests :)
Yes, it sounds very theoretical. :-)
Kevin
next prev parent reply other threads:[~2022-11-10 17:28 UTC|newest]
Thread overview: 61+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-11-08 12:37 [PATCH 00/13] block: Simplify drain Kevin Wolf
2022-11-08 12:37 ` [PATCH 01/13] qed: Don't yield in bdrv_qed_co_drain_begin() Kevin Wolf
2022-11-09 9:21 ` Vladimir Sementsov-Ogievskiy
2022-11-09 9:27 ` Vladimir Sementsov-Ogievskiy
2022-11-09 12:22 ` Kevin Wolf
2022-11-09 21:49 ` Stefan Hajnoczi
2022-11-10 11:07 ` Kevin Wolf
2022-11-11 11:14 ` Emanuele Giuseppe Esposito
2022-11-14 18:16 ` Hanna Reitz
2022-11-08 12:37 ` [PATCH 02/13] test-bdrv-drain: Don't yield in .bdrv_co_drained_begin/end() Kevin Wolf
2022-11-09 10:50 ` Vladimir Sementsov-Ogievskiy
2022-11-09 12:28 ` Kevin Wolf
2022-11-09 13:45 ` Vladimir Sementsov-Ogievskiy
2022-11-11 11:14 ` Emanuele Giuseppe Esposito
2022-11-14 18:16 ` Hanna Reitz
2022-11-08 12:37 ` [PATCH 03/13] block: Revert .bdrv_drained_begin/end to non-coroutine_fn Kevin Wolf
2022-11-09 14:29 ` Vladimir Sementsov-Ogievskiy
2022-11-09 22:13 ` Stefan Hajnoczi
2022-11-11 11:14 ` Emanuele Giuseppe Esposito
2022-11-14 18:17 ` Hanna Reitz
2022-11-08 12:37 ` [PATCH 04/13] block: Remove drained_end_counter Kevin Wolf
2022-11-09 14:44 ` Vladimir Sementsov-Ogievskiy
2022-11-11 16:37 ` Kevin Wolf
2022-11-11 11:15 ` Emanuele Giuseppe Esposito
2022-11-14 18:19 ` Hanna Reitz
2022-11-08 12:37 ` [PATCH 05/13] block: Inline bdrv_drain_invoke() Kevin Wolf
2022-11-09 15:34 ` Vladimir Sementsov-Ogievskiy
2022-11-10 19:48 ` Stefan Hajnoczi
2022-11-11 11:15 ` Emanuele Giuseppe Esposito
2022-11-14 18:19 ` Hanna Reitz
2022-11-08 12:37 ` [PATCH 06/13] block: Drain invidual nodes during reopen Kevin Wolf
2022-11-09 16:00 ` Vladimir Sementsov-Ogievskiy
2022-11-11 16:54 ` Kevin Wolf
2022-11-08 12:37 ` [PATCH 07/13] block: Don't use subtree drains in bdrv_drop_intermediate() Kevin Wolf
2022-11-09 16:18 ` Vladimir Sementsov-Ogievskiy
2022-11-14 18:20 ` Hanna Reitz
2022-11-08 12:37 ` [PATCH 08/13] stream: Replace subtree drain with a single node drain Kevin Wolf
2022-11-09 16:52 ` Vladimir Sementsov-Ogievskiy
2022-11-10 10:16 ` Kevin Wolf
2022-11-10 11:25 ` Vladimir Sementsov-Ogievskiy
2022-11-10 17:27 ` Kevin Wolf [this message]
2022-11-14 18:21 ` Hanna Reitz
2022-11-08 12:37 ` [PATCH 09/13] block: Remove subtree drains Kevin Wolf
2022-11-09 17:22 ` Vladimir Sementsov-Ogievskiy
2022-11-14 18:22 ` Hanna Reitz
2022-11-08 12:37 ` [PATCH 10/13] block: Call drain callbacks only once Kevin Wolf
2022-11-09 18:05 ` Vladimir Sementsov-Ogievskiy
2022-11-14 12:32 ` Kevin Wolf
2022-11-09 18:54 ` Vladimir Sementsov-Ogievskiy
2022-11-14 18:23 ` Hanna Reitz
2022-11-08 12:37 ` [PATCH 11/13] block: Remove ignore_bds_parents parameter from drain functions Kevin Wolf
2022-11-09 18:57 ` Vladimir Sementsov-Ogievskiy
2022-11-14 18:23 ` Hanna Reitz
2022-11-08 12:37 ` [PATCH 12/13] block: Don't poll in bdrv_replace_child_noperm() Kevin Wolf
2022-11-11 11:21 ` Emanuele Giuseppe Esposito
2022-11-14 20:22 ` Hanna Reitz
2022-11-17 13:27 ` Kevin Wolf
2022-11-08 12:37 ` [PATCH 13/13] block: Remove poll parameter from bdrv_parent_drained_begin_single() Kevin Wolf
2022-11-14 20:24 ` Hanna Reitz
2022-11-10 20:13 ` [PATCH 00/13] block: Simplify drain Stefan Hajnoczi
2022-11-11 11:23 ` Emanuele Giuseppe Esposito
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Y200ktGQEUKHjZgW@redhat.com \
--to=kwolf@redhat.com \
--cc=eesposit@redhat.com \
--cc=hreitz@redhat.com \
--cc=pbonzini@redhat.com \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
--cc=stefanha@redhat.com \
--cc=vsementsov@yandex-team.ru \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).