From: Kevin Wolf <kwolf@redhat.com>
To: Fiona Ebner <f.ebner@proxmox.com>
Cc: qemu-block@nongnu.org, qemu-devel@nongnu.org, den@virtuozzo.com,
andrey.drobyshev@virtuozzo.com, hreitz@redhat.com,
stefanha@redhat.com, eblake@redhat.com, jsnow@redhat.com,
vsementsov@yandex-team.ru
Subject: Re: [PATCH 09/11] block: move drain out of bdrv_change_aio_context()
Date: Wed, 14 May 2025 21:45:32 +0200 [thread overview]
Message-ID: <aCTy3NoJE2zSY47O@redhat.com> (raw)
In-Reply-To: <20250508140936.3344485-10-f.ebner@proxmox.com>
Am 08.05.2025 um 16:09 hat Fiona Ebner geschrieben:
> While bdrv_drained_begin() is now marked as GRAPH_UNLOCKED, TSA is not
> able to catch a remaining instance where the function is called with
> the graph lock held in bdrv_change_aio_context(). This is because the
> call is preceded by a bdrv_graph_rdunlock_main_loop(), but for callers
> that hold the exclusive lock that does not actually release the lock.
And releasing it wouldn't be correct anyway.
> In combination with block-stream, there is a deadlock that can happen
> because of this [0]. In particular, it can happen that
> main thread IO thread
> 1. acquires write lock
> in blk_co_do_preadv_part():
> 2. have non-zero blk->in_flight
> 3. try to acquire read lock
> 4. begin drain
>
> Steps 3 and 4 might be switched. Draining will poll and get stuck,
> because it will see the non-zero in_flight counter. But the IO thread
> will not make any progress either, because it cannot acquire the read
> lock.
>
> More granular draining is not trivially possible, because
> bdrv_change_aio_context() can recursively call itself e.g. via
> bdrv_child_change_aio_context().
>
> [0]: https://lore.kernel.org/qemu-devel/73839c04-7616-407e-b057-80ca69e63f51@virtuozzo.com/
>
> Reported-by: Andrey Drobyshev <andrey.drobyshev@virtuozzo.com>
> Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
> ---
> block.c | 91 ++++++++++++++++++++----------
> block/backup.c | 2 +
> block/blklogwrites.c | 4 ++
> block/blkverify.c | 2 +
> block/block-backend.c | 4 ++
> block/commit.c | 4 ++
> block/mirror.c | 5 ++
> block/qcow2.c | 2 +
> block/quorum.c | 4 ++
> block/replication.c | 7 +++
> block/snapshot.c | 2 +
> block/stream.c | 10 ++--
> block/vmdk.c | 10 ++++
> blockdev.c | 17 ++++--
> blockjob.c | 6 ++
> include/block/block-global-state.h | 8 ++-
> tests/unit/test-bdrv-drain.c | 18 ++++++
> tests/unit/test-bdrv-graph-mod.c | 10 ++++
> 18 files changed, 164 insertions(+), 42 deletions(-)
>
> diff --git a/block.c b/block.c
> index 10052027cd..d84ac2f394 100644
> --- a/block.c
> +++ b/block.c
> @@ -1720,12 +1720,14 @@ bdrv_open_driver(BlockDriverState *bs, BlockDriver *drv, const char *node_name,
> open_failed:
> bs->drv = NULL;
>
> + bdrv_drain_all_begin();
> bdrv_graph_wrlock();
> if (bs->file != NULL) {
> bdrv_unref_child(bs, bs->file);
> assert(!bs->file);
> }
> bdrv_graph_wrunlock();
> + bdrv_drain_all_end();
I feel this patch is doing too much at once, pushing drain up three
levels to all callers of bdrv_unref_child().
Please split it, doing only one level per patch, i.e. the first patch
only marks bdrv_change_aio_context() GRAPH_RDLOCK, removes the double
locking and moves the drain (which becomes drain_all) to
bdrv_try_change_aio_context(). Then the next patch marks that one
GRAPH_RDLOCK, etc.
This will make it more obvious that each step is done correctly. For
example, checking this hunk I noticed that bdrv_unref_child() doesn't
document that child->bs must be drained, but doing everything at once,
it's not obvious if and which other functions have the same problem.
> @@ -7702,14 +7699,12 @@ static bool bdrv_change_aio_context(BlockDriverState *bs, AioContext *ctx,
> }
>
> /*
> - * Change bs's and recursively all of its parents' and children's AioContext
> - * to the given new context, returning an error if that isn't possible.
> - *
> - * If ignore_child is not NULL, that child (and its subgraph) will not
> - * be touched.
> + * Use bdrv_try_change_aio_context() or bdrv_try_change_aio_context_locked().
> */
> -int bdrv_try_change_aio_context(BlockDriverState *bs, AioContext *ctx,
> - BdrvChild *ignore_child, Error **errp)
> +static int bdrv_try_change_aio_context_common(BlockDriverState *bs,
> + AioContext *ctx,
> + BdrvChild *ignore_child,
> + Error **errp)
This is exactly the same as bdrv_try_change_aio_context_locked(), so
this should be called bdrv_try_change_aio_context_locked() and the
wrapper below should go away.
Don't forget to add TSA annotations to functions whose interface you
touch (but I already mentioned this above).
> {
> Transaction *tran;
> GHashTable *visited;
I'll defer review of the rest of the patch to the next version when it's
split. As it is, it's too hard to understand where each specific change
comes from. (That is, I could probably work it out for each individual
hunk, but I'd lose the big picture of what I'm really looking at and if
the changes are complete.)
Kevin
next prev parent reply other threads:[~2025-05-14 19:46 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-05-08 14:09 [RFC 00/11] do not drain while holding the graph lock Fiona Ebner
2025-05-08 14:09 ` [PATCH 01/11] block: remove outdated comments about AioContext locking Fiona Ebner
2025-05-14 16:22 ` Kevin Wolf
2025-05-08 14:09 ` [PATCH 02/11] block: move drain outside of read-locked bdrv_reopen_queue_child() Fiona Ebner
2025-05-14 16:36 ` Kevin Wolf
2025-05-15 11:48 ` Fiona Ebner
2025-05-15 12:28 ` Kevin Wolf
2025-05-15 13:41 ` Fiona Ebner
2025-05-08 14:09 ` [PATCH 03/11] block/snapshot: move drain outside of read-locked bdrv_snapshot_delete() Fiona Ebner
2025-05-14 16:52 ` Kevin Wolf
2025-05-15 12:03 ` Fiona Ebner
2025-05-08 14:09 ` [PATCH 04/11] block: drain while unlocked in bdrv_reopen_parse_file_or_backing() Fiona Ebner
2025-05-14 16:59 ` Kevin Wolf
2025-05-08 14:09 ` [PATCH 05/11] block: move drain outside of read-locked bdrv_inactivate_recurse() Fiona Ebner
2025-05-14 17:06 ` Kevin Wolf
2025-05-08 14:09 ` [PATCH 06/11] blockdev: drain while unlocked in internal_snapshot_action() Fiona Ebner
2025-05-14 17:26 ` Kevin Wolf
2025-05-19 12:37 ` Fiona Ebner
2025-05-08 14:09 ` [PATCH 07/11] blockdev: drain while unlocked in external_snapshot_action() Fiona Ebner
2025-05-08 14:09 ` [PATCH 08/11] block: mark bdrv_drained_begin() as GRAPH_UNLOCKED Fiona Ebner
2025-05-14 17:50 ` Kevin Wolf
2025-05-08 14:09 ` [PATCH 09/11] block: move drain out of bdrv_change_aio_context() Fiona Ebner
2025-05-14 19:45 ` Kevin Wolf [this message]
2025-05-08 14:09 ` [PATCH 10/11] block/graph-lock: add drain flag to bdrv_graph_wr{, un}lock Fiona Ebner
2025-05-14 19:54 ` [PATCH 10/11] block/graph-lock: add drain flag to bdrv_graph_wr{,un}lock Kevin Wolf
2025-05-19 12:10 ` Fiona Ebner
2025-05-20 6:09 ` Fiona Ebner
2025-05-20 8:42 ` Kevin Wolf
2025-05-08 14:09 ` [PATCH 11/11] iotests/graph-changes-while-io: add test case with removal of lower snapshot Fiona Ebner
2025-05-14 20:14 ` Kevin Wolf
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aCTy3NoJE2zSY47O@redhat.com \
--to=kwolf@redhat.com \
--cc=andrey.drobyshev@virtuozzo.com \
--cc=den@virtuozzo.com \
--cc=eblake@redhat.com \
--cc=f.ebner@proxmox.com \
--cc=hreitz@redhat.com \
--cc=jsnow@redhat.com \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
--cc=stefanha@redhat.com \
--cc=vsementsov@yandex-team.ru \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).