From: Fiona Ebner <f.ebner@proxmox.com>
To: qemu-block@nongnu.org
Cc: qemu-devel@nongnu.org, kwolf@redhat.com, den@virtuozzo.com,
andrey.drobyshev@virtuozzo.com, hreitz@redhat.com,
stefanha@redhat.com, eblake@redhat.com, jsnow@redhat.com,
vsementsov@yandex-team.ru, xiechanglong.d@gmail.com,
wencongyang2@huawei.com, berto@igalia.com, fam@euphon.net,
ari@tuxera.com
Subject: [PATCH v3 00/24] block: do not drain while holding the graph lock
Date: Mon, 26 May 2025 15:21:16 +0200 [thread overview]
Message-ID: <20250526132140.1641377-1-f.ebner@proxmox.com> (raw)
Previous discussions:
v2: [0]
v1: [1]
Changes in v3:
* Also add bdrv_drain_all_begin() and bdrv_drain_all() as
GRAPH_UNLOCKED.
* Extend drained section in bdrv_try_change_aio_context() until after
the transaction is finalized.
* Also mark definition of static bdrv_change_aio_context() as
GRAPH_RDLOCK, not only the declaration.
* Add comments for bdrv_{child,parent}_change_aio_context() and
change_aio_ctx().
* Improve commit messages: typos/language/clarification.
Changes in v2:
* Split the big patch moving the drain outside of
bdrv_change_aio_context(), mark functions along the way with graph
lock annotations.
* In {internal,external}_snapshot_action, check that associated bs did
not change after drain and re-acquiring the lock.
* Improve error handling using goto where appropriate.
* Add bdrv_graph_wrlock_drained() convenience wrapper rather than
adding a flag argument.
* Don't use atomics to access bs->quiesce_counter field, add a patch
to adapt the two existing places that used atomics.
* Re-use 'top' image for graph-changes-while-io test case and rename
the other image to 'mid'. Remove the image files after the test.
* Use "must be" instead of "needs to be" in documentation, use single
line comments where possible.
* Remove yet another outdated comment.
* I did not add Kevin's R-b for the patch marking bdrv_drained_begin()
GRAPH_RDLOCK, as the earlier patches/preconditions changed.
This series is an attempt to fix a deadlock issue reported by Andrey
here [2].
bdrv_drained_begin() polls and is not allowed to be called with the
block graph lock held. Mark the function as GRAPH_UNLOCKED.
This alone does not catch the issue reported by Andrey, because there
is a bdrv_graph_rdunlock_main_loop() before bdrv_drained_begin() in
the function bdrv_change_aio_context(). That unlock is of course
ineffective if the exclusive lock is held, but it prevents TSA from
finding the issue.
Thus the bdrv_drained_begin() call from inside
bdrv_change_aio_context() needs to be moved up the call stack before
acquiring the locks. This is the bulk of the series.
Granular draining is not trivially possible, because many of the
affected functions can recursively call themselves.
In place where bdrv_drained_begin() calls were removed, assertions
are added, checking the quiesced_counter to ensure that the nodes
already got drained further up in the call stack.
NOTE:
there are pre-existing test failures on current master, e.g. '240' for
all formats, '295 296 inactive-node-nbd luks-detached-header' for luks
and 'mirror-sparse' for raw. For me, the failures do not change after
this series. The 'mirror-sparse' failure is setup-specific and there
is already a patch available [3].
[0]: https://lore.kernel.org/qemu-devel/20250520103012.424311-1-f.ebner@proxmox.com/
[1]: https://lore.kernel.org/qemu-devel/20250508140936.3344485-1-f.ebner@proxmox.com/
[2]: https://lore.kernel.org/qemu-devel/73839c04-7616-407e-b057-80ca69e63f51@virtuozzo.com/
[3]: https://lore.kernel.org/qemu-devel/20250523163041.2548675-7-eblake@redhat.com/
Andrey Drobyshev (1):
iotests/graph-changes-while-io: add test case with removal of lower
snapshot
Fiona Ebner (23):
block: remove outdated comments about AioContext locking
block: move drain outside of read-locked bdrv_reopen_queue_child()
block/snapshot: move drain outside of read-locked
bdrv_snapshot_delete()
block: move drain outside of read-locked bdrv_inactivate_recurse()
block: mark bdrv_parent_change_aio_context() GRAPH_RDLOCK
block: mark change_aio_ctx() callback and instances as
GRAPH_RDLOCK(_PTR)
block: mark bdrv_child_change_aio_context() GRAPH_RDLOCK
block: move drain outside of bdrv_change_aio_context() and mark
GRAPH_RDLOCK
block: move drain outside of bdrv_try_change_aio_context()
block: move drain outside of bdrv_attach_child_common(_abort)()
block: move drain outside of bdrv_set_backing_hd_drained()
block: move drain outside of bdrv_root_attach_child()
block: move drain outside of bdrv_attach_child()
block: move drain outside of quorum_add_child()
block: move drain outside of bdrv_root_unref_child()
block: move drain outside of quorum_del_child()
blockdev: drain while unlocked in internal_snapshot_action()
blockdev: drain while unlocked in external_snapshot_action()
block: mark bdrv_drained_begin() and friends as GRAPH_UNLOCKED
iotests/graph-changes-while-io: remove image file after test
block/io: remove duplicate GLOBAL_STATE_CODE() in
bdrv_do_drained_end()
block: never use atomics to access bs->quiesce_counter
block: add bdrv_graph_wrlock_drained() convenience wrapper
block.c | 184 +++++++++++-------
block/backup.c | 2 +-
block/blklogwrites.c | 4 +-
block/blkverify.c | 2 +-
block/block-backend.c | 10 +-
block/commit.c | 2 +-
block/graph-lock.c | 40 +++-
block/io.c | 8 +-
block/mirror.c | 2 +-
block/qcow2.c | 2 +-
block/quorum.c | 4 +-
block/replication.c | 4 +-
block/snapshot.c | 28 +--
block/stream.c | 8 +-
block/vmdk.c | 10 +-
blockdev.c | 78 ++++++--
blockjob.c | 10 +-
include/block/block-global-state.h | 19 +-
include/block/block-io.h | 2 +-
include/block/block_int-common.h | 20 +-
include/block/graph-lock.h | 11 ++
qemu-img.c | 2 +
.../qemu-iotests/tests/graph-changes-while-io | 102 +++++++++-
.../tests/graph-changes-while-io.out | 4 +-
tests/unit/test-bdrv-drain.c | 22 +--
tests/unit/test-bdrv-graph-mod.c | 10 +-
26 files changed, 412 insertions(+), 178 deletions(-)
--
2.39.5
next reply other threads:[~2025-05-26 13:22 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-05-26 13:21 Fiona Ebner [this message]
2025-05-26 13:21 ` [PATCH v3 01/24] block: remove outdated comments about AioContext locking Fiona Ebner
2025-05-26 13:21 ` [PATCH v3 02/24] block: move drain outside of read-locked bdrv_reopen_queue_child() Fiona Ebner
2025-05-26 13:21 ` [PATCH v3 03/24] block/snapshot: move drain outside of read-locked bdrv_snapshot_delete() Fiona Ebner
2025-05-26 13:21 ` [PATCH v3 04/24] block: move drain outside of read-locked bdrv_inactivate_recurse() Fiona Ebner
2025-05-26 13:21 ` [PATCH v3 05/24] block: mark bdrv_parent_change_aio_context() GRAPH_RDLOCK Fiona Ebner
2025-05-26 13:21 ` [PATCH v3 06/24] block: mark change_aio_ctx() callback and instances as GRAPH_RDLOCK(_PTR) Fiona Ebner
2025-05-26 13:21 ` [PATCH v3 07/24] block: mark bdrv_child_change_aio_context() GRAPH_RDLOCK Fiona Ebner
2025-05-26 13:21 ` [PATCH v3 08/24] block: move drain outside of bdrv_change_aio_context() and mark GRAPH_RDLOCK Fiona Ebner
2025-05-27 15:11 ` Kevin Wolf
2025-05-26 13:21 ` [PATCH v3 09/24] block: move drain outside of bdrv_try_change_aio_context() Fiona Ebner
2025-05-27 15:12 ` Kevin Wolf
2025-05-26 13:21 ` [PATCH v3 10/24] block: move drain outside of bdrv_attach_child_common(_abort)() Fiona Ebner
2025-05-27 15:23 ` Kevin Wolf
2025-05-28 11:41 ` Fiona Ebner
2025-05-26 13:21 ` [PATCH v3 11/24] block: move drain outside of bdrv_set_backing_hd_drained() Fiona Ebner
2025-05-27 15:29 ` Kevin Wolf
2025-05-28 13:09 ` Fiona Ebner
2025-05-26 13:21 ` [PATCH v3 12/24] block: move drain outside of bdrv_root_attach_child() Fiona Ebner
2025-05-27 17:16 ` Kevin Wolf
2025-05-26 13:21 ` [PATCH v3 13/24] block: move drain outside of bdrv_attach_child() Fiona Ebner
2025-05-27 17:20 ` Kevin Wolf
2025-05-26 13:21 ` [PATCH v3 14/24] block: move drain outside of quorum_add_child() Fiona Ebner
2025-05-27 17:23 ` Kevin Wolf
2025-05-26 13:21 ` [PATCH v3 15/24] block: move drain outside of bdrv_root_unref_child() Fiona Ebner
2025-05-27 17:50 ` Kevin Wolf
2025-05-26 13:21 ` [PATCH v3 16/24] block: move drain outside of quorum_del_child() Fiona Ebner
2025-05-27 17:52 ` Kevin Wolf
2025-05-26 13:21 ` [PATCH v3 17/24] blockdev: drain while unlocked in internal_snapshot_action() Fiona Ebner
2025-05-26 13:21 ` [PATCH v3 18/24] blockdev: drain while unlocked in external_snapshot_action() Fiona Ebner
2025-05-26 13:21 ` [PATCH v3 19/24] block: mark bdrv_drained_begin() and friends as GRAPH_UNLOCKED Fiona Ebner
2025-05-26 13:21 ` [PATCH v3 20/24] iotests/graph-changes-while-io: remove image file after test Fiona Ebner
2025-05-26 13:21 ` [PATCH v3 21/24] iotests/graph-changes-while-io: add test case with removal of lower snapshot Fiona Ebner
2025-05-26 13:21 ` [PATCH v3 22/24] block/io: remove duplicate GLOBAL_STATE_CODE() in bdrv_do_drained_end() Fiona Ebner
2025-05-26 13:21 ` [PATCH v3 23/24] block: never use atomics to access bs->quiesce_counter Fiona Ebner
2025-05-26 13:21 ` [PATCH v3 24/24] block: add bdrv_graph_wrlock_drained() convenience wrapper Fiona Ebner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250526132140.1641377-1-f.ebner@proxmox.com \
--to=f.ebner@proxmox.com \
--cc=andrey.drobyshev@virtuozzo.com \
--cc=ari@tuxera.com \
--cc=berto@igalia.com \
--cc=den@virtuozzo.com \
--cc=eblake@redhat.com \
--cc=fam@euphon.net \
--cc=hreitz@redhat.com \
--cc=jsnow@redhat.com \
--cc=kwolf@redhat.com \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
--cc=stefanha@redhat.com \
--cc=vsementsov@yandex-team.ru \
--cc=wencongyang2@huawei.com \
--cc=xiechanglong.d@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).