From: Kevin Wolf <kwolf@redhat.com>
To: qemu-block@nongnu.org
Cc: kwolf@redhat.com, qemu-devel@nongnu.org
Subject: [PULL 05/28] block: Introduce bdrv_schedule_unref()
Date: Fri, 15 Sep 2023 16:43:21 +0200 [thread overview]
Message-ID: <20230915144344.238596-6-kwolf@redhat.com> (raw)
In-Reply-To: <20230915144344.238596-1-kwolf@redhat.com>
bdrv_unref() is called by a lot of places that need to hold the graph
lock (it naturally happens in the context of operations that change the
graph). However, bdrv_unref() takes the graph writer lock internally, so
it can't actually be called while already holding a graph lock without
causing a deadlock.
bdrv_unref() also can't just become GRAPH_WRLOCK because it drains the
node before closing it, and draining requires that the graph is
unlocked.
The solution is to defer deleting the node until we don't hold the lock
any more and draining is possible again.
Note that keeping images open for longer than necessary can create
problems, too: You can't open an image again before it is really closed
(if image locking didn't prevent it, it would cause corruption).
Reopening an image immediately happens at least during bdrv_open() and
bdrv_co_create().
In order to solve this problem, make sure to run the deferred unref in
bdrv_graph_wrunlock(), i.e. the first possible place where we can drain
again. This is also why bdrv_schedule_unref() is marked GRAPH_WRLOCK.
The output of iotest 051 is updated because the additional polling
changes the order of HMP output, resulting in a new "(qemu)" prompt in
the test output that was previously on a separate line and filtered out.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20230911094620.45040-6-kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
include/block/block-global-state.h | 1 +
block.c | 17 +++++++++++++++++
block/graph-lock.c | 26 +++++++++++++++++++-------
tests/qemu-iotests/051.pc.out | 6 +++---
4 files changed, 40 insertions(+), 10 deletions(-)
diff --git a/include/block/block-global-state.h b/include/block/block-global-state.h
index f347199bff..e570799f85 100644
--- a/include/block/block-global-state.h
+++ b/include/block/block-global-state.h
@@ -224,6 +224,7 @@ void bdrv_img_create(const char *filename, const char *fmt,
void bdrv_ref(BlockDriverState *bs);
void no_coroutine_fn bdrv_unref(BlockDriverState *bs);
void coroutine_fn no_co_wrapper bdrv_co_unref(BlockDriverState *bs);
+void GRAPH_WRLOCK bdrv_schedule_unref(BlockDriverState *bs);
void bdrv_unref_child(BlockDriverState *parent, BdrvChild *child);
BdrvChild *bdrv_attach_child(BlockDriverState *parent_bs,
BlockDriverState *child_bs,
diff --git a/block.c b/block.c
index 9029ddd9ff..c8ac7cfac4 100644
--- a/block.c
+++ b/block.c
@@ -7044,6 +7044,23 @@ void bdrv_unref(BlockDriverState *bs)
}
}
+/*
+ * Release a BlockDriverState reference while holding the graph write lock.
+ *
+ * Calling bdrv_unref() directly is forbidden while holding the graph lock
+ * because bdrv_close() both involves polling and taking the graph lock
+ * internally. bdrv_schedule_unref() instead delays decreasing the refcount and
+ * possibly closing @bs until the graph lock is released.
+ */
+void bdrv_schedule_unref(BlockDriverState *bs)
+{
+ if (!bs) {
+ return;
+ }
+ aio_bh_schedule_oneshot(qemu_get_aio_context(),
+ (QEMUBHFunc *) bdrv_unref, bs);
+}
+
struct BdrvOpBlocker {
Error *reason;
QLIST_ENTRY(BdrvOpBlocker) list;
diff --git a/block/graph-lock.c b/block/graph-lock.c
index f357a2c0b1..58a799065f 100644
--- a/block/graph-lock.c
+++ b/block/graph-lock.c
@@ -163,17 +163,29 @@ void bdrv_graph_wrlock(BlockDriverState *bs)
void bdrv_graph_wrunlock(void)
{
GLOBAL_STATE_CODE();
- QEMU_LOCK_GUARD(&aio_context_list_lock);
assert(qatomic_read(&has_writer));
+ WITH_QEMU_LOCK_GUARD(&aio_context_list_lock) {
+ /*
+ * No need for memory barriers, this works in pair with
+ * the slow path of rdlock() and both take the lock.
+ */
+ qatomic_store_release(&has_writer, 0);
+
+ /* Wake up all coroutines that are waiting to read the graph */
+ qemu_co_enter_all(&reader_queue, &aio_context_list_lock);
+ }
+
/*
- * No need for memory barriers, this works in pair with
- * the slow path of rdlock() and both take the lock.
+ * Run any BHs that were scheduled during the wrlock section and that
+ * callers might expect to have finished (in particular, this is important
+ * for bdrv_schedule_unref()).
+ *
+ * Do this only after restarting coroutines so that nested event loops in
+ * BHs don't deadlock if their condition relies on the coroutine making
+ * progress.
*/
- qatomic_store_release(&has_writer, 0);
-
- /* Wake up all coroutine that are waiting to read the graph */
- qemu_co_enter_all(&reader_queue, &aio_context_list_lock);
+ aio_bh_poll(qemu_get_aio_context());
}
void coroutine_fn bdrv_graph_co_rdlock(void)
diff --git a/tests/qemu-iotests/051.pc.out b/tests/qemu-iotests/051.pc.out
index 4d4af5a486..7e10c5fa1b 100644
--- a/tests/qemu-iotests/051.pc.out
+++ b/tests/qemu-iotests/051.pc.out
@@ -169,11 +169,11 @@ QEMU_PROG: -device scsi-hd,drive=disk: Device needs media, but drive is empty
Testing: -drive file=TEST_DIR/t.qcow2,if=none,node-name=disk -object iothread,id=thread0 -device virtio-scsi,iothread=thread0,id=virtio-scsi0 -device scsi-hd,bus=virtio-scsi0.0,drive=disk,share-rw=on -device ide-hd,drive=disk,share-rw=on
QEMU X.Y.Z monitor - type 'help' for more information
-QEMU_PROG: -device ide-hd,drive=disk,share-rw=on: Cannot change iothread of active block backend
+(qemu) QEMU_PROG: -device ide-hd,drive=disk,share-rw=on: Cannot change iothread of active block backend
Testing: -drive file=TEST_DIR/t.qcow2,if=none,node-name=disk -object iothread,id=thread0 -device virtio-scsi,iothread=thread0,id=virtio-scsi0 -device scsi-hd,bus=virtio-scsi0.0,drive=disk,share-rw=on -device virtio-blk-pci,drive=disk,share-rw=on
QEMU X.Y.Z monitor - type 'help' for more information
-QEMU_PROG: -device virtio-blk-pci,drive=disk,share-rw=on: Cannot change iothread of active block backend
+(qemu) QEMU_PROG: -device virtio-blk-pci,drive=disk,share-rw=on: Cannot change iothread of active block backend
Testing: -drive file=TEST_DIR/t.qcow2,if=none,node-name=disk -object iothread,id=thread0 -device virtio-scsi,iothread=thread0,id=virtio-scsi0 -device scsi-hd,bus=virtio-scsi0.0,drive=disk,share-rw=on -device lsi53c895a,id=lsi0 -device scsi-hd,bus=lsi0.0,drive=disk,share-rw=on
QEMU X.Y.Z monitor - type 'help' for more information
@@ -185,7 +185,7 @@ QEMU X.Y.Z monitor - type 'help' for more information
Testing: -drive file=TEST_DIR/t.qcow2,if=none,node-name=disk -object iothread,id=thread0 -device virtio-scsi,iothread=thread0,id=virtio-scsi0 -device scsi-hd,bus=virtio-scsi0.0,drive=disk,share-rw=on -device virtio-blk-pci,drive=disk,iothread=thread0,share-rw=on
QEMU X.Y.Z monitor - type 'help' for more information
-QEMU_PROG: -device virtio-blk-pci,drive=disk,iothread=thread0,share-rw=on: Cannot change iothread of active block backend
+(qemu) QEMU_PROG: -device virtio-blk-pci,drive=disk,iothread=thread0,share-rw=on: Cannot change iothread of active block backend
Testing: -drive file=TEST_DIR/t.qcow2,if=none,node-name=disk -object iothread,id=thread0 -device virtio-scsi,iothread=thread0,id=virtio-scsi0 -device scsi-hd,bus=virtio-scsi0.0,drive=disk,share-rw=on -device virtio-scsi,id=virtio-scsi1,iothread=thread0 -device scsi-hd,bus=virtio-scsi1.0,drive=disk,share-rw=on
QEMU X.Y.Z monitor - type 'help' for more information
--
2.41.0
next prev parent reply other threads:[~2023-09-15 14:52 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-09-15 14:43 [PULL 00/28] Block layer patches Kevin Wolf
2023-09-15 14:43 ` [PULL 01/28] block: Remove unused BlockReopenQueueEntry.perms_checked Kevin Wolf
2023-09-15 14:43 ` [PULL 02/28] preallocate: Factor out preallocate_truncate_to_real_size() Kevin Wolf
2023-09-15 14:43 ` [PULL 03/28] preallocate: Don't poll during permission updates Kevin Wolf
2023-09-15 14:43 ` [PULL 04/28] block: Take AioContext lock for bdrv_append() more consistently Kevin Wolf
2023-09-15 14:43 ` Kevin Wolf [this message]
2023-09-15 14:43 ` [PULL 06/28] block-coroutine-wrapper: Add no_co_wrapper_bdrv_wrlock functions Kevin Wolf
2023-09-15 14:43 ` [PULL 07/28] block-coroutine-wrapper: Allow arbitrary parameter names Kevin Wolf
2023-09-15 14:43 ` [PULL 08/28] block: Mark bdrv_replace_child_noperm() GRAPH_WRLOCK Kevin Wolf
2023-09-15 14:43 ` [PULL 09/28] block: Mark bdrv_replace_child_tran() GRAPH_WRLOCK Kevin Wolf
2023-09-15 14:43 ` [PULL 10/28] block: Mark bdrv_attach_child_common() GRAPH_WRLOCK Kevin Wolf
2023-09-15 14:43 ` [PULL 11/28] block: Call transaction callbacks with lock held Kevin Wolf
2023-09-15 14:43 ` [PULL 12/28] block: Mark bdrv_attach_child() GRAPH_WRLOCK Kevin Wolf
2023-09-15 14:43 ` [PULL 13/28] block: Mark bdrv_parent_perms_conflict() and callers GRAPH_RDLOCK Kevin Wolf
2023-09-15 14:43 ` [PULL 14/28] block: Mark bdrv_get_cumulative_perm() " Kevin Wolf
2023-09-15 14:43 ` [PULL 15/28] block: Mark bdrv_child_perm() GRAPH_RDLOCK Kevin Wolf
2023-09-15 14:43 ` [PULL 16/28] block: Mark bdrv_parent_cb_change_media() GRAPH_RDLOCK Kevin Wolf
2023-09-15 14:43 ` [PULL 17/28] block: Take graph rdlock in bdrv_drop_intermediate() Kevin Wolf
2023-09-15 14:43 ` [PULL 18/28] block: Take graph rdlock in bdrv_change_aio_context() Kevin Wolf
2023-09-15 14:43 ` [PULL 19/28] block: Mark bdrv_root_unref_child() GRAPH_WRLOCK Kevin Wolf
2023-09-15 14:43 ` [PULL 20/28] block: Mark bdrv_unref_child() GRAPH_WRLOCK Kevin Wolf
2023-09-15 14:43 ` [PULL 21/28] block: Mark bdrv_add/del_child() and caller GRAPH_WRLOCK Kevin Wolf
2023-09-15 14:43 ` [PULL 22/28] block: add BDRV_BLOCK_COMPRESSED flag for bdrv_block_status() Kevin Wolf
2023-09-15 14:43 ` [PULL 23/28] qemu-img: map: report compressed data blocks Kevin Wolf
2023-09-15 14:43 ` [PULL 24/28] block: remove AIOCBInfo->get_aio_context() Kevin Wolf
2023-09-15 14:43 ` [PULL 25/28] test-bdrv-drain: avoid race with BH in IOThread drain test Kevin Wolf
2023-09-15 14:43 ` [PULL 26/28] block-backend: process I/O in the current AioContext Kevin Wolf
2023-09-15 14:43 ` [PULL 27/28] block-backend: process zoned requests " Kevin Wolf
2023-09-15 14:43 ` [PULL 28/28] block-coroutine-wrapper: use qemu_get_current_aio_context() Kevin Wolf
2023-09-18 15:03 ` [PULL 00/28] Block layer patches Stefan Hajnoczi
2023-09-18 18:56 ` Stefan Hajnoczi
2023-09-19 10:26 ` Kevin Wolf
2023-09-19 17:35 ` Stefan Hajnoczi
2023-09-19 19:34 ` Stefan Hajnoczi
2023-09-19 20:08 ` Stefan Hajnoczi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230915144344.238596-6-kwolf@redhat.com \
--to=kwolf@redhat.com \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).