qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Fiona Ebner <f.ebner@proxmox.com>
To: qemu-block@nongnu.org
Cc: qemu-devel@nongnu.org, kwolf@redhat.com, den@virtuozzo.com,
	andrey.drobyshev@virtuozzo.com, hreitz@redhat.com,
	stefanha@redhat.com, eblake@redhat.com, jsnow@redhat.com,
	vsementsov@yandex-team.ru, xiechanglong.d@gmail.com,
	wencongyang2@huawei.com, berto@igalia.com, fam@euphon.net,
	ari@tuxera.com
Subject: [PATCH v4 10/48] block: move drain outside of bdrv_attach_child_common(_abort)()
Date: Fri, 30 May 2025 17:10:47 +0200	[thread overview]
Message-ID: <20250530151125.955508-11-f.ebner@proxmox.com> (raw)
In-Reply-To: <20250530151125.955508-1-f.ebner@proxmox.com>

This is part of resolving the deadlock mentioned in commit "block:
move draining out of bdrv_change_aio_context() and mark GRAPH_RDLOCK".

The function bdrv_attach_child_common_abort() is used only as the
abort callback in bdrv_attach_child_common_drv transactions, so the
tran_finalize() calls of such transactions need to be in drained
sections too.

All code paths are covered:
The bdrv_attach_child_common_drv transactions are only used in
bdrv_attach_child_common(), so it is enough to check callers of
bdrv_attach_child_common() following the transactions.

bdrv_attach_child_common() is called by:
1. bdrv_attach_child_noperm(), which does not finalize the
   transaction yet.
2. bdrv_root_attach_child(), where a drained section is introduced.

bdrv_attach_child_noperm() is called by:
1. bdrv_attach_child(), where a drained section is introduced.
2. bdrv_set_file_or_backing_noperm(), which does not finalize the
   transaction yet.
3. bdrv_append(), where a drained section is introduced.

bdrv_set_file_or_backing_noperm() is called by:
1. bdrv_set_backing_hd_drained(), where a drained section is
   introduced.
2. bdrv_reopen_parse_file_or_backing(), which does not finalize the
   transaction yet. Draining the old child bs currently happens under
   the graph lock there. This is replaced with an assertion, because
   the drain will be moved further up to the caller.

bdrv_reopen_parse_file_or_backing() is called by:
1. bdrv_reopen_prepare(), which does not finalize the transaction yet.

bdrv_reopen_prepare() is called by:
1. bdrv_reopen_multiple(), which does finalize the transaction. It is
   called after bdrv_reopen_queue(), which starts a drained section.
   The drained section ends, when bdrv_reopen_queue_free() is called
   at the end of bdrv_reopen_multiple().

This resolves all code paths.

The functions bdrv_set_backing_hd_drained(), bdrv_attach_child() and
bdrv_root_attach_child() run under the graph lock, so they are not
actually allowed to drain. This will be addressed in the following
commits.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---

Changes in v4:
* Where applicable, document requirement to drain all nodes from
  before calling the function until finalizing the transaction.

 block.c | 40 ++++++++++++++++++++++++----------------
 1 file changed, 24 insertions(+), 16 deletions(-)

diff --git a/block.c b/block.c
index 3aaacabf7f..46eb2fe449 100644
--- a/block.c
+++ b/block.c
@@ -3028,10 +3028,8 @@ static void GRAPH_WRLOCK bdrv_attach_child_common_abort(void *opaque)
     bdrv_replace_child_noperm(s->child, NULL);
 
     if (bdrv_get_aio_context(bs) != s->old_child_ctx) {
-        bdrv_drain_all_begin();
         bdrv_try_change_aio_context_locked(bs, s->old_child_ctx, NULL,
                                            &error_abort);
-        bdrv_drain_all_end();
     }
 
     if (bdrv_child_get_parent_aio_context(s->child) != s->old_parent_ctx) {
@@ -3043,10 +3041,8 @@ static void GRAPH_WRLOCK bdrv_attach_child_common_abort(void *opaque)
 
         /* No need to visit `child`, because it has been detached already */
         visited = g_hash_table_new(NULL, NULL);
-        bdrv_drain_all_begin();
         ret = s->child->klass->change_aio_ctx(s->child, s->old_parent_ctx,
                                               visited, tran, &error_abort);
-        bdrv_drain_all_end();
         g_hash_table_destroy(visited);
 
         /* transaction is supposed to always succeed */
@@ -3075,6 +3071,9 @@ static TransactionActionDrv bdrv_attach_child_common_drv = {
  *
  * Both @parent_bs and @child_bs can move to a different AioContext in this
  * function.
+ *
+ * All block nodes must be drained before this function is called until after
+ * the transaction is finalized.
  */
 static BdrvChild * GRAPH_WRLOCK
 bdrv_attach_child_common(BlockDriverState *child_bs,
@@ -3118,10 +3117,8 @@ bdrv_attach_child_common(BlockDriverState *child_bs,
     parent_ctx = bdrv_child_get_parent_aio_context(new_child);
     if (child_ctx != parent_ctx) {
         Error *local_err = NULL;
-        bdrv_drain_all_begin();
         int ret = bdrv_try_change_aio_context_locked(child_bs, parent_ctx, NULL,
                                                      &local_err);
-        bdrv_drain_all_end();
 
         if (ret < 0 && child_class->change_aio_ctx) {
             Transaction *aio_ctx_tran = tran_new();
@@ -3129,11 +3126,9 @@ bdrv_attach_child_common(BlockDriverState *child_bs,
             bool ret_child;
 
             g_hash_table_add(visited, new_child);
-            bdrv_drain_all_begin();
             ret_child = child_class->change_aio_ctx(new_child, child_ctx,
                                                     visited, aio_ctx_tran,
                                                     NULL);
-            bdrv_drain_all_end();
             if (ret_child == true) {
                 error_free(local_err);
                 ret = 0;
@@ -3189,6 +3184,9 @@ bdrv_attach_child_common(BlockDriverState *child_bs,
  *
  * After calling this function, the transaction @tran may only be completed
  * while holding a writer lock for the graph.
+ *
+ * All block nodes must be drained before this function is called until after
+ * the transaction is finalized.
  */
 static BdrvChild * GRAPH_WRLOCK
 bdrv_attach_child_noperm(BlockDriverState *parent_bs,
@@ -3244,6 +3242,7 @@ BdrvChild *bdrv_root_attach_child(BlockDriverState *child_bs,
 
     GLOBAL_STATE_CODE();
 
+    bdrv_drain_all_begin();
     child = bdrv_attach_child_common(child_bs, child_name, child_class,
                                    child_role, perm, shared_perm, opaque,
                                    tran, errp);
@@ -3256,6 +3255,7 @@ BdrvChild *bdrv_root_attach_child(BlockDriverState *child_bs,
 
 out:
     tran_finalize(tran, ret);
+    bdrv_drain_all_end();
 
     bdrv_schedule_unref(child_bs);
 
@@ -3283,6 +3283,7 @@ BdrvChild *bdrv_attach_child(BlockDriverState *parent_bs,
 
     GLOBAL_STATE_CODE();
 
+    bdrv_drain_all_begin();
     child = bdrv_attach_child_noperm(parent_bs, child_bs, child_name,
                                      child_class, child_role, tran, errp);
     if (!child) {
@@ -3297,6 +3298,7 @@ BdrvChild *bdrv_attach_child(BlockDriverState *parent_bs,
 
 out:
     tran_finalize(tran, ret);
+    bdrv_drain_all_end();
 
     bdrv_schedule_unref(child_bs);
 
@@ -3465,6 +3467,9 @@ static BdrvChildRole bdrv_backing_role(BlockDriverState *bs)
  *
  * After calling this function, the transaction @tran may only be completed
  * while holding a writer lock for the graph.
+ *
+ * All block nodes must be drained before this function is called until after
+ * the transaction is finalized.
  */
 static int GRAPH_WRLOCK
 bdrv_set_file_or_backing_noperm(BlockDriverState *parent_bs,
@@ -3573,6 +3578,7 @@ int bdrv_set_backing_hd_drained(BlockDriverState *bs,
         assert(bs->backing->bs->quiesce_counter > 0);
     }
 
+    bdrv_drain_all_begin();
     ret = bdrv_set_file_or_backing_noperm(bs, backing_hd, true, tran, errp);
     if (ret < 0) {
         goto out;
@@ -3581,6 +3587,7 @@ int bdrv_set_backing_hd_drained(BlockDriverState *bs,
     ret = bdrv_refresh_perms(bs, tran, errp);
 out:
     tran_finalize(tran, ret);
+    bdrv_drain_all_end();
     return ret;
 }
 
@@ -4721,6 +4728,9 @@ int bdrv_reopen_set_read_only(BlockDriverState *bs, bool read_only,
  * Return 0 on success, otherwise return < 0 and set @errp.
  *
  * @reopen_state->bs can move to a different AioContext in this function.
+ *
+ * All block nodes must be drained before this function is called until after
+ * the transaction is finalized.
  */
 static int GRAPH_UNLOCKED
 bdrv_reopen_parse_file_or_backing(BDRVReopenState *reopen_state,
@@ -4814,7 +4824,7 @@ bdrv_reopen_parse_file_or_backing(BDRVReopenState *reopen_state,
 
     if (old_child_bs) {
         bdrv_ref(old_child_bs);
-        bdrv_drained_begin(old_child_bs);
+        assert(old_child_bs->quiesce_counter > 0);
     }
 
     bdrv_graph_rdunlock_main_loop();
@@ -4826,7 +4836,6 @@ bdrv_reopen_parse_file_or_backing(BDRVReopenState *reopen_state,
     bdrv_graph_wrunlock();
 
     if (old_child_bs) {
-        bdrv_drained_end(old_child_bs);
         bdrv_unref(old_child_bs);
     }
 
@@ -4855,6 +4864,9 @@ out_rdlock:
  *
  * After calling this function, the transaction @change_child_tran may only be
  * completed while holding a writer lock for the graph.
+ *
+ * All block nodes must be drained before this function is called until after
+ * the transaction is finalized.
  */
 static int GRAPH_UNLOCKED
 bdrv_reopen_prepare(BDRVReopenState *reopen_state, BlockReopenQueue *queue,
@@ -5501,9 +5513,7 @@ int bdrv_append(BlockDriverState *bs_new, BlockDriverState *bs_top,
     assert(!bs_new->backing);
     bdrv_graph_rdunlock_main_loop();
 
-    bdrv_drained_begin(bs_top);
-    bdrv_drained_begin(bs_new);
-
+    bdrv_drain_all_begin();
     bdrv_graph_wrlock();
 
     child = bdrv_attach_child_noperm(bs_new, bs_top, "backing",
@@ -5525,9 +5535,7 @@ out:
 
     bdrv_refresh_limits(bs_top, NULL, NULL);
     bdrv_graph_wrunlock();
-
-    bdrv_drained_end(bs_top);
-    bdrv_drained_end(bs_new);
+    bdrv_drain_all_end();
 
     return ret;
 }
-- 
2.39.5




  parent reply	other threads:[~2025-05-30 15:14 UTC|newest]

Thread overview: 60+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-05-30 15:10 [PATCH v4 00/48] block: do not drain while holding the graph lock Fiona Ebner
2025-05-30 15:10 ` [PATCH v4 01/48] block: remove outdated comments about AioContext locking Fiona Ebner
2025-05-30 15:10 ` [PATCH v4 02/48] block: move drain outside of read-locked bdrv_reopen_queue_child() Fiona Ebner
2025-05-30 15:10 ` [PATCH v4 03/48] block/snapshot: move drain outside of read-locked bdrv_snapshot_delete() Fiona Ebner
2025-05-30 15:10 ` [PATCH v4 04/48] block: move drain outside of read-locked bdrv_inactivate_recurse() Fiona Ebner
2025-05-30 15:10 ` [PATCH v4 05/48] block: mark bdrv_parent_change_aio_context() GRAPH_RDLOCK Fiona Ebner
2025-05-30 15:10 ` [PATCH v4 06/48] block: mark change_aio_ctx() callback and instances as GRAPH_RDLOCK(_PTR) Fiona Ebner
2025-05-30 15:10 ` [PATCH v4 07/48] block: mark bdrv_child_change_aio_context() GRAPH_RDLOCK Fiona Ebner
2025-05-30 15:10 ` [PATCH v4 08/48] block: move drain outside of bdrv_change_aio_context() and mark GRAPH_RDLOCK Fiona Ebner
2025-05-30 15:10 ` [PATCH v4 09/48] block: move drain outside of bdrv_try_change_aio_context() Fiona Ebner
2025-05-30 15:10 ` Fiona Ebner [this message]
2025-05-30 15:10 ` [PATCH v4 11/48] block: move drain outside of bdrv_set_backing_hd_drained() Fiona Ebner
2025-05-30 15:10 ` [PATCH v4 12/48] block: move drain outside of bdrv_root_attach_child() Fiona Ebner
2025-05-30 15:10 ` [PATCH v4 13/48] block: move drain outside of bdrv_attach_child() Fiona Ebner
2025-05-30 15:10 ` [PATCH v4 14/48] block: move drain outside of quorum_add_child() Fiona Ebner
2025-05-30 15:10 ` [PATCH v4 15/48] block: move drain outside of bdrv_root_unref_child() Fiona Ebner
2025-05-30 15:10 ` [PATCH v4 16/48] block: move drain outside of quorum_del_child() Fiona Ebner
2025-05-30 15:10 ` [PATCH v4 17/48] blockdev: drain while unlocked in internal_snapshot_action() Fiona Ebner
2025-05-30 15:10 ` [PATCH v4 18/48] blockdev: drain while unlocked in external_snapshot_action() Fiona Ebner
2025-05-30 15:10 ` [PATCH v4 19/48] block: mark bdrv_drained_begin() and friends as GRAPH_UNLOCKED Fiona Ebner
2025-05-30 15:10 ` [PATCH v4 20/48] iotests/graph-changes-while-io: remove image file after test Fiona Ebner
2025-05-30 15:10 ` [PATCH v4 21/48] iotests/graph-changes-while-io: add test case with removal of lower snapshot Fiona Ebner
2025-05-30 15:10 ` [PATCH v4 22/48] block/io: remove duplicate GLOBAL_STATE_CODE() in bdrv_do_drained_end() Fiona Ebner
2025-05-30 15:11 ` [PATCH v4 23/48] block: never use atomics to access bs->quiesce_counter Fiona Ebner
2025-06-02 14:45   ` Fiona Ebner
2025-07-01 11:24     ` Kevin Wolf
2025-05-30 15:11 ` [PATCH v4 24/48] block: add bdrv_graph_wrlock_drained() convenience wrapper Fiona Ebner
2025-07-01 11:37   ` Kevin Wolf
2025-05-30 15:11 ` [PATCH v4 25/48] block/mirror: switch to bdrv_set_backing_hd_drained() variant Fiona Ebner
2025-05-30 15:11 ` [PATCH v4 26/48] block/commit: " Fiona Ebner
2025-05-30 15:11 ` [PATCH v4 27/48] block: call bdrv_set_backing_hd() while unlocked in bdrv_open_backing_file() Fiona Ebner
2025-07-01 13:13   ` Kevin Wolf
2025-05-30 15:11 ` [PATCH v4 28/48] block: mark bdrv_set_backing_hd() as GRAPH_UNLOCKED Fiona Ebner
2025-05-30 15:11 ` [PATCH v4 29/48] blockdev: avoid locking and draining multiple times in external_snapshot_abort() Fiona Ebner
2025-06-02  8:56   ` Fiona Ebner
2025-05-30 15:11 ` [PATCH v4 30/48] block: drop wrapper for bdrv_set_backing_hd_drained() Fiona Ebner
2025-05-30 15:11 ` [PATCH v4 31/48] block-backend: mark blk_drain_all() as GRAPH_UNLOCKED Fiona Ebner
2025-05-30 15:11 ` [PATCH v4 32/48] block/snapshot: mark bdrv_all_delete_snapshot() " Fiona Ebner
2025-05-30 15:11 ` [PATCH v4 33/48] block/stream: mark stream_prepare() " Fiona Ebner
2025-05-30 15:11 ` [PATCH v4 34/48] block: mark bdrv_reopen_queue() and bdrv_reopen_multiple() " Fiona Ebner
2025-05-30 15:11 ` [PATCH v4 35/48] block: mark bdrv_inactivate() as GRAPH_RDLOCK and move drain to callers Fiona Ebner
2025-05-30 15:11 ` [PATCH v4 36/48] block: mark bdrv_inactivate_all() as GRAPH_UNLOCKED Fiona Ebner
2025-05-30 15:11 ` [PATCH v4 37/48] block: mark blk_remove_bs() " Fiona Ebner
2025-05-30 15:11 ` [PATCH v4 38/48] block: mark blk_drain() " Fiona Ebner
2025-05-30 15:11 ` [PATCH v4 39/48] block-backend: mark blk_io_limits_disable() " Fiona Ebner
2025-05-30 15:11 ` [PATCH v4 40/48] block/commit: mark commit_abort() " Fiona Ebner
2025-05-30 15:11 ` [PATCH v4 41/48] block: mark bdrv_new() " Fiona Ebner
2025-07-01 16:55   ` Kevin Wolf
2025-05-30 15:11 ` [PATCH v4 42/48] block: mark bdrv_replace_child_bs() " Fiona Ebner
2025-05-30 15:11 ` [PATCH v4 43/48] block: mark bdrv_insert_node() " Fiona Ebner
2025-05-30 15:11 ` [PATCH v4 44/48] block: mark bdrv_drop_intermediate() " Fiona Ebner
2025-05-30 15:11 ` [PATCH v4 45/48] block: mark bdrv_close_all() " Fiona Ebner
2025-05-30 15:11 ` [PATCH v4 46/48] block: mark bdrv_close() " Fiona Ebner
2025-05-30 15:11 ` [PATCH v4 47/48] block: mark bdrv_open_child_common() and its callers GRAPH_UNLOCKED Fiona Ebner
2025-05-30 15:11 ` [PATCH v4 48/48] blockjob: mark block_job_remove_all_bdrv() as GRAPH_UNLOCKED Fiona Ebner
2025-06-03 14:54 ` [PATCH v4 00/48] block: do not drain while holding the graph lock Kevin Wolf
2025-06-04  7:38   ` Fiona Ebner
2025-07-01 17:16 ` Kevin Wolf
2025-07-14 13:43   ` Kevin Wolf
2025-07-15 13:24     ` Fiona Ebner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250530151125.955508-11-f.ebner@proxmox.com \
    --to=f.ebner@proxmox.com \
    --cc=andrey.drobyshev@virtuozzo.com \
    --cc=ari@tuxera.com \
    --cc=berto@igalia.com \
    --cc=den@virtuozzo.com \
    --cc=eblake@redhat.com \
    --cc=fam@euphon.net \
    --cc=hreitz@redhat.com \
    --cc=jsnow@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    --cc=vsementsov@yandex-team.ru \
    --cc=wencongyang2@huawei.com \
    --cc=xiechanglong.d@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).