qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/3] fix a few blockjob-related deadlocks when using iothread
@ 2023-10-19 13:19 Fiona Ebner
  2023-10-19 13:19 ` [PATCH 1/3] blockjob: drop AioContext lock before calling bdrv_graph_wrlock() Fiona Ebner
                   ` (3 more replies)
  0 siblings, 4 replies; 6+ messages in thread
From: Fiona Ebner @ 2023-10-19 13:19 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-block, vsementsov, jsnow, hreitz, kwolf, pbonzini,
	t.lamprecht

The discussion leading up to this series can be found here:
https://lists.nongnu.org/archive/html/qemu-devel/2023-09/msg00729.html

Fiona Ebner (3):
  blockjob: drop AioContext lock before calling bdrv_graph_wrlock()
  block: avoid deadlock during bdrv_graph_wrlock() in bdrv_close()
  blockdev: mirror: avoid deadlock when using iothread in certain cases

 block.c    |  2 +-
 blockdev.c | 14 ++++++++++++--
 blockjob.c |  2 ++
 3 files changed, 15 insertions(+), 3 deletions(-)

-- 
2.39.2




^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 1/3] blockjob: drop AioContext lock before calling bdrv_graph_wrlock()
  2023-10-19 13:19 [PATCH 0/3] fix a few blockjob-related deadlocks when using iothread Fiona Ebner
@ 2023-10-19 13:19 ` Fiona Ebner
  2023-10-19 13:19 ` [PATCH 2/3] block: avoid potential deadlock during bdrv_graph_wrlock() in bdrv_close() Fiona Ebner
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 6+ messages in thread
From: Fiona Ebner @ 2023-10-19 13:19 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-block, vsementsov, jsnow, hreitz, kwolf, pbonzini,
	t.lamprecht

Same rationale as in 31b2ddfea3 ("graph-lock: Unlock the AioContext
while polling"). Otherwise, a deadlock can happen.

The alternative would be to pass a BlockDriverState along to
bdrv_graph_wrlock(), but there is no BlockDriverState readily
available and it's also better conceptually, because the lock is held
for the job.

The function is always called with the job's AioContext lock held, via
one of the .abort, .clean, .free or .prepare job driver functions.
Thus, it's safe to drop it.

While mirror_exit_common() does hold a second AioContext lock while
calling block_job_remove_all_bdrv(), that is for the main thread's
AioContext and does not need to be dropped (bdrv_graph_wrlock(bs) also
skips dropping the lock if bdrv_get_aio_context(bs) ==
qemu_get_aio_context()).

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
 blockjob.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/blockjob.c b/blockjob.c
index 807f992b59..953dc1b6dc 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -198,7 +198,9 @@ void block_job_remove_all_bdrv(BlockJob *job)
      * one to make sure that such a concurrent access does not attempt
      * to process an already freed BdrvChild.
      */
+    aio_context_release(job->job.aio_context);
     bdrv_graph_wrlock(NULL);
+    aio_context_acquire(job->job.aio_context);
     while (job->nodes) {
         GSList *l = job->nodes;
         BdrvChild *c = l->data;
-- 
2.39.2




^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 2/3] block: avoid potential deadlock during bdrv_graph_wrlock() in bdrv_close()
  2023-10-19 13:19 [PATCH 0/3] fix a few blockjob-related deadlocks when using iothread Fiona Ebner
  2023-10-19 13:19 ` [PATCH 1/3] blockjob: drop AioContext lock before calling bdrv_graph_wrlock() Fiona Ebner
@ 2023-10-19 13:19 ` Fiona Ebner
  2023-10-19 13:19 ` [PATCH 3/3] blockdev: mirror: avoid potential deadlock when using iothread Fiona Ebner
  2023-10-25 12:01 ` [PATCH 0/3] fix a few blockjob-related deadlocks " Kevin Wolf
  3 siblings, 0 replies; 6+ messages in thread
From: Fiona Ebner @ 2023-10-19 13:19 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-block, vsementsov, jsnow, hreitz, kwolf, pbonzini,
	t.lamprecht

by passing the BlockDriverState along, so the held AioContext can be
dropped before polling. See commit 31b2ddfea3 ("graph-lock: Unlock the
AioContext while polling") which introduced this functionality for
more information.

The only way to reach bdrv_close() is via bdrv_unref() and for calling
that the BlockDriverState's AioContext lock is supposed to be held.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---

Is it safe to expect callers of bdrv_unref() to hold the appropriate
AioContext lock?

 block.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/block.c b/block.c
index f9cf05ddcf..a527aa1a4c 100644
--- a/block.c
+++ b/block.c
@@ -5200,7 +5200,7 @@ static void bdrv_close(BlockDriverState *bs)
         bs->drv = NULL;
     }
 
-    bdrv_graph_wrlock(NULL);
+    bdrv_graph_wrlock(bs);
     QLIST_FOREACH_SAFE(child, &bs->children, next, next) {
         bdrv_unref_child(bs, child);
     }
-- 
2.39.2




^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 3/3] blockdev: mirror: avoid potential deadlock when using iothread
  2023-10-19 13:19 [PATCH 0/3] fix a few blockjob-related deadlocks when using iothread Fiona Ebner
  2023-10-19 13:19 ` [PATCH 1/3] blockjob: drop AioContext lock before calling bdrv_graph_wrlock() Fiona Ebner
  2023-10-19 13:19 ` [PATCH 2/3] block: avoid potential deadlock during bdrv_graph_wrlock() in bdrv_close() Fiona Ebner
@ 2023-10-19 13:19 ` Fiona Ebner
  2023-10-20  8:46   ` Fiona Ebner
  2023-10-25 12:01 ` [PATCH 0/3] fix a few blockjob-related deadlocks " Kevin Wolf
  3 siblings, 1 reply; 6+ messages in thread
From: Fiona Ebner @ 2023-10-19 13:19 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-block, vsementsov, jsnow, hreitz, kwolf, pbonzini,
	t.lamprecht

The bdrv_getlength() function is a generated co-wrapper and uses
AIO_WAIT_WHILE() to wait for the spawned coroutine. AIO_WAIT_WHILE()
expects the lock to be acquired exactly once.

This can happen when the source node is explicitly specified as the
@replaces parameter or if there is a filter on top of the source node.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
 blockdev.c | 14 ++++++++++++--
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index a01c62596b..877e3a26d4 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -2968,6 +2968,7 @@ static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs,
 
     if (replaces) {
         BlockDriverState *to_replace_bs;
+        AioContext *aio_context;
         AioContext *replace_aio_context;
         int64_t bs_size, replace_size;
 
@@ -2982,10 +2983,19 @@ static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs,
             return;
         }
 
+        aio_context = bdrv_get_aio_context(bs);
         replace_aio_context = bdrv_get_aio_context(to_replace_bs);
-        aio_context_acquire(replace_aio_context);
+        /*
+         * bdrv_getlength() is a co-wrapper and uses AIO_WAIT_WHILE. Be sure not
+         * to acquire the same AioContext twice.
+         */
+        if (replace_aio_context != aio_context) {
+            aio_context_acquire(replace_aio_context);
+        }
         replace_size = bdrv_getlength(to_replace_bs);
-        aio_context_release(replace_aio_context);
+        if (replace_aio_context != aio_context) {
+            aio_context_release(replace_aio_context);
+        }
 
         if (replace_size < 0) {
             error_setg_errno(errp, -replace_size,
-- 
2.39.2




^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH 3/3] blockdev: mirror: avoid potential deadlock when using iothread
  2023-10-19 13:19 ` [PATCH 3/3] blockdev: mirror: avoid potential deadlock when using iothread Fiona Ebner
@ 2023-10-20  8:46   ` Fiona Ebner
  0 siblings, 0 replies; 6+ messages in thread
From: Fiona Ebner @ 2023-10-20  8:46 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-block, vsementsov, jsnow, hreitz, kwolf, pbonzini,
	t.lamprecht

Am 19.10.23 um 15:19 schrieb Fiona Ebner:
> The bdrv_getlength() function is a generated co-wrapper and uses
> AIO_WAIT_WHILE() to wait for the spawned coroutine. AIO_WAIT_WHILE()
> expects the lock to be acquired exactly once.
> 
> This can happen when the source node is explicitly specified as the
> @replaces parameter or if there is a filter on top of the source node.

Correction: this should read "or if the source node is a filter node".



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 0/3] fix a few blockjob-related deadlocks when using iothread
  2023-10-19 13:19 [PATCH 0/3] fix a few blockjob-related deadlocks when using iothread Fiona Ebner
                   ` (2 preceding siblings ...)
  2023-10-19 13:19 ` [PATCH 3/3] blockdev: mirror: avoid potential deadlock when using iothread Fiona Ebner
@ 2023-10-25 12:01 ` Kevin Wolf
  3 siblings, 0 replies; 6+ messages in thread
From: Kevin Wolf @ 2023-10-25 12:01 UTC (permalink / raw)
  To: Fiona Ebner
  Cc: qemu-devel, qemu-block, vsementsov, jsnow, hreitz, pbonzini,
	t.lamprecht

Am 19.10.2023 um 15:19 hat Fiona Ebner geschrieben:
> The discussion leading up to this series can be found here:
> https://lists.nongnu.org/archive/html/qemu-devel/2023-09/msg00729.html
> 
> Fiona Ebner (3):
>   blockjob: drop AioContext lock before calling bdrv_graph_wrlock()
>   block: avoid deadlock during bdrv_graph_wrlock() in bdrv_close()
>   blockdev: mirror: avoid deadlock when using iothread in certain cases

Thanks, applied to the block branch.

Kevin



^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2023-10-25 12:03 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-10-19 13:19 [PATCH 0/3] fix a few blockjob-related deadlocks when using iothread Fiona Ebner
2023-10-19 13:19 ` [PATCH 1/3] blockjob: drop AioContext lock before calling bdrv_graph_wrlock() Fiona Ebner
2023-10-19 13:19 ` [PATCH 2/3] block: avoid potential deadlock during bdrv_graph_wrlock() in bdrv_close() Fiona Ebner
2023-10-19 13:19 ` [PATCH 3/3] blockdev: mirror: avoid potential deadlock when using iothread Fiona Ebner
2023-10-20  8:46   ` Fiona Ebner
2023-10-25 12:01 ` [PATCH 0/3] fix a few blockjob-related deadlocks " Kevin Wolf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).