public inbox for qemu-devel@nongnu.org
 help / color / mirror / Atom feed
* [PATCH] block/nfs: Do not enter coroutine from CB
@ 2026-01-02 15:32 Hanna Czenczek
  2026-03-06 13:32 ` Kevin Wolf
  0 siblings, 1 reply; 2+ messages in thread
From: Hanna Czenczek @ 2026-01-02 15:32 UTC (permalink / raw)
  To: qemu-block
  Cc: qemu-devel, Hanna Czenczek, Peter Lieven, Kevin Wolf, qemu-stable

The reasoning I gave for why it would be safe to call aio_co_wake()
despite holding the mutex was wrong: It is true that the current request
will not re-acquire the mutex, but a subsequent request in the same
coroutine can.  Because the mutex is a non-coroutine mutex, this will
result in a deadlock.

Therefore, we must either not enter the coroutine here (only scheduling
it), or release the mutex around aio_co_wake().  I opt for the former,
as it is the behavior prior to the offending commit, and so seems safe
to do.

Fixes: deb35c129b859b9bec70fd42f856a0b7c1dc6e61
       ("nfs: Run co BH CB in the coroutine’s AioContext")
Buglink: https://gitlab.com/qemu-project/qemu/-/issues/2622#note_2965097035
Cc: qemu-stable@nongnu.org
Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
---
 block/nfs.c | 19 ++++++++++---------
 1 file changed, 10 insertions(+), 9 deletions(-)

diff --git a/block/nfs.c b/block/nfs.c
index 1d3a34a30c..b78f4f86e8 100644
--- a/block/nfs.c
+++ b/block/nfs.c
@@ -249,14 +249,15 @@ nfs_co_generic_cb(int ret, struct nfs_context *nfs, void *data,
     }
 
     /*
-     * Safe to call: nfs_service(), which called us, is only run from the FD
-     * handlers, never from the request coroutine.  The request coroutine in
-     * turn will yield unconditionally.
-     * No need to release the lock, even if we directly enter the coroutine, as
-     * the lock is never re-taken after yielding.  (Note: If we do enter the
-     * coroutine, @task will probably be dangling once aio_co_wake() returns.)
+     * Using aio_co_wake() here could re-enter the coroutine directly, while we
+     * still hold the mutex.  The current request will not attempt to re-take
+     * the mutex, so that is fine; but if the same coroutine then goes on to
+     * submit another request, that new request will try to re-take the mutex,
+     * resulting in a deadlock.
+     * To prevent that, only schedule the coroutine so it will be entered later,
+     * with the mutex released.
      */
-    aio_co_wake(task->co);
+    aio_co_schedule(qemu_coroutine_get_aio_context(task->co), task->co);
 }
 
 static int coroutine_fn nfs_co_preadv(BlockDriverState *bs, int64_t offset,
@@ -716,8 +717,8 @@ nfs_get_allocated_file_size_cb(int ret, struct nfs_context *nfs, void *data,
     if (task->ret < 0) {
         error_report("NFS Error: %s", nfs_get_error(nfs));
     }
-    /* Safe to call, see nfs_co_generic_cb() */
-    aio_co_wake(task->co);
+    /* Must not use aio_co_wake(), see nfs_co_generic_cb() */
+    aio_co_schedule(qemu_coroutine_get_aio_context(task->co), task->co);
 }
 
 static int64_t coroutine_fn nfs_co_get_allocated_file_size(BlockDriverState *bs)
-- 
2.52.0



^ permalink raw reply related	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2026-03-06 13:32 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-01-02 15:32 [PATCH] block/nfs: Do not enter coroutine from CB Hanna Czenczek
2026-03-06 13:32 ` Kevin Wolf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox