qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Eric Blake <eblake@redhat.com>
To: qemu-devel@nongnu.org
Cc: "Daniel P. Berrangé" <berrange@redhat.com>
Subject: [PULL 11/15] qio: Add QIONetListener API for using AioContext
Date: Thu, 13 Nov 2025 16:55:07 -0600	[thread overview]
Message-ID: <20251113225531.1077484-28-eblake@redhat.com> (raw)
In-Reply-To: <20251113225531.1077484-17-eblake@redhat.com>

The user calling himself "John Doe" reported a deadlock when
attempting to use qemu-storage-daemon to serve both a base file over
NBD, and a qcow2 file with that NBD export as its backing file, from
the same process, even though it worked just fine when there were two
q-s-d processes.  The bulk of the NBD server code properly uses
coroutines to make progress in an event-driven manner, but the code
for spawning a new coroutine at the point when listen(2) detects a new
client was hard-coded to use the global GMainContext; in other words,
the callback that triggers nbd_client_new to let the server start the
negotiation sequence with the client requires the main loop to be
making progress.  However, the code for bdrv_open of a qcow2 image
with an NBD backing file uses an AIO_WAIT_WHILE nested event loop to
ensure that the entire qcow2 backing chain is either fully loaded or
rejected, without any side effects from the main loop causing unwanted
changes to the disk being loaded (in short, an AioContext represents
the set of actions that are known to be safe while handling block
layer I/O, while excluding any other pending actions in the global
main loop with potentially larger risk of unwanted side effects).

This creates a classic case of deadlock: the server can't progress to
the point of accept(2)ing the client to write to the NBD socket
because the main loop is being starved until the AIO_WAIT_WHILE
completes the bdrv_open, but the AIO_WAIT_WHILE can't progress because
it is blocked on the client coroutine stuck in a read() of the
expected magic number from the server side of the socket.

This patch adds a new API to allow clients to opt in to listening via
an AioContext rather than a GMainContext.  This will allow NBD to fix
the deadlock by performing all actions during bdrv_open in the main
loop AioContext.

Technical debt warning: I would have loved to utilize a notify
function with AioContext to guarantee that we don't finalize listener
due to an object_unref if there is any callback still running (the way
GSource does), but wiring up notify functions into AioContext is a
bigger task that will be deferred to a later QEMU release.  But for
solving the NBD deadlock, it is sufficient to note that the QMP
commands for enabling and disabling the NBD server are really the only
points where we want to change the listener's callback.  Furthermore,
those commands are serviced in the main loop, which is the same
AioContext that is also listening for connections.  Since a thread
cannot interrupt itself, we are ensured that at the point where we are
changing the watch, there are no callbacks active.  This is NOT as
powerful as the GSource cross-thread safety, but sufficient for the
needs of today.

An upcoming patch will then add a unit test (kept separate to make it
easier to rearrange the series to demonstrate the deadlock without
this patch).

Fixes: https://gitlab.com/qemu-project/qemu/-/issues/3169
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-ID: <20251113011625.878876-26-eblake@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
---
 include/io/net-listener.h | 21 +++++++++++++
 io/net-listener.c         | 66 +++++++++++++++++++++++++++++++++++++--
 2 files changed, 84 insertions(+), 3 deletions(-)

diff --git a/include/io/net-listener.h b/include/io/net-listener.h
index 93608f7fe89..fdc5f1c81ad 100644
--- a/include/io/net-listener.h
+++ b/include/io/net-listener.h
@@ -152,6 +152,27 @@ void qio_net_listener_set_client_func(QIONetListener *listener,
                                       gpointer data,
                                       GDestroyNotify notify);

+/**
+ * qio_net_listener_set_client_aio_func:
+ * @listener: the network listener object
+ * @func: the callback function
+ * @data: opaque data to pass to @func
+ * @context: AioContext that @func will be bound to; if NULL, this will
+ *           will use qemu_get_aio_context().
+ *
+ * Similar to qio_net_listener_set_client_func_full(), except that the polling
+ * will be done by an AioContext rather than a GMainContext.
+ *
+ * Because AioContext does not (yet) support a clean way to deregister
+ * a callback from one thread while another thread might be in that
+ * callback, this function is only safe to call from the thread
+ * currently associated with @context.
+ */
+void qio_net_listener_set_client_aio_func(QIONetListener *listener,
+                                          QIONetListenerClientFunc func,
+                                          void *data,
+                                          AioContext *context);
+
 /**
  * qio_net_listener_wait_client:
  * @listener: the network listener object
diff --git a/io/net-listener.c b/io/net-listener.c
index 49399ec926a..9410d72da9c 100644
--- a/io/net-listener.c
+++ b/io/net-listener.c
@@ -85,6 +85,17 @@ static gboolean qio_net_listener_channel_func(QIOChannel *ioc,
 }


+static void qio_net_listener_aio_func(void *opaque)
+{
+    QIONetListenerSource *data = opaque;
+
+    assert(data->io_source == NULL);
+    assert(data->listener->aio_context != NULL);
+    qio_net_listener_channel_func(QIO_CHANNEL(data->sioc), G_IO_IN,
+                                  data->listener);
+}
+
+
 int qio_net_listener_open_sync(QIONetListener *listener,
                                SocketAddress *addr,
                                int num,
@@ -157,8 +168,26 @@ qio_net_listener_watch(QIONetListener *listener, size_t i, const char *caller)
                 qio_net_listener_channel_func,
                 listener, (GDestroyNotify)object_unref, listener->context);
         } else {
-            /* The user passed an AioContext. Not supported yet. */
-            g_assert_not_reached();
+            /*
+             * The user passed an AioContext.  At this point,
+             * AioContext lacks a clean way to call a notify function
+             * to release a final reference after any callback is
+             * complete.  But we asserted earlier that the async
+             * callback is changed only from the thread associated
+             * with aio_context, which means no other thread is in the
+             * middle of running the callback when we are changing the
+             * refcount on listener here.  Therefore, a single
+             * reference here is sufficient to ensure listener is not
+             * finalized during the callback.
+             */
+            assert(listener->context == NULL);
+            if (i == 0) {
+                object_ref(OBJECT(listener));
+            }
+            qio_channel_set_aio_fd_handler(
+                QIO_CHANNEL(listener->source[i]->sioc),
+                listener->aio_context, qio_net_listener_aio_func,
+                NULL, NULL, listener->source[i]);
         }
     }
 }
@@ -184,7 +213,13 @@ qio_net_listener_unwatch(QIONetListener *listener, const char *caller)
                 listener->source[i]->io_source = NULL;
             }
         } else {
-            g_assert_not_reached();
+            assert(listener->context == NULL);
+            qio_channel_set_aio_fd_handler(
+                QIO_CHANNEL(listener->source[i]->sioc),
+                listener->aio_context, NULL, NULL, NULL, NULL);
+            if (i == listener->nsioc - 1) {
+                object_unref(OBJECT(listener));
+            }
         }
     }
 }
@@ -259,6 +294,31 @@ void qio_net_listener_set_client_func(QIONetListener *listener,
                                               notify, NULL, NULL);
 }

+void qio_net_listener_set_client_aio_func(QIONetListener *listener,
+                                          QIONetListenerClientFunc func,
+                                          void *data,
+                                          AioContext *context)
+{
+    if (!context) {
+        assert(qemu_in_main_thread());
+        context = qemu_get_aio_context();
+    } else {
+        /*
+         * TODO: The API was intentionally designed to allow a caller
+         * to pass an alternative AioContext for future expansion;
+         * however, actually implementating that is not possible
+         * without notify callbacks wired into AioContext similar to
+         * how they work in GSource.  So for now, this code hard-codes
+         * the knowledge that the only client needing AioContext is
+         * the NBD server, which uses the global context and does not
+         * suffer from cross-thread safety issues.
+         */
+        g_assert_not_reached();
+    }
+    qio_net_listener_set_client_func_internal(listener, func, data,
+                                              NULL, NULL, context);
+}
+
 struct QIONetListenerClientWaitData {
     QIOChannelSocket *sioc;
     GMainLoop *loop;
-- 
2.51.1



  parent reply	other threads:[~2025-11-13 22:58 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-13 22:54 [PULL for-10.2 00/15] NBD patches for 2025-11-13 Eric Blake
2025-11-13 22:54 ` [PULL 01/15] iotests: Drop execute permissions on vvfat.out Eric Blake
2025-11-13 22:54 ` [PULL 02/15] qio: Add trace points to net_listener Eric Blake
2025-11-13 22:54 ` [PULL 03/15] qio: Unwatch before notify in QIONetListener Eric Blake
2025-11-13 22:55 ` [PULL 04/15] qio: Remember context of qio_net_listener_set_client_func_full Eric Blake
2025-11-13 22:55 ` [PULL 05/15] qio: Protect NetListener callback with mutex Eric Blake
2025-11-13 22:55 ` [PULL 06/15] qio: Minor optimization when callback function is unchanged Eric Blake
2025-11-13 22:55 ` [PULL 07/15] qio: Factor out helpers qio_net_listener_[un]watch Eric Blake
2025-11-13 22:55 ` [PULL 08/15] chardev: Reuse channel's cached local address Eric Blake
2025-11-13 22:55 ` [PULL 09/15] qio: Provide accessor around QIONetListener->sioc Eric Blake
2025-11-13 22:55 ` [PULL 10/15] qio: Prepare NetListener to use AioContext Eric Blake
2025-11-13 22:55 ` Eric Blake [this message]
2025-11-13 22:55 ` [PULL 12/15] nbd: Avoid deadlock in client connecting to same-process server Eric Blake
2025-11-13 22:55 ` [PULL 13/15] iotests: Add coverage of recent NBD qio deadlock fix Eric Blake
2025-11-13 22:55 ` [PULL 14/15] tests/qemu-iotests: Fix broken grep command in iotest 207 Eric Blake
2025-11-13 22:55 ` [PULL 15/15] tests/qemu-iotest: fix iotest 024 with qed images Eric Blake
2025-11-14 16:57 ` [PULL for-10.2 00/15] NBD patches for 2025-11-13 Richard Henderson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20251113225531.1077484-28-eblake@redhat.com \
    --to=eblake@redhat.com \
    --cc=berrange@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).