qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 0/2] nbd/server: Quiesce coroutines on context switch
@ 2021-02-01 12:50 Sergio Lopez
  2021-02-01 12:50 ` [PATCH v4 1/2] block: Avoid processing BDS twice in bdrv_set_aio_context_ignore() Sergio Lopez
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Sergio Lopez @ 2021-02-01 12:50 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, Sergio Lopez, qemu-block, Max Reitz

This series allows the NBD server to properly switch between AIO contexts,
having quiesced recv_coroutine and send_coroutine before doing the transition.

We need this because we send back devices running in IO Thread owned contexts
to the main context when stopping the data plane, something that can happen
multiple times during the lifetime of a VM (usually during the boot sequence or
on a reboot), and we drag the NBD server of the correspoing export with it.

While there, fix also a problem caused by a cross-dependency between
closing the export's client connections and draining the block
layer. The visible effect of this problem was QEMU getting hung when
the guest request a power off while there's an active NBD client.

v4:
 - Call to blk_exp_close_all() from qemu-nbd and qemu-storage-daemon
 too. (Kevin Wolf)

v3:
 - Drop already merged "block: Honor blk_set_aio_context() context
 requirements" and "nbd/server: Quiesce coroutines on context switch"
 - Change the strategy for avoiding processing BDS twice to adding
 every child and parent to the ignore list in advance before
 processing them. (Kevin Wolf)
 - Replace "nbd/server: Quiesce coroutines on context switch" with
 "block: move blk_exp_close_all() to qemu_cleanup()"

v2:
 - Replace "virtio-blk: Acquire context while switching them on
 dataplane start" with "block: Honor blk_set_aio_context() context
 requirements" (Kevin Wolf)
 - Add "block: Avoid processing BDS twice in
 bdrv_set_aio_context_ignore()"
 - Add "block: Close block exports in two steps"
 - Rename nbd_read_eof() to nbd_server_read_eof() (Eric Blake)
 - Fix double space and typo in comment. (Eric Blake)

Sergio Lopez (2):
  block: Avoid processing BDS twice in bdrv_set_aio_context_ignore()
  block: move blk_exp_close_all() to qemu_cleanup()

 block.c                              | 35 +++++++++++++++++++++-------
 qemu-nbd.c                           |  1 +
 softmmu/runstate.c                   |  9 +++++++
 storage-daemon/qemu-storage-daemon.c |  1 +
 4 files changed, 38 insertions(+), 8 deletions(-)

-- 
2.26.2




^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH v4 1/2] block: Avoid processing BDS twice in bdrv_set_aio_context_ignore()
  2021-02-01 12:50 [PATCH v4 0/2] nbd/server: Quiesce coroutines on context switch Sergio Lopez
@ 2021-02-01 12:50 ` Sergio Lopez
  2021-02-01 12:50 ` [PATCH v4 2/2] block: move blk_exp_close_all() to qemu_cleanup() Sergio Lopez
  2021-02-01 15:31 ` [PATCH v4 0/2] nbd/server: Quiesce coroutines on context switch Kevin Wolf
  2 siblings, 0 replies; 4+ messages in thread
From: Sergio Lopez @ 2021-02-01 12:50 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, Sergio Lopez, qemu-block, Max Reitz

Some graphs may contain an indirect reference to the first BDS in the
chain that can be reached while walking it bottom->up from one its
children.

Doubling-processing of a BDS is especially problematic for the
aio_notifiers, as they might attempt to work on both the old and the
new AIO contexts.

To avoid this problem, add every child and parent to the ignore list
before actually processing them.

Suggested-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Sergio Lopez <slp@redhat.com>
---
 block.c | 34 +++++++++++++++++++++++++++-------
 1 file changed, 27 insertions(+), 7 deletions(-)

diff --git a/block.c b/block.c
index 8b9d457546..3da99312db 100644
--- a/block.c
+++ b/block.c
@@ -6414,7 +6414,10 @@ void bdrv_set_aio_context_ignore(BlockDriverState *bs,
                                  AioContext *new_context, GSList **ignore)
 {
     AioContext *old_context = bdrv_get_aio_context(bs);
-    BdrvChild *child;
+    GSList *children_to_process = NULL;
+    GSList *parents_to_process = NULL;
+    GSList *entry;
+    BdrvChild *child, *parent;
 
     g_assert(qemu_get_current_aio_context() == qemu_get_aio_context());
 
@@ -6429,16 +6432,33 @@ void bdrv_set_aio_context_ignore(BlockDriverState *bs,
             continue;
         }
         *ignore = g_slist_prepend(*ignore, child);
-        bdrv_set_aio_context_ignore(child->bs, new_context, ignore);
+        children_to_process = g_slist_prepend(children_to_process, child);
     }
-    QLIST_FOREACH(child, &bs->parents, next_parent) {
-        if (g_slist_find(*ignore, child)) {
+
+    QLIST_FOREACH(parent, &bs->parents, next_parent) {
+        if (g_slist_find(*ignore, parent)) {
             continue;
         }
-        assert(child->klass->set_aio_ctx);
-        *ignore = g_slist_prepend(*ignore, child);
-        child->klass->set_aio_ctx(child, new_context, ignore);
+        *ignore = g_slist_prepend(*ignore, parent);
+        parents_to_process = g_slist_prepend(parents_to_process, parent);
+    }
+
+    for (entry = children_to_process;
+         entry != NULL;
+         entry = g_slist_next(entry)) {
+        child = entry->data;
+        bdrv_set_aio_context_ignore(child->bs, new_context, ignore);
+    }
+    g_slist_free(children_to_process);
+
+    for (entry = parents_to_process;
+         entry != NULL;
+         entry = g_slist_next(entry)) {
+        parent = entry->data;
+        assert(parent->klass->set_aio_ctx);
+        parent->klass->set_aio_ctx(parent, new_context, ignore);
     }
+    g_slist_free(parents_to_process);
 
     bdrv_detach_aio_context(bs);
 
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH v4 2/2] block: move blk_exp_close_all() to qemu_cleanup()
  2021-02-01 12:50 [PATCH v4 0/2] nbd/server: Quiesce coroutines on context switch Sergio Lopez
  2021-02-01 12:50 ` [PATCH v4 1/2] block: Avoid processing BDS twice in bdrv_set_aio_context_ignore() Sergio Lopez
@ 2021-02-01 12:50 ` Sergio Lopez
  2021-02-01 15:31 ` [PATCH v4 0/2] nbd/server: Quiesce coroutines on context switch Kevin Wolf
  2 siblings, 0 replies; 4+ messages in thread
From: Sergio Lopez @ 2021-02-01 12:50 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, Sergio Lopez, qemu-block, Max Reitz

Move blk_exp_close_all() from bdrv_close() to qemu_cleanup(), before
bdrv_drain_all_begin().

Export drivers may have coroutines yielding at some point in the block
layer, so we need to shut them down before draining the block layer,
as otherwise they may get stuck blk_wait_while_drained().

RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1900505
Signed-off-by: Sergio Lopez <slp@redhat.com>
---
 block.c                              | 1 -
 qemu-nbd.c                           | 1 +
 softmmu/runstate.c                   | 9 +++++++++
 storage-daemon/qemu-storage-daemon.c | 1 +
 4 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/block.c b/block.c
index 3da99312db..9682c82fa8 100644
--- a/block.c
+++ b/block.c
@@ -4435,7 +4435,6 @@ static void bdrv_close(BlockDriverState *bs)
 void bdrv_close_all(void)
 {
     assert(job_next(NULL) == NULL);
-    blk_exp_close_all();
 
     /* Drop references from requests still in flight, such as canceled block
      * jobs whose AIO context has not been polled yet */
diff --git a/qemu-nbd.c b/qemu-nbd.c
index 0d513cb38c..608c63e82a 100644
--- a/qemu-nbd.c
+++ b/qemu-nbd.c
@@ -503,6 +503,7 @@ static const char *socket_activation_validate_opts(const char *device,
 static void qemu_nbd_shutdown(void)
 {
     job_cancel_sync_all();
+    blk_exp_close_all();
     bdrv_close_all();
 }
 
diff --git a/softmmu/runstate.c b/softmmu/runstate.c
index 6177693a30..ac4b2e2540 100644
--- a/softmmu/runstate.c
+++ b/softmmu/runstate.c
@@ -25,6 +25,7 @@
 #include "qemu/osdep.h"
 #include "audio/audio.h"
 #include "block/block.h"
+#include "block/export.h"
 #include "chardev/char.h"
 #include "crypto/cipher.h"
 #include "crypto/init.h"
@@ -783,6 +784,14 @@ void qemu_cleanup(void)
      */
     migration_shutdown();
 
+    /*
+     * Close the exports before draining the block layer. The export
+     * drivers may have coroutines yielding on it, so we need to clean
+     * them up before the drain, as otherwise they may be get stuck in
+     * blk_wait_while_drained().
+     */
+    blk_exp_close_all();
+
     /*
      * We must cancel all block jobs while the block layer is drained,
      * or cancelling will be affected by throttling and thus may block
diff --git a/storage-daemon/qemu-storage-daemon.c b/storage-daemon/qemu-storage-daemon.c
index e0c87edbdd..d8d172cc60 100644
--- a/storage-daemon/qemu-storage-daemon.c
+++ b/storage-daemon/qemu-storage-daemon.c
@@ -314,6 +314,7 @@ int main(int argc, char *argv[])
         main_loop_wait(false);
     }
 
+    blk_exp_close_all();
     bdrv_drain_all_begin();
     bdrv_close_all();
 
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH v4 0/2] nbd/server: Quiesce coroutines on context switch
  2021-02-01 12:50 [PATCH v4 0/2] nbd/server: Quiesce coroutines on context switch Sergio Lopez
  2021-02-01 12:50 ` [PATCH v4 1/2] block: Avoid processing BDS twice in bdrv_set_aio_context_ignore() Sergio Lopez
  2021-02-01 12:50 ` [PATCH v4 2/2] block: move blk_exp_close_all() to qemu_cleanup() Sergio Lopez
@ 2021-02-01 15:31 ` Kevin Wolf
  2 siblings, 0 replies; 4+ messages in thread
From: Kevin Wolf @ 2021-02-01 15:31 UTC (permalink / raw)
  To: Sergio Lopez; +Cc: qemu-devel, qemu-block, Max Reitz

Am 01.02.2021 um 13:50 hat Sergio Lopez geschrieben:
> This series allows the NBD server to properly switch between AIO contexts,
> having quiesced recv_coroutine and send_coroutine before doing the transition.
> 
> We need this because we send back devices running in IO Thread owned contexts
> to the main context when stopping the data plane, something that can happen
> multiple times during the lifetime of a VM (usually during the boot sequence or
> on a reboot), and we drag the NBD server of the correspoing export with it.
> 
> While there, fix also a problem caused by a cross-dependency between
> closing the export's client connections and draining the block
> layer. The visible effect of this problem was QEMU getting hung when
> the guest request a power off while there's an active NBD client.

Reviewed-by: Kevin Wolf <kwolf@redhat.com>



^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2021-02-01 15:33 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-02-01 12:50 [PATCH v4 0/2] nbd/server: Quiesce coroutines on context switch Sergio Lopez
2021-02-01 12:50 ` [PATCH v4 1/2] block: Avoid processing BDS twice in bdrv_set_aio_context_ignore() Sergio Lopez
2021-02-01 12:50 ` [PATCH v4 2/2] block: move blk_exp_close_all() to qemu_cleanup() Sergio Lopez
2021-02-01 15:31 ` [PATCH v4 0/2] nbd/server: Quiesce coroutines on context switch Kevin Wolf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).