qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/5] block-backend: process I/O in the current AioContext
@ 2023-09-12 23:10 Stefan Hajnoczi
  2023-09-12 23:10 ` [PATCH v3 1/5] block: remove AIOCBInfo->get_aio_context() Stefan Hajnoczi
                   ` (5 more replies)
  0 siblings, 6 replies; 14+ messages in thread
From: Stefan Hajnoczi @ 2023-09-12 23:10 UTC (permalink / raw)
  To: qemu-devel
  Cc: Klaus Jensen, kwolf, Cleber Rosa, John Snow, Hanna Reitz,
	Philippe Mathieu-Daudé, Keith Busch, Paolo Bonzini, Peter Xu,
	qemu-block, Stefan Hajnoczi, Vladimir Sementsov-Ogievskiy,
	David Hildenbrand, Fam Zheng

v3
- Add Patch 2 to fix a race condition in test-bdrv-drain. This was the CI
  failure that bumped this patch series from Kevin's pull request.
- Add missing 051.pc.out file. I tried qemu-system-aarch64 to see of 051.out
  also needs to be updated, but no changes were necessary. [Kevin]
v2
- Add patch to remove AIOCBInfo->get_aio_context() [Kevin]
- Add patch to use qemu_get_current_aio_context() in block-coroutine-wrapper so
  that the wrappers use the current AioContext instead of
  bdrv_get_aio_context().

Switch blk_aio_*() APIs over to multi-queue by using
qemu_get_current_aio_context() instead of blk_get_aio_context(). This change
will allow devices to process I/O in multiple IOThreads in the future.

The final patch requires my QIOChannel AioContext series to pass
tests/qemu-iotests/check -qcow2 281 because the nbd block driver is now
accessed from the main loop thread in addition to the IOThread:
https://lore.kernel.org/qemu-devel/20230823234504.1387239-1-stefanha@redhat.com/T/#t

Based-on: 20230823234504.1387239-1-stefanha@redhat.com

Stefan Hajnoczi (5):
  block: remove AIOCBInfo->get_aio_context()
  test-bdrv-drain: avoid race with BH in IOThread drain test
  block-backend: process I/O in the current AioContext
  block-backend: process zoned requests in the current AioContext
  block-coroutine-wrapper: use qemu_get_current_aio_context()

 include/block/aio.h                |  1 -
 include/block/block-global-state.h |  2 ++
 include/block/block-io.h           |  1 -
 block/block-backend.c              | 35 ++++++++----------------------
 block/io.c                         | 23 +++++++-------------
 hw/nvme/ctrl.c                     |  7 ------
 softmmu/dma-helpers.c              |  8 -------
 tests/unit/test-bdrv-drain.c       |  8 +++++++
 util/thread-pool.c                 |  8 -------
 scripts/block-coroutine-wrapper.py |  6 ++---
 tests/qemu-iotests/051.pc.out      |  4 ++--
 11 files changed, 31 insertions(+), 72 deletions(-)

-- 
2.41.0



^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v3 1/5] block: remove AIOCBInfo->get_aio_context()
  2023-09-12 23:10 [PATCH v3 0/5] block-backend: process I/O in the current AioContext Stefan Hajnoczi
@ 2023-09-12 23:10 ` Stefan Hajnoczi
  2023-09-13 15:59   ` Eric Blake
  2023-09-14  6:44   ` Klaus Jensen
  2023-09-12 23:10 ` [PATCH v3 2/5] test-bdrv-drain: avoid race with BH in IOThread drain test Stefan Hajnoczi
                   ` (4 subsequent siblings)
  5 siblings, 2 replies; 14+ messages in thread
From: Stefan Hajnoczi @ 2023-09-12 23:10 UTC (permalink / raw)
  To: qemu-devel
  Cc: Klaus Jensen, kwolf, Cleber Rosa, John Snow, Hanna Reitz,
	Philippe Mathieu-Daudé, Keith Busch, Paolo Bonzini, Peter Xu,
	qemu-block, Stefan Hajnoczi, Vladimir Sementsov-Ogievskiy,
	David Hildenbrand, Fam Zheng

The synchronous bdrv_aio_cancel() function needs the acb's AioContext so
it can call aio_poll() to wait for cancellation.

It turns out that all users run under the BQL in the main AioContext, so
this callback is not needed.

Remove the callback, mark bdrv_aio_cancel() GLOBAL_STATE_CODE just like
its blk_aio_cancel() caller, and poll the main loop AioContext.

The purpose of this cleanup is to identify bdrv_aio_cancel() as an API
that does not work with the multi-queue block layer.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 include/block/aio.h                |  1 -
 include/block/block-global-state.h |  2 ++
 include/block/block-io.h           |  1 -
 block/block-backend.c              | 17 -----------------
 block/io.c                         | 23 ++++++++---------------
 hw/nvme/ctrl.c                     |  7 -------
 softmmu/dma-helpers.c              |  8 --------
 util/thread-pool.c                 |  8 --------
 8 files changed, 10 insertions(+), 57 deletions(-)

diff --git a/include/block/aio.h b/include/block/aio.h
index 32042e8905..bcc165c974 100644
--- a/include/block/aio.h
+++ b/include/block/aio.h
@@ -31,7 +31,6 @@ typedef void BlockCompletionFunc(void *opaque, int ret);
 
 typedef struct AIOCBInfo {
     void (*cancel_async)(BlockAIOCB *acb);
-    AioContext *(*get_aio_context)(BlockAIOCB *acb);
     size_t aiocb_size;
 } AIOCBInfo;
 
diff --git a/include/block/block-global-state.h b/include/block/block-global-state.h
index f347199bff..ac2a605ef5 100644
--- a/include/block/block-global-state.h
+++ b/include/block/block-global-state.h
@@ -185,6 +185,8 @@ void bdrv_drain_all_begin_nopoll(void);
 void bdrv_drain_all_end(void);
 void bdrv_drain_all(void);
 
+void bdrv_aio_cancel(BlockAIOCB *acb);
+
 int bdrv_has_zero_init_1(BlockDriverState *bs);
 int bdrv_has_zero_init(BlockDriverState *bs);
 BlockDriverState *bdrv_find_node(const char *node_name);
diff --git a/include/block/block-io.h b/include/block/block-io.h
index 6db48f2d35..f1c796a1ce 100644
--- a/include/block/block-io.h
+++ b/include/block/block-io.h
@@ -101,7 +101,6 @@ bdrv_co_delete_file_noerr(BlockDriverState *bs);
 
 
 /* async block I/O */
-void bdrv_aio_cancel(BlockAIOCB *acb);
 void bdrv_aio_cancel_async(BlockAIOCB *acb);
 
 /* sg packet commands */
diff --git a/block/block-backend.c b/block/block-backend.c
index 4009ed5fed..a77295a198 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -33,8 +33,6 @@
 
 #define NOT_DONE 0x7fffffff /* used while emulated sync operation in progress */
 
-static AioContext *blk_aiocb_get_aio_context(BlockAIOCB *acb);
-
 typedef struct BlockBackendAioNotifier {
     void (*attached_aio_context)(AioContext *new_context, void *opaque);
     void (*detach_aio_context)(void *opaque);
@@ -103,7 +101,6 @@ typedef struct BlockBackendAIOCB {
 } BlockBackendAIOCB;
 
 static const AIOCBInfo block_backend_aiocb_info = {
-    .get_aio_context = blk_aiocb_get_aio_context,
     .aiocb_size = sizeof(BlockBackendAIOCB),
 };
 
@@ -1545,16 +1542,8 @@ typedef struct BlkAioEmAIOCB {
     bool has_returned;
 } BlkAioEmAIOCB;
 
-static AioContext *blk_aio_em_aiocb_get_aio_context(BlockAIOCB *acb_)
-{
-    BlkAioEmAIOCB *acb = container_of(acb_, BlkAioEmAIOCB, common);
-
-    return blk_get_aio_context(acb->rwco.blk);
-}
-
 static const AIOCBInfo blk_aio_em_aiocb_info = {
     .aiocb_size         = sizeof(BlkAioEmAIOCB),
-    .get_aio_context    = blk_aio_em_aiocb_get_aio_context,
 };
 
 static void blk_aio_complete(BlkAioEmAIOCB *acb)
@@ -2434,12 +2423,6 @@ AioContext *blk_get_aio_context(BlockBackend *blk)
     return blk->ctx;
 }
 
-static AioContext *blk_aiocb_get_aio_context(BlockAIOCB *acb)
-{
-    BlockBackendAIOCB *blk_acb = DO_UPCAST(BlockBackendAIOCB, common, acb);
-    return blk_get_aio_context(blk_acb->blk);
-}
-
 int blk_set_aio_context(BlockBackend *blk, AioContext *new_context,
                         Error **errp)
 {
diff --git a/block/io.c b/block/io.c
index ba23a9bcd3..209a6da0c8 100644
--- a/block/io.c
+++ b/block/io.c
@@ -2950,25 +2950,18 @@ int bdrv_load_vmstate(BlockDriverState *bs, uint8_t *buf,
 /**************************************************************/
 /* async I/Os */
 
+/**
+ * Synchronously cancels an acb. Must be called with the BQL held and the acb
+ * must be processed with the BQL held too (IOThreads are not allowed).
+ *
+ * Use bdrv_aio_cancel_async() instead when possible.
+ */
 void bdrv_aio_cancel(BlockAIOCB *acb)
 {
-    IO_CODE();
+    GLOBAL_STATE_CODE();
     qemu_aio_ref(acb);
     bdrv_aio_cancel_async(acb);
-    while (acb->refcnt > 1) {
-        if (acb->aiocb_info->get_aio_context) {
-            aio_poll(acb->aiocb_info->get_aio_context(acb), true);
-        } else if (acb->bs) {
-            /* qemu_aio_ref and qemu_aio_unref are not thread-safe, so
-             * assert that we're not using an I/O thread.  Thread-safe
-             * code should use bdrv_aio_cancel_async exclusively.
-             */
-            assert(bdrv_get_aio_context(acb->bs) == qemu_get_aio_context());
-            aio_poll(bdrv_get_aio_context(acb->bs), true);
-        } else {
-            abort();
-        }
-    }
+    AIO_WAIT_WHILE_UNLOCKED(NULL, acb->refcnt > 1);
     qemu_aio_unref(acb);
 }
 
diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c
index 539d273553..ee7273daa1 100644
--- a/hw/nvme/ctrl.c
+++ b/hw/nvme/ctrl.c
@@ -2130,11 +2130,6 @@ static inline bool nvme_is_write(NvmeRequest *req)
            rw->opcode == NVME_CMD_WRITE_ZEROES;
 }
 
-static AioContext *nvme_get_aio_context(BlockAIOCB *acb)
-{
-    return qemu_get_aio_context();
-}
-
 static void nvme_misc_cb(void *opaque, int ret)
 {
     NvmeRequest *req = opaque;
@@ -3302,7 +3297,6 @@ static void nvme_flush_cancel(BlockAIOCB *acb)
 static const AIOCBInfo nvme_flush_aiocb_info = {
     .aiocb_size = sizeof(NvmeFlushAIOCB),
     .cancel_async = nvme_flush_cancel,
-    .get_aio_context = nvme_get_aio_context,
 };
 
 static void nvme_do_flush(NvmeFlushAIOCB *iocb);
@@ -6478,7 +6472,6 @@ static void nvme_format_cancel(BlockAIOCB *aiocb)
 static const AIOCBInfo nvme_format_aiocb_info = {
     .aiocb_size = sizeof(NvmeFormatAIOCB),
     .cancel_async = nvme_format_cancel,
-    .get_aio_context = nvme_get_aio_context,
 };
 
 static void nvme_format_set(NvmeNamespace *ns, uint8_t lbaf, uint8_t mset,
diff --git a/softmmu/dma-helpers.c b/softmmu/dma-helpers.c
index 2463964805..36211acc7e 100644
--- a/softmmu/dma-helpers.c
+++ b/softmmu/dma-helpers.c
@@ -206,17 +206,9 @@ static void dma_aio_cancel(BlockAIOCB *acb)
     }
 }
 
-static AioContext *dma_get_aio_context(BlockAIOCB *acb)
-{
-    DMAAIOCB *dbs = container_of(acb, DMAAIOCB, common);
-
-    return dbs->ctx;
-}
-
 static const AIOCBInfo dma_aiocb_info = {
     .aiocb_size         = sizeof(DMAAIOCB),
     .cancel_async       = dma_aio_cancel,
-    .get_aio_context    = dma_get_aio_context,
 };
 
 BlockAIOCB *dma_blk_io(AioContext *ctx,
diff --git a/util/thread-pool.c b/util/thread-pool.c
index e3d8292d14..22f9ba3286 100644
--- a/util/thread-pool.c
+++ b/util/thread-pool.c
@@ -228,17 +228,9 @@ static void thread_pool_cancel(BlockAIOCB *acb)
 
 }
 
-static AioContext *thread_pool_get_aio_context(BlockAIOCB *acb)
-{
-    ThreadPoolElement *elem = (ThreadPoolElement *)acb;
-    ThreadPool *pool = elem->pool;
-    return pool->ctx;
-}
-
 static const AIOCBInfo thread_pool_aiocb_info = {
     .aiocb_size         = sizeof(ThreadPoolElement),
     .cancel_async       = thread_pool_cancel,
-    .get_aio_context    = thread_pool_get_aio_context,
 };
 
 BlockAIOCB *thread_pool_submit_aio(ThreadPoolFunc *func, void *arg,
-- 
2.41.0



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v3 2/5] test-bdrv-drain: avoid race with BH in IOThread drain test
  2023-09-12 23:10 [PATCH v3 0/5] block-backend: process I/O in the current AioContext Stefan Hajnoczi
  2023-09-12 23:10 ` [PATCH v3 1/5] block: remove AIOCBInfo->get_aio_context() Stefan Hajnoczi
@ 2023-09-12 23:10 ` Stefan Hajnoczi
  2023-09-13 16:08   ` Eric Blake
  2023-09-12 23:10 ` [PATCH v3 3/5] block-backend: process I/O in the current AioContext Stefan Hajnoczi
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 14+ messages in thread
From: Stefan Hajnoczi @ 2023-09-12 23:10 UTC (permalink / raw)
  To: qemu-devel
  Cc: Klaus Jensen, kwolf, Cleber Rosa, John Snow, Hanna Reitz,
	Philippe Mathieu-Daudé, Keith Busch, Paolo Bonzini, Peter Xu,
	qemu-block, Stefan Hajnoczi, Vladimir Sementsov-Ogievskiy,
	David Hildenbrand, Fam Zheng

This patch fixes a race condition in test-bdrv-drain that is difficult
to reproduce. test-bdrv-drain sometimes fails without an error message
on the block pull request sent by Kevin Wolf on Sep 4, 2023. I was able
to reproduce it locally and found that "block-backend: process I/O in
the current AioContext" (in this patch series) is the first commit where
it reproduces.

I do not know why "block-backend: process I/O in the current AioContext"
exposes this bug. It might be related to the fact that the test's preadv
request runs in the main thread instead of IOThread a after my commit.
That might simply change the timing of the test.

Now on to the race condition in test-bdrv-drain. The main thread
schedules a BH in IOThread a and then drains the BDS:

  aio_bh_schedule_oneshot(ctx_a, test_iothread_main_thread_bh, &data);

  /* The request is running on the IOThread a. Draining its block device
   * will make sure that it has completed as far as the BDS is concerned,
   * but the drain in this thread can continue immediately after
   * bdrv_dec_in_flight() and aio_ret might be assigned only slightly
   * later. */
  do_drain_begin(drain_type, bs);

If the BH completes before do_drain_begin() then there is nothing to
worry about.

If the BH invokes bdrv_flush() before do_drain_begin(), then
do_drain_begin() waits for it to complete.

The problematic case is when do_drain_begin() runs before the BH enters
bdrv_flush(). Then do_drain_begin() misses the BH and the drain
mechanism has failed in quiescing I/O.

Fix this by incrementing the in_flight counter so that do_drain_begin()
waits for test_iothread_main_thread_bh().

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 tests/unit/test-bdrv-drain.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/tests/unit/test-bdrv-drain.c b/tests/unit/test-bdrv-drain.c
index ccc453c29e..67a79aa3f0 100644
--- a/tests/unit/test-bdrv-drain.c
+++ b/tests/unit/test-bdrv-drain.c
@@ -512,6 +512,7 @@ static void test_iothread_main_thread_bh(void *opaque)
      * executed during drain, otherwise this would deadlock. */
     aio_context_acquire(bdrv_get_aio_context(data->bs));
     bdrv_flush(data->bs);
+    bdrv_dec_in_flight(data->bs); /* incremented by test_iothread_common() */
     aio_context_release(bdrv_get_aio_context(data->bs));
 }
 
@@ -583,6 +584,13 @@ static void test_iothread_common(enum drain_type drain_type, int drain_thread)
             aio_context_acquire(ctx_a);
         }
 
+        /*
+         * Increment in_flight so that do_drain_begin() waits for
+         * test_iothread_main_thread_bh(). This prevents the race between
+         * test_iothread_main_thread_bh() in IOThread a and do_drain_begin() in
+         * this thread. test_iothread_main_thread_bh() decrements in_flight.
+         */
+        bdrv_inc_in_flight(bs);
         aio_bh_schedule_oneshot(ctx_a, test_iothread_main_thread_bh, &data);
 
         /* The request is running on the IOThread a. Draining its block device
-- 
2.41.0



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v3 3/5] block-backend: process I/O in the current AioContext
  2023-09-12 23:10 [PATCH v3 0/5] block-backend: process I/O in the current AioContext Stefan Hajnoczi
  2023-09-12 23:10 ` [PATCH v3 1/5] block: remove AIOCBInfo->get_aio_context() Stefan Hajnoczi
  2023-09-12 23:10 ` [PATCH v3 2/5] test-bdrv-drain: avoid race with BH in IOThread drain test Stefan Hajnoczi
@ 2023-09-12 23:10 ` Stefan Hajnoczi
  2023-09-13 16:20   ` Eric Blake
  2023-09-12 23:10 ` [PATCH v3 4/5] block-backend: process zoned requests " Stefan Hajnoczi
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 14+ messages in thread
From: Stefan Hajnoczi @ 2023-09-12 23:10 UTC (permalink / raw)
  To: qemu-devel
  Cc: Klaus Jensen, kwolf, Cleber Rosa, John Snow, Hanna Reitz,
	Philippe Mathieu-Daudé, Keith Busch, Paolo Bonzini, Peter Xu,
	qemu-block, Stefan Hajnoczi, Vladimir Sementsov-Ogievskiy,
	David Hildenbrand, Fam Zheng

Switch blk_aio_*() APIs over to multi-queue by using
qemu_get_current_aio_context() instead of blk_get_aio_context(). This
change will allow devices to process I/O in multiple IOThreads in the
future.

I audited existing blk_aio_*() callers:
- migration/block.c: blk_mig_lock() protects the data accessed by the
  completion callback.
- The remaining emulated devices and exports run with
  qemu_get_aio_context() == blk_get_aio_context().

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 block/block-backend.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/block/block-backend.c b/block/block-backend.c
index a77295a198..4863be5691 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -1530,7 +1530,7 @@ BlockAIOCB *blk_abort_aio_request(BlockBackend *blk,
     acb->blk = blk;
     acb->ret = ret;
 
-    replay_bh_schedule_oneshot_event(blk_get_aio_context(blk),
+    replay_bh_schedule_oneshot_event(qemu_get_current_aio_context(),
                                      error_callback_bh, acb);
     return &acb->common;
 }
@@ -1584,11 +1584,11 @@ static BlockAIOCB *blk_aio_prwv(BlockBackend *blk, int64_t offset,
     acb->has_returned = false;
 
     co = qemu_coroutine_create(co_entry, acb);
-    aio_co_enter(blk_get_aio_context(blk), co);
+    aio_co_enter(qemu_get_current_aio_context(), co);
 
     acb->has_returned = true;
     if (acb->rwco.ret != NOT_DONE) {
-        replay_bh_schedule_oneshot_event(blk_get_aio_context(blk),
+        replay_bh_schedule_oneshot_event(qemu_get_current_aio_context(),
                                          blk_aio_complete_bh, acb);
     }
 
-- 
2.41.0



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v3 4/5] block-backend: process zoned requests in the current AioContext
  2023-09-12 23:10 [PATCH v3 0/5] block-backend: process I/O in the current AioContext Stefan Hajnoczi
                   ` (2 preceding siblings ...)
  2023-09-12 23:10 ` [PATCH v3 3/5] block-backend: process I/O in the current AioContext Stefan Hajnoczi
@ 2023-09-12 23:10 ` Stefan Hajnoczi
  2023-09-13 16:22   ` Eric Blake
  2023-09-12 23:10 ` [PATCH v3 5/5] block-coroutine-wrapper: use qemu_get_current_aio_context() Stefan Hajnoczi
  2023-09-15 14:39 ` [PATCH v3 0/5] block-backend: process I/O in the current AioContext Kevin Wolf
  5 siblings, 1 reply; 14+ messages in thread
From: Stefan Hajnoczi @ 2023-09-12 23:10 UTC (permalink / raw)
  To: qemu-devel
  Cc: Klaus Jensen, kwolf, Cleber Rosa, John Snow, Hanna Reitz,
	Philippe Mathieu-Daudé, Keith Busch, Paolo Bonzini, Peter Xu,
	qemu-block, Stefan Hajnoczi, Vladimir Sementsov-Ogievskiy,
	David Hildenbrand, Fam Zheng

Process zoned requests in the current thread's AioContext instead of in
the BlockBackend's AioContext.

There is no need to use the BlockBackend's AioContext thanks to CoMutex
bs->wps->colock, which protects zone metadata.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 block/block-backend.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/block/block-backend.c b/block/block-backend.c
index 4863be5691..427ebcc0e4 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -1890,11 +1890,11 @@ BlockAIOCB *blk_aio_zone_report(BlockBackend *blk, int64_t offset,
     acb->has_returned = false;
 
     co = qemu_coroutine_create(blk_aio_zone_report_entry, acb);
-    aio_co_enter(blk_get_aio_context(blk), co);
+    aio_co_enter(qemu_get_current_aio_context(), co);
 
     acb->has_returned = true;
     if (acb->rwco.ret != NOT_DONE) {
-        replay_bh_schedule_oneshot_event(blk_get_aio_context(blk),
+        replay_bh_schedule_oneshot_event(qemu_get_current_aio_context(),
                                          blk_aio_complete_bh, acb);
     }
 
@@ -1931,11 +1931,11 @@ BlockAIOCB *blk_aio_zone_mgmt(BlockBackend *blk, BlockZoneOp op,
     acb->has_returned = false;
 
     co = qemu_coroutine_create(blk_aio_zone_mgmt_entry, acb);
-    aio_co_enter(blk_get_aio_context(blk), co);
+    aio_co_enter(qemu_get_current_aio_context(), co);
 
     acb->has_returned = true;
     if (acb->rwco.ret != NOT_DONE) {
-        replay_bh_schedule_oneshot_event(blk_get_aio_context(blk),
+        replay_bh_schedule_oneshot_event(qemu_get_current_aio_context(),
                                          blk_aio_complete_bh, acb);
     }
 
@@ -1971,10 +1971,10 @@ BlockAIOCB *blk_aio_zone_append(BlockBackend *blk, int64_t *offset,
     acb->has_returned = false;
 
     co = qemu_coroutine_create(blk_aio_zone_append_entry, acb);
-    aio_co_enter(blk_get_aio_context(blk), co);
+    aio_co_enter(qemu_get_current_aio_context(), co);
     acb->has_returned = true;
     if (acb->rwco.ret != NOT_DONE) {
-        replay_bh_schedule_oneshot_event(blk_get_aio_context(blk),
+        replay_bh_schedule_oneshot_event(qemu_get_current_aio_context(),
                                          blk_aio_complete_bh, acb);
     }
 
-- 
2.41.0



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v3 5/5] block-coroutine-wrapper: use qemu_get_current_aio_context()
  2023-09-12 23:10 [PATCH v3 0/5] block-backend: process I/O in the current AioContext Stefan Hajnoczi
                   ` (3 preceding siblings ...)
  2023-09-12 23:10 ` [PATCH v3 4/5] block-backend: process zoned requests " Stefan Hajnoczi
@ 2023-09-12 23:10 ` Stefan Hajnoczi
  2023-09-13 16:23   ` Eric Blake
  2023-09-15 14:39 ` [PATCH v3 0/5] block-backend: process I/O in the current AioContext Kevin Wolf
  5 siblings, 1 reply; 14+ messages in thread
From: Stefan Hajnoczi @ 2023-09-12 23:10 UTC (permalink / raw)
  To: qemu-devel
  Cc: Klaus Jensen, kwolf, Cleber Rosa, John Snow, Hanna Reitz,
	Philippe Mathieu-Daudé, Keith Busch, Paolo Bonzini, Peter Xu,
	qemu-block, Stefan Hajnoczi, Vladimir Sementsov-Ogievskiy,
	David Hildenbrand, Fam Zheng

Use qemu_get_current_aio_context() in mixed wrappers and coroutine
wrappers so that code runs in the caller's AioContext instead of moving
to the BlockDriverState's AioContext. This change is necessary for the
multi-queue block layer where any thread can call into the block layer.

Most wrappers are IO_CODE where it's safe to use the current AioContext
nowadays. BlockDrivers and the core block layer use their own locks and
no longer depend on the AioContext lock for thread-safety.

The bdrv_create() wrapper invokes GLOBAL_STATE code. Using the current
AioContext is safe because this code is only called with the BQL held
from the main loop thread.

The output of qemu-iotests 051 is sensitive to event loop activity.
Update the output because the monitor BH runs at a different time,
causing prompts to be printed differently in the output.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 scripts/block-coroutine-wrapper.py | 6 ++----
 tests/qemu-iotests/051.pc.out      | 4 ++--
 2 files changed, 4 insertions(+), 6 deletions(-)

diff --git a/scripts/block-coroutine-wrapper.py b/scripts/block-coroutine-wrapper.py
index d4a183db61..f93fe154c3 100644
--- a/scripts/block-coroutine-wrapper.py
+++ b/scripts/block-coroutine-wrapper.py
@@ -88,8 +88,6 @@ def __init__(self, wrapper_type: str, return_type: str, name: str,
                 raise ValueError(f"no_co function can't be rdlock: {self.name}")
             self.target_name = f'{subsystem}_{subname}'
 
-        self.ctx = self.gen_ctx()
-
         self.get_result = 's->ret = '
         self.ret = 'return s.ret;'
         self.co_ret = 'return '
@@ -162,7 +160,7 @@ def create_mixed_wrapper(func: FuncDecl) -> str:
         {func.co_ret}{name}({ func.gen_list('{name}') });
     }} else {{
         {struct_name} s = {{
-            .poll_state.ctx = {func.ctx},
+            .poll_state.ctx = qemu_get_current_aio_context(),
             .poll_state.in_progress = true,
 
 { func.gen_block('            .{name} = {name},') }
@@ -186,7 +184,7 @@ def create_co_wrapper(func: FuncDecl) -> str:
 {func.return_type} {func.name}({ func.gen_list('{decl}') })
 {{
     {struct_name} s = {{
-        .poll_state.ctx = {func.ctx},
+        .poll_state.ctx = qemu_get_current_aio_context(),
         .poll_state.in_progress = true,
 
 { func.gen_block('        .{name} = {name},') }
diff --git a/tests/qemu-iotests/051.pc.out b/tests/qemu-iotests/051.pc.out
index 4d4af5a486..650cfed8e2 100644
--- a/tests/qemu-iotests/051.pc.out
+++ b/tests/qemu-iotests/051.pc.out
@@ -177,11 +177,11 @@ QEMU_PROG: -device virtio-blk-pci,drive=disk,share-rw=on: Cannot change iothread
 
 Testing: -drive file=TEST_DIR/t.qcow2,if=none,node-name=disk -object iothread,id=thread0 -device virtio-scsi,iothread=thread0,id=virtio-scsi0 -device scsi-hd,bus=virtio-scsi0.0,drive=disk,share-rw=on -device lsi53c895a,id=lsi0 -device scsi-hd,bus=lsi0.0,drive=disk,share-rw=on
 QEMU X.Y.Z monitor - type 'help' for more information
-(qemu) QEMU_PROG: -device scsi-hd,bus=lsi0.0,drive=disk,share-rw=on: HBA does not support iothreads
+QEMU_PROG: -device scsi-hd,bus=lsi0.0,drive=disk,share-rw=on: HBA does not support iothreads
 
 Testing: -drive file=TEST_DIR/t.qcow2,if=none,node-name=disk -object iothread,id=thread0 -device virtio-scsi,iothread=thread0,id=virtio-scsi0 -device scsi-hd,bus=virtio-scsi0.0,drive=disk,share-rw=on -device virtio-scsi,id=virtio-scsi1 -device scsi-hd,bus=virtio-scsi1.0,drive=disk,share-rw=on
 QEMU X.Y.Z monitor - type 'help' for more information
-(qemu) QEMU_PROG: -device scsi-hd,bus=virtio-scsi1.0,drive=disk,share-rw=on: Cannot change iothread of active block backend
+QEMU_PROG: -device scsi-hd,bus=virtio-scsi1.0,drive=disk,share-rw=on: Cannot change iothread of active block backend
 
 Testing: -drive file=TEST_DIR/t.qcow2,if=none,node-name=disk -object iothread,id=thread0 -device virtio-scsi,iothread=thread0,id=virtio-scsi0 -device scsi-hd,bus=virtio-scsi0.0,drive=disk,share-rw=on -device virtio-blk-pci,drive=disk,iothread=thread0,share-rw=on
 QEMU X.Y.Z monitor - type 'help' for more information
-- 
2.41.0



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH v3 1/5] block: remove AIOCBInfo->get_aio_context()
  2023-09-12 23:10 ` [PATCH v3 1/5] block: remove AIOCBInfo->get_aio_context() Stefan Hajnoczi
@ 2023-09-13 15:59   ` Eric Blake
  2023-09-14  6:44   ` Klaus Jensen
  1 sibling, 0 replies; 14+ messages in thread
From: Eric Blake @ 2023-09-13 15:59 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: qemu-devel, Klaus Jensen, kwolf, Cleber Rosa, John Snow,
	Hanna Reitz, Philippe Mathieu-Daudé, Keith Busch,
	Paolo Bonzini, Peter Xu, qemu-block, Vladimir Sementsov-Ogievskiy,
	David Hildenbrand, Fam Zheng

On Tue, Sep 12, 2023 at 07:10:33PM -0400, Stefan Hajnoczi wrote:
> The synchronous bdrv_aio_cancel() function needs the acb's AioContext so
> it can call aio_poll() to wait for cancellation.
> 
> It turns out that all users run under the BQL in the main AioContext, so
> this callback is not needed.
> 
> Remove the callback, mark bdrv_aio_cancel() GLOBAL_STATE_CODE just like
> its blk_aio_cancel() caller, and poll the main loop AioContext.
> 
> The purpose of this cleanup is to identify bdrv_aio_cancel() as an API
> that does not work with the multi-queue block layer.
> 
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---

Reviewed-by: Eric Blake <eblake@redhat.com>

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.
Virtualization:  qemu.org | libguestfs.org



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v3 2/5] test-bdrv-drain: avoid race with BH in IOThread drain test
  2023-09-12 23:10 ` [PATCH v3 2/5] test-bdrv-drain: avoid race with BH in IOThread drain test Stefan Hajnoczi
@ 2023-09-13 16:08   ` Eric Blake
  2023-09-14 12:01     ` Stefan Hajnoczi
  0 siblings, 1 reply; 14+ messages in thread
From: Eric Blake @ 2023-09-13 16:08 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: qemu-devel, Klaus Jensen, kwolf, Cleber Rosa, John Snow,
	Hanna Reitz, Philippe Mathieu-Daudé, Keith Busch,
	Paolo Bonzini, Peter Xu, qemu-block, Vladimir Sementsov-Ogievskiy,
	David Hildenbrand, Fam Zheng

On Tue, Sep 12, 2023 at 07:10:34PM -0400, Stefan Hajnoczi wrote:
> This patch fixes a race condition in test-bdrv-drain that is difficult
> to reproduce. test-bdrv-drain sometimes fails without an error message
> on the block pull request sent by Kevin Wolf on Sep 4, 2023. I was able
> to reproduce it locally and found that "block-backend: process I/O in
> the current AioContext" (in this patch series) is the first commit where
> it reproduces.
> 
> I do not know why "block-backend: process I/O in the current AioContext"
> exposes this bug. It might be related to the fact that the test's preadv
> request runs in the main thread instead of IOThread a after my commit.

In reading the commit message before the impacted code, my first
thought was that you had a typo of an extra word (that is, something
to fix by s/a //), but reading further, a better fix would be calling
attention to the fact that you are referencing a specific named
thread, as in s/IOThread a/IOThread A/...

> That might simply change the timing of the test.
> 
> Now on to the race condition in test-bdrv-drain. The main thread
> schedules a BH in IOThread a and then drains the BDS:

...and another spot with the same parse issue...

> 
>   aio_bh_schedule_oneshot(ctx_a, test_iothread_main_thread_bh, &data);
> 
>   /* The request is running on the IOThread a. Draining its block device

...but here you were quoting from the existing code base, which is
where I finally realized it was more than just your commit message.

>    * will make sure that it has completed as far as the BDS is concerned,
>    * but the drain in this thread can continue immediately after
>    * bdrv_dec_in_flight() and aio_ret might be assigned only slightly
>    * later. */
>   do_drain_begin(drain_type, bs);
> 
> If the BH completes before do_drain_begin() then there is nothing to
> worry about.
> 
> If the BH invokes bdrv_flush() before do_drain_begin(), then
> do_drain_begin() waits for it to complete.
> 
> The problematic case is when do_drain_begin() runs before the BH enters
> bdrv_flush(). Then do_drain_begin() misses the BH and the drain
> mechanism has failed in quiescing I/O.
> 
> Fix this by incrementing the in_flight counter so that do_drain_begin()
> waits for test_iothread_main_thread_bh().
> 
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
>  tests/unit/test-bdrv-drain.c | 8 ++++++++
>  1 file changed, 8 insertions(+)
> 
> diff --git a/tests/unit/test-bdrv-drain.c b/tests/unit/test-bdrv-drain.c
> index ccc453c29e..67a79aa3f0 100644
> --- a/tests/unit/test-bdrv-drain.c
> +++ b/tests/unit/test-bdrv-drain.c
> @@ -512,6 +512,7 @@ static void test_iothread_main_thread_bh(void *opaque)
>       * executed during drain, otherwise this would deadlock. */
>      aio_context_acquire(bdrv_get_aio_context(data->bs));
>      bdrv_flush(data->bs);
> +    bdrv_dec_in_flight(data->bs); /* incremented by test_iothread_common() */
>      aio_context_release(bdrv_get_aio_context(data->bs));
>  }
>  
> @@ -583,6 +584,13 @@ static void test_iothread_common(enum drain_type drain_type, int drain_thread)
>              aio_context_acquire(ctx_a);
>          }
>  
> +        /*
> +         * Increment in_flight so that do_drain_begin() waits for
> +         * test_iothread_main_thread_bh(). This prevents the race between
> +         * test_iothread_main_thread_bh() in IOThread a and do_drain_begin() in
> +         * this thread. test_iothread_main_thread_bh() decrements in_flight.
> +         */
> +        bdrv_inc_in_flight(bs);
>          aio_bh_schedule_oneshot(ctx_a, test_iothread_main_thread_bh, &data);
>  
>          /* The request is running on the IOThread a. Draining its block device

and indeed, your commit message is consistent with the current code's
naming convention.  If you have reason to respin, a pre-req patch to
change the case before adding more references might be nice, but I
won't insist.

Reviewed-by: Eric Blake <eblake@redhat.com>

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.
Virtualization:  qemu.org | libguestfs.org



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v3 3/5] block-backend: process I/O in the current AioContext
  2023-09-12 23:10 ` [PATCH v3 3/5] block-backend: process I/O in the current AioContext Stefan Hajnoczi
@ 2023-09-13 16:20   ` Eric Blake
  0 siblings, 0 replies; 14+ messages in thread
From: Eric Blake @ 2023-09-13 16:20 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: qemu-devel, Klaus Jensen, kwolf, Cleber Rosa, John Snow,
	Hanna Reitz, Philippe Mathieu-Daudé, Keith Busch,
	Paolo Bonzini, Peter Xu, qemu-block, Vladimir Sementsov-Ogievskiy,
	David Hildenbrand, Fam Zheng

On Tue, Sep 12, 2023 at 07:10:35PM -0400, Stefan Hajnoczi wrote:
> Switch blk_aio_*() APIs over to multi-queue by using
> qemu_get_current_aio_context() instead of blk_get_aio_context(). This
> change will allow devices to process I/O in multiple IOThreads in the
> future.
> 
> I audited existing blk_aio_*() callers:
> - migration/block.c: blk_mig_lock() protects the data accessed by the
>   completion callback.
> - The remaining emulated devices and exports run with
>   qemu_get_aio_context() == blk_get_aio_context().
> 
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
>  block/block-backend.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
>

Reviewed-by: Eric Blake <eblake@redhat.com>

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.
Virtualization:  qemu.org | libguestfs.org



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v3 4/5] block-backend: process zoned requests in the current AioContext
  2023-09-12 23:10 ` [PATCH v3 4/5] block-backend: process zoned requests " Stefan Hajnoczi
@ 2023-09-13 16:22   ` Eric Blake
  0 siblings, 0 replies; 14+ messages in thread
From: Eric Blake @ 2023-09-13 16:22 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: qemu-devel, Klaus Jensen, kwolf, Cleber Rosa, John Snow,
	Hanna Reitz, Philippe Mathieu-Daudé, Keith Busch,
	Paolo Bonzini, Peter Xu, qemu-block, Vladimir Sementsov-Ogievskiy,
	David Hildenbrand, Fam Zheng

On Tue, Sep 12, 2023 at 07:10:36PM -0400, Stefan Hajnoczi wrote:
> Process zoned requests in the current thread's AioContext instead of in
> the BlockBackend's AioContext.
> 
> There is no need to use the BlockBackend's AioContext thanks to CoMutex
> bs->wps->colock, which protects zone metadata.
> 
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
>  block/block-backend.c | 12 ++++++------
>  1 file changed, 6 insertions(+), 6 deletions(-)
> 

Reviewed-by: Eric Blake <eblake@redhat.com>

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.
Virtualization:  qemu.org | libguestfs.org



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v3 5/5] block-coroutine-wrapper: use qemu_get_current_aio_context()
  2023-09-12 23:10 ` [PATCH v3 5/5] block-coroutine-wrapper: use qemu_get_current_aio_context() Stefan Hajnoczi
@ 2023-09-13 16:23   ` Eric Blake
  0 siblings, 0 replies; 14+ messages in thread
From: Eric Blake @ 2023-09-13 16:23 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: qemu-devel, Klaus Jensen, kwolf, Cleber Rosa, John Snow,
	Hanna Reitz, Philippe Mathieu-Daudé, Keith Busch,
	Paolo Bonzini, Peter Xu, qemu-block, Vladimir Sementsov-Ogievskiy,
	David Hildenbrand, Fam Zheng

On Tue, Sep 12, 2023 at 07:10:37PM -0400, Stefan Hajnoczi wrote:
> Use qemu_get_current_aio_context() in mixed wrappers and coroutine
> wrappers so that code runs in the caller's AioContext instead of moving
> to the BlockDriverState's AioContext. This change is necessary for the
> multi-queue block layer where any thread can call into the block layer.
> 
> Most wrappers are IO_CODE where it's safe to use the current AioContext
> nowadays. BlockDrivers and the core block layer use their own locks and
> no longer depend on the AioContext lock for thread-safety.
> 
> The bdrv_create() wrapper invokes GLOBAL_STATE code. Using the current
> AioContext is safe because this code is only called with the BQL held
> from the main loop thread.
> 
> The output of qemu-iotests 051 is sensitive to event loop activity.
> Update the output because the monitor BH runs at a different time,
> causing prompts to be printed differently in the output.
> 
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
>  scripts/block-coroutine-wrapper.py | 6 ++----
>  tests/qemu-iotests/051.pc.out      | 4 ++--
>  2 files changed, 4 insertions(+), 6 deletions(-)
> 

Reviewed-by: Eric Blake <eblake@redhat.com>

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.
Virtualization:  qemu.org | libguestfs.org



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v3 1/5] block: remove AIOCBInfo->get_aio_context()
  2023-09-12 23:10 ` [PATCH v3 1/5] block: remove AIOCBInfo->get_aio_context() Stefan Hajnoczi
  2023-09-13 15:59   ` Eric Blake
@ 2023-09-14  6:44   ` Klaus Jensen
  1 sibling, 0 replies; 14+ messages in thread
From: Klaus Jensen @ 2023-09-14  6:44 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: qemu-devel, kwolf, Cleber Rosa, John Snow, Hanna Reitz,
	Philippe Mathieu-Daudé, Keith Busch, Paolo Bonzini, Peter Xu,
	qemu-block, Vladimir Sementsov-Ogievskiy, David Hildenbrand,
	Fam Zheng

[-- Attachment #1: Type: text/plain, Size: 2339 bytes --]

On Sep 12 19:10, Stefan Hajnoczi wrote:
> The synchronous bdrv_aio_cancel() function needs the acb's AioContext so
> it can call aio_poll() to wait for cancellation.
> 
> It turns out that all users run under the BQL in the main AioContext, so
> this callback is not needed.
> 
> Remove the callback, mark bdrv_aio_cancel() GLOBAL_STATE_CODE just like
> its blk_aio_cancel() caller, and poll the main loop AioContext.
> 
> The purpose of this cleanup is to identify bdrv_aio_cancel() as an API
> that does not work with the multi-queue block layer.
> 
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
>  include/block/aio.h                |  1 -
>  include/block/block-global-state.h |  2 ++
>  include/block/block-io.h           |  1 -
>  block/block-backend.c              | 17 -----------------
>  block/io.c                         | 23 ++++++++---------------
>  hw/nvme/ctrl.c                     |  7 -------
>  softmmu/dma-helpers.c              |  8 --------
>  util/thread-pool.c                 |  8 --------
>  8 files changed, 10 insertions(+), 57 deletions(-)
> 
> diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c
> index 539d273553..ee7273daa1 100644
> --- a/hw/nvme/ctrl.c
> +++ b/hw/nvme/ctrl.c
> @@ -2130,11 +2130,6 @@ static inline bool nvme_is_write(NvmeRequest *req)
>             rw->opcode == NVME_CMD_WRITE_ZEROES;
>  }
>  
> -static AioContext *nvme_get_aio_context(BlockAIOCB *acb)
> -{
> -    return qemu_get_aio_context();
> -}
> -
>  static void nvme_misc_cb(void *opaque, int ret)
>  {
>      NvmeRequest *req = opaque;
> @@ -3302,7 +3297,6 @@ static void nvme_flush_cancel(BlockAIOCB *acb)
>  static const AIOCBInfo nvme_flush_aiocb_info = {
>      .aiocb_size = sizeof(NvmeFlushAIOCB),
>      .cancel_async = nvme_flush_cancel,
> -    .get_aio_context = nvme_get_aio_context,
>  };
>  
>  static void nvme_do_flush(NvmeFlushAIOCB *iocb);
> @@ -6478,7 +6472,6 @@ static void nvme_format_cancel(BlockAIOCB *aiocb)
>  static const AIOCBInfo nvme_format_aiocb_info = {
>      .aiocb_size = sizeof(NvmeFormatAIOCB),
>      .cancel_async = nvme_format_cancel,
> -    .get_aio_context = nvme_get_aio_context,
>  };
>  
>  static void nvme_format_set(NvmeNamespace *ns, uint8_t lbaf, uint8_t mset,

Reviewed-by: Klaus Jensen <k.jensen@samsung.com>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v3 2/5] test-bdrv-drain: avoid race with BH in IOThread drain test
  2023-09-13 16:08   ` Eric Blake
@ 2023-09-14 12:01     ` Stefan Hajnoczi
  0 siblings, 0 replies; 14+ messages in thread
From: Stefan Hajnoczi @ 2023-09-14 12:01 UTC (permalink / raw)
  To: Eric Blake
  Cc: qemu-devel, Klaus Jensen, kwolf, Cleber Rosa, John Snow,
	Hanna Reitz, Philippe Mathieu-Daudé, Keith Busch,
	Paolo Bonzini, Peter Xu, qemu-block, Vladimir Sementsov-Ogievskiy,
	David Hildenbrand, Fam Zheng

[-- Attachment #1: Type: text/plain, Size: 4342 bytes --]

On Wed, Sep 13, 2023 at 11:08:54AM -0500, Eric Blake wrote:
> On Tue, Sep 12, 2023 at 07:10:34PM -0400, Stefan Hajnoczi wrote:
> > This patch fixes a race condition in test-bdrv-drain that is difficult
> > to reproduce. test-bdrv-drain sometimes fails without an error message
> > on the block pull request sent by Kevin Wolf on Sep 4, 2023. I was able
> > to reproduce it locally and found that "block-backend: process I/O in
> > the current AioContext" (in this patch series) is the first commit where
> > it reproduces.
> > 
> > I do not know why "block-backend: process I/O in the current AioContext"
> > exposes this bug. It might be related to the fact that the test's preadv
> > request runs in the main thread instead of IOThread a after my commit.
> 
> In reading the commit message before the impacted code, my first
> thought was that you had a typo of an extra word (that is, something
> to fix by s/a //), but reading further, a better fix would be calling
> attention to the fact that you are referencing a specific named
> thread, as in s/IOThread a/IOThread A/...
> 
> > That might simply change the timing of the test.
> > 
> > Now on to the race condition in test-bdrv-drain. The main thread
> > schedules a BH in IOThread a and then drains the BDS:
> 
> ...and another spot with the same parse issue...
> 
> > 
> >   aio_bh_schedule_oneshot(ctx_a, test_iothread_main_thread_bh, &data);
> > 
> >   /* The request is running on the IOThread a. Draining its block device
> 
> ...but here you were quoting from the existing code base, which is
> where I finally realized it was more than just your commit message.
> 
> >    * will make sure that it has completed as far as the BDS is concerned,
> >    * but the drain in this thread can continue immediately after
> >    * bdrv_dec_in_flight() and aio_ret might be assigned only slightly
> >    * later. */
> >   do_drain_begin(drain_type, bs);
> > 
> > If the BH completes before do_drain_begin() then there is nothing to
> > worry about.
> > 
> > If the BH invokes bdrv_flush() before do_drain_begin(), then
> > do_drain_begin() waits for it to complete.
> > 
> > The problematic case is when do_drain_begin() runs before the BH enters
> > bdrv_flush(). Then do_drain_begin() misses the BH and the drain
> > mechanism has failed in quiescing I/O.
> > 
> > Fix this by incrementing the in_flight counter so that do_drain_begin()
> > waits for test_iothread_main_thread_bh().
> > 
> > Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> > ---
> >  tests/unit/test-bdrv-drain.c | 8 ++++++++
> >  1 file changed, 8 insertions(+)
> > 
> > diff --git a/tests/unit/test-bdrv-drain.c b/tests/unit/test-bdrv-drain.c
> > index ccc453c29e..67a79aa3f0 100644
> > --- a/tests/unit/test-bdrv-drain.c
> > +++ b/tests/unit/test-bdrv-drain.c
> > @@ -512,6 +512,7 @@ static void test_iothread_main_thread_bh(void *opaque)
> >       * executed during drain, otherwise this would deadlock. */
> >      aio_context_acquire(bdrv_get_aio_context(data->bs));
> >      bdrv_flush(data->bs);
> > +    bdrv_dec_in_flight(data->bs); /* incremented by test_iothread_common() */
> >      aio_context_release(bdrv_get_aio_context(data->bs));
> >  }
> >  
> > @@ -583,6 +584,13 @@ static void test_iothread_common(enum drain_type drain_type, int drain_thread)
> >              aio_context_acquire(ctx_a);
> >          }
> >  
> > +        /*
> > +         * Increment in_flight so that do_drain_begin() waits for
> > +         * test_iothread_main_thread_bh(). This prevents the race between
> > +         * test_iothread_main_thread_bh() in IOThread a and do_drain_begin() in
> > +         * this thread. test_iothread_main_thread_bh() decrements in_flight.
> > +         */
> > +        bdrv_inc_in_flight(bs);
> >          aio_bh_schedule_oneshot(ctx_a, test_iothread_main_thread_bh, &data);
> >  
> >          /* The request is running on the IOThread a. Draining its block device
> 
> and indeed, your commit message is consistent with the current code's
> naming convention.  If you have reason to respin, a pre-req patch to
> change the case before adding more references might be nice, but I
> won't insist.
> 
> Reviewed-by: Eric Blake <eblake@redhat.com>

Sorry about that. It is confusing.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v3 0/5] block-backend: process I/O in the current AioContext
  2023-09-12 23:10 [PATCH v3 0/5] block-backend: process I/O in the current AioContext Stefan Hajnoczi
                   ` (4 preceding siblings ...)
  2023-09-12 23:10 ` [PATCH v3 5/5] block-coroutine-wrapper: use qemu_get_current_aio_context() Stefan Hajnoczi
@ 2023-09-15 14:39 ` Kevin Wolf
  5 siblings, 0 replies; 14+ messages in thread
From: Kevin Wolf @ 2023-09-15 14:39 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: qemu-devel, Klaus Jensen, Cleber Rosa, John Snow, Hanna Reitz,
	Philippe Mathieu-Daudé, Keith Busch, Paolo Bonzini, Peter Xu,
	qemu-block, Vladimir Sementsov-Ogievskiy, David Hildenbrand,
	Fam Zheng

Am 13.09.2023 um 01:10 hat Stefan Hajnoczi geschrieben:
> v3
> - Add Patch 2 to fix a race condition in test-bdrv-drain. This was the CI
>   failure that bumped this patch series from Kevin's pull request.
> - Add missing 051.pc.out file. I tried qemu-system-aarch64 to see of 051.out
>   also needs to be updated, but no changes were necessary. [Kevin]
> v2
> - Add patch to remove AIOCBInfo->get_aio_context() [Kevin]
> - Add patch to use qemu_get_current_aio_context() in block-coroutine-wrapper so
>   that the wrappers use the current AioContext instead of
>   bdrv_get_aio_context().
> 
> Switch blk_aio_*() APIs over to multi-queue by using
> qemu_get_current_aio_context() instead of blk_get_aio_context(). This change
> will allow devices to process I/O in multiple IOThreads in the future.
> 
> The final patch requires my QIOChannel AioContext series to pass
> tests/qemu-iotests/check -qcow2 281 because the nbd block driver is now
> accessed from the main loop thread in addition to the IOThread:
> https://lore.kernel.org/qemu-devel/20230823234504.1387239-1-stefanha@redhat.com/T/#t
> 
> Based-on: 20230823234504.1387239-1-stefanha@redhat.com

Thanks, applied to the block branch.

Kevin



^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2023-09-15 14:40 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-09-12 23:10 [PATCH v3 0/5] block-backend: process I/O in the current AioContext Stefan Hajnoczi
2023-09-12 23:10 ` [PATCH v3 1/5] block: remove AIOCBInfo->get_aio_context() Stefan Hajnoczi
2023-09-13 15:59   ` Eric Blake
2023-09-14  6:44   ` Klaus Jensen
2023-09-12 23:10 ` [PATCH v3 2/5] test-bdrv-drain: avoid race with BH in IOThread drain test Stefan Hajnoczi
2023-09-13 16:08   ` Eric Blake
2023-09-14 12:01     ` Stefan Hajnoczi
2023-09-12 23:10 ` [PATCH v3 3/5] block-backend: process I/O in the current AioContext Stefan Hajnoczi
2023-09-13 16:20   ` Eric Blake
2023-09-12 23:10 ` [PATCH v3 4/5] block-backend: process zoned requests " Stefan Hajnoczi
2023-09-13 16:22   ` Eric Blake
2023-09-12 23:10 ` [PATCH v3 5/5] block-coroutine-wrapper: use qemu_get_current_aio_context() Stefan Hajnoczi
2023-09-13 16:23   ` Eric Blake
2023-09-15 14:39 ` [PATCH v3 0/5] block-backend: process I/O in the current AioContext Kevin Wolf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).