qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/4] virtio-blk: prepare for the multi-queue block layer
@ 2023-09-14 14:00 Stefan Hajnoczi
  2023-09-14 14:00 ` [PATCH 1/4] block/file-posix: set up Linux AIO and io_uring in the current thread Stefan Hajnoczi
                   ` (5 more replies)
  0 siblings, 6 replies; 11+ messages in thread
From: Stefan Hajnoczi @ 2023-09-14 14:00 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Stefan Hajnoczi, qemu-block, Hanna Reitz,
	Michael S. Tsirkin

The virtio-blk device will soon be able to assign virtqueues to IOThreads,
eliminating the single IOThread bottleneck. In order to do that, the I/O code
path must support running in multiple threads.

This patch series removes the AioContext lock from the virtio-blk I/O code
path, adds thread-safety where it is required, and ensures that Linux AIO and
io_uring are available regardless of which thread calls into the block driver.
With these changes virtio-blk is ready for the iothread-vq-mapping feature,
which will be introduced in the next patch series.

Based-on: 20230913200045.1024233-1-stefanha@redhat.com ("[PATCH v3 0/4] virtio-blk: use blk_io_plug_call() instead of notification BH")
Based-on: 20230912231037.826804-1-stefanha@redhat.com ("[PATCH v3 0/5] block-backend: process I/O in the current AioContext")

Stefan Hajnoczi (4):
  block/file-posix: set up Linux AIO and io_uring in the current thread
  virtio-blk: add lock to protect s->rq
  virtio-blk: don't lock AioContext in the completion code path
  virtio-blk: don't lock AioContext in the submission code path

 include/hw/virtio/virtio-blk.h |   3 +-
 block/file-posix.c             |  99 +++++++++++++++---------------
 hw/block/virtio-blk.c          | 106 +++++++++++++++------------------
 3 files changed, 98 insertions(+), 110 deletions(-)

-- 
2.41.0



^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 1/4] block/file-posix: set up Linux AIO and io_uring in the current thread
  2023-09-14 14:00 [PATCH 0/4] virtio-blk: prepare for the multi-queue block layer Stefan Hajnoczi
@ 2023-09-14 14:00 ` Stefan Hajnoczi
  2023-09-14 15:47   ` Eric Blake
  2023-09-14 14:00 ` [PATCH 2/4] virtio-blk: add lock to protect s->rq Stefan Hajnoczi
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 11+ messages in thread
From: Stefan Hajnoczi @ 2023-09-14 14:00 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Stefan Hajnoczi, qemu-block, Hanna Reitz,
	Michael S. Tsirkin

The file-posix block driver currently only sets up Linux AIO and
io_uring in the BDS's AioContext. In the multi-queue block layer we must
be able to submit I/O requests in AioContexts that do not have Linux AIO
and io_uring set up yet since any thread can call into the block driver.

Set up Linux AIO and io_uring for the current AioContext during request
submission. We lose the ability to return an error from
.bdrv_file_open() when Linux AIO and io_uring setup fails (e.g. due to
resource limits). Instead the user only gets warnings and we fall back
to aio=threads. This is still better than a fatal error after startup.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 block/file-posix.c | 99 ++++++++++++++++++++++------------------------
 1 file changed, 47 insertions(+), 52 deletions(-)

diff --git a/block/file-posix.c b/block/file-posix.c
index 4757914ac0..e9dbb87c57 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -713,17 +713,11 @@ static int raw_open_common(BlockDriverState *bs, QDict *options,
 
 #ifdef CONFIG_LINUX_AIO
      /* Currently Linux does AIO only for files opened with O_DIRECT */
-    if (s->use_linux_aio) {
-        if (!(s->open_flags & O_DIRECT)) {
-            error_setg(errp, "aio=native was specified, but it requires "
-                             "cache.direct=on, which was not specified.");
-            ret = -EINVAL;
-            goto fail;
-        }
-        if (!aio_setup_linux_aio(bdrv_get_aio_context(bs), errp)) {
-            error_prepend(errp, "Unable to use native AIO: ");
-            goto fail;
-        }
+    if (s->use_linux_aio && !(s->open_flags & O_DIRECT)) {
+        error_setg(errp, "aio=native was specified, but it requires "
+                         "cache.direct=on, which was not specified.");
+        ret = -EINVAL;
+        goto fail;
     }
 #else
     if (s->use_linux_aio) {
@@ -734,14 +728,7 @@ static int raw_open_common(BlockDriverState *bs, QDict *options,
     }
 #endif /* !defined(CONFIG_LINUX_AIO) */
 
-#ifdef CONFIG_LINUX_IO_URING
-    if (s->use_linux_io_uring) {
-        if (!aio_setup_linux_io_uring(bdrv_get_aio_context(bs), errp)) {
-            error_prepend(errp, "Unable to use io_uring: ");
-            goto fail;
-        }
-    }
-#else
+#ifndef CONFIG_LINUX_IO_URING
     if (s->use_linux_io_uring) {
         error_setg(errp, "aio=io_uring was specified, but is not supported "
                          "in this build.");
@@ -2442,6 +2429,44 @@ static bool bdrv_qiov_is_aligned(BlockDriverState *bs, QEMUIOVector *qiov)
     return true;
 }
 
+static inline bool raw_check_linux_io_uring(BDRVRawState *s)
+{
+    Error *local_err = NULL;
+    AioContext *ctx;
+
+    if (!s->use_linux_io_uring) {
+        return false;
+    }
+
+    ctx = qemu_get_current_aio_context();
+    if (unlikely(!aio_setup_linux_io_uring(ctx, &local_err))) {
+        error_reportf_err(local_err, "Unable to use linux io_uring, "
+                                     "falling back to thread pool: ");
+        s->use_linux_io_uring = false;
+        return false;
+    }
+    return true;
+}
+
+static inline bool raw_check_linux_aio(BDRVRawState *s)
+{
+    Error *local_err = NULL;
+    AioContext *ctx;
+
+    if (!s->use_linux_aio) {
+        return false;
+    }
+
+    ctx = qemu_get_current_aio_context();
+    if (unlikely(!aio_setup_linux_aio(ctx, &local_err))) {
+        error_reportf_err(local_err, "Unable to use Linux AIO, "
+                                     "falling back to thread pool: ");
+        s->use_linux_aio = false;
+        return false;
+    }
+    return true;
+}
+
 static int coroutine_fn raw_co_prw(BlockDriverState *bs, uint64_t offset,
                                    uint64_t bytes, QEMUIOVector *qiov, int type)
 {
@@ -2470,13 +2495,13 @@ static int coroutine_fn raw_co_prw(BlockDriverState *bs, uint64_t offset,
     if (s->needs_alignment && !bdrv_qiov_is_aligned(bs, qiov)) {
         type |= QEMU_AIO_MISALIGNED;
 #ifdef CONFIG_LINUX_IO_URING
-    } else if (s->use_linux_io_uring) {
+    } else if (raw_check_linux_io_uring(s)) {
         assert(qiov->size == bytes);
         ret = luring_co_submit(bs, s->fd, offset, qiov, type);
         goto out;
 #endif
 #ifdef CONFIG_LINUX_AIO
-    } else if (s->use_linux_aio) {
+    } else if (raw_check_linux_aio(s)) {
         assert(qiov->size == bytes);
         ret = laio_co_submit(s->fd, offset, qiov, type,
                               s->aio_max_batch);
@@ -2566,39 +2591,13 @@ static int coroutine_fn raw_co_flush_to_disk(BlockDriverState *bs)
     };
 
 #ifdef CONFIG_LINUX_IO_URING
-    if (s->use_linux_io_uring) {
+    if (raw_check_linux_io_uring(s)) {
         return luring_co_submit(bs, s->fd, 0, NULL, QEMU_AIO_FLUSH);
     }
 #endif
     return raw_thread_pool_submit(handle_aiocb_flush, &acb);
 }
 
-static void raw_aio_attach_aio_context(BlockDriverState *bs,
-                                       AioContext *new_context)
-{
-    BDRVRawState __attribute__((unused)) *s = bs->opaque;
-#ifdef CONFIG_LINUX_AIO
-    if (s->use_linux_aio) {
-        Error *local_err = NULL;
-        if (!aio_setup_linux_aio(new_context, &local_err)) {
-            error_reportf_err(local_err, "Unable to use native AIO, "
-                                         "falling back to thread pool: ");
-            s->use_linux_aio = false;
-        }
-    }
-#endif
-#ifdef CONFIG_LINUX_IO_URING
-    if (s->use_linux_io_uring) {
-        Error *local_err = NULL;
-        if (!aio_setup_linux_io_uring(new_context, &local_err)) {
-            error_reportf_err(local_err, "Unable to use linux io_uring, "
-                                         "falling back to thread pool: ");
-            s->use_linux_io_uring = false;
-        }
-    }
-#endif
-}
-
 static void raw_close(BlockDriverState *bs)
 {
     BDRVRawState *s = bs->opaque;
@@ -3897,7 +3896,6 @@ BlockDriver bdrv_file = {
     .bdrv_co_copy_range_from = raw_co_copy_range_from,
     .bdrv_co_copy_range_to  = raw_co_copy_range_to,
     .bdrv_refresh_limits = raw_refresh_limits,
-    .bdrv_attach_aio_context = raw_aio_attach_aio_context,
 
     .bdrv_co_truncate                   = raw_co_truncate,
     .bdrv_co_getlength                  = raw_co_getlength,
@@ -4267,7 +4265,6 @@ static BlockDriver bdrv_host_device = {
     .bdrv_co_copy_range_from = raw_co_copy_range_from,
     .bdrv_co_copy_range_to  = raw_co_copy_range_to,
     .bdrv_refresh_limits = raw_refresh_limits,
-    .bdrv_attach_aio_context = raw_aio_attach_aio_context,
 
     .bdrv_co_truncate                   = raw_co_truncate,
     .bdrv_co_getlength                  = raw_co_getlength,
@@ -4403,7 +4400,6 @@ static BlockDriver bdrv_host_cdrom = {
     .bdrv_co_pwritev        = raw_co_pwritev,
     .bdrv_co_flush_to_disk  = raw_co_flush_to_disk,
     .bdrv_refresh_limits    = cdrom_refresh_limits,
-    .bdrv_attach_aio_context = raw_aio_attach_aio_context,
 
     .bdrv_co_truncate                   = raw_co_truncate,
     .bdrv_co_getlength                  = raw_co_getlength,
@@ -4529,7 +4525,6 @@ static BlockDriver bdrv_host_cdrom = {
     .bdrv_co_pwritev        = raw_co_pwritev,
     .bdrv_co_flush_to_disk  = raw_co_flush_to_disk,
     .bdrv_refresh_limits    = cdrom_refresh_limits,
-    .bdrv_attach_aio_context = raw_aio_attach_aio_context,
 
     .bdrv_co_truncate                   = raw_co_truncate,
     .bdrv_co_getlength                  = raw_co_getlength,
-- 
2.41.0



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 2/4] virtio-blk: add lock to protect s->rq
  2023-09-14 14:00 [PATCH 0/4] virtio-blk: prepare for the multi-queue block layer Stefan Hajnoczi
  2023-09-14 14:00 ` [PATCH 1/4] block/file-posix: set up Linux AIO and io_uring in the current thread Stefan Hajnoczi
@ 2023-09-14 14:00 ` Stefan Hajnoczi
  2023-09-14 16:13   ` Eric Blake
  2023-09-14 14:01 ` [PATCH 3/4] virtio-blk: don't lock AioContext in the completion code path Stefan Hajnoczi
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 11+ messages in thread
From: Stefan Hajnoczi @ 2023-09-14 14:00 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Stefan Hajnoczi, qemu-block, Hanna Reitz,
	Michael S. Tsirkin

s->rq is accessed from IO_CODE and GLOBAL_STATE_CODE. Introduce a lock
to protect s->rq and eliminate reliance on the AioContext lock.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 include/hw/virtio/virtio-blk.h |  3 +-
 hw/block/virtio-blk.c          | 67 +++++++++++++++++++++++-----------
 2 files changed, 47 insertions(+), 23 deletions(-)

diff --git a/include/hw/virtio/virtio-blk.h b/include/hw/virtio/virtio-blk.h
index dafec432ce..9881009c22 100644
--- a/include/hw/virtio/virtio-blk.h
+++ b/include/hw/virtio/virtio-blk.h
@@ -54,7 +54,8 @@ struct VirtIOBlockReq;
 struct VirtIOBlock {
     VirtIODevice parent_obj;
     BlockBackend *blk;
-    void *rq;
+    QemuMutex rq_lock;
+    void *rq; /* protected by rq_lock */
     VirtIOBlkConf conf;
     unsigned short sector_mask;
     bool original_wce;
diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index a1f8e15522..ee38e089bc 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -82,8 +82,11 @@ static int virtio_blk_handle_rw_error(VirtIOBlockReq *req, int error,
         /* Break the link as the next request is going to be parsed from the
          * ring again. Otherwise we may end up doing a double completion! */
         req->mr_next = NULL;
-        req->next = s->rq;
-        s->rq = req;
+
+        WITH_QEMU_LOCK_GUARD(&s->rq_lock) {
+            req->next = s->rq;
+            s->rq = req;
+        }
     } else if (action == BLOCK_ERROR_ACTION_REPORT) {
         virtio_blk_req_complete(req, VIRTIO_BLK_S_IOERR);
         if (acct_failed) {
@@ -1183,10 +1186,13 @@ static void virtio_blk_dma_restart_bh(void *opaque)
 {
     VirtIOBlock *s = opaque;
 
-    VirtIOBlockReq *req = s->rq;
+    VirtIOBlockReq *req;
     MultiReqBuffer mrb = {};
 
-    s->rq = NULL;
+    WITH_QEMU_LOCK_GUARD(&s->rq_lock) {
+        req = s->rq;
+        s->rq = NULL;
+    }
 
     aio_context_acquire(blk_get_aio_context(s->conf.conf.blk));
     while (req) {
@@ -1238,22 +1244,29 @@ static void virtio_blk_reset(VirtIODevice *vdev)
     AioContext *ctx;
     VirtIOBlockReq *req;
 
+    /* Dataplane has stopped... */
+    assert(!s->dataplane_started);
+
+    /* ...but requests may still be in flight. */
     ctx = blk_get_aio_context(s->blk);
     aio_context_acquire(ctx);
     blk_drain(s->blk);
+    aio_context_release(ctx);
 
     /* We drop queued requests after blk_drain() because blk_drain() itself can
      * produce them. */
-    while (s->rq) {
-        req = s->rq;
-        s->rq = req->next;
-        virtqueue_detach_element(req->vq, &req->elem, 0);
-        virtio_blk_free_request(req);
+    WITH_QEMU_LOCK_GUARD(&s->rq_lock) {
+        while (s->rq) {
+            req = s->rq;
+            s->rq = req->next;
+
+            /* No other threads can access req->vq here */
+            virtqueue_detach_element(req->vq, &req->elem, 0);
+
+            virtio_blk_free_request(req);
+        }
     }
 
-    aio_context_release(ctx);
-
-    assert(!s->dataplane_started);
     blk_set_enable_write_cache(s->blk, s->original_wce);
 }
 
@@ -1443,18 +1456,22 @@ static void virtio_blk_set_status(VirtIODevice *vdev, uint8_t status)
 static void virtio_blk_save_device(VirtIODevice *vdev, QEMUFile *f)
 {
     VirtIOBlock *s = VIRTIO_BLK(vdev);
-    VirtIOBlockReq *req = s->rq;
 
-    while (req) {
-        qemu_put_sbyte(f, 1);
+    WITH_QEMU_LOCK_GUARD(&s->rq_lock) {
+        VirtIOBlockReq *req = s->rq;
 
-        if (s->conf.num_queues > 1) {
-            qemu_put_be32(f, virtio_get_queue_index(req->vq));
+        while (req) {
+            qemu_put_sbyte(f, 1);
+
+            if (s->conf.num_queues > 1) {
+                qemu_put_be32(f, virtio_get_queue_index(req->vq));
+            }
+
+            qemu_put_virtqueue_element(vdev, f, &req->elem);
+            req = req->next;
         }
-
-        qemu_put_virtqueue_element(vdev, f, &req->elem);
-        req = req->next;
     }
+
     qemu_put_sbyte(f, 0);
 }
 
@@ -1480,8 +1497,11 @@ static int virtio_blk_load_device(VirtIODevice *vdev, QEMUFile *f,
 
         req = qemu_get_virtqueue_element(vdev, f, sizeof(VirtIOBlockReq));
         virtio_blk_init_request(s, virtio_get_queue(vdev, vq_idx), req);
-        req->next = s->rq;
-        s->rq = req;
+
+        WITH_QEMU_LOCK_GUARD(&s->rq_lock) {
+            req->next = s->rq;
+            s->rq = req;
+        }
     }
 
     return 0;
@@ -1628,6 +1648,8 @@ static void virtio_blk_device_realize(DeviceState *dev, Error **errp)
                                             s->host_features);
     virtio_init(vdev, VIRTIO_ID_BLOCK, s->config_size);
 
+    qemu_mutex_init(&s->rq_lock);
+
     s->blk = conf->conf.blk;
     s->rq = NULL;
     s->sector_mask = (s->conf.conf.logical_block_size / BDRV_SECTOR_SIZE) - 1;
@@ -1679,6 +1701,7 @@ static void virtio_blk_device_unrealize(DeviceState *dev)
         virtio_del_queue(vdev, i);
     }
     qemu_coroutine_dec_pool_size(conf->num_queues * conf->queue_size / 2);
+    qemu_mutex_destroy(&s->rq_lock);
     blk_ram_registrar_destroy(&s->blk_ram_registrar);
     qemu_del_vm_change_state_handler(s->change);
     blockdev_mark_auto_del(s->blk);
-- 
2.41.0



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 3/4] virtio-blk: don't lock AioContext in the completion code path
  2023-09-14 14:00 [PATCH 0/4] virtio-blk: prepare for the multi-queue block layer Stefan Hajnoczi
  2023-09-14 14:00 ` [PATCH 1/4] block/file-posix: set up Linux AIO and io_uring in the current thread Stefan Hajnoczi
  2023-09-14 14:00 ` [PATCH 2/4] virtio-blk: add lock to protect s->rq Stefan Hajnoczi
@ 2023-09-14 14:01 ` Stefan Hajnoczi
  2023-09-14 16:17   ` Eric Blake
  2023-09-14 14:01 ` [PATCH 4/4] virtio-blk: don't lock AioContext in the submission " Stefan Hajnoczi
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 11+ messages in thread
From: Stefan Hajnoczi @ 2023-09-14 14:01 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Stefan Hajnoczi, qemu-block, Hanna Reitz,
	Michael S. Tsirkin

Nothing in the completion code path relies on the AioContext lock
anymore. Virtqueues are only accessed from one thread at any moment and
the s->rq global state is protected by its own lock now.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 hw/block/virtio-blk.c | 34 ++++------------------------------
 1 file changed, 4 insertions(+), 30 deletions(-)

diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index ee38e089bc..f5315df042 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -105,7 +105,6 @@ static void virtio_blk_rw_complete(void *opaque, int ret)
     VirtIOBlock *s = next->dev;
     VirtIODevice *vdev = VIRTIO_DEVICE(s);
 
-    aio_context_acquire(blk_get_aio_context(s->conf.conf.blk));
     while (next) {
         VirtIOBlockReq *req = next;
         next = req->mr_next;
@@ -138,7 +137,6 @@ static void virtio_blk_rw_complete(void *opaque, int ret)
         block_acct_done(blk_get_stats(s->blk), &req->acct);
         virtio_blk_free_request(req);
     }
-    aio_context_release(blk_get_aio_context(s->conf.conf.blk));
 }
 
 static void virtio_blk_flush_complete(void *opaque, int ret)
@@ -146,19 +144,13 @@ static void virtio_blk_flush_complete(void *opaque, int ret)
     VirtIOBlockReq *req = opaque;
     VirtIOBlock *s = req->dev;
 
-    aio_context_acquire(blk_get_aio_context(s->conf.conf.blk));
-    if (ret) {
-        if (virtio_blk_handle_rw_error(req, -ret, 0, true)) {
-            goto out;
-        }
+    if (ret && virtio_blk_handle_rw_error(req, -ret, 0, true)) {
+        return;
     }
 
     virtio_blk_req_complete(req, VIRTIO_BLK_S_OK);
     block_acct_done(blk_get_stats(s->blk), &req->acct);
     virtio_blk_free_request(req);
-
-out:
-    aio_context_release(blk_get_aio_context(s->conf.conf.blk));
 }
 
 static void virtio_blk_discard_write_zeroes_complete(void *opaque, int ret)
@@ -168,11 +160,8 @@ static void virtio_blk_discard_write_zeroes_complete(void *opaque, int ret)
     bool is_write_zeroes = (virtio_ldl_p(VIRTIO_DEVICE(s), &req->out.type) &
                             ~VIRTIO_BLK_T_BARRIER) == VIRTIO_BLK_T_WRITE_ZEROES;
 
-    aio_context_acquire(blk_get_aio_context(s->conf.conf.blk));
-    if (ret) {
-        if (virtio_blk_handle_rw_error(req, -ret, false, is_write_zeroes)) {
-            goto out;
-        }
+    if (ret && virtio_blk_handle_rw_error(req, -ret, false, is_write_zeroes)) {
+        return;
     }
 
     virtio_blk_req_complete(req, VIRTIO_BLK_S_OK);
@@ -180,9 +169,6 @@ static void virtio_blk_discard_write_zeroes_complete(void *opaque, int ret)
         block_acct_done(blk_get_stats(s->blk), &req->acct);
     }
     virtio_blk_free_request(req);
-
-out:
-    aio_context_release(blk_get_aio_context(s->conf.conf.blk));
 }
 
 #ifdef __linux__
@@ -229,10 +215,8 @@ static void virtio_blk_ioctl_complete(void *opaque, int status)
     virtio_stl_p(vdev, &scsi->data_len, hdr->dxfer_len);
 
 out:
-    aio_context_acquire(blk_get_aio_context(s->conf.conf.blk));
     virtio_blk_req_complete(req, status);
     virtio_blk_free_request(req);
-    aio_context_release(blk_get_aio_context(s->conf.conf.blk));
     g_free(ioctl_req);
 }
 
@@ -672,7 +656,6 @@ static void virtio_blk_zone_report_complete(void *opaque, int ret)
 {
     ZoneCmdData *data = opaque;
     VirtIOBlockReq *req = data->req;
-    VirtIOBlock *s = req->dev;
     VirtIODevice *vdev = VIRTIO_DEVICE(req->dev);
     struct iovec *in_iov = data->in_iov;
     unsigned in_num = data->in_num;
@@ -763,10 +746,8 @@ static void virtio_blk_zone_report_complete(void *opaque, int ret)
     }
 
 out:
-    aio_context_acquire(blk_get_aio_context(s->conf.conf.blk));
     virtio_blk_req_complete(req, err_status);
     virtio_blk_free_request(req);
-    aio_context_release(blk_get_aio_context(s->conf.conf.blk));
     g_free(data->zone_report_data.zones);
     g_free(data);
 }
@@ -829,10 +810,8 @@ static void virtio_blk_zone_mgmt_complete(void *opaque, int ret)
         err_status = VIRTIO_BLK_S_ZONE_INVALID_CMD;
     }
 
-    aio_context_acquire(blk_get_aio_context(s->conf.conf.blk));
     virtio_blk_req_complete(req, err_status);
     virtio_blk_free_request(req);
-    aio_context_release(blk_get_aio_context(s->conf.conf.blk));
 }
 
 static int virtio_blk_handle_zone_mgmt(VirtIOBlockReq *req, BlockZoneOp op)
@@ -882,7 +861,6 @@ static void virtio_blk_zone_append_complete(void *opaque, int ret)
 {
     ZoneCmdData *data = opaque;
     VirtIOBlockReq *req = data->req;
-    VirtIOBlock *s = req->dev;
     VirtIODevice *vdev = VIRTIO_DEVICE(req->dev);
     int64_t append_sector, n;
     uint8_t err_status = VIRTIO_BLK_S_OK;
@@ -905,10 +883,8 @@ static void virtio_blk_zone_append_complete(void *opaque, int ret)
     trace_virtio_blk_zone_append_complete(vdev, req, append_sector, ret);
 
 out:
-    aio_context_acquire(blk_get_aio_context(s->conf.conf.blk));
     virtio_blk_req_complete(req, err_status);
     virtio_blk_free_request(req);
-    aio_context_release(blk_get_aio_context(s->conf.conf.blk));
     g_free(data);
 }
 
@@ -944,10 +920,8 @@ static int virtio_blk_handle_zone_append(VirtIOBlockReq *req,
     return 0;
 
 out:
-    aio_context_acquire(blk_get_aio_context(s->conf.conf.blk));
     virtio_blk_req_complete(req, err_status);
     virtio_blk_free_request(req);
-    aio_context_release(blk_get_aio_context(s->conf.conf.blk));
     return err_status;
 }
 
-- 
2.41.0



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 4/4] virtio-blk: don't lock AioContext in the submission code path
  2023-09-14 14:00 [PATCH 0/4] virtio-blk: prepare for the multi-queue block layer Stefan Hajnoczi
                   ` (2 preceding siblings ...)
  2023-09-14 14:01 ` [PATCH 3/4] virtio-blk: don't lock AioContext in the completion code path Stefan Hajnoczi
@ 2023-09-14 14:01 ` Stefan Hajnoczi
  2023-09-14 17:27   ` Eric Blake
  2023-09-15 11:17 ` [PATCH 0/4] virtio-blk: prepare for the multi-queue block layer Michael S. Tsirkin
  2023-12-19 14:09 ` Kevin Wolf
  5 siblings, 1 reply; 11+ messages in thread
From: Stefan Hajnoczi @ 2023-09-14 14:01 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Stefan Hajnoczi, qemu-block, Hanna Reitz,
	Michael S. Tsirkin

There is no need to acquire the AioContext lock around blk_aio_*() or
blk_get_geometry() anymore. I/O plugging (defer_call()) also does not
require the AioContext lock anymore.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 hw/block/virtio-blk.c | 5 -----
 1 file changed, 5 deletions(-)

diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index f5315df042..e110f9718b 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -1111,7 +1111,6 @@ void virtio_blk_handle_vq(VirtIOBlock *s, VirtQueue *vq)
     MultiReqBuffer mrb = {};
     bool suppress_notifications = virtio_queue_get_notification(vq);
 
-    aio_context_acquire(blk_get_aio_context(s->blk));
     defer_call_begin();
 
     do {
@@ -1137,7 +1136,6 @@ void virtio_blk_handle_vq(VirtIOBlock *s, VirtQueue *vq)
     }
 
     defer_call_end();
-    aio_context_release(blk_get_aio_context(s->blk));
 }
 
 static void virtio_blk_handle_output(VirtIODevice *vdev, VirtQueue *vq)
@@ -1168,7 +1166,6 @@ static void virtio_blk_dma_restart_bh(void *opaque)
         s->rq = NULL;
     }
 
-    aio_context_acquire(blk_get_aio_context(s->conf.conf.blk));
     while (req) {
         VirtIOBlockReq *next = req->next;
         if (virtio_blk_handle_request(req, &mrb)) {
@@ -1192,8 +1189,6 @@ static void virtio_blk_dma_restart_bh(void *opaque)
 
     /* Paired with inc in virtio_blk_dma_restart_cb() */
     blk_dec_in_flight(s->conf.conf.blk);
-
-    aio_context_release(blk_get_aio_context(s->conf.conf.blk));
 }
 
 static void virtio_blk_dma_restart_cb(void *opaque, bool running,
-- 
2.41.0



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/4] block/file-posix: set up Linux AIO and io_uring in the current thread
  2023-09-14 14:00 ` [PATCH 1/4] block/file-posix: set up Linux AIO and io_uring in the current thread Stefan Hajnoczi
@ 2023-09-14 15:47   ` Eric Blake
  0 siblings, 0 replies; 11+ messages in thread
From: Eric Blake @ 2023-09-14 15:47 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: qemu-devel, Kevin Wolf, qemu-block, Hanna Reitz,
	Michael S. Tsirkin

On Thu, Sep 14, 2023 at 10:00:58AM -0400, Stefan Hajnoczi wrote:
> The file-posix block driver currently only sets up Linux AIO and
> io_uring in the BDS's AioContext. In the multi-queue block layer we must
> be able to submit I/O requests in AioContexts that do not have Linux AIO
> and io_uring set up yet since any thread can call into the block driver.
> 
> Set up Linux AIO and io_uring for the current AioContext during request
> submission. We lose the ability to return an error from
> .bdrv_file_open() when Linux AIO and io_uring setup fails (e.g. due to
> resource limits). Instead the user only gets warnings and we fall back
> to aio=threads. This is still better than a fatal error after startup.
> 
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
>  block/file-posix.c | 99 ++++++++++++++++++++++------------------------
>  1 file changed, 47 insertions(+), 52 deletions(-)
> 
> diff --git a/block/file-posix.c b/block/file-posix.c
> index 4757914ac0..e9dbb87c57 100644
> --- a/block/file-posix.c
> +++ b/block/file-posix.c
> @@ -713,17 +713,11 @@ static int raw_open_common(BlockDriverState *bs, QDict *options,
>  
>  #ifdef CONFIG_LINUX_AIO
>       /* Currently Linux does AIO only for files opened with O_DIRECT */
> -    if (s->use_linux_aio) {
> -        if (!(s->open_flags & O_DIRECT)) {
> -            error_setg(errp, "aio=native was specified, but it requires "
> -                             "cache.direct=on, which was not specified.");
> -            ret = -EINVAL;
> -            goto fail;
> -        }
> -        if (!aio_setup_linux_aio(bdrv_get_aio_context(bs), errp)) {

We were previously doing setup only once during open...

> +static inline bool raw_check_linux_io_uring(BDRVRawState *s)
> +{
> +    Error *local_err = NULL;
> +    AioContext *ctx;
> +
> +    if (!s->use_linux_io_uring) {
> +        return false;
> +    }
> +
> +    ctx = qemu_get_current_aio_context();
> +    if (unlikely(!aio_setup_linux_io_uring(ctx, &local_err))) {

...now you're doing it on ever I/O request.  I had to check that setup
is idempotent; thankfully it is (once ctx has an associated linux_aio
or io_uring, setup is a no-op for subsequent calls).

Reviewed-by: Eric Blake <eblake@redhat.com>

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.
Virtualization:  qemu.org | libguestfs.org



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/4] virtio-blk: add lock to protect s->rq
  2023-09-14 14:00 ` [PATCH 2/4] virtio-blk: add lock to protect s->rq Stefan Hajnoczi
@ 2023-09-14 16:13   ` Eric Blake
  0 siblings, 0 replies; 11+ messages in thread
From: Eric Blake @ 2023-09-14 16:13 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: qemu-devel, Kevin Wolf, qemu-block, Hanna Reitz,
	Michael S. Tsirkin

On Thu, Sep 14, 2023 at 10:00:59AM -0400, Stefan Hajnoczi wrote:
> s->rq is accessed from IO_CODE and GLOBAL_STATE_CODE. Introduce a lock
> to protect s->rq and eliminate reliance on the AioContext lock.
> 
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
>  include/hw/virtio/virtio-blk.h |  3 +-
>  hw/block/virtio-blk.c          | 67 +++++++++++++++++++++++-----------
>  2 files changed, 47 insertions(+), 23 deletions(-)
> 

Reviewed-by: Eric Blake <eblake@redhat.com>

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.
Virtualization:  qemu.org | libguestfs.org



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 3/4] virtio-blk: don't lock AioContext in the completion code path
  2023-09-14 14:01 ` [PATCH 3/4] virtio-blk: don't lock AioContext in the completion code path Stefan Hajnoczi
@ 2023-09-14 16:17   ` Eric Blake
  0 siblings, 0 replies; 11+ messages in thread
From: Eric Blake @ 2023-09-14 16:17 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: qemu-devel, Kevin Wolf, qemu-block, Hanna Reitz,
	Michael S. Tsirkin

On Thu, Sep 14, 2023 at 10:01:00AM -0400, Stefan Hajnoczi wrote:
> Nothing in the completion code path relies on the AioContext lock
> anymore. Virtqueues are only accessed from one thread at any moment and
> the s->rq global state is protected by its own lock now.
> 
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
>  hw/block/virtio-blk.c | 34 ++++------------------------------
>  1 file changed, 4 insertions(+), 30 deletions(-)
> 

Reviewed-by: Eric Blake <eblake@redhat.com>

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.
Virtualization:  qemu.org | libguestfs.org



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 4/4] virtio-blk: don't lock AioContext in the submission code path
  2023-09-14 14:01 ` [PATCH 4/4] virtio-blk: don't lock AioContext in the submission " Stefan Hajnoczi
@ 2023-09-14 17:27   ` Eric Blake
  0 siblings, 0 replies; 11+ messages in thread
From: Eric Blake @ 2023-09-14 17:27 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: qemu-devel, Kevin Wolf, qemu-block, Hanna Reitz,
	Michael S. Tsirkin

On Thu, Sep 14, 2023 at 10:01:01AM -0400, Stefan Hajnoczi wrote:
> There is no need to acquire the AioContext lock around blk_aio_*() or
> blk_get_geometry() anymore. I/O plugging (defer_call()) also does not
> require the AioContext lock anymore.
> 
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
>  hw/block/virtio-blk.c | 5 -----
>  1 file changed, 5 deletions(-)
> 

Reviewed-by: Eric Blake <eblake@redhat.com>

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.
Virtualization:  qemu.org | libguestfs.org



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/4] virtio-blk: prepare for the multi-queue block layer
  2023-09-14 14:00 [PATCH 0/4] virtio-blk: prepare for the multi-queue block layer Stefan Hajnoczi
                   ` (3 preceding siblings ...)
  2023-09-14 14:01 ` [PATCH 4/4] virtio-blk: don't lock AioContext in the submission " Stefan Hajnoczi
@ 2023-09-15 11:17 ` Michael S. Tsirkin
  2023-12-19 14:09 ` Kevin Wolf
  5 siblings, 0 replies; 11+ messages in thread
From: Michael S. Tsirkin @ 2023-09-15 11:17 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: qemu-devel, Kevin Wolf, qemu-block, Hanna Reitz

On Thu, Sep 14, 2023 at 10:00:57AM -0400, Stefan Hajnoczi wrote:
> The virtio-blk device will soon be able to assign virtqueues to IOThreads,
> eliminating the single IOThread bottleneck. In order to do that, the I/O code
> path must support running in multiple threads.
> 
> This patch series removes the AioContext lock from the virtio-blk I/O code
> path, adds thread-safety where it is required, and ensures that Linux AIO and
> io_uring are available regardless of which thread calls into the block driver.
> With these changes virtio-blk is ready for the iothread-vq-mapping feature,
> which will be introduced in the next patch series.
> 
> Based-on: 20230913200045.1024233-1-stefanha@redhat.com ("[PATCH v3 0/4] virtio-blk: use blk_io_plug_call() instead of notification BH")
> Based-on: 20230912231037.826804-1-stefanha@redhat.com ("[PATCH v3 0/5] block-backend: process I/O in the current AioContext")


virtio bits:

Reviewed-by: Michael S. Tsirkin <mst@redhat.com>

feel free to merge

> Stefan Hajnoczi (4):
>   block/file-posix: set up Linux AIO and io_uring in the current thread
>   virtio-blk: add lock to protect s->rq
>   virtio-blk: don't lock AioContext in the completion code path
>   virtio-blk: don't lock AioContext in the submission code path
> 
>  include/hw/virtio/virtio-blk.h |   3 +-
>  block/file-posix.c             |  99 +++++++++++++++---------------
>  hw/block/virtio-blk.c          | 106 +++++++++++++++------------------
>  3 files changed, 98 insertions(+), 110 deletions(-)
> 
> -- 
> 2.41.0



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/4] virtio-blk: prepare for the multi-queue block layer
  2023-09-14 14:00 [PATCH 0/4] virtio-blk: prepare for the multi-queue block layer Stefan Hajnoczi
                   ` (4 preceding siblings ...)
  2023-09-15 11:17 ` [PATCH 0/4] virtio-blk: prepare for the multi-queue block layer Michael S. Tsirkin
@ 2023-12-19 14:09 ` Kevin Wolf
  5 siblings, 0 replies; 11+ messages in thread
From: Kevin Wolf @ 2023-12-19 14:09 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: qemu-devel, qemu-block, Hanna Reitz, Michael S. Tsirkin

Am 14.09.2023 um 16:00 hat Stefan Hajnoczi geschrieben:
> The virtio-blk device will soon be able to assign virtqueues to IOThreads,
> eliminating the single IOThread bottleneck. In order to do that, the I/O code
> path must support running in multiple threads.
> 
> This patch series removes the AioContext lock from the virtio-blk I/O code
> path, adds thread-safety where it is required, and ensures that Linux AIO and
> io_uring are available regardless of which thread calls into the block driver.
> With these changes virtio-blk is ready for the iothread-vq-mapping feature,
> which will be introduced in the next patch series.
> 
> Based-on: 20230913200045.1024233-1-stefanha@redhat.com ("[PATCH v3 0/4] virtio-blk: use blk_io_plug_call() instead of notification BH")
> Based-on: 20230912231037.826804-1-stefanha@redhat.com ("[PATCH v3 0/5] block-backend: process I/O in the current AioContext")
> 
> Stefan Hajnoczi (4):
>   block/file-posix: set up Linux AIO and io_uring in the current thread
>   virtio-blk: add lock to protect s->rq
>   virtio-blk: don't lock AioContext in the completion code path
>   virtio-blk: don't lock AioContext in the submission code path

Thanks, applied to the block branch.

Kevin



^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2023-12-19 14:11 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-09-14 14:00 [PATCH 0/4] virtio-blk: prepare for the multi-queue block layer Stefan Hajnoczi
2023-09-14 14:00 ` [PATCH 1/4] block/file-posix: set up Linux AIO and io_uring in the current thread Stefan Hajnoczi
2023-09-14 15:47   ` Eric Blake
2023-09-14 14:00 ` [PATCH 2/4] virtio-blk: add lock to protect s->rq Stefan Hajnoczi
2023-09-14 16:13   ` Eric Blake
2023-09-14 14:01 ` [PATCH 3/4] virtio-blk: don't lock AioContext in the completion code path Stefan Hajnoczi
2023-09-14 16:17   ` Eric Blake
2023-09-14 14:01 ` [PATCH 4/4] virtio-blk: don't lock AioContext in the submission " Stefan Hajnoczi
2023-09-14 17:27   ` Eric Blake
2023-09-15 11:17 ` [PATCH 0/4] virtio-blk: prepare for the multi-queue block layer Michael S. Tsirkin
2023-12-19 14:09 ` Kevin Wolf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).