* [PATCH 0/4] virtio-blk: prepare for the multi-queue block layer
@ 2023-09-14 14:00 Stefan Hajnoczi
2023-09-14 14:00 ` [PATCH 1/4] block/file-posix: set up Linux AIO and io_uring in the current thread Stefan Hajnoczi
` (5 more replies)
0 siblings, 6 replies; 11+ messages in thread
From: Stefan Hajnoczi @ 2023-09-14 14:00 UTC (permalink / raw)
To: qemu-devel
Cc: Kevin Wolf, Stefan Hajnoczi, qemu-block, Hanna Reitz,
Michael S. Tsirkin
The virtio-blk device will soon be able to assign virtqueues to IOThreads,
eliminating the single IOThread bottleneck. In order to do that, the I/O code
path must support running in multiple threads.
This patch series removes the AioContext lock from the virtio-blk I/O code
path, adds thread-safety where it is required, and ensures that Linux AIO and
io_uring are available regardless of which thread calls into the block driver.
With these changes virtio-blk is ready for the iothread-vq-mapping feature,
which will be introduced in the next patch series.
Based-on: 20230913200045.1024233-1-stefanha@redhat.com ("[PATCH v3 0/4] virtio-blk: use blk_io_plug_call() instead of notification BH")
Based-on: 20230912231037.826804-1-stefanha@redhat.com ("[PATCH v3 0/5] block-backend: process I/O in the current AioContext")
Stefan Hajnoczi (4):
block/file-posix: set up Linux AIO and io_uring in the current thread
virtio-blk: add lock to protect s->rq
virtio-blk: don't lock AioContext in the completion code path
virtio-blk: don't lock AioContext in the submission code path
include/hw/virtio/virtio-blk.h | 3 +-
block/file-posix.c | 99 +++++++++++++++---------------
hw/block/virtio-blk.c | 106 +++++++++++++++------------------
3 files changed, 98 insertions(+), 110 deletions(-)
--
2.41.0
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 1/4] block/file-posix: set up Linux AIO and io_uring in the current thread
2023-09-14 14:00 [PATCH 0/4] virtio-blk: prepare for the multi-queue block layer Stefan Hajnoczi
@ 2023-09-14 14:00 ` Stefan Hajnoczi
2023-09-14 15:47 ` Eric Blake
2023-09-14 14:00 ` [PATCH 2/4] virtio-blk: add lock to protect s->rq Stefan Hajnoczi
` (4 subsequent siblings)
5 siblings, 1 reply; 11+ messages in thread
From: Stefan Hajnoczi @ 2023-09-14 14:00 UTC (permalink / raw)
To: qemu-devel
Cc: Kevin Wolf, Stefan Hajnoczi, qemu-block, Hanna Reitz,
Michael S. Tsirkin
The file-posix block driver currently only sets up Linux AIO and
io_uring in the BDS's AioContext. In the multi-queue block layer we must
be able to submit I/O requests in AioContexts that do not have Linux AIO
and io_uring set up yet since any thread can call into the block driver.
Set up Linux AIO and io_uring for the current AioContext during request
submission. We lose the ability to return an error from
.bdrv_file_open() when Linux AIO and io_uring setup fails (e.g. due to
resource limits). Instead the user only gets warnings and we fall back
to aio=threads. This is still better than a fatal error after startup.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
block/file-posix.c | 99 ++++++++++++++++++++++------------------------
1 file changed, 47 insertions(+), 52 deletions(-)
diff --git a/block/file-posix.c b/block/file-posix.c
index 4757914ac0..e9dbb87c57 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -713,17 +713,11 @@ static int raw_open_common(BlockDriverState *bs, QDict *options,
#ifdef CONFIG_LINUX_AIO
/* Currently Linux does AIO only for files opened with O_DIRECT */
- if (s->use_linux_aio) {
- if (!(s->open_flags & O_DIRECT)) {
- error_setg(errp, "aio=native was specified, but it requires "
- "cache.direct=on, which was not specified.");
- ret = -EINVAL;
- goto fail;
- }
- if (!aio_setup_linux_aio(bdrv_get_aio_context(bs), errp)) {
- error_prepend(errp, "Unable to use native AIO: ");
- goto fail;
- }
+ if (s->use_linux_aio && !(s->open_flags & O_DIRECT)) {
+ error_setg(errp, "aio=native was specified, but it requires "
+ "cache.direct=on, which was not specified.");
+ ret = -EINVAL;
+ goto fail;
}
#else
if (s->use_linux_aio) {
@@ -734,14 +728,7 @@ static int raw_open_common(BlockDriverState *bs, QDict *options,
}
#endif /* !defined(CONFIG_LINUX_AIO) */
-#ifdef CONFIG_LINUX_IO_URING
- if (s->use_linux_io_uring) {
- if (!aio_setup_linux_io_uring(bdrv_get_aio_context(bs), errp)) {
- error_prepend(errp, "Unable to use io_uring: ");
- goto fail;
- }
- }
-#else
+#ifndef CONFIG_LINUX_IO_URING
if (s->use_linux_io_uring) {
error_setg(errp, "aio=io_uring was specified, but is not supported "
"in this build.");
@@ -2442,6 +2429,44 @@ static bool bdrv_qiov_is_aligned(BlockDriverState *bs, QEMUIOVector *qiov)
return true;
}
+static inline bool raw_check_linux_io_uring(BDRVRawState *s)
+{
+ Error *local_err = NULL;
+ AioContext *ctx;
+
+ if (!s->use_linux_io_uring) {
+ return false;
+ }
+
+ ctx = qemu_get_current_aio_context();
+ if (unlikely(!aio_setup_linux_io_uring(ctx, &local_err))) {
+ error_reportf_err(local_err, "Unable to use linux io_uring, "
+ "falling back to thread pool: ");
+ s->use_linux_io_uring = false;
+ return false;
+ }
+ return true;
+}
+
+static inline bool raw_check_linux_aio(BDRVRawState *s)
+{
+ Error *local_err = NULL;
+ AioContext *ctx;
+
+ if (!s->use_linux_aio) {
+ return false;
+ }
+
+ ctx = qemu_get_current_aio_context();
+ if (unlikely(!aio_setup_linux_aio(ctx, &local_err))) {
+ error_reportf_err(local_err, "Unable to use Linux AIO, "
+ "falling back to thread pool: ");
+ s->use_linux_aio = false;
+ return false;
+ }
+ return true;
+}
+
static int coroutine_fn raw_co_prw(BlockDriverState *bs, uint64_t offset,
uint64_t bytes, QEMUIOVector *qiov, int type)
{
@@ -2470,13 +2495,13 @@ static int coroutine_fn raw_co_prw(BlockDriverState *bs, uint64_t offset,
if (s->needs_alignment && !bdrv_qiov_is_aligned(bs, qiov)) {
type |= QEMU_AIO_MISALIGNED;
#ifdef CONFIG_LINUX_IO_URING
- } else if (s->use_linux_io_uring) {
+ } else if (raw_check_linux_io_uring(s)) {
assert(qiov->size == bytes);
ret = luring_co_submit(bs, s->fd, offset, qiov, type);
goto out;
#endif
#ifdef CONFIG_LINUX_AIO
- } else if (s->use_linux_aio) {
+ } else if (raw_check_linux_aio(s)) {
assert(qiov->size == bytes);
ret = laio_co_submit(s->fd, offset, qiov, type,
s->aio_max_batch);
@@ -2566,39 +2591,13 @@ static int coroutine_fn raw_co_flush_to_disk(BlockDriverState *bs)
};
#ifdef CONFIG_LINUX_IO_URING
- if (s->use_linux_io_uring) {
+ if (raw_check_linux_io_uring(s)) {
return luring_co_submit(bs, s->fd, 0, NULL, QEMU_AIO_FLUSH);
}
#endif
return raw_thread_pool_submit(handle_aiocb_flush, &acb);
}
-static void raw_aio_attach_aio_context(BlockDriverState *bs,
- AioContext *new_context)
-{
- BDRVRawState __attribute__((unused)) *s = bs->opaque;
-#ifdef CONFIG_LINUX_AIO
- if (s->use_linux_aio) {
- Error *local_err = NULL;
- if (!aio_setup_linux_aio(new_context, &local_err)) {
- error_reportf_err(local_err, "Unable to use native AIO, "
- "falling back to thread pool: ");
- s->use_linux_aio = false;
- }
- }
-#endif
-#ifdef CONFIG_LINUX_IO_URING
- if (s->use_linux_io_uring) {
- Error *local_err = NULL;
- if (!aio_setup_linux_io_uring(new_context, &local_err)) {
- error_reportf_err(local_err, "Unable to use linux io_uring, "
- "falling back to thread pool: ");
- s->use_linux_io_uring = false;
- }
- }
-#endif
-}
-
static void raw_close(BlockDriverState *bs)
{
BDRVRawState *s = bs->opaque;
@@ -3897,7 +3896,6 @@ BlockDriver bdrv_file = {
.bdrv_co_copy_range_from = raw_co_copy_range_from,
.bdrv_co_copy_range_to = raw_co_copy_range_to,
.bdrv_refresh_limits = raw_refresh_limits,
- .bdrv_attach_aio_context = raw_aio_attach_aio_context,
.bdrv_co_truncate = raw_co_truncate,
.bdrv_co_getlength = raw_co_getlength,
@@ -4267,7 +4265,6 @@ static BlockDriver bdrv_host_device = {
.bdrv_co_copy_range_from = raw_co_copy_range_from,
.bdrv_co_copy_range_to = raw_co_copy_range_to,
.bdrv_refresh_limits = raw_refresh_limits,
- .bdrv_attach_aio_context = raw_aio_attach_aio_context,
.bdrv_co_truncate = raw_co_truncate,
.bdrv_co_getlength = raw_co_getlength,
@@ -4403,7 +4400,6 @@ static BlockDriver bdrv_host_cdrom = {
.bdrv_co_pwritev = raw_co_pwritev,
.bdrv_co_flush_to_disk = raw_co_flush_to_disk,
.bdrv_refresh_limits = cdrom_refresh_limits,
- .bdrv_attach_aio_context = raw_aio_attach_aio_context,
.bdrv_co_truncate = raw_co_truncate,
.bdrv_co_getlength = raw_co_getlength,
@@ -4529,7 +4525,6 @@ static BlockDriver bdrv_host_cdrom = {
.bdrv_co_pwritev = raw_co_pwritev,
.bdrv_co_flush_to_disk = raw_co_flush_to_disk,
.bdrv_refresh_limits = cdrom_refresh_limits,
- .bdrv_attach_aio_context = raw_aio_attach_aio_context,
.bdrv_co_truncate = raw_co_truncate,
.bdrv_co_getlength = raw_co_getlength,
--
2.41.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH 2/4] virtio-blk: add lock to protect s->rq
2023-09-14 14:00 [PATCH 0/4] virtio-blk: prepare for the multi-queue block layer Stefan Hajnoczi
2023-09-14 14:00 ` [PATCH 1/4] block/file-posix: set up Linux AIO and io_uring in the current thread Stefan Hajnoczi
@ 2023-09-14 14:00 ` Stefan Hajnoczi
2023-09-14 16:13 ` Eric Blake
2023-09-14 14:01 ` [PATCH 3/4] virtio-blk: don't lock AioContext in the completion code path Stefan Hajnoczi
` (3 subsequent siblings)
5 siblings, 1 reply; 11+ messages in thread
From: Stefan Hajnoczi @ 2023-09-14 14:00 UTC (permalink / raw)
To: qemu-devel
Cc: Kevin Wolf, Stefan Hajnoczi, qemu-block, Hanna Reitz,
Michael S. Tsirkin
s->rq is accessed from IO_CODE and GLOBAL_STATE_CODE. Introduce a lock
to protect s->rq and eliminate reliance on the AioContext lock.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
include/hw/virtio/virtio-blk.h | 3 +-
hw/block/virtio-blk.c | 67 +++++++++++++++++++++++-----------
2 files changed, 47 insertions(+), 23 deletions(-)
diff --git a/include/hw/virtio/virtio-blk.h b/include/hw/virtio/virtio-blk.h
index dafec432ce..9881009c22 100644
--- a/include/hw/virtio/virtio-blk.h
+++ b/include/hw/virtio/virtio-blk.h
@@ -54,7 +54,8 @@ struct VirtIOBlockReq;
struct VirtIOBlock {
VirtIODevice parent_obj;
BlockBackend *blk;
- void *rq;
+ QemuMutex rq_lock;
+ void *rq; /* protected by rq_lock */
VirtIOBlkConf conf;
unsigned short sector_mask;
bool original_wce;
diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index a1f8e15522..ee38e089bc 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -82,8 +82,11 @@ static int virtio_blk_handle_rw_error(VirtIOBlockReq *req, int error,
/* Break the link as the next request is going to be parsed from the
* ring again. Otherwise we may end up doing a double completion! */
req->mr_next = NULL;
- req->next = s->rq;
- s->rq = req;
+
+ WITH_QEMU_LOCK_GUARD(&s->rq_lock) {
+ req->next = s->rq;
+ s->rq = req;
+ }
} else if (action == BLOCK_ERROR_ACTION_REPORT) {
virtio_blk_req_complete(req, VIRTIO_BLK_S_IOERR);
if (acct_failed) {
@@ -1183,10 +1186,13 @@ static void virtio_blk_dma_restart_bh(void *opaque)
{
VirtIOBlock *s = opaque;
- VirtIOBlockReq *req = s->rq;
+ VirtIOBlockReq *req;
MultiReqBuffer mrb = {};
- s->rq = NULL;
+ WITH_QEMU_LOCK_GUARD(&s->rq_lock) {
+ req = s->rq;
+ s->rq = NULL;
+ }
aio_context_acquire(blk_get_aio_context(s->conf.conf.blk));
while (req) {
@@ -1238,22 +1244,29 @@ static void virtio_blk_reset(VirtIODevice *vdev)
AioContext *ctx;
VirtIOBlockReq *req;
+ /* Dataplane has stopped... */
+ assert(!s->dataplane_started);
+
+ /* ...but requests may still be in flight. */
ctx = blk_get_aio_context(s->blk);
aio_context_acquire(ctx);
blk_drain(s->blk);
+ aio_context_release(ctx);
/* We drop queued requests after blk_drain() because blk_drain() itself can
* produce them. */
- while (s->rq) {
- req = s->rq;
- s->rq = req->next;
- virtqueue_detach_element(req->vq, &req->elem, 0);
- virtio_blk_free_request(req);
+ WITH_QEMU_LOCK_GUARD(&s->rq_lock) {
+ while (s->rq) {
+ req = s->rq;
+ s->rq = req->next;
+
+ /* No other threads can access req->vq here */
+ virtqueue_detach_element(req->vq, &req->elem, 0);
+
+ virtio_blk_free_request(req);
+ }
}
- aio_context_release(ctx);
-
- assert(!s->dataplane_started);
blk_set_enable_write_cache(s->blk, s->original_wce);
}
@@ -1443,18 +1456,22 @@ static void virtio_blk_set_status(VirtIODevice *vdev, uint8_t status)
static void virtio_blk_save_device(VirtIODevice *vdev, QEMUFile *f)
{
VirtIOBlock *s = VIRTIO_BLK(vdev);
- VirtIOBlockReq *req = s->rq;
- while (req) {
- qemu_put_sbyte(f, 1);
+ WITH_QEMU_LOCK_GUARD(&s->rq_lock) {
+ VirtIOBlockReq *req = s->rq;
- if (s->conf.num_queues > 1) {
- qemu_put_be32(f, virtio_get_queue_index(req->vq));
+ while (req) {
+ qemu_put_sbyte(f, 1);
+
+ if (s->conf.num_queues > 1) {
+ qemu_put_be32(f, virtio_get_queue_index(req->vq));
+ }
+
+ qemu_put_virtqueue_element(vdev, f, &req->elem);
+ req = req->next;
}
-
- qemu_put_virtqueue_element(vdev, f, &req->elem);
- req = req->next;
}
+
qemu_put_sbyte(f, 0);
}
@@ -1480,8 +1497,11 @@ static int virtio_blk_load_device(VirtIODevice *vdev, QEMUFile *f,
req = qemu_get_virtqueue_element(vdev, f, sizeof(VirtIOBlockReq));
virtio_blk_init_request(s, virtio_get_queue(vdev, vq_idx), req);
- req->next = s->rq;
- s->rq = req;
+
+ WITH_QEMU_LOCK_GUARD(&s->rq_lock) {
+ req->next = s->rq;
+ s->rq = req;
+ }
}
return 0;
@@ -1628,6 +1648,8 @@ static void virtio_blk_device_realize(DeviceState *dev, Error **errp)
s->host_features);
virtio_init(vdev, VIRTIO_ID_BLOCK, s->config_size);
+ qemu_mutex_init(&s->rq_lock);
+
s->blk = conf->conf.blk;
s->rq = NULL;
s->sector_mask = (s->conf.conf.logical_block_size / BDRV_SECTOR_SIZE) - 1;
@@ -1679,6 +1701,7 @@ static void virtio_blk_device_unrealize(DeviceState *dev)
virtio_del_queue(vdev, i);
}
qemu_coroutine_dec_pool_size(conf->num_queues * conf->queue_size / 2);
+ qemu_mutex_destroy(&s->rq_lock);
blk_ram_registrar_destroy(&s->blk_ram_registrar);
qemu_del_vm_change_state_handler(s->change);
blockdev_mark_auto_del(s->blk);
--
2.41.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH 3/4] virtio-blk: don't lock AioContext in the completion code path
2023-09-14 14:00 [PATCH 0/4] virtio-blk: prepare for the multi-queue block layer Stefan Hajnoczi
2023-09-14 14:00 ` [PATCH 1/4] block/file-posix: set up Linux AIO and io_uring in the current thread Stefan Hajnoczi
2023-09-14 14:00 ` [PATCH 2/4] virtio-blk: add lock to protect s->rq Stefan Hajnoczi
@ 2023-09-14 14:01 ` Stefan Hajnoczi
2023-09-14 16:17 ` Eric Blake
2023-09-14 14:01 ` [PATCH 4/4] virtio-blk: don't lock AioContext in the submission " Stefan Hajnoczi
` (2 subsequent siblings)
5 siblings, 1 reply; 11+ messages in thread
From: Stefan Hajnoczi @ 2023-09-14 14:01 UTC (permalink / raw)
To: qemu-devel
Cc: Kevin Wolf, Stefan Hajnoczi, qemu-block, Hanna Reitz,
Michael S. Tsirkin
Nothing in the completion code path relies on the AioContext lock
anymore. Virtqueues are only accessed from one thread at any moment and
the s->rq global state is protected by its own lock now.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
hw/block/virtio-blk.c | 34 ++++------------------------------
1 file changed, 4 insertions(+), 30 deletions(-)
diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index ee38e089bc..f5315df042 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -105,7 +105,6 @@ static void virtio_blk_rw_complete(void *opaque, int ret)
VirtIOBlock *s = next->dev;
VirtIODevice *vdev = VIRTIO_DEVICE(s);
- aio_context_acquire(blk_get_aio_context(s->conf.conf.blk));
while (next) {
VirtIOBlockReq *req = next;
next = req->mr_next;
@@ -138,7 +137,6 @@ static void virtio_blk_rw_complete(void *opaque, int ret)
block_acct_done(blk_get_stats(s->blk), &req->acct);
virtio_blk_free_request(req);
}
- aio_context_release(blk_get_aio_context(s->conf.conf.blk));
}
static void virtio_blk_flush_complete(void *opaque, int ret)
@@ -146,19 +144,13 @@ static void virtio_blk_flush_complete(void *opaque, int ret)
VirtIOBlockReq *req = opaque;
VirtIOBlock *s = req->dev;
- aio_context_acquire(blk_get_aio_context(s->conf.conf.blk));
- if (ret) {
- if (virtio_blk_handle_rw_error(req, -ret, 0, true)) {
- goto out;
- }
+ if (ret && virtio_blk_handle_rw_error(req, -ret, 0, true)) {
+ return;
}
virtio_blk_req_complete(req, VIRTIO_BLK_S_OK);
block_acct_done(blk_get_stats(s->blk), &req->acct);
virtio_blk_free_request(req);
-
-out:
- aio_context_release(blk_get_aio_context(s->conf.conf.blk));
}
static void virtio_blk_discard_write_zeroes_complete(void *opaque, int ret)
@@ -168,11 +160,8 @@ static void virtio_blk_discard_write_zeroes_complete(void *opaque, int ret)
bool is_write_zeroes = (virtio_ldl_p(VIRTIO_DEVICE(s), &req->out.type) &
~VIRTIO_BLK_T_BARRIER) == VIRTIO_BLK_T_WRITE_ZEROES;
- aio_context_acquire(blk_get_aio_context(s->conf.conf.blk));
- if (ret) {
- if (virtio_blk_handle_rw_error(req, -ret, false, is_write_zeroes)) {
- goto out;
- }
+ if (ret && virtio_blk_handle_rw_error(req, -ret, false, is_write_zeroes)) {
+ return;
}
virtio_blk_req_complete(req, VIRTIO_BLK_S_OK);
@@ -180,9 +169,6 @@ static void virtio_blk_discard_write_zeroes_complete(void *opaque, int ret)
block_acct_done(blk_get_stats(s->blk), &req->acct);
}
virtio_blk_free_request(req);
-
-out:
- aio_context_release(blk_get_aio_context(s->conf.conf.blk));
}
#ifdef __linux__
@@ -229,10 +215,8 @@ static void virtio_blk_ioctl_complete(void *opaque, int status)
virtio_stl_p(vdev, &scsi->data_len, hdr->dxfer_len);
out:
- aio_context_acquire(blk_get_aio_context(s->conf.conf.blk));
virtio_blk_req_complete(req, status);
virtio_blk_free_request(req);
- aio_context_release(blk_get_aio_context(s->conf.conf.blk));
g_free(ioctl_req);
}
@@ -672,7 +656,6 @@ static void virtio_blk_zone_report_complete(void *opaque, int ret)
{
ZoneCmdData *data = opaque;
VirtIOBlockReq *req = data->req;
- VirtIOBlock *s = req->dev;
VirtIODevice *vdev = VIRTIO_DEVICE(req->dev);
struct iovec *in_iov = data->in_iov;
unsigned in_num = data->in_num;
@@ -763,10 +746,8 @@ static void virtio_blk_zone_report_complete(void *opaque, int ret)
}
out:
- aio_context_acquire(blk_get_aio_context(s->conf.conf.blk));
virtio_blk_req_complete(req, err_status);
virtio_blk_free_request(req);
- aio_context_release(blk_get_aio_context(s->conf.conf.blk));
g_free(data->zone_report_data.zones);
g_free(data);
}
@@ -829,10 +810,8 @@ static void virtio_blk_zone_mgmt_complete(void *opaque, int ret)
err_status = VIRTIO_BLK_S_ZONE_INVALID_CMD;
}
- aio_context_acquire(blk_get_aio_context(s->conf.conf.blk));
virtio_blk_req_complete(req, err_status);
virtio_blk_free_request(req);
- aio_context_release(blk_get_aio_context(s->conf.conf.blk));
}
static int virtio_blk_handle_zone_mgmt(VirtIOBlockReq *req, BlockZoneOp op)
@@ -882,7 +861,6 @@ static void virtio_blk_zone_append_complete(void *opaque, int ret)
{
ZoneCmdData *data = opaque;
VirtIOBlockReq *req = data->req;
- VirtIOBlock *s = req->dev;
VirtIODevice *vdev = VIRTIO_DEVICE(req->dev);
int64_t append_sector, n;
uint8_t err_status = VIRTIO_BLK_S_OK;
@@ -905,10 +883,8 @@ static void virtio_blk_zone_append_complete(void *opaque, int ret)
trace_virtio_blk_zone_append_complete(vdev, req, append_sector, ret);
out:
- aio_context_acquire(blk_get_aio_context(s->conf.conf.blk));
virtio_blk_req_complete(req, err_status);
virtio_blk_free_request(req);
- aio_context_release(blk_get_aio_context(s->conf.conf.blk));
g_free(data);
}
@@ -944,10 +920,8 @@ static int virtio_blk_handle_zone_append(VirtIOBlockReq *req,
return 0;
out:
- aio_context_acquire(blk_get_aio_context(s->conf.conf.blk));
virtio_blk_req_complete(req, err_status);
virtio_blk_free_request(req);
- aio_context_release(blk_get_aio_context(s->conf.conf.blk));
return err_status;
}
--
2.41.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH 4/4] virtio-blk: don't lock AioContext in the submission code path
2023-09-14 14:00 [PATCH 0/4] virtio-blk: prepare for the multi-queue block layer Stefan Hajnoczi
` (2 preceding siblings ...)
2023-09-14 14:01 ` [PATCH 3/4] virtio-blk: don't lock AioContext in the completion code path Stefan Hajnoczi
@ 2023-09-14 14:01 ` Stefan Hajnoczi
2023-09-14 17:27 ` Eric Blake
2023-09-15 11:17 ` [PATCH 0/4] virtio-blk: prepare for the multi-queue block layer Michael S. Tsirkin
2023-12-19 14:09 ` Kevin Wolf
5 siblings, 1 reply; 11+ messages in thread
From: Stefan Hajnoczi @ 2023-09-14 14:01 UTC (permalink / raw)
To: qemu-devel
Cc: Kevin Wolf, Stefan Hajnoczi, qemu-block, Hanna Reitz,
Michael S. Tsirkin
There is no need to acquire the AioContext lock around blk_aio_*() or
blk_get_geometry() anymore. I/O plugging (defer_call()) also does not
require the AioContext lock anymore.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
hw/block/virtio-blk.c | 5 -----
1 file changed, 5 deletions(-)
diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index f5315df042..e110f9718b 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -1111,7 +1111,6 @@ void virtio_blk_handle_vq(VirtIOBlock *s, VirtQueue *vq)
MultiReqBuffer mrb = {};
bool suppress_notifications = virtio_queue_get_notification(vq);
- aio_context_acquire(blk_get_aio_context(s->blk));
defer_call_begin();
do {
@@ -1137,7 +1136,6 @@ void virtio_blk_handle_vq(VirtIOBlock *s, VirtQueue *vq)
}
defer_call_end();
- aio_context_release(blk_get_aio_context(s->blk));
}
static void virtio_blk_handle_output(VirtIODevice *vdev, VirtQueue *vq)
@@ -1168,7 +1166,6 @@ static void virtio_blk_dma_restart_bh(void *opaque)
s->rq = NULL;
}
- aio_context_acquire(blk_get_aio_context(s->conf.conf.blk));
while (req) {
VirtIOBlockReq *next = req->next;
if (virtio_blk_handle_request(req, &mrb)) {
@@ -1192,8 +1189,6 @@ static void virtio_blk_dma_restart_bh(void *opaque)
/* Paired with inc in virtio_blk_dma_restart_cb() */
blk_dec_in_flight(s->conf.conf.blk);
-
- aio_context_release(blk_get_aio_context(s->conf.conf.blk));
}
static void virtio_blk_dma_restart_cb(void *opaque, bool running,
--
2.41.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH 1/4] block/file-posix: set up Linux AIO and io_uring in the current thread
2023-09-14 14:00 ` [PATCH 1/4] block/file-posix: set up Linux AIO and io_uring in the current thread Stefan Hajnoczi
@ 2023-09-14 15:47 ` Eric Blake
0 siblings, 0 replies; 11+ messages in thread
From: Eric Blake @ 2023-09-14 15:47 UTC (permalink / raw)
To: Stefan Hajnoczi
Cc: qemu-devel, Kevin Wolf, qemu-block, Hanna Reitz,
Michael S. Tsirkin
On Thu, Sep 14, 2023 at 10:00:58AM -0400, Stefan Hajnoczi wrote:
> The file-posix block driver currently only sets up Linux AIO and
> io_uring in the BDS's AioContext. In the multi-queue block layer we must
> be able to submit I/O requests in AioContexts that do not have Linux AIO
> and io_uring set up yet since any thread can call into the block driver.
>
> Set up Linux AIO and io_uring for the current AioContext during request
> submission. We lose the ability to return an error from
> .bdrv_file_open() when Linux AIO and io_uring setup fails (e.g. due to
> resource limits). Instead the user only gets warnings and we fall back
> to aio=threads. This is still better than a fatal error after startup.
>
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
> block/file-posix.c | 99 ++++++++++++++++++++++------------------------
> 1 file changed, 47 insertions(+), 52 deletions(-)
>
> diff --git a/block/file-posix.c b/block/file-posix.c
> index 4757914ac0..e9dbb87c57 100644
> --- a/block/file-posix.c
> +++ b/block/file-posix.c
> @@ -713,17 +713,11 @@ static int raw_open_common(BlockDriverState *bs, QDict *options,
>
> #ifdef CONFIG_LINUX_AIO
> /* Currently Linux does AIO only for files opened with O_DIRECT */
> - if (s->use_linux_aio) {
> - if (!(s->open_flags & O_DIRECT)) {
> - error_setg(errp, "aio=native was specified, but it requires "
> - "cache.direct=on, which was not specified.");
> - ret = -EINVAL;
> - goto fail;
> - }
> - if (!aio_setup_linux_aio(bdrv_get_aio_context(bs), errp)) {
We were previously doing setup only once during open...
> +static inline bool raw_check_linux_io_uring(BDRVRawState *s)
> +{
> + Error *local_err = NULL;
> + AioContext *ctx;
> +
> + if (!s->use_linux_io_uring) {
> + return false;
> + }
> +
> + ctx = qemu_get_current_aio_context();
> + if (unlikely(!aio_setup_linux_io_uring(ctx, &local_err))) {
...now you're doing it on ever I/O request. I had to check that setup
is idempotent; thankfully it is (once ctx has an associated linux_aio
or io_uring, setup is a no-op for subsequent calls).
Reviewed-by: Eric Blake <eblake@redhat.com>
--
Eric Blake, Principal Software Engineer
Red Hat, Inc.
Virtualization: qemu.org | libguestfs.org
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 2/4] virtio-blk: add lock to protect s->rq
2023-09-14 14:00 ` [PATCH 2/4] virtio-blk: add lock to protect s->rq Stefan Hajnoczi
@ 2023-09-14 16:13 ` Eric Blake
0 siblings, 0 replies; 11+ messages in thread
From: Eric Blake @ 2023-09-14 16:13 UTC (permalink / raw)
To: Stefan Hajnoczi
Cc: qemu-devel, Kevin Wolf, qemu-block, Hanna Reitz,
Michael S. Tsirkin
On Thu, Sep 14, 2023 at 10:00:59AM -0400, Stefan Hajnoczi wrote:
> s->rq is accessed from IO_CODE and GLOBAL_STATE_CODE. Introduce a lock
> to protect s->rq and eliminate reliance on the AioContext lock.
>
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
> include/hw/virtio/virtio-blk.h | 3 +-
> hw/block/virtio-blk.c | 67 +++++++++++++++++++++++-----------
> 2 files changed, 47 insertions(+), 23 deletions(-)
>
Reviewed-by: Eric Blake <eblake@redhat.com>
--
Eric Blake, Principal Software Engineer
Red Hat, Inc.
Virtualization: qemu.org | libguestfs.org
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 3/4] virtio-blk: don't lock AioContext in the completion code path
2023-09-14 14:01 ` [PATCH 3/4] virtio-blk: don't lock AioContext in the completion code path Stefan Hajnoczi
@ 2023-09-14 16:17 ` Eric Blake
0 siblings, 0 replies; 11+ messages in thread
From: Eric Blake @ 2023-09-14 16:17 UTC (permalink / raw)
To: Stefan Hajnoczi
Cc: qemu-devel, Kevin Wolf, qemu-block, Hanna Reitz,
Michael S. Tsirkin
On Thu, Sep 14, 2023 at 10:01:00AM -0400, Stefan Hajnoczi wrote:
> Nothing in the completion code path relies on the AioContext lock
> anymore. Virtqueues are only accessed from one thread at any moment and
> the s->rq global state is protected by its own lock now.
>
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
> hw/block/virtio-blk.c | 34 ++++------------------------------
> 1 file changed, 4 insertions(+), 30 deletions(-)
>
Reviewed-by: Eric Blake <eblake@redhat.com>
--
Eric Blake, Principal Software Engineer
Red Hat, Inc.
Virtualization: qemu.org | libguestfs.org
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 4/4] virtio-blk: don't lock AioContext in the submission code path
2023-09-14 14:01 ` [PATCH 4/4] virtio-blk: don't lock AioContext in the submission " Stefan Hajnoczi
@ 2023-09-14 17:27 ` Eric Blake
0 siblings, 0 replies; 11+ messages in thread
From: Eric Blake @ 2023-09-14 17:27 UTC (permalink / raw)
To: Stefan Hajnoczi
Cc: qemu-devel, Kevin Wolf, qemu-block, Hanna Reitz,
Michael S. Tsirkin
On Thu, Sep 14, 2023 at 10:01:01AM -0400, Stefan Hajnoczi wrote:
> There is no need to acquire the AioContext lock around blk_aio_*() or
> blk_get_geometry() anymore. I/O plugging (defer_call()) also does not
> require the AioContext lock anymore.
>
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
> hw/block/virtio-blk.c | 5 -----
> 1 file changed, 5 deletions(-)
>
Reviewed-by: Eric Blake <eblake@redhat.com>
--
Eric Blake, Principal Software Engineer
Red Hat, Inc.
Virtualization: qemu.org | libguestfs.org
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 0/4] virtio-blk: prepare for the multi-queue block layer
2023-09-14 14:00 [PATCH 0/4] virtio-blk: prepare for the multi-queue block layer Stefan Hajnoczi
` (3 preceding siblings ...)
2023-09-14 14:01 ` [PATCH 4/4] virtio-blk: don't lock AioContext in the submission " Stefan Hajnoczi
@ 2023-09-15 11:17 ` Michael S. Tsirkin
2023-12-19 14:09 ` Kevin Wolf
5 siblings, 0 replies; 11+ messages in thread
From: Michael S. Tsirkin @ 2023-09-15 11:17 UTC (permalink / raw)
To: Stefan Hajnoczi; +Cc: qemu-devel, Kevin Wolf, qemu-block, Hanna Reitz
On Thu, Sep 14, 2023 at 10:00:57AM -0400, Stefan Hajnoczi wrote:
> The virtio-blk device will soon be able to assign virtqueues to IOThreads,
> eliminating the single IOThread bottleneck. In order to do that, the I/O code
> path must support running in multiple threads.
>
> This patch series removes the AioContext lock from the virtio-blk I/O code
> path, adds thread-safety where it is required, and ensures that Linux AIO and
> io_uring are available regardless of which thread calls into the block driver.
> With these changes virtio-blk is ready for the iothread-vq-mapping feature,
> which will be introduced in the next patch series.
>
> Based-on: 20230913200045.1024233-1-stefanha@redhat.com ("[PATCH v3 0/4] virtio-blk: use blk_io_plug_call() instead of notification BH")
> Based-on: 20230912231037.826804-1-stefanha@redhat.com ("[PATCH v3 0/5] block-backend: process I/O in the current AioContext")
virtio bits:
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
feel free to merge
> Stefan Hajnoczi (4):
> block/file-posix: set up Linux AIO and io_uring in the current thread
> virtio-blk: add lock to protect s->rq
> virtio-blk: don't lock AioContext in the completion code path
> virtio-blk: don't lock AioContext in the submission code path
>
> include/hw/virtio/virtio-blk.h | 3 +-
> block/file-posix.c | 99 +++++++++++++++---------------
> hw/block/virtio-blk.c | 106 +++++++++++++++------------------
> 3 files changed, 98 insertions(+), 110 deletions(-)
>
> --
> 2.41.0
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 0/4] virtio-blk: prepare for the multi-queue block layer
2023-09-14 14:00 [PATCH 0/4] virtio-blk: prepare for the multi-queue block layer Stefan Hajnoczi
` (4 preceding siblings ...)
2023-09-15 11:17 ` [PATCH 0/4] virtio-blk: prepare for the multi-queue block layer Michael S. Tsirkin
@ 2023-12-19 14:09 ` Kevin Wolf
5 siblings, 0 replies; 11+ messages in thread
From: Kevin Wolf @ 2023-12-19 14:09 UTC (permalink / raw)
To: Stefan Hajnoczi; +Cc: qemu-devel, qemu-block, Hanna Reitz, Michael S. Tsirkin
Am 14.09.2023 um 16:00 hat Stefan Hajnoczi geschrieben:
> The virtio-blk device will soon be able to assign virtqueues to IOThreads,
> eliminating the single IOThread bottleneck. In order to do that, the I/O code
> path must support running in multiple threads.
>
> This patch series removes the AioContext lock from the virtio-blk I/O code
> path, adds thread-safety where it is required, and ensures that Linux AIO and
> io_uring are available regardless of which thread calls into the block driver.
> With these changes virtio-blk is ready for the iothread-vq-mapping feature,
> which will be introduced in the next patch series.
>
> Based-on: 20230913200045.1024233-1-stefanha@redhat.com ("[PATCH v3 0/4] virtio-blk: use blk_io_plug_call() instead of notification BH")
> Based-on: 20230912231037.826804-1-stefanha@redhat.com ("[PATCH v3 0/5] block-backend: process I/O in the current AioContext")
>
> Stefan Hajnoczi (4):
> block/file-posix: set up Linux AIO and io_uring in the current thread
> virtio-blk: add lock to protect s->rq
> virtio-blk: don't lock AioContext in the completion code path
> virtio-blk: don't lock AioContext in the submission code path
Thanks, applied to the block branch.
Kevin
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2023-12-19 14:11 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-09-14 14:00 [PATCH 0/4] virtio-blk: prepare for the multi-queue block layer Stefan Hajnoczi
2023-09-14 14:00 ` [PATCH 1/4] block/file-posix: set up Linux AIO and io_uring in the current thread Stefan Hajnoczi
2023-09-14 15:47 ` Eric Blake
2023-09-14 14:00 ` [PATCH 2/4] virtio-blk: add lock to protect s->rq Stefan Hajnoczi
2023-09-14 16:13 ` Eric Blake
2023-09-14 14:01 ` [PATCH 3/4] virtio-blk: don't lock AioContext in the completion code path Stefan Hajnoczi
2023-09-14 16:17 ` Eric Blake
2023-09-14 14:01 ` [PATCH 4/4] virtio-blk: don't lock AioContext in the submission " Stefan Hajnoczi
2023-09-14 17:27 ` Eric Blake
2023-09-15 11:17 ` [PATCH 0/4] virtio-blk: prepare for the multi-queue block layer Michael S. Tsirkin
2023-12-19 14:09 ` Kevin Wolf
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).