* [PATCH V4 0/6] loop: improve loop aio perf by IOCB_NOWAIT
@ 2025-09-28 13:29 Ming Lei
2025-09-28 13:29 ` [PATCH V4 1/6] loop: add helper lo_cmd_nr_bvec() Ming Lei
` (6 more replies)
0 siblings, 7 replies; 22+ messages in thread
From: Ming Lei @ 2025-09-28 13:29 UTC (permalink / raw)
To: Jens Axboe, linux-block
Cc: Mikulas Patocka, Zhaoyang Huang, Dave Chinner, linux-fsdevel,
Ming Lei
Hello Jens,
This patchset improves loop aio perf by using IOCB_NOWAIT for avoiding to queue aio
command to workqueue context, meantime refactor lo_rw_aio() a bit.
In my test VM, loop disk perf becomes very close to perf of the backing block
device(nvme/mq virtio-scsi).
And Mikulas verified that this way can improve 12jobs sequential rw io by
~5X, and basically solve the reported problem together with loop MQ change.
https://lore.kernel.org/linux-block/a8e5c76a-231f-07d1-a394-847de930f638@redhat.com/
Zhaoyang Huang also mentioned it may fix their performance issue on Android
use case.
The loop MQ change will be posted as standalone patch, because it needs
losetup change.
V4:
- rebase
- re-organize and make it more readable
V3:
- add reviewed-by tag
- rename variable & improve commit log & comment on 5/5(Christoph)
V2:
- patch style fix & cleanup (Christoph)
- fix randwrite perf regression on sparse backing file
- drop MQ change
Ming Lei (6):
loop: add helper lo_cmd_nr_bvec()
loop: add helper lo_rw_aio_prep()
loop: add lo_submit_rw_aio()
loop: move command blkcg/memcg initialization into loop_queue_work
loop: try to handle loop aio command via NOWAIT IO first
loop: add hint for handling aio via IOCB_NOWAIT
drivers/block/loop.c | 227 +++++++++++++++++++++++++++++++++++--------
1 file changed, 188 insertions(+), 39 deletions(-)
--
2.47.0
^ permalink raw reply [flat|nested] 22+ messages in thread
* [PATCH V4 1/6] loop: add helper lo_cmd_nr_bvec()
2025-09-28 13:29 [PATCH V4 0/6] loop: improve loop aio perf by IOCB_NOWAIT Ming Lei
@ 2025-09-28 13:29 ` Ming Lei
2025-10-03 7:04 ` Christoph Hellwig
2025-09-28 13:29 ` [PATCH V4 2/6] loop: add helper lo_rw_aio_prep() Ming Lei
` (5 subsequent siblings)
6 siblings, 1 reply; 22+ messages in thread
From: Ming Lei @ 2025-09-28 13:29 UTC (permalink / raw)
To: Jens Axboe, linux-block
Cc: Mikulas Patocka, Zhaoyang Huang, Dave Chinner, linux-fsdevel,
Ming Lei
Add lo_cmd_nr_bvec() and prepare for refactoring lo_rw_aio().
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
drivers/block/loop.c | 18 ++++++++++++++----
1 file changed, 14 insertions(+), 4 deletions(-)
diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index 053a086d547e..af443651dff5 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -337,6 +337,19 @@ static void lo_rw_aio_complete(struct kiocb *iocb, long ret)
lo_rw_aio_do_completion(cmd);
}
+static inline unsigned lo_cmd_nr_bvec(struct loop_cmd *cmd)
+{
+ struct request *rq = blk_mq_rq_from_pdu(cmd);
+ struct req_iterator rq_iter;
+ struct bio_vec tmp;
+ int nr_bvec = 0;
+
+ rq_for_each_bvec(tmp, rq, rq_iter)
+ nr_bvec++;
+
+ return nr_bvec;
+}
+
static int lo_rw_aio(struct loop_device *lo, struct loop_cmd *cmd,
loff_t pos, int rw)
{
@@ -348,12 +361,9 @@ static int lo_rw_aio(struct loop_device *lo, struct loop_cmd *cmd,
struct file *file = lo->lo_backing_file;
struct bio_vec tmp;
unsigned int offset;
- int nr_bvec = 0;
+ int nr_bvec = lo_cmd_nr_bvec(cmd);
int ret;
- rq_for_each_bvec(tmp, rq, rq_iter)
- nr_bvec++;
-
if (rq->bio != rq->biotail) {
bvec = kmalloc_array(nr_bvec, sizeof(struct bio_vec),
--
2.47.0
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH V4 2/6] loop: add helper lo_rw_aio_prep()
2025-09-28 13:29 [PATCH V4 0/6] loop: improve loop aio perf by IOCB_NOWAIT Ming Lei
2025-09-28 13:29 ` [PATCH V4 1/6] loop: add helper lo_cmd_nr_bvec() Ming Lei
@ 2025-09-28 13:29 ` Ming Lei
2025-10-03 7:04 ` Christoph Hellwig
2025-09-28 13:29 ` [PATCH V4 3/6] loop: add lo_submit_rw_aio() Ming Lei
` (4 subsequent siblings)
6 siblings, 1 reply; 22+ messages in thread
From: Ming Lei @ 2025-09-28 13:29 UTC (permalink / raw)
To: Jens Axboe, linux-block
Cc: Mikulas Patocka, Zhaoyang Huang, Dave Chinner, linux-fsdevel,
Ming Lei
Add helper lo_rw_aio_prep() to make lo_rw_aio() more readable.
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
drivers/block/loop.c | 63 ++++++++++++++++++++++++++++----------------
1 file changed, 40 insertions(+), 23 deletions(-)
diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index af443651dff5..b065892106a6 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -350,21 +350,15 @@ static inline unsigned lo_cmd_nr_bvec(struct loop_cmd *cmd)
return nr_bvec;
}
-static int lo_rw_aio(struct loop_device *lo, struct loop_cmd *cmd,
- loff_t pos, int rw)
+static int lo_rw_aio_prep(struct loop_device *lo, struct loop_cmd *cmd,
+ unsigned nr_bvec, loff_t pos)
{
- struct iov_iter iter;
- struct req_iterator rq_iter;
- struct bio_vec *bvec;
struct request *rq = blk_mq_rq_from_pdu(cmd);
- struct bio *bio = rq->bio;
- struct file *file = lo->lo_backing_file;
- struct bio_vec tmp;
- unsigned int offset;
- int nr_bvec = lo_cmd_nr_bvec(cmd);
- int ret;
if (rq->bio != rq->biotail) {
+ struct req_iterator rq_iter;
+ struct bio_vec *bvec;
+ struct bio_vec tmp;
bvec = kmalloc_array(nr_bvec, sizeof(struct bio_vec),
GFP_NOIO);
@@ -382,8 +376,42 @@ static int lo_rw_aio(struct loop_device *lo, struct loop_cmd *cmd,
*bvec = tmp;
bvec++;
}
- bvec = cmd->bvec;
+ } else {
+ cmd->bvec = NULL;
+ }
+
+ cmd->iocb.ki_pos = pos;
+ cmd->iocb.ki_filp = lo->lo_backing_file;
+ cmd->iocb.ki_ioprio = req_get_ioprio(rq);
+ if (cmd->use_aio) {
+ cmd->iocb.ki_complete = lo_rw_aio_complete;
+ cmd->iocb.ki_flags = IOCB_DIRECT;
+ } else {
+ cmd->iocb.ki_complete = NULL;
+ cmd->iocb.ki_flags = 0;
+ }
+ return 0;
+}
+
+static int lo_rw_aio(struct loop_device *lo, struct loop_cmd *cmd,
+ loff_t pos, int rw)
+{
+ struct iov_iter iter;
+ struct bio_vec *bvec;
+ struct request *rq = blk_mq_rq_from_pdu(cmd);
+ struct bio *bio = rq->bio;
+ struct file *file = lo->lo_backing_file;
+ unsigned int offset;
+ int nr_bvec = lo_cmd_nr_bvec(cmd);
+ int ret;
+
+ ret = lo_rw_aio_prep(lo, cmd, nr_bvec, pos);
+ if (unlikely(ret))
+ return ret;
+
+ if (cmd->bvec) {
offset = 0;
+ bvec = cmd->bvec;
} else {
/*
* Same here, this bio may be started from the middle of the
@@ -398,17 +426,6 @@ static int lo_rw_aio(struct loop_device *lo, struct loop_cmd *cmd,
iov_iter_bvec(&iter, rw, bvec, nr_bvec, blk_rq_bytes(rq));
iter.iov_offset = offset;
- cmd->iocb.ki_pos = pos;
- cmd->iocb.ki_filp = file;
- cmd->iocb.ki_ioprio = req_get_ioprio(rq);
- if (cmd->use_aio) {
- cmd->iocb.ki_complete = lo_rw_aio_complete;
- cmd->iocb.ki_flags = IOCB_DIRECT;
- } else {
- cmd->iocb.ki_complete = NULL;
- cmd->iocb.ki_flags = 0;
- }
-
if (rw == ITER_SOURCE) {
kiocb_start_write(&cmd->iocb);
ret = file->f_op->write_iter(&cmd->iocb, &iter);
--
2.47.0
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH V4 3/6] loop: add lo_submit_rw_aio()
2025-09-28 13:29 [PATCH V4 0/6] loop: improve loop aio perf by IOCB_NOWAIT Ming Lei
2025-09-28 13:29 ` [PATCH V4 1/6] loop: add helper lo_cmd_nr_bvec() Ming Lei
2025-09-28 13:29 ` [PATCH V4 2/6] loop: add helper lo_rw_aio_prep() Ming Lei
@ 2025-09-28 13:29 ` Ming Lei
2025-10-03 7:04 ` Christoph Hellwig
2025-09-28 13:29 ` [PATCH V4 4/6] loop: move command blkcg/memcg initialization into loop_queue_work Ming Lei
` (3 subsequent siblings)
6 siblings, 1 reply; 22+ messages in thread
From: Ming Lei @ 2025-09-28 13:29 UTC (permalink / raw)
To: Jens Axboe, linux-block
Cc: Mikulas Patocka, Zhaoyang Huang, Dave Chinner, linux-fsdevel,
Ming Lei
Add lo_submit_rw_aio() and refactor lo_rw_aio().
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
drivers/block/loop.c | 41 ++++++++++++++++++++++++-----------------
1 file changed, 24 insertions(+), 17 deletions(-)
diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index b065892106a6..3ab910572bd9 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -393,38 +393,32 @@ static int lo_rw_aio_prep(struct loop_device *lo, struct loop_cmd *cmd,
return 0;
}
-static int lo_rw_aio(struct loop_device *lo, struct loop_cmd *cmd,
- loff_t pos, int rw)
+static int lo_submit_rw_aio(struct loop_device *lo, struct loop_cmd *cmd,
+ int nr_bvec, int rw)
{
- struct iov_iter iter;
- struct bio_vec *bvec;
struct request *rq = blk_mq_rq_from_pdu(cmd);
- struct bio *bio = rq->bio;
struct file *file = lo->lo_backing_file;
- unsigned int offset;
- int nr_bvec = lo_cmd_nr_bvec(cmd);
+ struct iov_iter iter;
int ret;
- ret = lo_rw_aio_prep(lo, cmd, nr_bvec, pos);
- if (unlikely(ret))
- return ret;
-
if (cmd->bvec) {
- offset = 0;
- bvec = cmd->bvec;
+ iov_iter_bvec(&iter, rw, cmd->bvec, nr_bvec, blk_rq_bytes(rq));
+ iter.iov_offset = 0;
} else {
+ struct bio *bio = rq->bio;
+ struct bio_vec *bvec = __bvec_iter_bvec(bio->bi_io_vec,
+ bio->bi_iter);
+
/*
* Same here, this bio may be started from the middle of the
* 'bvec' because of bio splitting, so offset from the bvec
* must be passed to iov iterator
*/
- offset = bio->bi_iter.bi_bvec_done;
- bvec = __bvec_iter_bvec(bio->bi_io_vec, bio->bi_iter);
+ iov_iter_bvec(&iter, rw, bvec, nr_bvec, blk_rq_bytes(rq));
+ iter.iov_offset = bio->bi_iter.bi_bvec_done;
}
atomic_set(&cmd->ref, 2);
- iov_iter_bvec(&iter, rw, bvec, nr_bvec, blk_rq_bytes(rq));
- iter.iov_offset = offset;
if (rw == ITER_SOURCE) {
kiocb_start_write(&cmd->iocb);
@@ -433,7 +427,20 @@ static int lo_rw_aio(struct loop_device *lo, struct loop_cmd *cmd,
ret = file->f_op->read_iter(&cmd->iocb, &iter);
lo_rw_aio_do_completion(cmd);
+ return ret;
+}
+
+static int lo_rw_aio(struct loop_device *lo, struct loop_cmd *cmd,
+ loff_t pos, int rw)
+{
+ int nr_bvec = lo_cmd_nr_bvec(cmd);
+ int ret;
+
+ ret = lo_rw_aio_prep(lo, cmd, nr_bvec, pos);
+ if (unlikely(ret))
+ return ret;
+ ret = lo_submit_rw_aio(lo, cmd, nr_bvec, rw);
if (ret != -EIOCBQUEUED)
lo_rw_aio_complete(&cmd->iocb, ret);
return -EIOCBQUEUED;
--
2.47.0
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH V4 4/6] loop: move command blkcg/memcg initialization into loop_queue_work
2025-09-28 13:29 [PATCH V4 0/6] loop: improve loop aio perf by IOCB_NOWAIT Ming Lei
` (2 preceding siblings ...)
2025-09-28 13:29 ` [PATCH V4 3/6] loop: add lo_submit_rw_aio() Ming Lei
@ 2025-09-28 13:29 ` Ming Lei
2025-09-28 13:29 ` [PATCH V4 5/6] loop: try to handle loop aio command via NOWAIT IO first Ming Lei
` (2 subsequent siblings)
6 siblings, 0 replies; 22+ messages in thread
From: Ming Lei @ 2025-09-28 13:29 UTC (permalink / raw)
To: Jens Axboe, linux-block
Cc: Mikulas Patocka, Zhaoyang Huang, Dave Chinner, linux-fsdevel,
Ming Lei, Christoph Hellwig
Move loop command blkcg/memcg initialization into loop_queue_work,
and prepare for supporting to handle loop io command by IOCB_NOWAIT.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
drivers/block/loop.c | 32 +++++++++++++++++---------------
1 file changed, 17 insertions(+), 15 deletions(-)
diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index 3ab910572bd9..99eec0a25dbc 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -829,11 +829,28 @@ static inline int queue_on_root_worker(struct cgroup_subsys_state *css)
static void loop_queue_work(struct loop_device *lo, struct loop_cmd *cmd)
{
+ struct request __maybe_unused *rq = blk_mq_rq_from_pdu(cmd);
struct rb_node **node, *parent = NULL;
struct loop_worker *cur_worker, *worker = NULL;
struct work_struct *work;
struct list_head *cmd_list;
+ /* always use the first bio's css */
+ cmd->blkcg_css = NULL;
+ cmd->memcg_css = NULL;
+#ifdef CONFIG_BLK_CGROUP
+ if (rq->bio) {
+ cmd->blkcg_css = bio_blkcg_css(rq->bio);
+#ifdef CONFIG_MEMCG
+ if (cmd->blkcg_css) {
+ cmd->memcg_css =
+ cgroup_get_e_css(cmd->blkcg_css->cgroup,
+ &memory_cgrp_subsys);
+ }
+#endif
+ }
+#endif
+
spin_lock_irq(&lo->lo_work_lock);
if (queue_on_root_worker(cmd->blkcg_css))
@@ -1903,21 +1920,6 @@ static blk_status_t loop_queue_rq(struct blk_mq_hw_ctx *hctx,
break;
}
- /* always use the first bio's css */
- cmd->blkcg_css = NULL;
- cmd->memcg_css = NULL;
-#ifdef CONFIG_BLK_CGROUP
- if (rq->bio) {
- cmd->blkcg_css = bio_blkcg_css(rq->bio);
-#ifdef CONFIG_MEMCG
- if (cmd->blkcg_css) {
- cmd->memcg_css =
- cgroup_get_e_css(cmd->blkcg_css->cgroup,
- &memory_cgrp_subsys);
- }
-#endif
- }
-#endif
loop_queue_work(lo, cmd);
return BLK_STS_OK;
--
2.47.0
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH V4 5/6] loop: try to handle loop aio command via NOWAIT IO first
2025-09-28 13:29 [PATCH V4 0/6] loop: improve loop aio perf by IOCB_NOWAIT Ming Lei
` (3 preceding siblings ...)
2025-09-28 13:29 ` [PATCH V4 4/6] loop: move command blkcg/memcg initialization into loop_queue_work Ming Lei
@ 2025-09-28 13:29 ` Ming Lei
2025-09-29 6:44 ` Yu Kuai
2025-09-28 13:29 ` [PATCH V4 6/6] loop: add hint for handling aio via IOCB_NOWAIT Ming Lei
2025-09-28 18:42 ` [syzbot ci] Re: loop: improve loop aio perf by IOCB_NOWAIT syzbot ci
6 siblings, 1 reply; 22+ messages in thread
From: Ming Lei @ 2025-09-28 13:29 UTC (permalink / raw)
To: Jens Axboe, linux-block
Cc: Mikulas Patocka, Zhaoyang Huang, Dave Chinner, linux-fsdevel,
Ming Lei
Try to handle loop aio command via NOWAIT IO first, then we can avoid to
queue the aio command into workqueue. This is usually one big win in
case that FS block mapping is stable, Mikulas verified [1] that this way
improves IO perf by close to 5X in 12jobs sequential read/write test,
in which FS block mapping is just stable.
Fallback to workqueue in case of -EAGAIN. This way may bring a little
cost from the 1st retry, but when running the following write test over
loop/sparse_file, the actual effect on randwrite is obvious:
```
truncate -s 4G 1.img #1.img is created on XFS/virtio-scsi
losetup -f 1.img --direct-io=on
fio --direct=1 --bs=4k --runtime=40 --time_based --numjobs=1 --ioengine=libaio \
--iodepth=16 --group_reporting=1 --filename=/dev/loop0 -name=job --rw=$RW
```
- RW=randwrite: obvious IOPS drop observed
- RW=write: a little drop(%5 - 10%)
This perf drop on randwrite over sparse file will be addressed in the
following patch.
BLK_MQ_F_BLOCKING has to be set for calling into .read_iter() or .write_iter()
which might sleep even though it is NOWAIT, and the only effect is that rcu read
lock is replaced with srcu read lock.
Link: https://lore.kernel.org/linux-block/a8e5c76a-231f-07d1-a394-847de930f638@redhat.com/ [1]
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
drivers/block/loop.c | 62 ++++++++++++++++++++++++++++++++++++++++----
1 file changed, 57 insertions(+), 5 deletions(-)
diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index 99eec0a25dbc..57e33553695b 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -90,6 +90,8 @@ struct loop_cmd {
#define LOOP_IDLE_WORKER_TIMEOUT (60 * HZ)
#define LOOP_DEFAULT_HW_Q_DEPTH 128
+static void loop_queue_work(struct loop_device *lo, struct loop_cmd *cmd);
+
static DEFINE_IDR(loop_index_idr);
static DEFINE_MUTEX(loop_ctl_mutex);
static DEFINE_MUTEX(loop_validate_mutex);
@@ -321,6 +323,15 @@ static void lo_rw_aio_do_completion(struct loop_cmd *cmd)
if (!atomic_dec_and_test(&cmd->ref))
return;
+
+ /* -EAGAIN could be returned from bdev's ->ki_complete */
+ if (cmd->ret == -EAGAIN) {
+ struct loop_device *lo = rq->q->queuedata;
+
+ loop_queue_work(lo, cmd);
+ return;
+ }
+
kfree(cmd->bvec);
cmd->bvec = NULL;
if (req_op(rq) == REQ_OP_WRITE)
@@ -436,16 +447,40 @@ static int lo_rw_aio(struct loop_device *lo, struct loop_cmd *cmd,
int nr_bvec = lo_cmd_nr_bvec(cmd);
int ret;
- ret = lo_rw_aio_prep(lo, cmd, nr_bvec, pos);
- if (unlikely(ret))
- return ret;
+ /* prepared already for aio from nowait code path */
+ if (!cmd->use_aio) {
+ ret = lo_rw_aio_prep(lo, cmd, nr_bvec, pos);
+ if (unlikely(ret))
+ goto fail;
+ }
+ cmd->iocb.ki_flags &= ~IOCB_NOWAIT;
ret = lo_submit_rw_aio(lo, cmd, nr_bvec, rw);
+fail:
if (ret != -EIOCBQUEUED)
lo_rw_aio_complete(&cmd->iocb, ret);
return -EIOCBQUEUED;
}
+static int lo_rw_aio_nowait(struct loop_device *lo, struct loop_cmd *cmd,
+ int rw)
+{
+ struct request *rq = blk_mq_rq_from_pdu(cmd);
+ loff_t pos = ((loff_t) blk_rq_pos(rq) << 9) + lo->lo_offset;
+ int nr_bvec = lo_cmd_nr_bvec(cmd);
+ int ret = lo_rw_aio_prep(lo, cmd, nr_bvec, pos);
+
+ if (unlikely(ret))
+ goto fail;
+
+ cmd->iocb.ki_flags |= IOCB_NOWAIT;
+ ret = lo_submit_rw_aio(lo, cmd, nr_bvec, rw);
+fail:
+ if (ret != -EIOCBQUEUED && ret != -EAGAIN)
+ lo_rw_aio_complete(&cmd->iocb, ret);
+ return ret;
+}
+
static int do_req_filebacked(struct loop_device *lo, struct request *rq)
{
struct loop_cmd *cmd = blk_mq_rq_to_pdu(rq);
@@ -1903,6 +1938,7 @@ static blk_status_t loop_queue_rq(struct blk_mq_hw_ctx *hctx,
struct request *rq = bd->rq;
struct loop_cmd *cmd = blk_mq_rq_to_pdu(rq);
struct loop_device *lo = rq->q->queuedata;
+ int rw = 0;
blk_mq_start_request(rq);
@@ -1915,9 +1951,24 @@ static blk_status_t loop_queue_rq(struct blk_mq_hw_ctx *hctx,
case REQ_OP_WRITE_ZEROES:
cmd->use_aio = false;
break;
- default:
+ case REQ_OP_READ:
+ rw = ITER_DEST;
+ cmd->use_aio = lo->lo_flags & LO_FLAGS_DIRECT_IO;
+ break;
+ case REQ_OP_WRITE:
+ rw = ITER_SOURCE;
cmd->use_aio = lo->lo_flags & LO_FLAGS_DIRECT_IO;
break;
+ default:
+ return BLK_STS_IOERR;
+ }
+
+ if (cmd->use_aio) {
+ int res = lo_rw_aio_nowait(lo, cmd, rw);
+
+ if (res != -EAGAIN)
+ return BLK_STS_OK;
+ /* fallback to workqueue for handling aio */
}
loop_queue_work(lo, cmd);
@@ -2069,7 +2120,8 @@ static int loop_add(int i)
lo->tag_set.queue_depth = hw_queue_depth;
lo->tag_set.numa_node = NUMA_NO_NODE;
lo->tag_set.cmd_size = sizeof(struct loop_cmd);
- lo->tag_set.flags = BLK_MQ_F_STACKING | BLK_MQ_F_NO_SCHED_BY_DEFAULT;
+ lo->tag_set.flags = BLK_MQ_F_STACKING | BLK_MQ_F_NO_SCHED_BY_DEFAULT |
+ BLK_MQ_F_BLOCKING;
lo->tag_set.driver_data = lo;
err = blk_mq_alloc_tag_set(&lo->tag_set);
--
2.47.0
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH V4 6/6] loop: add hint for handling aio via IOCB_NOWAIT
2025-09-28 13:29 [PATCH V4 0/6] loop: improve loop aio perf by IOCB_NOWAIT Ming Lei
` (4 preceding siblings ...)
2025-09-28 13:29 ` [PATCH V4 5/6] loop: try to handle loop aio command via NOWAIT IO first Ming Lei
@ 2025-09-28 13:29 ` Ming Lei
2025-10-03 7:06 ` Christoph Hellwig
2025-09-28 18:42 ` [syzbot ci] Re: loop: improve loop aio perf by IOCB_NOWAIT syzbot ci
6 siblings, 1 reply; 22+ messages in thread
From: Ming Lei @ 2025-09-28 13:29 UTC (permalink / raw)
To: Jens Axboe, linux-block
Cc: Mikulas Patocka, Zhaoyang Huang, Dave Chinner, linux-fsdevel,
Ming Lei
Add hint for using IOCB_NOWAIT to handle loop aio command for avoiding
to cause write(especially randwrite) perf regression on sparse backed file.
Try IOCB_NOWAIT in the following situations:
- backing file is block device
OR
- READ aio command
OR
- there isn't any queued blocking async WRITEs, because NOWAIT won't cause
contention with blocking WRITE, which often implies exclusive lock
With this simple policy, perf regression of randwrite/write on sparse
backing file is fixed.
Link: https://lore.kernel.org/dm-devel/7d6ae2c9-df8e-50d0-7ad6-b787cb3cfab4@redhat.com/
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
drivers/block/loop.c | 61 ++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 61 insertions(+)
diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index 57e33553695b..911262b648ce 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -68,6 +68,7 @@ struct loop_device {
struct rb_root worker_tree;
struct timer_list timer;
bool sysfs_inited;
+ unsigned lo_nr_blocking_writes;
struct request_queue *lo_queue;
struct blk_mq_tag_set tag_set;
@@ -462,6 +463,33 @@ static int lo_rw_aio(struct loop_device *lo, struct loop_cmd *cmd,
return -EIOCBQUEUED;
}
+static inline bool lo_aio_try_nowait(struct loop_device *lo,
+ struct loop_cmd *cmd)
+{
+ struct file *file = lo->lo_backing_file;
+ struct inode *inode = file->f_mapping->host;
+ struct request *rq = blk_mq_rq_from_pdu(cmd);
+
+ /* NOWAIT works fine for backing block device */
+ if (S_ISBLK(inode->i_mode))
+ return true;
+
+ /*
+ * NOWAIT is supposed to be fine for READ without contending with
+ * blocking WRITE
+ */
+ if (req_op(rq) == REQ_OP_READ)
+ return true;
+
+ /*
+ * If there is any queued non-NOWAIT async WRITE , don't try new
+ * NOWAIT WRITE for avoiding contention
+ *
+ * Here we focus on handling stable FS block mapping via NOWAIT
+ */
+ return READ_ONCE(lo->lo_nr_blocking_writes) == 0;
+}
+
static int lo_rw_aio_nowait(struct loop_device *lo, struct loop_cmd *cmd,
int rw)
{
@@ -473,6 +501,9 @@ static int lo_rw_aio_nowait(struct loop_device *lo, struct loop_cmd *cmd,
if (unlikely(ret))
goto fail;
+ if (!lo_aio_try_nowait(lo, cmd))
+ return -EAGAIN;
+
cmd->iocb.ki_flags |= IOCB_NOWAIT;
ret = lo_submit_rw_aio(lo, cmd, nr_bvec, rw);
fail:
@@ -773,12 +804,19 @@ static ssize_t loop_attr_dio_show(struct loop_device *lo, char *buf)
return sysfs_emit(buf, "%s\n", dio ? "1" : "0");
}
+static ssize_t loop_attr_nr_blocking_writes_show(struct loop_device *lo,
+ char *buf)
+{
+ return sysfs_emit(buf, "%u\n", lo->lo_nr_blocking_writes);
+}
+
LOOP_ATTR_RO(backing_file);
LOOP_ATTR_RO(offset);
LOOP_ATTR_RO(sizelimit);
LOOP_ATTR_RO(autoclear);
LOOP_ATTR_RO(partscan);
LOOP_ATTR_RO(dio);
+LOOP_ATTR_RO(nr_blocking_writes);
static struct attribute *loop_attrs[] = {
&loop_attr_backing_file.attr,
@@ -787,6 +825,7 @@ static struct attribute *loop_attrs[] = {
&loop_attr_autoclear.attr,
&loop_attr_partscan.attr,
&loop_attr_dio.attr,
+ &loop_attr_nr_blocking_writes.attr,
NULL,
};
@@ -862,6 +901,24 @@ static inline int queue_on_root_worker(struct cgroup_subsys_state *css)
}
#endif
+static inline void loop_inc_blocking_writes(struct loop_device *lo,
+ struct loop_cmd *cmd)
+{
+ lockdep_assert_held(&lo->lo_mutex);
+
+ if (req_op(blk_mq_rq_from_pdu(cmd)) == REQ_OP_WRITE)
+ lo->lo_nr_blocking_writes += 1;
+}
+
+static inline void loop_dec_blocking_writes(struct loop_device *lo,
+ struct loop_cmd *cmd)
+{
+ lockdep_assert_held(&lo->lo_mutex);
+
+ if (req_op(blk_mq_rq_from_pdu(cmd)) == REQ_OP_WRITE)
+ lo->lo_nr_blocking_writes -= 1;
+}
+
static void loop_queue_work(struct loop_device *lo, struct loop_cmd *cmd)
{
struct request __maybe_unused *rq = blk_mq_rq_from_pdu(cmd);
@@ -944,6 +1001,8 @@ static void loop_queue_work(struct loop_device *lo, struct loop_cmd *cmd)
work = &lo->rootcg_work;
cmd_list = &lo->rootcg_cmd_list;
}
+ if (cmd->use_aio)
+ loop_inc_blocking_writes(lo, cmd);
list_add_tail(&cmd->list_entry, cmd_list);
queue_work(lo->workqueue, work);
spin_unlock_irq(&lo->lo_work_lock);
@@ -2042,6 +2101,8 @@ static void loop_process_work(struct loop_worker *worker,
cond_resched();
spin_lock_irq(&lo->lo_work_lock);
+ if (cmd->use_aio)
+ loop_dec_blocking_writes(lo, cmd);
}
/*
--
2.47.0
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [syzbot ci] Re: loop: improve loop aio perf by IOCB_NOWAIT
2025-09-28 13:29 [PATCH V4 0/6] loop: improve loop aio perf by IOCB_NOWAIT Ming Lei
` (5 preceding siblings ...)
2025-09-28 13:29 ` [PATCH V4 6/6] loop: add hint for handling aio via IOCB_NOWAIT Ming Lei
@ 2025-09-28 18:42 ` syzbot ci
2025-09-29 1:13 ` Ming Lei
6 siblings, 1 reply; 22+ messages in thread
From: syzbot ci @ 2025-09-28 18:42 UTC (permalink / raw)
To: axboe, dchinner, hch, linux-block, linux-fsdevel, ming.lei,
mpatocka, zhaoyang.huang
Cc: syzbot, syzkaller-bugs
syzbot ci has tested the following series
[v1] loop: improve loop aio perf by IOCB_NOWAIT
https://lore.kernel.org/all/20250928132927.3672537-1-ming.lei@redhat.com
* [PATCH V4 1/6] loop: add helper lo_cmd_nr_bvec()
* [PATCH V4 2/6] loop: add helper lo_rw_aio_prep()
* [PATCH V4 3/6] loop: add lo_submit_rw_aio()
* [PATCH V4 4/6] loop: move command blkcg/memcg initialization into loop_queue_work
* [PATCH V4 5/6] loop: try to handle loop aio command via NOWAIT IO first
* [PATCH V4 6/6] loop: add hint for handling aio via IOCB_NOWAIT
and found the following issue:
WARNING in lo_submit_rw_aio
Full report is available here:
https://ci.syzbot.org/series/0ffdb6b4-a5fe-48da-9473-d2a926e780bd
***
WARNING in lo_submit_rw_aio
tree: torvalds
URL: https://kernel.googlesource.com/pub/scm/linux/kernel/git/torvalds/linux
base: 07e27ad16399afcd693be20211b0dfae63e0615f
arch: amd64
compiler: Debian clang version 20.1.8 (++20250708063551+0c9f909b7976-1~exp1~20250708183702.136), Debian LLD 20.1.8
config: https://ci.syzbot.org/builds/3aba003b-2400-4e88-9a31-c09ab4e41a84/config
C repro: https://ci.syzbot.org/findings/dc97454c-d87b-41f5-a44a-7182e666cfd5/c_repro
syz repro: https://ci.syzbot.org/findings/dc97454c-d87b-41f5-a44a-7182e666cfd5/syz_repro
------------[ cut here ]------------
WARNING: CPU: 0 PID: 5958 at drivers/block/loop.c:907 loop_inc_blocking_writes drivers/block/loop.c:907 [inline]
WARNING: CPU: 0 PID: 5958 at drivers/block/loop.c:907 loop_queue_work+0xb3b/0xc30 drivers/block/loop.c:1005
Modules linked in:
CPU: 0 UID: 0 PID: 5958 Comm: udevd Not tainted syzkaller #0 PREEMPT(full)
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
RIP: 0010:loop_inc_blocking_writes drivers/block/loop.c:907 [inline]
RIP: 0010:loop_queue_work+0xb3b/0xc30 drivers/block/loop.c:1005
Code: 33 bf 08 00 00 00 4c 89 ea e8 c1 89 7e fb 4c 89 f7 48 83 c4 30 5b 41 5c 41 5d 41 5e 41 5f 5d e9 cb 87 71 05 e8 26 36 b4 fb 90 <0f> 0b 90 e9 4e fe ff ff e8 18 36 b4 fb 48 83 c5 18 48 89 e8 48 c1
RSP: 0018:ffffc9000340ef38 EFLAGS: 00010093
RAX: ffffffff860b776a RBX: ffff88802187c000 RCX: ffff88810cee3980
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
RBP: 0000000000000000 R08: 0000000000000003 R09: 0000000000000004
R10: dffffc0000000000 R11: fffff52000681dc4 R12: ffff88802187c158
R13: ffff88802187c110 R14: ffff888021989460 R15: ffff888021989418
FS: 00007f649c5a4c80(0000) GS:ffff8880b8612000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000056520d53f000 CR3: 0000000109e48000 CR4: 00000000000006f0
Call Trace:
<TASK>
lo_submit_rw_aio+0x493/0x620 drivers/block/loop.c:441
lo_rw_aio_nowait drivers/block/loop.c:508 [inline]
loop_queue_rq+0x64d/0x840 drivers/block/loop.c:2026
__blk_mq_issue_directly block/blk-mq.c:2695 [inline]
blk_mq_request_issue_directly+0x3c1/0x710 block/blk-mq.c:2782
blk_mq_issue_direct+0x2a0/0x660 block/blk-mq.c:2803
blk_mq_dispatch_queue_requests+0x621/0x800 block/blk-mq.c:2878
blk_mq_flush_plug_list+0x432/0x550 block/blk-mq.c:2961
__blk_flush_plug+0x3d3/0x4b0 block/blk-core.c:1220
blk_finish_plug+0x5e/0x90 block/blk-core.c:1247
read_pages+0x3b2/0x580 mm/readahead.c:173
page_cache_ra_unbounded+0x6b0/0x7b0 mm/readahead.c:297
do_page_cache_ra mm/readahead.c:327 [inline]
force_page_cache_ra mm/readahead.c:356 [inline]
page_cache_sync_ra+0x3b9/0xb10 mm/readahead.c:572
filemap_get_pages+0x43c/0x1ea0 mm/filemap.c:2603
filemap_read+0x3f6/0x11a0 mm/filemap.c:2712
blkdev_read_iter+0x30a/0x440 block/fops.c:852
new_sync_read fs/read_write.c:491 [inline]
vfs_read+0x55a/0xa30 fs/read_write.c:572
ksys_read+0x145/0x250 fs/read_write.c:715
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0xfa/0x3b0 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f649c116b6a
Code: 00 3d 00 00 41 00 75 0d 50 48 8d 3d 2d 08 0a 00 e8 ea 7d 01 00 31 c0 e9 07 ff ff ff 64 8b 04 25 18 00 00 00 85 c0 75 1b 0f 05 <48> 3d 00 f0 ff ff 76 6c 48 8b 15 8f a2 0d 00 f7 d8 64 89 02 48 83
RSP: 002b:00007ffd6597a888 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
RAX: ffffffffffffffda RBX: 0000000000040000 RCX: 00007f649c116b6a
RDX: 0000000000040000 RSI: 000056520d500438 RDI: 0000000000000009
RBP: 0000000000040000 R08: 000056520d500410 R09: 0000000000000010
R10: 0000000000004011 R11: 0000000000000246 R12: 000056520d500410
R13: 000056520d500428 R14: 000056520d3cf9c8 R15: 000056520d3cf970
</TASK>
***
If these findings have caused you to resend the series or submit a
separate fix, please add the following tag to your commit message:
Tested-by: syzbot@syzkaller.appspotmail.com
---
This report is generated by a bot. It may contain errors.
syzbot ci engineers can be reached at syzkaller@googlegroups.com.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [syzbot ci] Re: loop: improve loop aio perf by IOCB_NOWAIT
2025-09-28 18:42 ` [syzbot ci] Re: loop: improve loop aio perf by IOCB_NOWAIT syzbot ci
@ 2025-09-29 1:13 ` Ming Lei
0 siblings, 0 replies; 22+ messages in thread
From: Ming Lei @ 2025-09-29 1:13 UTC (permalink / raw)
To: syzbot ci
Cc: axboe, dchinner, hch, linux-block, linux-fsdevel, mpatocka,
zhaoyang.huang, syzbot, syzkaller-bugs
On Sun, Sep 28, 2025 at 11:42:20AM -0700, syzbot ci wrote:
> syzbot ci has tested the following series
>
> [v1] loop: improve loop aio perf by IOCB_NOWAIT
> https://lore.kernel.org/all/20250928132927.3672537-1-ming.lei@redhat.com
> * [PATCH V4 1/6] loop: add helper lo_cmd_nr_bvec()
> * [PATCH V4 2/6] loop: add helper lo_rw_aio_prep()
> * [PATCH V4 3/6] loop: add lo_submit_rw_aio()
> * [PATCH V4 4/6] loop: move command blkcg/memcg initialization into loop_queue_work
> * [PATCH V4 5/6] loop: try to handle loop aio command via NOWAIT IO first
> * [PATCH V4 6/6] loop: add hint for handling aio via IOCB_NOWAIT
>
> and found the following issue:
> WARNING in lo_submit_rw_aio
>
> Full report is available here:
> https://ci.syzbot.org/series/0ffdb6b4-a5fe-48da-9473-d2a926e780bd
>
> ***
>
> WARNING in lo_submit_rw_aio
>
> tree: torvalds
> URL: https://kernel.googlesource.com/pub/scm/linux/kernel/git/torvalds/linux
> base: 07e27ad16399afcd693be20211b0dfae63e0615f
> arch: amd64
> compiler: Debian clang version 20.1.8 (++20250708063551+0c9f909b7976-1~exp1~20250708183702.136), Debian LLD 20.1.8
> config: https://ci.syzbot.org/builds/3aba003b-2400-4e88-9a31-c09ab4e41a84/config
> C repro: https://ci.syzbot.org/findings/dc97454c-d87b-41f5-a44a-7182e666cfd5/c_repro
> syz repro: https://ci.syzbot.org/findings/dc97454c-d87b-41f5-a44a-7182e666cfd5/syz_repro
>
> ------------[ cut here ]------------
> WARNING: CPU: 0 PID: 5958 at drivers/block/loop.c:907 loop_inc_blocking_writes drivers/block/loop.c:907 [inline]
Thanks for your report!
Looks wrong lock is asserted, and the following change can fix it:
diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index 911262b648ce..f3372bf35fd5 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -904,7 +904,7 @@ static inline int queue_on_root_worker(struct cgroup_subsys_state *css)
static inline void loop_inc_blocking_writes(struct loop_device *lo,
struct loop_cmd *cmd)
{
- lockdep_assert_held(&lo->lo_mutex);
+ lockdep_assert_held(&lo->lo_work_lock);
if (req_op(blk_mq_rq_from_pdu(cmd)) == REQ_OP_WRITE)
lo->lo_nr_blocking_writes += 1;
@@ -913,7 +913,7 @@ static inline void loop_inc_blocking_writes(struct loop_device *lo,
static inline void loop_dec_blocking_writes(struct loop_device *lo,
struct loop_cmd *cmd)
{
- lockdep_assert_held(&lo->lo_mutex);
+ lockdep_assert_held(&lo->lo_work_lock);
if (req_op(blk_mq_rq_from_pdu(cmd)) == REQ_OP_WRITE)
lo->lo_nr_blocking_writes -= 1;
Thanks,
Ming
^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: [PATCH V4 5/6] loop: try to handle loop aio command via NOWAIT IO first
2025-09-28 13:29 ` [PATCH V4 5/6] loop: try to handle loop aio command via NOWAIT IO first Ming Lei
@ 2025-09-29 6:44 ` Yu Kuai
2025-09-29 9:18 ` Ming Lei
0 siblings, 1 reply; 22+ messages in thread
From: Yu Kuai @ 2025-09-29 6:44 UTC (permalink / raw)
To: Ming Lei, Jens Axboe, linux-block
Cc: Mikulas Patocka, Zhaoyang Huang, Dave Chinner, linux-fsdevel,
yukuai (C)
Hi,
在 2025/09/28 21:29, Ming Lei 写道:
> Try to handle loop aio command via NOWAIT IO first, then we can avoid to
> queue the aio command into workqueue. This is usually one big win in
> case that FS block mapping is stable, Mikulas verified [1] that this way
> improves IO perf by close to 5X in 12jobs sequential read/write test,
> in which FS block mapping is just stable.
>
> Fallback to workqueue in case of -EAGAIN. This way may bring a little
> cost from the 1st retry, but when running the following write test over
> loop/sparse_file, the actual effect on randwrite is obvious:
>
> ```
> truncate -s 4G 1.img #1.img is created on XFS/virtio-scsi
> losetup -f 1.img --direct-io=on
> fio --direct=1 --bs=4k --runtime=40 --time_based --numjobs=1 --ioengine=libaio \
> --iodepth=16 --group_reporting=1 --filename=/dev/loop0 -name=job --rw=$RW
> ```
>
> - RW=randwrite: obvious IOPS drop observed
> - RW=write: a little drop(%5 - 10%)
>
> This perf drop on randwrite over sparse file will be addressed in the
> following patch.
>
> BLK_MQ_F_BLOCKING has to be set for calling into .read_iter() or .write_iter()
> which might sleep even though it is NOWAIT, and the only effect is that rcu read
> lock is replaced with srcu read lock.
>
> Link: https://lore.kernel.org/linux-block/a8e5c76a-231f-07d1-a394-847de930f638@redhat.com/ [1]
> Signed-off-by: Ming Lei <ming.lei@redhat.com>
> ---
> drivers/block/loop.c | 62 ++++++++++++++++++++++++++++++++++++++++----
> 1 file changed, 57 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/block/loop.c b/drivers/block/loop.c
> index 99eec0a25dbc..57e33553695b 100644
> --- a/drivers/block/loop.c
> +++ b/drivers/block/loop.c
> @@ -90,6 +90,8 @@ struct loop_cmd {
> #define LOOP_IDLE_WORKER_TIMEOUT (60 * HZ)
> #define LOOP_DEFAULT_HW_Q_DEPTH 128
>
> +static void loop_queue_work(struct loop_device *lo, struct loop_cmd *cmd);
> +
> static DEFINE_IDR(loop_index_idr);
> static DEFINE_MUTEX(loop_ctl_mutex);
> static DEFINE_MUTEX(loop_validate_mutex);
> @@ -321,6 +323,15 @@ static void lo_rw_aio_do_completion(struct loop_cmd *cmd)
>
> if (!atomic_dec_and_test(&cmd->ref))
> return;
> +
> + /* -EAGAIN could be returned from bdev's ->ki_complete */
> + if (cmd->ret == -EAGAIN) {
> + struct loop_device *lo = rq->q->queuedata;
> +
> + loop_queue_work(lo, cmd);
> + return;
> + }
> +
> kfree(cmd->bvec);
> cmd->bvec = NULL;
> if (req_op(rq) == REQ_OP_WRITE)
> @@ -436,16 +447,40 @@ static int lo_rw_aio(struct loop_device *lo, struct loop_cmd *cmd,
> int nr_bvec = lo_cmd_nr_bvec(cmd);
> int ret;
>
> - ret = lo_rw_aio_prep(lo, cmd, nr_bvec, pos);
> - if (unlikely(ret))
> - return ret;
> + /* prepared already for aio from nowait code path */
> + if (!cmd->use_aio) {
> + ret = lo_rw_aio_prep(lo, cmd, nr_bvec, pos);
> + if (unlikely(ret))
> + goto fail;
> + }
>
> + cmd->iocb.ki_flags &= ~IOCB_NOWAIT;
> ret = lo_submit_rw_aio(lo, cmd, nr_bvec, rw);
> +fail:
> if (ret != -EIOCBQUEUED)
> lo_rw_aio_complete(&cmd->iocb, ret);
> return -EIOCBQUEUED;
> }
>
> +static int lo_rw_aio_nowait(struct loop_device *lo, struct loop_cmd *cmd,
> + int rw)
> +{
> + struct request *rq = blk_mq_rq_from_pdu(cmd);
> + loff_t pos = ((loff_t) blk_rq_pos(rq) << 9) + lo->lo_offset;
> + int nr_bvec = lo_cmd_nr_bvec(cmd);
> + int ret = lo_rw_aio_prep(lo, cmd, nr_bvec, pos);
> +
> + if (unlikely(ret))
> + goto fail;
> +
> + cmd->iocb.ki_flags |= IOCB_NOWAIT;
> + ret = lo_submit_rw_aio(lo, cmd, nr_bvec, rw);
Should you also check if backing device/file support nowait? Otherwise
bio will fail with BLK_STS_NOTSUPP from submit_bio_noacct().
Thanks,
Kuai
> +fail:
> + if (ret != -EIOCBQUEUED && ret != -EAGAIN)
> + lo_rw_aio_complete(&cmd->iocb, ret);
> + return ret;
> +}
> +
> static int do_req_filebacked(struct loop_device *lo, struct request *rq)
> {
> struct loop_cmd *cmd = blk_mq_rq_to_pdu(rq);
> @@ -1903,6 +1938,7 @@ static blk_status_t loop_queue_rq(struct blk_mq_hw_ctx *hctx,
> struct request *rq = bd->rq;
> struct loop_cmd *cmd = blk_mq_rq_to_pdu(rq);
> struct loop_device *lo = rq->q->queuedata;
> + int rw = 0;
>
> blk_mq_start_request(rq);
>
> @@ -1915,9 +1951,24 @@ static blk_status_t loop_queue_rq(struct blk_mq_hw_ctx *hctx,
> case REQ_OP_WRITE_ZEROES:
> cmd->use_aio = false;
> break;
> - default:
> + case REQ_OP_READ:
> + rw = ITER_DEST;
> + cmd->use_aio = lo->lo_flags & LO_FLAGS_DIRECT_IO;
> + break;
> + case REQ_OP_WRITE:
> + rw = ITER_SOURCE;
> cmd->use_aio = lo->lo_flags & LO_FLAGS_DIRECT_IO;
> break;
> + default:
> + return BLK_STS_IOERR;
> + }
> +
> + if (cmd->use_aio) {
> + int res = lo_rw_aio_nowait(lo, cmd, rw);
> +
> + if (res != -EAGAIN)
> + return BLK_STS_OK;
> + /* fallback to workqueue for handling aio */
> }
>
> loop_queue_work(lo, cmd);
> @@ -2069,7 +2120,8 @@ static int loop_add(int i)
> lo->tag_set.queue_depth = hw_queue_depth;
> lo->tag_set.numa_node = NUMA_NO_NODE;
> lo->tag_set.cmd_size = sizeof(struct loop_cmd);
> - lo->tag_set.flags = BLK_MQ_F_STACKING | BLK_MQ_F_NO_SCHED_BY_DEFAULT;
> + lo->tag_set.flags = BLK_MQ_F_STACKING | BLK_MQ_F_NO_SCHED_BY_DEFAULT |
> + BLK_MQ_F_BLOCKING;
> lo->tag_set.driver_data = lo;
>
> err = blk_mq_alloc_tag_set(&lo->tag_set);
>
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH V4 5/6] loop: try to handle loop aio command via NOWAIT IO first
2025-09-29 6:44 ` Yu Kuai
@ 2025-09-29 9:18 ` Ming Lei
0 siblings, 0 replies; 22+ messages in thread
From: Ming Lei @ 2025-09-29 9:18 UTC (permalink / raw)
To: Yu Kuai
Cc: Jens Axboe, linux-block, Mikulas Patocka, Zhaoyang Huang,
Dave Chinner, linux-fsdevel, yukuai (C)
On Mon, Sep 29, 2025 at 02:44:53PM +0800, Yu Kuai wrote:
> Hi,
>
> 在 2025/09/28 21:29, Ming Lei 写道:
> > Try to handle loop aio command via NOWAIT IO first, then we can avoid to
> > queue the aio command into workqueue. This is usually one big win in
> > case that FS block mapping is stable, Mikulas verified [1] that this way
> > improves IO perf by close to 5X in 12jobs sequential read/write test,
> > in which FS block mapping is just stable.
> >
> > Fallback to workqueue in case of -EAGAIN. This way may bring a little
> > cost from the 1st retry, but when running the following write test over
> > loop/sparse_file, the actual effect on randwrite is obvious:
> >
> > ```
> > truncate -s 4G 1.img #1.img is created on XFS/virtio-scsi
> > losetup -f 1.img --direct-io=on
> > fio --direct=1 --bs=4k --runtime=40 --time_based --numjobs=1 --ioengine=libaio \
> > --iodepth=16 --group_reporting=1 --filename=/dev/loop0 -name=job --rw=$RW
> > ```
> >
> > - RW=randwrite: obvious IOPS drop observed
> > - RW=write: a little drop(%5 - 10%)
> >
> > This perf drop on randwrite over sparse file will be addressed in the
> > following patch.
> >
> > BLK_MQ_F_BLOCKING has to be set for calling into .read_iter() or .write_iter()
> > which might sleep even though it is NOWAIT, and the only effect is that rcu read
> > lock is replaced with srcu read lock.
> >
> > Link: https://lore.kernel.org/linux-block/a8e5c76a-231f-07d1-a394-847de930f638@redhat.com/ [1]
> > Signed-off-by: Ming Lei <ming.lei@redhat.com>
> > ---
> > drivers/block/loop.c | 62 ++++++++++++++++++++++++++++++++++++++++----
> > 1 file changed, 57 insertions(+), 5 deletions(-)
> >
> > diff --git a/drivers/block/loop.c b/drivers/block/loop.c
> > index 99eec0a25dbc..57e33553695b 100644
> > --- a/drivers/block/loop.c
> > +++ b/drivers/block/loop.c
> > @@ -90,6 +90,8 @@ struct loop_cmd {
> > #define LOOP_IDLE_WORKER_TIMEOUT (60 * HZ)
> > #define LOOP_DEFAULT_HW_Q_DEPTH 128
> > +static void loop_queue_work(struct loop_device *lo, struct loop_cmd *cmd);
> > +
> > static DEFINE_IDR(loop_index_idr);
> > static DEFINE_MUTEX(loop_ctl_mutex);
> > static DEFINE_MUTEX(loop_validate_mutex);
> > @@ -321,6 +323,15 @@ static void lo_rw_aio_do_completion(struct loop_cmd *cmd)
> > if (!atomic_dec_and_test(&cmd->ref))
> > return;
> > +
> > + /* -EAGAIN could be returned from bdev's ->ki_complete */
> > + if (cmd->ret == -EAGAIN) {
> > + struct loop_device *lo = rq->q->queuedata;
> > +
> > + loop_queue_work(lo, cmd);
> > + return;
> > + }
> > +
> > kfree(cmd->bvec);
> > cmd->bvec = NULL;
> > if (req_op(rq) == REQ_OP_WRITE)
> > @@ -436,16 +447,40 @@ static int lo_rw_aio(struct loop_device *lo, struct loop_cmd *cmd,
> > int nr_bvec = lo_cmd_nr_bvec(cmd);
> > int ret;
> > - ret = lo_rw_aio_prep(lo, cmd, nr_bvec, pos);
> > - if (unlikely(ret))
> > - return ret;
> > + /* prepared already for aio from nowait code path */
> > + if (!cmd->use_aio) {
> > + ret = lo_rw_aio_prep(lo, cmd, nr_bvec, pos);
> > + if (unlikely(ret))
> > + goto fail;
> > + }
> > + cmd->iocb.ki_flags &= ~IOCB_NOWAIT;
> > ret = lo_submit_rw_aio(lo, cmd, nr_bvec, rw);
> > +fail:
> > if (ret != -EIOCBQUEUED)
> > lo_rw_aio_complete(&cmd->iocb, ret);
> > return -EIOCBQUEUED;
> > }
> > +static int lo_rw_aio_nowait(struct loop_device *lo, struct loop_cmd *cmd,
> > + int rw)
> > +{
> > + struct request *rq = blk_mq_rq_from_pdu(cmd);
> > + loff_t pos = ((loff_t) blk_rq_pos(rq) << 9) + lo->lo_offset;
> > + int nr_bvec = lo_cmd_nr_bvec(cmd);
> > + int ret = lo_rw_aio_prep(lo, cmd, nr_bvec, pos);
> > +
> > + if (unlikely(ret))
> > + goto fail;
> > +
> > + cmd->iocb.ki_flags |= IOCB_NOWAIT;
> > + ret = lo_submit_rw_aio(lo, cmd, nr_bvec, rw);
>
> Should you also check if backing device/file support nowait? Otherwise
> bio will fail with BLK_STS_NOTSUPP from submit_bio_noacct().
Good catch, nowait should only be applied in case of FMODE_NOWAIT, will add the
check.
Thanks,
Ming
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH V4 1/6] loop: add helper lo_cmd_nr_bvec()
2025-09-28 13:29 ` [PATCH V4 1/6] loop: add helper lo_cmd_nr_bvec() Ming Lei
@ 2025-10-03 7:04 ` Christoph Hellwig
0 siblings, 0 replies; 22+ messages in thread
From: Christoph Hellwig @ 2025-10-03 7:04 UTC (permalink / raw)
To: Ming Lei
Cc: Jens Axboe, linux-block, Mikulas Patocka, Zhaoyang Huang,
Dave Chinner, linux-fsdevel
Looks good:
Reviewed-by: Christoph Hellwig <hch@lst.de>
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH V4 2/6] loop: add helper lo_rw_aio_prep()
2025-09-28 13:29 ` [PATCH V4 2/6] loop: add helper lo_rw_aio_prep() Ming Lei
@ 2025-10-03 7:04 ` Christoph Hellwig
0 siblings, 0 replies; 22+ messages in thread
From: Christoph Hellwig @ 2025-10-03 7:04 UTC (permalink / raw)
To: Ming Lei
Cc: Jens Axboe, linux-block, Mikulas Patocka, Zhaoyang Huang,
Dave Chinner, linux-fsdevel
On Sun, Sep 28, 2025 at 09:29:21PM +0800, Ming Lei wrote:
> Add helper lo_rw_aio_prep() to make lo_rw_aio() more readable.
Does it? The patch looks ok, but the reasoing here is a bit weak.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH V4 3/6] loop: add lo_submit_rw_aio()
2025-09-28 13:29 ` [PATCH V4 3/6] loop: add lo_submit_rw_aio() Ming Lei
@ 2025-10-03 7:04 ` Christoph Hellwig
0 siblings, 0 replies; 22+ messages in thread
From: Christoph Hellwig @ 2025-10-03 7:04 UTC (permalink / raw)
To: Ming Lei
Cc: Jens Axboe, linux-block, Mikulas Patocka, Zhaoyang Huang,
Dave Chinner, linux-fsdevel
On Sun, Sep 28, 2025 at 09:29:22PM +0800, Ming Lei wrote:
> Add lo_submit_rw_aio() and refactor lo_rw_aio().
Same.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH V4 6/6] loop: add hint for handling aio via IOCB_NOWAIT
2025-09-28 13:29 ` [PATCH V4 6/6] loop: add hint for handling aio via IOCB_NOWAIT Ming Lei
@ 2025-10-03 7:06 ` Christoph Hellwig
2025-10-06 14:18 ` Ming Lei
0 siblings, 1 reply; 22+ messages in thread
From: Christoph Hellwig @ 2025-10-03 7:06 UTC (permalink / raw)
To: Ming Lei
Cc: Jens Axboe, linux-block, Mikulas Patocka, Zhaoyang Huang,
Dave Chinner, linux-fsdevel
On Sun, Sep 28, 2025 at 09:29:25PM +0800, Ming Lei wrote:
> - there isn't any queued blocking async WRITEs, because NOWAIT won't cause
> contention with blocking WRITE, which often implies exclusive lock
Isn't this a generic thing we should be doing in core code so that
it applies to io_uring I/O as well?
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH V4 6/6] loop: add hint for handling aio via IOCB_NOWAIT
2025-10-03 7:06 ` Christoph Hellwig
@ 2025-10-06 14:18 ` Ming Lei
2025-10-07 6:33 ` Christoph Hellwig
0 siblings, 1 reply; 22+ messages in thread
From: Ming Lei @ 2025-10-06 14:18 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Jens Axboe, linux-block, Mikulas Patocka, Zhaoyang Huang,
Dave Chinner, linux-fsdevel
On Fri, Oct 03, 2025 at 12:06:44AM -0700, Christoph Hellwig wrote:
> On Sun, Sep 28, 2025 at 09:29:25PM +0800, Ming Lei wrote:
> > - there isn't any queued blocking async WRITEs, because NOWAIT won't cause
> > contention with blocking WRITE, which often implies exclusive lock
>
> Isn't this a generic thing we should be doing in core code so that
> it applies to io_uring I/O as well?
No.
It is just policy of using NOWAIT or not, so far:
- RWF_NOWAIT can be set from preadv/pwritev
- used for handling io_uring FS read/write
Even though loop's situation is similar with io-uring, however, both two are
different subsystem, and there is nothing `core code` for both, more importantly
it is just one policy: use it or not use it, each subsystem can make its
own decision based on subsystem internal.
Thanks,
Ming
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH V4 6/6] loop: add hint for handling aio via IOCB_NOWAIT
2025-10-06 14:18 ` Ming Lei
@ 2025-10-07 6:33 ` Christoph Hellwig
2025-10-07 12:15 ` Ming Lei
0 siblings, 1 reply; 22+ messages in thread
From: Christoph Hellwig @ 2025-10-07 6:33 UTC (permalink / raw)
To: Ming Lei
Cc: Christoph Hellwig, Jens Axboe, linux-block, Mikulas Patocka,
Zhaoyang Huang, Dave Chinner, linux-fsdevel, io-uring
On Mon, Oct 06, 2025 at 10:18:12PM +0800, Ming Lei wrote:
> On Fri, Oct 03, 2025 at 12:06:44AM -0700, Christoph Hellwig wrote:
> > On Sun, Sep 28, 2025 at 09:29:25PM +0800, Ming Lei wrote:
> > > - there isn't any queued blocking async WRITEs, because NOWAIT won't cause
> > > contention with blocking WRITE, which often implies exclusive lock
> >
> > Isn't this a generic thing we should be doing in core code so that
> > it applies to io_uring I/O as well?
>
> No.
>
> It is just policy of using NOWAIT or not, so far:
>
> - RWF_NOWAIT can be set from preadv/pwritev
>
> - used for handling io_uring FS read/write
>
> Even though loop's situation is similar with io-uring, however, both two are
> different subsystem, and there is nothing `core code` for both, more importantly
> it is just one policy: use it or not use it, each subsystem can make its
> own decision based on subsystem internal.
I fail to parse what you say here. You are encoding special magic
about what underlying file systems do in an upper layer. I'd much
rather have a flag similar FOP_DIO_PARALLEL_WRITE that makes this
limitation clear rather then opencoding it in the loop driver while
leabing the primary user of RWF_NOWAIT out in the cold.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH V4 6/6] loop: add hint for handling aio via IOCB_NOWAIT
2025-10-07 6:33 ` Christoph Hellwig
@ 2025-10-07 12:15 ` Ming Lei
2025-10-08 5:56 ` Christoph Hellwig
0 siblings, 1 reply; 22+ messages in thread
From: Ming Lei @ 2025-10-07 12:15 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Jens Axboe, linux-block, Mikulas Patocka, Zhaoyang Huang,
Dave Chinner, linux-fsdevel, io-uring
On Mon, Oct 06, 2025 at 11:33:17PM -0700, Christoph Hellwig wrote:
> On Mon, Oct 06, 2025 at 10:18:12PM +0800, Ming Lei wrote:
> > On Fri, Oct 03, 2025 at 12:06:44AM -0700, Christoph Hellwig wrote:
> > > On Sun, Sep 28, 2025 at 09:29:25PM +0800, Ming Lei wrote:
> > > > - there isn't any queued blocking async WRITEs, because NOWAIT won't cause
> > > > contention with blocking WRITE, which often implies exclusive lock
> > >
> > > Isn't this a generic thing we should be doing in core code so that
> > > it applies to io_uring I/O as well?
> >
> > No.
> >
> > It is just policy of using NOWAIT or not, so far:
> >
> > - RWF_NOWAIT can be set from preadv/pwritev
> >
> > - used for handling io_uring FS read/write
> >
> > Even though loop's situation is similar with io-uring, however, both two are
> > different subsystem, and there is nothing `core code` for both, more importantly
> > it is just one policy: use it or not use it, each subsystem can make its
> > own decision based on subsystem internal.
>
> I fail to parse what you say here. You are encoding special magic
> about what underlying file systems do in an upper layer. I'd much
NOWAIT is obviously interface provided by FS, here loop just wants to try
NOWAIT first in block layer dispatch context for avoiding the extra wq
schedule latency.
But for write on sparse file, trying NOWAIT first may bring extra retry
cost, that is why the hint is added. It is very coarse, but potential
regression can be avoided.
> rather have a flag similar FOP_DIO_PARALLEL_WRITE that makes this
> limitation clear rather then opencoding it in the loop driver while
What is the limitation?
> leabing the primary user of RWF_NOWAIT out in the cold.
FOP_DIO_PARALLEL_WRITE is one static FS feature, but here it is FS
runtime behavior, such as if the write can be blocked because of space
allocation, so it can't be done by one static flag.
io-uring shares nothing with loop in this area, it is just one policy wrt.
use NOWAIT or not. I don't understand why you insist on covering both
from FS internal...
Thanks,
Ming
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH V4 6/6] loop: add hint for handling aio via IOCB_NOWAIT
2025-10-07 12:15 ` Ming Lei
@ 2025-10-08 5:56 ` Christoph Hellwig
2025-10-09 1:25 ` Ming Lei
0 siblings, 1 reply; 22+ messages in thread
From: Christoph Hellwig @ 2025-10-08 5:56 UTC (permalink / raw)
To: Ming Lei
Cc: Christoph Hellwig, Jens Axboe, linux-block, Mikulas Patocka,
Zhaoyang Huang, Dave Chinner, linux-fsdevel, io-uring
On Tue, Oct 07, 2025 at 08:15:05PM +0800, Ming Lei wrote:
> NOWAIT is obviously interface provided by FS, here loop just wants to try
> NOWAIT first in block layer dispatch context for avoiding the extra wq
> schedule latency.
Yes.
> But for write on sparse file, trying NOWAIT first may bring extra retry
> cost, that is why the hint is added. It is very coarse, but potential
> regression can be avoided.
And that is absolutely not a property of loop, and loop should not have
to know about. So this logic needs to be in common code, preferably
triggered by a fs flag. Note that this isn't about holes - it is about
allocating blocks. For most file systems filling holes or extending
past i_size is what requires allocating blocks. But for a out of place
write file systems like btrfs, or zoned xfs we always need to allocate
blocks for now. But I have work that I need to finish off that allows
for non-blocking block allocation in zoned XFS, at which point you
don't need this. I think some of this might be true for network file
systems already.
>
> > rather have a flag similar FOP_DIO_PARALLEL_WRITE that makes this
> > limitation clear rather then opencoding it in the loop driver while
>
> What is the limitation?
See above.
> > leabing the primary user of RWF_NOWAIT out in the cold.
>
> FOP_DIO_PARALLEL_WRITE is one static FS feature,
It actually isn't :( I need to move it to be a bit more dynamic on a
per-file basis.
> but here it is FS
> runtime behavior, such as if the write can be blocked because of space
> allocation, so it can't be done by one static flag.
Yes, that's why you want a flag to indicate that a file, or maybe file
operations instance can do non-blocking fill of blocks. But that's
for the future, for now I just want your logic lifted to common code
and shared with io_uring so that we don't have weird hardcoded
assumptions about file system behavior inside the loop driver.
> io-uring shares nothing with loop in this area, it is just one policy wrt.
> use NOWAIT or not. I don't understand why you insist on covering both
> from FS internal...
It's really about all IOCB_NOWAIT users, io_uring being the prime one,
and the one that we can actually easily write tests for.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH V4 6/6] loop: add hint for handling aio via IOCB_NOWAIT
2025-10-08 5:56 ` Christoph Hellwig
@ 2025-10-09 1:25 ` Ming Lei
2025-10-13 6:26 ` Christoph Hellwig
0 siblings, 1 reply; 22+ messages in thread
From: Ming Lei @ 2025-10-09 1:25 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Jens Axboe, linux-block, Mikulas Patocka, Zhaoyang Huang,
Dave Chinner, linux-fsdevel, io-uring
On Tue, Oct 07, 2025 at 10:56:01PM -0700, Christoph Hellwig wrote:
> On Tue, Oct 07, 2025 at 08:15:05PM +0800, Ming Lei wrote:
> > NOWAIT is obviously interface provided by FS, here loop just wants to try
> > NOWAIT first in block layer dispatch context for avoiding the extra wq
> > schedule latency.
>
> Yes.
>
> > But for write on sparse file, trying NOWAIT first may bring extra retry
> > cost, that is why the hint is added. It is very coarse, but potential
> > regression can be avoided.
>
> And that is absolutely not a property of loop, and loop should not have
> to know about. So this logic needs to be in common code, preferably
> triggered by a fs flag. Note that this isn't about holes - it is about
> allocating blocks. For most file systems filling holes or extending
> past i_size is what requires allocating blocks. But for a out of place
> write file systems like btrfs, or zoned xfs we always need to allocate
> blocks for now. But I have work that I need to finish off that allows
> for non-blocking block allocation in zoned XFS, at which point you
> don't need this. I think some of this might be true for network file
> systems already.
Firstly this FS flag isn't available, if it is added, we may take it into
account, and it is just one check, which shouldn't be blocker for this
loop perf improvement.
Secondly it isn't enough to replace nowait decision from user side, one
case is overwrite, which is a nice usecase for nowait.
>
> >
> > > rather have a flag similar FOP_DIO_PARALLEL_WRITE that makes this
> > > limitation clear rather then opencoding it in the loop driver while
> >
> > What is the limitation?
>
> See above.
>
> > > leabing the primary user of RWF_NOWAIT out in the cold.
> >
> > FOP_DIO_PARALLEL_WRITE is one static FS feature,
>
> It actually isn't :( I need to move it to be a bit more dynamic on a
> per-file basis.
>
> > but here it is FS
> > runtime behavior, such as if the write can be blocked because of space
> > allocation, so it can't be done by one static flag.
>
> Yes, that's why you want a flag to indicate that a file, or maybe file
> operations instance can do non-blocking fill of blocks. But that's
> for the future, for now I just want your logic lifted to common code
> and shared with io_uring so that we don't have weird hardcoded
> assumptions about file system behavior inside the loop driver.
As I mentioned the hint in this patch is very loop specific for avoiding
potential write perf regression, which just works for loop's case.
It can't be applied on io-uring, otherwise perf regression can be caused on
overwrite from io-uring application.
So I don't know what is the exact common code or logic for both loop and
io-uring.
Thanks,
Ming
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH V4 6/6] loop: add hint for handling aio via IOCB_NOWAIT
2025-10-09 1:25 ` Ming Lei
@ 2025-10-13 6:26 ` Christoph Hellwig
2025-10-13 8:26 ` Ming Lei
0 siblings, 1 reply; 22+ messages in thread
From: Christoph Hellwig @ 2025-10-13 6:26 UTC (permalink / raw)
To: Ming Lei
Cc: Christoph Hellwig, Jens Axboe, linux-block, Mikulas Patocka,
Zhaoyang Huang, Dave Chinner, linux-fsdevel, io-uring
On Thu, Oct 09, 2025 at 09:25:47AM +0800, Ming Lei wrote:
> Firstly this FS flag isn't available, if it is added, we may take it into
> account, and it is just one check, which shouldn't be blocker for this
> loop perf improvement.
>
> Secondly it isn't enough to replace nowait decision from user side, one
> case is overwrite, which is a nice usecase for nowait.
Yes. But right now you are hardcoding heuristics which is overall a
very minor user of RWF_NOWAIT instead of sorting this out properly.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH V4 6/6] loop: add hint for handling aio via IOCB_NOWAIT
2025-10-13 6:26 ` Christoph Hellwig
@ 2025-10-13 8:26 ` Ming Lei
0 siblings, 0 replies; 22+ messages in thread
From: Ming Lei @ 2025-10-13 8:26 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Jens Axboe, linux-block, Mikulas Patocka, Zhaoyang Huang,
Dave Chinner, linux-fsdevel, io-uring
On Mon, Oct 13, 2025 at 2:28 PM Christoph Hellwig <hch@infradead.org> wrote:
>
> On Thu, Oct 09, 2025 at 09:25:47AM +0800, Ming Lei wrote:
> > Firstly this FS flag isn't available, if it is added, we may take it into
> > account, and it is just one check, which shouldn't be blocker for this
> > loop perf improvement.
> >
> > Secondly it isn't enough to replace nowait decision from user side, one
> > case is overwrite, which is a nice usecase for nowait.
>
> Yes. But right now you are hardcoding heuristics which is overall a
> very minor user of RWF_NOWAIT instead of sorting this out properly.
Yes, that is why I call the hint as loop specific, it isn't perfect, just for
avoiding potential regression by taking nowait.
Given the improvement is big, and the perf issue has been
reported several times, I'd suggest taking it this way first, and
document it can be improved in future.
Thanks,
^ permalink raw reply [flat|nested] 22+ messages in thread
end of thread, other threads:[~2025-10-13 8:26 UTC | newest]
Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-09-28 13:29 [PATCH V4 0/6] loop: improve loop aio perf by IOCB_NOWAIT Ming Lei
2025-09-28 13:29 ` [PATCH V4 1/6] loop: add helper lo_cmd_nr_bvec() Ming Lei
2025-10-03 7:04 ` Christoph Hellwig
2025-09-28 13:29 ` [PATCH V4 2/6] loop: add helper lo_rw_aio_prep() Ming Lei
2025-10-03 7:04 ` Christoph Hellwig
2025-09-28 13:29 ` [PATCH V4 3/6] loop: add lo_submit_rw_aio() Ming Lei
2025-10-03 7:04 ` Christoph Hellwig
2025-09-28 13:29 ` [PATCH V4 4/6] loop: move command blkcg/memcg initialization into loop_queue_work Ming Lei
2025-09-28 13:29 ` [PATCH V4 5/6] loop: try to handle loop aio command via NOWAIT IO first Ming Lei
2025-09-29 6:44 ` Yu Kuai
2025-09-29 9:18 ` Ming Lei
2025-09-28 13:29 ` [PATCH V4 6/6] loop: add hint for handling aio via IOCB_NOWAIT Ming Lei
2025-10-03 7:06 ` Christoph Hellwig
2025-10-06 14:18 ` Ming Lei
2025-10-07 6:33 ` Christoph Hellwig
2025-10-07 12:15 ` Ming Lei
2025-10-08 5:56 ` Christoph Hellwig
2025-10-09 1:25 ` Ming Lei
2025-10-13 6:26 ` Christoph Hellwig
2025-10-13 8:26 ` Ming Lei
2025-09-28 18:42 ` [syzbot ci] Re: loop: improve loop aio perf by IOCB_NOWAIT syzbot ci
2025-09-29 1:13 ` Ming Lei
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).