linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ming Lei <ming.lei@redhat.com>
To: Yu Kuai <yukuai1@huaweicloud.com>
Cc: Jens Axboe <axboe@kernel.dk>,
	linux-block@vger.kernel.org,
	Mikulas Patocka <mpatocka@redhat.com>,
	Zhaoyang Huang <zhaoyang.huang@unisoc.com>,
	Dave Chinner <dchinner@redhat.com>,
	linux-fsdevel@vger.kernel.org, "yukuai (C)" <yukuai3@huawei.com>
Subject: Re: [PATCH V4 5/6] loop: try to handle loop aio command via NOWAIT IO first
Date: Mon, 29 Sep 2025 17:18:51 +0800	[thread overview]
Message-ID: <aNpOiQbgrSukFaUT@fedora> (raw)
In-Reply-To: <d043680f-1d7a-bcb8-2588-4eae403f050d@huaweicloud.com>

On Mon, Sep 29, 2025 at 02:44:53PM +0800, Yu Kuai wrote:
> Hi,
> 
> 在 2025/09/28 21:29, Ming Lei 写道:
> > Try to handle loop aio command via NOWAIT IO first, then we can avoid to
> > queue the aio command into workqueue. This is usually one big win in
> > case that FS block mapping is stable, Mikulas verified [1] that this way
> > improves IO perf by close to 5X in 12jobs sequential read/write test,
> > in which FS block mapping is just stable.
> > 
> > Fallback to workqueue in case of -EAGAIN. This way may bring a little
> > cost from the 1st retry, but when running the following write test over
> > loop/sparse_file, the actual effect on randwrite is obvious:
> > 
> > ```
> > truncate -s 4G 1.img    #1.img is created on XFS/virtio-scsi
> > losetup -f 1.img --direct-io=on
> > fio --direct=1 --bs=4k --runtime=40 --time_based --numjobs=1 --ioengine=libaio \
> > 	--iodepth=16 --group_reporting=1 --filename=/dev/loop0 -name=job --rw=$RW
> > ```
> > 
> > - RW=randwrite: obvious IOPS drop observed
> > - RW=write: a little drop(%5 - 10%)
> > 
> > This perf drop on randwrite over sparse file will be addressed in the
> > following patch.
> > 
> > BLK_MQ_F_BLOCKING has to be set for calling into .read_iter() or .write_iter()
> > which might sleep even though it is NOWAIT, and the only effect is that rcu read
> > lock is replaced with srcu read lock.
> > 
> > Link: https://lore.kernel.org/linux-block/a8e5c76a-231f-07d1-a394-847de930f638@redhat.com/ [1]
> > Signed-off-by: Ming Lei <ming.lei@redhat.com>
> > ---
> >   drivers/block/loop.c | 62 ++++++++++++++++++++++++++++++++++++++++----
> >   1 file changed, 57 insertions(+), 5 deletions(-)
> > 
> > diff --git a/drivers/block/loop.c b/drivers/block/loop.c
> > index 99eec0a25dbc..57e33553695b 100644
> > --- a/drivers/block/loop.c
> > +++ b/drivers/block/loop.c
> > @@ -90,6 +90,8 @@ struct loop_cmd {
> >   #define LOOP_IDLE_WORKER_TIMEOUT (60 * HZ)
> >   #define LOOP_DEFAULT_HW_Q_DEPTH 128
> > +static void loop_queue_work(struct loop_device *lo, struct loop_cmd *cmd);
> > +
> >   static DEFINE_IDR(loop_index_idr);
> >   static DEFINE_MUTEX(loop_ctl_mutex);
> >   static DEFINE_MUTEX(loop_validate_mutex);
> > @@ -321,6 +323,15 @@ static void lo_rw_aio_do_completion(struct loop_cmd *cmd)
> >   	if (!atomic_dec_and_test(&cmd->ref))
> >   		return;
> > +
> > +	/* -EAGAIN could be returned from bdev's ->ki_complete */
> > +	if (cmd->ret == -EAGAIN) {
> > +		struct loop_device *lo = rq->q->queuedata;
> > +
> > +		loop_queue_work(lo, cmd);
> > +		return;
> > +	}
> > +
> >   	kfree(cmd->bvec);
> >   	cmd->bvec = NULL;
> >   	if (req_op(rq) == REQ_OP_WRITE)
> > @@ -436,16 +447,40 @@ static int lo_rw_aio(struct loop_device *lo, struct loop_cmd *cmd,
> >   	int nr_bvec = lo_cmd_nr_bvec(cmd);
> >   	int ret;
> > -	ret = lo_rw_aio_prep(lo, cmd, nr_bvec, pos);
> > -	if (unlikely(ret))
> > -		return ret;
> > +	/* prepared already for aio from nowait code path */
> > +	if (!cmd->use_aio) {
> > +		ret = lo_rw_aio_prep(lo, cmd, nr_bvec, pos);
> > +		if (unlikely(ret))
> > +			goto fail;
> > +	}
> > +	cmd->iocb.ki_flags &= ~IOCB_NOWAIT;
> >   	ret = lo_submit_rw_aio(lo, cmd, nr_bvec, rw);
> > +fail:
> >   	if (ret != -EIOCBQUEUED)
> >   		lo_rw_aio_complete(&cmd->iocb, ret);
> >   	return -EIOCBQUEUED;
> >   }
> > +static int lo_rw_aio_nowait(struct loop_device *lo, struct loop_cmd *cmd,
> > +			    int rw)
> > +{
> > +	struct request *rq = blk_mq_rq_from_pdu(cmd);
> > +	loff_t pos = ((loff_t) blk_rq_pos(rq) << 9) + lo->lo_offset;
> > +	int nr_bvec = lo_cmd_nr_bvec(cmd);
> > +	int ret = lo_rw_aio_prep(lo, cmd, nr_bvec, pos);
> > +
> > +	if (unlikely(ret))
> > +		goto fail;
> > +
> > +	cmd->iocb.ki_flags |= IOCB_NOWAIT;
> > +	ret = lo_submit_rw_aio(lo, cmd, nr_bvec, rw);
> 
> Should you also check if backing device/file support nowait? Otherwise
> bio will fail with BLK_STS_NOTSUPP from submit_bio_noacct().

Good catch, nowait should only be applied in case of FMODE_NOWAIT, will add the
check.


Thanks,
Ming


  reply	other threads:[~2025-09-29  9:19 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-09-28 13:29 [PATCH V4 0/6] loop: improve loop aio perf by IOCB_NOWAIT Ming Lei
2025-09-28 13:29 ` [PATCH V4 1/6] loop: add helper lo_cmd_nr_bvec() Ming Lei
2025-10-03  7:04   ` Christoph Hellwig
2025-09-28 13:29 ` [PATCH V4 2/6] loop: add helper lo_rw_aio_prep() Ming Lei
2025-10-03  7:04   ` Christoph Hellwig
2025-09-28 13:29 ` [PATCH V4 3/6] loop: add lo_submit_rw_aio() Ming Lei
2025-10-03  7:04   ` Christoph Hellwig
2025-09-28 13:29 ` [PATCH V4 4/6] loop: move command blkcg/memcg initialization into loop_queue_work Ming Lei
2025-09-28 13:29 ` [PATCH V4 5/6] loop: try to handle loop aio command via NOWAIT IO first Ming Lei
2025-09-29  6:44   ` Yu Kuai
2025-09-29  9:18     ` Ming Lei [this message]
2025-09-28 13:29 ` [PATCH V4 6/6] loop: add hint for handling aio via IOCB_NOWAIT Ming Lei
2025-10-03  7:06   ` Christoph Hellwig
2025-10-06 14:18     ` Ming Lei
2025-10-07  6:33       ` Christoph Hellwig
2025-10-07 12:15         ` Ming Lei
2025-10-08  5:56           ` Christoph Hellwig
2025-10-09  1:25             ` Ming Lei
2025-10-13  6:26               ` Christoph Hellwig
2025-10-13  8:26                 ` Ming Lei
2025-09-28 18:42 ` [syzbot ci] Re: loop: improve loop aio perf by IOCB_NOWAIT syzbot ci
2025-09-29  1:13   ` Ming Lei

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aNpOiQbgrSukFaUT@fedora \
    --to=ming.lei@redhat.com \
    --cc=axboe@kernel.dk \
    --cc=dchinner@redhat.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=mpatocka@redhat.com \
    --cc=yukuai1@huaweicloud.com \
    --cc=yukuai3@huawei.com \
    --cc=zhaoyang.huang@unisoc.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).