From: Ming Lei <ming.lei@redhat.com>
To: Caleb Sander Mateos <csander@purestorage.com>
Cc: Jens Axboe <axboe@kernel.dk>,
linux-block@vger.kernel.org, Keith Busch <kbusch@kernel.org>,
Uday Shankar <ushankar@purestorage.com>
Subject: Re: [PATCH 6/8] ublk: implement ->queue_rqs()
Date: Thu, 27 Mar 2025 17:15:02 +0800 [thread overview]
Message-ID: <Z-UXFu6q1apYYUBb@fedora> (raw)
In-Reply-To: <CADUfDZoBTkNypLXsVFMsZLSJGw3i7VxnQ=POodmYm=E4HPU=SA@mail.gmail.com>
On Wed, Mar 26, 2025 at 02:30:56PM -0700, Caleb Sander Mateos wrote:
> On Mon, Mar 24, 2025 at 6:49 AM Ming Lei <ming.lei@redhat.com> wrote:
> >
> > Implement ->queue_rqs() for improving perf in case of MQ.
> >
> > In this way, we just need to call io_uring_cmd_complete_in_task() once for
> > one batch, then both io_uring and ublk server can get exact batch from
> > client side.
> >
> > Follows IOPS improvement:
> >
> > - tests
> >
> > tools/testing/selftests/ublk/kublk add -t null -q 2 [-z]
> >
> > fio/t/io_uring -p0 /dev/ublkb0
> >
> > - results:
> >
> > more than 10% IOPS boost observed
> >
> > Pass all ublk selftests, especially the io dispatch order test.
> >
> > Cc: Uday Shankar <ushankar@purestorage.com>
> > Signed-off-by: Ming Lei <ming.lei@redhat.com>
> > ---
> > drivers/block/ublk_drv.c | 85 ++++++++++++++++++++++++++++++++++++----
> > 1 file changed, 77 insertions(+), 8 deletions(-)
> >
> > diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c
> > index 53a463681a41..86621fde7fde 100644
> > --- a/drivers/block/ublk_drv.c
> > +++ b/drivers/block/ublk_drv.c
> > @@ -83,6 +83,7 @@ struct ublk_rq_data {
> > struct ublk_uring_cmd_pdu {
> > struct ublk_queue *ubq;
> > u16 tag;
> > + struct rq_list list;
> > };
> >
> > /*
> > @@ -1258,6 +1259,32 @@ static void ublk_queue_cmd(struct ublk_queue *ubq, struct request *rq)
> > io_uring_cmd_complete_in_task(io->cmd, ublk_rq_task_work_cb);
> > }
> >
> > +static void ublk_cmd_list_tw_cb(struct io_uring_cmd *cmd,
> > + unsigned int issue_flags)
> > +{
> > + struct ublk_uring_cmd_pdu *pdu = ublk_get_uring_cmd_pdu(cmd);
> > + struct ublk_queue *ubq = pdu->ubq;
> > + struct request *rq;
> > +
> > + while ((rq = rq_list_pop(&pdu->list))) {
> > + struct ublk_io *io = &ubq->ios[rq->tag];
> > +
> > + ublk_rq_task_work_cb(io->cmd, issue_flags);
>
> ublk_rq_task_work_cb() is duplicating the lookup of ubq, rq, and io.
> Could you factor out a helper that takes those values instead of cmd?
>
> > + }
> > +}
> > +
> > +static void ublk_queue_cmd_list(struct ublk_queue *ubq, struct rq_list *l)
> > +{
> > + struct request *rq = l->head;
> > + struct ublk_io *io = &ubq->ios[rq->tag];
> > + struct ublk_uring_cmd_pdu *pdu = ublk_get_uring_cmd_pdu(io->cmd);
> > +
> > + pdu->ubq = ubq;
>
> Why does pdu->ubq need to be set here but not in ublk_queue_cmd()? I
> would have thought it would already be set to ubq because pdu comes
> from a rq belonging to this ubq.
>
> > + pdu->list = *l;
> > + rq_list_init(l);
> > + io_uring_cmd_complete_in_task(io->cmd, ublk_cmd_list_tw_cb);
>
> Could store io->cmd in a variable to avoid looking it up twice.
>
> > +}
> > +
> > static enum blk_eh_timer_return ublk_timeout(struct request *rq)
> > {
> > struct ublk_queue *ubq = rq->mq_hctx->driver_data;
> > @@ -1296,16 +1323,13 @@ static enum blk_eh_timer_return ublk_timeout(struct request *rq)
> > return BLK_EH_RESET_TIMER;
> > }
> >
> > -static blk_status_t ublk_queue_rq(struct blk_mq_hw_ctx *hctx,
> > - const struct blk_mq_queue_data *bd)
> > +static blk_status_t ublk_prep_rq_batch(struct request *rq)
>
> naming nit: why "batch"?
All were handled in my local version.
>
> > {
> > - struct ublk_queue *ubq = hctx->driver_data;
> > - struct request *rq = bd->rq;
> > + struct ublk_queue *ubq = rq->mq_hctx->driver_data;
> > blk_status_t res;
> >
> > - if (unlikely(ubq->fail_io)) {
> > + if (unlikely(ubq->fail_io))
> > return BLK_STS_TARGET;
> > - }
> >
> > /* fill iod to slot in io cmd buffer */
> > res = ublk_setup_iod(ubq, rq);
> > @@ -1324,17 +1348,58 @@ static blk_status_t ublk_queue_rq(struct blk_mq_hw_ctx *hctx,
> > if (ublk_nosrv_should_queue_io(ubq) && unlikely(ubq->force_abort))
> > return BLK_STS_IOERR;
> >
> > + if (unlikely(ubq->canceling))
> > + return BLK_STS_IOERR;
>
> Why is ubq->cancelling treated differently for ->queue_rq() vs. ->queue_rqs()?
It is same, just ublk_queue_rqs() becomes simpler by letting ->queue_rqs()
to handle ubq->canceling.
Here it is really something which need to comment for ubq->canceling
handling, which has to be done after ->fail_io/->force_abort is dealt
with, otherwise IO hang is caused when removing disk.
That is one bug introduced in this patch.
>
> > +
> > + blk_mq_start_request(rq);
> > + return BLK_STS_OK;
> > +}
> > +
> > +static blk_status_t ublk_queue_rq(struct blk_mq_hw_ctx *hctx,
> > + const struct blk_mq_queue_data *bd)
> > +{
> > + struct ublk_queue *ubq = hctx->driver_data;
> > + struct request *rq = bd->rq;
> > + blk_status_t res;
> > +
> > if (unlikely(ubq->canceling)) {
> > __ublk_abort_rq(ubq, rq);
> > return BLK_STS_OK;
> > }
> >
> > - blk_mq_start_request(bd->rq);
> > - ublk_queue_cmd(ubq, rq);
> > + res = ublk_prep_rq_batch(rq);
> > + if (res != BLK_STS_OK)
> > + return res;
> >
> > + ublk_queue_cmd(ubq, rq);
> > return BLK_STS_OK;
> > }
> >
> > +static void ublk_queue_rqs(struct rq_list *rqlist)
> > +{
> > + struct rq_list requeue_list = { };
> > + struct rq_list submit_list = { };
> > + struct ublk_queue *ubq = NULL;
> > + struct request *req;
> > +
> > + while ((req = rq_list_pop(rqlist))) {
> > + struct ublk_queue *this_q = req->mq_hctx->driver_data;
> > +
> > + if (ubq && ubq != this_q && !rq_list_empty(&submit_list))
> > + ublk_queue_cmd_list(ubq, &submit_list);
> > + ubq = this_q;
>
> Probably could avoid the extra ->driver_data dereference on every rq
> by comparing the mq_hctx pointers instead. The ->driver_data
> dereference could be moved to the ublk_queue_cmd_list() calls.
Yes.
>
> > +
> > + if (ublk_prep_rq_batch(req) == BLK_STS_OK)
> > + rq_list_add_tail(&submit_list, req);
> > + else
> > + rq_list_add_tail(&requeue_list, req);
> > + }
> > +
> > + if (ubq && !rq_list_empty(&submit_list))
> > + ublk_queue_cmd_list(ubq, &submit_list);
> > + *rqlist = requeue_list;
> > +}
> > +
> > static int ublk_init_hctx(struct blk_mq_hw_ctx *hctx, void *driver_data,
> > unsigned int hctx_idx)
> > {
> > @@ -1347,6 +1412,7 @@ static int ublk_init_hctx(struct blk_mq_hw_ctx *hctx, void *driver_data,
> >
> > static const struct blk_mq_ops ublk_mq_ops = {
> > .queue_rq = ublk_queue_rq,
> > + .queue_rqs = ublk_queue_rqs,
> > .init_hctx = ublk_init_hctx,
> > .timeout = ublk_timeout,
> > };
> > @@ -3147,6 +3213,9 @@ static int __init ublk_init(void)
> > BUILD_BUG_ON((u64)UBLKSRV_IO_BUF_OFFSET +
> > UBLKSRV_IO_BUF_TOTAL_SIZE < UBLKSRV_IO_BUF_OFFSET);
> >
> > + BUILD_BUG_ON(sizeof(struct ublk_uring_cmd_pdu) >
> > + sizeof_field(struct io_uring_cmd, pdu));
>
> Looks like Uday also suggested this, but if you change
> ublk_get_uring_cmd_pdu() to use io_uring_cmd_to_pdu(), you get this
> check for free.
I have followed Uday's suggestion to patch ublk_get_uring_cmd_pdu(),
and will send out v2 soon.
Thanks,
Ming
next prev parent reply other threads:[~2025-03-27 9:15 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-03-24 13:48 [PATCH 0/8] ublk: cleanup & improvement & zc follow-up Ming Lei
2025-03-24 13:48 ` [PATCH 1/8] ublk: remove two unused fields from 'struct ublk_queue' Ming Lei
2025-03-24 14:32 ` Caleb Sander Mateos
2025-03-24 13:48 ` [PATCH 2/8] ublk: add helper of ublk_need_map_io() Ming Lei
2025-03-24 15:01 ` Caleb Sander Mateos
2025-03-24 13:48 ` [PATCH 3/8] ublk: truncate io command result Ming Lei
2025-03-24 15:51 ` Caleb Sander Mateos
2025-03-25 0:50 ` Ming Lei
2025-03-24 13:48 ` [PATCH 4/8] ublk: add segment parameter Ming Lei
2025-03-24 22:26 ` Caleb Sander Mateos
2025-03-25 1:15 ` Ming Lei
2025-03-25 19:43 ` Caleb Sander Mateos
2025-03-26 2:17 ` Ming Lei
2025-03-26 16:43 ` Caleb Sander Mateos
2025-03-27 1:49 ` Ming Lei
2025-03-24 13:49 ` [PATCH 5/8] ublk: document zero copy feature Ming Lei
2025-03-25 15:26 ` Caleb Sander Mateos
2025-03-26 2:29 ` Ming Lei
2025-03-26 16:48 ` Caleb Sander Mateos
2025-03-24 13:49 ` [PATCH 6/8] ublk: implement ->queue_rqs() Ming Lei
2025-03-24 17:07 ` Uday Shankar
2025-03-25 1:02 ` Ming Lei
2025-03-26 21:30 ` Caleb Sander Mateos
2025-03-27 9:15 ` Ming Lei [this message]
2025-03-24 13:49 ` [PATCH 7/8] selftests: ublk: add more tests for covering MQ Ming Lei
2025-03-24 13:49 ` [PATCH 8/8] selftests: ublk: add test for checking zero copy related parameter Ming Lei
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Z-UXFu6q1apYYUBb@fedora \
--to=ming.lei@redhat.com \
--cc=axboe@kernel.dk \
--cc=csander@purestorage.com \
--cc=kbusch@kernel.org \
--cc=linux-block@vger.kernel.org \
--cc=ushankar@purestorage.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).