Re: [PATCH] ublk_drv: don't call task_work_add for queueing io commands

linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Ming Lei <ming.lei@redhat.com>
To: Ziyang Zhang <ZiyangZhang@linux.alibaba.com>
Cc: linux-block@vger.kernel.org, Jens Axboe <axboe@kernel.dk>
Subject: Re: [PATCH] ublk_drv: don't call task_work_add for queueing io commands
Date: Tue, 25 Oct 2022 15:19:09 +0800	[thread overview]
Message-ID: <Y1eN7VqaJrCVJZjb@T590> (raw)
In-Reply-To: <1816adbd-fa1c-71e0-652d-483d47697254@linux.alibaba.com>

On Tue, Oct 25, 2022 at 11:15:57AM +0800, Ziyang Zhang wrote:
> On 2022/10/24 21:20, Ming Lei wrote:
> > Hello Ziyang,
> > 
> > On Mon, Oct 24, 2022 at 05:48:51PM +0800, Ziyang Zhang wrote:
> >> On 2022/10/23 17:38, Ming Lei wrote:
> >>> task_work_add() is used for waking ubq daemon task with one batch
> >>> of io requests/commands queued. However, task_work_add() isn't
> >>> exported for module code, and it is still debatable if the symbol
> >>> should be exported.
> >>>
> >>> Fortunately we still have io_uring_cmd_complete_in_task() which just
> >>> can't handle batched wakeup for us.
> >>>
> >>> Add one one llist into ublk_queue and call io_uring_cmd_complete_in_task()
> >>> via current command for running them via task work.
> >>>
> >>> This way cleans up current code a lot, meantime allow us to wakeup
> >>> ubq daemon task after queueing batched requests/io commands.
> >>>
> >>
> >>
> >> Hi, Ming
> >>
> >> This patch works and I have run some tests to compare current version(ucmd)
> >> with your patch(ucmd-batch).
> >>
> >> iodepth=128 numjobs=1 direct=1 bs=4k
> >>
> >> --------------------------------------------
> >> ublk loop target, the backend is a file.
> >> IOPS(k)
> >>
> >> type		ucmd		ucmd-batch
> >> seq-read	54.7		54.2	
> >> rand-read	52.8		52.0
> >>
> >> --------------------------------------------
> >> ublk null target
> >> IOPS(k)
> >>
> >> type		ucmd		ucmd-batch
> >> seq-read	257		257
> >> rand-read	252		253
> >>
> >>
> >> I find that io_req_task_work_add() puts task_work node into a llist
> >> first, then it may call task_work_add() to run batched task_works. So do we really
> >> need such llist in ublk_drv? I think io_uring has already considered task_work batch
> >> optimization.
> >>
> >> BTW, task_work_add() in ublk_drv achieves
> >> higher IOPS(about 5-10% on my machine) than io_uring_cmd_complete_in_task()
> >> in ublk_drv.
> > 
> > Yeah, that is same with my observation, and motivation of this patch is
> > to get same performance with task_work_add by building ublk_drv as
> > module. One win of task_work_add() is that we get exact batching info
> > meantime only send TWA_SIGNAL_NO_IPI for whole batch, that is basically
> > what the patch is doing, but needs help of the following ublksrv patch:
> > 
> > https://github.com/ming1/ubdsrv/commit/dce6d1d222023c1641292713b311ced01e6dc548
> > 
> > which sets IORING_SETUP_COOP_TASKRUN for ublksrv's uring, then
> > io_uring_cmd_complete_in_task will notify via TWA_SIGNAL_NO_IPI, and 5+%
> > IOPS boost is observed on loop/001 by putting image on SSD in my test
> > VM.
> 
> Hi Ming,
> 
> I have added this ublksrv patch and run the above test again.
> I have also run ublksrv test: loop/001. Please check them.
> 
> Intel(R) Xeon(R) Platinum 8369B CPU @ 2.70GHz 16 cores
> 64GB MEM, CentOS 8, kernel 6.0+
> 
> --------
> fio test
> 
> iodepth=128 numjobs=1 direct=1 bs=4k
> 
> ucmd: without your kernel patch. Run io_uring_cmd_complete_in_task()
> for each blk-mq rq.
> 
> ucmd-batch: with your kernel patch. Run io_uring_cmd_complete_in_task()
> for the last blk-mq rq.
> 
> --------------------------------------------
> ublk loop target, the backend is a file.
> 
> IOPS(k)
> 
> type		ucmd		ucmd-batch
> seq-read	54.1		53.7
> rand-read	52.0		52.0
> 
> --------------------------------------------
> ublk null target
> IOPS(k)
> 
> type		ucmd		ucmd-batch
> seq-read	272		265
> rand-read	262		260
> 
> ------------
> ublksrv test
> 
> -------------
> ucmd
> 
> running loop/001
>         fio (ublk/loop( -f /root/work/ubdsrv/tests/tmp/ublk_loop_1G_BZ85U), libaio, bs 4k, dio, hw queues:1, uring_comp: 0, get_data: 0)...
>         randwrite: jobs 1, iops 66737
>         randread: jobs 1, iops 64935
>         randrw: jobs 1, iops read 32694 write 32710
>         rw(512k): jobs 1, iops read 772 write 819
> 
> -------------
> ucmd-batch
> 
> running loop/001
>         fio (ublk/loop( -f /root/work/ubdsrv/tests/tmp/ublk_loop_1G_F56a3), libaio, bs 4k, dio, hw queues:1, uring_comp: 0, get_data: 0)...
>         randwrite: jobs 1, iops 66720
>         randread: jobs 1, iops 65015
>         randrw: jobs 1, iops read 32743 write 32759
>         rw(512k): jobs 1, iops read 771 write 817
> 
> 
> It seems that manually putting rqs into llist and calling
> io_uring_cmd_complete_in_task() while handling the last rq does
> not improve IOPS much.
> 
> io_req_task_work_add() puts task_work node into a internal llist
> first, then it may call task_work_add() to run batched task_works.
> IMO, io_uring has already done such batch optimization and ublk_drv
> does not need to add such llist.

The difference is just how batching is handled, looks blk-mq's batch info
doesn't matter any more. In my test, looks the perf improvement is mainly
made by enabling IORING_SETUP_COOP_TASKRUN in ublksrv.

Can you check if enabling IORING_SETUP_COOP_TASKRUN only can reach
same perf with task_work_add()(ublk_drv is builtin) when building
ublk_drv as module?


Thanks,
Ming

next prev parent reply	other threads:[~2022-10-25  7:19 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-23  9:38 [PATCH] ublk_drv: don't call task_work_add for queueing io commands Ming Lei
2022-10-24  9:48 ` Ziyang Zhang
2022-10-24 13:20   ` Ming Lei
2022-10-25  3:15     ` Ziyang Zhang
2022-10-25  7:19       ` Ming Lei [this message]
2022-10-25  7:46         ` Ziyang Zhang
2022-10-25  8:43           ` Ziyang Zhang
2022-10-25 15:17             ` Ming Lei
2022-10-26 10:32               ` Ziyang Zhang
2022-10-26 11:29                 ` Ming Lei
2022-10-27  3:00                   ` Ziyang Zhang
2022-10-27 15:38                     ` Ming Lei

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y1eN7VqaJrCVJZjb@T590 \
    --to=ming.lei@redhat.com \
    --cc=ZiyangZhang@linux.alibaba.com \
    --cc=axboe@kernel.dk \
    --cc=linux-block@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).