Re: [PATCH v2 0/5] blk-mq-sched: support request batch dispatching for sq elevator

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Ming Lei <ming.lei@redhat.com>
To: Yu Kuai <yukuai1@huaweicloud.com>
Cc: dlemoal@kernel.org, hare@suse.de, jack@suse.cz, tj@kernel.org,
	josef@toxicpanda.com, axboe@kernel.dk, yukuai3@huawei.com,
	cgroups@vger.kernel.org, linux-block@vger.kernel.org,
	linux-kernel@vger.kernel.org, yi.zhang@huawei.com,
	yangerkun@huawei.com, johnny.chenyi@huawei.com
Subject: Re: [PATCH v2 0/5] blk-mq-sched: support request batch dispatching for sq elevator
Date: Thu, 31 Jul 2025 16:18:06 +0800	[thread overview]
Message-ID: <aIsmvj_lxLA6ZaWe@fedora> (raw)
In-Reply-To: <20250730082207.4031744-1-yukuai1@huaweicloud.com>

On Wed, Jul 30, 2025 at 04:22:02PM +0800, Yu Kuai wrote:
> From: Yu Kuai <yukuai3@huawei.com>
> 
> Changes from v1:
>  - the ioc changes are send separately;
>  - change the patch 1-3 order as suggested by Damien;
> 
> Currently, both mq-deadline and bfq have global spin lock that will be
> grabbed inside elevator methods like dispatch_request, insert_requests,
> and bio_merge. And the global lock is the main reason mq-deadline and
> bfq can't scale very well.
> 
> For dispatch_request method, current behavior is dispatching one request at
> a time. In the case of multiple dispatching contexts, This behavior, on the
> one hand, introduce intense lock contention:
> 
> t1:                     t2:                     t3:
> lock                    lock                    lock
> // grab lock
> ops.dispatch_request
> unlock
>                         // grab lock
>                         ops.dispatch_request
>                         unlock
>                                                 // grab lock
>                                                 ops.dispatch_request
>                                                 unlock
> 
> on the other hand, messing up the requests dispatching order:
> t1:
> 
> lock
> rq1 = ops.dispatch_request
> unlock
>                         t2:
>                         lock
>                         rq2 = ops.dispatch_request
>                         unlock
> 
> lock
> rq3 = ops.dispatch_request
> unlock
> 
>                         lock
>                         rq4 = ops.dispatch_request
>                         unlock
> 
> //rq1,rq3 issue to disk
>                         // rq2, rq4 issue to disk
> 
> In this case, the elevator dispatch order is rq 1-2-3-4, however,
> such order in disk is rq 1-3-2-4, the order for rq2 and rq3 is inversed.
> 
> While dispatching request, blk_mq_get_disatpch_budget() and
> blk_mq_get_driver_tag() must be called, and they are not ready to be
> called inside elevator methods, hence introduce a new method like
> dispatch_requests is not possible.
> 
> In conclusion, this set factor the global lock out of dispatch_request
> method, and support request batch dispatch by calling the methods
> multiple time while holding the lock.
> 
> nullblk setup:
> modprobe null_blk nr_devices=0 &&
>     udevadm settle &&
>     cd /sys/kernel/config/nullb &&
>     mkdir nullb0 &&
>     cd nullb0 &&
>     echo 0 > completion_nsec &&
>     echo 512 > blocksize &&
>     echo 0 > home_node &&
>     echo 0 > irqmode &&
>     echo 128 > submit_queues &&
>     echo 1024 > hw_queue_depth &&
>     echo 1024 > size &&
>     echo 0 > memory_backed &&
>     echo 2 > queue_mode &&
>     echo 1 > power ||
>     exit $?
> 
> Test script:
> fio -filename=/dev/$disk -name=test -rw=randwrite -bs=4k -iodepth=32 \
>   -numjobs=16 --iodepth_batch_submit=8 --iodepth_batch_complete=8 \
>   -direct=1 -ioengine=io_uring -group_reporting -time_based -runtime=30
> 
> Test result: iops
> 
> |                 | deadline | bfq      |
> | --------------- | -------- | -------- |
> | before this set | 263k     | 124k     |
> | after this set  | 475k     | 292k     |

batch dispatch may hurt io merge performance which is important for
elevator, so please provide test data on real HDD. & SSD., instead of
null_blk only, and it can be perfect if merge sensitive workload
is evaluated.



Thanks,
Ming

next prev parent reply	other threads:[~2025-07-31  8:18 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-07-30  8:22 [PATCH v2 0/5] blk-mq-sched: support request batch dispatching for sq elevator Yu Kuai
2025-07-30  8:22 ` [PATCH v2 1/5] blk-mq-sched: introduce high level elevator lock Yu Kuai
2025-07-30 17:19   ` Bart Van Assche
2025-07-30 17:59     ` Yu Kuai
2025-07-31  6:17   ` Hannes Reinecke
2025-07-30  8:22 ` [PATCH v2 2/5] mq-deadline: switch to use " Yu Kuai
2025-07-30 17:21   ` Bart Van Assche
2025-07-30 18:01     ` Yu Kuai
2025-07-30 18:10       ` Bart Van Assche
2025-07-31  6:20   ` Hannes Reinecke
2025-07-31  6:22     ` Damien Le Moal
2025-07-31  6:32       ` Yu Kuai
2025-07-31  7:04         ` Damien Le Moal
2025-07-31  7:14           ` Yu Kuai
2025-07-30  8:22 ` [PATCH v2 3/5] block, bfq: " Yu Kuai
2025-07-30 17:24   ` Bart Van Assche
2025-07-31  6:22   ` Hannes Reinecke
2025-07-30  8:22 ` [PATCH v2 4/5] blk-mq-sched: refactor __blk_mq_do_dispatch_sched() Yu Kuai
2025-07-30 18:32   ` Bart Van Assche
2025-07-31  0:49     ` Yu Kuai
2025-07-30  8:22 ` [PATCH v2 5/5] blk-mq-sched: support request batch dispatching for sq elevator Yu Kuai
2025-07-31  8:18 ` Ming Lei [this message]
2025-07-31  8:42   ` [PATCH v2 0/5] " Yu Kuai
2025-07-31  9:25     ` Ming Lei
2025-07-31  9:33       ` Yu Kuai
2025-07-31 10:22         ` Ming Lei

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aIsmvj_lxLA6ZaWe@fedora \
    --to=ming.lei@redhat.com \
    --cc=axboe@kernel.dk \
    --cc=cgroups@vger.kernel.org \
    --cc=dlemoal@kernel.org \
    --cc=hare@suse.de \
    --cc=jack@suse.cz \
    --cc=johnny.chenyi@huawei.com \
    --cc=josef@toxicpanda.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=tj@kernel.org \
    --cc=yangerkun@huawei.com \
    --cc=yi.zhang@huawei.com \
    --cc=yukuai1@huaweicloud.com \
    --cc=yukuai3@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.