[PATCH 0/6] blk-mq-sched: support request batch dispatching for sq elevator

cgroups.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Yu Kuai <yukuai1@huaweicloud.com>
To: dlemoal@kernel.org, hare@suse.de, tj@kernel.org,
	josef@toxicpanda.com, axboe@kernel.dk, yukuai3@huawei.com
Cc: cgroups@vger.kernel.org, linux-block@vger.kernel.org,
	linux-kernel@vger.kernel.org, yukuai1@huaweicloud.com,
	yi.zhang@huawei.com, yangerkun@huawei.com,
	johnny.chenyi@huawei.com
Subject: [PATCH 0/6] blk-mq-sched: support request batch dispatching for sq elevator
Date: Tue, 22 Jul 2025 15:24:25 +0800	[thread overview]
Message-ID: <20250722072431.610354-1-yukuai1@huaweicloud.com> (raw)

From: Yu Kuai <yukuai3@huawei.com>

Currently, both mq-deadline and bfq have global spin lock that will be
grabbed inside elevator methods like dispatch_request, insert_requests,
and bio_merge. And the global lock is the main reason mq-deadline and
bfq can't scale very well.

For dispatch_request method, current behavior is dispatching one request at
a time. In the case of multiple dispatching contexts, This behavior, on the
one hand, introduce intense lock contention:

t1:                     t2:                     t3:
lock                    lock                    lock
// grab lock
ops.dispatch_request
unlock
                        // grab lock
                        ops.dispatch_request
                        unlock
                                                // grab lock
                                                ops.dispatch_request
                                                unlock

on the other hand, messing up the requests dispatching order:
t1:

lock
rq1 = ops.dispatch_request
unlock
                        t2:
                        lock
                        rq2 = ops.dispatch_request
                        unlock

lock
rq3 = ops.dispatch_request
unlock

                        lock
                        rq4 = ops.dispatch_request
                        unlock

//rq1,rq3 issue to disk
                        // rq2, rq4 issue to disk

In this case, the elevator dispatch order is rq 1-2-3-4, however,
such order in disk is rq 1-3-2-4, the order for rq2 and rq3 is inversed.

While dispatching request, blk_mq_get_disatpch_budget() and
blk_mq_get_driver_tag() must be called, and they are not ready to be
called inside elevator methods, hence introduce a new method like
dispatch_requests is not possible.

In conclusion, this set factor the global lock out of dispatch_request
method, and support request batch dispatch by calling the methods
multiple time while holding the lock.

nullblk setup:
modprobe null_blk nr_devices=0 &&
    udevadm settle &&
    cd /sys/kernel/config/nullb &&
    mkdir nullb0 &&
    cd nullb0 &&
    echo 0 > completion_nsec &&
    echo 512 > blocksize &&
    echo 0 > home_node &&
    echo 0 > irqmode &&
    echo 128 > submit_queues &&
    echo 1024 > hw_queue_depth &&
    echo 1024 > size &&
    echo 0 > memory_backed &&
    echo 2 > queue_mode &&
    echo 1 > power ||
    exit $?

Test script:
fio -filename=/dev/$disk -name=test -rw=randwrite -bs=4k -iodepth=32 \
  -numjobs=16 --iodepth_batch_submit=8 --iodepth_batch_complete=8 \
  -direct=1 -ioengine=io_uring -group_reporting -time_based -runtime=30

Test result: iops

|                 | deadline | bfq      |
| --------------- | -------- | -------- |
| before this set | 263k     | 124k     |
| after this set  | 475k     | 292k     |

Yu Kuai (6):
  mq-deadline: switch to use high layer elevator lock
  block, bfq: don't grab queue_lock from io path
  block, bfq: switch to use elevator lock
  elevator: factor elevator lock out of dispatch_request method
  blk-mq-sched: refactor __blk_mq_do_dispatch_sched()
  blk-mq-sched: support request batch dispatching for sq elevator

 block/bfq-cgroup.c   |   4 +-
 block/bfq-iosched.c  |  73 ++++++-------
 block/bfq-iosched.h  |   2 +-
 block/blk-ioc.c      |  43 +++++++-
 block/blk-mq-sched.c | 240 ++++++++++++++++++++++++++++++-------------
 block/blk-mq.h       |  21 ++++
 block/blk.h          |   2 +-
 block/elevator.c     |   1 +
 block/elevator.h     |   4 +-
 block/mq-deadline.c  |  58 +++++------
 10 files changed, 293 insertions(+), 155 deletions(-)

-- 
2.39.2

next             reply	other threads:[~2025-07-22  7:30 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-07-22  7:24 Yu Kuai [this message]
2025-07-22  7:24 ` [PATCH 1/6] mq-deadline: switch to use high layer elevator lock Yu Kuai
2025-07-23  1:46   ` Damien Le Moal
2025-07-23  2:07     ` Yu Kuai
2025-07-23  2:38       ` Damien Le Moal
2025-07-22  7:24 ` [PATCH 2/6] block, bfq: don't grab queue_lock from io path Yu Kuai
2025-07-23  1:52   ` Damien Le Moal
2025-07-23  2:04     ` Yu Kuai
2025-07-22  7:24 ` [PATCH 3/6] block, bfq: switch to use elevator lock Yu Kuai
2025-07-23  1:53   ` Damien Le Moal
2025-07-22  7:24 ` [PATCH 4/6] elevator: factor elevator lock out of dispatch_request method Yu Kuai
2025-07-23  1:59   ` Damien Le Moal
2025-07-23  2:17     ` Yu Kuai
2025-07-23  2:42       ` Damien Le Moal
2025-07-23  2:51         ` Yu Kuai
2025-07-23  4:34           ` Damien Le Moal
2025-07-23  6:10             ` Yu Kuai
2025-07-22  7:24 ` [PATCH 5/6] blk-mq-sched: refactor __blk_mq_do_dispatch_sched() Yu Kuai
2025-07-22  7:24 ` [PATCH 6/6] blk-mq-sched: support request batch dispatching for sq elevator Yu Kuai

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250722072431.610354-1-yukuai1@huaweicloud.com \
    --to=yukuai1@huaweicloud.com \
    --cc=axboe@kernel.dk \
    --cc=cgroups@vger.kernel.org \
    --cc=dlemoal@kernel.org \
    --cc=hare@suse.de \
    --cc=johnny.chenyi@huawei.com \
    --cc=josef@toxicpanda.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=tj@kernel.org \
    --cc=yangerkun@huawei.com \
    --cc=yi.zhang@huawei.com \
    --cc=yukuai3@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).