From: Yu Kuai <yukuai1@huaweicloud.com>
To: dlemoal@kernel.org, hare@suse.de, tj@kernel.org,
josef@toxicpanda.com, axboe@kernel.dk, yukuai3@huawei.com
Cc: cgroups@vger.kernel.org, linux-block@vger.kernel.org,
linux-kernel@vger.kernel.org, yukuai1@huaweicloud.com,
yi.zhang@huawei.com, yangerkun@huawei.com,
johnny.chenyi@huawei.com
Subject: [PATCH 0/6] blk-mq-sched: support request batch dispatching for sq elevator
Date: Tue, 22 Jul 2025 15:24:25 +0800 [thread overview]
Message-ID: <20250722072431.610354-1-yukuai1@huaweicloud.com> (raw)
From: Yu Kuai <yukuai3@huawei.com>
Currently, both mq-deadline and bfq have global spin lock that will be
grabbed inside elevator methods like dispatch_request, insert_requests,
and bio_merge. And the global lock is the main reason mq-deadline and
bfq can't scale very well.
For dispatch_request method, current behavior is dispatching one request at
a time. In the case of multiple dispatching contexts, This behavior, on the
one hand, introduce intense lock contention:
t1: t2: t3:
lock lock lock
// grab lock
ops.dispatch_request
unlock
// grab lock
ops.dispatch_request
unlock
// grab lock
ops.dispatch_request
unlock
on the other hand, messing up the requests dispatching order:
t1:
lock
rq1 = ops.dispatch_request
unlock
t2:
lock
rq2 = ops.dispatch_request
unlock
lock
rq3 = ops.dispatch_request
unlock
lock
rq4 = ops.dispatch_request
unlock
//rq1,rq3 issue to disk
// rq2, rq4 issue to disk
In this case, the elevator dispatch order is rq 1-2-3-4, however,
such order in disk is rq 1-3-2-4, the order for rq2 and rq3 is inversed.
While dispatching request, blk_mq_get_disatpch_budget() and
blk_mq_get_driver_tag() must be called, and they are not ready to be
called inside elevator methods, hence introduce a new method like
dispatch_requests is not possible.
In conclusion, this set factor the global lock out of dispatch_request
method, and support request batch dispatch by calling the methods
multiple time while holding the lock.
nullblk setup:
modprobe null_blk nr_devices=0 &&
udevadm settle &&
cd /sys/kernel/config/nullb &&
mkdir nullb0 &&
cd nullb0 &&
echo 0 > completion_nsec &&
echo 512 > blocksize &&
echo 0 > home_node &&
echo 0 > irqmode &&
echo 128 > submit_queues &&
echo 1024 > hw_queue_depth &&
echo 1024 > size &&
echo 0 > memory_backed &&
echo 2 > queue_mode &&
echo 1 > power ||
exit $?
Test script:
fio -filename=/dev/$disk -name=test -rw=randwrite -bs=4k -iodepth=32 \
-numjobs=16 --iodepth_batch_submit=8 --iodepth_batch_complete=8 \
-direct=1 -ioengine=io_uring -group_reporting -time_based -runtime=30
Test result: iops
| | deadline | bfq |
| --------------- | -------- | -------- |
| before this set | 263k | 124k |
| after this set | 475k | 292k |
Yu Kuai (6):
mq-deadline: switch to use high layer elevator lock
block, bfq: don't grab queue_lock from io path
block, bfq: switch to use elevator lock
elevator: factor elevator lock out of dispatch_request method
blk-mq-sched: refactor __blk_mq_do_dispatch_sched()
blk-mq-sched: support request batch dispatching for sq elevator
block/bfq-cgroup.c | 4 +-
block/bfq-iosched.c | 73 ++++++-------
block/bfq-iosched.h | 2 +-
block/blk-ioc.c | 43 +++++++-
block/blk-mq-sched.c | 240 ++++++++++++++++++++++++++++++-------------
block/blk-mq.h | 21 ++++
block/blk.h | 2 +-
block/elevator.c | 1 +
block/elevator.h | 4 +-
block/mq-deadline.c | 58 +++++------
10 files changed, 293 insertions(+), 155 deletions(-)
--
2.39.2
next reply other threads:[~2025-07-22 7:30 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-07-22 7:24 Yu Kuai [this message]
2025-07-22 7:24 ` [PATCH 1/6] mq-deadline: switch to use high layer elevator lock Yu Kuai
2025-07-23 1:46 ` Damien Le Moal
2025-07-23 2:07 ` Yu Kuai
2025-07-23 2:38 ` Damien Le Moal
2025-07-22 7:24 ` [PATCH 2/6] block, bfq: don't grab queue_lock from io path Yu Kuai
2025-07-23 1:52 ` Damien Le Moal
2025-07-23 2:04 ` Yu Kuai
2025-07-22 7:24 ` [PATCH 3/6] block, bfq: switch to use elevator lock Yu Kuai
2025-07-23 1:53 ` Damien Le Moal
2025-07-22 7:24 ` [PATCH 4/6] elevator: factor elevator lock out of dispatch_request method Yu Kuai
2025-07-23 1:59 ` Damien Le Moal
2025-07-23 2:17 ` Yu Kuai
2025-07-23 2:42 ` Damien Le Moal
2025-07-23 2:51 ` Yu Kuai
2025-07-23 4:34 ` Damien Le Moal
2025-07-23 6:10 ` Yu Kuai
2025-07-22 7:24 ` [PATCH 5/6] blk-mq-sched: refactor __blk_mq_do_dispatch_sched() Yu Kuai
2025-07-22 7:24 ` [PATCH 6/6] blk-mq-sched: support request batch dispatching for sq elevator Yu Kuai
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250722072431.610354-1-yukuai1@huaweicloud.com \
--to=yukuai1@huaweicloud.com \
--cc=axboe@kernel.dk \
--cc=cgroups@vger.kernel.org \
--cc=dlemoal@kernel.org \
--cc=hare@suse.de \
--cc=johnny.chenyi@huawei.com \
--cc=josef@toxicpanda.com \
--cc=linux-block@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=tj@kernel.org \
--cc=yangerkun@huawei.com \
--cc=yi.zhang@huawei.com \
--cc=yukuai3@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).