cgroups.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/5] blk-mq-sched: support request batch dispatching for sq elevator
@ 2025-07-30  8:22 Yu Kuai
  2025-07-30  8:22 ` [PATCH v2 1/5] blk-mq-sched: introduce high level elevator lock Yu Kuai
                   ` (5 more replies)
  0 siblings, 6 replies; 26+ messages in thread
From: Yu Kuai @ 2025-07-30  8:22 UTC (permalink / raw)
  To: dlemoal, hare, jack, tj, josef, axboe, yukuai3
  Cc: cgroups, linux-block, linux-kernel, yukuai1, yi.zhang, yangerkun,
	johnny.chenyi

From: Yu Kuai <yukuai3@huawei.com>

Changes from v1:
 - the ioc changes are send separately;
 - change the patch 1-3 order as suggested by Damien;

Currently, both mq-deadline and bfq have global spin lock that will be
grabbed inside elevator methods like dispatch_request, insert_requests,
and bio_merge. And the global lock is the main reason mq-deadline and
bfq can't scale very well.

For dispatch_request method, current behavior is dispatching one request at
a time. In the case of multiple dispatching contexts, This behavior, on the
one hand, introduce intense lock contention:

t1:                     t2:                     t3:
lock                    lock                    lock
// grab lock
ops.dispatch_request
unlock
                        // grab lock
                        ops.dispatch_request
                        unlock
                                                // grab lock
                                                ops.dispatch_request
                                                unlock

on the other hand, messing up the requests dispatching order:
t1:

lock
rq1 = ops.dispatch_request
unlock
                        t2:
                        lock
                        rq2 = ops.dispatch_request
                        unlock

lock
rq3 = ops.dispatch_request
unlock

                        lock
                        rq4 = ops.dispatch_request
                        unlock

//rq1,rq3 issue to disk
                        // rq2, rq4 issue to disk

In this case, the elevator dispatch order is rq 1-2-3-4, however,
such order in disk is rq 1-3-2-4, the order for rq2 and rq3 is inversed.

While dispatching request, blk_mq_get_disatpch_budget() and
blk_mq_get_driver_tag() must be called, and they are not ready to be
called inside elevator methods, hence introduce a new method like
dispatch_requests is not possible.

In conclusion, this set factor the global lock out of dispatch_request
method, and support request batch dispatch by calling the methods
multiple time while holding the lock.

nullblk setup:
modprobe null_blk nr_devices=0 &&
    udevadm settle &&
    cd /sys/kernel/config/nullb &&
    mkdir nullb0 &&
    cd nullb0 &&
    echo 0 > completion_nsec &&
    echo 512 > blocksize &&
    echo 0 > home_node &&
    echo 0 > irqmode &&
    echo 128 > submit_queues &&
    echo 1024 > hw_queue_depth &&
    echo 1024 > size &&
    echo 0 > memory_backed &&
    echo 2 > queue_mode &&
    echo 1 > power ||
    exit $?

Test script:
fio -filename=/dev/$disk -name=test -rw=randwrite -bs=4k -iodepth=32 \
  -numjobs=16 --iodepth_batch_submit=8 --iodepth_batch_complete=8 \
  -direct=1 -ioengine=io_uring -group_reporting -time_based -runtime=30

Test result: iops

|                 | deadline | bfq      |
| --------------- | -------- | -------- |
| before this set | 263k     | 124k     |
| after this set  | 475k     | 292k     |

Yu Kuai (5):
  blk-mq-sched: introduce high level elevator lock
  mq-deadline: switch to use elevator lock
  block, bfq: switch to use elevator lock
  blk-mq-sched: refactor __blk_mq_do_dispatch_sched()
  blk-mq-sched: support request batch dispatching for sq elevator

 block/bfq-cgroup.c   |   4 +-
 block/bfq-iosched.c  |  49 +++++----
 block/bfq-iosched.h  |   2 +-
 block/blk-mq-sched.c | 241 ++++++++++++++++++++++++++++++-------------
 block/blk-mq.h       |  21 ++++
 block/elevator.c     |   1 +
 block/elevator.h     |   4 +-
 block/mq-deadline.c  |  58 +++++------
 8 files changed, 248 insertions(+), 132 deletions(-)

-- 
2.39.2


^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2025-07-31 10:22 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-30  8:22 [PATCH v2 0/5] blk-mq-sched: support request batch dispatching for sq elevator Yu Kuai
2025-07-30  8:22 ` [PATCH v2 1/5] blk-mq-sched: introduce high level elevator lock Yu Kuai
2025-07-30 17:19   ` Bart Van Assche
2025-07-30 17:59     ` Yu Kuai
2025-07-31  6:17   ` Hannes Reinecke
2025-07-30  8:22 ` [PATCH v2 2/5] mq-deadline: switch to use " Yu Kuai
2025-07-30 17:21   ` Bart Van Assche
2025-07-30 18:01     ` Yu Kuai
2025-07-30 18:10       ` Bart Van Assche
2025-07-31  6:20   ` Hannes Reinecke
2025-07-31  6:22     ` Damien Le Moal
2025-07-31  6:32       ` Yu Kuai
2025-07-31  7:04         ` Damien Le Moal
2025-07-31  7:14           ` Yu Kuai
2025-07-30  8:22 ` [PATCH v2 3/5] block, bfq: " Yu Kuai
2025-07-30 17:24   ` Bart Van Assche
2025-07-31  6:22   ` Hannes Reinecke
2025-07-30  8:22 ` [PATCH v2 4/5] blk-mq-sched: refactor __blk_mq_do_dispatch_sched() Yu Kuai
2025-07-30 18:32   ` Bart Van Assche
2025-07-31  0:49     ` Yu Kuai
2025-07-30  8:22 ` [PATCH v2 5/5] blk-mq-sched: support request batch dispatching for sq elevator Yu Kuai
2025-07-31  8:18 ` [PATCH v2 0/5] " Ming Lei
2025-07-31  8:42   ` Yu Kuai
2025-07-31  9:25     ` Ming Lei
2025-07-31  9:33       ` Yu Kuai
2025-07-31 10:22         ` Ming Lei

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).