From: Yu Kuai <yukuai1@huaweicloud.com>
To: colyli@kernel.org, linux-raid@vger.kernel.org
Cc: linux-block@vger.kernel.org, "yukuai (C)" <yukuai3@huawei.com>
Subject: Re: [PATCH 1/2] block: ignore underlying non-stack devices io_opt
Date: Mon, 18 Aug 2025 09:14:09 +0800 [thread overview]
Message-ID: <a3a98a81-16e9-2f3c-b6e5-c83a0055c784@huaweicloud.com> (raw)
In-Reply-To: <20250817152645.7115-1-colyli@kernel.org>
Hi,
在 2025/08/17 23:26, colyli@kernel.org 写道:
> From: Coly Li <colyli@kernel.org>
>
> This patch adds a new BLK_FLAG_STACK_IO_OPT for stack block device. If a
> stack block device like md raid5 declares its io_opt when don't want
> blk_stack_limits() to change it with io_opt of underlying non-stack
> block devices, BLK_FLAG_STACK_IO_OPT can be set on limits.flags. Then in
> blk_stack_limits(), lcm_not_zero(t->io_opt, b->io_opt) will be avoided.
>
It's better refering to the thread:
https://lore.kernel.org/all/ywsfp3lqnijgig6yrlv2ztxram6ohf5z4yfeebswjkvp2dzisd@f5ikoyo3sfq5/
That scsi and mdraid have different definition of io_opt.
> For md raid5, it is necessary to keep a proper io_opt size for better
> I/O thoughput.
>
> Signed-off-by: Coly Li <colyli@kernel.org>
> ---
> block/blk-settings.c | 6 +++++-
> drivers/md/raid5.c | 1 +
> include/linux/blkdev.h | 3 +++
> 3 files changed, 9 insertions(+), 1 deletion(-)
>
> diff --git a/block/blk-settings.c b/block/blk-settings.c
> index 07874e9b609f..46ee538b2be9 100644
> --- a/block/blk-settings.c
> +++ b/block/blk-settings.c
> @@ -782,6 +782,7 @@ int blk_stack_limits(struct queue_limits *t, struct queue_limits *b,
> t->features &= ~BLK_FEAT_POLL;
>
> t->flags |= (b->flags & BLK_FLAG_MISALIGNED);
> + t->flags |= (b->flags & BLK_FLAG_STACK_IO_OPT);
>
> t->max_sectors = min_not_zero(t->max_sectors, b->max_sectors);
> t->max_user_sectors = min_not_zero(t->max_user_sectors,
> @@ -839,7 +840,10 @@ int blk_stack_limits(struct queue_limits *t, struct queue_limits *b,
> b->physical_block_size);
>
> t->io_min = max(t->io_min, b->io_min);
> - t->io_opt = lcm_not_zero(t->io_opt, b->io_opt);
> + if (!t->io_opt || !(t->flags & BLK_FLAG_STACK_IO_OPT) ||
> + (b->flags & BLK_FLAG_STACK_IO_OPT))
> + t->io_opt = lcm_not_zero(t->io_opt, b->io_opt);
> +
> t->dma_alignment = max(t->dma_alignment, b->dma_alignment);
>
> /* Set non-power-of-2 compatible chunk_sectors boundary */
> diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
> index 023649fe2476..989acd8abd98 100644
> --- a/drivers/md/raid5.c
> +++ b/drivers/md/raid5.c
> @@ -7730,6 +7730,7 @@ static int raid5_set_limits(struct mddev *mddev)
> lim.io_min = mddev->chunk_sectors << 9;
> lim.io_opt = lim.io_min * (conf->raid_disks - conf->max_degraded);
> lim.features |= BLK_FEAT_RAID_PARTIAL_STRIPES_EXPENSIVE;
> + lim.flags |= BLK_FLAG_STACK_IO_OPT;
> lim.discard_granularity = stripe;
> lim.max_write_zeroes_sectors = 0;
> mddev_stack_rdev_limits(mddev, &lim, 0);
And I think raid0/raid1/raid10 should all set this flag as well.
Thanks,
Kuai
> diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
> index 95886b404b16..a22c7cea9836 100644
> --- a/include/linux/blkdev.h
> +++ b/include/linux/blkdev.h
> @@ -366,6 +366,9 @@ typedef unsigned int __bitwise blk_flags_t;
> /* passthrough command IO accounting */
> #define BLK_FLAG_IOSTATS_PASSTHROUGH ((__force blk_flags_t)(1u << 2))
>
> +/* ignore underlying non-stack devices io_opt */
> +#define BLK_FLAG_STACK_IO_OPT ((__force blk_flags_t)(1u << 3))
> +
> struct queue_limits {
> blk_features_t features;
> blk_flags_t flags;
>
next prev parent reply other threads:[~2025-08-18 1:14 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-08-17 15:26 [PATCH 1/2] block: ignore underlying non-stack devices io_opt colyli
2025-08-17 15:26 ` [PATCH 2/2] md: split bio by io_opt size in md_submit_bio() colyli
2025-08-18 1:38 ` Yu Kuai
2025-08-18 8:01 ` Christoph Hellwig
2025-08-18 9:51 ` John Garry
[not found] ` <6DA25F37-26B3-4912-90A3-346CFD9A6EEA@coly.li>
2025-08-18 12:20 ` John Garry
2025-08-17 18:37 ` [PATCH 1/2] block: ignore underlying non-stack devices io_opt Paul Menzel
2025-08-18 1:14 ` Yu Kuai [this message]
2025-08-18 2:51 ` Damien Le Moal
2025-08-18 2:57 ` Yu Kuai
2025-08-18 3:18 ` Damien Le Moal
2025-08-18 3:40 ` Yu Kuai
2025-08-18 5:56 ` Christoph Hellwig
2025-08-18 6:14 ` Yu Kuai
2025-08-18 6:18 ` Christoph Hellwig
2025-08-18 6:31 ` Yu Kuai
2025-08-18 8:00 ` Christoph Hellwig
2025-08-18 8:10 ` Yu Kuai
2025-08-18 8:14 ` Christoph Hellwig
2025-08-18 8:57 ` Yu Kuai
2025-08-18 9:08 ` Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=a3a98a81-16e9-2f3c-b6e5-c83a0055c784@huaweicloud.com \
--to=yukuai1@huaweicloud.com \
--cc=colyli@kernel.org \
--cc=linux-block@vger.kernel.org \
--cc=linux-raid@vger.kernel.org \
--cc=yukuai3@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).