From: John Garry <john.g.garry@oracle.com>
To: Nilay Shroff <nilay@linux.ibm.com>, linux-block@vger.kernel.org
Cc: hch@lst.de, martin.petersen@oracle.com, axboe@kernel.dk,
ojaswin@linux.ibm.com, gjoyce@ibm.com
Subject: Re: [PATCH] block: fix atomic write limits for stacked devices
Date: Thu, 5 Jun 2025 10:01:51 +0100 [thread overview]
Message-ID: <4ab6d5a2-1780-4e81-8ea1-e5d93d651dc5@oracle.com> (raw)
In-Reply-To: <01e38aba-21ec-4507-8e5f-392838e8b937@linux.ibm.com>
On 04/06/2025 16:09, Nilay Shroff wrote:
>> I need to test further, but maybe we can change the check to this:
>>
>> if (t->io_min <= SECTOR_SIZE || t->io_min == t->physical_block_size) {
>> /* No chunk sectors, so use bottom device values directly */
>> t->atomic_write_hw_unit_max = b->atomic_write_hw_unit_max;
>> t->atomic_write_hw_unit_min = b->atomic_write_hw_unit_min;
>> t->atomic_write_hw_max = b->atomic_write_hw_max;
>> return true;
>> }
> How about instead adding a new BLK_FEAT_STRIPED flag and then use it here while
> setting atomic limits as shown below:
>
> diff --git a/block/blk-settings.c b/block/blk-settings.c
> index a000daafbfb4..bf5d35282d42 100644
> --- a/block/blk-settings.c
> +++ b/block/blk-settings.c
> @@ -598,8 +598,14 @@ static bool blk_stack_atomic_writes_head(struct queue_limits *t,
> !blk_stack_atomic_writes_boundary_head(t, b))
> return false;
>
> - if (t->io_min <= SECTOR_SIZE) {
> - /* No chunk sectors, so use bottom device values directly */
> + if (t->io_min <= SECTOR_SIZE || !(t->features & BLK_FEAT_STRIPED)) {
> + /*
> + * If there are no chunk sectors, or if the top device does not
> + * advertise the STRIPED feature (i.e., it's not a striped
> + * device like md-raid0 or dm-stripe), then we directly inherit
> + * the atomic write capabilities from the underlying (bottom)
> + * device.
> + */
> t->atomic_write_hw_unit_max = b->atomic_write_hw_unit_max;
> t->atomic_write_hw_unit_min = b->atomic_write_hw_unit_min;
> t->atomic_write_hw_max = b->atomic_write_hw_max;
>
> I tested the above change with md-raid0 and dm-strip setup and seems to
> be working well. What do you think?
I would hope that we don't require this complexity.
I think that this check should be fine:
if (t->io_min <= t->physical_block_size) {
}
But I have found a method to break atomic writes for raid10 on mainline
with that - that is if I have physical_block_size > chunk size. This
ends up that atomic write unit max comes directly from the bottom device
atomic write unit max, and it should be limited by chunk size.
Let me send a patch for that, and we can go from there - ok?
Thanks,
John
next prev parent reply other threads:[~2025-06-05 9:02 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-06-03 11:27 [PATCH] block: fix atomic write limits for stacked devices Nilay Shroff
2025-06-03 12:17 ` John Garry
2025-06-03 15:16 ` Nilay Shroff
2025-06-04 7:29 ` John Garry
2025-06-04 15:09 ` Nilay Shroff
2025-06-05 9:01 ` John Garry [this message]
2025-06-05 9:50 ` Nilay Shroff
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4ab6d5a2-1780-4e81-8ea1-e5d93d651dc5@oracle.com \
--to=john.g.garry@oracle.com \
--cc=axboe@kernel.dk \
--cc=gjoyce@ibm.com \
--cc=hch@lst.de \
--cc=linux-block@vger.kernel.org \
--cc=martin.petersen@oracle.com \
--cc=nilay@linux.ibm.com \
--cc=ojaswin@linux.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox