* [PATCH] block: fix atomic write limits for stacked devices
@ 2025-06-03 11:27 Nilay Shroff
2025-06-03 12:17 ` John Garry
0 siblings, 1 reply; 7+ messages in thread
From: Nilay Shroff @ 2025-06-03 11:27 UTC (permalink / raw)
To: linux-block; +Cc: john.g.garry, hch, martin.petersen, axboe, ojaswin, gjoyce
The current logic applies the bottom device's atomic write limits to
the stacked (top) device only if the top device does not support chunk
sectors. However, this approach is too restrictive.
We should also propagate the bottom device's atomic write limits to the
stacked device if atomic_write_hw_unit_{min,max} of the bottom device
are aligned with the top device's chunk size (io_min). Failing to do so
may unnecessarily reduce the stacked device's atomic write limits to
values much smaller than what the hardware can actually support.
For example, on an NVMe disk with the following controller capability:
$ nvme id-ctrl /dev/nvme1 | grep awupf
awupf : 63
Without the patch applied,
The bottom device (nvme1c1n1) atomic queue limits:
$ /sys/block/nvme1c1n1/queue/atomic_write_boundary_bytes:0
$ /sys/block/nvme1c1n1/queue/atomic_write_max_bytes:262144
$ /sys/block/nvme1c1n1/queue/atomic_write_unit_max_bytes:262144
$ /sys/block/nvme1c1n1/queue/atomic_write_unit_min_bytes:4096
The top device (nvme1n1) atomic queue limits:
$ /sys/block/nvme1n1/queue/atomic_write_boundary_bytes:0
$ /sys/block/nvme1n1/queue/atomic_write_max_bytes:4096
$ /sys/block/nvme1n1/queue/atomic_write_unit_max_bytes:4096
$ /sys/block/nvme1n1/queue/atomic_write_unit_min_bytes:4096
With this patch applied,
The top device (nvme1n1) atomic queue limits:
/sys/block/nvme1n1/queue/atomic_write_boundary_bytes:0
/sys/block/nvme1n1/queue/atomic_write_max_bytes:262144
/sys/block/nvme1n1/queue/atomic_write_unit_max_bytes:262144
/sys/block/nvme1n1/queue/atomic_write_unit_min_bytes:4096
This change ensures that the stacked device retains optimal atomic write
capability when alignment permits, improving overall performance and
correctness.
Reported-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
Fixes: d7f36dc446e8 ("block: Support atomic writes limits for stacked devices")
Signed-off-by: Nilay Shroff <nilay@linux.ibm.com>
---
block/blk-settings.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)
diff --git a/block/blk-settings.c b/block/blk-settings.c
index a000daafbfb4..35c1354dd5ae 100644
--- a/block/blk-settings.c
+++ b/block/blk-settings.c
@@ -598,8 +598,14 @@ static bool blk_stack_atomic_writes_head(struct queue_limits *t,
!blk_stack_atomic_writes_boundary_head(t, b))
return false;
- if (t->io_min <= SECTOR_SIZE) {
- /* No chunk sectors, so use bottom device values directly */
+ if (t->io_min <= SECTOR_SIZE ||
+ (!(t->atomic_write_hw_unit_max % t->io_min) &&
+ !(t->atomic_write_hw_unit_min % t->io_min))) {
+ /*
+ * If there are no chunk sectors, or if b->atomic_write_hw_unit
+ * _{min, max} are aligned to the chunk size (t->io_min), then
+ * use the bottom device's values directly.
+ */
t->atomic_write_hw_unit_max = b->atomic_write_hw_unit_max;
t->atomic_write_hw_unit_min = b->atomic_write_hw_unit_min;
t->atomic_write_hw_max = b->atomic_write_hw_max;
--
2.49.0
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH] block: fix atomic write limits for stacked devices
2025-06-03 11:27 [PATCH] block: fix atomic write limits for stacked devices Nilay Shroff
@ 2025-06-03 12:17 ` John Garry
2025-06-03 15:16 ` Nilay Shroff
0 siblings, 1 reply; 7+ messages in thread
From: John Garry @ 2025-06-03 12:17 UTC (permalink / raw)
To: Nilay Shroff, linux-block; +Cc: hch, martin.petersen, axboe, ojaswin, gjoyce
On 03/06/2025 12:27, Nilay Shroff wrote:
> The current logic applies the bottom device's atomic write limits to
> the stacked (top) device only if the top device does not support chunk
> sectors. However, this approach is too restrictive.
Note that the assumption is that md raid0/10 or dm-stripe chunk size is
in io_min. It's not ideal, but that's the field which always holds
chunk_sectors for those drivers.
max_hw_sectors would seem to be more appropriate to me...
>
> We should also propagate the bottom device's atomic write limits to the
> stacked device if atomic_write_hw_unit_{min,max} of the bottom device
> are aligned with the top device's chunk size (io_min). Failing to do so
> may unnecessarily reduce the stacked device's atomic write limits to
> values much smaller than what the hardware can actually support.
>
> For example, on an NVMe disk with the following controller capability:
> $ nvme id-ctrl /dev/nvme1 | grep awupf
> awupf : 63
>
> Without the patch applied,
>
> The bottom device (nvme1c1n1) atomic queue limits:
> $ /sys/block/nvme1c1n1/queue/atomic_write_boundary_bytes:0
> $ /sys/block/nvme1c1n1/queue/atomic_write_max_bytes:262144
> $ /sys/block/nvme1c1n1/queue/atomic_write_unit_max_bytes:262144
> $ /sys/block/nvme1c1n1/queue/atomic_write_unit_min_bytes:4096
>
> The top device (nvme1n1) atomic queue limits:
> $ /sys/block/nvme1n1/queue/atomic_write_boundary_bytes:0
> $ /sys/block/nvme1n1/queue/atomic_write_max_bytes:4096
> $ /sys/block/nvme1n1/queue/atomic_write_unit_max_bytes:4096
> $ /sys/block/nvme1n1/queue/atomic_write_unit_min_bytes:4096
>
> With this patch applied,
>
> The top device (nvme1n1) atomic queue limits:
> /sys/block/nvme1n1/queue/atomic_write_boundary_bytes:0
> /sys/block/nvme1n1/queue/atomic_write_max_bytes:262144
> /sys/block/nvme1n1/queue/atomic_write_unit_max_bytes:262144
> /sys/block/nvme1n1/queue/atomic_write_unit_min_bytes:4096
>
> This change ensures that the stacked device retains optimal atomic write
> capability when alignment permits, improving overall performance and
> correctness.
>
> Reported-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
> Fixes: d7f36dc446e8 ("block: Support atomic writes limits for stacked devices")
> Signed-off-by: Nilay Shroff <nilay@linux.ibm.com>
> ---
> block/blk-settings.c | 10 ++++++++--
> 1 file changed, 8 insertions(+), 2 deletions(-)
>
> diff --git a/block/blk-settings.c b/block/blk-settings.c
> index a000daafbfb4..35c1354dd5ae 100644
> --- a/block/blk-settings.c
> +++ b/block/blk-settings.c
> @@ -598,8 +598,14 @@ static bool blk_stack_atomic_writes_head(struct queue_limits *t,
> !blk_stack_atomic_writes_boundary_head(t, b))
> return false;
>
> - if (t->io_min <= SECTOR_SIZE) {
> - /* No chunk sectors, so use bottom device values directly */
> + if (t->io_min <= SECTOR_SIZE ||
> + (!(t->atomic_write_hw_unit_max % t->io_min) &&
> + !(t->atomic_write_hw_unit_min % t->io_min))) {
So will this now break md raid0/10 or dm stripe when t->io_min is set (>
SECTOR_SIZE)? I mean, for md raid0/10 or dm-stripe, we should be taking
the chunk size into account there and we now don't seem to be doing so now.
What is the value of top device io_min and physical_block_size in your
example?
> + /*
> + * If there are no chunk sectors, or if b->atomic_write_hw_unit
> + * _{min, max} are aligned to the chunk size (t->io_min), then
> + * use the bottom device's values directly.
> + */
> t->atomic_write_hw_unit_max = b->atomic_write_hw_unit_max;
> t->atomic_write_hw_unit_min = b->atomic_write_hw_unit_min;
> t->atomic_write_hw_max = b->atomic_write_hw_max;
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] block: fix atomic write limits for stacked devices
2025-06-03 12:17 ` John Garry
@ 2025-06-03 15:16 ` Nilay Shroff
2025-06-04 7:29 ` John Garry
0 siblings, 1 reply; 7+ messages in thread
From: Nilay Shroff @ 2025-06-03 15:16 UTC (permalink / raw)
To: John Garry, linux-block; +Cc: hch, martin.petersen, axboe, ojaswin, gjoyce
On 6/3/25 5:47 PM, John Garry wrote:
> On 03/06/2025 12:27, Nilay Shroff wrote:
>> The current logic applies the bottom device's atomic write limits to
>> the stacked (top) device only if the top device does not support chunk
>> sectors. However, this approach is too restrictive.
>
> Note that the assumption is that md raid0/10 or dm-stripe chunk size is in io_min. It's not ideal, but that's the field which always holds chunk_sectors for those drivers.
>
> max_hw_sectors would seem to be more appropriate to me...
>
>>
>> We should also propagate the bottom device's atomic write limits to the
>> stacked device if atomic_write_hw_unit_{min,max} of the bottom device
>> are aligned with the top device's chunk size (io_min). Failing to do so
>> may unnecessarily reduce the stacked device's atomic write limits to
>> values much smaller than what the hardware can actually support.
>>
>> For example, on an NVMe disk with the following controller capability:
>> $ nvme id-ctrl /dev/nvme1 | grep awupf
>> awupf : 63
>>
>> Without the patch applied,
>>
>> The bottom device (nvme1c1n1) atomic queue limits:
>> $ /sys/block/nvme1c1n1/queue/atomic_write_boundary_bytes:0
>> $ /sys/block/nvme1c1n1/queue/atomic_write_max_bytes:262144
>> $ /sys/block/nvme1c1n1/queue/atomic_write_unit_max_bytes:262144
>> $ /sys/block/nvme1c1n1/queue/atomic_write_unit_min_bytes:4096
>>
>> The top device (nvme1n1) atomic queue limits:
>> $ /sys/block/nvme1n1/queue/atomic_write_boundary_bytes:0
>> $ /sys/block/nvme1n1/queue/atomic_write_max_bytes:4096
>> $ /sys/block/nvme1n1/queue/atomic_write_unit_max_bytes:4096
>> $ /sys/block/nvme1n1/queue/atomic_write_unit_min_bytes:4096
>>
>> With this patch applied,
>>
>> The top device (nvme1n1) atomic queue limits:
>> /sys/block/nvme1n1/queue/atomic_write_boundary_bytes:0
>> /sys/block/nvme1n1/queue/atomic_write_max_bytes:262144
>> /sys/block/nvme1n1/queue/atomic_write_unit_max_bytes:262144
>> /sys/block/nvme1n1/queue/atomic_write_unit_min_bytes:4096
>>
>> This change ensures that the stacked device retains optimal atomic write
>> capability when alignment permits, improving overall performance and
>> correctness.
>>
>> Reported-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
>> Fixes: d7f36dc446e8 ("block: Support atomic writes limits for stacked devices")
>> Signed-off-by: Nilay Shroff <nilay@linux.ibm.com>
>> ---
>> block/blk-settings.c | 10 ++++++++--
>> 1 file changed, 8 insertions(+), 2 deletions(-)
>>
>> diff --git a/block/blk-settings.c b/block/blk-settings.c
>> index a000daafbfb4..35c1354dd5ae 100644
>> --- a/block/blk-settings.c
>> +++ b/block/blk-settings.c
>> @@ -598,8 +598,14 @@ static bool blk_stack_atomic_writes_head(struct queue_limits *t,
>> !blk_stack_atomic_writes_boundary_head(t, b))
>> return false;
>> - if (t->io_min <= SECTOR_SIZE) {
>> - /* No chunk sectors, so use bottom device values directly */
>> + if (t->io_min <= SECTOR_SIZE ||
>> + (!(t->atomic_write_hw_unit_max % t->io_min) &&
>> + !(t->atomic_write_hw_unit_min % t->io_min))) {
>
> So will this now break md raid0/10 or dm stripe when t->io_min is set (> SECTOR_SIZE)? I mean, for md raid0/10 or dm-stripe, we should be taking the chunk size into account there and we now don't seem to be doing so now.
>
Shouldn't it be work good if we ensure that a bottom device atomic write unit min/max are
aligned with the top device chunk sectors then top device could simply copy and use the
bottom device atomic write limits directly? Or do we have a special case for raid0/10 and
dm-strip which can't handle atomic write if chunk size for stacked device is greater than
SECTOR_SIZE?
BTW there's a typo in the above change, we should have the above if check written as below
(my bad):
if (t->io_min <= SECTOR_SIZE ||
(!(b->atomic_write_hw_unit_max % t->io_min) &&
!(b->atomic_write_hw_unit_min % t->io_min))) {
...
...
> What is the value of top device io_min and physical_block_size in your example?
The NVme disk which I am using has both t->io_min and t->physical_block_size set
to 4096.
Thanks,
--Nilay
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] block: fix atomic write limits for stacked devices
2025-06-03 15:16 ` Nilay Shroff
@ 2025-06-04 7:29 ` John Garry
2025-06-04 15:09 ` Nilay Shroff
0 siblings, 1 reply; 7+ messages in thread
From: John Garry @ 2025-06-04 7:29 UTC (permalink / raw)
To: Nilay Shroff, linux-block; +Cc: hch, martin.petersen, axboe, ojaswin, gjoyce
On 03/06/2025 16:16, Nilay Shroff wrote:
>>> diff --git a/block/blk-settings.c b/block/blk-settings.c
>>> index a000daafbfb4..35c1354dd5ae 100644
>>> --- a/block/blk-settings.c
>>> +++ b/block/blk-settings.c
>>> @@ -598,8 +598,14 @@ static bool blk_stack_atomic_writes_head(struct queue_limits *t,
>>> !blk_stack_atomic_writes_boundary_head(t, b))
>>> return false;
>>> - if (t->io_min <= SECTOR_SIZE) {
>>> - /* No chunk sectors, so use bottom device values directly */
>>> + if (t->io_min <= SECTOR_SIZE ||
>>> + (!(t->atomic_write_hw_unit_max % t->io_min) &&
>>> + !(t->atomic_write_hw_unit_min % t->io_min))) {
>> So will this now break md raid0/10 or dm stripe when t->io_min is set (> SECTOR_SIZE)? I mean, for md raid0/10 or dm-stripe, we should be taking the chunk size into account there and we now don't seem to be doing so now.
>>
> Shouldn't it be work good if we ensure that a bottom device atomic write unit min/max are
> aligned with the top device chunk sectors then top device could simply copy and use the
> bottom device atomic write limits directly?
You need to be more specific when you say "aligned".
Consider chunk sectors for md raid0 is 16KB and
b->atomic_write_hw_unit_max is 32KB. Then we must reduce
t->atomic_write_hw_unit_max to 16KB (so cannot use the value in
b->atomic_write_hw_unit_max directly).
> Or do we have a special case for raid0/10 and
> dm-strip which can't handle atomic write if chunk size for stacked device is greater than
> SECTOR_SIZE?
>
> BTW there's a typo in the above change, we should have the above if check written as below
> (my bad):
> if (t->io_min <= SECTOR_SIZE ||
> (!(b->atomic_write_hw_unit_max % t->io_min) &&
> !(b->atomic_write_hw_unit_min % t->io_min))) {
> ...
> ...
>
>> What is the value of top device io_min and physical_block_size in your example?
> The NVme disk which I am using has both t->io_min and t->physical_block_size set
> to 4096.
I need to test further, but maybe we can change the check to this:
if (t->io_min <= SECTOR_SIZE || t->io_min == t->physical_block_size) {
/* No chunk sectors, so use bottom device values directly */
t->atomic_write_hw_unit_max = b->atomic_write_hw_unit_max;
t->atomic_write_hw_unit_min = b->atomic_write_hw_unit_min;
t->atomic_write_hw_max = b->atomic_write_hw_max;
return true;
}
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] block: fix atomic write limits for stacked devices
2025-06-04 7:29 ` John Garry
@ 2025-06-04 15:09 ` Nilay Shroff
2025-06-05 9:01 ` John Garry
0 siblings, 1 reply; 7+ messages in thread
From: Nilay Shroff @ 2025-06-04 15:09 UTC (permalink / raw)
To: John Garry, linux-block; +Cc: hch, martin.petersen, axboe, ojaswin, gjoyce
On 6/4/25 12:59 PM, John Garry wrote:
> On 03/06/2025 16:16, Nilay Shroff wrote:
>>>> diff --git a/block/blk-settings.c b/block/blk-settings.c
>>>> index a000daafbfb4..35c1354dd5ae 100644
>>>> --- a/block/blk-settings.c
>>>> +++ b/block/blk-settings.c
>>>> @@ -598,8 +598,14 @@ static bool blk_stack_atomic_writes_head(struct queue_limits *t,
>>>> !blk_stack_atomic_writes_boundary_head(t, b))
>>>> return false;
>>>> - if (t->io_min <= SECTOR_SIZE) {
>>>> - /* No chunk sectors, so use bottom device values directly */
>>>> + if (t->io_min <= SECTOR_SIZE ||
>>>> + (!(t->atomic_write_hw_unit_max % t->io_min) &&
>>>> + !(t->atomic_write_hw_unit_min % t->io_min))) {
>>> So will this now break md raid0/10 or dm stripe when t->io_min is set (> SECTOR_SIZE)? I mean, for md raid0/10 or dm-stripe, we should be taking the chunk size into account there and we now don't seem to be doing so now.
>>>
>> Shouldn't it be work good if we ensure that a bottom device atomic write unit min/max are
>> aligned with the top device chunk sectors then top device could simply copy and use the
>> bottom device atomic write limits directly?
>
> You need to be more specific when you say "aligned".
>
I meant to say bottom device atomic write unit min/max are multiples of top device chunk sectors .
> Consider chunk sectors for md raid0 is 16KB and b->atomic_write_hw_unit_max is 32KB. Then we must reduce t->atomic_write_hw_unit_max to 16KB (so cannot use the value in b->atomic_write_hw_unit_max directly).
Okay, I think I understood your concerns here.
>
>> Or do we have a special case for raid0/10 and
>> dm-strip which can't handle atomic write if chunk size for stacked device is greater than
>> SECTOR_SIZE?
>>
>> BTW there's a typo in the above change, we should have the above if check written as below
>> (my bad):
>> if (t->io_min <= SECTOR_SIZE ||
>> (!(b->atomic_write_hw_unit_max % t->io_min) &&
>> !(b->atomic_write_hw_unit_min % t->io_min))) {
>> ...
>> ...
>>
>>> What is the value of top device io_min and physical_block_size in your example?
>> The NVme disk which I am using has both t->io_min and t->physical_block_size set
>> to 4096.
>
> I need to test further, but maybe we can change the check to this:
>
> if (t->io_min <= SECTOR_SIZE || t->io_min == t->physical_block_size) {
> /* No chunk sectors, so use bottom device values directly */
> t->atomic_write_hw_unit_max = b->atomic_write_hw_unit_max;
> t->atomic_write_hw_unit_min = b->atomic_write_hw_unit_min;
> t->atomic_write_hw_max = b->atomic_write_hw_max;
> return true;
> }
How about instead adding a new BLK_FEAT_STRIPED flag and then use it here while
setting atomic limits as shown below:
diff --git a/block/blk-settings.c b/block/blk-settings.c
index a000daafbfb4..bf5d35282d42 100644
--- a/block/blk-settings.c
+++ b/block/blk-settings.c
@@ -598,8 +598,14 @@ static bool blk_stack_atomic_writes_head(struct queue_limits *t,
!blk_stack_atomic_writes_boundary_head(t, b))
return false;
- if (t->io_min <= SECTOR_SIZE) {
- /* No chunk sectors, so use bottom device values directly */
+ if (t->io_min <= SECTOR_SIZE || !(t->features & BLK_FEAT_STRIPED)) {
+ /*
+ * If there are no chunk sectors, or if the top device does not
+ * advertise the STRIPED feature (i.e., it's not a striped
+ * device like md-raid0 or dm-stripe), then we directly inherit
+ * the atomic write capabilities from the underlying (bottom)
+ * device.
+ */
t->atomic_write_hw_unit_max = b->atomic_write_hw_unit_max;
t->atomic_write_hw_unit_min = b->atomic_write_hw_unit_min;
t->atomic_write_hw_max = b->atomic_write_hw_max;
I tested the above change with md-raid0 and dm-strip setup and seems to
be working well. What do you think?
Thanks,
--Nilay
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH] block: fix atomic write limits for stacked devices
2025-06-04 15:09 ` Nilay Shroff
@ 2025-06-05 9:01 ` John Garry
2025-06-05 9:50 ` Nilay Shroff
0 siblings, 1 reply; 7+ messages in thread
From: John Garry @ 2025-06-05 9:01 UTC (permalink / raw)
To: Nilay Shroff, linux-block; +Cc: hch, martin.petersen, axboe, ojaswin, gjoyce
On 04/06/2025 16:09, Nilay Shroff wrote:
>> I need to test further, but maybe we can change the check to this:
>>
>> if (t->io_min <= SECTOR_SIZE || t->io_min == t->physical_block_size) {
>> /* No chunk sectors, so use bottom device values directly */
>> t->atomic_write_hw_unit_max = b->atomic_write_hw_unit_max;
>> t->atomic_write_hw_unit_min = b->atomic_write_hw_unit_min;
>> t->atomic_write_hw_max = b->atomic_write_hw_max;
>> return true;
>> }
> How about instead adding a new BLK_FEAT_STRIPED flag and then use it here while
> setting atomic limits as shown below:
>
> diff --git a/block/blk-settings.c b/block/blk-settings.c
> index a000daafbfb4..bf5d35282d42 100644
> --- a/block/blk-settings.c
> +++ b/block/blk-settings.c
> @@ -598,8 +598,14 @@ static bool blk_stack_atomic_writes_head(struct queue_limits *t,
> !blk_stack_atomic_writes_boundary_head(t, b))
> return false;
>
> - if (t->io_min <= SECTOR_SIZE) {
> - /* No chunk sectors, so use bottom device values directly */
> + if (t->io_min <= SECTOR_SIZE || !(t->features & BLK_FEAT_STRIPED)) {
> + /*
> + * If there are no chunk sectors, or if the top device does not
> + * advertise the STRIPED feature (i.e., it's not a striped
> + * device like md-raid0 or dm-stripe), then we directly inherit
> + * the atomic write capabilities from the underlying (bottom)
> + * device.
> + */
> t->atomic_write_hw_unit_max = b->atomic_write_hw_unit_max;
> t->atomic_write_hw_unit_min = b->atomic_write_hw_unit_min;
> t->atomic_write_hw_max = b->atomic_write_hw_max;
>
> I tested the above change with md-raid0 and dm-strip setup and seems to
> be working well. What do you think?
I would hope that we don't require this complexity.
I think that this check should be fine:
if (t->io_min <= t->physical_block_size) {
}
But I have found a method to break atomic writes for raid10 on mainline
with that - that is if I have physical_block_size > chunk size. This
ends up that atomic write unit max comes directly from the bottom device
atomic write unit max, and it should be limited by chunk size.
Let me send a patch for that, and we can go from there - ok?
Thanks,
John
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] block: fix atomic write limits for stacked devices
2025-06-05 9:01 ` John Garry
@ 2025-06-05 9:50 ` Nilay Shroff
0 siblings, 0 replies; 7+ messages in thread
From: Nilay Shroff @ 2025-06-05 9:50 UTC (permalink / raw)
To: John Garry, linux-block; +Cc: hch, martin.petersen, axboe, ojaswin, gjoyce
On 6/5/25 2:31 PM, John Garry wrote:
> On 04/06/2025 16:09, Nilay Shroff wrote:
>>> I need to test further, but maybe we can change the check to this:
>>>
>>> if (t->io_min <= SECTOR_SIZE || t->io_min == t->physical_block_size) {
>>> /* No chunk sectors, so use bottom device values directly */
>>> t->atomic_write_hw_unit_max = b->atomic_write_hw_unit_max;
>>> t->atomic_write_hw_unit_min = b->atomic_write_hw_unit_min;
>>> t->atomic_write_hw_max = b->atomic_write_hw_max;
>>> return true;
>>> }
>> How about instead adding a new BLK_FEAT_STRIPED flag and then use it here while
>> setting atomic limits as shown below:
>>
>> diff --git a/block/blk-settings.c b/block/blk-settings.c
>> index a000daafbfb4..bf5d35282d42 100644
>> --- a/block/blk-settings.c
>> +++ b/block/blk-settings.c
>> @@ -598,8 +598,14 @@ static bool blk_stack_atomic_writes_head(struct queue_limits *t,
>> !blk_stack_atomic_writes_boundary_head(t, b))
>> return false;
>> - if (t->io_min <= SECTOR_SIZE) {
>> - /* No chunk sectors, so use bottom device values directly */
>> + if (t->io_min <= SECTOR_SIZE || !(t->features & BLK_FEAT_STRIPED)) {
>> + /*
>> + * If there are no chunk sectors, or if the top device does not
>> + * advertise the STRIPED feature (i.e., it's not a striped
>> + * device like md-raid0 or dm-stripe), then we directly inherit
>> + * the atomic write capabilities from the underlying (bottom)
>> + * device.
>> + */
>> t->atomic_write_hw_unit_max = b->atomic_write_hw_unit_max;
>> t->atomic_write_hw_unit_min = b->atomic_write_hw_unit_min;
>> t->atomic_write_hw_max = b->atomic_write_hw_max;
>>
>> I tested the above change with md-raid0 and dm-strip setup and seems to
>> be working well. What do you think?
>
> I would hope that we don't require this complexity.
>
> I think that this check should be fine:
>
> if (t->io_min <= t->physical_block_size) {
>
> }
>
> But I have found a method to break atomic writes for raid10 on mainline with that - that is if I have physical_block_size > chunk size. This ends up that atomic write unit max comes directly from the bottom device atomic write unit max, and it should be limited by chunk size.
>
> Let me send a patch for that, and we can go from there - ok?
Yeah sure, we need to fix this. Please send your changes and I'd be glad to
help review and test your changes on my setup.
Thanks,
--Nilay
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2025-06-05 9:51 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-03 11:27 [PATCH] block: fix atomic write limits for stacked devices Nilay Shroff
2025-06-03 12:17 ` John Garry
2025-06-03 15:16 ` Nilay Shroff
2025-06-04 7:29 ` John Garry
2025-06-04 15:09 ` Nilay Shroff
2025-06-05 9:01 ` John Garry
2025-06-05 9:50 ` Nilay Shroff
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox