linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/5] block/md/dm: set chunk_sectors from stacked dev stripe size
@ 2025-06-18  8:37 John Garry
  2025-06-18  8:37 ` [PATCH v2 1/5] block: sanitize chunk_sectors for atomic write limits John Garry
                   ` (6 more replies)
  0 siblings, 7 replies; 19+ messages in thread
From: John Garry @ 2025-06-18  8:37 UTC (permalink / raw)
  To: agk, snitzer, mpatocka, song, yukuai3, hch, nilay, axboe
  Cc: dm-devel, linux-kernel, linux-raid, linux-block, ojaswin,
	martin.petersen, John Garry

This value in io_min is used to configure any atomic write limit for the
stacked device. The idea is that the atomic write unit max is a
power-of-2 factor of the stripe size, and the stripe size is available
in io_min.

Using io_min causes issues, as:
a. it may be mutated
b. the check for io_min being set for determining if we are dealing with
a striped device is hard to get right, as reported in [0].

This series now sets chunk_sectors limit to share stripe size.

[0] https://lore.kernel.org/linux-block/888f3b1d-7817-4007-b3b3-1a2ea04df771@linux.ibm.com/T/#mecca17129f72811137d3c2f1e477634e77f06781

Based on v6.16-rc2

Differences to RFC:
- sanitize chunk_sectors for atomic write limits
- set chunk_sectors in stripe_io_hints()

John Garry (5):
  block: sanitize chunk_sectors for atomic write limits
  md/raid0: set chunk_sectors limit
  md/raid10: set chunk_sectors limit
  dm-stripe: limit chunk_sectors to the stripe size
  block: use chunk_sectors when evaluating stacked atomic write limits

 block/blk-settings.c   | 60 ++++++++++++++++++++++++++----------------
 drivers/md/dm-stripe.c |  1 +
 drivers/md/raid0.c     |  1 +
 drivers/md/raid10.c    |  1 +
 4 files changed, 40 insertions(+), 23 deletions(-)

-- 
2.31.1


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v2 1/5] block: sanitize chunk_sectors for atomic write limits
  2025-06-18  8:37 [PATCH v2 0/5] block/md/dm: set chunk_sectors from stacked dev stripe size John Garry
@ 2025-06-18  8:37 ` John Garry
  2025-06-20 14:30   ` Nilay Shroff
  2025-06-18  8:37 ` [PATCH v2 2/5] md/raid0: set chunk_sectors limit John Garry
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 19+ messages in thread
From: John Garry @ 2025-06-18  8:37 UTC (permalink / raw)
  To: agk, snitzer, mpatocka, song, yukuai3, hch, nilay, axboe
  Cc: dm-devel, linux-kernel, linux-raid, linux-block, ojaswin,
	martin.petersen, John Garry

Currently we just ensure that a non-zero value in chunk_sectors aligns
with any atomic write boundary, as the blk boundary functionality uses
both these values.

However it is also improper to have atomic write unit max > chunk_sectors
(for non-zero chunk_sectors), as this would lead to splitting of atomic
write bios (which is disallowed).

Sanitize atomic write unit max against chunk_sectors to avoid any
potential problems.

Fixes: d00eea91deaf3 ("block: Add extra checks in blk_validate_atomic_write_limits()")
Signed-off-by: John Garry <john.g.garry@oracle.com>
---
 block/blk-settings.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/block/blk-settings.c b/block/blk-settings.c
index a000daafbfb4..7ca21fb32598 100644
--- a/block/blk-settings.c
+++ b/block/blk-settings.c
@@ -180,7 +180,7 @@ static void blk_atomic_writes_update_limits(struct queue_limits *lim)
 
 static void blk_validate_atomic_write_limits(struct queue_limits *lim)
 {
-	unsigned int boundary_sectors;
+	unsigned int boundary_sectors, chunk_bytes;
 
 	if (!(lim->features & BLK_FEAT_ATOMIC_WRITES))
 		goto unsupported;
@@ -202,6 +202,13 @@ static void blk_validate_atomic_write_limits(struct queue_limits *lim)
 			 lim->atomic_write_hw_max))
 		goto unsupported;
 
+	chunk_bytes = lim->chunk_sectors << SECTOR_SHIFT;
+	if (chunk_bytes) {
+		if (WARN_ON_ONCE(lim->atomic_write_hw_unit_max >
+			chunk_bytes))
+			goto unsupported;
+	}
+
 	boundary_sectors = lim->atomic_write_hw_boundary >> SECTOR_SHIFT;
 
 	if (boundary_sectors) {
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 2/5] md/raid0: set chunk_sectors limit
  2025-06-18  8:37 [PATCH v2 0/5] block/md/dm: set chunk_sectors from stacked dev stripe size John Garry
  2025-06-18  8:37 ` [PATCH v2 1/5] block: sanitize chunk_sectors for atomic write limits John Garry
@ 2025-06-18  8:37 ` John Garry
  2025-06-20 14:31   ` Nilay Shroff
  2025-07-02  9:33   ` Yu Kuai
  2025-06-18  8:37 ` [PATCH v2 3/5] md/raid10: " John Garry
                   ` (4 subsequent siblings)
  6 siblings, 2 replies; 19+ messages in thread
From: John Garry @ 2025-06-18  8:37 UTC (permalink / raw)
  To: agk, snitzer, mpatocka, song, yukuai3, hch, nilay, axboe
  Cc: dm-devel, linux-kernel, linux-raid, linux-block, ojaswin,
	martin.petersen, John Garry

Currently we use min io size as the chunk size when deciding on the
atomic write size limits - see blk_stack_atomic_writes_head().

The limit min_io size is not a reliable value to store the chunk size, as
this may be mutated by the block stacking code. Such an example would be
for the min io size less than the physical block size, and the min io size
is raised to the physical block size - see blk_stack_limits().

The block stacking limits will rely on chunk_sectors in future,
so set this value (to the chunk size).

Signed-off-by: John Garry <john.g.garry@oracle.com>
---
 drivers/md/raid0.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/md/raid0.c b/drivers/md/raid0.c
index d8f639f4ae12..cbe2a9054cb9 100644
--- a/drivers/md/raid0.c
+++ b/drivers/md/raid0.c
@@ -384,6 +384,7 @@ static int raid0_set_limits(struct mddev *mddev)
 	lim.max_write_zeroes_sectors = mddev->chunk_sectors;
 	lim.io_min = mddev->chunk_sectors << 9;
 	lim.io_opt = lim.io_min * mddev->raid_disks;
+	lim.chunk_sectors = mddev->chunk_sectors;
 	lim.features |= BLK_FEAT_ATOMIC_WRITES;
 	err = mddev_stack_rdev_limits(mddev, &lim, MDDEV_STACK_INTEGRITY);
 	if (err)
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 3/5] md/raid10: set chunk_sectors limit
  2025-06-18  8:37 [PATCH v2 0/5] block/md/dm: set chunk_sectors from stacked dev stripe size John Garry
  2025-06-18  8:37 ` [PATCH v2 1/5] block: sanitize chunk_sectors for atomic write limits John Garry
  2025-06-18  8:37 ` [PATCH v2 2/5] md/raid0: set chunk_sectors limit John Garry
@ 2025-06-18  8:37 ` John Garry
  2025-06-20 14:32   ` Nilay Shroff
  2025-07-02  9:33   ` Yu Kuai
  2025-06-18  8:37 ` [PATCH v2 4/5] dm-stripe: limit chunk_sectors to the stripe size John Garry
                   ` (3 subsequent siblings)
  6 siblings, 2 replies; 19+ messages in thread
From: John Garry @ 2025-06-18  8:37 UTC (permalink / raw)
  To: agk, snitzer, mpatocka, song, yukuai3, hch, nilay, axboe
  Cc: dm-devel, linux-kernel, linux-raid, linux-block, ojaswin,
	martin.petersen, John Garry

Same as done for raid0, set chunk_sectors limit to appropriately set the
atomic write size limit.

Signed-off-by: John Garry <john.g.garry@oracle.com>
---
 drivers/md/raid10.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
index b74780af4c22..97065bb26f43 100644
--- a/drivers/md/raid10.c
+++ b/drivers/md/raid10.c
@@ -4004,6 +4004,7 @@ static int raid10_set_queue_limits(struct mddev *mddev)
 	md_init_stacking_limits(&lim);
 	lim.max_write_zeroes_sectors = 0;
 	lim.io_min = mddev->chunk_sectors << 9;
+	lim.chunk_sectors = mddev->chunk_sectors;
 	lim.io_opt = lim.io_min * raid10_nr_stripes(conf);
 	lim.features |= BLK_FEAT_ATOMIC_WRITES;
 	err = mddev_stack_rdev_limits(mddev, &lim, MDDEV_STACK_INTEGRITY);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 4/5] dm-stripe: limit chunk_sectors to the stripe size
  2025-06-18  8:37 [PATCH v2 0/5] block/md/dm: set chunk_sectors from stacked dev stripe size John Garry
                   ` (2 preceding siblings ...)
  2025-06-18  8:37 ` [PATCH v2 3/5] md/raid10: " John Garry
@ 2025-06-18  8:37 ` John Garry
  2025-06-20 14:33   ` Nilay Shroff
  2025-06-23  9:49   ` Mikulas Patocka
  2025-06-18  8:37 ` [PATCH v2 5/5] block: use chunk_sectors when evaluating stacked atomic write limits John Garry
                   ` (2 subsequent siblings)
  6 siblings, 2 replies; 19+ messages in thread
From: John Garry @ 2025-06-18  8:37 UTC (permalink / raw)
  To: agk, snitzer, mpatocka, song, yukuai3, hch, nilay, axboe
  Cc: dm-devel, linux-kernel, linux-raid, linux-block, ojaswin,
	martin.petersen, John Garry

Same as done for raid0, set chunk_sectors limit to appropriately set the
atomic write size limit.

Setting chunk_sectors limit in this way overrides the stacked limit
already calculated based on the bottom device limits. This is ok, as
when any bios are sent to the bottom devices, the block layer will still
respect the bottom device chunk_sectors.

Signed-off-by: John Garry <john.g.garry@oracle.com>
---
 drivers/md/dm-stripe.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/md/dm-stripe.c b/drivers/md/dm-stripe.c
index a7dc04bd55e5..5bbbdf8fc1bd 100644
--- a/drivers/md/dm-stripe.c
+++ b/drivers/md/dm-stripe.c
@@ -458,6 +458,7 @@ static void stripe_io_hints(struct dm_target *ti,
 	struct stripe_c *sc = ti->private;
 	unsigned int chunk_size = sc->chunk_size << SECTOR_SHIFT;
 
+	limits->chunk_sectors = sc->chunk_size;
 	limits->io_min = chunk_size;
 	limits->io_opt = chunk_size * sc->stripes;
 }
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 5/5] block: use chunk_sectors when evaluating stacked atomic write limits
  2025-06-18  8:37 [PATCH v2 0/5] block/md/dm: set chunk_sectors from stacked dev stripe size John Garry
                   ` (3 preceding siblings ...)
  2025-06-18  8:37 ` [PATCH v2 4/5] dm-stripe: limit chunk_sectors to the stripe size John Garry
@ 2025-06-18  8:37 ` John Garry
  2025-06-20  2:40   ` Martin K. Petersen
  2025-06-20 14:33   ` Nilay Shroff
  2025-06-20 14:29 ` [PATCH v2 0/5] block/md/dm: set chunk_sectors from stacked dev stripe size Nilay Shroff
  2025-06-26  9:36 ` John Garry
  6 siblings, 2 replies; 19+ messages in thread
From: John Garry @ 2025-06-18  8:37 UTC (permalink / raw)
  To: agk, snitzer, mpatocka, song, yukuai3, hch, nilay, axboe
  Cc: dm-devel, linux-kernel, linux-raid, linux-block, ojaswin,
	martin.petersen, John Garry

The atomic write unit max value is limited by any stacked device stripe
size.

It is required that the atomic write unit is a power-of-2 factor of the
stripe size.

Currently we use io_min limit to hold the stripe size, and check for a
io_min <= SECTOR_SIZE when deciding if we have a striped stacked device.

Nilay reports that this causes a problem when the physical block size is
greater than SECTOR_SIZE [0].

Furthermore, io_min may be mutated when stacking devices, and this makes
it a poor candidate to hold the stripe size. Such an example (of when
io_min may change) would be when the io_min is less than the physical
block size.

Use chunk_sectors to hold the stripe size, which is more appropriate.

[0] https://lore.kernel.org/linux-block/888f3b1d-7817-4007-b3b3-1a2ea04df771@linux.ibm.com/T/#mecca17129f72811137d3c2f1e477634e77f06781

Signed-off-by: John Garry <john.g.garry@oracle.com>
---
 block/blk-settings.c | 51 +++++++++++++++++++++++++-------------------
 1 file changed, 29 insertions(+), 22 deletions(-)

diff --git a/block/blk-settings.c b/block/blk-settings.c
index 7ca21fb32598..20d3563f5d3f 100644
--- a/block/blk-settings.c
+++ b/block/blk-settings.c
@@ -596,41 +596,47 @@ static bool blk_stack_atomic_writes_boundary_head(struct queue_limits *t,
 	return true;
 }
 
+static inline unsigned int max_pow_of_two_factor(const unsigned int nr)
+{
+	return 1 << (ffs(nr) - 1);
+}
 
-/* Check stacking of first bottom device */
-static bool blk_stack_atomic_writes_head(struct queue_limits *t,
-				struct queue_limits *b)
+static void blk_stack_atomic_writes_chunk_sectors(struct queue_limits *t)
 {
-	if (b->atomic_write_hw_boundary &&
-	    !blk_stack_atomic_writes_boundary_head(t, b))
-		return false;
+	unsigned int chunk_bytes = t->chunk_sectors << SECTOR_SHIFT;
 
-	if (t->io_min <= SECTOR_SIZE) {
-		/* No chunk sectors, so use bottom device values directly */
-		t->atomic_write_hw_unit_max = b->atomic_write_hw_unit_max;
-		t->atomic_write_hw_unit_min = b->atomic_write_hw_unit_min;
-		t->atomic_write_hw_max = b->atomic_write_hw_max;
-		return true;
-	}
+	if (!t->chunk_sectors)
+		return;
 
 	/*
 	 * Find values for limits which work for chunk size.
 	 * b->atomic_write_hw_unit_{min, max} may not be aligned with chunk
-	 * size (t->io_min), as chunk size is not restricted to a power-of-2.
+	 * size, as the chunk size is not restricted to a power-of-2.
 	 * So we need to find highest power-of-2 which works for the chunk
 	 * size.
-	 * As an example scenario, we could have b->unit_max = 16K and
-	 * t->io_min = 24K. For this case, reduce t->unit_max to a value
-	 * aligned with both limits, i.e. 8K in this example.
+	 * As an example scenario, we could have t->unit_max = 16K and
+	 * t->chunk_sectors = 24KB. For this case, reduce t->unit_max to a
+	 * value aligned with both limits, i.e. 8K in this example.
 	 */
-	t->atomic_write_hw_unit_max = b->atomic_write_hw_unit_max;
-	while (t->io_min % t->atomic_write_hw_unit_max)
-		t->atomic_write_hw_unit_max /= 2;
+	t->atomic_write_hw_unit_max = min(t->atomic_write_hw_unit_max,
+					max_pow_of_two_factor(chunk_bytes));
 
-	t->atomic_write_hw_unit_min = min(b->atomic_write_hw_unit_min,
+	t->atomic_write_hw_unit_min = min(t->atomic_write_hw_unit_min,
 					  t->atomic_write_hw_unit_max);
-	t->atomic_write_hw_max = min(b->atomic_write_hw_max, t->io_min);
+	t->atomic_write_hw_max = min(t->atomic_write_hw_max, chunk_bytes);
+}
 
+/* Check stacking of first bottom device */
+static bool blk_stack_atomic_writes_head(struct queue_limits *t,
+				struct queue_limits *b)
+{
+	if (b->atomic_write_hw_boundary &&
+	    !blk_stack_atomic_writes_boundary_head(t, b))
+		return false;
+
+	t->atomic_write_hw_unit_max = b->atomic_write_hw_unit_max;
+	t->atomic_write_hw_unit_min = b->atomic_write_hw_unit_min;
+	t->atomic_write_hw_max = b->atomic_write_hw_max;
 	return true;
 }
 
@@ -658,6 +664,7 @@ static void blk_stack_atomic_writes_limits(struct queue_limits *t,
 
 	if (!blk_stack_atomic_writes_head(t, b))
 		goto unsupported;
+	blk_stack_atomic_writes_chunk_sectors(t);
 	return;
 
 unsupported:
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 5/5] block: use chunk_sectors when evaluating stacked atomic write limits
  2025-06-18  8:37 ` [PATCH v2 5/5] block: use chunk_sectors when evaluating stacked atomic write limits John Garry
@ 2025-06-20  2:40   ` Martin K. Petersen
  2025-06-20 11:35     ` John Garry
  2025-06-20 14:33   ` Nilay Shroff
  1 sibling, 1 reply; 19+ messages in thread
From: Martin K. Petersen @ 2025-06-20  2:40 UTC (permalink / raw)
  To: John Garry
  Cc: agk, snitzer, mpatocka, song, yukuai3, hch, nilay, axboe,
	dm-devel, linux-kernel, linux-raid, linux-block, ojaswin,
	martin.petersen


John,

> Furthermore, io_min may be mutated when stacking devices, and this
> makes it a poor candidate to hold the stripe size. Such an example (of
> when io_min may change) would be when the io_min is less than the
> physical block size.

io_min is not allowed to be smaller than the physical_block_size. How
did we end up violating that requirement?

  logical_block_size <= physical_block_size <= io_min <= io_opt

-- 
Martin K. Petersen

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 5/5] block: use chunk_sectors when evaluating stacked atomic write limits
  2025-06-20  2:40   ` Martin K. Petersen
@ 2025-06-20 11:35     ` John Garry
  0 siblings, 0 replies; 19+ messages in thread
From: John Garry @ 2025-06-20 11:35 UTC (permalink / raw)
  To: Martin K. Petersen
  Cc: agk, snitzer, mpatocka, song, yukuai3, hch, nilay, axboe,
	dm-devel, linux-kernel, linux-raid, linux-block, ojaswin

On 20/06/2025 03:40, Martin K. Petersen wrote:
>> Furthermore, io_min may be mutated when stacking devices, and this
>> makes it a poor candidate to hold the stripe size. Such an example (of
>> when io_min may change) would be when the io_min is less than the
>> physical block size.
> io_min is not allowed to be smaller than the physical_block_size. How
> did we end up violating that requirement?
> 
>    logical_block_size <= physical_block_size <= io_min <= io_opt

I should have been a bit less ambiguous in my words.

I meant that if we try to set the stacked device io_min (from the stripe 
size) less than the bottom device phys block size, then this leads to 
the stacked device io_min being set to the bottom device phys block 
size. That's what I mean by mutating. And that's why it's a bad idea to 
assume that the stripe size is in io_min.

Having said that, we should probably reject this even being allowed – we 
should not have physical blocks straddling stripes.


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 0/5] block/md/dm: set chunk_sectors from stacked dev stripe size
  2025-06-18  8:37 [PATCH v2 0/5] block/md/dm: set chunk_sectors from stacked dev stripe size John Garry
                   ` (4 preceding siblings ...)
  2025-06-18  8:37 ` [PATCH v2 5/5] block: use chunk_sectors when evaluating stacked atomic write limits John Garry
@ 2025-06-20 14:29 ` Nilay Shroff
  2025-06-26  9:36 ` John Garry
  6 siblings, 0 replies; 19+ messages in thread
From: Nilay Shroff @ 2025-06-20 14:29 UTC (permalink / raw)
  To: John Garry, agk, snitzer, mpatocka, song, yukuai3, hch, axboe
  Cc: dm-devel, linux-kernel, linux-raid, linux-block, ojaswin,
	martin.petersen



On 6/18/25 2:07 PM, John Garry wrote:
> This value in io_min is used to configure any atomic write limit for the
> stacked device. The idea is that the atomic write unit max is a
> power-of-2 factor of the stripe size, and the stripe size is available
> in io_min.
> 
> Using io_min causes issues, as:
> a. it may be mutated
> b. the check for io_min being set for determining if we are dealing with
> a striped device is hard to get right, as reported in [0].
> 
> This series now sets chunk_sectors limit to share stripe size.
> 
> [0] https://lore.kernel.org/linux-block/888f3b1d-7817-4007-b3b3-1a2ea04df771@linux.ibm.com/T/#mecca17129f72811137d3c2f1e477634e77f06781
> 
> Based on v6.16-rc2

I have validated this patchset using an NVMe disk supporting atomic write and
native NVMe multipath. I have also validated dm-stripe and raid configuration.
Overall the patchset looks good to me and fixes the issue I posted[1] earlier 
with my NVMe disk.

[1]: https://lore.kernel.org/linux-block/888f3b1d-7817-4007-b3b3-1a2ea04df771@linux.ibm.com/T/#mecca17129f72811137d3c2f1e477634e77f06781

Thanks,
--Nilay

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 1/5] block: sanitize chunk_sectors for atomic write limits
  2025-06-18  8:37 ` [PATCH v2 1/5] block: sanitize chunk_sectors for atomic write limits John Garry
@ 2025-06-20 14:30   ` Nilay Shroff
  0 siblings, 0 replies; 19+ messages in thread
From: Nilay Shroff @ 2025-06-20 14:30 UTC (permalink / raw)
  To: John Garry, agk, snitzer, mpatocka, song, yukuai3, hch, axboe
  Cc: dm-devel, linux-kernel, linux-raid, linux-block, ojaswin,
	martin.petersen



On 6/18/25 2:07 PM, John Garry wrote:
> Currently we just ensure that a non-zero value in chunk_sectors aligns
> with any atomic write boundary, as the blk boundary functionality uses
> both these values.
> 
> However it is also improper to have atomic write unit max > chunk_sectors
> (for non-zero chunk_sectors), as this would lead to splitting of atomic
> write bios (which is disallowed).
> 
> Sanitize atomic write unit max against chunk_sectors to avoid any
> potential problems.
> 
> Fixes: d00eea91deaf3 ("block: Add extra checks in blk_validate_atomic_write_limits()")
> Signed-off-by: John Garry <john.g.garry@oracle.com>

Looks good to me:
Reviewed-by: Nilay Shroff <nilay@linux.ibm.com>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 2/5] md/raid0: set chunk_sectors limit
  2025-06-18  8:37 ` [PATCH v2 2/5] md/raid0: set chunk_sectors limit John Garry
@ 2025-06-20 14:31   ` Nilay Shroff
  2025-07-02  9:33   ` Yu Kuai
  1 sibling, 0 replies; 19+ messages in thread
From: Nilay Shroff @ 2025-06-20 14:31 UTC (permalink / raw)
  To: John Garry, agk, snitzer, mpatocka, song, yukuai3, hch, axboe
  Cc: dm-devel, linux-kernel, linux-raid, linux-block, ojaswin,
	martin.petersen



On 6/18/25 2:07 PM, John Garry wrote:
> Currently we use min io size as the chunk size when deciding on the
> atomic write size limits - see blk_stack_atomic_writes_head().
> 
> The limit min_io size is not a reliable value to store the chunk size, as
> this may be mutated by the block stacking code. Such an example would be
> for the min io size less than the physical block size, and the min io size
> is raised to the physical block size - see blk_stack_limits().
> 
> The block stacking limits will rely on chunk_sectors in future,
> so set this value (to the chunk size).
> 
> Signed-off-by: John Garry <john.g.garry@oracle.com>

Looks good to me:
Reviewed-by: Nilay Shroff <nilay@linux.ibm.com>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 3/5] md/raid10: set chunk_sectors limit
  2025-06-18  8:37 ` [PATCH v2 3/5] md/raid10: " John Garry
@ 2025-06-20 14:32   ` Nilay Shroff
  2025-07-02  9:33   ` Yu Kuai
  1 sibling, 0 replies; 19+ messages in thread
From: Nilay Shroff @ 2025-06-20 14:32 UTC (permalink / raw)
  To: John Garry, agk, snitzer, mpatocka, song, yukuai3, hch, axboe
  Cc: dm-devel, linux-kernel, linux-raid, linux-block, ojaswin,
	martin.petersen



On 6/18/25 2:07 PM, John Garry wrote:
> Same as done for raid0, set chunk_sectors limit to appropriately set the
> atomic write size limit.
> 
> Signed-off-by: John Garry <john.g.garry@oracle.com>

Looks good to me:
Reviewed-by: Nilay Shroff <nilay@linux.ibm.com>


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 4/5] dm-stripe: limit chunk_sectors to the stripe size
  2025-06-18  8:37 ` [PATCH v2 4/5] dm-stripe: limit chunk_sectors to the stripe size John Garry
@ 2025-06-20 14:33   ` Nilay Shroff
  2025-06-23  9:49   ` Mikulas Patocka
  1 sibling, 0 replies; 19+ messages in thread
From: Nilay Shroff @ 2025-06-20 14:33 UTC (permalink / raw)
  To: John Garry, agk, snitzer, mpatocka, song, yukuai3, hch, axboe
  Cc: dm-devel, linux-kernel, linux-raid, linux-block, ojaswin,
	martin.petersen



On 6/18/25 2:07 PM, John Garry wrote:
> Same as done for raid0, set chunk_sectors limit to appropriately set the
> atomic write size limit.
> 
> Setting chunk_sectors limit in this way overrides the stacked limit
> already calculated based on the bottom device limits. This is ok, as
> when any bios are sent to the bottom devices, the block layer will still
> respect the bottom device chunk_sectors.
> 
> Signed-off-by: John Garry <john.g.garry@oracle.com>

Looks good to me:
Reviewed-by: Nilay Shroff <nilay@linux.ibm.com>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 5/5] block: use chunk_sectors when evaluating stacked atomic write limits
  2025-06-18  8:37 ` [PATCH v2 5/5] block: use chunk_sectors when evaluating stacked atomic write limits John Garry
  2025-06-20  2:40   ` Martin K. Petersen
@ 2025-06-20 14:33   ` Nilay Shroff
  1 sibling, 0 replies; 19+ messages in thread
From: Nilay Shroff @ 2025-06-20 14:33 UTC (permalink / raw)
  To: John Garry, agk, snitzer, mpatocka, song, yukuai3, hch, axboe
  Cc: dm-devel, linux-kernel, linux-raid, linux-block, ojaswin,
	martin.petersen



On 6/18/25 2:07 PM, John Garry wrote:
> The atomic write unit max value is limited by any stacked device stripe
> size.
> 
> It is required that the atomic write unit is a power-of-2 factor of the
> stripe size.
> 
> Currently we use io_min limit to hold the stripe size, and check for a
> io_min <= SECTOR_SIZE when deciding if we have a striped stacked device.
> 
> Nilay reports that this causes a problem when the physical block size is
> greater than SECTOR_SIZE [0].
> 
> Furthermore, io_min may be mutated when stacking devices, and this makes
> it a poor candidate to hold the stripe size. Such an example (of when
> io_min may change) would be when the io_min is less than the physical
> block size.
> 
> Use chunk_sectors to hold the stripe size, which is more appropriate.
> 
> [0] https://lore.kernel.org/linux-block/888f3b1d-7817-4007-b3b3-1a2ea04df771@linux.ibm.com/T/#mecca17129f72811137d3c2f1e477634e77f06781
> 
> Signed-off-by: John Garry <john.g.garry@oracle.com>

Looks good to me:
Reviewed-by: Nilay Shroff <nilay@linux.ibm.com>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 4/5] dm-stripe: limit chunk_sectors to the stripe size
  2025-06-18  8:37 ` [PATCH v2 4/5] dm-stripe: limit chunk_sectors to the stripe size John Garry
  2025-06-20 14:33   ` Nilay Shroff
@ 2025-06-23  9:49   ` Mikulas Patocka
  1 sibling, 0 replies; 19+ messages in thread
From: Mikulas Patocka @ 2025-06-23  9:49 UTC (permalink / raw)
  To: John Garry
  Cc: agk, snitzer, song, yukuai3, hch, nilay, axboe, dm-devel,
	linux-kernel, linux-raid, linux-block, ojaswin, martin.petersen



On Wed, 18 Jun 2025, John Garry wrote:

> Same as done for raid0, set chunk_sectors limit to appropriately set the
> atomic write size limit.
> 
> Setting chunk_sectors limit in this way overrides the stacked limit
> already calculated based on the bottom device limits. This is ok, as
> when any bios are sent to the bottom devices, the block layer will still
> respect the bottom device chunk_sectors.
> 
> Signed-off-by: John Garry <john.g.garry@oracle.com>
> ---
>  drivers/md/dm-stripe.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/md/dm-stripe.c b/drivers/md/dm-stripe.c
> index a7dc04bd55e5..5bbbdf8fc1bd 100644
> --- a/drivers/md/dm-stripe.c
> +++ b/drivers/md/dm-stripe.c
> @@ -458,6 +458,7 @@ static void stripe_io_hints(struct dm_target *ti,
>  	struct stripe_c *sc = ti->private;
>  	unsigned int chunk_size = sc->chunk_size << SECTOR_SHIFT;
>  
> +	limits->chunk_sectors = sc->chunk_size;
>  	limits->io_min = chunk_size;
>  	limits->io_opt = chunk_size * sc->stripes;
>  }
> -- 
> 2.31.1

Reviewed-by: Mikulas Patocka <mpatocka@redhat.com>


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 0/5] block/md/dm: set chunk_sectors from stacked dev stripe size
  2025-06-18  8:37 [PATCH v2 0/5] block/md/dm: set chunk_sectors from stacked dev stripe size John Garry
                   ` (5 preceding siblings ...)
  2025-06-20 14:29 ` [PATCH v2 0/5] block/md/dm: set chunk_sectors from stacked dev stripe size Nilay Shroff
@ 2025-06-26  9:36 ` John Garry
  2025-07-02  8:28   ` John Garry
  6 siblings, 1 reply; 19+ messages in thread
From: John Garry @ 2025-06-26  9:36 UTC (permalink / raw)
  To: agk, snitzer, mpatocka, song, yukuai3, hch, nilay, axboe
  Cc: dm-devel, linux-kernel, linux-raid, linux-block, ojaswin,
	martin.petersen

On 18/06/2025 09:37, John Garry wrote:
> This value in io_min is used to configure any atomic write limit for the
> stacked device. The idea is that the atomic write unit max is a
> power-of-2 factor of the stripe size, and the stripe size is available
> in io_min.
> 
> Using io_min causes issues, as:
> a. it may be mutated
> b. the check for io_min being set for determining if we are dealing with
> a striped device is hard to get right, as reported in [0].
> 
> This series now sets chunk_sectors limit to share stripe size.

Any more comments here? It would be good to have the md/raid0 and 
md/raid10 changes checked by the md maintainers.

Thanks

> 
> [0] https://lore.kernel.org/linux-block/888f3b1d-7817-4007-b3b3-1a2ea04df771@linux.ibm.com/T/#mecca17129f72811137d3c2f1e477634e77f06781
> 
> Based on v6.16-rc2
> 
> Differences to RFC:
> - sanitize chunk_sectors for atomic write limits
> - set chunk_sectors in stripe_io_hints()
> 
> John Garry (5):
>    block: sanitize chunk_sectors for atomic write limits
>    md/raid0: set chunk_sectors limit
>    md/raid10: set chunk_sectors limit
>    dm-stripe: limit chunk_sectors to the stripe size
>    block: use chunk_sectors when evaluating stacked atomic write limits
> 
>   block/blk-settings.c   | 60 ++++++++++++++++++++++++++----------------
>   drivers/md/dm-stripe.c |  1 +
>   drivers/md/raid0.c     |  1 +
>   drivers/md/raid10.c    |  1 +
>   4 files changed, 40 insertions(+), 23 deletions(-)
> 


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 0/5] block/md/dm: set chunk_sectors from stacked dev stripe size
  2025-06-26  9:36 ` John Garry
@ 2025-07-02  8:28   ` John Garry
  0 siblings, 0 replies; 19+ messages in thread
From: John Garry @ 2025-07-02  8:28 UTC (permalink / raw)
  To: agk, snitzer, mpatocka, song, yukuai3, hch, nilay, axboe
  Cc: dm-devel, linux-kernel, linux-raid, linux-block, ojaswin,
	martin.petersen

On 26/06/2025 10:36, John Garry wrote:
 > On 18/06/2025 09:37, John Garry wrote:

Hi Jens,

Could you kindly consider picking up this series via the block tree?

I was hoping for a maintainer ack on the md raid0/1/10 stuff, but it's 
quite a straightforward change there.

Cheers,
John

>> This value in io_min is used to configure any atomic write limit for the
>> stacked device. The idea is that the atomic write unit max is a
>> power-of-2 factor of the stripe size, and the stripe size is available
>> in io_min.
>>
>> Using io_min causes issues, as:
>> a. it may be mutated
>> b. the check for io_min being set for determining if we are dealing with
>> a striped device is hard to get right, as reported in [0].
>>
>> This series now sets chunk_sectors limit to share stripe size.
> 
> Any more comments here? It would be good to have the md/raid0 and md/ 
> raid10 changes checked by the md maintainers.
> 
> Thanks
> 
>>
>> [0] https://urldefense.com/v3/__https://lore.kernel.org/linux- 
>> block/888f3b1d-7817-4007-b3b3-1a2ea04df771@linux.ibm.com/T/ 
>> *mecca17129f72811137d3c2f1e477634e77f06781__;Iw!!ACWV5N9M2RV99hQ! 
>> I8diqGp3zAZO162eEpQ1SuUsrvAMTWzhbHUSxxn23h3TLcRRTAs3LUDanOeWiK2osXVfFD0HHw4PioWzfd6MhbOnyw$
>> Based on v6.16-rc2
>>
>> Differences to RFC:
>> - sanitize chunk_sectors for atomic write limits
>> - set chunk_sectors in stripe_io_hints()
>>
>> John Garry (5):
>>    block: sanitize chunk_sectors for atomic write limits
>>    md/raid0: set chunk_sectors limit
>>    md/raid10: set chunk_sectors limit
>>    dm-stripe: limit chunk_sectors to the stripe size
>>    block: use chunk_sectors when evaluating stacked atomic write limits
>>
>>   block/blk-settings.c   | 60 ++++++++++++++++++++++++++----------------
>>   drivers/md/dm-stripe.c |  1 +
>>   drivers/md/raid0.c     |  1 +
>>   drivers/md/raid10.c    |  1 +
>>   4 files changed, 40 insertions(+), 23 deletions(-) 


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 2/5] md/raid0: set chunk_sectors limit
  2025-06-18  8:37 ` [PATCH v2 2/5] md/raid0: set chunk_sectors limit John Garry
  2025-06-20 14:31   ` Nilay Shroff
@ 2025-07-02  9:33   ` Yu Kuai
  1 sibling, 0 replies; 19+ messages in thread
From: Yu Kuai @ 2025-07-02  9:33 UTC (permalink / raw)
  To: John Garry, agk, snitzer, mpatocka, song, hch, nilay, axboe
  Cc: dm-devel, linux-kernel, linux-raid, linux-block, ojaswin,
	martin.petersen, yukuai (C)

在 2025/06/18 16:37, John Garry 写道:
> Currently we use min io size as the chunk size when deciding on the
> atomic write size limits - see blk_stack_atomic_writes_head().
> 
> The limit min_io size is not a reliable value to store the chunk size, as
> this may be mutated by the block stacking code. Such an example would be
> for the min io size less than the physical block size, and the min io size
> is raised to the physical block size - see blk_stack_limits().
> 
> The block stacking limits will rely on chunk_sectors in future,
> so set this value (to the chunk size).
> 
> Signed-off-by: John Garry <john.g.garry@oracle.com>
> ---
>   drivers/md/raid0.c | 1 +
>   1 file changed, 1 insertion(+)
> 

Sorry about the delay.

Reviewed-by: Yu Kuai <yukuai3@huawei.com>
> diff --git a/drivers/md/raid0.c b/drivers/md/raid0.c
> index d8f639f4ae12..cbe2a9054cb9 100644
> --- a/drivers/md/raid0.c
> +++ b/drivers/md/raid0.c
> @@ -384,6 +384,7 @@ static int raid0_set_limits(struct mddev *mddev)
>   	lim.max_write_zeroes_sectors = mddev->chunk_sectors;
>   	lim.io_min = mddev->chunk_sectors << 9;
>   	lim.io_opt = lim.io_min * mddev->raid_disks;
> +	lim.chunk_sectors = mddev->chunk_sectors;
>   	lim.features |= BLK_FEAT_ATOMIC_WRITES;
>   	err = mddev_stack_rdev_limits(mddev, &lim, MDDEV_STACK_INTEGRITY);
>   	if (err)
> 


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 3/5] md/raid10: set chunk_sectors limit
  2025-06-18  8:37 ` [PATCH v2 3/5] md/raid10: " John Garry
  2025-06-20 14:32   ` Nilay Shroff
@ 2025-07-02  9:33   ` Yu Kuai
  1 sibling, 0 replies; 19+ messages in thread
From: Yu Kuai @ 2025-07-02  9:33 UTC (permalink / raw)
  To: John Garry, agk, snitzer, mpatocka, song, hch, nilay, axboe
  Cc: dm-devel, linux-kernel, linux-raid, linux-block, ojaswin,
	martin.petersen, yukuai (C)

在 2025/06/18 16:37, John Garry 写道:
> Same as done for raid0, set chunk_sectors limit to appropriately set the
> atomic write size limit.
> 
> Signed-off-by: John Garry <john.g.garry@oracle.com>
> ---
>   drivers/md/raid10.c | 1 +
>   1 file changed, 1 insertion(+)
> 

Reviewed-by: Yu Kuai <yukuai3@huawei.com>

> diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
> index b74780af4c22..97065bb26f43 100644
> --- a/drivers/md/raid10.c
> +++ b/drivers/md/raid10.c
> @@ -4004,6 +4004,7 @@ static int raid10_set_queue_limits(struct mddev *mddev)
>   	md_init_stacking_limits(&lim);
>   	lim.max_write_zeroes_sectors = 0;
>   	lim.io_min = mddev->chunk_sectors << 9;
> +	lim.chunk_sectors = mddev->chunk_sectors;
>   	lim.io_opt = lim.io_min * raid10_nr_stripes(conf);
>   	lim.features |= BLK_FEAT_ATOMIC_WRITES;
>   	err = mddev_stack_rdev_limits(mddev, &lim, MDDEV_STACK_INTEGRITY);
> 


^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2025-07-02  9:33 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-18  8:37 [PATCH v2 0/5] block/md/dm: set chunk_sectors from stacked dev stripe size John Garry
2025-06-18  8:37 ` [PATCH v2 1/5] block: sanitize chunk_sectors for atomic write limits John Garry
2025-06-20 14:30   ` Nilay Shroff
2025-06-18  8:37 ` [PATCH v2 2/5] md/raid0: set chunk_sectors limit John Garry
2025-06-20 14:31   ` Nilay Shroff
2025-07-02  9:33   ` Yu Kuai
2025-06-18  8:37 ` [PATCH v2 3/5] md/raid10: " John Garry
2025-06-20 14:32   ` Nilay Shroff
2025-07-02  9:33   ` Yu Kuai
2025-06-18  8:37 ` [PATCH v2 4/5] dm-stripe: limit chunk_sectors to the stripe size John Garry
2025-06-20 14:33   ` Nilay Shroff
2025-06-23  9:49   ` Mikulas Patocka
2025-06-18  8:37 ` [PATCH v2 5/5] block: use chunk_sectors when evaluating stacked atomic write limits John Garry
2025-06-20  2:40   ` Martin K. Petersen
2025-06-20 11:35     ` John Garry
2025-06-20 14:33   ` Nilay Shroff
2025-06-20 14:29 ` [PATCH v2 0/5] block/md/dm: set chunk_sectors from stacked dev stripe size Nilay Shroff
2025-06-26  9:36 ` John Garry
2025-07-02  8:28   ` John Garry

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).