linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH RFC 0/4] block: set chunk_sectors from stacked dev stripe size
@ 2025-06-05 15:08 John Garry
  2025-06-05 15:08 ` [PATCH RFC 1/4] md/raid0: set chunk_sectors limit John Garry
                   ` (3 more replies)
  0 siblings, 4 replies; 11+ messages in thread
From: John Garry @ 2025-06-05 15:08 UTC (permalink / raw)
  To: agk, snitzer, mpatocka, song, yukuai3, hch, nilay, axboe
  Cc: dm-devel, linux-kernel, linux-raid, linux-block, ojaswin,
	martin.petersen, John Garry

Currently md raid0/10 and dm-stripe sets io_min limit from stripe width.

This value in io_min is used to configure any atomic write limit for the
stacked device. The idea is that the atomic write unit max is a
power-of-2 factor of the stripe width.

Using io_min causes issues, as:
a. it may be mutated
b. the check for io_min being set for determining if we are dealing with
a striped device is hard to get right, as reported in [0].

This series now sets chunk_sectors limit to share stripe size.

Sending as an RFC as I need to test more and I am not totally confident
that using chunk_sectors is the right approach.

Apologies for any delays in response, as I will be OoO soon, and I
wanted to send this to share with Nilay.

[0] https://lore.kernel.org/linux-block/888f3b1d-7817-4007-b3b3-1a2ea04df771@linux.ibm.com/T/#mecca17129f72811137d3c2f1e477634e77f06781

John Garry (4):
  md/raid0: set chunk_sectors limit
  md/raid10: set chunk_sectors limit
  dm-stripe: limit chunk_sectors to the stripe size
  block: use chunk_sectors when evaluating stacked atomic write limits

 block/blk-settings.c          | 8 +++++---
 drivers/md/dm-stripe.c        | 3 ++-
 drivers/md/dm-table.c         | 4 ++++
 drivers/md/raid0.c            | 1 +
 drivers/md/raid10.c           | 1 +
 include/linux/device-mapper.h | 3 +++
 6 files changed, 16 insertions(+), 4 deletions(-)

-- 
2.31.1


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH RFC 1/4] md/raid0: set chunk_sectors limit
  2025-06-05 15:08 [PATCH RFC 0/4] block: set chunk_sectors from stacked dev stripe size John Garry
@ 2025-06-05 15:08 ` John Garry
  2025-06-05 15:08 ` [PATCH RFC 2/4] md/raid10: " John Garry
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 11+ messages in thread
From: John Garry @ 2025-06-05 15:08 UTC (permalink / raw)
  To: agk, snitzer, mpatocka, song, yukuai3, hch, nilay, axboe
  Cc: dm-devel, linux-kernel, linux-raid, linux-block, ojaswin,
	martin.petersen, John Garry

Currently we use min io size as the chunk size when deciding on the
atomic write size limits - see blk_stack_atomic_writes_head().

The limits min_io_size is not a reliable value to store the chunk
size, as this may be mutated by the block stacking code. Such an example
would be for the min io size less than the physical block size, and the
min io size is raised to the physical block size - see blk_stack_limits().

The block stacking limits will rely on chunk_sectors in future,
so set this value (to the chunk size).

Signed-off-by: John Garry <john.g.garry@oracle.com>
---
 drivers/md/raid0.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/md/raid0.c b/drivers/md/raid0.c
index d8f639f4ae12..cbe2a9054cb9 100644
--- a/drivers/md/raid0.c
+++ b/drivers/md/raid0.c
@@ -384,6 +384,7 @@ static int raid0_set_limits(struct mddev *mddev)
 	lim.max_write_zeroes_sectors = mddev->chunk_sectors;
 	lim.io_min = mddev->chunk_sectors << 9;
 	lim.io_opt = lim.io_min * mddev->raid_disks;
+	lim.chunk_sectors = mddev->chunk_sectors;
 	lim.features |= BLK_FEAT_ATOMIC_WRITES;
 	err = mddev_stack_rdev_limits(mddev, &lim, MDDEV_STACK_INTEGRITY);
 	if (err)
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH RFC 2/4] md/raid10: set chunk_sectors limit
  2025-06-05 15:08 [PATCH RFC 0/4] block: set chunk_sectors from stacked dev stripe size John Garry
  2025-06-05 15:08 ` [PATCH RFC 1/4] md/raid0: set chunk_sectors limit John Garry
@ 2025-06-05 15:08 ` John Garry
  2025-06-05 15:08 ` [PATCH RFC 3/4] dm-stripe: limit chunk_sectors to the stripe size John Garry
  2025-06-05 15:08 ` [PATCH RFC 4/4] block: use chunk_sectors when evaluating stacked atomic write limits John Garry
  3 siblings, 0 replies; 11+ messages in thread
From: John Garry @ 2025-06-05 15:08 UTC (permalink / raw)
  To: agk, snitzer, mpatocka, song, yukuai3, hch, nilay, axboe
  Cc: dm-devel, linux-kernel, linux-raid, linux-block, ojaswin,
	martin.petersen, John Garry

Same as done for raid0, set chunk_sectors limit to appropriately set the
atomic write size limit.

Signed-off-by: John Garry <john.g.garry@oracle.com>
---
 drivers/md/raid10.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
index dce06bf65016..60abe063f248 100644
--- a/drivers/md/raid10.c
+++ b/drivers/md/raid10.c
@@ -4003,6 +4003,7 @@ static int raid10_set_queue_limits(struct mddev *mddev)
 	md_init_stacking_limits(&lim);
 	lim.max_write_zeroes_sectors = 0;
 	lim.io_min = mddev->chunk_sectors << 9;
+	lim.chunk_sectors = mddev->chunk_sectors;
 	lim.io_opt = lim.io_min * raid10_nr_stripes(conf);
 	lim.features |= BLK_FEAT_ATOMIC_WRITES;
 	err = mddev_stack_rdev_limits(mddev, &lim, MDDEV_STACK_INTEGRITY);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH RFC 3/4] dm-stripe: limit chunk_sectors to the stripe size
  2025-06-05 15:08 [PATCH RFC 0/4] block: set chunk_sectors from stacked dev stripe size John Garry
  2025-06-05 15:08 ` [PATCH RFC 1/4] md/raid0: set chunk_sectors limit John Garry
  2025-06-05 15:08 ` [PATCH RFC 2/4] md/raid10: " John Garry
@ 2025-06-05 15:08 ` John Garry
  2025-06-06 15:16   ` Nilay Shroff
  2025-06-09 15:19   ` Mikulas Patocka
  2025-06-05 15:08 ` [PATCH RFC 4/4] block: use chunk_sectors when evaluating stacked atomic write limits John Garry
  3 siblings, 2 replies; 11+ messages in thread
From: John Garry @ 2025-06-05 15:08 UTC (permalink / raw)
  To: agk, snitzer, mpatocka, song, yukuai3, hch, nilay, axboe
  Cc: dm-devel, linux-kernel, linux-raid, linux-block, ojaswin,
	martin.petersen, John Garry

Currently we use min io size as the chunk size when deciding on the limit
of atomic write size.

Using min io size is not reliable, as this may be mutated when stacking
the bottom device limits.

The block stacking limits will rely on chunk_sectors in future, so set
this value (to the chunk size).

Introduce a flag - DM_TARGET_STRIPED - and check this in
dm_set_device_limits() when setting this limit.

Signed-off-by: John Garry <john.g.garry@oracle.com>
---
 drivers/md/dm-stripe.c        | 3 ++-
 drivers/md/dm-table.c         | 4 ++++
 include/linux/device-mapper.h | 3 +++
 3 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/drivers/md/dm-stripe.c b/drivers/md/dm-stripe.c
index a7dc04bd55e5..c30df6715149 100644
--- a/drivers/md/dm-stripe.c
+++ b/drivers/md/dm-stripe.c
@@ -466,7 +466,8 @@ static struct target_type stripe_target = {
 	.name   = "striped",
 	.version = {1, 7, 0},
 	.features = DM_TARGET_PASSES_INTEGRITY | DM_TARGET_NOWAIT |
-		    DM_TARGET_ATOMIC_WRITES | DM_TARGET_PASSES_CRYPTO,
+		    DM_TARGET_ATOMIC_WRITES | DM_TARGET_PASSES_CRYPTO |
+		    DM_TARGET_STRIPED,
 	.module = THIS_MODULE,
 	.ctr    = stripe_ctr,
 	.dtr    = stripe_dtr,
diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c
index 24a857ff6d0b..4f1f7173740c 100644
--- a/drivers/md/dm-table.c
+++ b/drivers/md/dm-table.c
@@ -430,6 +430,10 @@ static int dm_set_device_limits(struct dm_target *ti, struct dm_dev *dev,
 		return 0;
 	}
 
+	/* For striped types, limit the chunk_sectors to the chunk size */
+	if (dm_target_supports_striped(ti->type))
+		limits->chunk_sectors = len >> SECTOR_SHIFT;
+
 	mutex_lock(&q->limits_lock);
 	/*
 	 * BLK_FEAT_ATOMIC_WRITES is not inherited from the bottom device in
diff --git a/include/linux/device-mapper.h b/include/linux/device-mapper.h
index cb95951547ab..a863523b69ee 100644
--- a/include/linux/device-mapper.h
+++ b/include/linux/device-mapper.h
@@ -309,6 +309,9 @@ struct target_type {
 #define DM_TARGET_ATOMIC_WRITES		0x00000400
 #define dm_target_supports_atomic_writes(type) ((type)->features & DM_TARGET_ATOMIC_WRITES)
 
+#define DM_TARGET_STRIPED		0x00000800
+#define dm_target_supports_striped(type) ((type)->features & DM_TARGET_STRIPED)
+
 struct dm_target {
 	struct dm_table *table;
 	struct target_type *type;
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH RFC 4/4] block: use chunk_sectors when evaluating stacked atomic write limits
  2025-06-05 15:08 [PATCH RFC 0/4] block: set chunk_sectors from stacked dev stripe size John Garry
                   ` (2 preceding siblings ...)
  2025-06-05 15:08 ` [PATCH RFC 3/4] dm-stripe: limit chunk_sectors to the stripe size John Garry
@ 2025-06-05 15:08 ` John Garry
  2025-06-06 15:23   ` Nilay Shroff
  3 siblings, 1 reply; 11+ messages in thread
From: John Garry @ 2025-06-05 15:08 UTC (permalink / raw)
  To: agk, snitzer, mpatocka, song, yukuai3, hch, nilay, axboe
  Cc: dm-devel, linux-kernel, linux-raid, linux-block, ojaswin,
	martin.petersen, John Garry

The atomic write unit max is limited by any stack device stripe size.

It is required that the atomic write unit is a power-of-2 factor of the
stripe size.

Currently we use io_min limit to hold the stripe size, and check for a
io_min <= SECTOR_SIZE when deciding if we have a striped stacked device.

Nilay reports that this causes a problem when the physical block size is
greater than SECTOR_SIZE [0].

Furthermore, io_min may be mutated when stacking devices, and this makes
it a poor candidate to hold the stripe size. Such an example would be
when the io_min is less than the physical block size.

Use chunk_sectors to hold the stripe size, which is more appropriate.

[0] https://lore.kernel.org/linux-block/888f3b1d-7817-4007-b3b3-1a2ea04df771@linux.ibm.com/T/#mecca17129f72811137d3c2f1e477634e77f06781

Signed-off-by: John Garry <john.g.garry@oracle.com>
---
 block/blk-settings.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/block/blk-settings.c b/block/blk-settings.c
index a000daafbfb4..5b0f1a854e81 100644
--- a/block/blk-settings.c
+++ b/block/blk-settings.c
@@ -594,11 +594,13 @@ static bool blk_stack_atomic_writes_boundary_head(struct queue_limits *t,
 static bool blk_stack_atomic_writes_head(struct queue_limits *t,
 				struct queue_limits *b)
 {
+	unsigned int chunk_size = t->chunk_sectors << SECTOR_SHIFT;
+
 	if (b->atomic_write_hw_boundary &&
 	    !blk_stack_atomic_writes_boundary_head(t, b))
 		return false;
 
-	if (t->io_min <= SECTOR_SIZE) {
+	if (!t->chunk_sectors) {
 		/* No chunk sectors, so use bottom device values directly */
 		t->atomic_write_hw_unit_max = b->atomic_write_hw_unit_max;
 		t->atomic_write_hw_unit_min = b->atomic_write_hw_unit_min;
@@ -617,12 +619,12 @@ static bool blk_stack_atomic_writes_head(struct queue_limits *t,
 	 * aligned with both limits, i.e. 8K in this example.
 	 */
 	t->atomic_write_hw_unit_max = b->atomic_write_hw_unit_max;
-	while (t->io_min % t->atomic_write_hw_unit_max)
+	while (chunk_size % t->atomic_write_hw_unit_max)
 		t->atomic_write_hw_unit_max /= 2;
 
 	t->atomic_write_hw_unit_min = min(b->atomic_write_hw_unit_min,
 					  t->atomic_write_hw_unit_max);
-	t->atomic_write_hw_max = min(b->atomic_write_hw_max, t->io_min);
+	t->atomic_write_hw_max = min(b->atomic_write_hw_max, chunk_size);
 
 	return true;
 }
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH RFC 3/4] dm-stripe: limit chunk_sectors to the stripe size
  2025-06-05 15:08 ` [PATCH RFC 3/4] dm-stripe: limit chunk_sectors to the stripe size John Garry
@ 2025-06-06 15:16   ` Nilay Shroff
  2025-06-12 10:01     ` John Garry
  2025-06-09 15:19   ` Mikulas Patocka
  1 sibling, 1 reply; 11+ messages in thread
From: Nilay Shroff @ 2025-06-06 15:16 UTC (permalink / raw)
  To: John Garry, agk, snitzer, mpatocka, song, yukuai3, hch, axboe
  Cc: dm-devel, linux-kernel, linux-raid, linux-block, ojaswin,
	martin.petersen



On 6/5/25 8:38 PM, John Garry wrote:
> Currently we use min io size as the chunk size when deciding on the limit
> of atomic write size.
> 
> Using min io size is not reliable, as this may be mutated when stacking
> the bottom device limits.
> 
> The block stacking limits will rely on chunk_sectors in future, so set
> this value (to the chunk size).
> 
> Introduce a flag - DM_TARGET_STRIPED - and check this in
> dm_set_device_limits() when setting this limit.
> 
> Signed-off-by: John Garry <john.g.garry@oracle.com>
> ---
>  drivers/md/dm-stripe.c        | 3 ++-
>  drivers/md/dm-table.c         | 4 ++++
>  include/linux/device-mapper.h | 3 +++
>  3 files changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/md/dm-stripe.c b/drivers/md/dm-stripe.c
> index a7dc04bd55e5..c30df6715149 100644
> --- a/drivers/md/dm-stripe.c
> +++ b/drivers/md/dm-stripe.c
> @@ -466,7 +466,8 @@ static struct target_type stripe_target = {
>  	.name   = "striped",
>  	.version = {1, 7, 0},
>  	.features = DM_TARGET_PASSES_INTEGRITY | DM_TARGET_NOWAIT |
> -		    DM_TARGET_ATOMIC_WRITES | DM_TARGET_PASSES_CRYPTO,
> +		    DM_TARGET_ATOMIC_WRITES | DM_TARGET_PASSES_CRYPTO |
> +		    DM_TARGET_STRIPED,
>  	.module = THIS_MODULE,
>  	.ctr    = stripe_ctr,
>  	.dtr    = stripe_dtr,
> diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c
> index 24a857ff6d0b..4f1f7173740c 100644
> --- a/drivers/md/dm-table.c
> +++ b/drivers/md/dm-table.c
> @@ -430,6 +430,10 @@ static int dm_set_device_limits(struct dm_target *ti, struct dm_dev *dev,
>  		return 0;
>  	}
>  
> +	/* For striped types, limit the chunk_sectors to the chunk size */
> +	if (dm_target_supports_striped(ti->type))
> +		limits->chunk_sectors = len >> SECTOR_SHIFT;
> +
I think here "len" refers to the total size of dm target and not the 
chunk sectors. So we need to modify this and take into account chunk sectors..
We can get chunk sectors, for example, like this:

	struct stripe_c *sc = ti->private;
	limits->chunk_sectors = sc->chunk_size;

But again struct stripe_c is private to dm-stripe.c and so we can't access it
here directly in dm-table.c Better we add a new callback function for dm target
type under struct target_type and then use that callback to get chunk sector. 

struct target_type stripe_target = {
        ...
        .chunk_sectors = stripe_chunk_sectors,
        ...
}

Thanks,
--Nilay


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH RFC 4/4] block: use chunk_sectors when evaluating stacked atomic write limits
  2025-06-05 15:08 ` [PATCH RFC 4/4] block: use chunk_sectors when evaluating stacked atomic write limits John Garry
@ 2025-06-06 15:23   ` Nilay Shroff
  2025-06-12  9:17     ` John Garry
  0 siblings, 1 reply; 11+ messages in thread
From: Nilay Shroff @ 2025-06-06 15:23 UTC (permalink / raw)
  To: John Garry, agk, snitzer, mpatocka, song, yukuai3, hch, axboe
  Cc: dm-devel, linux-kernel, linux-raid, linux-block, ojaswin,
	martin.petersen



On 6/5/25 8:38 PM, John Garry wrote:
> The atomic write unit max is limited by any stack device stripe size.
> 
> It is required that the atomic write unit is a power-of-2 factor of the
> stripe size.
> 
> Currently we use io_min limit to hold the stripe size, and check for a
> io_min <= SECTOR_SIZE when deciding if we have a striped stacked device.
> 
> Nilay reports that this causes a problem when the physical block size is
> greater than SECTOR_SIZE [0].
> 
> Furthermore, io_min may be mutated when stacking devices, and this makes
> it a poor candidate to hold the stripe size. Such an example would be
> when the io_min is less than the physical block size.
> 
> Use chunk_sectors to hold the stripe size, which is more appropriate.
> 
> [0] https://lore.kernel.org/linux-block/888f3b1d-7817-4007-b3b3-1a2ea04df771@linux.ibm.com/T/#mecca17129f72811137d3c2f1e477634e77f06781
> 
> Signed-off-by: John Garry <john.g.garry@oracle.com>
> ---
>  block/blk-settings.c | 8 +++++---
>  1 file changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/block/blk-settings.c b/block/blk-settings.c
> index a000daafbfb4..5b0f1a854e81 100644
> --- a/block/blk-settings.c
> +++ b/block/blk-settings.c
> @@ -594,11 +594,13 @@ static bool blk_stack_atomic_writes_boundary_head(struct queue_limits *t,
>  static bool blk_stack_atomic_writes_head(struct queue_limits *t,
>  				struct queue_limits *b)
>  {
> +	unsigned int chunk_size = t->chunk_sectors << SECTOR_SHIFT;
> +
>  	if (b->atomic_write_hw_boundary &&
>  	    !blk_stack_atomic_writes_boundary_head(t, b))
>  		return false;
>  
> -	if (t->io_min <= SECTOR_SIZE) {
> +	if (!t->chunk_sectors) {
>  		/* No chunk sectors, so use bottom device values directly */
>  		t->atomic_write_hw_unit_max = b->atomic_write_hw_unit_max;
>  		t->atomic_write_hw_unit_min = b->atomic_write_hw_unit_min;
> @@ -617,12 +619,12 @@ static bool blk_stack_atomic_writes_head(struct queue_limits *t,
>  	 * aligned with both limits, i.e. 8K in this example.
>  	 */
>  	t->atomic_write_hw_unit_max = b->atomic_write_hw_unit_max;
> -	while (t->io_min % t->atomic_write_hw_unit_max)
> +	while (chunk_size % t->atomic_write_hw_unit_max)
>  		t->atomic_write_hw_unit_max /= 2;
>  
>  	t->atomic_write_hw_unit_min = min(b->atomic_write_hw_unit_min,
>  					  t->atomic_write_hw_unit_max);
> -	t->atomic_write_hw_max = min(b->atomic_write_hw_max, t->io_min);
> +	t->atomic_write_hw_max = min(b->atomic_write_hw_max, chunk_size);
>  
>  	return true;
>  }

This works well with my NVMe disk which supports atomic writes however the only
concern is what if in case t->chunk_sectors is also defined for NVMe disk? 
I see that nvme_set_chunk_sectors() initializes the chunk_sectors for NVMe. 
The value which is assigned to lim->chunk_sectors in nvme_set_chunk_sectors()
represents "noiob" (i.e. Namespace Optimal I/O Boundary). My disk has "noiob" 
set to zero but in case if it's non-zero then would it break the above logic
for NVMe atomic writes?

Thanks,
--Nilay



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH RFC 3/4] dm-stripe: limit chunk_sectors to the stripe size
  2025-06-05 15:08 ` [PATCH RFC 3/4] dm-stripe: limit chunk_sectors to the stripe size John Garry
  2025-06-06 15:16   ` Nilay Shroff
@ 2025-06-09 15:19   ` Mikulas Patocka
  2025-06-12  9:15     ` John Garry
  1 sibling, 1 reply; 11+ messages in thread
From: Mikulas Patocka @ 2025-06-09 15:19 UTC (permalink / raw)
  To: John Garry
  Cc: agk, snitzer, song, yukuai3, hch, nilay, axboe, dm-devel,
	linux-kernel, linux-raid, linux-block, ojaswin, martin.petersen



On Thu, 5 Jun 2025, John Garry wrote:

> Currently we use min io size as the chunk size when deciding on the limit
> of atomic write size.
> 
> Using min io size is not reliable, as this may be mutated when stacking
> the bottom device limits.
> 
> The block stacking limits will rely on chunk_sectors in future, so set
> this value (to the chunk size).
> 
> Introduce a flag - DM_TARGET_STRIPED - and check this in
> dm_set_device_limits() when setting this limit.
> 
> Signed-off-by: John Garry <john.g.garry@oracle.com>
> ---
>  drivers/md/dm-stripe.c        | 3 ++-
>  drivers/md/dm-table.c         | 4 ++++
>  include/linux/device-mapper.h | 3 +++
>  3 files changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/md/dm-stripe.c b/drivers/md/dm-stripe.c
> index a7dc04bd55e5..c30df6715149 100644
> --- a/drivers/md/dm-stripe.c
> +++ b/drivers/md/dm-stripe.c
> @@ -466,7 +466,8 @@ static struct target_type stripe_target = {
>  	.name   = "striped",
>  	.version = {1, 7, 0},
>  	.features = DM_TARGET_PASSES_INTEGRITY | DM_TARGET_NOWAIT |
> -		    DM_TARGET_ATOMIC_WRITES | DM_TARGET_PASSES_CRYPTO,
> +		    DM_TARGET_ATOMIC_WRITES | DM_TARGET_PASSES_CRYPTO |
> +		    DM_TARGET_STRIPED,
>  	.module = THIS_MODULE,
>  	.ctr    = stripe_ctr,
>  	.dtr    = stripe_dtr,
> diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c
> index 24a857ff6d0b..4f1f7173740c 100644
> --- a/drivers/md/dm-table.c
> +++ b/drivers/md/dm-table.c
> @@ -430,6 +430,10 @@ static int dm_set_device_limits(struct dm_target *ti, struct dm_dev *dev,
>  		return 0;
>  	}
>  
> +	/* For striped types, limit the chunk_sectors to the chunk size */
> +	if (dm_target_supports_striped(ti->type))
> +		limits->chunk_sectors = len >> SECTOR_SHIFT;
> +

len is already in sectors, so why do we shift it right?

Could this logic be moved to the function stripe_io_hints, so that we 
don't have to add a new flag for that and that we don't have to modify the 
generic dm code?

Mikulas

>  	mutex_lock(&q->limits_lock);
>  	/*
>  	 * BLK_FEAT_ATOMIC_WRITES is not inherited from the bottom device in
> diff --git a/include/linux/device-mapper.h b/include/linux/device-mapper.h
> index cb95951547ab..a863523b69ee 100644
> --- a/include/linux/device-mapper.h
> +++ b/include/linux/device-mapper.h
> @@ -309,6 +309,9 @@ struct target_type {
>  #define DM_TARGET_ATOMIC_WRITES		0x00000400
>  #define dm_target_supports_atomic_writes(type) ((type)->features & DM_TARGET_ATOMIC_WRITES)
>  
> +#define DM_TARGET_STRIPED		0x00000800
> +#define dm_target_supports_striped(type) ((type)->features & DM_TARGET_STRIPED)
> +
>  struct dm_target {
>  	struct dm_table *table;
>  	struct target_type *type;
> -- 
> 2.31.1
> 


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH RFC 3/4] dm-stripe: limit chunk_sectors to the stripe size
  2025-06-09 15:19   ` Mikulas Patocka
@ 2025-06-12  9:15     ` John Garry
  0 siblings, 0 replies; 11+ messages in thread
From: John Garry @ 2025-06-12  9:15 UTC (permalink / raw)
  To: Mikulas Patocka
  Cc: agk, snitzer, song, yukuai3, hch, nilay, axboe, dm-devel,
	linux-kernel, linux-raid, linux-block, ojaswin, martin.petersen

On 09/06/2025 16:19, Mikulas Patocka wrote:
>> diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c
>> index 24a857ff6d0b..4f1f7173740c 100644
>> --- a/drivers/md/dm-table.c
>> +++ b/drivers/md/dm-table.c
>> @@ -430,6 +430,10 @@ static int dm_set_device_limits(struct dm_target *ti, struct dm_dev *dev,
>>   		return 0;
>>   	}
>>   
>> +	/* For striped types, limit the chunk_sectors to the chunk size */
>> +	if (dm_target_supports_striped(ti->type))
>> +		limits->chunk_sectors = len >> SECTOR_SHIFT;
>> +
> len is already in sectors, so why do we shift it right?

Actually what I am passing is not the proper value at all. len holds the 
sc->stripe_width, and that seems to be md dev size / # stripes. I think 
that we want chunk_size, right?

> 
> Could this logic be moved to the function stripe_io_hints, so that we
> don't have to add a new flag for that and that we don't have to modify the
> generic dm code?
> 

That would be better. I am going to have to change 
blk_stack_atomic_writes_limits() to work for that, but I think that code 
needs to change anyway if the bottom device has its own chunk_sectors 
(as Nilay mentioned about 4/4).

Thanks,
John

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH RFC 4/4] block: use chunk_sectors when evaluating stacked atomic write limits
  2025-06-06 15:23   ` Nilay Shroff
@ 2025-06-12  9:17     ` John Garry
  0 siblings, 0 replies; 11+ messages in thread
From: John Garry @ 2025-06-12  9:17 UTC (permalink / raw)
  To: Nilay Shroff, agk, snitzer, mpatocka, song, yukuai3, hch, axboe
  Cc: dm-devel, linux-kernel, linux-raid, linux-block, ojaswin,
	martin.petersen

On 06/06/2025 16:23, Nilay Shroff wrote:
>>   	t->atomic_write_hw_unit_min = min(b->atomic_write_hw_unit_min,
>>   					  t->atomic_write_hw_unit_max);
>> -	t->atomic_write_hw_max = min(b->atomic_write_hw_max, t->io_min);
>> +	t->atomic_write_hw_max = min(b->atomic_write_hw_max, chunk_size);
>>   
>>   	return true;
>>   }
> This works well with my NVMe disk which supports atomic writes however the only
> concern is what if in case t->chunk_sectors is also defined for NVMe disk?
> I see that nvme_set_chunk_sectors() initializes the chunk_sectors for NVMe.
> The value which is assigned to lim->chunk_sectors in nvme_set_chunk_sectors()
> represents "noiob" (i.e. Namespace Optimal I/O Boundary). My disk has "noiob"
> set to zero but in case if it's non-zero then would it break the above logic
> for NVMe atomic writes?

Yeah, I think that I need to change the code to account for the bottom 
device setting chunk_sectors.

Thanks,
John

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH RFC 3/4] dm-stripe: limit chunk_sectors to the stripe size
  2025-06-06 15:16   ` Nilay Shroff
@ 2025-06-12 10:01     ` John Garry
  0 siblings, 0 replies; 11+ messages in thread
From: John Garry @ 2025-06-12 10:01 UTC (permalink / raw)
  To: Nilay Shroff, agk, snitzer, mpatocka, song, yukuai3, hch, axboe
  Cc: dm-devel, linux-kernel, linux-raid, linux-block, ojaswin,
	martin.petersen

On 06/06/2025 16:16, Nilay Shroff wrote:
>> diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c
>> index 24a857ff6d0b..4f1f7173740c 100644
>> --- a/drivers/md/dm-table.c
>> +++ b/drivers/md/dm-table.c
>> @@ -430,6 +430,10 @@ static int dm_set_device_limits(struct dm_target *ti, struct dm_dev *dev,
>>   		return 0;
>>   	}
>>   
>> +	/* For striped types, limit the chunk_sectors to the chunk size */
>> +	if (dm_target_supports_striped(ti->type))
>> +		limits->chunk_sectors = len >> SECTOR_SHIFT;
>> +
> I think here "len" refers to the total size of dm target and not the
> chunk sectors. So we need to modify this and take into account chunk sectors..
> We can get chunk sectors, for example, like this:
> 
> 	struct stripe_c *sc = ti->private;
> 	limits->chunk_sectors = sc->chunk_size;

right

I find that the terminology used in the dm code for stripe width and 
size a bit confusing.

> 
> But again struct stripe_c is private to dm-stripe.c and so we can't access it
> here directly in dm-table.c Better we add a new callback function for dm target
> type under struct target_type and then use that callback to get chunk sector.
> 
> struct target_type stripe_target = {
>          ...
>          .chunk_sectors = stripe_chunk_sectors,
>          ...
> }


Please see reply to Mikulas on this same topic.

Thanks,
John

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2025-06-12 10:01 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-05 15:08 [PATCH RFC 0/4] block: set chunk_sectors from stacked dev stripe size John Garry
2025-06-05 15:08 ` [PATCH RFC 1/4] md/raid0: set chunk_sectors limit John Garry
2025-06-05 15:08 ` [PATCH RFC 2/4] md/raid10: " John Garry
2025-06-05 15:08 ` [PATCH RFC 3/4] dm-stripe: limit chunk_sectors to the stripe size John Garry
2025-06-06 15:16   ` Nilay Shroff
2025-06-12 10:01     ` John Garry
2025-06-09 15:19   ` Mikulas Patocka
2025-06-12  9:15     ` John Garry
2025-06-05 15:08 ` [PATCH RFC 4/4] block: use chunk_sectors when evaluating stacked atomic write limits John Garry
2025-06-06 15:23   ` Nilay Shroff
2025-06-12  9:17     ` John Garry

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).