* [PATCH RFC 0/5] device mapper atomic write support
@ 2025-01-06 12:41 John Garry
2025-01-06 12:41 ` [PATCH 1/5] block: Ensure start sector is aligned for stacking atomic writes John Garry
` (5 more replies)
0 siblings, 6 replies; 20+ messages in thread
From: John Garry @ 2025-01-06 12:41 UTC (permalink / raw)
To: axboe, agk, snitzer, hch
Cc: mpatocka, martin.petersen, linux-block, dm-devel, linux-kernel,
John Garry
This series introduces initial device mapper atomic write support.
Since we already support stacking atomic writes limits, it's quite
straightforward to support.
Only dm-linear is supported for now, but other personalities could
be supported.
Patch #1 is a proper fix, but the rest of the series is RFC - this is
because I have not fully tested and we are close to the end of this
development cycle.
Based on v6.13-rc6
John Garry (5):
block: Ensure start sector is aligned for stacking atomic writes
block: Change blk_stack_atomic_writes_limits() unit_min check
dm-table: Atomic writes support
dm: Ensure cloned bio is same length for atomic write
dm-linear: Enable atomic writes
block/blk-settings.c | 9 ++++++---
drivers/md/dm-linear.c | 3 ++-
drivers/md/dm-table.c | 12 ++++++++++++
drivers/md/dm.c | 3 +++
include/linux/blkdev.h | 21 ++++++++++++---------
include/linux/device-mapper.h | 3 +++
6 files changed, 38 insertions(+), 13 deletions(-)
--
2.31.1
^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH 1/5] block: Ensure start sector is aligned for stacking atomic writes
2025-01-06 12:41 [PATCH RFC 0/5] device mapper atomic write support John Garry
@ 2025-01-06 12:41 ` John Garry
2025-01-06 15:36 ` Christoph Hellwig
2025-01-06 12:41 ` [PATCH RFC 2/5] block: Change blk_stack_atomic_writes_limits() unit_min check John Garry
` (4 subsequent siblings)
5 siblings, 1 reply; 20+ messages in thread
From: John Garry @ 2025-01-06 12:41 UTC (permalink / raw)
To: axboe, agk, snitzer, hch
Cc: mpatocka, martin.petersen, linux-block, dm-devel, linux-kernel,
John Garry
For stacking atomic writes, ensure that the start sector is aligned with
the device atomic write unit min and any boundary. Otherwise, we may
permit misaligned atomic writes.
Rework bdev_can_atomic_write() into a common helper to resuse the
alignment check. There also use atomic_write_hw_unit_min, which is more
proper (than atomic_write_unit_min).
Fixes: d7f36dc446e89 ("block: Support atomic writes limits for stacked devices")
Signed-off-by: John Garry <john.g.garry@oracle.com>
---
block/blk-settings.c | 7 +++++--
include/linux/blkdev.h | 21 ++++++++++++---------
2 files changed, 17 insertions(+), 11 deletions(-)
diff --git a/block/blk-settings.c b/block/blk-settings.c
index 8f09e33f41f6..a8dd5c097b8a 100644
--- a/block/blk-settings.c
+++ b/block/blk-settings.c
@@ -584,7 +584,7 @@ static bool blk_stack_atomic_writes_head(struct queue_limits *t,
}
static void blk_stack_atomic_writes_limits(struct queue_limits *t,
- struct queue_limits *b)
+ struct queue_limits *b, sector_t start)
{
if (!(t->features & BLK_FEAT_ATOMIC_WRITES_STACKED))
goto unsupported;
@@ -592,6 +592,9 @@ static void blk_stack_atomic_writes_limits(struct queue_limits *t,
if (!b->atomic_write_unit_min)
goto unsupported;
+ if (!blk_atomic_write_start_sect_aligned(start, b))
+ goto unsupported;
+
/*
* If atomic_write_hw_max is set, we have already stacked 1x bottom
* device, so check for compliance.
@@ -774,7 +777,7 @@ int blk_stack_limits(struct queue_limits *t, struct queue_limits *b,
t->zone_write_granularity = 0;
t->max_zone_append_sectors = 0;
}
- blk_stack_atomic_writes_limits(t, b);
+ blk_stack_atomic_writes_limits(t, b, start);
return ret;
}
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 378d3a1a22fc..b9776d469781 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -1699,6 +1699,15 @@ struct io_comp_batch {
void (*complete)(struct io_comp_batch *);
};
+static inline bool blk_atomic_write_start_sect_aligned(sector_t sector,
+ struct queue_limits *limits)
+{
+ unsigned int alignment = max(limits->atomic_write_hw_unit_min,
+ limits->atomic_write_hw_boundary);
+
+ return IS_ALIGNED(sector, alignment >> SECTOR_SHIFT);
+}
+
static inline bool bdev_can_atomic_write(struct block_device *bdev)
{
struct request_queue *bd_queue = bdev->bd_queue;
@@ -1707,15 +1716,9 @@ static inline bool bdev_can_atomic_write(struct block_device *bdev)
if (!limits->atomic_write_unit_min)
return false;
- if (bdev_is_partition(bdev)) {
- sector_t bd_start_sect = bdev->bd_start_sect;
- unsigned int alignment =
- max(limits->atomic_write_unit_min,
- limits->atomic_write_hw_boundary);
-
- if (!IS_ALIGNED(bd_start_sect, alignment >> SECTOR_SHIFT))
- return false;
- }
+ if (bdev_is_partition(bdev))
+ return blk_atomic_write_start_sect_aligned(bdev->bd_start_sect,
+ limits);
return true;
}
--
2.31.1
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH RFC 2/5] block: Change blk_stack_atomic_writes_limits() unit_min check
2025-01-06 12:41 [PATCH RFC 0/5] device mapper atomic write support John Garry
2025-01-06 12:41 ` [PATCH 1/5] block: Ensure start sector is aligned for stacking atomic writes John Garry
@ 2025-01-06 12:41 ` John Garry
2025-01-06 15:37 ` Christoph Hellwig
2025-01-06 12:41 ` [PATCH RFC 3/5] dm-table: Atomic writes support John Garry
` (3 subsequent siblings)
5 siblings, 1 reply; 20+ messages in thread
From: John Garry @ 2025-01-06 12:41 UTC (permalink / raw)
To: axboe, agk, snitzer, hch
Cc: mpatocka, martin.petersen, linux-block, dm-devel, linux-kernel,
John Garry
The current check in blk_stack_atomic_writes_limits() for a bottom device
supporting atomic writes is to verify that limit atomic_write_unit_min is
non-zero.
This would cause a problem for device mapper queue limits calculation. This
is because it uses a temporary queue_limits structure to stack the limits,
before finally commiting the limits update.
The value of atomic_write_unit_min for the temporary queue_limits
structure is never evaluated and so cannot be used, so use limit
atomic_write_hw_unit_min.
Signed-off-by: John Garry <john.g.garry@oracle.com>
---
block/blk-settings.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/block/blk-settings.c b/block/blk-settings.c
index a8dd5c097b8a..d4fccd09e237 100644
--- a/block/blk-settings.c
+++ b/block/blk-settings.c
@@ -589,7 +589,7 @@ static void blk_stack_atomic_writes_limits(struct queue_limits *t,
if (!(t->features & BLK_FEAT_ATOMIC_WRITES_STACKED))
goto unsupported;
- if (!b->atomic_write_unit_min)
+ if (!b->atomic_write_hw_unit_min)
goto unsupported;
if (!blk_atomic_write_start_sect_aligned(start, b))
--
2.31.1
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH RFC 3/5] dm-table: Atomic writes support
2025-01-06 12:41 [PATCH RFC 0/5] device mapper atomic write support John Garry
2025-01-06 12:41 ` [PATCH 1/5] block: Ensure start sector is aligned for stacking atomic writes John Garry
2025-01-06 12:41 ` [PATCH RFC 2/5] block: Change blk_stack_atomic_writes_limits() unit_min check John Garry
@ 2025-01-06 12:41 ` John Garry
2025-01-06 17:49 ` Mike Snitzer
2025-01-06 12:41 ` [PATCH RFC 4/5] dm: Ensure cloned bio is same length for atomic write John Garry
` (2 subsequent siblings)
5 siblings, 1 reply; 20+ messages in thread
From: John Garry @ 2025-01-06 12:41 UTC (permalink / raw)
To: axboe, agk, snitzer, hch
Cc: mpatocka, martin.petersen, linux-block, dm-devel, linux-kernel,
John Garry
Support stacking atomic write limits for DM devices.
All the pre-existing code in blk_stack_atomic_writes_limits() already takes
care of finding the aggregate limits from the bottom devices.
Feature flag DM_TARGET_ATOMIC_WRITES is introduced so that atomic writes
can be enabled on personalities selectively. This is to ensure that atomic
writes are only enabled when verified to be working properly (for a
specific personality). In addition, it just may not make sense to enable
atomic writes on some personalities (so this flag also helps there).
When testing for bottom device atomic writes support, only the bdev
queue limits are tested. There is no need to test the bottom bdev
start sector (like which bdev_can_atomic_write() does), as this would
already be checked in the dm_calculate_queue_limits() -> ..
iterate_devices() -> dm_set_device_limits() -> blk_stack_limits()
callchain.
Signed-off-by: John Garry <john.g.garry@oracle.com>
---
drivers/md/dm-table.c | 12 ++++++++++++
include/linux/device-mapper.h | 3 +++
2 files changed, 15 insertions(+)
diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c
index bd8b796ae683..1e0b7e364546 100644
--- a/drivers/md/dm-table.c
+++ b/drivers/md/dm-table.c
@@ -1593,6 +1593,7 @@ int dm_calculate_queue_limits(struct dm_table *t,
struct queue_limits ti_limits;
unsigned int zone_sectors = 0;
bool zoned = false;
+ bool atomic_writes = true;
dm_set_stacking_limits(limits);
@@ -1602,8 +1603,12 @@ int dm_calculate_queue_limits(struct dm_table *t,
if (!dm_target_passes_integrity(ti->type))
t->integrity_supported = false;
+ if (!dm_target_supports_atomic_writes(ti->type))
+ atomic_writes = false;
}
+ if (atomic_writes)
+ limits->features |= BLK_FEAT_ATOMIC_WRITES_STACKED;
for (unsigned int i = 0; i < t->num_targets; i++) {
struct dm_target *ti = dm_table_get_target(t, i);
@@ -1616,6 +1621,13 @@ int dm_calculate_queue_limits(struct dm_table *t,
goto combine_limits;
}
+ /*
+ * dm_set_device_limits() -> blk_stack_limits() considers
+ * ti_limits as 'top', so set BLK_FEAT_ATOMIC_WRITES_STACKED
+ * here also.
+ */
+ if (atomic_writes)
+ ti_limits.features |= BLK_FEAT_ATOMIC_WRITES_STACKED;
/*
* Combine queue limits of all the devices this target uses.
*/
diff --git a/include/linux/device-mapper.h b/include/linux/device-mapper.h
index 8321f65897f3..bcc6d7b69470 100644
--- a/include/linux/device-mapper.h
+++ b/include/linux/device-mapper.h
@@ -299,6 +299,9 @@ struct target_type {
#define dm_target_supports_mixed_zoned_model(type) (false)
#endif
+#define DM_TARGET_ATOMIC_WRITES 0x00000400
+#define dm_target_supports_atomic_writes(type) ((type)->features & DM_TARGET_ATOMIC_WRITES)
+
struct dm_target {
struct dm_table *table;
struct target_type *type;
--
2.31.1
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH RFC 4/5] dm: Ensure cloned bio is same length for atomic write
2025-01-06 12:41 [PATCH RFC 0/5] device mapper atomic write support John Garry
` (2 preceding siblings ...)
2025-01-06 12:41 ` [PATCH RFC 3/5] dm-table: Atomic writes support John Garry
@ 2025-01-06 12:41 ` John Garry
2025-01-06 12:41 ` [PATCH RFC 5/5] dm-linear: Enable atomic writes John Garry
2025-01-06 17:26 ` [PATCH RFC 0/5] device mapper atomic write support Mike Snitzer
5 siblings, 0 replies; 20+ messages in thread
From: John Garry @ 2025-01-06 12:41 UTC (permalink / raw)
To: axboe, agk, snitzer, hch
Cc: mpatocka, martin.petersen, linux-block, dm-devel, linux-kernel,
John Garry
For an atomic writes, a cloned bio must be same length as the original bio,
i.e. no splitting.
Error in case it is not.
Per-dm device queue limits should be setup to ensure that this does not
happen, but error this case as an insurance policy.
Signed-off-by: John Garry <john.g.garry@oracle.com>
---
drivers/md/dm.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index 12ecf07a3841..e26c73fb365a 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -1746,6 +1746,9 @@ static blk_status_t __split_and_process_bio(struct clone_info *ci)
ci->submit_as_polled = !!(ci->bio->bi_opf & REQ_POLLED);
len = min_t(sector_t, max_io_len(ti, ci->sector), ci->sector_count);
+ if (ci->bio->bi_opf & REQ_ATOMIC && len != ci->sector_count)
+ return BLK_STS_IOERR;
+
setup_split_accounting(ci, len);
if (unlikely(ci->bio->bi_opf & REQ_NOWAIT)) {
--
2.31.1
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH RFC 5/5] dm-linear: Enable atomic writes
2025-01-06 12:41 [PATCH RFC 0/5] device mapper atomic write support John Garry
` (3 preceding siblings ...)
2025-01-06 12:41 ` [PATCH RFC 4/5] dm: Ensure cloned bio is same length for atomic write John Garry
@ 2025-01-06 12:41 ` John Garry
2025-01-06 17:26 ` [PATCH RFC 0/5] device mapper atomic write support Mike Snitzer
5 siblings, 0 replies; 20+ messages in thread
From: John Garry @ 2025-01-06 12:41 UTC (permalink / raw)
To: axboe, agk, snitzer, hch
Cc: mpatocka, martin.petersen, linux-block, dm-devel, linux-kernel,
John Garry
Set feature flag DM_TARGET_ATOMIC_WRITES.
Signed-off-by: John Garry <john.g.garry@oracle.com>
---
drivers/md/dm-linear.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/md/dm-linear.c b/drivers/md/dm-linear.c
index 49fb0f684193..351f4ee83997 100644
--- a/drivers/md/dm-linear.c
+++ b/drivers/md/dm-linear.c
@@ -201,7 +201,8 @@ static struct target_type linear_target = {
.name = "linear",
.version = {1, 4, 0},
.features = DM_TARGET_PASSES_INTEGRITY | DM_TARGET_NOWAIT |
- DM_TARGET_ZONED_HM | DM_TARGET_PASSES_CRYPTO,
+ DM_TARGET_ZONED_HM | DM_TARGET_PASSES_CRYPTO |
+ DM_TARGET_ATOMIC_WRITES,
.report_zones = linear_report_zones,
.module = THIS_MODULE,
.ctr = linear_ctr,
--
2.31.1
^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH 1/5] block: Ensure start sector is aligned for stacking atomic writes
2025-01-06 12:41 ` [PATCH 1/5] block: Ensure start sector is aligned for stacking atomic writes John Garry
@ 2025-01-06 15:36 ` Christoph Hellwig
0 siblings, 0 replies; 20+ messages in thread
From: Christoph Hellwig @ 2025-01-06 15:36 UTC (permalink / raw)
To: John Garry
Cc: axboe, agk, snitzer, hch, mpatocka, martin.petersen, linux-block,
dm-devel, linux-kernel
Looks good:
Reviewed-by: Christoph Hellwig <hch@lst.de>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH RFC 2/5] block: Change blk_stack_atomic_writes_limits() unit_min check
2025-01-06 12:41 ` [PATCH RFC 2/5] block: Change blk_stack_atomic_writes_limits() unit_min check John Garry
@ 2025-01-06 15:37 ` Christoph Hellwig
0 siblings, 0 replies; 20+ messages in thread
From: Christoph Hellwig @ 2025-01-06 15:37 UTC (permalink / raw)
To: John Garry
Cc: axboe, agk, snitzer, hch, mpatocka, martin.petersen, linux-block,
dm-devel, linux-kernel
Looks good:
Reviewed-by: Christoph Hellwig <hch@lst.de>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH RFC 0/5] device mapper atomic write support
2025-01-06 12:41 [PATCH RFC 0/5] device mapper atomic write support John Garry
` (4 preceding siblings ...)
2025-01-06 12:41 ` [PATCH RFC 5/5] dm-linear: Enable atomic writes John Garry
@ 2025-01-06 17:26 ` Mike Snitzer
2025-01-06 18:14 ` John Garry
5 siblings, 1 reply; 20+ messages in thread
From: Mike Snitzer @ 2025-01-06 17:26 UTC (permalink / raw)
To: John Garry
Cc: axboe, agk, hch, mpatocka, martin.petersen, linux-block, dm-devel,
linux-kernel
On Mon, Jan 06, 2025 at 12:41:14PM +0000, John Garry wrote:
> This series introduces initial device mapper atomic write support.
>
> Since we already support stacking atomic writes limits, it's quite
> straightforward to support.
>
> Only dm-linear is supported for now, but other personalities could
> be supported.
>
> Patch #1 is a proper fix, but the rest of the series is RFC - this is
> because I have not fully tested and we are close to the end of this
> development cycle.
In general, looks reasonable. But I would prefer to see atomic write
support added to dm-striped as well. Not that I have some need, but
because it will help verify the correctness of the general stacking
code changes (in both block and DM core). I wrote and/or fixed a fair
amount of the non-atomic block limits stacking code over the
years.. so this is just me trying to inform this effort based on
limits stacking gotchas we've experienced to this point.
Looks like adding dm-striped support would just need to ensure that
the chunk_size is multiple of atomic write size (so chunk_size >=
atomic write size).
Relative to linear, testing limits stacking in terms of linear should
also verify that concatenated volumes work.
Thanks,
Mike
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH RFC 3/5] dm-table: Atomic writes support
2025-01-06 12:41 ` [PATCH RFC 3/5] dm-table: Atomic writes support John Garry
@ 2025-01-06 17:49 ` Mike Snitzer
2025-01-06 18:18 ` John Garry
0 siblings, 1 reply; 20+ messages in thread
From: Mike Snitzer @ 2025-01-06 17:49 UTC (permalink / raw)
To: John Garry
Cc: axboe, agk, hch, mpatocka, martin.petersen, linux-block, dm-devel,
linux-kernel
On Mon, Jan 06, 2025 at 12:41:17PM +0000, John Garry wrote:
> Support stacking atomic write limits for DM devices.
>
> All the pre-existing code in blk_stack_atomic_writes_limits() already takes
> care of finding the aggregate limits from the bottom devices.
>
> Feature flag DM_TARGET_ATOMIC_WRITES is introduced so that atomic writes
> can be enabled on personalities selectively. This is to ensure that atomic
> writes are only enabled when verified to be working properly (for a
> specific personality). In addition, it just may not make sense to enable
> atomic writes on some personalities (so this flag also helps there).
>
> When testing for bottom device atomic writes support, only the bdev
> queue limits are tested. There is no need to test the bottom bdev
> start sector (like which bdev_can_atomic_write() does), as this would
> already be checked in the dm_calculate_queue_limits() -> ..
> iterate_devices() -> dm_set_device_limits() -> blk_stack_limits()
> callchain.
>
> Signed-off-by: John Garry <john.g.garry@oracle.com>
> ---
> drivers/md/dm-table.c | 12 ++++++++++++
> include/linux/device-mapper.h | 3 +++
> 2 files changed, 15 insertions(+)
>
> diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c
> index bd8b796ae683..1e0b7e364546 100644
> --- a/drivers/md/dm-table.c
> +++ b/drivers/md/dm-table.c
> @@ -1593,6 +1593,7 @@ int dm_calculate_queue_limits(struct dm_table *t,
> struct queue_limits ti_limits;
> unsigned int zone_sectors = 0;
> bool zoned = false;
> + bool atomic_writes = true;
>
> dm_set_stacking_limits(limits);
>
> @@ -1602,8 +1603,12 @@ int dm_calculate_queue_limits(struct dm_table *t,
>
> if (!dm_target_passes_integrity(ti->type))
> t->integrity_supported = false;
> + if (!dm_target_supports_atomic_writes(ti->type))
> + atomic_writes = false;
> }
>
> + if (atomic_writes)
> + limits->features |= BLK_FEAT_ATOMIC_WRITES_STACKED;
> for (unsigned int i = 0; i < t->num_targets; i++) {
> struct dm_target *ti = dm_table_get_target(t, i);
>
> @@ -1616,6 +1621,13 @@ int dm_calculate_queue_limits(struct dm_table *t,
> goto combine_limits;
> }
>
> + /*
> + * dm_set_device_limits() -> blk_stack_limits() considers
> + * ti_limits as 'top', so set BLK_FEAT_ATOMIC_WRITES_STACKED
> + * here also.
> + */
> + if (atomic_writes)
> + ti_limits.features |= BLK_FEAT_ATOMIC_WRITES_STACKED;
> /*
> * Combine queue limits of all the devices this target uses.
> */
You're referring to this code that immediately follows this ^ comment
which stacks up the limits of a target's potential to have N component
data devices:
ti->type->iterate_devices(ti, dm_set_device_limits,
&ti_limits);
Your comment and redundant feature flag setting is feels wrong. I'd
expect code comparable to what is done for zoned, e.g.:
if (!zoned && (ti_limits.features & BLK_FEAT_ZONED)) {
/*
* After stacking all limits, validate all devices
* in table support this zoned model and zone sectors.
*/
zoned = (ti_limits.features & BLK_FEAT_ZONED);
zone_sectors = ti_limits.chunk_sectors;
}
Meaning, for zoned devices, a side-effect of the
ti->type->iterate_devices() call (and N blk_stack_limits calls) is
ti_limits.features having BLK_FEAT_ZONED enabled. Why wouldn't the same
side-effect happen for BLK_FEAT_ATOMIC_WRITES_STACKED (speaks to
blk_stack_limits being different/wrong for atomic writes support)?
Just feels not quite right... but I could be wrong, please see if
there is any "there" there ;)
Thanks,
Mike
> diff --git a/include/linux/device-mapper.h b/include/linux/device-mapper.h
> index 8321f65897f3..bcc6d7b69470 100644
> --- a/include/linux/device-mapper.h
> +++ b/include/linux/device-mapper.h
> @@ -299,6 +299,9 @@ struct target_type {
> #define dm_target_supports_mixed_zoned_model(type) (false)
> #endif
>
> +#define DM_TARGET_ATOMIC_WRITES 0x00000400
> +#define dm_target_supports_atomic_writes(type) ((type)->features & DM_TARGET_ATOMIC_WRITES)
> +
> struct dm_target {
> struct dm_table *table;
> struct target_type *type;
> --
> 2.31.1
>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH RFC 0/5] device mapper atomic write support
2025-01-06 17:26 ` [PATCH RFC 0/5] device mapper atomic write support Mike Snitzer
@ 2025-01-06 18:14 ` John Garry
2025-01-07 17:13 ` Mikulas Patocka
0 siblings, 1 reply; 20+ messages in thread
From: John Garry @ 2025-01-06 18:14 UTC (permalink / raw)
To: Mike Snitzer
Cc: axboe, agk, hch, mpatocka, martin.petersen, linux-block, dm-devel,
linux-kernel
On 06/01/2025 17:26, Mike Snitzer wrote:
> On Mon, Jan 06, 2025 at 12:41:14PM +0000, John Garry wrote:
>> This series introduces initial device mapper atomic write support.
>>
>> Since we already support stacking atomic writes limits, it's quite
>> straightforward to support.
>>
>> Only dm-linear is supported for now, but other personalities could
>> be supported.
>>
>> Patch #1 is a proper fix, but the rest of the series is RFC - this is
>> because I have not fully tested and we are close to the end of this
>> development cycle.
> In general, looks reasonable. But I would prefer to see atomic write
> support added to dm-striped as well. Not that I have some need, but
> because it will help verify the correctness of the general stacking
> code changes (in both block and DM core).
That should be fine. We already have md raid0 support working (for
atomic writes), so I would expect much of the required support is
already available.
> I wrote and/or fixed a fair
> amount of the non-atomic block limits stacking code over the
> years.. so this is just me trying to inform this effort based on
> limits stacking gotchas we've experienced to this point.
Yeah, understood. And that is why I am on the lookup for points at which
we try to split atomic writes in the submission patch. The only reason
that it should happen is due to the limits being incorrectly calculated.
>
> Looks like adding dm-striped support would just need to ensure that
> the chunk_size is multiple of atomic write size (so chunk_size >=
> atomic write size).
Right, so the block queue limits code already will throttle the atomic
write max so that chunk_size % atomic write upper limit == 0.
>
> Relative to linear, testing limits stacking in terms of linear should
> also verify that concatenated volumes work.
ok,
Thanks,
John
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH RFC 3/5] dm-table: Atomic writes support
2025-01-06 17:49 ` Mike Snitzer
@ 2025-01-06 18:18 ` John Garry
0 siblings, 0 replies; 20+ messages in thread
From: John Garry @ 2025-01-06 18:18 UTC (permalink / raw)
To: Mike Snitzer
Cc: axboe, agk, hch, mpatocka, martin.petersen, linux-block, dm-devel,
linux-kernel
On 06/01/2025 17:49, Mike Snitzer wrote:
>> +++ b/drivers/md/dm-table.c
>> @@ -1593,6 +1593,7 @@ int dm_calculate_queue_limits(struct dm_table *t,
>> struct queue_limits ti_limits;
>> unsigned int zone_sectors = 0;
>> bool zoned = false;
>> + bool atomic_writes = true;
>>
>> dm_set_stacking_limits(limits);
>>
>> @@ -1602,8 +1603,12 @@ int dm_calculate_queue_limits(struct dm_table *t,
>>
>> if (!dm_target_passes_integrity(ti->type))
>> t->integrity_supported = false;
>> + if (!dm_target_supports_atomic_writes(ti->type))
>> + atomic_writes = false;
>> }
>>
>> + if (atomic_writes)
>> + limits->features |= BLK_FEAT_ATOMIC_WRITES_STACKED;
>> for (unsigned int i = 0; i < t->num_targets; i++) {
>> struct dm_target *ti = dm_table_get_target(t, i);
>>
>> @@ -1616,6 +1621,13 @@ int dm_calculate_queue_limits(struct dm_table *t,
>> goto combine_limits;
>> }
>>
>> + /*
>> + * dm_set_device_limits() -> blk_stack_limits() considers
>> + * ti_limits as 'top', so set BLK_FEAT_ATOMIC_WRITES_STACKED
>> + * here also.
>> + */
>> + if (atomic_writes)
>> + ti_limits.features |= BLK_FEAT_ATOMIC_WRITES_STACKED;
>> /*
>> * Combine queue limits of all the devices this target uses.
>> */
> You're referring to this code that immediately follows this ^ comment
> which stacks up the limits of a target's potential to have N component
> data devices:
>
> ti->type->iterate_devices(ti, dm_set_device_limits,
> &ti_limits);
>
> Your comment and redundant feature flag setting is feels wrong. I'd
> expect code comparable to what is done for zoned, e.g.:
>
> if (!zoned && (ti_limits.features & BLK_FEAT_ZONED)) {
> /*
> * After stacking all limits, validate all devices
> * in table support this zoned model and zone sectors.
> */
> zoned = (ti_limits.features & BLK_FEAT_ZONED);
> zone_sectors = ti_limits.chunk_sectors;
> }
>
> Meaning, for zoned devices, a side-effect of the
> ti->type->iterate_devices() call (and N blk_stack_limits calls) is
> ti_limits.features having BLK_FEAT_ZONED enabled. Why wouldn't the same
> side-effect happen for BLK_FEAT_ATOMIC_WRITES_STACKED (speaks to
> blk_stack_limits being different/wrong for atomic writes support)?
ok, do I admit that my code did not feel quite right, so I will check
the zoned code as a reference.
>
> Just feels not quite right... but I could be wrong, please see if
> there is any "there" there
Will do.
Cheers,
John
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH RFC 0/5] device mapper atomic write support
2025-01-06 18:14 ` John Garry
@ 2025-01-07 17:13 ` Mikulas Patocka
2025-01-07 17:58 ` John Garry
2025-01-16 11:28 ` John Garry
0 siblings, 2 replies; 20+ messages in thread
From: Mikulas Patocka @ 2025-01-07 17:13 UTC (permalink / raw)
To: John Garry
Cc: Mike Snitzer, axboe, agk, hch, martin.petersen, linux-block,
dm-devel, linux-kernel
On Mon, 6 Jan 2025, John Garry wrote:
> On 06/01/2025 17:26, Mike Snitzer wrote:
> > On Mon, Jan 06, 2025 at 12:41:14PM +0000, John Garry wrote:
> > > This series introduces initial device mapper atomic write support.
> > >
> > > Since we already support stacking atomic writes limits, it's quite
> > > straightforward to support.
> > >
> > > Only dm-linear is supported for now, but other personalities could
> > > be supported.
> > >
> > > Patch #1 is a proper fix, but the rest of the series is RFC - this is
> > > because I have not fully tested and we are close to the end of this
> > > development cycle.
> > In general, looks reasonable. But I would prefer to see atomic write
> > support added to dm-striped as well. Not that I have some need, but
> > because it will help verify the correctness of the general stacking
> > code changes (in both block and DM core).
>
> That should be fine. We already have md raid0 support working (for atomic
> writes), so I would expect much of the required support is already available.
BTW. could it be possible to add dm-mirror support as well? dm-mirror is
used when the user moves the logical volume to another physical volume, so
it would be nice if this worked without resulting in not-supported errors.
dm-mirror uses dm-io to perform the writes on multiple mirror legs (see
the function do_write() -> dm_io()), I looked at the code and it seems
that the support for atomic writes in dm-mirror and dm-io would be
straightforward.
Another possibility would be dm-snapshot support, assuming that the atomic
i/o size <= snapshot chunk size, the support should be easy - i.e. just
pass the flag REQ_ATOMIC through. Perhaps it could be supported for
dm-thin as well.
Mikulas
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH RFC 0/5] device mapper atomic write support
2025-01-07 17:13 ` Mikulas Patocka
@ 2025-01-07 17:58 ` John Garry
2025-01-07 18:56 ` Mikulas Patocka
2025-01-16 11:28 ` John Garry
1 sibling, 1 reply; 20+ messages in thread
From: John Garry @ 2025-01-07 17:58 UTC (permalink / raw)
To: Mikulas Patocka
Cc: Mike Snitzer, axboe, agk, hch, martin.petersen, linux-block,
dm-devel, linux-kernel
On 07/01/2025 17:13, Mikulas Patocka wrote:
> On Mon, 6 Jan 2025, John Garry wrote:
>
>> On 06/01/2025 17:26, Mike Snitzer wrote:
>>> On Mon, Jan 06, 2025 at 12:41:14PM +0000, John Garry wrote:
>>>> This series introduces initial device mapper atomic write support.
>>>>
>>>> Since we already support stacking atomic writes limits, it's quite
>>>> straightforward to support.
>>>>
>>>> Only dm-linear is supported for now, but other personalities could
>>>> be supported.
>>>>
>>>> Patch #1 is a proper fix, but the rest of the series is RFC - this is
>>>> because I have not fully tested and we are close to the end of this
>>>> development cycle.
>>> In general, looks reasonable. But I would prefer to see atomic write
>>> support added to dm-striped as well. Not that I have some need, but
>>> because it will help verify the correctness of the general stacking
>>> code changes (in both block and DM core).
>> That should be fine. We already have md raid0 support working (for atomic
>> writes), so I would expect much of the required support is already available.
> BTW. could it be possible to add dm-mirror support as well? dm-mirror is
> used when the user moves the logical volume to another physical volume, so
> it would be nice if this worked without resulting in not-supported errors.
>
> dm-mirror uses dm-io to perform the writes on multiple mirror legs (see
> the function do_write() -> dm_io()), I looked at the code and it seems
> that the support for atomic writes in dm-mirror and dm-io would be
> straightforward.
FWIW, we do support atomic writes for md raid1. The key principle is
that we atomically write to each disk. Obviously we cannot write to
multiple disks atomically. So the copies in each mirror may be
out-of-sync after an unexpected power fail, but that is ok as either
will have all of old or new data, which is what we guarantee.
>
> Another possibility would be dm-snapshot support, assuming that the atomic
> i/o size <= snapshot chunk size, the support should be easy - i.e. just
> pass the flag REQ_ATOMIC through. Perhaps it could be supported for
> dm-thin as well.
Do you think that there will be users for these?
atomic writes provide guarantees for users, and it would be hard to
detect when these guarantees become broken through software bugs. I
would be just concerned that we enable atomic writes for many of these
more complicated personalities, and they are not actively used and break.
Thanks,
John
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH RFC 0/5] device mapper atomic write support
2025-01-07 17:58 ` John Garry
@ 2025-01-07 18:56 ` Mikulas Patocka
0 siblings, 0 replies; 20+ messages in thread
From: Mikulas Patocka @ 2025-01-07 18:56 UTC (permalink / raw)
To: John Garry, Joe Thornber
Cc: Mike Snitzer, axboe, agk, hch, martin.petersen, linux-block,
dm-devel, linux-kernel
On Tue, 7 Jan 2025, John Garry wrote:
> On 07/01/2025 17:13, Mikulas Patocka wrote:
> > On Mon, 6 Jan 2025, John Garry wrote:
> >
> > BTW. could it be possible to add dm-mirror support as well? dm-mirror is
> > used when the user moves the logical volume to another physical volume, so
> > it would be nice if this worked without resulting in not-supported errors.
> >
> > dm-mirror uses dm-io to perform the writes on multiple mirror legs (see
> > the function do_write() -> dm_io()), I looked at the code and it seems
> > that the support for atomic writes in dm-mirror and dm-io would be
> > straightforward.
>
> FWIW, we do support atomic writes for md raid1. The key principle is that we
> atomically write to each disk. Obviously we cannot write to multiple disks
> atomically. So the copies in each mirror may be out-of-sync after an
> unexpected power fail, but that is ok as either will have all of old or new
> data, which is what we guarantee.
Yes - something like that can be implemented for dm-mirror too.
> > Another possibility would be dm-snapshot support, assuming that the atomic
> > i/o size <= snapshot chunk size, the support should be easy - i.e. just
> > pass the flag REQ_ATOMIC through. Perhaps it could be supported for
> > dm-thin as well.
>
> Do you think that there will be users for these?
>
> atomic writes provide guarantees for users, and it would be hard to detect
> when these guarantees become broken through software bugs. I would be just
> concerned that we enable atomic writes for many of these more complicated
> personalities, and they are not actively used and break.
>
> Thanks,
> John
dm-snapshot is not much used, but dm-thin is. I added Joe to the
recipients list, so that he can decide whether dm-thin should support
atomic writes or not.
Mikulas
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH RFC 0/5] device mapper atomic write support
2025-01-07 17:13 ` Mikulas Patocka
2025-01-07 17:58 ` John Garry
@ 2025-01-16 11:28 ` John Garry
2025-01-16 12:59 ` Mikulas Patocka
1 sibling, 1 reply; 20+ messages in thread
From: John Garry @ 2025-01-16 11:28 UTC (permalink / raw)
To: Mikulas Patocka
Cc: Mike Snitzer, axboe, agk, hch, martin.petersen, linux-block,
dm-devel, linux-kernel
On 07/01/2025 17:13, Mikulas Patocka wrote:
>
>
> On Mon, 6 Jan 2025, John Garry wrote:
>
>> On 06/01/2025 17:26, Mike Snitzer wrote:
>>> On Mon, Jan 06, 2025 at 12:41:14PM +0000, John Garry wrote:
>>>> This series introduces initial device mapper atomic write support.
>>>>
>>>> Since we already support stacking atomic writes limits, it's quite
>>>> straightforward to support.
>>>>
>>>> Only dm-linear is supported for now, but other personalities could
>>>> be supported.
>>>>
>>>> Patch #1 is a proper fix, but the rest of the series is RFC - this is
>>>> because I have not fully tested and we are close to the end of this
>>>> development cycle.
>>> In general, looks reasonable. But I would prefer to see atomic write
>>> support added to dm-striped as well. Not that I have some need, but
>>> because it will help verify the correctness of the general stacking
>>> code changes (in both block and DM core).
>>
>> That should be fine. We already have md raid0 support working (for atomic
>> writes), so I would expect much of the required support is already available.
>
> BTW. could it be possible to add dm-mirror support as well? dm-mirror is
> used when the user moves the logical volume to another physical volume, so
> it would be nice if this worked without resulting in not-supported errors.
>
> dm-mirror uses dm-io to perform the writes on multiple mirror legs (see
> the function do_write() -> dm_io()), I looked at the code and it seems
> that the support for atomic writes in dm-mirror and dm-io would be
> straightforward.
I tried this out, and it seems to work ok.
However, I need to set DM_TARGET_ATOMIC_WRITES in the
mirror_target.features member, like:
diff --git a/drivers/md/dm-raid1.c b/drivers/md/dm-raid1.c
index 9511dae5b556..913a92c55904 100644
--- a/drivers/md/dm-raid1.c
+++ b/drivers/md/dm-raid1.c
@@ -1485,6 +1485,7 @@ static struct target_type mirror_target = {
.name = "mirror",
.version = {1, 14, 0},
.module = THIS_MODULE,
+ .features = DM_TARGET_ATOMIC_WRITES,
.ctr = mirror_ctr,
.dtr = mirror_dtr,
.map = mirror_map,
Is this the right thing to do? I ask, as none of the other DM_TARGET*
flags are set already, which makes me suspicious.
Thanks,
John
^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH RFC 0/5] device mapper atomic write support
2025-01-16 11:28 ` John Garry
@ 2025-01-16 12:59 ` Mikulas Patocka
2025-01-16 13:03 ` John Garry
0 siblings, 1 reply; 20+ messages in thread
From: Mikulas Patocka @ 2025-01-16 12:59 UTC (permalink / raw)
To: John Garry
Cc: Mike Snitzer, axboe, agk, hch, martin.petersen, linux-block,
dm-devel, linux-kernel
On Thu, 16 Jan 2025, John Garry wrote:
> On 07/01/2025 17:13, Mikulas Patocka wrote:
> >
> >
> > On Mon, 6 Jan 2025, John Garry wrote:
> >
> > BTW. could it be possible to add dm-mirror support as well? dm-mirror is
> > used when the user moves the logical volume to another physical volume, so
> > it would be nice if this worked without resulting in not-supported errors.
> >
> > dm-mirror uses dm-io to perform the writes on multiple mirror legs (see
> > the function do_write() -> dm_io()), I looked at the code and it seems
> > that the support for atomic writes in dm-mirror and dm-io would be
> > straightforward.
>
> I tried this out, and it seems to work ok.
>
> However, I need to set DM_TARGET_ATOMIC_WRITES in the mirror_target.features
> member, like:
>
> diff --git a/drivers/md/dm-raid1.c b/drivers/md/dm-raid1.c
> index 9511dae5b556..913a92c55904 100644
> --- a/drivers/md/dm-raid1.c
> +++ b/drivers/md/dm-raid1.c
> @@ -1485,6 +1485,7 @@ static struct target_type mirror_target = {
> .name = "mirror",
> .version = {1, 14, 0},
> .module = THIS_MODULE,
> + .features = DM_TARGET_ATOMIC_WRITES,
> .ctr = mirror_ctr,
> .dtr = mirror_dtr,
> .map = mirror_map,
>
>
> Is this the right thing to do? I ask, as none of the other DM_TARGET* flags
> are set already, which makes me suspicious.
>
> Thanks,
> John
Yes - that's right. I suggest that you verify that the atomic flag is
really passed through the dm-raid1.c and dm-io.c stack. Add a printk that
tests if REQ_ATOMIC is set to the function do_region in dm-io.c just
before "submit_bio(bio)".
Alternatively, you can use blktrace to test if the REQ_ATOMIC is passed
through correctly.
Mikulas
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH RFC 0/5] device mapper atomic write support
2025-01-16 12:59 ` Mikulas Patocka
@ 2025-01-16 13:03 ` John Garry
2025-01-16 14:58 ` Mikulas Patocka
0 siblings, 1 reply; 20+ messages in thread
From: John Garry @ 2025-01-16 13:03 UTC (permalink / raw)
To: Mikulas Patocka
Cc: Mike Snitzer, axboe, agk, hch, martin.petersen, linux-block,
dm-devel, linux-kernel
On 16/01/2025 12:59, Mikulas Patocka wrote:
>>> dm-mirror uses dm-io to perform the writes on multiple mirror legs (see
>>> the function do_write() -> dm_io()), I looked at the code and it seems
>>> that the support for atomic writes in dm-mirror and dm-io would be
>>> straightforward.
>> I tried this out, and it seems to work ok.
>>
>> However, I need to set DM_TARGET_ATOMIC_WRITES in the mirror_target.features
>> member, like:
>>
>> diff --git a/drivers/md/dm-raid1.c b/drivers/md/dm-raid1.c
>> index 9511dae5b556..913a92c55904 100644
>> --- a/drivers/md/dm-raid1.c
>> +++ b/drivers/md/dm-raid1.c
>> @@ -1485,6 +1485,7 @@ static struct target_type mirror_target = {
>> .name = "mirror",
>> .version = {1, 14, 0},
>> .module = THIS_MODULE,
>> + .features = DM_TARGET_ATOMIC_WRITES,
>> .ctr = mirror_ctr,
>> .dtr = mirror_dtr,
>> .map = mirror_map,
>>
>>
>> Is this the right thing to do? I ask, as none of the other DM_TARGET* flags
>> are set already, which makes me suspicious.
>>
>> Thanks,
>> John
> Yes - that's right. I suggest that you verify that the atomic flag is
> really passed through the dm-raid1.c and dm-io.c stack. Add a printk that
> tests if REQ_ATOMIC is set to the function do_region in dm-io.c just
> before "submit_bio(bio)".
>
> Alternatively, you can use blktrace to test if the REQ_ATOMIC is passed
> through correctly.
Yes, it is passed ok.
JFYI, I can also verify proper atomic write functionality on /dev/dmX
with fio in verify mode.
Thanks,
John
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH RFC 0/5] device mapper atomic write support
2025-01-16 13:03 ` John Garry
@ 2025-01-16 14:58 ` Mikulas Patocka
2025-01-16 15:01 ` John Garry
0 siblings, 1 reply; 20+ messages in thread
From: Mikulas Patocka @ 2025-01-16 14:58 UTC (permalink / raw)
To: John Garry
Cc: Mike Snitzer, axboe, agk, hch, martin.petersen, linux-block,
dm-devel, linux-kernel
On Thu, 16 Jan 2025, John Garry wrote:
> On 16/01/2025 12:59, Mikulas Patocka wrote:
> > > > dm-mirror uses dm-io to perform the writes on multiple mirror legs (see
> > > > the function do_write() -> dm_io()), I looked at the code and it seems
> > > > that the support for atomic writes in dm-mirror and dm-io would be
> > > > straightforward.
> > > I tried this out, and it seems to work ok.
> > >
> > > However, I need to set DM_TARGET_ATOMIC_WRITES in the
> > > mirror_target.features
> > > member, like:
> > >
> > > diff --git a/drivers/md/dm-raid1.c b/drivers/md/dm-raid1.c
> > > index 9511dae5b556..913a92c55904 100644
> > > --- a/drivers/md/dm-raid1.c
> > > +++ b/drivers/md/dm-raid1.c
> > > @@ -1485,6 +1485,7 @@ static struct target_type mirror_target = {
> > > .name = "mirror",
> > > .version = {1, 14, 0},
> > > .module = THIS_MODULE,
> > > + .features = DM_TARGET_ATOMIC_WRITES,
> > > .ctr = mirror_ctr,
> > > .dtr = mirror_dtr,
> > > .map = mirror_map,
> > >
> > >
> > > Is this the right thing to do? I ask, as none of the other DM_TARGET*
> > > flags
> > > are set already, which makes me suspicious.
> > >
> > > Thanks,
> > > John
> > Yes - that's right. I suggest that you verify that the atomic flag is
> > really passed through the dm-raid1.c and dm-io.c stack. Add a printk that
> > tests if REQ_ATOMIC is set to the function do_region in dm-io.c just
> > before "submit_bio(bio)".
> >
> > Alternatively, you can use blktrace to test if the REQ_ATOMIC is passed
> > through correctly.
>
> Yes, it is passed ok.
>
> JFYI, I can also verify proper atomic write functionality on /dev/dmX with fio
> in verify mode.
>
> Thanks,
> John
Yes - so please send version 2 of the patches and I will stage them for
this merge window.
Mikulas
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH RFC 0/5] device mapper atomic write support
2025-01-16 14:58 ` Mikulas Patocka
@ 2025-01-16 15:01 ` John Garry
0 siblings, 0 replies; 20+ messages in thread
From: John Garry @ 2025-01-16 15:01 UTC (permalink / raw)
To: Mikulas Patocka
Cc: Mike Snitzer, axboe, agk, hch, martin.petersen, linux-block,
dm-devel, linux-kernel
On 16/01/2025 14:58, Mikulas Patocka wrote:
> Yes - so please send version 2 of the patches and I will stage them for
> this merge window.
I'll send a v2 today, however I made some block changes based on the
feedback from Mike on the dm-table changes in v1.
So prob quite late for this cycle, considering that it touches many
trees ...
Cheers,
John
^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2025-01-16 15:02 UTC | newest]
Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-01-06 12:41 [PATCH RFC 0/5] device mapper atomic write support John Garry
2025-01-06 12:41 ` [PATCH 1/5] block: Ensure start sector is aligned for stacking atomic writes John Garry
2025-01-06 15:36 ` Christoph Hellwig
2025-01-06 12:41 ` [PATCH RFC 2/5] block: Change blk_stack_atomic_writes_limits() unit_min check John Garry
2025-01-06 15:37 ` Christoph Hellwig
2025-01-06 12:41 ` [PATCH RFC 3/5] dm-table: Atomic writes support John Garry
2025-01-06 17:49 ` Mike Snitzer
2025-01-06 18:18 ` John Garry
2025-01-06 12:41 ` [PATCH RFC 4/5] dm: Ensure cloned bio is same length for atomic write John Garry
2025-01-06 12:41 ` [PATCH RFC 5/5] dm-linear: Enable atomic writes John Garry
2025-01-06 17:26 ` [PATCH RFC 0/5] device mapper atomic write support Mike Snitzer
2025-01-06 18:14 ` John Garry
2025-01-07 17:13 ` Mikulas Patocka
2025-01-07 17:58 ` John Garry
2025-01-07 18:56 ` Mikulas Patocka
2025-01-16 11:28 ` John Garry
2025-01-16 12:59 ` Mikulas Patocka
2025-01-16 13:03 ` John Garry
2025-01-16 14:58 ` Mikulas Patocka
2025-01-16 15:01 ` John Garry
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).