* [PATCH v2 0/2] Fix DM zone resource limits stacking
@ 2024-06-05 2:24 Damien Le Moal
2024-06-05 2:24 ` [PATCH v2 1/2] block: Imporve checks on zone resource limits Damien Le Moal
` (2 more replies)
0 siblings, 3 replies; 8+ messages in thread
From: Damien Le Moal @ 2024-06-05 2:24 UTC (permalink / raw)
To: Jens Axboe, linux-block, dm-devel, Mike Snitzer, Mikulas Patocka
Cc: Christoph Hellwig, Benjamin Marzinski
This is V2 of the patch 4/4 of the series "Zone write plugging and DM
zone fixes". This patch fixes DM zone resource limits stacking (max open
zones and max active zones limits). Patch 1 is new and is added to help
catch problems and eventual regressions of the handling of these limits.
Changes from v1:
- Added patch 1
- Modified patch 2 to not cap the limits for a target with the number
of sequential zones mapped but rather to use the device limits as is
when more zones than the limits are mapped and 0 otherwise (no
limits).
Damien Le Moal (2):
block: Imporve checks on zone resource limits
dm: Improve zone resource limits handling
block/blk-settings.c | 4 +
block/blk-zoned.c | 5 +
drivers/md/dm-zone.c | 220 +++++++++++++++++++++++++++++++++++--------
3 files changed, 192 insertions(+), 37 deletions(-)
--
2.45.1
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH v2 1/2] block: Imporve checks on zone resource limits
2024-06-05 2:24 [PATCH v2 0/2] Fix DM zone resource limits stacking Damien Le Moal
@ 2024-06-05 2:24 ` Damien Le Moal
2024-06-05 4:17 ` Christoph Hellwig
2024-06-05 2:24 ` [PATCH v2 2/2] dm: Improve zone resource limits handling Damien Le Moal
2024-06-05 2:26 ` [PATCH v2 0/2] Fix DM zone resource limits stacking Damien Le Moal
2 siblings, 1 reply; 8+ messages in thread
From: Damien Le Moal @ 2024-06-05 2:24 UTC (permalink / raw)
To: Jens Axboe, linux-block, dm-devel, Mike Snitzer, Mikulas Patocka
Cc: Christoph Hellwig, Benjamin Marzinski
Make sure that the zone resource limits of a zoned block device are
correct by checking that:
(a) If the device has a max active zones limit, make sure that the max
open zones limit is lower than the max active zones limit.
(b) If the device has a max open zones or a max active zones limit,
check that the limits are lower than the number of sequential zones
of the device.
For (a), a check is added to blk_validate_zoned_limits(). For (b), given
that we need to number of sequential zones of the device, this check is
added to disk_update_zone_resources(). This is safe to do as that
function is executed with the queue frozen and the check executed after
queue_limits_start_update() with the queue limits lock held.
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
---
block/blk-settings.c | 4 ++++
block/blk-zoned.c | 5 +++++
2 files changed, 9 insertions(+)
diff --git a/block/blk-settings.c b/block/blk-settings.c
index effeb9a639bb..a79c57376ef7 100644
--- a/block/blk-settings.c
+++ b/block/blk-settings.c
@@ -80,6 +80,10 @@ static int blk_validate_zoned_limits(struct queue_limits *lim)
if (WARN_ON_ONCE(!IS_ENABLED(CONFIG_BLK_DEV_ZONED)))
return -EINVAL;
+ if (lim->max_active_zones &&
+ WARN_ON_ONCE(lim->max_open_zones > lim->max_active_zones))
+ lim->max_open_zones = lim->max_active_zones;
+
if (lim->zone_write_granularity < lim->logical_block_size)
lim->zone_write_granularity = lim->logical_block_size;
diff --git a/block/blk-zoned.c b/block/blk-zoned.c
index 52abebf56027..2af4d5ca81d2 100644
--- a/block/blk-zoned.c
+++ b/block/blk-zoned.c
@@ -1660,6 +1660,11 @@ static int disk_update_zone_resources(struct gendisk *disk,
lim = queue_limits_start_update(q);
nr_seq_zones = disk->nr_zones - nr_conv_zones;
+ if (WARN_ON_ONCE(lim.max_active_zones > nr_seq_zones))
+ lim.max_active_zones = 0;
+ if (WARN_ON_ONCE(lim.max_open_zones > nr_seq_zones))
+ lim.max_open_zones = 0;
+
pool_size = max(lim.max_open_zones, lim.max_active_zones);
if (!pool_size)
pool_size = min(BLK_ZONE_WPLUG_DEFAULT_POOL_SIZE, nr_seq_zones);
--
2.45.1
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH v2 2/2] dm: Improve zone resource limits handling
2024-06-05 2:24 [PATCH v2 0/2] Fix DM zone resource limits stacking Damien Le Moal
2024-06-05 2:24 ` [PATCH v2 1/2] block: Imporve checks on zone resource limits Damien Le Moal
@ 2024-06-05 2:24 ` Damien Le Moal
2024-06-05 4:23 ` Christoph Hellwig
2024-06-05 2:26 ` [PATCH v2 0/2] Fix DM zone resource limits stacking Damien Le Moal
2 siblings, 1 reply; 8+ messages in thread
From: Damien Le Moal @ 2024-06-05 2:24 UTC (permalink / raw)
To: Jens Axboe, linux-block, dm-devel, Mike Snitzer, Mikulas Patocka
Cc: Christoph Hellwig, Benjamin Marzinski
The generic stacking of limits implemented in the block layer cannot
correctly handle stacking of zone resource limits (max open zones and
max active zones) because these limits are for an entire device but the
stacking may be for a portion of that device (e.g. a dm-linear target
that does not cover an entire block device). As a result, when DM
devices are created on top of zoned block devices, the DM device never
has any zone resource limits advertized, which is only correct if all
underlying target devices also have no zone resource limits.
If at least one target device has resource limits, the user may see
either performance issues (if the max open zone limit of the device is
exceeded) or write I/O errors if the max active zone limit of one of
the underlying target devices is exceeded.
While it is very difficult to correctly and reliably stack zone resource
limits in general, cases where targets are not sharing zone resources of
the same device can be dealt with relatively easily. Such situation
happens when a target maps all sequential zones of a zoned block device:
for such mapping, other targets mapping other parts of the same zoned
block device can only contain conventional zones and thus will not
require any zone resource to correctly handle write operations.
For a mapped device constructed with such targets, which includes mapped
devices constructed with targets mapping entire zoned block devices, the
zone resource limits can be reliably determined using the non-zero
minimum of the zone resource limits of all targets.
For mapped devices that include targets partially mapping the set of
sequential write required zones of zoned block devices, instead of
advertizing no zone resource limits, it is also better to set the mapped
device limits to the non-zero minimum of the limits of all targets. In
this case the limits for a target depend on the number of sequential
zones being mapped: if this number of zone is larger than the limits,
then the limits of the device apply and can be used. If on the other
hand the target maps a number of zones smaller than the limits, then no
limits is needed and we can assume that the target has no limits (limits
set to 0).
This commit improves zone resource limits handling as described above
using the function dm_set_zone_resource_limits(). This function is
executed from dm_set_zones_restrictions() and iterates the targets of a
mapped device to evaluate the max open and max active zone limits. This
relies on an internal "stacking" of the limits of the target devices
combined with a direct counting of the number of sequential zones
mapped by the targets.
1) For a target mapping an entire zoned block device, the limits for the
target are set to the limits of the device.
2) For a target partially mapping a zoned block device, the number of
mapped sequential zones is used to determine the limits: if the
target maps more sequential write required zones than the device
limits, then the limits of the device are used as-is. If the number
of mapped sequential zones is lower than the limits, then we assume
that the target has no limits (limits set to 0).
target can reliably inherit the device limits. Otherwise, the target
limits are set to the device limits capped by the number of mapped
sequential zones.
With this evaluation done for each target, the mapped device zone
resource limits are evaluated as the non-zero minimum of the limits of
all the targets.
For configurations resulting in unreliable limits, i.e. a table
containing a target partially mapping a zoned device, a warning message
is issued.
The counting of mapped sequential zones for the target is done using the
new function dm_device_count_zones() which performs a report zones on
the entire block device with the callback dm_device_count_zones_cb().
This count of mapped sequential zones is used to determine if the mapped
device contains only conventional zones. This allows simplifying
dm_set_zones_restrictions() to not do a report zones. For mapped devices
mapping only conventional zones, dm_set_zone_resource_limits() changes
the mapped device to a regular device.
To further cleanup dm_set_zones_restrictions(), the message about the
type of zone append (native or emulated) is moved inside
dm_revalidate_zones().
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
---
drivers/md/dm-zone.c | 220 +++++++++++++++++++++++++++++++++++--------
1 file changed, 183 insertions(+), 37 deletions(-)
diff --git a/drivers/md/dm-zone.c b/drivers/md/dm-zone.c
index 5d66d916730e..2d41b748de87 100644
--- a/drivers/md/dm-zone.c
+++ b/drivers/md/dm-zone.c
@@ -145,17 +145,180 @@ bool dm_is_zone_write(struct mapped_device *md, struct bio *bio)
}
}
+struct dm_device_zone_count {
+ sector_t start;
+ sector_t len;
+ unsigned int total_nr_seq_zones;
+ unsigned int target_nr_seq_zones;
+};
+
/*
- * Count conventional zones of a mapped zoned device. If the device
- * only has conventional zones, do not expose it as zoned.
+ * Count the total number of and the number of mapped sequential zones of a
+ * target zoned device.
*/
-static int dm_check_zoned_cb(struct blk_zone *zone, unsigned int idx,
- void *data)
+static int dm_device_count_zones_cb(struct blk_zone *zone,
+ unsigned int idx, void *data)
+{
+ struct dm_device_zone_count *zc = data;
+
+ if (zone->type != BLK_ZONE_TYPE_CONVENTIONAL) {
+ zc->total_nr_seq_zones++;
+ if (zone->start >= zc->start &&
+ zone->start < zc->start + zc->len)
+ zc->target_nr_seq_zones++;
+ }
+
+ return 0;
+}
+
+static int dm_device_count_zones(struct dm_dev *dev,
+ struct dm_device_zone_count *zc)
+{
+ int ret;
+
+ ret = blkdev_report_zones(dev->bdev, 0, UINT_MAX,
+ dm_device_count_zones_cb, zc);
+ if (ret < 0)
+ return ret;
+ if (!ret)
+ return -EIO;
+ return 0;
+}
+
+struct dm_zone_resource_limits {
+ unsigned int mapped_nr_seq_zones;
+ unsigned int max_open_zones;
+ unsigned int max_active_zones;
+ bool reliable_limits;
+};
+
+static int device_get_zone_resource_limits(struct dm_target *ti,
+ struct dm_dev *dev, sector_t start,
+ sector_t len, void *data)
+{
+ struct dm_zone_resource_limits *zlim = data;
+ struct gendisk *disk = dev->bdev->bd_disk;
+ unsigned int max_open_zones, max_active_zones;
+ int ret;
+ struct dm_device_zone_count zc = {
+ .start = start,
+ .len = len,
+ };
+
+ /*
+ * If the target is not the whole device, the device zone resources may
+ * be shared between different targets. Check this by counting the
+ * number of mapped sequential zones: if this number is smaller than the
+ * total number of sequential zones of the target device, then resource
+ * sharing may happen and the zone limits will not be reliable.
+ */
+ ret = dm_device_count_zones(dev, &zc);
+ if (ret) {
+ DMERR("Count device %s zones failed",
+ disk->disk_name);
+ return ret;
+ }
+
+ zlim->mapped_nr_seq_zones += zc.target_nr_seq_zones;
+
+ /*
+ * If the target does not map any sequential zones, then we do not need
+ * any zone resource limits.
+ */
+ if (!zc.target_nr_seq_zones)
+ return 0;
+
+ /*
+ * If the target does not map all sequential zones, the limits
+ * will not be reliable.
+ */
+ if (zc.target_nr_seq_zones < zc.total_nr_seq_zones)
+ zlim->reliable_limits = false;
+
+ /*
+ * If the target maps less sequential zones than the limit values, then
+ * we do not have limits for this target.
+ */
+ max_active_zones = disk->queue->limits.max_active_zones;
+ if (max_active_zones >= zc.target_nr_seq_zones)
+ max_active_zones = 0;
+ zlim->max_active_zones =
+ min_not_zero(max_active_zones, zlim->max_active_zones);
+
+ max_open_zones = disk->queue->limits.max_open_zones;
+ if (max_open_zones >= zc.target_nr_seq_zones)
+ max_open_zones = 0;
+ zlim->max_open_zones =
+ min_not_zero(max_open_zones, zlim->max_open_zones);
+
+ /* We cannot have more open zones than active zones. */
+ zlim->max_open_zones =
+ min(zlim->max_open_zones, zlim->max_active_zones);
+
+ return 0;
+}
+
+static int dm_set_zone_resource_limits(struct mapped_device *md,
+ struct dm_table *t, struct queue_limits *lim)
{
- unsigned int *nr_conv_zones = data;
+ struct gendisk *disk = md->disk;
+ struct dm_zone_resource_limits zlim = {
+ .max_open_zones = UINT_MAX,
+ .max_active_zones = UINT_MAX,
+ .reliable_limits = true,
+ };
- if (zone->type == BLK_ZONE_TYPE_CONVENTIONAL)
- (*nr_conv_zones)++;
+ /* Get the zone resource limits from the targets. */
+ for (unsigned int i = 0; i < t->num_targets; i++) {
+ struct dm_target *ti = dm_table_get_target(t, i);
+
+ if (!ti->type->iterate_devices ||
+ ti->type->iterate_devices(ti,
+ device_get_zone_resource_limits, &zlim)) {
+ DMERR("Could not determine %s zone resource limits",
+ md->disk->disk_name);
+ return -ENODEV;
+ }
+ }
+
+ /*
+ * If we only have conventional zones mapped, expose the mapped device
+ + as a regular device.
+ */
+ if (!zlim.mapped_nr_seq_zones) {
+ lim->max_open_zones = 0;
+ lim->max_active_zones = 0;
+ lim->max_zone_append_sectors = 0;
+ lim->zone_write_granularity = 0;
+ lim->zoned = false;
+ clear_bit(DMF_EMULATE_ZONE_APPEND, &md->flags);
+ md->nr_zones = 0;
+ disk->nr_zones = 0;
+ return 0;
+ }
+
+ /*
+ * Warn if the mapped device is partially using zone resources of the
+ * target devices as that leads to unreliable limits, i.e. if another
+ * mapped device uses the same underlying devices, we cannot enforce
+ * zone limits to guarantee that writing will not lead to errors.
+ * Note that we really should return an error for such case but there is
+ * no easy way to find out if another mapped device uses the same
+ * underlying zoned devices.
+ */
+ if (!zlim.reliable_limits)
+ DMWARN("%s zone resource limits may be unreliable",
+ disk->disk_name);
+
+ if (zlim.max_open_zones >= zlim.mapped_nr_seq_zones)
+ lim->max_open_zones = 0;
+ else
+ lim->max_open_zones = zlim.max_open_zones;
+
+ if (zlim.max_active_zones >= zlim.mapped_nr_seq_zones)
+ lim->max_active_zones = 0;
+ else
+ lim->max_active_zones = zlim.max_active_zones;
return 0;
}
@@ -172,8 +335,13 @@ static int dm_revalidate_zones(struct mapped_device *md, struct dm_table *t)
int ret;
/* Revalidate only if something changed. */
- if (!disk->nr_zones || disk->nr_zones != md->nr_zones)
+ if (!disk->nr_zones || disk->nr_zones != md->nr_zones) {
+ DMINFO("%s using %s zone append",
+ md->disk->disk_name,
+ queue_emulates_zone_append(disk->queue) ?
+ "emulated" : "native");
md->nr_zones = 0;
+ }
if (md->nr_zones)
return 0;
@@ -224,8 +392,6 @@ int dm_set_zones_restrictions(struct dm_table *t, struct request_queue *q,
struct queue_limits *lim)
{
struct mapped_device *md = t->md;
- struct gendisk *disk = md->disk;
- unsigned int nr_conv_zones = 0;
int ret;
/*
@@ -244,36 +410,16 @@ int dm_set_zones_restrictions(struct dm_table *t, struct request_queue *q,
return 0;
/*
- * Count conventional zones to check that the mapped device will indeed
- * have sequential write required zones.
+ * Determine the max open and max active zone limits for the mapped
+ * device. For a mapped device containing only conventional zones, the
+ * mapped device is changed to be a regular block device, so exit early
+ * for such case.
*/
- md->zone_revalidate_map = t;
- ret = dm_blk_report_zones(disk, 0, UINT_MAX,
- dm_check_zoned_cb, &nr_conv_zones);
- md->zone_revalidate_map = NULL;
- if (ret < 0) {
- DMERR("Check zoned failed %d", ret);
+ ret = dm_set_zone_resource_limits(md, t, lim);
+ if (ret)
return ret;
- }
-
- /*
- * If we only have conventional zones, expose the mapped device as
- * a regular device.
- */
- if (nr_conv_zones >= ret) {
- lim->max_open_zones = 0;
- lim->max_active_zones = 0;
- lim->zoned = false;
- clear_bit(DMF_EMULATE_ZONE_APPEND, &md->flags);
- disk->nr_zones = 0;
+ if (!lim->zoned)
return 0;
- }
-
- if (!md->disk->nr_zones) {
- DMINFO("%s using %s zone append",
- md->disk->disk_name,
- queue_emulates_zone_append(q) ? "emulated" : "native");
- }
ret = dm_revalidate_zones(md, t);
if (ret < 0)
--
2.45.1
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH v2 0/2] Fix DM zone resource limits stacking
2024-06-05 2:24 [PATCH v2 0/2] Fix DM zone resource limits stacking Damien Le Moal
2024-06-05 2:24 ` [PATCH v2 1/2] block: Imporve checks on zone resource limits Damien Le Moal
2024-06-05 2:24 ` [PATCH v2 2/2] dm: Improve zone resource limits handling Damien Le Moal
@ 2024-06-05 2:26 ` Damien Le Moal
2 siblings, 0 replies; 8+ messages in thread
From: Damien Le Moal @ 2024-06-05 2:26 UTC (permalink / raw)
To: Jens Axboe, linux-block, dm-devel, Mike Snitzer, Mikulas Patocka
Cc: Christoph Hellwig, Benjamin Marzinski, Shin'ichiro Kawasaki
On 6/5/24 11:24, Damien Le Moal wrote:
> This is V2 of the patch 4/4 of the series "Zone write plugging and DM
> zone fixes". This patch fixes DM zone resource limits stacking (max open
> zones and max active zones limits). Patch 1 is new and is added to help
> catch problems and eventual regressions of the handling of these limits.
I forgot to mention that I am working with Shin'ichiro to add blktests cases to
extend zbd test group coverage to DM zone resource limits.
--
Damien Le Moal
Western Digital Research
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2 1/2] block: Imporve checks on zone resource limits
2024-06-05 2:24 ` [PATCH v2 1/2] block: Imporve checks on zone resource limits Damien Le Moal
@ 2024-06-05 4:17 ` Christoph Hellwig
2024-06-05 4:52 ` Damien Le Moal
0 siblings, 1 reply; 8+ messages in thread
From: Christoph Hellwig @ 2024-06-05 4:17 UTC (permalink / raw)
To: Damien Le Moal
Cc: Jens Axboe, linux-block, dm-devel, Mike Snitzer, Mikulas Patocka,
Christoph Hellwig, Benjamin Marzinski
improve is misspelled in the subject.
> @@ -80,6 +80,10 @@ static int blk_validate_zoned_limits(struct queue_limits *lim)
> if (WARN_ON_ONCE(!IS_ENABLED(CONFIG_BLK_DEV_ZONED)))
> return -EINVAL;
>
> + if (lim->max_active_zones &&
> + WARN_ON_ONCE(lim->max_open_zones > lim->max_active_zones))
> + lim->max_open_zones = lim->max_active_zones;
Given how active zones are defined this is an error condition, and
should return -EINVAL.
> diff --git a/block/blk-zoned.c b/block/blk-zoned.c
> index 52abebf56027..2af4d5ca81d2 100644
> --- a/block/blk-zoned.c
> +++ b/block/blk-zoned.c
> @@ -1660,6 +1660,11 @@ static int disk_update_zone_resources(struct gendisk *disk,
> lim = queue_limits_start_update(q);
>
> nr_seq_zones = disk->nr_zones - nr_conv_zones;
> + if (WARN_ON_ONCE(lim.max_active_zones > nr_seq_zones))
> + lim.max_active_zones = 0;
> + if (WARN_ON_ONCE(lim.max_open_zones > nr_seq_zones))
> + lim.max_open_zones = 0;
Why would you warn about this? Offering an open/active limit larger
than the number of sequential zones is a pretty natural condition
for certain corner cases (e.g. create only a tiny namespace on a ZNS
SSD). This could also use a code comment explaining why the limit
is adjusted.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2 2/2] dm: Improve zone resource limits handling
2024-06-05 2:24 ` [PATCH v2 2/2] dm: Improve zone resource limits handling Damien Le Moal
@ 2024-06-05 4:23 ` Christoph Hellwig
2024-06-05 4:55 ` Damien Le Moal
0 siblings, 1 reply; 8+ messages in thread
From: Christoph Hellwig @ 2024-06-05 4:23 UTC (permalink / raw)
To: Damien Le Moal
Cc: Jens Axboe, linux-block, dm-devel, Mike Snitzer, Mikulas Patocka,
Christoph Hellwig, Benjamin Marzinski
On Wed, Jun 05, 2024 at 11:24:45AM +0900, Damien Le Moal wrote:
> The generic stacking of limits implemented in the block layer cannot
> correctly handle stacking of zone resource limits (max open zones and
> max active zones)
... for DM. All other limits stacking ends up in a single top device.
> + /*
> + * If the target does not map all sequential zones, the limits
> + * will not be reliable.
> + */
> + if (zc.target_nr_seq_zones < zc.total_nr_seq_zones)
> + zlim->reliable_limits = false;
> +
> + /*
> + * If the target maps less sequential zones than the limit values, then
> + * we do not have limits for this target.
> + */
> + max_active_zones = disk->queue->limits.max_active_zones;
> + if (max_active_zones >= zc.target_nr_seq_zones)
> + max_active_zones = 0;
> + zlim->max_active_zones =
> + min_not_zero(max_active_zones, zlim->max_active_zones);
> +
> + max_open_zones = disk->queue->limits.max_open_zones;
> + if (max_open_zones >= zc.target_nr_seq_zones)
> + max_open_zones = 0;
> + zlim->max_open_zones =
> + min_not_zero(max_open_zones, zlim->max_open_zones);
Given that your previous patch already caps max_open/active_zones to the
number of sequential zones, duplicating this here should not be needed.
> + /* We cannot have more open zones than active zones. */
> + zlim->max_open_zones =
> + min(zlim->max_open_zones, zlim->max_active_zones);
Same question about the capping as in patch 1, and same comment about
the duplication as above.
> + if (zlim.max_open_zones >= zlim.mapped_nr_seq_zones)
> + lim->max_open_zones = 0;
> + else
> + lim->max_open_zones = zlim.max_open_zones;
> +
> + if (zlim.max_active_zones >= zlim.mapped_nr_seq_zones)
> + lim->max_active_zones = 0;
> + else
> + lim->max_active_zones = zlim.max_active_zones;
And once more here.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2 1/2] block: Imporve checks on zone resource limits
2024-06-05 4:17 ` Christoph Hellwig
@ 2024-06-05 4:52 ` Damien Le Moal
0 siblings, 0 replies; 8+ messages in thread
From: Damien Le Moal @ 2024-06-05 4:52 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Jens Axboe, linux-block, dm-devel, Mike Snitzer, Mikulas Patocka,
Benjamin Marzinski
On 6/5/24 13:17, Christoph Hellwig wrote:
> improve is misspelled in the subject.
>
>> @@ -80,6 +80,10 @@ static int blk_validate_zoned_limits(struct queue_limits *lim)
>> if (WARN_ON_ONCE(!IS_ENABLED(CONFIG_BLK_DEV_ZONED)))
>> return -EINVAL;
>>
>> + if (lim->max_active_zones &&
>> + WARN_ON_ONCE(lim->max_open_zones > lim->max_active_zones))
>> + lim->max_open_zones = lim->max_active_zones;
>
> Given how active zones are defined this is an error condition, and
> should return -EINVAL.
>
>> diff --git a/block/blk-zoned.c b/block/blk-zoned.c
>> index 52abebf56027..2af4d5ca81d2 100644
>> --- a/block/blk-zoned.c
>> +++ b/block/blk-zoned.c
>> @@ -1660,6 +1660,11 @@ static int disk_update_zone_resources(struct gendisk *disk,
>> lim = queue_limits_start_update(q);
>>
>> nr_seq_zones = disk->nr_zones - nr_conv_zones;
>> + if (WARN_ON_ONCE(lim.max_active_zones > nr_seq_zones))
>> + lim.max_active_zones = 0;
>> + if (WARN_ON_ONCE(lim.max_open_zones > nr_seq_zones))
>> + lim.max_open_zones = 0;
>
> Why would you warn about this? Offering an open/active limit larger
> than the number of sequential zones is a pretty natural condition
> for certain corner cases (e.g. create only a tiny namespace on a ZNS
> SSD). This could also use a code comment explaining why the limit
> is adjusted.
Right. I actually did not consider that case, which is indeed valid given that
for nvme, the limits are per controller, not namespace (which is a very
unfortunate design flaw...).
I will remove the warn and add a comment.
--
Damien Le Moal
Western Digital Research
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2 2/2] dm: Improve zone resource limits handling
2024-06-05 4:23 ` Christoph Hellwig
@ 2024-06-05 4:55 ` Damien Le Moal
0 siblings, 0 replies; 8+ messages in thread
From: Damien Le Moal @ 2024-06-05 4:55 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Jens Axboe, linux-block, dm-devel, Mike Snitzer, Mikulas Patocka,
Benjamin Marzinski
On 6/5/24 13:23, Christoph Hellwig wrote:
> On Wed, Jun 05, 2024 at 11:24:45AM +0900, Damien Le Moal wrote:
>> The generic stacking of limits implemented in the block layer cannot
>> correctly handle stacking of zone resource limits (max open zones and
>> max active zones)
>
> ... for DM. All other limits stacking ends up in a single top device.
I know. And I do not see your point here.
>
>> + /*
>> + * If the target does not map all sequential zones, the limits
>> + * will not be reliable.
>> + */
>> + if (zc.target_nr_seq_zones < zc.total_nr_seq_zones)
>> + zlim->reliable_limits = false;
>> +
>> + /*
>> + * If the target maps less sequential zones than the limit values, then
>> + * we do not have limits for this target.
>> + */
>> + max_active_zones = disk->queue->limits.max_active_zones;
>> + if (max_active_zones >= zc.target_nr_seq_zones)
>> + max_active_zones = 0;
>> + zlim->max_active_zones =
>> + min_not_zero(max_active_zones, zlim->max_active_zones);
>> +
>> + max_open_zones = disk->queue->limits.max_open_zones;
>> + if (max_open_zones >= zc.target_nr_seq_zones)
>> + max_open_zones = 0;
>> + zlim->max_open_zones =
>> + min_not_zero(max_open_zones, zlim->max_open_zones);
>
> Given that your previous patch already caps max_open/active_zones to the
> number of sequential zones, duplicating this here should not be needed.
Indeed. Will remove this.
>
>> + /* We cannot have more open zones than active zones. */
>> + zlim->max_open_zones =
>> + min(zlim->max_open_zones, zlim->max_active_zones);
>
> Same question about the capping as in patch 1, and same comment about
> the duplication as above.
Yes, we can remove this one too.
>
>> + if (zlim.max_open_zones >= zlim.mapped_nr_seq_zones)
>> + lim->max_open_zones = 0;
>> + else
>> + lim->max_open_zones = zlim.max_open_zones;
>> +
>> + if (zlim.max_active_zones >= zlim.mapped_nr_seq_zones)
>> + lim->max_active_zones = 0;
>> + else
>> + lim->max_active_zones = zlim.max_active_zones;
>
> And once more here.
Yep.
>
>
--
Damien Le Moal
Western Digital Research
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2024-06-05 4:55 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-06-05 2:24 [PATCH v2 0/2] Fix DM zone resource limits stacking Damien Le Moal
2024-06-05 2:24 ` [PATCH v2 1/2] block: Imporve checks on zone resource limits Damien Le Moal
2024-06-05 4:17 ` Christoph Hellwig
2024-06-05 4:52 ` Damien Le Moal
2024-06-05 2:24 ` [PATCH v2 2/2] dm: Improve zone resource limits handling Damien Le Moal
2024-06-05 4:23 ` Christoph Hellwig
2024-06-05 4:55 ` Damien Le Moal
2024-06-05 2:26 ` [PATCH v2 0/2] Fix DM zone resource limits stacking Damien Le Moal
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox