* [PATCH 1/2] xfs: refactor xfs_mount_zones
2026-03-31 15:26 fix handling of too many open zones at mount time v2 Christoph Hellwig
@ 2026-03-31 15:26 ` Christoph Hellwig
2026-03-31 19:37 ` Damien Le Moal
2026-03-31 15:26 ` [PATCH 2/2] xfs: handle too many open zones when mounting Christoph Hellwig
2026-04-07 13:38 ` fix handling of too many open zones at mount time v2 Carlos Maiolino
2 siblings, 1 reply; 6+ messages in thread
From: Christoph Hellwig @ 2026-03-31 15:26 UTC (permalink / raw)
To: Carlos Maiolino; +Cc: Damien Le Moal, Hans Holmberg, linux-xfs
xfs_mount_zones has grown a bit too big and unorganized. Split the
zone reporting loop into a separate helper, hiding the rtg variable
there. Print the mount message last, and also keep the VFS writeback
chunk size last instead of in the middle of the logic to calculate
the free/available blocks.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hans Holmberg <hans.holmberg@wdc.com>
---
fs/xfs/xfs_zone_alloc.c | 54 ++++++++++++++++++++++++++---------------
1 file changed, 34 insertions(+), 20 deletions(-)
diff --git a/fs/xfs/xfs_zone_alloc.c b/fs/xfs/xfs_zone_alloc.c
index 06e2cb79030e..e9f1d9d08620 100644
--- a/fs/xfs/xfs_zone_alloc.c
+++ b/fs/xfs/xfs_zone_alloc.c
@@ -1230,6 +1230,29 @@ xfs_free_zone_info(
kfree(zi);
}
+static int
+xfs_report_zones(
+ struct xfs_mount *mp,
+ struct xfs_init_zones *iz)
+{
+ struct xfs_rtgroup *rtg = NULL;
+
+ while ((rtg = xfs_rtgroup_next(mp, rtg))) {
+ xfs_rgblock_t write_pointer;
+ int error;
+
+ error = xfs_query_write_pointer(iz, rtg, &write_pointer);
+ if (!error)
+ error = xfs_init_zone(iz, rtg, write_pointer);
+ if (error) {
+ xfs_rtgroup_rele(rtg);
+ return error;
+ }
+ }
+
+ return 0;
+}
+
int
xfs_mount_zones(
struct xfs_mount *mp)
@@ -1238,7 +1261,6 @@ xfs_mount_zones(
.zone_capacity = mp->m_groups[XG_TYPE_RTG].blocks,
.zone_size = xfs_rtgroup_raw_size(mp),
};
- struct xfs_rtgroup *rtg = NULL;
int error;
if (!mp->m_rtdev_targp) {
@@ -1268,9 +1290,13 @@ xfs_mount_zones(
if (!mp->m_zone_info)
return -ENOMEM;
- xfs_info(mp, "%u zones of %u blocks (%u max open zones)",
- mp->m_sb.sb_rgcount, iz.zone_capacity, mp->m_max_open_zones);
- trace_xfs_zones_mount(mp);
+ error = xfs_report_zones(mp, &iz);
+ if (error)
+ goto out_free_zone_info;
+
+ xfs_set_freecounter(mp, XC_FREE_RTAVAILABLE, iz.available);
+ xfs_set_freecounter(mp, XC_FREE_RTEXTENTS,
+ iz.available + iz.reclaimable);
/*
* The writeback code switches between inodes regularly to provide
@@ -1296,22 +1322,6 @@ xfs_mount_zones(
XFS_FSB_TO_B(mp, min(iz.zone_capacity, XFS_MAX_BMBT_EXTLEN)) >>
PAGE_SHIFT;
- while ((rtg = xfs_rtgroup_next(mp, rtg))) {
- xfs_rgblock_t write_pointer;
-
- error = xfs_query_write_pointer(&iz, rtg, &write_pointer);
- if (!error)
- error = xfs_init_zone(&iz, rtg, write_pointer);
- if (error) {
- xfs_rtgroup_rele(rtg);
- goto out_free_zone_info;
- }
- }
-
- xfs_set_freecounter(mp, XC_FREE_RTAVAILABLE, iz.available);
- xfs_set_freecounter(mp, XC_FREE_RTEXTENTS,
- iz.available + iz.reclaimable);
-
/*
* The user may configure GC to free up a percentage of unused blocks.
* By default this is 0. GC will always trigger at the minimum level
@@ -1322,6 +1332,10 @@ xfs_mount_zones(
error = xfs_zone_gc_mount(mp);
if (error)
goto out_free_zone_info;
+
+ xfs_info(mp, "%u zones of %u blocks (%u max open zones)",
+ mp->m_sb.sb_rgcount, iz.zone_capacity, mp->m_max_open_zones);
+ trace_xfs_zones_mount(mp);
return 0;
out_free_zone_info:
--
2.47.3
^ permalink raw reply related [flat|nested] 6+ messages in thread* Re: [PATCH 1/2] xfs: refactor xfs_mount_zones
2026-03-31 15:26 ` [PATCH 1/2] xfs: refactor xfs_mount_zones Christoph Hellwig
@ 2026-03-31 19:37 ` Damien Le Moal
0 siblings, 0 replies; 6+ messages in thread
From: Damien Le Moal @ 2026-03-31 19:37 UTC (permalink / raw)
To: Christoph Hellwig, Carlos Maiolino; +Cc: Hans Holmberg, linux-xfs
On 4/1/26 00:26, Christoph Hellwig wrote:
> xfs_mount_zones has grown a bit too big and unorganized. Split the
> zone reporting loop into a separate helper, hiding the rtg variable
> there. Print the mount message last, and also keep the VFS writeback
> chunk size last instead of in the middle of the logic to calculate
> the free/available blocks.
>
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> Reviewed-by: Hans Holmberg <hans.holmberg@wdc.com>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
--
Damien Le Moal
Western Digital Research
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH 2/2] xfs: handle too many open zones when mounting
2026-03-31 15:26 fix handling of too many open zones at mount time v2 Christoph Hellwig
2026-03-31 15:26 ` [PATCH 1/2] xfs: refactor xfs_mount_zones Christoph Hellwig
@ 2026-03-31 15:26 ` Christoph Hellwig
2026-03-31 19:38 ` Damien Le Moal
2026-04-07 13:38 ` fix handling of too many open zones at mount time v2 Carlos Maiolino
2 siblings, 1 reply; 6+ messages in thread
From: Christoph Hellwig @ 2026-03-31 15:26 UTC (permalink / raw)
To: Carlos Maiolino; +Cc: Damien Le Moal, Hans Holmberg, linux-xfs
When running on conventional zones or devices, the zoned allocator does
not have a real write pointer, but instead fakes it up at mount time
based on the last block recorded in the rmap. This can create spurious
"open" zones when the last written blocks in a conventional zone are
invalidated. Add a loop to the mount code to find the conventional zone
with the highest used block in the rmap tree and "finish" it until we
are below the open zones limit.
While we're at it, also error out if there are too many open sequential
zones, which can only happen when the user overrode the max open zones
limit (or with really buggy hardware reducing the limit, but not much
we can do about that).
Fixes: 4e4d52075577 ("xfs: add the zoned space allocator")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hans Holmberg <hans.holmberg@wdc.com>
---
fs/xfs/xfs_trace.h | 1 +
fs/xfs/xfs_zone_alloc.c | 75 +++++++++++++++++++++++++++++++++++++++++
2 files changed, 76 insertions(+)
diff --git a/fs/xfs/xfs_trace.h b/fs/xfs/xfs_trace.h
index 60d1e605dfa5..c5ad26a1d7bb 100644
--- a/fs/xfs/xfs_trace.h
+++ b/fs/xfs/xfs_trace.h
@@ -461,6 +461,7 @@ DEFINE_EVENT(xfs_zone_alloc_class, name, \
DEFINE_ZONE_ALLOC_EVENT(xfs_zone_record_blocks);
DEFINE_ZONE_ALLOC_EVENT(xfs_zone_skip_blocks);
DEFINE_ZONE_ALLOC_EVENT(xfs_zone_alloc_blocks);
+DEFINE_ZONE_ALLOC_EVENT(xfs_zone_spurious_open);
TRACE_EVENT(xfs_zone_gc_select_victim,
TP_PROTO(struct xfs_rtgroup *rtg, unsigned int bucket),
diff --git a/fs/xfs/xfs_zone_alloc.c b/fs/xfs/xfs_zone_alloc.c
index e9f1d9d08620..5f8b6cbeebfd 100644
--- a/fs/xfs/xfs_zone_alloc.c
+++ b/fs/xfs/xfs_zone_alloc.c
@@ -1253,6 +1253,77 @@ xfs_report_zones(
return 0;
}
+static inline bool
+xfs_zone_is_conv(
+ struct xfs_rtgroup *rtg)
+{
+ return !bdev_zone_is_seq(rtg_mount(rtg)->m_rtdev_targp->bt_bdev,
+ xfs_gbno_to_daddr(rtg_group(rtg), 0));
+}
+
+static struct xfs_open_zone *
+xfs_find_fullest_conventional_open_zone(
+ struct xfs_mount *mp)
+{
+ struct xfs_zone_info *zi = mp->m_zone_info;
+ struct xfs_open_zone *found = NULL, *oz;
+
+ spin_lock(&zi->zi_open_zones_lock);
+ list_for_each_entry(oz, &zi->zi_open_zones, oz_entry) {
+ if (!xfs_zone_is_conv(oz->oz_rtg))
+ continue;
+ if (!found || oz->oz_allocated > found->oz_allocated)
+ found = oz;
+ }
+ spin_unlock(&zi->zi_open_zones_lock);
+
+ return found;
+}
+
+/*
+ * Find the fullest conventional zones and remove them from the open zone pool
+ * until we are at the open zone limit.
+ *
+ * We can end up with spurious "open" zones when the last blocks in a fully
+ * written zone were invalidate as there is no write pointer for conventional
+ * zones.
+ *
+ * If we are still over the limit when there is no conventional open zone left,
+ * the user overrode the max open zones limit using the max_open_zones mount
+ * option we should fail.
+ */
+static int
+xfs_finish_spurious_open_zones(
+ struct xfs_mount *mp,
+ struct xfs_init_zones *iz)
+{
+ struct xfs_zone_info *zi = mp->m_zone_info;
+
+ while (zi->zi_nr_open_zones > mp->m_max_open_zones) {
+ struct xfs_open_zone *oz;
+ xfs_filblks_t adjust;
+
+ oz = xfs_find_fullest_conventional_open_zone(mp);
+ if (!oz) {
+ xfs_err(mp,
+"too many open zones for max_open_zones limit (%u/%u)",
+ zi->zi_nr_open_zones, mp->m_max_open_zones);
+ return -EINVAL;
+ }
+
+ xfs_rtgroup_lock(oz->oz_rtg, XFS_RTGLOCK_RMAP);
+ adjust = rtg_blocks(oz->oz_rtg) - oz->oz_written;
+ trace_xfs_zone_spurious_open(oz, oz->oz_written, adjust);
+ oz->oz_written = rtg_blocks(oz->oz_rtg);
+ xfs_open_zone_mark_full(oz);
+ xfs_rtgroup_unlock(oz->oz_rtg, XFS_RTGLOCK_RMAP);
+ iz->available -= adjust;
+ iz->reclaimable += adjust;
+ }
+
+ return 0;
+}
+
int
xfs_mount_zones(
struct xfs_mount *mp)
@@ -1294,6 +1365,10 @@ xfs_mount_zones(
if (error)
goto out_free_zone_info;
+ error = xfs_finish_spurious_open_zones(mp, &iz);
+ if (error)
+ goto out_free_zone_info;
+
xfs_set_freecounter(mp, XC_FREE_RTAVAILABLE, iz.available);
xfs_set_freecounter(mp, XC_FREE_RTEXTENTS,
iz.available + iz.reclaimable);
--
2.47.3
^ permalink raw reply related [flat|nested] 6+ messages in thread* Re: [PATCH 2/2] xfs: handle too many open zones when mounting
2026-03-31 15:26 ` [PATCH 2/2] xfs: handle too many open zones when mounting Christoph Hellwig
@ 2026-03-31 19:38 ` Damien Le Moal
0 siblings, 0 replies; 6+ messages in thread
From: Damien Le Moal @ 2026-03-31 19:38 UTC (permalink / raw)
To: Christoph Hellwig, Carlos Maiolino; +Cc: Hans Holmberg, linux-xfs
On 4/1/26 00:26, Christoph Hellwig wrote:
> When running on conventional zones or devices, the zoned allocator does
> not have a real write pointer, but instead fakes it up at mount time
> based on the last block recorded in the rmap. This can create spurious
> "open" zones when the last written blocks in a conventional zone are
> invalidated. Add a loop to the mount code to find the conventional zone
> with the highest used block in the rmap tree and "finish" it until we
> are below the open zones limit.
>
> While we're at it, also error out if there are too many open sequential
> zones, which can only happen when the user overrode the max open zones
> limit (or with really buggy hardware reducing the limit, but not much
> we can do about that).
>
> Fixes: 4e4d52075577 ("xfs: add the zoned space allocator")
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> Reviewed-by: Hans Holmberg <hans.holmberg@wdc.com>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
--
Damien Le Moal
Western Digital Research
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: fix handling of too many open zones at mount time v2
2026-03-31 15:26 fix handling of too many open zones at mount time v2 Christoph Hellwig
2026-03-31 15:26 ` [PATCH 1/2] xfs: refactor xfs_mount_zones Christoph Hellwig
2026-03-31 15:26 ` [PATCH 2/2] xfs: handle too many open zones when mounting Christoph Hellwig
@ 2026-04-07 13:38 ` Carlos Maiolino
2 siblings, 0 replies; 6+ messages in thread
From: Carlos Maiolino @ 2026-04-07 13:38 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: Damien Le Moal, Hans Holmberg, linux-xfs
On Tue, 31 Mar 2026 17:26:04 +0200, Christoph Hellwig wrote:
> because there is no actual write pointer when running the zoned allocator
> on conventional devices or zones, we can see spurious extra open zones
> when the last blocks in a written zone have been invalidated. This
> series adds code to handle that case and remove these spurious extra
> zones. It also fixes up the mountinfo code for open zones to be
> more easy to parse, and adds a new sysfs file reporting the currently
> open zones, which makes it easier to use the value in tests.
>
> [...]
Applied to for-next, thanks!
[1/2] xfs: refactor xfs_mount_zones
commit: 02367990bdcbeabb0ffd3e8e227e5f79a04186fc
[2/2] xfs: handle too many open zones when mounting
commit: c6584888864e36d6225a6c16d8c39fd2aa9a45d8
Best regards,
--
Carlos Maiolino <cem@kernel.org>
^ permalink raw reply [flat|nested] 6+ messages in thread