public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
* refactor zone reporting
@ 2026-01-09 17:20 Christoph Hellwig
  2026-01-09 17:20 ` [PATCH 1/6] xfs: add missing forward declaration in xfs_zones.h Christoph Hellwig
                   ` (5 more replies)
  0 siblings, 6 replies; 23+ messages in thread
From: Christoph Hellwig @ 2026-01-09 17:20 UTC (permalink / raw)
  To: Carlos Maiolino; +Cc: Damien Le Moal, linux-xfs

Hi all,

this series refactor the zone reporting code so that it is more
clearly split between sanity checking the report hardware zone
information, and the XFS zoned RT information.  This reduced the
code size and removes an iteration over all RTGs at boot time.

It will also allow to do smarter checking of hardware zones and
RTG allocation information in repair once ported to userspace.

I've also included Damien's xfsprogs patch to make xfs_zones.h
compile better standalone as it touches the same area.

Diffstat:
 libxfs/xfs_rtgroup.h |   15 ++++
 libxfs/xfs_zones.c   |  142 ++++++++++--------------------------------
 libxfs/xfs_zones.h   |    6 +
 xfs_zone_alloc.c     |  171 +++++++++++++++++++++++++++++----------------------
 4 files changed, 152 insertions(+), 182 deletions(-)

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH 1/6] xfs: add missing forward declaration in xfs_zones.h
  2026-01-09 17:20 refactor zone reporting Christoph Hellwig
@ 2026-01-09 17:20 ` Christoph Hellwig
  2026-01-10  0:50   ` Darrick J. Wong
  2026-01-09 17:20 ` [PATCH 2/6] xfs: add a xfs_rtgroup_raw_size helper Christoph Hellwig
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 23+ messages in thread
From: Christoph Hellwig @ 2026-01-09 17:20 UTC (permalink / raw)
  To: Carlos Maiolino; +Cc: Damien Le Moal, linux-xfs

From: Damien Le Moal <dlemoal@kernel.org>

Add the missing forward declaration for struct blk_zone in xfs_zones.h.
This avoids headaches with the order of header file inclusion to avoid
compilation errors.

Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/xfs/libxfs/xfs_zones.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/fs/xfs/libxfs/xfs_zones.h b/fs/xfs/libxfs/xfs_zones.h
index 5fefd132e002..df10a34da71d 100644
--- a/fs/xfs/libxfs/xfs_zones.h
+++ b/fs/xfs/libxfs/xfs_zones.h
@@ -3,6 +3,7 @@
 #define _LIBXFS_ZONES_H
 
 struct xfs_rtgroup;
+struct blk_zone;
 
 /*
  * In order to guarantee forward progress for GC we need to reserve at least
-- 
2.47.3


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 2/6] xfs: add a xfs_rtgroup_raw_size helper
  2026-01-09 17:20 refactor zone reporting Christoph Hellwig
  2026-01-09 17:20 ` [PATCH 1/6] xfs: add missing forward declaration in xfs_zones.h Christoph Hellwig
@ 2026-01-09 17:20 ` Christoph Hellwig
  2026-01-10  1:00   ` Darrick J. Wong
  2026-01-09 17:20 ` [PATCH 3/6] xfs: pass the write pointer to xfs_init_zone Christoph Hellwig
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 23+ messages in thread
From: Christoph Hellwig @ 2026-01-09 17:20 UTC (permalink / raw)
  To: Carlos Maiolino; +Cc: Damien Le Moal, linux-xfs

Add a helper to figure the on-disk size of a group, accounting for the
XFS_SB_FEAT_INCOMPAT_ZONE_GAPS feature if needed.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/xfs/libxfs/xfs_rtgroup.h | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/fs/xfs/libxfs/xfs_rtgroup.h b/fs/xfs/libxfs/xfs_rtgroup.h
index 73cace4d25c7..c0b9f9f2c413 100644
--- a/fs/xfs/libxfs/xfs_rtgroup.h
+++ b/fs/xfs/libxfs/xfs_rtgroup.h
@@ -371,4 +371,19 @@ xfs_rtgs_to_rfsbs(
 	return xfs_groups_to_rfsbs(mp, nr_groups, XG_TYPE_RTG);
 }
 
+/*
+ * Return the "raw" size of a group on the hardware device.  This includes the
+ * daddr gaps present for XFS_SB_FEAT_INCOMPAT_ZONE_GAPS file systems.
+ */
+static inline xfs_rgblock_t
+xfs_rtgroup_raw_size(
+	struct xfs_mount	*mp)
+{
+	struct xfs_groups	*g = &mp->m_groups[XG_TYPE_RTG];
+
+	if (g->has_daddr_gaps)
+		return 1U << g->blklog;
+	return g->blocks;
+}
+
 #endif /* __LIBXFS_RTGROUP_H */
-- 
2.47.3


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 3/6] xfs: pass the write pointer to xfs_init_zone
  2026-01-09 17:20 refactor zone reporting Christoph Hellwig
  2026-01-09 17:20 ` [PATCH 1/6] xfs: add missing forward declaration in xfs_zones.h Christoph Hellwig
  2026-01-09 17:20 ` [PATCH 2/6] xfs: add a xfs_rtgroup_raw_size helper Christoph Hellwig
@ 2026-01-09 17:20 ` Christoph Hellwig
  2026-01-10  1:11   ` Darrick J. Wong
  2026-01-12 10:15   ` Damien Le Moal
  2026-01-09 17:20 ` [PATCH 4/6] xfs: split and refactor zone validation Christoph Hellwig
                   ` (2 subsequent siblings)
  5 siblings, 2 replies; 23+ messages in thread
From: Christoph Hellwig @ 2026-01-09 17:20 UTC (permalink / raw)
  To: Carlos Maiolino; +Cc: Damien Le Moal, linux-xfs

Move the two methods to query the write pointer out of xfs_init_zone into
the callers, so that xfs_init_zone doesn't have to bother with the
blk_zone structure and instead operates purely at the XFS realtime group
level.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/xfs/xfs_zone_alloc.c | 66 +++++++++++++++++++++++------------------
 1 file changed, 37 insertions(+), 29 deletions(-)

diff --git a/fs/xfs/xfs_zone_alloc.c b/fs/xfs/xfs_zone_alloc.c
index bbcf21704ea0..013228eab0ac 100644
--- a/fs/xfs/xfs_zone_alloc.c
+++ b/fs/xfs/xfs_zone_alloc.c
@@ -981,43 +981,43 @@ struct xfs_init_zones {
 	uint64_t		reclaimable;
 };
 
+/*
+ * For sequential write required zones, we restart writing at the hardware write
+ * pointer.
+ *
+ * For conventional zones or conventional devices we have query the rmap to
+ * find the highest recorded block and set the write pointer to the block after
+ * that.  In case of a power loss this misses blocks where the data I/O has
+ * completed but not recorded in the rmap yet, and it also rewrites blocks if
+ * the most recently written ones got deleted again before unmount, but this is
+ * the best we can do without hardware support.
+ */
+static xfs_rgblock_t
+xfs_rmap_write_pointer(
+	struct xfs_rtgroup	*rtg)
+{
+	xfs_rgblock_t		highest_rgbno;
+
+	xfs_rtgroup_lock(rtg, XFS_RTGLOCK_RMAP);
+	highest_rgbno = xfs_rtrmap_highest_rgbno(rtg);
+	xfs_rtgroup_unlock(rtg, XFS_RTGLOCK_RMAP);
+
+	if (highest_rgbno == NULLRGBLOCK)
+		return 0;
+	return highest_rgbno + 1;
+}
+
 static int
 xfs_init_zone(
 	struct xfs_init_zones	*iz,
 	struct xfs_rtgroup	*rtg,
-	struct blk_zone		*zone)
+	xfs_rgblock_t		write_pointer)
 {
 	struct xfs_mount	*mp = rtg_mount(rtg);
 	struct xfs_zone_info	*zi = mp->m_zone_info;
 	uint32_t		used = rtg_rmap(rtg)->i_used_blocks;
-	xfs_rgblock_t		write_pointer, highest_rgbno;
 	int			error;
 
-	if (zone && !xfs_zone_validate(zone, rtg, &write_pointer))
-		return -EFSCORRUPTED;
-
-	/*
-	 * For sequential write required zones we retrieved the hardware write
-	 * pointer above.
-	 *
-	 * For conventional zones or conventional devices we don't have that
-	 * luxury.  Instead query the rmap to find the highest recorded block
-	 * and set the write pointer to the block after that.  In case of a
-	 * power loss this misses blocks where the data I/O has completed but
-	 * not recorded in the rmap yet, and it also rewrites blocks if the most
-	 * recently written ones got deleted again before unmount, but this is
-	 * the best we can do without hardware support.
-	 */
-	if (!zone || zone->cond == BLK_ZONE_COND_NOT_WP) {
-		xfs_rtgroup_lock(rtg, XFS_RTGLOCK_RMAP);
-		highest_rgbno = xfs_rtrmap_highest_rgbno(rtg);
-		if (highest_rgbno == NULLRGBLOCK)
-			write_pointer = 0;
-		else
-			write_pointer = highest_rgbno + 1;
-		xfs_rtgroup_unlock(rtg, XFS_RTGLOCK_RMAP);
-	}
-
 	/*
 	 * If there are no used blocks, but the zone is not in empty state yet
 	 * we lost power before the zoned reset.  In that case finish the work
@@ -1066,6 +1066,7 @@ xfs_get_zone_info_cb(
 	struct xfs_mount	*mp = iz->mp;
 	xfs_fsblock_t		zsbno = xfs_daddr_to_rtb(mp, zone->start);
 	xfs_rgnumber_t		rgno;
+	xfs_rgblock_t		write_pointer;
 	struct xfs_rtgroup	*rtg;
 	int			error;
 
@@ -1080,7 +1081,13 @@ xfs_get_zone_info_cb(
 		xfs_warn(mp, "realtime group not found for zone %u.", rgno);
 		return -EFSCORRUPTED;
 	}
-	error = xfs_init_zone(iz, rtg, zone);
+	if (!xfs_zone_validate(zone, rtg, &write_pointer)) {
+		xfs_rtgroup_rele(rtg);
+		return -EFSCORRUPTED;
+	}
+	if (zone->cond == BLK_ZONE_COND_NOT_WP)
+		write_pointer = xfs_rmap_write_pointer(rtg);
+	error = xfs_init_zone(iz, rtg, write_pointer);
 	xfs_rtgroup_rele(rtg);
 	return error;
 }
@@ -1290,7 +1297,8 @@ xfs_mount_zones(
 		struct xfs_rtgroup	*rtg = NULL;
 
 		while ((rtg = xfs_rtgroup_next(mp, rtg))) {
-			error = xfs_init_zone(&iz, rtg, NULL);
+			error = xfs_init_zone(&iz, rtg,
+					xfs_rmap_write_pointer(rtg));
 			if (error) {
 				xfs_rtgroup_rele(rtg);
 				goto out_free_zone_info;
-- 
2.47.3


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 4/6] xfs: split and refactor zone validation
  2026-01-09 17:20 refactor zone reporting Christoph Hellwig
                   ` (2 preceding siblings ...)
  2026-01-09 17:20 ` [PATCH 3/6] xfs: pass the write pointer to xfs_init_zone Christoph Hellwig
@ 2026-01-09 17:20 ` Christoph Hellwig
  2026-01-10  1:44   ` Darrick J. Wong
  2026-01-09 17:20 ` [PATCH 5/6] xfs: check that used blocks are smaller than the write pointer Christoph Hellwig
  2026-01-09 17:20 ` [PATCH 6/6] xfs: use blkdev_get_zone_info to simply zone reporting Christoph Hellwig
  5 siblings, 1 reply; 23+ messages in thread
From: Christoph Hellwig @ 2026-01-09 17:20 UTC (permalink / raw)
  To: Carlos Maiolino; +Cc: Damien Le Moal, linux-xfs

Currently xfs_zone_validate mixes validating the software zone state in
the XFS realtime group with validating the hardware state reported in
struct blk_zone and deriving the write pointer from that.

Move all code that works on the realtime group to xfs_init_zone, and only
keep the hardware state validation in xfs_zone_validate.  This makes the
code more clear, and allows for better reuse in userspace.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/xfs/libxfs/xfs_zones.c | 142 ++++++++++----------------------------
 fs/xfs/libxfs/xfs_zones.h |   5 +-
 fs/xfs/xfs_zone_alloc.c   |  26 ++++++-
 3 files changed, 63 insertions(+), 110 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_zones.c b/fs/xfs/libxfs/xfs_zones.c
index b40f71f878b5..8d54452744ae 100644
--- a/fs/xfs/libxfs/xfs_zones.c
+++ b/fs/xfs/libxfs/xfs_zones.c
@@ -14,174 +14,102 @@
 #include "xfs_rtgroup.h"
 #include "xfs_zones.h"
 
-static bool
-xfs_zone_validate_empty(
-	struct blk_zone		*zone,
-	struct xfs_rtgroup	*rtg,
-	xfs_rgblock_t		*write_pointer)
-{
-	struct xfs_mount	*mp = rtg_mount(rtg);
-
-	if (rtg_rmap(rtg)->i_used_blocks > 0) {
-		xfs_warn(mp, "empty zone %u has non-zero used counter (0x%x).",
-			 rtg_rgno(rtg), rtg_rmap(rtg)->i_used_blocks);
-		return false;
-	}
-
-	*write_pointer = 0;
-	return true;
-}
-
-static bool
-xfs_zone_validate_wp(
-	struct blk_zone		*zone,
-	struct xfs_rtgroup	*rtg,
-	xfs_rgblock_t		*write_pointer)
-{
-	struct xfs_mount	*mp = rtg_mount(rtg);
-	xfs_rtblock_t		wp_fsb = xfs_daddr_to_rtb(mp, zone->wp);
-
-	if (rtg_rmap(rtg)->i_used_blocks > rtg->rtg_extents) {
-		xfs_warn(mp, "zone %u has too large used counter (0x%x).",
-			 rtg_rgno(rtg), rtg_rmap(rtg)->i_used_blocks);
-		return false;
-	}
-
-	if (xfs_rtb_to_rgno(mp, wp_fsb) != rtg_rgno(rtg)) {
-		xfs_warn(mp, "zone %u write pointer (0x%llx) outside of zone.",
-			 rtg_rgno(rtg), wp_fsb);
-		return false;
-	}
-
-	*write_pointer = xfs_rtb_to_rgbno(mp, wp_fsb);
-	if (*write_pointer >= rtg->rtg_extents) {
-		xfs_warn(mp, "zone %u has invalid write pointer (0x%x).",
-			 rtg_rgno(rtg), *write_pointer);
-		return false;
-	}
-
-	return true;
-}
-
-static bool
-xfs_zone_validate_full(
-	struct blk_zone		*zone,
-	struct xfs_rtgroup	*rtg,
-	xfs_rgblock_t		*write_pointer)
-{
-	struct xfs_mount	*mp = rtg_mount(rtg);
-
-	if (rtg_rmap(rtg)->i_used_blocks > rtg->rtg_extents) {
-		xfs_warn(mp, "zone %u has too large used counter (0x%x).",
-			 rtg_rgno(rtg), rtg_rmap(rtg)->i_used_blocks);
-		return false;
-	}
-
-	*write_pointer = rtg->rtg_extents;
-	return true;
-}
-
 static bool
 xfs_zone_validate_seq(
+	struct xfs_mount	*mp,
 	struct blk_zone		*zone,
-	struct xfs_rtgroup	*rtg,
+	unsigned int		zone_no,
 	xfs_rgblock_t		*write_pointer)
 {
-	struct xfs_mount	*mp = rtg_mount(rtg);
-
 	switch (zone->cond) {
 	case BLK_ZONE_COND_EMPTY:
-		return xfs_zone_validate_empty(zone, rtg, write_pointer);
+		*write_pointer = 0;
+		return true;
 	case BLK_ZONE_COND_IMP_OPEN:
 	case BLK_ZONE_COND_EXP_OPEN:
 	case BLK_ZONE_COND_CLOSED:
 	case BLK_ZONE_COND_ACTIVE:
-		return xfs_zone_validate_wp(zone, rtg, write_pointer);
+		if (zone->wp < zone->start ||
+		    zone->wp >= zone->start + zone->capacity) {
+			xfs_warn(mp,
+	"zone %u write pointer (%llu) outside of zone.",
+				zone_no, zone->wp);
+			return false;
+		}
+
+		*write_pointer = XFS_BB_TO_FSB(mp, zone->wp - zone->start);
+		return true;
 	case BLK_ZONE_COND_FULL:
-		return xfs_zone_validate_full(zone, rtg, write_pointer);
+		*write_pointer = XFS_BB_TO_FSB(mp, zone->capacity);
+		return true;
 	case BLK_ZONE_COND_NOT_WP:
 	case BLK_ZONE_COND_OFFLINE:
 	case BLK_ZONE_COND_READONLY:
 		xfs_warn(mp, "zone %u has unsupported zone condition 0x%x.",
-			rtg_rgno(rtg), zone->cond);
+			zone_no, zone->cond);
 		return false;
 	default:
 		xfs_warn(mp, "zone %u has unknown zone condition 0x%x.",
-			rtg_rgno(rtg), zone->cond);
+			zone_no, zone->cond);
 		return false;
 	}
 }
 
 static bool
 xfs_zone_validate_conv(
+	struct xfs_mount	*mp,
 	struct blk_zone		*zone,
-	struct xfs_rtgroup	*rtg)
+	unsigned int		zone_no)
 {
-	struct xfs_mount	*mp = rtg_mount(rtg);
-
 	switch (zone->cond) {
 	case BLK_ZONE_COND_NOT_WP:
 		return true;
 	default:
 		xfs_warn(mp,
 "conventional zone %u has unsupported zone condition 0x%x.",
-			 rtg_rgno(rtg), zone->cond);
+			 zone_no, zone->cond);
 		return false;
 	}
 }
 
 bool
 xfs_zone_validate(
+	struct xfs_mount	*mp,
 	struct blk_zone		*zone,
-	struct xfs_rtgroup	*rtg,
+	unsigned int		zone_no,
+	uint32_t		expected_size,
+	uint32_t		expected_capacity,
 	xfs_rgblock_t		*write_pointer)
 {
-	struct xfs_mount	*mp = rtg_mount(rtg);
-	struct xfs_groups	*g = &mp->m_groups[XG_TYPE_RTG];
-	uint32_t		expected_size;
-
 	/*
 	 * Check that the zone capacity matches the rtgroup size stored in the
 	 * superblock.  Note that all zones including the last one must have a
 	 * uniform capacity.
 	 */
-	if (XFS_BB_TO_FSB(mp, zone->capacity) != g->blocks) {
+	if (XFS_BB_TO_FSB(mp, zone->capacity) != expected_capacity) {
 		xfs_warn(mp,
-"zone %u capacity (0x%llx) does not match RT group size (0x%x).",
-			rtg_rgno(rtg), XFS_BB_TO_FSB(mp, zone->capacity),
-			g->blocks);
+"zone %u capacity (%llu) does not match RT group size (%u).",
+			zone_no, XFS_BB_TO_FSB(mp, zone->capacity),
+			expected_capacity);
 		return false;
 	}
 
-	if (g->has_daddr_gaps) {
-		expected_size = 1 << g->blklog;
-	} else {
-		if (zone->len != zone->capacity) {
-			xfs_warn(mp,
-"zone %u has capacity != size ((0x%llx vs 0x%llx)",
-				rtg_rgno(rtg),
-				XFS_BB_TO_FSB(mp, zone->len),
-				XFS_BB_TO_FSB(mp, zone->capacity));
-			return false;
-		}
-		expected_size = g->blocks;
-	}
-
 	if (XFS_BB_TO_FSB(mp, zone->len) != expected_size) {
 		xfs_warn(mp,
-"zone %u length (0x%llx) does match geometry (0x%x).",
-			rtg_rgno(rtg), XFS_BB_TO_FSB(mp, zone->len),
+"zone %u length (%llu) does not match geometry (%u).",
+			zone_no, XFS_BB_TO_FSB(mp, zone->len),
 			expected_size);
+		return false;
 	}
 
 	switch (zone->type) {
 	case BLK_ZONE_TYPE_CONVENTIONAL:
-		return xfs_zone_validate_conv(zone, rtg);
+		return xfs_zone_validate_conv(mp, zone, zone_no);
 	case BLK_ZONE_TYPE_SEQWRITE_REQ:
-		return xfs_zone_validate_seq(zone, rtg, write_pointer);
+		return xfs_zone_validate_seq(mp, zone, zone_no, write_pointer);
 	default:
 		xfs_warn(mp, "zoned %u has unsupported type 0x%x.",
-			rtg_rgno(rtg), zone->type);
+			zone_no, zone->type);
 		return false;
 	}
 }
diff --git a/fs/xfs/libxfs/xfs_zones.h b/fs/xfs/libxfs/xfs_zones.h
index df10a34da71d..b5b3df04a066 100644
--- a/fs/xfs/libxfs/xfs_zones.h
+++ b/fs/xfs/libxfs/xfs_zones.h
@@ -37,7 +37,8 @@ struct blk_zone;
  */
 #define XFS_DEFAULT_MAX_OPEN_ZONES	128
 
-bool xfs_zone_validate(struct blk_zone *zone, struct xfs_rtgroup *rtg,
-	xfs_rgblock_t *write_pointer);
+bool xfs_zone_validate(struct xfs_mount *mp, struct blk_zone *zone,
+	unsigned int zone_no, uint32_t expected_size,
+	uint32_t expected_capacity, xfs_rgblock_t *write_pointer);
 
 #endif /* _LIBXFS_ZONES_H */
diff --git a/fs/xfs/xfs_zone_alloc.c b/fs/xfs/xfs_zone_alloc.c
index 013228eab0ac..d8df219fd3b4 100644
--- a/fs/xfs/xfs_zone_alloc.c
+++ b/fs/xfs/xfs_zone_alloc.c
@@ -977,6 +977,8 @@ xfs_free_open_zones(
 
 struct xfs_init_zones {
 	struct xfs_mount	*mp;
+	uint32_t		zone_size;
+	uint32_t		zone_capacity;
 	uint64_t		available;
 	uint64_t		reclaimable;
 };
@@ -1018,6 +1020,25 @@ xfs_init_zone(
 	uint32_t		used = rtg_rmap(rtg)->i_used_blocks;
 	int			error;
 
+	if (write_pointer > rtg->rtg_extents) {
+		xfs_warn(mp, "zone %u has invalid write pointer (0x%x).",
+			 rtg_rgno(rtg), write_pointer);
+		return -EFSCORRUPTED;
+	}
+
+	if (used > rtg->rtg_extents) {
+		xfs_warn(mp,
+"zone %u has used counter (0x%x) larger than zone capacity (0x%llx).",
+			 rtg_rgno(rtg), used, rtg->rtg_extents);
+		return -EFSCORRUPTED;
+	}
+
+	if (write_pointer == 0 && used != 0) {
+		xfs_warn(mp, "empty zone %u has non-zero used counter (0x%x).",
+			rtg_rgno(rtg), used);
+		return -EFSCORRUPTED;
+	}
+
 	/*
 	 * If there are no used blocks, but the zone is not in empty state yet
 	 * we lost power before the zoned reset.  In that case finish the work
@@ -1081,7 +1102,8 @@ xfs_get_zone_info_cb(
 		xfs_warn(mp, "realtime group not found for zone %u.", rgno);
 		return -EFSCORRUPTED;
 	}
-	if (!xfs_zone_validate(zone, rtg, &write_pointer)) {
+	if (!xfs_zone_validate(mp, zone, idx, iz->zone_size,
+			iz->zone_capacity, &write_pointer)) {
 		xfs_rtgroup_rele(rtg);
 		return -EFSCORRUPTED;
 	}
@@ -1227,6 +1249,8 @@ xfs_mount_zones(
 {
 	struct xfs_init_zones	iz = {
 		.mp		= mp,
+		.zone_capacity	= mp->m_groups[XG_TYPE_RTG].blocks,
+		.zone_size	= xfs_rtgroup_raw_size(mp),
 	};
 	struct xfs_buftarg	*bt = mp->m_rtdev_targp;
 	xfs_extlen_t		zone_blocks = mp->m_groups[XG_TYPE_RTG].blocks;
-- 
2.47.3


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 5/6] xfs: check that used blocks are smaller than the write pointer
  2026-01-09 17:20 refactor zone reporting Christoph Hellwig
                   ` (3 preceding siblings ...)
  2026-01-09 17:20 ` [PATCH 4/6] xfs: split and refactor zone validation Christoph Hellwig
@ 2026-01-09 17:20 ` Christoph Hellwig
  2026-01-10  1:25   ` Darrick J. Wong
  2026-01-09 17:20 ` [PATCH 6/6] xfs: use blkdev_get_zone_info to simply zone reporting Christoph Hellwig
  5 siblings, 1 reply; 23+ messages in thread
From: Christoph Hellwig @ 2026-01-09 17:20 UTC (permalink / raw)
  To: Carlos Maiolino; +Cc: Damien Le Moal, linux-xfs

Any used block must have been written, this reject used blocks > write
pointer.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/xfs/xfs_zone_alloc.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/fs/xfs/xfs_zone_alloc.c b/fs/xfs/xfs_zone_alloc.c
index d8df219fd3b4..00260f70242f 100644
--- a/fs/xfs/xfs_zone_alloc.c
+++ b/fs/xfs/xfs_zone_alloc.c
@@ -1033,6 +1033,13 @@ xfs_init_zone(
 		return -EFSCORRUPTED;
 	}
 
+	if (used > write_pointer) {
+		xfs_warn(mp,
+"zone %u has used counter (0x%x) larger than write pointer (0x%x).",
+			 rtg_rgno(rtg), used, write_pointer);
+		return -EFSCORRUPTED;
+	}
+
 	if (write_pointer == 0 && used != 0) {
 		xfs_warn(mp, "empty zone %u has non-zero used counter (0x%x).",
 			rtg_rgno(rtg), used);
-- 
2.47.3


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 6/6] xfs: use blkdev_get_zone_info to simply zone reporting
  2026-01-09 17:20 refactor zone reporting Christoph Hellwig
                   ` (4 preceding siblings ...)
  2026-01-09 17:20 ` [PATCH 5/6] xfs: check that used blocks are smaller than the write pointer Christoph Hellwig
@ 2026-01-09 17:20 ` Christoph Hellwig
  2026-01-10  1:28   ` Darrick J. Wong
  2026-01-13 10:33   ` Damien Le Moal
  5 siblings, 2 replies; 23+ messages in thread
From: Christoph Hellwig @ 2026-01-09 17:20 UTC (permalink / raw)
  To: Carlos Maiolino; +Cc: Damien Le Moal, linux-xfs

Unwind the callback based programming model by querying the cached
zone information using blkdev_get_zone_info.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/xfs/xfs_zone_alloc.c | 104 +++++++++++++++++-----------------------
 1 file changed, 45 insertions(+), 59 deletions(-)

diff --git a/fs/xfs/xfs_zone_alloc.c b/fs/xfs/xfs_zone_alloc.c
index 00260f70242f..2849be19369e 100644
--- a/fs/xfs/xfs_zone_alloc.c
+++ b/fs/xfs/xfs_zone_alloc.c
@@ -976,7 +976,6 @@ xfs_free_open_zones(
 }
 
 struct xfs_init_zones {
-	struct xfs_mount	*mp;
 	uint32_t		zone_size;
 	uint32_t		zone_capacity;
 	uint64_t		available;
@@ -1009,6 +1008,39 @@ xfs_rmap_write_pointer(
 	return highest_rgbno + 1;
 }
 
+static int
+xfs_query_write_pointer(
+	struct xfs_init_zones	*iz,
+	struct xfs_rtgroup	*rtg,
+	xfs_rgblock_t		*write_pointer)
+{
+	struct xfs_mount	*mp = rtg_mount(rtg);
+	struct block_device	*bdev = mp->m_rtdev_targp->bt_bdev;
+	sector_t		start = xfs_gbno_to_daddr(&rtg->rtg_group, 0);
+	struct blk_zone		zone = {
+		.cond	= BLK_ZONE_COND_NOT_WP,
+	};
+	int			error;
+
+	if (bdev_is_zoned(bdev)) {
+		error = blkdev_get_zone_info(bdev, start, &zone);
+		if (error)
+			return error;
+		if (zone.start != start) {
+			xfs_warn(mp, "mismatched zone start: 0x%llx/0x%llx.",
+				zone.start, start);
+			return -EFSCORRUPTED;
+		}
+		if (!xfs_zone_validate(mp, &zone, rtg_rgno(rtg), iz->zone_size,
+				iz->zone_capacity, write_pointer))
+			return -EFSCORRUPTED;
+	}
+
+	if (zone.cond == BLK_ZONE_COND_NOT_WP)
+		*write_pointer = xfs_rmap_write_pointer(rtg);
+	return 0;
+}
+
 static int
 xfs_init_zone(
 	struct xfs_init_zones	*iz,
@@ -1084,43 +1116,6 @@ xfs_init_zone(
 	return 0;
 }
 
-static int
-xfs_get_zone_info_cb(
-	struct blk_zone		*zone,
-	unsigned int		idx,
-	void			*data)
-{
-	struct xfs_init_zones	*iz = data;
-	struct xfs_mount	*mp = iz->mp;
-	xfs_fsblock_t		zsbno = xfs_daddr_to_rtb(mp, zone->start);
-	xfs_rgnumber_t		rgno;
-	xfs_rgblock_t		write_pointer;
-	struct xfs_rtgroup	*rtg;
-	int			error;
-
-	if (xfs_rtb_to_rgbno(mp, zsbno) != 0) {
-		xfs_warn(mp, "mismatched zone start 0x%llx.", zsbno);
-		return -EFSCORRUPTED;
-	}
-
-	rgno = xfs_rtb_to_rgno(mp, zsbno);
-	rtg = xfs_rtgroup_grab(mp, rgno);
-	if (!rtg) {
-		xfs_warn(mp, "realtime group not found for zone %u.", rgno);
-		return -EFSCORRUPTED;
-	}
-	if (!xfs_zone_validate(mp, zone, idx, iz->zone_size,
-			iz->zone_capacity, &write_pointer)) {
-		xfs_rtgroup_rele(rtg);
-		return -EFSCORRUPTED;
-	}
-	if (zone->cond == BLK_ZONE_COND_NOT_WP)
-		write_pointer = xfs_rmap_write_pointer(rtg);
-	error = xfs_init_zone(iz, rtg, write_pointer);
-	xfs_rtgroup_rele(rtg);
-	return error;
-}
-
 /*
  * Calculate the max open zone limit based on the of number of backing zones
  * available.
@@ -1255,15 +1250,13 @@ xfs_mount_zones(
 	struct xfs_mount	*mp)
 {
 	struct xfs_init_zones	iz = {
-		.mp		= mp,
 		.zone_capacity	= mp->m_groups[XG_TYPE_RTG].blocks,
 		.zone_size	= xfs_rtgroup_raw_size(mp),
 	};
-	struct xfs_buftarg	*bt = mp->m_rtdev_targp;
-	xfs_extlen_t		zone_blocks = mp->m_groups[XG_TYPE_RTG].blocks;
+	struct xfs_rtgroup	*rtg = NULL;
 	int			error;
 
-	if (!bt) {
+	if (!mp->m_rtdev_targp) {
 		xfs_notice(mp, "RT device missing.");
 		return -EINVAL;
 	}
@@ -1291,7 +1284,7 @@ xfs_mount_zones(
 		return -ENOMEM;
 
 	xfs_info(mp, "%u zones of %u blocks (%u max open zones)",
-		 mp->m_sb.sb_rgcount, zone_blocks, mp->m_max_open_zones);
+		 mp->m_sb.sb_rgcount, iz.zone_capacity, mp->m_max_open_zones);
 	trace_xfs_zones_mount(mp);
 
 	/*
@@ -1315,25 +1308,18 @@ xfs_mount_zones(
 	 * or beneficial.
 	 */
 	mp->m_super->s_min_writeback_pages =
-		XFS_FSB_TO_B(mp, min(zone_blocks, XFS_MAX_BMBT_EXTLEN)) >>
+		XFS_FSB_TO_B(mp, min(iz.zone_capacity, XFS_MAX_BMBT_EXTLEN)) >>
 			PAGE_SHIFT;
 
-	if (bdev_is_zoned(bt->bt_bdev)) {
-		error = blkdev_report_zones_cached(bt->bt_bdev,
-				XFS_FSB_TO_BB(mp, mp->m_sb.sb_rtstart),
-				mp->m_sb.sb_rgcount, xfs_get_zone_info_cb, &iz);
-		if (error < 0)
+	while ((rtg = xfs_rtgroup_next(mp, rtg))) {
+		xfs_rgblock_t		write_pointer;
+
+		error = xfs_query_write_pointer(&iz, rtg, &write_pointer);
+		if (!error)
+			error = xfs_init_zone(&iz, rtg, write_pointer);
+		if (error) {
+			xfs_rtgroup_rele(rtg);
 			goto out_free_zone_info;
-	} else {
-		struct xfs_rtgroup	*rtg = NULL;
-
-		while ((rtg = xfs_rtgroup_next(mp, rtg))) {
-			error = xfs_init_zone(&iz, rtg,
-					xfs_rmap_write_pointer(rtg));
-			if (error) {
-				xfs_rtgroup_rele(rtg);
-				goto out_free_zone_info;
-			}
 		}
 	}
 
-- 
2.47.3


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/6] xfs: add missing forward declaration in xfs_zones.h
  2026-01-09 17:20 ` [PATCH 1/6] xfs: add missing forward declaration in xfs_zones.h Christoph Hellwig
@ 2026-01-10  0:50   ` Darrick J. Wong
  0 siblings, 0 replies; 23+ messages in thread
From: Darrick J. Wong @ 2026-01-10  0:50 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Carlos Maiolino, Damien Le Moal, linux-xfs

On Fri, Jan 09, 2026 at 06:20:46PM +0100, Christoph Hellwig wrote:
> From: Damien Le Moal <dlemoal@kernel.org>
> 
> Add the missing forward declaration for struct blk_zone in xfs_zones.h.
> This avoids headaches with the order of header file inclusion to avoid
> compilation errors.
> 
> Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
> Signed-off-by: Christoph Hellwig <hch@lst.de>

LGTM
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>

--D

> ---
>  fs/xfs/libxfs/xfs_zones.h | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/fs/xfs/libxfs/xfs_zones.h b/fs/xfs/libxfs/xfs_zones.h
> index 5fefd132e002..df10a34da71d 100644
> --- a/fs/xfs/libxfs/xfs_zones.h
> +++ b/fs/xfs/libxfs/xfs_zones.h
> @@ -3,6 +3,7 @@
>  #define _LIBXFS_ZONES_H
>  
>  struct xfs_rtgroup;
> +struct blk_zone;
>  
>  /*
>   * In order to guarantee forward progress for GC we need to reserve at least
> -- 
> 2.47.3
> 
> 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/6] xfs: add a xfs_rtgroup_raw_size helper
  2026-01-09 17:20 ` [PATCH 2/6] xfs: add a xfs_rtgroup_raw_size helper Christoph Hellwig
@ 2026-01-10  1:00   ` Darrick J. Wong
  0 siblings, 0 replies; 23+ messages in thread
From: Darrick J. Wong @ 2026-01-10  1:00 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Carlos Maiolino, Damien Le Moal, linux-xfs

On Fri, Jan 09, 2026 at 06:20:47PM +0100, Christoph Hellwig wrote:
> Add a helper to figure the on-disk size of a group, accounting for the
> XFS_SB_FEAT_INCOMPAT_ZONE_GAPS feature if needed.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>

Looks good to me,
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>

--D

> ---
>  fs/xfs/libxfs/xfs_rtgroup.h | 15 +++++++++++++++
>  1 file changed, 15 insertions(+)
> 
> diff --git a/fs/xfs/libxfs/xfs_rtgroup.h b/fs/xfs/libxfs/xfs_rtgroup.h
> index 73cace4d25c7..c0b9f9f2c413 100644
> --- a/fs/xfs/libxfs/xfs_rtgroup.h
> +++ b/fs/xfs/libxfs/xfs_rtgroup.h
> @@ -371,4 +371,19 @@ xfs_rtgs_to_rfsbs(
>  	return xfs_groups_to_rfsbs(mp, nr_groups, XG_TYPE_RTG);
>  }
>  
> +/*
> + * Return the "raw" size of a group on the hardware device.  This includes the
> + * daddr gaps present for XFS_SB_FEAT_INCOMPAT_ZONE_GAPS file systems.
> + */
> +static inline xfs_rgblock_t
> +xfs_rtgroup_raw_size(
> +	struct xfs_mount	*mp)
> +{
> +	struct xfs_groups	*g = &mp->m_groups[XG_TYPE_RTG];
> +
> +	if (g->has_daddr_gaps)
> +		return 1U << g->blklog;
> +	return g->blocks;
> +}
> +
>  #endif /* __LIBXFS_RTGROUP_H */
> -- 
> 2.47.3
> 
> 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 3/6] xfs: pass the write pointer to xfs_init_zone
  2026-01-09 17:20 ` [PATCH 3/6] xfs: pass the write pointer to xfs_init_zone Christoph Hellwig
@ 2026-01-10  1:11   ` Darrick J. Wong
  2026-01-12 10:15   ` Damien Le Moal
  1 sibling, 0 replies; 23+ messages in thread
From: Darrick J. Wong @ 2026-01-10  1:11 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Carlos Maiolino, Damien Le Moal, linux-xfs

On Fri, Jan 09, 2026 at 06:20:48PM +0100, Christoph Hellwig wrote:
> Move the two methods to query the write pointer out of xfs_init_zone into
> the callers, so that xfs_init_zone doesn't have to bother with the
> blk_zone structure and instead operates purely at the XFS realtime group
> level.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>

Ok, so this change is decoupling zone initialization (aka
xfs_init_zone) from struct blk_zone so that now zone initialization
itself doesn't have to know how to call the stuff in linux/blkzoned.h.

That's a nice restructuring, so
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>

--D


> ---
>  fs/xfs/xfs_zone_alloc.c | 66 +++++++++++++++++++++++------------------
>  1 file changed, 37 insertions(+), 29 deletions(-)
> 
> diff --git a/fs/xfs/xfs_zone_alloc.c b/fs/xfs/xfs_zone_alloc.c
> index bbcf21704ea0..013228eab0ac 100644
> --- a/fs/xfs/xfs_zone_alloc.c
> +++ b/fs/xfs/xfs_zone_alloc.c
> @@ -981,43 +981,43 @@ struct xfs_init_zones {
>  	uint64_t		reclaimable;
>  };
>  
> +/*
> + * For sequential write required zones, we restart writing at the hardware write
> + * pointer.
> + *
> + * For conventional zones or conventional devices we have query the rmap to
> + * find the highest recorded block and set the write pointer to the block after
> + * that.  In case of a power loss this misses blocks where the data I/O has
> + * completed but not recorded in the rmap yet, and it also rewrites blocks if
> + * the most recently written ones got deleted again before unmount, but this is
> + * the best we can do without hardware support.
> + */
> +static xfs_rgblock_t
> +xfs_rmap_write_pointer(
> +	struct xfs_rtgroup	*rtg)
> +{
> +	xfs_rgblock_t		highest_rgbno;
> +
> +	xfs_rtgroup_lock(rtg, XFS_RTGLOCK_RMAP);
> +	highest_rgbno = xfs_rtrmap_highest_rgbno(rtg);
> +	xfs_rtgroup_unlock(rtg, XFS_RTGLOCK_RMAP);
> +
> +	if (highest_rgbno == NULLRGBLOCK)
> +		return 0;
> +	return highest_rgbno + 1;
> +}
> +
>  static int
>  xfs_init_zone(
>  	struct xfs_init_zones	*iz,
>  	struct xfs_rtgroup	*rtg,
> -	struct blk_zone		*zone)
> +	xfs_rgblock_t		write_pointer)
>  {
>  	struct xfs_mount	*mp = rtg_mount(rtg);
>  	struct xfs_zone_info	*zi = mp->m_zone_info;
>  	uint32_t		used = rtg_rmap(rtg)->i_used_blocks;
> -	xfs_rgblock_t		write_pointer, highest_rgbno;
>  	int			error;
>  
> -	if (zone && !xfs_zone_validate(zone, rtg, &write_pointer))
> -		return -EFSCORRUPTED;
> -
> -	/*
> -	 * For sequential write required zones we retrieved the hardware write
> -	 * pointer above.
> -	 *
> -	 * For conventional zones or conventional devices we don't have that
> -	 * luxury.  Instead query the rmap to find the highest recorded block
> -	 * and set the write pointer to the block after that.  In case of a
> -	 * power loss this misses blocks where the data I/O has completed but
> -	 * not recorded in the rmap yet, and it also rewrites blocks if the most
> -	 * recently written ones got deleted again before unmount, but this is
> -	 * the best we can do without hardware support.
> -	 */
> -	if (!zone || zone->cond == BLK_ZONE_COND_NOT_WP) {
> -		xfs_rtgroup_lock(rtg, XFS_RTGLOCK_RMAP);
> -		highest_rgbno = xfs_rtrmap_highest_rgbno(rtg);
> -		if (highest_rgbno == NULLRGBLOCK)
> -			write_pointer = 0;
> -		else
> -			write_pointer = highest_rgbno + 1;
> -		xfs_rtgroup_unlock(rtg, XFS_RTGLOCK_RMAP);
> -	}
> -
>  	/*
>  	 * If there are no used blocks, but the zone is not in empty state yet
>  	 * we lost power before the zoned reset.  In that case finish the work
> @@ -1066,6 +1066,7 @@ xfs_get_zone_info_cb(
>  	struct xfs_mount	*mp = iz->mp;
>  	xfs_fsblock_t		zsbno = xfs_daddr_to_rtb(mp, zone->start);
>  	xfs_rgnumber_t		rgno;
> +	xfs_rgblock_t		write_pointer;
>  	struct xfs_rtgroup	*rtg;
>  	int			error;
>  
> @@ -1080,7 +1081,13 @@ xfs_get_zone_info_cb(
>  		xfs_warn(mp, "realtime group not found for zone %u.", rgno);
>  		return -EFSCORRUPTED;
>  	}
> -	error = xfs_init_zone(iz, rtg, zone);
> +	if (!xfs_zone_validate(zone, rtg, &write_pointer)) {
> +		xfs_rtgroup_rele(rtg);
> +		return -EFSCORRUPTED;
> +	}
> +	if (zone->cond == BLK_ZONE_COND_NOT_WP)
> +		write_pointer = xfs_rmap_write_pointer(rtg);
> +	error = xfs_init_zone(iz, rtg, write_pointer);
>  	xfs_rtgroup_rele(rtg);
>  	return error;
>  }
> @@ -1290,7 +1297,8 @@ xfs_mount_zones(
>  		struct xfs_rtgroup	*rtg = NULL;
>  
>  		while ((rtg = xfs_rtgroup_next(mp, rtg))) {
> -			error = xfs_init_zone(&iz, rtg, NULL);
> +			error = xfs_init_zone(&iz, rtg,
> +					xfs_rmap_write_pointer(rtg));
>  			if (error) {
>  				xfs_rtgroup_rele(rtg);
>  				goto out_free_zone_info;
> -- 
> 2.47.3
> 
> 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 5/6] xfs: check that used blocks are smaller than the write pointer
  2026-01-09 17:20 ` [PATCH 5/6] xfs: check that used blocks are smaller than the write pointer Christoph Hellwig
@ 2026-01-10  1:25   ` Darrick J. Wong
  0 siblings, 0 replies; 23+ messages in thread
From: Darrick J. Wong @ 2026-01-10  1:25 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Carlos Maiolino, Damien Le Moal, linux-xfs

On Fri, Jan 09, 2026 at 06:20:50PM +0100, Christoph Hellwig wrote:
> Any used block must have been written, this reject used blocks > write
> pointer.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>

Oops, we missed that :/
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>

--D

> ---
>  fs/xfs/xfs_zone_alloc.c | 7 +++++++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/fs/xfs/xfs_zone_alloc.c b/fs/xfs/xfs_zone_alloc.c
> index d8df219fd3b4..00260f70242f 100644
> --- a/fs/xfs/xfs_zone_alloc.c
> +++ b/fs/xfs/xfs_zone_alloc.c
> @@ -1033,6 +1033,13 @@ xfs_init_zone(
>  		return -EFSCORRUPTED;
>  	}
>  
> +	if (used > write_pointer) {
> +		xfs_warn(mp,
> +"zone %u has used counter (0x%x) larger than write pointer (0x%x).",
> +			 rtg_rgno(rtg), used, write_pointer);
> +		return -EFSCORRUPTED;
> +	}
> +
>  	if (write_pointer == 0 && used != 0) {
>  		xfs_warn(mp, "empty zone %u has non-zero used counter (0x%x).",
>  			rtg_rgno(rtg), used);
> -- 
> 2.47.3
> 
> 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 6/6] xfs: use blkdev_get_zone_info to simply zone reporting
  2026-01-09 17:20 ` [PATCH 6/6] xfs: use blkdev_get_zone_info to simply zone reporting Christoph Hellwig
@ 2026-01-10  1:28   ` Darrick J. Wong
  2026-01-13 10:33   ` Damien Le Moal
  1 sibling, 0 replies; 23+ messages in thread
From: Darrick J. Wong @ 2026-01-10  1:28 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Carlos Maiolino, Damien Le Moal, linux-xfs

On Fri, Jan 09, 2026 at 06:20:51PM +0100, Christoph Hellwig wrote:
> Unwind the callback based programming model by querying the cached
> zone information using blkdev_get_zone_info.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>

Ok, so now I see what's going here -- the libxfs zone code does the
validation, but it's up to the code in fs/xfs/ (or libxfs/init.c in
userspace) to find the zone information.  Let's hope the cached zone
information reduces the noticeable(ish) mount delays on some of my zoned
hardware.

Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>

--D

> ---
>  fs/xfs/xfs_zone_alloc.c | 104 +++++++++++++++++-----------------------
>  1 file changed, 45 insertions(+), 59 deletions(-)
> 
> diff --git a/fs/xfs/xfs_zone_alloc.c b/fs/xfs/xfs_zone_alloc.c
> index 00260f70242f..2849be19369e 100644
> --- a/fs/xfs/xfs_zone_alloc.c
> +++ b/fs/xfs/xfs_zone_alloc.c
> @@ -976,7 +976,6 @@ xfs_free_open_zones(
>  }
>  
>  struct xfs_init_zones {
> -	struct xfs_mount	*mp;
>  	uint32_t		zone_size;
>  	uint32_t		zone_capacity;
>  	uint64_t		available;
> @@ -1009,6 +1008,39 @@ xfs_rmap_write_pointer(
>  	return highest_rgbno + 1;
>  }
>  
> +static int
> +xfs_query_write_pointer(
> +	struct xfs_init_zones	*iz,
> +	struct xfs_rtgroup	*rtg,
> +	xfs_rgblock_t		*write_pointer)
> +{
> +	struct xfs_mount	*mp = rtg_mount(rtg);
> +	struct block_device	*bdev = mp->m_rtdev_targp->bt_bdev;
> +	sector_t		start = xfs_gbno_to_daddr(&rtg->rtg_group, 0);
> +	struct blk_zone		zone = {
> +		.cond	= BLK_ZONE_COND_NOT_WP,
> +	};
> +	int			error;
> +
> +	if (bdev_is_zoned(bdev)) {
> +		error = blkdev_get_zone_info(bdev, start, &zone);
> +		if (error)
> +			return error;
> +		if (zone.start != start) {
> +			xfs_warn(mp, "mismatched zone start: 0x%llx/0x%llx.",
> +				zone.start, start);
> +			return -EFSCORRUPTED;
> +		}
> +		if (!xfs_zone_validate(mp, &zone, rtg_rgno(rtg), iz->zone_size,
> +				iz->zone_capacity, write_pointer))
> +			return -EFSCORRUPTED;
> +	}
> +
> +	if (zone.cond == BLK_ZONE_COND_NOT_WP)
> +		*write_pointer = xfs_rmap_write_pointer(rtg);
> +	return 0;
> +}
> +
>  static int
>  xfs_init_zone(
>  	struct xfs_init_zones	*iz,
> @@ -1084,43 +1116,6 @@ xfs_init_zone(
>  	return 0;
>  }
>  
> -static int
> -xfs_get_zone_info_cb(
> -	struct blk_zone		*zone,
> -	unsigned int		idx,
> -	void			*data)
> -{
> -	struct xfs_init_zones	*iz = data;
> -	struct xfs_mount	*mp = iz->mp;
> -	xfs_fsblock_t		zsbno = xfs_daddr_to_rtb(mp, zone->start);
> -	xfs_rgnumber_t		rgno;
> -	xfs_rgblock_t		write_pointer;
> -	struct xfs_rtgroup	*rtg;
> -	int			error;
> -
> -	if (xfs_rtb_to_rgbno(mp, zsbno) != 0) {
> -		xfs_warn(mp, "mismatched zone start 0x%llx.", zsbno);
> -		return -EFSCORRUPTED;
> -	}
> -
> -	rgno = xfs_rtb_to_rgno(mp, zsbno);
> -	rtg = xfs_rtgroup_grab(mp, rgno);
> -	if (!rtg) {
> -		xfs_warn(mp, "realtime group not found for zone %u.", rgno);
> -		return -EFSCORRUPTED;
> -	}
> -	if (!xfs_zone_validate(mp, zone, idx, iz->zone_size,
> -			iz->zone_capacity, &write_pointer)) {
> -		xfs_rtgroup_rele(rtg);
> -		return -EFSCORRUPTED;
> -	}
> -	if (zone->cond == BLK_ZONE_COND_NOT_WP)
> -		write_pointer = xfs_rmap_write_pointer(rtg);
> -	error = xfs_init_zone(iz, rtg, write_pointer);
> -	xfs_rtgroup_rele(rtg);
> -	return error;
> -}
> -
>  /*
>   * Calculate the max open zone limit based on the of number of backing zones
>   * available.
> @@ -1255,15 +1250,13 @@ xfs_mount_zones(
>  	struct xfs_mount	*mp)
>  {
>  	struct xfs_init_zones	iz = {
> -		.mp		= mp,
>  		.zone_capacity	= mp->m_groups[XG_TYPE_RTG].blocks,
>  		.zone_size	= xfs_rtgroup_raw_size(mp),
>  	};
> -	struct xfs_buftarg	*bt = mp->m_rtdev_targp;
> -	xfs_extlen_t		zone_blocks = mp->m_groups[XG_TYPE_RTG].blocks;
> +	struct xfs_rtgroup	*rtg = NULL;
>  	int			error;
>  
> -	if (!bt) {
> +	if (!mp->m_rtdev_targp) {
>  		xfs_notice(mp, "RT device missing.");
>  		return -EINVAL;
>  	}
> @@ -1291,7 +1284,7 @@ xfs_mount_zones(
>  		return -ENOMEM;
>  
>  	xfs_info(mp, "%u zones of %u blocks (%u max open zones)",
> -		 mp->m_sb.sb_rgcount, zone_blocks, mp->m_max_open_zones);
> +		 mp->m_sb.sb_rgcount, iz.zone_capacity, mp->m_max_open_zones);
>  	trace_xfs_zones_mount(mp);
>  
>  	/*
> @@ -1315,25 +1308,18 @@ xfs_mount_zones(
>  	 * or beneficial.
>  	 */
>  	mp->m_super->s_min_writeback_pages =
> -		XFS_FSB_TO_B(mp, min(zone_blocks, XFS_MAX_BMBT_EXTLEN)) >>
> +		XFS_FSB_TO_B(mp, min(iz.zone_capacity, XFS_MAX_BMBT_EXTLEN)) >>
>  			PAGE_SHIFT;
>  
> -	if (bdev_is_zoned(bt->bt_bdev)) {
> -		error = blkdev_report_zones_cached(bt->bt_bdev,
> -				XFS_FSB_TO_BB(mp, mp->m_sb.sb_rtstart),
> -				mp->m_sb.sb_rgcount, xfs_get_zone_info_cb, &iz);
> -		if (error < 0)
> +	while ((rtg = xfs_rtgroup_next(mp, rtg))) {
> +		xfs_rgblock_t		write_pointer;
> +
> +		error = xfs_query_write_pointer(&iz, rtg, &write_pointer);
> +		if (!error)
> +			error = xfs_init_zone(&iz, rtg, write_pointer);
> +		if (error) {
> +			xfs_rtgroup_rele(rtg);
>  			goto out_free_zone_info;
> -	} else {
> -		struct xfs_rtgroup	*rtg = NULL;
> -
> -		while ((rtg = xfs_rtgroup_next(mp, rtg))) {
> -			error = xfs_init_zone(&iz, rtg,
> -					xfs_rmap_write_pointer(rtg));
> -			if (error) {
> -				xfs_rtgroup_rele(rtg);
> -				goto out_free_zone_info;
> -			}
>  		}
>  	}
>  
> -- 
> 2.47.3
> 
> 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 4/6] xfs: split and refactor zone validation
  2026-01-09 17:20 ` [PATCH 4/6] xfs: split and refactor zone validation Christoph Hellwig
@ 2026-01-10  1:44   ` Darrick J. Wong
  2026-01-12 10:12     ` Christoph Hellwig
  0 siblings, 1 reply; 23+ messages in thread
From: Darrick J. Wong @ 2026-01-10  1:44 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Carlos Maiolino, Damien Le Moal, linux-xfs

On Fri, Jan 09, 2026 at 06:20:49PM +0100, Christoph Hellwig wrote:
> Currently xfs_zone_validate mixes validating the software zone state in
> the XFS realtime group with validating the hardware state reported in
> struct blk_zone and deriving the write pointer from that.
> 
> Move all code that works on the realtime group to xfs_init_zone, and only
> keep the hardware state validation in xfs_zone_validate.  This makes the
> code more clear, and allows for better reuse in userspace.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>

Hrmm.  There's a lot going on in this patch.  The code changes here are
a lot of shuffling code around, and I think the end result is that there
are (a) fewer small functions; (b) discovering the write pointer moves
towards xfs_init_zone; and (c) here and elsewhere the validation of that
write pointer shifts towards libxfs...?

If so, then I think I understand what's going on here well enough to say
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>

--D


> ---
>  fs/xfs/libxfs/xfs_zones.c | 142 ++++++++++----------------------------
>  fs/xfs/libxfs/xfs_zones.h |   5 +-
>  fs/xfs/xfs_zone_alloc.c   |  26 ++++++-
>  3 files changed, 63 insertions(+), 110 deletions(-)
> 
> diff --git a/fs/xfs/libxfs/xfs_zones.c b/fs/xfs/libxfs/xfs_zones.c
> index b40f71f878b5..8d54452744ae 100644
> --- a/fs/xfs/libxfs/xfs_zones.c
> +++ b/fs/xfs/libxfs/xfs_zones.c
> @@ -14,174 +14,102 @@
>  #include "xfs_rtgroup.h"
>  #include "xfs_zones.h"
>  
> -static bool
> -xfs_zone_validate_empty(
> -	struct blk_zone		*zone,
> -	struct xfs_rtgroup	*rtg,
> -	xfs_rgblock_t		*write_pointer)
> -{
> -	struct xfs_mount	*mp = rtg_mount(rtg);
> -
> -	if (rtg_rmap(rtg)->i_used_blocks > 0) {
> -		xfs_warn(mp, "empty zone %u has non-zero used counter (0x%x).",
> -			 rtg_rgno(rtg), rtg_rmap(rtg)->i_used_blocks);
> -		return false;
> -	}
> -
> -	*write_pointer = 0;
> -	return true;
> -}
> -
> -static bool
> -xfs_zone_validate_wp(
> -	struct blk_zone		*zone,
> -	struct xfs_rtgroup	*rtg,
> -	xfs_rgblock_t		*write_pointer)
> -{
> -	struct xfs_mount	*mp = rtg_mount(rtg);
> -	xfs_rtblock_t		wp_fsb = xfs_daddr_to_rtb(mp, zone->wp);
> -
> -	if (rtg_rmap(rtg)->i_used_blocks > rtg->rtg_extents) {
> -		xfs_warn(mp, "zone %u has too large used counter (0x%x).",
> -			 rtg_rgno(rtg), rtg_rmap(rtg)->i_used_blocks);
> -		return false;
> -	}
> -
> -	if (xfs_rtb_to_rgno(mp, wp_fsb) != rtg_rgno(rtg)) {
> -		xfs_warn(mp, "zone %u write pointer (0x%llx) outside of zone.",
> -			 rtg_rgno(rtg), wp_fsb);
> -		return false;
> -	}
> -
> -	*write_pointer = xfs_rtb_to_rgbno(mp, wp_fsb);
> -	if (*write_pointer >= rtg->rtg_extents) {
> -		xfs_warn(mp, "zone %u has invalid write pointer (0x%x).",
> -			 rtg_rgno(rtg), *write_pointer);
> -		return false;
> -	}
> -
> -	return true;
> -}
> -
> -static bool
> -xfs_zone_validate_full(
> -	struct blk_zone		*zone,
> -	struct xfs_rtgroup	*rtg,
> -	xfs_rgblock_t		*write_pointer)
> -{
> -	struct xfs_mount	*mp = rtg_mount(rtg);
> -
> -	if (rtg_rmap(rtg)->i_used_blocks > rtg->rtg_extents) {
> -		xfs_warn(mp, "zone %u has too large used counter (0x%x).",
> -			 rtg_rgno(rtg), rtg_rmap(rtg)->i_used_blocks);
> -		return false;
> -	}
> -
> -	*write_pointer = rtg->rtg_extents;
> -	return true;
> -}
> -
>  static bool
>  xfs_zone_validate_seq(
> +	struct xfs_mount	*mp,
>  	struct blk_zone		*zone,
> -	struct xfs_rtgroup	*rtg,
> +	unsigned int		zone_no,
>  	xfs_rgblock_t		*write_pointer)
>  {
> -	struct xfs_mount	*mp = rtg_mount(rtg);
> -
>  	switch (zone->cond) {
>  	case BLK_ZONE_COND_EMPTY:
> -		return xfs_zone_validate_empty(zone, rtg, write_pointer);
> +		*write_pointer = 0;
> +		return true;
>  	case BLK_ZONE_COND_IMP_OPEN:
>  	case BLK_ZONE_COND_EXP_OPEN:
>  	case BLK_ZONE_COND_CLOSED:
>  	case BLK_ZONE_COND_ACTIVE:
> -		return xfs_zone_validate_wp(zone, rtg, write_pointer);
> +		if (zone->wp < zone->start ||
> +		    zone->wp >= zone->start + zone->capacity) {
> +			xfs_warn(mp,
> +	"zone %u write pointer (%llu) outside of zone.",
> +				zone_no, zone->wp);
> +			return false;
> +		}
> +
> +		*write_pointer = XFS_BB_TO_FSB(mp, zone->wp - zone->start);
> +		return true;
>  	case BLK_ZONE_COND_FULL:
> -		return xfs_zone_validate_full(zone, rtg, write_pointer);
> +		*write_pointer = XFS_BB_TO_FSB(mp, zone->capacity);
> +		return true;
>  	case BLK_ZONE_COND_NOT_WP:
>  	case BLK_ZONE_COND_OFFLINE:
>  	case BLK_ZONE_COND_READONLY:
>  		xfs_warn(mp, "zone %u has unsupported zone condition 0x%x.",
> -			rtg_rgno(rtg), zone->cond);
> +			zone_no, zone->cond);
>  		return false;
>  	default:
>  		xfs_warn(mp, "zone %u has unknown zone condition 0x%x.",
> -			rtg_rgno(rtg), zone->cond);
> +			zone_no, zone->cond);
>  		return false;
>  	}
>  }
>  
>  static bool
>  xfs_zone_validate_conv(
> +	struct xfs_mount	*mp,
>  	struct blk_zone		*zone,
> -	struct xfs_rtgroup	*rtg)
> +	unsigned int		zone_no)
>  {
> -	struct xfs_mount	*mp = rtg_mount(rtg);
> -
>  	switch (zone->cond) {
>  	case BLK_ZONE_COND_NOT_WP:
>  		return true;
>  	default:
>  		xfs_warn(mp,
>  "conventional zone %u has unsupported zone condition 0x%x.",
> -			 rtg_rgno(rtg), zone->cond);
> +			 zone_no, zone->cond);
>  		return false;
>  	}
>  }
>  
>  bool
>  xfs_zone_validate(
> +	struct xfs_mount	*mp,
>  	struct blk_zone		*zone,
> -	struct xfs_rtgroup	*rtg,
> +	unsigned int		zone_no,
> +	uint32_t		expected_size,
> +	uint32_t		expected_capacity,
>  	xfs_rgblock_t		*write_pointer)
>  {
> -	struct xfs_mount	*mp = rtg_mount(rtg);
> -	struct xfs_groups	*g = &mp->m_groups[XG_TYPE_RTG];
> -	uint32_t		expected_size;
> -
>  	/*
>  	 * Check that the zone capacity matches the rtgroup size stored in the
>  	 * superblock.  Note that all zones including the last one must have a
>  	 * uniform capacity.
>  	 */
> -	if (XFS_BB_TO_FSB(mp, zone->capacity) != g->blocks) {
> +	if (XFS_BB_TO_FSB(mp, zone->capacity) != expected_capacity) {
>  		xfs_warn(mp,
> -"zone %u capacity (0x%llx) does not match RT group size (0x%x).",
> -			rtg_rgno(rtg), XFS_BB_TO_FSB(mp, zone->capacity),
> -			g->blocks);
> +"zone %u capacity (%llu) does not match RT group size (%u).",
> +			zone_no, XFS_BB_TO_FSB(mp, zone->capacity),
> +			expected_capacity);
>  		return false;
>  	}
>  
> -	if (g->has_daddr_gaps) {
> -		expected_size = 1 << g->blklog;
> -	} else {
> -		if (zone->len != zone->capacity) {
> -			xfs_warn(mp,
> -"zone %u has capacity != size ((0x%llx vs 0x%llx)",
> -				rtg_rgno(rtg),
> -				XFS_BB_TO_FSB(mp, zone->len),
> -				XFS_BB_TO_FSB(mp, zone->capacity));
> -			return false;
> -		}
> -		expected_size = g->blocks;
> -	}
> -
>  	if (XFS_BB_TO_FSB(mp, zone->len) != expected_size) {
>  		xfs_warn(mp,
> -"zone %u length (0x%llx) does match geometry (0x%x).",
> -			rtg_rgno(rtg), XFS_BB_TO_FSB(mp, zone->len),
> +"zone %u length (%llu) does not match geometry (%u).",
> +			zone_no, XFS_BB_TO_FSB(mp, zone->len),
>  			expected_size);
> +		return false;
>  	}
>  
>  	switch (zone->type) {
>  	case BLK_ZONE_TYPE_CONVENTIONAL:
> -		return xfs_zone_validate_conv(zone, rtg);
> +		return xfs_zone_validate_conv(mp, zone, zone_no);
>  	case BLK_ZONE_TYPE_SEQWRITE_REQ:
> -		return xfs_zone_validate_seq(zone, rtg, write_pointer);
> +		return xfs_zone_validate_seq(mp, zone, zone_no, write_pointer);
>  	default:
>  		xfs_warn(mp, "zoned %u has unsupported type 0x%x.",
> -			rtg_rgno(rtg), zone->type);
> +			zone_no, zone->type);
>  		return false;
>  	}
>  }
> diff --git a/fs/xfs/libxfs/xfs_zones.h b/fs/xfs/libxfs/xfs_zones.h
> index df10a34da71d..b5b3df04a066 100644
> --- a/fs/xfs/libxfs/xfs_zones.h
> +++ b/fs/xfs/libxfs/xfs_zones.h
> @@ -37,7 +37,8 @@ struct blk_zone;
>   */
>  #define XFS_DEFAULT_MAX_OPEN_ZONES	128
>  
> -bool xfs_zone_validate(struct blk_zone *zone, struct xfs_rtgroup *rtg,
> -	xfs_rgblock_t *write_pointer);
> +bool xfs_zone_validate(struct xfs_mount *mp, struct blk_zone *zone,
> +	unsigned int zone_no, uint32_t expected_size,
> +	uint32_t expected_capacity, xfs_rgblock_t *write_pointer);
>  
>  #endif /* _LIBXFS_ZONES_H */
> diff --git a/fs/xfs/xfs_zone_alloc.c b/fs/xfs/xfs_zone_alloc.c
> index 013228eab0ac..d8df219fd3b4 100644
> --- a/fs/xfs/xfs_zone_alloc.c
> +++ b/fs/xfs/xfs_zone_alloc.c
> @@ -977,6 +977,8 @@ xfs_free_open_zones(
>  
>  struct xfs_init_zones {
>  	struct xfs_mount	*mp;
> +	uint32_t		zone_size;
> +	uint32_t		zone_capacity;
>  	uint64_t		available;
>  	uint64_t		reclaimable;
>  };
> @@ -1018,6 +1020,25 @@ xfs_init_zone(
>  	uint32_t		used = rtg_rmap(rtg)->i_used_blocks;
>  	int			error;
>  
> +	if (write_pointer > rtg->rtg_extents) {
> +		xfs_warn(mp, "zone %u has invalid write pointer (0x%x).",
> +			 rtg_rgno(rtg), write_pointer);
> +		return -EFSCORRUPTED;
> +	}
> +
> +	if (used > rtg->rtg_extents) {
> +		xfs_warn(mp,
> +"zone %u has used counter (0x%x) larger than zone capacity (0x%llx).",
> +			 rtg_rgno(rtg), used, rtg->rtg_extents);
> +		return -EFSCORRUPTED;
> +	}
> +
> +	if (write_pointer == 0 && used != 0) {
> +		xfs_warn(mp, "empty zone %u has non-zero used counter (0x%x).",
> +			rtg_rgno(rtg), used);
> +		return -EFSCORRUPTED;
> +	}
> +
>  	/*
>  	 * If there are no used blocks, but the zone is not in empty state yet
>  	 * we lost power before the zoned reset.  In that case finish the work
> @@ -1081,7 +1102,8 @@ xfs_get_zone_info_cb(
>  		xfs_warn(mp, "realtime group not found for zone %u.", rgno);
>  		return -EFSCORRUPTED;
>  	}
> -	if (!xfs_zone_validate(zone, rtg, &write_pointer)) {
> +	if (!xfs_zone_validate(mp, zone, idx, iz->zone_size,
> +			iz->zone_capacity, &write_pointer)) {
>  		xfs_rtgroup_rele(rtg);
>  		return -EFSCORRUPTED;
>  	}
> @@ -1227,6 +1249,8 @@ xfs_mount_zones(
>  {
>  	struct xfs_init_zones	iz = {
>  		.mp		= mp,
> +		.zone_capacity	= mp->m_groups[XG_TYPE_RTG].blocks,
> +		.zone_size	= xfs_rtgroup_raw_size(mp),
>  	};
>  	struct xfs_buftarg	*bt = mp->m_rtdev_targp;
>  	xfs_extlen_t		zone_blocks = mp->m_groups[XG_TYPE_RTG].blocks;
> -- 
> 2.47.3
> 
> 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 4/6] xfs: split and refactor zone validation
  2026-01-10  1:44   ` Darrick J. Wong
@ 2026-01-12 10:12     ` Christoph Hellwig
  0 siblings, 0 replies; 23+ messages in thread
From: Christoph Hellwig @ 2026-01-12 10:12 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: Christoph Hellwig, Carlos Maiolino, Damien Le Moal, linux-xfs

On Fri, Jan 09, 2026 at 05:44:13PM -0800, Darrick J. Wong wrote:
> On Fri, Jan 09, 2026 at 06:20:49PM +0100, Christoph Hellwig wrote:
> > Currently xfs_zone_validate mixes validating the software zone state in
> > the XFS realtime group with validating the hardware state reported in
> > struct blk_zone and deriving the write pointer from that.
> > 
> > Move all code that works on the realtime group to xfs_init_zone, and only
> > keep the hardware state validation in xfs_zone_validate.  This makes the
> > code more clear, and allows for better reuse in userspace.
> > 
> > Signed-off-by: Christoph Hellwig <hch@lst.de>
> 
> Hrmm.  There's a lot going on in this patch.  The code changes here are
> a lot of shuffling code around, and I think the end result is that there
> are (a) fewer small functions; (b) discovering the write pointer moves
> towards xfs_init_zone; and (c) here and elsewhere the validation of that
> write pointer shifts towards libxfs...?

Yeah.  I initiall had this split up a bit more, but that made things
even harder to follow..

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 3/6] xfs: pass the write pointer to xfs_init_zone
  2026-01-09 17:20 ` [PATCH 3/6] xfs: pass the write pointer to xfs_init_zone Christoph Hellwig
  2026-01-10  1:11   ` Darrick J. Wong
@ 2026-01-12 10:15   ` Damien Le Moal
  2026-01-12 21:50     ` Darrick J. Wong
  2026-01-13  7:47     ` Christoph Hellwig
  1 sibling, 2 replies; 23+ messages in thread
From: Damien Le Moal @ 2026-01-12 10:15 UTC (permalink / raw)
  To: Christoph Hellwig, Carlos Maiolino; +Cc: linux-xfs

On 1/9/26 18:20, Christoph Hellwig wrote:
> Move the two methods to query the write pointer out of xfs_init_zone into
> the callers, so that xfs_init_zone doesn't have to bother with the
> blk_zone structure and instead operates purely at the XFS realtime group
> level.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>  fs/xfs/xfs_zone_alloc.c | 66 +++++++++++++++++++++++------------------
>  1 file changed, 37 insertions(+), 29 deletions(-)
> 
> diff --git a/fs/xfs/xfs_zone_alloc.c b/fs/xfs/xfs_zone_alloc.c
> index bbcf21704ea0..013228eab0ac 100644
> --- a/fs/xfs/xfs_zone_alloc.c
> +++ b/fs/xfs/xfs_zone_alloc.c
> @@ -981,43 +981,43 @@ struct xfs_init_zones {
>  	uint64_t		reclaimable;
>  };
>  
> +/*
> + * For sequential write required zones, we restart writing at the hardware write
> + * pointer.
> + *
> + * For conventional zones or conventional devices we have query the rmap to
> + * find the highest recorded block and set the write pointer to the block after
> + * that.  In case of a power loss this misses blocks where the data I/O has
> + * completed but not recorded in the rmap yet, and it also rewrites blocks if
> + * the most recently written ones got deleted again before unmount, but this is
> + * the best we can do without hardware support.
> + */

I find this comment and the function name confusing since we are not looking at
a zone write pointer at all. So maybe rename this to something like:

xfs_rmap_get_highest_rgbno()

? Also, I think the comment block should go...


> +static xfs_rgblock_t
> +xfs_rmap_write_pointer(
> +	struct xfs_rtgroup	*rtg)
> +{
> +	xfs_rgblock_t		highest_rgbno;
> +
> +	xfs_rtgroup_lock(rtg, XFS_RTGLOCK_RMAP);
> +	highest_rgbno = xfs_rtrmap_highest_rgbno(rtg);
> +	xfs_rtgroup_unlock(rtg, XFS_RTGLOCK_RMAP);
> +
> +	if (highest_rgbno == NULLRGBLOCK)
> +		return 0;
> +	return highest_rgbno + 1;
> +}

[...]

>  	/*
>  	 * If there are no used blocks, but the zone is not in empty state yet
>  	 * we lost power before the zoned reset.  In that case finish the work
> @@ -1066,6 +1066,7 @@ xfs_get_zone_info_cb(
>  	struct xfs_mount	*mp = iz->mp;
>  	xfs_fsblock_t		zsbno = xfs_daddr_to_rtb(mp, zone->start);
>  	xfs_rgnumber_t		rgno;
> +	xfs_rgblock_t		write_pointer;
>  	struct xfs_rtgroup	*rtg;
>  	int			error;
>  
> @@ -1080,7 +1081,13 @@ xfs_get_zone_info_cb(
>  		xfs_warn(mp, "realtime group not found for zone %u.", rgno);
>  		return -EFSCORRUPTED;
>  	}
> -	error = xfs_init_zone(iz, rtg, zone);

...here.
This code is also hard to follow without a comment indicating that write_pointer
is not set by xfs_zone_validate() for conventional zones. Ideally, we should
move the call to xfs_rmap_write_pointer() in xfs_zone_validate(). That would be
cleaner, no ?

> +	if (!xfs_zone_validate(zone, rtg, &write_pointer)) {
> +		xfs_rtgroup_rele(rtg);
> +		return -EFSCORRUPTED;
> +	}
> +	if (zone->cond == BLK_ZONE_COND_NOT_WP)
> +		write_pointer = xfs_rmap_write_pointer(rtg);
> +	error = xfs_init_zone(iz, rtg, write_pointer);
>  	xfs_rtgroup_rele(rtg);
>  	return error;
>  }
> @@ -1290,7 +1297,8 @@ xfs_mount_zones(
>  		struct xfs_rtgroup	*rtg = NULL;
>  
>  		while ((rtg = xfs_rtgroup_next(mp, rtg))) {
> -			error = xfs_init_zone(&iz, rtg, NULL);
> +			error = xfs_init_zone(&iz, rtg,
> +					xfs_rmap_write_pointer(rtg));
>  			if (error) {
>  				xfs_rtgroup_rele(rtg);
>  				goto out_free_zone_info;


-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 3/6] xfs: pass the write pointer to xfs_init_zone
  2026-01-12 10:15   ` Damien Le Moal
@ 2026-01-12 21:50     ` Darrick J. Wong
  2026-01-13  7:47       ` Christoph Hellwig
  2026-01-13  7:47     ` Christoph Hellwig
  1 sibling, 1 reply; 23+ messages in thread
From: Darrick J. Wong @ 2026-01-12 21:50 UTC (permalink / raw)
  To: Damien Le Moal; +Cc: Christoph Hellwig, Carlos Maiolino, linux-xfs

On Mon, Jan 12, 2026 at 11:15:07AM +0100, Damien Le Moal wrote:
> On 1/9/26 18:20, Christoph Hellwig wrote:
> > Move the two methods to query the write pointer out of xfs_init_zone into
> > the callers, so that xfs_init_zone doesn't have to bother with the
> > blk_zone structure and instead operates purely at the XFS realtime group
> > level.
> > 
> > Signed-off-by: Christoph Hellwig <hch@lst.de>
> > ---
> >  fs/xfs/xfs_zone_alloc.c | 66 +++++++++++++++++++++++------------------
> >  1 file changed, 37 insertions(+), 29 deletions(-)
> > 
> > diff --git a/fs/xfs/xfs_zone_alloc.c b/fs/xfs/xfs_zone_alloc.c
> > index bbcf21704ea0..013228eab0ac 100644
> > --- a/fs/xfs/xfs_zone_alloc.c
> > +++ b/fs/xfs/xfs_zone_alloc.c
> > @@ -981,43 +981,43 @@ struct xfs_init_zones {
> >  	uint64_t		reclaimable;
> >  };
> >  
> > +/*
> > + * For sequential write required zones, we restart writing at the hardware write
> > + * pointer.
> > + *
> > + * For conventional zones or conventional devices we have query the rmap to
> > + * find the highest recorded block and set the write pointer to the block after
> > + * that.  In case of a power loss this misses blocks where the data I/O has
> > + * completed but not recorded in the rmap yet, and it also rewrites blocks if
> > + * the most recently written ones got deleted again before unmount, but this is
> > + * the best we can do without hardware support.
> > + */
> 
> I find this comment and the function name confusing since we are not looking at
> a zone write pointer at all. So maybe rename this to something like:
> 
> xfs_rmap_get_highest_rgbno()
> 
> ? Also, I think the comment block should go...
> 
> > +static xfs_rgblock_t
> > +xfs_rmap_write_pointer(
> > +	struct xfs_rtgroup	*rtg)
> > +{
> > +	xfs_rgblock_t		highest_rgbno;
> > +
> > +	xfs_rtgroup_lock(rtg, XFS_RTGLOCK_RMAP);
> > +	highest_rgbno = xfs_rtrmap_highest_rgbno(rtg);
> > +	xfs_rtgroup_unlock(rtg, XFS_RTGLOCK_RMAP);
> > +
> > +	if (highest_rgbno == NULLRGBLOCK)
> > +		return 0;
> > +	return highest_rgbno + 1;
> > +}
> 
> [...]
> 
> >  	/*
> >  	 * If there are no used blocks, but the zone is not in empty state yet
> >  	 * we lost power before the zoned reset.  In that case finish the work
> > @@ -1066,6 +1066,7 @@ xfs_get_zone_info_cb(
> >  	struct xfs_mount	*mp = iz->mp;
> >  	xfs_fsblock_t		zsbno = xfs_daddr_to_rtb(mp, zone->start);
> >  	xfs_rgnumber_t		rgno;
> > +	xfs_rgblock_t		write_pointer;
> >  	struct xfs_rtgroup	*rtg;
> >  	int			error;
> >  
> > @@ -1080,7 +1081,13 @@ xfs_get_zone_info_cb(
> >  		xfs_warn(mp, "realtime group not found for zone %u.", rgno);
> >  		return -EFSCORRUPTED;
> >  	}
> > -	error = xfs_init_zone(iz, rtg, zone);
> 
> ...here.
> This code is also hard to follow without a comment indicating that write_pointer
> is not set by xfs_zone_validate() for conventional zones. Ideally, we should
> move the call to xfs_rmap_write_pointer() in xfs_zone_validate(). That would be
> cleaner, no ?
> 
> > +	if (!xfs_zone_validate(zone, rtg, &write_pointer)) {

I had wondered by the time I got to the end of this series if this
function should be renamed to xfs_validate_hw_zone() or something like
that?

--D

> > +		xfs_rtgroup_rele(rtg);
> > +		return -EFSCORRUPTED;
> > +	}
> > +	if (zone->cond == BLK_ZONE_COND_NOT_WP)
> > +		write_pointer = xfs_rmap_write_pointer(rtg);
> > +	error = xfs_init_zone(iz, rtg, write_pointer);
> >  	xfs_rtgroup_rele(rtg);
> >  	return error;
> >  }
> > @@ -1290,7 +1297,8 @@ xfs_mount_zones(
> >  		struct xfs_rtgroup	*rtg = NULL;
> >  
> >  		while ((rtg = xfs_rtgroup_next(mp, rtg))) {
> > -			error = xfs_init_zone(&iz, rtg, NULL);
> > +			error = xfs_init_zone(&iz, rtg,
> > +					xfs_rmap_write_pointer(rtg));
> >  			if (error) {
> >  				xfs_rtgroup_rele(rtg);
> >  				goto out_free_zone_info;
> 
> 
> -- 
> Damien Le Moal
> Western Digital Research
> 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 3/6] xfs: pass the write pointer to xfs_init_zone
  2026-01-12 10:15   ` Damien Le Moal
  2026-01-12 21:50     ` Darrick J. Wong
@ 2026-01-13  7:47     ` Christoph Hellwig
  2026-01-13  9:27       ` Damien Le Moal
  1 sibling, 1 reply; 23+ messages in thread
From: Christoph Hellwig @ 2026-01-13  7:47 UTC (permalink / raw)
  To: Damien Le Moal; +Cc: Christoph Hellwig, Carlos Maiolino, linux-xfs

On Mon, Jan 12, 2026 at 11:15:07AM +0100, Damien Le Moal wrote:
> > + * pointer.
> > + *
> > + * For conventional zones or conventional devices we have query the rmap to
> > + * find the highest recorded block and set the write pointer to the block after
> > + * that.  In case of a power loss this misses blocks where the data I/O has
> > + * completed but not recorded in the rmap yet, and it also rewrites blocks if
> > + * the most recently written ones got deleted again before unmount, but this is
> > + * the best we can do without hardware support.
> > + */
> 
> I find this comment and the function name confusing since we are not looking at
> a zone write pointer at all. So maybe rename this to something like:
> 
> xfs_rmap_get_highest_rgbno()

Well, we're still trying to make up a write pointer.  I've renamed
it to include estimate, and in the revised series this goes away
as a separate helper.  But what is confusing about the comment?

> ? Also, I think the comment block should go...

In the update version this goes away as a separate function and I
think the comment gets into a better place before a function that
queries the hardware or estimated rmap write pointer.

> This code is also hard to follow without a comment indicating that write_pointer
> is not set by xfs_zone_validate() for conventional zones. Ideally, we should
> move the call to xfs_rmap_write_pointer() in xfs_zone_validate(). That would be
> cleaner, no ?

No.  xfs_zone_validate is about to become entirely about the blk_zone
and not XFS internal information.


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 3/6] xfs: pass the write pointer to xfs_init_zone
  2026-01-12 21:50     ` Darrick J. Wong
@ 2026-01-13  7:47       ` Christoph Hellwig
  0 siblings, 0 replies; 23+ messages in thread
From: Christoph Hellwig @ 2026-01-13  7:47 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: Damien Le Moal, Christoph Hellwig, Carlos Maiolino, linux-xfs

On Mon, Jan 12, 2026 at 01:50:08PM -0800, Darrick J. Wong wrote:
> > > +	if (!xfs_zone_validate(zone, rtg, &write_pointer)) {
> 
> I had wondered by the time I got to the end of this series if this
> function should be renamed to xfs_validate_hw_zone() or something like
> that?

I've renamed it to xfs_validate_blk_zone to match the struct name.


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 3/6] xfs: pass the write pointer to xfs_init_zone
  2026-01-13  7:47     ` Christoph Hellwig
@ 2026-01-13  9:27       ` Damien Le Moal
  0 siblings, 0 replies; 23+ messages in thread
From: Damien Le Moal @ 2026-01-13  9:27 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Carlos Maiolino, linux-xfs

On 2026/01/13 8:47, Christoph Hellwig wrote:
> On Mon, Jan 12, 2026 at 11:15:07AM +0100, Damien Le Moal wrote:
>>> + * pointer.
>>> + *
>>> + * For conventional zones or conventional devices we have query the rmap to
>>> + * find the highest recorded block and set the write pointer to the block after
>>> + * that.  In case of a power loss this misses blocks where the data I/O has
>>> + * completed but not recorded in the rmap yet, and it also rewrites blocks if
>>> + * the most recently written ones got deleted again before unmount, but this is
>>> + * the best we can do without hardware support.
>>> + */
>>
>> I find this comment and the function name confusing since we are not looking at
>> a zone write pointer at all. So maybe rename this to something like:
>>
>> xfs_rmap_get_highest_rgbno()
> 
> Well, we're still trying to make up a write pointer.  I've renamed
> it to include estimate, and in the revised series this goes away
> as a separate helper.  But what is confusing about the comment?

It talks about zone types but the function code looks only at block groups, not
struct blk_zone.

> 
>> ? Also, I think the comment block should go...
> 
> In the update version this goes away as a separate function and I
> think the comment gets into a better place before a function that
> queries the hardware or estimated rmap write pointer.
> 
>> This code is also hard to follow without a comment indicating that write_pointer
>> is not set by xfs_zone_validate() for conventional zones. Ideally, we should
>> move the call to xfs_rmap_write_pointer() in xfs_zone_validate(). That would be
>> cleaner, no ?
> 
> No.  xfs_zone_validate is about to become entirely about the blk_zone
> and not XFS internal information.

OK.


-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 6/6] xfs: use blkdev_get_zone_info to simply zone reporting
  2026-01-09 17:20 ` [PATCH 6/6] xfs: use blkdev_get_zone_info to simply zone reporting Christoph Hellwig
  2026-01-10  1:28   ` Darrick J. Wong
@ 2026-01-13 10:33   ` Damien Le Moal
  1 sibling, 0 replies; 23+ messages in thread
From: Damien Le Moal @ 2026-01-13 10:33 UTC (permalink / raw)
  To: Christoph Hellwig, Carlos Maiolino; +Cc: linux-xfs

On 1/9/26 18:20, Christoph Hellwig wrote:
> Unwind the callback based programming model by querying the cached
> zone information using blkdev_get_zone_info.

In the title: s/simply/simplify


-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH 3/6] xfs: pass the write pointer to xfs_init_zone
  2026-01-14  6:53 refactor zone reporting v2 Christoph Hellwig
@ 2026-01-14  6:53 ` Christoph Hellwig
  2026-01-14 10:00   ` Damien Le Moal
  2026-01-16 14:16   ` Carlos Maiolino
  0 siblings, 2 replies; 23+ messages in thread
From: Christoph Hellwig @ 2026-01-14  6:53 UTC (permalink / raw)
  To: Carlos Maiolino; +Cc: Damien Le Moal, Darrick J. Wong, linux-xfs

Move the two methods to query the write pointer out of xfs_init_zone into
the callers, so that xfs_init_zone doesn't have to bother with the
blk_zone structure and instead operates purely at the XFS realtime group
level.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
---
 fs/xfs/xfs_zone_alloc.c | 66 +++++++++++++++++++++++------------------
 1 file changed, 37 insertions(+), 29 deletions(-)

diff --git a/fs/xfs/xfs_zone_alloc.c b/fs/xfs/xfs_zone_alloc.c
index 4ca7769b5adb..87243644d88e 100644
--- a/fs/xfs/xfs_zone_alloc.c
+++ b/fs/xfs/xfs_zone_alloc.c
@@ -981,43 +981,43 @@ struct xfs_init_zones {
 	uint64_t		reclaimable;
 };
 
+/*
+ * For sequential write required zones, we restart writing at the hardware write
+ * pointer returned by xfs_zone_validate().
+ *
+ * For conventional zones or conventional devices we have query the rmap to
+ * find the highest recorded block and set the write pointer to the block after
+ * that.  In case of a power loss this misses blocks where the data I/O has
+ * completed but not recorded in the rmap yet, and it also rewrites blocks if
+ * the most recently written ones got deleted again before unmount, but this is
+ * the best we can do without hardware support.
+ */
+static xfs_rgblock_t
+xfs_rmap_estimate_write_pointer(
+	struct xfs_rtgroup	*rtg)
+{
+	xfs_rgblock_t		highest_rgbno;
+
+	xfs_rtgroup_lock(rtg, XFS_RTGLOCK_RMAP);
+	highest_rgbno = xfs_rtrmap_highest_rgbno(rtg);
+	xfs_rtgroup_unlock(rtg, XFS_RTGLOCK_RMAP);
+
+	if (highest_rgbno == NULLRGBLOCK)
+		return 0;
+	return highest_rgbno + 1;
+}
+
 static int
 xfs_init_zone(
 	struct xfs_init_zones	*iz,
 	struct xfs_rtgroup	*rtg,
-	struct blk_zone		*zone)
+	xfs_rgblock_t		write_pointer)
 {
 	struct xfs_mount	*mp = rtg_mount(rtg);
 	struct xfs_zone_info	*zi = mp->m_zone_info;
 	uint32_t		used = rtg_rmap(rtg)->i_used_blocks;
-	xfs_rgblock_t		write_pointer, highest_rgbno;
 	int			error;
 
-	if (zone && !xfs_zone_validate(zone, rtg, &write_pointer))
-		return -EFSCORRUPTED;
-
-	/*
-	 * For sequential write required zones we retrieved the hardware write
-	 * pointer above.
-	 *
-	 * For conventional zones or conventional devices we don't have that
-	 * luxury.  Instead query the rmap to find the highest recorded block
-	 * and set the write pointer to the block after that.  In case of a
-	 * power loss this misses blocks where the data I/O has completed but
-	 * not recorded in the rmap yet, and it also rewrites blocks if the most
-	 * recently written ones got deleted again before unmount, but this is
-	 * the best we can do without hardware support.
-	 */
-	if (!zone || zone->cond == BLK_ZONE_COND_NOT_WP) {
-		xfs_rtgroup_lock(rtg, XFS_RTGLOCK_RMAP);
-		highest_rgbno = xfs_rtrmap_highest_rgbno(rtg);
-		if (highest_rgbno == NULLRGBLOCK)
-			write_pointer = 0;
-		else
-			write_pointer = highest_rgbno + 1;
-		xfs_rtgroup_unlock(rtg, XFS_RTGLOCK_RMAP);
-	}
-
 	/*
 	 * If there are no used blocks, but the zone is not in empty state yet
 	 * we lost power before the zoned reset.  In that case finish the work
@@ -1066,6 +1066,7 @@ xfs_get_zone_info_cb(
 	struct xfs_mount	*mp = iz->mp;
 	xfs_fsblock_t		zsbno = xfs_daddr_to_rtb(mp, zone->start);
 	xfs_rgnumber_t		rgno;
+	xfs_rgblock_t		write_pointer;
 	struct xfs_rtgroup	*rtg;
 	int			error;
 
@@ -1080,7 +1081,13 @@ xfs_get_zone_info_cb(
 		xfs_warn(mp, "realtime group not found for zone %u.", rgno);
 		return -EFSCORRUPTED;
 	}
-	error = xfs_init_zone(iz, rtg, zone);
+	if (!xfs_zone_validate(zone, rtg, &write_pointer)) {
+		xfs_rtgroup_rele(rtg);
+		return -EFSCORRUPTED;
+	}
+	if (zone->cond == BLK_ZONE_COND_NOT_WP)
+		write_pointer = xfs_rmap_estimate_write_pointer(rtg);
+	error = xfs_init_zone(iz, rtg, write_pointer);
 	xfs_rtgroup_rele(rtg);
 	return error;
 }
@@ -1290,7 +1297,8 @@ xfs_mount_zones(
 		struct xfs_rtgroup	*rtg = NULL;
 
 		while ((rtg = xfs_rtgroup_next(mp, rtg))) {
-			error = xfs_init_zone(&iz, rtg, NULL);
+			error = xfs_init_zone(&iz, rtg,
+					xfs_rmap_estimate_write_pointer(rtg));
 			if (error) {
 				xfs_rtgroup_rele(rtg);
 				goto out_free_zone_info;
-- 
2.47.3


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH 3/6] xfs: pass the write pointer to xfs_init_zone
  2026-01-14  6:53 ` [PATCH 3/6] xfs: pass the write pointer to xfs_init_zone Christoph Hellwig
@ 2026-01-14 10:00   ` Damien Le Moal
  2026-01-16 14:16   ` Carlos Maiolino
  1 sibling, 0 replies; 23+ messages in thread
From: Damien Le Moal @ 2026-01-14 10:00 UTC (permalink / raw)
  To: Christoph Hellwig, Carlos Maiolino; +Cc: Darrick J. Wong, linux-xfs

On 1/14/26 07:53, Christoph Hellwig wrote:
> Move the two methods to query the write pointer out of xfs_init_zone into
> the callers, so that xfs_init_zone doesn't have to bother with the
> blk_zone structure and instead operates purely at the XFS realtime group
> level.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>

One nit below. Otherwise, looks OK to me.

Reviewed-by: Damien Le Moal <dlemoal@kernel.org>

> ---
>  fs/xfs/xfs_zone_alloc.c | 66 +++++++++++++++++++++++------------------
>  1 file changed, 37 insertions(+), 29 deletions(-)
> 
> diff --git a/fs/xfs/xfs_zone_alloc.c b/fs/xfs/xfs_zone_alloc.c
> index 4ca7769b5adb..87243644d88e 100644
> --- a/fs/xfs/xfs_zone_alloc.c
> +++ b/fs/xfs/xfs_zone_alloc.c
> @@ -981,43 +981,43 @@ struct xfs_init_zones {
>  	uint64_t		reclaimable;
>  };
>  
> +/*
> + * For sequential write required zones, we restart writing at the hardware write
> + * pointer returned by xfs_zone_validate().
> + *
> + * For conventional zones or conventional devices we have query the rmap to

Nit:
s/we have query/we have to query (or "we must")

> + * find the highest recorded block and set the write pointer to the block after
> + * that.  In case of a power loss this misses blocks where the data I/O has
> + * completed but not recorded in the rmap yet, and it also rewrites blocks if
> + * the most recently written ones got deleted again before unmount, but this is
> + * the best we can do without hardware support.
> + */



-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 3/6] xfs: pass the write pointer to xfs_init_zone
  2026-01-14  6:53 ` [PATCH 3/6] xfs: pass the write pointer to xfs_init_zone Christoph Hellwig
  2026-01-14 10:00   ` Damien Le Moal
@ 2026-01-16 14:16   ` Carlos Maiolino
  1 sibling, 0 replies; 23+ messages in thread
From: Carlos Maiolino @ 2026-01-16 14:16 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Damien Le Moal, Darrick J. Wong, linux-xfs

On Wed, Jan 14, 2026 at 07:53:26AM +0100, Christoph Hellwig wrote:
> Move the two methods to query the write pointer out of xfs_init_zone into
> the callers, so that xfs_init_zone doesn't have to bother with the
> blk_zone structure and instead operates purely at the XFS realtime group
> level.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
> ---
>  fs/xfs/xfs_zone_alloc.c | 66 +++++++++++++++++++++++------------------
>  1 file changed, 37 insertions(+), 29 deletions(-)
> 
> diff --git a/fs/xfs/xfs_zone_alloc.c b/fs/xfs/xfs_zone_alloc.c
> index 4ca7769b5adb..87243644d88e 100644
> --- a/fs/xfs/xfs_zone_alloc.c
> +++ b/fs/xfs/xfs_zone_alloc.c
> @@ -981,43 +981,43 @@ struct xfs_init_zones {
>  	uint64_t		reclaimable;
>  };
>  
> +/*
> + * For sequential write required zones, we restart writing at the hardware write
> + * pointer returned by xfs_zone_validate().
> + *
> + * For conventional zones or conventional devices we have query the rmap to
> + * find the highest recorded block and set the write pointer to the block after
> + * that.  In case of a power loss this misses blocks where the data I/O has
> + * completed but not recorded in the rmap yet, and it also rewrites blocks if
> + * the most recently written ones got deleted again before unmount, but this is
> + * the best we can do without hardware support.
> + */
> +static xfs_rgblock_t
> +xfs_rmap_estimate_write_pointer(
> +	struct xfs_rtgroup	*rtg)
> +{
> +	xfs_rgblock_t		highest_rgbno;
> +
> +	xfs_rtgroup_lock(rtg, XFS_RTGLOCK_RMAP);
> +	highest_rgbno = xfs_rtrmap_highest_rgbno(rtg);
> +	xfs_rtgroup_unlock(rtg, XFS_RTGLOCK_RMAP);
> +
> +	if (highest_rgbno == NULLRGBLOCK)
> +		return 0;
> +	return highest_rgbno + 1;
> +}
> +
>  static int
>  xfs_init_zone(
>  	struct xfs_init_zones	*iz,
>  	struct xfs_rtgroup	*rtg,
> -	struct blk_zone		*zone)
> +	xfs_rgblock_t		write_pointer)
>  {
>  	struct xfs_mount	*mp = rtg_mount(rtg);
>  	struct xfs_zone_info	*zi = mp->m_zone_info;
>  	uint32_t		used = rtg_rmap(rtg)->i_used_blocks;
> -	xfs_rgblock_t		write_pointer, highest_rgbno;
>  	int			error;
>  
> -	if (zone && !xfs_zone_validate(zone, rtg, &write_pointer))
> -		return -EFSCORRUPTED;
> -
> -	/*
> -	 * For sequential write required zones we retrieved the hardware write
> -	 * pointer above.
> -	 *
> -	 * For conventional zones or conventional devices we don't have that
> -	 * luxury.  Instead query the rmap to find the highest recorded block
> -	 * and set the write pointer to the block after that.  In case of a
> -	 * power loss this misses blocks where the data I/O has completed but
> -	 * not recorded in the rmap yet, and it also rewrites blocks if the most
> -	 * recently written ones got deleted again before unmount, but this is
> -	 * the best we can do without hardware support.
> -	 */
> -	if (!zone || zone->cond == BLK_ZONE_COND_NOT_WP) {
> -		xfs_rtgroup_lock(rtg, XFS_RTGLOCK_RMAP);
> -		highest_rgbno = xfs_rtrmap_highest_rgbno(rtg);
> -		if (highest_rgbno == NULLRGBLOCK)
> -			write_pointer = 0;
> -		else
> -			write_pointer = highest_rgbno + 1;
> -		xfs_rtgroup_unlock(rtg, XFS_RTGLOCK_RMAP);
> -	}
> -
>  	/*
>  	 * If there are no used blocks, but the zone is not in empty state yet
>  	 * we lost power before the zoned reset.  In that case finish the work
> @@ -1066,6 +1066,7 @@ xfs_get_zone_info_cb(
>  	struct xfs_mount	*mp = iz->mp;
>  	xfs_fsblock_t		zsbno = xfs_daddr_to_rtb(mp, zone->start);
>  	xfs_rgnumber_t		rgno;
> +	xfs_rgblock_t		write_pointer;
>  	struct xfs_rtgroup	*rtg;
>  	int			error;
>  
> @@ -1080,7 +1081,13 @@ xfs_get_zone_info_cb(
>  		xfs_warn(mp, "realtime group not found for zone %u.", rgno);
>  		return -EFSCORRUPTED;
>  	}
> -	error = xfs_init_zone(iz, rtg, zone);
> +	if (!xfs_zone_validate(zone, rtg, &write_pointer)) {
> +		xfs_rtgroup_rele(rtg);
> +		return -EFSCORRUPTED;
> +	}
> +	if (zone->cond == BLK_ZONE_COND_NOT_WP)
> +		write_pointer = xfs_rmap_estimate_write_pointer(rtg);
> +	error = xfs_init_zone(iz, rtg, write_pointer);
>  	xfs_rtgroup_rele(rtg);
>  	return error;
>  }
> @@ -1290,7 +1297,8 @@ xfs_mount_zones(
>  		struct xfs_rtgroup	*rtg = NULL;
>  
>  		while ((rtg = xfs_rtgroup_next(mp, rtg))) {
> -			error = xfs_init_zone(&iz, rtg, NULL);
> +			error = xfs_init_zone(&iz, rtg,
> +					xfs_rmap_estimate_write_pointer(rtg));
>  			if (error) {
>  				xfs_rtgroup_rele(rtg);
>  				goto out_free_zone_info;
> -- 
> 2.47.3
> 

With Damien's comment in place:

Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>

> 

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2026-01-16 14:16 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-01-09 17:20 refactor zone reporting Christoph Hellwig
2026-01-09 17:20 ` [PATCH 1/6] xfs: add missing forward declaration in xfs_zones.h Christoph Hellwig
2026-01-10  0:50   ` Darrick J. Wong
2026-01-09 17:20 ` [PATCH 2/6] xfs: add a xfs_rtgroup_raw_size helper Christoph Hellwig
2026-01-10  1:00   ` Darrick J. Wong
2026-01-09 17:20 ` [PATCH 3/6] xfs: pass the write pointer to xfs_init_zone Christoph Hellwig
2026-01-10  1:11   ` Darrick J. Wong
2026-01-12 10:15   ` Damien Le Moal
2026-01-12 21:50     ` Darrick J. Wong
2026-01-13  7:47       ` Christoph Hellwig
2026-01-13  7:47     ` Christoph Hellwig
2026-01-13  9:27       ` Damien Le Moal
2026-01-09 17:20 ` [PATCH 4/6] xfs: split and refactor zone validation Christoph Hellwig
2026-01-10  1:44   ` Darrick J. Wong
2026-01-12 10:12     ` Christoph Hellwig
2026-01-09 17:20 ` [PATCH 5/6] xfs: check that used blocks are smaller than the write pointer Christoph Hellwig
2026-01-10  1:25   ` Darrick J. Wong
2026-01-09 17:20 ` [PATCH 6/6] xfs: use blkdev_get_zone_info to simply zone reporting Christoph Hellwig
2026-01-10  1:28   ` Darrick J. Wong
2026-01-13 10:33   ` Damien Le Moal
  -- strict thread matches above, loose matches on Subject: below --
2026-01-14  6:53 refactor zone reporting v2 Christoph Hellwig
2026-01-14  6:53 ` [PATCH 3/6] xfs: pass the write pointer to xfs_init_zone Christoph Hellwig
2026-01-14 10:00   ` Damien Le Moal
2026-01-16 14:16   ` Carlos Maiolino

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox