linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* misc cleanups
@ 2025-06-17 10:51 Christoph Hellwig
  2025-06-17 10:51 ` [PATCH 1/7] xfs: clean up the initial read logic in xfs_readsb Christoph Hellwig
                   ` (6 more replies)
  0 siblings, 7 replies; 36+ messages in thread
From: Christoph Hellwig @ 2025-06-17 10:51 UTC (permalink / raw)
  To: Carlos Maiolino; +Cc: John Garry, Darrick J. Wong, linux-xfs

Hi all,

this series has a bunch of cleanups, mostly around the mount code and
triggered by various recent changes in the area.

Diffstat:
 xfs_buf.c     |   42 +++++-------------------
 xfs_buf.h     |   23 +------------
 xfs_buf_mem.c |    2 -
 xfs_file.c    |    2 -
 xfs_inode.h   |    2 -
 xfs_iomap.c   |    2 -
 xfs_iops.c    |    2 -
 xfs_mount.c   |   98 +++++++++++++++++++++++-----------------------------------
 xfs_super.c   |   12 +------
 xfs_trace.h   |   31 ++++++++----------
 10 files changed, 74 insertions(+), 142 deletions(-)

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH 1/7] xfs: clean up the initial read logic in xfs_readsb
  2025-06-17 10:51 misc cleanups Christoph Hellwig
@ 2025-06-17 10:51 ` Christoph Hellwig
  2025-07-01 14:55   ` Darrick J. Wong
  2025-06-17 10:52 ` [PATCH 2/7] xfs: remove the call to sync_blockdev in xfs_configure_buftarg Christoph Hellwig
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 36+ messages in thread
From: Christoph Hellwig @ 2025-06-17 10:51 UTC (permalink / raw)
  To: Carlos Maiolino; +Cc: John Garry, Darrick J. Wong, linux-xfs

The initial sb read is always for a device logical block size
buffer.  The device logical block size is provided in the
bt_logical_sectorsize in struct buftarg, so use that instead of the
confusingly named xfs_getsize_buftarg buffer that reads it from the bdev.

Update the comments surrounding the code to better describe what is going
on.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/xfs/xfs_buf.h   |  1 -
 fs/xfs/xfs_mount.c | 21 +++++++++++----------
 2 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/fs/xfs/xfs_buf.h b/fs/xfs/xfs_buf.h
index 9d2ab567cf81..294dd9d61dbb 100644
--- a/fs/xfs/xfs_buf.h
+++ b/fs/xfs/xfs_buf.h
@@ -376,7 +376,6 @@ extern void xfs_buftarg_wait(struct xfs_buftarg *);
 extern void xfs_buftarg_drain(struct xfs_buftarg *);
 int xfs_configure_buftarg(struct xfs_buftarg *btp, unsigned int sectorsize);
 
-#define xfs_getsize_buftarg(buftarg)	block_size((buftarg)->bt_bdev)
 #define xfs_readonly_buftarg(buftarg)	bdev_read_only((buftarg)->bt_bdev)
 
 int xfs_buf_reverify(struct xfs_buf *bp, const struct xfs_buf_ops *ops);
diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c
index 29276fe60df9..047100b080aa 100644
--- a/fs/xfs/xfs_mount.c
+++ b/fs/xfs/xfs_mount.c
@@ -171,19 +171,16 @@ xfs_readsb(
 	ASSERT(mp->m_ddev_targp != NULL);
 
 	/*
-	 * For the initial read, we must guess at the sector
-	 * size based on the block device.  It's enough to
-	 * get the sb_sectsize out of the superblock and
-	 * then reread with the proper length.
-	 * We don't verify it yet, because it may not be complete.
+	 * In the first pass, use the device sector size to just read enough
+	 * of the superblock to extract the XFS sector size.
+	 *
+	 * The device sector size must be smaller than or equal to the XFS
+	 * sector size and thus we can always read the superblock.  Once we know
+	 * the XFS sector size, re-read it and run the buffer verifier.
 	 */
-	sector_size = xfs_getsize_buftarg(mp->m_ddev_targp);
+	sector_size = mp->m_ddev_targp->bt_logical_sectorsize;
 	buf_ops = NULL;
 
-	/*
-	 * Allocate a (locked) buffer to hold the superblock. This will be kept
-	 * around at all times to optimize access to the superblock.
-	 */
 reread:
 	error = xfs_buf_read_uncached(mp->m_ddev_targp, XFS_SB_DADDR,
 				      BTOBB(sector_size), &bp, buf_ops);
@@ -247,6 +244,10 @@ xfs_readsb(
 	/* no need to be quiet anymore, so reset the buf ops */
 	bp->b_ops = &xfs_sb_buf_ops;
 
+	/*
+	 * Keep a pointer of the sb buffer around instead of caching it in the
+	 * buffer cache because we access it frequently.
+	 */
 	mp->m_sb_bp = bp;
 	xfs_buf_unlock(bp);
 	return 0;
-- 
2.47.2


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 2/7] xfs: remove the call to sync_blockdev in xfs_configure_buftarg
  2025-06-17 10:51 misc cleanups Christoph Hellwig
  2025-06-17 10:51 ` [PATCH 1/7] xfs: clean up the initial read logic in xfs_readsb Christoph Hellwig
@ 2025-06-17 10:52 ` Christoph Hellwig
  2025-06-17 12:09   ` John Garry
  2025-06-17 10:52 ` [PATCH 3/7] xfs: remove the call to bdev_validate_blocksize " Christoph Hellwig
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 36+ messages in thread
From: Christoph Hellwig @ 2025-06-17 10:52 UTC (permalink / raw)
  To: Carlos Maiolino; +Cc: John Garry, Darrick J. Wong, linux-xfs

This extra call is no needed as xfs_alloc_buftarg already calls
sync_blockdev.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/xfs/xfs_buf.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c
index 8af83bd161f9..91647a43e1b2 100644
--- a/fs/xfs/xfs_buf.c
+++ b/fs/xfs/xfs_buf.c
@@ -1744,8 +1744,7 @@ xfs_configure_buftarg(
 	 */
 	if (bdev_can_atomic_write(btp->bt_bdev))
 		xfs_configure_buftarg_atomic_writes(btp);
-
-	return sync_blockdev(btp->bt_bdev);
+	return 0;
 }
 
 int
-- 
2.47.2


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 3/7] xfs: remove the call to bdev_validate_blocksize in xfs_configure_buftarg
  2025-06-17 10:51 misc cleanups Christoph Hellwig
  2025-06-17 10:51 ` [PATCH 1/7] xfs: clean up the initial read logic in xfs_readsb Christoph Hellwig
  2025-06-17 10:52 ` [PATCH 2/7] xfs: remove the call to sync_blockdev in xfs_configure_buftarg Christoph Hellwig
@ 2025-06-17 10:52 ` Christoph Hellwig
  2025-06-17 23:40   ` Dave Chinner
  2025-06-17 10:52 ` [PATCH 4/7] xfs: refactor xfs_calc_atomic_write_unit_max Christoph Hellwig
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 36+ messages in thread
From: Christoph Hellwig @ 2025-06-17 10:52 UTC (permalink / raw)
  To: Carlos Maiolino; +Cc: John Garry, Darrick J. Wong, linux-xfs

All checks the checks done in bdev_validate_blocksize are already
performed in xfs_readsb and xfs_validate_sb_common.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/xfs/xfs_buf.c | 10 ----------
 1 file changed, 10 deletions(-)

diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c
index 91647a43e1b2..0f15837724e7 100644
--- a/fs/xfs/xfs_buf.c
+++ b/fs/xfs/xfs_buf.c
@@ -1722,22 +1722,12 @@ xfs_configure_buftarg(
 	struct xfs_buftarg	*btp,
 	unsigned int		sectorsize)
 {
-	int			error;
-
 	ASSERT(btp->bt_bdev != NULL);
 
 	/* Set up metadata sector size info */
 	btp->bt_meta_sectorsize = sectorsize;
 	btp->bt_meta_sectormask = sectorsize - 1;
 
-	error = bdev_validate_blocksize(btp->bt_bdev, sectorsize);
-	if (error) {
-		xfs_warn(btp->bt_mount,
-			"Cannot use blocksize %u on device %pg, err %d",
-			sectorsize, btp->bt_bdev, error);
-		return -EINVAL;
-	}
-
 	/*
 	 * Flush the block device pagecache so our bios see anything dirtied
 	 * before mount.
-- 
2.47.2


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 4/7] xfs: refactor xfs_calc_atomic_write_unit_max
  2025-06-17 10:51 misc cleanups Christoph Hellwig
                   ` (2 preceding siblings ...)
  2025-06-17 10:52 ` [PATCH 3/7] xfs: remove the call to bdev_validate_blocksize " Christoph Hellwig
@ 2025-06-17 10:52 ` Christoph Hellwig
  2025-06-17 11:44   ` John Garry
  2025-06-17 10:52 ` [PATCH 5/7] xfs: rename the bt_bdev_* buftarg fields Christoph Hellwig
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 36+ messages in thread
From: Christoph Hellwig @ 2025-06-17 10:52 UTC (permalink / raw)
  To: Carlos Maiolino; +Cc: John Garry, Darrick J. Wong, linux-xfs

This function and the helpers used by it duplicate the same logic for AGs
and RTGs.  Use the xfs_group_type enum to unify both variants.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/xfs/xfs_mount.c | 77 +++++++++++++++++-----------------------------
 fs/xfs/xfs_trace.h | 31 +++++++++----------
 2 files changed, 43 insertions(+), 65 deletions(-)

diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c
index 047100b080aa..929b1f3349dc 100644
--- a/fs/xfs/xfs_mount.c
+++ b/fs/xfs/xfs_mount.c
@@ -679,68 +679,47 @@ static inline unsigned int max_pow_of_two_factor(const unsigned int nr)
 }
 
 /*
- * If the data device advertises atomic write support, limit the size of data
- * device atomic writes to the greatest power-of-two factor of the AG size so
- * that every atomic write unit aligns with the start of every AG.  This is
- * required so that the per-AG allocations for an atomic write will always be
+ * If the underlying device advertises atomic write support, limit the size of
+ * atomic writes to the greatest power-of-two factor of the group size so
+ * that every atomic write unit aligns with the start of every group.  This is
+ * required so that the allocations for an atomic write will always be
  * aligned compatibly with the alignment requirements of the storage.
  *
- * If the data device doesn't advertise atomic writes, then there are no
- * alignment restrictions and the largest out-of-place write we can do
- * ourselves is the number of blocks that user files can allocate from any AG.
+ * If the device doesn't advertise atomic writes, then there are no alignment
+ * restrictions and the largest out-of-place write we can do ourselves is the
+ * number of blocks that user files can allocate from any group.
  */
-static inline xfs_extlen_t xfs_calc_perag_awu_max(struct xfs_mount *mp)
-{
-	if (mp->m_ddev_targp->bt_bdev_awu_min > 0)
-		return max_pow_of_two_factor(mp->m_sb.sb_agblocks);
-	return rounddown_pow_of_two(mp->m_ag_max_usable);
-}
-
-/*
- * Reflink on the realtime device requires rtgroups, and atomic writes require
- * reflink.
- *
- * If the realtime device advertises atomic write support, limit the size of
- * data device atomic writes to the greatest power-of-two factor of the rtgroup
- * size so that every atomic write unit aligns with the start of every rtgroup.
- * This is required so that the per-rtgroup allocations for an atomic write
- * will always be aligned compatibly with the alignment requirements of the
- * storage.
- *
- * If the rt device doesn't advertise atomic writes, then there are no
- * alignment restrictions and the largest out-of-place write we can do
- * ourselves is the number of blocks that user files can allocate from any
- * rtgroup.
- */
-static inline xfs_extlen_t xfs_calc_rtgroup_awu_max(struct xfs_mount *mp)
+static xfs_extlen_t
+xfs_calc_group_awu_max(
+	struct xfs_mount	*mp,
+	enum xfs_group_type	type)
 {
-	struct xfs_groups	*rgs = &mp->m_groups[XG_TYPE_RTG];
+	struct xfs_groups	*g = &mp->m_groups[type];
+	struct xfs_buftarg	*btp = type == XG_TYPE_RTG ?
+			mp->m_rtdev_targp : mp->m_ddev_targp;
 
-	if (rgs->blocks == 0)
+	if (g->blocks == 0)
 		return 0;
-	if (mp->m_rtdev_targp && mp->m_rtdev_targp->bt_bdev_awu_min > 0)
-		return max_pow_of_two_factor(rgs->blocks);
-	return rounddown_pow_of_two(rgs->blocks);
+	if (btp && btp->bt_bdev_awu_min > 0)
+		return max_pow_of_two_factor(g->blocks);
+	return rounddown_pow_of_two(g->blocks);
 }
 
 /* Compute the maximum atomic write unit size for each section. */
 static inline void
 xfs_calc_atomic_write_unit_max(
-	struct xfs_mount	*mp)
+	struct xfs_mount	*mp,
+	enum xfs_group_type	type)
 {
-	struct xfs_groups	*ags = &mp->m_groups[XG_TYPE_AG];
-	struct xfs_groups	*rgs = &mp->m_groups[XG_TYPE_RTG];
+	struct xfs_groups	*g = &mp->m_groups[type];
 
 	const xfs_extlen_t	max_write = xfs_calc_atomic_write_max(mp);
 	const xfs_extlen_t	max_ioend = xfs_reflink_max_atomic_cow(mp);
-	const xfs_extlen_t	max_agsize = xfs_calc_perag_awu_max(mp);
-	const xfs_extlen_t	max_rgsize = xfs_calc_rtgroup_awu_max(mp);
-
-	ags->awu_max = min3(max_write, max_ioend, max_agsize);
-	rgs->awu_max = min3(max_write, max_ioend, max_rgsize);
+	const xfs_extlen_t	max_gsize = xfs_calc_group_awu_max(mp, type);
 
-	trace_xfs_calc_atomic_write_unit_max(mp, max_write, max_ioend,
-			max_agsize, max_rgsize);
+	g->awu_max = min3(max_write, max_ioend, max_gsize);
+	trace_xfs_calc_atomic_write_unit_max(mp, type, max_write, max_ioend,
+			max_gsize, g->awu_max);
 }
 
 /*
@@ -758,7 +737,8 @@ xfs_set_max_atomic_write_opt(
 		max(mp->m_groups[XG_TYPE_AG].blocks,
 		    mp->m_groups[XG_TYPE_RTG].blocks);
 	const xfs_extlen_t	max_group_write =
-		max(xfs_calc_perag_awu_max(mp), xfs_calc_rtgroup_awu_max(mp));
+		max(xfs_calc_group_awu_max(mp, XG_TYPE_AG),
+		    xfs_calc_group_awu_max(mp, XG_TYPE_RTG));
 	int			error;
 
 	if (new_max_bytes == 0)
@@ -814,7 +794,8 @@ xfs_set_max_atomic_write_opt(
 		return error;
 	}
 
-	xfs_calc_atomic_write_unit_max(mp);
+	xfs_calc_atomic_write_unit_max(mp, XG_TYPE_AG);
+	xfs_calc_atomic_write_unit_max(mp, XG_TYPE_RTG);
 	mp->m_awu_max_bytes = new_max_bytes;
 	return 0;
 }
diff --git a/fs/xfs/xfs_trace.h b/fs/xfs/xfs_trace.h
index 01d284a1c759..a45de5d89933 100644
--- a/fs/xfs/xfs_trace.h
+++ b/fs/xfs/xfs_trace.h
@@ -171,36 +171,33 @@ DEFINE_ATTR_LIST_EVENT(xfs_attr_leaf_list);
 DEFINE_ATTR_LIST_EVENT(xfs_attr_node_list);
 
 TRACE_EVENT(xfs_calc_atomic_write_unit_max,
-	TP_PROTO(struct xfs_mount *mp, unsigned int max_write,
-		 unsigned int max_ioend, unsigned int max_agsize,
-		 unsigned int max_rgsize),
-	TP_ARGS(mp, max_write, max_ioend, max_agsize, max_rgsize),
+	TP_PROTO(struct xfs_mount *mp, enum xfs_group_type type,
+		 unsigned int max_write, unsigned int max_ioend,
+		 unsigned int max_gsize, unsigned int awu_max),
+	TP_ARGS(mp, type, max_write, max_ioend, max_gsize, awu_max),
 	TP_STRUCT__entry(
 		__field(dev_t, dev)
+		__field(enum xfs_group_type, type)
 		__field(unsigned int, max_write)
 		__field(unsigned int, max_ioend)
-		__field(unsigned int, max_agsize)
-		__field(unsigned int, max_rgsize)
-		__field(unsigned int, data_awu_max)
-		__field(unsigned int, rt_awu_max)
+		__field(unsigned int, max_gsize)
+		__field(unsigned int, awu_max)
 	),
 	TP_fast_assign(
 		__entry->dev = mp->m_super->s_dev;
+		__entry->type = type;
 		__entry->max_write = max_write;
 		__entry->max_ioend = max_ioend;
-		__entry->max_agsize = max_agsize;
-		__entry->max_rgsize = max_rgsize;
-		__entry->data_awu_max = mp->m_groups[XG_TYPE_AG].awu_max;
-		__entry->rt_awu_max = mp->m_groups[XG_TYPE_RTG].awu_max;
+		__entry->max_gsize = max_gsize;
+		__entry->awu_max = awu_max;
 	),
-	TP_printk("dev %d:%d max_write %u max_ioend %u max_agsize %u max_rgsize %u data_awu_max %u rt_awu_max %u",
+	TP_printk("dev %d:%d %s max_write %u max_ioend %u max_gsize %u awu_max %u",
 		  MAJOR(__entry->dev), MINOR(__entry->dev),
+		  __print_symbolic(__entry->type, XG_TYPE_STRINGS),
 		  __entry->max_write,
 		  __entry->max_ioend,
-		  __entry->max_agsize,
-		  __entry->max_rgsize,
-		  __entry->data_awu_max,
-		  __entry->rt_awu_max)
+		  __entry->max_gsize,
+		  __entry->awu_max)
 );
 
 TRACE_EVENT(xfs_calc_max_atomic_write_fsblocks,
-- 
2.47.2


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 5/7] xfs: rename the bt_bdev_* buftarg fields
  2025-06-17 10:51 misc cleanups Christoph Hellwig
                   ` (3 preceding siblings ...)
  2025-06-17 10:52 ` [PATCH 4/7] xfs: refactor xfs_calc_atomic_write_unit_max Christoph Hellwig
@ 2025-06-17 10:52 ` Christoph Hellwig
  2025-06-17 12:02   ` John Garry
  2025-06-17 10:52 ` [PATCH 6/7] xfs: remove the bt_bdev_file buftarg field Christoph Hellwig
  2025-06-17 10:52 ` [PATCH 7/7] xfs: remove the bt_meta_sectorsize field in struct buftarg Christoph Hellwig
  6 siblings, 1 reply; 36+ messages in thread
From: Christoph Hellwig @ 2025-06-17 10:52 UTC (permalink / raw)
  To: Carlos Maiolino; +Cc: John Garry, Darrick J. Wong, linux-xfs

The extra bdev_ is weird, so drop it.  The maximum size is based on the
bdev hardware limits, so add a hw_ component instead.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/xfs/xfs_buf.c   | 4 ++--
 fs/xfs/xfs_buf.h   | 4 ++--
 fs/xfs/xfs_file.c  | 2 +-
 fs/xfs/xfs_inode.h | 2 +-
 fs/xfs/xfs_iomap.c | 2 +-
 fs/xfs/xfs_iops.c  | 2 +-
 fs/xfs/xfs_mount.c | 2 +-
 7 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c
index 0f15837724e7..abd66665fb8c 100644
--- a/fs/xfs/xfs_buf.c
+++ b/fs/xfs/xfs_buf.c
@@ -1712,8 +1712,8 @@ xfs_configure_buftarg_atomic_writes(
 		max_bytes = 0;
 	}
 
-	btp->bt_bdev_awu_min = min_bytes;
-	btp->bt_bdev_awu_max = max_bytes;
+	btp->bt_awu_min = min_bytes;
+	btp->bt_awu_max_hw = max_bytes;
 }
 
 /* Configure a buffer target that abstracts a block device. */
diff --git a/fs/xfs/xfs_buf.h b/fs/xfs/xfs_buf.h
index 294dd9d61dbb..6a96780d841b 100644
--- a/fs/xfs/xfs_buf.h
+++ b/fs/xfs/xfs_buf.h
@@ -113,8 +113,8 @@ struct xfs_buftarg {
 	struct ratelimit_state	bt_ioerror_rl;
 
 	/* Atomic write unit values, bytes */
-	unsigned int		bt_bdev_awu_min;
-	unsigned int		bt_bdev_awu_max;
+	unsigned int		bt_awu_min;
+	unsigned int		bt_awu_max_hw;
 
 	/* built-in cache, if we're not using the perag one */
 	struct xfs_buf_cache	bt_cache[];
diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
index 48254a72071b..6c1ccb5a2c82 100644
--- a/fs/xfs/xfs_file.c
+++ b/fs/xfs/xfs_file.c
@@ -752,7 +752,7 @@ xfs_file_dio_write_atomic(
 	 * HW offload should be faster, so try that first if it is already
 	 * known that the write length is not too large.
 	 */
-	if (ocount > xfs_inode_buftarg(ip)->bt_bdev_awu_max)
+	if (ocount > xfs_inode_buftarg(ip)->bt_awu_max_hw)
 		dops = &xfs_atomic_write_cow_iomap_ops;
 	else
 		dops = &xfs_direct_write_iomap_ops;
diff --git a/fs/xfs/xfs_inode.h b/fs/xfs/xfs_inode.h
index d7e2b902ef5c..39660d7aa3bb 100644
--- a/fs/xfs/xfs_inode.h
+++ b/fs/xfs/xfs_inode.h
@@ -358,7 +358,7 @@ static inline bool xfs_inode_has_bigrtalloc(const struct xfs_inode *ip)
 
 static inline bool xfs_inode_can_hw_atomic_write(const struct xfs_inode *ip)
 {
-	return xfs_inode_buftarg(ip)->bt_bdev_awu_max > 0;
+	return xfs_inode_buftarg(ip)->bt_awu_max_hw > 0;
 }
 
 /*
diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
index ff05e6b1b0bb..a2387d5918a9 100644
--- a/fs/xfs/xfs_iomap.c
+++ b/fs/xfs/xfs_iomap.c
@@ -827,7 +827,7 @@ xfs_bmap_hw_atomic_write_possible(
 	/*
 	 * The ->iomap_begin caller should ensure this, but check anyway.
 	 */
-	return len <= xfs_inode_buftarg(ip)->bt_bdev_awu_max;
+	return len <= xfs_inode_buftarg(ip)->bt_awu_max_hw;
 }
 
 static int
diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c
index 8cddbb7c149b..dc14c853475e 100644
--- a/fs/xfs/xfs_iops.c
+++ b/fs/xfs/xfs_iops.c
@@ -665,7 +665,7 @@ xfs_get_atomic_write_max_opt(
 	 * less than our out of place write limit, but we don't want to exceed
 	 * the awu_max.
 	 */
-	return min(awu_max, xfs_inode_buftarg(ip)->bt_bdev_awu_max);
+	return min(awu_max, xfs_inode_buftarg(ip)->bt_awu_max_hw);
 }
 
 static void
diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c
index 929b1f3349dc..70554f90c210 100644
--- a/fs/xfs/xfs_mount.c
+++ b/fs/xfs/xfs_mount.c
@@ -700,7 +700,7 @@ xfs_calc_group_awu_max(
 
 	if (g->blocks == 0)
 		return 0;
-	if (btp && btp->bt_bdev_awu_min > 0)
+	if (btp && btp->bt_awu_min > 0)
 		return max_pow_of_two_factor(g->blocks);
 	return rounddown_pow_of_two(g->blocks);
 }
-- 
2.47.2


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 6/7] xfs: remove the bt_bdev_file buftarg field
  2025-06-17 10:51 misc cleanups Christoph Hellwig
                   ` (4 preceding siblings ...)
  2025-06-17 10:52 ` [PATCH 5/7] xfs: rename the bt_bdev_* buftarg fields Christoph Hellwig
@ 2025-06-17 10:52 ` Christoph Hellwig
  2025-06-17 10:52 ` [PATCH 7/7] xfs: remove the bt_meta_sectorsize field in struct buftarg Christoph Hellwig
  6 siblings, 0 replies; 36+ messages in thread
From: Christoph Hellwig @ 2025-06-17 10:52 UTC (permalink / raw)
  To: Carlos Maiolino; +Cc: John Garry, Darrick J. Wong, linux-xfs

And use bt_file for both bdev and shmem backed buftargs.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/xfs/xfs_buf.c | 4 ++--
 fs/xfs/xfs_buf.h | 1 -
 2 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c
index abd66665fb8c..c8f0f8fe433a 100644
--- a/fs/xfs/xfs_buf.c
+++ b/fs/xfs/xfs_buf.c
@@ -1683,7 +1683,7 @@ xfs_free_buftarg(
 	fs_put_dax(btp->bt_daxdev, btp->bt_mount);
 	/* the main block device is closed by kill_block_super */
 	if (btp->bt_bdev != btp->bt_mount->m_super->s_bdev)
-		bdev_fput(btp->bt_bdev_file);
+		bdev_fput(btp->bt_file);
 	kfree(btp);
 }
 
@@ -1792,7 +1792,7 @@ xfs_alloc_buftarg(
 	btp = kzalloc(sizeof(*btp), GFP_KERNEL | __GFP_NOFAIL);
 
 	btp->bt_mount = mp;
-	btp->bt_bdev_file = bdev_file;
+	btp->bt_file = bdev_file;
 	btp->bt_bdev = file_bdev(bdev_file);
 	btp->bt_dev = btp->bt_bdev->bd_dev;
 	btp->bt_daxdev = fs_dax_get_by_bdev(btp->bt_bdev, &btp->bt_dax_part_off,
diff --git a/fs/xfs/xfs_buf.h b/fs/xfs/xfs_buf.h
index 6a96780d841b..adc97351f12a 100644
--- a/fs/xfs/xfs_buf.h
+++ b/fs/xfs/xfs_buf.h
@@ -94,7 +94,6 @@ void xfs_buf_cache_destroy(struct xfs_buf_cache *bch);
  */
 struct xfs_buftarg {
 	dev_t			bt_dev;
-	struct file		*bt_bdev_file;
 	struct block_device	*bt_bdev;
 	struct dax_device	*bt_daxdev;
 	struct file		*bt_file;
-- 
2.47.2


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 7/7] xfs: remove the bt_meta_sectorsize field in struct buftarg
  2025-06-17 10:51 misc cleanups Christoph Hellwig
                   ` (5 preceding siblings ...)
  2025-06-17 10:52 ` [PATCH 6/7] xfs: remove the bt_bdev_file buftarg field Christoph Hellwig
@ 2025-06-17 10:52 ` Christoph Hellwig
  2025-06-17 12:15   ` John Garry
                     ` (2 more replies)
  6 siblings, 3 replies; 36+ messages in thread
From: Christoph Hellwig @ 2025-06-17 10:52 UTC (permalink / raw)
  To: Carlos Maiolino; +Cc: John Garry, Darrick J. Wong, linux-xfs

The file system only has a single file system sector size.  Read that
from the in-core super block to avoid confusion about the two different
"sector sizes" stored in the buftarg.  Note that this loosens the
alignment asserts for memory backed buftargs that set the page size here.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/xfs/xfs_buf.c     | 21 +++++----------------
 fs/xfs/xfs_buf.h     | 17 +----------------
 fs/xfs/xfs_buf_mem.c |  2 --
 fs/xfs/xfs_super.c   | 12 +++---------
 4 files changed, 9 insertions(+), 43 deletions(-)

diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c
index c8f0f8fe433a..0acac5302c54 100644
--- a/fs/xfs/xfs_buf.c
+++ b/fs/xfs/xfs_buf.c
@@ -387,11 +387,12 @@ xfs_buf_map_verify(
 	struct xfs_buftarg	*btp,
 	struct xfs_buf_map	*map)
 {
+	unsigned int		sectsize = btp->bt_mount->m_sb.sb_sectsize;
 	xfs_daddr_t		eofs;
 
 	/* Check for IOs smaller than the sector size / not sector aligned */
-	ASSERT(!(BBTOB(map->bm_len) < btp->bt_meta_sectorsize));
-	ASSERT(!(BBTOB(map->bm_bn) & (xfs_off_t)btp->bt_meta_sectormask));
+	ASSERT(!(BBTOB(map->bm_len) < sectsize));
+	ASSERT(!(BBTOB(map->bm_bn) & (xfs_off_t)(sectsize - 1)));
 
 	/*
 	 * Corrupted block numbers can get through to here, unfortunately, so we
@@ -1719,15 +1720,10 @@ xfs_configure_buftarg_atomic_writes(
 /* Configure a buffer target that abstracts a block device. */
 int
 xfs_configure_buftarg(
-	struct xfs_buftarg	*btp,
-	unsigned int		sectorsize)
+	struct xfs_buftarg	*btp)
 {
 	ASSERT(btp->bt_bdev != NULL);
 
-	/* Set up metadata sector size info */
-	btp->bt_meta_sectorsize = sectorsize;
-	btp->bt_meta_sectormask = sectorsize - 1;
-
 	/*
 	 * Flush the block device pagecache so our bios see anything dirtied
 	 * before mount.
@@ -1806,14 +1802,7 @@ xfs_alloc_buftarg(
 	if (error)
 		goto error_free;
 
-	/*
-	 * When allocating the buftargs we have not yet read the super block and
-	 * thus don't know the file system sector size yet.
-	 */
-	btp->bt_meta_sectorsize = bdev_logical_block_size(btp->bt_bdev);
-	btp->bt_meta_sectormask = btp->bt_meta_sectorsize - 1;
-
-	error = xfs_init_buftarg(btp, btp->bt_meta_sectorsize,
+	error = xfs_init_buftarg(btp, bdev_logical_block_size(btp->bt_bdev),
 				mp->m_super->s_id);
 	if (error)
 		goto error_free;
diff --git a/fs/xfs/xfs_buf.h b/fs/xfs/xfs_buf.h
index adc97351f12a..ec17baed2cbb 100644
--- a/fs/xfs/xfs_buf.h
+++ b/fs/xfs/xfs_buf.h
@@ -79,19 +79,6 @@ struct xfs_buf_cache {
 int xfs_buf_cache_init(struct xfs_buf_cache *bch);
 void xfs_buf_cache_destroy(struct xfs_buf_cache *bch);
 
-/*
- * The xfs_buftarg contains 2 notions of "sector size" -
- *
- * 1) The metadata sector size, which is the minimum unit and
- *    alignment of IO which will be performed by metadata operations.
- * 2) The device logical sector size
- *
- * The first is specified at mkfs time, and is stored on-disk in the
- * superblock's sb_sectsize.
- *
- * The latter is derived from the underlying device, and controls direct IO
- * alignment constraints.
- */
 struct xfs_buftarg {
 	dev_t			bt_dev;
 	struct block_device	*bt_bdev;
@@ -99,8 +86,6 @@ struct xfs_buftarg {
 	struct file		*bt_file;
 	u64			bt_dax_part_off;
 	struct xfs_mount	*bt_mount;
-	unsigned int		bt_meta_sectorsize;
-	size_t			bt_meta_sectormask;
 	size_t			bt_logical_sectorsize;
 	size_t			bt_logical_sectormask;
 
@@ -373,7 +358,7 @@ struct xfs_buftarg *xfs_alloc_buftarg(struct xfs_mount *mp,
 extern void xfs_free_buftarg(struct xfs_buftarg *);
 extern void xfs_buftarg_wait(struct xfs_buftarg *);
 extern void xfs_buftarg_drain(struct xfs_buftarg *);
-int xfs_configure_buftarg(struct xfs_buftarg *btp, unsigned int sectorsize);
+int xfs_configure_buftarg(struct xfs_buftarg *btp);
 
 #define xfs_readonly_buftarg(buftarg)	bdev_read_only((buftarg)->bt_bdev)
 
diff --git a/fs/xfs/xfs_buf_mem.c b/fs/xfs/xfs_buf_mem.c
index dcbfa274e06d..46f527750d34 100644
--- a/fs/xfs/xfs_buf_mem.c
+++ b/fs/xfs/xfs_buf_mem.c
@@ -90,8 +90,6 @@ xmbuf_alloc(
 	btp->bt_dev = (dev_t)-1U;
 	btp->bt_bdev = NULL; /* in-memory buftargs have no bdev */
 	btp->bt_file = file;
-	btp->bt_meta_sectorsize = XMBUF_BLOCKSIZE;
-	btp->bt_meta_sectormask = XMBUF_BLOCKSIZE - 1;
 
 	error = xfs_init_buftarg(btp, XMBUF_BLOCKSIZE, descr);
 	if (error)
diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
index bb0a82635a77..9067a6977627 100644
--- a/fs/xfs/xfs_super.c
+++ b/fs/xfs/xfs_super.c
@@ -541,17 +541,12 @@ xfs_setup_devices(
 {
 	int			error;
 
-	error = xfs_configure_buftarg(mp->m_ddev_targp, mp->m_sb.sb_sectsize);
+	error = xfs_configure_buftarg(mp->m_ddev_targp);
 	if (error)
 		return error;
 
 	if (mp->m_logdev_targp && mp->m_logdev_targp != mp->m_ddev_targp) {
-		unsigned int	log_sector_size = BBSIZE;
-
-		if (xfs_has_sector(mp))
-			log_sector_size = mp->m_sb.sb_logsectsize;
-		error = xfs_configure_buftarg(mp->m_logdev_targp,
-					    log_sector_size);
+		error = xfs_configure_buftarg(mp->m_logdev_targp);
 		if (error)
 			return error;
 	}
@@ -564,8 +559,7 @@ xfs_setup_devices(
 		}
 		mp->m_rtdev_targp = mp->m_ddev_targp;
 	} else if (mp->m_rtname) {
-		error = xfs_configure_buftarg(mp->m_rtdev_targp,
-					    mp->m_sb.sb_sectsize);
+		error = xfs_configure_buftarg(mp->m_rtdev_targp);
 		if (error)
 			return error;
 	}
-- 
2.47.2


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [PATCH 4/7] xfs: refactor xfs_calc_atomic_write_unit_max
  2025-06-17 10:52 ` [PATCH 4/7] xfs: refactor xfs_calc_atomic_write_unit_max Christoph Hellwig
@ 2025-06-17 11:44   ` John Garry
  2025-06-18  5:08     ` Christoph Hellwig
  0 siblings, 1 reply; 36+ messages in thread
From: John Garry @ 2025-06-17 11:44 UTC (permalink / raw)
  To: Christoph Hellwig, Carlos Maiolino; +Cc: Darrick J. Wong, linux-xfs

On 17/06/2025 11:52, Christoph Hellwig wrote:
> This function and the helpers used by it duplicate the same logic for AGs
> and RTGs.  Use the xfs_group_type enum to unify both variants.
> 

This looks ok, and regardless of minor comments:

Reviewed-by: John Garry <john.g.garry@oracle.com>

> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>   fs/xfs/xfs_mount.c | 77 +++++++++++++++++-----------------------------
>   fs/xfs/xfs_trace.h | 31 +++++++++----------
>   2 files changed, 43 insertions(+), 65 deletions(-)
> 
> diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c
> index 047100b080aa..929b1f3349dc 100644
> --- a/fs/xfs/xfs_mount.c
> +++ b/fs/xfs/xfs_mount.c
> @@ -679,68 +679,47 @@ static inline unsigned int max_pow_of_two_factor(const unsigned int nr)
>   }
>   
>   /*
> - * If the data device advertises atomic write support, limit the size of data
> - * device atomic writes to the greatest power-of-two factor of the AG size so
> - * that every atomic write unit aligns with the start of every AG.  This is
> - * required so that the per-AG allocations for an atomic write will always be
> + * If the underlying device advertises atomic write support, limit the size of
> + * atomic writes to the greatest power-of-two factor of the group size so
> + * that every atomic write unit aligns with the start of every group.  This is
> + * required so that the allocations for an atomic write will always be
>    * aligned compatibly with the alignment requirements of the storage.
>    *
> - * If the data device doesn't advertise atomic writes, then there are no
> - * alignment restrictions and the largest out-of-place write we can do
> - * ourselves is the number of blocks that user files can allocate from any AG.
> + * If the device doesn't advertise atomic writes, then there are no alignment
> + * restrictions and the largest out-of-place write we can do ourselves is the
> + * number of blocks that user files can allocate from any group.
>    */
> -static inline xfs_extlen_t xfs_calc_perag_awu_max(struct xfs_mount *mp)
> -{
> -	if (mp->m_ddev_targp->bt_bdev_awu_min > 0)
> -		return max_pow_of_two_factor(mp->m_sb.sb_agblocks);
> -	return rounddown_pow_of_two(mp->m_ag_max_usable);
> -}
> -
> -/*
> - * Reflink on the realtime device requires rtgroups, and atomic writes require
> - * reflink.
> - *
> - * If the realtime device advertises atomic write support, limit the size of
> - * data device atomic writes to the greatest power-of-two factor of the rtgroup
> - * size so that every atomic write unit aligns with the start of every rtgroup.
> - * This is required so that the per-rtgroup allocations for an atomic write
> - * will always be aligned compatibly with the alignment requirements of the
> - * storage.
> - *
> - * If the rt device doesn't advertise atomic writes, then there are no
> - * alignment restrictions and the largest out-of-place write we can do
> - * ourselves is the number of blocks that user files can allocate from any
> - * rtgroup.
> - */
> -static inline xfs_extlen_t xfs_calc_rtgroup_awu_max(struct xfs_mount *mp)
> +static xfs_extlen_t
> +xfs_calc_group_awu_max(
> +	struct xfs_mount	*mp,
> +	enum xfs_group_type	type)
>   {
> -	struct xfs_groups	*rgs = &mp->m_groups[XG_TYPE_RTG];
> +	struct xfs_groups	*g = &mp->m_groups[type];
> +	struct xfs_buftarg	*btp = type == XG_TYPE_RTG ?
> +			mp->m_rtdev_targp : mp->m_ddev_targp;

Could this be made a bit more readable?

>   
> -	if (rgs->blocks == 0)
> +	if (g->blocks == 0)
>   		return 0;
> -	if (mp->m_rtdev_targp && mp->m_rtdev_targp->bt_bdev_awu_min > 0)
> -		return max_pow_of_two_factor(rgs->blocks);
> -	return rounddown_pow_of_two(rgs->blocks);
> +	if (btp && btp->bt_bdev_awu_min > 0)

Is it actually logically possible that g->blocks != 0 and btp == NULL? 
That's really a comment on the current rt handling.

> +		return max_pow_of_two_factor(g->blocks);
> +	return rounddown_pow_of_two(g->blocks);

note: We may be able to improve this calc in future, as I think that 
awu_max_opt only should be limited by max_pow_of_two_factor(g->blocks) 
and awu_max should be limited by rounddown_pow_of_two(g->blocks). I need 
to check the code more... it would not increase awu_max_opt, which 
people like me care about.

>   }
>   
>   /* Compute the maximum atomic write unit size for each section. */
>   static inline void
>   xfs_calc_atomic_write_unit_max(
> -	struct xfs_mount	*mp)
> +	struct xfs_mount	*mp,
> +	enum xfs_group_type	type)
>   {
> -	struct xfs_groups	*ags = &mp->m_groups[XG_TYPE_AG];
> -	struct xfs_groups	*rgs = &mp->m_groups[XG_TYPE_RTG];
> +	struct xfs_groups	*g = &mp->m_groups[type];
>   
>   	const xfs_extlen_t	max_write = xfs_calc_atomic_write_max(mp);
>   	const xfs_extlen_t	max_ioend = xfs_reflink_max_atomic_cow(mp);
> -	const xfs_extlen_t	max_agsize = xfs_calc_perag_awu_max(mp);
> -	const xfs_extlen_t	max_rgsize = xfs_calc_rtgroup_awu_max(mp);
> -
> -	ags->awu_max = min3(max_write, max_ioend, max_agsize);
> -	rgs->awu_max = min3(max_write, max_ioend, max_rgsize);
> +	const xfs_extlen_t	max_gsize = xfs_calc_group_awu_max(mp, type);
>   
> -	trace_xfs_calc_atomic_write_unit_max(mp, max_write, max_ioend,
> -			max_agsize, max_rgsize);
> +	g->awu_max = min3(max_write, max_ioend, max_gsize);
> +	trace_xfs_calc_atomic_write_unit_max(mp, type, max_write, max_ioend,
> +			max_gsize, g->awu_max);
>   }
>   
>   /*
> @@ -758,7 +737,8 @@ xfs_set_max_atomic_write_opt(
>   		max(mp->m_groups[XG_TYPE_AG].blocks,
>   		    mp->m_groups[XG_TYPE_RTG].blocks);
>   	const xfs_extlen_t	max_group_write =
> -		max(xfs_calc_perag_awu_max(mp), xfs_calc_rtgroup_awu_max(mp));
> +		max(xfs_calc_group_awu_max(mp, XG_TYPE_AG),
> +		    xfs_calc_group_awu_max(mp, XG_TYPE_RTG));
>   	int			error;
>   
>   	if (new_max_bytes == 0)
> @@ -814,7 +794,8 @@ xfs_set_max_atomic_write_opt(
>   		return error;
>   	}
>   
> -	xfs_calc_atomic_write_unit_max(mp);
> +	xfs_calc_atomic_write_unit_max(mp, XG_TYPE_AG);
> +	xfs_calc_atomic_write_unit_max(mp, XG_TYPE_RTG);
>   	mp->m_awu_max_bytes = new_max_bytes;
>   	return 0;
>   }
> diff --git a/fs/xfs/xfs_trace.h b/fs/xfs/xfs_trace.h
> index 01d284a1c759..a45de5d89933 100644
> --- a/fs/xfs/xfs_trace.h
> +++ b/fs/xfs/xfs_trace.h
> @@ -171,36 +171,33 @@ DEFINE_ATTR_LIST_EVENT(xfs_attr_leaf_list);
>   DEFINE_ATTR_LIST_EVENT(xfs_attr_node_list);
>   
>   TRACE_EVENT(xfs_calc_atomic_write_unit_max,
> -	TP_PROTO(struct xfs_mount *mp, unsigned int max_write,
> -		 unsigned int max_ioend, unsigned int max_agsize,
> -		 unsigned int max_rgsize),
> -	TP_ARGS(mp, max_write, max_ioend, max_agsize, max_rgsize),
> +	TP_PROTO(struct xfs_mount *mp, enum xfs_group_type type,
> +		 unsigned int max_write, unsigned int max_ioend,
> +		 unsigned int max_gsize, unsigned int awu_max),
> +	TP_ARGS(mp, type, max_write, max_ioend, max_gsize, awu_max),
>   	TP_STRUCT__entry(
>   		__field(dev_t, dev)
> +		__field(enum xfs_group_type, type)
>   		__field(unsigned int, max_write)
>   		__field(unsigned int, max_ioend)
> -		__field(unsigned int, max_agsize)
> -		__field(unsigned int, max_rgsize)
> -		__field(unsigned int, data_awu_max)
> -		__field(unsigned int, rt_awu_max)
> +		__field(unsigned int, max_gsize)
> +		__field(unsigned int, awu_max)
>   	),
>   	TP_fast_assign(
>   		__entry->dev = mp->m_super->s_dev;
> +		__entry->type = type;
>   		__entry->max_write = max_write;
>   		__entry->max_ioend = max_ioend;
> -		__entry->max_agsize = max_agsize;
> -		__entry->max_rgsize = max_rgsize;
> -		__entry->data_awu_max = mp->m_groups[XG_TYPE_AG].awu_max;
> -		__entry->rt_awu_max = mp->m_groups[XG_TYPE_RTG].awu_max;
> +		__entry->max_gsize = max_gsize;
> +		__entry->awu_max = awu_max;
>   	),
> -	TP_printk("dev %d:%d max_write %u max_ioend %u max_agsize %u max_rgsize %u data_awu_max %u rt_awu_max %u",
> +	TP_printk("dev %d:%d %s max_write %u max_ioend %u max_gsize %u awu_max %u",
>   		  MAJOR(__entry->dev), MINOR(__entry->dev),
> +		  __print_symbolic(__entry->type, XG_TYPE_STRINGS),

nice

>   		  __entry->max_write,
>   		  __entry->max_ioend,
> -		  __entry->max_agsize,
> -		  __entry->max_rgsize,
> -		  __entry->data_awu_max,
> -		  __entry->rt_awu_max)
> +		  __entry->max_gsize,
> +		  __entry->awu_max)
>   );
>   
>   TRACE_EVENT(xfs_calc_max_atomic_write_fsblocks,


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 5/7] xfs: rename the bt_bdev_* buftarg fields
  2025-06-17 10:52 ` [PATCH 5/7] xfs: rename the bt_bdev_* buftarg fields Christoph Hellwig
@ 2025-06-17 12:02   ` John Garry
  2025-06-18  5:10     ` Christoph Hellwig
  0 siblings, 1 reply; 36+ messages in thread
From: John Garry @ 2025-06-17 12:02 UTC (permalink / raw)
  To: Christoph Hellwig, Carlos Maiolino; +Cc: Darrick J. Wong, linux-xfs

On 17/06/2025 11:52, Christoph Hellwig wrote:
> The extra bdev_ is weird, so drop it.  The maximum size is based on the
> bdev hardware limits, so add a hw_ component instead.

but the min is also based on hw limits, no?

I also note that we have request queue limits atomic_write_unit_max and 
atomic_write_hw_unit_max

and bt_awu_max_hw is written with request queue limit atomic_write_unit_max

But I don't think that this will cause confusion.

> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>   fs/xfs/xfs_buf.c   | 4 ++--
>   fs/xfs/xfs_buf.h   | 4 ++--
>   fs/xfs/xfs_file.c  | 2 +-
>   fs/xfs/xfs_inode.h | 2 +-
>   fs/xfs/xfs_iomap.c | 2 +-
>   fs/xfs/xfs_iops.c  | 2 +-
>   fs/xfs/xfs_mount.c | 2 +-
>   7 files changed, 9 insertions(+), 9 deletions(-)
> 
> diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c
> index 0f15837724e7..abd66665fb8c 100644
> --- a/fs/xfs/xfs_buf.c
> +++ b/fs/xfs/xfs_buf.c
> @@ -1712,8 +1712,8 @@ xfs_configure_buftarg_atomic_writes(
>   		max_bytes = 0;
>   	}
>   
> -	btp->bt_bdev_awu_min = min_bytes;
> -	btp->bt_bdev_awu_max = max_bytes;
> +	btp->bt_awu_min = min_bytes;
> +	btp->bt_awu_max_hw = max_bytes;>   }
>   

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 2/7] xfs: remove the call to sync_blockdev in xfs_configure_buftarg
  2025-06-17 10:52 ` [PATCH 2/7] xfs: remove the call to sync_blockdev in xfs_configure_buftarg Christoph Hellwig
@ 2025-06-17 12:09   ` John Garry
  2025-06-24 14:07     ` Christoph Hellwig
  0 siblings, 1 reply; 36+ messages in thread
From: John Garry @ 2025-06-17 12:09 UTC (permalink / raw)
  To: Christoph Hellwig, Carlos Maiolino; +Cc: Darrick J. Wong, linux-xfs

On 17/06/2025 11:52, Christoph Hellwig wrote:
> This extra call is no needed as xfs_alloc_buftarg already calls
> sync_blockdev.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>   fs/xfs/xfs_buf.c | 3 +--
>   1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c
> index 8af83bd161f9..91647a43e1b2 100644
> --- a/fs/xfs/xfs_buf.c
> +++ b/fs/xfs/xfs_buf.c
> @@ -1744,8 +1744,7 @@ xfs_configure_buftarg(
>   	 */
>   	if (bdev_can_atomic_write(btp->bt_bdev))
>   		xfs_configure_buftarg_atomic_writes(btp);
> -
> -	return sync_blockdev(btp->bt_bdev);
> +	return 0;

we only ever return 0 now, so we can get rid of the return code

>   }
>   
>   int


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 7/7] xfs: remove the bt_meta_sectorsize field in struct buftarg
  2025-06-17 10:52 ` [PATCH 7/7] xfs: remove the bt_meta_sectorsize field in struct buftarg Christoph Hellwig
@ 2025-06-17 12:15   ` John Garry
  2025-06-17 12:21     ` John Garry
  2025-06-17 23:51   ` Dave Chinner
  2025-06-18  8:09   ` kernel test robot
  2 siblings, 1 reply; 36+ messages in thread
From: John Garry @ 2025-06-17 12:15 UTC (permalink / raw)
  To: Christoph Hellwig, Carlos Maiolino; +Cc: Darrick J. Wong, linux-xfs

On 17/06/2025 11:52, Christoph Hellwig wrote:
>   xfs_configure_buftarg(
> -	struct xfs_buftarg	*btp,
> -	unsigned int		sectorsize)
> +	struct xfs_buftarg	*btp)
>   {

This is now just really a wrapper for calling 
xfs_configure_buftarg_atomic_write() - is that ok?

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 7/7] xfs: remove the bt_meta_sectorsize field in struct buftarg
  2025-06-17 12:15   ` John Garry
@ 2025-06-17 12:21     ` John Garry
  2025-06-18  5:11       ` Christoph Hellwig
  0 siblings, 1 reply; 36+ messages in thread
From: John Garry @ 2025-06-17 12:21 UTC (permalink / raw)
  To: Christoph Hellwig, Carlos Maiolino; +Cc: Darrick J. Wong, linux-xfs

On 17/06/2025 13:15, John Garry wrote:
> On 17/06/2025 11:52, Christoph Hellwig wrote:
>>   xfs_configure_buftarg(
>> -    struct xfs_buftarg    *btp,
>> -    unsigned int        sectorsize)
>> +    struct xfs_buftarg    *btp)
>>   {
> 
> This is now just really a wrapper for calling 
> xfs_configure_buftarg_atomic_write() - is that ok?

And the bdev_can_atomic_write() call [not shown] is superfluous already, 
as we have that same check in bdev_atomic_write_unit_{min,max}_bytes()

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 3/7] xfs: remove the call to bdev_validate_blocksize in xfs_configure_buftarg
  2025-06-17 10:52 ` [PATCH 3/7] xfs: remove the call to bdev_validate_blocksize " Christoph Hellwig
@ 2025-06-17 23:40   ` Dave Chinner
  2025-06-18  5:05     ` Christoph Hellwig
  0 siblings, 1 reply; 36+ messages in thread
From: Dave Chinner @ 2025-06-17 23:40 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Carlos Maiolino, John Garry, Darrick J. Wong, linux-xfs

On Tue, Jun 17, 2025 at 12:52:01PM +0200, Christoph Hellwig wrote:
> All checks the checks done in bdev_validate_blocksize are already
> performed in xfs_readsb and xfs_validate_sb_common.

For the data device, yes. I don't obviously see anywhere else that
we check the fs external log dev or rt device sector size against
the block device sector size, so unless I'm just being blind it
seems to me that this check in xfs_configure_buftarg() is still
necessary for those devices.

-Dave.

-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 7/7] xfs: remove the bt_meta_sectorsize field in struct buftarg
  2025-06-17 10:52 ` [PATCH 7/7] xfs: remove the bt_meta_sectorsize field in struct buftarg Christoph Hellwig
  2025-06-17 12:15   ` John Garry
@ 2025-06-17 23:51   ` Dave Chinner
  2025-06-18  5:15     ` Christoph Hellwig
  2025-06-18  8:09   ` kernel test robot
  2 siblings, 1 reply; 36+ messages in thread
From: Dave Chinner @ 2025-06-17 23:51 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Carlos Maiolino, John Garry, Darrick J. Wong, linux-xfs

On Tue, Jun 17, 2025 at 12:52:05PM +0200, Christoph Hellwig wrote:
> The file system only has a single file system sector size.

The external log device can have a different sector size to
the rest of the filesystem. This series looks like it removes the
ability to validate that the log device sector size in teh
superblock is valid for the backing device....

At minimum, the series description needs to be more than "this is
just a code cleanup" because it really changes the way block devices
are opened, configured and validated against the on-disk information
in the superblock....

-Dave.

-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 3/7] xfs: remove the call to bdev_validate_blocksize in xfs_configure_buftarg
  2025-06-17 23:40   ` Dave Chinner
@ 2025-06-18  5:05     ` Christoph Hellwig
  0 siblings, 0 replies; 36+ messages in thread
From: Christoph Hellwig @ 2025-06-18  5:05 UTC (permalink / raw)
  To: Dave Chinner
  Cc: Christoph Hellwig, Carlos Maiolino, John Garry, Darrick J. Wong,
	linux-xfs

On Wed, Jun 18, 2025 at 09:40:20AM +1000, Dave Chinner wrote:
> On Tue, Jun 17, 2025 at 12:52:01PM +0200, Christoph Hellwig wrote:
> > All checks the checks done in bdev_validate_blocksize are already
> > performed in xfs_readsb and xfs_validate_sb_common.
> 
> For the data device, yes. I don't obviously see anywhere else that
> we check the fs external log dev or rt device sector size against
> the block device sector size, so unless I'm just being blind it
> seems to me that this check in xfs_configure_buftarg() is still
> necessary for those devices.

bdev_validate_blocksize does two things:
 
 - call blk_validate_block_size to enure the passed in block size
   is a power of two, > 512 bytes and < BLK_MAX_BLOCK_SIZE
 - check that the passed in size is larger than the sector size

In xfs_setup_devices both the main and RT device pass sb_sectsize,
so the first part is common for them, the log device passes
either BBSIZE or sb_logsectsize.

XFS verifies the is power of two for both sb fields.  The
BLK_MAX_BLOCK_SIZE is not relevant as we don't set the block
size in the block layer.  But yes, we are missing the
XFS (log)sectorsize >= lba size check for the RT and log device,
I'll fix it up.


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 4/7] xfs: refactor xfs_calc_atomic_write_unit_max
  2025-06-17 11:44   ` John Garry
@ 2025-06-18  5:08     ` Christoph Hellwig
  2025-06-18  6:28       ` John Garry
  0 siblings, 1 reply; 36+ messages in thread
From: Christoph Hellwig @ 2025-06-18  5:08 UTC (permalink / raw)
  To: John Garry; +Cc: Christoph Hellwig, Carlos Maiolino, Darrick J. Wong, linux-xfs

On Tue, Jun 17, 2025 at 12:44:57PM +0100, John Garry wrote:
>> -	struct xfs_groups	*rgs = &mp->m_groups[XG_TYPE_RTG];
>> +	struct xfs_groups	*g = &mp->m_groups[type];
>> +	struct xfs_buftarg	*btp = type == XG_TYPE_RTG ?
>> +			mp->m_rtdev_targp : mp->m_ddev_targp;
>
> Could this be made a bit more readable?

Suggestions welcome.

>
>>   -	if (rgs->blocks == 0)
>> +	if (g->blocks == 0)
>>   		return 0;
>> -	if (mp->m_rtdev_targp && mp->m_rtdev_targp->bt_bdev_awu_min > 0)
>> -		return max_pow_of_two_factor(rgs->blocks);
>> -	return rounddown_pow_of_two(rgs->blocks);
>> +	if (btp && btp->bt_bdev_awu_min > 0)
>
> Is it actually logically possible that g->blocks != 0 and btp == NULL? 
> That's really a comment on the current rt handling.

No.  For the data device btp is never NULL, and for the RT device we
reject the mount early on in this case, see xfs_rtmount_readsb and
xfs_rtmount_init.


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 5/7] xfs: rename the bt_bdev_* buftarg fields
  2025-06-17 12:02   ` John Garry
@ 2025-06-18  5:10     ` Christoph Hellwig
  2025-06-18  6:23       ` John Garry
  0 siblings, 1 reply; 36+ messages in thread
From: Christoph Hellwig @ 2025-06-18  5:10 UTC (permalink / raw)
  To: John Garry; +Cc: Christoph Hellwig, Carlos Maiolino, Darrick J. Wong, linux-xfs

On Tue, Jun 17, 2025 at 01:02:16PM +0100, John Garry wrote:
> On 17/06/2025 11:52, Christoph Hellwig wrote:
>> The extra bdev_ is weird, so drop it.  The maximum size is based on the
>> bdev hardware limits, so add a hw_ component instead.
>
> but the min is also based on hw limits, no?

Yes.

> I also note that we have request queue limits atomic_write_unit_max and 
> atomic_write_hw_unit_max
>
> and bt_awu_max_hw is written with request queue limit atomic_write_unit_max
>
> But I don't think that this will cause confusion.

Should we switch to the request_queue names instead of the nvme
spec names here entirely?


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 7/7] xfs: remove the bt_meta_sectorsize field in struct buftarg
  2025-06-17 12:21     ` John Garry
@ 2025-06-18  5:11       ` Christoph Hellwig
  0 siblings, 0 replies; 36+ messages in thread
From: Christoph Hellwig @ 2025-06-18  5:11 UTC (permalink / raw)
  To: John Garry; +Cc: Christoph Hellwig, Carlos Maiolino, Darrick J. Wong, linux-xfs

On Tue, Jun 17, 2025 at 01:21:38PM +0100, John Garry wrote:
> On 17/06/2025 13:15, John Garry wrote:
>> On 17/06/2025 11:52, Christoph Hellwig wrote:
>>>   xfs_configure_buftarg(
>>> -    struct xfs_buftarg    *btp,
>>> -    unsigned int        sectorsize)
>>> +    struct xfs_buftarg    *btp)
>>>   {
>>
>> This is now just really a wrapper for calling 
>> xfs_configure_buftarg_atomic_write() - is that ok?
>
> And the bdev_can_atomic_write() call [not shown] is superfluous already, as 
> we have that same check in bdev_atomic_write_unit_{min,max}_bytes()

I'll look into it, that should simplify things further.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 7/7] xfs: remove the bt_meta_sectorsize field in struct buftarg
  2025-06-17 23:51   ` Dave Chinner
@ 2025-06-18  5:15     ` Christoph Hellwig
  2025-06-19  2:42       ` Dave Chinner
  0 siblings, 1 reply; 36+ messages in thread
From: Christoph Hellwig @ 2025-06-18  5:15 UTC (permalink / raw)
  To: Dave Chinner
  Cc: Christoph Hellwig, Carlos Maiolino, John Garry, Darrick J. Wong,
	linux-xfs

On Wed, Jun 18, 2025 at 09:51:10AM +1000, Dave Chinner wrote:
> On Tue, Jun 17, 2025 at 12:52:05PM +0200, Christoph Hellwig wrote:
> > The file system only has a single file system sector size.
> 
> The external log device can have a different sector size to
> the rest of the filesystem. This series looks like it removes the
> ability to validate that the log device sector size in teh
> superblock is valid for the backing device....

I don't follow.  Do you mean it remove the future possibility to do this?
Even then it would be better to do this directly based off the superblock
and not use a field in the buftarg currently only used for cached buffers
(which aren't used on anything but the main device).

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 5/7] xfs: rename the bt_bdev_* buftarg fields
  2025-06-18  5:10     ` Christoph Hellwig
@ 2025-06-18  6:23       ` John Garry
  0 siblings, 0 replies; 36+ messages in thread
From: John Garry @ 2025-06-18  6:23 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Carlos Maiolino, Darrick J. Wong, linux-xfs

On 18/06/2025 06:10, Christoph Hellwig wrote:
> On Tue, Jun 17, 2025 at 01:02:16PM +0100, John Garry wrote:
>> On 17/06/2025 11:52, Christoph Hellwig wrote:
>>> The extra bdev_ is weird, so drop it.  The maximum size is based on the
>>> bdev hardware limits, so add a hw_ component instead.
>> but the min is also based on hw limits, no?
> Yes.
> 
>> I also note that we have request queue limits atomic_write_unit_max and
>> atomic_write_hw_unit_max
>>
>> and bt_awu_max_hw is written with request queue limit atomic_write_unit_max
>>
>> But I don't think that this will cause confusion.
> Should we switch to the request_queue names instead of the nvme
> spec names here entirely?
I think what you suggest is fine (with bt_awu_{min,max}_hw). It's 
concise, while the request_queue names are a bit wordy. Maybe even the 
_hw can be dropped (from bt_awu_{min,max}_hw) and a comment added to 
their definition in xfs_buftarg.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 4/7] xfs: refactor xfs_calc_atomic_write_unit_max
  2025-06-18  5:08     ` Christoph Hellwig
@ 2025-06-18  6:28       ` John Garry
  2025-06-24 14:09         ` Christoph Hellwig
  0 siblings, 1 reply; 36+ messages in thread
From: John Garry @ 2025-06-18  6:28 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Carlos Maiolino, Darrick J. Wong, linux-xfs

On 18/06/2025 06:08, Christoph Hellwig wrote:
>>> -	struct xfs_groups	*rgs = &mp->m_groups[XG_TYPE_RTG];
>>> +	struct xfs_groups	*g = &mp->m_groups[type];
>>> +	struct xfs_buftarg	*btp = type == XG_TYPE_RTG ?
>>> +			mp->m_rtdev_targp : mp->m_ddev_targp;
>> Could this be made a bit more readable?
> Suggestions welcome.

I thought that you did not like the ternary operator :)

Using an if-else would bloat the code, so I suppose what you have is ok.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 7/7] xfs: remove the bt_meta_sectorsize field in struct buftarg
  2025-06-17 10:52 ` [PATCH 7/7] xfs: remove the bt_meta_sectorsize field in struct buftarg Christoph Hellwig
  2025-06-17 12:15   ` John Garry
  2025-06-17 23:51   ` Dave Chinner
@ 2025-06-18  8:09   ` kernel test robot
  2 siblings, 0 replies; 36+ messages in thread
From: kernel test robot @ 2025-06-18  8:09 UTC (permalink / raw)
  To: Christoph Hellwig, Carlos Maiolino
  Cc: oe-kbuild-all, John Garry, Darrick J. Wong, linux-xfs

Hi Christoph,

kernel test robot noticed the following build warnings:

[auto build test WARNING on xfs-linux/for-next]
[also build test WARNING on linus/master v6.16-rc2 next-20250617]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Christoph-Hellwig/xfs-remove-the-call-to-sync_blockdev-in-xfs_configure_buftarg/20250617-195226
base:   https://git.kernel.org/pub/scm/fs/xfs/xfs-linux.git for-next
patch link:    https://lore.kernel.org/r/20250617105238.3393499-8-hch%40lst.de
patch subject: [PATCH 7/7] xfs: remove the bt_meta_sectorsize field in struct buftarg
config: s390-randconfig-r072-20250618 (https://download.01.org/0day-ci/archive/20250618/202506181720.cz0T8fSK-lkp@intel.com/config)
compiler: s390-linux-gcc (GCC) 8.5.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250618/202506181720.cz0T8fSK-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202506181720.cz0T8fSK-lkp@intel.com/

All warnings (new ones prefixed by >>):

   fs/xfs/xfs_buf.c: In function 'xfs_buf_map_verify':
>> fs/xfs/xfs_buf.c:390:16: warning: unused variable 'sectsize' [-Wunused-variable]
     unsigned int  sectsize = btp->bt_mount->m_sb.sb_sectsize;
                   ^~~~~~~~


vim +/sectsize +390 fs/xfs/xfs_buf.c

   384	
   385	static int
   386	xfs_buf_map_verify(
   387		struct xfs_buftarg	*btp,
   388		struct xfs_buf_map	*map)
   389	{
 > 390		unsigned int		sectsize = btp->bt_mount->m_sb.sb_sectsize;
   391		xfs_daddr_t		eofs;
   392	
   393		/* Check for IOs smaller than the sector size / not sector aligned */
   394		ASSERT(!(BBTOB(map->bm_len) < sectsize));
   395		ASSERT(!(BBTOB(map->bm_bn) & (xfs_off_t)(sectsize - 1)));
   396	
   397		/*
   398		 * Corrupted block numbers can get through to here, unfortunately, so we
   399		 * have to check that the buffer falls within the filesystem bounds.
   400		 */
   401		eofs = XFS_FSB_TO_BB(btp->bt_mount, btp->bt_mount->m_sb.sb_dblocks);
   402		if (map->bm_bn < 0 || map->bm_bn >= eofs) {
   403			xfs_alert(btp->bt_mount,
   404				  "%s: daddr 0x%llx out of range, EOFS 0x%llx",
   405				  __func__, map->bm_bn, eofs);
   406			WARN_ON(1);
   407			return -EFSCORRUPTED;
   408		}
   409		return 0;
   410	}
   411	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 7/7] xfs: remove the bt_meta_sectorsize field in struct buftarg
  2025-06-18  5:15     ` Christoph Hellwig
@ 2025-06-19  2:42       ` Dave Chinner
  2025-06-24 14:11         ` Christoph Hellwig
  0 siblings, 1 reply; 36+ messages in thread
From: Dave Chinner @ 2025-06-19  2:42 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Carlos Maiolino, John Garry, Darrick J. Wong, linux-xfs

On Wed, Jun 18, 2025 at 07:15:09AM +0200, Christoph Hellwig wrote:
> On Wed, Jun 18, 2025 at 09:51:10AM +1000, Dave Chinner wrote:
> > On Tue, Jun 17, 2025 at 12:52:05PM +0200, Christoph Hellwig wrote:
> > > The file system only has a single file system sector size.
> > 
> > The external log device can have a different sector size to
> > the rest of the filesystem. This series looks like it removes the
> > ability to validate that the log device sector size in teh
> > superblock is valid for the backing device....
> 
> I don't follow.  Do you mean it remove the future possibility to do this?

No, I mean that this:

# mkfs.xfs -l sectsize=512,logdev=/dev/nvme1n1 -d sectsize=4k ....  /dev/nvme0n1

is an valid filesystem configuration and has been for a long, long
time. i.e. the logdev does not have to have the same physical sector
size support as the data device.

If the above filesystem was moved to new devices where the external
log device also had a minimum LBA sector size of 4kB, that
filesystem must not mount. The 512 byte sector size for the journal
means journal IO is aligned and rounded to 512 byte boundaries, not
4kB boundaries like the new underlying device requires. IOWs, that
fs config is unusable on that new device config.

This sort of sector size mismatch is currently caught by the
xfs_setup_devices() -> xfs_configure_buftarg() ->
bdev_validate_blocksize() path. That validation path is what you are
removing in this series, so you are introducing regressions in device
sector size validation during mount....

> Even then it would be better to do this directly based off the superblock
> and not use a field in the buftarg currently only used for cached buffers
> (which aren't used on anything but the main device).

IDGI. The sector size passed to xfs_configure_buftarg() by
xfs_setup_devices() for the bdev LBA size check comes directly from
the superblock. You're advocating that we do exactly what the code
you are removing already does....

-Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 2/7] xfs: remove the call to sync_blockdev in xfs_configure_buftarg
  2025-06-17 12:09   ` John Garry
@ 2025-06-24 14:07     ` Christoph Hellwig
  2025-06-24 14:46       ` John Garry
  0 siblings, 1 reply; 36+ messages in thread
From: Christoph Hellwig @ 2025-06-24 14:07 UTC (permalink / raw)
  To: John Garry; +Cc: Christoph Hellwig, Carlos Maiolino, Darrick J. Wong, linux-xfs

On Tue, Jun 17, 2025 at 01:09:43PM +0100, John Garry wrote:
>> diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c
>> index 8af83bd161f9..91647a43e1b2 100644
>> --- a/fs/xfs/xfs_buf.c
>> +++ b/fs/xfs/xfs_buf.c
>> @@ -1744,8 +1744,7 @@ xfs_configure_buftarg(
>>   	 */
>>   	if (bdev_can_atomic_write(btp->bt_bdev))
>>   		xfs_configure_buftarg_atomic_writes(btp);
>> -
>> -	return sync_blockdev(btp->bt_bdev);
>> +	return 0;
>
> we only ever return 0 now, so we can get rid of the return code

The call to bdev_validate_blocksize above the diff context can still
return an error.


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 4/7] xfs: refactor xfs_calc_atomic_write_unit_max
  2025-06-18  6:28       ` John Garry
@ 2025-06-24 14:09         ` Christoph Hellwig
  2025-06-24 15:03           ` John Garry
  0 siblings, 1 reply; 36+ messages in thread
From: Christoph Hellwig @ 2025-06-24 14:09 UTC (permalink / raw)
  To: John Garry; +Cc: Christoph Hellwig, Carlos Maiolino, Darrick J. Wong, linux-xfs

On Wed, Jun 18, 2025 at 07:28:19AM +0100, John Garry wrote:
> On 18/06/2025 06:08, Christoph Hellwig wrote:
>>>> -	struct xfs_groups	*rgs = &mp->m_groups[XG_TYPE_RTG];
>>>> +	struct xfs_groups	*g = &mp->m_groups[type];
>>>> +	struct xfs_buftarg	*btp = type == XG_TYPE_RTG ?
>>>> +			mp->m_rtdev_targp : mp->m_ddev_targp;
>>> Could this be made a bit more readable?
>> Suggestions welcome.
>
> I thought that you did not like the ternary operator :)
>
> Using an if-else would bloat the code, so I suppose what you have is ok.

Initializing variables at declaration time is one of the few sane-ish
use cases for it.  Although maybe a little helpers might be even better.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 7/7] xfs: remove the bt_meta_sectorsize field in struct buftarg
  2025-06-19  2:42       ` Dave Chinner
@ 2025-06-24 14:11         ` Christoph Hellwig
  2025-06-24 23:55           ` Dave Chinner
  0 siblings, 1 reply; 36+ messages in thread
From: Christoph Hellwig @ 2025-06-24 14:11 UTC (permalink / raw)
  To: Dave Chinner
  Cc: Christoph Hellwig, Carlos Maiolino, John Garry, Darrick J. Wong,
	linux-xfs

On Thu, Jun 19, 2025 at 12:42:39PM +1000, Dave Chinner wrote:
> > > The external log device can have a different sector size to
> > > the rest of the filesystem. This series looks like it removes the
> > > ability to validate that the log device sector size in teh
> > > superblock is valid for the backing device....
> > 
> > I don't follow.  Do you mean it remove the future possibility to do this?
> 
> No, I mean that this:
> 
> # mkfs.xfs -l sectsize=512,logdev=/dev/nvme1n1 -d sectsize=4k ....  /dev/nvme0n1
> 
> is an valid filesystem configuration and has been for a long, long
> time. i.e. the logdev does not have to have the same physical sector
> size support as the data device.

Sure, and I've never disagreed.  But you'd not explained how that is
relevant for this patch.  The bt_meta_sectorsize is only used for
asserting the alignment of cached buffers, and we place no buffers
(cached or uncached) on the log device ever.


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 2/7] xfs: remove the call to sync_blockdev in xfs_configure_buftarg
  2025-06-24 14:07     ` Christoph Hellwig
@ 2025-06-24 14:46       ` John Garry
  0 siblings, 0 replies; 36+ messages in thread
From: John Garry @ 2025-06-24 14:46 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Carlos Maiolino, Darrick J. Wong, linux-xfs

On 24/06/2025 15:07, Christoph Hellwig wrote:
> On Tue, Jun 17, 2025 at 01:09:43PM +0100, John Garry wrote:
>>> diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c
>>> index 8af83bd161f9..91647a43e1b2 100644
>>> --- a/fs/xfs/xfs_buf.c
>>> +++ b/fs/xfs/xfs_buf.c
>>> @@ -1744,8 +1744,7 @@ xfs_configure_buftarg(
>>>    	 */
>>>    	if (bdev_can_atomic_write(btp->bt_bdev))
>>>    		xfs_configure_buftarg_atomic_writes(btp);
>>> -
>>> -	return sync_blockdev(btp->bt_bdev);
>>> +	return 0;
>>
>> we only ever return 0 now, so we can get rid of the return code
> 
> The call to bdev_validate_blocksize above the diff context can still
> return an error.
> 

Right, but I think that my comment would be correct after the next 
patch. I just noticed this when I applied your series, but commented on 
the incorrect patch.

But I also did mention that maybe we can get rid of 
xfs_configure_buftarg() since it is effectively a wrapper now. The 
ASSERT (call in that function) looks pointless to me.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 4/7] xfs: refactor xfs_calc_atomic_write_unit_max
  2025-06-24 14:09         ` Christoph Hellwig
@ 2025-06-24 15:03           ` John Garry
  0 siblings, 0 replies; 36+ messages in thread
From: John Garry @ 2025-06-24 15:03 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Carlos Maiolino, Darrick J. Wong, linux-xfs

On 24/06/2025 15:09, Christoph Hellwig wrote:
> On Wed, Jun 18, 2025 at 07:28:19AM +0100, John Garry wrote:
>> On 18/06/2025 06:08, Christoph Hellwig wrote:
>>>>> -	struct xfs_groups	*rgs = &mp->m_groups[XG_TYPE_RTG];
>>>>> +	struct xfs_groups	*g = &mp->m_groups[type];
>>>>> +	struct xfs_buftarg	*btp = type == XG_TYPE_RTG ?
>>>>> +			mp->m_rtdev_targp : mp->m_ddev_targp;
>>>> Could this be made a bit more readable?
>>> Suggestions welcome.
>>
>> I thought that you did not like the ternary operator :)
>>
>> Using an if-else would bloat the code, so I suppose what you have is ok.
> 
> Initializing variables at declaration time is one of the few sane-ish
> use cases for it.  Although maybe a little helpers might be even better.

As you prob noticed, xfs_dax_notify_dev_failure() and (effectively) 
xfs_group_bdev() has this same pattern.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 7/7] xfs: remove the bt_meta_sectorsize field in struct buftarg
  2025-06-24 14:11         ` Christoph Hellwig
@ 2025-06-24 23:55           ` Dave Chinner
  2025-06-25  6:23             ` Christoph Hellwig
  0 siblings, 1 reply; 36+ messages in thread
From: Dave Chinner @ 2025-06-24 23:55 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Carlos Maiolino, John Garry, Darrick J. Wong, linux-xfs

On Tue, Jun 24, 2025 at 04:11:17PM +0200, Christoph Hellwig wrote:
> On Thu, Jun 19, 2025 at 12:42:39PM +1000, Dave Chinner wrote:
> > > > The external log device can have a different sector size to
> > > > the rest of the filesystem. This series looks like it removes the
> > > > ability to validate that the log device sector size in teh
> > > > superblock is valid for the backing device....
> > > 
> > > I don't follow.  Do you mean it remove the future possibility to do this?
> > 
> > No, I mean that this:
> > 
> > # mkfs.xfs -l sectsize=512,logdev=/dev/nvme1n1 -d sectsize=4k ....  /dev/nvme0n1
> > 
> > is an valid filesystem configuration and has been for a long, long
> > time. i.e. the logdev does not have to have the same physical sector
> > size support as the data device.
> 
> Sure, and I've never disagreed.  But you'd not explained how that is
> relevant for this patch.  The bt_meta_sectorsize is only used for
> asserting the alignment of cached buffers, and we place no buffers
> (cached or uncached) on the log device ever.

Again: I am not talking about the bt_meta_sectorsize removal being a
problem!

I will repeat it once more: this patchset removes the check that
guarantees the the underlying block device has a sector size that is
valid for the sector size the filesystem devices are configured to
use. That is not acceptible - a 512 byte sector filesystem device
must not be able to mount on a hard 4kB sector device because the
moment we do a 512 byte aligned IO to the log device, the bdev will
give an EIO error and we'll shut down the filesystem.

We check for sector size mismatches at device open time so we don't
get into this nasty situation. You are removing those sector size
checks in this patch set. This is an obvious functional regression,
and that is the part of this change that I'm NACKing. 

Christoph, I explained this in the email you are replying to, and
you have simply ignored it all. Please at least read the entire
email before you respond this time!

-Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 7/7] xfs: remove the bt_meta_sectorsize field in struct buftarg
  2025-06-24 23:55           ` Dave Chinner
@ 2025-06-25  6:23             ` Christoph Hellwig
  0 siblings, 0 replies; 36+ messages in thread
From: Christoph Hellwig @ 2025-06-25  6:23 UTC (permalink / raw)
  To: Dave Chinner
  Cc: Christoph Hellwig, Carlos Maiolino, John Garry, Darrick J. Wong,
	linux-xfs

On Wed, Jun 25, 2025 at 09:55:28AM +1000, Dave Chinner wrote:
> Again: I am not talking about the bt_meta_sectorsize removal being a
> problem!

Why do you comment on the patch doing that then?

> I will repeat it once more: this patchset removes the check that
> guarantees the the underlying block device has a sector size that is
> valid for the sector size the filesystem devices are configured to
> use. That is not acceptible - a 512 byte sector filesystem device
> must not be able to mount on a hard 4kB sector device because the
> moment we do a 512 byte aligned IO to the log device, the bdev will
> give an EIO error and we'll shut down the filesystem.

Yes, and I've already agree to not drop that check after you initially
pointed that out.

So I'm really confused on what you are trying to comment on for this
patch.


^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH 2/7] xfs: remove the call to sync_blockdev in xfs_configure_buftarg
  2025-07-01 10:40 misc cleanups v2 Christoph Hellwig
@ 2025-07-01 10:40 ` Christoph Hellwig
  2025-07-01 10:53   ` John Garry
  2025-07-01 11:03   ` John Garry
  0 siblings, 2 replies; 36+ messages in thread
From: Christoph Hellwig @ 2025-07-01 10:40 UTC (permalink / raw)
  To: Carlos Maiolino; +Cc: John Garry, Darrick J. Wong, linux-xfs

This extra call is no needed as xfs_alloc_buftarg already calls
sync_blockdev.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/xfs/xfs_buf.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c
index ba5bd6031ece..7a05310da895 100644
--- a/fs/xfs/xfs_buf.c
+++ b/fs/xfs/xfs_buf.c
@@ -1744,8 +1744,7 @@ xfs_configure_buftarg(
 	 */
 	if (bdev_can_atomic_write(btp->bt_bdev))
 		xfs_configure_buftarg_atomic_writes(btp);
-
-	return sync_blockdev(btp->bt_bdev);
+	return 0;
 }
 
 int
-- 
2.47.2


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [PATCH 2/7] xfs: remove the call to sync_blockdev in xfs_configure_buftarg
  2025-07-01 10:40 ` [PATCH 2/7] xfs: remove the call to sync_blockdev in xfs_configure_buftarg Christoph Hellwig
@ 2025-07-01 10:53   ` John Garry
  2025-07-01 11:03   ` John Garry
  1 sibling, 0 replies; 36+ messages in thread
From: John Garry @ 2025-07-01 10:53 UTC (permalink / raw)
  To: Christoph Hellwig, Carlos Maiolino; +Cc: Darrick J. Wong, linux-xfs

On 01/07/2025 11:40, Christoph Hellwig wrote:
> This extra call is no needed as xfs_alloc_buftarg already calls

not needed

> sync_blockdev.
> 
> Signed-off-by: Christoph Hellwig<hch@lst.de>

seems fine, so FWIW:

Reviewed-by: John Garry <john.g.garry@oracle.com>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 2/7] xfs: remove the call to sync_blockdev in xfs_configure_buftarg
  2025-07-01 10:40 ` [PATCH 2/7] xfs: remove the call to sync_blockdev in xfs_configure_buftarg Christoph Hellwig
  2025-07-01 10:53   ` John Garry
@ 2025-07-01 11:03   ` John Garry
  2025-07-03 13:42     ` Christoph Hellwig
  1 sibling, 1 reply; 36+ messages in thread
From: John Garry @ 2025-07-01 11:03 UTC (permalink / raw)
  To: Christoph Hellwig, Carlos Maiolino; +Cc: Darrick J. Wong, linux-xfs

On 01/07/2025 11:40, Christoph Hellwig wrote:
> This extra call is no needed as xfs_alloc_buftarg already calls
> sync_blockdev.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>   fs/xfs/xfs_buf.c | 3 +--
>   1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c
> index ba5bd6031ece..7a05310da895 100644
> --- a/fs/xfs/xfs_buf.c
> +++ b/fs/xfs/xfs_buf.c
> @@ -1744,8 +1744,7 @@ xfs_configure_buftarg(
>   	 */
>   	if (bdev_can_atomic_write(btp->bt_bdev))
>   		xfs_configure_buftarg_atomic_writes(btp);
> -
> -	return sync_blockdev(btp->bt_bdev);
> +	return 0;
>   }
>   
>   int

BTW, the comment would need to be removed as well:

	/*
	 * Flush the block device pagecache so our bios see anything dirtied
	 * before mount.
	 */
	if (bdev_can_atomic_write(btp->bt_bdev))
		xfs_configure_buftarg_atomic_writes(btp);

	return sync_blockdev(btp->bt_bdev);
}

This all seems to have come from 6e7d71b3a0, which is a strange 
resolution to me.


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 1/7] xfs: clean up the initial read logic in xfs_readsb
  2025-06-17 10:51 ` [PATCH 1/7] xfs: clean up the initial read logic in xfs_readsb Christoph Hellwig
@ 2025-07-01 14:55   ` Darrick J. Wong
  0 siblings, 0 replies; 36+ messages in thread
From: Darrick J. Wong @ 2025-07-01 14:55 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Carlos Maiolino, John Garry, linux-xfs

On Tue, Jun 17, 2025 at 12:51:59PM +0200, Christoph Hellwig wrote:
> The initial sb read is always for a device logical block size
> buffer.  The device logical block size is provided in the
> bt_logical_sectorsize in struct buftarg, so use that instead of the
> confusingly named xfs_getsize_buftarg buffer that reads it from the bdev.
> 
> Update the comments surrounding the code to better describe what is going
> on.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>

Looks good,
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>

--D

> ---
>  fs/xfs/xfs_buf.h   |  1 -
>  fs/xfs/xfs_mount.c | 21 +++++++++++----------
>  2 files changed, 11 insertions(+), 11 deletions(-)
> 
> diff --git a/fs/xfs/xfs_buf.h b/fs/xfs/xfs_buf.h
> index 9d2ab567cf81..294dd9d61dbb 100644
> --- a/fs/xfs/xfs_buf.h
> +++ b/fs/xfs/xfs_buf.h
> @@ -376,7 +376,6 @@ extern void xfs_buftarg_wait(struct xfs_buftarg *);
>  extern void xfs_buftarg_drain(struct xfs_buftarg *);
>  int xfs_configure_buftarg(struct xfs_buftarg *btp, unsigned int sectorsize);
>  
> -#define xfs_getsize_buftarg(buftarg)	block_size((buftarg)->bt_bdev)
>  #define xfs_readonly_buftarg(buftarg)	bdev_read_only((buftarg)->bt_bdev)
>  
>  int xfs_buf_reverify(struct xfs_buf *bp, const struct xfs_buf_ops *ops);
> diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c
> index 29276fe60df9..047100b080aa 100644
> --- a/fs/xfs/xfs_mount.c
> +++ b/fs/xfs/xfs_mount.c
> @@ -171,19 +171,16 @@ xfs_readsb(
>  	ASSERT(mp->m_ddev_targp != NULL);
>  
>  	/*
> -	 * For the initial read, we must guess at the sector
> -	 * size based on the block device.  It's enough to
> -	 * get the sb_sectsize out of the superblock and
> -	 * then reread with the proper length.
> -	 * We don't verify it yet, because it may not be complete.
> +	 * In the first pass, use the device sector size to just read enough
> +	 * of the superblock to extract the XFS sector size.
> +	 *
> +	 * The device sector size must be smaller than or equal to the XFS
> +	 * sector size and thus we can always read the superblock.  Once we know
> +	 * the XFS sector size, re-read it and run the buffer verifier.
>  	 */
> -	sector_size = xfs_getsize_buftarg(mp->m_ddev_targp);
> +	sector_size = mp->m_ddev_targp->bt_logical_sectorsize;
>  	buf_ops = NULL;
>  
> -	/*
> -	 * Allocate a (locked) buffer to hold the superblock. This will be kept
> -	 * around at all times to optimize access to the superblock.
> -	 */
>  reread:
>  	error = xfs_buf_read_uncached(mp->m_ddev_targp, XFS_SB_DADDR,
>  				      BTOBB(sector_size), &bp, buf_ops);
> @@ -247,6 +244,10 @@ xfs_readsb(
>  	/* no need to be quiet anymore, so reset the buf ops */
>  	bp->b_ops = &xfs_sb_buf_ops;
>  
> +	/*
> +	 * Keep a pointer of the sb buffer around instead of caching it in the
> +	 * buffer cache because we access it frequently.
> +	 */
>  	mp->m_sb_bp = bp;
>  	xfs_buf_unlock(bp);
>  	return 0;
> -- 
> 2.47.2
> 
> 

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 2/7] xfs: remove the call to sync_blockdev in xfs_configure_buftarg
  2025-07-01 11:03   ` John Garry
@ 2025-07-03 13:42     ` Christoph Hellwig
  0 siblings, 0 replies; 36+ messages in thread
From: Christoph Hellwig @ 2025-07-03 13:42 UTC (permalink / raw)
  To: John Garry; +Cc: Christoph Hellwig, Carlos Maiolino, Darrick J. Wong, linux-xfs

On Tue, Jul 01, 2025 at 12:03:53PM +0100, John Garry wrote:
> BTW, the comment would need to be removed as well:
>
> 	/*
> 	 * Flush the block device pagecache so our bios see anything dirtied
> 	 * before mount.
> 	 */
> 	if (bdev_can_atomic_write(btp->bt_bdev))
> 		xfs_configure_buftarg_atomic_writes(btp);
>
> 	return sync_blockdev(btp->bt_bdev);
> }
>
> This all seems to have come from 6e7d71b3a0, which is a strange resolution 
> to me.

Yes, including having the comment in the wrong place.

^ permalink raw reply	[flat|nested] 36+ messages in thread

end of thread, other threads:[~2025-07-03 13:42 UTC | newest]

Thread overview: 36+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-17 10:51 misc cleanups Christoph Hellwig
2025-06-17 10:51 ` [PATCH 1/7] xfs: clean up the initial read logic in xfs_readsb Christoph Hellwig
2025-07-01 14:55   ` Darrick J. Wong
2025-06-17 10:52 ` [PATCH 2/7] xfs: remove the call to sync_blockdev in xfs_configure_buftarg Christoph Hellwig
2025-06-17 12:09   ` John Garry
2025-06-24 14:07     ` Christoph Hellwig
2025-06-24 14:46       ` John Garry
2025-06-17 10:52 ` [PATCH 3/7] xfs: remove the call to bdev_validate_blocksize " Christoph Hellwig
2025-06-17 23:40   ` Dave Chinner
2025-06-18  5:05     ` Christoph Hellwig
2025-06-17 10:52 ` [PATCH 4/7] xfs: refactor xfs_calc_atomic_write_unit_max Christoph Hellwig
2025-06-17 11:44   ` John Garry
2025-06-18  5:08     ` Christoph Hellwig
2025-06-18  6:28       ` John Garry
2025-06-24 14:09         ` Christoph Hellwig
2025-06-24 15:03           ` John Garry
2025-06-17 10:52 ` [PATCH 5/7] xfs: rename the bt_bdev_* buftarg fields Christoph Hellwig
2025-06-17 12:02   ` John Garry
2025-06-18  5:10     ` Christoph Hellwig
2025-06-18  6:23       ` John Garry
2025-06-17 10:52 ` [PATCH 6/7] xfs: remove the bt_bdev_file buftarg field Christoph Hellwig
2025-06-17 10:52 ` [PATCH 7/7] xfs: remove the bt_meta_sectorsize field in struct buftarg Christoph Hellwig
2025-06-17 12:15   ` John Garry
2025-06-17 12:21     ` John Garry
2025-06-18  5:11       ` Christoph Hellwig
2025-06-17 23:51   ` Dave Chinner
2025-06-18  5:15     ` Christoph Hellwig
2025-06-19  2:42       ` Dave Chinner
2025-06-24 14:11         ` Christoph Hellwig
2025-06-24 23:55           ` Dave Chinner
2025-06-25  6:23             ` Christoph Hellwig
2025-06-18  8:09   ` kernel test robot
  -- strict thread matches above, loose matches on Subject: below --
2025-07-01 10:40 misc cleanups v2 Christoph Hellwig
2025-07-01 10:40 ` [PATCH 2/7] xfs: remove the call to sync_blockdev in xfs_configure_buftarg Christoph Hellwig
2025-07-01 10:53   ` John Garry
2025-07-01 11:03   ` John Garry
2025-07-03 13:42     ` Christoph Hellwig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).