* [PATCH 1/7] xfs: clean up the initial read logic in xfs_readsb
2025-07-01 10:40 misc cleanups v2 Christoph Hellwig
@ 2025-07-01 10:40 ` Christoph Hellwig
2025-07-01 16:43 ` Darrick J. Wong
2025-07-01 10:40 ` [PATCH 2/7] xfs: remove the call to sync_blockdev in xfs_configure_buftarg Christoph Hellwig
` (5 subsequent siblings)
6 siblings, 1 reply; 33+ messages in thread
From: Christoph Hellwig @ 2025-07-01 10:40 UTC (permalink / raw)
To: Carlos Maiolino; +Cc: John Garry, Darrick J. Wong, linux-xfs
The initial sb read is always for a device logical block size
buffer. The device logical block size is provided in the
bt_logical_sectorsize in struct buftarg, so use that instead of the
confusingly named xfs_getsize_buftarg buffer that reads it from the bdev.
Update the comments surrounding the code to better describe what is going
on.
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
fs/xfs/xfs_buf.h | 1 -
fs/xfs/xfs_mount.c | 21 +++++++++++----------
2 files changed, 11 insertions(+), 11 deletions(-)
diff --git a/fs/xfs/xfs_buf.h b/fs/xfs/xfs_buf.h
index 15fc56948346..73a9686110e8 100644
--- a/fs/xfs/xfs_buf.h
+++ b/fs/xfs/xfs_buf.h
@@ -375,7 +375,6 @@ extern void xfs_buftarg_wait(struct xfs_buftarg *);
extern void xfs_buftarg_drain(struct xfs_buftarg *);
int xfs_configure_buftarg(struct xfs_buftarg *btp, unsigned int sectorsize);
-#define xfs_getsize_buftarg(buftarg) block_size((buftarg)->bt_bdev)
#define xfs_readonly_buftarg(buftarg) bdev_read_only((buftarg)->bt_bdev)
int xfs_buf_reverify(struct xfs_buf *bp, const struct xfs_buf_ops *ops);
diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c
index 29276fe60df9..047100b080aa 100644
--- a/fs/xfs/xfs_mount.c
+++ b/fs/xfs/xfs_mount.c
@@ -171,19 +171,16 @@ xfs_readsb(
ASSERT(mp->m_ddev_targp != NULL);
/*
- * For the initial read, we must guess at the sector
- * size based on the block device. It's enough to
- * get the sb_sectsize out of the superblock and
- * then reread with the proper length.
- * We don't verify it yet, because it may not be complete.
+ * In the first pass, use the device sector size to just read enough
+ * of the superblock to extract the XFS sector size.
+ *
+ * The device sector size must be smaller than or equal to the XFS
+ * sector size and thus we can always read the superblock. Once we know
+ * the XFS sector size, re-read it and run the buffer verifier.
*/
- sector_size = xfs_getsize_buftarg(mp->m_ddev_targp);
+ sector_size = mp->m_ddev_targp->bt_logical_sectorsize;
buf_ops = NULL;
- /*
- * Allocate a (locked) buffer to hold the superblock. This will be kept
- * around at all times to optimize access to the superblock.
- */
reread:
error = xfs_buf_read_uncached(mp->m_ddev_targp, XFS_SB_DADDR,
BTOBB(sector_size), &bp, buf_ops);
@@ -247,6 +244,10 @@ xfs_readsb(
/* no need to be quiet anymore, so reset the buf ops */
bp->b_ops = &xfs_sb_buf_ops;
+ /*
+ * Keep a pointer of the sb buffer around instead of caching it in the
+ * buffer cache because we access it frequently.
+ */
mp->m_sb_bp = bp;
xfs_buf_unlock(bp);
return 0;
--
2.47.2
^ permalink raw reply related [flat|nested] 33+ messages in thread
* Re: [PATCH 1/7] xfs: clean up the initial read logic in xfs_readsb
2025-07-01 10:40 ` [PATCH 1/7] xfs: clean up the initial read logic in xfs_readsb Christoph Hellwig
@ 2025-07-01 16:43 ` Darrick J. Wong
0 siblings, 0 replies; 33+ messages in thread
From: Darrick J. Wong @ 2025-07-01 16:43 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: Carlos Maiolino, John Garry, linux-xfs
On Tue, Jul 01, 2025 at 12:40:35PM +0200, Christoph Hellwig wrote:
> The initial sb read is always for a device logical block size
> buffer. The device logical block size is provided in the
> bt_logical_sectorsize in struct buftarg, so use that instead of the
> confusingly named xfs_getsize_buftarg buffer that reads it from the bdev.
>
> Update the comments surrounding the code to better describe what is going
> on.
>
> Signed-off-by: Christoph Hellwig <hch@lst.de>
Looks good,
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
--D
> ---
> fs/xfs/xfs_buf.h | 1 -
> fs/xfs/xfs_mount.c | 21 +++++++++++----------
> 2 files changed, 11 insertions(+), 11 deletions(-)
>
> diff --git a/fs/xfs/xfs_buf.h b/fs/xfs/xfs_buf.h
> index 15fc56948346..73a9686110e8 100644
> --- a/fs/xfs/xfs_buf.h
> +++ b/fs/xfs/xfs_buf.h
> @@ -375,7 +375,6 @@ extern void xfs_buftarg_wait(struct xfs_buftarg *);
> extern void xfs_buftarg_drain(struct xfs_buftarg *);
> int xfs_configure_buftarg(struct xfs_buftarg *btp, unsigned int sectorsize);
>
> -#define xfs_getsize_buftarg(buftarg) block_size((buftarg)->bt_bdev)
> #define xfs_readonly_buftarg(buftarg) bdev_read_only((buftarg)->bt_bdev)
>
> int xfs_buf_reverify(struct xfs_buf *bp, const struct xfs_buf_ops *ops);
> diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c
> index 29276fe60df9..047100b080aa 100644
> --- a/fs/xfs/xfs_mount.c
> +++ b/fs/xfs/xfs_mount.c
> @@ -171,19 +171,16 @@ xfs_readsb(
> ASSERT(mp->m_ddev_targp != NULL);
>
> /*
> - * For the initial read, we must guess at the sector
> - * size based on the block device. It's enough to
> - * get the sb_sectsize out of the superblock and
> - * then reread with the proper length.
> - * We don't verify it yet, because it may not be complete.
> + * In the first pass, use the device sector size to just read enough
> + * of the superblock to extract the XFS sector size.
> + *
> + * The device sector size must be smaller than or equal to the XFS
> + * sector size and thus we can always read the superblock. Once we know
> + * the XFS sector size, re-read it and run the buffer verifier.
> */
> - sector_size = xfs_getsize_buftarg(mp->m_ddev_targp);
> + sector_size = mp->m_ddev_targp->bt_logical_sectorsize;
> buf_ops = NULL;
>
> - /*
> - * Allocate a (locked) buffer to hold the superblock. This will be kept
> - * around at all times to optimize access to the superblock.
> - */
> reread:
> error = xfs_buf_read_uncached(mp->m_ddev_targp, XFS_SB_DADDR,
> BTOBB(sector_size), &bp, buf_ops);
> @@ -247,6 +244,10 @@ xfs_readsb(
> /* no need to be quiet anymore, so reset the buf ops */
> bp->b_ops = &xfs_sb_buf_ops;
>
> + /*
> + * Keep a pointer of the sb buffer around instead of caching it in the
> + * buffer cache because we access it frequently.
> + */
> mp->m_sb_bp = bp;
> xfs_buf_unlock(bp);
> return 0;
> --
> 2.47.2
>
>
^ permalink raw reply [flat|nested] 33+ messages in thread
* [PATCH 2/7] xfs: remove the call to sync_blockdev in xfs_configure_buftarg
2025-07-01 10:40 misc cleanups v2 Christoph Hellwig
2025-07-01 10:40 ` [PATCH 1/7] xfs: clean up the initial read logic in xfs_readsb Christoph Hellwig
@ 2025-07-01 10:40 ` Christoph Hellwig
2025-07-01 10:53 ` John Garry
2025-07-01 11:03 ` John Garry
2025-07-01 10:40 ` [PATCH 3/7] xfs: add a xfs_group_type_buftarg helper Christoph Hellwig
` (4 subsequent siblings)
6 siblings, 2 replies; 33+ messages in thread
From: Christoph Hellwig @ 2025-07-01 10:40 UTC (permalink / raw)
To: Carlos Maiolino; +Cc: John Garry, Darrick J. Wong, linux-xfs
This extra call is no needed as xfs_alloc_buftarg already calls
sync_blockdev.
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
fs/xfs/xfs_buf.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c
index ba5bd6031ece..7a05310da895 100644
--- a/fs/xfs/xfs_buf.c
+++ b/fs/xfs/xfs_buf.c
@@ -1744,8 +1744,7 @@ xfs_configure_buftarg(
*/
if (bdev_can_atomic_write(btp->bt_bdev))
xfs_configure_buftarg_atomic_writes(btp);
-
- return sync_blockdev(btp->bt_bdev);
+ return 0;
}
int
--
2.47.2
^ permalink raw reply related [flat|nested] 33+ messages in thread
* Re: [PATCH 2/7] xfs: remove the call to sync_blockdev in xfs_configure_buftarg
2025-07-01 10:40 ` [PATCH 2/7] xfs: remove the call to sync_blockdev in xfs_configure_buftarg Christoph Hellwig
@ 2025-07-01 10:53 ` John Garry
2025-07-01 11:03 ` John Garry
1 sibling, 0 replies; 33+ messages in thread
From: John Garry @ 2025-07-01 10:53 UTC (permalink / raw)
To: Christoph Hellwig, Carlos Maiolino; +Cc: Darrick J. Wong, linux-xfs
On 01/07/2025 11:40, Christoph Hellwig wrote:
> This extra call is no needed as xfs_alloc_buftarg already calls
not needed
> sync_blockdev.
>
> Signed-off-by: Christoph Hellwig<hch@lst.de>
seems fine, so FWIW:
Reviewed-by: John Garry <john.g.garry@oracle.com>
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH 2/7] xfs: remove the call to sync_blockdev in xfs_configure_buftarg
2025-07-01 10:40 ` [PATCH 2/7] xfs: remove the call to sync_blockdev in xfs_configure_buftarg Christoph Hellwig
2025-07-01 10:53 ` John Garry
@ 2025-07-01 11:03 ` John Garry
2025-07-03 13:42 ` Christoph Hellwig
1 sibling, 1 reply; 33+ messages in thread
From: John Garry @ 2025-07-01 11:03 UTC (permalink / raw)
To: Christoph Hellwig, Carlos Maiolino; +Cc: Darrick J. Wong, linux-xfs
On 01/07/2025 11:40, Christoph Hellwig wrote:
> This extra call is no needed as xfs_alloc_buftarg already calls
> sync_blockdev.
>
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
> fs/xfs/xfs_buf.c | 3 +--
> 1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c
> index ba5bd6031ece..7a05310da895 100644
> --- a/fs/xfs/xfs_buf.c
> +++ b/fs/xfs/xfs_buf.c
> @@ -1744,8 +1744,7 @@ xfs_configure_buftarg(
> */
> if (bdev_can_atomic_write(btp->bt_bdev))
> xfs_configure_buftarg_atomic_writes(btp);
> -
> - return sync_blockdev(btp->bt_bdev);
> + return 0;
> }
>
> int
BTW, the comment would need to be removed as well:
/*
* Flush the block device pagecache so our bios see anything dirtied
* before mount.
*/
if (bdev_can_atomic_write(btp->bt_bdev))
xfs_configure_buftarg_atomic_writes(btp);
return sync_blockdev(btp->bt_bdev);
}
This all seems to have come from 6e7d71b3a0, which is a strange
resolution to me.
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH 2/7] xfs: remove the call to sync_blockdev in xfs_configure_buftarg
2025-07-01 11:03 ` John Garry
@ 2025-07-03 13:42 ` Christoph Hellwig
0 siblings, 0 replies; 33+ messages in thread
From: Christoph Hellwig @ 2025-07-03 13:42 UTC (permalink / raw)
To: John Garry; +Cc: Christoph Hellwig, Carlos Maiolino, Darrick J. Wong, linux-xfs
On Tue, Jul 01, 2025 at 12:03:53PM +0100, John Garry wrote:
> BTW, the comment would need to be removed as well:
>
> /*
> * Flush the block device pagecache so our bios see anything dirtied
> * before mount.
> */
> if (bdev_can_atomic_write(btp->bt_bdev))
> xfs_configure_buftarg_atomic_writes(btp);
>
> return sync_blockdev(btp->bt_bdev);
> }
>
> This all seems to have come from 6e7d71b3a0, which is a strange resolution
> to me.
Yes, including having the comment in the wrong place.
^ permalink raw reply [flat|nested] 33+ messages in thread
* [PATCH 3/7] xfs: add a xfs_group_type_buftarg helper
2025-07-01 10:40 misc cleanups v2 Christoph Hellwig
2025-07-01 10:40 ` [PATCH 1/7] xfs: clean up the initial read logic in xfs_readsb Christoph Hellwig
2025-07-01 10:40 ` [PATCH 2/7] xfs: remove the call to sync_blockdev in xfs_configure_buftarg Christoph Hellwig
@ 2025-07-01 10:40 ` Christoph Hellwig
2025-07-01 11:05 ` John Garry
2025-07-01 16:43 ` Darrick J. Wong
2025-07-01 10:40 ` [PATCH 4/7] xfs: refactor xfs_calc_atomic_write_unit_max Christoph Hellwig
` (3 subsequent siblings)
6 siblings, 2 replies; 33+ messages in thread
From: Christoph Hellwig @ 2025-07-01 10:40 UTC (permalink / raw)
To: Carlos Maiolino; +Cc: John Garry, Darrick J. Wong, linux-xfs
Generalize the xfs_group_type helper in the discard code to return a buftarg
and move it to xfs_mount.h, and use the result in xfs_dax_notify_dev_failure.
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
fs/xfs/xfs_discard.c | 29 +++++++----------------------
fs/xfs/xfs_mount.h | 17 +++++++++++++++++
fs/xfs/xfs_notify_failure.c | 3 +--
3 files changed, 25 insertions(+), 24 deletions(-)
diff --git a/fs/xfs/xfs_discard.c b/fs/xfs/xfs_discard.c
index 94d0873bcd62..603d51365645 100644
--- a/fs/xfs/xfs_discard.c
+++ b/fs/xfs/xfs_discard.c
@@ -103,24 +103,6 @@ xfs_discard_endio(
bio_put(bio);
}
-static inline struct block_device *
-xfs_group_bdev(
- const struct xfs_group *xg)
-{
- struct xfs_mount *mp = xg->xg_mount;
-
- switch (xg->xg_type) {
- case XG_TYPE_AG:
- return mp->m_ddev_targp->bt_bdev;
- case XG_TYPE_RTG:
- return mp->m_rtdev_targp->bt_bdev;
- default:
- ASSERT(0);
- break;
- }
- return NULL;
-}
-
/*
* Walk the discard list and issue discards on all the busy extents in the
* list. We plug and chain the bios so that we only need a single completion
@@ -138,11 +120,14 @@ xfs_discard_extents(
blk_start_plug(&plug);
list_for_each_entry(busyp, &extents->extent_list, list) {
- trace_xfs_discard_extent(busyp->group, busyp->bno,
- busyp->length);
+ struct xfs_group *xg = busyp->group;
+ struct xfs_buftarg *btp =
+ xfs_group_type_buftarg(xg->xg_mount, xg->xg_type);
+
+ trace_xfs_discard_extent(xg, busyp->bno, busyp->length);
- error = __blkdev_issue_discard(xfs_group_bdev(busyp->group),
- xfs_gbno_to_daddr(busyp->group, busyp->bno),
+ error = __blkdev_issue_discard(btp->bt_bdev,
+ xfs_gbno_to_daddr(xg, busyp->bno),
XFS_FSB_TO_BB(mp, busyp->length),
GFP_KERNEL, &bio);
if (error && error != -EOPNOTSUPP) {
diff --git a/fs/xfs/xfs_mount.h b/fs/xfs/xfs_mount.h
index d85084f9f317..97de44c32272 100644
--- a/fs/xfs/xfs_mount.h
+++ b/fs/xfs/xfs_mount.h
@@ -802,4 +802,21 @@ static inline void xfs_mod_sb_delalloc(struct xfs_mount *mp, int64_t delta)
int xfs_set_max_atomic_write_opt(struct xfs_mount *mp,
unsigned long long new_max_bytes);
+static inline struct xfs_buftarg *
+xfs_group_type_buftarg(
+ struct xfs_mount *mp,
+ enum xfs_group_type type)
+{
+ switch (type) {
+ case XG_TYPE_AG:
+ return mp->m_ddev_targp;
+ case XG_TYPE_RTG:
+ return mp->m_rtdev_targp;
+ default:
+ ASSERT(0);
+ break;
+ }
+ return NULL;
+}
+
#endif /* __XFS_MOUNT_H__ */
diff --git a/fs/xfs/xfs_notify_failure.c b/fs/xfs/xfs_notify_failure.c
index 3545dc1d953c..42e9c72b85c0 100644
--- a/fs/xfs/xfs_notify_failure.c
+++ b/fs/xfs/xfs_notify_failure.c
@@ -253,8 +253,7 @@ xfs_dax_notify_dev_failure(
return -EOPNOTSUPP;
}
- error = xfs_dax_translate_range(type == XG_TYPE_RTG ?
- mp->m_rtdev_targp : mp->m_ddev_targp,
+ error = xfs_dax_translate_range(xfs_group_type_buftarg(mp, type),
offset, len, &daddr, &bblen);
if (error)
return error;
--
2.47.2
^ permalink raw reply related [flat|nested] 33+ messages in thread
* Re: [PATCH 3/7] xfs: add a xfs_group_type_buftarg helper
2025-07-01 10:40 ` [PATCH 3/7] xfs: add a xfs_group_type_buftarg helper Christoph Hellwig
@ 2025-07-01 11:05 ` John Garry
2025-07-01 16:43 ` Darrick J. Wong
1 sibling, 0 replies; 33+ messages in thread
From: John Garry @ 2025-07-01 11:05 UTC (permalink / raw)
To: Christoph Hellwig, Carlos Maiolino; +Cc: Darrick J. Wong, linux-xfs
On 01/07/2025 11:40, Christoph Hellwig wrote:
> Generalize the xfs_group_type helper in the discard code to return a buftarg
> and move it to xfs_mount.h, and use the result in xfs_dax_notify_dev_failure.
>
> Signed-off-by: Christoph Hellwig<hch@lst.de>
looks ok, so FWIW:
Reviewed-by: John Garry <john.g.garry@oracle.com>
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH 3/7] xfs: add a xfs_group_type_buftarg helper
2025-07-01 10:40 ` [PATCH 3/7] xfs: add a xfs_group_type_buftarg helper Christoph Hellwig
2025-07-01 11:05 ` John Garry
@ 2025-07-01 16:43 ` Darrick J. Wong
1 sibling, 0 replies; 33+ messages in thread
From: Darrick J. Wong @ 2025-07-01 16:43 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: Carlos Maiolino, John Garry, linux-xfs
On Tue, Jul 01, 2025 at 12:40:37PM +0200, Christoph Hellwig wrote:
> Generalize the xfs_group_type helper in the discard code to return a buftarg
> and move it to xfs_mount.h, and use the result in xfs_dax_notify_dev_failure.
>
> Signed-off-by: Christoph Hellwig <hch@lst.de>
Looks good to me,
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
--D
> ---
> fs/xfs/xfs_discard.c | 29 +++++++----------------------
> fs/xfs/xfs_mount.h | 17 +++++++++++++++++
> fs/xfs/xfs_notify_failure.c | 3 +--
> 3 files changed, 25 insertions(+), 24 deletions(-)
>
> diff --git a/fs/xfs/xfs_discard.c b/fs/xfs/xfs_discard.c
> index 94d0873bcd62..603d51365645 100644
> --- a/fs/xfs/xfs_discard.c
> +++ b/fs/xfs/xfs_discard.c
> @@ -103,24 +103,6 @@ xfs_discard_endio(
> bio_put(bio);
> }
>
> -static inline struct block_device *
> -xfs_group_bdev(
> - const struct xfs_group *xg)
> -{
> - struct xfs_mount *mp = xg->xg_mount;
> -
> - switch (xg->xg_type) {
> - case XG_TYPE_AG:
> - return mp->m_ddev_targp->bt_bdev;
> - case XG_TYPE_RTG:
> - return mp->m_rtdev_targp->bt_bdev;
> - default:
> - ASSERT(0);
> - break;
> - }
> - return NULL;
> -}
> -
> /*
> * Walk the discard list and issue discards on all the busy extents in the
> * list. We plug and chain the bios so that we only need a single completion
> @@ -138,11 +120,14 @@ xfs_discard_extents(
>
> blk_start_plug(&plug);
> list_for_each_entry(busyp, &extents->extent_list, list) {
> - trace_xfs_discard_extent(busyp->group, busyp->bno,
> - busyp->length);
> + struct xfs_group *xg = busyp->group;
> + struct xfs_buftarg *btp =
> + xfs_group_type_buftarg(xg->xg_mount, xg->xg_type);
> +
> + trace_xfs_discard_extent(xg, busyp->bno, busyp->length);
>
> - error = __blkdev_issue_discard(xfs_group_bdev(busyp->group),
> - xfs_gbno_to_daddr(busyp->group, busyp->bno),
> + error = __blkdev_issue_discard(btp->bt_bdev,
> + xfs_gbno_to_daddr(xg, busyp->bno),
> XFS_FSB_TO_BB(mp, busyp->length),
> GFP_KERNEL, &bio);
> if (error && error != -EOPNOTSUPP) {
> diff --git a/fs/xfs/xfs_mount.h b/fs/xfs/xfs_mount.h
> index d85084f9f317..97de44c32272 100644
> --- a/fs/xfs/xfs_mount.h
> +++ b/fs/xfs/xfs_mount.h
> @@ -802,4 +802,21 @@ static inline void xfs_mod_sb_delalloc(struct xfs_mount *mp, int64_t delta)
> int xfs_set_max_atomic_write_opt(struct xfs_mount *mp,
> unsigned long long new_max_bytes);
>
> +static inline struct xfs_buftarg *
> +xfs_group_type_buftarg(
> + struct xfs_mount *mp,
> + enum xfs_group_type type)
> +{
> + switch (type) {
> + case XG_TYPE_AG:
> + return mp->m_ddev_targp;
> + case XG_TYPE_RTG:
> + return mp->m_rtdev_targp;
> + default:
> + ASSERT(0);
> + break;
> + }
> + return NULL;
> +}
> +
> #endif /* __XFS_MOUNT_H__ */
> diff --git a/fs/xfs/xfs_notify_failure.c b/fs/xfs/xfs_notify_failure.c
> index 3545dc1d953c..42e9c72b85c0 100644
> --- a/fs/xfs/xfs_notify_failure.c
> +++ b/fs/xfs/xfs_notify_failure.c
> @@ -253,8 +253,7 @@ xfs_dax_notify_dev_failure(
> return -EOPNOTSUPP;
> }
>
> - error = xfs_dax_translate_range(type == XG_TYPE_RTG ?
> - mp->m_rtdev_targp : mp->m_ddev_targp,
> + error = xfs_dax_translate_range(xfs_group_type_buftarg(mp, type),
> offset, len, &daddr, &bblen);
> if (error)
> return error;
> --
> 2.47.2
>
>
^ permalink raw reply [flat|nested] 33+ messages in thread
* [PATCH 4/7] xfs: refactor xfs_calc_atomic_write_unit_max
2025-07-01 10:40 misc cleanups v2 Christoph Hellwig
` (2 preceding siblings ...)
2025-07-01 10:40 ` [PATCH 3/7] xfs: add a xfs_group_type_buftarg helper Christoph Hellwig
@ 2025-07-01 10:40 ` Christoph Hellwig
2025-07-01 16:53 ` Darrick J. Wong
2025-07-01 10:40 ` [PATCH 5/7] xfs: rename the bt_bdev_* buftarg fields Christoph Hellwig
` (2 subsequent siblings)
6 siblings, 1 reply; 33+ messages in thread
From: Christoph Hellwig @ 2025-07-01 10:40 UTC (permalink / raw)
To: Carlos Maiolino; +Cc: John Garry, Darrick J. Wong, linux-xfs
This function and the helpers used by it duplicate the same logic for AGs
and RTGs. Use the xfs_group_type enum to unify both variants.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: John Garry <john.g.garry@oracle.com>
---
fs/xfs/xfs_mount.c | 76 +++++++++++++++++-----------------------------
fs/xfs/xfs_trace.h | 31 +++++++++----------
2 files changed, 42 insertions(+), 65 deletions(-)
diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c
index 047100b080aa..99fbb22bad4c 100644
--- a/fs/xfs/xfs_mount.c
+++ b/fs/xfs/xfs_mount.c
@@ -679,68 +679,46 @@ static inline unsigned int max_pow_of_two_factor(const unsigned int nr)
}
/*
- * If the data device advertises atomic write support, limit the size of data
- * device atomic writes to the greatest power-of-two factor of the AG size so
- * that every atomic write unit aligns with the start of every AG. This is
- * required so that the per-AG allocations for an atomic write will always be
+ * If the underlying device advertises atomic write support, limit the size of
+ * atomic writes to the greatest power-of-two factor of the group size so
+ * that every atomic write unit aligns with the start of every group. This is
+ * required so that the allocations for an atomic write will always be
* aligned compatibly with the alignment requirements of the storage.
*
- * If the data device doesn't advertise atomic writes, then there are no
- * alignment restrictions and the largest out-of-place write we can do
- * ourselves is the number of blocks that user files can allocate from any AG.
+ * If the device doesn't advertise atomic writes, then there are no alignment
+ * restrictions and the largest out-of-place write we can do ourselves is the
+ * number of blocks that user files can allocate from any group.
*/
-static inline xfs_extlen_t xfs_calc_perag_awu_max(struct xfs_mount *mp)
-{
- if (mp->m_ddev_targp->bt_bdev_awu_min > 0)
- return max_pow_of_two_factor(mp->m_sb.sb_agblocks);
- return rounddown_pow_of_two(mp->m_ag_max_usable);
-}
-
-/*
- * Reflink on the realtime device requires rtgroups, and atomic writes require
- * reflink.
- *
- * If the realtime device advertises atomic write support, limit the size of
- * data device atomic writes to the greatest power-of-two factor of the rtgroup
- * size so that every atomic write unit aligns with the start of every rtgroup.
- * This is required so that the per-rtgroup allocations for an atomic write
- * will always be aligned compatibly with the alignment requirements of the
- * storage.
- *
- * If the rt device doesn't advertise atomic writes, then there are no
- * alignment restrictions and the largest out-of-place write we can do
- * ourselves is the number of blocks that user files can allocate from any
- * rtgroup.
- */
-static inline xfs_extlen_t xfs_calc_rtgroup_awu_max(struct xfs_mount *mp)
+static xfs_extlen_t
+xfs_calc_group_awu_max(
+ struct xfs_mount *mp,
+ enum xfs_group_type type)
{
- struct xfs_groups *rgs = &mp->m_groups[XG_TYPE_RTG];
+ struct xfs_groups *g = &mp->m_groups[type];
+ struct xfs_buftarg *btp = xfs_group_type_buftarg(mp, type);
- if (rgs->blocks == 0)
+ if (g->blocks == 0)
return 0;
- if (mp->m_rtdev_targp && mp->m_rtdev_targp->bt_bdev_awu_min > 0)
- return max_pow_of_two_factor(rgs->blocks);
- return rounddown_pow_of_two(rgs->blocks);
+ if (btp && btp->bt_bdev_awu_min > 0)
+ return max_pow_of_two_factor(g->blocks);
+ return rounddown_pow_of_two(g->blocks);
}
/* Compute the maximum atomic write unit size for each section. */
static inline void
xfs_calc_atomic_write_unit_max(
- struct xfs_mount *mp)
+ struct xfs_mount *mp,
+ enum xfs_group_type type)
{
- struct xfs_groups *ags = &mp->m_groups[XG_TYPE_AG];
- struct xfs_groups *rgs = &mp->m_groups[XG_TYPE_RTG];
+ struct xfs_groups *g = &mp->m_groups[type];
const xfs_extlen_t max_write = xfs_calc_atomic_write_max(mp);
const xfs_extlen_t max_ioend = xfs_reflink_max_atomic_cow(mp);
- const xfs_extlen_t max_agsize = xfs_calc_perag_awu_max(mp);
- const xfs_extlen_t max_rgsize = xfs_calc_rtgroup_awu_max(mp);
-
- ags->awu_max = min3(max_write, max_ioend, max_agsize);
- rgs->awu_max = min3(max_write, max_ioend, max_rgsize);
+ const xfs_extlen_t max_gsize = xfs_calc_group_awu_max(mp, type);
- trace_xfs_calc_atomic_write_unit_max(mp, max_write, max_ioend,
- max_agsize, max_rgsize);
+ g->awu_max = min3(max_write, max_ioend, max_gsize);
+ trace_xfs_calc_atomic_write_unit_max(mp, type, max_write, max_ioend,
+ max_gsize, g->awu_max);
}
/*
@@ -758,7 +736,8 @@ xfs_set_max_atomic_write_opt(
max(mp->m_groups[XG_TYPE_AG].blocks,
mp->m_groups[XG_TYPE_RTG].blocks);
const xfs_extlen_t max_group_write =
- max(xfs_calc_perag_awu_max(mp), xfs_calc_rtgroup_awu_max(mp));
+ max(xfs_calc_group_awu_max(mp, XG_TYPE_AG),
+ xfs_calc_group_awu_max(mp, XG_TYPE_RTG));
int error;
if (new_max_bytes == 0)
@@ -814,7 +793,8 @@ xfs_set_max_atomic_write_opt(
return error;
}
- xfs_calc_atomic_write_unit_max(mp);
+ xfs_calc_atomic_write_unit_max(mp, XG_TYPE_AG);
+ xfs_calc_atomic_write_unit_max(mp, XG_TYPE_RTG);
mp->m_awu_max_bytes = new_max_bytes;
return 0;
}
diff --git a/fs/xfs/xfs_trace.h b/fs/xfs/xfs_trace.h
index aae0d0ef84e0..6addebd764b0 100644
--- a/fs/xfs/xfs_trace.h
+++ b/fs/xfs/xfs_trace.h
@@ -171,36 +171,33 @@ DEFINE_ATTR_LIST_EVENT(xfs_attr_leaf_list);
DEFINE_ATTR_LIST_EVENT(xfs_attr_node_list);
TRACE_EVENT(xfs_calc_atomic_write_unit_max,
- TP_PROTO(struct xfs_mount *mp, unsigned int max_write,
- unsigned int max_ioend, unsigned int max_agsize,
- unsigned int max_rgsize),
- TP_ARGS(mp, max_write, max_ioend, max_agsize, max_rgsize),
+ TP_PROTO(struct xfs_mount *mp, enum xfs_group_type type,
+ unsigned int max_write, unsigned int max_ioend,
+ unsigned int max_gsize, unsigned int awu_max),
+ TP_ARGS(mp, type, max_write, max_ioend, max_gsize, awu_max),
TP_STRUCT__entry(
__field(dev_t, dev)
+ __field(enum xfs_group_type, type)
__field(unsigned int, max_write)
__field(unsigned int, max_ioend)
- __field(unsigned int, max_agsize)
- __field(unsigned int, max_rgsize)
- __field(unsigned int, data_awu_max)
- __field(unsigned int, rt_awu_max)
+ __field(unsigned int, max_gsize)
+ __field(unsigned int, awu_max)
),
TP_fast_assign(
__entry->dev = mp->m_super->s_dev;
+ __entry->type = type;
__entry->max_write = max_write;
__entry->max_ioend = max_ioend;
- __entry->max_agsize = max_agsize;
- __entry->max_rgsize = max_rgsize;
- __entry->data_awu_max = mp->m_groups[XG_TYPE_AG].awu_max;
- __entry->rt_awu_max = mp->m_groups[XG_TYPE_RTG].awu_max;
+ __entry->max_gsize = max_gsize;
+ __entry->awu_max = awu_max;
),
- TP_printk("dev %d:%d max_write %u max_ioend %u max_agsize %u max_rgsize %u data_awu_max %u rt_awu_max %u",
+ TP_printk("dev %d:%d %s max_write %u max_ioend %u max_gsize %u awu_max %u",
MAJOR(__entry->dev), MINOR(__entry->dev),
+ __print_symbolic(__entry->type, XG_TYPE_STRINGS),
__entry->max_write,
__entry->max_ioend,
- __entry->max_agsize,
- __entry->max_rgsize,
- __entry->data_awu_max,
- __entry->rt_awu_max)
+ __entry->max_gsize,
+ __entry->awu_max)
);
TRACE_EVENT(xfs_calc_max_atomic_write_fsblocks,
--
2.47.2
^ permalink raw reply related [flat|nested] 33+ messages in thread
* Re: [PATCH 4/7] xfs: refactor xfs_calc_atomic_write_unit_max
2025-07-01 10:40 ` [PATCH 4/7] xfs: refactor xfs_calc_atomic_write_unit_max Christoph Hellwig
@ 2025-07-01 16:53 ` Darrick J. Wong
0 siblings, 0 replies; 33+ messages in thread
From: Darrick J. Wong @ 2025-07-01 16:53 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: Carlos Maiolino, John Garry, linux-xfs
On Tue, Jul 01, 2025 at 12:40:38PM +0200, Christoph Hellwig wrote:
> This function and the helpers used by it duplicate the same logic for AGs
> and RTGs. Use the xfs_group_type enum to unify both variants.
>
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> Reviewed-by: John Garry <john.g.garry@oracle.com>
LGTM,
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
--D
> ---
> fs/xfs/xfs_mount.c | 76 +++++++++++++++++-----------------------------
> fs/xfs/xfs_trace.h | 31 +++++++++----------
> 2 files changed, 42 insertions(+), 65 deletions(-)
>
> diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c
> index 047100b080aa..99fbb22bad4c 100644
> --- a/fs/xfs/xfs_mount.c
> +++ b/fs/xfs/xfs_mount.c
> @@ -679,68 +679,46 @@ static inline unsigned int max_pow_of_two_factor(const unsigned int nr)
> }
>
> /*
> - * If the data device advertises atomic write support, limit the size of data
> - * device atomic writes to the greatest power-of-two factor of the AG size so
> - * that every atomic write unit aligns with the start of every AG. This is
> - * required so that the per-AG allocations for an atomic write will always be
> + * If the underlying device advertises atomic write support, limit the size of
> + * atomic writes to the greatest power-of-two factor of the group size so
> + * that every atomic write unit aligns with the start of every group. This is
> + * required so that the allocations for an atomic write will always be
> * aligned compatibly with the alignment requirements of the storage.
> *
> - * If the data device doesn't advertise atomic writes, then there are no
> - * alignment restrictions and the largest out-of-place write we can do
> - * ourselves is the number of blocks that user files can allocate from any AG.
> + * If the device doesn't advertise atomic writes, then there are no alignment
> + * restrictions and the largest out-of-place write we can do ourselves is the
> + * number of blocks that user files can allocate from any group.
> */
> -static inline xfs_extlen_t xfs_calc_perag_awu_max(struct xfs_mount *mp)
> -{
> - if (mp->m_ddev_targp->bt_bdev_awu_min > 0)
> - return max_pow_of_two_factor(mp->m_sb.sb_agblocks);
> - return rounddown_pow_of_two(mp->m_ag_max_usable);
> -}
> -
> -/*
> - * Reflink on the realtime device requires rtgroups, and atomic writes require
> - * reflink.
> - *
> - * If the realtime device advertises atomic write support, limit the size of
> - * data device atomic writes to the greatest power-of-two factor of the rtgroup
> - * size so that every atomic write unit aligns with the start of every rtgroup.
> - * This is required so that the per-rtgroup allocations for an atomic write
> - * will always be aligned compatibly with the alignment requirements of the
> - * storage.
> - *
> - * If the rt device doesn't advertise atomic writes, then there are no
> - * alignment restrictions and the largest out-of-place write we can do
> - * ourselves is the number of blocks that user files can allocate from any
> - * rtgroup.
> - */
> -static inline xfs_extlen_t xfs_calc_rtgroup_awu_max(struct xfs_mount *mp)
> +static xfs_extlen_t
> +xfs_calc_group_awu_max(
> + struct xfs_mount *mp,
> + enum xfs_group_type type)
> {
> - struct xfs_groups *rgs = &mp->m_groups[XG_TYPE_RTG];
> + struct xfs_groups *g = &mp->m_groups[type];
> + struct xfs_buftarg *btp = xfs_group_type_buftarg(mp, type);
>
> - if (rgs->blocks == 0)
> + if (g->blocks == 0)
> return 0;
> - if (mp->m_rtdev_targp && mp->m_rtdev_targp->bt_bdev_awu_min > 0)
> - return max_pow_of_two_factor(rgs->blocks);
> - return rounddown_pow_of_two(rgs->blocks);
> + if (btp && btp->bt_bdev_awu_min > 0)
> + return max_pow_of_two_factor(g->blocks);
> + return rounddown_pow_of_two(g->blocks);
> }
>
> /* Compute the maximum atomic write unit size for each section. */
> static inline void
> xfs_calc_atomic_write_unit_max(
> - struct xfs_mount *mp)
> + struct xfs_mount *mp,
> + enum xfs_group_type type)
> {
> - struct xfs_groups *ags = &mp->m_groups[XG_TYPE_AG];
> - struct xfs_groups *rgs = &mp->m_groups[XG_TYPE_RTG];
> + struct xfs_groups *g = &mp->m_groups[type];
>
> const xfs_extlen_t max_write = xfs_calc_atomic_write_max(mp);
> const xfs_extlen_t max_ioend = xfs_reflink_max_atomic_cow(mp);
> - const xfs_extlen_t max_agsize = xfs_calc_perag_awu_max(mp);
> - const xfs_extlen_t max_rgsize = xfs_calc_rtgroup_awu_max(mp);
> -
> - ags->awu_max = min3(max_write, max_ioend, max_agsize);
> - rgs->awu_max = min3(max_write, max_ioend, max_rgsize);
> + const xfs_extlen_t max_gsize = xfs_calc_group_awu_max(mp, type);
>
> - trace_xfs_calc_atomic_write_unit_max(mp, max_write, max_ioend,
> - max_agsize, max_rgsize);
> + g->awu_max = min3(max_write, max_ioend, max_gsize);
> + trace_xfs_calc_atomic_write_unit_max(mp, type, max_write, max_ioend,
> + max_gsize, g->awu_max);
> }
>
> /*
> @@ -758,7 +736,8 @@ xfs_set_max_atomic_write_opt(
> max(mp->m_groups[XG_TYPE_AG].blocks,
> mp->m_groups[XG_TYPE_RTG].blocks);
> const xfs_extlen_t max_group_write =
> - max(xfs_calc_perag_awu_max(mp), xfs_calc_rtgroup_awu_max(mp));
> + max(xfs_calc_group_awu_max(mp, XG_TYPE_AG),
> + xfs_calc_group_awu_max(mp, XG_TYPE_RTG));
> int error;
>
> if (new_max_bytes == 0)
> @@ -814,7 +793,8 @@ xfs_set_max_atomic_write_opt(
> return error;
> }
>
> - xfs_calc_atomic_write_unit_max(mp);
> + xfs_calc_atomic_write_unit_max(mp, XG_TYPE_AG);
> + xfs_calc_atomic_write_unit_max(mp, XG_TYPE_RTG);
> mp->m_awu_max_bytes = new_max_bytes;
> return 0;
> }
> diff --git a/fs/xfs/xfs_trace.h b/fs/xfs/xfs_trace.h
> index aae0d0ef84e0..6addebd764b0 100644
> --- a/fs/xfs/xfs_trace.h
> +++ b/fs/xfs/xfs_trace.h
> @@ -171,36 +171,33 @@ DEFINE_ATTR_LIST_EVENT(xfs_attr_leaf_list);
> DEFINE_ATTR_LIST_EVENT(xfs_attr_node_list);
>
> TRACE_EVENT(xfs_calc_atomic_write_unit_max,
> - TP_PROTO(struct xfs_mount *mp, unsigned int max_write,
> - unsigned int max_ioend, unsigned int max_agsize,
> - unsigned int max_rgsize),
> - TP_ARGS(mp, max_write, max_ioend, max_agsize, max_rgsize),
> + TP_PROTO(struct xfs_mount *mp, enum xfs_group_type type,
> + unsigned int max_write, unsigned int max_ioend,
> + unsigned int max_gsize, unsigned int awu_max),
> + TP_ARGS(mp, type, max_write, max_ioend, max_gsize, awu_max),
> TP_STRUCT__entry(
> __field(dev_t, dev)
> + __field(enum xfs_group_type, type)
> __field(unsigned int, max_write)
> __field(unsigned int, max_ioend)
> - __field(unsigned int, max_agsize)
> - __field(unsigned int, max_rgsize)
> - __field(unsigned int, data_awu_max)
> - __field(unsigned int, rt_awu_max)
> + __field(unsigned int, max_gsize)
> + __field(unsigned int, awu_max)
> ),
> TP_fast_assign(
> __entry->dev = mp->m_super->s_dev;
> + __entry->type = type;
> __entry->max_write = max_write;
> __entry->max_ioend = max_ioend;
> - __entry->max_agsize = max_agsize;
> - __entry->max_rgsize = max_rgsize;
> - __entry->data_awu_max = mp->m_groups[XG_TYPE_AG].awu_max;
> - __entry->rt_awu_max = mp->m_groups[XG_TYPE_RTG].awu_max;
> + __entry->max_gsize = max_gsize;
> + __entry->awu_max = awu_max;
> ),
> - TP_printk("dev %d:%d max_write %u max_ioend %u max_agsize %u max_rgsize %u data_awu_max %u rt_awu_max %u",
> + TP_printk("dev %d:%d %s max_write %u max_ioend %u max_gsize %u awu_max %u",
> MAJOR(__entry->dev), MINOR(__entry->dev),
> + __print_symbolic(__entry->type, XG_TYPE_STRINGS),
> __entry->max_write,
> __entry->max_ioend,
> - __entry->max_agsize,
> - __entry->max_rgsize,
> - __entry->data_awu_max,
> - __entry->rt_awu_max)
> + __entry->max_gsize,
> + __entry->awu_max)
> );
>
> TRACE_EVENT(xfs_calc_max_atomic_write_fsblocks,
> --
> 2.47.2
>
>
^ permalink raw reply [flat|nested] 33+ messages in thread
* [PATCH 5/7] xfs: rename the bt_bdev_* buftarg fields
2025-07-01 10:40 misc cleanups v2 Christoph Hellwig
` (3 preceding siblings ...)
2025-07-01 10:40 ` [PATCH 4/7] xfs: refactor xfs_calc_atomic_write_unit_max Christoph Hellwig
@ 2025-07-01 10:40 ` Christoph Hellwig
2025-07-01 10:54 ` John Garry
2025-07-01 16:45 ` Darrick J. Wong
2025-07-01 10:40 ` [PATCH 6/7] xfs: remove the bt_bdev_file buftarg field Christoph Hellwig
2025-07-01 10:40 ` [PATCH 7/7] xfs: remove the bt_meta_sectorsize field in struct buftarg Christoph Hellwig
6 siblings, 2 replies; 33+ messages in thread
From: Christoph Hellwig @ 2025-07-01 10:40 UTC (permalink / raw)
To: Carlos Maiolino; +Cc: John Garry, Darrick J. Wong, linux-xfs
The extra bdev_ is weird, so drop it. Also improve the comment to make
it clear these are the hardware limits.
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
fs/xfs/xfs_buf.c | 4 ++--
fs/xfs/xfs_buf.h | 6 +++---
fs/xfs/xfs_file.c | 2 +-
fs/xfs/xfs_inode.h | 2 +-
fs/xfs/xfs_iomap.c | 2 +-
fs/xfs/xfs_iops.c | 2 +-
fs/xfs/xfs_mount.c | 2 +-
7 files changed, 10 insertions(+), 10 deletions(-)
diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c
index 7a05310da895..661f6c70e9d0 100644
--- a/fs/xfs/xfs_buf.c
+++ b/fs/xfs/xfs_buf.c
@@ -1712,8 +1712,8 @@ xfs_configure_buftarg_atomic_writes(
max_bytes = 0;
}
- btp->bt_bdev_awu_min = min_bytes;
- btp->bt_bdev_awu_max = max_bytes;
+ btp->bt_awu_min = min_bytes;
+ btp->bt_awu_max = max_bytes;
}
/* Configure a buffer target that abstracts a block device. */
diff --git a/fs/xfs/xfs_buf.h b/fs/xfs/xfs_buf.h
index 73a9686110e8..7987a6d64874 100644
--- a/fs/xfs/xfs_buf.h
+++ b/fs/xfs/xfs_buf.h
@@ -112,9 +112,9 @@ struct xfs_buftarg {
struct percpu_counter bt_readahead_count;
struct ratelimit_state bt_ioerror_rl;
- /* Atomic write unit values, bytes */
- unsigned int bt_bdev_awu_min;
- unsigned int bt_bdev_awu_max;
+ /* Hardware atomic write unit values, bytes */
+ unsigned int bt_awu_min;
+ unsigned int bt_awu_max;
/* built-in cache, if we're not using the perag one */
struct xfs_buf_cache bt_cache[];
diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
index 48254a72071b..377fc9077781 100644
--- a/fs/xfs/xfs_file.c
+++ b/fs/xfs/xfs_file.c
@@ -752,7 +752,7 @@ xfs_file_dio_write_atomic(
* HW offload should be faster, so try that first if it is already
* known that the write length is not too large.
*/
- if (ocount > xfs_inode_buftarg(ip)->bt_bdev_awu_max)
+ if (ocount > xfs_inode_buftarg(ip)->bt_awu_max)
dops = &xfs_atomic_write_cow_iomap_ops;
else
dops = &xfs_direct_write_iomap_ops;
diff --git a/fs/xfs/xfs_inode.h b/fs/xfs/xfs_inode.h
index d7e2b902ef5c..07fbdcc4cbf5 100644
--- a/fs/xfs/xfs_inode.h
+++ b/fs/xfs/xfs_inode.h
@@ -358,7 +358,7 @@ static inline bool xfs_inode_has_bigrtalloc(const struct xfs_inode *ip)
static inline bool xfs_inode_can_hw_atomic_write(const struct xfs_inode *ip)
{
- return xfs_inode_buftarg(ip)->bt_bdev_awu_max > 0;
+ return xfs_inode_buftarg(ip)->bt_awu_max > 0;
}
/*
diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
index ff05e6b1b0bb..ec30b78bf5c4 100644
--- a/fs/xfs/xfs_iomap.c
+++ b/fs/xfs/xfs_iomap.c
@@ -827,7 +827,7 @@ xfs_bmap_hw_atomic_write_possible(
/*
* The ->iomap_begin caller should ensure this, but check anyway.
*/
- return len <= xfs_inode_buftarg(ip)->bt_bdev_awu_max;
+ return len <= xfs_inode_buftarg(ip)->bt_awu_max;
}
static int
diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c
index 8cddbb7c149b..01e597290eb5 100644
--- a/fs/xfs/xfs_iops.c
+++ b/fs/xfs/xfs_iops.c
@@ -665,7 +665,7 @@ xfs_get_atomic_write_max_opt(
* less than our out of place write limit, but we don't want to exceed
* the awu_max.
*/
- return min(awu_max, xfs_inode_buftarg(ip)->bt_bdev_awu_max);
+ return min(awu_max, xfs_inode_buftarg(ip)->bt_awu_max);
}
static void
diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c
index 99fbb22bad4c..0b690bc119d7 100644
--- a/fs/xfs/xfs_mount.c
+++ b/fs/xfs/xfs_mount.c
@@ -699,7 +699,7 @@ xfs_calc_group_awu_max(
if (g->blocks == 0)
return 0;
- if (btp && btp->bt_bdev_awu_min > 0)
+ if (btp && btp->bt_awu_min > 0)
return max_pow_of_two_factor(g->blocks);
return rounddown_pow_of_two(g->blocks);
}
--
2.47.2
^ permalink raw reply related [flat|nested] 33+ messages in thread
* Re: [PATCH 5/7] xfs: rename the bt_bdev_* buftarg fields
2025-07-01 10:40 ` [PATCH 5/7] xfs: rename the bt_bdev_* buftarg fields Christoph Hellwig
@ 2025-07-01 10:54 ` John Garry
2025-07-01 16:45 ` Darrick J. Wong
1 sibling, 0 replies; 33+ messages in thread
From: John Garry @ 2025-07-01 10:54 UTC (permalink / raw)
To: Christoph Hellwig, Carlos Maiolino; +Cc: Darrick J. Wong, linux-xfs
On 01/07/2025 11:40, Christoph Hellwig wrote:
> The extra bdev_ is weird, so drop it. Also improve the comment to make
> it clear these are the hardware limits.
>
> Signed-off-by: Christoph Hellwig<hch@lst.de>
Reviewed-by: John Garry <john.g.garry@oracle.com>
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH 5/7] xfs: rename the bt_bdev_* buftarg fields
2025-07-01 10:40 ` [PATCH 5/7] xfs: rename the bt_bdev_* buftarg fields Christoph Hellwig
2025-07-01 10:54 ` John Garry
@ 2025-07-01 16:45 ` Darrick J. Wong
1 sibling, 0 replies; 33+ messages in thread
From: Darrick J. Wong @ 2025-07-01 16:45 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: Carlos Maiolino, John Garry, linux-xfs
On Tue, Jul 01, 2025 at 12:40:39PM +0200, Christoph Hellwig wrote:
> The extra bdev_ is weird, so drop it. Also improve the comment to make
> it clear these are the hardware limits.
>
> Signed-off-by: Christoph Hellwig <hch@lst.de>
Looks fine, thanks for the clarification in the xfs_buftarg definition
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
--D
> ---
> fs/xfs/xfs_buf.c | 4 ++--
> fs/xfs/xfs_buf.h | 6 +++---
> fs/xfs/xfs_file.c | 2 +-
> fs/xfs/xfs_inode.h | 2 +-
> fs/xfs/xfs_iomap.c | 2 +-
> fs/xfs/xfs_iops.c | 2 +-
> fs/xfs/xfs_mount.c | 2 +-
> 7 files changed, 10 insertions(+), 10 deletions(-)
>
> diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c
> index 7a05310da895..661f6c70e9d0 100644
> --- a/fs/xfs/xfs_buf.c
> +++ b/fs/xfs/xfs_buf.c
> @@ -1712,8 +1712,8 @@ xfs_configure_buftarg_atomic_writes(
> max_bytes = 0;
> }
>
> - btp->bt_bdev_awu_min = min_bytes;
> - btp->bt_bdev_awu_max = max_bytes;
> + btp->bt_awu_min = min_bytes;
> + btp->bt_awu_max = max_bytes;
> }
>
> /* Configure a buffer target that abstracts a block device. */
> diff --git a/fs/xfs/xfs_buf.h b/fs/xfs/xfs_buf.h
> index 73a9686110e8..7987a6d64874 100644
> --- a/fs/xfs/xfs_buf.h
> +++ b/fs/xfs/xfs_buf.h
> @@ -112,9 +112,9 @@ struct xfs_buftarg {
> struct percpu_counter bt_readahead_count;
> struct ratelimit_state bt_ioerror_rl;
>
> - /* Atomic write unit values, bytes */
> - unsigned int bt_bdev_awu_min;
> - unsigned int bt_bdev_awu_max;
> + /* Hardware atomic write unit values, bytes */
> + unsigned int bt_awu_min;
> + unsigned int bt_awu_max;
>
> /* built-in cache, if we're not using the perag one */
> struct xfs_buf_cache bt_cache[];
> diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
> index 48254a72071b..377fc9077781 100644
> --- a/fs/xfs/xfs_file.c
> +++ b/fs/xfs/xfs_file.c
> @@ -752,7 +752,7 @@ xfs_file_dio_write_atomic(
> * HW offload should be faster, so try that first if it is already
> * known that the write length is not too large.
> */
> - if (ocount > xfs_inode_buftarg(ip)->bt_bdev_awu_max)
> + if (ocount > xfs_inode_buftarg(ip)->bt_awu_max)
> dops = &xfs_atomic_write_cow_iomap_ops;
> else
> dops = &xfs_direct_write_iomap_ops;
> diff --git a/fs/xfs/xfs_inode.h b/fs/xfs/xfs_inode.h
> index d7e2b902ef5c..07fbdcc4cbf5 100644
> --- a/fs/xfs/xfs_inode.h
> +++ b/fs/xfs/xfs_inode.h
> @@ -358,7 +358,7 @@ static inline bool xfs_inode_has_bigrtalloc(const struct xfs_inode *ip)
>
> static inline bool xfs_inode_can_hw_atomic_write(const struct xfs_inode *ip)
> {
> - return xfs_inode_buftarg(ip)->bt_bdev_awu_max > 0;
> + return xfs_inode_buftarg(ip)->bt_awu_max > 0;
> }
>
> /*
> diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
> index ff05e6b1b0bb..ec30b78bf5c4 100644
> --- a/fs/xfs/xfs_iomap.c
> +++ b/fs/xfs/xfs_iomap.c
> @@ -827,7 +827,7 @@ xfs_bmap_hw_atomic_write_possible(
> /*
> * The ->iomap_begin caller should ensure this, but check anyway.
> */
> - return len <= xfs_inode_buftarg(ip)->bt_bdev_awu_max;
> + return len <= xfs_inode_buftarg(ip)->bt_awu_max;
> }
>
> static int
> diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c
> index 8cddbb7c149b..01e597290eb5 100644
> --- a/fs/xfs/xfs_iops.c
> +++ b/fs/xfs/xfs_iops.c
> @@ -665,7 +665,7 @@ xfs_get_atomic_write_max_opt(
> * less than our out of place write limit, but we don't want to exceed
> * the awu_max.
> */
> - return min(awu_max, xfs_inode_buftarg(ip)->bt_bdev_awu_max);
> + return min(awu_max, xfs_inode_buftarg(ip)->bt_awu_max);
> }
>
> static void
> diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c
> index 99fbb22bad4c..0b690bc119d7 100644
> --- a/fs/xfs/xfs_mount.c
> +++ b/fs/xfs/xfs_mount.c
> @@ -699,7 +699,7 @@ xfs_calc_group_awu_max(
>
> if (g->blocks == 0)
> return 0;
> - if (btp && btp->bt_bdev_awu_min > 0)
> + if (btp && btp->bt_awu_min > 0)
> return max_pow_of_two_factor(g->blocks);
> return rounddown_pow_of_two(g->blocks);
> }
> --
> 2.47.2
>
>
^ permalink raw reply [flat|nested] 33+ messages in thread
* [PATCH 6/7] xfs: remove the bt_bdev_file buftarg field
2025-07-01 10:40 misc cleanups v2 Christoph Hellwig
` (4 preceding siblings ...)
2025-07-01 10:40 ` [PATCH 5/7] xfs: rename the bt_bdev_* buftarg fields Christoph Hellwig
@ 2025-07-01 10:40 ` Christoph Hellwig
2025-07-01 11:09 ` John Garry
2025-07-01 16:46 ` Darrick J. Wong
2025-07-01 10:40 ` [PATCH 7/7] xfs: remove the bt_meta_sectorsize field in struct buftarg Christoph Hellwig
6 siblings, 2 replies; 33+ messages in thread
From: Christoph Hellwig @ 2025-07-01 10:40 UTC (permalink / raw)
To: Carlos Maiolino; +Cc: John Garry, Darrick J. Wong, linux-xfs
And use bt_file for both bdev and shmem backed buftargs.
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
fs/xfs/xfs_buf.c | 4 ++--
fs/xfs/xfs_buf.h | 1 -
2 files changed, 2 insertions(+), 3 deletions(-)
diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c
index 661f6c70e9d0..b73da43f489c 100644
--- a/fs/xfs/xfs_buf.c
+++ b/fs/xfs/xfs_buf.c
@@ -1683,7 +1683,7 @@ xfs_free_buftarg(
fs_put_dax(btp->bt_daxdev, btp->bt_mount);
/* the main block device is closed by kill_block_super */
if (btp->bt_bdev != btp->bt_mount->m_super->s_bdev)
- bdev_fput(btp->bt_bdev_file);
+ bdev_fput(btp->bt_file);
kfree(btp);
}
@@ -1802,7 +1802,7 @@ xfs_alloc_buftarg(
btp = kzalloc(sizeof(*btp), GFP_KERNEL | __GFP_NOFAIL);
btp->bt_mount = mp;
- btp->bt_bdev_file = bdev_file;
+ btp->bt_file = bdev_file;
btp->bt_bdev = file_bdev(bdev_file);
btp->bt_dev = btp->bt_bdev->bd_dev;
btp->bt_daxdev = fs_dax_get_by_bdev(btp->bt_bdev, &btp->bt_dax_part_off,
diff --git a/fs/xfs/xfs_buf.h b/fs/xfs/xfs_buf.h
index 7987a6d64874..b269e115d9ac 100644
--- a/fs/xfs/xfs_buf.h
+++ b/fs/xfs/xfs_buf.h
@@ -94,7 +94,6 @@ void xfs_buf_cache_destroy(struct xfs_buf_cache *bch);
*/
struct xfs_buftarg {
dev_t bt_dev;
- struct file *bt_bdev_file;
struct block_device *bt_bdev;
struct dax_device *bt_daxdev;
struct file *bt_file;
--
2.47.2
^ permalink raw reply related [flat|nested] 33+ messages in thread
* Re: [PATCH 6/7] xfs: remove the bt_bdev_file buftarg field
2025-07-01 10:40 ` [PATCH 6/7] xfs: remove the bt_bdev_file buftarg field Christoph Hellwig
@ 2025-07-01 11:09 ` John Garry
2025-07-01 16:46 ` Darrick J. Wong
1 sibling, 0 replies; 33+ messages in thread
From: John Garry @ 2025-07-01 11:09 UTC (permalink / raw)
To: Christoph Hellwig, Carlos Maiolino; +Cc: Darrick J. Wong, linux-xfs
On 01/07/2025 11:40, Christoph Hellwig wrote:
> And use bt_file for both bdev and shmem backed buftargs.
>
> Signed-off-by: Christoph Hellwig <hch@lst.de>
looks ok, so FWIW:
Reviewed-by: John Garry <john.g.garry@oracle.com>
> ---
> fs/xfs/xfs_buf.c | 4 ++--
> fs/xfs/xfs_buf.h | 1 -
> 2 files changed, 2 insertions(+), 3 deletions(-)
>
> diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c
> index 661f6c70e9d0..b73da43f489c 100644
> --- a/fs/xfs/xfs_buf.c
> +++ b/fs/xfs/xfs_buf.c
> @@ -1683,7 +1683,7 @@ xfs_free_buftarg(
> fs_put_dax(btp->bt_daxdev, btp->bt_mount);
> /* the main block device is closed by kill_block_super */
> if (btp->bt_bdev != btp->bt_mount->m_super->s_bdev)
> - bdev_fput(btp->bt_bdev_file);
> + bdev_fput(btp->bt_file);
> kfree(btp);
> }
>
> @@ -1802,7 +1802,7 @@ xfs_alloc_buftarg(
> btp = kzalloc(sizeof(*btp), GFP_KERNEL | __GFP_NOFAIL);
>
> btp->bt_mount = mp;
> - btp->bt_bdev_file = bdev_file;
> + btp->bt_file = bdev_file;
> btp->bt_bdev = file_bdev(bdev_file);
> btp->bt_dev = btp->bt_bdev->bd_dev;
> btp->bt_daxdev = fs_dax_get_by_bdev(btp->bt_bdev, &btp->bt_dax_part_off,
> diff --git a/fs/xfs/xfs_buf.h b/fs/xfs/xfs_buf.h
> index 7987a6d64874..b269e115d9ac 100644
> --- a/fs/xfs/xfs_buf.h
> +++ b/fs/xfs/xfs_buf.h
> @@ -94,7 +94,6 @@ void xfs_buf_cache_destroy(struct xfs_buf_cache *bch);
> */
> struct xfs_buftarg {
> dev_t bt_dev;
> - struct file *bt_bdev_file;
> struct block_device *bt_bdev;
> struct dax_device *bt_daxdev;
> struct file *bt_file;
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH 6/7] xfs: remove the bt_bdev_file buftarg field
2025-07-01 10:40 ` [PATCH 6/7] xfs: remove the bt_bdev_file buftarg field Christoph Hellwig
2025-07-01 11:09 ` John Garry
@ 2025-07-01 16:46 ` Darrick J. Wong
1 sibling, 0 replies; 33+ messages in thread
From: Darrick J. Wong @ 2025-07-01 16:46 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: Carlos Maiolino, John Garry, linux-xfs
On Tue, Jul 01, 2025 at 12:40:40PM +0200, Christoph Hellwig wrote:
> And use bt_file for both bdev and shmem backed buftargs.
>
> Signed-off-by: Christoph Hellwig <hch@lst.de>
Woot, that's a nice savings!
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
--D
> ---
> fs/xfs/xfs_buf.c | 4 ++--
> fs/xfs/xfs_buf.h | 1 -
> 2 files changed, 2 insertions(+), 3 deletions(-)
>
> diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c
> index 661f6c70e9d0..b73da43f489c 100644
> --- a/fs/xfs/xfs_buf.c
> +++ b/fs/xfs/xfs_buf.c
> @@ -1683,7 +1683,7 @@ xfs_free_buftarg(
> fs_put_dax(btp->bt_daxdev, btp->bt_mount);
> /* the main block device is closed by kill_block_super */
> if (btp->bt_bdev != btp->bt_mount->m_super->s_bdev)
> - bdev_fput(btp->bt_bdev_file);
> + bdev_fput(btp->bt_file);
> kfree(btp);
> }
>
> @@ -1802,7 +1802,7 @@ xfs_alloc_buftarg(
> btp = kzalloc(sizeof(*btp), GFP_KERNEL | __GFP_NOFAIL);
>
> btp->bt_mount = mp;
> - btp->bt_bdev_file = bdev_file;
> + btp->bt_file = bdev_file;
> btp->bt_bdev = file_bdev(bdev_file);
> btp->bt_dev = btp->bt_bdev->bd_dev;
> btp->bt_daxdev = fs_dax_get_by_bdev(btp->bt_bdev, &btp->bt_dax_part_off,
> diff --git a/fs/xfs/xfs_buf.h b/fs/xfs/xfs_buf.h
> index 7987a6d64874..b269e115d9ac 100644
> --- a/fs/xfs/xfs_buf.h
> +++ b/fs/xfs/xfs_buf.h
> @@ -94,7 +94,6 @@ void xfs_buf_cache_destroy(struct xfs_buf_cache *bch);
> */
> struct xfs_buftarg {
> dev_t bt_dev;
> - struct file *bt_bdev_file;
> struct block_device *bt_bdev;
> struct dax_device *bt_daxdev;
> struct file *bt_file;
> --
> 2.47.2
>
>
^ permalink raw reply [flat|nested] 33+ messages in thread
* [PATCH 7/7] xfs: remove the bt_meta_sectorsize field in struct buftarg
2025-07-01 10:40 misc cleanups v2 Christoph Hellwig
` (5 preceding siblings ...)
2025-07-01 10:40 ` [PATCH 6/7] xfs: remove the bt_bdev_file buftarg field Christoph Hellwig
@ 2025-07-01 10:40 ` Christoph Hellwig
2025-07-01 16:51 ` Darrick J. Wong
2025-07-01 22:55 ` Dave Chinner
6 siblings, 2 replies; 33+ messages in thread
From: Christoph Hellwig @ 2025-07-01 10:40 UTC (permalink / raw)
To: Carlos Maiolino; +Cc: John Garry, Darrick J. Wong, linux-xfs
The file system only has a single file system sector size. Read that
from the in-core super block to avoid confusion about the two different
"sector sizes" stored in the buftarg. Note that this loosens the
alignment asserts for memory backed buftargs that set the page size here.
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
fs/xfs/xfs_buf.c | 20 +++++---------------
fs/xfs/xfs_buf.h | 15 ---------------
fs/xfs/xfs_buf_mem.c | 2 --
3 files changed, 5 insertions(+), 32 deletions(-)
diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c
index b73da43f489c..0f20d9514d0d 100644
--- a/fs/xfs/xfs_buf.c
+++ b/fs/xfs/xfs_buf.c
@@ -387,17 +387,18 @@ xfs_buf_map_verify(
struct xfs_buftarg *btp,
struct xfs_buf_map *map)
{
+ struct xfs_mount *mp = btp->bt_mount;
xfs_daddr_t eofs;
/* Check for IOs smaller than the sector size / not sector aligned */
- ASSERT(!(BBTOB(map->bm_len) < btp->bt_meta_sectorsize));
- ASSERT(!(BBTOB(map->bm_bn) & (xfs_off_t)btp->bt_meta_sectormask));
+ ASSERT(!(BBTOB(map->bm_len) < mp->m_sb.sb_sectsize));
+ ASSERT(!(BBTOB(map->bm_bn) & (xfs_off_t)(mp->m_sb.sb_sectsize - 1)));
/*
* Corrupted block numbers can get through to here, unfortunately, so we
* have to check that the buffer falls within the filesystem bounds.
*/
- eofs = XFS_FSB_TO_BB(btp->bt_mount, btp->bt_mount->m_sb.sb_dblocks);
+ eofs = XFS_FSB_TO_BB(mp, mp->m_sb.sb_dblocks);
if (map->bm_bn < 0 || map->bm_bn >= eofs) {
xfs_alert(btp->bt_mount,
"%s: daddr 0x%llx out of range, EOFS 0x%llx",
@@ -1726,10 +1727,6 @@ xfs_configure_buftarg(
ASSERT(btp->bt_bdev != NULL);
- /* Set up metadata sector size info */
- btp->bt_meta_sectorsize = sectorsize;
- btp->bt_meta_sectormask = sectorsize - 1;
-
error = bdev_validate_blocksize(btp->bt_bdev, sectorsize);
if (error) {
xfs_warn(btp->bt_mount,
@@ -1816,14 +1813,7 @@ xfs_alloc_buftarg(
if (error)
goto error_free;
- /*
- * When allocating the buftargs we have not yet read the super block and
- * thus don't know the file system sector size yet.
- */
- btp->bt_meta_sectorsize = bdev_logical_block_size(btp->bt_bdev);
- btp->bt_meta_sectormask = btp->bt_meta_sectorsize - 1;
-
- error = xfs_init_buftarg(btp, btp->bt_meta_sectorsize,
+ error = xfs_init_buftarg(btp, bdev_logical_block_size(btp->bt_bdev),
mp->m_super->s_id);
if (error)
goto error_free;
diff --git a/fs/xfs/xfs_buf.h b/fs/xfs/xfs_buf.h
index b269e115d9ac..8edfd9ed799e 100644
--- a/fs/xfs/xfs_buf.h
+++ b/fs/xfs/xfs_buf.h
@@ -79,19 +79,6 @@ struct xfs_buf_cache {
int xfs_buf_cache_init(struct xfs_buf_cache *bch);
void xfs_buf_cache_destroy(struct xfs_buf_cache *bch);
-/*
- * The xfs_buftarg contains 2 notions of "sector size" -
- *
- * 1) The metadata sector size, which is the minimum unit and
- * alignment of IO which will be performed by metadata operations.
- * 2) The device logical sector size
- *
- * The first is specified at mkfs time, and is stored on-disk in the
- * superblock's sb_sectsize.
- *
- * The latter is derived from the underlying device, and controls direct IO
- * alignment constraints.
- */
struct xfs_buftarg {
dev_t bt_dev;
struct block_device *bt_bdev;
@@ -99,8 +86,6 @@ struct xfs_buftarg {
struct file *bt_file;
u64 bt_dax_part_off;
struct xfs_mount *bt_mount;
- unsigned int bt_meta_sectorsize;
- size_t bt_meta_sectormask;
size_t bt_logical_sectorsize;
size_t bt_logical_sectormask;
diff --git a/fs/xfs/xfs_buf_mem.c b/fs/xfs/xfs_buf_mem.c
index dcbfa274e06d..46f527750d34 100644
--- a/fs/xfs/xfs_buf_mem.c
+++ b/fs/xfs/xfs_buf_mem.c
@@ -90,8 +90,6 @@ xmbuf_alloc(
btp->bt_dev = (dev_t)-1U;
btp->bt_bdev = NULL; /* in-memory buftargs have no bdev */
btp->bt_file = file;
- btp->bt_meta_sectorsize = XMBUF_BLOCKSIZE;
- btp->bt_meta_sectormask = XMBUF_BLOCKSIZE - 1;
error = xfs_init_buftarg(btp, XMBUF_BLOCKSIZE, descr);
if (error)
--
2.47.2
^ permalink raw reply related [flat|nested] 33+ messages in thread
* Re: [PATCH 7/7] xfs: remove the bt_meta_sectorsize field in struct buftarg
2025-07-01 10:40 ` [PATCH 7/7] xfs: remove the bt_meta_sectorsize field in struct buftarg Christoph Hellwig
@ 2025-07-01 16:51 ` Darrick J. Wong
2025-07-01 22:55 ` Dave Chinner
1 sibling, 0 replies; 33+ messages in thread
From: Darrick J. Wong @ 2025-07-01 16:51 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: Carlos Maiolino, John Garry, linux-xfs
On Tue, Jul 01, 2025 at 12:40:41PM +0200, Christoph Hellwig wrote:
> The file system only has a single file system sector size. Read that
> from the in-core super block to avoid confusion about the two different
> "sector sizes" stored in the buftarg.
Yeah, that whole bt_meta_sectorsize stuff was confusing.
> Note that this loosens the
> alignment asserts for memory backed buftargs that set the page size here.
Doesn't matter at all since shmem files are byte addressable anyway. :)
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
> fs/xfs/xfs_buf.c | 20 +++++---------------
> fs/xfs/xfs_buf.h | 15 ---------------
> fs/xfs/xfs_buf_mem.c | 2 --
> 3 files changed, 5 insertions(+), 32 deletions(-)
>
> diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c
> index b73da43f489c..0f20d9514d0d 100644
> --- a/fs/xfs/xfs_buf.c
> +++ b/fs/xfs/xfs_buf.c
> @@ -387,17 +387,18 @@ xfs_buf_map_verify(
> struct xfs_buftarg *btp,
> struct xfs_buf_map *map)
> {
> + struct xfs_mount *mp = btp->bt_mount;
> xfs_daddr_t eofs;
>
> /* Check for IOs smaller than the sector size / not sector aligned */
> - ASSERT(!(BBTOB(map->bm_len) < btp->bt_meta_sectorsize));
> - ASSERT(!(BBTOB(map->bm_bn) & (xfs_off_t)btp->bt_meta_sectormask));
> + ASSERT(!(BBTOB(map->bm_len) < mp->m_sb.sb_sectsize));
Urgh, should the polarity of this be reversed?
ASSERT(BBTOB(map->bm_len) >= mp->m_sb.sb_sectsize);
Though I don't really care enough to block the patch so
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
--D
> + ASSERT(!(BBTOB(map->bm_bn) & (xfs_off_t)(mp->m_sb.sb_sectsize - 1)));
>
> /*
> * Corrupted block numbers can get through to here, unfortunately, so we
> * have to check that the buffer falls within the filesystem bounds.
> */
> - eofs = XFS_FSB_TO_BB(btp->bt_mount, btp->bt_mount->m_sb.sb_dblocks);
> + eofs = XFS_FSB_TO_BB(mp, mp->m_sb.sb_dblocks);
> if (map->bm_bn < 0 || map->bm_bn >= eofs) {
> xfs_alert(btp->bt_mount,
> "%s: daddr 0x%llx out of range, EOFS 0x%llx",
> @@ -1726,10 +1727,6 @@ xfs_configure_buftarg(
>
> ASSERT(btp->bt_bdev != NULL);
>
> - /* Set up metadata sector size info */
> - btp->bt_meta_sectorsize = sectorsize;
> - btp->bt_meta_sectormask = sectorsize - 1;
> -
> error = bdev_validate_blocksize(btp->bt_bdev, sectorsize);
> if (error) {
> xfs_warn(btp->bt_mount,
> @@ -1816,14 +1813,7 @@ xfs_alloc_buftarg(
> if (error)
> goto error_free;
>
> - /*
> - * When allocating the buftargs we have not yet read the super block and
> - * thus don't know the file system sector size yet.
> - */
> - btp->bt_meta_sectorsize = bdev_logical_block_size(btp->bt_bdev);
> - btp->bt_meta_sectormask = btp->bt_meta_sectorsize - 1;
> -
> - error = xfs_init_buftarg(btp, btp->bt_meta_sectorsize,
> + error = xfs_init_buftarg(btp, bdev_logical_block_size(btp->bt_bdev),
> mp->m_super->s_id);
> if (error)
> goto error_free;
> diff --git a/fs/xfs/xfs_buf.h b/fs/xfs/xfs_buf.h
> index b269e115d9ac..8edfd9ed799e 100644
> --- a/fs/xfs/xfs_buf.h
> +++ b/fs/xfs/xfs_buf.h
> @@ -79,19 +79,6 @@ struct xfs_buf_cache {
> int xfs_buf_cache_init(struct xfs_buf_cache *bch);
> void xfs_buf_cache_destroy(struct xfs_buf_cache *bch);
>
> -/*
> - * The xfs_buftarg contains 2 notions of "sector size" -
> - *
> - * 1) The metadata sector size, which is the minimum unit and
> - * alignment of IO which will be performed by metadata operations.
> - * 2) The device logical sector size
> - *
> - * The first is specified at mkfs time, and is stored on-disk in the
> - * superblock's sb_sectsize.
> - *
> - * The latter is derived from the underlying device, and controls direct IO
> - * alignment constraints.
> - */
> struct xfs_buftarg {
> dev_t bt_dev;
> struct block_device *bt_bdev;
> @@ -99,8 +86,6 @@ struct xfs_buftarg {
> struct file *bt_file;
> u64 bt_dax_part_off;
> struct xfs_mount *bt_mount;
> - unsigned int bt_meta_sectorsize;
> - size_t bt_meta_sectormask;
> size_t bt_logical_sectorsize;
> size_t bt_logical_sectormask;
>
> diff --git a/fs/xfs/xfs_buf_mem.c b/fs/xfs/xfs_buf_mem.c
> index dcbfa274e06d..46f527750d34 100644
> --- a/fs/xfs/xfs_buf_mem.c
> +++ b/fs/xfs/xfs_buf_mem.c
> @@ -90,8 +90,6 @@ xmbuf_alloc(
> btp->bt_dev = (dev_t)-1U;
> btp->bt_bdev = NULL; /* in-memory buftargs have no bdev */
> btp->bt_file = file;
> - btp->bt_meta_sectorsize = XMBUF_BLOCKSIZE;
> - btp->bt_meta_sectormask = XMBUF_BLOCKSIZE - 1;
>
> error = xfs_init_buftarg(btp, XMBUF_BLOCKSIZE, descr);
> if (error)
> --
> 2.47.2
>
>
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH 7/7] xfs: remove the bt_meta_sectorsize field in struct buftarg
2025-07-01 10:40 ` [PATCH 7/7] xfs: remove the bt_meta_sectorsize field in struct buftarg Christoph Hellwig
2025-07-01 16:51 ` Darrick J. Wong
@ 2025-07-01 22:55 ` Dave Chinner
2025-07-03 13:44 ` Christoph Hellwig
1 sibling, 1 reply; 33+ messages in thread
From: Dave Chinner @ 2025-07-01 22:55 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: Carlos Maiolino, John Garry, Darrick J. Wong, linux-xfs
On Tue, Jul 01, 2025 at 12:40:41PM +0200, Christoph Hellwig wrote:
> The file system only has a single file system sector size. Read that
This is still an incorrect assertion.
We still have the exact same notion of two different sector sizes
for the data device - one for metadata alignment (from the sb) and
a different one for data (direct IO) alignment (from the bdev).
Removing the abstraction and the comment that explains this does not
make this fundamental data vs metadata sector size difference
within the data device buftarg and betweeen different buftargs
go away.
All you are doing is changing the way this differentiation is
implemented - it's making the data device metadata sector size an
implicit property of *all* buftargs, rather than an explicit
property defined at buftarg instantiation.
IOWs, you are removing an explicit abstraction that exists for
correctness and replacing it with a fixed value that isn't correct
for all buftargs in the filesystem.
> Read that
> from the in-core super block to avoid confusion about the two different
> "sector sizes" stored in the buftarg. Note that this loosens the
> alignment asserts for memory backed buftargs that set the page size here.
I think you got that wrong, because ....
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
> fs/xfs/xfs_buf.c | 20 +++++---------------
> fs/xfs/xfs_buf.h | 15 ---------------
> fs/xfs/xfs_buf_mem.c | 2 --
> 3 files changed, 5 insertions(+), 32 deletions(-)
>
> diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c
> index b73da43f489c..0f20d9514d0d 100644
> --- a/fs/xfs/xfs_buf.c
> +++ b/fs/xfs/xfs_buf.c
> @@ -387,17 +387,18 @@ xfs_buf_map_verify(
> struct xfs_buftarg *btp,
> struct xfs_buf_map *map)
> {
> + struct xfs_mount *mp = btp->bt_mount;
> xfs_daddr_t eofs;
>
> /* Check for IOs smaller than the sector size / not sector aligned */
> - ASSERT(!(BBTOB(map->bm_len) < btp->bt_meta_sectorsize));
> - ASSERT(!(BBTOB(map->bm_bn) & (xfs_off_t)btp->bt_meta_sectormask));
> + ASSERT(!(BBTOB(map->bm_len) < mp->m_sb.sb_sectsize));
> + ASSERT(!(BBTOB(map->bm_bn) & (xfs_off_t)(mp->m_sb.sb_sectsize - 1)));
.... these asserts will fire for single block XMBUF buffers because
PAGE_SIZE < mp->m_sb.sb_sectsize when the metadata sector size is
larger than PAGE_SIZE....
Fundamentally, using mp->m_sb.sb_sectsize for all buftargs is simply
wrong. Buftargs have different sector sizes, and we should continue
to encapsulate this in the buftarg.
Also, I can see no obvious reason for getting rid of this
abstraction and none has been given. It doesn't make the code any
cleaner, and it introduces an incorrect implicit assumption in the
buffer cache code (i.e. that every buftarg has the same sector
size). Hence I don't think we actually should be "cleaning up" this
code like this...
Now, if there's some other functional reason for doing this change
that hasn't been stated, let's talk about the change in that
context. Can you explain why this is necessary?
But as a straight cleanup it makes no sense...
-Dave.
--
Dave Chinner
david@fromorbit.com
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH 7/7] xfs: remove the bt_meta_sectorsize field in struct buftarg
2025-07-01 22:55 ` Dave Chinner
@ 2025-07-03 13:44 ` Christoph Hellwig
0 siblings, 0 replies; 33+ messages in thread
From: Christoph Hellwig @ 2025-07-03 13:44 UTC (permalink / raw)
To: Dave Chinner
Cc: Christoph Hellwig, Carlos Maiolino, John Garry, Darrick J. Wong,
linux-xfs
On Wed, Jul 02, 2025 at 08:55:06AM +1000, Dave Chinner wrote:
> .... these asserts will fire for single block XMBUF buffers because
> PAGE_SIZE < mp->m_sb.sb_sectsize when the metadata sector size is
> larger than PAGE_SIZE....
Oh right, we now support metadata sector sizes > PAGE_SIZE.
> Also, I can see no obvious reason for getting rid of this
> abstraction and none has been given. It doesn't make the code any
> cleaner, and it introduces an incorrect implicit assumption in the
> buffer cache code (i.e. that every buftarg has the same sector
> size). Hence I don't think we actually should be "cleaning up" this
> code like this...
>
> Now, if there's some other functional reason for doing this change
> that hasn't been stated, let's talk about the change in that
> context. Can you explain why this is necessary?
Mostly because they way they were used is horribly confusing.
The worst part is fixed up in patch 1, but I wanted to go one step
futher. I guess I'll just drop this patch and we'll have to live
with the two sector sizes.
^ permalink raw reply [flat|nested] 33+ messages in thread