linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCHSET] xfs: improve online repair reap calculations
@ 2025-08-28 14:28 Darrick J. Wong
  2025-08-28 14:28 ` [PATCH 1/9] xfs: prepare reaping code for dynamic limits Darrick J. Wong
                   ` (8 more replies)
  0 siblings, 9 replies; 23+ messages in thread
From: Darrick J. Wong @ 2025-08-28 14:28 UTC (permalink / raw)
  To: cem, djwong; +Cc: stable, hch, linux-xfs

Hi all,

A few months ago, the multi-fsblock untorn writes patchset added a bunch
of log intent item helper functions to estimate the number of intent
items that could be added to a particular transaction.  Those helpers
enabled us to compute a safe upper bound on the number of blocks that
could be written in an untorn fashion with filesystem-provided out of
place writes.

Currently, the online fsck code employs static limits on the number of
intent items that it's willing to accrue to a single transaction when
it's trying to reap what it thinks are the old blocks from a corrupt
structure.  There have been no problems reported with this approach
after years of testing, but static limits are scary and gross because
overestimating the intent item limit could result in transaction
overflows and dead filesystems; and underestimating causes unnecessary
overhead.

This series uses the new log intent item size helpers to estimate the
limits dynamically based on worst-case per-block repair work vs. the
size of the scrub transaction.  After several months of testing this,
there don't seem to be any problems here either.

If you're going to start using this code, I strongly recommend pulling
from my git trees, which are linked below.

This has been running on the djcloud for months with no problems.  Enjoy!
Comments and questions are, as always, welcome.

--D

kernel git tree:
https://git.kernel.org/cgit/linux/kernel/git/djwong/xfs-linux.git/log/?h=fix-scrub-reap-calculations
---
Commits in this patchset:
 * xfs: prepare reaping code for dynamic limits
 * xfs: convert the ifork reap code to use xreap_state
 * xfs: use deferred intent items for reaping crosslinked blocks
 * xfs: compute per-AG extent reap limits dynamically
 * xfs: compute data device CoW staging extent reap limits dynamically
 * xfs: compute realtime device CoW staging extent reap limits dynamically
 * xfs: compute file mapping reap limits dynamically
 * xfs: remove static reap limits
 * xfs: use deferred reaping for data device cow extents
---
 fs/xfs/scrub/repair.h |    8 -
 fs/xfs/scrub/trace.h  |   45 ++++
 fs/xfs/scrub/newbt.c  |    7 +
 fs/xfs/scrub/reap.c   |  622 +++++++++++++++++++++++++++++++++++++++----------
 fs/xfs/scrub/trace.c  |    1 
 5 files changed, 552 insertions(+), 131 deletions(-)


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH 1/9] xfs: prepare reaping code for dynamic limits
  2025-08-28 14:28 [PATCHSET] xfs: improve online repair reap calculations Darrick J. Wong
@ 2025-08-28 14:28 ` Darrick J. Wong
  2025-09-02  6:23   ` Christoph Hellwig
  2025-08-28 14:28 ` [PATCH 2/9] xfs: convert the ifork reap code to use xreap_state Darrick J. Wong
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 23+ messages in thread
From: Darrick J. Wong @ 2025-08-28 14:28 UTC (permalink / raw)
  To: cem, djwong; +Cc: hch, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

The online repair block reaping code employs static limits to decide if
it's time to roll the transaction or finish the deferred item chains to
avoid overflowing the scrub transaction's reservation.  However, the
use of static limits aren't great -- btree blocks are assumed to be
scattered around the AG and the buffers need to be invalidated, whereas
COW staging extents are usually contiguous and do not have buffers.  We
would like to configure the limits dynamically.

To get ready for this, reorganize struct xreap_state to store dynamic
limits, and add helpers to hide some of the details of how the limits
are enforced.  Also rename the "xreap roll" functions to include the
word "binval" because they only exist to decide when we should roll the
transaction to deal with buffer invalidations.

No functional changes intended here.

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
 fs/xfs/scrub/reap.c |  149 +++++++++++++++++++++++++++------------------------
 1 file changed, 79 insertions(+), 70 deletions(-)


diff --git a/fs/xfs/scrub/reap.c b/fs/xfs/scrub/reap.c
index 8703897c0a9ccb..7877ef427990eb 100644
--- a/fs/xfs/scrub/reap.c
+++ b/fs/xfs/scrub/reap.c
@@ -95,17 +95,17 @@ struct xreap_state {
 	const struct xfs_owner_info	*oinfo;
 	enum xfs_ag_resv_type		resv;
 
-	/* If true, roll the transaction before reaping the next extent. */
-	bool				force_roll;
-
-	/* Number of deferred reaps attached to the current transaction. */
-	unsigned int			deferred;
-
 	/* Number of invalidated buffers logged to the current transaction. */
-	unsigned int			invalidated;
+	unsigned int			nr_binval;
 
-	/* Number of deferred reaps queued during the whole reap sequence. */
-	unsigned long long		total_deferred;
+	/* Maximum number of buffers we can invalidate in a single tx. */
+	unsigned int			max_binval;
+
+	/* Number of deferred reaps attached to the current transaction. */
+	unsigned int			nr_deferred;
+
+	/* Maximum number of intents we can reap in a single transaction. */
+	unsigned int			max_deferred;
 };
 
 /* Put a block back on the AGFL. */
@@ -148,44 +148,36 @@ xreap_put_freelist(
 }
 
 /* Are there any uncommitted reap operations? */
-static inline bool xreap_dirty(const struct xreap_state *rs)
+static inline bool xreap_is_dirty(const struct xreap_state *rs)
 {
-	if (rs->force_roll)
-		return true;
-	if (rs->deferred)
-		return true;
-	if (rs->invalidated)
-		return true;
-	if (rs->total_deferred)
-		return true;
-	return false;
+	return rs->nr_binval > 0 || rs->nr_deferred > 0;
 }
 
 #define XREAP_MAX_BINVAL	(2048)
 
 /*
- * Decide if we want to roll the transaction after reaping an extent.  We don't
- * want to overrun the transaction reservation, so we prohibit more than
- * 128 EFIs per transaction.  For the same reason, we limit the number
- * of buffer invalidations to 2048.
+ * Decide if we need to roll the transaction to clear out the the log
+ * reservation that we allocated to buffer invalidations.
  */
-static inline bool xreap_want_roll(const struct xreap_state *rs)
+static inline bool xreap_want_binval_roll(const struct xreap_state *rs)
 {
-	if (rs->force_roll)
-		return true;
-	if (rs->deferred > XREP_MAX_ITRUNCATE_EFIS)
-		return true;
-	if (rs->invalidated > XREAP_MAX_BINVAL)
-		return true;
-	return false;
+	return rs->nr_binval >= rs->max_binval;
 }
 
-static inline void xreap_reset(struct xreap_state *rs)
+/* Reset the buffer invalidation count after rolling. */
+static inline void xreap_binval_reset(struct xreap_state *rs)
 {
-	rs->total_deferred += rs->deferred;
-	rs->deferred = 0;
-	rs->invalidated = 0;
-	rs->force_roll = false;
+	rs->nr_binval = 0;
+}
+
+/*
+ * Bump the number of invalidated buffers, and return true if we can continue,
+ * or false if we need to roll the transaction.
+ */
+static inline bool xreap_inc_binval(struct xreap_state *rs)
+{
+	rs->nr_binval++;
+	return rs->nr_binval < rs->max_binval;
 }
 
 #define XREAP_MAX_DEFER_CHAIN		(2048)
@@ -194,25 +186,36 @@ static inline void xreap_reset(struct xreap_state *rs)
  * Decide if we want to finish the deferred ops that are attached to the scrub
  * transaction.  We don't want to queue huge chains of deferred ops because
  * that can consume a lot of log space and kernel memory.  Hence we trigger a
- * xfs_defer_finish if there are more than 2048 deferred reap operations or the
- * caller did some real work.
+ * xfs_defer_finish if there are too many deferred reap operations or we've run
+ * out of space for invalidations.
  */
-static inline bool
-xreap_want_defer_finish(const struct xreap_state *rs)
+static inline bool xreap_want_defer_finish(const struct xreap_state *rs)
 {
-	if (rs->force_roll)
-		return true;
-	if (rs->total_deferred > XREAP_MAX_DEFER_CHAIN)
-		return true;
-	return false;
+	return rs->nr_deferred >= rs->max_deferred;
 }
 
+/*
+ * Reset the defer chain length and buffer invalidation count after finishing
+ * items.
+ */
 static inline void xreap_defer_finish_reset(struct xreap_state *rs)
 {
-	rs->total_deferred = 0;
-	rs->deferred = 0;
-	rs->invalidated = 0;
-	rs->force_roll = false;
+	rs->nr_deferred = 0;
+	rs->nr_binval = 0;
+}
+
+/*
+ * Bump the number of deferred extent reaps.
+ */
+static inline void xreap_inc_defer(struct xreap_state *rs)
+{
+	rs->nr_deferred++;
+}
+
+/* Force the caller to finish a deferred item chain. */
+static inline void xreap_force_defer_finish(struct xreap_state *rs)
+{
+	rs->nr_deferred = rs->max_deferred;
 }
 
 /*
@@ -297,14 +300,13 @@ xreap_agextent_binval(
 		while ((bp = xrep_bufscan_advance(mp, &scan)) != NULL) {
 			xfs_trans_bjoin(sc->tp, bp);
 			xfs_trans_binval(sc->tp, bp);
-			rs->invalidated++;
 
 			/*
 			 * Stop invalidating if we've hit the limit; we should
 			 * still have enough reservation left to free however
 			 * far we've gotten.
 			 */
-			if (rs->invalidated > XREAP_MAX_BINVAL) {
+			if (!xreap_inc_binval(rs)) {
 				*aglenp -= agbno_next - bno;
 				goto out;
 			}
@@ -416,7 +418,7 @@ xreap_agextent_iter(
 		trace_xreap_dispose_unmap_extent(pag_group(sc->sa.pag), agbno,
 				*aglenp);
 
-		rs->force_roll = true;
+		xreap_force_defer_finish(rs);
 
 		if (rs->oinfo == &XFS_RMAP_OINFO_COW) {
 			/*
@@ -443,7 +445,7 @@ xreap_agextent_iter(
 	 */
 	xreap_agextent_binval(rs, agbno, aglenp);
 	if (*aglenp == 0) {
-		ASSERT(xreap_want_roll(rs));
+		ASSERT(xreap_want_binval_roll(rs));
 		return 0;
 	}
 
@@ -463,7 +465,7 @@ xreap_agextent_iter(
 		if (error)
 			return error;
 
-		rs->force_roll = true;
+		xreap_force_defer_finish(rs);
 		return 0;
 	}
 
@@ -474,7 +476,7 @@ xreap_agextent_iter(
 		if (error)
 			return error;
 
-		rs->force_roll = true;
+		xreap_force_defer_finish(rs);
 		return 0;
 	}
 
@@ -489,8 +491,8 @@ xreap_agextent_iter(
 	if (error)
 		return error;
 
-	rs->deferred++;
-	if (rs->deferred % 2 == 0)
+	xreap_inc_defer(rs);
+	if (rs->nr_deferred % 2 == 0)
 		xfs_defer_add_barrier(sc->tp);
 	return 0;
 }
@@ -531,11 +533,11 @@ xreap_agmeta_extent(
 			if (error)
 				return error;
 			xreap_defer_finish_reset(rs);
-		} else if (xreap_want_roll(rs)) {
+		} else if (xreap_want_binval_roll(rs)) {
 			error = xrep_roll_ag_trans(sc);
 			if (error)
 				return error;
-			xreap_reset(rs);
+			xreap_binval_reset(rs);
 		}
 
 		agbno += aglen;
@@ -556,6 +558,8 @@ xrep_reap_agblocks(
 		.sc			= sc,
 		.oinfo			= oinfo,
 		.resv			= type,
+		.max_binval		= XREAP_MAX_BINVAL,
+		.max_deferred		= XREAP_MAX_DEFER_CHAIN,
 	};
 	int				error;
 
@@ -566,7 +570,7 @@ xrep_reap_agblocks(
 	if (error)
 		return error;
 
-	if (xreap_dirty(&rs))
+	if (xreap_is_dirty(&rs))
 		return xrep_defer_finish(sc);
 
 	return 0;
@@ -628,7 +632,7 @@ xreap_fsmeta_extent(
 			if (error)
 				goto out_agf;
 			xreap_defer_finish_reset(rs);
-		} else if (xreap_want_roll(rs)) {
+		} else if (xreap_want_binval_roll(rs)) {
 			/*
 			 * Hold the AGF buffer across the transaction roll so
 			 * that we don't have to reattach it to the scrub
@@ -639,7 +643,7 @@ xreap_fsmeta_extent(
 			xfs_trans_bjoin(sc->tp, sc->sa.agf_bp);
 			if (error)
 				goto out_agf;
-			xreap_reset(rs);
+			xreap_binval_reset(rs);
 		}
 
 		agbno += aglen;
@@ -668,6 +672,8 @@ xrep_reap_fsblocks(
 		.sc			= sc,
 		.oinfo			= oinfo,
 		.resv			= XFS_AG_RESV_NONE,
+		.max_binval		= XREAP_MAX_BINVAL,
+		.max_deferred		= XREAP_MAX_DEFER_CHAIN,
 	};
 	int				error;
 
@@ -678,7 +684,7 @@ xrep_reap_fsblocks(
 	if (error)
 		return error;
 
-	if (xreap_dirty(&rs))
+	if (xreap_is_dirty(&rs))
 		return xrep_defer_finish(sc);
 
 	return 0;
@@ -778,7 +784,7 @@ xreap_rgextent_iter(
 				*rglenp);
 
 		xfs_refcount_free_cow_extent(sc->tp, true, rtbno, *rglenp);
-		rs->deferred++;
+		xreap_inc_defer(rs);
 		return 0;
 	}
 
@@ -799,7 +805,7 @@ xreap_rgextent_iter(
 	if (error)
 		return error;
 
-	rs->deferred++;
+	xreap_inc_defer(rs);
 	return 0;
 }
 
@@ -855,11 +861,11 @@ xreap_rtmeta_extent(
 			if (error)
 				goto out_unlock;
 			xreap_defer_finish_reset(rs);
-		} else if (xreap_want_roll(rs)) {
+		} else if (xreap_want_binval_roll(rs)) {
 			error = xfs_trans_roll_inode(&sc->tp, sc->ip);
 			if (error)
 				goto out_unlock;
-			xreap_reset(rs);
+			xreap_binval_reset(rs);
 		}
 
 		rgbno += rglen;
@@ -886,6 +892,8 @@ xrep_reap_rtblocks(
 		.sc			= sc,
 		.oinfo			= oinfo,
 		.resv			= XFS_AG_RESV_NONE,
+		.max_binval		= XREAP_MAX_BINVAL,
+		.max_deferred		= XREAP_MAX_DEFER_CHAIN,
 	};
 	int				error;
 
@@ -896,7 +904,7 @@ xrep_reap_rtblocks(
 	if (error)
 		return error;
 
-	if (xreap_dirty(&rs))
+	if (xreap_is_dirty(&rs))
 		return xrep_defer_finish(sc);
 
 	return 0;
@@ -922,6 +930,8 @@ xrep_reap_metadir_fsblocks(
 		.sc			= sc,
 		.oinfo			= &oinfo,
 		.resv			= XFS_AG_RESV_NONE,
+		.max_binval		= XREAP_MAX_BINVAL,
+		.max_deferred		= XREAP_MAX_DEFER_CHAIN,
 	};
 	int				error;
 
@@ -930,12 +940,11 @@ xrep_reap_metadir_fsblocks(
 	ASSERT(xfs_is_metadir_inode(sc->ip));
 
 	xfs_rmap_ino_bmbt_owner(&oinfo, sc->ip->i_ino, XFS_DATA_FORK);
-
 	error = xfsb_bitmap_walk(bitmap, xreap_fsmeta_extent, &rs);
 	if (error)
 		return error;
 
-	if (xreap_dirty(&rs)) {
+	if (xreap_is_dirty(&rs)) {
 		error = xrep_defer_finish(sc);
 		if (error)
 			return error;


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 2/9] xfs: convert the ifork reap code to use xreap_state
  2025-08-28 14:28 [PATCHSET] xfs: improve online repair reap calculations Darrick J. Wong
  2025-08-28 14:28 ` [PATCH 1/9] xfs: prepare reaping code for dynamic limits Darrick J. Wong
@ 2025-08-28 14:28 ` Darrick J. Wong
  2025-09-02  6:24   ` Christoph Hellwig
  2025-08-28 14:28 ` [PATCH 3/9] xfs: use deferred intent items for reaping crosslinked blocks Darrick J. Wong
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 23+ messages in thread
From: Darrick J. Wong @ 2025-08-28 14:28 UTC (permalink / raw)
  To: cem, djwong; +Cc: hch, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

Convert the file fork reaping code to use struct xreap_state so that we
can reuse the dynamic state tracking code.

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
 fs/xfs/scrub/reap.c |   78 ++++++++++++++++++++++++++++++---------------------
 1 file changed, 46 insertions(+), 32 deletions(-)


diff --git a/fs/xfs/scrub/reap.c b/fs/xfs/scrub/reap.c
index 7877ef427990eb..db46f6cd4112f3 100644
--- a/fs/xfs/scrub/reap.c
+++ b/fs/xfs/scrub/reap.c
@@ -91,9 +91,21 @@
 struct xreap_state {
 	struct xfs_scrub		*sc;
 
-	/* Reverse mapping owner and metadata reservation type. */
-	const struct xfs_owner_info	*oinfo;
-	enum xfs_ag_resv_type		resv;
+	union {
+		struct {
+			/*
+			 * For AG blocks, this is reverse mapping owner and
+			 * metadata reservation type.
+			 */
+			const struct xfs_owner_info	*oinfo;
+			enum xfs_ag_resv_type		resv;
+		};
+		struct {
+			/* For file blocks, this is the inode and fork. */
+			struct xfs_inode		*ip;
+			int				whichfork;
+		};
+	};
 
 	/* Number of invalidated buffers logged to the current transaction. */
 	unsigned int			nr_binval;
@@ -964,13 +976,12 @@ xrep_reap_metadir_fsblocks(
  */
 STATIC int
 xreap_bmapi_select(
-	struct xfs_scrub	*sc,
-	struct xfs_inode	*ip,
-	int			whichfork,
+	struct xreap_state	*rs,
 	struct xfs_bmbt_irec	*imap,
 	bool			*crosslinked)
 {
 	struct xfs_owner_info	oinfo;
+	struct xfs_scrub	*sc = rs->sc;
 	struct xfs_btree_cur	*cur;
 	xfs_filblks_t		len = 1;
 	xfs_agblock_t		bno;
@@ -984,7 +995,8 @@ xreap_bmapi_select(
 	cur = xfs_rmapbt_init_cursor(sc->mp, sc->tp, sc->sa.agf_bp,
 			sc->sa.pag);
 
-	xfs_rmap_ino_owner(&oinfo, ip->i_ino, whichfork, imap->br_startoff);
+	xfs_rmap_ino_owner(&oinfo, rs->ip->i_ino, rs->whichfork,
+			imap->br_startoff);
 	error = xfs_rmap_has_other_keys(cur, agbno, 1, &oinfo, crosslinked);
 	if (error)
 		goto out_cur;
@@ -1047,21 +1059,19 @@ xreap_buf_loggable(
  */
 STATIC int
 xreap_bmapi_binval(
-	struct xfs_scrub	*sc,
-	struct xfs_inode	*ip,
-	int			whichfork,
+	struct xreap_state	*rs,
 	struct xfs_bmbt_irec	*imap)
 {
+	struct xfs_scrub	*sc = rs->sc;
 	struct xfs_mount	*mp = sc->mp;
 	struct xfs_perag	*pag = sc->sa.pag;
-	int			bmap_flags = xfs_bmapi_aflag(whichfork);
+	int			bmap_flags = xfs_bmapi_aflag(rs->whichfork);
 	xfs_fileoff_t		off;
 	xfs_fileoff_t		max_off;
 	xfs_extlen_t		scan_blocks;
 	xfs_agblock_t		bno;
 	xfs_agblock_t		agbno;
 	xfs_agblock_t		agbno_next;
-	unsigned int		invalidated = 0;
 	int			error;
 
 	/*
@@ -1088,7 +1098,7 @@ xreap_bmapi_binval(
 		struct xfs_bmbt_irec	hmap;
 		int			nhmaps = 1;
 
-		error = xfs_bmapi_read(ip, off, max_off - off, &hmap,
+		error = xfs_bmapi_read(rs->ip, off, max_off - off, &hmap,
 				&nhmaps, bmap_flags);
 		if (error)
 			return error;
@@ -1129,14 +1139,13 @@ xreap_bmapi_binval(
 				xfs_buf_stale(bp);
 				xfs_buf_relse(bp);
 			}
-			invalidated++;
 
 			/*
 			 * Stop invalidating if we've hit the limit; we should
 			 * still have enough reservation left to free however
-			 * much of the mapping we've seen so far.
+			 * far we've gotten.
 			 */
-			if (invalidated > XREAP_MAX_BINVAL) {
+			if (!xreap_inc_binval(rs)) {
 				imap->br_blockcount = agbno_next - bno;
 				goto out;
 			}
@@ -1158,12 +1167,11 @@ xreap_bmapi_binval(
  */
 STATIC int
 xrep_reap_bmapi_iter(
-	struct xfs_scrub		*sc,
-	struct xfs_inode		*ip,
-	int				whichfork,
+	struct xreap_state		*rs,
 	struct xfs_bmbt_irec		*imap,
 	bool				crosslinked)
 {
+	struct xfs_scrub		*sc = rs->sc;
 	int				error;
 
 	if (crosslinked) {
@@ -1184,10 +1192,10 @@ xrep_reap_bmapi_iter(
 		 * deferred log intents in this function to control the exact
 		 * sequence of metadata updates.
 		 */
-		xfs_bmap_unmap_extent(sc->tp, ip, whichfork, imap);
-		xfs_trans_mod_dquot_byino(sc->tp, ip, XFS_TRANS_DQ_BCOUNT,
+		xfs_bmap_unmap_extent(sc->tp, rs->ip, rs->whichfork, imap);
+		xfs_trans_mod_dquot_byino(sc->tp, rs->ip, XFS_TRANS_DQ_BCOUNT,
 				-(int64_t)imap->br_blockcount);
-		xfs_rmap_unmap_extent(sc->tp, ip, whichfork, imap);
+		xfs_rmap_unmap_extent(sc->tp, rs->ip, rs->whichfork, imap);
 		return 0;
 	}
 
@@ -1208,7 +1216,7 @@ xrep_reap_bmapi_iter(
 	 * transaction is full of logged buffer invalidations, so we need to
 	 * return early so that we can roll and retry.
 	 */
-	error = xreap_bmapi_binval(sc, ip, whichfork, imap);
+	error = xreap_bmapi_binval(rs, imap);
 	if (error || imap->br_blockcount == 0)
 		return error;
 
@@ -1217,8 +1225,8 @@ xrep_reap_bmapi_iter(
 	 * intents in this function to control the exact sequence of metadata
 	 * updates.
 	 */
-	xfs_bmap_unmap_extent(sc->tp, ip, whichfork, imap);
-	xfs_trans_mod_dquot_byino(sc->tp, ip, XFS_TRANS_DQ_BCOUNT,
+	xfs_bmap_unmap_extent(sc->tp, rs->ip, rs->whichfork, imap);
+	xfs_trans_mod_dquot_byino(sc->tp, rs->ip, XFS_TRANS_DQ_BCOUNT,
 			-(int64_t)imap->br_blockcount);
 	return xfs_free_extent_later(sc->tp, imap->br_startblock,
 			imap->br_blockcount, NULL, XFS_AG_RESV_NONE,
@@ -1231,18 +1239,17 @@ xrep_reap_bmapi_iter(
  */
 STATIC int
 xreap_ifork_extent(
-	struct xfs_scrub		*sc,
-	struct xfs_inode		*ip,
-	int				whichfork,
+	struct xreap_state		*rs,
 	struct xfs_bmbt_irec		*imap)
 {
+	struct xfs_scrub		*sc = rs->sc;
 	xfs_agnumber_t			agno;
 	bool				crosslinked;
 	int				error;
 
 	ASSERT(sc->sa.pag == NULL);
 
-	trace_xreap_ifork_extent(sc, ip, whichfork, imap);
+	trace_xreap_ifork_extent(sc, rs->ip, rs->whichfork, imap);
 
 	agno = XFS_FSB_TO_AGNO(sc->mp, imap->br_startblock);
 	sc->sa.pag = xfs_perag_get(sc->mp, agno);
@@ -1257,11 +1264,11 @@ xreap_ifork_extent(
 	 * Decide the fate of the blocks at the beginning of the mapping, then
 	 * update the mapping to use it with the unmap calls.
 	 */
-	error = xreap_bmapi_select(sc, ip, whichfork, imap, &crosslinked);
+	error = xreap_bmapi_select(rs, imap, &crosslinked);
 	if (error)
 		goto out_agf;
 
-	error = xrep_reap_bmapi_iter(sc, ip, whichfork, imap, crosslinked);
+	error = xrep_reap_bmapi_iter(rs, imap, crosslinked);
 	if (error)
 		goto out_agf;
 
@@ -1285,6 +1292,12 @@ xrep_reap_ifork(
 	struct xfs_inode	*ip,
 	int			whichfork)
 {
+	struct xreap_state	rs = {
+		.sc		= sc,
+		.ip		= ip,
+		.whichfork	= whichfork,
+		.max_binval	= XREAP_MAX_BINVAL,
+	};
 	xfs_fileoff_t		off = 0;
 	int			bmap_flags = xfs_bmapi_aflag(whichfork);
 	int			error;
@@ -1312,13 +1325,14 @@ xrep_reap_ifork(
 		 * can in a single transaction.
 		 */
 		if (xfs_bmap_is_real_extent(&imap)) {
-			error = xreap_ifork_extent(sc, ip, whichfork, &imap);
+			error = xreap_ifork_extent(&rs, &imap);
 			if (error)
 				return error;
 
 			error = xfs_defer_finish(&sc->tp);
 			if (error)
 				return error;
+			xreap_defer_finish_reset(&rs);
 		}
 
 		off = imap.br_startoff + imap.br_blockcount;


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 3/9] xfs: use deferred intent items for reaping crosslinked blocks
  2025-08-28 14:28 [PATCHSET] xfs: improve online repair reap calculations Darrick J. Wong
  2025-08-28 14:28 ` [PATCH 1/9] xfs: prepare reaping code for dynamic limits Darrick J. Wong
  2025-08-28 14:28 ` [PATCH 2/9] xfs: convert the ifork reap code to use xreap_state Darrick J. Wong
@ 2025-08-28 14:28 ` Darrick J. Wong
  2025-09-02  6:25   ` Christoph Hellwig
  2025-08-28 14:28 ` [PATCH 4/9] xfs: compute per-AG extent reap limits dynamically Darrick J. Wong
                   ` (5 subsequent siblings)
  8 siblings, 1 reply; 23+ messages in thread
From: Darrick J. Wong @ 2025-08-28 14:28 UTC (permalink / raw)
  To: cem, djwong; +Cc: stable, hch, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

When we're removing rmap records for crosslinked blocks, use deferred
intent items so that we can try to free/unmap as many of the old data
structure's blocks as we can in the same transaction as the commit.

Cc: <stable@vger.kernel.org> # v6.6
Fixes: 1c7ce115e52106 ("xfs: reap large AG metadata extents when possible")
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
 fs/xfs/scrub/reap.c |    9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)


diff --git a/fs/xfs/scrub/reap.c b/fs/xfs/scrub/reap.c
index db46f6cd4112f3..33272729249f64 100644
--- a/fs/xfs/scrub/reap.c
+++ b/fs/xfs/scrub/reap.c
@@ -430,8 +430,6 @@ xreap_agextent_iter(
 		trace_xreap_dispose_unmap_extent(pag_group(sc->sa.pag), agbno,
 				*aglenp);
 
-		xreap_force_defer_finish(rs);
-
 		if (rs->oinfo == &XFS_RMAP_OINFO_COW) {
 			/*
 			 * If we're unmapping CoW staging extents, remove the
@@ -440,11 +438,14 @@ xreap_agextent_iter(
 			 */
 			xfs_refcount_free_cow_extent(sc->tp, false, fsbno,
 					*aglenp);
+			xreap_force_defer_finish(rs);
 			return 0;
 		}
 
-		return xfs_rmap_free(sc->tp, sc->sa.agf_bp, sc->sa.pag, agbno,
-				*aglenp, rs->oinfo);
+		xfs_rmap_free_extent(sc->tp, false, fsbno, *aglenp,
+				rs->oinfo->oi_owner);
+		xreap_inc_defer(rs);
+		return 0;
 	}
 
 	trace_xreap_dispose_free_extent(pag_group(sc->sa.pag), agbno, *aglenp);


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 4/9] xfs: compute per-AG extent reap limits dynamically
  2025-08-28 14:28 [PATCHSET] xfs: improve online repair reap calculations Darrick J. Wong
                   ` (2 preceding siblings ...)
  2025-08-28 14:28 ` [PATCH 3/9] xfs: use deferred intent items for reaping crosslinked blocks Darrick J. Wong
@ 2025-08-28 14:28 ` Darrick J. Wong
  2025-09-02  6:26   ` Christoph Hellwig
  2025-08-28 14:29 ` [PATCH 5/9] xfs: compute data device CoW staging " Darrick J. Wong
                   ` (4 subsequent siblings)
  8 siblings, 1 reply; 23+ messages in thread
From: Darrick J. Wong @ 2025-08-28 14:28 UTC (permalink / raw)
  To: cem, djwong; +Cc: hch, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

Calculate the maximum number of extents that can be reaped in a single
transaction chain, and the number of buffers that can be invalidated in
a single transaction.  The rough calculation here is:

nr_extents = (logres - reservation used by any one step) /
		(space used by intents per extent +
		 space used per binval)

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
 fs/xfs/scrub/trace.h |   42 +++++++++++++++
 fs/xfs/scrub/reap.c  |  140 ++++++++++++++++++++++++++++++++++++++++++++++----
 fs/xfs/scrub/trace.c |    1 
 3 files changed, 171 insertions(+), 12 deletions(-)


diff --git a/fs/xfs/scrub/trace.h b/fs/xfs/scrub/trace.h
index a8187281eb96b9..d39da0e67024fb 100644
--- a/fs/xfs/scrub/trace.h
+++ b/fs/xfs/scrub/trace.h
@@ -2000,6 +2000,48 @@ DEFINE_REPAIR_EXTENT_EVENT(xreap_agextent_binval);
 DEFINE_REPAIR_EXTENT_EVENT(xreap_bmapi_binval);
 DEFINE_REPAIR_EXTENT_EVENT(xrep_agfl_insert);
 
+DECLARE_EVENT_CLASS(xrep_reap_limits_class,
+	TP_PROTO(const struct xfs_trans *tp, unsigned int per_binval,
+		 unsigned int max_binval, unsigned int step_size,
+		 unsigned int per_intent,
+		 unsigned int max_deferred),
+	TP_ARGS(tp, per_binval, max_binval, step_size, per_intent, max_deferred),
+	TP_STRUCT__entry(
+		__field(dev_t, dev)
+		__field(unsigned int, log_res)
+		__field(unsigned int, per_binval)
+		__field(unsigned int, max_binval)
+		__field(unsigned int, step_size)
+		__field(unsigned int, per_intent)
+		__field(unsigned int, max_deferred)
+	),
+	TP_fast_assign(
+		__entry->dev = tp->t_mountp->m_super->s_dev;
+		__entry->log_res = tp->t_log_res;
+		__entry->per_binval = per_binval;
+		__entry->max_binval = max_binval;
+		__entry->step_size = step_size;
+		__entry->per_intent = per_intent;
+		__entry->max_deferred = max_deferred;
+	),
+	TP_printk("dev %d:%d logres %u per_binval %u max_binval %u step_size %u per_intent %u max_deferred %u",
+		  MAJOR(__entry->dev), MINOR(__entry->dev),
+		  __entry->log_res,
+		  __entry->per_binval,
+		  __entry->max_binval,
+		  __entry->step_size,
+		  __entry->per_intent,
+		  __entry->max_deferred)
+);
+#define DEFINE_REPAIR_REAP_LIMITS_EVENT(name) \
+DEFINE_EVENT(xrep_reap_limits_class, name, \
+	TP_PROTO(const struct xfs_trans *tp, unsigned int per_binval, \
+		 unsigned int max_binval, unsigned int step_size, \
+		 unsigned int per_intent, \
+		 unsigned int max_deferred), \
+	TP_ARGS(tp, per_binval, max_binval, step_size, per_intent, max_deferred))
+DEFINE_REPAIR_REAP_LIMITS_EVENT(xreap_agextent_limits);
+
 DECLARE_EVENT_CLASS(xrep_reap_find_class,
 	TP_PROTO(const struct xfs_group *xg, xfs_agblock_t agbno,
 		 xfs_extlen_t len, bool crosslinked),
diff --git a/fs/xfs/scrub/reap.c b/fs/xfs/scrub/reap.c
index 33272729249f64..929ea3c453d313 100644
--- a/fs/xfs/scrub/reap.c
+++ b/fs/xfs/scrub/reap.c
@@ -36,6 +36,12 @@
 #include "xfs_metafile.h"
 #include "xfs_rtgroup.h"
 #include "xfs_rtrmap_btree.h"
+#include "xfs_extfree_item.h"
+#include "xfs_rmap_item.h"
+#include "xfs_refcount_item.h"
+#include "xfs_buf_item.h"
+#include "xfs_bmap_item.h"
+#include "xfs_bmap_btree.h"
 #include "scrub/scrub.h"
 #include "scrub/common.h"
 #include "scrub/trace.h"
@@ -230,6 +236,15 @@ static inline void xreap_force_defer_finish(struct xreap_state *rs)
 	rs->nr_deferred = rs->max_deferred;
 }
 
+/* Maximum number of fsblocks that we might find in a buffer to invalidate. */
+static inline unsigned int
+xrep_binval_max_fsblocks(
+	struct xfs_mount	*mp)
+{
+	/* Remote xattr values are the largest buffers that we support. */
+	return xfs_attr3_max_rmt_blocks(mp);
+}
+
 /*
  * Compute the maximum length of a buffer cache scan (in units of sectors),
  * given a quantity of fs blocks.
@@ -239,12 +254,8 @@ xrep_bufscan_max_sectors(
 	struct xfs_mount	*mp,
 	xfs_extlen_t		fsblocks)
 {
-	int			max_fsbs;
-
-	/* Remote xattr values are the largest buffers that we support. */
-	max_fsbs = xfs_attr3_max_rmt_blocks(mp);
-
-	return XFS_FSB_TO_BB(mp, min_t(xfs_extlen_t, fsblocks, max_fsbs));
+	return XFS_FSB_TO_BB(mp, min_t(xfs_extlen_t, fsblocks,
+				       xrep_binval_max_fsblocks(mp)));
 }
 
 /*
@@ -442,6 +453,7 @@ xreap_agextent_iter(
 			return 0;
 		}
 
+		/* t1: unmap crosslinked metadata blocks */
 		xfs_rmap_free_extent(sc->tp, false, fsbno, *aglenp,
 				rs->oinfo->oi_owner);
 		xreap_inc_defer(rs);
@@ -482,7 +494,7 @@ xreap_agextent_iter(
 		return 0;
 	}
 
-	/* Put blocks back on the AGFL one at a time. */
+	/* t3: Put blocks back on the AGFL one at a time. */
 	if (rs->resv == XFS_AG_RESV_AGFL) {
 		ASSERT(*aglenp == 1);
 		error = xreap_put_freelist(sc, agbno);
@@ -494,7 +506,7 @@ xreap_agextent_iter(
 	}
 
 	/*
-	 * Use deferred frees to get rid of the old btree blocks to try to
+	 * t4: Use deferred frees to get rid of the old btree blocks to try to
 	 * minimize the window in which we could crash and lose the old blocks.
 	 * Add a defer ops barrier every other extent to avoid stressing the
 	 * system with large EFIs.
@@ -510,6 +522,110 @@ xreap_agextent_iter(
 	return 0;
 }
 
+/* Configure the deferral and invalidation limits */
+static inline void
+xreap_configure_limits(
+	struct xreap_state	*rs,
+	unsigned int		fixed_overhead,
+	unsigned int		variable_overhead,
+	unsigned int		per_intent,
+	unsigned int		per_binval)
+{
+	struct xfs_scrub	*sc = rs->sc;
+	unsigned int		res = sc->tp->t_log_res - fixed_overhead;
+
+	/* Don't underflow the reservation */
+	if (sc->tp->t_log_res < (fixed_overhead + variable_overhead)) {
+		ASSERT(sc->tp->t_log_res >=
+				(fixed_overhead + variable_overhead));
+		xfs_force_shutdown(sc->mp, SHUTDOWN_CORRUPT_INCORE);
+		return;
+	}
+
+	rs->max_deferred = res / variable_overhead;
+	res -= rs->max_deferred * per_intent;
+	rs->max_binval = per_binval ? res / per_binval : 0;
+}
+
+/*
+ * Compute the maximum number of intent items that reaping can attach to the
+ * scrub transaction given the worst case log overhead of the intent items
+ * needed to reap a single per-AG space extent.  This is not for freeing CoW
+ * staging extents.
+ */
+STATIC void
+xreap_configure_agextent_limits(
+	struct xreap_state	*rs)
+{
+	struct xfs_scrub	*sc = rs->sc;
+	struct xfs_mount	*mp = sc->mp;
+
+	/*
+	 * In the worst case, relogging an intent item causes both an intent
+	 * item and a done item to be attached to a transaction for each extent
+	 * that we'd like to process.
+	 */
+	const unsigned int	efi = xfs_efi_log_space(1) +
+				      xfs_efd_log_space(1);
+	const unsigned int	rui = xfs_rui_log_space(1) +
+				      xfs_rud_log_space();
+
+	/*
+	 * Various things can happen when reaping non-CoW metadata blocks:
+	 *
+	 * t1: Unmapping crosslinked metadata blocks: deferred removal of rmap
+	 * record.
+	 *
+	 * t3: Freeing to AGFL: roll and finish deferred items for every block.
+	 * Limits here do not matter.
+	 *
+	 * t4: Freeing metadata blocks: deferred freeing of the space, which
+	 * also removes the rmap record.
+	 *
+	 * For simplicity, we'll use the worst-case intents size to determine
+	 * the maximum number of deferred extents before we have to finish the
+	 * whole chain.  If we're trying to reap a btree larger than this size,
+	 * a crash midway through reaping can result in leaked blocks.
+	 */
+	const unsigned int	t1 = rui;
+	const unsigned int	t4 = rui + efi;
+	const unsigned int	per_intent = max(t1, t4);
+
+	/*
+	 * For each transaction in a reap chain, we must be able to take one
+	 * step in the defer item chain, which should only consist of EFI or
+	 * RUI items.
+	 */
+	const unsigned int	f1 = xfs_calc_finish_efi_reservation(mp, 1);
+	const unsigned int	f2 = xfs_calc_finish_rui_reservation(mp, 1);
+	const unsigned int	step_size = max(f1, f2);
+
+	/* Largest buffer size (in fsblocks) that can be invalidated. */
+	const unsigned int	max_binval = xrep_binval_max_fsblocks(mp);
+
+	/* Maximum overhead of invalidating one buffer. */
+	const unsigned int	per_binval =
+		xfs_buf_inval_log_space(1, XFS_B_TO_FSBT(mp, max_binval));
+
+	/*
+	 * For each transaction in a reap chain, we can delete some number of
+	 * extents and invalidate some number of blocks.  We assume that btree
+	 * blocks aren't usually contiguous; and that scrub likely pulled all
+	 * the buffers into memory.  From these assumptions, set the maximum
+	 * number of deferrals we can queue before flushing the defer chain,
+	 * and the number of invalidations we can queue before rolling to a
+	 * clean transaction (and possibly relogging some of the deferrals) to
+	 * the same quantity.
+	 */
+	const unsigned int	variable_overhead = per_intent + per_binval;
+
+	xreap_configure_limits(rs, step_size, variable_overhead, per_intent,
+			per_binval);
+
+	trace_xreap_agextent_limits(sc->tp, per_binval, rs->max_binval,
+			step_size, per_intent, rs->max_deferred);
+}
+
 /*
  * Break an AG metadata extent into sub-extents by fate (crosslinked, not
  * crosslinked), and dispose of each sub-extent separately.
@@ -571,14 +687,13 @@ xrep_reap_agblocks(
 		.sc			= sc,
 		.oinfo			= oinfo,
 		.resv			= type,
-		.max_binval		= XREAP_MAX_BINVAL,
-		.max_deferred		= XREAP_MAX_DEFER_CHAIN,
 	};
 	int				error;
 
 	ASSERT(xfs_has_rmapbt(sc->mp));
 	ASSERT(sc->ip == NULL);
 
+	xreap_configure_agextent_limits(&rs);
 	error = xagb_bitmap_walk(bitmap, xreap_agmeta_extent, &rs);
 	if (error)
 		return error;
@@ -693,6 +808,8 @@ xrep_reap_fsblocks(
 	ASSERT(xfs_has_rmapbt(sc->mp));
 	ASSERT(sc->ip != NULL);
 
+	if (oinfo != &XFS_RMAP_OINFO_COW)
+		xreap_configure_agextent_limits(&rs);
 	error = xfsb_bitmap_walk(bitmap, xreap_fsmeta_extent, &rs);
 	if (error)
 		return error;
@@ -943,8 +1060,6 @@ xrep_reap_metadir_fsblocks(
 		.sc			= sc,
 		.oinfo			= &oinfo,
 		.resv			= XFS_AG_RESV_NONE,
-		.max_binval		= XREAP_MAX_BINVAL,
-		.max_deferred		= XREAP_MAX_DEFER_CHAIN,
 	};
 	int				error;
 
@@ -952,6 +1067,7 @@ xrep_reap_metadir_fsblocks(
 	ASSERT(sc->ip != NULL);
 	ASSERT(xfs_is_metadir_inode(sc->ip));
 
+	xreap_configure_agextent_limits(&rs);
 	xfs_rmap_ino_bmbt_owner(&oinfo, sc->ip->i_ino, XFS_DATA_FORK);
 	error = xfsb_bitmap_walk(bitmap, xreap_fsmeta_extent, &rs);
 	if (error)
diff --git a/fs/xfs/scrub/trace.c b/fs/xfs/scrub/trace.c
index 2450e214103fed..987313a52e6401 100644
--- a/fs/xfs/scrub/trace.c
+++ b/fs/xfs/scrub/trace.c
@@ -22,6 +22,7 @@
 #include "xfs_parent.h"
 #include "xfs_metafile.h"
 #include "xfs_rtgroup.h"
+#include "xfs_trans.h"
 #include "scrub/scrub.h"
 #include "scrub/xfile.h"
 #include "scrub/xfarray.h"


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 5/9] xfs: compute data device CoW staging extent reap limits dynamically
  2025-08-28 14:28 [PATCHSET] xfs: improve online repair reap calculations Darrick J. Wong
                   ` (3 preceding siblings ...)
  2025-08-28 14:28 ` [PATCH 4/9] xfs: compute per-AG extent reap limits dynamically Darrick J. Wong
@ 2025-08-28 14:29 ` Darrick J. Wong
  2025-09-03  6:02   ` Christoph Hellwig
  2025-08-28 14:29 ` [PATCH 6/9] xfs: compute realtime " Darrick J. Wong
                   ` (3 subsequent siblings)
  8 siblings, 1 reply; 23+ messages in thread
From: Darrick J. Wong @ 2025-08-28 14:29 UTC (permalink / raw)
  To: cem, djwong; +Cc: hch, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

Calculate the maximum number of CoW staging extents that can be reaped
in a single transaction chain.  The rough calculation here is:

nr_extents = (logres - reservation used by any one step) /
		(space used by intents per extent +
		 space used for a few buffer invalidations)

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
 fs/xfs/scrub/trace.h |    1 +
 fs/xfs/scrub/reap.c  |   88 +++++++++++++++++++++++++++++++++++++++++++++++---
 2 files changed, 84 insertions(+), 5 deletions(-)


diff --git a/fs/xfs/scrub/trace.h b/fs/xfs/scrub/trace.h
index d39da0e67024fb..a9da22f50534cb 100644
--- a/fs/xfs/scrub/trace.h
+++ b/fs/xfs/scrub/trace.h
@@ -2041,6 +2041,7 @@ DEFINE_EVENT(xrep_reap_limits_class, name, \
 		 unsigned int max_deferred), \
 	TP_ARGS(tp, per_binval, max_binval, step_size, per_intent, max_deferred))
 DEFINE_REPAIR_REAP_LIMITS_EVENT(xreap_agextent_limits);
+DEFINE_REPAIR_REAP_LIMITS_EVENT(xreap_agcow_limits);
 
 DECLARE_EVENT_CLASS(xrep_reap_find_class,
 	TP_PROTO(const struct xfs_group *xg, xfs_agblock_t agbno,
diff --git a/fs/xfs/scrub/reap.c b/fs/xfs/scrub/reap.c
index 929ea3c453d313..aaef7e6771a045 100644
--- a/fs/xfs/scrub/reap.c
+++ b/fs/xfs/scrub/reap.c
@@ -443,7 +443,7 @@ xreap_agextent_iter(
 
 		if (rs->oinfo == &XFS_RMAP_OINFO_COW) {
 			/*
-			 * If we're unmapping CoW staging extents, remove the
+			 * t0: Unmapping CoW staging extents, remove the
 			 * records from the refcountbt, which will remove the
 			 * rmap record as well.
 			 */
@@ -475,7 +475,7 @@ xreap_agextent_iter(
 	}
 
 	/*
-	 * If we're getting rid of CoW staging extents, use deferred work items
+	 * t2: To get rid of CoW staging extents, use deferred work items
 	 * to remove the refcountbt records (which removes the rmap records)
 	 * and free the extent.  We're not worried about the system going down
 	 * here because log recovery walks the refcount btree to clean out the
@@ -626,6 +626,84 @@ xreap_configure_agextent_limits(
 			step_size, per_intent, rs->max_deferred);
 }
 
+/*
+ * Compute the maximum number of intent items that reaping can attach to the
+ * scrub transaction given the worst case log overhead of the intent items
+ * needed to reap a single CoW staging extent.  This is not for freeing
+ * metadata blocks.
+ */
+STATIC void
+xreap_configure_agcow_limits(
+	struct xreap_state	*rs)
+{
+	struct xfs_scrub	*sc = rs->sc;
+	struct xfs_mount	*mp = sc->mp;
+
+	/*
+	 * In the worst case, relogging an intent item causes both an intent
+	 * item and a done item to be attached to a transaction for each extent
+	 * that we'd like to process.
+	 */
+	const unsigned int	efi = xfs_efi_log_space(1) +
+				      xfs_efd_log_space(1);
+	const unsigned int	rui = xfs_rui_log_space(1) +
+				      xfs_rud_log_space();
+	const unsigned int	cui = xfs_cui_log_space(1) +
+				      xfs_cud_log_space();
+
+	/*
+	 * Various things can happen when reaping non-CoW metadata blocks:
+	 *
+	 * t0: Unmapping crosslinked CoW blocks: deferred removal of refcount
+	 * record, which defers removal of rmap record
+	 *
+	 * t2: Freeing CoW blocks: deferred removal of refcount record, which
+	 * defers removal of rmap record; and deferred removal of the space
+	 *
+	 * For simplicity, we'll use the worst-case intents size to determine
+	 * the maximum number of deferred extents before we have to finish the
+	 * whole chain.  If we're trying to reap a btree larger than this size,
+	 * a crash midway through reaping can result in leaked blocks.
+	 */
+	const unsigned int	t0 = cui + rui;
+	const unsigned int	t2 = cui + rui + efi;
+	const unsigned int	per_intent = max(t0, t2);
+
+	/*
+	 * For each transaction in a reap chain, we must be able to take one
+	 * step in the defer item chain, which should only consist of CUI, EFI,
+	 * or RUI items.
+	 */
+	const unsigned int	f1 = xfs_calc_finish_efi_reservation(mp, 1);
+	const unsigned int	f2 = xfs_calc_finish_rui_reservation(mp, 1);
+	const unsigned int	f3 = xfs_calc_finish_cui_reservation(mp, 1);
+	const unsigned int	step_size = max3(f1, f2, f3);
+
+	/* Largest buffer size (in fsblocks) that can be invalidated. */
+	const unsigned int	max_binval = xrep_binval_max_fsblocks(mp);
+
+	/* Overhead of invalidating one buffer */
+	const unsigned int	per_binval =
+		xfs_buf_inval_log_space(1, XFS_B_TO_FSBT(mp, max_binval));
+
+	/*
+	 * For each transaction in a reap chain, we can delete some number of
+	 * extents and invalidate some number of blocks.  We assume that CoW
+	 * staging extents are usually more than 1 fsblock, and that there
+	 * shouldn't be any buffers for those blocks.  From the assumptions,
+	 * set the number of deferrals to use as much of the reservation as
+	 * it can, but leave space to invalidate 1/8th that number of buffers.
+	 */
+	const unsigned int	variable_overhead = per_intent +
+							(per_binval / 8);
+
+	xreap_configure_limits(rs, step_size, variable_overhead, per_intent,
+			per_binval);
+
+	trace_xreap_agcow_limits(sc->tp, per_binval, rs->max_binval, step_size,
+			per_intent, rs->max_deferred);
+}
+
 /*
  * Break an AG metadata extent into sub-extents by fate (crosslinked, not
  * crosslinked), and dispose of each sub-extent separately.
@@ -800,15 +878,15 @@ xrep_reap_fsblocks(
 		.sc			= sc,
 		.oinfo			= oinfo,
 		.resv			= XFS_AG_RESV_NONE,
-		.max_binval		= XREAP_MAX_BINVAL,
-		.max_deferred		= XREAP_MAX_DEFER_CHAIN,
 	};
 	int				error;
 
 	ASSERT(xfs_has_rmapbt(sc->mp));
 	ASSERT(sc->ip != NULL);
 
-	if (oinfo != &XFS_RMAP_OINFO_COW)
+	if (oinfo == &XFS_RMAP_OINFO_COW)
+		xreap_configure_agcow_limits(&rs);
+	else
 		xreap_configure_agextent_limits(&rs);
 	error = xfsb_bitmap_walk(bitmap, xreap_fsmeta_extent, &rs);
 	if (error)


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 6/9] xfs: compute realtime device CoW staging extent reap limits dynamically
  2025-08-28 14:28 [PATCHSET] xfs: improve online repair reap calculations Darrick J. Wong
                   ` (4 preceding siblings ...)
  2025-08-28 14:29 ` [PATCH 5/9] xfs: compute data device CoW staging " Darrick J. Wong
@ 2025-08-28 14:29 ` Darrick J. Wong
  2025-09-02  6:27   ` Christoph Hellwig
  2025-08-28 14:29 ` [PATCH 7/9] xfs: compute file mapping " Darrick J. Wong
                   ` (2 subsequent siblings)
  8 siblings, 1 reply; 23+ messages in thread
From: Darrick J. Wong @ 2025-08-28 14:29 UTC (permalink / raw)
  To: cem, djwong; +Cc: hch, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

Calculate the maximum number of CoW staging extents that can be reaped
in a single transaction chain.  The rough calculation here is:

nr_extents = (logres - reservation used by any one step) /
		(space used by intents per extent)

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
 fs/xfs/scrub/trace.h |    1 +
 fs/xfs/scrub/reap.c  |   71 +++++++++++++++++++++++++++++++++++++++++++++++---
 2 files changed, 68 insertions(+), 4 deletions(-)


diff --git a/fs/xfs/scrub/trace.h b/fs/xfs/scrub/trace.h
index a9da22f50534cb..1a994d339c42cf 100644
--- a/fs/xfs/scrub/trace.h
+++ b/fs/xfs/scrub/trace.h
@@ -2042,6 +2042,7 @@ DEFINE_EVENT(xrep_reap_limits_class, name, \
 	TP_ARGS(tp, per_binval, max_binval, step_size, per_intent, max_deferred))
 DEFINE_REPAIR_REAP_LIMITS_EVENT(xreap_agextent_limits);
 DEFINE_REPAIR_REAP_LIMITS_EVENT(xreap_agcow_limits);
+DEFINE_REPAIR_REAP_LIMITS_EVENT(xreap_rgcow_limits);
 
 DECLARE_EVENT_CLASS(xrep_reap_find_class,
 	TP_PROTO(const struct xfs_group *xg, xfs_agblock_t agbno,
diff --git a/fs/xfs/scrub/reap.c b/fs/xfs/scrub/reap.c
index aaef7e6771a045..b2f089e2c49daa 100644
--- a/fs/xfs/scrub/reap.c
+++ b/fs/xfs/scrub/reap.c
@@ -984,7 +984,7 @@ xreap_rgextent_iter(
 	rtbno = xfs_rgbno_to_rtb(sc->sr.rtg, rgbno);
 
 	/*
-	 * If there are other rmappings, this block is cross linked and must
+	 * t1: There are other rmappings; this block is cross linked and must
 	 * not be freed.  Remove the forward and reverse mapping and move on.
 	 */
 	if (crosslinked) {
@@ -999,7 +999,7 @@ xreap_rgextent_iter(
 	trace_xreap_dispose_free_extent(rtg_group(sc->sr.rtg), rgbno, *rglenp);
 
 	/*
-	 * The CoW staging extent is not crosslinked.  Use deferred work items
+	 * t2: The CoW staging extent is not crosslinked.  Use deferred work
 	 * to remove the refcountbt records (which removes the rmap records)
 	 * and free the extent.  We're not worried about the system going down
 	 * here because log recovery walks the refcount btree to clean out the
@@ -1017,6 +1017,69 @@ xreap_rgextent_iter(
 	return 0;
 }
 
+/*
+ * Compute the maximum number of intent items that reaping can attach to the
+ * scrub transaction given the worst case log overhead of the intent items
+ * needed to reap a single CoW staging extent.  This is not for freeing
+ * metadata blocks.
+ */
+STATIC void
+xreap_configure_rgcow_limits(
+	struct xreap_state	*rs)
+{
+	struct xfs_scrub	*sc = rs->sc;
+	struct xfs_mount	*mp = sc->mp;
+
+	/*
+	 * In the worst case, relogging an intent item causes both an intent
+	 * item and a done item to be attached to a transaction for each extent
+	 * that we'd like to process.
+	 */
+	const unsigned int	efi = xfs_efi_log_space(1) +
+				      xfs_efd_log_space(1);
+	const unsigned int	rui = xfs_rui_log_space(1) +
+				      xfs_rud_log_space();
+	const unsigned int	cui = xfs_cui_log_space(1) +
+				      xfs_cud_log_space();
+
+	/*
+	 * Various things can happen when reaping non-CoW metadata blocks:
+	 *
+	 * t1: Unmapping crosslinked CoW blocks: deferred removal of refcount
+	 * record, which defers removal of rmap record
+	 *
+	 * t2: Freeing CoW blocks: deferred removal of refcount record, which
+	 * defers removal of rmap record; and deferred removal of the space
+	 *
+	 * For simplicity, we'll use the worst-case intents size to determine
+	 * the maximum number of deferred extents before we have to finish the
+	 * whole chain.  If we're trying to reap a btree larger than this size,
+	 * a crash midway through reaping can result in leaked blocks.
+	 */
+	const unsigned int	t1 = cui + rui;
+	const unsigned int	t2 = cui + rui + efi;
+	const unsigned int	per_intent = max(t1, t2);
+
+	/*
+	 * For each transaction in a reap chain, we must be able to take one
+	 * step in the defer item chain, which should only consist of CUI, EFI,
+	 * or RUI items.
+	 */
+	const unsigned int	f1 = xfs_calc_finish_rt_efi_reservation(mp, 1);
+	const unsigned int	f2 = xfs_calc_finish_rt_rui_reservation(mp, 1);
+	const unsigned int	f3 = xfs_calc_finish_rt_cui_reservation(mp, 1);
+	const unsigned int	step_size = max3(f1, f2, f3);
+
+	/*
+	 * The only buffer for the rt device is the rtgroup super, so we don't
+	 * need to save space for buffer invalidations.
+	 */
+	xreap_configure_limits(rs, step_size, per_intent, per_intent, 0);
+
+	trace_xreap_rgcow_limits(sc->tp, 0, 0, step_size, per_intent,
+			rs->max_deferred);
+}
+
 #define XREAP_RTGLOCK_ALL	(XFS_RTGLOCK_BITMAP | \
 				 XFS_RTGLOCK_RMAP | \
 				 XFS_RTGLOCK_REFCOUNT)
@@ -1100,14 +1163,14 @@ xrep_reap_rtblocks(
 		.sc			= sc,
 		.oinfo			= oinfo,
 		.resv			= XFS_AG_RESV_NONE,
-		.max_binval		= XREAP_MAX_BINVAL,
-		.max_deferred		= XREAP_MAX_DEFER_CHAIN,
 	};
 	int				error;
 
 	ASSERT(xfs_has_rmapbt(sc->mp));
 	ASSERT(sc->ip != NULL);
+	ASSERT(oinfo == &XFS_RMAP_OINFO_COW);
 
+	xreap_configure_rgcow_limits(&rs);
 	error = xrtb_bitmap_walk(bitmap, xreap_rtmeta_extent, &rs);
 	if (error)
 		return error;


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 7/9] xfs: compute file mapping reap limits dynamically
  2025-08-28 14:28 [PATCHSET] xfs: improve online repair reap calculations Darrick J. Wong
                   ` (5 preceding siblings ...)
  2025-08-28 14:29 ` [PATCH 6/9] xfs: compute realtime " Darrick J. Wong
@ 2025-08-28 14:29 ` Darrick J. Wong
  2025-09-02  6:27   ` Christoph Hellwig
  2025-08-28 14:29 ` [PATCH 8/9] xfs: remove static reap limits Darrick J. Wong
  2025-08-28 14:30 ` [PATCH 9/9] xfs: use deferred reaping for data device cow extents Darrick J. Wong
  8 siblings, 1 reply; 23+ messages in thread
From: Darrick J. Wong @ 2025-08-28 14:29 UTC (permalink / raw)
  To: cem, djwong; +Cc: hch, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

Reaping file fork mappings is a little different -- log recovery can
free the blocks for us, so we only try to process a single mapping at a
time.  Therefore, we only need to figure out the maximum number of
blocks that we can invalidate in a single transaction.

The rough calculation here is:

nr_extents = (logres - reservation used by any one step) /
		(space used per binval)

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
 fs/xfs/scrub/trace.h |    1 
 fs/xfs/scrub/reap.c  |  109 ++++++++++++++++++++++++++++++++++++++++++++++++--
 2 files changed, 105 insertions(+), 5 deletions(-)


diff --git a/fs/xfs/scrub/trace.h b/fs/xfs/scrub/trace.h
index 1a994d339c42cf..39ea651cbb7510 100644
--- a/fs/xfs/scrub/trace.h
+++ b/fs/xfs/scrub/trace.h
@@ -2043,6 +2043,7 @@ DEFINE_EVENT(xrep_reap_limits_class, name, \
 DEFINE_REPAIR_REAP_LIMITS_EVENT(xreap_agextent_limits);
 DEFINE_REPAIR_REAP_LIMITS_EVENT(xreap_agcow_limits);
 DEFINE_REPAIR_REAP_LIMITS_EVENT(xreap_rgcow_limits);
+DEFINE_REPAIR_REAP_LIMITS_EVENT(xreap_bmapi_limits);
 
 DECLARE_EVENT_CLASS(xrep_reap_find_class,
 	TP_PROTO(const struct xfs_group *xg, xfs_agblock_t agbno,
diff --git a/fs/xfs/scrub/reap.c b/fs/xfs/scrub/reap.c
index b2f089e2c49daa..d58fd57aaebb43 100644
--- a/fs/xfs/scrub/reap.c
+++ b/fs/xfs/scrub/reap.c
@@ -542,7 +542,7 @@ xreap_configure_limits(
 		return;
 	}
 
-	rs->max_deferred = res / variable_overhead;
+	rs->max_deferred = per_intent ? res / variable_overhead : 0;
 	res -= rs->max_deferred * per_intent;
 	rs->max_binval = per_binval ? res / per_binval : 0;
 }
@@ -1446,7 +1446,7 @@ xrep_reap_bmapi_iter(
 				imap->br_blockcount);
 
 		/*
-		 * Schedule removal of the mapping from the fork.  We use
+		 * t0: Schedule removal of the mapping from the fork.  We use
 		 * deferred log intents in this function to control the exact
 		 * sequence of metadata updates.
 		 */
@@ -1479,8 +1479,8 @@ xrep_reap_bmapi_iter(
 		return error;
 
 	/*
-	 * Schedule removal of the mapping from the fork.  We use deferred log
-	 * intents in this function to control the exact sequence of metadata
+	 * t1: Schedule removal of the mapping from the fork.  We use deferred
+	 * work in this function to control the exact sequence of metadata
 	 * updates.
 	 */
 	xfs_bmap_unmap_extent(sc->tp, rs->ip, rs->whichfork, imap);
@@ -1491,6 +1491,105 @@ xrep_reap_bmapi_iter(
 			XFS_FREE_EXTENT_SKIP_DISCARD);
 }
 
+/* Compute the maximum mapcount of a file buffer. */
+static unsigned int
+xreap_bmapi_binval_mapcount(
+	struct xfs_scrub	*sc)
+{
+	/* directory blocks can span multiple fsblocks and be discontiguous */
+	if (sc->sm->sm_type == XFS_SCRUB_TYPE_DIR)
+		return sc->mp->m_dir_geo->fsbcount;
+
+	/* all other file xattr/symlink blocks must be contiguous */
+	return 1;
+}
+
+/* Compute the maximum block size of a file buffer. */
+static unsigned int
+xreap_bmapi_binval_blocksize(
+	struct xfs_scrub	*sc)
+{
+	switch (sc->sm->sm_type) {
+	case XFS_SCRUB_TYPE_DIR:
+		return sc->mp->m_dir_geo->blksize;
+	case XFS_SCRUB_TYPE_XATTR:
+	case XFS_SCRUB_TYPE_PARENT:
+		/*
+		 * The xattr structure itself consists of single fsblocks, but
+		 * there could be remote xattr blocks to invalidate.
+		 */
+		return XFS_XATTR_SIZE_MAX;
+	}
+
+	/* everything else is a single block */
+	return sc->mp->m_sb.sb_blocksize;
+}
+
+/*
+ * Compute the maximum number of buffer invalidations that we can do while
+ * reaping a single extent from a file fork.
+ */
+STATIC void
+xreap_configure_bmapi_limits(
+	struct xreap_state	*rs)
+{
+	struct xfs_scrub	*sc = rs->sc;
+	struct xfs_mount	*mp = sc->mp;
+
+	/* overhead of invalidating a buffer */
+	const unsigned int	per_binval =
+		xfs_buf_inval_log_space(xreap_bmapi_binval_mapcount(sc),
+					    xreap_bmapi_binval_blocksize(sc));
+
+	/*
+	 * In the worst case, relogging an intent item causes both an intent
+	 * item and a done item to be attached to a transaction for each extent
+	 * that we'd like to process.
+	 */
+	const unsigned int	efi = xfs_efi_log_space(1) +
+				      xfs_efd_log_space(1);
+	const unsigned int	rui = xfs_rui_log_space(1) +
+				      xfs_rud_log_space();
+	const unsigned int	bui = xfs_bui_log_space(1) +
+				      xfs_bud_log_space();
+
+	/*
+	 * t1: Unmapping crosslinked file data blocks: one bmap deletion,
+	 * possibly an EFI for underfilled bmbt blocks, and an rmap deletion.
+	 *
+	 * t2: Freeing freeing file data blocks: one bmap deletion, possibly an
+	 * EFI for underfilled bmbt blocks, and another EFI for the space
+	 * itself.
+	 */
+	const unsigned int	t1 = (bui + efi) + rui;
+	const unsigned int	t2 = (bui + efi) + efi;
+	const unsigned int	per_intent = max(t1, t2);
+
+	/*
+	 * For each transaction in a reap chain, we must be able to take one
+	 * step in the defer item chain, which should only consist of CUI, EFI,
+	 * or RUI items.
+	 */
+	const unsigned int	f1 = xfs_calc_finish_efi_reservation(mp, 1);
+	const unsigned int	f2 = xfs_calc_finish_rui_reservation(mp, 1);
+	const unsigned int	f3 = xfs_calc_finish_bui_reservation(mp, 1);
+	const unsigned int	step_size = max3(f1, f2, f3);
+
+	/*
+	 * Each call to xreap_ifork_extent starts with a clean transaction and
+	 * operates on a single mapping by creating a chain of log intent items
+	 * for that mapping.  We need to leave enough reservation in the
+	 * transaction to log btree buffer and inode updates for each step in
+	 * the chain, and to relog the log intents.
+	 */
+	const unsigned int	per_extent_res = per_intent + step_size;
+
+	xreap_configure_limits(rs, per_extent_res, per_binval, 0, per_binval);
+
+	trace_xreap_bmapi_limits(sc->tp, per_binval, rs->max_binval,
+			step_size, per_intent, 1);
+}
+
 /*
  * Dispose of as much of this file extent as we can.  Upon successful return,
  * the imap will reflect the mapping that was removed from the fork.
@@ -1554,7 +1653,6 @@ xrep_reap_ifork(
 		.sc		= sc,
 		.ip		= ip,
 		.whichfork	= whichfork,
-		.max_binval	= XREAP_MAX_BINVAL,
 	};
 	xfs_fileoff_t		off = 0;
 	int			bmap_flags = xfs_bmapi_aflag(whichfork);
@@ -1564,6 +1662,7 @@ xrep_reap_ifork(
 	ASSERT(ip == sc->ip || ip == sc->tempip);
 	ASSERT(whichfork == XFS_ATTR_FORK || !XFS_IS_REALTIME_INODE(ip));
 
+	xreap_configure_bmapi_limits(&rs);
 	while (off < XFS_MAX_FILEOFF) {
 		struct xfs_bmbt_irec	imap;
 		int			nimaps = 1;


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 8/9] xfs: remove static reap limits
  2025-08-28 14:28 [PATCHSET] xfs: improve online repair reap calculations Darrick J. Wong
                   ` (6 preceding siblings ...)
  2025-08-28 14:29 ` [PATCH 7/9] xfs: compute file mapping " Darrick J. Wong
@ 2025-08-28 14:29 ` Darrick J. Wong
  2025-09-02  6:28   ` Christoph Hellwig
  2025-08-28 14:30 ` [PATCH 9/9] xfs: use deferred reaping for data device cow extents Darrick J. Wong
  8 siblings, 1 reply; 23+ messages in thread
From: Darrick J. Wong @ 2025-08-28 14:29 UTC (permalink / raw)
  To: cem, djwong; +Cc: hch, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

Get rid of two of the static limits, and move the third to the one file
that uses it.

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
 fs/xfs/scrub/repair.h |    8 --------
 fs/xfs/scrub/newbt.c  |    7 +++++++
 fs/xfs/scrub/reap.c   |    4 ----
 3 files changed, 7 insertions(+), 12 deletions(-)


diff --git a/fs/xfs/scrub/repair.h b/fs/xfs/scrub/repair.h
index 9c04295742c85f..2bb125c4f9bf2b 100644
--- a/fs/xfs/scrub/repair.h
+++ b/fs/xfs/scrub/repair.h
@@ -18,14 +18,6 @@ static inline int xrep_notsupported(struct xfs_scrub *sc)
 
 #ifdef CONFIG_XFS_ONLINE_REPAIR
 
-/*
- * This is the maximum number of deferred extent freeing item extents (EFIs)
- * that we'll attach to a transaction without rolling the transaction to avoid
- * overrunning a tr_itruncate reservation.
- */
-#define XREP_MAX_ITRUNCATE_EFIS	(128)
-
-
 /* Repair helpers */
 
 int xrep_attempt(struct xfs_scrub *sc, struct xchk_stats_run *run);
diff --git a/fs/xfs/scrub/newbt.c b/fs/xfs/scrub/newbt.c
index 1588ce971cb8e1..f00ec278fdc53e 100644
--- a/fs/xfs/scrub/newbt.c
+++ b/fs/xfs/scrub/newbt.c
@@ -27,6 +27,13 @@
 #include "scrub/repair.h"
 #include "scrub/newbt.h"
 
+/*
+ * This is the maximum number of deferred extent freeing item extents (EFIs)
+ * that we'll attach to a transaction without rolling the transaction to avoid
+ * overrunning a tr_itruncate reservation.
+ */
+#define XREP_MAX_ITRUNCATE_EFIS	(128)
+
 /*
  * Estimate proper slack values for a btree that's being reloaded.
  *
diff --git a/fs/xfs/scrub/reap.c b/fs/xfs/scrub/reap.c
index d58fd57aaebb43..82910188111dd7 100644
--- a/fs/xfs/scrub/reap.c
+++ b/fs/xfs/scrub/reap.c
@@ -171,8 +171,6 @@ static inline bool xreap_is_dirty(const struct xreap_state *rs)
 	return rs->nr_binval > 0 || rs->nr_deferred > 0;
 }
 
-#define XREAP_MAX_BINVAL	(2048)
-
 /*
  * Decide if we need to roll the transaction to clear out the the log
  * reservation that we allocated to buffer invalidations.
@@ -198,8 +196,6 @@ static inline bool xreap_inc_binval(struct xreap_state *rs)
 	return rs->nr_binval < rs->max_binval;
 }
 
-#define XREAP_MAX_DEFER_CHAIN		(2048)
-
 /*
  * Decide if we want to finish the deferred ops that are attached to the scrub
  * transaction.  We don't want to queue huge chains of deferred ops because


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 9/9] xfs: use deferred reaping for data device cow extents
  2025-08-28 14:28 [PATCHSET] xfs: improve online repair reap calculations Darrick J. Wong
                   ` (7 preceding siblings ...)
  2025-08-28 14:29 ` [PATCH 8/9] xfs: remove static reap limits Darrick J. Wong
@ 2025-08-28 14:30 ` Darrick J. Wong
  2025-09-02  6:28   ` Christoph Hellwig
  8 siblings, 1 reply; 23+ messages in thread
From: Darrick J. Wong @ 2025-08-28 14:30 UTC (permalink / raw)
  To: cem, djwong; +Cc: hch, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

Don't roll the whole transaction after every extent, that's rather
inefficient.

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
 fs/xfs/scrub/reap.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)


diff --git a/fs/xfs/scrub/reap.c b/fs/xfs/scrub/reap.c
index 82910188111dd7..07f5bb8a642124 100644
--- a/fs/xfs/scrub/reap.c
+++ b/fs/xfs/scrub/reap.c
@@ -445,7 +445,7 @@ xreap_agextent_iter(
 			 */
 			xfs_refcount_free_cow_extent(sc->tp, false, fsbno,
 					*aglenp);
-			xreap_force_defer_finish(rs);
+			xreap_inc_defer(rs);
 			return 0;
 		}
 
@@ -486,7 +486,7 @@ xreap_agextent_iter(
 		if (error)
 			return error;
 
-		xreap_force_defer_finish(rs);
+		xreap_inc_defer(rs);
 		return 0;
 	}
 


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/9] xfs: prepare reaping code for dynamic limits
  2025-08-28 14:28 ` [PATCH 1/9] xfs: prepare reaping code for dynamic limits Darrick J. Wong
@ 2025-09-02  6:23   ` Christoph Hellwig
  0 siblings, 0 replies; 23+ messages in thread
From: Christoph Hellwig @ 2025-09-02  6:23 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: cem, hch, linux-xfs

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/9] xfs: convert the ifork reap code to use xreap_state
  2025-08-28 14:28 ` [PATCH 2/9] xfs: convert the ifork reap code to use xreap_state Darrick J. Wong
@ 2025-09-02  6:24   ` Christoph Hellwig
  0 siblings, 0 replies; 23+ messages in thread
From: Christoph Hellwig @ 2025-09-02  6:24 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: cem, linux-xfs

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 3/9] xfs: use deferred intent items for reaping crosslinked blocks
  2025-08-28 14:28 ` [PATCH 3/9] xfs: use deferred intent items for reaping crosslinked blocks Darrick J. Wong
@ 2025-09-02  6:25   ` Christoph Hellwig
  2025-09-02 22:22     ` Darrick J. Wong
  0 siblings, 1 reply; 23+ messages in thread
From: Christoph Hellwig @ 2025-09-02  6:25 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: cem, stable, linux-xfs

On Thu, Aug 28, 2025 at 07:28:38AM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
> 
> When we're removing rmap records for crosslinked blocks, use deferred
> intent items so that we can try to free/unmap as many of the old data
> structure's blocks as we can in the same transaction as the commit.

Shouldn't this be at the start of the series?

Otherwise looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 4/9] xfs: compute per-AG extent reap limits dynamically
  2025-08-28 14:28 ` [PATCH 4/9] xfs: compute per-AG extent reap limits dynamically Darrick J. Wong
@ 2025-09-02  6:26   ` Christoph Hellwig
  0 siblings, 0 replies; 23+ messages in thread
From: Christoph Hellwig @ 2025-09-02  6:26 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: cem, linux-xfs

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 6/9] xfs: compute realtime device CoW staging extent reap limits dynamically
  2025-08-28 14:29 ` [PATCH 6/9] xfs: compute realtime " Darrick J. Wong
@ 2025-09-02  6:27   ` Christoph Hellwig
  0 siblings, 0 replies; 23+ messages in thread
From: Christoph Hellwig @ 2025-09-02  6:27 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: cem, linux-xfs

On Thu, Aug 28, 2025 at 07:29:26AM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
> 
> Calculate the maximum number of CoW staging extents that can be reaped
> in a single transaction chain.  The rough calculation here is:
> 
> nr_extents = (logres - reservation used by any one step) /
> 		(space used by intents per extent)

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 7/9] xfs: compute file mapping reap limits dynamically
  2025-08-28 14:29 ` [PATCH 7/9] xfs: compute file mapping " Darrick J. Wong
@ 2025-09-02  6:27   ` Christoph Hellwig
  0 siblings, 0 replies; 23+ messages in thread
From: Christoph Hellwig @ 2025-09-02  6:27 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: cem, linux-xfs

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 8/9] xfs: remove static reap limits
  2025-08-28 14:29 ` [PATCH 8/9] xfs: remove static reap limits Darrick J. Wong
@ 2025-09-02  6:28   ` Christoph Hellwig
  2025-09-02 22:32     ` Darrick J. Wong
  0 siblings, 1 reply; 23+ messages in thread
From: Christoph Hellwig @ 2025-09-02  6:28 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: cem, linux-xfs

On Thu, Aug 28, 2025 at 07:29:57AM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
> 
> Get rid of two of the static limits, and move the third to the one file
> that uses it.

Removing the two now unused one is obvious, but could still be stated
here to not puzzle the future git log reader.  But why move the third
one?


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 9/9] xfs: use deferred reaping for data device cow extents
  2025-08-28 14:30 ` [PATCH 9/9] xfs: use deferred reaping for data device cow extents Darrick J. Wong
@ 2025-09-02  6:28   ` Christoph Hellwig
  0 siblings, 0 replies; 23+ messages in thread
From: Christoph Hellwig @ 2025-09-02  6:28 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: cem, linux-xfs

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 3/9] xfs: use deferred intent items for reaping crosslinked blocks
  2025-09-02  6:25   ` Christoph Hellwig
@ 2025-09-02 22:22     ` Darrick J. Wong
  0 siblings, 0 replies; 23+ messages in thread
From: Darrick J. Wong @ 2025-09-02 22:22 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: cem, stable, linux-xfs

On Tue, Sep 02, 2025 at 08:25:45AM +0200, Christoph Hellwig wrote:
> On Thu, Aug 28, 2025 at 07:28:38AM -0700, Darrick J. Wong wrote:
> > From: Darrick J. Wong <djwong@kernel.org>
> > 
> > When we're removing rmap records for crosslinked blocks, use deferred
> > intent items so that we can try to free/unmap as many of the old data
> > structure's blocks as we can in the same transaction as the commit.
> 
> Shouldn't this be at the start of the series?

Yeah, will move it.

> Otherwise looks good:
> 
> Reviewed-by: Christoph Hellwig <hch@lst.de>

Thanks!

--D

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 8/9] xfs: remove static reap limits
  2025-09-02  6:28   ` Christoph Hellwig
@ 2025-09-02 22:32     ` Darrick J. Wong
  2025-09-03  6:03       ` Christoph Hellwig
  0 siblings, 1 reply; 23+ messages in thread
From: Darrick J. Wong @ 2025-09-02 22:32 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: cem, linux-xfs

On Tue, Sep 02, 2025 at 08:28:29AM +0200, Christoph Hellwig wrote:
> On Thu, Aug 28, 2025 at 07:29:57AM -0700, Darrick J. Wong wrote:
> > From: Darrick J. Wong <djwong@kernel.org>
> > 
> > Get rid of two of the static limits, and move the third to the one file
> > that uses it.
> 
> Removing the two now unused one is obvious, but could still be stated
> here to not puzzle the future git log reader.  But why move the third
> one?

It's only needed in newbt.c.  What if I change the commit message to:

"xfs: remove static reap limits from repair.h

"Delete XREAP_MAX_BINVAL and XREAP_MAX_DEFER_CHAIN because the reap code
now calculates those limits dynamically, so they're no longer needed.

"Move the third limit (XREP_MAX_ITRUNCATE_EFIS) to the one file that
uses it.  Note that the btree rebuilding code should reserve exactly the
number of blocks needed to rebuild a btree, so it is rare that the newbt
code will need to add any EFIs to the commit transaction.  That's why
that static limit remains."

Would that make it clearer?

--D

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 5/9] xfs: compute data device CoW staging extent reap limits dynamically
  2025-08-28 14:29 ` [PATCH 5/9] xfs: compute data device CoW staging " Darrick J. Wong
@ 2025-09-03  6:02   ` Christoph Hellwig
  0 siblings, 0 replies; 23+ messages in thread
From: Christoph Hellwig @ 2025-09-03  6:02 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: cem, hch, linux-xfs

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 8/9] xfs: remove static reap limits
  2025-09-02 22:32     ` Darrick J. Wong
@ 2025-09-03  6:03       ` Christoph Hellwig
  2025-09-03  6:27         ` Darrick J. Wong
  0 siblings, 1 reply; 23+ messages in thread
From: Christoph Hellwig @ 2025-09-03  6:03 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: Christoph Hellwig, cem, linux-xfs

On Tue, Sep 02, 2025 at 03:32:03PM -0700, Darrick J. Wong wrote:
> It's only needed in newbt.c.  What if I change the commit message to:
> 
> "xfs: remove static reap limits from repair.h
> 
> "Delete XREAP_MAX_BINVAL and XREAP_MAX_DEFER_CHAIN because the reap code
> now calculates those limits dynamically, so they're no longer needed.
> 
> "Move the third limit (XREP_MAX_ITRUNCATE_EFIS) to the one file that
> uses it.  Note that the btree rebuilding code should reserve exactly the
> number of blocks needed to rebuild a btree, so it is rare that the newbt
> code will need to add any EFIs to the commit transaction.  That's why
> that static limit remains."
> 
> Would that make it clearer?

Yes. With that:

Reviewed-by: Christoph Hellwig <hch@lst.de>


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 8/9] xfs: remove static reap limits
  2025-09-03  6:03       ` Christoph Hellwig
@ 2025-09-03  6:27         ` Darrick J. Wong
  0 siblings, 0 replies; 23+ messages in thread
From: Darrick J. Wong @ 2025-09-03  6:27 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: cem, linux-xfs

On Wed, Sep 03, 2025 at 08:03:11AM +0200, Christoph Hellwig wrote:
> On Tue, Sep 02, 2025 at 03:32:03PM -0700, Darrick J. Wong wrote:
> > It's only needed in newbt.c.  What if I change the commit message to:
> > 
> > "xfs: remove static reap limits from repair.h
> > 
> > "Delete XREAP_MAX_BINVAL and XREAP_MAX_DEFER_CHAIN because the reap code
> > now calculates those limits dynamically, so they're no longer needed.
> > 
> > "Move the third limit (XREP_MAX_ITRUNCATE_EFIS) to the one file that
> > uses it.  Note that the btree rebuilding code should reserve exactly the
> > number of blocks needed to rebuild a btree, so it is rare that the newbt
> > code will need to add any EFIs to the commit transaction.  That's why
> > that static limit remains."
> > 
> > Would that make it clearer?
> 
> Yes. With that:
> 
> Reviewed-by: Christoph Hellwig <hch@lst.de>

Cool, thank you!

--D

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2025-09-03  6:27 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-28 14:28 [PATCHSET] xfs: improve online repair reap calculations Darrick J. Wong
2025-08-28 14:28 ` [PATCH 1/9] xfs: prepare reaping code for dynamic limits Darrick J. Wong
2025-09-02  6:23   ` Christoph Hellwig
2025-08-28 14:28 ` [PATCH 2/9] xfs: convert the ifork reap code to use xreap_state Darrick J. Wong
2025-09-02  6:24   ` Christoph Hellwig
2025-08-28 14:28 ` [PATCH 3/9] xfs: use deferred intent items for reaping crosslinked blocks Darrick J. Wong
2025-09-02  6:25   ` Christoph Hellwig
2025-09-02 22:22     ` Darrick J. Wong
2025-08-28 14:28 ` [PATCH 4/9] xfs: compute per-AG extent reap limits dynamically Darrick J. Wong
2025-09-02  6:26   ` Christoph Hellwig
2025-08-28 14:29 ` [PATCH 5/9] xfs: compute data device CoW staging " Darrick J. Wong
2025-09-03  6:02   ` Christoph Hellwig
2025-08-28 14:29 ` [PATCH 6/9] xfs: compute realtime " Darrick J. Wong
2025-09-02  6:27   ` Christoph Hellwig
2025-08-28 14:29 ` [PATCH 7/9] xfs: compute file mapping " Darrick J. Wong
2025-09-02  6:27   ` Christoph Hellwig
2025-08-28 14:29 ` [PATCH 8/9] xfs: remove static reap limits Darrick J. Wong
2025-09-02  6:28   ` Christoph Hellwig
2025-09-02 22:32     ` Darrick J. Wong
2025-09-03  6:03       ` Christoph Hellwig
2025-09-03  6:27         ` Darrick J. Wong
2025-08-28 14:30 ` [PATCH 9/9] xfs: use deferred reaping for data device cow extents Darrick J. Wong
2025-09-02  6:28   ` Christoph Hellwig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).