public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
* fix recovery of allocator ops after a growfs
@ 2024-10-14  6:04 Christoph Hellwig
  2024-10-14  6:04 ` [PATCH 1/6] xfs: pass the exact range to initialize to xfs_initialize_perag Christoph Hellwig
                   ` (6 more replies)
  0 siblings, 7 replies; 15+ messages in thread
From: Christoph Hellwig @ 2024-10-14  6:04 UTC (permalink / raw)
  To: Carlos Maiolino; +Cc: Darrick J. Wong, Brian Foster, linux-xfs

Hi all,

auditing the perag code for the generic groups feature found an issue
where recovery of an extfree intent without a logged done entry will
fail when the log also contained the transaction that added the AG
to the extent is freed to because the file system geometry in the
superblock is only updated updated and the perag structures are only
created after log recovery has finished.

This version now also ensures the transactions using the new AGs
are not in the same CIL checkpoint as the growfs transaction.

Changes since v1:
 - rename old_agcount to orig_agcount in xlog_do_recover
 - remove a redundant check in xfs_initialize_perag
 - remove xlog_recover_update_agcount and fold it into the only caller
 - add a new patch to update the pag values for the last AG
 - add more detailed comments
 - dropped the changes to ensure post-growfs transactions touching the
   superblock are forced into their own checkpoint for now.  Based on
   reviews this needs a lot more work, and isn't needed to fix the known
   failing test. 

Diffstat:
 libxfs/xfs_ag.c        |   75 ++++++++++++++++++-------------------------------
 libxfs/xfs_ag.h        |   23 +++++++++------
 libxfs/xfs_alloc.c     |    4 +-
 xfs_buf_item_recover.c |   70 +++++++++++++++++++++++++++++++++++++++++++++
 xfs_fsops.c            |   20 +++++--------
 xfs_log_recover.c      |    7 ----
 xfs_mount.c            |    9 ++---
 7 files changed, 129 insertions(+), 79 deletions(-)

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 1/6] xfs: pass the exact range to initialize to xfs_initialize_perag
  2024-10-14  6:04 fix recovery of allocator ops after a growfs Christoph Hellwig
@ 2024-10-14  6:04 ` Christoph Hellwig
  2024-10-15 13:11   ` Brian Foster
  2024-10-14  6:04 ` [PATCH 2/6] xfs: merge the perag freeing helpers Christoph Hellwig
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 15+ messages in thread
From: Christoph Hellwig @ 2024-10-14  6:04 UTC (permalink / raw)
  To: Carlos Maiolino; +Cc: Darrick J. Wong, Brian Foster, linux-xfs

Currently only the new agcount is passed to xfs_initialize_perag, which
requires lookups of existing AGs to skip them and complicates error
handling.  Also pass the previous agcount so that the range that
xfs_initialize_perag operates on is exactly defined.  That way the
extra lookups can be avoided, and error handling can clean up the
exact range from the old count to the last added perag structure.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/xfs/libxfs/xfs_ag.c   | 28 ++++++----------------------
 fs/xfs/libxfs/xfs_ag.h   |  5 +++--
 fs/xfs/xfs_fsops.c       | 18 ++++++++----------
 fs/xfs/xfs_log_recover.c |  5 +++--
 fs/xfs/xfs_mount.c       |  4 ++--
 5 files changed, 22 insertions(+), 38 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_ag.c b/fs/xfs/libxfs/xfs_ag.c
index 5f0494702e0b55..464f682eab4690 100644
--- a/fs/xfs/libxfs/xfs_ag.c
+++ b/fs/xfs/libxfs/xfs_ag.c
@@ -296,27 +296,16 @@ xfs_free_unused_perag_range(
 int
 xfs_initialize_perag(
 	struct xfs_mount	*mp,
-	xfs_agnumber_t		agcount,
+	xfs_agnumber_t		old_agcount,
+	xfs_agnumber_t		new_agcount,
 	xfs_rfsblock_t		dblocks,
 	xfs_agnumber_t		*maxagi)
 {
 	struct xfs_perag	*pag;
 	xfs_agnumber_t		index;
-	xfs_agnumber_t		first_initialised = NULLAGNUMBER;
 	int			error;
 
-	/*
-	 * Walk the current per-ag tree so we don't try to initialise AGs
-	 * that already exist (growfs case). Allocate and insert all the
-	 * AGs we don't find ready for initialisation.
-	 */
-	for (index = 0; index < agcount; index++) {
-		pag = xfs_perag_get(mp, index);
-		if (pag) {
-			xfs_perag_put(pag);
-			continue;
-		}
-
+	for (index = old_agcount; index < new_agcount; index++) {
 		pag = kzalloc(sizeof(*pag), GFP_KERNEL | __GFP_RETRY_MAYFAIL);
 		if (!pag) {
 			error = -ENOMEM;
@@ -353,21 +342,17 @@ xfs_initialize_perag(
 		/* Active ref owned by mount indicates AG is online. */
 		atomic_set(&pag->pag_active_ref, 1);
 
-		/* first new pag is fully initialized */
-		if (first_initialised == NULLAGNUMBER)
-			first_initialised = index;
-
 		/*
 		 * Pre-calculated geometry
 		 */
-		pag->block_count = __xfs_ag_block_count(mp, index, agcount,
+		pag->block_count = __xfs_ag_block_count(mp, index, new_agcount,
 				dblocks);
 		pag->min_block = XFS_AGFL_BLOCK(mp);
 		__xfs_agino_range(mp, pag->block_count, &pag->agino_min,
 				&pag->agino_max);
 	}
 
-	index = xfs_set_inode_alloc(mp, agcount);
+	index = xfs_set_inode_alloc(mp, new_agcount);
 
 	if (maxagi)
 		*maxagi = index;
@@ -381,8 +366,7 @@ xfs_initialize_perag(
 out_free_pag:
 	kfree(pag);
 out_unwind_new_pags:
-	/* unwind any prior newly initialized pags */
-	xfs_free_unused_perag_range(mp, first_initialised, agcount);
+	xfs_free_unused_perag_range(mp, old_agcount, index);
 	return error;
 }
 
diff --git a/fs/xfs/libxfs/xfs_ag.h b/fs/xfs/libxfs/xfs_ag.h
index d9cccd093b60e0..69fc31e7b84728 100644
--- a/fs/xfs/libxfs/xfs_ag.h
+++ b/fs/xfs/libxfs/xfs_ag.h
@@ -146,8 +146,9 @@ __XFS_AG_OPSTATE(agfl_needs_reset, AGFL_NEEDS_RESET)
 
 void xfs_free_unused_perag_range(struct xfs_mount *mp, xfs_agnumber_t agstart,
 			xfs_agnumber_t agend);
-int xfs_initialize_perag(struct xfs_mount *mp, xfs_agnumber_t agcount,
-			xfs_rfsblock_t dcount, xfs_agnumber_t *maxagi);
+int xfs_initialize_perag(struct xfs_mount *mp, xfs_agnumber_t old_agcount,
+		xfs_agnumber_t agcount, xfs_rfsblock_t dcount,
+		xfs_agnumber_t *maxagi);
 int xfs_initialize_perag_data(struct xfs_mount *mp, xfs_agnumber_t agno);
 void xfs_free_perag(struct xfs_mount *mp);
 
diff --git a/fs/xfs/xfs_fsops.c b/fs/xfs/xfs_fsops.c
index 3643cc843f6271..de2bf0594cb474 100644
--- a/fs/xfs/xfs_fsops.c
+++ b/fs/xfs/xfs_fsops.c
@@ -87,6 +87,7 @@ xfs_growfs_data_private(
 	struct xfs_mount	*mp,		/* mount point for filesystem */
 	struct xfs_growfs_data	*in)		/* growfs data input struct */
 {
+	xfs_agnumber_t		oagcount = mp->m_sb.sb_agcount;
 	struct xfs_buf		*bp;
 	int			error;
 	xfs_agnumber_t		nagcount;
@@ -94,7 +95,6 @@ xfs_growfs_data_private(
 	xfs_rfsblock_t		nb, nb_div, nb_mod;
 	int64_t			delta;
 	bool			lastag_extended = false;
-	xfs_agnumber_t		oagcount;
 	struct xfs_trans	*tp;
 	struct aghdr_init_data	id = {};
 	struct xfs_perag	*last_pag;
@@ -138,16 +138,14 @@ xfs_growfs_data_private(
 	if (delta == 0)
 		return 0;
 
-	oagcount = mp->m_sb.sb_agcount;
-	/* allocate the new per-ag structures */
-	if (nagcount > oagcount) {
-		error = xfs_initialize_perag(mp, nagcount, nb, &nagimax);
-		if (error)
-			return error;
-	} else if (nagcount < oagcount) {
-		/* TODO: shrinking the entire AGs hasn't yet completed */
+	/* TODO: shrinking the entire AGs hasn't yet completed */
+	if (nagcount < oagcount)
 		return -EINVAL;
-	}
+
+	/* allocate the new per-ag structures */
+	error = xfs_initialize_perag(mp, oagcount, nagcount, nb, &nagimax);
+	if (error)
+		return error;
 
 	if (delta > 0)
 		error = xfs_trans_alloc(mp, &M_RES(mp)->tr_growdata,
diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
index d6a3ff24c327c3..60d46338f51792 100644
--- a/fs/xfs/xfs_log_recover.c
+++ b/fs/xfs/xfs_log_recover.c
@@ -3346,6 +3346,7 @@ xlog_do_recover(
 	struct xfs_mount	*mp = log->l_mp;
 	struct xfs_buf		*bp = mp->m_sb_bp;
 	struct xfs_sb		*sbp = &mp->m_sb;
+	xfs_agnumber_t		orig_agcount = sbp->sb_agcount;
 	int			error;
 
 	trace_xfs_log_recover(log, head_blk, tail_blk);
@@ -3393,8 +3394,8 @@ xlog_do_recover(
 	/* re-initialise in-core superblock and geometry structures */
 	mp->m_features |= xfs_sb_version_to_features(sbp);
 	xfs_reinit_percpu_counters(mp);
-	error = xfs_initialize_perag(mp, sbp->sb_agcount, sbp->sb_dblocks,
-			&mp->m_maxagi);
+	error = xfs_initialize_perag(mp, orig_agcount, sbp->sb_agcount,
+			sbp->sb_dblocks, &mp->m_maxagi);
 	if (error) {
 		xfs_warn(mp, "Failed post-recovery per-ag init: %d", error);
 		return error;
diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c
index 1fdd79c5bfa04e..6fa7239a4a01b6 100644
--- a/fs/xfs/xfs_mount.c
+++ b/fs/xfs/xfs_mount.c
@@ -810,8 +810,8 @@ xfs_mountfs(
 	/*
 	 * Allocate and initialize the per-ag data.
 	 */
-	error = xfs_initialize_perag(mp, sbp->sb_agcount, mp->m_sb.sb_dblocks,
-			&mp->m_maxagi);
+	error = xfs_initialize_perag(mp, 0, sbp->sb_agcount,
+			mp->m_sb.sb_dblocks, &mp->m_maxagi);
 	if (error) {
 		xfs_warn(mp, "Failed per-ag init: %d", error);
 		goto out_free_dir;
-- 
2.45.2


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 2/6] xfs: merge the perag freeing helpers
  2024-10-14  6:04 fix recovery of allocator ops after a growfs Christoph Hellwig
  2024-10-14  6:04 ` [PATCH 1/6] xfs: pass the exact range to initialize to xfs_initialize_perag Christoph Hellwig
@ 2024-10-14  6:04 ` Christoph Hellwig
  2024-10-14  6:04 ` [PATCH 3/6] xfs: update the file system geometry after recoverying superblock buffers Christoph Hellwig
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 15+ messages in thread
From: Christoph Hellwig @ 2024-10-14  6:04 UTC (permalink / raw)
  To: Carlos Maiolino; +Cc: Darrick J. Wong, Brian Foster, linux-xfs

There is no good reason to have two different routines for freeing perag
structures for the unmount and error cases.  Add two arguments to specify
the range of AGs to free to xfs_free_perag, and use that to replace
xfs_free_unused_perag_range.

The addition RCU grace period for the error case is harmless, and the
extra check for the AG to actually exist is not required now that the
callers pass the exact known allocated range.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/xfs/libxfs/xfs_ag.c | 40 ++++++++++------------------------------
 fs/xfs/libxfs/xfs_ag.h |  5 ++---
 fs/xfs/xfs_fsops.c     |  2 +-
 fs/xfs/xfs_mount.c     |  5 ++---
 4 files changed, 15 insertions(+), 37 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_ag.c b/fs/xfs/libxfs/xfs_ag.c
index 464f682eab4690..8ace2cc200a60e 100644
--- a/fs/xfs/libxfs/xfs_ag.c
+++ b/fs/xfs/libxfs/xfs_ag.c
@@ -185,17 +185,20 @@ xfs_initialize_perag_data(
 }
 
 /*
- * Free up the per-ag resources associated with the mount structure.
+ * Free up the per-ag resources  within the specified AG range.
  */
 void
-xfs_free_perag(
-	struct xfs_mount	*mp)
+xfs_free_perag_range(
+	struct xfs_mount	*mp,
+	xfs_agnumber_t		first_agno,
+	xfs_agnumber_t		end_agno)
+
 {
-	struct xfs_perag	*pag;
 	xfs_agnumber_t		agno;
 
-	for (agno = 0; agno < mp->m_sb.sb_agcount; agno++) {
-		pag = xa_erase(&mp->m_perags, agno);
+	for (agno = first_agno; agno < end_agno; agno++) {
+		struct xfs_perag	*pag = xa_erase(&mp->m_perags, agno);
+
 		ASSERT(pag);
 		XFS_IS_CORRUPT(pag->pag_mount, atomic_read(&pag->pag_ref) != 0);
 		xfs_defer_drain_free(&pag->pag_intents_drain);
@@ -270,29 +273,6 @@ xfs_agino_range(
 	return __xfs_agino_range(mp, xfs_ag_block_count(mp, agno), first, last);
 }
 
-/*
- * Free perag within the specified AG range, it is only used to free unused
- * perags under the error handling path.
- */
-void
-xfs_free_unused_perag_range(
-	struct xfs_mount	*mp,
-	xfs_agnumber_t		agstart,
-	xfs_agnumber_t		agend)
-{
-	struct xfs_perag	*pag;
-	xfs_agnumber_t		index;
-
-	for (index = agstart; index < agend; index++) {
-		pag = xa_erase(&mp->m_perags, index);
-		if (!pag)
-			break;
-		xfs_buf_cache_destroy(&pag->pag_bcache);
-		xfs_defer_drain_free(&pag->pag_intents_drain);
-		kfree(pag);
-	}
-}
-
 int
 xfs_initialize_perag(
 	struct xfs_mount	*mp,
@@ -366,7 +346,7 @@ xfs_initialize_perag(
 out_free_pag:
 	kfree(pag);
 out_unwind_new_pags:
-	xfs_free_unused_perag_range(mp, old_agcount, index);
+	xfs_free_perag_range(mp, old_agcount, index);
 	return error;
 }
 
diff --git a/fs/xfs/libxfs/xfs_ag.h b/fs/xfs/libxfs/xfs_ag.h
index 69fc31e7b84728..6e68d6a3161a0f 100644
--- a/fs/xfs/libxfs/xfs_ag.h
+++ b/fs/xfs/libxfs/xfs_ag.h
@@ -144,13 +144,12 @@ __XFS_AG_OPSTATE(prefers_metadata, PREFERS_METADATA)
 __XFS_AG_OPSTATE(allows_inodes, ALLOWS_INODES)
 __XFS_AG_OPSTATE(agfl_needs_reset, AGFL_NEEDS_RESET)
 
-void xfs_free_unused_perag_range(struct xfs_mount *mp, xfs_agnumber_t agstart,
-			xfs_agnumber_t agend);
 int xfs_initialize_perag(struct xfs_mount *mp, xfs_agnumber_t old_agcount,
 		xfs_agnumber_t agcount, xfs_rfsblock_t dcount,
 		xfs_agnumber_t *maxagi);
+void xfs_free_perag_range(struct xfs_mount *mp, xfs_agnumber_t first_agno,
+		xfs_agnumber_t end_agno);
 int xfs_initialize_perag_data(struct xfs_mount *mp, xfs_agnumber_t agno);
-void xfs_free_perag(struct xfs_mount *mp);
 
 /* Passive AG references */
 struct xfs_perag *xfs_perag_get(struct xfs_mount *mp, xfs_agnumber_t agno);
diff --git a/fs/xfs/xfs_fsops.c b/fs/xfs/xfs_fsops.c
index de2bf0594cb474..b247d895c276d2 100644
--- a/fs/xfs/xfs_fsops.c
+++ b/fs/xfs/xfs_fsops.c
@@ -229,7 +229,7 @@ xfs_growfs_data_private(
 	xfs_trans_cancel(tp);
 out_free_unused_perag:
 	if (nagcount > oagcount)
-		xfs_free_unused_perag_range(mp, oagcount, nagcount);
+		xfs_free_perag_range(mp, oagcount, nagcount);
 	return error;
 }
 
diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c
index 6fa7239a4a01b6..25bbcc3f4ee08b 100644
--- a/fs/xfs/xfs_mount.c
+++ b/fs/xfs/xfs_mount.c
@@ -1048,7 +1048,7 @@ xfs_mountfs(
 		xfs_buftarg_drain(mp->m_logdev_targp);
 	xfs_buftarg_drain(mp->m_ddev_targp);
  out_free_perag:
-	xfs_free_perag(mp);
+	xfs_free_perag_range(mp, 0, mp->m_sb.sb_agcount);
  out_free_dir:
 	xfs_da_unmount(mp);
  out_remove_uuid:
@@ -1129,8 +1129,7 @@ xfs_unmountfs(
 	xfs_errortag_clearall(mp);
 #endif
 	shrinker_free(mp->m_inodegc_shrinker);
-	xfs_free_perag(mp);
-
+	xfs_free_perag_range(mp, 0, mp->m_sb.sb_agcount);
 	xfs_errortag_del(mp);
 	xfs_error_sysfs_del(mp);
 	xchk_stats_unregister(mp->m_scrub_stats);
-- 
2.45.2


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 3/6] xfs: update the file system geometry after recoverying superblock buffers
  2024-10-14  6:04 fix recovery of allocator ops after a growfs Christoph Hellwig
  2024-10-14  6:04 ` [PATCH 1/6] xfs: pass the exact range to initialize to xfs_initialize_perag Christoph Hellwig
  2024-10-14  6:04 ` [PATCH 2/6] xfs: merge the perag freeing helpers Christoph Hellwig
@ 2024-10-14  6:04 ` Christoph Hellwig
  2024-10-15 13:11   ` Brian Foster
  2024-10-15 16:19   ` Darrick J. Wong
  2024-10-14  6:04 ` [PATCH 4/6] xfs: error out when a superblock buffer update reduces the agcount Christoph Hellwig
                   ` (3 subsequent siblings)
  6 siblings, 2 replies; 15+ messages in thread
From: Christoph Hellwig @ 2024-10-14  6:04 UTC (permalink / raw)
  To: Carlos Maiolino; +Cc: Darrick J. Wong, Brian Foster, linux-xfs

Primary superblock buffers that change the file system geometry after a
growfs operation can affect the operation of later CIL checkpoints that
make use of the newly added space and allocation groups.

Apply the changes to the in-memory structures as part of recovery pass 2,
to ensure recovery works fine for such cases.

In the future we should apply the logic to other updates such as features
bits as well.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/xfs/xfs_buf_item_recover.c | 52 +++++++++++++++++++++++++++++++++++
 fs/xfs/xfs_log_recover.c      |  8 ------
 2 files changed, 52 insertions(+), 8 deletions(-)

diff --git a/fs/xfs/xfs_buf_item_recover.c b/fs/xfs/xfs_buf_item_recover.c
index 09e893cf563cb9..edf1162a8c9dd0 100644
--- a/fs/xfs/xfs_buf_item_recover.c
+++ b/fs/xfs/xfs_buf_item_recover.c
@@ -22,6 +22,9 @@
 #include "xfs_inode.h"
 #include "xfs_dir2.h"
 #include "xfs_quota.h"
+#include "xfs_alloc.h"
+#include "xfs_ag.h"
+#include "xfs_sb.h"
 
 /*
  * This is the number of entries in the l_buf_cancel_table used during
@@ -684,6 +687,49 @@ xlog_recover_do_inode_buffer(
 	return 0;
 }
 
+/*
+ * Update the in-memory superblock and perag structures from the primary SB
+ * buffer.
+ *
+ * This is required because transactions running after growfs may require the
+ * updated values to be set in a previous fully commit transaction.
+ */
+static int
+xlog_recover_do_primary_sb_buffer(
+	struct xfs_mount		*mp,
+	struct xlog_recover_item	*item,
+	struct xfs_buf			*bp,
+	struct xfs_buf_log_format	*buf_f,
+	xfs_lsn_t			current_lsn)
+{
+	struct xfs_dsb			*dsb = bp->b_addr;
+	xfs_agnumber_t			orig_agcount = mp->m_sb.sb_agcount;
+	int				error;
+
+	xlog_recover_do_reg_buffer(mp, item, bp, buf_f, current_lsn);
+
+	/*
+	 * Update the in-core super block from the freshly recovered on-disk one.
+	 */
+	xfs_sb_from_disk(&mp->m_sb, dsb);
+
+	/*
+	 * Initialize the new perags, and also update various block and inode
+	 * allocator setting based off the number of AGs or total blocks.
+	 * Because of the latter this also needs to happen if the agcount did
+	 * not change.
+	 */
+	error = xfs_initialize_perag(mp, orig_agcount,
+			mp->m_sb.sb_agcount, mp->m_sb.sb_dblocks,
+			&mp->m_maxagi);
+	if (error) {
+		xfs_warn(mp, "Failed recovery per-ag init: %d", error);
+		return error;
+	}
+	mp->m_alloc_set_aside = xfs_alloc_set_aside(mp);
+	return 0;
+}
+
 /*
  * V5 filesystems know the age of the buffer on disk being recovered. We can
  * have newer objects on disk than we are replaying, and so for these cases we
@@ -967,6 +1013,12 @@ xlog_recover_buf_commit_pass2(
 		dirty = xlog_recover_do_dquot_buffer(mp, log, item, bp, buf_f);
 		if (!dirty)
 			goto out_release;
+	} else if ((xfs_blft_from_flags(buf_f) & XFS_BLFT_SB_BUF) &&
+			xfs_buf_daddr(bp) == 0) {
+		error = xlog_recover_do_primary_sb_buffer(mp, item, bp, buf_f,
+				current_lsn);
+		if (error)
+			goto out_release;
 	} else {
 		xlog_recover_do_reg_buffer(mp, item, bp, buf_f, current_lsn);
 	}
diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
index 60d46338f51792..08b8938e4efb7d 100644
--- a/fs/xfs/xfs_log_recover.c
+++ b/fs/xfs/xfs_log_recover.c
@@ -3346,7 +3346,6 @@ xlog_do_recover(
 	struct xfs_mount	*mp = log->l_mp;
 	struct xfs_buf		*bp = mp->m_sb_bp;
 	struct xfs_sb		*sbp = &mp->m_sb;
-	xfs_agnumber_t		orig_agcount = sbp->sb_agcount;
 	int			error;
 
 	trace_xfs_log_recover(log, head_blk, tail_blk);
@@ -3394,13 +3393,6 @@ xlog_do_recover(
 	/* re-initialise in-core superblock and geometry structures */
 	mp->m_features |= xfs_sb_version_to_features(sbp);
 	xfs_reinit_percpu_counters(mp);
-	error = xfs_initialize_perag(mp, orig_agcount, sbp->sb_agcount,
-			sbp->sb_dblocks, &mp->m_maxagi);
-	if (error) {
-		xfs_warn(mp, "Failed post-recovery per-ag init: %d", error);
-		return error;
-	}
-	mp->m_alloc_set_aside = xfs_alloc_set_aside(mp);
 
 	/* Normal transactions can now occur */
 	clear_bit(XLOG_ACTIVE_RECOVERY, &log->l_opstate);
-- 
2.45.2


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 4/6] xfs: error out when a superblock buffer update reduces the agcount
  2024-10-14  6:04 fix recovery of allocator ops after a growfs Christoph Hellwig
                   ` (2 preceding siblings ...)
  2024-10-14  6:04 ` [PATCH 3/6] xfs: update the file system geometry after recoverying superblock buffers Christoph Hellwig
@ 2024-10-14  6:04 ` Christoph Hellwig
  2024-10-14  6:04 ` [PATCH 5/6] xfs: don't use __GFP_RETRY_MAYFAIL in xfs_initialize_perag Christoph Hellwig
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 15+ messages in thread
From: Christoph Hellwig @ 2024-10-14  6:04 UTC (permalink / raw)
  To: Carlos Maiolino; +Cc: Darrick J. Wong, Brian Foster, linux-xfs

XFS currently does not support reducing the agcount, so error out if
a logged sb buffer tries to shrink the agcount.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/xfs/xfs_buf_item_recover.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/fs/xfs/xfs_buf_item_recover.c b/fs/xfs/xfs_buf_item_recover.c
index edf1162a8c9dd0..a839ff5dcaa908 100644
--- a/fs/xfs/xfs_buf_item_recover.c
+++ b/fs/xfs/xfs_buf_item_recover.c
@@ -713,6 +713,11 @@ xlog_recover_do_primary_sb_buffer(
 	 */
 	xfs_sb_from_disk(&mp->m_sb, dsb);
 
+	if (mp->m_sb.sb_agcount < orig_agcount) {
+		xfs_alert(mp, "Shrinking AG count in log recovery not supported");
+		return -EFSCORRUPTED;
+	}
+
 	/*
 	 * Initialize the new perags, and also update various block and inode
 	 * allocator setting based off the number of AGs or total blocks.
-- 
2.45.2


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 5/6] xfs: don't use __GFP_RETRY_MAYFAIL in xfs_initialize_perag
  2024-10-14  6:04 fix recovery of allocator ops after a growfs Christoph Hellwig
                   ` (3 preceding siblings ...)
  2024-10-14  6:04 ` [PATCH 4/6] xfs: error out when a superblock buffer update reduces the agcount Christoph Hellwig
@ 2024-10-14  6:04 ` Christoph Hellwig
  2024-10-14  6:04 ` [PATCH 6/6] xfs: update the pag for the last AG at recovery time Christoph Hellwig
  2024-10-26  7:27 ` fix recovery of allocator ops after a growfs Carlos Maiolino
  6 siblings, 0 replies; 15+ messages in thread
From: Christoph Hellwig @ 2024-10-14  6:04 UTC (permalink / raw)
  To: Carlos Maiolino; +Cc: Darrick J. Wong, Brian Foster, linux-xfs

__GFP_RETRY_MAYFAIL increases the likelyhood of allocations to fail,
which isn't really helpful during log recovery.  Remove the flag and
stick to the default GFP_KERNEL policies.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/xfs/libxfs/xfs_ag.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/xfs/libxfs/xfs_ag.c b/fs/xfs/libxfs/xfs_ag.c
index 8ace2cc200a60e..25cec9dc10c941 100644
--- a/fs/xfs/libxfs/xfs_ag.c
+++ b/fs/xfs/libxfs/xfs_ag.c
@@ -286,7 +286,7 @@ xfs_initialize_perag(
 	int			error;
 
 	for (index = old_agcount; index < new_agcount; index++) {
-		pag = kzalloc(sizeof(*pag), GFP_KERNEL | __GFP_RETRY_MAYFAIL);
+		pag = kzalloc(sizeof(*pag), GFP_KERNEL);
 		if (!pag) {
 			error = -ENOMEM;
 			goto out_unwind_new_pags;
-- 
2.45.2


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 6/6] xfs: update the pag for the last AG at recovery time
  2024-10-14  6:04 fix recovery of allocator ops after a growfs Christoph Hellwig
                   ` (4 preceding siblings ...)
  2024-10-14  6:04 ` [PATCH 5/6] xfs: don't use __GFP_RETRY_MAYFAIL in xfs_initialize_perag Christoph Hellwig
@ 2024-10-14  6:04 ` Christoph Hellwig
  2024-10-15 13:11   ` Brian Foster
  2024-10-26  7:27 ` fix recovery of allocator ops after a growfs Carlos Maiolino
  6 siblings, 1 reply; 15+ messages in thread
From: Christoph Hellwig @ 2024-10-14  6:04 UTC (permalink / raw)
  To: Carlos Maiolino; +Cc: Darrick J. Wong, Brian Foster, linux-xfs

Currently log recovery never updates the in-core perag values for the
last allocation group when they were grown by growfs.  This leads to
btree record validation failures for the alloc, ialloc or finotbt
trees if a transaction references this new space.

Found by Brian's new growfs recovery stress test.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/xfs/libxfs/xfs_ag.c        | 17 +++++++++++++++++
 fs/xfs/libxfs/xfs_ag.h        |  1 +
 fs/xfs/xfs_buf_item_recover.c | 19 ++++++++++++++++---
 3 files changed, 34 insertions(+), 3 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_ag.c b/fs/xfs/libxfs/xfs_ag.c
index 25cec9dc10c941..5ca8d01068273d 100644
--- a/fs/xfs/libxfs/xfs_ag.c
+++ b/fs/xfs/libxfs/xfs_ag.c
@@ -273,6 +273,23 @@ xfs_agino_range(
 	return __xfs_agino_range(mp, xfs_ag_block_count(mp, agno), first, last);
 }
 
+int
+xfs_update_last_ag_size(
+	struct xfs_mount	*mp,
+	xfs_agnumber_t		prev_agcount)
+{
+	struct xfs_perag	*pag = xfs_perag_grab(mp, prev_agcount - 1);
+
+	if (!pag)
+		return -EFSCORRUPTED;
+	pag->block_count = __xfs_ag_block_count(mp, prev_agcount - 1,
+			mp->m_sb.sb_agcount, mp->m_sb.sb_dblocks);
+	__xfs_agino_range(mp, pag->block_count, &pag->agino_min,
+			&pag->agino_max);
+	xfs_perag_rele(pag);
+	return 0;
+}
+
 int
 xfs_initialize_perag(
 	struct xfs_mount	*mp,
diff --git a/fs/xfs/libxfs/xfs_ag.h b/fs/xfs/libxfs/xfs_ag.h
index 6e68d6a3161a0f..9edfe0e9643964 100644
--- a/fs/xfs/libxfs/xfs_ag.h
+++ b/fs/xfs/libxfs/xfs_ag.h
@@ -150,6 +150,7 @@ int xfs_initialize_perag(struct xfs_mount *mp, xfs_agnumber_t old_agcount,
 void xfs_free_perag_range(struct xfs_mount *mp, xfs_agnumber_t first_agno,
 		xfs_agnumber_t end_agno);
 int xfs_initialize_perag_data(struct xfs_mount *mp, xfs_agnumber_t agno);
+int xfs_update_last_ag_size(struct xfs_mount *mp, xfs_agnumber_t prev_agcount);
 
 /* Passive AG references */
 struct xfs_perag *xfs_perag_get(struct xfs_mount *mp, xfs_agnumber_t agno);
diff --git a/fs/xfs/xfs_buf_item_recover.c b/fs/xfs/xfs_buf_item_recover.c
index a839ff5dcaa908..5180cbf5a90b4b 100644
--- a/fs/xfs/xfs_buf_item_recover.c
+++ b/fs/xfs/xfs_buf_item_recover.c
@@ -708,6 +708,11 @@ xlog_recover_do_primary_sb_buffer(
 
 	xlog_recover_do_reg_buffer(mp, item, bp, buf_f, current_lsn);
 
+	if (orig_agcount == 0) {
+		xfs_alert(mp, "Trying to grow file system without AGs");
+		return -EFSCORRUPTED;
+	}
+
 	/*
 	 * Update the in-core super block from the freshly recovered on-disk one.
 	 */
@@ -718,15 +723,23 @@ xlog_recover_do_primary_sb_buffer(
 		return -EFSCORRUPTED;
 	}
 
+	/*
+	 * Growfs can also grow the last existing AG.  In this case we also need
+	 * to update the length in the in-core perag structure and values
+	 * depending on it.
+	 */
+	error = xfs_update_last_ag_size(mp, orig_agcount);
+	if (error)
+		return error;
+
 	/*
 	 * Initialize the new perags, and also update various block and inode
 	 * allocator setting based off the number of AGs or total blocks.
 	 * Because of the latter this also needs to happen if the agcount did
 	 * not change.
 	 */
-	error = xfs_initialize_perag(mp, orig_agcount,
-			mp->m_sb.sb_agcount, mp->m_sb.sb_dblocks,
-			&mp->m_maxagi);
+	error = xfs_initialize_perag(mp, orig_agcount, mp->m_sb.sb_agcount,
+			mp->m_sb.sb_dblocks, &mp->m_maxagi);
 	if (error) {
 		xfs_warn(mp, "Failed recovery per-ag init: %d", error);
 		return error;
-- 
2.45.2


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/6] xfs: pass the exact range to initialize to xfs_initialize_perag
  2024-10-14  6:04 ` [PATCH 1/6] xfs: pass the exact range to initialize to xfs_initialize_perag Christoph Hellwig
@ 2024-10-15 13:11   ` Brian Foster
  2024-10-15 13:35     ` Christoph Hellwig
  2024-10-16  6:25     ` Christoph Hellwig
  0 siblings, 2 replies; 15+ messages in thread
From: Brian Foster @ 2024-10-15 13:11 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Carlos Maiolino, Darrick J. Wong, linux-xfs

On Mon, Oct 14, 2024 at 08:04:50AM +0200, Christoph Hellwig wrote:
> Currently only the new agcount is passed to xfs_initialize_perag, which
> requires lookups of existing AGs to skip them and complicates error
> handling.  Also pass the previous agcount so that the range that
> xfs_initialize_perag operates on is exactly defined.  That way the
> extra lookups can be avoided, and error handling can clean up the
> exact range from the old count to the last added perag structure.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> Reviewed-by: Darrick J. Wong <djwong@kernel.org>
> ---
>  fs/xfs/libxfs/xfs_ag.c   | 28 ++++++----------------------
>  fs/xfs/libxfs/xfs_ag.h   |  5 +++--
>  fs/xfs/xfs_fsops.c       | 18 ++++++++----------
>  fs/xfs/xfs_log_recover.c |  5 +++--
>  fs/xfs/xfs_mount.c       |  4 ++--
>  5 files changed, 22 insertions(+), 38 deletions(-)
> 
> diff --git a/fs/xfs/libxfs/xfs_ag.c b/fs/xfs/libxfs/xfs_ag.c
> index 5f0494702e0b55..464f682eab4690 100644
> --- a/fs/xfs/libxfs/xfs_ag.c
> +++ b/fs/xfs/libxfs/xfs_ag.c
> @@ -296,27 +296,16 @@ xfs_free_unused_perag_range(
>  int
>  xfs_initialize_perag(
>  	struct xfs_mount	*mp,
> -	xfs_agnumber_t		agcount,
> +	xfs_agnumber_t		old_agcount,
> +	xfs_agnumber_t		new_agcount,

What happened to using first/end or whatever terminology here like is
done in one of the later patches? I really find old/new unnecessarily
confusing in this context.

Otherwise looks good:

Reviewed-by: Brian Foster <bfoster@redhat.com>

>  	xfs_rfsblock_t		dblocks,
>  	xfs_agnumber_t		*maxagi)
>  {
>  	struct xfs_perag	*pag;
>  	xfs_agnumber_t		index;
> -	xfs_agnumber_t		first_initialised = NULLAGNUMBER;
>  	int			error;
>  
> -	/*
> -	 * Walk the current per-ag tree so we don't try to initialise AGs
> -	 * that already exist (growfs case). Allocate and insert all the
> -	 * AGs we don't find ready for initialisation.
> -	 */
> -	for (index = 0; index < agcount; index++) {
> -		pag = xfs_perag_get(mp, index);
> -		if (pag) {
> -			xfs_perag_put(pag);
> -			continue;
> -		}
> -
> +	for (index = old_agcount; index < new_agcount; index++) {
>  		pag = kzalloc(sizeof(*pag), GFP_KERNEL | __GFP_RETRY_MAYFAIL);
>  		if (!pag) {
>  			error = -ENOMEM;
> @@ -353,21 +342,17 @@ xfs_initialize_perag(
>  		/* Active ref owned by mount indicates AG is online. */
>  		atomic_set(&pag->pag_active_ref, 1);
>  
> -		/* first new pag is fully initialized */
> -		if (first_initialised == NULLAGNUMBER)
> -			first_initialised = index;
> -
>  		/*
>  		 * Pre-calculated geometry
>  		 */
> -		pag->block_count = __xfs_ag_block_count(mp, index, agcount,
> +		pag->block_count = __xfs_ag_block_count(mp, index, new_agcount,
>  				dblocks);
>  		pag->min_block = XFS_AGFL_BLOCK(mp);
>  		__xfs_agino_range(mp, pag->block_count, &pag->agino_min,
>  				&pag->agino_max);
>  	}
>  
> -	index = xfs_set_inode_alloc(mp, agcount);
> +	index = xfs_set_inode_alloc(mp, new_agcount);
>  
>  	if (maxagi)
>  		*maxagi = index;
> @@ -381,8 +366,7 @@ xfs_initialize_perag(
>  out_free_pag:
>  	kfree(pag);
>  out_unwind_new_pags:
> -	/* unwind any prior newly initialized pags */
> -	xfs_free_unused_perag_range(mp, first_initialised, agcount);
> +	xfs_free_unused_perag_range(mp, old_agcount, index);
>  	return error;
>  }
>  
> diff --git a/fs/xfs/libxfs/xfs_ag.h b/fs/xfs/libxfs/xfs_ag.h
> index d9cccd093b60e0..69fc31e7b84728 100644
> --- a/fs/xfs/libxfs/xfs_ag.h
> +++ b/fs/xfs/libxfs/xfs_ag.h
> @@ -146,8 +146,9 @@ __XFS_AG_OPSTATE(agfl_needs_reset, AGFL_NEEDS_RESET)
>  
>  void xfs_free_unused_perag_range(struct xfs_mount *mp, xfs_agnumber_t agstart,
>  			xfs_agnumber_t agend);
> -int xfs_initialize_perag(struct xfs_mount *mp, xfs_agnumber_t agcount,
> -			xfs_rfsblock_t dcount, xfs_agnumber_t *maxagi);
> +int xfs_initialize_perag(struct xfs_mount *mp, xfs_agnumber_t old_agcount,
> +		xfs_agnumber_t agcount, xfs_rfsblock_t dcount,
> +		xfs_agnumber_t *maxagi);
>  int xfs_initialize_perag_data(struct xfs_mount *mp, xfs_agnumber_t agno);
>  void xfs_free_perag(struct xfs_mount *mp);
>  
> diff --git a/fs/xfs/xfs_fsops.c b/fs/xfs/xfs_fsops.c
> index 3643cc843f6271..de2bf0594cb474 100644
> --- a/fs/xfs/xfs_fsops.c
> +++ b/fs/xfs/xfs_fsops.c
> @@ -87,6 +87,7 @@ xfs_growfs_data_private(
>  	struct xfs_mount	*mp,		/* mount point for filesystem */
>  	struct xfs_growfs_data	*in)		/* growfs data input struct */
>  {
> +	xfs_agnumber_t		oagcount = mp->m_sb.sb_agcount;
>  	struct xfs_buf		*bp;
>  	int			error;
>  	xfs_agnumber_t		nagcount;
> @@ -94,7 +95,6 @@ xfs_growfs_data_private(
>  	xfs_rfsblock_t		nb, nb_div, nb_mod;
>  	int64_t			delta;
>  	bool			lastag_extended = false;
> -	xfs_agnumber_t		oagcount;
>  	struct xfs_trans	*tp;
>  	struct aghdr_init_data	id = {};
>  	struct xfs_perag	*last_pag;
> @@ -138,16 +138,14 @@ xfs_growfs_data_private(
>  	if (delta == 0)
>  		return 0;
>  
> -	oagcount = mp->m_sb.sb_agcount;
> -	/* allocate the new per-ag structures */
> -	if (nagcount > oagcount) {
> -		error = xfs_initialize_perag(mp, nagcount, nb, &nagimax);
> -		if (error)
> -			return error;
> -	} else if (nagcount < oagcount) {
> -		/* TODO: shrinking the entire AGs hasn't yet completed */
> +	/* TODO: shrinking the entire AGs hasn't yet completed */
> +	if (nagcount < oagcount)
>  		return -EINVAL;
> -	}
> +
> +	/* allocate the new per-ag structures */
> +	error = xfs_initialize_perag(mp, oagcount, nagcount, nb, &nagimax);
> +	if (error)
> +		return error;
>  
>  	if (delta > 0)
>  		error = xfs_trans_alloc(mp, &M_RES(mp)->tr_growdata,
> diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
> index d6a3ff24c327c3..60d46338f51792 100644
> --- a/fs/xfs/xfs_log_recover.c
> +++ b/fs/xfs/xfs_log_recover.c
> @@ -3346,6 +3346,7 @@ xlog_do_recover(
>  	struct xfs_mount	*mp = log->l_mp;
>  	struct xfs_buf		*bp = mp->m_sb_bp;
>  	struct xfs_sb		*sbp = &mp->m_sb;
> +	xfs_agnumber_t		orig_agcount = sbp->sb_agcount;
>  	int			error;
>  
>  	trace_xfs_log_recover(log, head_blk, tail_blk);
> @@ -3393,8 +3394,8 @@ xlog_do_recover(
>  	/* re-initialise in-core superblock and geometry structures */
>  	mp->m_features |= xfs_sb_version_to_features(sbp);
>  	xfs_reinit_percpu_counters(mp);
> -	error = xfs_initialize_perag(mp, sbp->sb_agcount, sbp->sb_dblocks,
> -			&mp->m_maxagi);
> +	error = xfs_initialize_perag(mp, orig_agcount, sbp->sb_agcount,
> +			sbp->sb_dblocks, &mp->m_maxagi);
>  	if (error) {
>  		xfs_warn(mp, "Failed post-recovery per-ag init: %d", error);
>  		return error;
> diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c
> index 1fdd79c5bfa04e..6fa7239a4a01b6 100644
> --- a/fs/xfs/xfs_mount.c
> +++ b/fs/xfs/xfs_mount.c
> @@ -810,8 +810,8 @@ xfs_mountfs(
>  	/*
>  	 * Allocate and initialize the per-ag data.
>  	 */
> -	error = xfs_initialize_perag(mp, sbp->sb_agcount, mp->m_sb.sb_dblocks,
> -			&mp->m_maxagi);
> +	error = xfs_initialize_perag(mp, 0, sbp->sb_agcount,
> +			mp->m_sb.sb_dblocks, &mp->m_maxagi);
>  	if (error) {
>  		xfs_warn(mp, "Failed per-ag init: %d", error);
>  		goto out_free_dir;
> -- 
> 2.45.2
> 
> 


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 3/6] xfs: update the file system geometry after recoverying superblock buffers
  2024-10-14  6:04 ` [PATCH 3/6] xfs: update the file system geometry after recoverying superblock buffers Christoph Hellwig
@ 2024-10-15 13:11   ` Brian Foster
  2024-10-15 16:19   ` Darrick J. Wong
  1 sibling, 0 replies; 15+ messages in thread
From: Brian Foster @ 2024-10-15 13:11 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Carlos Maiolino, Darrick J. Wong, linux-xfs

On Mon, Oct 14, 2024 at 08:04:52AM +0200, Christoph Hellwig wrote:
> Primary superblock buffers that change the file system geometry after a
> growfs operation can affect the operation of later CIL checkpoints that
> make use of the newly added space and allocation groups.
> 
> Apply the changes to the in-memory structures as part of recovery pass 2,
> to ensure recovery works fine for such cases.
> 
> In the future we should apply the logic to other updates such as features
> bits as well.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---

Reviewed-by: Brian Foster <bfoster@redhat.com>

>  fs/xfs/xfs_buf_item_recover.c | 52 +++++++++++++++++++++++++++++++++++
>  fs/xfs/xfs_log_recover.c      |  8 ------
>  2 files changed, 52 insertions(+), 8 deletions(-)
> 
> diff --git a/fs/xfs/xfs_buf_item_recover.c b/fs/xfs/xfs_buf_item_recover.c
> index 09e893cf563cb9..edf1162a8c9dd0 100644
> --- a/fs/xfs/xfs_buf_item_recover.c
> +++ b/fs/xfs/xfs_buf_item_recover.c
> @@ -22,6 +22,9 @@
>  #include "xfs_inode.h"
>  #include "xfs_dir2.h"
>  #include "xfs_quota.h"
> +#include "xfs_alloc.h"
> +#include "xfs_ag.h"
> +#include "xfs_sb.h"
>  
>  /*
>   * This is the number of entries in the l_buf_cancel_table used during
> @@ -684,6 +687,49 @@ xlog_recover_do_inode_buffer(
>  	return 0;
>  }
>  
> +/*
> + * Update the in-memory superblock and perag structures from the primary SB
> + * buffer.
> + *
> + * This is required because transactions running after growfs may require the
> + * updated values to be set in a previous fully commit transaction.
> + */
> +static int
> +xlog_recover_do_primary_sb_buffer(
> +	struct xfs_mount		*mp,
> +	struct xlog_recover_item	*item,
> +	struct xfs_buf			*bp,
> +	struct xfs_buf_log_format	*buf_f,
> +	xfs_lsn_t			current_lsn)
> +{
> +	struct xfs_dsb			*dsb = bp->b_addr;
> +	xfs_agnumber_t			orig_agcount = mp->m_sb.sb_agcount;
> +	int				error;
> +
> +	xlog_recover_do_reg_buffer(mp, item, bp, buf_f, current_lsn);
> +
> +	/*
> +	 * Update the in-core super block from the freshly recovered on-disk one.
> +	 */
> +	xfs_sb_from_disk(&mp->m_sb, dsb);
> +
> +	/*
> +	 * Initialize the new perags, and also update various block and inode
> +	 * allocator setting based off the number of AGs or total blocks.
> +	 * Because of the latter this also needs to happen if the agcount did
> +	 * not change.
> +	 */
> +	error = xfs_initialize_perag(mp, orig_agcount,
> +			mp->m_sb.sb_agcount, mp->m_sb.sb_dblocks,
> +			&mp->m_maxagi);
> +	if (error) {
> +		xfs_warn(mp, "Failed recovery per-ag init: %d", error);
> +		return error;
> +	}
> +	mp->m_alloc_set_aside = xfs_alloc_set_aside(mp);
> +	return 0;
> +}
> +
>  /*
>   * V5 filesystems know the age of the buffer on disk being recovered. We can
>   * have newer objects on disk than we are replaying, and so for these cases we
> @@ -967,6 +1013,12 @@ xlog_recover_buf_commit_pass2(
>  		dirty = xlog_recover_do_dquot_buffer(mp, log, item, bp, buf_f);
>  		if (!dirty)
>  			goto out_release;
> +	} else if ((xfs_blft_from_flags(buf_f) & XFS_BLFT_SB_BUF) &&
> +			xfs_buf_daddr(bp) == 0) {
> +		error = xlog_recover_do_primary_sb_buffer(mp, item, bp, buf_f,
> +				current_lsn);
> +		if (error)
> +			goto out_release;
>  	} else {
>  		xlog_recover_do_reg_buffer(mp, item, bp, buf_f, current_lsn);
>  	}
> diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
> index 60d46338f51792..08b8938e4efb7d 100644
> --- a/fs/xfs/xfs_log_recover.c
> +++ b/fs/xfs/xfs_log_recover.c
> @@ -3346,7 +3346,6 @@ xlog_do_recover(
>  	struct xfs_mount	*mp = log->l_mp;
>  	struct xfs_buf		*bp = mp->m_sb_bp;
>  	struct xfs_sb		*sbp = &mp->m_sb;
> -	xfs_agnumber_t		orig_agcount = sbp->sb_agcount;
>  	int			error;
>  
>  	trace_xfs_log_recover(log, head_blk, tail_blk);
> @@ -3394,13 +3393,6 @@ xlog_do_recover(
>  	/* re-initialise in-core superblock and geometry structures */
>  	mp->m_features |= xfs_sb_version_to_features(sbp);
>  	xfs_reinit_percpu_counters(mp);
> -	error = xfs_initialize_perag(mp, orig_agcount, sbp->sb_agcount,
> -			sbp->sb_dblocks, &mp->m_maxagi);
> -	if (error) {
> -		xfs_warn(mp, "Failed post-recovery per-ag init: %d", error);
> -		return error;
> -	}
> -	mp->m_alloc_set_aside = xfs_alloc_set_aside(mp);
>  
>  	/* Normal transactions can now occur */
>  	clear_bit(XLOG_ACTIVE_RECOVERY, &log->l_opstate);
> -- 
> 2.45.2
> 
> 


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 6/6] xfs: update the pag for the last AG at recovery time
  2024-10-14  6:04 ` [PATCH 6/6] xfs: update the pag for the last AG at recovery time Christoph Hellwig
@ 2024-10-15 13:11   ` Brian Foster
  2024-10-15 13:37     ` Christoph Hellwig
  0 siblings, 1 reply; 15+ messages in thread
From: Brian Foster @ 2024-10-15 13:11 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Carlos Maiolino, Darrick J. Wong, linux-xfs

On Mon, Oct 14, 2024 at 08:04:55AM +0200, Christoph Hellwig wrote:
> Currently log recovery never updates the in-core perag values for the
> last allocation group when they were grown by growfs.  This leads to
> btree record validation failures for the alloc, ialloc or finotbt
> trees if a transaction references this new space.
> 
> Found by Brian's new growfs recovery stress test.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---

Thanks for tracking this down. The test now passes here as well. I'll
try to get it polished up and posted soon.

>  fs/xfs/libxfs/xfs_ag.c        | 17 +++++++++++++++++
>  fs/xfs/libxfs/xfs_ag.h        |  1 +
>  fs/xfs/xfs_buf_item_recover.c | 19 ++++++++++++++++---
>  3 files changed, 34 insertions(+), 3 deletions(-)
> 
> diff --git a/fs/xfs/libxfs/xfs_ag.c b/fs/xfs/libxfs/xfs_ag.c
> index 25cec9dc10c941..5ca8d01068273d 100644
> --- a/fs/xfs/libxfs/xfs_ag.c
> +++ b/fs/xfs/libxfs/xfs_ag.c
> @@ -273,6 +273,23 @@ xfs_agino_range(
>  	return __xfs_agino_range(mp, xfs_ag_block_count(mp, agno), first, last);
>  }
>  

Comment please. I.e.,

/*
 * Update the perag of the previous tail AG if it has been changed
 * during recovery (i.e. recovery of a growfs).
 */

> +int
> +xfs_update_last_ag_size(
> +	struct xfs_mount	*mp,
> +	xfs_agnumber_t		prev_agcount)
> +{
> +	struct xfs_perag	*pag = xfs_perag_grab(mp, prev_agcount - 1);
> +
> +	if (!pag)
> +		return -EFSCORRUPTED;
> +	pag->block_count = __xfs_ag_block_count(mp, prev_agcount - 1,
> +			mp->m_sb.sb_agcount, mp->m_sb.sb_dblocks);
> +	__xfs_agino_range(mp, pag->block_count, &pag->agino_min,
> +			&pag->agino_max);
> +	xfs_perag_rele(pag);
> +	return 0;
> +}
> +
>  int
>  xfs_initialize_perag(
>  	struct xfs_mount	*mp,
> diff --git a/fs/xfs/libxfs/xfs_ag.h b/fs/xfs/libxfs/xfs_ag.h
> index 6e68d6a3161a0f..9edfe0e9643964 100644
> --- a/fs/xfs/libxfs/xfs_ag.h
> +++ b/fs/xfs/libxfs/xfs_ag.h
> @@ -150,6 +150,7 @@ int xfs_initialize_perag(struct xfs_mount *mp, xfs_agnumber_t old_agcount,
>  void xfs_free_perag_range(struct xfs_mount *mp, xfs_agnumber_t first_agno,
>  		xfs_agnumber_t end_agno);
>  int xfs_initialize_perag_data(struct xfs_mount *mp, xfs_agnumber_t agno);
> +int xfs_update_last_ag_size(struct xfs_mount *mp, xfs_agnumber_t prev_agcount);
>  
>  /* Passive AG references */
>  struct xfs_perag *xfs_perag_get(struct xfs_mount *mp, xfs_agnumber_t agno);
> diff --git a/fs/xfs/xfs_buf_item_recover.c b/fs/xfs/xfs_buf_item_recover.c
> index a839ff5dcaa908..5180cbf5a90b4b 100644
> --- a/fs/xfs/xfs_buf_item_recover.c
> +++ b/fs/xfs/xfs_buf_item_recover.c
> @@ -708,6 +708,11 @@ xlog_recover_do_primary_sb_buffer(
>  
>  	xlog_recover_do_reg_buffer(mp, item, bp, buf_f, current_lsn);
>  
> +	if (orig_agcount == 0) {
> +		xfs_alert(mp, "Trying to grow file system without AGs");
> +		return -EFSCORRUPTED;
> +	}
> +
>  	/*
>  	 * Update the in-core super block from the freshly recovered on-disk one.
>  	 */
> @@ -718,15 +723,23 @@ xlog_recover_do_primary_sb_buffer(
>  		return -EFSCORRUPTED;
>  	}
>  
> +	/*
> +	 * Growfs can also grow the last existing AG.  In this case we also need

It can shrink the last AG as well, FWIW.

> +	 * to update the length in the in-core perag structure and values
> +	 * depending on it.
> +	 */
> +	error = xfs_update_last_ag_size(mp, orig_agcount);
> +	if (error)
> +		return error;
> +
>  	/*
>  	 * Initialize the new perags, and also update various block and inode
>  	 * allocator setting based off the number of AGs or total blocks.
>  	 * Because of the latter this also needs to happen if the agcount did
>  	 * not change.
>  	 */
> -	error = xfs_initialize_perag(mp, orig_agcount,
> -			mp->m_sb.sb_agcount, mp->m_sb.sb_dblocks,
> -			&mp->m_maxagi);
> +	error = xfs_initialize_perag(mp, orig_agcount, mp->m_sb.sb_agcount,
> +			mp->m_sb.sb_dblocks, &mp->m_maxagi);

Seems like this should be folded into an earlier patch?

With the nits addressed:

Reviewed-by: Brian Foster <bfoster@redhat.com>

>  	if (error) {
>  		xfs_warn(mp, "Failed recovery per-ag init: %d", error);
>  		return error;
> -- 
> 2.45.2
> 
> 


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/6] xfs: pass the exact range to initialize to xfs_initialize_perag
  2024-10-15 13:11   ` Brian Foster
@ 2024-10-15 13:35     ` Christoph Hellwig
  2024-10-16  6:25     ` Christoph Hellwig
  1 sibling, 0 replies; 15+ messages in thread
From: Christoph Hellwig @ 2024-10-15 13:35 UTC (permalink / raw)
  To: Brian Foster
  Cc: Christoph Hellwig, Carlos Maiolino, Darrick J. Wong, linux-xfs

On Tue, Oct 15, 2024 at 09:11:06AM -0400, Brian Foster wrote:
> >  xfs_initialize_perag(
> >  	struct xfs_mount	*mp,
> > -	xfs_agnumber_t		agcount,
> > +	xfs_agnumber_t		old_agcount,
> > +	xfs_agnumber_t		new_agcount,
> 
> What happened to using first/end or whatever terminology here like is
> done in one of the later patches? I really find old/new unnecessarily
> confusing in this context.

I though you only wanted that for the caller.  I can fix it up.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 6/6] xfs: update the pag for the last AG at recovery time
  2024-10-15 13:11   ` Brian Foster
@ 2024-10-15 13:37     ` Christoph Hellwig
  0 siblings, 0 replies; 15+ messages in thread
From: Christoph Hellwig @ 2024-10-15 13:37 UTC (permalink / raw)
  To: Brian Foster
  Cc: Christoph Hellwig, Carlos Maiolino, Darrick J. Wong, linux-xfs

On Tue, Oct 15, 2024 at 09:11:41AM -0400, Brian Foster wrote:
> > index 25cec9dc10c941..5ca8d01068273d 100644
> > --- a/fs/xfs/libxfs/xfs_ag.c
> > +++ b/fs/xfs/libxfs/xfs_ag.c
> > @@ -273,6 +273,23 @@ xfs_agino_range(
> >  	return __xfs_agino_range(mp, xfs_ag_block_count(mp, agno), first, last);
> >  }
> >  
> 
> Comment please. I.e.,
> 
> /*
>  * Update the perag of the previous tail AG if it has been changed
>  * during recovery (i.e. recovery of a growfs).
>  */

Sure.

> > +	/*
> > +	 * Growfs can also grow the last existing AG.  In this case we also need
> 
> It can shrink the last AG as well, FWIW.

Indeed, I keep forgetting about the weird special case partial shrink
that we support.

> > -	error = xfs_initialize_perag(mp, orig_agcount,
> > -			mp->m_sb.sb_agcount, mp->m_sb.sb_dblocks,
> > -			&mp->m_maxagi);
> > +	error = xfs_initialize_perag(mp, orig_agcount, mp->m_sb.sb_agcount,
> > +			mp->m_sb.sb_dblocks, &mp->m_maxagi);
> 
> Seems like this should be folded into an earlier patch?

Yes.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 3/6] xfs: update the file system geometry after recoverying superblock buffers
  2024-10-14  6:04 ` [PATCH 3/6] xfs: update the file system geometry after recoverying superblock buffers Christoph Hellwig
  2024-10-15 13:11   ` Brian Foster
@ 2024-10-15 16:19   ` Darrick J. Wong
  1 sibling, 0 replies; 15+ messages in thread
From: Darrick J. Wong @ 2024-10-15 16:19 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Carlos Maiolino, Brian Foster, linux-xfs

On Mon, Oct 14, 2024 at 08:04:52AM +0200, Christoph Hellwig wrote:
> Primary superblock buffers that change the file system geometry after a
> growfs operation can affect the operation of later CIL checkpoints that
> make use of the newly added space and allocation groups.
> 
> Apply the changes to the in-memory structures as part of recovery pass 2,
> to ensure recovery works fine for such cases.
> 
> In the future we should apply the logic to other updates such as features
> bits as well.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>

Looks good to me, thanks for making the name changes I requested. :)
Reviewed-by: Darrick J. Wong <djwong@kernel.org>

--D

> ---
>  fs/xfs/xfs_buf_item_recover.c | 52 +++++++++++++++++++++++++++++++++++
>  fs/xfs/xfs_log_recover.c      |  8 ------
>  2 files changed, 52 insertions(+), 8 deletions(-)
> 
> diff --git a/fs/xfs/xfs_buf_item_recover.c b/fs/xfs/xfs_buf_item_recover.c
> index 09e893cf563cb9..edf1162a8c9dd0 100644
> --- a/fs/xfs/xfs_buf_item_recover.c
> +++ b/fs/xfs/xfs_buf_item_recover.c
> @@ -22,6 +22,9 @@
>  #include "xfs_inode.h"
>  #include "xfs_dir2.h"
>  #include "xfs_quota.h"
> +#include "xfs_alloc.h"
> +#include "xfs_ag.h"
> +#include "xfs_sb.h"
>  
>  /*
>   * This is the number of entries in the l_buf_cancel_table used during
> @@ -684,6 +687,49 @@ xlog_recover_do_inode_buffer(
>  	return 0;
>  }
>  
> +/*
> + * Update the in-memory superblock and perag structures from the primary SB
> + * buffer.
> + *
> + * This is required because transactions running after growfs may require the
> + * updated values to be set in a previous fully commit transaction.
> + */
> +static int
> +xlog_recover_do_primary_sb_buffer(
> +	struct xfs_mount		*mp,
> +	struct xlog_recover_item	*item,
> +	struct xfs_buf			*bp,
> +	struct xfs_buf_log_format	*buf_f,
> +	xfs_lsn_t			current_lsn)
> +{
> +	struct xfs_dsb			*dsb = bp->b_addr;
> +	xfs_agnumber_t			orig_agcount = mp->m_sb.sb_agcount;
> +	int				error;
> +
> +	xlog_recover_do_reg_buffer(mp, item, bp, buf_f, current_lsn);
> +
> +	/*
> +	 * Update the in-core super block from the freshly recovered on-disk one.
> +	 */
> +	xfs_sb_from_disk(&mp->m_sb, dsb);
> +
> +	/*
> +	 * Initialize the new perags, and also update various block and inode
> +	 * allocator setting based off the number of AGs or total blocks.
> +	 * Because of the latter this also needs to happen if the agcount did
> +	 * not change.
> +	 */
> +	error = xfs_initialize_perag(mp, orig_agcount,
> +			mp->m_sb.sb_agcount, mp->m_sb.sb_dblocks,
> +			&mp->m_maxagi);
> +	if (error) {
> +		xfs_warn(mp, "Failed recovery per-ag init: %d", error);
> +		return error;
> +	}
> +	mp->m_alloc_set_aside = xfs_alloc_set_aside(mp);
> +	return 0;
> +}
> +
>  /*
>   * V5 filesystems know the age of the buffer on disk being recovered. We can
>   * have newer objects on disk than we are replaying, and so for these cases we
> @@ -967,6 +1013,12 @@ xlog_recover_buf_commit_pass2(
>  		dirty = xlog_recover_do_dquot_buffer(mp, log, item, bp, buf_f);
>  		if (!dirty)
>  			goto out_release;
> +	} else if ((xfs_blft_from_flags(buf_f) & XFS_BLFT_SB_BUF) &&
> +			xfs_buf_daddr(bp) == 0) {
> +		error = xlog_recover_do_primary_sb_buffer(mp, item, bp, buf_f,
> +				current_lsn);
> +		if (error)
> +			goto out_release;
>  	} else {
>  		xlog_recover_do_reg_buffer(mp, item, bp, buf_f, current_lsn);
>  	}
> diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
> index 60d46338f51792..08b8938e4efb7d 100644
> --- a/fs/xfs/xfs_log_recover.c
> +++ b/fs/xfs/xfs_log_recover.c
> @@ -3346,7 +3346,6 @@ xlog_do_recover(
>  	struct xfs_mount	*mp = log->l_mp;
>  	struct xfs_buf		*bp = mp->m_sb_bp;
>  	struct xfs_sb		*sbp = &mp->m_sb;
> -	xfs_agnumber_t		orig_agcount = sbp->sb_agcount;
>  	int			error;
>  
>  	trace_xfs_log_recover(log, head_blk, tail_blk);
> @@ -3394,13 +3393,6 @@ xlog_do_recover(
>  	/* re-initialise in-core superblock and geometry structures */
>  	mp->m_features |= xfs_sb_version_to_features(sbp);
>  	xfs_reinit_percpu_counters(mp);
> -	error = xfs_initialize_perag(mp, orig_agcount, sbp->sb_agcount,
> -			sbp->sb_dblocks, &mp->m_maxagi);
> -	if (error) {
> -		xfs_warn(mp, "Failed post-recovery per-ag init: %d", error);
> -		return error;
> -	}
> -	mp->m_alloc_set_aside = xfs_alloc_set_aside(mp);
>  
>  	/* Normal transactions can now occur */
>  	clear_bit(XLOG_ACTIVE_RECOVERY, &log->l_opstate);
> -- 
> 2.45.2
> 
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/6] xfs: pass the exact range to initialize to xfs_initialize_perag
  2024-10-15 13:11   ` Brian Foster
  2024-10-15 13:35     ` Christoph Hellwig
@ 2024-10-16  6:25     ` Christoph Hellwig
  1 sibling, 0 replies; 15+ messages in thread
From: Christoph Hellwig @ 2024-10-16  6:25 UTC (permalink / raw)
  To: Brian Foster
  Cc: Christoph Hellwig, Carlos Maiolino, Darrick J. Wong, linux-xfs

On Tue, Oct 15, 2024 at 09:11:06AM -0400, Brian Foster wrote:
> What happened to using first/end or whatever terminology here like is
> done in one of the later patches? I really find old/new unnecessarily
> confusing in this context.

Looking into it first/end is bad because it is the acount and not
the indices, i.e. each of them are the last index + 1.  So I'll switch
to using orig instead of old as inthe caller here, and keep new.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: fix recovery of allocator ops after a growfs
  2024-10-14  6:04 fix recovery of allocator ops after a growfs Christoph Hellwig
                   ` (5 preceding siblings ...)
  2024-10-14  6:04 ` [PATCH 6/6] xfs: update the pag for the last AG at recovery time Christoph Hellwig
@ 2024-10-26  7:27 ` Carlos Maiolino
  6 siblings, 0 replies; 15+ messages in thread
From: Carlos Maiolino @ 2024-10-26  7:27 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Darrick J. Wong, Brian Foster, linux-xfs

On Mon, 14 Oct 2024 08:04:49 +0200, Christoph Hellwig wrote:
> auditing the perag code for the generic groups feature found an issue
> where recovery of an extfree intent without a logged done entry will
> fail when the log also contained the transaction that added the AG
> to the extent is freed to because the file system geometry in the
> superblock is only updated updated and the perag structures are only
> created after log recovery has finished.
> 
> [...]

Applied to for-next, thanks!

[1/6] xfs: pass the exact range to initialize to xfs_initialize_perag
      commit: 82742f8c3f1a93787a05a00aca50c2a565231f84
[2/6] xfs: merge the perag freeing helpers
      commit: aa67ec6a25617e36eba4fb28a88159f500a6cac6
[3/6] xfs: update the file system geometry after recoverying superblock buffers
      commit: 6a18765b54e2e52aebcdb84c3b4f4d1f7cb2c0ca
[4/6] xfs: error out when a superblock buffer update reduces the agcount
      commit: b882b0f8138ffa935834e775953f1630f89bbb62
[5/6] xfs: don't use __GFP_RETRY_MAYFAIL in xfs_initialize_perag
      commit: 069cf5e32b700f94c6ac60f6171662bdfb04f325
[6/6] xfs: update the pag for the last AG at recovery time
      commit: 4a201dcfa1ff0dcfe4348c40f3ad8bd68b97eb6c

Best regards,
-- 
Carlos Maiolino <cem@kernel.org>


^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2024-10-26  7:27 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-10-14  6:04 fix recovery of allocator ops after a growfs Christoph Hellwig
2024-10-14  6:04 ` [PATCH 1/6] xfs: pass the exact range to initialize to xfs_initialize_perag Christoph Hellwig
2024-10-15 13:11   ` Brian Foster
2024-10-15 13:35     ` Christoph Hellwig
2024-10-16  6:25     ` Christoph Hellwig
2024-10-14  6:04 ` [PATCH 2/6] xfs: merge the perag freeing helpers Christoph Hellwig
2024-10-14  6:04 ` [PATCH 3/6] xfs: update the file system geometry after recoverying superblock buffers Christoph Hellwig
2024-10-15 13:11   ` Brian Foster
2024-10-15 16:19   ` Darrick J. Wong
2024-10-14  6:04 ` [PATCH 4/6] xfs: error out when a superblock buffer update reduces the agcount Christoph Hellwig
2024-10-14  6:04 ` [PATCH 5/6] xfs: don't use __GFP_RETRY_MAYFAIL in xfs_initialize_perag Christoph Hellwig
2024-10-14  6:04 ` [PATCH 6/6] xfs: update the pag for the last AG at recovery time Christoph Hellwig
2024-10-15 13:11   ` Brian Foster
2024-10-15 13:37     ` Christoph Hellwig
2024-10-26  7:27 ` fix recovery of allocator ops after a growfs Carlos Maiolino

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox