From: <gregkh@linuxfoundation.org>
To: amir73il@gmail.com,catherine.hoang@oracle.com,chandan.babu@oracle.com,chandanbabu@kernel.org,dchinner@redhat.com,djwong@kernel.org,gregkh@linuxfoundation.org,leah.rumancik@gmail.com,osandov@fb.com,xfs-stable@lists.linux.dev
Cc: <stable-commits@vger.kernel.org>
Subject: Patch "xfs: fix internal error from AGFL exhaustion" has been added to the 6.1-stable tree
Date: Thu, 30 Jan 2025 09:40:59 +0100 [thread overview]
Message-ID: <2025013059-approval-crumpled-d672@gregkh> (raw)
In-Reply-To: <20250129184717.80816-15-leah.rumancik@gmail.com>
This is a note to let you know that I've just added the patch titled
xfs: fix internal error from AGFL exhaustion
to the 6.1-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary
The filename of the patch is:
xfs-fix-internal-error-from-agfl-exhaustion.patch
and it can be found in the queue-6.1 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable@vger.kernel.org> know about it.
From stable+bounces-111225-greg=kroah.com@vger.kernel.org Wed Jan 29 19:47:57 2025
From: Leah Rumancik <leah.rumancik@gmail.com>
Date: Wed, 29 Jan 2025 10:47:12 -0800
Subject: xfs: fix internal error from AGFL exhaustion
To: stable@vger.kernel.org
Cc: xfs-stable@lists.linux.dev, amir73il@gmail.com, chandan.babu@oracle.com, catherine.hoang@oracle.com, Omar Sandoval <osandov@fb.com>, "Darrick J. Wong" <djwong@kernel.org>, Dave Chinner <dchinner@redhat.com>, Chandan Babu R <chandanbabu@kernel.org>, Leah Rumancik <leah.rumancik@gmail.com>
Message-ID: <20250129184717.80816-15-leah.rumancik@gmail.com>
From: Omar Sandoval <osandov@fb.com>
[ Upstream commit f63a5b3769ad7659da4c0420751d78958ab97675 ]
We've been seeing XFS errors like the following:
XFS: Internal error i != 1 at line 3526 of file fs/xfs/libxfs/xfs_btree.c. Caller xfs_btree_insert+0x1ec/0x280
...
Call Trace:
xfs_corruption_error+0x94/0xa0
xfs_btree_insert+0x221/0x280
xfs_alloc_fixup_trees+0x104/0x3e0
xfs_alloc_ag_vextent_size+0x667/0x820
xfs_alloc_fix_freelist+0x5d9/0x750
xfs_free_extent_fix_freelist+0x65/0xa0
__xfs_free_extent+0x57/0x180
...
This is the XFS_IS_CORRUPT() check in xfs_btree_insert() when
xfs_btree_insrec() fails.
After converting this into a panic and dissecting the core dump, I found
that xfs_btree_insrec() is failing because it's trying to split a leaf
node in the cntbt when the AG free list is empty. In particular, it's
failing to get a block from the AGFL _while trying to refill the AGFL_.
If a single operation splits every level of the bnobt and the cntbt (and
the rmapbt if it is enabled) at once, the free list will be empty. Then,
when the next operation tries to refill the free list, it allocates
space. If the allocation does not use a full extent, it will need to
insert records for the remaining space in the bnobt and cntbt. And if
those new records go in full leaves, the leaves (and potentially more
nodes up to the old root) need to be split.
Fix it by accounting for the additional splits that may be required to
refill the free list in the calculation for the minimum free list size.
P.S. As far as I can tell, this bug has existed for a long time -- maybe
back to xfs-history commit afdf80ae7405 ("Add XFS_AG_MAXLEVELS macros
...") in April 1994! It requires a very unlucky sequence of events, and
in fact we didn't hit it until a particular sparse mmap workload updated
from 5.12 to 5.19. But this bug existed in 5.12, so it must've been
exposed by some other change in allocation or writeback patterns. It's
also much less likely to be hit with the rmapbt enabled, since that
increases the minimum free list size and is unlikely to split at the
same time as the bnobt and cntbt.
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
Signed-off-by: Leah Rumancik <leah.rumancik@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
fs/xfs/libxfs/xfs_alloc.c | 27 ++++++++++++++++++++++++---
1 file changed, 24 insertions(+), 3 deletions(-)
--- a/fs/xfs/libxfs/xfs_alloc.c
+++ b/fs/xfs/libxfs/xfs_alloc.c
@@ -2273,16 +2273,37 @@ xfs_alloc_min_freelist(
ASSERT(mp->m_alloc_maxlevels > 0);
+ /*
+ * For a btree shorter than the maximum height, the worst case is that
+ * every level gets split and a new level is added, then while inserting
+ * another entry to refill the AGFL, every level under the old root gets
+ * split again. This is:
+ *
+ * (full height split reservation) + (AGFL refill split height)
+ * = (current height + 1) + (current height - 1)
+ * = (new height) + (new height - 2)
+ * = 2 * new height - 2
+ *
+ * For a btree of maximum height, the worst case is that every level
+ * under the root gets split, then while inserting another entry to
+ * refill the AGFL, every level under the root gets split again. This is
+ * also:
+ *
+ * 2 * (current height - 1)
+ * = 2 * (new height - 1)
+ * = 2 * new height - 2
+ */
+
/* space needed by-bno freespace btree */
min_free = min_t(unsigned int, levels[XFS_BTNUM_BNOi] + 1,
- mp->m_alloc_maxlevels);
+ mp->m_alloc_maxlevels) * 2 - 2;
/* space needed by-size freespace btree */
min_free += min_t(unsigned int, levels[XFS_BTNUM_CNTi] + 1,
- mp->m_alloc_maxlevels);
+ mp->m_alloc_maxlevels) * 2 - 2;
/* space needed reverse mapping used space btree */
if (xfs_has_rmapbt(mp))
min_free += min_t(unsigned int, levels[XFS_BTNUM_RMAPi] + 1,
- mp->m_rmap_maxlevels);
+ mp->m_rmap_maxlevels) * 2 - 2;
return min_free;
}
Patches currently in stable-queue which might be from leah.rumancik@gmail.com are
queue-6.1/xfs-allow-read-io-and-ficlone-to-run-concurrently.patch
queue-6.1/xfs-hoist-freeing-of-rt-data-fork-extent-mappings.patch
queue-6.1/xfs-make-sure-maxlen-is-still-congruent-with-prod-when-rounding-down.patch
queue-6.1/xfs-only-remap-the-written-blocks-in-xfs_reflink_end_cow_extent.patch
queue-6.1/xfs-dquot-recovery-does-not-validate-the-recovered-dquot.patch
queue-6.1/xfs-clean-up-dqblk-extraction.patch
queue-6.1/xfs-abort-intent-items-when-recovery-intents-fail.patch
queue-6.1/xfs-up-ic_sema-if-flushing-data-device-fails.patch
queue-6.1/xfs-fix-internal-error-from-agfl-exhaustion.patch
queue-6.1/xfs-factor-out-xfs_defer_pending_abort.patch
queue-6.1/xfs-fix-units-conversion-error-in-xfs_bmap_del_extent_delay.patch
queue-6.1/xfs-bump-max-fsgeom-struct-version.patch
queue-6.1/xfs-handle-nimaps-0-from-xfs_bmapi_write-in-xfs_alloc_file_space.patch
queue-6.1/xfs-rt-stubs-should-return-negative-errnos-when-rt-disabled.patch
queue-6.1/xfs-clean-up-fs_xflag_realtime-handling-in-xfs_ioctl_setattr_xflags.patch
queue-6.1/xfs-respect-the-stable-writes-flag-on-the-rt-device.patch
queue-6.1/xfs-introduce-protection-for-drop-nlink.patch
queue-6.1/xfs-prevent-rt-growfs-when-quota-is-enabled.patch
queue-6.1/xfs-inode-recovery-does-not-validate-the-recovered-inode.patch
next prev parent reply other threads:[~2025-01-30 8:41 UTC|newest]
Thread overview: 40+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-01-29 18:46 [PATCH 6.1 00/19] xfs 6.1.y fixes from 6.7 Leah Rumancik
2025-01-29 18:46 ` [PATCH 6.1 01/19] xfs: bump max fsgeom struct version Leah Rumancik
2025-01-30 8:40 ` Patch "xfs: bump max fsgeom struct version" has been added to the 6.1-stable tree gregkh
2025-01-29 18:47 ` [PATCH 6.1 02/19] xfs: hoist freeing of rt data fork extent mappings Leah Rumancik
2025-01-30 8:41 ` Patch "xfs: hoist freeing of rt data fork extent mappings" has been added to the 6.1-stable tree gregkh
2025-01-29 18:47 ` [PATCH 6.1 03/19] xfs: prevent rt growfs when quota is enabled Leah Rumancik
2025-01-30 8:41 ` Patch "xfs: prevent rt growfs when quota is enabled" has been added to the 6.1-stable tree gregkh
2025-01-29 18:47 ` [PATCH 6.1 04/19] xfs: rt stubs should return negative errnos when rt disabled Leah Rumancik
2025-01-30 8:41 ` Patch "xfs: rt stubs should return negative errnos when rt disabled" has been added to the 6.1-stable tree gregkh
2025-01-29 18:47 ` [PATCH 6.1 05/19] xfs: fix units conversion error in xfs_bmap_del_extent_delay Leah Rumancik
2025-01-30 8:41 ` Patch "xfs: fix units conversion error in xfs_bmap_del_extent_delay" has been added to the 6.1-stable tree gregkh
2025-01-29 18:47 ` [PATCH 6.1 06/19] xfs: make sure maxlen is still congruent with prod when rounding down Leah Rumancik
2025-01-30 8:41 ` Patch "xfs: make sure maxlen is still congruent with prod when rounding down" has been added to the 6.1-stable tree gregkh
2025-01-29 18:47 ` [PATCH 6.1 07/19] xfs: introduce protection for drop nlink Leah Rumancik
2025-01-30 8:41 ` Patch "xfs: introduce protection for drop nlink" has been added to the 6.1-stable tree gregkh
2025-01-29 18:47 ` [PATCH 6.1 08/19] xfs: handle nimaps=0 from xfs_bmapi_write in xfs_alloc_file_space Leah Rumancik
2025-01-30 8:41 ` Patch "xfs: handle nimaps=0 from xfs_bmapi_write in xfs_alloc_file_space" has been added to the 6.1-stable tree gregkh
2025-01-29 18:47 ` [PATCH 6.1 09/19] xfs: allow read IO and FICLONE to run concurrently Leah Rumancik
2025-01-30 8:40 ` Patch "xfs: allow read IO and FICLONE to run concurrently" has been added to the 6.1-stable tree gregkh
2025-01-29 18:47 ` [PATCH 6.1 10/19] xfs: factor out xfs_defer_pending_abort Leah Rumancik
2025-01-30 8:40 ` Patch "xfs: factor out xfs_defer_pending_abort" has been added to the 6.1-stable tree gregkh
2025-01-29 18:47 ` [PATCH 6.1 11/19] xfs: abort intent items when recovery intents fail Leah Rumancik
2025-01-30 8:40 ` Patch "xfs: abort intent items when recovery intents fail" has been added to the 6.1-stable tree gregkh
2025-01-29 18:47 ` [PATCH 6.1 12/19] xfs: only remap the written blocks in xfs_reflink_end_cow_extent Leah Rumancik
2025-01-30 8:41 ` Patch "xfs: only remap the written blocks in xfs_reflink_end_cow_extent" has been added to the 6.1-stable tree gregkh
2025-01-29 18:47 ` [PATCH 6.1 13/19] xfs: up(ic_sema) if flushing data device fails Leah Rumancik
2025-01-30 8:41 ` Patch "xfs: up(ic_sema) if flushing data device fails" has been added to the 6.1-stable tree gregkh
2025-01-29 18:47 ` [PATCH 6.1 14/19] xfs: fix internal error from AGFL exhaustion Leah Rumancik
2025-01-30 8:40 ` gregkh [this message]
2025-01-29 18:47 ` [PATCH 6.1 15/19] xfs: inode recovery does not validate the recovered inode Leah Rumancik
2025-01-30 8:41 ` Patch "xfs: inode recovery does not validate the recovered inode" has been added to the 6.1-stable tree gregkh
2025-01-29 18:47 ` [PATCH 6.1 16/19] xfs: clean up dqblk extraction Leah Rumancik
2025-01-30 8:40 ` Patch "xfs: clean up dqblk extraction" has been added to the 6.1-stable tree gregkh
2025-01-29 18:47 ` [PATCH 6.1 17/19] xfs: dquot recovery does not validate the recovered dquot Leah Rumancik
2025-01-30 8:40 ` Patch "xfs: dquot recovery does not validate the recovered dquot" has been added to the 6.1-stable tree gregkh
2025-01-29 18:47 ` [PATCH 6.1 18/19] xfs: clean up FS_XFLAG_REALTIME handling in xfs_ioctl_setattr_xflags Leah Rumancik
2025-01-30 8:40 ` Patch "xfs: clean up FS_XFLAG_REALTIME handling in xfs_ioctl_setattr_xflags" has been added to the 6.1-stable tree gregkh
2025-01-29 18:47 ` [PATCH 6.1 19/19] xfs: respect the stable writes flag on the RT device Leah Rumancik
2025-01-30 8:41 ` Patch "xfs: respect the stable writes flag on the RT device" has been added to the 6.1-stable tree gregkh
2025-01-30 8:41 ` [PATCH 6.1 00/19] xfs 6.1.y fixes from 6.7 Greg KH
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=2025013059-approval-crumpled-d672@gregkh \
--to=gregkh@linuxfoundation.org \
--cc=amir73il@gmail.com \
--cc=catherine.hoang@oracle.com \
--cc=chandan.babu@oracle.com \
--cc=chandanbabu@kernel.org \
--cc=dchinner@redhat.com \
--cc=djwong@kernel.org \
--cc=leah.rumancik@gmail.com \
--cc=osandov@fb.com \
--cc=stable-commits@vger.kernel.org \
--cc=xfs-stable@lists.linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox