From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: stable@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
patches@lists.linux.dev, linux-xfs@vger.kernel.org,
"Darrick J. Wong" <djwong@kernel.org>,
Christoph Hellwig <hch@lst.de>,
Catherine Hoang <catherine.hoang@oracle.com>
Subject: [PATCH 6.6 034/124] xfs: use dontcache for grabbing inodes during scrub
Date: Mon, 21 Oct 2024 12:23:58 +0200 [thread overview]
Message-ID: <20241021102258.047180264@linuxfoundation.org> (raw)
In-Reply-To: <20241021102256.706334758@linuxfoundation.org>
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: "Darrick J. Wong" <djwong@kernel.org>
commit b27ce0da60a523fc32e3795f96b2de5490642235 upstream.
[backport: resolve conflict due to missing iscan.c]
Back when I wrote commit a03297a0ca9f2, I had thought that we'd be doing
users a favor by only marking inodes dontcache at the end of a scrub
operation, and only if there's only one reference to that inode. This
was more or less true back when I_DONTCACHE was an XFS iflag and the
only thing it did was change the outcome of xfs_fs_drop_inode to 1.
Note: If there are dentries pointing to the inode when scrub finishes,
the inode will have positive i_count and stay around in cache until
dentry reclaim.
But now we have d_mark_dontcache, which cause the inode *and* the
dentries attached to it all to be marked I_DONTCACHE, which means that
we drop the dentries ASAP, which drops the inode ASAP.
This is bad if scrub found problems with the inode, because now they can
be scheduled for inactivation, which can cause inodegc to trip on it and
shut down the filesystem.
Even if the inode isn't bad, this is still suboptimal because phases 3-7
each initiate inode scans. Dropping the inode immediately during phase
3 is silly because phase 5 will reload it and drop it immediately, etc.
It's fine to mark the inodes dontcache, but if there have been accesses
to the file that set up dentries, we should keep them.
I validated this by setting up ftrace to capture xfs_iget_recycle*
tracepoints and ran xfs/285 for 30 seconds. With current djwong-wtf I
saw ~30,000 recycle events. I then dropped the d_mark_dontcache calls
and set XFS_IGET_DONTCACHE, and the recycle events dropped to ~5,000 per
30 seconds.
Therefore, grab the inode with XFS_IGET_DONTCACHE, which only has the
effect of setting I_DONTCACHE for cache misses. Remove the
d_mark_dontcache call that can happen in xchk_irele.
Fixes: a03297a0ca9f2 ("xfs: manage inode DONTCACHE status at irele time")
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Catherine Hoang <catherine.hoang@oracle.com>
Acked-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
fs/xfs/scrub/common.c | 12 +++---------
fs/xfs/scrub/scrub.h | 7 +++++++
2 files changed, 10 insertions(+), 9 deletions(-)
--- a/fs/xfs/scrub/common.c
+++ b/fs/xfs/scrub/common.c
@@ -735,7 +735,7 @@ xchk_iget(
{
ASSERT(sc->tp != NULL);
- return xfs_iget(sc->mp, sc->tp, inum, XFS_IGET_UNTRUSTED, 0, ipp);
+ return xfs_iget(sc->mp, sc->tp, inum, XCHK_IGET_FLAGS, 0, ipp);
}
/*
@@ -786,8 +786,8 @@ again:
if (error)
return error;
- error = xfs_iget(mp, tp, inum,
- XFS_IGET_NORETRY | XFS_IGET_UNTRUSTED, 0, ipp);
+ error = xfs_iget(mp, tp, inum, XFS_IGET_NORETRY | XCHK_IGET_FLAGS, 0,
+ ipp);
if (error == -EAGAIN) {
/*
* The inode may be in core but temporarily unavailable and may
@@ -994,12 +994,6 @@ xchk_irele(
spin_lock(&VFS_I(ip)->i_lock);
VFS_I(ip)->i_state &= ~I_DONTCACHE;
spin_unlock(&VFS_I(ip)->i_lock);
- } else if (atomic_read(&VFS_I(ip)->i_count) == 1) {
- /*
- * If this is the last reference to the inode and the caller
- * permits it, set DONTCACHE to avoid thrashing.
- */
- d_mark_dontcache(VFS_I(ip));
}
xfs_irele(ip);
--- a/fs/xfs/scrub/scrub.h
+++ b/fs/xfs/scrub/scrub.h
@@ -17,6 +17,13 @@ struct xfs_scrub;
#define XCHK_GFP_FLAGS ((__force gfp_t)(GFP_KERNEL | __GFP_NOWARN | \
__GFP_RETRY_MAYFAIL))
+/*
+ * For opening files by handle for fsck operations, we don't trust the inumber
+ * or the allocation state; therefore, perform an untrusted lookup. We don't
+ * want these inodes to pollute the cache, so mark them for immediate removal.
+ */
+#define XCHK_IGET_FLAGS (XFS_IGET_UNTRUSTED | XFS_IGET_DONTCACHE)
+
/* Type info and names for the scrub types. */
enum xchk_type {
ST_NONE = 1, /* disabled */
next prev parent reply other threads:[~2024-10-21 10:35 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20241021102256.706334758@linuxfoundation.org>
2024-10-21 10:23 ` [PATCH 6.6 024/124] xfs: fix error returns from xfs_bmapi_write Greg Kroah-Hartman
2024-10-21 10:23 ` [PATCH 6.6 025/124] xfs: fix xfs_bmap_add_extent_delay_real for partial conversions Greg Kroah-Hartman
2024-10-21 10:23 ` [PATCH 6.6 026/124] xfs: remove a racy if_bytes check in xfs_reflink_end_cow_extent Greg Kroah-Hartman
2024-10-21 10:23 ` [PATCH 6.6 027/124] xfs: require XFS_SB_FEAT_INCOMPAT_LOG_XATTRS for attr log intent item recovery Greg Kroah-Hartman
2024-10-21 10:23 ` [PATCH 6.6 028/124] xfs: check opcode and iovec count match in xlog_recover_attri_commit_pass2 Greg Kroah-Hartman
2024-10-21 10:23 ` [PATCH 6.6 029/124] xfs: fix missing check for invalid attr flags Greg Kroah-Hartman
2024-10-21 10:23 ` [PATCH 6.6 030/124] xfs: check shortform attr entry flags specifically Greg Kroah-Hartman
2024-10-21 10:23 ` [PATCH 6.6 031/124] xfs: validate recovered name buffers when recovering xattr items Greg Kroah-Hartman
2024-10-21 10:23 ` [PATCH 6.6 032/124] xfs: enforce one namespace per attribute Greg Kroah-Hartman
2024-10-21 10:23 ` [PATCH 6.6 033/124] xfs: revert commit 44af6c7e59b12 Greg Kroah-Hartman
2024-10-21 10:23 ` Greg Kroah-Hartman [this message]
2024-10-21 10:23 ` [PATCH 6.6 035/124] xfs: match lock mode in xfs_buffered_write_iomap_begin() Greg Kroah-Hartman
2024-10-21 10:24 ` [PATCH 6.6 036/124] xfs: make the seq argument to xfs_bmapi_convert_delalloc() optional Greg Kroah-Hartman
2024-10-21 10:24 ` [PATCH 6.6 037/124] xfs: make xfs_bmapi_convert_delalloc() to allocate the target offset Greg Kroah-Hartman
2024-10-21 10:24 ` [PATCH 6.6 038/124] xfs: convert delayed extents to unwritten when zeroing post eof blocks Greg Kroah-Hartman
2024-10-21 10:24 ` [PATCH 6.6 039/124] xfs: allow symlinks with short remote targets Greg Kroah-Hartman
2024-10-21 10:24 ` [PATCH 6.6 040/124] xfs: make sure sb_fdblocks is non-negative Greg Kroah-Hartman
2024-10-21 10:24 ` [PATCH 6.6 041/124] xfs: fix unlink vs cluster buffer instantiation race Greg Kroah-Hartman
2024-10-21 10:24 ` [PATCH 6.6 042/124] xfs: fix freeing speculative preallocations for preallocated files Greg Kroah-Hartman
2024-10-21 10:24 ` [PATCH 6.6 043/124] xfs: allow unlinked symlinks and dirs with zero size Greg Kroah-Hartman
2024-10-21 10:24 ` [PATCH 6.6 044/124] xfs: restrict when we try to align cow fork delalloc to cowextsz hints Greg Kroah-Hartman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20241021102258.047180264@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=catherine.hoang@oracle.com \
--cc=djwong@kernel.org \
--cc=hch@lst.de \
--cc=linux-xfs@vger.kernel.org \
--cc=patches@lists.linux.dev \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox