From: Amir Goldstein <amir73il@gmail.com>
To: "Darrick J . Wong" <djwong@kernel.org>
Cc: Leah Rumancik <leah.rumancik@gmail.com>,
Chandan Babu R <chandan.babu@oracle.com>,
linux-xfs@vger.kernel.org, fstests@vger.kernel.org,
Dave Chinner <dchinner@redhat.com>,
Frank Hofmann <fhofmann@cloudflare.com>,
"Darrick J . Wong" <darrick.wong@oracle.com>,
Dave Chinner <david@fromorbit.com>
Subject: [PATCH 5.10 CANDIDATE 6/7] xfs: reorder iunlink remove operation in xfs_ifree
Date: Sun, 28 Aug 2022 15:46:13 +0300 [thread overview]
Message-ID: <20220828124614.2190592-7-amir73il@gmail.com> (raw)
In-Reply-To: <20220828124614.2190592-1-amir73il@gmail.com>
From: Dave Chinner <dchinner@redhat.com>
commit 9a5280b312e2e7898b6397b2ca3cfd03f67d7be1 upstream.
[backport for 5.10.y]
The O_TMPFILE creation implementation creates a specific order of
operations for inode allocation/freeing and unlinked list
modification. Currently both are serialised by the AGI, so the order
doesn't strictly matter as long as the are both in the same
transaction.
However, if we want to move the unlinked list insertions largely out
from under the AGI lock, then we have to be concerned about the
order in which we do unlinked list modification operations.
O_TMPFILE creation tells us this order is inode allocation/free,
then unlinked list modification.
Change xfs_ifree() to use this same ordering on unlinked list
removal. This way we always guarantee that when we enter the
iunlinked list removal code from this path, we already have the AGI
locked and we don't have to worry about lock nesting AGI reads
inside unlink list locks because it's already locked and attached to
the transaction.
We can do this safely as the inode freeing and unlinked list removal
are done in the same transaction and hence are atomic operations
with respect to log recovery.
Reported-by: Frank Hofmann <fhofmann@cloudflare.com>
Fixes: 298f7bec503f ("xfs: pin inode backing buffer to the inode log item")
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
fs/xfs/xfs_inode.c | 22 ++++++++++++----------
1 file changed, 12 insertions(+), 10 deletions(-)
diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
index 1f61e085676b..929ed3bc5619 100644
--- a/fs/xfs/xfs_inode.c
+++ b/fs/xfs/xfs_inode.c
@@ -2669,14 +2669,13 @@ xfs_ifree_cluster(
}
/*
- * This is called to return an inode to the inode free list.
- * The inode should already be truncated to 0 length and have
- * no pages associated with it. This routine also assumes that
- * the inode is already a part of the transaction.
+ * This is called to return an inode to the inode free list. The inode should
+ * already be truncated to 0 length and have no pages associated with it. This
+ * routine also assumes that the inode is already a part of the transaction.
*
- * The on-disk copy of the inode will have been added to the list
- * of unlinked inodes in the AGI. We need to remove the inode from
- * that list atomically with respect to freeing it here.
+ * The on-disk copy of the inode will have been added to the list of unlinked
+ * inodes in the AGI. We need to remove the inode from that list atomically with
+ * respect to freeing it here.
*/
int
xfs_ifree(
@@ -2694,13 +2693,16 @@ xfs_ifree(
ASSERT(ip->i_d.di_nblocks == 0);
/*
- * Pull the on-disk inode from the AGI unlinked list.
+ * Free the inode first so that we guarantee that the AGI lock is going
+ * to be taken before we remove the inode from the unlinked list. This
+ * makes the AGI lock -> unlinked list modification order the same as
+ * used in O_TMPFILE creation.
*/
- error = xfs_iunlink_remove(tp, ip);
+ error = xfs_difree(tp, ip->i_ino, &xic);
if (error)
return error;
- error = xfs_difree(tp, ip->i_ino, &xic);
+ error = xfs_iunlink_remove(tp, ip);
if (error)
return error;
--
2.25.1
next prev parent reply other threads:[~2022-08-28 12:46 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-08-28 12:46 [PATCH 5.10 CANDIDATE 0/7] xfs stable candidate patches for 5.10.y (from v5.18+) Amir Goldstein
2022-08-28 12:46 ` [PATCH 5.10 CANDIDATE 1/7] xfs: remove infinite loop when reserving free block pool Amir Goldstein
2022-08-28 12:46 ` [PATCH 5.10 CANDIDATE 2/7] xfs: always succeed at setting the reserve pool size Amir Goldstein
2022-08-28 12:46 ` [PATCH 5.10 CANDIDATE 3/7] xfs: fix overfilling of reserve pool Amir Goldstein
2022-08-28 12:46 ` [PATCH 5.10 CANDIDATE 4/7] xfs: fix soft lockup via spinning in filestream ag selection loop Amir Goldstein
2022-08-28 12:46 ` [PATCH 5.10 CANDIDATE 5/7] xfs: revert "xfs: actually bump warning counts when we send warnings" Amir Goldstein
2022-08-28 12:46 ` Amir Goldstein [this message]
2022-08-28 12:46 ` [PATCH 5.10 CANDIDATE 7/7] xfs: validate inode fork size against fork format Amir Goldstein
2022-08-29 14:21 ` [PATCH 5.10 CANDIDATE 0/7] xfs stable candidate patches for 5.10.y (from v5.18+) Darrick J. Wong
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20220828124614.2190592-7-amir73il@gmail.com \
--to=amir73il@gmail.com \
--cc=chandan.babu@oracle.com \
--cc=darrick.wong@oracle.com \
--cc=david@fromorbit.com \
--cc=dchinner@redhat.com \
--cc=djwong@kernel.org \
--cc=fhofmann@cloudflare.com \
--cc=fstests@vger.kernel.org \
--cc=leah.rumancik@gmail.com \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).