From: "Darrick J. Wong" <djwong@kernel.org>
To: Dave Chinner <david@fromorbit.com>
Cc: linux-xfs@vger.kernel.org, zlang@redhat.com
Subject: Re: [PATCH 2/2] xfs: recovery should not clear di_flushiter unconditionally
Date: Fri, 10 Nov 2023 11:32:55 -0800 [thread overview]
Message-ID: <20231110193255.GK1205143@frogsfrogsfrogs> (raw)
In-Reply-To: <20231110044500.718022-3-david@fromorbit.com>
On Fri, Nov 10, 2023 at 03:33:14PM +1100, Dave Chinner wrote:
> From: Dave Chinner <dchinner@redhat.com>
>
> Because on v3 inodes, di_flushiter doesn't exist. It overlaps with
> zero padding in the inode, except when NREXT64=1 configurations are
> in use and the zero padding is no longer padding but holds the 64
> bit extent counter.
>
> This manifests obviously on big endian platforms (e.g. s390) because
> the log dinode is in host order and the overlap is the LSBs of the
> extent count field. It is not noticed on little endian machines
> because the overlap is at the MSB end of the extent count field and
> we need to get more than 2^^48 extents in the inode before it
> manifests. i.e. the heat death of the universe will occur before we
> see the problem in little endian machines.
>
> This is a zero-day issue for NREXT64=1 configuraitons on big endian
> machines. Fix it by only clearing di_flushiter on v2 inodes during
> recovery.
>
> Fixes: 9b7d16e34bbe ("xfs: Introduce XFS_DIFLAG2_NREXT64 and associated helpers")
> cc: stable@kernel.org # 5.19+
> Signed-off-by: Dave Chinner <dchinner@redhat.com>
> ---
> fs/xfs/xfs_inode_item_recover.c | 32 +++++++++++++++++---------------
> 1 file changed, 17 insertions(+), 15 deletions(-)
>
> diff --git a/fs/xfs/xfs_inode_item_recover.c b/fs/xfs/xfs_inode_item_recover.c
> index f4c31c2b60d5..dbdab4ce7c44 100644
> --- a/fs/xfs/xfs_inode_item_recover.c
> +++ b/fs/xfs/xfs_inode_item_recover.c
> @@ -371,24 +371,26 @@ xlog_recover_inode_commit_pass2(
> * superblock flag to determine whether we need to look at di_flushiter
> * to skip replay when the on disk inode is newer than the log one
> */
> - if (!xfs_has_v3inodes(mp) &&
> - ldip->di_flushiter < be16_to_cpu(dip->di_flushiter)) {
> - /*
> - * Deal with the wrap case, DI_MAX_FLUSH is less
> - * than smaller numbers
> - */
> - if (be16_to_cpu(dip->di_flushiter) == DI_MAX_FLUSH &&
> - ldip->di_flushiter < (DI_MAX_FLUSH >> 1)) {
> - /* do nothing */
> - } else {
> - trace_xfs_log_recover_inode_skip(log, in_f);
> - error = 0;
> - goto out_release;
> + if (!xfs_has_v3inodes(mp)) {
> + if (ldip->di_flushiter < be16_to_cpu(dip->di_flushiter)) {
> + /*
> + * Deal with the wrap case, DI_MAX_FLUSH is less
> + * than smaller numbers
> + */
> + if (be16_to_cpu(dip->di_flushiter) == DI_MAX_FLUSH &&
> + ldip->di_flushiter < (DI_MAX_FLUSH >> 1)) {
> + /* do nothing */
> + } else {
> + trace_xfs_log_recover_inode_skip(log, in_f);
> + error = 0;
> + goto out_release;
> + }
> }
> +
> + /* Take the opportunity to reset the flush iteration count */
> + ldip->di_flushiter = 0;
Hmm. Well this fixes the zeroday problem, so thank you for getting the
root of this!
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Though hch did suggest reducing the amount of indenting here by
compressing the if tests together. I can't decide if it's worth
rearranging that old V4 code since none of it's scheduled for removal
until 2030, but it /is/ legacy code that maybe we just don't care to
touch?
<shrug>
--D
> }
>
> - /* Take the opportunity to reset the flush iteration count */
> - ldip->di_flushiter = 0;
>
> if (unlikely(S_ISREG(ldip->di_mode))) {
> if ((ldip->di_format != XFS_DINODE_FMT_EXTENTS) &&
> --
> 2.42.0
>
next prev parent reply other threads:[~2023-11-10 19:36 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-11-10 4:33 [PATCH 0/2] xfs: fix recovery corruption on s390 w/ nrext64 Dave Chinner
2023-11-10 4:33 ` [PATCH 1/2] xfs: inode recovery does not validate the recovered inode Dave Chinner
2023-11-10 19:27 ` Darrick J. Wong
2023-11-10 20:40 ` Dave Chinner
2023-11-17 22:19 ` Darrick J. Wong
2023-11-10 4:33 ` [PATCH 2/2] xfs: recovery should not clear di_flushiter unconditionally Dave Chinner
2023-11-10 19:32 ` Darrick J. Wong [this message]
2023-11-10 20:36 ` Dave Chinner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20231110193255.GK1205143@frogsfrogsfrogs \
--to=djwong@kernel.org \
--cc=david@fromorbit.com \
--cc=linux-xfs@vger.kernel.org \
--cc=zlang@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox