public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: "Darrick J. Wong" <djwong@kernel.org>
Cc: linux-xfs@vger.kernel.org, zlang@redhat.com
Subject: Re: [PATCH 2/2] xfs: recovery should not clear di_flushiter unconditionally
Date: Sat, 11 Nov 2023 07:36:59 +1100	[thread overview]
Message-ID: <ZU6Ua7JLAioBP7PY@dread.disaster.area> (raw)
In-Reply-To: <20231110193255.GK1205143@frogsfrogsfrogs>

On Fri, Nov 10, 2023 at 11:32:55AM -0800, Darrick J. Wong wrote:
> On Fri, Nov 10, 2023 at 03:33:14PM +1100, Dave Chinner wrote:
> > From: Dave Chinner <dchinner@redhat.com>
> > 
> > Because on v3 inodes, di_flushiter doesn't exist. It overlaps with
> > zero padding in the inode, except when NREXT64=1 configurations are
> > in use and the zero padding is no longer padding but holds the 64
> > bit extent counter.
> > 
> > This manifests obviously on big endian platforms (e.g. s390) because
> > the log dinode is in host order and the overlap is the LSBs of the
> > extent count field. It is not noticed on little endian machines
> > because the overlap is at the MSB end of the extent count field and
> > we need to get more than 2^^48 extents in the inode before it
> > manifests. i.e. the heat death of the universe will occur before we
> > see the problem in little endian machines.
> > 
> > This is a zero-day issue for NREXT64=1 configuraitons on big endian
> > machines. Fix it by only clearing di_flushiter on v2 inodes during
> > recovery.
> > 
> > Fixes: 9b7d16e34bbe ("xfs: Introduce XFS_DIFLAG2_NREXT64 and associated helpers")
> > cc: stable@kernel.org # 5.19+
> > Signed-off-by: Dave Chinner <dchinner@redhat.com>
> > ---
> >  fs/xfs/xfs_inode_item_recover.c | 32 +++++++++++++++++---------------
> >  1 file changed, 17 insertions(+), 15 deletions(-)
> > 
> > diff --git a/fs/xfs/xfs_inode_item_recover.c b/fs/xfs/xfs_inode_item_recover.c
> > index f4c31c2b60d5..dbdab4ce7c44 100644
> > --- a/fs/xfs/xfs_inode_item_recover.c
> > +++ b/fs/xfs/xfs_inode_item_recover.c
> > @@ -371,24 +371,26 @@ xlog_recover_inode_commit_pass2(
> >  	 * superblock flag to determine whether we need to look at di_flushiter
> >  	 * to skip replay when the on disk inode is newer than the log one
> >  	 */
> > -	if (!xfs_has_v3inodes(mp) &&
> > -	    ldip->di_flushiter < be16_to_cpu(dip->di_flushiter)) {
> > -		/*
> > -		 * Deal with the wrap case, DI_MAX_FLUSH is less
> > -		 * than smaller numbers
> > -		 */
> > -		if (be16_to_cpu(dip->di_flushiter) == DI_MAX_FLUSH &&
> > -		    ldip->di_flushiter < (DI_MAX_FLUSH >> 1)) {
> > -			/* do nothing */
> > -		} else {
> > -			trace_xfs_log_recover_inode_skip(log, in_f);
> > -			error = 0;
> > -			goto out_release;
> > +	if (!xfs_has_v3inodes(mp)) {
> > +		if (ldip->di_flushiter < be16_to_cpu(dip->di_flushiter)) {
> > +			/*
> > +			 * Deal with the wrap case, DI_MAX_FLUSH is less
> > +			 * than smaller numbers
> > +			 */
> > +			if (be16_to_cpu(dip->di_flushiter) == DI_MAX_FLUSH &&
> > +			    ldip->di_flushiter < (DI_MAX_FLUSH >> 1)) {
> > +				/* do nothing */
> > +			} else {
> > +				trace_xfs_log_recover_inode_skip(log, in_f);
> > +				error = 0;
> > +				goto out_release;
> > +			}
> >  		}
> > +
> > +		/* Take the opportunity to reset the flush iteration count */
> > +		ldip->di_flushiter = 0;
> 
> Hmm.  Well this fixes the zeroday problem, so thank you for getting the
> root of this!
> 
> Reviewed-by: Darrick J. Wong <djwong@kernel.org>
> 
> Though hch did suggest reducing the amount of indenting here by
> compressing the if tests together.  I can't decide if it's worth
> rearranging that old V4 code since none of it's scheduled for removal
> until 2030, but it /is/ legacy code that maybe we just don't care to
> touch?

I'd prefer not to touch it. It's now all isolated in a
"!xfs_has_v3inodes()" branch, so I think we can largely ignore the
grot for now. If we have to do a larger rework of this code in
future, then we can look at reworking it.

But right now, I really don't feel like risking breaking something
else by doing unnecessary cleanups on code we haven't needed to
touch in over a decade.

-Dave.
-- 
Dave Chinner
david@fromorbit.com

      reply	other threads:[~2023-11-10 20:37 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-11-10  4:33 [PATCH 0/2] xfs: fix recovery corruption on s390 w/ nrext64 Dave Chinner
2023-11-10  4:33 ` [PATCH 1/2] xfs: inode recovery does not validate the recovered inode Dave Chinner
2023-11-10 19:27   ` Darrick J. Wong
2023-11-10 20:40     ` Dave Chinner
2023-11-17 22:19       ` Darrick J. Wong
2023-11-10  4:33 ` [PATCH 2/2] xfs: recovery should not clear di_flushiter unconditionally Dave Chinner
2023-11-10 19:32   ` Darrick J. Wong
2023-11-10 20:36     ` Dave Chinner [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZU6Ua7JLAioBP7PY@dread.disaster.area \
    --to=david@fromorbit.com \
    --cc=djwong@kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=zlang@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox