All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: "Darrick J. Wong" <djwong@kernel.org>
Cc: linux-xfs@vger.kernel.org
Subject: Re: [PATCH] xfs: allow sunit mount option to repair bad primary sb stripe values
Date: Wed, 13 Mar 2024 17:05:16 +1100	[thread overview]
Message-ID: <ZfFCHHWsQ3ucGo3C@dread.disaster.area> (raw)
In-Reply-To: <20240313045634.GK1927156@frogsfrogsfrogs>

On Tue, Mar 12, 2024 at 09:56:34PM -0700, Darrick J. Wong wrote:
> On Wed, Mar 13, 2024 at 10:30:06AM +1100, Dave Chinner wrote:
> > From: Dave Chinner <dchinner@redhat.com>
> > 
> > If a filesystem has a busted stripe alignment configuration on disk
> > (e.g. because broken RAID firmware told mkfs that swidth was smaller
> > than sunit), then the filesystem will refuse to mount due to the
> > stripe validation failing. This failure is triggering during distro
> > upgrades from old kernels lacking this check to newer kernels with
> > this check, and currently the only way to fix it is with offline
> > xfs_db surgery.
> > 
> > This runtime validity checking occurs when we read the superblock
> > for the first time and causes the mount to fail immediately. This
> > prevents the rewrite of stripe unit/width via
> > mount options that occurs later in the mount process. Hence there is
> > no way to recover this situation without resorting to offline xfs_db
> > rewrite of the values.
> > 
> > However, we parse the mount options long before we read the
> > superblock, and we know if the mount has been asked to re-write the
> > stripe alignment configuration when we are reading the superblock
> > and verifying it for the first time. Hence we can conditionally
> > ignore stripe verification failures if the mount options specified
> > will correct the issue.
> > 
> > We validate that the new stripe unit/width are valid before we
> > overwrite the superblock values, so we can ignore the invalid config
> > at verification and fail the mount later if the new values are not
> > valid. This, at least, gives users the chance of correcting the
> > issue after a kernel upgrade without having to resort to xfs-db
> > hacks.
> > 
> > Signed-off-by: Dave Chinner <dchinner@redhat.com>
> > ---
> >  fs/xfs/libxfs/xfs_sb.c | 40 +++++++++++++++++++++++++++++++---------
> >  fs/xfs/libxfs/xfs_sb.h |  3 ++-
> >  2 files changed, 33 insertions(+), 10 deletions(-)
> > 
> > diff --git a/fs/xfs/libxfs/xfs_sb.c b/fs/xfs/libxfs/xfs_sb.c
> > index d991eec05436..f51b1efa2cae 100644
> > --- a/fs/xfs/libxfs/xfs_sb.c
> > +++ b/fs/xfs/libxfs/xfs_sb.c
> > @@ -530,7 +530,8 @@ xfs_validate_sb_common(
> >  	}
> >  
> >  	if (!xfs_validate_stripe_geometry(mp, XFS_FSB_TO_B(mp, sbp->sb_unit),
> > -			XFS_FSB_TO_B(mp, sbp->sb_width), 0, false))
> > +			XFS_FSB_TO_B(mp, sbp->sb_width), 0,
> > +			xfs_buf_daddr(bp) == XFS_SB_DADDR, false))
> >  		return -EFSCORRUPTED;
> >  
> >  	/*
> > @@ -1323,8 +1324,10 @@ xfs_sb_get_secondary(
> >  }
> >  
> >  /*
> > - * sunit, swidth, sectorsize(optional with 0) should be all in bytes,
> > - * so users won't be confused by values in error messages.
> > + * sunit, swidth, sectorsize(optional with 0) should be all in bytes, so users
> > + * won't be confused by values in error messages. This returns false if a value
> > + * is invalid and it is not the primary superblock that going to be corrected
> > + * later in the mount process.
> 
> Hmm, I found this last sentence a little confusing.  How about:
> 
> "This function returns false if the stripe geometry is invalid and no
> attempt will be made to correct it later in the mount process."

Sure.

.....
> > @@ -1375,9 +1379,27 @@ xfs_validate_stripe_geometry(
> >  			xfs_notice(mp,
> >  "stripe width (%lld) must be a multiple of the stripe unit (%lld)",
> >  				   swidth, sunit);
> > -		return false;
> > +		goto check_override;
> >  	}
> >  	return true;
> > +
> > +check_override:
> > +	if (!primary_sb)
> > +		return false;
> > +	/*
> > +	 * During mount, mp->m_dalign will not be set unless the sunit mount
> > +	 * option was set. If it was set, ignore the bad stripe alignment values
> > +	 * and allow the validation and overwrite later in the mount process to
> > +	 * attempt to overwrite the bad stripe alignment values with the values
> > +	 * supplied by mount options.
> 
> What catches the case of if m_dalign/m_swidth also being garbage values?
> Is it xfs_check_new_dalign?  Should that fail the mount if the
> replacement values are also garbage?

xfs_validate_new_dalign().

It returns -EINVAL is the new striped alignment cannot be used (i.e.
not aligned to block or ag sizes) and that causes the mount to fail.

> > diff --git a/fs/xfs/libxfs/xfs_sb.h b/fs/xfs/libxfs/xfs_sb.h
> > index 67a40069724c..58798b9c70ba 100644
> > --- a/fs/xfs/libxfs/xfs_sb.h
> > +++ b/fs/xfs/libxfs/xfs_sb.h
> > @@ -35,7 +35,8 @@ extern int	xfs_sb_get_secondary(struct xfs_mount *mp,
> >  				struct xfs_buf **bpp);
> >  
> >  extern bool	xfs_validate_stripe_geometry(struct xfs_mount *mp,
> 
> This declaration might as well lose the extern here too.
> 
> > -		__s64 sunit, __s64 swidth, int sectorsize, bool silent);
> > +		__s64 sunit, __s64 swidth, int sectorsize, bool primary_sb,
> > +		bool silent);
> 
> What should value for @primary_sb should mkfs pass into
> xfs_validate_stripe_geometry from calc_stripe_factors?

Userspace problem, really. i.e. mkfs is already abusing this
function by passing a NULL xfs_mount and so will crash if the
function tries to dereference mp. Hence it can pass false for
"primary_sb" so it doesn't run any of the kernel side primary sb
recovery code that dereferences mp because it doesn't need to
(can't!) run it.

-Dave.
-- 
Dave Chinner
david@fromorbit.com

  reply	other threads:[~2024-03-13  6:05 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-12 23:30 [PATCH] xfs: allow sunit mount option to repair bad primary sb stripe values Dave Chinner
2024-03-13  4:56 ` Darrick J. Wong
2024-03-13  6:05   ` Dave Chinner [this message]
2024-03-13 15:48     ` Darrick J. Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZfFCHHWsQ3ucGo3C@dread.disaster.area \
    --to=david@fromorbit.com \
    --cc=djwong@kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.