From: Dave Chinner <david@fromorbit.com>
To: "Darrick J. Wong" <djwong@kernel.org>
Cc: linux-xfs@vger.kernel.org
Subject: Re: [PATCH] xfs: allow sunit mount option to repair bad primary sb stripe values
Date: Wed, 13 Mar 2024 17:05:16 +1100 [thread overview]
Message-ID: <ZfFCHHWsQ3ucGo3C@dread.disaster.area> (raw)
In-Reply-To: <20240313045634.GK1927156@frogsfrogsfrogs>
On Tue, Mar 12, 2024 at 09:56:34PM -0700, Darrick J. Wong wrote:
> On Wed, Mar 13, 2024 at 10:30:06AM +1100, Dave Chinner wrote:
> > From: Dave Chinner <dchinner@redhat.com>
> >
> > If a filesystem has a busted stripe alignment configuration on disk
> > (e.g. because broken RAID firmware told mkfs that swidth was smaller
> > than sunit), then the filesystem will refuse to mount due to the
> > stripe validation failing. This failure is triggering during distro
> > upgrades from old kernels lacking this check to newer kernels with
> > this check, and currently the only way to fix it is with offline
> > xfs_db surgery.
> >
> > This runtime validity checking occurs when we read the superblock
> > for the first time and causes the mount to fail immediately. This
> > prevents the rewrite of stripe unit/width via
> > mount options that occurs later in the mount process. Hence there is
> > no way to recover this situation without resorting to offline xfs_db
> > rewrite of the values.
> >
> > However, we parse the mount options long before we read the
> > superblock, and we know if the mount has been asked to re-write the
> > stripe alignment configuration when we are reading the superblock
> > and verifying it for the first time. Hence we can conditionally
> > ignore stripe verification failures if the mount options specified
> > will correct the issue.
> >
> > We validate that the new stripe unit/width are valid before we
> > overwrite the superblock values, so we can ignore the invalid config
> > at verification and fail the mount later if the new values are not
> > valid. This, at least, gives users the chance of correcting the
> > issue after a kernel upgrade without having to resort to xfs-db
> > hacks.
> >
> > Signed-off-by: Dave Chinner <dchinner@redhat.com>
> > ---
> > fs/xfs/libxfs/xfs_sb.c | 40 +++++++++++++++++++++++++++++++---------
> > fs/xfs/libxfs/xfs_sb.h | 3 ++-
> > 2 files changed, 33 insertions(+), 10 deletions(-)
> >
> > diff --git a/fs/xfs/libxfs/xfs_sb.c b/fs/xfs/libxfs/xfs_sb.c
> > index d991eec05436..f51b1efa2cae 100644
> > --- a/fs/xfs/libxfs/xfs_sb.c
> > +++ b/fs/xfs/libxfs/xfs_sb.c
> > @@ -530,7 +530,8 @@ xfs_validate_sb_common(
> > }
> >
> > if (!xfs_validate_stripe_geometry(mp, XFS_FSB_TO_B(mp, sbp->sb_unit),
> > - XFS_FSB_TO_B(mp, sbp->sb_width), 0, false))
> > + XFS_FSB_TO_B(mp, sbp->sb_width), 0,
> > + xfs_buf_daddr(bp) == XFS_SB_DADDR, false))
> > return -EFSCORRUPTED;
> >
> > /*
> > @@ -1323,8 +1324,10 @@ xfs_sb_get_secondary(
> > }
> >
> > /*
> > - * sunit, swidth, sectorsize(optional with 0) should be all in bytes,
> > - * so users won't be confused by values in error messages.
> > + * sunit, swidth, sectorsize(optional with 0) should be all in bytes, so users
> > + * won't be confused by values in error messages. This returns false if a value
> > + * is invalid and it is not the primary superblock that going to be corrected
> > + * later in the mount process.
>
> Hmm, I found this last sentence a little confusing. How about:
>
> "This function returns false if the stripe geometry is invalid and no
> attempt will be made to correct it later in the mount process."
Sure.
.....
> > @@ -1375,9 +1379,27 @@ xfs_validate_stripe_geometry(
> > xfs_notice(mp,
> > "stripe width (%lld) must be a multiple of the stripe unit (%lld)",
> > swidth, sunit);
> > - return false;
> > + goto check_override;
> > }
> > return true;
> > +
> > +check_override:
> > + if (!primary_sb)
> > + return false;
> > + /*
> > + * During mount, mp->m_dalign will not be set unless the sunit mount
> > + * option was set. If it was set, ignore the bad stripe alignment values
> > + * and allow the validation and overwrite later in the mount process to
> > + * attempt to overwrite the bad stripe alignment values with the values
> > + * supplied by mount options.
>
> What catches the case of if m_dalign/m_swidth also being garbage values?
> Is it xfs_check_new_dalign? Should that fail the mount if the
> replacement values are also garbage?
xfs_validate_new_dalign().
It returns -EINVAL is the new striped alignment cannot be used (i.e.
not aligned to block or ag sizes) and that causes the mount to fail.
> > diff --git a/fs/xfs/libxfs/xfs_sb.h b/fs/xfs/libxfs/xfs_sb.h
> > index 67a40069724c..58798b9c70ba 100644
> > --- a/fs/xfs/libxfs/xfs_sb.h
> > +++ b/fs/xfs/libxfs/xfs_sb.h
> > @@ -35,7 +35,8 @@ extern int xfs_sb_get_secondary(struct xfs_mount *mp,
> > struct xfs_buf **bpp);
> >
> > extern bool xfs_validate_stripe_geometry(struct xfs_mount *mp,
>
> This declaration might as well lose the extern here too.
>
> > - __s64 sunit, __s64 swidth, int sectorsize, bool silent);
> > + __s64 sunit, __s64 swidth, int sectorsize, bool primary_sb,
> > + bool silent);
>
> What should value for @primary_sb should mkfs pass into
> xfs_validate_stripe_geometry from calc_stripe_factors?
Userspace problem, really. i.e. mkfs is already abusing this
function by passing a NULL xfs_mount and so will crash if the
function tries to dereference mp. Hence it can pass false for
"primary_sb" so it doesn't run any of the kernel side primary sb
recovery code that dereferences mp because it doesn't need to
(can't!) run it.
-Dave.
--
Dave Chinner
david@fromorbit.com
next prev parent reply other threads:[~2024-03-13 6:05 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-03-12 23:30 [PATCH] xfs: allow sunit mount option to repair bad primary sb stripe values Dave Chinner
2024-03-13 4:56 ` Darrick J. Wong
2024-03-13 6:05 ` Dave Chinner [this message]
2024-03-13 15:48 ` Darrick J. Wong
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZfFCHHWsQ3ucGo3C@dread.disaster.area \
--to=david@fromorbit.com \
--cc=djwong@kernel.org \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox