linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Eryu Guan <guaneryu@gmail.com>
To: Brian Foster <bfoster@redhat.com>
Cc: linux-xfs@vger.kernel.org, stable@vger.kernel.org,
	"Darrick J. Wong" <darrick.wong@oracle.com>
Subject: Re: [PATCH v2] xfs: detect agfl count corruption and reset agfl
Date: Sun, 25 Mar 2018 14:02:31 +0800	[thread overview]
Message-ID: <20180325060231.GB30836@localhost.localdomain> (raw)
In-Reply-To: <20180323114622.GA20654@bfoster.bfoster>

On Fri, Mar 23, 2018 at 07:46:22AM -0400, Brian Foster wrote:
> On Fri, Mar 23, 2018 at 01:11:38PM +0800, Eryu Guan wrote:
> > Hi Brian,
> > 
> > On Thu, Mar 15, 2018 at 01:45:39PM -0400, Brian Foster wrote:
> > > The struct xfs_agfl v5 header was originally introduced with
> > > unexpected padding that caused the AGFL to operate with one less
> > > slot than intended. The header has since been packed, but the fix
> > > left an incompatibility for users who upgrade from an old kernel
> > > with the unpacked header to a newer kernel with the packed header
> > > while the AGFL happens to wrap around the end. The newer kernel
> > > recognizes one extra slot at the physical end of the AGFL that the
> > > previous kernel did not. The new kernel will eventually attempt to
> > > allocate a block from that slot, which contains invalid data, and
> > > cause a crash.
> > > 
> > > This condition can be detected by comparing the active range of the
> > > AGFL to the count. While this detects a padding mismatch, it can
> > > also trigger false positives for unrelated flcount corruption. Since
> > > we cannot distinguish a size mismatch due to padding from unrelated
> > > corruption, we can't trust the AGFL enough to simply repopulate the
> > > empty slot.
> > > 
> > > Instead, avoid unnecessarily complex detection logic and and use a
> > > solution that can handle any form of flcount corruption that slips
> > > through read verifiers: distrust the entire AGFL and reset it to an
> > > empty state. Any valid blocks within the AGFL are intentionally
> > > leaked. This requires xfs_repair to rectify (which was already
> > > necessary based on the state the AGFL was found in). The reset
> > > mitigates the side effect of the padding mismatch problem from a
> > > filesystem crash to a free space accounting inconsistency. The
> > > generic approach also means that this patch can be safely backported
> > > to kernels with or without a packed struct xfs_agfl.
> > > 
> > > Check the AGF for an invalid freelist count on initial read from
> > > disk. If detected, set a flag on the xfs_perag to indicate that a
> > > reset is required before the AGFL can be used. In the first
> > > transaction that attempts to use a flagged AGFL, reset it to empty,
> > > warn the user about the inconsistency and allow the freelist fixup
> > > code to repopulate the AGFL with new blocks. The xfs_perag flag is
> > > cleared to eliminate the need for repeated checks on each block
> > > allocation operation.
> > > 
> > > Fixes: 96f859d52bcb ("libxfs: pack the agfl header structure so XFS_AGFL_SIZE is correct")
> > > CC: <stable@vger.kernel.org>
> > > Suggested-by: Dave Chinner <david@fromorbit.com>
> > > Signed-off-by: Brian Foster <bfoster@redhat.com>
> > > Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
> > > Reviewed-by Dave Chiluk <chiluk+linuxxfs@indeed.com>
> > > ---
> > > 
> > > v2:
> > > - Function rename and logic cleanup.
> > > - CC stable.
> > > v1: https://marc.info/?l=linux-xfs&m=152104784709048&w=2
> > > - Use a simplified AGFL reset mechanism.
> > > - Trigger on AGFL fixup rather than get/put.
> > > - Various refactors, cleanups and comments.
> > > rfc: https://marc.info/?l=linux-xfs&m=152045069616434&w=2
> > > 
> > >  fs/xfs/libxfs/xfs_alloc.c | 94 +++++++++++++++++++++++++++++++++++++++++++++++
> > >  fs/xfs/xfs_mount.h        |  1 +
> > >  fs/xfs/xfs_trace.h        |  9 ++++-
> > >  3 files changed, 103 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/fs/xfs/libxfs/xfs_alloc.c b/fs/xfs/libxfs/xfs_alloc.c
> > > index 3db90b707fb2..39387bdd225d 100644
> > > --- a/fs/xfs/libxfs/xfs_alloc.c
> > > +++ b/fs/xfs/libxfs/xfs_alloc.c
> > > @@ -2063,6 +2063,93 @@ xfs_alloc_space_available(
> > >  }
> > >  
> > >  /*
> > > + * Check the agfl fields of the agf for inconsistency or corruption. The purpose
> > > + * is to detect an agfl header padding mismatch between current and early v5
> > > + * kernels. This problem manifests as a 1-slot size difference between the
> > > + * on-disk flcount and the active [first, last] range of a wrapped agfl. This
> > > + * may also catch variants of agfl count corruption unrelated to padding. Either
> > > + * way, we'll reset the agfl and warn the user.
> > > + *
> > > + * Return true if a reset is required before the agfl can be used, false
> > > + * otherwise.
> > > + */
> > > +static bool
> > > +xfs_agfl_needs_reset(
> > > +	struct xfs_mount	*mp,
> > > +	struct xfs_agf		*agf)
> > > +{
> > > +	uint32_t		f = be32_to_cpu(agf->agf_flfirst);
> > > +	uint32_t		l = be32_to_cpu(agf->agf_fllast);
> > > +	uint32_t		c = be32_to_cpu(agf->agf_flcount);
> > > +	int			agfl_size = xfs_agfl_size(mp);
> >                                             ^^^^^^^^^^^^^
> > Should this be XFS_AGFL_SIZE(mp)? Otherwise I got compile error (based
> > on 4.15-rc5 kernel):
> > 
> 
> This patch was rebased onto for-next which now includes commit
> a78ee256c325e ("xfs: convert XFS_AGFL_SIZE to a helper function"). That
> commit refactors the macro into a helper function.

Ah, thanks! I did search the list (roughly) to see if there's any patch
introduced xfs_agfl_size but apparently missed this patch.

Thanks,
Eryu

      reply	other threads:[~2018-03-25  6:02 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-03-15 17:45 [PATCH v2] xfs: detect agfl count corruption and reset agfl Brian Foster
2018-03-15 19:31 ` Darrick J. Wong
2018-03-16 12:00   ` Brian Foster
2018-03-16 15:49     ` Darrick J. Wong
2018-03-23  5:11 ` Eryu Guan
2018-03-23 11:46   ` Brian Foster
2018-03-25  6:02     ` Eryu Guan [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180325060231.GB30836@localhost.localdomain \
    --to=guaneryu@gmail.com \
    --cc=bfoster@redhat.com \
    --cc=darrick.wong@oracle.com \
    --cc=linux-xfs@vger.kernel.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).