Re: [PATCH v2] xfs: test agfl reset on bad list wrapping

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Eryu Guan <guaneryu@gmail.com>
To: "Darrick J. Wong" <darrick.wong@oracle.com>
Cc: Brian Foster <bfoster@redhat.com>,
	linux-xfs@vger.kernel.org, david@fromorbit.com,
	fstests <fstests@vger.kernel.org>
Subject: Re: [PATCH v2] xfs: test agfl reset on bad list wrapping
Date: Wed, 21 Mar 2018 22:45:32 +0800	[thread overview]
Message-ID: <20180321144532.GU30836@localhost.localdomain> (raw)
In-Reply-To: <20180321031746.GF4866@magnolia>

On Tue, Mar 20, 2018 at 08:17:46PM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> From the kernel patch that this test examines ("xfs: detect agfl count
> corruption and reset agfl"):
> 
> "The struct xfs_agfl v5 header was originally introduced with
> unexpected padding that caused the AGFL to operate with one less
> slot than intended. The header has since been packed, but the fix
> left an incompatibility for users who upgrade from an old kernel
> with the unpacked header to a newer kernel with the packed header
> while the AGFL happens to wrap around the end. The newer kernel
> recognizes one extra slot at the physical end of the AGFL that the
> previous kernel did not. The new kernel will eventually attempt to
> allocate a block from that slot, which contains invalid data, and
> cause a crash.
> 
> "This condition can be detected by comparing the active range of the
> AGFL to the count. While this detects a padding mismatch, it can
> also trigger false positives for unrelated flcount corruption. Since
> we cannot distinguish a size mismatch due to padding from unrelated
> corruption, we can't trust the AGFL enough to simply repopulate the
> empty slot.
> 
> "Instead, avoid unnecessarily complex detection logic and and use a
> solution that can handle any form of flcount corruption that slips
> through read verifiers: distrust the entire AGFL and reset it to an
> empty state. Any valid blocks within the AGFL are intentionally
> leaked. This requires xfs_repair to rectify (which was already
> necessary based on the state the AGFL was found in). The reset
> mitigates the side effect of the padding mismatch problem from a
> filesystem crash to a free space accounting inconsistency."
> 
> This test exercises the reset code by mutating a fresh filesystem to
> contain an agfl with various list configurations of correctly wrapped,
> incorrectly wrapped, not wrapped, and actually corrupt free lists; then
> checks the success of the reset operation by fragmenting the free space
> btrees to exercise the agfl.  Kernels without this reset fix will shut
> down the filesystem with corruption errors.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> ---
>  common/rc         |    6 +
>  tests/xfs/709     |  254 +++++++++++++++++++++++++++++++++++++++++++++++++++++
>  tests/xfs/709.out |   13 +++
>  tests/xfs/group   |    1 
>  4 files changed, 274 insertions(+)
>  create mode 100755 tests/xfs/709
>  create mode 100644 tests/xfs/709.out
> 
> diff --git a/common/rc b/common/rc
> index 2c29d55..8f048f1 100644
> --- a/common/rc
> +++ b/common/rc
> @@ -3440,6 +3440,12 @@ _get_device_size()
>  	grep `_short_dev $1` /proc/partitions | awk '{print $3}'
>  }
>  
> +# check dmesg log for a specific string
> +_check_dmesg_for() {
> +	dmesg | tac | sed -ne "0,\#run fstests $seqnum at $date_time#p" | \
> +		tac | egrep -q "$1"

Hmm, searching dmesg log for a specific test this way requires a
writable /dev/kmsg, we have checked it in 'check', otherwise we won't
write such logs to dmesg. Need a _require_check_dmesg or something?

And it seems this "dmesg | tac ... | tac" sequence can be factored out
to a helper and reused in _check_dmesg too.

Thanks,
Eryu

next prev parent reply	other threads:[~2018-03-21 14:45 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-03-21  3:17 [PATCH v2] xfs: test agfl reset on bad list wrapping Darrick J. Wong
2018-03-21 12:30 ` Brian Foster
2018-03-21 16:08   ` Darrick J. Wong
2018-03-21 14:45 ` Eryu Guan [this message]
2018-03-21 16:17   ` Darrick J. Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180321144532.GU30836@localhost.localdomain \
    --to=guaneryu@gmail.com \
    --cc=bfoster@redhat.com \
    --cc=darrick.wong@oracle.com \
    --cc=david@fromorbit.com \
    --cc=fstests@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.