From: Dave Chinner <david@fromorbit.com>
To: Brian Foster <bfoster@redhat.com>
Cc: fstests@vger.kernel.org, xfs@oss.sgi.com
Subject: Re: [PATCH v2] generic/032: add xfs unwritten extent data corruption reproducer
Date: Tue, 23 Sep 2014 11:26:56 +1000 [thread overview]
Message-ID: <20140923012656.GP4267@dastard> (raw)
In-Reply-To: <1411414836-64598-1-git-send-email-bfoster@redhat.com>
On Mon, Sep 22, 2014 at 03:40:36PM -0400, Brian Foster wrote:
> XFS had a data corruption problem where writeback of pages to unwritten
> extents would fail to run unwritten extent conversion at I/O completion.
> This causes subsequent reads of written, but unconverted regions to
> return zeroes. This occurs on sub-page block size filesystems when
> writeback contends for the inode lock (e.g., with a file writer).
>
> Add a test that creates the conditions to reproduce the data corruption
> and detect it by looking for unwritten extents after all said extents
> have been overwritten.
>
> Signed-off-by: Brian Foster <bfoster@redhat.com>
I still think the error handling for the unwritten extent case is
wrong. Failure debug is exactly what $seqres.full is for, so that's
where failure information should go. If we detect a failure case
and have to abort immediately, then _fail() should be used. And
_fail() leaves a message to look at $seqres.full for details.
So:
> + # Check for unwritten extents. We should have none since we wrote over
> + # the entire preallocated region and ran fsync.
> + $XFS_IO_PROG \
> + -c "fiemap -v" \
> + $SCRATCH_MNT/file | _filter_fiemap | \
> + grep unwritten >> $seqres.full 2>&1
> + if [ $? == 0 ]; then
> + # data corruption! dump the extent list and break out to dump
> + # the file content
> + $XFS_IO_PROG -c "fiemap -v" $SCRATCH_MNT/file
> + break
> + fi
Can simply be:
$XFS_IO_PROG -c "fiemap -v" $SCRATCH_MNT/file | \
tee -a $seqres.full | \
filter_fiemap | grep unwritten
[ $? == 0 ] && _fail "Unwritten extents found!"
and will result in the output:
generic/032 0s.... [failed, exit status 1] - output mismatch (see ....results/generic/032.bad)
And results/generic/032.bad will contain:
....
Unwritten extents found!
(see ..../results/generic/032.full for details)
And the complete fiemap output will be in results/generic/032.full.
> +done
> +
> +echo $iters iterations
> +
> +kill $syncpid
> +wait
> +
> +# clear page cache and dump the file
> +_scratch_remount
> +hexdump $SCRATCH_MNT/file
> +
> +_scratch_unmount
No need to unmount. check does that when checking the filesystem,
and if not the next _require_scratch call will do it....
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
WARNING: multiple messages have this Message-ID (diff)
From: Dave Chinner <david@fromorbit.com>
To: Brian Foster <bfoster@redhat.com>
Cc: fstests@vger.kernel.org, xfs@oss.sgi.com
Subject: Re: [PATCH v2] generic/032: add xfs unwritten extent data corruption reproducer
Date: Tue, 23 Sep 2014 11:26:56 +1000 [thread overview]
Message-ID: <20140923012656.GP4267@dastard> (raw)
In-Reply-To: <1411414836-64598-1-git-send-email-bfoster@redhat.com>
On Mon, Sep 22, 2014 at 03:40:36PM -0400, Brian Foster wrote:
> XFS had a data corruption problem where writeback of pages to unwritten
> extents would fail to run unwritten extent conversion at I/O completion.
> This causes subsequent reads of written, but unconverted regions to
> return zeroes. This occurs on sub-page block size filesystems when
> writeback contends for the inode lock (e.g., with a file writer).
>
> Add a test that creates the conditions to reproduce the data corruption
> and detect it by looking for unwritten extents after all said extents
> have been overwritten.
>
> Signed-off-by: Brian Foster <bfoster@redhat.com>
I still think the error handling for the unwritten extent case is
wrong. Failure debug is exactly what $seqres.full is for, so that's
where failure information should go. If we detect a failure case
and have to abort immediately, then _fail() should be used. And
_fail() leaves a message to look at $seqres.full for details.
So:
> + # Check for unwritten extents. We should have none since we wrote over
> + # the entire preallocated region and ran fsync.
> + $XFS_IO_PROG \
> + -c "fiemap -v" \
> + $SCRATCH_MNT/file | _filter_fiemap | \
> + grep unwritten >> $seqres.full 2>&1
> + if [ $? == 0 ]; then
> + # data corruption! dump the extent list and break out to dump
> + # the file content
> + $XFS_IO_PROG -c "fiemap -v" $SCRATCH_MNT/file
> + break
> + fi
Can simply be:
$XFS_IO_PROG -c "fiemap -v" $SCRATCH_MNT/file | \
tee -a $seqres.full | \
filter_fiemap | grep unwritten
[ $? == 0 ] && _fail "Unwritten extents found!"
and will result in the output:
generic/032 0s.... [failed, exit status 1] - output mismatch (see ....results/generic/032.bad)
And results/generic/032.bad will contain:
....
Unwritten extents found!
(see ..../results/generic/032.full for details)
And the complete fiemap output will be in results/generic/032.full.
> +done
> +
> +echo $iters iterations
> +
> +kill $syncpid
> +wait
> +
> +# clear page cache and dump the file
> +_scratch_remount
> +hexdump $SCRATCH_MNT/file
> +
> +_scratch_unmount
No need to unmount. check does that when checking the filesystem,
and if not the next _require_scratch call will do it....
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
next prev parent reply other threads:[~2014-09-23 1:27 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-09-22 19:40 [PATCH v2] generic/032: add xfs unwritten extent data corruption reproducer Brian Foster
2014-09-22 19:40 ` Brian Foster
2014-09-23 1:26 ` Dave Chinner [this message]
2014-09-23 1:26 ` Dave Chinner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140923012656.GP4267@dastard \
--to=david@fromorbit.com \
--cc=bfoster@redhat.com \
--cc=fstests@vger.kernel.org \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.