* [PATCH v2] generic/032: add xfs unwritten extent data corruption reproducer
@ 2014-09-22 19:40 Brian Foster
2014-09-23 1:26 ` Dave Chinner
0 siblings, 1 reply; 2+ messages in thread
From: Brian Foster @ 2014-09-22 19:40 UTC (permalink / raw)
To: fstests; +Cc: xfs
XFS had a data corruption problem where writeback of pages to unwritten
extents would fail to run unwritten extent conversion at I/O completion.
This causes subsequent reads of written, but unconverted regions to
return zeroes. This occurs on sub-page block size filesystems when
writeback contends for the inode lock (e.g., with a file writer).
Add a test that creates the conditions to reproduce the data corruption
and detect it by looking for unwritten extents after all said extents
have been overwritten.
Signed-off-by: Brian Foster <bfoster@redhat.com>
---
v2:
- Converted to generic test.
- Use fiemap instead of xfs_bmap.
- Added to rw group.
- Various fixups: init/clean $tmp, loop syntax, redirect output to
$seqres.full, use _scratch_remount.
v1: http://oss.sgi.com/archives/xfs/2014-09/msg00296.html
tests/generic/032 | 119 ++++++++++++++++++++++++++++++++++++++++++++++++++
tests/generic/032.out | 5 +++
tests/generic/group | 1 +
3 files changed, 125 insertions(+)
create mode 100755 tests/generic/032
create mode 100644 tests/generic/032.out
diff --git a/tests/generic/032 b/tests/generic/032
new file mode 100755
index 0000000..deaeba8
--- /dev/null
+++ b/tests/generic/032
@@ -0,0 +1,119 @@
+#! /bin/bash
+# FS QA Test No. 032
+#
+# This test implements a data corruption scenario on XFS filesystems with
+# sub-page sized blocks and unwritten extents. Inode lock contention during
+# writeback of pages to unwritten extents leads to failure to convert those
+# extents on I/O completion. This causes data corruption as unwritten extents
+# are always read back as zeroes.
+#
+#-----------------------------------------------------------------------
+# Copyright (c) 2014 Red Hat, Inc. All Rights Reserved.
+#
+# This program is free software; you can redistribute it and/or
+# modify it under the terms of the GNU General Public License as
+# published by the Free Software Foundation.
+#
+# This program is distributed in the hope that it would be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write the Free Software Foundation,
+# Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
+#-----------------------------------------------------------------------
+#
+
+seq=`basename $0`
+seqres=$RESULT_DIR/$seq
+echo "QA output created by $seq"
+
+here=`pwd`
+tmp=/tmp/$$
+status=1 # failure is the default!
+trap "_cleanup; exit \$status" 0 1 2 3 15
+
+_cleanup()
+{
+ cd /
+ kill -9 $syncpid > /dev/null 2>&1
+ wait
+ rm -f $tmp.*
+}
+
+# get standard environment, filters and checks
+. ./common/rc
+. ./common/punch
+
+# real QA test starts here
+rm -f $seqres.full
+
+_syncloop()
+{
+ while [ true ]; do
+ sync
+ done
+}
+
+# Modify as appropriate.
+_supported_fs generic
+_supported_os Linux
+_require_scratch
+_require_xfs_io_command "falloc"
+_require_xfs_io_command "fiemap"
+
+_scratch_mkfs >/dev/null 2>&1
+_scratch_mount
+
+# run background sync thread
+_syncloop &
+syncpid=$!
+
+for iters in $(seq 1 100)
+do
+ rm -f $SCRATCH_MNT/file
+
+ # create a delalloc block in each page of the first 64k of the file
+ for pgoff in $(seq 0 0x1000 0xf000); do
+ offset=$((pgoff + 0xc00))
+ $XFS_IO_PROG -f \
+ -c "pwrite $offset 0x1" \
+ $SCRATCH_MNT/file >> $seqres.full 2>&1
+ done
+
+ # preallocate the first 64k and overwite, writing past 64k to contend
+ # with writeback
+ $XFS_IO_PROG \
+ -c "falloc 0 0x10000" \
+ -c "pwrite 0 0x100000" \
+ -c "fsync" \
+ $SCRATCH_MNT/file >> $seqres.full 2>&1
+
+ # Check for unwritten extents. We should have none since we wrote over
+ # the entire preallocated region and ran fsync.
+ $XFS_IO_PROG \
+ -c "fiemap -v" \
+ $SCRATCH_MNT/file | _filter_fiemap | \
+ grep unwritten >> $seqres.full 2>&1
+ if [ $? == 0 ]; then
+ # data corruption! dump the extent list and break out to dump
+ # the file content
+ $XFS_IO_PROG -c "fiemap -v" $SCRATCH_MNT/file
+ break
+ fi
+done
+
+echo $iters iterations
+
+kill $syncpid
+wait
+
+# clear page cache and dump the file
+_scratch_remount
+hexdump $SCRATCH_MNT/file
+
+_scratch_unmount
+
+status=0
+exit
diff --git a/tests/generic/032.out b/tests/generic/032.out
new file mode 100644
index 0000000..ca5376d
--- /dev/null
+++ b/tests/generic/032.out
@@ -0,0 +1,5 @@
+QA output created by 032
+100 iterations
+0000000 cdcd cdcd cdcd cdcd cdcd cdcd cdcd cdcd
+*
+0100000
diff --git a/tests/generic/group b/tests/generic/group
index bdcfd9d..8e0c22a 100644
--- a/tests/generic/group
+++ b/tests/generic/group
@@ -31,6 +31,7 @@
026 acl quick auto
027 auto enospc
028 auto quick
+032 auto quick rw
053 acl repair auto quick
062 attr udf auto quick
068 other auto freeze dangerous stress
--
1.8.3.1
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply related [flat|nested] 2+ messages in thread* Re: [PATCH v2] generic/032: add xfs unwritten extent data corruption reproducer
2014-09-22 19:40 [PATCH v2] generic/032: add xfs unwritten extent data corruption reproducer Brian Foster
@ 2014-09-23 1:26 ` Dave Chinner
0 siblings, 0 replies; 2+ messages in thread
From: Dave Chinner @ 2014-09-23 1:26 UTC (permalink / raw)
To: Brian Foster; +Cc: fstests, xfs
On Mon, Sep 22, 2014 at 03:40:36PM -0400, Brian Foster wrote:
> XFS had a data corruption problem where writeback of pages to unwritten
> extents would fail to run unwritten extent conversion at I/O completion.
> This causes subsequent reads of written, but unconverted regions to
> return zeroes. This occurs on sub-page block size filesystems when
> writeback contends for the inode lock (e.g., with a file writer).
>
> Add a test that creates the conditions to reproduce the data corruption
> and detect it by looking for unwritten extents after all said extents
> have been overwritten.
>
> Signed-off-by: Brian Foster <bfoster@redhat.com>
I still think the error handling for the unwritten extent case is
wrong. Failure debug is exactly what $seqres.full is for, so that's
where failure information should go. If we detect a failure case
and have to abort immediately, then _fail() should be used. And
_fail() leaves a message to look at $seqres.full for details.
So:
> + # Check for unwritten extents. We should have none since we wrote over
> + # the entire preallocated region and ran fsync.
> + $XFS_IO_PROG \
> + -c "fiemap -v" \
> + $SCRATCH_MNT/file | _filter_fiemap | \
> + grep unwritten >> $seqres.full 2>&1
> + if [ $? == 0 ]; then
> + # data corruption! dump the extent list and break out to dump
> + # the file content
> + $XFS_IO_PROG -c "fiemap -v" $SCRATCH_MNT/file
> + break
> + fi
Can simply be:
$XFS_IO_PROG -c "fiemap -v" $SCRATCH_MNT/file | \
tee -a $seqres.full | \
filter_fiemap | grep unwritten
[ $? == 0 ] && _fail "Unwritten extents found!"
and will result in the output:
generic/032 0s.... [failed, exit status 1] - output mismatch (see ....results/generic/032.bad)
And results/generic/032.bad will contain:
....
Unwritten extents found!
(see ..../results/generic/032.full for details)
And the complete fiemap output will be in results/generic/032.full.
> +done
> +
> +echo $iters iterations
> +
> +kill $syncpid
> +wait
> +
> +# clear page cache and dump the file
> +_scratch_remount
> +hexdump $SCRATCH_MNT/file
> +
> +_scratch_unmount
No need to unmount. check does that when checking the filesystem,
and if not the next _require_scratch call will do it....
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2014-09-23 1:27 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-09-22 19:40 [PATCH v2] generic/032: add xfs unwritten extent data corruption reproducer Brian Foster
2014-09-23 1:26 ` Dave Chinner
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox