* [PATCH] xfs/062: add xfs unwritten extent data corruption reproducer
@ 2014-09-19 18:13 Brian Foster
2014-09-20 23:45 ` Dave Chinner
0 siblings, 1 reply; 3+ messages in thread
From: Brian Foster @ 2014-09-19 18:13 UTC (permalink / raw)
To: fstests; +Cc: xfs
XFS had a data corruption problem where writeback of pages to unwritten
extents would fail to run unwritten extent conversion at I/O completion.
This causes subsequent reads of written, but unconverted regions to
return zeroes. This occurs on sub-page block size filesystems when
writeback contends for the inode lock (e.g., with a file writer).
Add a test that creates the conditions to reproduce the data corruption
and detect it by looking for unwritten extents after all said extents
have been overwritten.
Signed-off-by: Brian Foster <bfoster@redhat.com>
---
Hi all,
Here's a test for the data corruption issue I sent a fix for yesterday:
http://oss.sgi.com/archives/xfs/2014-09/msg00259.html
It took a little time to improve the effectiveness of the test, but I've
been able to run it in current form for 90+ iterations without a false
negative (e.g., test passes when it shouldn't) on my setup. It runs in
~30s when there is no failure.
Brian
tests/xfs/062 | 114 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
tests/xfs/062.out | 5 +++
tests/xfs/group | 1 +
3 files changed, 120 insertions(+)
create mode 100755 tests/xfs/062
create mode 100644 tests/xfs/062.out
diff --git a/tests/xfs/062 b/tests/xfs/062
new file mode 100755
index 0000000..61f8cf4
--- /dev/null
+++ b/tests/xfs/062
@@ -0,0 +1,114 @@
+#! /bin/bash
+# FS QA Test No. 062
+#
+# This test implements a data corruption scenario on XFS filesystems with
+# sub-page sized blocks and unwritten extents. Inode lock contention during
+# writeback of pages to unwritten extents leads to failure to convert those
+# extents on I/O completion. This causes data corruption as unwritten extents
+# are always read back as zeroes.
+#
+#-----------------------------------------------------------------------
+# Copyright (c) 2014 Red Hat, Inc. All Rights Reserved.
+#
+# This program is free software; you can redistribute it and/or
+# modify it under the terms of the GNU General Public License as
+# published by the Free Software Foundation.
+#
+# This program is distributed in the hope that it would be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write the Free Software Foundation,
+# Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
+#-----------------------------------------------------------------------
+#
+
+seq=`basename $0`
+seqres=$RESULT_DIR/$seq
+echo "QA output created by $seq"
+
+here=`pwd`
+status=1 # failure is the default!
+trap "_cleanup; exit \$status" 0 1 2 3 15
+
+_cleanup()
+{
+ cd /
+ kill -9 $syncpid > /dev/null 2>&1
+}
+
+# get standard environment, filters and checks
+. ./common/rc
+. ./common/punch
+
+# real QA test starts here
+
+_syncloop()
+{
+ while [ true ]
+ do
+ sync
+ done
+}
+
+# Modify as appropriate.
+_supported_fs xfs
+_supported_os Linux
+_require_scratch
+_require_xfs_io_command "falloc"
+
+_scratch_mkfs >/dev/null 2>&1
+_scratch_mount
+
+# run background sync thread
+_syncloop &
+syncpid=$!
+
+for iters in $(seq 1 100)
+do
+ rm -f $SCRATCH_MNT/file
+
+ # create a delalloc block in each page of the first 64k of the file
+ for pgoff in $(seq 0 0x1000 0xf000)
+ do
+ offset=$((pgoff + 0xc00))
+ $XFS_IO_PROG -fc "pwrite $offset 0x1" $SCRATCH_MNT/file \
+ > /dev/null 2>&1
+ done
+
+ # preallocate the first 64k and overwite, writing past 64k to contend
+ # with writeback
+ $XFS_IO_PROG \
+ -c "falloc 0 0x10000" \
+ -c "pwrite 0 0x100000" \
+ -c "fsync" \
+ $SCRATCH_MNT/file > /dev/null 2>&1
+
+ # Check for unwritten extents. We should have none since we wrote over
+ # the entire preallocated region and ran fsync.
+ xfs_bmap -v $SCRATCH_MNT/file | _filter_bmap | grep unwritten \
+ > /dev/null 2>&1
+ if [ $? == 0 ]
+ then
+ xfs_bmap -v $SCRATCH_MNT/file
+ break
+ fi
+done
+
+echo $iters iterations
+
+kill $syncpid
+wait
+
+# clear page cache and dump the file
+_scratch_unmount
+_scratch_mount
+hexdump $SCRATCH_MNT/file
+
+_scratch_unmount
+_check_scratch_fs
+
+status=0
+exit
diff --git a/tests/xfs/062.out b/tests/xfs/062.out
new file mode 100644
index 0000000..420f2e4
--- /dev/null
+++ b/tests/xfs/062.out
@@ -0,0 +1,5 @@
+QA output created by 062
+100 iterations
+0000000 cdcd cdcd cdcd cdcd cdcd cdcd cdcd cdcd
+*
+0100000
diff --git a/tests/xfs/group b/tests/xfs/group
index 09bce15..685cbe7 100644
--- a/tests/xfs/group
+++ b/tests/xfs/group
@@ -57,6 +57,7 @@
059 dump ioctl auto quick
060 dump ioctl auto quick
061 dump ioctl auto quick
+062 auto quick
063 dump attr auto quick
064 dump auto
065 dump auto
--
1.8.3.1
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply related [flat|nested] 3+ messages in thread* Re: [PATCH] xfs/062: add xfs unwritten extent data corruption reproducer
2014-09-19 18:13 [PATCH] xfs/062: add xfs unwritten extent data corruption reproducer Brian Foster
@ 2014-09-20 23:45 ` Dave Chinner
2014-09-22 19:40 ` Brian Foster
0 siblings, 1 reply; 3+ messages in thread
From: Dave Chinner @ 2014-09-20 23:45 UTC (permalink / raw)
To: Brian Foster; +Cc: fstests, xfs
On Fri, Sep 19, 2014 at 02:13:05PM -0400, Brian Foster wrote:
> XFS had a data corruption problem where writeback of pages to unwritten
> extents would fail to run unwritten extent conversion at I/O completion.
> This causes subsequent reads of written, but unconverted regions to
> return zeroes. This occurs on sub-page block size filesystems when
> writeback contends for the inode lock (e.g., with a file writer).
>
> Add a test that creates the conditions to reproduce the data corruption
> and detect it by looking for unwritten extents after all said extents
> have been overwritten.
>
> Signed-off-by: Brian Foster <bfoster@redhat.com>
> ---
>
> Hi all,
>
> Here's a test for the data corruption issue I sent a fix for yesterday:
>
> http://oss.sgi.com/archives/xfs/2014-09/msg00259.html
>
> It took a little time to improve the effectiveness of the test, but I've
> been able to run it in current form for 90+ iterations without a false
> negative (e.g., test passes when it shouldn't) on my setup. It runs in
> ~30s when there is no failure.
>
> Brian
>
> tests/xfs/062 | 114 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
> tests/xfs/062.out | 5 +++
> tests/xfs/group | 1 +
> 3 files changed, 120 insertions(+)
> create mode 100755 tests/xfs/062
> create mode 100644 tests/xfs/062.out
There's nothing XFS specific about this test. Yes, it exposes a bug
on XFS, but every filesystem should be able to run and pass this
test.
>
> diff --git a/tests/xfs/062 b/tests/xfs/062
> new file mode 100755
> index 0000000..61f8cf4
> --- /dev/null
> +++ b/tests/xfs/062
> @@ -0,0 +1,114 @@
> +#! /bin/bash
> +# FS QA Test No. 062
> +#
> +# This test implements a data corruption scenario on XFS filesystems with
> +# sub-page sized blocks and unwritten extents. Inode lock contention during
> +# writeback of pages to unwritten extents leads to failure to convert those
> +# extents on I/O completion. This causes data corruption as unwritten extents
> +# are always read back as zeroes.
> +#
> +#-----------------------------------------------------------------------
> +# Copyright (c) 2014 Red Hat, Inc. All Rights Reserved.
> +#
> +# This program is free software; you can redistribute it and/or
> +# modify it under the terms of the GNU General Public License as
> +# published by the Free Software Foundation.
> +#
> +# This program is distributed in the hope that it would be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> +# GNU General Public License for more details.
> +#
> +# You should have received a copy of the GNU General Public License
> +# along with this program; if not, write the Free Software Foundation,
> +# Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
> +#-----------------------------------------------------------------------
> +#
> +
> +seq=`basename $0`
> +seqres=$RESULT_DIR/$seq
> +echo "QA output created by $seq"
> +
> +here=`pwd`
> +status=1 # failure is the default!
> +trap "_cleanup; exit \$status" 0 1 2 3 15
Always need to set $tmp here. It's used by things like
_scratch_mkfs.
> +
> +_cleanup()
> +{
> + cd /
> + kill -9 $syncpid > /dev/null 2>&1
wait
rm -f $tmp.*
> +}
> +
> +# get standard environment, filters and checks
> +. ./common/rc
> +. ./common/punch
> +
> +# real QA test starts here
> +
> +_syncloop()
> +{
> + while [ true ]
> + do
> + sync
> + done
while [....]; do
.....
done
> +}
> +
> +# Modify as appropriate.
> +_supported_fs xfs
> +_supported_os Linux
> +_require_scratch
> +_require_xfs_io_command "falloc"
_require_xfs_io_command "fiemap"
> +
> +_scratch_mkfs >/dev/null 2>&1
> +_scratch_mount
> +
> +# run background sync thread
> +_syncloop &
> +syncpid=$!
> +
> +for iters in $(seq 1 100)
> +do
> + rm -f $SCRATCH_MNT/file
> +
> + # create a delalloc block in each page of the first 64k of the file
> + for pgoff in $(seq 0 0x1000 0xf000)
> + do
> + offset=$((pgoff + 0xc00))
> + $XFS_IO_PROG -fc "pwrite $offset 0x1" $SCRATCH_MNT/file \
> + > /dev/null 2>&1
> + done
for .... ; do
.....
done
is the normal way of doing this. Also it's worth splitting the
xfs_io command across multiple lines as you have done later on to
make it more readable. Maybe, also, redirect the output to
$seqres.full so it's there for debugging failures rather than
/dev/null.
> +
> + # preallocate the first 64k and overwite, writing past 64k to contend
> + # with writeback
> + $XFS_IO_PROG \
> + -c "falloc 0 0x10000" \
> + -c "pwrite 0 0x100000" \
> + -c "fsync" \
> + $SCRATCH_MNT/file > /dev/null 2>&1
> +
> + # Check for unwritten extents. We should have none since we wrote over
> + # the entire preallocated region and ran fsync.
> + xfs_bmap -v $SCRATCH_MNT/file | _filter_bmap | grep unwritten \
> + > /dev/null 2>&1
Use $XFS_IO_PROG -c "fiemap -v" and dump the output to $seqres.full.
> + if [ $? == 0 ]
> + then
> + xfs_bmap -v $SCRATCH_MNT/file
> + break
> + fi
if [...]; then.
That's also a failure, right? So shouldn't it set status=1 and exit,
leaving the cleanup function to kill and wait for everything to
finish?
> +done
> +
> +echo $iters iterations
> +
> +kill $syncpid
> +wait
> +
> +# clear page cache and dump the file
> +_scratch_unmount
> +_scratch_mount
_scratch_remount
> +hexdump $SCRATCH_MNT/file
> +
> +_scratch_unmount
> +_check_scratch_fs
Don't need to do that anymore.
> +
> +status=0
> +exit
> diff --git a/tests/xfs/062.out b/tests/xfs/062.out
> new file mode 100644
> index 0000000..420f2e4
> --- /dev/null
> +++ b/tests/xfs/062.out
> @@ -0,0 +1,5 @@
> +QA output created by 062
> +100 iterations
> +0000000 cdcd cdcd cdcd cdcd cdcd cdcd cdcd cdcd
> +*
> +0100000
> diff --git a/tests/xfs/group b/tests/xfs/group
> index 09bce15..685cbe7 100644
> --- a/tests/xfs/group
> +++ b/tests/xfs/group
> @@ -57,6 +57,7 @@
> 059 dump ioctl auto quick
> 060 dump ioctl auto quick
> 061 dump ioctl auto quick
> +062 auto quick
and the rw group, too.
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 3+ messages in thread* Re: [PATCH] xfs/062: add xfs unwritten extent data corruption reproducer
2014-09-20 23:45 ` Dave Chinner
@ 2014-09-22 19:40 ` Brian Foster
0 siblings, 0 replies; 3+ messages in thread
From: Brian Foster @ 2014-09-22 19:40 UTC (permalink / raw)
To: Dave Chinner; +Cc: fstests, xfs
On Sun, Sep 21, 2014 at 09:45:00AM +1000, Dave Chinner wrote:
> On Fri, Sep 19, 2014 at 02:13:05PM -0400, Brian Foster wrote:
> > XFS had a data corruption problem where writeback of pages to unwritten
> > extents would fail to run unwritten extent conversion at I/O completion.
> > This causes subsequent reads of written, but unconverted regions to
> > return zeroes. This occurs on sub-page block size filesystems when
> > writeback contends for the inode lock (e.g., with a file writer).
> >
> > Add a test that creates the conditions to reproduce the data corruption
> > and detect it by looking for unwritten extents after all said extents
> > have been overwritten.
> >
> > Signed-off-by: Brian Foster <bfoster@redhat.com>
> > ---
> >
> > Hi all,
> >
> > Here's a test for the data corruption issue I sent a fix for yesterday:
> >
> > http://oss.sgi.com/archives/xfs/2014-09/msg00259.html
> >
> > It took a little time to improve the effectiveness of the test, but I've
> > been able to run it in current form for 90+ iterations without a false
> > negative (e.g., test passes when it shouldn't) on my setup. It runs in
> > ~30s when there is no failure.
> >
> > Brian
> >
> > tests/xfs/062 | 114 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
> > tests/xfs/062.out | 5 +++
> > tests/xfs/group | 1 +
> > 3 files changed, 120 insertions(+)
> > create mode 100755 tests/xfs/062
> > create mode 100644 tests/xfs/062.out
>
> There's nothing XFS specific about this test. Yes, it exposes a bug
> on XFS, but every filesystem should be able to run and pass this
> test.
>
Indeed, converted to a generic test.
> >
> > diff --git a/tests/xfs/062 b/tests/xfs/062
> > new file mode 100755
> > index 0000000..61f8cf4
> > --- /dev/null
> > +++ b/tests/xfs/062
...
> > + if [ $? == 0 ]
> > + then
> > + xfs_bmap -v $SCRATCH_MNT/file
> > + break
> > + fi
>
> if [...]; then.
>
> That's also a failure, right? So shouldn't it set status=1 and exit,
> leaving the cleanup function to kill and wait for everything to
> finish?
>
It's the main/expected failure. The intent was to use the hexdump as the
determination for success/failure, but I had to determine when to break
out of the loop since the reproducer requires several iterations. I
figured the raw extent map output would be useful (the output of which
also calls out something is wrong when compared to the golden output) as
well as the content diff, so both are printed in the failure case.
I've added an additional comment here for the v2 I'm about to post. I
can just dump the extent map here and exit with status=1 if that is
preferred, but it seems appropriate to include the hexdump in the
failure scenario as well.
Brian
> > +done
> > +
> > +echo $iters iterations
> > +
> > +kill $syncpid
> > +wait
> > +
> > +# clear page cache and dump the file
> > +_scratch_unmount
> > +_scratch_mount
>
> _scratch_remount
>
> > +hexdump $SCRATCH_MNT/file
> > +
> > +_scratch_unmount
> > +_check_scratch_fs
>
> Don't need to do that anymore.
>
> > +
> > +status=0
> > +exit
> > diff --git a/tests/xfs/062.out b/tests/xfs/062.out
> > new file mode 100644
> > index 0000000..420f2e4
> > --- /dev/null
> > +++ b/tests/xfs/062.out
> > @@ -0,0 +1,5 @@
> > +QA output created by 062
> > +100 iterations
> > +0000000 cdcd cdcd cdcd cdcd cdcd cdcd cdcd cdcd
> > +*
> > +0100000
> > diff --git a/tests/xfs/group b/tests/xfs/group
> > index 09bce15..685cbe7 100644
> > --- a/tests/xfs/group
> > +++ b/tests/xfs/group
> > @@ -57,6 +57,7 @@
> > 059 dump ioctl auto quick
> > 060 dump ioctl auto quick
> > 061 dump ioctl auto quick
> > +062 auto quick
>
> and the rw group, too.
>
> Cheers,
>
> Dave.
> --
> Dave Chinner
> david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2014-09-22 19:40 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-09-19 18:13 [PATCH] xfs/062: add xfs unwritten extent data corruption reproducer Brian Foster
2014-09-20 23:45 ` Dave Chinner
2014-09-22 19:40 ` Brian Foster
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox