* [PATCH] xfs/053: test for stale data exposure via falloc/writeback interaction
@ 2014-09-26 18:32 Brian Foster
2014-09-29 3:32 ` Dave Chinner
0 siblings, 1 reply; 3+ messages in thread
From: Brian Foster @ 2014-09-26 18:32 UTC (permalink / raw)
To: fstests; +Cc: xfs
XFS buffered I/O writeback has a subtle race condition that leads to
stale data exposure if the filesystem happens to crash after delayed
allocation blocks are converted on disk and before data is written back
to said blocks.
Use file allocation commands to attempt to reproduce a related, but
slightly different variant of this problem. The associated falloc
commands can lead to partial writeback that converts an extent larger
than the range affected by falloc. If the filesystem crashes after the
extent conversion but before all other cached data is written to the
extent, stale data can be exposed.
Signed-off-by: Brian Foster <bfoster@redhat.com>
---
This fell out of a combination of a conversation with Dave about XFS
writeback and buffer/cache coherency and some hacking I'm doing on the
XFS zero range implementation. Note that fpunch currently fails the
test. Also, this test is XFS specific primarily due to the use of
godown.
Brian
tests/xfs/053 | 101 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
tests/xfs/053.out | 10 ++++++
tests/xfs/group | 1 +
3 files changed, 112 insertions(+)
create mode 100755 tests/xfs/053
create mode 100644 tests/xfs/053.out
diff --git a/tests/xfs/053 b/tests/xfs/053
new file mode 100755
index 0000000..4fba127
--- /dev/null
+++ b/tests/xfs/053
@@ -0,0 +1,101 @@
+#! /bin/bash
+# FS QA Test No. 053
+#
+# Test stale data exposure via writeback using various file allocation
+# modification commands. The presumption is that such commands result in partial
+# writeback and can convert a delayed allocation extent, that might be larger
+# than the ranged affected by fallocate, to a normal extent. If the fs happens
+# to crash sometime between when the extent modification is logged and writeback
+# occurs for dirty pages within the extent but outside of the fallocated range,
+# stale data exposure can occur.
+#
+#-----------------------------------------------------------------------
+# Copyright (c) 2014 Red Hat, Inc. All Rights Reserved.
+#
+# This program is free software; you can redistribute it and/or
+# modify it under the terms of the GNU General Public License as
+# published by the Free Software Foundation.
+#
+# This program is distributed in the hope that it would be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write the Free Software Foundation,
+# Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
+#-----------------------------------------------------------------------
+#
+
+seq=`basename $0`
+seqres=$RESULT_DIR/$seq
+echo "QA output created by $seq"
+
+here=`pwd`
+tmp=/tmp/$$
+status=1 # failure is the default!
+trap "_cleanup; exit \$status" 0 1 2 3 15
+
+_cleanup()
+{
+ cd /
+ rm -f $tmp.*
+}
+
+# get standard environment, filters and checks
+. ./common/rc
+. ./common/filter
+. ./common/punch
+
+# real QA test starts here
+rm -f $seqres.full
+
+_crashtest()
+{
+ cmd=$1
+ img=$SCRATCH_MNT/$seq.img
+ mnt=$SCRATCH_MNT/$seq.mnt
+ file=$mnt/file
+
+ # Create an fs on a small, initialized image. The pattern is written to
+ # the image to detect stale data exposure.
+ $XFS_IO_PROG -f -c "truncate 0" -c "pwrite 0 25M" $img \
+ >> $seqres.full 2>&1
+ $MKFS_XFS_PROG $MKFS_OPTIONS $img >> $seqres.full 2>&1
+
+ mkdir -p $mnt
+ mount $img $mnt
+
+ echo $cmd
+
+ # write, run the test command and shutdown the fs
+ $XFS_IO_PROG -f -c "pwrite -S 1 0 64k" -c "$cmd 60k 4k" $file | \
+ _filter_xfs_io
+ ./src/godown -f $mnt
+
+ umount $mnt
+ mount $img $mnt
+
+ # we generally expect a zero-sized file (this should be silent)
+ hexdump $file
+
+ umount $mnt
+}
+
+# Modify as appropriate.
+_supported_fs xfs
+_supported_os Linux
+_require_scratch
+_require_xfs_io_command "falloc"
+_require_xfs_io_command "fpunch"
+_require_xfs_io_command "fzero"
+
+_scratch_mkfs >/dev/null 2>&1
+_scratch_mount
+
+_crashtest "falloc -k"
+_crashtest "fpunch"
+_crashtest "fzero -k"
+
+status=0
+exit
diff --git a/tests/xfs/053.out b/tests/xfs/053.out
new file mode 100644
index 0000000..c777fe2
--- /dev/null
+++ b/tests/xfs/053.out
@@ -0,0 +1,10 @@
+QA output created by 053
+falloc -k
+wrote 65536/65536 bytes at offset 0
+XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+fpunch
+wrote 65536/65536 bytes at offset 0
+XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+fzero -k
+wrote 65536/65536 bytes at offset 0
+XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
diff --git a/tests/xfs/group b/tests/xfs/group
index 09bce15..408e617 100644
--- a/tests/xfs/group
+++ b/tests/xfs/group
@@ -49,6 +49,7 @@
050 quota auto quick
051 auto log metadata
052 quota db auto quick
+053 auto quick rw
054 quota auto quick
055 dump ioctl remote tape
056 dump ioctl auto quick
--
1.8.3.1
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply related [flat|nested] 3+ messages in thread* Re: [PATCH] xfs/053: test for stale data exposure via falloc/writeback interaction
2014-09-26 18:32 [PATCH] xfs/053: test for stale data exposure via falloc/writeback interaction Brian Foster
@ 2014-09-29 3:32 ` Dave Chinner
2014-09-29 14:09 ` Brian Foster
0 siblings, 1 reply; 3+ messages in thread
From: Dave Chinner @ 2014-09-29 3:32 UTC (permalink / raw)
To: Brian Foster; +Cc: fstests, xfs
On Fri, Sep 26, 2014 at 02:32:29PM -0400, Brian Foster wrote:
> XFS buffered I/O writeback has a subtle race condition that leads to
> stale data exposure if the filesystem happens to crash after delayed
> allocation blocks are converted on disk and before data is written back
> to said blocks.
>
> Use file allocation commands to attempt to reproduce a related, but
> slightly different variant of this problem. The associated falloc
> commands can lead to partial writeback that converts an extent larger
> than the range affected by falloc. If the filesystem crashes after the
> extent conversion but before all other cached data is written to the
> extent, stale data can be exposed.
>
> Signed-off-by: Brian Foster <bfoster@redhat.com>
> ---
>
> This fell out of a combination of a conversation with Dave about XFS
> writeback and buffer/cache coherency and some hacking I'm doing on the
> XFS zero range implementation. Note that fpunch currently fails the
> test. Also, this test is XFS specific primarily due to the use of
> godown.
.....
> +_crashtest()
> +{
> + cmd=$1
> + img=$SCRATCH_MNT/$seq.img
> + mnt=$SCRATCH_MNT/$seq.mnt
> + file=$mnt/file
> +
> + # Create an fs on a small, initialized image. The pattern is written to
> + # the image to detect stale data exposure.
> + $XFS_IO_PROG -f -c "truncate 0" -c "pwrite 0 25M" $img \
> + >> $seqres.full 2>&1
> + $MKFS_XFS_PROG $MKFS_OPTIONS $img >> $seqres.full 2>&1
> +
> + mkdir -p $mnt
> + mount $img $mnt
> +
> + echo $cmd
> +
> + # write, run the test command and shutdown the fs
> + $XFS_IO_PROG -f -c "pwrite -S 1 0 64k" -c "$cmd 60k 4k" $file | \
> + _filter_xfs_io
So at this point the file is correctly 64k in size in memory.
> + ./src/godown -f $mnt
And here you tell godown to flush the log, so if there's a
transaction in the that sets the inode size to 64k.
> + umount $mnt
> + mount $img $mnt
Then log recovery will set the file size to 64k, and:
> +
> + # we generally expect a zero-sized file (this should be silent)
> + hexdump $file
This comment is not actually correct. I'm actually seeing 64k length
files after recovery in 2 of 3 cases being tested, so I don't think
this is a correct observation.
Some clarification of what is actually being tested is needed
here.
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 3+ messages in thread* Re: [PATCH] xfs/053: test for stale data exposure via falloc/writeback interaction
2014-09-29 3:32 ` Dave Chinner
@ 2014-09-29 14:09 ` Brian Foster
0 siblings, 0 replies; 3+ messages in thread
From: Brian Foster @ 2014-09-29 14:09 UTC (permalink / raw)
To: Dave Chinner; +Cc: fstests, xfs
On Mon, Sep 29, 2014 at 01:32:44PM +1000, Dave Chinner wrote:
> On Fri, Sep 26, 2014 at 02:32:29PM -0400, Brian Foster wrote:
> > XFS buffered I/O writeback has a subtle race condition that leads to
> > stale data exposure if the filesystem happens to crash after delayed
> > allocation blocks are converted on disk and before data is written back
> > to said blocks.
> >
> > Use file allocation commands to attempt to reproduce a related, but
> > slightly different variant of this problem. The associated falloc
> > commands can lead to partial writeback that converts an extent larger
> > than the range affected by falloc. If the filesystem crashes after the
> > extent conversion but before all other cached data is written to the
> > extent, stale data can be exposed.
> >
> > Signed-off-by: Brian Foster <bfoster@redhat.com>
> > ---
> >
> > This fell out of a combination of a conversation with Dave about XFS
> > writeback and buffer/cache coherency and some hacking I'm doing on the
> > XFS zero range implementation. Note that fpunch currently fails the
> > test. Also, this test is XFS specific primarily due to the use of
> > godown.
> .....
> > +_crashtest()
> > +{
> > + cmd=$1
> > + img=$SCRATCH_MNT/$seq.img
> > + mnt=$SCRATCH_MNT/$seq.mnt
> > + file=$mnt/file
> > +
> > + # Create an fs on a small, initialized image. The pattern is written to
> > + # the image to detect stale data exposure.
> > + $XFS_IO_PROG -f -c "truncate 0" -c "pwrite 0 25M" $img \
> > + >> $seqres.full 2>&1
> > + $MKFS_XFS_PROG $MKFS_OPTIONS $img >> $seqres.full 2>&1
> > +
> > + mkdir -p $mnt
> > + mount $img $mnt
> > +
> > + echo $cmd
> > +
> > + # write, run the test command and shutdown the fs
> > + $XFS_IO_PROG -f -c "pwrite -S 1 0 64k" -c "$cmd 60k 4k" $file | \
> > + _filter_xfs_io
>
> So at this point the file is correctly 64k in size in memory.
>
> > + ./src/godown -f $mnt
>
> And here you tell godown to flush the log, so if there's a
> transaction in the that sets the inode size to 64k.
>
> > + umount $mnt
> > + mount $img $mnt
>
> Then log recovery will set the file size to 64k, and:
>
> > +
> > + # we generally expect a zero-sized file (this should be silent)
> > + hexdump $file
>
> This comment is not actually correct. I'm actually seeing 64k length
> files after recovery in 2 of 3 cases being tested, so I don't think
> this is a correct observation.
>
> Some clarification of what is actually being tested is needed
> here.
>
What output is dumped for the file? I normally see either a zero length
file or data that was never written to the file. For example, punch
fails with this:
+0000000 cdcd cdcd cdcd cdcd cdcd cdcd cdcd cdcd
+*
+000f000 0000 0000 0000 0000 0000 0000 0000 0000
+*
+0010000
I suppose it could be possible to see a non-zero length file with valid
data, but I've not seen that occur.
Brian
> Cheers,
>
> Dave.
> --
> Dave Chinner
> david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2014-09-29 14:09 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-09-26 18:32 [PATCH] xfs/053: test for stale data exposure via falloc/writeback interaction Brian Foster
2014-09-29 3:32 ` Dave Chinner
2014-09-29 14:09 ` Brian Foster
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox