* [PATCH] generic: test eofblocks race with file extending aio dio writes
@ 2017-04-20 15:21 Zorro Lang
2017-04-20 18:22 ` Brian Foster
0 siblings, 1 reply; 4+ messages in thread
From: Zorro Lang @ 2017-04-20 15:21 UTC (permalink / raw)
To: fstests
It's possible for post-eof blocks to end up being used for direct I/O
writes. dio write performs an upfront unwritten extent allocation, sends
the dio and then updates the inode size (if necessary) on write
completion. If a file release occurs while a file extending dio write is
in flight, it is possible to mistake the post-eof blocks for speculative
preallocation and incorrectly truncate them from the inode. This means
that the resulting dio write completion can discover a hole and allocate
new blocks rather than perform unwritten extent conversion.
A kernel warning can be reproduced by generic/299 on XFS:
XFS: Assertion failed: tp->t_blk_res_used <= tp->t_blk_res, \
file: fs/xfs//xfs_trans.c, line: 309
The root cause is that xfs_free_eofblocks() uses i_size to truncate
post-eof blocks from the inode, but async, file extending direct writes
do not update i_size until write completion, long after inode locks are
dropped. Therefore, xfs_free_eofblocks() effectively truncates the inode
to the incorrect size.
For cover this filesystem corruption testing, write this new case to
check data integrality manually, not only depend on a kernel warning.
Signed-off-by: Zorro Lang <zlang@redhat.com>
---
Hi,
This case is similar with generic/114 which use same aio-dio-eof-race.c
program. I was planning to write an aio writer, but I found there's
already a good program written by Eric, so I turn to use it directly :)
This case is different with g/114, this case runs aio-dio-eof-race
with an "open & close" process together. This multi-processes
maybe trigger a free_eofblocks race with file extending dio write.
Thanks,
Zorro
tests/generic/426 | 87 +++++++++++++++++++++++++++++++++++++++++++++++++++
tests/generic/426.out | 2 ++
tests/generic/group | 1 +
3 files changed, 90 insertions(+)
create mode 100755 tests/generic/426
create mode 100644 tests/generic/426.out
diff --git a/tests/generic/426 b/tests/generic/426
new file mode 100755
index 0000000..ff022ad
--- /dev/null
+++ b/tests/generic/426
@@ -0,0 +1,87 @@
+#! /bin/bash
+# FS QA Test 426
+#
+# Try to trigger a race of free eofblocks and file extending dio writes.
+# A known bug of XFS has been fixed by "e4229d6 xfs: fix eofblocks race
+# with file extending async dio writes"
+#
+#-----------------------------------------------------------------------
+# Copyright (c) 2017 Red Hat Inc. All Rights Reserved.
+#
+# This program is free software; you can redistribute it and/or
+# modify it under the terms of the GNU General Public License as
+# published by the Free Software Foundation.
+#
+# This program is distributed in the hope that it would be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write the Free Software Foundation,
+# Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
+#-----------------------------------------------------------------------
+#
+
+seq=`basename $0`
+seqres=$RESULT_DIR/$seq
+echo "QA output created by $seq"
+
+here=`pwd`
+tmp=/tmp/$$
+status=1 # failure is the default!
+trap "_cleanup; exit \$status" 0 1 2 3 15
+
+_cleanup()
+{
+ cd /
+ rm -f $tmp.*
+}
+
+# get standard environment, filters and checks
+. ./common/rc
+. ./common/filter
+
+# remove previous $seqres.full before test
+rm -f $seqres.full
+
+# real QA test starts here
+
+# Modify as appropriate.
+_supported_fs generic
+_supported_os Linux
+_require_scratch
+_require_aiodio aio-dio-eof-race
+
+# limit the filesystem size, to save the time of filling filesystem
+_scratch_mkfs_sized $((1024 * 1024 * 1024)) >>$seqres.full 2>&1
+_scratch_mount
+
+# try to write more bytes than filesystem size to fill the filesystem
+$XFS_IO_PROG -f -c "pwrite -S 0x55 0 $((1024 * 1024 * 1024 * 2))" \
+ $SCRATCH_MNT/fillfs-$seq 2>/dev/null
+rm -f $SCRATCH_MNT/fillfs-$seq
+
+# start a background aio writer, which does several extending loops
+# internally and check data integrality
+$AIO_TEST $SCRATCH_MNT/tst-aio-dio-eof-race.$seq &
+aio_pid=$!
+
+# open & close the file frequently, to trigger xfs_free_eofblocks
+while true; do
+ cat $SCRATCH_MNT/tst-aio-dio-eof-race.$seq >/dev/null 2>&1
+done &
+open_close_pid=$!
+
+wait $aio_pid
+status=$?
+kill $open_close_pid
+wait $open_close_pid
+if [ $status -ne 0 ]; then
+ od -t x1 $SCRATCH_MNT/tst-aio-dio-eof-race.$seq >> $seqres.full
+ exit
+fi
+
+# success, all done
+status=0
+exit
diff --git a/tests/generic/426.out b/tests/generic/426.out
new file mode 100644
index 0000000..ad7a01a
--- /dev/null
+++ b/tests/generic/426.out
@@ -0,0 +1,2 @@
+QA output created by 426
+Success, all done.
diff --git a/tests/generic/group b/tests/generic/group
index 6d6e4f6..70c36f9 100644
--- a/tests/generic/group
+++ b/tests/generic/group
@@ -428,3 +428,4 @@
423 auto quick
424 auto quick
425 auto quick attr
+426 auto aio rw
--
2.7.4
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH] generic: test eofblocks race with file extending aio dio writes
2017-04-20 15:21 [PATCH] generic: test eofblocks race with file extending aio dio writes Zorro Lang
@ 2017-04-20 18:22 ` Brian Foster
2017-04-24 5:32 ` Zorro Lang
0 siblings, 1 reply; 4+ messages in thread
From: Brian Foster @ 2017-04-20 18:22 UTC (permalink / raw)
To: Zorro Lang; +Cc: fstests
On Thu, Apr 20, 2017 at 11:21:03PM +0800, Zorro Lang wrote:
> It's possible for post-eof blocks to end up being used for direct I/O
> writes. dio write performs an upfront unwritten extent allocation, sends
> the dio and then updates the inode size (if necessary) on write
> completion. If a file release occurs while a file extending dio write is
> in flight, it is possible to mistake the post-eof blocks for speculative
> preallocation and incorrectly truncate them from the inode. This means
> that the resulting dio write completion can discover a hole and allocate
> new blocks rather than perform unwritten extent conversion.
>
> A kernel warning can be reproduced by generic/299 on XFS:
> XFS: Assertion failed: tp->t_blk_res_used <= tp->t_blk_res, \
> file: fs/xfs//xfs_trans.c, line: 309
>
It might be important to point out that this test also exposes file
corruption and/or stale data exposure. E.g., the verification of written
data is an important distinction between this test and generic/299.
> The root cause is that xfs_free_eofblocks() uses i_size to truncate
> post-eof blocks from the inode, but async, file extending direct writes
> do not update i_size until write completion, long after inode locks are
> dropped. Therefore, xfs_free_eofblocks() effectively truncates the inode
> to the incorrect size.
>
> For cover this filesystem corruption testing, write this new case to
> check data integrality manually, not only depend on a kernel warning.
>
> Signed-off-by: Zorro Lang <zlang@redhat.com>
> ---
>
> Hi,
>
> This case is similar with generic/114 which use same aio-dio-eof-race.c
> program. I was planning to write an aio writer, but I found there's
> already a good program written by Eric, so I turn to use it directly :)
>
Well, that is convenient. ;) A few quick notes...
> This case is different with g/114, this case runs aio-dio-eof-race
> with an "open & close" process together. This multi-processes
> maybe trigger a free_eofblocks race with file extending dio write.
>
> Thanks,
> Zorro
>
>
> tests/generic/426 | 87 +++++++++++++++++++++++++++++++++++++++++++++++++++
> tests/generic/426.out | 2 ++
> tests/generic/group | 1 +
> 3 files changed, 90 insertions(+)
> create mode 100755 tests/generic/426
> create mode 100644 tests/generic/426.out
>
> diff --git a/tests/generic/426 b/tests/generic/426
> new file mode 100755
> index 0000000..ff022ad
> --- /dev/null
> +++ b/tests/generic/426
> @@ -0,0 +1,87 @@
> +#! /bin/bash
> +# FS QA Test 426
> +#
> +# Try to trigger a race of free eofblocks and file extending dio writes.
> +# A known bug of XFS has been fixed by "e4229d6 xfs: fix eofblocks race
> +# with file extending async dio writes"
> +#
> +#-----------------------------------------------------------------------
> +# Copyright (c) 2017 Red Hat Inc. All Rights Reserved.
> +#
> +# This program is free software; you can redistribute it and/or
> +# modify it under the terms of the GNU General Public License as
> +# published by the Free Software Foundation.
> +#
> +# This program is distributed in the hope that it would be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> +# GNU General Public License for more details.
> +#
> +# You should have received a copy of the GNU General Public License
> +# along with this program; if not, write the Free Software Foundation,
> +# Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
> +#-----------------------------------------------------------------------
> +#
> +
> +seq=`basename $0`
> +seqres=$RESULT_DIR/$seq
> +echo "QA output created by $seq"
> +
> +here=`pwd`
> +tmp=/tmp/$$
> +status=1 # failure is the default!
> +trap "_cleanup; exit \$status" 0 1 2 3 15
> +
> +_cleanup()
> +{
> + cd /
> + rm -f $tmp.*
> +}
> +
> +# get standard environment, filters and checks
> +. ./common/rc
> +. ./common/filter
> +
> +# remove previous $seqres.full before test
> +rm -f $seqres.full
> +
> +# real QA test starts here
> +
> +# Modify as appropriate.
> +_supported_fs generic
> +_supported_os Linux
> +_require_scratch
> +_require_aiodio aio-dio-eof-race
> +
It looks like this tool doesn't do a ton of writing... up to 8MB. Has
that proven to be a reliable reproducer in your testing so far? I'm kind
of wondering if we should update that tool with a flag to accept a
configurable file size, but that's not a big deal if the reproducer is
reliable.
> +# limit the filesystem size, to save the time of filling filesystem
> +_scratch_mkfs_sized $((1024 * 1024 * 1024)) >>$seqres.full 2>&1
> +_scratch_mount
> +
That's 1G... I wonder if we can get away with smaller (a couple hundred
MB perhaps?), since we have to fill the whole thing first.
> +# try to write more bytes than filesystem size to fill the filesystem
> +$XFS_IO_PROG -f -c "pwrite -S 0x55 0 $((1024 * 1024 * 1024 * 2))" \
> + $SCRATCH_MNT/fillfs-$seq 2>/dev/null
> +rm -f $SCRATCH_MNT/fillfs-$seq
> +
> +# start a background aio writer, which does several extending loops
> +# internally and check data integrality
> +$AIO_TEST $SCRATCH_MNT/tst-aio-dio-eof-race.$seq &
> +aio_pid=$!
> +
> +# open & close the file frequently, to trigger xfs_free_eofblocks
> +while true; do
> + cat $SCRATCH_MNT/tst-aio-dio-eof-race.$seq >/dev/null 2>&1
> +done &
> +open_close_pid=$!
> +
We probably want to start the open/close loop before the writer to
reduce the chances of the writer doing much work before the open/close
loop actually starts.
I also wonder if something like the following would be better than using
cat:
while true; do
exec 3<>$SCRATCH_MNT/tst-aio-dio-eof-race.$seq
exec 3<>-
done
... because we can skip useless reads of the file. Something like
'$XFS_IO -c open <file>' may be another option as well.
Brian
> +wait $aio_pid
> +status=$?
> +kill $open_close_pid
> +wait $open_close_pid
> +if [ $status -ne 0 ]; then
> + od -t x1 $SCRATCH_MNT/tst-aio-dio-eof-race.$seq >> $seqres.full
> + exit
> +fi
> +
> +# success, all done
> +status=0
> +exit
> diff --git a/tests/generic/426.out b/tests/generic/426.out
> new file mode 100644
> index 0000000..ad7a01a
> --- /dev/null
> +++ b/tests/generic/426.out
> @@ -0,0 +1,2 @@
> +QA output created by 426
> +Success, all done.
> diff --git a/tests/generic/group b/tests/generic/group
> index 6d6e4f6..70c36f9 100644
> --- a/tests/generic/group
> +++ b/tests/generic/group
> @@ -428,3 +428,4 @@
> 423 auto quick
> 424 auto quick
> 425 auto quick attr
> +426 auto aio rw
> --
> 2.7.4
>
> --
> To unsubscribe from this list: send the line "unsubscribe fstests" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] generic: test eofblocks race with file extending aio dio writes
2017-04-20 18:22 ` Brian Foster
@ 2017-04-24 5:32 ` Zorro Lang
2017-04-24 12:16 ` Brian Foster
0 siblings, 1 reply; 4+ messages in thread
From: Zorro Lang @ 2017-04-24 5:32 UTC (permalink / raw)
To: Brian Foster; +Cc: fstests
On Thu, Apr 20, 2017 at 02:22:30PM -0400, Brian Foster wrote:
> On Thu, Apr 20, 2017 at 11:21:03PM +0800, Zorro Lang wrote:
> > It's possible for post-eof blocks to end up being used for direct I/O
> > writes. dio write performs an upfront unwritten extent allocation, sends
> > the dio and then updates the inode size (if necessary) on write
> > completion. If a file release occurs while a file extending dio write is
> > in flight, it is possible to mistake the post-eof blocks for speculative
> > preallocation and incorrectly truncate them from the inode. This means
> > that the resulting dio write completion can discover a hole and allocate
> > new blocks rather than perform unwritten extent conversion.
> >
> > A kernel warning can be reproduced by generic/299 on XFS:
> > XFS: Assertion failed: tp->t_blk_res_used <= tp->t_blk_res, \
> > file: fs/xfs//xfs_trans.c, line: 309
> >
>
> It might be important to point out that this test also exposes file
> corruption and/or stale data exposure. E.g., the verification of written
> data is an important distinction between this test and generic/299.
Sure, I'll add that. Hah, Sorry I pasted your commit log at here directly :)
>
> > The root cause is that xfs_free_eofblocks() uses i_size to truncate
> > post-eof blocks from the inode, but async, file extending direct writes
> > do not update i_size until write completion, long after inode locks are
> > dropped. Therefore, xfs_free_eofblocks() effectively truncates the inode
> > to the incorrect size.
> >
> > For cover this filesystem corruption testing, write this new case to
> > check data integrality manually, not only depend on a kernel warning.
> >
> > Signed-off-by: Zorro Lang <zlang@redhat.com>
> > ---
> >
> > Hi,
> >
> > This case is similar with generic/114 which use same aio-dio-eof-race.c
> > program. I was planning to write an aio writer, but I found there's
> > already a good program written by Eric, so I turn to use it directly :)
> >
>
> Well, that is convenient. ;) A few quick notes...
Yeah, but I've written one, then I found this one T_T..
>
> > This case is different with g/114, this case runs aio-dio-eof-race
> > with an "open & close" process together. This multi-processes
> > maybe trigger a free_eofblocks race with file extending dio write.
> >
> > Thanks,
> > Zorro
> >
> >
> > tests/generic/426 | 87 +++++++++++++++++++++++++++++++++++++++++++++++++++
> > tests/generic/426.out | 2 ++
> > tests/generic/group | 1 +
> > 3 files changed, 90 insertions(+)
> > create mode 100755 tests/generic/426
> > create mode 100644 tests/generic/426.out
> >
> > diff --git a/tests/generic/426 b/tests/generic/426
> > new file mode 100755
> > index 0000000..ff022ad
> > --- /dev/null
> > +++ b/tests/generic/426
> > @@ -0,0 +1,87 @@
> > +#! /bin/bash
> > +# FS QA Test 426
> > +#
> > +# Try to trigger a race of free eofblocks and file extending dio writes.
> > +# A known bug of XFS has been fixed by "e4229d6 xfs: fix eofblocks race
> > +# with file extending async dio writes"
> > +#
> > +#-----------------------------------------------------------------------
> > +# Copyright (c) 2017 Red Hat Inc. All Rights Reserved.
> > +#
> > +# This program is free software; you can redistribute it and/or
> > +# modify it under the terms of the GNU General Public License as
> > +# published by the Free Software Foundation.
> > +#
> > +# This program is distributed in the hope that it would be useful,
> > +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> > +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> > +# GNU General Public License for more details.
> > +#
> > +# You should have received a copy of the GNU General Public License
> > +# along with this program; if not, write the Free Software Foundation,
> > +# Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
> > +#-----------------------------------------------------------------------
> > +#
> > +
> > +seq=`basename $0`
> > +seqres=$RESULT_DIR/$seq
> > +echo "QA output created by $seq"
> > +
> > +here=`pwd`
> > +tmp=/tmp/$$
> > +status=1 # failure is the default!
> > +trap "_cleanup; exit \$status" 0 1 2 3 15
> > +
> > +_cleanup()
> > +{
> > + cd /
> > + rm -f $tmp.*
> > +}
> > +
> > +# get standard environment, filters and checks
> > +. ./common/rc
> > +. ./common/filter
> > +
> > +# remove previous $seqres.full before test
> > +rm -f $seqres.full
> > +
> > +# real QA test starts here
> > +
> > +# Modify as appropriate.
> > +_supported_fs generic
> > +_supported_os Linux
> > +_require_scratch
> > +_require_aiodio aio-dio-eof-race
> > +
>
> It looks like this tool doesn't do a ton of writing... up to 8MB. Has
> that proven to be a reliable reproducer in your testing so far? I'm kind
> of wondering if we should update that tool with a flag to accept a
> configurable file size, but that's not a big deal if the reproducer is
> reliable.
Yes, compare with the original reproducer of this bug, aio-dio-eof-race
do leass writing. For reproduce this bug, I increase the frequence of
open & close (no sleep 1), then the bug still can be reproduced as my
testing with very high reproducible.
I was thinking if I should change aio-dio-eof-race.c, because I don't know
if it always can be 100% reproduced on high performance machine.
>
> > +# limit the filesystem size, to save the time of filling filesystem
> > +_scratch_mkfs_sized $((1024 * 1024 * 1024)) >>$seqres.full 2>&1
> > +_scratch_mount
> > +
>
> That's 1G... I wonder if we can get away with smaller (a couple hundred
> MB perhaps?), since we have to fill the whole thing first.
Sure, how about 512MB or 256MB? What's the minimum hint size of XFS?
>
> > +# try to write more bytes than filesystem size to fill the filesystem
> > +$XFS_IO_PROG -f -c "pwrite -S 0x55 0 $((1024 * 1024 * 1024 * 2))" \
> > + $SCRATCH_MNT/fillfs-$seq 2>/dev/null
> > +rm -f $SCRATCH_MNT/fillfs-$seq
> > +
> > +# start a background aio writer, which does several extending loops
> > +# internally and check data integrality
> > +$AIO_TEST $SCRATCH_MNT/tst-aio-dio-eof-race.$seq &
> > +aio_pid=$!
> > +
> > +# open & close the file frequently, to trigger xfs_free_eofblocks
> > +while true; do
> > + cat $SCRATCH_MNT/tst-aio-dio-eof-race.$seq >/dev/null 2>&1
> > +done &
> > +open_close_pid=$!
> > +
>
> We probably want to start the open/close loop before the writer to
> reduce the chances of the writer doing much work before the open/close
> loop actually starts.
>
> I also wonder if something like the following would be better than using
> cat:
>
> while true; do
> exec 3<>$SCRATCH_MNT/tst-aio-dio-eof-race.$seq
> exec 3<>-
> done
>
> ... because we can skip useless reads of the file. Something like
> '$XFS_IO -c open <file>' may be another option as well.
Sure, that's better !
>
> Brian
>
> > +wait $aio_pid
> > +status=$?
> > +kill $open_close_pid
> > +wait $open_close_pid
> > +if [ $status -ne 0 ]; then
> > + od -t x1 $SCRATCH_MNT/tst-aio-dio-eof-race.$seq >> $seqres.full
> > + exit
> > +fi
> > +
> > +# success, all done
> > +status=0
> > +exit
> > diff --git a/tests/generic/426.out b/tests/generic/426.out
> > new file mode 100644
> > index 0000000..ad7a01a
> > --- /dev/null
> > +++ b/tests/generic/426.out
> > @@ -0,0 +1,2 @@
> > +QA output created by 426
> > +Success, all done.
> > diff --git a/tests/generic/group b/tests/generic/group
> > index 6d6e4f6..70c36f9 100644
> > --- a/tests/generic/group
> > +++ b/tests/generic/group
> > @@ -428,3 +428,4 @@
> > 423 auto quick
> > 424 auto quick
> > 425 auto quick attr
> > +426 auto aio rw
> > --
> > 2.7.4
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe fstests" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe fstests" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] generic: test eofblocks race with file extending aio dio writes
2017-04-24 5:32 ` Zorro Lang
@ 2017-04-24 12:16 ` Brian Foster
0 siblings, 0 replies; 4+ messages in thread
From: Brian Foster @ 2017-04-24 12:16 UTC (permalink / raw)
To: Zorro Lang; +Cc: fstests
On Mon, Apr 24, 2017 at 01:32:42PM +0800, Zorro Lang wrote:
> On Thu, Apr 20, 2017 at 02:22:30PM -0400, Brian Foster wrote:
> > On Thu, Apr 20, 2017 at 11:21:03PM +0800, Zorro Lang wrote:
> > > It's possible for post-eof blocks to end up being used for direct I/O
> > > writes. dio write performs an upfront unwritten extent allocation, sends
> > > the dio and then updates the inode size (if necessary) on write
> > > completion. If a file release occurs while a file extending dio write is
> > > in flight, it is possible to mistake the post-eof blocks for speculative
> > > preallocation and incorrectly truncate them from the inode. This means
> > > that the resulting dio write completion can discover a hole and allocate
> > > new blocks rather than perform unwritten extent conversion.
> > >
> > > A kernel warning can be reproduced by generic/299 on XFS:
> > > XFS: Assertion failed: tp->t_blk_res_used <= tp->t_blk_res, \
> > > file: fs/xfs//xfs_trans.c, line: 309
> > >
> >
> > It might be important to point out that this test also exposes file
> > corruption and/or stale data exposure. E.g., the verification of written
> > data is an important distinction between this test and generic/299.
>
> Sure, I'll add that. Hah, Sorry I pasted your commit log at here directly :)
>
> >
> > > The root cause is that xfs_free_eofblocks() uses i_size to truncate
> > > post-eof blocks from the inode, but async, file extending direct writes
> > > do not update i_size until write completion, long after inode locks are
> > > dropped. Therefore, xfs_free_eofblocks() effectively truncates the inode
> > > to the incorrect size.
> > >
> > > For cover this filesystem corruption testing, write this new case to
> > > check data integrality manually, not only depend on a kernel warning.
> > >
> > > Signed-off-by: Zorro Lang <zlang@redhat.com>
> > > ---
> > >
> > > Hi,
> > >
> > > This case is similar with generic/114 which use same aio-dio-eof-race.c
> > > program. I was planning to write an aio writer, but I found there's
> > > already a good program written by Eric, so I turn to use it directly :)
> > >
> >
> > Well, that is convenient. ;) A few quick notes...
>
> Yeah, but I've written one, then I found this one T_T..
>
> >
> > > This case is different with g/114, this case runs aio-dio-eof-race
> > > with an "open & close" process together. This multi-processes
> > > maybe trigger a free_eofblocks race with file extending dio write.
> > >
> > > Thanks,
> > > Zorro
> > >
> > >
> > > tests/generic/426 | 87 +++++++++++++++++++++++++++++++++++++++++++++++++++
> > > tests/generic/426.out | 2 ++
> > > tests/generic/group | 1 +
> > > 3 files changed, 90 insertions(+)
> > > create mode 100755 tests/generic/426
> > > create mode 100644 tests/generic/426.out
> > >
> > > diff --git a/tests/generic/426 b/tests/generic/426
> > > new file mode 100755
> > > index 0000000..ff022ad
> > > --- /dev/null
> > > +++ b/tests/generic/426
> > > @@ -0,0 +1,87 @@
> > > +#! /bin/bash
> > > +# FS QA Test 426
> > > +#
> > > +# Try to trigger a race of free eofblocks and file extending dio writes.
> > > +# A known bug of XFS has been fixed by "e4229d6 xfs: fix eofblocks race
> > > +# with file extending async dio writes"
> > > +#
> > > +#-----------------------------------------------------------------------
> > > +# Copyright (c) 2017 Red Hat Inc. All Rights Reserved.
> > > +#
> > > +# This program is free software; you can redistribute it and/or
> > > +# modify it under the terms of the GNU General Public License as
> > > +# published by the Free Software Foundation.
> > > +#
> > > +# This program is distributed in the hope that it would be useful,
> > > +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> > > +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> > > +# GNU General Public License for more details.
> > > +#
> > > +# You should have received a copy of the GNU General Public License
> > > +# along with this program; if not, write the Free Software Foundation,
> > > +# Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
> > > +#-----------------------------------------------------------------------
> > > +#
> > > +
> > > +seq=`basename $0`
> > > +seqres=$RESULT_DIR/$seq
> > > +echo "QA output created by $seq"
> > > +
> > > +here=`pwd`
> > > +tmp=/tmp/$$
> > > +status=1 # failure is the default!
> > > +trap "_cleanup; exit \$status" 0 1 2 3 15
> > > +
> > > +_cleanup()
> > > +{
> > > + cd /
> > > + rm -f $tmp.*
> > > +}
> > > +
> > > +# get standard environment, filters and checks
> > > +. ./common/rc
> > > +. ./common/filter
> > > +
> > > +# remove previous $seqres.full before test
> > > +rm -f $seqres.full
> > > +
> > > +# real QA test starts here
> > > +
> > > +# Modify as appropriate.
> > > +_supported_fs generic
> > > +_supported_os Linux
> > > +_require_scratch
> > > +_require_aiodio aio-dio-eof-race
> > > +
> >
> > It looks like this tool doesn't do a ton of writing... up to 8MB. Has
> > that proven to be a reliable reproducer in your testing so far? I'm kind
> > of wondering if we should update that tool with a flag to accept a
> > configurable file size, but that's not a big deal if the reproducer is
> > reliable.
>
> Yes, compare with the original reproducer of this bug, aio-dio-eof-race
> do leass writing. For reproduce this bug, I increase the frequence of
> open & close (no sleep 1), then the bug still can be reproduced as my
> testing with very high reproducible.
>
> I was thinking if I should change aio-dio-eof-race.c, because I don't know
> if it always can be 100% reproduced on high performance machine.
>
Ok, well it may be worth a try on a different environment, though I
would expect it to still work.
> >
> > > +# limit the filesystem size, to save the time of filling filesystem
> > > +_scratch_mkfs_sized $((1024 * 1024 * 1024)) >>$seqres.full 2>&1
> > > +_scratch_mount
> > > +
> >
> > That's 1G... I wonder if we can get away with smaller (a couple hundred
> > MB perhaps?), since we have to fill the whole thing first.
>
> Sure, how about 512MB or 256MB? What's the minimum hint size of XFS?
>
IMO, the smallest you can possibly use, the better. I'm not sure we have
a specific min size, it may depend on what mkfs options are provided.
256MB sounds more reasonable to me...? This is something that is
probably going to run during every auto xfstests run, so the less time
spent doing unnecessary test preparation like stamping unused fs blocks
with a pattern, the easier it is to keep the overall runtime down.
Brian
> >
> > > +# try to write more bytes than filesystem size to fill the filesystem
> > > +$XFS_IO_PROG -f -c "pwrite -S 0x55 0 $((1024 * 1024 * 1024 * 2))" \
> > > + $SCRATCH_MNT/fillfs-$seq 2>/dev/null
> > > +rm -f $SCRATCH_MNT/fillfs-$seq
> > > +
> > > +# start a background aio writer, which does several extending loops
> > > +# internally and check data integrality
> > > +$AIO_TEST $SCRATCH_MNT/tst-aio-dio-eof-race.$seq &
> > > +aio_pid=$!
> > > +
> > > +# open & close the file frequently, to trigger xfs_free_eofblocks
> > > +while true; do
> > > + cat $SCRATCH_MNT/tst-aio-dio-eof-race.$seq >/dev/null 2>&1
> > > +done &
> > > +open_close_pid=$!
> > > +
> >
> > We probably want to start the open/close loop before the writer to
> > reduce the chances of the writer doing much work before the open/close
> > loop actually starts.
> >
> > I also wonder if something like the following would be better than using
> > cat:
> >
> > while true; do
> > exec 3<>$SCRATCH_MNT/tst-aio-dio-eof-race.$seq
> > exec 3<>-
> > done
> >
> > ... because we can skip useless reads of the file. Something like
> > '$XFS_IO -c open <file>' may be another option as well.
>
> Sure, that's better !
>
> >
> > Brian
> >
> > > +wait $aio_pid
> > > +status=$?
> > > +kill $open_close_pid
> > > +wait $open_close_pid
> > > +if [ $status -ne 0 ]; then
> > > + od -t x1 $SCRATCH_MNT/tst-aio-dio-eof-race.$seq >> $seqres.full
> > > + exit
> > > +fi
> > > +
> > > +# success, all done
> > > +status=0
> > > +exit
> > > diff --git a/tests/generic/426.out b/tests/generic/426.out
> > > new file mode 100644
> > > index 0000000..ad7a01a
> > > --- /dev/null
> > > +++ b/tests/generic/426.out
> > > @@ -0,0 +1,2 @@
> > > +QA output created by 426
> > > +Success, all done.
> > > diff --git a/tests/generic/group b/tests/generic/group
> > > index 6d6e4f6..70c36f9 100644
> > > --- a/tests/generic/group
> > > +++ b/tests/generic/group
> > > @@ -428,3 +428,4 @@
> > > 423 auto quick
> > > 424 auto quick
> > > 425 auto quick attr
> > > +426 auto aio rw
> > > --
> > > 2.7.4
> > >
> > > --
> > > To unsubscribe from this list: send the line "unsubscribe fstests" in
> > > the body of a message to majordomo@vger.kernel.org
> > > More majordomo info at http://vger.kernel.org/majordomo-info.html
> > --
> > To unsubscribe from this list: send the line "unsubscribe fstests" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe fstests" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2017-04-24 12:16 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-04-20 15:21 [PATCH] generic: test eofblocks race with file extending aio dio writes Zorro Lang
2017-04-20 18:22 ` Brian Foster
2017-04-24 5:32 ` Zorro Lang
2017-04-24 12:16 ` Brian Foster
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox