All of lore.kernel.org
 help / color / mirror / Atom feed
From: Brian Foster <bfoster@redhat.com>
To: Eryu Guan <guaneryu@gmail.com>
Cc: fstests@vger.kernel.org, xfs@oss.sgi.com
Subject: Re: [PATCH] generic/033: add xfs delalloc indirect block depletion reproducer
Date: Thu, 25 Sep 2014 11:14:24 -0400	[thread overview]
Message-ID: <20140925151424.GC47304@bfoster.bfoster> (raw)
In-Reply-To: <20140925035416.GC13950@dhcp-13-216.nay.redhat.com>

On Thu, Sep 25, 2014 at 11:54:16AM +0800, Eryu Guan wrote:
> On Wed, Sep 24, 2014 at 02:47:05PM -0400, Brian Foster wrote:
> > XFS allocates extra indirect blocks for delayed allocation extents at
> > write time. When delalloc extents are split, the existing indirect block
> > reservation was historically divided up evenly among the new extents
> > even though the overall requirement for two extents could exceed the
> > requirement for the original. Repeated delalloc extent splits ultimately
> > leads to extents with 0 indirect blocks and in turn leads to assert
> > failures in XFS.
> > 
> > Add a test to stress indirect block reservation for delayed allocation
> > extents. The test converts a single delalloc extent to many and operates
> > on the remaining extents to detect or trigger potential problems.
> > 
> > Signed-off-by: Brian Foster <bfoster@redhat.com>
> > ---
> > 
> > Here's a simple reproducer for the indirect block reservation problem
> > called out here:
> > 
> > http://oss.sgi.com/archives/xfs/2014-09/msg00337.html
> > 
> > It reproduces the assert failures described therein:
> > 
> > XFS: Assertion failed: startblockval(del.br_startblock) > 0, file: fs/xfs/libxfs/xfs_bmap.c, line: 5281
> > 
> > Note that this test also unintentionally fails on XFS. The test file
> > ends up zero-sized after the remount and thus hexdump doesn't produce
> > any output. This doesn't occur on ext4, I suspect due to the fact that
> > the range being zeroed is flushed beforehand, though I could be wrong
> > about that.
> 
> Tested with ext4 and xfs, and ext4 passes/xfs fails the test as
> described here.
> 
> Reviewed-by: Eryu Guan <eguan@redhat.com>
> 
> With one nitpick below..
> 
> > 
> > In any event, this calls out a separate bug in XFS where if appending
> > data is chucked from cache by zero range before written back (eof is
> > page aligned), we lose the on-disk inode size update and the inode size
> > changes unexpectedly across the remount (assuming nothing else changes
> > the size, of course).
> > 
> > Brian
> > 
> >  tests/generic/033     | 88 +++++++++++++++++++++++++++++++++++++++++++++++++++
> >  tests/generic/033.out |  4 +++
> >  tests/generic/group   |  1 +
> >  3 files changed, 93 insertions(+)
> >  create mode 100755 tests/generic/033
> >  create mode 100644 tests/generic/033.out
> > 
> > diff --git a/tests/generic/033 b/tests/generic/033
> > new file mode 100755
> > index 0000000..41198b7
> > --- /dev/null
> > +++ b/tests/generic/033
> > @@ -0,0 +1,88 @@
> > +#! /bin/bash
> > +# FS QA Test No. 033
> > +#
> > +# This test stresses indirect block reservation for delayed allocation extents.
> > +# XFS reserves extra blocks for deferred allocation of delalloc extents. These
> > +# reserved blocks can be divided among more extents than anticipated if the
> > +# original extent for which the blocks were reserved is split into multiple
> > +# delalloc extents. If this scenario repeats, eventually some extents are left
> > +# without any indirect block reservation whatsoever. This leads to assert
> > +# failures and possibly other problems in XFS.
> > +#
> > +#-----------------------------------------------------------------------
> > +# Copyright (c) 2014 Red Hat, Inc.  All Rights Reserved.
> > +#
> > +# This program is free software; you can redistribute it and/or
> > +# modify it under the terms of the GNU General Public License as
> > +# published by the Free Software Foundation.
> > +#
> > +# This program is distributed in the hope that it would be useful,
> > +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> > +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> > +# GNU General Public License for more details.
> > +#
> > +# You should have received a copy of the GNU General Public License
> > +# along with this program; if not, write the Free Software Foundation,
> > +# Inc.,  51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
> > +#-----------------------------------------------------------------------
> > +#
> > +
> > +seq=`basename $0`
> > +seqres=$RESULT_DIR/$seq
> > +echo "QA output created by $seq"
> > +
> > +here=`pwd`
> > +tmp=/tmp/$$
> > +status=1	# failure is the default!
> > +trap "_cleanup; exit \$status" 0 1 2 3 15
> > +
> > +_cleanup()
> > +{
> > +	cd /
> > +	rm -f $tmp.*
> > +}
> > +
> > +# get standard environment, filters and checks
> > +. ./common/rc
> > +. ./common/filter
> > +
> > +# real QA test starts here
> > +rm -f $seqres.full
> > +
> > +# Modify as appropriate.
> > +_supported_fs generic
> > +_supported_os Linux
> > +_require_scratch
> > +_require_xfs_io_command "fzero"
> > +
> > +_scratch_mkfs >/dev/null 2>&1
> > +_scratch_mount
> > +
> > +file=$SCRATCH_MNT/file.$seq
> > +bytes=$((64 * 1024))
> > +
> > +# create sequential delayed allocation
> > +$XFS_IO_PROG -f -c "pwrite 0 $bytes" $file | _filter_xfs_io \
> > +	>> $seqres.full 2>&1
> 
> The output of xfs_io is redirected to $seqres.full, so it's not
> necessary to be filtered, for debug purpose.
> 
> And the following two xfs_io calls.
> 

Indeed, I'll post v2. Thanks for the review.

Brian

> Thanks,
> Eryu
> 
> > +
> > +# Zero every other 4k range to split the larger delalloc extent into many more
> > +# smaller extents. Use zero instead of hole punch because the former does not
> > +# force writeback (and hence delalloc conversion). It can simply discard
> > +# delalloc blocks and convert the ranges to unwritten.
> > +endoff=$((bytes - 4096))
> > +for i in $(seq 0 8192 $endoff); do
> > +	$XFS_IO_PROG -c "fzero -k $i 4k" $file | _filter_xfs_io \
> > +		>> $seqres.full 2>&1
> > +done
> > +
> > +# now zero the opposite set to remove remaining delalloc extents
> > +for i in $(seq 4096 8192 $endoff); do
> > +	$XFS_IO_PROG -c "fzero -k $i 4k" $file | _filter_xfs_io	\
> > +		>> $seqres.full 2>&1
> > +done
> > +
> > +_scratch_remount
> > +hexdump $file
> > +
> > +status=0
> > +exit
> > diff --git a/tests/generic/033.out b/tests/generic/033.out
> > new file mode 100644
> > index 0000000..419d831
> > --- /dev/null
> > +++ b/tests/generic/033.out
> > @@ -0,0 +1,4 @@
> > +QA output created by 033
> > +0000000 0000 0000 0000 0000 0000 0000 0000 0000
> > +*
> > +0010000
> > diff --git a/tests/generic/group b/tests/generic/group
> > index 8e0c22a..1227408 100644
> > --- a/tests/generic/group
> > +++ b/tests/generic/group
> > @@ -32,6 +32,7 @@
> >  027 auto enospc
> >  028 auto quick
> >  032 auto quick rw
> > +033 auto quick rw
> >  053 acl repair auto quick
> >  062 attr udf auto quick
> >  068 other auto freeze dangerous stress
> > -- 
> > 1.8.3.1
> > 
> > _______________________________________________
> > xfs mailing list
> > xfs@oss.sgi.com
> > http://oss.sgi.com/mailman/listinfo/xfs
> 
> _______________________________________________
> xfs mailing list
> xfs@oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs

WARNING: multiple messages have this Message-ID (diff)
From: Brian Foster <bfoster@redhat.com>
To: Eryu Guan <guaneryu@gmail.com>
Cc: fstests@vger.kernel.org, xfs@oss.sgi.com
Subject: Re: [PATCH] generic/033: add xfs delalloc indirect block depletion reproducer
Date: Thu, 25 Sep 2014 11:14:24 -0400	[thread overview]
Message-ID: <20140925151424.GC47304@bfoster.bfoster> (raw)
In-Reply-To: <20140925035416.GC13950@dhcp-13-216.nay.redhat.com>

On Thu, Sep 25, 2014 at 11:54:16AM +0800, Eryu Guan wrote:
> On Wed, Sep 24, 2014 at 02:47:05PM -0400, Brian Foster wrote:
> > XFS allocates extra indirect blocks for delayed allocation extents at
> > write time. When delalloc extents are split, the existing indirect block
> > reservation was historically divided up evenly among the new extents
> > even though the overall requirement for two extents could exceed the
> > requirement for the original. Repeated delalloc extent splits ultimately
> > leads to extents with 0 indirect blocks and in turn leads to assert
> > failures in XFS.
> > 
> > Add a test to stress indirect block reservation for delayed allocation
> > extents. The test converts a single delalloc extent to many and operates
> > on the remaining extents to detect or trigger potential problems.
> > 
> > Signed-off-by: Brian Foster <bfoster@redhat.com>
> > ---
> > 
> > Here's a simple reproducer for the indirect block reservation problem
> > called out here:
> > 
> > http://oss.sgi.com/archives/xfs/2014-09/msg00337.html
> > 
> > It reproduces the assert failures described therein:
> > 
> > XFS: Assertion failed: startblockval(del.br_startblock) > 0, file: fs/xfs/libxfs/xfs_bmap.c, line: 5281
> > 
> > Note that this test also unintentionally fails on XFS. The test file
> > ends up zero-sized after the remount and thus hexdump doesn't produce
> > any output. This doesn't occur on ext4, I suspect due to the fact that
> > the range being zeroed is flushed beforehand, though I could be wrong
> > about that.
> 
> Tested with ext4 and xfs, and ext4 passes/xfs fails the test as
> described here.
> 
> Reviewed-by: Eryu Guan <eguan@redhat.com>
> 
> With one nitpick below..
> 
> > 
> > In any event, this calls out a separate bug in XFS where if appending
> > data is chucked from cache by zero range before written back (eof is
> > page aligned), we lose the on-disk inode size update and the inode size
> > changes unexpectedly across the remount (assuming nothing else changes
> > the size, of course).
> > 
> > Brian
> > 
> >  tests/generic/033     | 88 +++++++++++++++++++++++++++++++++++++++++++++++++++
> >  tests/generic/033.out |  4 +++
> >  tests/generic/group   |  1 +
> >  3 files changed, 93 insertions(+)
> >  create mode 100755 tests/generic/033
> >  create mode 100644 tests/generic/033.out
> > 
> > diff --git a/tests/generic/033 b/tests/generic/033
> > new file mode 100755
> > index 0000000..41198b7
> > --- /dev/null
> > +++ b/tests/generic/033
> > @@ -0,0 +1,88 @@
> > +#! /bin/bash
> > +# FS QA Test No. 033
> > +#
> > +# This test stresses indirect block reservation for delayed allocation extents.
> > +# XFS reserves extra blocks for deferred allocation of delalloc extents. These
> > +# reserved blocks can be divided among more extents than anticipated if the
> > +# original extent for which the blocks were reserved is split into multiple
> > +# delalloc extents. If this scenario repeats, eventually some extents are left
> > +# without any indirect block reservation whatsoever. This leads to assert
> > +# failures and possibly other problems in XFS.
> > +#
> > +#-----------------------------------------------------------------------
> > +# Copyright (c) 2014 Red Hat, Inc.  All Rights Reserved.
> > +#
> > +# This program is free software; you can redistribute it and/or
> > +# modify it under the terms of the GNU General Public License as
> > +# published by the Free Software Foundation.
> > +#
> > +# This program is distributed in the hope that it would be useful,
> > +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> > +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> > +# GNU General Public License for more details.
> > +#
> > +# You should have received a copy of the GNU General Public License
> > +# along with this program; if not, write the Free Software Foundation,
> > +# Inc.,  51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
> > +#-----------------------------------------------------------------------
> > +#
> > +
> > +seq=`basename $0`
> > +seqres=$RESULT_DIR/$seq
> > +echo "QA output created by $seq"
> > +
> > +here=`pwd`
> > +tmp=/tmp/$$
> > +status=1	# failure is the default!
> > +trap "_cleanup; exit \$status" 0 1 2 3 15
> > +
> > +_cleanup()
> > +{
> > +	cd /
> > +	rm -f $tmp.*
> > +}
> > +
> > +# get standard environment, filters and checks
> > +. ./common/rc
> > +. ./common/filter
> > +
> > +# real QA test starts here
> > +rm -f $seqres.full
> > +
> > +# Modify as appropriate.
> > +_supported_fs generic
> > +_supported_os Linux
> > +_require_scratch
> > +_require_xfs_io_command "fzero"
> > +
> > +_scratch_mkfs >/dev/null 2>&1
> > +_scratch_mount
> > +
> > +file=$SCRATCH_MNT/file.$seq
> > +bytes=$((64 * 1024))
> > +
> > +# create sequential delayed allocation
> > +$XFS_IO_PROG -f -c "pwrite 0 $bytes" $file | _filter_xfs_io \
> > +	>> $seqres.full 2>&1
> 
> The output of xfs_io is redirected to $seqres.full, so it's not
> necessary to be filtered, for debug purpose.
> 
> And the following two xfs_io calls.
> 

Indeed, I'll post v2. Thanks for the review.

Brian

> Thanks,
> Eryu
> 
> > +
> > +# Zero every other 4k range to split the larger delalloc extent into many more
> > +# smaller extents. Use zero instead of hole punch because the former does not
> > +# force writeback (and hence delalloc conversion). It can simply discard
> > +# delalloc blocks and convert the ranges to unwritten.
> > +endoff=$((bytes - 4096))
> > +for i in $(seq 0 8192 $endoff); do
> > +	$XFS_IO_PROG -c "fzero -k $i 4k" $file | _filter_xfs_io \
> > +		>> $seqres.full 2>&1
> > +done
> > +
> > +# now zero the opposite set to remove remaining delalloc extents
> > +for i in $(seq 4096 8192 $endoff); do
> > +	$XFS_IO_PROG -c "fzero -k $i 4k" $file | _filter_xfs_io	\
> > +		>> $seqres.full 2>&1
> > +done
> > +
> > +_scratch_remount
> > +hexdump $file
> > +
> > +status=0
> > +exit
> > diff --git a/tests/generic/033.out b/tests/generic/033.out
> > new file mode 100644
> > index 0000000..419d831
> > --- /dev/null
> > +++ b/tests/generic/033.out
> > @@ -0,0 +1,4 @@
> > +QA output created by 033
> > +0000000 0000 0000 0000 0000 0000 0000 0000 0000
> > +*
> > +0010000
> > diff --git a/tests/generic/group b/tests/generic/group
> > index 8e0c22a..1227408 100644
> > --- a/tests/generic/group
> > +++ b/tests/generic/group
> > @@ -32,6 +32,7 @@
> >  027 auto enospc
> >  028 auto quick
> >  032 auto quick rw
> > +033 auto quick rw
> >  053 acl repair auto quick
> >  062 attr udf auto quick
> >  068 other auto freeze dangerous stress
> > -- 
> > 1.8.3.1
> > 
> > _______________________________________________
> > xfs mailing list
> > xfs@oss.sgi.com
> > http://oss.sgi.com/mailman/listinfo/xfs
> 
> _______________________________________________
> xfs mailing list
> xfs@oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

  reply	other threads:[~2014-09-25 15:14 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-09-24 18:47 [PATCH] generic/033: add xfs delalloc indirect block depletion reproducer Brian Foster
2014-09-24 18:47 ` Brian Foster
2014-09-25  3:54 ` Eryu Guan
2014-09-25  3:54   ` Eryu Guan
2014-09-25 15:14   ` Brian Foster [this message]
2014-09-25 15:14     ` Brian Foster

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140925151424.GC47304@bfoster.bfoster \
    --to=bfoster@redhat.com \
    --cc=fstests@vger.kernel.org \
    --cc=guaneryu@gmail.com \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.