public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Eric Sandeen <sandeen@sandeen.net>
To: Dave Chinner <david@fromorbit.com>
Cc: xfs@oss.sgi.com
Subject: Re: [PATCH] xfs: punch all delalloc blocks beyond EOF on write failure.
Date: Wed, 25 Apr 2012 23:21:45 -0500	[thread overview]
Message-ID: <4F98CD59.5010207@sandeen.net> (raw)
In-Reply-To: <1335396189-3830-1-git-send-email-david@fromorbit.com>

On 4/25/12 6:23 PM, Dave Chinner wrote:
> From: Dave Chinner <dchinner@redhat.com>
> 
> I've been seeing regular ASSERT failures in xfstests when running
> fsstress based tests over the past month. xfs_getbmap() has been
> failing this test:
> 
> XFS: Assertion failed: ((iflags & BMV_IF_DELALLOC) != 0) ||
> (map[i].br_startblock != DELAYSTARTBLOCK), file: fs/xfs/xfs_bmap.c,
> line: 5650
> 
> where it is encountering a delayed allocation extent after writing
> all the dirty data to disk and then walking the extent map
> atomically by holding the XFS_IOLOCK_SHARED to prevent new delayed
> allocation extents from being created.
> 
> Test 083 on a 512 byte block size filesystem was used to reproduce
> the problem, because it only had a 5s run timeand would usually fail
> every 3-4 runs. This test is exercising ENOSPC behaviour by running
> fsstress on a nearly full filesystem. The following trace extract
> shows the final few events on the inode that tripped the assert:
> 
>  xfs_ilock:             flags ILOCK_EXCL caller xfs_setfilesize
>  xfs_setfilesize:       isize 0x180000 disize 0x12d400 offset 0x17e200 count 7680
> 
> file size updated to 0x180000 by IO completion
> 
>  xfs_ilock:             flags ILOCK_EXCL caller xfs_iomap_write_delay
>  xfs_iext_insert:       state  idx 3 offset 3072 block 4503599627239432 count 1 flag 0 caller xfs_bmap_add_extent_hole_delay
>  xfs_get_blocks_alloc:  size 0x180000 offset 0x180000 count 512 type  startoff 0xc00 startblock -1 blockcount 0x1
>  xfs_ilock:             flags ILOCK_EXCL caller __xfs_get_blocks
> 
> delalloc write, adding a single block at offset 0x180000
> 
>  xfs_delalloc_enospc:   isize 0x180000 disize 0x180000 offset 0x180200 count 512
> 
> ENOSPC trying to allocate a dellalloc block at offset 0x180200
> 
>  xfs_ilock:             flags ILOCK_EXCL caller xfs_iomap_write_delay
>  xfs_get_blocks_alloc:  size 0x180000 offset 0x180200 count 512 type  startoff 0xc00 startblock -1 blockcount 0x2
> 
> And succeeding on retry after flushing dirty inodes.
> 
>  xfs_ilock:             flags ILOCK_EXCL caller __xfs_get_blocks
>  xfs_delalloc_enospc:   isize 0x180000 disize 0x180000 offset 0x180400 count 512
> 
> ENOSPC trying to allocate a dellalloc block at offset 0x180400
> 
>  xfs_ilock:             flags ILOCK_EXCL caller xfs_iomap_write_delay
>  xfs_delalloc_enospc:   isize 0x180000 disize 0x180000 offset 0x180400 count 512
> 
> And failing the retry, giving a real ENOSPC error.
> 
>  xfs_ilock:             flags ILOCK_EXCL caller xfs_vm_write_failed
>                                                 ^^^^^^^^^^^^^^^^^^^
> The smoking gun - the write being failed and cleaning up delalloc
> blocks beyond EOF allocated by the failed write.
> 
>  xfs_getattr:
>  xfs_ilock:             flags IOLOCK_SHARED caller xfs_getbmap
>  xfs_ilock:             flags ILOCK_SHARED caller xfs_ilock_map_shared
> 
> And that's where we died almost immediately afterwards.
> xfs_bmapi_read() found delalloc extent beyond current file in memory
> file size. Some debug I added to xfs_getbmap() showed the state just
> before the assert failure:
> 
>  ino 0x80e48: off 0xc00, fsb 0xffffffffffffffff, len 0x1, size 0x180000
>  start_fsb 0x106, end_fsb 0x638
>  ino flags 0x2 nex 0xd bmvcnt 0x555, len 0x3c58a6f23c0bf1, start 0xc00
>  ext 0: off 0x1fc, fsb 0x24782, len 0x254
>  ext 1: off 0x450, fsb 0x40851, len 0x30
>  ext 2: off 0x480, fsb 0xd99, len 0x1b8
>  ext 3: off 0x92f, fsb 0x4099a, len 0x3b
>  ext 4: off 0x96d, fsb 0x41844, len 0x98
>  ext 5: off 0xbf1, fsb 0x408ab, len 0xf
> 
> which shows that we found a single delalloc block beyond EOF (first
> line of output) when we were returning the map for a length
> somewhere around 10^16 bytes long (second line), and the on-disk
> extents showed they didn't go past EOF (last lines).
> 
> Further debug added to xfs_vm_write_failed() showed this happened
> when punching out delalloc blocks beyond the end of the file after
> the failed write:
> 
> [  132.606693] ino 0x80e48: vwf to 0x181000, sze 0x180000
> [  132.609573] start_fsb 0xc01, end_fsb 0xc08
> 
> It punched the range 0xc01 -> 0xc08, but the range we really need to
> punch is 0xc00 -> 0xc07 (8 blocks from 0xc00) as this testing was
> run on a 512 byte block size filesystem (8 blocks per page).
> the punch from is 0xc00. So end_fsb is correct, but start_fsb is
> wrong as we punch from start_fsb for (end_fsb - start_fsb) blocks.
> Hence we are not punching the delalloc block beyond EOF in the case.
> 
> The fix is simple - it's a silly off-by-one mistake in calculating
> the range. It's especially silly because the macro used to calculate
> the start_fsb already takes into account the case where the inode
> size is an exact multiple of the filesystem block size...
> 
> Signed-off-by: Dave Chinner <dchinner@redhat.com>

tl;dr, but I get the gist ;)  I find these macros confusing.

Reviewed-by: Eric Sandeen <sandeen@redhat.com>

Thanks for a patch & can grok & review ;)

> ---
>  fs/xfs/xfs_aops.c |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c
> index b66766a..64ed87a 100644
> --- a/fs/xfs/xfs_aops.c
> +++ b/fs/xfs/xfs_aops.c
> @@ -1431,7 +1431,7 @@ xfs_vm_write_failed(
>  		 * Check if there are any blocks that are outside of i_size
>  		 * that need to be trimmed back.
>  		 */
> -		start_fsb = XFS_B_TO_FSB(ip->i_mount, inode->i_size) + 1;
> +		start_fsb = XFS_B_TO_FSB(ip->i_mount, inode->i_size);
>  		end_fsb = XFS_B_TO_FSB(ip->i_mount, to);
>  		if (end_fsb <= start_fsb)
>  			return;

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

  reply	other threads:[~2012-04-26  4:21 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-04-25 23:23 [PATCH] xfs: punch all delalloc blocks beyond EOF on write failure Dave Chinner
2012-04-26  4:21 ` Eric Sandeen [this message]
2012-04-26  6:47   ` Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4F98CD59.5010207@sandeen.net \
    --to=sandeen@sandeen.net \
    --cc=david@fromorbit.com \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox