From: Mark Tinguely <tinguely@sgi.com>
To: stable@vger.kernel.org
Cc: xfs@oss.sgi.com
Subject: [3.0-stable PATCH 15/36] xfs: punch all delalloc blocks beyond EOF on write failure.
Date: Mon, 03 Dec 2012 17:42:23 -0600 [thread overview]
Message-ID: <20121203144310.186918896@sgi.com> (raw)
In-Reply-To: 20121203144208.143464631@sgi.com
[-- Attachment #1: 073 --]
[-- Type: text/plain, Size: 5471 bytes --]
From: Dave Chinner <dchinner@redhat.com>
Upstream commit: 01c84d2dc1311fb76ea217dadfd5b3a5f3cab563
I've been seeing regular ASSERT failures in xfstests when running
fsstress based tests over the past month. xfs_getbmap() has been
failing this test:
XFS: Assertion failed: ((iflags & BMV_IF_DELALLOC) != 0) ||
(map[i].br_startblock != DELAYSTARTBLOCK), file: fs/xfs/xfs_bmap.c,
line: 5650
where it is encountering a delayed allocation extent after writing
all the dirty data to disk and then walking the extent map
atomically by holding the XFS_IOLOCK_SHARED to prevent new delayed
allocation extents from being created.
Test 083 on a 512 byte block size filesystem was used to reproduce
the problem, because it only had a 5s run timeand would usually fail
every 3-4 runs. This test is exercising ENOSPC behaviour by running
fsstress on a nearly full filesystem. The following trace extract
shows the final few events on the inode that tripped the assert:
xfs_ilock: flags ILOCK_EXCL caller xfs_setfilesize
xfs_setfilesize: isize 0x180000 disize 0x12d400 offset 0x17e200 count 7680
file size updated to 0x180000 by IO completion
xfs_ilock: flags ILOCK_EXCL caller xfs_iomap_write_delay
xfs_iext_insert: state idx 3 offset 3072 block 4503599627239432 count 1 flag 0 caller xfs_bmap_add_extent_hole_delay
xfs_get_blocks_alloc: size 0x180000 offset 0x180000 count 512 type startoff 0xc00 startblock -1 blockcount 0x1
xfs_ilock: flags ILOCK_EXCL caller __xfs_get_blocks
delalloc write, adding a single block at offset 0x180000
xfs_delalloc_enospc: isize 0x180000 disize 0x180000 offset 0x180200 count 512
ENOSPC trying to allocate a dellalloc block at offset 0x180200
xfs_ilock: flags ILOCK_EXCL caller xfs_iomap_write_delay
xfs_get_blocks_alloc: size 0x180000 offset 0x180200 count 512 type startoff 0xc00 startblock -1 blockcount 0x2
And succeeding on retry after flushing dirty inodes.
xfs_ilock: flags ILOCK_EXCL caller __xfs_get_blocks
xfs_delalloc_enospc: isize 0x180000 disize 0x180000 offset 0x180400 count 512
ENOSPC trying to allocate a dellalloc block at offset 0x180400
xfs_ilock: flags ILOCK_EXCL caller xfs_iomap_write_delay
xfs_delalloc_enospc: isize 0x180000 disize 0x180000 offset 0x180400 count 512
And failing the retry, giving a real ENOSPC error.
xfs_ilock: flags ILOCK_EXCL caller xfs_vm_write_failed
^^^^^^^^^^^^^^^^^^^
The smoking gun - the write being failed and cleaning up delalloc
blocks beyond EOF allocated by the failed write.
xfs_getattr:
xfs_ilock: flags IOLOCK_SHARED caller xfs_getbmap
xfs_ilock: flags ILOCK_SHARED caller xfs_ilock_map_shared
And that's where we died almost immediately afterwards.
xfs_bmapi_read() found delalloc extent beyond current file in memory
file size. Some debug I added to xfs_getbmap() showed the state just
before the assert failure:
ino 0x80e48: off 0xc00, fsb 0xffffffffffffffff, len 0x1, size 0x180000
start_fsb 0x106, end_fsb 0x638
ino flags 0x2 nex 0xd bmvcnt 0x555, len 0x3c58a6f23c0bf1, start 0xc00
ext 0: off 0x1fc, fsb 0x24782, len 0x254
ext 1: off 0x450, fsb 0x40851, len 0x30
ext 2: off 0x480, fsb 0xd99, len 0x1b8
ext 3: off 0x92f, fsb 0x4099a, len 0x3b
ext 4: off 0x96d, fsb 0x41844, len 0x98
ext 5: off 0xbf1, fsb 0x408ab, len 0xf
which shows that we found a single delalloc block beyond EOF (first
line of output) when we were returning the map for a length
somewhere around 10^16 bytes long (second line), and the on-disk
extents showed they didn't go past EOF (last lines).
Further debug added to xfs_vm_write_failed() showed this happened
when punching out delalloc blocks beyond the end of the file after
the failed write:
[ 132.606693] ino 0x80e48: vwf to 0x181000, sze 0x180000
[ 132.609573] start_fsb 0xc01, end_fsb 0xc08
It punched the range 0xc01 -> 0xc08, but the range we really need to
punch is 0xc00 -> 0xc07 (8 blocks from 0xc00) as this testing was
run on a 512 byte block size filesystem (8 blocks per page).
the punch from is 0xc00. So end_fsb is correct, but start_fsb is
wrong as we punch from start_fsb for (end_fsb - start_fsb) blocks.
Hence we are not punching the delalloc block beyond EOF in the case.
The fix is simple - it's a silly off-by-one mistake in calculating
the range. It's especially silly because the macro used to calculate
the start_fsb already takes into account the case where the inode
size is an exact multiple of the filesystem block size...
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
---
fs/xfs/linux-2.6/xfs_aops.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
Index: b/fs/xfs/linux-2.6/xfs_aops.c
===================================================================
--- a/fs/xfs/linux-2.6/xfs_aops.c
+++ b/fs/xfs/linux-2.6/xfs_aops.c
@@ -1421,7 +1421,7 @@ xfs_vm_write_failed(
* Check if there are any blocks that are outside of i_size
* that need to be trimmed back.
*/
- start_fsb = XFS_B_TO_FSB(ip->i_mount, inode->i_size) + 1;
+ start_fsb = XFS_B_TO_FSB(ip->i_mount, inode->i_size);
end_fsb = XFS_B_TO_FSB(ip->i_mount, to);
if (end_fsb <= start_fsb)
return;
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
next prev parent reply other threads:[~2012-12-03 23:40 UTC|newest]
Thread overview: 46+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-12-03 23:42 [3.0-stable PATCH 00/36] Proposed 3.0-stable bug patches Mark Tinguely
2012-12-03 23:42 ` [3.0-stable PATCH 01/36] xfs: fix possible overflow in xfs_ioc_trim() Mark Tinguely
2012-12-03 23:42 ` [3.0-stable PATCH 02/36] xfs: fix allocation length overflow in xfs_bmapi_write() Mark Tinguely
2012-12-03 23:42 ` [3.0-stable PATCH 03/36] xfs: mark the xfssyncd workqueue as non-reentrant Mark Tinguely
2012-12-03 23:42 ` [3.0-stable PATCH 04/36] xfs: xfs_trans_add_item() - dont assign in ASSERT() when compare is intended Mark Tinguely
2012-12-03 23:42 ` [3.0-stable PATCH 05/36] xfs: only take the ILOCK in xfs_reclaim_inode() Mark Tinguely
2012-12-03 23:42 ` [3.0-stable PATCH 06/36] xfs: fallback to vmalloc for large buffers in xfs_attrmulti_attr_get Mark Tinguely
2012-12-03 23:42 ` [3.0-stable PATCH 07/36] xfs: fallback to vmalloc for large buffers in xfs_getbmap Mark Tinguely
2012-12-03 23:42 ` [3.0-stable PATCH 08/36] xfs: fix deadlock in xfs_rtfree_extent Mark Tinguely
2012-12-03 23:42 ` [3.0-stable PATCH 09/36] xfs: Fix open flag handling in open_by_handle code Mark Tinguely
2012-12-03 23:42 ` [3.0-stable PATCH 10/36] xfs: Account log unmount transaction correctly Mark Tinguely
2012-12-03 23:42 ` [3.0-stable PATCH 11/36] xfs: fix fstrim offset calculations Mark Tinguely
2012-12-03 23:42 ` [3.0-stable PATCH 12/36] xfs: dont fill statvfs with project quota for a directory Mark Tinguely
2012-12-03 23:42 ` [3.0-stable PATCH 13/36] xfs: Ensure inode reclaim can run during quotacheck Mark Tinguely
2012-12-03 23:42 ` [3.0-stable PATCH 14/36] xfs: use shared ilock mode for direct IO writes by default Mark Tinguely
2012-12-03 23:42 ` Mark Tinguely [this message]
2012-12-03 23:42 ` [3.0-stable PATCH 16/36] xfs: page type check in writeback only checks last buffer Mark Tinguely
2012-12-03 23:42 ` [3.0-stable PATCH 17/36] xfs: punch new delalloc blocks out of failed writes inside EOF Mark Tinguely
2012-12-03 23:42 ` [3.0-stable PATCH 18/36] xfs: dont assert on delalloc regions beyond EOF Mark Tinguely
2012-12-03 23:42 ` [3.0-stable PATCH 19/36] xfs: limit specualtive delalloc to maxioffset Mark Tinguely
2012-12-03 23:42 ` [3.0-stable PATCH 20/36] xfs: Use preallocation for inodes with extsz hints Mark Tinguely
2012-12-03 23:42 ` [3.0-stable PATCH 21/36] xfs: Dont allocate new buffers on every call to _xfs_buf_find Mark Tinguely
2012-12-03 23:42 ` [3.0-stable PATCH 22/36] xfs: clean up buffer allocation Mark Tinguely
2012-12-03 23:42 ` [3.0-stable PATCH 23/36] xfs: fix buffer lookup race on allocation failure Mark Tinguely
2012-12-03 23:42 ` [3.0-stable PATCH 24/36] xfs: use iolock on XFS_IOC_ALLOCSP calls Mark Tinguely
2012-12-03 23:42 ` [3.0-stable PATCH 25/36] xfs: Properly exclude IO type flags from buffer flags Mark Tinguely
2012-12-03 23:42 ` [3.0-stable PATCH 26/36] xfs: flush outstanding buffers on log mount failure Mark Tinguely
2012-12-03 23:42 ` [3.0-stable PATCH 27/36] xfs: protect xfs_sync_worker with s_umount semaphore Mark Tinguely
2012-12-03 23:42 ` [3.0-stable PATCH 28/36] xfs: fix memory reclaim deadlock on agi buffer Mark Tinguely
2012-12-03 23:42 ` [3.0-stable PATCH 29/36] xfs: xfs_vm_writepage clear iomap_valid when Mark Tinguely
2012-12-03 23:42 ` [3.0-stable PATCH 30/36] xfs: fix allocbt cursor leak in xfs_alloc_ag_vextent_near Mark Tinguely
2012-12-03 23:42 ` [3.0-stable PATCH 31/36] xfs: shutdown xfs_sync_worker before the log Mark Tinguely
2012-12-03 23:42 ` [3.0-stable PATCH 32/36] xfs: really fix the cursor leak in xfs_alloc_ag_vextent_near Mark Tinguely
2012-12-03 23:42 ` [3.0-stable PATCH 33/36] xfs: check for stale inode before acquiring iflock on push Mark Tinguely
2012-12-03 23:42 ` [3.0-stable PATCH 34/36] xfs: stop the sync worker before xfs_unmountfs Mark Tinguely
2012-12-03 23:42 ` [3.0-stable PATCH 35/36] xfs: zero allocation_args on the kernel stack Mark Tinguely
2012-12-03 23:42 ` [3.0-stable PATCH 36/36] xfs: only update the last_sync_lsn when a transaction completes Mark Tinguely
2012-12-04 21:44 ` [3.0-stable PATCH 00/36] Proposed 3.0-stable bug patches Ben Myers
2012-12-05 21:45 ` Dave Chinner
2012-12-06 17:27 ` Mark Tinguely
2012-12-07 10:06 ` Dave Chinner
2012-12-07 21:15 ` Ben Myers
2012-12-08 12:06 ` Christoph Hellwig
2012-12-08 19:12 ` Greg KH
2012-12-10 0:24 ` Dave Chinner
2012-12-10 22:03 ` Ben Myers
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20121203144310.186918896@sgi.com \
--to=tinguely@sgi.com \
--cc=stable@vger.kernel.org \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.