From: Dave Chinner <david@fromorbit.com>
To: xfs@oss.sgi.com
Subject: [PATCH 073/102] xfs: punch all delalloc blocks beyond EOF on write failure.
Date: Thu, 23 Aug 2012 15:02:31 +1000 [thread overview]
Message-ID: <1345698180-13612-74-git-send-email-david@fromorbit.com> (raw)
In-Reply-To: <1345698180-13612-1-git-send-email-david@fromorbit.com>
From: Dave Chinner <dchinner@redhat.com>
Upstream commit: 01c84d2dc1311fb76ea217dadfd5b3a5f3cab563
I've been seeing regular ASSERT failures in xfstests when running
fsstress based tests over the past month. xfs_getbmap() has been
failing this test:
XFS: Assertion failed: ((iflags & BMV_IF_DELALLOC) != 0) ||
(map[i].br_startblock != DELAYSTARTBLOCK), file: fs/xfs/xfs_bmap.c,
line: 5650
where it is encountering a delayed allocation extent after writing
all the dirty data to disk and then walking the extent map
atomically by holding the XFS_IOLOCK_SHARED to prevent new delayed
allocation extents from being created.
Test 083 on a 512 byte block size filesystem was used to reproduce
the problem, because it only had a 5s run timeand would usually fail
every 3-4 runs. This test is exercising ENOSPC behaviour by running
fsstress on a nearly full filesystem. The following trace extract
shows the final few events on the inode that tripped the assert:
xfs_ilock: flags ILOCK_EXCL caller xfs_setfilesize
xfs_setfilesize: isize 0x180000 disize 0x12d400 offset 0x17e200 count 7680
file size updated to 0x180000 by IO completion
xfs_ilock: flags ILOCK_EXCL caller xfs_iomap_write_delay
xfs_iext_insert: state idx 3 offset 3072 block 4503599627239432 count 1 flag 0 caller xfs_bmap_add_extent_hole_delay
xfs_get_blocks_alloc: size 0x180000 offset 0x180000 count 512 type startoff 0xc00 startblock -1 blockcount 0x1
xfs_ilock: flags ILOCK_EXCL caller __xfs_get_blocks
delalloc write, adding a single block at offset 0x180000
xfs_delalloc_enospc: isize 0x180000 disize 0x180000 offset 0x180200 count 512
ENOSPC trying to allocate a dellalloc block at offset 0x180200
xfs_ilock: flags ILOCK_EXCL caller xfs_iomap_write_delay
xfs_get_blocks_alloc: size 0x180000 offset 0x180200 count 512 type startoff 0xc00 startblock -1 blockcount 0x2
And succeeding on retry after flushing dirty inodes.
xfs_ilock: flags ILOCK_EXCL caller __xfs_get_blocks
xfs_delalloc_enospc: isize 0x180000 disize 0x180000 offset 0x180400 count 512
ENOSPC trying to allocate a dellalloc block at offset 0x180400
xfs_ilock: flags ILOCK_EXCL caller xfs_iomap_write_delay
xfs_delalloc_enospc: isize 0x180000 disize 0x180000 offset 0x180400 count 512
And failing the retry, giving a real ENOSPC error.
xfs_ilock: flags ILOCK_EXCL caller xfs_vm_write_failed
^^^^^^^^^^^^^^^^^^^
The smoking gun - the write being failed and cleaning up delalloc
blocks beyond EOF allocated by the failed write.
xfs_getattr:
xfs_ilock: flags IOLOCK_SHARED caller xfs_getbmap
xfs_ilock: flags ILOCK_SHARED caller xfs_ilock_map_shared
And that's where we died almost immediately afterwards.
xfs_bmapi_read() found delalloc extent beyond current file in memory
file size. Some debug I added to xfs_getbmap() showed the state just
before the assert failure:
ino 0x80e48: off 0xc00, fsb 0xffffffffffffffff, len 0x1, size 0x180000
start_fsb 0x106, end_fsb 0x638
ino flags 0x2 nex 0xd bmvcnt 0x555, len 0x3c58a6f23c0bf1, start 0xc00
ext 0: off 0x1fc, fsb 0x24782, len 0x254
ext 1: off 0x450, fsb 0x40851, len 0x30
ext 2: off 0x480, fsb 0xd99, len 0x1b8
ext 3: off 0x92f, fsb 0x4099a, len 0x3b
ext 4: off 0x96d, fsb 0x41844, len 0x98
ext 5: off 0xbf1, fsb 0x408ab, len 0xf
which shows that we found a single delalloc block beyond EOF (first
line of output) when we were returning the map for a length
somewhere around 10^16 bytes long (second line), and the on-disk
extents showed they didn't go past EOF (last lines).
Further debug added to xfs_vm_write_failed() showed this happened
when punching out delalloc blocks beyond the end of the file after
the failed write:
[ 132.606693] ino 0x80e48: vwf to 0x181000, sze 0x180000
[ 132.609573] start_fsb 0xc01, end_fsb 0xc08
It punched the range 0xc01 -> 0xc08, but the range we really need to
punch is 0xc00 -> 0xc07 (8 blocks from 0xc00) as this testing was
run on a 512 byte block size filesystem (8 blocks per page).
the punch from is 0xc00. So end_fsb is correct, but start_fsb is
wrong as we punch from start_fsb for (end_fsb - start_fsb) blocks.
Hence we are not punching the delalloc block beyond EOF in the case.
The fix is simple - it's a silly off-by-one mistake in calculating
the range. It's especially silly because the macro used to calculate
the start_fsb already takes into account the case where the inode
size is an exact multiple of the filesystem block size...
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
---
fs/xfs/linux-2.6/xfs_aops.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/xfs/linux-2.6/xfs_aops.c b/fs/xfs/linux-2.6/xfs_aops.c
index 2ce0e07..d2d3c84 100644
--- a/fs/xfs/linux-2.6/xfs_aops.c
+++ b/fs/xfs/linux-2.6/xfs_aops.c
@@ -1482,7 +1482,7 @@ xfs_vm_write_failed(
* Check if there are any blocks that are outside of i_size
* that need to be trimmed back.
*/
- start_fsb = XFS_B_TO_FSB(ip->i_mount, inode->i_size) + 1;
+ start_fsb = XFS_B_TO_FSB(ip->i_mount, inode->i_size);
end_fsb = XFS_B_TO_FSB(ip->i_mount, to);
if (end_fsb <= start_fsb)
return;
--
1.7.10
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
next prev parent reply other threads:[~2012-08-23 5:04 UTC|newest]
Thread overview: 117+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-08-23 5:01 [RFC, PATCH 0/102]: xfs: 3.0.x stable kernel update Dave Chinner
2012-08-23 5:01 ` [PATCH 001/102] xfs: don't serialise adjacent concurrent direct IO appending writes Dave Chinner
2012-08-23 5:01 ` [PATCH 002/102] xfs: remove dead ENODEV handling in xfs_destroy_ioend Dave Chinner
2012-08-23 5:01 ` [PATCH 003/102] xfs: defer AIO/DIO completions Dave Chinner
2012-08-23 5:01 ` [PATCH 004/102] xfs: reduce ioend latency Dave Chinner
2012-08-23 5:01 ` [PATCH 005/102] xfs: wait for I/O completion when writing out pages in xfs_setattr_size Dave Chinner
2012-08-23 5:01 ` [PATCH 006/102] xfs: improve ioend error handling Dave Chinner
2012-08-23 5:01 ` [PATCH 007/102] xfs: Check the return value of xfs_buf_get() Dave Chinner
2012-08-23 5:01 ` [PATCH 008/102] xfs: Check the return value of xfs_trans_get_buf() Dave Chinner
2012-08-23 5:01 ` [PATCH 009/102] xfs: dont ignore error code from xfs_bmbt_update Dave Chinner
2012-08-23 5:01 ` [PATCH 010/102] xfs: fix possible overflow in xfs_ioc_trim() Dave Chinner
2012-08-23 5:01 ` [PATCH 011/102] xfs: XFS_TRANS_SWAPEXT is not a valid flag for Dave Chinner
2012-08-23 5:01 ` [PATCH 012/102] xfs: Don't allocate new buffers on every call to _xfs_buf_find Dave Chinner
2012-08-23 5:01 ` [PATCH 013/102] xfs: reduce the number of log forces from tail pushing Dave Chinner
2012-08-23 5:01 ` [PATCH 014/102] xfs: optimize fsync on directories Dave Chinner
2012-08-23 5:01 ` [PATCH 015/102] xfs: clean up buffer allocation Dave Chinner
2012-08-23 5:01 ` [PATCH 016/102] xfs: clean up xfs_ioerror_alert Dave Chinner
2012-08-23 5:01 ` [PATCH 017/102] xfs: use xfs_ioerror_alert in xfs_buf_iodone_callbacks Dave Chinner
2012-08-23 5:01 ` [PATCH 018/102] xfs: do not flush data workqueues in xfs_flush_buftarg Dave Chinner
2012-08-23 5:01 ` [PATCH 019/102] xfs: add AIL pushing tracepoints Dave Chinner
2012-08-23 5:01 ` [PATCH 020/102] xfs: warn if direct reclaim tries to writeback pages Dave Chinner
2012-08-27 18:17 ` Christoph Hellwig
2012-09-05 11:32 ` Mel Gorman
2012-09-13 20:12 ` Mark Tinguely
2012-09-14 9:45 ` Mel Gorman
2012-09-14 12:56 ` Mark Tinguely
2012-09-14 16:18 ` Mark Tinguely
2012-08-23 5:01 ` [PATCH 021/102] xfs: fix force shutdown handling in xfs_end_io Dave Chinner
2012-08-23 5:01 ` [PATCH 022/102] xfs: fix allocation length overflow in xfs_bmapi_write() Dave Chinner
2012-08-23 5:01 ` [PATCH 023/102] xfs: fix the logspace waiting algorithm Dave Chinner
2012-08-23 5:01 ` [PATCH 024/102] xfs: untangle SYNC_WAIT and SYNC_TRYLOCK meanings for xfs_qm_dqflush Dave Chinner
2012-08-23 5:01 ` [PATCH 025/102] xfs: make sure to really flush all dquots in xfs_qm_quotacheck Dave Chinner
2012-08-23 5:01 ` [PATCH 026/102] xfs: simplify xfs_qm_detach_gdquots Dave Chinner
2012-08-23 5:01 ` [PATCH 027/102] xfs: mark the xfssyncd workqueue as non-reentrant Dave Chinner
2012-08-23 5:01 ` [PATCH 028/102] xfs: make i_flags an unsigned long Dave Chinner
2012-08-23 5:01 ` [PATCH 029/102] xfs: remove the i_size field in struct xfs_inode Dave Chinner
2012-08-23 5:01 ` [PATCH 030/102] xfs: remove the i_new_size " Dave Chinner
2012-08-23 5:01 ` [PATCH 031/102] xfs: always return with the iolock held from Dave Chinner
2012-08-23 5:01 ` [PATCH 032/102] xfs: cleanup xfs_file_aio_write Dave Chinner
2012-08-23 5:01 ` [PATCH 033/102] xfs: pass KM_SLEEP flag to kmem_realloc() in Dave Chinner
2012-08-23 5:01 ` [PATCH 034/102] xfs: show uuid when mount fails due to duplicate uuid Dave Chinner
2012-08-23 5:01 ` [PATCH 035/102] xfs: xfs_trans_add_item() - don't assign in ASSERT() when compare is intended Dave Chinner
2012-08-23 5:01 ` [PATCH 036/102] xfs: split tail_lsn assignments from log space wakeups Dave Chinner
2012-08-23 5:01 ` [PATCH 037/102] xfs: do exact log space wakeups in xlog_ungrant_log_space Dave Chinner
2012-08-23 5:01 ` [PATCH 038/102] xfs: remove xfs_trans_unlocked_item Dave Chinner
2012-08-23 5:01 ` [PATCH 039/102] xfs: cleanup xfs_log_space_wake Dave Chinner
2012-08-23 5:01 ` [PATCH 040/102] xfs: remove log space waitqueues Dave Chinner
2012-08-23 5:01 ` [PATCH 041/102] xfs: add the xlog_grant_head structure Dave Chinner
2012-08-23 5:02 ` [PATCH 042/102] xfs: add xlog_grant_head_init Dave Chinner
2012-08-23 5:02 ` [PATCH 043/102] xfs: add xlog_grant_head_wake_all Dave Chinner
2012-08-23 5:02 ` [PATCH 044/102] xfs: share code for grant head waiting Dave Chinner
2012-08-23 5:02 ` [PATCH 045/102] xfs: share code for grant head wakeups Dave Chinner
2012-08-23 5:02 ` [PATCH 046/102] xfs: share code for grant head availability checks Dave Chinner
2012-08-23 5:02 ` [PATCH 047/102] xfs: split and cleanup xfs_log_reserve Dave Chinner
2012-08-23 5:02 ` [PATCH 048/102] xfs: only take the ILOCK in xfs_reclaim_inode() Dave Chinner
2012-08-23 5:02 ` [PATCH 049/102] xfs: use per-filesystem I/O completion workqueues Dave Chinner
2012-08-23 5:02 ` [PATCH 050/102] xfs: do not require an ioend for new EOF calculation Dave Chinner
2012-08-23 5:02 ` [PATCH 054/102] xfs: make xfs_inode_item_size idempotent Dave Chinner
2012-08-23 5:02 ` [PATCH 055/102] xfs: split in-core and on-disk inode log item fields Dave Chinner
2012-08-23 5:02 ` [PATCH 056/102] xfs: reimplement fdatasync support Dave Chinner
2012-08-23 5:02 ` [PATCH 057/102] xfs: fallback to vmalloc for large buffers in xfs_attrmulti_attr_get Dave Chinner
2012-08-23 5:02 ` [PATCH 058/102] xfs: fallback to vmalloc for large buffers in xfs_getbmap Dave Chinner
2012-08-23 5:02 ` [PATCH 059/102] xfs: fix deadlock in xfs_rtfree_extent Dave Chinner
2012-08-23 5:02 ` [PATCH 060/102] xfs: Fix open flag handling in open_by_handle code Dave Chinner
2012-08-23 5:02 ` [PATCH 061/102] xfs: introduce an allocation workqueue Dave Chinner
2012-08-23 5:02 ` [PATCH 062/102] xfs: trace xfs_name strings correctly Dave Chinner
2012-08-23 5:02 ` [PATCH 063/102] xfs: Account log unmount transaction correctly Dave Chinner
2012-08-23 5:02 ` [PATCH 064/102] xfs: fix fstrim offset calculations Dave Chinner
2012-08-23 5:02 ` [PATCH 065/102] xfs: add lots of attribute trace points Dave Chinner
2012-08-23 5:02 ` [PATCH 066/102] xfs: don't fill statvfs with project quota for a directory Dave Chinner
2012-08-23 5:02 ` [PATCH 067/102] xfs: Ensure inode reclaim can run during quotacheck Dave Chinner
2012-08-23 5:02 ` [PATCH 068/102] xfs: avoid taking the ilock unnessecarily in xfs_qm_dqattach Dave Chinner
2012-08-23 5:02 ` [PATCH 069/102] xfs: reduce ilock hold times in xfs_file_aio_write_checks Dave Chinner
2012-08-23 5:02 ` [PATCH 070/102] xfs: reduce ilock hold times in xfs_setattr_size Dave Chinner
2012-08-23 5:02 ` [PATCH 071/102] xfs: push the ilock into xfs_zero_eof Dave Chinner
2012-08-23 5:02 ` [PATCH 072/102] xfs: use shared ilock mode for direct IO writes by default Dave Chinner
2012-08-23 5:02 ` Dave Chinner [this message]
2012-08-23 5:02 ` [PATCH 074/102] xfs: using GFP_NOFS for blkdev_issue_flush Dave Chinner
2012-08-23 5:02 ` [PATCH 075/102] xfs: page type check in writeback only checks last buffer Dave Chinner
2012-08-23 5:02 ` [PATCH 076/102] xfs: punch new delalloc blocks out of failed writes inside Dave Chinner
2012-08-23 5:02 ` [PATCH 077/102] xfs: prevent needless mount warning causing test failures Dave Chinner
2012-08-23 5:02 ` [PATCH 078/102] xfs: don't assert on delalloc regions beyond EOF Dave Chinner
2012-08-23 5:02 ` [PATCH 079/102] xfs: limit specualtive delalloc to maxioffset Dave Chinner
2012-08-23 5:02 ` [PATCH 080/102] xfs: Use preallocation for inodes with extsz hints Dave Chinner
2012-08-23 5:02 ` [PATCH 081/102] xfs: fix buffer lookup race on allocation failure Dave Chinner
2012-08-23 5:02 ` [PATCH 082/102] xfs: check for buffer errors before waiting Dave Chinner
2012-08-23 5:02 ` [PATCH 083/102] xfs: fix incorrect b_offset initialisation Dave Chinner
2012-08-23 5:02 ` [PATCH 084/102] xfs: use kmem_zone_zalloc for buffers Dave Chinner
2012-08-23 5:02 ` [PATCH 085/102] xfs: use iolock on XFS_IOC_ALLOCSP calls Dave Chinner
2012-08-23 5:02 ` [PATCH 086/102] xfs: Properly exclude IO type flags from buffer flags Dave Chinner
2012-08-23 5:02 ` [PATCH 087/102] xfs: flush outstanding buffers on log mount failure Dave Chinner
2012-08-23 5:02 ` [PATCH 088/102] xfs: protect xfs_sync_worker with s_umount semaphore Dave Chinner
2012-08-23 5:02 ` [PATCH 089/102] xfs: fix memory reclaim deadlock on agi buffer Dave Chinner
2012-08-23 5:02 ` [PATCH 090/102] xfs: add trace points for log forces Dave Chinner
2012-08-23 5:02 ` [PATCH 091/102] xfs: switch to proper __bitwise type for KM_... flags Dave Chinner
2012-08-23 5:02 ` [PATCH 092/102] xfs: xfs_vm_writepage clear iomap_valid when Dave Chinner
2012-08-23 5:02 ` [PATCH 093/102] xfs: fix debug_object WARN at xfs_alloc_vextent() Dave Chinner
2012-08-23 5:02 ` [PATCH 094/102] xfs: m_maxioffset is redundant Dave Chinner
2012-08-23 5:02 ` [PATCH 095/102] xfs: make largest supported offset less shouty Dave Chinner
2012-08-23 5:02 ` [PATCH 096/102] xfs: kill copy and paste segment checks in xfs_file_aio_read Dave Chinner
2012-08-23 5:02 ` [PATCH 097/102] xfs: fix allocbt cursor leak in xfs_alloc_ag_vextent_near Dave Chinner
2012-08-23 5:02 ` [PATCH 098/102] xfs: shutdown xfs_sync_worker before the log Dave Chinner
2012-08-23 5:02 ` [PATCH 099/102] xfs: really fix the cursor leak in xfs_alloc_ag_vextent_near Dave Chinner
2012-08-23 5:02 ` [PATCH 100/102] xfs: don't defer metadata allocation to the workqueue Dave Chinner
2012-08-23 5:02 ` [PATCH 101/102] xfs: prevent recursion in xfs_buf_iorequest Dave Chinner
2012-08-23 5:03 ` [PATCH 102/102] xfs: handle EOF correctly in xfs_vm_writepage Dave Chinner
2012-08-23 21:54 ` [RFC, PATCH 0/102]: xfs: 3.0.x stable kernel update Dave Chinner
2012-08-23 22:14 ` Ben Myers
2012-08-23 22:23 ` Matthias Schniedermeyer
2012-09-01 23:10 ` Christoph Hellwig
2012-09-03 6:04 ` Dave Chinner
2012-09-04 21:13 ` Ben Myers
2012-09-05 4:24 ` Dave Chinner
2012-09-13 18:32 ` Mark Tinguely
2012-09-18 13:59 ` Mark Tinguely
2012-09-18 23:50 ` Dave Chinner
2012-09-19 13:14 ` Mark Tinguely
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1345698180-13612-74-git-send-email-david@fromorbit.com \
--to=david@fromorbit.com \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.