From: Dave Chinner <david@fromorbit.com>
To: xfs@oss.sgi.com
Subject: [PATCH 073/102] xfs: punch all delalloc blocks beyond EOF on write failure.
Date: Thu, 23 Aug 2012 15:02:31 +1000 [thread overview]
Message-ID: <1345698180-13612-74-git-send-email-david@fromorbit.com> (raw)
In-Reply-To: <1345698180-13612-1-git-send-email-david@fromorbit.com>
From: Dave Chinner <dchinner@redhat.com>
Upstream commit: 01c84d2dc1311fb76ea217dadfd5b3a5f3cab563
I've been seeing regular ASSERT failures in xfstests when running
fsstress based tests over the past month. xfs_getbmap() has been
failing this test:
XFS: Assertion failed: ((iflags & BMV_IF_DELALLOC) != 0) ||
(map[i].br_startblock != DELAYSTARTBLOCK), file: fs/xfs/xfs_bmap.c,
line: 5650
where it is encountering a delayed allocation extent after writing
all the dirty data to disk and then walking the extent map
atomically by holding the XFS_IOLOCK_SHARED to prevent new delayed
allocation extents from being created.
Test 083 on a 512 byte block size filesystem was used to reproduce
the problem, because it only had a 5s run timeand would usually fail
every 3-4 runs. This test is exercising ENOSPC behaviour by running
fsstress on a nearly full filesystem. The following trace extract
shows the final few events on the inode that tripped the assert:
xfs_ilock: flags ILOCK_EXCL caller xfs_setfilesize
xfs_setfilesize: isize 0x180000 disize 0x12d400 offset 0x17e200 count 7680
file size updated to 0x180000 by IO completion
xfs_ilock: flags ILOCK_EXCL caller xfs_iomap_write_delay
xfs_iext_insert: state idx 3 offset 3072 block 4503599627239432 count 1 flag 0 caller xfs_bmap_add_extent_hole_delay
xfs_get_blocks_alloc: size 0x180000 offset 0x180000 count 512 type startoff 0xc00 startblock -1 blockcount 0x1
xfs_ilock: flags ILOCK_EXCL caller __xfs_get_blocks
delalloc write, adding a single block at offset 0x180000
xfs_delalloc_enospc: isize 0x180000 disize 0x180000 offset 0x180200 count 512
ENOSPC trying to allocate a dellalloc block at offset 0x180200
xfs_ilock: flags ILOCK_EXCL caller xfs_iomap_write_delay
xfs_get_blocks_alloc: size 0x180000 offset 0x180200 count 512 type startoff 0xc00 startblock -1 blockcount 0x2
And succeeding on retry after flushing dirty inodes.
xfs_ilock: flags ILOCK_EXCL caller __xfs_get_blocks
xfs_delalloc_enospc: isize 0x180000 disize 0x180000 offset 0x180400 count 512
ENOSPC trying to allocate a dellalloc block at offset 0x180400
xfs_ilock: flags ILOCK_EXCL caller xfs_iomap_write_delay
xfs_delalloc_enospc: isize 0x180000 disize 0x180000 offset 0x180400 count 512
And failing the retry, giving a real ENOSPC error.
xfs_ilock: flags ILOCK_EXCL caller xfs_vm_write_failed
^^^^^^^^^^^^^^^^^^^
The smoking gun - the write being failed and cleaning up delalloc
blocks beyond EOF allocated by the failed write.
xfs_getattr:
xfs_ilock: flags IOLOCK_SHARED caller xfs_getbmap
xfs_ilock: flags ILOCK_SHARED caller xfs_ilock_map_shared
And that's where we died almost immediately afterwards.
xfs_bmapi_read() found delalloc extent beyond current file in memory
file size. Some debug I added to xfs_getbmap() showed the state just
before the assert failure:
ino 0x80e48: off 0xc00, fsb 0xffffffffffffffff, len 0x1, size 0x180000
start_fsb 0x106, end_fsb 0x638
ino flags 0x2 nex 0xd bmvcnt 0x555, len 0x3c58a6f23c0bf1, start 0xc00
ext 0: off 0x1fc, fsb 0x24782, len 0x254
ext 1: off 0x450, fsb 0x40851, len 0x30
ext 2: off 0x480, fsb 0xd99, len 0x1b8
ext 3: off 0x92f, fsb 0x4099a, len 0x3b
ext 4: off 0x96d, fsb 0x41844, len 0x98
ext 5: off 0xbf1, fsb 0x408ab, len 0xf
which shows that we found a single delalloc block beyond EOF (first
line of output) when we were returning the map for a length
somewhere around 10^16 bytes long (second line), and the on-disk
extents showed they didn't go past EOF (last lines).
Further debug added to xfs_vm_write_failed() showed this happened
when punching out delalloc blocks beyond the end of the file after
the failed write:
[ 132.606693] ino 0x80e48: vwf to 0x181000, sze 0x180000
[ 132.609573] start_fsb 0xc01, end_fsb 0xc08
It punched the range 0xc01 -> 0xc08, but the range we really need to
punch is 0xc00 -> 0xc07 (8 blocks from 0xc00) as this testing was
run on a 512 byte block size filesystem (8 blocks per page).
the punch from is 0xc00. So end_fsb is correct, but start_fsb is
wrong as we punch from start_fsb for (end_fsb - start_fsb) blocks.
Hence we are not punching the delalloc block beyond EOF in the case.
The fix is simple - it's a silly off-by-one mistake in calculating
the range. It's especially silly because the macro used to calculate
the start_fsb already takes into account the case where the inode
size is an exact multiple of the filesystem block size...
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
---
fs/xfs/linux-2.6/xfs_aops.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/xfs/linux-2.6/xfs_aops.c b/fs/xfs/linux-2.6/xfs_aops.c
index 2ce0e07..d2d3c84 100644
--- a/fs/xfs/linux-2.6/xfs_aops.c
+++ b/fs/xfs/linux-2.6/xfs_aops.c
@@ -1482,7 +1482,7 @@ xfs_vm_write_failed(
* Check if there are any blocks that are outside of i_size
* that need to be trimmed back.
*/
- start_fsb = XFS_B_TO_FSB(ip->i_mount, inode->i_size) + 1;
+ start_fsb = XFS_B_TO_FSB(ip->i_mount, inode->i_size);
end_fsb = XFS_B_TO_FSB(ip->i_mount, to);
if (end_fsb <= start_fsb)
return;
--
1.7.10
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
next prev parent reply other threads:[~2012-08-23 5:04 UTC|newest]
Thread overview: 117+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-08-23 5:01 [RFC, PATCH 0/102]: xfs: 3.0.x stable kernel update Dave Chinner
2012-08-23 5:01 ` [PATCH 001/102] xfs: don't serialise adjacent concurrent direct IO appending writes Dave Chinner
2012-08-23 5:01 ` [PATCH 002/102] xfs: remove dead ENODEV handling in xfs_destroy_ioend Dave Chinner
2012-08-23 5:01 ` [PATCH 003/102] xfs: defer AIO/DIO completions Dave Chinner
2012-08-23 5:01 ` [PATCH 004/102] xfs: reduce ioend latency Dave Chinner
2012-08-23 5:01 ` [PATCH 005/102] xfs: wait for I/O completion when writing out pages in xfs_setattr_size Dave Chinner
2012-08-23 5:01 ` [PATCH 006/102] xfs: improve ioend error handling Dave Chinner
2012-08-23 5:01 ` [PATCH 007/102] xfs: Check the return value of xfs_buf_get() Dave Chinner
2012-08-23 5:01 ` [PATCH 008/102] xfs: Check the return value of xfs_trans_get_buf() Dave Chinner
2012-08-23 5:01 ` [PATCH 009/102] xfs: dont ignore error code from xfs_bmbt_update Dave Chinner
2012-08-23 5:01 ` [PATCH 010/102] xfs: fix possible overflow in xfs_ioc_trim() Dave Chinner
2012-08-23 5:01 ` [PATCH 011/102] xfs: XFS_TRANS_SWAPEXT is not a valid flag for Dave Chinner
2012-08-23 5:01 ` [PATCH 012/102] xfs: Don't allocate new buffers on every call to _xfs_buf_find Dave Chinner
2012-08-23 5:01 ` [PATCH 013/102] xfs: reduce the number of log forces from tail pushing Dave Chinner
2012-08-23 5:01 ` [PATCH 014/102] xfs: optimize fsync on directories Dave Chinner
2012-08-23 5:01 ` [PATCH 015/102] xfs: clean up buffer allocation Dave Chinner
2012-08-23 5:01 ` [PATCH 016/102] xfs: clean up xfs_ioerror_alert Dave Chinner
2012-08-23 5:01 ` [PATCH 017/102] xfs: use xfs_ioerror_alert in xfs_buf_iodone_callbacks Dave Chinner
2012-08-23 5:01 ` [PATCH 018/102] xfs: do not flush data workqueues in xfs_flush_buftarg Dave Chinner
2012-08-23 5:01 ` [PATCH 019/102] xfs: add AIL pushing tracepoints Dave Chinner
2012-08-23 5:01 ` [PATCH 020/102] xfs: warn if direct reclaim tries to writeback pages Dave Chinner
2012-08-27 18:17 ` Christoph Hellwig
2012-09-05 11:32 ` Mel Gorman
2012-09-13 20:12 ` Mark Tinguely
2012-09-14 9:45 ` Mel Gorman
2012-09-14 12:56 ` Mark Tinguely
2012-09-14 16:18 ` Mark Tinguely
2012-08-23 5:01 ` [PATCH 021/102] xfs: fix force shutdown handling in xfs_end_io Dave Chinner
2012-08-23 5:01 ` [PATCH 022/102] xfs: fix allocation length overflow in xfs_bmapi_write() Dave Chinner
2012-08-23 5:01 ` [PATCH 023/102] xfs: fix the logspace waiting algorithm Dave Chinner
2012-08-23 5:01 ` [PATCH 024/102] xfs: untangle SYNC_WAIT and SYNC_TRYLOCK meanings for xfs_qm_dqflush Dave Chinner
2012-08-23 5:01 ` [PATCH 025/102] xfs: make sure to really flush all dquots in xfs_qm_quotacheck Dave Chinner
2012-08-23 5:01 ` [PATCH 026/102] xfs: simplify xfs_qm_detach_gdquots Dave Chinner
2012-08-23 5:01 ` [PATCH 027/102] xfs: mark the xfssyncd workqueue as non-reentrant Dave Chinner
2012-08-23 5:01 ` [PATCH 028/102] xfs: make i_flags an unsigned long Dave Chinner
2012-08-23 5:01 ` [PATCH 029/102] xfs: remove the i_size field in struct xfs_inode Dave Chinner
2012-08-23 5:01 ` [PATCH 030/102] xfs: remove the i_new_size " Dave Chinner
2012-08-23 5:01 ` [PATCH 031/102] xfs: always return with the iolock held from Dave Chinner
2012-08-23 5:01 ` [PATCH 032/102] xfs: cleanup xfs_file_aio_write Dave Chinner
2012-08-23 5:01 ` [PATCH 033/102] xfs: pass KM_SLEEP flag to kmem_realloc() in Dave Chinner
2012-08-23 5:01 ` [PATCH 034/102] xfs: show uuid when mount fails due to duplicate uuid Dave Chinner
2012-08-23 5:01 ` [PATCH 035/102] xfs: xfs_trans_add_item() - don't assign in ASSERT() when compare is intended Dave Chinner
2012-08-23 5:01 ` [PATCH 036/102] xfs: split tail_lsn assignments from log space wakeups Dave Chinner
2012-08-23 5:01 ` [PATCH 037/102] xfs: do exact log space wakeups in xlog_ungrant_log_space Dave Chinner
2012-08-23 5:01 ` [PATCH 038/102] xfs: remove xfs_trans_unlocked_item Dave Chinner
2012-08-23 5:01 ` [PATCH 039/102] xfs: cleanup xfs_log_space_wake Dave Chinner
2012-08-23 5:01 ` [PATCH 040/102] xfs: remove log space waitqueues Dave Chinner
2012-08-23 5:01 ` [PATCH 041/102] xfs: add the xlog_grant_head structure Dave Chinner
2012-08-23 5:02 ` [PATCH 042/102] xfs: add xlog_grant_head_init Dave Chinner
2012-08-23 5:02 ` [PATCH 043/102] xfs: add xlog_grant_head_wake_all Dave Chinner
2012-08-23 5:02 ` [PATCH 044/102] xfs: share code for grant head waiting Dave Chinner
2012-08-23 5:02 ` [PATCH 045/102] xfs: share code for grant head wakeups Dave Chinner
2012-08-23 5:02 ` [PATCH 046/102] xfs: share code for grant head availability checks Dave Chinner
2012-08-23 5:02 ` [PATCH 047/102] xfs: split and cleanup xfs_log_reserve Dave Chinner
2012-08-23 5:02 ` [PATCH 048/102] xfs: only take the ILOCK in xfs_reclaim_inode() Dave Chinner
2012-08-23 5:02 ` [PATCH 049/102] xfs: use per-filesystem I/O completion workqueues Dave Chinner
2012-08-23 5:02 ` [PATCH 050/102] xfs: do not require an ioend for new EOF calculation Dave Chinner
2012-08-23 5:02 ` [PATCH 054/102] xfs: make xfs_inode_item_size idempotent Dave Chinner
2012-08-23 5:02 ` [PATCH 055/102] xfs: split in-core and on-disk inode log item fields Dave Chinner
2012-08-23 5:02 ` [PATCH 056/102] xfs: reimplement fdatasync support Dave Chinner
2012-08-23 5:02 ` [PATCH 057/102] xfs: fallback to vmalloc for large buffers in xfs_attrmulti_attr_get Dave Chinner
2012-08-23 5:02 ` [PATCH 058/102] xfs: fallback to vmalloc for large buffers in xfs_getbmap Dave Chinner
2012-08-23 5:02 ` [PATCH 059/102] xfs: fix deadlock in xfs_rtfree_extent Dave Chinner
2012-08-23 5:02 ` [PATCH 060/102] xfs: Fix open flag handling in open_by_handle code Dave Chinner
2012-08-23 5:02 ` [PATCH 061/102] xfs: introduce an allocation workqueue Dave Chinner
2012-08-23 5:02 ` [PATCH 062/102] xfs: trace xfs_name strings correctly Dave Chinner
2012-08-23 5:02 ` [PATCH 063/102] xfs: Account log unmount transaction correctly Dave Chinner
2012-08-23 5:02 ` [PATCH 064/102] xfs: fix fstrim offset calculations Dave Chinner
2012-08-23 5:02 ` [PATCH 065/102] xfs: add lots of attribute trace points Dave Chinner
2012-08-23 5:02 ` [PATCH 066/102] xfs: don't fill statvfs with project quota for a directory Dave Chinner
2012-08-23 5:02 ` [PATCH 067/102] xfs: Ensure inode reclaim can run during quotacheck Dave Chinner
2012-08-23 5:02 ` [PATCH 068/102] xfs: avoid taking the ilock unnessecarily in xfs_qm_dqattach Dave Chinner
2012-08-23 5:02 ` [PATCH 069/102] xfs: reduce ilock hold times in xfs_file_aio_write_checks Dave Chinner
2012-08-23 5:02 ` [PATCH 070/102] xfs: reduce ilock hold times in xfs_setattr_size Dave Chinner
2012-08-23 5:02 ` [PATCH 071/102] xfs: push the ilock into xfs_zero_eof Dave Chinner
2012-08-23 5:02 ` [PATCH 072/102] xfs: use shared ilock mode for direct IO writes by default Dave Chinner
2012-08-23 5:02 ` Dave Chinner [this message]
2012-08-23 5:02 ` [PATCH 074/102] xfs: using GFP_NOFS for blkdev_issue_flush Dave Chinner
2012-08-23 5:02 ` [PATCH 075/102] xfs: page type check in writeback only checks last buffer Dave Chinner
2012-08-23 5:02 ` [PATCH 076/102] xfs: punch new delalloc blocks out of failed writes inside Dave Chinner
2012-08-23 5:02 ` [PATCH 077/102] xfs: prevent needless mount warning causing test failures Dave Chinner
2012-08-23 5:02 ` [PATCH 078/102] xfs: don't assert on delalloc regions beyond EOF Dave Chinner
2012-08-23 5:02 ` [PATCH 079/102] xfs: limit specualtive delalloc to maxioffset Dave Chinner
2012-08-23 5:02 ` [PATCH 080/102] xfs: Use preallocation for inodes with extsz hints Dave Chinner
2012-08-23 5:02 ` [PATCH 081/102] xfs: fix buffer lookup race on allocation failure Dave Chinner
2012-08-23 5:02 ` [PATCH 082/102] xfs: check for buffer errors before waiting Dave Chinner
2012-08-23 5:02 ` [PATCH 083/102] xfs: fix incorrect b_offset initialisation Dave Chinner
2012-08-23 5:02 ` [PATCH 084/102] xfs: use kmem_zone_zalloc for buffers Dave Chinner
2012-08-23 5:02 ` [PATCH 085/102] xfs: use iolock on XFS_IOC_ALLOCSP calls Dave Chinner
2012-08-23 5:02 ` [PATCH 086/102] xfs: Properly exclude IO type flags from buffer flags Dave Chinner
2012-08-23 5:02 ` [PATCH 087/102] xfs: flush outstanding buffers on log mount failure Dave Chinner
2012-08-23 5:02 ` [PATCH 088/102] xfs: protect xfs_sync_worker with s_umount semaphore Dave Chinner
2012-08-23 5:02 ` [PATCH 089/102] xfs: fix memory reclaim deadlock on agi buffer Dave Chinner
2012-08-23 5:02 ` [PATCH 090/102] xfs: add trace points for log forces Dave Chinner
2012-08-23 5:02 ` [PATCH 091/102] xfs: switch to proper __bitwise type for KM_... flags Dave Chinner
2012-08-23 5:02 ` [PATCH 092/102] xfs: xfs_vm_writepage clear iomap_valid when Dave Chinner
2012-08-23 5:02 ` [PATCH 093/102] xfs: fix debug_object WARN at xfs_alloc_vextent() Dave Chinner
2012-08-23 5:02 ` [PATCH 094/102] xfs: m_maxioffset is redundant Dave Chinner
2012-08-23 5:02 ` [PATCH 095/102] xfs: make largest supported offset less shouty Dave Chinner
2012-08-23 5:02 ` [PATCH 096/102] xfs: kill copy and paste segment checks in xfs_file_aio_read Dave Chinner
2012-08-23 5:02 ` [PATCH 097/102] xfs: fix allocbt cursor leak in xfs_alloc_ag_vextent_near Dave Chinner
2012-08-23 5:02 ` [PATCH 098/102] xfs: shutdown xfs_sync_worker before the log Dave Chinner
2012-08-23 5:02 ` [PATCH 099/102] xfs: really fix the cursor leak in xfs_alloc_ag_vextent_near Dave Chinner
2012-08-23 5:02 ` [PATCH 100/102] xfs: don't defer metadata allocation to the workqueue Dave Chinner
2012-08-23 5:02 ` [PATCH 101/102] xfs: prevent recursion in xfs_buf_iorequest Dave Chinner
2012-08-23 5:03 ` [PATCH 102/102] xfs: handle EOF correctly in xfs_vm_writepage Dave Chinner
2012-08-23 21:54 ` [RFC, PATCH 0/102]: xfs: 3.0.x stable kernel update Dave Chinner
2012-08-23 22:14 ` Ben Myers
2012-08-23 22:23 ` Matthias Schniedermeyer
2012-09-01 23:10 ` Christoph Hellwig
2012-09-03 6:04 ` Dave Chinner
2012-09-04 21:13 ` Ben Myers
2012-09-05 4:24 ` Dave Chinner
2012-09-13 18:32 ` Mark Tinguely
2012-09-18 13:59 ` Mark Tinguely
2012-09-18 23:50 ` Dave Chinner
2012-09-19 13:14 ` Mark Tinguely
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1345698180-13612-74-git-send-email-david@fromorbit.com \
--to=david@fromorbit.com \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox