* [PATCH] xfs: don't BUG() on mixed direct and mapped I/O
@ 2016-10-31 14:14 Brian Foster
2016-10-31 15:46 ` Christoph Hellwig
0 siblings, 1 reply; 3+ messages in thread
From: Brian Foster @ 2016-10-31 14:14 UTC (permalink / raw)
To: linux-xfs
We've had reports of generic/095 causing XFS to BUG() in
__xfs_get_blocks() due to the existence of delalloc blocks on a direct
I/O read. generic/095 issues a mix of various types of I/O, including
direct and memory mapped I/O to a single file. This is clearly not
supported behavior and is known to lead to such problems. E.g., the lack
of exclusion between the direct I/O and write fault paths means that a
write fault can allocate delalloc blocks in a region of a file that was
previously a hole after the direct read has attempted to flush/inval the
file range, but before it actually reads the block mapping. In turn, the
direct read discovers a delalloc extent and cannot proceed.
While the appropriate solution here is to not mix direct and memory
mapped I/O to the same regions of the same file, the current BUG_ON()
behavior is probably overkill as it can crash the entire system.
Instead, localize the failure to the I/O in question by returning an
error for a direct I/O that cannot be handled safely due to delalloc
blocks. Be careful to allow the case of a direct write to post-eof
delalloc blocks. This can occur due to speculative preallocation and is
safe as post-eof blocks are not accompanied by dirty pages in pagecache
(conversely, preallocation within eof must have been zeroed, and thus
dirtied, before the inode size could have been increased beyond said
blocks).
Finally, provide an additional warning if a direct I/O write occurs
while the file is memory mapped. This may not catch all problematic
scenarios, but provides a hint that some known-to-be-problematic I/O
methods are in use.
Signed-off-by: Brian Foster <bfoster@redhat.com>
---
FWIW, this survived xfstests and a weekend of a 16x fsstress run.
Brian
fs/xfs/xfs_aops.c | 22 ++++++++++++++++++++--
1 file changed, 20 insertions(+), 2 deletions(-)
diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c
index 3e57a56..2693ba8 100644
--- a/fs/xfs/xfs_aops.c
+++ b/fs/xfs/xfs_aops.c
@@ -1361,6 +1361,26 @@ __xfs_get_blocks(
if (error)
goto out_unlock;
+ /*
+ * The only time we can ever safely find delalloc blocks on direct I/O
+ * is a dio write to post-eof speculative preallocation. All other
+ * scenarios are indicative of a problem or misuse (such as mixing
+ * direct and mapped I/O).
+ *
+ * The file may be unmapped by the time we get here so we cannot
+ * reliably fail the I/O based on mapping. Instead, fail the I/O if this
+ * is a read or a write within eof. Otherwise, carry on but warn as a
+ * precuation if the file happens to be mapped.
+ */
+ if (direct && imap.br_startblock == DELAYSTARTBLOCK) {
+ if (!create || offset < i_size_read(VFS_I(ip))) {
+ WARN_ON_ONCE(1);
+ error = -EIO;
+ goto out_unlock;
+ }
+ WARN_ON_ONCE(mapping_mapped(VFS_I(ip)->i_mapping));
+ }
+
/* for DAX, we convert unwritten extents directly */
if (create &&
(!nimaps ||
@@ -1450,8 +1470,6 @@ __xfs_get_blocks(
(new || ISUNWRITTEN(&imap))))
set_buffer_new(bh_result);
- BUG_ON(direct && imap.br_startblock == DELAYSTARTBLOCK);
-
return 0;
out_unlock:
--
2.7.4
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH] xfs: don't BUG() on mixed direct and mapped I/O
2016-10-31 14:14 [PATCH] xfs: don't BUG() on mixed direct and mapped I/O Brian Foster
@ 2016-10-31 15:46 ` Christoph Hellwig
2016-10-31 16:25 ` Brian Foster
0 siblings, 1 reply; 3+ messages in thread
From: Christoph Hellwig @ 2016-10-31 15:46 UTC (permalink / raw)
To: Brian Foster; +Cc: linux-xfs
On Mon, Oct 31, 2016 at 10:14:28AM -0400, Brian Foster wrote:
> We've had reports of generic/095 causing XFS to BUG() in
> __xfs_get_blocks() due to the existence of delalloc blocks on a direct
> I/O read. generic/095 issues a mix of various types of I/O, including
> direct and memory mapped I/O to a single file.
Can you explain the scenario in which case this happens in a little
more detail? The patch looks fine to me, but I'd really like to
understand how this happens.
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] xfs: don't BUG() on mixed direct and mapped I/O
2016-10-31 15:46 ` Christoph Hellwig
@ 2016-10-31 16:25 ` Brian Foster
0 siblings, 0 replies; 3+ messages in thread
From: Brian Foster @ 2016-10-31 16:25 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: linux-xfs
On Mon, Oct 31, 2016 at 08:46:42AM -0700, Christoph Hellwig wrote:
> On Mon, Oct 31, 2016 at 10:14:28AM -0400, Brian Foster wrote:
> > We've had reports of generic/095 causing XFS to BUG() in
> > __xfs_get_blocks() due to the existence of delalloc blocks on a direct
> > I/O read. generic/095 issues a mix of various types of I/O, including
> > direct and memory mapped I/O to a single file.
>
> Can you explain the scenario in which case this happens in a little
> more detail? The patch looks fine to me, but I'd really like to
> understand how this happens.
Sure... the case I reproduced is a race between a direct I/O read and a
mapped write to a hole in a file. The direct read gets through
xfs_file_dio_aio_read() and down to __xfs_get_blocks() while the region
is still a hole. Before the xfs_bmapi_read() call from
__xfs_get_blocks(), a mapped write occurs and allocates delalloc blocks
in the associated file range. xfs_bmapi_read() then returns a delalloc
mapping for a dio read and falls through to the BUG_ON().
FWIW, the specific reproducer was a tweaked variant of generic/095 to up
the iodepth (1024), iodepth_batch (60), and numjobs (20) fio params. It
was also on a ppc64 box with a 64k page size, so that might have also
improved the chances of a race. This can be manufactured on demand with a
hack to delay the dio read in __xfs_get_blocks(), however. E.g., stick a
'if (!create && direct) ssleep(N);' right before xfs_bmapi_read(), run a
single block dio read to a hole in the file, and then a single block
mapped write to the same offset as the read while it is delayed.
Brian
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2016-10-31 16:25 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-10-31 14:14 [PATCH] xfs: don't BUG() on mixed direct and mapped I/O Brian Foster
2016-10-31 15:46 ` Christoph Hellwig
2016-10-31 16:25 ` Brian Foster
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).