linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jan Kara <jack@suse.cz>
To: Theodore Ts'o <tytso@mit.edu>
Cc: Eric Whitney <enwlinux@gmail.com>,
	linux-ext4@vger.kernel.org, jack@suse.cz
Subject: Re: [PATCH] ext4: revert i_data_sum locking cleanups for dioread_nolock
Date: Thu, 18 Feb 2016 23:09:56 +0100	[thread overview]
Message-ID: <20160218220956.GA24219@quack.suse.cz> (raw)
In-Reply-To: <20160216050840.GA3426@thunk.org>

[-- Attachment #1: Type: text/plain, Size: 2283 bytes --]

On Tue 16-02-16 00:08:40, Ted Tso wrote:
> On Fri, Feb 12, 2016 at 01:25:06PM -0500, Eric Whitney wrote:
> > Commit 2bcba4781fa3 ("ext4: get rid of EXT4_GET_BLOCKS_NO_LOCK flag")
> > can cause a kernel panic when xfstest generic/300 is run on a file
> > system mounted with dioread_nolock.  The panic is typically triggered
> > from check_irqs_on() (fs/buffer.c: 1272), and happens because
> > ext4_end_io_dio() is being called in an interrupt context rather than
> > from a workqueue for AIO.  This suggests that buffer_defer_completion
> > may not be set properly when creating an unwritten extent for async
> > direct I/O.
> > 
> > Revert the locking changes until this problem can be resolved.  Patch
> > applied to 4.5-rc3 and tested with a full xfstest-bld auto group run.
> > 
> > Signed-off-by: Eric Whitney <enwlinux@gmail.com>
> 
> Applied, thanks.
> 
> I was able to reliably reproduce the problem without this revert patch
> using a 32-bit x86 kvm-xfstests test appliance:
> 
>    ftp://ftp.kernel.org/pub/linux/kernel/people/tytso/kvm-xfstests/testing/root_fs.img.i386
> 
> Using the command: "kvm-xfstests -c dioread_nolock generic/300"
> 
> With this patch, the problem goes away, so reverting the patch is
> clearly the right thing for now.  There is something screwy going on,
> since the original commit shouldn't have made a difference, but let's
> revert it first, and figure it out when we have time to investigate
> more deeply.

OK, I had a look into this. So I'm not 100% what has happened but the
following looks likely: Current io_end handling can overwrite io_end
pointer in the inode in dioread_nolock mode (nothing prevents unlocked DIO
to overwrite pointer of locked DIO and then clear it out). I suspect that
the change in i_data_sem locking made this race more visible. Attached
patch should fix the issue (I don't see failures of generic/300 with it in
dioread_nolock mode). Can you consider this instead of a revert Eric sent?

I have also a more complete rewrite of io_end handling which makes the code
more comprehensible and avoids storing io_end pointer in the inode (thus
avoids similar pitfalls in future) but that is a 4.6 matter. I'll submit
the rewrite once xfstests runs complete.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

[-- Attachment #2: 0001-ext4-Fix-crashes-in-dioread_nolock-mode.patch --]
[-- Type: text/x-patch, Size: 2609 bytes --]

>From 09eae61bb0433f8424f606fde00db35dbd72615d Mon Sep 17 00:00:00 2001
From: Jan Kara <jack@suse.cz>
Date: Thu, 18 Feb 2016 22:50:07 +0100
Subject: [PATCH] ext4: Fix crashes in dioread_nolock mode

Competing overwrite DIO in dioread_nolock mode will just overwrite
pointer to io_end in the inode. This may result in data corruption or
extent conversion happening from IO completion interrupt because we
don't properly set buffer_defer_completion() when unlocked DIO races
with locked DIO to unwritten extent.

Since unlocked DIO doesn't need io_end for anything, just avoid
allocating it and corrupting pointer from inode for locked DIO.
A cleaner fix would be to avoid these games with io_end pointer from the
inode but that requires more intrusive changes so we leave that for
later.

Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/ext4/inode.c | 40 ++++++++++++++++++++--------------------
 1 file changed, 20 insertions(+), 20 deletions(-)

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 83bc8bfb3bea..ee8ca1ff4023 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -3253,29 +3253,29 @@ static ssize_t ext4_ext_direct_IO(struct kiocb *iocb, struct iov_iter *iter,
 	 * case, we allocate an io_end structure to hook to the iocb.
 	 */
 	iocb->private = NULL;
-	ext4_inode_aio_set(inode, NULL);
-	if (!is_sync_kiocb(iocb)) {
-		io_end = ext4_init_io_end(inode, GFP_NOFS);
-		if (!io_end) {
-			ret = -ENOMEM;
-			goto retake_lock;
-		}
-		/*
-		 * Grab reference for DIO. Will be dropped in ext4_end_io_dio()
-		 */
-		iocb->private = ext4_get_io_end(io_end);
-		/*
-		 * we save the io structure for current async direct
-		 * IO, so that later ext4_map_blocks() could flag the
-		 * io structure whether there is a unwritten extents
-		 * needs to be converted when IO is completed.
-		 */
-		ext4_inode_aio_set(inode, io_end);
-	}
-
 	if (overwrite) {
 		get_block_func = ext4_get_block_overwrite;
 	} else {
+		ext4_inode_aio_set(inode, NULL);
+		if (!is_sync_kiocb(iocb)) {
+			io_end = ext4_init_io_end(inode, GFP_NOFS);
+			if (!io_end) {
+				ret = -ENOMEM;
+				goto retake_lock;
+			}
+			/*
+			 * Grab reference for DIO. Will be dropped in
+			 * ext4_end_io_dio()
+			 */
+			iocb->private = ext4_get_io_end(io_end);
+			/*
+			 * we save the io structure for current async direct
+			 * IO, so that later ext4_map_blocks() could flag the
+			 * io structure whether there is a unwritten extents
+			 * needs to be converted when IO is completed.
+			 */
+			ext4_inode_aio_set(inode, io_end);
+		}
 		get_block_func = ext4_get_block_write;
 		dio_flags = DIO_LOCKING;
 	}
-- 
2.6.2


  reply	other threads:[~2016-02-18 22:09 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-02-12 18:25 [PATCH] ext4: revert i_data_sum locking cleanups for dioread_nolock Eric Whitney
2016-02-16  5:08 ` Theodore Ts'o
2016-02-18 22:09   ` Jan Kara [this message]
2016-02-19  5:30     ` Theodore Ts'o
2016-02-19 14:19       ` Eric Whitney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160218220956.GA24219@quack.suse.cz \
    --to=jack@suse.cz \
    --cc=enwlinux@gmail.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).