linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dave Hansen <dave.hansen@linux.intel.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Anvin <hpa@zytor.com>, Theodore Ts'o <tytso@mit.edu>,
	Andrew Morton <akpm@linux-foundation.org>,
	"linux-ext4@vger.kernel.org" <linux-ext4@vger.kernel.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: [REGRESSION] 998ef75ddb and aio-dio-invalidate-failure w/ data=journal
Date: Mon, 5 Oct 2015 16:33:17 -0700	[thread overview]
Message-ID: <561308BD.5090304@linux.intel.com> (raw)
In-Reply-To: <CA+55aFz9hH26=iQa832J5789Kf1ufYEJPt5L_D1wcDhCycUCFQ@mail.gmail.com>

> On Mon, Oct 5, 2015 at 10:18 PM, Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
>>
>> Your ext4 patch may well fix the issue, and be the right thing to do
>> (_regardless_ of the revert, in fact - while it might make the revert
>> unnecessary, it might also be a good idea even if we do revert).
> 
> Thinking a bit more about your patch, I actually am getting more and
> more convinced that it's the wrong thing to do.
> 
> Why?
> 
> Because the whole "Setting copied=0 will tell the upper layers to
> repeat the write" just seems a nasty layering violation, where the
> low-level filesystem uses a magic return code to introduce a special
> case at the upper layers. But the upper layers are actually already
> *aware* of the special case, and in fact have a comment about it.

It's nasty, but it's widespread.  I'm fairly sure some version of that
code shows up in all these places:

	ext4_journalled_write_end()
	generic_write_end()->block_write_end()
	ocfs2_write_end_inline()
	ocfs2_write_end_nolock()
	logfs_write_end()
	ext3_journalled_write_end()
	btrfs_copy_from_user()
	reiserfs_write_end()

> The ext4 side still worries me, though. You made that
> "page_zero_new_buffers()" conditional on "copied" being non-zero, but
> I'm not convinced it can be conditional. Even if we retry, that retry
> may end up failing (for example, because the source isn't mapped, so
> we return -EFAULT rather than re-doing the write), but we have those
> new buffers that got allocated in write_begin(), and now nobody has
> ever written any data to them at all, so they have random stale
> contents.

Yeah, I guess they're not uptodate and nobody is on the hook to make
them uptodate.

> So I do think this needs more thought. Or at least somebody should
> explain to me better why it's all ok.
> 
> I'm attaching the "copied = 0" special case thing at the
> generic_perform_write() level as a patch for comments. But for now I
> still think that reverting would seem to be the safer thing (which
> still possibly leaves things buggy with a racy unmap, but at least
> it's the old bug that we've never hit in practice).

FWIW, I'm not objecting at all to a revert.  It's obviously the safe
thing to do.

> --- a/mm/filemap.c
> +++ b/mm/filemap.c
> @@ -2493,6 +2493,20 @@ again:
>  		pagefault_enable();
>  		flush_dcache_page(page);
>  
> +		/*
> +		 * If we didn't successfully copy all the data from user space,
> +		 * and the target page is not up-to-date, we will have to prefault
> +		 * the source. And if the page wasn't up-to-date before the write,
> +		 * the "write_end()" may need to *make* it up-to-date, and thus
> +		 * overwrite our partial copy.
> +		 *
> +		 * So for that case, thow away the whole thing and force a full
> +		 * restart (see comment above, and iov_iter_fault_in_readable()
> +		 * below).
> +		 */
> +		if (copied < bytes && !PageUptodate(page))
> +			copied = 0;
> +
>  		status = a_ops->write_end(file, mapping, pos, bytes, copied,
>  						page, fsdata);

Did you mean that as a cleanup, or to fix the regression?

Since the page isn't faulted in yet, iov_iter_copy_from_user_atomic()
had already set copied=0, so that has no effect on the ext4 code being
called and it spits out the same warning Ted originally reported.

Ted, do we really just need to tell ext4 that this dirtying operation is
OK?  Or is there really a risk of filesystem corruption that we're not
seeing?

  reply	other threads:[~2015-10-05 23:33 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-10-05 15:22 [REGRESSION] 998ef75ddb and aio-dio-invalidate-failure w/ data=journal Theodore Ts'o
2015-10-05 15:58 ` Linus Torvalds
2015-10-05 16:23   ` Dave Hansen
2015-10-05 20:22     ` Linus Torvalds
2015-10-05 20:48       ` Dave Hansen
2015-10-05 21:18         ` Linus Torvalds
2015-10-05 21:55           ` Linus Torvalds
2015-10-05 23:33             ` Dave Hansen [this message]
2015-10-06  9:01               ` Linus Torvalds
2015-10-05 20:49       ` H. Peter Anvin
2015-10-06  7:56         ` Ingo Molnar
2015-10-06  9:10           ` Linus Torvalds
2015-10-06  9:27             ` Ingo Molnar
2015-10-06 13:29               ` Linus Torvalds
2015-10-06 13:42                 ` Ingo Molnar
2015-10-05 16:03 ` Dave Hansen
2015-10-05 18:04 ` Dave Hansen
2015-10-07  3:34   ` Theodore Ts'o
2015-10-07  7:32     ` Linus Torvalds
2015-10-07 15:43       ` Theodore Ts'o
2015-10-09  4:01         ` [PATCH] ext4: use private version of page_zero_new_buffers() for data=journal mode Theodore Ts'o
2015-10-13  6:06           ` Leonid V. Fedorenchik
2015-10-15 11:17           ` Jan Kara
2025-01-26 17:01           ` Mateusz Guzik
2025-01-26 18:48             ` Linus Torvalds
2025-01-26 19:49               ` Mateusz Guzik
2025-01-26 22:03                 ` Linus Torvalds
2025-01-26 22:45                   ` Mateusz Guzik
2025-01-27 20:52                     ` Dave Hansen
2025-01-27 21:46                       ` Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=561308BD.5090304@linux.intel.com \
    --to=dave.hansen@linux.intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=hpa@zytor.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).