Re: [PATCH] ext4: use private version of page_zero_new_buffers() for data=journal mode

public inbox for linux-fsdevel@vger.kernel.org
 help / color / mirror / Atom feed

From: Dave Hansen <dave.hansen@intel.com>
To: Mateusz Guzik <mjguzik@gmail.com>,
	Linus Torvalds <torvalds@linux-foundation.org>
Cc: Theodore Ts'o <tytso@mit.edu>,
	Ext4 Developers List <linux-ext4@vger.kernel.org>,
	Linux Kernel Developers List <linux-kernel@vger.kernel.org>,
	akpm@linux-foundation.org, linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH] ext4: use private version of page_zero_new_buffers() for data=journal mode
Date: Mon, 27 Jan 2025 12:52:51 -0800	[thread overview]
Message-ID: <76b80fff-0f62-4708-95e6-87de272f35a5@intel.com> (raw)
In-Reply-To: <CAGudoHHyEQ1swCJkFDicb8hYYSMCXyMUcRVrtWkbeYwSChCmpQ@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 1557 bytes --]

On 1/26/25 14:45, Mateusz Guzik wrote:
>>
>> So if you don't get around to it, and _if_ I remember this when the
>> merge window is open, I might do it in my local tree, but then it will
>> end up being too late for this merge window.
>>
> The to-be-unreverted change was written by Dave (cc'ed).
> 
> I had a brief chat with him on irc, he said he is going to submit an
> updated patch.

I poked at it a bit today. There's obviously been the page=>folio churn
and also iov_iter_fault_in_readable() got renamed and got some slightly
new semantics.

Additionally, I'm doubting the explicit pagefault_disable(). The
original patch did this:

+      pagefault_disable();
       copied = iov_iter_copy_from_user_atomic(page, i, offset, bytes);
+      pagefault_enable();

and the modern generic_perform_write() is using:

       copied = copy_folio_from_iter_atomic(folio, offset, bytes, i);

But the "atomic" copy is using kmap_atomic() internally which has a
built-in pagefault_disable(). It wouldn't be super atomic if it were
handling page faults of course.

So I don't think generic_perform_write() needs to do its own
pagefault_disable().

I actually still had the original decade-old test case sitting around
compiled on my test box. It still triggers the issue and _will_ livelock
if fault_in_iov_iter_readable() isn't called somewhere.

Anyway, here's a patch that compiles, boots and doesn't immediately fall
over on ext4 in case anyone else wants to poke at it. I'll do a real
changelog, SoB, etc.... and send it out for real tomorrow if it holds up.

[-- Attachment #2: generic_perform_write-1.patch --]
[-- Type: text/x-patch, Size: 1692 bytes --]


index 4f476411a9a2d..98b37e4c6d43c 100644

---

 b/mm/filemap.c |   25 ++++++++++++++-----------
 1 file changed, 14 insertions(+), 11 deletions(-)

diff -puN mm/filemap.c~generic_perform_write-1 mm/filemap.c
--- a/mm/filemap.c~generic_perform_write-1	2025-01-27 09:53:13.219120969 -0800
+++ b/mm/filemap.c	2025-01-27 12:28:40.333920434 -0800
@@ -4027,17 +4027,6 @@ retry:
 		bytes = min(chunk - offset, bytes);
 		balance_dirty_pages_ratelimited(mapping);
 
-		/*
-		 * Bring in the user page that we will copy from _first_.
-		 * Otherwise there's a nasty deadlock on copying from the
-		 * same page as we're writing to, without it being marked
-		 * up-to-date.
-		 */
-		if (unlikely(fault_in_iov_iter_readable(i, bytes) == bytes)) {
-			status = -EFAULT;
-			break;
-		}
-
 		if (fatal_signal_pending(current)) {
 			status = -EINTR;
 			break;
@@ -4055,6 +4044,11 @@ retry:
 		if (mapping_writably_mapped(mapping))
 			flush_dcache_folio(folio);
 
+		/*
+		 * This needs to be atomic because actually handling page
+		 * faults on 'i' can deadlock if the copy targets a
+		 * userspace mapping of 'folio'.
+		 */
 		copied = copy_folio_from_iter_atomic(folio, offset, bytes, i);
 		flush_dcache_folio(folio);
 
@@ -4080,6 +4074,15 @@ retry:
 				bytes = copied;
 				goto retry;
 			}
+			/*
+			 * 'folio' is now unlocked and faults on it can be
+			 * handled. Ensure forward progress by trying to
+			 * fault it in now.
+			 */
+                        if (fault_in_iov_iter_readable(i, bytes) == bytes) {
+                                status = -EFAULT;
+                                break;
+                        }
 		} else {
 			pos += status;
 			written += status;
_

next prev parent reply	other threads:[~2025-01-27 20:52 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20151007154303.GC24678@thunk.org>
     [not found] ` <1444363269-25956-1-git-send-email-tytso@mit.edu>
2025-01-26 17:01   ` [PATCH] ext4: use private version of page_zero_new_buffers() for data=journal mode Mateusz Guzik
2025-01-26 18:48     ` Linus Torvalds
2025-01-26 19:49       ` Mateusz Guzik
2025-01-26 22:03         ` Linus Torvalds
2025-01-26 22:45           ` Mateusz Guzik
2025-01-27 20:52             ` Dave Hansen [this message]
2025-01-27 21:46               ` Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=76b80fff-0f62-4708-95e6-87de272f35a5@intel.com \
    --to=dave.hansen@intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mjguzik@gmail.com \
    --cc=torvalds@linux-foundation.org \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox