linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: David Hildenbrand <david@redhat.com>
To: Oscar Salvador <osalvador@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Muchun Song <muchun.song@linux.dev>,
	James Houghton <jthoughton@google.com>,
	Peter Xu <peterx@redhat.com>, Gavin Guo <gavinguo@igalia.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 2/5] mm,hugetlb: Document the reason to lock the folio in the faulting path
Date: Fri, 13 Jun 2025 21:57:23 +0200	[thread overview]
Message-ID: <ffeeb3d2-0e45-43d1-b2e1-a55f09b160f5@redhat.com> (raw)
In-Reply-To: <aEw0dxfc5n8v1-Mp@localhost.localdomain>

On 13.06.25 16:23, Oscar Salvador wrote:
> On Fri, Jun 13, 2025 at 03:56:15PM +0200, David Hildenbrand wrote:
>> On 12.06.25 15:46, Oscar Salvador wrote:
>>> -	/* hugetlb_wp() requires page locks of pte_page(vmf.orig_pte) */
>>> +	/*
>>> +	 * We need to lock the folio before calling hugetlb_wp().
>>> +	 * Either the folio is in the pagecache and we need to copy it over
>>> +	 * to another file, so it must remain stable throughout the operation,
>>
>> But as discussed, why is that the case? We don't need that for ordinary
>> pages, and existing folio mappings can already concurrently modify the page?
> 
> Normal faulting path takes the lock when we fault in a file read-only or to
> to map it privately.
> That is done via __do_fault or cow_fault, in __do_fault()->vma->vm_ops_>fault().
> E.g. filemap_fault() will locate the page and lock it.
> And it will hold it during the entire operation, note that we unlock it
> after we have called finish_fault().
 > > The page can't go away because filemap_fault also gets a reference on
> it, so I guess it's to hold it stable.
> 

What I meant is:

Assume we have a pagecache page mapped into our page tables R/O 
(MAP_PRIVATE mapping).

During a write fault on such a pagecache page, we end up in 
do_wp_page()->wp_page_copy() we perform the copy via 
__wp_page_copy_user() without the folio lock.

In wp_page_copy(), we retake the pt lock, to make sure that the page is 
still mapped (pte_same). If the page is no longer mapped, we retry the 
fault.

In that case, we only want to make sure that the folio is still mapped 
after possibly dropping the page table lock in between.

As we are holding an additional folio reference in 
do_wp_page()->wp_page_copy(), the folio cannot get freed concurrently.


There is indeed the do_cow_fault() path where we avoid faulting in the 
pagecache page in the first place. So no page table reference, an I can 
understand why we would need the folio lock there.


Regarding hugetlb_no_page(): I think we could drop the folio lock for a 
pagecache folio after inserting the folio into the page table. Just like 
do_wp_page()->wp_page_copy(), we would have to verify again under PTL if 
the folio is still mapped

... which we already do through pte_same() checks?

-- 
Cheers,

David / dhildenb



  reply	other threads:[~2025-06-13 19:57 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-06-12 13:46 [PATCH 0/5] Misc rework on hugetlb_fault Oscar Salvador
2025-06-12 13:46 ` [PATCH 1/5] mm,hugetlb: Change mechanism to detect a COW on private mapping Oscar Salvador
2025-06-13 13:52   ` David Hildenbrand
2025-06-12 13:46 ` [PATCH 2/5] mm,hugetlb: Document the reason to lock the folio in the faulting path Oscar Salvador
2025-06-13 13:56   ` David Hildenbrand
2025-06-13 14:23     ` Oscar Salvador
2025-06-13 19:57       ` David Hildenbrand [this message]
2025-06-13 21:47         ` Oscar Salvador
2025-06-14  9:07           ` Oscar Salvador
2025-06-16  9:22             ` David Hildenbrand
2025-06-16 14:10               ` Oscar Salvador
2025-06-16 14:41                 ` David Hildenbrand
2025-06-17 10:03                   ` Oscar Salvador
2025-06-17 11:27                     ` David Hildenbrand
2025-06-17 12:04                       ` Oscar Salvador
2025-06-17 12:08                         ` David Hildenbrand
2025-06-17 12:10                           ` Oscar Salvador
2025-06-17 12:50                             ` Oscar Salvador
2025-06-17 13:42                               ` David Hildenbrand
2025-06-17 14:00                                 ` Oscar Salvador
2025-06-19 11:52                                 ` Oscar Salvador
2025-06-12 13:46 ` [PATCH 3/5] mm,hugetlb: Conver anon_rmap into boolean Oscar Salvador
2025-06-13 13:48   ` David Hildenbrand
2025-06-12 13:47 ` [PATCH 4/5] mm,hugetlb: Drop obsolete comment about non-present pte and second faults Oscar Salvador
2025-06-12 13:47 ` [PATCH 5/5] mm,hugetlb: Drop unlikelys from hugetlb_fault Oscar Salvador
2025-06-13  8:55 ` [PATCH 0/5] Misc rework on hugetlb_fault Oscar Salvador

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ffeeb3d2-0e45-43d1-b2e1-a55f09b160f5@redhat.com \
    --to=david@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=gavinguo@igalia.com \
    --cc=jthoughton@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=muchun.song@linux.dev \
    --cc=osalvador@suse.de \
    --cc=peterx@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).