Re: [RFC PATCH 1/3] mm, hugetlb: Clean up locking in hugetlb_fault and hugetlb_wp

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: David Hildenbrand <david@redhat.com>
To: Peter Xu <peterx@redhat.com>, Oscar Salvador <osalvador@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Muchun Song <muchun.song@linux.dev>,
	James Houghton <jthoughton@google.com>,
	Gavin Guo <gavinguo@igalia.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [RFC PATCH 1/3] mm, hugetlb: Clean up locking in hugetlb_fault and hugetlb_wp
Date: Tue, 3 Jun 2025 17:08:55 +0200	[thread overview]
Message-ID: <fd10b2b3-064c-4ee2-ad7d-e30a0e194533@redhat.com> (raw)
In-Reply-To: <aD8NUSUV5zA4yNY3@x1.local>

On 03.06.25 16:57, Peter Xu wrote:
> On Tue, Jun 03, 2025 at 03:50:54PM +0200, Oscar Salvador wrote:
>> On Mon, Jun 02, 2025 at 05:30:19PM -0400, Peter Xu wrote:
>>> Right, and thanks for the git digging as usual.  I would agree hugetlb is
>>> more challenge than many other modules on git archaeology. :)
>>>
>>> Even if I mentioned the invalidate_lock, I don't think I thought deeper
>>> than that. I just wished whenever possible we still move hugetlb code
>>> closer to generic code, so if that's the goal we may still want to one day
>>> have a closer look at whether hugetlb can also use invalidate_lock.  Maybe
>>> it isn't worthwhile at last: invalidate_lock is currently a rwsem, which
>>> normally at least allows concurrent fault, but that's currently what isn't
>>> allowed in hugetlb anyway..
>>>
>>> If we start to remove finer grained locks that work will be even harder,
>>> and removing folio lock in this case in fault path also brings hugetlbfs
>>> even further from other file systems.  That might be slightly against what
>>> we used to wish to do, which is to make it closer to others.  Meanwhile I'm
>>> also not yet sure the benefit of not taking folio lock all across, e.g. I
>>> don't expect perf would change at all even if lock is avoided.  We may want
>>> to think about that too when doing so.
>>
>> Ok, I have to confess I was not looking things from this perspective,
>> but when doing so, yes, you are right, we should strive to find
>> replacements wherever we can for not using hugetlb-specific code.
>>
>> I do not know about this case though, not sure what other options do we
>> have when trying to shut concurrent faults while doing other operation.
>> But it is something we should definitely look at.
>>
>> Wrt. to the lock.
>> There were two locks, old_folio (taken in hugetlb_fault) and
>> pagecache_folio one.
> 
> There're actually three places this patch touched, the 3rd one is
> hugetlb_no_page(), in which case I also think we should lock it, not only
> because file folios normally does it (see do_fault(), for example), but
> also that's exactly what James mentioned I believe on possible race of
> !uptodate hugetlb folio being injected by UFFDIO_CONTINUE, along the lines:
> 
> 		folio = alloc_hugetlb_folio(vma, vmf->address, false);
>                  ...
> 		folio_zero_user(folio, vmf->real_address);
> 		__folio_mark_uptodate(folio);
> 
>> The thing was not about worry as how much perf we leave on the table
>> because of these locks, as I am pretty sure is next to 0, but my drive
>> was to understand what are protection and why, because as the discussion
>> showed, none of us really had a good idea about it and it turns out that this
>> goes back more than ~20 years ago.
>>
>> Another topic for the lock (old_folio, so the one we copy from),
>> when we compare it to generic code, we do not take the lock there.
>> Looking at do_wp_page(), we do __get__ a reference on the folio we copy
>> from, but not the lock, so AFAIU, the lock seems only to please
> 
> Yes this is a good point; for CoW path alone maybe we don't need to lock
> old_folio.
> 
>> folio_move_anon_rmap() from hugetlb_wp.
>>
>> Taking a look at do_wp_page()->wp_can_reuse_anon_folio() which also
>> calls folio_move_anon_rmap() in case we can re-use the folio, it only
>> takes the lock before the call to folio_move_anon_rmap(), and then
>> unlocks it.
> 
> IMHO, do_wp_page() took the folio lock not for folio_move_anon_rmap(), but
> for checking swapcache/ksm stuff which needs to be serialized with folio
> lock.
> 
> So I'm not 100% confident on the folio_move_anon_rmap(), but I _think_ it
> deserves a data_race() and IIUC it only work not because of the folio lock,
> but because of how anon_vma is managed as a tree as of now, so that as long
> as WRITE_ONCE() even a race is benign (because the rmap walker will either
> see a complete old anon_vma that includes the parent process's anon_vma, or
> the child's).  What really protects the anon_vma should really be anon_vma
> lock.. That can definitely be a separate topic.  I'm not sure whether you'd
> like to dig this part out, but if you do I'd also be more than happy to
> know whether my understanding needs correction here.. :)
> 
> In general, I still agree with you that if hugetlb CoW path can look closer
> to do_wp_page then it's great.


As stated elsewhere, the mapcount check + folio_move_anon_rmap need the 
folio lock.

-- 
Cheers,

David / dhildenb

next prev parent reply	other threads:[~2025-06-03 15:09 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-06-02 14:16 [RFC PATCH 0/3] Clean up locking in hugetlb faulting code Oscar Salvador
2025-06-02 14:16 ` [RFC PATCH 1/3] mm, hugetlb: Clean up locking in hugetlb_fault and hugetlb_wp Oscar Salvador
2025-06-02 15:14   ` Peter Xu
2025-06-02 20:47     ` Oscar Salvador
2025-06-02 21:30       ` Peter Xu
2025-06-03 13:50         ` Oscar Salvador
2025-06-03 14:57           ` Peter Xu
2025-06-03 15:08             ` David Hildenbrand [this message]
2025-06-03 15:46               ` Peter Xu
2025-06-03 17:19                 ` David Hildenbrand
2025-06-03 19:11                   ` Peter Xu
2025-06-03 18:31             ` Peter Xu
2025-06-10 14:13               ` Oscar Salvador
2025-06-10 15:57                 ` Peter Xu
2025-06-03 15:12           ` David Hildenbrand
2025-06-02 14:16 ` [RFC PATCH 2/3] mm, hugetlb: Update comments in hugetlb_fault Oscar Salvador
2025-06-02 14:16 ` [RFC PATCH 3/3] mm, hugetlb: Drop unlikelys from hugetlb_fault Oscar Salvador
2025-06-16  3:21 ` [RFC PATCH 0/3] Clean up locking in hugetlb faulting code Gavin Guo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=fd10b2b3-064c-4ee2-ad7d-e30a0e194533@redhat.com \
    --to=david@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=gavinguo@igalia.com \
    --cc=jthoughton@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=muchun.song@linux.dev \
    --cc=osalvador@suse.de \
    --cc=peterx@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).