From: "David Hildenbrand (Arm)" <david@kernel.org>
To: Dev Jain <dev.jain@arm.com>,
akpm@linux-foundation.org, ljs@kernel.org, hughd@google.com,
chrisl@kernel.org, kasong@tencent.com
Cc: riel@surriel.com, liam@infradead.org, vbabka@kernel.org,
harry@kernel.org, jannh@google.com, linux-mm@kvack.org,
linux-kernel@vger.kernel.org, qi.zheng@linux.dev,
shakeel.butt@linux.dev, baohua@kernel.org,
axelrasmussen@google.com, yuanchu@google.com, weixugc@google.com,
rppt@kernel.org, surenb@google.com, mhocko@suse.com,
baolin.wang@linux.alibaba.com, shikemeng@huaweicloud.com,
nphamcs@gmail.com, bhe@redhat.com, youngjun.park@lge.com,
pfalcato@suse.de, ryan.roberts@arm.com,
anshuman.khandual@arm.com
Subject: Re: [PATCH v3 2/9] mm/rmap: refactor hugetlb pte clearing in try_to_unmap_one
Date: Mon, 11 May 2026 10:59:35 +0200 [thread overview]
Message-ID: <f3ec1b78-e405-4b58-8588-faa97070cb3b@kernel.org> (raw)
In-Reply-To: <d2f11bbd-93ec-4a7e-9de3-ed4541914ad9@arm.com>
On 5/11/26 10:53, Dev Jain wrote:
>
>
> On 11/05/26 12:40 pm, David Hildenbrand (Arm) wrote:
>> On 5/6/26 11:44, Dev Jain wrote:
>>> Simplify the code by refactoring the folio_test_hugetlb() branch into
>>> a new function.
>>>
>>> While at it, convert BUG helpers to WARN helpers.
>>>
>>> Signed-off-by: Dev Jain <dev.jain@arm.com>
>>> ---
>>> mm/rmap.c | 117 ++++++++++++++++++++++++++++++++----------------------
>>> 1 file changed, 69 insertions(+), 48 deletions(-)
>>>
>>> diff --git a/mm/rmap.c b/mm/rmap.c
>>> index a5f067a09de0f..a98acdea0530a 100644
>>> --- a/mm/rmap.c
>>> +++ b/mm/rmap.c
>>> @@ -1978,6 +1978,68 @@ static inline unsigned int folio_unmap_pte_batch(struct folio *folio,
>>> FPB_RESPECT_WRITE | FPB_RESPECT_SOFT_DIRTY);
>>> }
>>>
>>> +/* Returns false if unmap needs to be aborted */
>>> +static inline bool unmap_hugetlb_folio(struct vm_area_struct *vma,
>>
>> I'm wondering whether we should make it clearer that this belongs to the
>> try_to_unmap family by calling it
>>
>> ttu_hugetlb_folio
>
> Yes I had suggested a ttu_ prefix somewhere else in the first version,
> Lorenzo didn't like it (or probably he didn't like that specific use
> of ttu):
>
> https://lore.kernel.org/all/a8b06f36-98e1-435c-881f-67242bc4304a@lucifer.local/
>
> Don't know about a better name other than "commit_ttu_lazyfree_folio" in
> that case, but for the hugetlb case, I like ttu_hugetlb_folio.
Yes, in particular, once we just process the whole hugetlb oddity in there.
I don't really care about the exact name as long as it's clear that this is not
something fairly generic.
[...]
>>
>>
>>> + /*
>>> + * huge_pmd_unshare may unmap an entire PMD page.
>>> + * There is no way of knowing exactly which PMDs may
>>> + * be cached for this mm, so we must flush them all.
>>> + * start/end were already adjusted above to cover this
>>> + * range.
>>> + */
>>> + flush_cache_range(vma, range->start, range->end);
>>> +
>>> + /*
>>> + * To call huge_pmd_unshare, i_mmap_rwsem must be
>>> + * held in write mode. Caller needs to explicitly
>>> + * do this outside rmap routines.
>>> + *
>>> + * We also must hold hugetlb vma_lock in write mode.
>>> + * Lock order dictates acquiring vma_lock BEFORE
>>> + * i_mmap_rwsem. We can only try lock here and fail
>>> + * if unsuccessful.
>>> + */
>>> + if (!folio_test_anon(folio)) {
>>> + struct mmu_gather tlb;
>>> +
>>> + VM_WARN_ON(!(flags & TTU_RMAP_LOCKED));
>>> + if (!hugetlb_vma_trylock_write(vma)) {
>>> + *exit_walk = true;
>>> + return false;
>>> + }
>>> +
>>> + tlb_gather_mmu_vma(&tlb, vma);
>>> + if (huge_pmd_unshare(&tlb, vma, pvmw->address, pvmw->pte)) {
>>> + hugetlb_vma_unlock_write(vma);
>>> + huge_pmd_unshare_flush(&tlb, vma);
>>> + tlb_finish_mmu(&tlb);
>>> + /*
>>> + * The PMD table was unmapped,
>>> + * consequently unmapping the folio.
>>> + */
>>> + *exit_walk = true;
>>> + return true;
>>> + }
>>> + hugetlb_vma_unlock_write(vma);
>>> + tlb_finish_mmu(&tlb);
>>> + }
>>> + *pteval = huge_ptep_clear_flush(vma, pvmw->address, pvmw->pte);
>>> + if (pte_dirty(*pteval))
>>> + folio_mark_dirty(folio);
>>> +
>>> + *exit_walk = false;
>>> + return true;
>>
>>
>> Can we instead introduce some enum that tells the caller how to proceed?
>>
>> I assume we have
>>
>> (a) Abort walk (ret = false + page_vma_mapped_walk_done())
>>
>> (b) Walk done (ret = true + page_vma_mapped_walk_done())
>>
>> (c) Continue walk (call page_vma_mapped_walk())
>>
>> enum ttu_walk_result {
>> TTU_WALK_CONTINUE,
>> TTU_WALK_ABORT,
>> TTU_WALK_DONE
>> }
>
> I had replied to such a suggestion here:
>
> https://lore.kernel.org/all/caa7c455-7472-48eb-a5dc-145e587d67ba@arm.com/
>
> Probably we don't have any other solution : )
That looks like the right way to. The boolean return is just nasty.
>>> - */
>>> - goto walk_done;
>>> - }
>>> - hugetlb_vma_unlock_write(vma);
>>> - tlb_finish_mmu(&tlb);
>>> + ret = unmap_hugetlb_folio(vma, folio, &pvmw, subpage,
>>> + flags, &pteval, &range,
>>> + &exit_walk);
>>> + if (exit_walk) {
>>> + page_vma_mapped_walk_done(&pvmw);
>>> + break;
>>
>> In the old walk_abort case you wouldn't set ret = false?
>
> ret will be set appropriately in unmap_hugetlb_folio.
Ah, right. Confusing ;)
>>
>> When returning the enum you could simply do something like
>>
>> switch (ret) {
>> case TTU_WALK_ABORT:
>> goto walk_abort;
>> case TTU_WALK_DONE:
>> goto walk_done;
>> default:
>> break;
>> }
>>
>>
>> While I like this patch, can we please just move all the hugetlb shite into this
>> helper function?
>>
>> Essentially, get rid of hugetlb special casing in the remainder of the function.
>>
>> That also makes the function name clearer (right now it's only doing a part of
>> hugetlb folio unmapping).
>
> Okay I can try that. That would mean splitting the pvmw walk for hugetlb and
> non-hugetlb, but I suspect it would be very less code duplication.
Right, and it would also be clearer that hugetlb really only is called for
hwpoison handling.
--
Cheers,
David
next prev parent reply other threads:[~2026-05-11 8:59 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-06 9:44 [PATCH v3 0/9] Optimize anonymous large folio unmapping Dev Jain
2026-05-06 9:44 ` [PATCH v3 1/9] mm/rmap: initialize nr_pages to 1 at loop start in try_to_unmap_one Dev Jain
2026-05-11 6:48 ` David Hildenbrand (Arm)
2026-05-11 8:18 ` Dev Jain
2026-05-11 8:32 ` David Hildenbrand (Arm)
2026-05-06 9:44 ` [PATCH v3 2/9] mm/rmap: refactor hugetlb pte clearing " Dev Jain
2026-05-11 7:10 ` David Hildenbrand (Arm)
2026-05-11 8:53 ` Dev Jain
2026-05-11 8:59 ` David Hildenbrand (Arm) [this message]
2026-05-06 9:44 ` [PATCH v3 3/9] mm/rmap: refactor some code around lazyfree folio unmapping Dev Jain
2026-05-11 7:28 ` David Hildenbrand (Arm)
2026-05-06 9:44 ` [PATCH v3 4/9] mm/memory: Batch set uffd-wp markers during zapping Dev Jain
2026-05-11 7:37 ` David Hildenbrand (Arm)
2026-05-06 9:45 ` [PATCH v3 5/9] mm/rmap: batch unmap folios belonging to uffd-wp VMAs Dev Jain
2026-05-11 7:41 ` David Hildenbrand (Arm)
2026-05-06 9:45 ` [PATCH v3 6/9] mm/swapfile: Add batched version of folio_dup_swap Dev Jain
2026-05-11 7:45 ` David Hildenbrand (Arm)
2026-05-06 9:45 ` [PATCH v3 7/9] mm/swapfile: Add batched version of folio_put_swap Dev Jain
2026-05-11 8:07 ` David Hildenbrand (Arm)
2026-05-06 9:45 ` [PATCH v3 8/9] mm/rmap: Add batched version of folio_try_share_anon_rmap_pte Dev Jain
2026-05-11 8:13 ` David Hildenbrand (Arm)
2026-05-11 8:14 ` David Hildenbrand (Arm)
2026-05-06 9:45 ` [PATCH v3 9/9] mm/rmap: enable batch unmapping of anonymous folios Dev Jain
2026-05-11 8:16 ` David Hildenbrand (Arm)
2026-05-08 23:38 ` [PATCH v3 0/9] Optimize anonymous large folio unmapping Andrew Morton
2026-05-11 6:21 ` Dev Jain
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=f3ec1b78-e405-4b58-8588-faa97070cb3b@kernel.org \
--to=david@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=anshuman.khandual@arm.com \
--cc=axelrasmussen@google.com \
--cc=baohua@kernel.org \
--cc=baolin.wang@linux.alibaba.com \
--cc=bhe@redhat.com \
--cc=chrisl@kernel.org \
--cc=dev.jain@arm.com \
--cc=harry@kernel.org \
--cc=hughd@google.com \
--cc=jannh@google.com \
--cc=kasong@tencent.com \
--cc=liam@infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=ljs@kernel.org \
--cc=mhocko@suse.com \
--cc=nphamcs@gmail.com \
--cc=pfalcato@suse.de \
--cc=qi.zheng@linux.dev \
--cc=riel@surriel.com \
--cc=rppt@kernel.org \
--cc=ryan.roberts@arm.com \
--cc=shakeel.butt@linux.dev \
--cc=shikemeng@huaweicloud.com \
--cc=surenb@google.com \
--cc=vbabka@kernel.org \
--cc=weixugc@google.com \
--cc=youngjun.park@lge.com \
--cc=yuanchu@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox