All of lore.kernel.org
 help / color / mirror / Atom feed
From: Lance Yang <lance.yang@linux.dev>
To: "David Hildenbrand (Arm)" <david@kernel.org>,
	akpm@linux-foundation.org, ljs@kernel.org
Cc: ziy@nvidia.com, baolin.wang@linux.alibaba.com,
	liam@infradead.org, npache@redhat.com, ryan.roberts@arm.com,
	dev.jain@arm.com, baohua@kernel.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH mm-unstable 1/1] mm/khugepaged: fix PMD collapse swap PTE accounting
Date: Wed, 10 Jun 2026 00:28:24 +0800	[thread overview]
Message-ID: <9e548d7d-480a-42cb-8912-80f415cd0bb3@linux.dev> (raw)
In-Reply-To: <7d081256-5b30-4e3c-b948-85ba76ad0e1d@kernel.org>



On 2026/6/9 21:16, David Hildenbrand (Arm) wrote:
> On 6/9/26 14:04, Lance Yang wrote:
>> From: Lance Yang <lance.yang@linux.dev>
>>
>> mthp_collapse() uses mthp_present_ptes to decide whether a range has
>> enough occupied PTEs to try collapse. Swap PTEs accepted by
>> collapse_scan_pmd() are counted in unmapped, but are not represented in
>> mthp_present_ptes.
>>
>> When lower orders are enabled, collapse_scan_pmd() relaxes max_ptes_none
>> so the scan can cover the whole PMD and build the bitmap. mthp_collapse()
>> then checks the PMD-order candidate using the bitmap.
>>
>> With max_ptes_none set to 0, a range with 511 present PTEs and one swap
>> PTE no longer reaches collapse_huge_page(), even though PMD collapse can
>> handle swap PTEs up to max_ptes_swap.
>>
>> Account unmapped PTEs only for PMD order. PMD collapse supports swap PTEs
>> through max_ptes_swap, while lower-order mTHP collapse does not currently
>> support non-present PTEs. Keep non-present PTEs out of the lower-order
>> eligibility check.
>>
>> Signed-off-by: Lance Yang <lance.yang@linux.dev>
>> ---
>> Sent separately, as discussed in [1], to spell out the PMD-order swap PTE
>> case. Patch [2] is still only in mm-unstable, so no Fixes: tag.
>>
>> [1] https://lore.kernel.org/linux-mm/CAA1CXcD7WAiA1b9GTLAuNZ+kHaFx0SzZwpBkqAZ=s+RHsTUaow@mail.gmail.com/
>> [2] https://lore.kernel.org/linux-mm/20260605161422.213817-12-npache@redhat.com/
>>
>>   mm/khugepaged.c | 8 ++++++++
>>   1 file changed, 8 insertions(+)
>>
>> diff --git a/mm/khugepaged.c b/mm/khugepaged.c
>> index b12187709f6d..617bca76db49 100644
>> --- a/mm/khugepaged.c
>> +++ b/mm/khugepaged.c
>> @@ -1508,6 +1508,14 @@ static enum scan_result mthp_collapse(struct mm_struct *mm,
>>   		nr_occupied_ptes = bitmap_weight_from(cc->mthp_present_ptes, offset,
>>   						      offset + nr_ptes);
>>   
>> +		/*
>> +		 * Swap PTEs accepted during the scan are counted in @unmapped,
>> +		 * not in the present-PTE bitmap. Account them for the PMD-order
>> +		 * candidate.
>> +		 */
>> +		if (is_pmd_order(order))
>> +			nr_occupied_ptes += unmapped;
>> +
> 
> LGTM, there is a bit of opportunity for cleanup in the future :)

Yes, follow-up cleanup material :)

> Acked-by: David Hildenbrand (Arm) <david@kernel.org>

Thanks!

> For example, as we no longer have the VMA here, collapse_max_ptes_none is
> imprecise in uffd VMAs. We might try collapsing where there sure is nothing to
> collapse.

Oh, good catch. We may end up trying a collapse that cannot really
go anywhere ... One for a follow-up.

> We could likely handle the userfaultfd_armed() part easier: some indication that
> we must not have any pte_none() would be sufficient.

Right. By the time we get to mthp_collapse(), we probably only need
to carry that as a small "no pte_none" constraint for the candidate
range.

> Also, I don't see a good reason why uffd would not be allowed to collapse with
> zeropages ... it's really just about missing faults due to pte_none().

Makes sense to me. I'll take a look when I get a chance. And yeah,
as Lorenzo said, better to clean up the khugepaged mess first
before piling more on top :)

Cheers, Lance


  parent reply	other threads:[~2026-06-09 16:28 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-09 12:04 [PATCH mm-unstable 1/1] mm/khugepaged: fix PMD collapse swap PTE accounting Lance Yang
2026-06-09 13:16 ` David Hildenbrand (Arm)
2026-06-09 14:33   ` Lorenzo Stoakes
2026-06-09 16:28   ` Lance Yang [this message]
2026-06-09 17:04   ` Nico Pache
2026-06-10  2:10     ` Lance Yang
2026-06-09 13:20 ` David Hildenbrand (Arm)
2026-06-09 13:56   ` Lance Yang
2026-06-09 18:56     ` Andrew Morton
2026-06-10  1:52       ` Lance Yang
2026-06-09 14:32 ` Lorenzo Stoakes
2026-06-09 17:08 ` Nico Pache
2026-06-10  8:02 ` Baolin Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9e548d7d-480a-42cb-8912-80f415cd0bb3@linux.dev \
    --to=lance.yang@linux.dev \
    --cc=akpm@linux-foundation.org \
    --cc=baohua@kernel.org \
    --cc=baolin.wang@linux.alibaba.com \
    --cc=david@kernel.org \
    --cc=dev.jain@arm.com \
    --cc=liam@infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=ljs@kernel.org \
    --cc=npache@redhat.com \
    --cc=ryan.roberts@arm.com \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.