All of lore.kernel.org
 help / color / mirror / Atom feed
From: Lance Yang <lance.yang@linux.dev>
To: Nico Pache <npache@redhat.com>
Cc: "David Hildenbrand (Arm)" <david@kernel.org>,
	linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-mm@kvack.org, linux-trace-kernel@vger.kernel.org,
	aarcange@redhat.com, akpm@linux-foundation.org,
	anshuman.khandual@arm.com, apopple@nvidia.com, baohua@kernel.org,
	baolin.wang@linux.alibaba.com, byungchul@sk.com,
	catalin.marinas@arm.com, cl@gentwo.org, corbet@lwn.net,
	dave.hansen@linux.intel.com, dev.jain@arm.com, gourry@gourry.net,
	hannes@cmpxchg.org, hughd@google.com, jack@suse.cz,
	jackmanb@google.com, jannh@google.com, jglisse@google.com,
	joshua.hahnjy@gmail.com, kas@kernel.org, liam@infradead.org,
	ljs@kernel.org, mathieu.desnoyers@efficios.com,
	matthew.brost@intel.com, mhiramat@kernel.org, mhocko@suse.com,
	peterx@redhat.com, pfalcato@suse.de, rakie.kim@sk.com,
	raquini@redhat.com, rdunlap@infradead.org,
	richard.weiyang@gmail.com, rientjes@google.com,
	rostedt@goodmis.org, rppt@kernel.org, ryan.roberts@arm.com,
	shivankg@amd.com, sunnanyong@huawei.com, surenb@google.com,
	thomas.hellstrom@linux.intel.com, tiwai@suse.de,
	usamaarif642@gmail.com, vbabka@suse.cz, vishal.moola@gmail.com,
	wangkefeng.wang@huawei.com, will@kernel.org, willy@infradead.org,
	yang@os.amperecomputing.com, ying.huang@linux.alibaba.com,
	ziy@nvidia.com, zokeefe@google.com
Subject: Re: [PATCH mm-unstable v19 11/14] mm/khugepaged: Introduce mTHP collapse support
Date: Tue, 9 Jun 2026 19:01:51 +0800	[thread overview]
Message-ID: <4fa58f7c-f3c5-484e-a8ca-7dfef50a1679@linux.dev> (raw)
In-Reply-To: <CAA1CXcD7WAiA1b9GTLAuNZ+kHaFx0SzZwpBkqAZ=s+RHsTUaow@mail.gmail.com>



On 2026/6/9 18:50, Nico Pache wrote:
> On Tue, Jun 9, 2026 at 4:37 AM Lance Yang <lance.yang@linux.dev> wrote:
>>
>>
>>
>> On 2026/6/9 17:32, Nico Pache wrote:
>>> On Tue, Jun 9, 2026 at 3:26 AM David Hildenbrand (Arm) <david@kernel.org> wrote:
>>>>
>>>> On 6/9/26 11:06, Nico Pache wrote:
>>>>> On Mon, Jun 8, 2026 at 8:57 AM David Hildenbrand (Arm) <david@kernel.org> wrote:
>>>>>>
>>>>>> On 6/6/26 12:28, Lance Yang wrote:
>>>>>>>
>>>>>>>
>>>>>>> Looks broken for swap PTEs in PMD collapse ...
>>>>>>>
>>>>>>> collapse_scan_pmd() allows them up to max_ptes_swap and record them in
>>>>>>> unmapped, but they don't get a bit in mthp_present_ptes. And then
>>>>>>> mthp_collapse() does the check above:
>>>>>>
>>>>>> Right. I assumed this is implicitly handled by the optimization in collapse_scan_pmd:
>>>>>>
>>>>>>           if (enabled_orders != BIT(HPAGE_PMD_ORDER))
>>>>>>                   max_ptes_none = KHUGEPAGED_MAX_PTES_LIMIT;
>>>>>>
>>>>>> But we perform the check a second time.
>>>>>>
>>>>>>>
>>>>>>> nr_occupied_ptes >= nr_ptes - max_ptes_none
>>>>>>>
>>>>>>> So max_ptes_none=0 + 511 present PTEs + one allowed swap PTE won't even
>>>>>>> call collapse_huge_page() for PMD order.
>>>>>>>
>>>>>>> Shouldn't we account for them in the PMD-order check? Something like:
>>>>>>>
>>>>>>> if (is_pmd_order(order))
>>>>>>>         nr_occupied_ptes += unmapped;
>>>>>
>>>>> This solution seems good for a temporary fixup. but longterm we may
>>>>> want something else. I'm still not sure how we plan on supporting
>>>>> swapin without causing creep. So I'd be ok with adding a fix for
>>>>> legacy PMD behavior until we know how to handle mTHP creep correctly.
>>>>>
>>>>>> As an alternative, we could either 1) skip the check there for
>>>>>> pmd order (as the check was already done); or 2) introduce+maintain
>>>>>> a bitmap that tracks non-present PTEs.
>>>>>>
>>>>>> @@ -1475,7 +1477,9 @@ static enum scan_result mthp_collapse(struct mm_struct *mm,
>>>>>>                   nr_occupied_ptes = bitmap_weight_from(cc->mthp_present_ptes, offset,
>>>>>>                                                         offset + nr_ptes);
>>>>>>
>>>>>> -               if (nr_occupied_ptes >= nr_ptes - max_ptes_none) {
>>>>>> +               /* Check was already done in the caller. */
>>>>>> +               if (is_pmd_order(order) ||
>>>>>> +                   nr_occupied_ptes >= nr_ptes - max_ptes_none) {
>>>>>>                           enum scan_result ret;
>>>>>>
>>>>>>                           collapse_address = address + offset * PAGE_SIZE;
>>>>>>
>>>>>> 2) would probably be cleanest long-term.
>>>>>
>>>>> That would be best for future swapin support in mTHP, but I still
>>>>> don't think it solves the creep issue.
>>>>
>>>> It wouldn't, we'd simply maintain the state we collect + rely on in separate
>>>> bitmaps. On swapin, we'd have to update/refresh bitmaps I guess.
>>>
>>> Yeah, I'm saying for the future, it obviously solves this issue here
>>> as well, but if we have positional tracking of the swapout, shared,
>>> and none PTEs, I think we can use this to determine whether the
>>> collapse would lead to creep. If we detect creep would happen it may
>>> be best to automatically collapse to the N+1 (or greater) candidate.
>>> Just thinking outloud here.
>>>
>>>>
>>>>> Perhaps we could combine the
>>>>> two bitmaps to determine if it would make the future collapse eligible
>>>>> again? Not sure but ill start thinking about it.
>>>>>
>>>>> Should I send a fixup for this using Lance's solution? Or does Lance
>>>>> want to send a patch out with the fixes tag?
>>>>
>>>> If Lance could send a fixup, explaining the situation, that would be nice.
>>
>> Sure, happy to send a fixup :P
>>
>> Should I send it as a fixup to be folded into this patch, or as a
>> separate patch with a Fixes: tag?
> 
> Id assume a seperate patch so you can keep credit for the discovery :)

Okay :D

> Thank you for all the review you provided on this series, its been
> really helpful!

Appreciate it!

Nice work getting it this far. Nice one, Nico :P


  reply	other threads:[~2026-06-09 11:03 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-05 16:14 [PATCH mm-unstable v19 00/14] khugepaged: add mTHP collapse support Nico Pache
2026-06-05 16:14 ` [PATCH mm-unstable v19 01/14] mm/khugepaged: generalize hugepage_vma_revalidate for mTHP support Nico Pache
2026-06-05 16:14 ` [PATCH mm-unstable v19 02/14] mm/khugepaged: generalize alloc_charge_folio() Nico Pache
2026-06-05 16:14 ` [PATCH mm-unstable v19 03/14] mm/khugepaged: rework max_ptes_* handling with helper functions Nico Pache
2026-06-05 16:14 ` [PATCH mm-unstable v19 04/14] mm/khugepaged: generalize __collapse_huge_page_* for mTHP support Nico Pache
2026-06-05 19:03   ` Zi Yan
2026-06-05 16:14 ` [PATCH mm-unstable v19 05/14] mm/khugepaged: require collapse_huge_page to enter/exit with the lock dropped Nico Pache
2026-06-05 20:07   ` Zi Yan
2026-06-08  4:34   ` Lance Yang
2026-06-05 16:14 ` [PATCH mm-unstable v19 06/14] mm/khugepaged: generalize collapse_huge_page for mTHP collapse Nico Pache
2026-06-05 17:48   ` David Hildenbrand (Arm)
2026-06-05 18:15     ` Lorenzo Stoakes
2026-06-05 18:18   ` Lorenzo Stoakes
2026-06-08  4:54   ` Lance Yang
2026-06-05 16:14 ` [PATCH mm-unstable v19 07/14] mm/khugepaged: skip collapsing mTHP to smaller orders Nico Pache
2026-06-08  6:59   ` Lance Yang
2026-06-05 16:14 ` [PATCH mm-unstable v19 08/14] mm/khugepaged: add per-order mTHP collapse failure statistics Nico Pache
2026-06-08  7:13   ` Lance Yang
2026-06-05 16:14 ` [PATCH mm-unstable v19 09/14] mm/khugepaged: improve tracepoints for mTHP orders Nico Pache
2026-06-08  7:19   ` Lance Yang
2026-06-05 16:14 ` [PATCH mm-unstable v19 10/14] mm/khugepaged: introduce collapse_possible_orders helper functions Nico Pache
2026-06-05 17:46   ` Lorenzo Stoakes
2026-06-08  7:27   ` Lance Yang
2026-06-05 16:14 ` [PATCH mm-unstable v19 11/14] mm/khugepaged: Introduce mTHP collapse support Nico Pache
2026-06-05 18:03   ` David Hildenbrand (Arm)
2026-06-05 18:38   ` Lorenzo Stoakes
2026-06-09  9:01     ` Nico Pache
2026-06-06 10:28   ` Lance Yang
2026-06-08 14:56     ` David Hildenbrand (Arm)
2026-06-08 16:26       ` Lance Yang
2026-06-09  1:52         ` Lance Yang
2026-06-09  9:06       ` Nico Pache
2026-06-09  9:25         ` David Hildenbrand (Arm)
2026-06-09  9:32           ` Nico Pache
2026-06-09 10:36             ` Lance Yang
2026-06-09 10:50               ` Nico Pache
2026-06-09 11:01                 ` Lance Yang [this message]
2026-06-05 16:14 ` [PATCH mm-unstable v19 12/14] mm/khugepaged: avoid unnecessary mTHP collapse attempts Nico Pache
2026-06-05 17:49   ` David Hildenbrand (Arm)
2026-06-05 18:16     ` Lorenzo Stoakes
2026-06-08  7:36   ` Lance Yang
2026-06-05 16:14 ` [PATCH mm-unstable v19 13/14] mm/khugepaged: run khugepaged for all orders Nico Pache
2026-06-05 16:14 ` [PATCH mm-unstable v19 14/14] Documentation: mm: update the admin guide for mTHP collapse Nico Pache
2026-06-05 17:52   ` David Hildenbrand (Arm)
2026-06-05 18:20   ` Lorenzo Stoakes
2026-06-08  7:41   ` Lance Yang
2026-06-05 18:07 ` [PATCH mm-unstable v19 00/14] khugepaged: add mTHP collapse support David Hildenbrand (Arm)
2026-06-05 18:39   ` Lorenzo Stoakes
2026-06-06  0:38 ` Andrew Morton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4fa58f7c-f3c5-484e-a8ca-7dfef50a1679@linux.dev \
    --to=lance.yang@linux.dev \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=anshuman.khandual@arm.com \
    --cc=apopple@nvidia.com \
    --cc=baohua@kernel.org \
    --cc=baolin.wang@linux.alibaba.com \
    --cc=byungchul@sk.com \
    --cc=catalin.marinas@arm.com \
    --cc=cl@gentwo.org \
    --cc=corbet@lwn.net \
    --cc=dave.hansen@linux.intel.com \
    --cc=david@kernel.org \
    --cc=dev.jain@arm.com \
    --cc=gourry@gourry.net \
    --cc=hannes@cmpxchg.org \
    --cc=hughd@google.com \
    --cc=jack@suse.cz \
    --cc=jackmanb@google.com \
    --cc=jannh@google.com \
    --cc=jglisse@google.com \
    --cc=joshua.hahnjy@gmail.com \
    --cc=kas@kernel.org \
    --cc=liam@infradead.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-trace-kernel@vger.kernel.org \
    --cc=ljs@kernel.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=matthew.brost@intel.com \
    --cc=mhiramat@kernel.org \
    --cc=mhocko@suse.com \
    --cc=npache@redhat.com \
    --cc=peterx@redhat.com \
    --cc=pfalcato@suse.de \
    --cc=rakie.kim@sk.com \
    --cc=raquini@redhat.com \
    --cc=rdunlap@infradead.org \
    --cc=richard.weiyang@gmail.com \
    --cc=rientjes@google.com \
    --cc=rostedt@goodmis.org \
    --cc=rppt@kernel.org \
    --cc=ryan.roberts@arm.com \
    --cc=shivankg@amd.com \
    --cc=sunnanyong@huawei.com \
    --cc=surenb@google.com \
    --cc=thomas.hellstrom@linux.intel.com \
    --cc=tiwai@suse.de \
    --cc=usamaarif642@gmail.com \
    --cc=vbabka@suse.cz \
    --cc=vishal.moola@gmail.com \
    --cc=wangkefeng.wang@huawei.com \
    --cc=will@kernel.org \
    --cc=willy@infradead.org \
    --cc=yang@os.amperecomputing.com \
    --cc=ying.huang@linux.alibaba.com \
    --cc=ziy@nvidia.com \
    --cc=zokeefe@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.