From: Lorenzo Stoakes <ljs@kernel.org>
To: "David Hildenbrand (Arm)" <david@kernel.org>
Cc: Lance Yang <lance.yang@linux.dev>,
akpm@linux-foundation.org, ziy@nvidia.com,
baolin.wang@linux.alibaba.com, liam@infradead.org,
npache@redhat.com, ryan.roberts@arm.com, dev.jain@arm.com,
baohua@kernel.org, linux-mm@kvack.org,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH mm-unstable 1/1] mm/khugepaged: fix PMD collapse swap PTE accounting
Date: Tue, 9 Jun 2026 15:33:40 +0100 [thread overview]
Message-ID: <aigj8wehALMDySTC@lucifer> (raw)
In-Reply-To: <7d081256-5b30-4e3c-b948-85ba76ad0e1d@kernel.org>
On Tue, Jun 09, 2026 at 03:16:10PM +0200, David Hildenbrand (Arm) wrote:
> On 6/9/26 14:04, Lance Yang wrote:
> > From: Lance Yang <lance.yang@linux.dev>
> >
> > mthp_collapse() uses mthp_present_ptes to decide whether a range has
> > enough occupied PTEs to try collapse. Swap PTEs accepted by
> > collapse_scan_pmd() are counted in unmapped, but are not represented in
> > mthp_present_ptes.
> >
> > When lower orders are enabled, collapse_scan_pmd() relaxes max_ptes_none
> > so the scan can cover the whole PMD and build the bitmap. mthp_collapse()
> > then checks the PMD-order candidate using the bitmap.
> >
> > With max_ptes_none set to 0, a range with 511 present PTEs and one swap
> > PTE no longer reaches collapse_huge_page(), even though PMD collapse can
> > handle swap PTEs up to max_ptes_swap.
> >
> > Account unmapped PTEs only for PMD order. PMD collapse supports swap PTEs
> > through max_ptes_swap, while lower-order mTHP collapse does not currently
> > support non-present PTEs. Keep non-present PTEs out of the lower-order
> > eligibility check.
> >
> > Signed-off-by: Lance Yang <lance.yang@linux.dev>
> > ---
> > Sent separately, as discussed in [1], to spell out the PMD-order swap PTE
> > case. Patch [2] is still only in mm-unstable, so no Fixes: tag.
> >
> > [1] https://lore.kernel.org/linux-mm/CAA1CXcD7WAiA1b9GTLAuNZ+kHaFx0SzZwpBkqAZ=s+RHsTUaow@mail.gmail.com/
> > [2] https://lore.kernel.org/linux-mm/20260605161422.213817-12-npache@redhat.com/
> >
> > mm/khugepaged.c | 8 ++++++++
> > 1 file changed, 8 insertions(+)
> >
> > diff --git a/mm/khugepaged.c b/mm/khugepaged.c
> > index b12187709f6d..617bca76db49 100644
> > --- a/mm/khugepaged.c
> > +++ b/mm/khugepaged.c
> > @@ -1508,6 +1508,14 @@ static enum scan_result mthp_collapse(struct mm_struct *mm,
> > nr_occupied_ptes = bitmap_weight_from(cc->mthp_present_ptes, offset,
> > offset + nr_ptes);
> >
> > + /*
> > + * Swap PTEs accepted during the scan are counted in @unmapped,
> > + * not in the present-PTE bitmap. Account them for the PMD-order
> > + * candidate.
> > + */
> > + if (is_pmd_order(order))
> > + nr_occupied_ptes += unmapped;
> > +
>
> LGTM, there is a bit of opportunity for cleanup in the future :)
From my point of view, accepting the mTHP khugepaged changes was essentially a
big compromise on how much it adds to the mess of the existing code base, and
AFAIC we shouldn't accept any further major changes until we actually sort this
mess out :)
>
> Acked-by: David Hildenbrand (Arm) <david@kernel.org>
>
>
> For example, as we no longer have the VMA here, collapse_max_ptes_none is
> imprecise in uffd VMAs. We might try collapsing where there sure is nothing to
> collapse.
>
> We could likely handle the userfaultfd_armed() part easier: some indication that
> we must not have any pte_none() would be sufficient.
>
> Also, I don't see a good reason why uffd would not be allowed to collapse with
> zeropages ... it's really just about missing faults due to pte_none().
Ugh uffd.
>
> --
> Cheers,
>
> David
Cheers, Lorenzo
next prev parent reply other threads:[~2026-06-09 14:33 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-09 12:04 [PATCH mm-unstable 1/1] mm/khugepaged: fix PMD collapse swap PTE accounting Lance Yang
2026-06-09 13:16 ` David Hildenbrand (Arm)
2026-06-09 14:33 ` Lorenzo Stoakes [this message]
2026-06-09 16:28 ` Lance Yang
2026-06-09 17:04 ` Nico Pache
2026-06-09 13:20 ` David Hildenbrand (Arm)
2026-06-09 13:56 ` Lance Yang
2026-06-09 14:32 ` Lorenzo Stoakes
2026-06-09 17:08 ` Nico Pache
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aigj8wehALMDySTC@lucifer \
--to=ljs@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=baohua@kernel.org \
--cc=baolin.wang@linux.alibaba.com \
--cc=david@kernel.org \
--cc=dev.jain@arm.com \
--cc=lance.yang@linux.dev \
--cc=liam@infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=npache@redhat.com \
--cc=ryan.roberts@arm.com \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox