linux-doc.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Nico Pache <npache@redhat.com>
To: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: linux-mm@kvack.org, linux-doc@vger.kernel.org,
	 linux-kernel@vger.kernel.org,
	linux-trace-kernel@vger.kernel.org,  david@redhat.com,
	ziy@nvidia.com, lorenzo.stoakes@oracle.com,
	 Liam.Howlett@oracle.com, ryan.roberts@arm.com, dev.jain@arm.com,
	 corbet@lwn.net, rostedt@goodmis.org, mhiramat@kernel.org,
	 mathieu.desnoyers@efficios.com, akpm@linux-foundation.org,
	baohua@kernel.org,  willy@infradead.org, peterx@redhat.com,
	wangkefeng.wang@huawei.com,  usamaarif642@gmail.com,
	sunnanyong@huawei.com, vishal.moola@gmail.com,
	 thomas.hellstrom@linux.intel.com, yang@os.amperecomputing.com,
	 kirill.shutemov@linux.intel.com, aarcange@redhat.com,
	raquini@redhat.com,  anshuman.khandual@arm.com,
	catalin.marinas@arm.com, tiwai@suse.de,  will@kernel.org,
	dave.hansen@linux.intel.com, jack@suse.cz, cl@gentwo.org,
	 jglisse@google.com, surenb@google.com, zokeefe@google.com,
	hannes@cmpxchg.org,  rientjes@google.com, mhocko@suse.com,
	rdunlap@infradead.org, hughd@google.com
Subject: Re: [PATCH v9 13/14] khugepaged: add per-order mTHP khugepaged stats
Date: Fri, 18 Jul 2025 15:00:25 -0600	[thread overview]
Message-ID: <CAA1CXcDQeiMjVhxVjnCvBuTQLSBQh0ea7FJXg52ebNFDHfXm1g@mail.gmail.com> (raw)
In-Reply-To: <94c8899a-f116-4b6a-94d3-f8295ee3f535@linux.alibaba.com>

On Thu, Jul 17, 2025 at 11:05 PM Baolin Wang
<baolin.wang@linux.alibaba.com> wrote:
>
>
>
> On 2025/7/14 08:32, Nico Pache wrote:
> > With mTHP support inplace, let add the per-order mTHP stats for
> > exceeding NONE, SWAP, and SHARED.
> >
> > Signed-off-by: Nico Pache <npache@redhat.com>
> > ---
> >   Documentation/admin-guide/mm/transhuge.rst | 17 +++++++++++++++++
> >   include/linux/huge_mm.h                    |  3 +++
> >   mm/huge_memory.c                           |  7 +++++++
> >   mm/khugepaged.c                            | 15 ++++++++++++---
> >   4 files changed, 39 insertions(+), 3 deletions(-)
> >
> > diff --git a/Documentation/admin-guide/mm/transhuge.rst b/Documentation/admin-guide/mm/transhuge.rst
> > index 2c523dce6bc7..28c8af61efba 100644
> > --- a/Documentation/admin-guide/mm/transhuge.rst
> > +++ b/Documentation/admin-guide/mm/transhuge.rst
> > @@ -658,6 +658,23 @@ nr_anon_partially_mapped
> >          an anonymous THP as "partially mapped" and count it here, even though it
> >          is not actually partially mapped anymore.
> >
> > +collapse_exceed_swap_pte
> > +       The number of anonymous THP which contain at least one swap PTE.
> > +       Currently khugepaged does not support collapsing mTHP regions that
> > +       contain a swap PTE.
> > +
> > +collapse_exceed_none_pte
> > +       The number of anonymous THP which have exceeded the none PTE threshold.
> > +       With mTHP collapse, a bitmap is used to gather the state of a PMD region
> > +       and is then recursively checked from largest to smallest order against
> > +       the scaled max_ptes_none count. This counter indicates that the next
> > +       enabled order will be checked.
> > +
> > +collapse_exceed_shared_pte
> > +       The number of anonymous THP which contain at least one shared PTE.
> > +       Currently khugepaged does not support collapsing mTHP regions that
> > +       contain a shared PTE.
> > +
> >   As the system ages, allocating huge pages may be expensive as the
> >   system uses memory compaction to copy data around memory to free a
> >   huge page for use. There are some counters in ``/proc/vmstat`` to help
> > diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
> > index 4042078e8cc9..e0a27f80f390 100644
> > --- a/include/linux/huge_mm.h
> > +++ b/include/linux/huge_mm.h
> > @@ -141,6 +141,9 @@ enum mthp_stat_item {
> >       MTHP_STAT_SPLIT_DEFERRED,
> >       MTHP_STAT_NR_ANON,
> >       MTHP_STAT_NR_ANON_PARTIALLY_MAPPED,
> > +     MTHP_STAT_COLLAPSE_EXCEED_SWAP,
> > +     MTHP_STAT_COLLAPSE_EXCEED_NONE,
> > +     MTHP_STAT_COLLAPSE_EXCEED_SHARED,
> >       __MTHP_STAT_COUNT
> >   };
> >
> > diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> > index e2ed9493df77..57e5699cf638 100644
> > --- a/mm/huge_memory.c
> > +++ b/mm/huge_memory.c
> > @@ -632,6 +632,10 @@ DEFINE_MTHP_STAT_ATTR(split_failed, MTHP_STAT_SPLIT_FAILED);
> >   DEFINE_MTHP_STAT_ATTR(split_deferred, MTHP_STAT_SPLIT_DEFERRED);
> >   DEFINE_MTHP_STAT_ATTR(nr_anon, MTHP_STAT_NR_ANON);
> >   DEFINE_MTHP_STAT_ATTR(nr_anon_partially_mapped, MTHP_STAT_NR_ANON_PARTIALLY_MAPPED);
> > +DEFINE_MTHP_STAT_ATTR(collapse_exceed_swap_pte, MTHP_STAT_COLLAPSE_EXCEED_SWAP);
> > +DEFINE_MTHP_STAT_ATTR(collapse_exceed_none_pte, MTHP_STAT_COLLAPSE_EXCEED_NONE);
> > +DEFINE_MTHP_STAT_ATTR(collapse_exceed_shared_pte, MTHP_STAT_COLLAPSE_EXCEED_SHARED);
> > +
> >
> >   static struct attribute *anon_stats_attrs[] = {
> >       &anon_fault_alloc_attr.attr,
> > @@ -648,6 +652,9 @@ static struct attribute *anon_stats_attrs[] = {
> >       &split_deferred_attr.attr,
> >       &nr_anon_attr.attr,
> >       &nr_anon_partially_mapped_attr.attr,
> > +     &collapse_exceed_swap_pte_attr.attr,
> > +     &collapse_exceed_none_pte_attr.attr,
> > +     &collapse_exceed_shared_pte_attr.attr,
> >       NULL,
> >   };
> >
> > diff --git a/mm/khugepaged.c b/mm/khugepaged.c
> > index d0c99b86b304..8a5873d0a23a 100644
> > --- a/mm/khugepaged.c
> > +++ b/mm/khugepaged.c
> > @@ -594,7 +594,10 @@ static int __collapse_huge_page_isolate(struct vm_area_struct *vma,
> >                               continue;
> >                       } else {
> >                               result = SCAN_EXCEED_NONE_PTE;
> > -                             count_vm_event(THP_SCAN_EXCEED_NONE_PTE);
> > +                             if (order == HPAGE_PMD_ORDER)
> > +                                     count_vm_event(THP_SCAN_EXCEED_NONE_PTE);
> > +                             else
> > +                                     count_mthp_stat(order, MTHP_STAT_COLLAPSE_EXCEED_NONE);
>
> Please follow the same logic as other mTHP statistics, meaning there is
> no need to filter out PMD-sized orders, because mTHP also supports
> PMD-sized orders. So logic should be:
>
> if (order == HPAGE_PMD_ORDER)
>         count_vm_event(THP_SCAN_EXCEED_NONE_PTE);
>
> count_mthp_stat(order, MTHP_STAT_COLLAPSE_EXCEED_NONE);
Good point-- I will fix that!
>
> >                               goto out;
> >                       }
> >               }
> > @@ -623,8 +626,14 @@ static int __collapse_huge_page_isolate(struct vm_area_struct *vma,
> >               /* See khugepaged_scan_pmd(). */
> >               if (folio_maybe_mapped_shared(folio)) {
> >                       ++shared;
> > -                     if (order != HPAGE_PMD_ORDER || (cc->is_khugepaged &&
> > -                         shared > khugepaged_max_ptes_shared)) {
> > +                     if (order != HPAGE_PMD_ORDER) {
> > +                             result = SCAN_EXCEED_SHARED_PTE;
> > +                             count_mthp_stat(order, MTHP_STAT_COLLAPSE_EXCEED_SHARED);
> > +                             goto out;
> > +                     }
>
> Ditto.
Thanks!

There is also the SWAP one, which is slightly different as it is
calculated during the scan phase, and in the mTHP case in the swapin
faulting code. Not sure if during the scan phase we should also
increment the counter for the PMD order... or just leave it as a
general vm_event counter since it's not attributed to an order during
scan. I believe the latter is the correct approach and only attribute
an order to it in the __collapse_huge_page_swapin function if its mTHP
collapses.
>
> > +
> > +                     if (cc->is_khugepaged &&
> > +                             shared > khugepaged_max_ptes_shared) {
> >                               result = SCAN_EXCEED_SHARED_PTE;
> >                               count_vm_event(THP_SCAN_EXCEED_SHARED_PTE);
> >                               goto out;
>


  reply	other threads:[~2025-07-18 21:00 UTC|newest]

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-07-14  0:31 [PATCH v9 00/14] khugepaged: mTHP support Nico Pache
2025-07-14  0:31 ` [PATCH v9 01/14] khugepaged: rename hpage_collapse_* to collapse_* Nico Pache
2025-07-15 15:39   ` David Hildenbrand
2025-07-16 14:29   ` Liam R. Howlett
2025-07-16 15:20     ` David Hildenbrand
2025-07-17  7:21     ` Nico Pache
2025-07-25 16:43   ` Lorenzo Stoakes
2025-07-25 22:35     ` Nico Pache
2025-07-14  0:31 ` [PATCH v9 02/14] introduce collapse_single_pmd to unify khugepaged and madvise_collapse Nico Pache
2025-07-15 15:53   ` David Hildenbrand
2025-07-23  1:56     ` Nico Pache
2025-07-16 15:12   ` Liam R. Howlett
2025-07-23  1:55     ` Nico Pache
2025-07-14  0:31 ` [PATCH v9 03/14] khugepaged: generalize hugepage_vma_revalidate for mTHP support Nico Pache
2025-07-15 15:55   ` David Hildenbrand
2025-07-14  0:31 ` [PATCH v9 04/14] khugepaged: generalize alloc_charge_folio() Nico Pache
2025-07-16 13:46   ` David Hildenbrand
2025-07-17  7:22     ` Nico Pache
2025-07-14  0:31 ` [PATCH v9 05/14] khugepaged: generalize __collapse_huge_page_* for mTHP support Nico Pache
2025-07-16 13:52   ` David Hildenbrand
2025-07-17  7:22     ` Nico Pache
2025-07-16 14:02   ` David Hildenbrand
2025-07-17  7:23     ` Nico Pache
2025-07-17 15:54     ` Lorenzo Stoakes
2025-07-25 16:09   ` Lorenzo Stoakes
2025-07-25 22:37     ` Nico Pache
2025-07-14  0:31 ` [PATCH v9 06/14] khugepaged: introduce collapse_scan_bitmap " Nico Pache
2025-07-16 14:03   ` David Hildenbrand
2025-07-17  7:23     ` Nico Pache
2025-07-16 15:38   ` Liam R. Howlett
2025-07-17  7:24     ` Nico Pache
2025-07-14  0:32 ` [PATCH v9 07/14] khugepaged: add " Nico Pache
2025-07-14  0:32 ` [PATCH v9 08/14] khugepaged: skip collapsing mTHP to smaller orders Nico Pache
2025-07-16 14:32   ` David Hildenbrand
2025-07-17  7:24     ` Nico Pache
2025-07-14  0:32 ` [PATCH v9 09/14] khugepaged: avoid unnecessary mTHP collapse attempts Nico Pache
2025-07-18  2:14   ` Baolin Wang
2025-07-18 22:34     ` Nico Pache
2025-07-14  0:32 ` [PATCH v9 10/14] khugepaged: allow khugepaged to check all anonymous mTHP orders Nico Pache
2025-07-16 15:28   ` David Hildenbrand
2025-07-17  7:25     ` Nico Pache
2025-07-18  8:40       ` David Hildenbrand
2025-07-14  0:32 ` [PATCH v9 11/14] khugepaged: kick khugepaged for enabling none-PMD-sized mTHPs Nico Pache
2025-07-14  0:32 ` [PATCH v9 12/14] khugepaged: improve tracepoints for mTHP orders Nico Pache
2025-07-22 15:39   ` David Hildenbrand
2025-07-14  0:32 ` [PATCH v9 13/14] khugepaged: add per-order mTHP khugepaged stats Nico Pache
2025-07-18  5:04   ` Baolin Wang
2025-07-18 21:00     ` Nico Pache [this message]
2025-07-19  4:42       ` Baolin Wang
2025-07-14  0:32 ` [PATCH v9 14/14] Documentation: mm: update the admin guide for mTHP collapse Nico Pache
2025-07-15  0:39 ` [PATCH v9 00/14] khugepaged: mTHP support Andrew Morton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAA1CXcDQeiMjVhxVjnCvBuTQLSBQh0ea7FJXg52ebNFDHfXm1g@mail.gmail.com \
    --to=npache@redhat.com \
    --cc=Liam.Howlett@oracle.com \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=anshuman.khandual@arm.com \
    --cc=baohua@kernel.org \
    --cc=baolin.wang@linux.alibaba.com \
    --cc=catalin.marinas@arm.com \
    --cc=cl@gentwo.org \
    --cc=corbet@lwn.net \
    --cc=dave.hansen@linux.intel.com \
    --cc=david@redhat.com \
    --cc=dev.jain@arm.com \
    --cc=hannes@cmpxchg.org \
    --cc=hughd@google.com \
    --cc=jack@suse.cz \
    --cc=jglisse@google.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-trace-kernel@vger.kernel.org \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=mhiramat@kernel.org \
    --cc=mhocko@suse.com \
    --cc=peterx@redhat.com \
    --cc=raquini@redhat.com \
    --cc=rdunlap@infradead.org \
    --cc=rientjes@google.com \
    --cc=rostedt@goodmis.org \
    --cc=ryan.roberts@arm.com \
    --cc=sunnanyong@huawei.com \
    --cc=surenb@google.com \
    --cc=thomas.hellstrom@linux.intel.com \
    --cc=tiwai@suse.de \
    --cc=usamaarif642@gmail.com \
    --cc=vishal.moola@gmail.com \
    --cc=wangkefeng.wang@huawei.com \
    --cc=will@kernel.org \
    --cc=willy@infradead.org \
    --cc=yang@os.amperecomputing.com \
    --cc=ziy@nvidia.com \
    --cc=zokeefe@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).