public inbox for linux-doc@vger.kernel.org
 help / color / mirror / Atom feed
From: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
To: Nico Pache <npache@redhat.com>
Cc: linux-mm@kvack.org, linux-doc@vger.kernel.org,
	linux-kernel@vger.kernel.org, linux-trace-kernel@vger.kernel.org,
	david@redhat.com, ziy@nvidia.com, baolin.wang@linux.alibaba.com,
	Liam.Howlett@oracle.com, ryan.roberts@arm.com, dev.jain@arm.com,
	corbet@lwn.net, rostedt@goodmis.org, mhiramat@kernel.org,
	mathieu.desnoyers@efficios.com, akpm@linux-foundation.org,
	baohua@kernel.org, willy@infradead.org, peterx@redhat.com,
	wangkefeng.wang@huawei.com, usamaarif642@gmail.com,
	sunnanyong@huawei.com, vishal.moola@gmail.com,
	thomas.hellstrom@linux.intel.com, yang@os.amperecomputing.com,
	kirill.shutemov@linux.intel.com, aarcange@redhat.com,
	raquini@redhat.com, anshuman.khandual@arm.com,
	catalin.marinas@arm.com, tiwai@suse.de, will@kernel.org,
	dave.hansen@linux.intel.com, jack@suse.cz, cl@gentwo.org,
	jglisse@google.com, surenb@google.com, zokeefe@google.com,
	hannes@cmpxchg.org, rientjes@google.com, mhocko@suse.com,
	rdunlap@infradead.org, hughd@google.com
Subject: Re: [PATCH v10 12/13] khugepaged: add per-order mTHP khugepaged stats
Date: Thu, 11 Sep 2025 12:14:28 +0100	[thread overview]
Message-ID: <d090c532-cdbb-434e-b573-c90a98224e27@lucifer.local> (raw)
In-Reply-To: <CAA1CXcD+E8kSA2P5JchayQb5wrwmUVPRCK+W-qiD=Xtz03ti+Q@mail.gmail.com>

One small nitty note - be super helpful if you could add a newline before/after
your reply or soething like this as

>> blah blah blah
> blah
>> blah blah blah

Is harder to read than:

>> blah blah blah
>
> blah
>
>> blah blah blah

Thanks :)

On Tue, Sep 09, 2025 at 12:36:54AM -0600, Nico Pache wrote:
> On Thu, Aug 21, 2025 at 8:49 AM Lorenzo Stoakes
> <lorenzo.stoakes@oracle.com> wrote:
> >
> > On Tue, Aug 19, 2025 at 08:16:10AM -0600, Nico Pache wrote:
> > > With mTHP support inplace, let add the per-order mTHP stats for
> > > exceeding NONE, SWAP, and SHARED.
> > >
> >
> > This is really not enough of a commit message. Exceeding what, where, why,
> > how? What does 'exceeding' mean here, etc. etc. More words please :)
> Ok I will add more in the next version

Thanks

> >
> > > Signed-off-by: Nico Pache <npache@redhat.com>
> > > ---
> > >  Documentation/admin-guide/mm/transhuge.rst | 17 +++++++++++++++++
> > >  include/linux/huge_mm.h                    |  3 +++
> > >  mm/huge_memory.c                           |  7 +++++++
> > >  mm/khugepaged.c                            | 16 +++++++++++++---
> > >  4 files changed, 40 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/Documentation/admin-guide/mm/transhuge.rst b/Documentation/admin-guide/mm/transhuge.rst
> > > index 7ccb93e22852..b85547ac4fe9 100644
> > > --- a/Documentation/admin-guide/mm/transhuge.rst
> > > +++ b/Documentation/admin-guide/mm/transhuge.rst
> > > @@ -705,6 +705,23 @@ nr_anon_partially_mapped
> > >         an anonymous THP as "partially mapped" and count it here, even though it
> > >         is not actually partially mapped anymore.
> > >
> > > +collapse_exceed_swap_pte
> > > +       The number of anonymous THP which contain at least one swap PTE.
> >
> > The number of anonymous THP what? Pages? Let's be specific.
> ack

Thanks

> >
> > > +       Currently khugepaged does not support collapsing mTHP regions that
> > > +       contain a swap PTE.
> >
> > Wait what? So we have a counter for something that's unsupported? That
> > seems not so useful?
> The current implementation does not support swapped out or shared
> pages. However these counters allow us to monitor when a mTHP collapse
> fails due to exceeding the threshold (ie 0, hitting any swapped out or
> shared page)

So the collapse counters are not measuring collapses? That seems a bit
confusing. Or actually is this implied in the 'exceed' bit? Because that'd make
sense actually.

But let's obviously document this carefully.


> >
> > > +
> > > +collapse_exceed_none_pte
> > > +       The number of anonymous THP which have exceeded the none PTE threshold.
> >
> > THP pages. What's the 'none PTE threshold'? Do you mean
> > /sys/kernel/mm/transparent_hugepage/khugepaged/max_ptes_none ?
> ack, I will expand these descriptions

Thanks.

> >
> > Let's spell that out please, this is far too vague.
> >
> > > +       With mTHP collapse, a bitmap is used to gather the state of a PMD region
> > > +       and is then recursively checked from largest to smallest order against
> > > +       the scaled max_ptes_none count. This counter indicates that the next
> > > +       enabled order will be checked.
> >
> > I think you really need to expand upon this as this is confusing and vague.
> >
> > I also don't think saying 'recursive' here really benefits anything, Just
> > saying that we try to collapse the largest mTHP size we can in each
> > instance, and then give a more 'words-y' explanation as to how
> > max_ptes_none is (in effect) converted to a ratio of a PMD, and then that
> > ratio is applied to the mTHP sizes.
> >
> > You can then go on to say that this counter measures the number of
> > occasions in which this occurred.
> ack I will clean it up

Thanks

> >
> > > +
> > > +collapse_exceed_shared_pte
> > > +       The number of anonymous THP which contain at least one shared PTE.
> >
> > anonymous THP pages right? :)
> regions?

I don't understand what regions would mean here?

So what are you actually measuring? The number of anonymous THP mappings? If so
I think 'mappings' is probably better. Or 'The number of anonymous THP page
table ranges...' perhaps?

> >
> > > +       Currently khugepaged does not support collapsing mTHP regions that
> > > +       contain a shared PTE.
> >
> > Again I don't really understand the purpose of creating a counter for
> > something we don't support.

> see above

Ack

> >
> > Let's add it when we support it.
> >
> > I also in this case and the exceed swap case don't understand what you mean
> > by exceed here, you need to spell this out clearly.
> >
> > Perhaps the context missing here is that you _also_ count THP events in
> > these counters.
> >
> > But again, given we have THP_... counters for the stats mTHP doesn't do
> > yet, I'd say adding these is pointless.
> >
> > > +
> > >  As the system ages, allocating huge pages may be expensive as the
> > >  system uses memory compaction to copy data around memory to free a
> > >  huge page for use. There are some counters in ``/proc/vmstat`` to help
> > > diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
> > > index 4ada5d1f7297..6f1593d0b4b5 100644
> > > --- a/include/linux/huge_mm.h
> > > +++ b/include/linux/huge_mm.h
> > > @@ -144,6 +144,9 @@ enum mthp_stat_item {
> > >       MTHP_STAT_SPLIT_DEFERRED,
> > >       MTHP_STAT_NR_ANON,
> > >       MTHP_STAT_NR_ANON_PARTIALLY_MAPPED,
> > > +     MTHP_STAT_COLLAPSE_EXCEED_SWAP,
> > > +     MTHP_STAT_COLLAPSE_EXCEED_NONE,
> > > +     MTHP_STAT_COLLAPSE_EXCEED_SHARED,
> >
> > Wh do we put 'collapse' here but not in the THP equivalents?

> to indicate they come from the collapse functionality. I can shorten
> it by removing COLLAPSE if youd like

Hmm, if this is actually giving information then fine to keep.

> >
> > >       __MTHP_STAT_COUNT
> > >  };
> > >
> > > diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> > > index 20d005c2c61f..9f0470c3e983 100644
> > > --- a/mm/huge_memory.c
> > > +++ b/mm/huge_memory.c
> > > @@ -639,6 +639,10 @@ DEFINE_MTHP_STAT_ATTR(split_failed, MTHP_STAT_SPLIT_FAILED);
> > >  DEFINE_MTHP_STAT_ATTR(split_deferred, MTHP_STAT_SPLIT_DEFERRED);
> > >  DEFINE_MTHP_STAT_ATTR(nr_anon, MTHP_STAT_NR_ANON);
> > >  DEFINE_MTHP_STAT_ATTR(nr_anon_partially_mapped, MTHP_STAT_NR_ANON_PARTIALLY_MAPPED);
> > > +DEFINE_MTHP_STAT_ATTR(collapse_exceed_swap_pte, MTHP_STAT_COLLAPSE_EXCEED_SWAP);
> > > +DEFINE_MTHP_STAT_ATTR(collapse_exceed_none_pte, MTHP_STAT_COLLAPSE_EXCEED_NONE);
> > > +DEFINE_MTHP_STAT_ATTR(collapse_exceed_shared_pte, MTHP_STAT_COLLAPSE_EXCEED_SHARED);
> > > +
> > >
> > >  static struct attribute *anon_stats_attrs[] = {
> > >       &anon_fault_alloc_attr.attr,
> > > @@ -655,6 +659,9 @@ static struct attribute *anon_stats_attrs[] = {
> > >       &split_deferred_attr.attr,
> > >       &nr_anon_attr.attr,
> > >       &nr_anon_partially_mapped_attr.attr,
> > > +     &collapse_exceed_swap_pte_attr.attr,
> > > +     &collapse_exceed_none_pte_attr.attr,
> > > +     &collapse_exceed_shared_pte_attr.attr,
> > >       NULL,
> > >  };
> > >
> > > diff --git a/mm/khugepaged.c b/mm/khugepaged.c
> > > index c13bc583a368..5a3386043f39 100644
> > > --- a/mm/khugepaged.c
> > > +++ b/mm/khugepaged.c
> > > @@ -594,7 +594,9 @@ static int __collapse_huge_page_isolate(struct vm_area_struct *vma,
> > >                               continue;
> > >                       } else {
> > >                               result = SCAN_EXCEED_NONE_PTE;
> > > -                             count_vm_event(THP_SCAN_EXCEED_NONE_PTE);
> >
> > Hm so wait you were miscounting statistics in patch 10/13 when you turned
> > all this one? That's not good.
> >
> > This should be in place _first_ before enabling the feature.
> Ok I can move them around.

Thanks

> >
> > > +                             if (order == HPAGE_PMD_ORDER)
> > > +                                     count_vm_event(THP_SCAN_EXCEED_NONE_PTE);
> > > +                             count_mthp_stat(order, MTHP_STAT_COLLAPSE_EXCEED_NONE);
> > >                               goto out;
> > >                       }
> > >               }
> > > @@ -633,10 +635,17 @@ static int __collapse_huge_page_isolate(struct vm_area_struct *vma,
> > >                        * shared may cause a future higher order collapse on a
> > >                        * rescan of the same range.
> > >                        */
> > > -                     if (order != HPAGE_PMD_ORDER || (cc->is_khugepaged &&
> > > -                         shared > khugepaged_max_ptes_shared)) {
> > > +                     if (order != HPAGE_PMD_ORDER) {
> >
> > Hm wait what? I dont understand what's going on here? You're no longer
> > actually doing any check except order != HPAGE_PMD_ORDER?... am I missnig
> > something?
> >
> > Again why we are bothering to maintain a counter that doesn't mean anything
> > I don't know? I may be misinterpreting somehow however.

I guess answered by rest.

> >
> > > +                             result = SCAN_EXCEED_SHARED_PTE;
> > > +                             count_mthp_stat(order, MTHP_STAT_COLLAPSE_EXCEED_SHARED);
> > > +                             goto out;
> > > +                     }
> > > +
> > > +                     if (cc->is_khugepaged &&
> > > +                         shared > khugepaged_max_ptes_shared) {
> > >                               result = SCAN_EXCEED_SHARED_PTE;
> > >                               count_vm_event(THP_SCAN_EXCEED_SHARED_PTE);
> > > +                             count_mthp_stat(order, MTHP_STAT_COLLAPSE_EXCEED_SHARED);
> > >                               goto out;
> > >                       }
> > >               }
> > > @@ -1084,6 +1093,7 @@ static int __collapse_huge_page_swapin(struct mm_struct *mm,
> > >                * range.
> > >                */
> > >               if (order != HPAGE_PMD_ORDER) {
> > > +                     count_mthp_stat(order, MTHP_STAT_COLLAPSE_EXCEED_SWAP);
> >
> > This again seems surely to not be testing for what it claims to be
> > tracking? I may again be missing context here.
> We are bailing out of the mTHP collapse due to it having a SWAP page.
> In turn exceeding our threshold of 0.

OK.

>
> Cheers,
> -- Nico
> >
> > >                       pte_unmap(pte);
> > >                       mmap_read_unlock(mm);
> > >                       result = SCAN_EXCEED_SWAP_PTE;
> > > --
> > > 2.50.1
> > >
> >
>

  reply	other threads:[~2025-09-11 11:15 UTC|newest]

Thread overview: 91+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-08-19 13:41 [PATCH v10 00/13] khugepaged: mTHP support Nico Pache
2025-08-19 13:41 ` [PATCH v10 01/13] khugepaged: rename hpage_collapse_* to collapse_* Nico Pache
2025-08-20 10:42   ` Lorenzo Stoakes
2025-09-11 11:56   ` Lance Yang
2025-08-19 13:41 ` [PATCH v10 02/13] introduce collapse_single_pmd to unify khugepaged and madvise_collapse Nico Pache
2025-08-20 11:21   ` Lorenzo Stoakes
2025-08-20 16:35     ` Nico Pache
2025-08-22 10:21       ` Lorenzo Stoakes
2025-08-26 13:30         ` Nico Pache
2025-08-19 13:41 ` [PATCH v10 03/13] khugepaged: generalize hugepage_vma_revalidate for mTHP support Nico Pache
2025-08-20 13:23   ` Lorenzo Stoakes
2025-08-20 15:40     ` Nico Pache
2025-08-21  3:41       ` Wei Yang
2025-08-21 14:09         ` Zi Yan
2025-08-22 10:25           ` Lorenzo Stoakes
2025-08-24  1:37   ` Wei Yang
2025-08-26 13:46     ` Nico Pache
2025-08-19 13:41 ` [PATCH v10 04/13] khugepaged: generalize alloc_charge_folio() Nico Pache
2025-08-20 13:28   ` Lorenzo Stoakes
2025-08-19 13:41 ` [PATCH v10 05/13] khugepaged: generalize __collapse_huge_page_* for mTHP support Nico Pache
2025-08-20 14:22   ` Lorenzo Stoakes
2025-09-01 16:15     ` David Hildenbrand
2025-08-19 13:41 ` [PATCH v10 06/13] khugepaged: add " Nico Pache
2025-08-20 18:29   ` Lorenzo Stoakes
2025-09-02 20:12     ` Nico Pache
2025-09-05 10:13       ` Lorenzo Stoakes
2025-09-08 22:29         ` Nico Pache
2025-09-11 12:21           ` Lorenzo Stoakes
2025-08-19 13:41 ` [PATCH v10 07/13] khugepaged: skip collapsing mTHP to smaller orders Nico Pache
2025-08-21 12:05   ` Lorenzo Stoakes
2025-08-21 12:33     ` Dev Jain
2025-08-22 10:33       ` Lorenzo Stoakes
2025-08-21 16:54     ` Steven Rostedt
2025-08-21 16:56       ` Lorenzo Stoakes
2025-08-19 13:42 ` [PATCH v10 08/13] khugepaged: avoid unnecessary mTHP collapse attempts Nico Pache
2025-08-20 10:38   ` Lorenzo Stoakes
2025-08-19 13:42 ` [PATCH v10 09/13] khugepaged: enable collapsing mTHPs even when PMD THPs are disabled Nico Pache
2025-08-21 13:35   ` Lorenzo Stoakes
2025-08-19 13:42 ` [PATCH v10 10/13] khugepaged: kick khugepaged for enabling none-PMD-sized mTHPs Nico Pache
2025-08-21 14:18   ` Lorenzo Stoakes
2025-08-21 14:26     ` Lorenzo Stoakes
2025-08-22  6:59     ` Baolin Wang
2025-08-22  7:36       ` Dev Jain
2025-08-19 13:42 ` [PATCH v10 11/13] khugepaged: improve tracepoints for mTHP orders Nico Pache
2025-08-21 14:24   ` Lorenzo Stoakes
2025-08-19 14:16 ` [PATCH v10 12/13] khugepaged: add per-order mTHP khugepaged stats Nico Pache
2025-08-21 14:47   ` Lorenzo Stoakes
2025-09-09  6:36     ` Nico Pache
2025-09-11 11:14       ` Lorenzo Stoakes [this message]
2025-08-19 14:17 ` [PATCH v10 13/13] Documentation: mm: update the admin guide for mTHP collapse Nico Pache
2025-08-21 15:03   ` Lorenzo Stoakes
2025-08-19 21:55 ` [PATCH v10 00/13] khugepaged: mTHP support Andrew Morton
2025-08-20 15:55   ` Nico Pache
2025-08-21 15:01 ` Lorenzo Stoakes
2025-08-21 15:13   ` Dev Jain
2025-08-21 15:19     ` Lorenzo Stoakes
2025-08-21 15:25       ` Nico Pache
2025-08-21 15:27         ` Nico Pache
2025-08-21 15:32           ` Lorenzo Stoakes
2025-08-21 16:46             ` Nico Pache
2025-08-21 16:54               ` Lorenzo Stoakes
2025-08-21 17:26                 ` David Hildenbrand
2025-08-21 20:43                 ` David Hildenbrand
2025-08-22 10:41                   ` Lorenzo Stoakes
2025-08-22 14:10                     ` David Hildenbrand
2025-08-22 14:49                       ` Lorenzo Stoakes
2025-08-22 15:33                         ` Dev Jain
2025-08-26 10:43                           ` Lorenzo Stoakes
2025-08-28  9:46                       ` Baolin Wang
2025-08-28 10:48                         ` Dev Jain
2025-08-29  1:55                           ` Baolin Wang
2025-09-01 16:46                             ` David Hildenbrand
2025-09-02  2:28                               ` Baolin Wang
2025-09-02  9:03                                 ` David Hildenbrand
2025-09-02 10:34                                   ` Usama Arif
2025-09-02 11:03                                     ` David Hildenbrand
2025-09-02 20:23                                       ` Usama Arif
2025-09-03  3:27                                         ` Baolin Wang
2025-09-04  2:54                                         ` Nico Pache
2025-09-05 11:48                                           ` Lorenzo Stoakes
2025-09-05 11:55                                             ` David Hildenbrand
2025-09-05 12:31                                               ` Usama Arif
2025-09-05 12:38                                                 ` David Hildenbrand
2025-09-04  2:44                 ` Nico Pache
2025-09-04 18:56                   ` David Hildenbrand
2025-08-21 16:38     ` Liam R. Howlett
2025-09-01 16:21 ` David Hildenbrand
2025-09-01 17:06   ` David Hildenbrand
2025-09-05 18:05 ` Dev Jain
  -- strict thread matches above, loose matches on Subject: below --
2025-08-19 13:48 [PATCH v10 12/13] khugepaged: add per-order mTHP khugepaged stats Nico Pache
2025-08-19 14:23 ` Nico Pache

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d090c532-cdbb-434e-b573-c90a98224e27@lucifer.local \
    --to=lorenzo.stoakes@oracle.com \
    --cc=Liam.Howlett@oracle.com \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=anshuman.khandual@arm.com \
    --cc=baohua@kernel.org \
    --cc=baolin.wang@linux.alibaba.com \
    --cc=catalin.marinas@arm.com \
    --cc=cl@gentwo.org \
    --cc=corbet@lwn.net \
    --cc=dave.hansen@linux.intel.com \
    --cc=david@redhat.com \
    --cc=dev.jain@arm.com \
    --cc=hannes@cmpxchg.org \
    --cc=hughd@google.com \
    --cc=jack@suse.cz \
    --cc=jglisse@google.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-trace-kernel@vger.kernel.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=mhiramat@kernel.org \
    --cc=mhocko@suse.com \
    --cc=npache@redhat.com \
    --cc=peterx@redhat.com \
    --cc=raquini@redhat.com \
    --cc=rdunlap@infradead.org \
    --cc=rientjes@google.com \
    --cc=rostedt@goodmis.org \
    --cc=ryan.roberts@arm.com \
    --cc=sunnanyong@huawei.com \
    --cc=surenb@google.com \
    --cc=thomas.hellstrom@linux.intel.com \
    --cc=tiwai@suse.de \
    --cc=usamaarif642@gmail.com \
    --cc=vishal.moola@gmail.com \
    --cc=wangkefeng.wang@huawei.com \
    --cc=will@kernel.org \
    --cc=willy@infradead.org \
    --cc=yang@os.amperecomputing.com \
    --cc=ziy@nvidia.com \
    --cc=zokeefe@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox