From: "Liam R. Howlett" <Liam.Howlett@oracle.com>
To: Dev Jain <dev.jain@arm.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>,
Nico Pache <npache@redhat.com>,
linux-mm@kvack.org, linux-doc@vger.kernel.org,
linux-kernel@vger.kernel.org, linux-trace-kernel@vger.kernel.org,
david@redhat.com, ziy@nvidia.com, baolin.wang@linux.alibaba.com,
ryan.roberts@arm.com, corbet@lwn.net, rostedt@goodmis.org,
mhiramat@kernel.org, mathieu.desnoyers@efficios.com,
akpm@linux-foundation.org, baohua@kernel.org,
willy@infradead.org, peterx@redhat.com,
wangkefeng.wang@huawei.com, usamaarif642@gmail.com,
sunnanyong@huawei.com, vishal.moola@gmail.com,
thomas.hellstrom@linux.intel.com, yang@os.amperecomputing.com,
kirill.shutemov@linux.intel.com, aarcange@redhat.com,
raquini@redhat.com, anshuman.khandual@arm.com,
catalin.marinas@arm.com, tiwai@suse.de, will@kernel.org,
dave.hansen@linux.intel.com, jack@suse.cz, cl@gentwo.org,
jglisse@google.com, surenb@google.com, zokeefe@google.com,
hannes@cmpxchg.org, rientjes@google.com, mhocko@suse.com,
rdunlap@infradead.org, hughd@google.com
Subject: Re: [PATCH v10 00/13] khugepaged: mTHP support
Date: Thu, 21 Aug 2025 12:38:26 -0400 [thread overview]
Message-ID: <bdhwa7x5zys3qnvocluyrsi2rwxgzd5ia4pzw3qlblngezrbjb@ke6jydmjm3ad> (raw)
In-Reply-To: <38b37195-28c8-4471-bd06-951083118efd@arm.com>
* Dev Jain <dev.jain@arm.com> [250821 11:14]:
>
> On 21/08/25 8:31 pm, Lorenzo Stoakes wrote:
> > OK so I noticed in patch 13/13 (!) where you change the documentation that you
> > essentially state that the whole method used to determine the ratio of PTEs to
> > collapse to mTHP is broken:
> >
> > khugepaged uses max_ptes_none scaled to the order of the enabled
> > mTHP size to determine collapses. When using mTHPs it's recommended
> > to set max_ptes_none low-- ideally less than HPAGE_PMD_NR / 2 (255
> > on 4k page size). This will prevent undesired "creep" behavior that
> > leads to continuously collapsing to the largest mTHP size; when we
> > collapse, we are bringing in new non-zero pages that will, on a
> > subsequent scan, cause the max_ptes_none check of the +1 order to
> > always be satisfied. By limiting this to less than half the current
> > order, we make sure we don't cause this feedback
> > loop. max_ptes_shared and max_ptes_swap have no effect when
> > collapsing to a mTHP, and mTHP collapse will fail on shared or
> > swapped out pages.
> >
> > This seems to me to suggest that using
> > /sys/kernel/mm/transparent_hugepage/khugepaged/max_ptes_none as some means
> > of establishing a 'ratio' to do this calculation is fundamentally flawed.
> >
> > So surely we ought to introduce a new sysfs tunable for this? Perhaps
> >
> > /sys/kernel/mm/transparent_hugepage/khugepaged/mthp_max_ptes_none_ratio
> >
> > Or something like this?
> >
> > It's already questionable that we are taking a value that is expressed
> > essentially in terms of PTE entries per PMD and then use it implicitly to
> > determine the ratio for mTHP, but to then say 'oh but the default value is
> > known-broken' is just a blocker for the series in my opinion.
> >
> > This really has to be done a different way I think.
> >
> > Cheers, Lorenzo
>
> FWIW this was my version of the documentation patch:
> https://lore.kernel.org/all/20250211111326.14295-18-dev.jain@arm.com/
>
> The discussion about the creep problem started here:
> https://lore.kernel.org/all/7098654a-776d-413b-8aca-28f811620df7@arm.com/
>
> and the discussion continuing here:
> https://lore.kernel.org/all/37375ace-5601-4d6c-9dac-d1c8268698e9@redhat.com/
>
> ending with a summary I gave here:
> https://lore.kernel.org/all/8114d47b-b383-4d6e-ab65-a0e88b99c873@arm.com/
>
> This should help you with the context.
Thanks for hunting this down, the context should be referenced in the
change log so we can find it easier in the future (and now). Or at
least in the cover letter.
The way the change log in the cover letter is written makes it
exceedingly long. Could you switch to listing the changes from v9 and
links to v1-8 (+RFCs if there are any)? Well, I guess include v10
changes and v1-9 urls..
At the length it is now, it's most likely a tl;dr for most. If you're
starting to review this at v10, then you'd probably appreciate not
rehashing discussions and if you're going from v9 then you already have
an idea of what v10 should have changed.
Said another way, the changelog is more useful with context and context
is difficult to find without a lore link.
I am having issues tracking down the contexts of many items of what has
been generated here and it'll only get worse as time moves on. We do
our best to keep change logs with the necessary details, but having
bread crumbs to follow is extremely helpful for review and in the long
run.
Thanks,
Liam
next prev parent reply other threads:[~2025-08-21 16:39 UTC|newest]
Thread overview: 75+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-08-19 13:41 [PATCH v10 00/13] khugepaged: mTHP support Nico Pache
2025-08-19 13:41 ` [PATCH v10 01/13] khugepaged: rename hpage_collapse_* to collapse_* Nico Pache
2025-08-20 10:42 ` Lorenzo Stoakes
2025-08-19 13:41 ` [PATCH v10 02/13] introduce collapse_single_pmd to unify khugepaged and madvise_collapse Nico Pache
2025-08-20 11:21 ` Lorenzo Stoakes
2025-08-20 16:35 ` Nico Pache
2025-08-22 10:21 ` Lorenzo Stoakes
2025-08-26 13:30 ` Nico Pache
2025-08-19 13:41 ` [PATCH v10 03/13] khugepaged: generalize hugepage_vma_revalidate for mTHP support Nico Pache
2025-08-20 13:23 ` Lorenzo Stoakes
2025-08-20 15:40 ` Nico Pache
2025-08-21 3:41 ` Wei Yang
2025-08-21 14:09 ` Zi Yan
2025-08-22 10:25 ` Lorenzo Stoakes
2025-08-24 1:37 ` Wei Yang
2025-08-26 13:46 ` Nico Pache
2025-08-19 13:41 ` [PATCH v10 04/13] khugepaged: generalize alloc_charge_folio() Nico Pache
2025-08-20 13:28 ` Lorenzo Stoakes
2025-08-19 13:41 ` [PATCH v10 05/13] khugepaged: generalize __collapse_huge_page_* for mTHP support Nico Pache
2025-08-20 14:22 ` Lorenzo Stoakes
2025-09-01 16:15 ` David Hildenbrand
2025-08-19 13:41 ` [PATCH v10 06/13] khugepaged: add " Nico Pache
2025-08-20 18:29 ` Lorenzo Stoakes
2025-09-02 20:12 ` Nico Pache
2025-08-19 13:41 ` [PATCH v10 07/13] khugepaged: skip collapsing mTHP to smaller orders Nico Pache
2025-08-21 12:05 ` Lorenzo Stoakes
2025-08-21 12:33 ` Dev Jain
2025-08-22 10:33 ` Lorenzo Stoakes
2025-08-21 16:54 ` Steven Rostedt
2025-08-21 16:56 ` Lorenzo Stoakes
2025-08-19 13:42 ` [PATCH v10 08/13] khugepaged: avoid unnecessary mTHP collapse attempts Nico Pache
2025-08-20 10:38 ` Lorenzo Stoakes
2025-08-19 13:42 ` [PATCH v10 09/13] khugepaged: enable collapsing mTHPs even when PMD THPs are disabled Nico Pache
2025-08-21 13:35 ` Lorenzo Stoakes
2025-08-19 13:42 ` [PATCH v10 10/13] khugepaged: kick khugepaged for enabling none-PMD-sized mTHPs Nico Pache
2025-08-21 14:18 ` Lorenzo Stoakes
2025-08-21 14:26 ` Lorenzo Stoakes
2025-08-22 6:59 ` Baolin Wang
2025-08-22 7:36 ` Dev Jain
2025-08-19 13:42 ` [PATCH v10 11/13] khugepaged: improve tracepoints for mTHP orders Nico Pache
2025-08-21 14:24 ` Lorenzo Stoakes
2025-08-19 14:16 ` [PATCH v10 12/13] khugepaged: add per-order mTHP khugepaged stats Nico Pache
2025-08-21 14:47 ` Lorenzo Stoakes
2025-08-19 14:17 ` [PATCH v10 13/13] Documentation: mm: update the admin guide for mTHP collapse Nico Pache
2025-08-21 15:03 ` Lorenzo Stoakes
2025-08-19 21:55 ` [PATCH v10 00/13] khugepaged: mTHP support Andrew Morton
2025-08-20 15:55 ` Nico Pache
2025-08-21 15:01 ` Lorenzo Stoakes
2025-08-21 15:13 ` Dev Jain
2025-08-21 15:19 ` Lorenzo Stoakes
2025-08-21 15:25 ` Nico Pache
2025-08-21 15:27 ` Nico Pache
2025-08-21 15:32 ` Lorenzo Stoakes
2025-08-21 16:46 ` Nico Pache
2025-08-21 16:54 ` Lorenzo Stoakes
2025-08-21 17:26 ` David Hildenbrand
2025-08-21 20:43 ` David Hildenbrand
2025-08-22 10:41 ` Lorenzo Stoakes
2025-08-22 14:10 ` David Hildenbrand
2025-08-22 14:49 ` Lorenzo Stoakes
2025-08-22 15:33 ` Dev Jain
2025-08-26 10:43 ` Lorenzo Stoakes
2025-08-28 9:46 ` Baolin Wang
2025-08-28 10:48 ` Dev Jain
2025-08-29 1:55 ` Baolin Wang
2025-09-01 16:46 ` David Hildenbrand
2025-09-02 2:28 ` Baolin Wang
2025-09-02 9:03 ` David Hildenbrand
2025-09-02 10:34 ` Usama Arif
2025-09-02 11:03 ` David Hildenbrand
2025-09-02 20:23 ` Usama Arif
2025-09-03 3:27 ` Baolin Wang
2025-08-21 16:38 ` Liam R. Howlett [this message]
2025-09-01 16:21 ` David Hildenbrand
2025-09-01 17:06 ` David Hildenbrand
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=bdhwa7x5zys3qnvocluyrsi2rwxgzd5ia4pzw3qlblngezrbjb@ke6jydmjm3ad \
--to=liam.howlett@oracle.com \
--cc=aarcange@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=anshuman.khandual@arm.com \
--cc=baohua@kernel.org \
--cc=baolin.wang@linux.alibaba.com \
--cc=catalin.marinas@arm.com \
--cc=cl@gentwo.org \
--cc=corbet@lwn.net \
--cc=dave.hansen@linux.intel.com \
--cc=david@redhat.com \
--cc=dev.jain@arm.com \
--cc=hannes@cmpxchg.org \
--cc=hughd@google.com \
--cc=jack@suse.cz \
--cc=jglisse@google.com \
--cc=kirill.shutemov@linux.intel.com \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-trace-kernel@vger.kernel.org \
--cc=lorenzo.stoakes@oracle.com \
--cc=mathieu.desnoyers@efficios.com \
--cc=mhiramat@kernel.org \
--cc=mhocko@suse.com \
--cc=npache@redhat.com \
--cc=peterx@redhat.com \
--cc=raquini@redhat.com \
--cc=rdunlap@infradead.org \
--cc=rientjes@google.com \
--cc=rostedt@goodmis.org \
--cc=ryan.roberts@arm.com \
--cc=sunnanyong@huawei.com \
--cc=surenb@google.com \
--cc=thomas.hellstrom@linux.intel.com \
--cc=tiwai@suse.de \
--cc=usamaarif642@gmail.com \
--cc=vishal.moola@gmail.com \
--cc=wangkefeng.wang@huawei.com \
--cc=will@kernel.org \
--cc=willy@infradead.org \
--cc=yang@os.amperecomputing.com \
--cc=ziy@nvidia.com \
--cc=zokeefe@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).