From: Usama Arif <usamaarif642@gmail.com>
To: David Hildenbrand <david@redhat.com>,
Lorenzo Stoakes <lorenzo.stoakes@oracle.com>,
Hugh Dickins <hughd@google.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>,
akpm@linux-foundation.org, ziy@nvidia.com,
Liam.Howlett@oracle.com, npache@redhat.com, ryan.roberts@arm.com,
dev.jain@arm.com, baohua@kernel.org, zokeefe@google.com,
shy828301@gmail.com, linux-mm@kvack.org,
linux-kernel@vger.kernel.org,
Johannes Weiner <hannes@cmpxchg.org>
Subject: Re: [PATCH v4 0/2] fix MADV_COLLAPSE issue if THP settings are disabled
Date: Wed, 25 Jun 2025 12:03:17 +0100 [thread overview]
Message-ID: <f366ce31-582c-4f90-bc32-05ddf3e71fa6@gmail.com> (raw)
In-Reply-To: <008ec97f-3b33-4b47-a112-9cef8c1d7f58@redhat.com>
On 25/06/2025 08:34, David Hildenbrand wrote:
>>>
>>> We would all prefer a less messy world of THP tunables. I certainly
>>> find plenty to dislike there too; and wish that a less assertive name
>>> than "never" had been chosen originally for the default off position.
>>>
>>> But please don't break the accepted and documented behaviour of
>>> MADV_COLLAPSE now.
>>
>> Again see above, I absolutely disagree this is documented _clearly_. And
>> that's the underlying issue here.
>> > I feel like if you polled 100 system administrators (assuming they knew
>> about THP) as to how you globally disable THP, probably all 100 would say
>> you do it via:
>>
>> # echo never > /sys/kernel/mm/transparent_hugepage/enabled
>>
>
> Yes. One big problem is that the documentation was not updated.
>
> Changing the meaning of "entirely disabled" to "entirely disabled automatically (page faults, khugepaged)"
>
>> So shouldn't 'never break userspace' be based on practical reality rather
>> than a theorised interpretation of documents that sadly are not clear
>> enough?
>
> I think the problem is that there might indeed be more users out there relying on "never+MADV_COLLPASE" to now place THPs than "never+MADV_COLLPASE" to no place THPs.
>
> What is the harm when not placing THPs? Performance degradation for some apps?
>
I think a bigger issue than performance degradation is someone upgrading the kernel and not
seeing MADV_COLLAPSE working as it has since the beginning and not knowing that its due
to a kernel change.
I feel transparent_hugepage/enabled is too messed up, and its difficult to fix it without
breaking it for someone? I still find it weird that we can set transparent_hugepage/enabled
to never and transparent_hugepage/hugepages-2048kB/enabled to madvise and still get hugepages.
(And we actually use this configuration in production for our ARM servers).
Introducing deny for global and page size I feel will over complicate it because of the issue in
the previous paragraph, page size setting overrides global setting. so even if
transparent_hugepage/enabled is deny, we might still get a THP if the page setting is not.
Someone needs to file to deny, which is the same as setting every file to never.
So I just wanted to throw another bad idea in the mix, what if we introduce another sysfs file
(I hate introducing sysfs :)), something like /sys/kernel/mm/thp_allowed (or some other alternate name)
which is default 1.
Once someone sets it to 0, no one can ever get a THP, no matter what future changes we make. Whether its
madv_collapse, bpf THPs, cgroup THPs, prctls, syscalls.. never will mean never.
Notice that its /sys/kernel/mm/thp_allowed and not /sys/kernel/mm/transparent_hugepage/thp_allowed.
Having it one directory above will make it look uglier, but it highlights that whatever you
set in /sys/kernel/mm/transparent_hugepage/ wont matter if /sys/kernel/mm/thp_allowed is set to 0.
Ideally this would be /sys/kernel/mm/transparent_hugepage/enabled=never if we were developing this
from scratch..
Not pushing for this idea, just throwing it out there.
Thanks,
Usama
next prev parent reply other threads:[~2025-06-25 11:03 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-06-25 1:40 [PATCH v4 0/2] fix MADV_COLLAPSE issue if THP settings are disabled Baolin Wang
2025-06-25 1:40 ` [PATCH v4 1/2] mm: huge_memory: disallow hugepages if the system-wide THP sysfs " Baolin Wang
2025-06-25 4:34 ` Dev Jain
2025-06-25 1:40 ` [PATCH v4 2/2] mm: shmem: disallow hugepages if the system-wide shmem " Baolin Wang
2025-06-25 5:53 ` [PATCH v4 0/2] fix MADV_COLLAPSE issue if THP " Hugh Dickins
2025-06-25 6:05 ` Dev Jain
2025-06-25 6:26 ` Baolin Wang
2025-06-25 6:49 ` Dev Jain
2025-06-25 6:55 ` Baolin Wang
2025-06-25 7:20 ` Lorenzo Stoakes
2025-06-25 7:34 ` David Hildenbrand
2025-06-25 7:55 ` Lorenzo Stoakes
2025-06-25 8:12 ` Lorenzo Stoakes
2025-06-25 8:24 ` David Hildenbrand
2025-06-25 8:37 ` Lorenzo Stoakes
2025-06-25 8:52 ` Baolin Wang
2025-06-25 9:31 ` Lorenzo Stoakes
2025-06-25 10:02 ` Baolin Wang
2025-06-25 10:07 ` David Hildenbrand
2025-06-25 10:15 ` Lorenzo Stoakes
2025-06-25 10:29 ` David Hildenbrand
2025-06-25 8:53 ` David Hildenbrand
2025-06-25 11:03 ` Usama Arif [this message]
2025-06-25 11:09 ` David Hildenbrand
2025-06-26 3:49 ` Hugh Dickins
2025-06-25 7:23 ` David Hildenbrand
2025-06-25 7:30 ` Lorenzo Stoakes
2025-06-25 7:36 ` David Hildenbrand
2025-06-25 7:42 ` Lorenzo Stoakes
2025-06-25 7:49 ` David Hildenbrand
2025-06-25 8:16 ` David Hildenbrand
2025-06-25 8:22 ` Lorenzo Stoakes
2025-06-25 8:40 ` David Hildenbrand
2025-06-25 8:45 ` Lorenzo Stoakes
2025-06-25 21:51 ` Hugh Dickins
2025-07-09 12:36 ` Lorenzo Stoakes
2025-07-10 1:58 ` Baolin Wang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=f366ce31-582c-4f90-bc32-05ddf3e71fa6@gmail.com \
--to=usamaarif642@gmail.com \
--cc=Liam.Howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=baohua@kernel.org \
--cc=baolin.wang@linux.alibaba.com \
--cc=david@redhat.com \
--cc=dev.jain@arm.com \
--cc=hannes@cmpxchg.org \
--cc=hughd@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lorenzo.stoakes@oracle.com \
--cc=npache@redhat.com \
--cc=ryan.roberts@arm.com \
--cc=shy828301@gmail.com \
--cc=ziy@nvidia.com \
--cc=zokeefe@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.