linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
To: David Hildenbrand <david@redhat.com>
Cc: Hugh Dickins <hughd@google.com>,
	Baolin Wang <baolin.wang@linux.alibaba.com>,
	akpm@linux-foundation.org, ziy@nvidia.com,
	Liam.Howlett@oracle.com, npache@redhat.com, ryan.roberts@arm.com,
	dev.jain@arm.com, baohua@kernel.org, zokeefe@google.com,
	shy828301@gmail.com, usamaarif642@gmail.com, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH v4 0/2] fix MADV_COLLAPSE issue if THP settings are disabled
Date: Wed, 25 Jun 2025 09:22:58 +0100	[thread overview]
Message-ID: <f36e64f2-f3d1-407e-862f-ceccc89ac9a8@lucifer.local> (raw)
In-Reply-To: <28051538-d3ea-4064-aef3-89f6dd98b119@redhat.com>

On Wed, Jun 25, 2025 at 10:16:46AM +0200, David Hildenbrand wrote:
> On 25.06.25 09:49, David Hildenbrand wrote:
> > I think the whole use case of using MADV_COLLAPSE to completely control
> > THP allocation in a system is otherwise pretty hard to achieve, if there
> > is no other way to tame THP allocation through page faults+khugepaged.
>
> Just want to add: for an app itself, it's doable in "madvise" mode perfectly
> fine.
>
> If your app does a MADV_HUGEPAGE, it can get a THP during page-fault +
> khugepaged.
>
> If your app does not do a MADV_HUGEPAGE, it can get a THP through
> MADV_COLLAPSE.
>
> So the "madvise" mode actually works.

Right, but for me MADV_COLLAPSE is more about 'I want THPs _now_ (if available),
not when khugepaged decides to give me some'.

So we have multiple semantics at work here, unfortunately.

>
> The problem appears as soon as we want to control other processes that might
> be setting MADV_HUGEPAGE, and we actually want to control the behavior using
> process_madvise(MADV_COLLAPSE), to say "well, the MADV_HUGEPAGE" should be
> ignored.

This is a _very_ specialist use.

I'd argue for a 'manual' mode to be added to sysfs to cover this case, with
'never' having the 'actually means never' semantics.

You might argue that could confuse things, but it'd retain the 'de facto'
understanding nearly everybody has about what thees flags mean, but give
whatever user is out there that needs this the ability to continue doing what
they want.

And we get into philosophy about not 'breaking' userland, not sure we have a
TLB/page fault/folio allocation efficiency contract with userland :)

No program will break with this patch applied. Just potentially get performance
degradation in a very, very specialist case.

>
> Then, you configure "never" system-wide and use
> process_madvise(MADV_COLLAPSE) to drive it all manually.
>
> Curious to learn if there is such a user out there.

Oh me too :)

>
> --
> Cheers,
>
> David / dhildenb
>


  reply	other threads:[~2025-06-25  8:23 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-06-25  1:40 [PATCH v4 0/2] fix MADV_COLLAPSE issue if THP settings are disabled Baolin Wang
2025-06-25  1:40 ` [PATCH v4 1/2] mm: huge_memory: disallow hugepages if the system-wide THP sysfs " Baolin Wang
2025-06-25  4:34   ` Dev Jain
2025-06-25  1:40 ` [PATCH v4 2/2] mm: shmem: disallow hugepages if the system-wide shmem " Baolin Wang
2025-06-25  5:53 ` [PATCH v4 0/2] fix MADV_COLLAPSE issue if THP " Hugh Dickins
2025-06-25  6:05   ` Dev Jain
2025-06-25  6:26   ` Baolin Wang
2025-06-25  6:49     ` Dev Jain
2025-06-25  6:55       ` Baolin Wang
2025-06-25  7:20   ` Lorenzo Stoakes
2025-06-25  7:34     ` David Hildenbrand
2025-06-25  7:55       ` Lorenzo Stoakes
2025-06-25  8:12         ` Lorenzo Stoakes
2025-06-25  8:24           ` David Hildenbrand
2025-06-25  8:37             ` Lorenzo Stoakes
2025-06-25  8:52               ` Baolin Wang
2025-06-25  9:31                 ` Lorenzo Stoakes
2025-06-25 10:02                   ` Baolin Wang
2025-06-25 10:07                     ` David Hildenbrand
2025-06-25 10:15                       ` Lorenzo Stoakes
2025-06-25 10:29                         ` David Hildenbrand
2025-06-25  8:53               ` David Hildenbrand
2025-06-25 11:03       ` Usama Arif
2025-06-25 11:09         ` David Hildenbrand
2025-06-26  3:49           ` Hugh Dickins
2025-06-25  7:23   ` David Hildenbrand
2025-06-25  7:30     ` Lorenzo Stoakes
2025-06-25  7:36       ` David Hildenbrand
2025-06-25  7:42         ` Lorenzo Stoakes
2025-06-25  7:49           ` David Hildenbrand
2025-06-25  8:16             ` David Hildenbrand
2025-06-25  8:22               ` Lorenzo Stoakes [this message]
2025-06-25  8:40                 ` David Hildenbrand
2025-06-25  8:45                   ` Lorenzo Stoakes
2025-06-25 21:51         ` Hugh Dickins
2025-07-09 12:36 ` Lorenzo Stoakes
2025-07-10  1:58   ` Baolin Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f36e64f2-f3d1-407e-862f-ceccc89ac9a8@lucifer.local \
    --to=lorenzo.stoakes@oracle.com \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=baohua@kernel.org \
    --cc=baolin.wang@linux.alibaba.com \
    --cc=david@redhat.com \
    --cc=dev.jain@arm.com \
    --cc=hughd@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=npache@redhat.com \
    --cc=ryan.roberts@arm.com \
    --cc=shy828301@gmail.com \
    --cc=usamaarif642@gmail.com \
    --cc=ziy@nvidia.com \
    --cc=zokeefe@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).