From: "David Hildenbrand (Arm)" <david@kernel.org>
To: Usama Arif <usama.arif@linux.dev>
Cc: Nico Pache <npache@redhat.com>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org,
yuzhao@google.com, usamaarif642@gmail.com, lance.yang@linux.dev,
baohua@kernel.org, dev.jain@arm.com, ryan.roberts@arm.com,
liam@infradead.org, baolin.wang@linux.alibaba.com,
ziy@nvidia.com, ljs@kernel.org, akpm@linux-foundation.org
Subject: Re: [RFC] mm: restrict zero-page remapping to underused THP splits
Date: Mon, 11 May 2026 15:42:44 +0200 [thread overview]
Message-ID: <574fc329-bf2d-4686-9f15-b1709432326e@kernel.org> (raw)
In-Reply-To: <608bef55-44d1-47f1-a201-4a6bd7be137d@linux.dev>
On 5/11/26 15:10, Usama Arif wrote:
>
>
> On 11/05/2026 07:36, David Hildenbrand (Arm) wrote:
>>
>>>
>>> Hello!
>>
>>
>> Hi!
>>
>>>
>>> I think (3) definitely makes sense.
>>>
>>> I have not had a deep look at KSM up until just now, so might be dumb
>>> to say all of below.. :)
>>>
>>> What I see is that KSM scans THPs as 512 individual 4K subpages and splits the
>>> THP whenever it actually wants to merge a single 4K chunk. That seems like a
>>> lot of work for a single 4K?
>>
>> Yes, but that's what the users ask for: if there is a chance to deduplicate
>> memory, it shall be deduplicated asap.
>>
>>>
>>> One thing that came to my mind is to have a separate tree for THPs and only
>>> merge the THPs that have the same content, but the possibility of encoutering
>>> 2M pages with same content is extremely low? so this is probably a bad idea.
>>
>> Right, the probability is low, and it would change existing semantics, breaking
>> existing users.
>>
>> In addition, we would have to add large folio support for KSM, which I rather
>> would avoid.
>>
>>>
>>> An alternative is, does it even make sense to process and split THPs by KSM
>>> in the way it works now? IMO this is a lot of work for a single 4K merge.
>>> Shrinker is designed to release memory when its needed, i.e. reclaim, at
>>> which point IMO free memory is more important than performance. But KSM runs
>>> all the time.. so constantly splitting THPs everytime a single 4K can be
>>> merged just hurts performance all the time.
>>
>> Right, but that's what you get with KSM: bad performance if there is a chance to
>> deduplicate :)
>>
>> (and bad performance from scanning overhead)
>>
>>> If someone cares about memory,
>>> they should be running the shrinker.
>>
>> It's not just the zero page, but really any page content. The zero page is
>> currently only "special" after we added conditional support to deduplicate to
>> the shared zeropage in KSM. The shrinker doesn't help for any other page content
>> besides zero-filled.
>>
>> Further, the shrinker is something system-wide, whereby KSM is usually only
>> enabled for selected VMAs (with some exceptions nowadays).
>>
>> Also note that KSM deduplicates independent of the folio size: not just THPs,
>> but really any (large) folio. Yes, it splits large folios, but that's really
>> just to keep the T in THP.
>>
>>> Is a better alternative that KSM skips
>>> THPs, THP shrinker splits THPs into 4K subpages when memory is needed, and
>>> only then KSM gets those 4K subpages?
>>>
>>> Above sounds like reworking KSM, but just wanted to put it out there.
>>
>> Right, and it makes KSM more THP aware. Which is something I would avoid right now.
>>
>>>
>>> (2) + (3) sounds like a good solution, but I wonder if above alternative
>>> of KSM just skipping THPs might be better?
>>
>> That would change the semantics where, for example, where we expect that memory
>> was deduplicated after a KSM run.
>>
>> VMs (where KSM is usually employed) are expected to be mostly backed by THPs:
>> except where we can deduplicate memory. Skipping THPs would essentially break
>> the main use case for KSM :)
>>
>> Does that make sense?
>>
>
> Yes, all of above makes sense. But I feel like this means someone should not
> set THP policy to always and enable KSM together.
IIRC, QEMU will, as default, set MADV_HUGEPAGE and MADV_MERGEABLE :)
(KSM itself later has to be enabled manually on a system level)
> In general I feel like KSM
> is not something that should be run on big servers, as hopefully you are
> not managing memory as 4K chunks for big machines and using a lot of THPs.
Right. But the 4k chunks are movable and compaction can move them around to
create THPs elsewhere.
--
Cheers,
David
next prev parent reply other threads:[~2026-05-11 13:42 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-08 17:05 [RFC] mm: restrict zero-page remapping to underused THP splits Nico Pache
2026-05-08 21:32 ` David Hildenbrand (Arm)
2026-05-09 8:25 ` Lance Yang
2026-05-10 11:39 ` Usama Arif
2026-05-11 6:36 ` David Hildenbrand (Arm)
2026-05-11 13:10 ` Usama Arif
2026-05-11 13:42 ` David Hildenbrand (Arm) [this message]
2026-05-11 13:44 ` David Hildenbrand (Arm)
2026-05-11 14:15 ` Usama Arif
2026-05-11 18:40 ` Nico Pache
2026-05-12 7:05 ` David Hildenbrand (Arm)
2026-05-12 18:36 ` Nico Pache
2026-05-12 19:02 ` David Hildenbrand (Arm)
2026-05-14 8:11 ` Lance Yang
2026-05-09 3:21 ` Lance Yang
2026-05-11 18:42 ` Nico Pache
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=574fc329-bf2d-4686-9f15-b1709432326e@kernel.org \
--to=david@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=baohua@kernel.org \
--cc=baolin.wang@linux.alibaba.com \
--cc=dev.jain@arm.com \
--cc=lance.yang@linux.dev \
--cc=liam@infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=ljs@kernel.org \
--cc=npache@redhat.com \
--cc=ryan.roberts@arm.com \
--cc=usama.arif@linux.dev \
--cc=usamaarif642@gmail.com \
--cc=yuzhao@google.com \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.