The Linux Kernel Mailing List
 help / color / mirror / Atom feed
From: Usama Arif <usama.arif@linux.dev>
To: "David Hildenbrand (Arm)" <david@kernel.org>
Cc: Nico Pache <npache@redhat.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	yuzhao@google.com, usamaarif642@gmail.com, lance.yang@linux.dev,
	baohua@kernel.org, dev.jain@arm.com, ryan.roberts@arm.com,
	liam@infradead.org, baolin.wang@linux.alibaba.com,
	ziy@nvidia.com, ljs@kernel.org, akpm@linux-foundation.org
Subject: Re: [RFC] mm: restrict zero-page remapping to underused THP splits
Date: Mon, 11 May 2026 14:10:35 +0100	[thread overview]
Message-ID: <608bef55-44d1-47f1-a201-4a6bd7be137d@linux.dev> (raw)
In-Reply-To: <8838114e-5b6a-4f3d-932c-9e97e51216ae@kernel.org>



On 11/05/2026 07:36, David Hildenbrand (Arm) wrote:
> 
>>>
>>> I tend to like (2), and maybe (3) on top. Opinions?
>>>
>>
>> Hello!
> 
> 
> Hi!
> 
>>
>> I think (3) definitely makes sense.
>>
>> I have not had a deep look at KSM up until just now, so might be dumb
>> to say all of below.. :)
>>
>> What I see is that KSM scans THPs as 512 individual 4K subpages and splits the
>> THP whenever it actually wants to merge a single 4K chunk. That seems like a
>> lot of work for a single 4K?
> 
> Yes, but that's what the users ask for: if there is a chance to deduplicate
> memory, it shall be deduplicated asap.
> 
>>
>> One thing that came to my mind is to have a separate tree for THPs and only
>> merge the THPs that have the same content, but the possibility of encoutering
>> 2M pages with same content is extremely low? so this is probably a bad idea.
> 
> Right, the probability is low, and it would change existing semantics, breaking
> existing users.
> 
> In addition, we would have to add large folio support for KSM, which I rather
> would avoid.
> 
>>
>> An alternative is, does it even make sense to process and split THPs by KSM
>> in the way it works now? IMO this is a lot of work for a single 4K merge.
>> Shrinker is designed to release memory when its needed, i.e. reclaim, at
>> which point IMO free memory is more important than performance. But KSM runs
>> all the time.. so constantly splitting THPs everytime a single 4K can be
>> merged just hurts performance all the time.
> 
> Right, but that's what you get with KSM: bad performance if there is a chance to
> deduplicate :)
> 
> (and bad performance from scanning overhead)
> 
>> If someone cares about memory,
>> they should be running the shrinker.
> 
> It's not just the zero page, but really any page content. The zero page is
> currently only "special" after we added conditional support to deduplicate to
> the shared zeropage in KSM. The shrinker doesn't help for any other page content
> besides zero-filled.
> 
> Further, the shrinker is something system-wide, whereby KSM is usually only
> enabled for selected VMAs (with some exceptions nowadays).
> 
> Also note that KSM deduplicates independent of the folio size: not just THPs,
> but really any (large) folio. Yes, it splits large folios, but that's really
> just to keep the T in THP.
> 
>> Is a better alternative that KSM skips
>> THPs, THP shrinker splits THPs into 4K subpages when memory is needed, and
>> only then KSM gets those 4K subpages?
>>
>> Above sounds like reworking KSM, but just wanted to put it out there.
> 
> Right, and it makes KSM more THP aware. Which is something I would avoid right now.
> 
>>
>> (2) + (3) sounds like a good solution, but I wonder if above alternative
>> of KSM just skipping THPs might be better?
> 
> That would change the semantics where, for example, where we expect that memory
> was deduplicated after a KSM run.
> 
> VMs (where KSM is usually employed) are expected to be mostly backed by THPs:
> except where we can deduplicate memory. Skipping THPs would essentially break
> the main use case for KSM :)
> 
> Does that make sense?
> 

Yes, all of above makes sense. But I feel like this means someone should not
set THP policy to always and enable KSM together. In general I feel like KSM
is not something that should be run on big servers, as hopefully you are
not managing memory as 4K chunks for big machines and using a lot of THPs.


  reply	other threads:[~2026-05-11 13:10 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-08 17:05 [RFC] mm: restrict zero-page remapping to underused THP splits Nico Pache
2026-05-08 21:32 ` David Hildenbrand (Arm)
2026-05-09  8:25   ` Lance Yang
2026-05-10 11:39   ` Usama Arif
2026-05-11  6:36     ` David Hildenbrand (Arm)
2026-05-11 13:10       ` Usama Arif [this message]
2026-05-11 13:42         ` David Hildenbrand (Arm)
2026-05-11 13:44           ` David Hildenbrand (Arm)
2026-05-11 14:15             ` Usama Arif
2026-05-11 18:40   ` Nico Pache
2026-05-09  3:21 ` Lance Yang
2026-05-11 18:42   ` Nico Pache

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=608bef55-44d1-47f1-a201-4a6bd7be137d@linux.dev \
    --to=usama.arif@linux.dev \
    --cc=akpm@linux-foundation.org \
    --cc=baohua@kernel.org \
    --cc=baolin.wang@linux.alibaba.com \
    --cc=david@kernel.org \
    --cc=dev.jain@arm.com \
    --cc=lance.yang@linux.dev \
    --cc=liam@infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=ljs@kernel.org \
    --cc=npache@redhat.com \
    --cc=ryan.roberts@arm.com \
    --cc=usamaarif642@gmail.com \
    --cc=yuzhao@google.com \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox