The Linux Kernel Mailing List
 help / color / mirror / Atom feed
From: "David Hildenbrand (Arm)" <david@kernel.org>
To: Usama Arif <usama.arif@linux.dev>
Cc: Nico Pache <npache@redhat.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	yuzhao@google.com, usamaarif642@gmail.com, lance.yang@linux.dev,
	baohua@kernel.org, dev.jain@arm.com, ryan.roberts@arm.com,
	liam@infradead.org, baolin.wang@linux.alibaba.com,
	ziy@nvidia.com, ljs@kernel.org, akpm@linux-foundation.org
Subject: Re: [RFC] mm: restrict zero-page remapping to underused THP splits
Date: Mon, 11 May 2026 08:36:55 +0200	[thread overview]
Message-ID: <8838114e-5b6a-4f3d-932c-9e97e51216ae@kernel.org> (raw)
In-Reply-To: <20260510114001.600681-1-usama.arif@linux.dev>


>>
>> I tend to like (2), and maybe (3) on top. Opinions?
>>
> 
> Hello!


Hi!

> 
> I think (3) definitely makes sense.
> 
> I have not had a deep look at KSM up until just now, so might be dumb
> to say all of below.. :)
> 
> What I see is that KSM scans THPs as 512 individual 4K subpages and splits the
> THP whenever it actually wants to merge a single 4K chunk. That seems like a
> lot of work for a single 4K?

Yes, but that's what the users ask for: if there is a chance to deduplicate
memory, it shall be deduplicated asap.

> 
> One thing that came to my mind is to have a separate tree for THPs and only
> merge the THPs that have the same content, but the possibility of encoutering
> 2M pages with same content is extremely low? so this is probably a bad idea.

Right, the probability is low, and it would change existing semantics, breaking
existing users.

In addition, we would have to add large folio support for KSM, which I rather
would avoid.

> 
> An alternative is, does it even make sense to process and split THPs by KSM
> in the way it works now? IMO this is a lot of work for a single 4K merge.
> Shrinker is designed to release memory when its needed, i.e. reclaim, at
> which point IMO free memory is more important than performance. But KSM runs
> all the time.. so constantly splitting THPs everytime a single 4K can be
> merged just hurts performance all the time.

Right, but that's what you get with KSM: bad performance if there is a chance to
deduplicate :)

(and bad performance from scanning overhead)

> If someone cares about memory,
> they should be running the shrinker.

It's not just the zero page, but really any page content. The zero page is
currently only "special" after we added conditional support to deduplicate to
the shared zeropage in KSM. The shrinker doesn't help for any other page content
besides zero-filled.

Further, the shrinker is something system-wide, whereby KSM is usually only
enabled for selected VMAs (with some exceptions nowadays).

Also note that KSM deduplicates independent of the folio size: not just THPs,
but really any (large) folio. Yes, it splits large folios, but that's really
just to keep the T in THP.

> Is a better alternative that KSM skips
> THPs, THP shrinker splits THPs into 4K subpages when memory is needed, and
> only then KSM gets those 4K subpages?
> 
> Above sounds like reworking KSM, but just wanted to put it out there.

Right, and it makes KSM more THP aware. Which is something I would avoid right now.

> 
> (2) + (3) sounds like a good solution, but I wonder if above alternative
> of KSM just skipping THPs might be better?

That would change the semantics where, for example, where we expect that memory
was deduplicated after a KSM run.

VMs (where KSM is usually employed) are expected to be mostly backed by THPs:
except where we can deduplicate memory. Skipping THPs would essentially break
the main use case for KSM :)

Does that make sense?

-- 
Cheers,

David

  reply	other threads:[~2026-05-11  6:36 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-08 17:05 [RFC] mm: restrict zero-page remapping to underused THP splits Nico Pache
2026-05-08 21:32 ` David Hildenbrand (Arm)
2026-05-09  8:25   ` Lance Yang
2026-05-10 11:39   ` Usama Arif
2026-05-11  6:36     ` David Hildenbrand (Arm) [this message]
2026-05-11 13:10       ` Usama Arif
2026-05-11 13:42         ` David Hildenbrand (Arm)
2026-05-11 13:44           ` David Hildenbrand (Arm)
2026-05-11 14:15             ` Usama Arif
2026-05-11 18:40   ` Nico Pache
2026-05-09  3:21 ` Lance Yang
2026-05-11 18:42   ` Nico Pache

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8838114e-5b6a-4f3d-932c-9e97e51216ae@kernel.org \
    --to=david@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=baohua@kernel.org \
    --cc=baolin.wang@linux.alibaba.com \
    --cc=dev.jain@arm.com \
    --cc=lance.yang@linux.dev \
    --cc=liam@infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=ljs@kernel.org \
    --cc=npache@redhat.com \
    --cc=ryan.roberts@arm.com \
    --cc=usama.arif@linux.dev \
    --cc=usamaarif642@gmail.com \
    --cc=yuzhao@google.com \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox