linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Byungchul Park <byungchul@sk.com>
To: Gregory Price <gourry@gourry.net>
Cc: Matthew Wilcox <willy@infradead.org>,
	Hyeonggon Yoo <42.hyeyoo@gmail.com>,
	lsf-pc@lists.linux-foundation.org, linux-mm@kvack.org,
	linux-cxl@vger.kernel.org, Honggyu Kim <honggyu.kim@sk.com>,
	kernel_team@skhynix.com
Subject: Re: [LSF/MM/BPF TOPIC] Restricting or migrating unmovable kernel allocations from slow tier
Date: Fri, 7 Feb 2025 19:14:49 +0900	[thread overview]
Message-ID: <20250207101449.GA35103@system.software.com> (raw)
In-Reply-To: <Z6XLCeU0vjIOYGKe@gourry-fedora-PF4VCD3F>

On Fri, Feb 07, 2025 at 03:57:45AM -0500, Gregory Price wrote:
> On Fri, Feb 07, 2025 at 04:20:24PM +0900, Byungchul Park wrote:
> > On Sat, Feb 01, 2025 at 02:04:17PM +0000, Matthew Wilcox wrote:
> > 
> > We can work with from the easiest object
> 
> >e.g. page table
> 
> It's more efficient and easier to change page sizes than it is to make
> page tables migratable.

You are misunderstanding.  I didn't say 'do not change page sizes'.  I
didn't say it's easier than changing page size.  I said *both* changing
page sizes and making them migratable could reduce ZONE_NORMAL cost.

> It's also easier to reclaim cold pages eating up significantly more
> memory than the page table (which describes pages at ~8 bytes per page).

Same.  We should keep reclaiming cold pages eating up memory.  Why do we
give up reclaiming cold pages if page table becomes migratable?  I
really don't understand why you are trying to exclusively pick up only
one effort for that purpose.

> Also, there's quite a bit of literature that shows page tables landing
> on remote nodes (cross-socket) has negative performance impacts.

Exactly.  That's the motivation to suggest this topic.  That's why we
are asking about kernel object migratibility.  Of course, we try our
best to place kernel object in DRAM in the first place.  However, the
thing would arise when it becomes impossible.  It's about comparison
between 'premature reclaim and die(= oom)' and 'slight degradation of
performance'.

> Putting them on CXL makes the problem worse.

No.  Higher chance to die is worse.

> > struct page,
> 
> `struct page` is a structure that describes a physically addressed page.
> 
> It is common to access it by simply doing `pfn_to_page()`, which is a
> fairly simply conversion (bit more complex in sparsemem w/ sections)
> 
> This is used in a lockless manner to acquire page references all over
> the kernel.
> 
> Making that migratable is... ambitious, to say the least.

Yes.  I don't think it's easy.

> > and kernel stack,
> 
> The default kernel stack size is like 16kb.  You'd need like 100,000
> threads to eat up 1.5GB, and 2048 threads only eats like 32MB.
> 
> It's not an interesting amount of memory if you have a 20TB system.

Kernel stack is an example.  We can skip it and look for better
candidate.

> > When it comes to this topic, the most important thing is the collected
> > *direction* from the community so that we can start the work under the
> > *direction*.
> > 
> 
> My thoughts here are that memory tiering is the wrong tool for the
> problem you are trying to solve.

I think any valid efforts can be considered at the same time.  Is there
any reason that effort in tiering environment should be excluded?

	Byungchul

> Maybe there's a world in which we propose a ZONE_MEMDESC which is
> exclusively used for `struct page` for a node. 
> 
> At least then you could design CXL capacities *around* that.
> 
> ~Gregory


  parent reply	other threads:[~2025-02-07 10:15 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-02-01 13:29 [LSF/MM/BPF TOPIC] Restricting or migrating unmovable kernel allocations from slow tier Hyeonggon Yoo
2025-02-01 14:04 ` Matthew Wilcox
2025-02-01 15:13   ` Hyeonggon Yoo
2025-02-01 16:30     ` Gregory Price
2025-02-01 18:48       ` Matthew Wilcox
2025-02-03 22:09       ` Dan Williams
2025-02-07  7:20   ` Byungchul Park
2025-02-07  8:57     ` Gregory Price
2025-02-07  9:27       ` Gregory Price
2025-02-07  9:34       ` Honggyu Kim
2025-02-07  9:54         ` Gregory Price
2025-02-07 10:49           ` Byungchul Park
2025-02-10  2:33           ` Harry (Hyeonggon) Yoo
2025-02-10  3:19             ` Matthew Wilcox
2025-02-10  6:00             ` Gregory Price
2025-02-10  7:17               ` Byungchul Park
2025-02-10 15:47                 ` Gregory Price
2025-02-10 15:55                   ` Matthew Wilcox
2025-02-10 16:06                     ` Gregory Price
2025-02-11  1:53                   ` Byungchul Park
2025-02-21  1:52                   ` Harry Yoo
2025-02-25  4:54                     ` [LSF/MM/BPF TOPIC] Gathering ideas to reduce ZONE_NORMAL cost Byungchul Park
2025-02-25  5:06                   ` [LSF/MM/BPF TOPIC] Restricting or migrating unmovable kernel allocations from slow tier Byungchul Park
2025-03-03 15:55                     ` Gregory Price
2025-02-07 10:14       ` Byungchul Park [this message]
2025-02-10  7:02       ` Byungchul Park
2025-02-04  9:59 ` David Hildenbrand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250207101449.GA35103@system.software.com \
    --to=byungchul@sk.com \
    --cc=42.hyeyoo@gmail.com \
    --cc=gourry@gourry.net \
    --cc=honggyu.kim@sk.com \
    --cc=kernel_team@skhynix.com \
    --cc=linux-cxl@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lsf-pc@lists.linux-foundation.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).