linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Hyeonggon Yoo <42.hyeyoo@gmail.com>
To: Mike Rapoport <rppt@kernel.org>
Cc: lsf-pc@lists.linux-foundation.org, linux-mm@kvack.org,
	Aaron Lu <aaron.lu@intel.com>,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Subject: Re: [LSF/MM/BPF TOPIC] reducing direct map fragmentation
Date: Sun, 19 Feb 2023 08:07:59 +0000	[thread overview]
Message-ID: <Y/HY3y4toae8/nmQ@localhost> (raw)
In-Reply-To: <Y9qqLZz3bFsgE0Kn@kernel.org>

On Wed, Feb 01, 2023 at 08:06:37PM +0200, Mike Rapoport wrote:
> Hi all,

Hi Mike, I'm interested in this topic and hope to discuss this with you
at LSF/MM/BPF.
 
> There are use-cases that need to remove pages from the direct map or at least
> map them at PTE level. These use-cases include vfree, module loading, ftrace,
> kprobe, BPF, secretmem and generally any caller of set_memory/set_direct_map
> APIs.
> 
> Remapping pages at PTE level causes split of the PUD and PMD sized mappings
> in the direct map which leads to performance degradation.
>
> To reduce the performance hit caused by the fragmentation of the direct
> map, it makes sense to group and/or cache the base pages removed from the
> direct map so that the most of base pages created during a split of a large
> page will be consumed by users requiring PTE level mappings.

How much performance difference did you see in your test when direct
map was fragmented, or is there a way to check this difference? 

> Last year the proposal to use a new migrate type for such cache received
> strong pushback and the suggested alternative was to try to use slab
> instead.
> 
> I've been thinking about it (yeah, it took me a while) and I believe slab
> is not appropriate because use cases require at least page size allocations
> and some would really benefit from higher order allocations, and in the
> most cases the code that allocates memory excluded from the direct map
> needs the struct page/folio. 
>
> For example, caching allocations of text in 2M pages would benefit from
> reduced iTLB pressure and doing kmalloc() from vmalloc() will be way more
> intrusive than using some variant of __alloc_pages().
>
> Secretmem and potentially PKS protected page tables also need struct
> page/folio.
> 
> My current proposal is to have a cache of 2M pages close to the page
> allocator and use a GFP flag to make allocation request use that cache. On
> the free() path, the pages that are mapped at PTE level will be put into
> that cache.

I would like to discuss not only having cache layer of pages but also how
direct map could be merged correctly and efficiently.

I vaguely recall that Aaron Lu sent RFC series about this and Kirill A.
Shutemov's feedback was to batch merge operations. [1]

Also a CPA API called by the cache layer that could merge fragmented
mappings would work for merging 4K pages to 2M [2], but won't work
for merging 2M mappings to 1G mappings.

At that time I didn't follow more discussions (e.g. execmem_alloc())
Maybe I'm missing some points.

[1] https://lore.kernel.org/linux-mm/20220809100408.rm6ofiewtty6rvcl@box

[2] https://lore.kernel.org/linux-mm/YvfLxuflw2ctHFWF@kernel.org
 
> The cache is internally implemented as a buddy allocator so it can satisfy
> high order allocations, and there will be a shrinker to release free pages
> from that cache to the page allocator.
> 
> I hope to have a first prototype posted Really Soon.

Looking forward to that!
Wonder how it would be shaped.

> 
> -- 
> Sincerely yours,
> Mike.



  reply	other threads:[~2023-02-19  8:08 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-02-01 18:06 [LSF/MM/BPF TOPIC] reducing direct map fragmentation Mike Rapoport
2023-02-19  8:07 ` Hyeonggon Yoo [this message]
2023-02-19 18:09   ` Mike Rapoport
2023-02-20 14:43     ` Hyeonggon Yoo
2023-02-24 14:45       ` Mike Rapoport
2023-04-21  9:05 ` [Lsf-pc] " Michal Hocko
2023-04-21  9:47   ` Mike Rapoport
2023-04-21 12:41     ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y/HY3y4toae8/nmQ@localhost \
    --to=42.hyeyoo@gmail.com \
    --cc=aaron.lu@intel.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-mm@kvack.org \
    --cc=lsf-pc@lists.linux-foundation.org \
    --cc=rppt@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).