All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mike Rapoport <rppt@kernel.org>
To: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: Pratyush Yadav <pratyush@kernel.org>,
	Alexander Graf <graf@amazon.com>,
	Changyuan Lyu <changyuanl@google.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Baoquan He <bhe@redhat.com>,
	kexec@lists.infradead.org, linux-kernel@vger.kernel.org,
	linux-mm@kvack.org
Subject: Re: [PATCH] kho: initialize tail pages for higher order folios properly
Date: Tue, 10 Jun 2025 19:41:25 +0300	[thread overview]
Message-ID: <aEhgNU80Dr9iRwoD@kernel.org> (raw)
In-Reply-To: <CA+CK2bDXOWzrTsNtbCCK2rabNysf0JFQT5VfAjYrX5oSoiLmSQ@mail.gmail.com>

On Tue, Jun 10, 2025 at 07:20:23AM -0400, Pasha Tatashin wrote:
> On Tue, Jun 10, 2025 at 1:44 AM Mike Rapoport <rppt@kernel.org> wrote:
> >
> > On Mon, Jun 09, 2025 at 04:07:50PM -0400, Pasha Tatashin wrote:
> > > On Mon, Jun 9, 2025 at 3:36 PM Mike Rapoport <rppt@kernel.org> wrote:
> > > >
> > > > Hi Pratyush,
> > > >
> > > > On Fri, Jun 06, 2025 at 06:23:06PM +0200, Pratyush Yadav wrote:
> > > > > Hi Mike,
> > > > >
> > > > > On Fri, Jun 06 2025, Mike Rapoport wrote:
> > > > >
> > > > > > On Thu, Jun 05, 2025 at 07:11:41PM +0200, Pratyush Yadav wrote:
> > > > > >> From: Pratyush Yadav <ptyadav@amazon.de>
> > > > > >>
> > > > > >> --- a/kernel/kexec_handover.c
> > > > > >> +++ b/kernel/kexec_handover.c
> > > > > >> @@ -157,11 +157,21 @@ static int __kho_preserve_order(struct kho_mem_track *track, unsigned long pfn,
> > > > > >>  }
> > > > > >>
> > > > > >>  /* almost as free_reserved_page(), just don't free the page */
> > > > > >> -static void kho_restore_page(struct page *page)
> > > > > >> +static void kho_restore_page(struct page *page, unsigned int order)
> > > > > >>  {
> > > > > >> -  ClearPageReserved(page);
> > > > > >
> > > > > > So now we don't clear PG_Reserved even on order-0 pages? ;-)
> > > > >
> > > > > We don't need to. As I mentioned in the commit message as well,
> > > > > PG_Reserved is never set for KHO pages since they are reserved with
> > > > > MEMBLOCK_RSRV_NOINIT, so memmap_init_reserved_pages() skips over them.
> > > >
> > > > You are right, I missed it.
> > > >
> > > > > That said, while reading through some of the code, I noticed another
> > > > > bug: because KHO reserves the preserved pages as NOINIT, with
> > > > > CONFIG_DEFERRED_STRUCT_PAGE_INIT == n, all the pages get initialized
> > > > > when memmap_init_range() is called from setup_arch (paging_init() on
> > > > > x86). This happens before kho_memory_init(), so the KHO-preserved pages
> > > > > are not marked as reserved to memblock yet.
> > > > >
> > > > > With deferred page init, some pages might not get initialized early, and
> > > > > get initialized after kho_memory_init(), by which time the KHO-preserved
> > > > > pages are marked as reserved. So, deferred_init_maxorder() will skip
> > > > > over those pages and leave them uninitialized.
> > > > >
> > > > > So we need to either also call init_deferred_page(), or remove the
> > > > > memblock_reserved_mark_noinit() call in deserialize_bitmap(). And TBH, I
> > > > > am not sure why KHO pages even need to be marked noinit in the first
> > > > > place. Probably the only benefit would be if a large chunk of memory is
> > > > > KHO-preserved, the pages can be initialized later on-demand, reducing
> > > > > bootup time a bit.
> > > >
> > > > One benefit is performance indeed, because in not deferred case the
> > > > initialization of reserved pages in memmap_init_reserved_pages() is really
> > > > excessive.
> > > >
> > > > But more importantly, if we remove memblock_reserved_mark_noinit(), with
> > > > CONFIG_DEFERRED_STRUCT_PAGE_INIT we'd loose page->private because the
> > > > struct page will be cleared after kho_mem_deserialize().
> > > >
> > > > > What do you think? Should we drop noinit or call init_deferred_page()?
> > > > > FWIW, my preference is to drop noinit, since init_deferred_page() is
> > > > > __meminit and we would have to make sure it doesn't go away after boot.
> > > >
> > > > We can't drop noinit and calling init_deferred_page() after boot just won't
> > > > work because it uses memblock to find the page's node and memblock is gone
> > > > after init.
> > > >
> > > > The simplest short-term solution is to disable KHO when
> > > > CONFIG_DEFERRED_STRUCT_PAGE_INIT is set and then find an efficient way to
> > > > make it all work together.
> > >
> > > This is what I've done in LUOv3 WIP:
> > > https://github.com/soleen/linux/commit/3059f38ac0a39a397873759fb429bd5d1f8ea681
> >
> > I think it should be the other way around, KHO should depend on
> > !DEFERRED_STRUCT_PAGE_INIT.
> 
> Agreed, and this is what I first tried, but that does not work, there
> is some circular dependency breaking the build. If you feel
> adventurous you can try that :-)

Hmm, weird, worked for me :/
 
> > > We will need to teah KHO to work with deferred struct page init. I
> > > suspect, we could init preserved struct pages and then skip over them
> > > during deferred init.
> >
> > We could, but with that would mean we'll run this before SMP and it's not
> > desirable. Also, init_deferred_page() for a random page requires
> 
> We already run KHO init before smp_init:
> start_kernel() -> mm_core_init() -> kho_memory_init() ->
> kho_restore_folio() -> struct pages must be already initialized here!
> 
> While deferred struct pages are initialized:
> start_kernel() -> rest_init() -> kernel_init() ->
> kernel_init_freeable() -> page_alloc_init_late() ->
> deferred_init_memmap()
> 
> If the number of preserved pages that is needed during early boot is
> relatively small, that it should not be an issue to pre-initialize
> struct pages for them before deferred struct pages are initialized. We
> already pre-initialize some  "struct pages" that are needed during
> early boot before the reset are initialized, see deferred_grow_zone()

deferred_grow_zone() takes a chunk in the beginning of uninitialized range,
with kho we are talking about some random pages. If we preinit them early,
deferred_init_memmap() will overwrite them.

Anyway, I'm going to look into it, hopefully I'll have something Really
Soon (tm).
 
> Pasha

-- 
Sincerely yours,
Mike.


  reply	other threads:[~2025-06-10 20:09 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-06-05 17:11 [PATCH] kho: initialize tail pages for higher order folios properly Pratyush Yadav
2025-06-05 20:13 ` Andrew Morton
2025-06-06  8:04 ` Mike Rapoport
2025-06-06 16:23   ` Pratyush Yadav
2025-06-09 19:36     ` Mike Rapoport
2025-06-09 20:07       ` Pasha Tatashin
2025-06-10  5:44         ` Mike Rapoport
2025-06-10 11:20           ` Pasha Tatashin
2025-06-10 16:41             ` Mike Rapoport [this message]
2025-06-10 22:33               ` Pasha Tatashin
2025-06-11 13:06                 ` Pratyush Yadav
2025-06-11 13:14                   ` Pasha Tatashin
2025-06-11 13:35                     ` Mike Rapoport
2025-06-11 14:01                       ` Pratyush Yadav
2025-06-11 14:36                         ` Mike Rapoport
2025-06-13 14:22                           ` Pratyush Yadav
2025-06-13 16:21                             ` Mike Rapoport
2025-06-11 13:38                     ` Pratyush Yadav

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aEhgNU80Dr9iRwoD@kernel.org \
    --to=rppt@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=bhe@redhat.com \
    --cc=changyuanl@google.com \
    --cc=graf@amazon.com \
    --cc=kexec@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=pasha.tatashin@soleen.com \
    --cc=pratyush@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.