All of lore.kernel.org
 help / color / mirror / Atom feed
From: Pratyush Yadav <ptyadav@amazon.de>
To: Mike Rapoport <rppt@kernel.org>
Cc: Jason Gunthorpe <jgg@nvidia.com>,
	Changyuan Lyu <changyuanl@google.com>,
	<linux-kernel@vger.kernel.org>, <graf@amazon.com>,
	<akpm@linux-foundation.org>, <luto@kernel.org>,
	<anthony.yznaga@oracle.com>, <arnd@arndb.de>,
	<ashish.kalra@amd.com>, <benh@kernel.crashing.org>,
	<bp@alien8.de>, <catalin.marinas@arm.com>,
	<dave.hansen@linux.intel.com>, <dwmw2@infradead.org>,
	<ebiederm@xmission.com>, <mingo@redhat.com>, <jgowans@amazon.com>,
	<corbet@lwn.net>, <krzk@kernel.org>, <mark.rutland@arm.com>,
	<pbonzini@redhat.com>, <pasha.tatashin@soleen.com>,
	<hpa@zytor.com>, <peterz@infradead.org>, <robh+dt@kernel.org>,
	<robh@kernel.org>, <saravanak@google.com>,
	<skinsburskii@linux.microsoft.com>, <rostedt@goodmis.org>,
	<tglx@linutronix.de>, <thomas.lendacky@amd.com>,
	<usama.arif@bytedance.com>, <will@kernel.org>,
	<devicetree@vger.kernel.org>, <kexec@lists.infradead.org>,
	<linux-arm-kernel@lists.infradead.org>,
	<linux-doc@vger.kernel.org>, <linux-mm@kvack.org>,
	<x86@kernel.org>
Subject: Re: [PATCH v5 09/16] kexec: enable KHO support for memory preservation
Date: Fri, 4 Apr 2025 16:15:28 +0000	[thread overview]
Message-ID: <mafs0cydrq4wv.fsf@amazon.de> (raw)
In-Reply-To: <Z-_kSXrHWU5Bf3sV@kernel.org>

Hi Mike,

On Fri, Apr 04 2025, Mike Rapoport wrote:

[...]
> As for the optimizations of memblock reserve path, currently it what hurts
> the most in my and Pratyush experiments. They are not very representative,
> but still, preserving lots of pages/folios spread all over would have it's
> toll on the mm initialization. And I don't think invasive changes to how
> buddy and memory map initialization are the best way to move forward and
> optimize that. Quite possibly we'd want to be able to minimize amount of
> *ranges* that we preserve.
>
> So from the three alternatives we have now (xarrays + bitmaps, tables +
> bitmaps and maple tree for ranges) maple tree seems to be the simplest and
> efficient enough to start with.

But you'd need to somehow serialize the maple tree ranges into some
format. So you would either end up going back to the kho_mem ranges we
had, or have to invent something more complex. The sample code you wrote
is pretty much going back to having kho_mem ranges.

And if you say that we should minimize the amount of ranges, the table +
bitmaps is still a fairly good data structure. You can very well have a
higher order table where your entire range is a handful of bits. This
lets you track a small number of ranges fairly efficiently -- both in
terms of memory and in terms of CPU. I think the only place where it
doesn't work as well as a maple tree is if you want to merge or split a
lot ranges quickly. But if you say that you only want to have a handful
of ranges, does that really matter?

Also, I think the allocation pattern depends on which use case you have
in mind. For hypervisor live update, you might very well only have a
handful of ranges. The use case I have in mind is for taking a userspace
process, quickly checkpointing it by dumping its memory contents to a
memfd, and restoring it after KHO. For that, the ability to do random
sparse allocations quickly helps a lot.

So IMO the table works well for both sparse and dense allocations. So
why have a data structure that only solves one problem when we can have
one that solves both? And honestly, I don't think the table is that much
more complex either -- both in terms of understanding the idea and in
terms of code -- the whole thing is like 200 lines.

Also, I think changes to buddy initialization _is_ the way to optimize
boot times. Having maple tree ranges and moving them around into
memblock ranges does not really scale very well for anything other than
a handful of ranges, and we shouldn't limit ourselves to that without
good reason.

>  
> Preserving folio orders with it is really straighforward and until we see
> some real data of how the entire KHO machinery is used, I'd prefer simple
> over anything else.

-- 
Regards,
Pratyush Yadav


  parent reply	other threads:[~2025-04-14  8:37 UTC|newest]

Thread overview: 104+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-03-20  1:55 [PATCH v5 00/16] kexec: introduce Kexec HandOver (KHO) Changyuan Lyu
2025-03-20  1:55 ` [PATCH v5 01/16] kexec: define functions to map and unmap segments Changyuan Lyu
2025-03-20 14:15   ` kernel test robot
2025-03-20  1:55 ` [PATCH v5 02/16] mm/mm_init: rename init_reserved_page to init_deferred_page Changyuan Lyu
2025-03-20  7:10   ` Krzysztof Kozlowski
2025-03-20 17:15     ` Changyuan Lyu
2025-03-20  1:55 ` [PATCH v5 03/16] memblock: add MEMBLOCK_RSRV_KERN flag Changyuan Lyu
2025-03-20  1:55 ` [PATCH v5 04/16] memblock: Add support for scratch memory Changyuan Lyu
2025-03-20  1:55 ` [PATCH v5 05/16] memblock: introduce memmap_init_kho_scratch() Changyuan Lyu
2025-03-20  1:55 ` [PATCH v5 06/16] hashtable: add macro HASHTABLE_INIT Changyuan Lyu
2025-03-20  1:55 ` [PATCH v5 07/16] kexec: add Kexec HandOver (KHO) generation helpers Changyuan Lyu
2025-03-21 13:34   ` Jason Gunthorpe
2025-03-23 19:02     ` Changyuan Lyu
2025-03-24 16:28       ` Jason Gunthorpe
2025-03-25  0:21         ` Changyuan Lyu
2025-03-25  2:20           ` Jason Gunthorpe
2025-03-24 18:40   ` Frank van der Linden
2025-03-25 19:19     ` Mike Rapoport
2025-03-25 21:56       ` Frank van der Linden
2025-03-26 11:59         ` Mike Rapoport
2025-03-26 16:25           ` Frank van der Linden
2025-03-20  1:55 ` [PATCH v5 08/16] kexec: add KHO parsing support Changyuan Lyu
2025-03-20  1:55 ` [PATCH v5 09/16] kexec: enable KHO support for memory preservation Changyuan Lyu
2025-03-21 13:46   ` Jason Gunthorpe
2025-03-22 19:12     ` Mike Rapoport
2025-03-23 18:55       ` Jason Gunthorpe
2025-03-24 18:18         ` Mike Rapoport
2025-03-24 20:07           ` Jason Gunthorpe
2025-03-26 12:07             ` Mike Rapoport
2025-03-23 19:07     ` Changyuan Lyu
2025-03-25  2:04       ` Jason Gunthorpe
2025-03-27 10:03   ` Pratyush Yadav
2025-03-27 13:31     ` Jason Gunthorpe
2025-03-27 17:28       ` Pratyush Yadav
2025-03-28 12:53         ` Jason Gunthorpe
2025-04-02 16:44         ` Changyuan Lyu
2025-04-02 16:47           ` Pratyush Yadav
2025-04-02 18:37             ` Pasha Tatashin
2025-04-02 18:49               ` Pratyush Yadav
2025-04-02 19:16   ` Pratyush Yadav
2025-04-03 11:42     ` Jason Gunthorpe
2025-04-03 13:58       ` Mike Rapoport
2025-04-03 14:24         ` Jason Gunthorpe
2025-04-04  9:54           ` Mike Rapoport
2025-04-04 12:47             ` Jason Gunthorpe
2025-04-04 13:53               ` Mike Rapoport
2025-04-04 14:30                 ` Jason Gunthorpe
2025-04-04 16:24                   ` Pratyush Yadav
2025-04-04 17:31                     ` Jason Gunthorpe
2025-04-06 16:13                     ` Mike Rapoport
2025-04-06 16:11                   ` Mike Rapoport
2025-04-07 14:16                     ` Jason Gunthorpe
2025-04-07 16:31                       ` Mike Rapoport
2025-04-07 17:03                         ` Jason Gunthorpe
2025-04-09  9:06                           ` Mike Rapoport
2025-04-09 12:56                             ` Jason Gunthorpe
2025-04-09 13:58                               ` Mike Rapoport
2025-04-09 15:37                                 ` Jason Gunthorpe
2025-04-09 16:19                                   ` Mike Rapoport
2025-04-09 16:28                                     ` Jason Gunthorpe
2025-04-10 16:51                                       ` Matthew Wilcox
2025-04-10 17:31                                         ` Jason Gunthorpe
2025-04-09 16:28                       ` Mike Rapoport
2025-04-09 18:32                         ` Jason Gunthorpe
2025-04-04 16:15                 ` Pratyush Yadav [this message]
2025-04-06 16:34                   ` Mike Rapoport
2025-04-07 14:23                     ` Jason Gunthorpe
2025-04-03 13:57     ` Mike Rapoport
2025-04-11  4:02     ` Changyuan Lyu
2025-04-03 15:50   ` Pratyush Yadav
2025-04-03 16:10     ` Jason Gunthorpe
2025-04-03 17:37       ` Pratyush Yadav
2025-04-04 12:54         ` Jason Gunthorpe
2025-04-04 15:39           ` Pratyush Yadav
2025-04-09  8:35       ` Mike Rapoport
2025-03-20  1:55 ` [PATCH v5 10/16] kexec: add KHO support to kexec file loads Changyuan Lyu
2025-03-21 13:48   ` Jason Gunthorpe
2025-03-20  1:55 ` [PATCH v5 11/16] kexec: add config option for KHO Changyuan Lyu
2025-03-20  7:10   ` Krzysztof Kozlowski
2025-03-20 17:18     ` Changyuan Lyu
2025-03-24  4:18   ` Dave Young
2025-03-24 19:26     ` Pasha Tatashin
2025-03-25  1:24       ` Dave Young
2025-03-25  3:07         ` Dave Young
2025-03-25  6:57     ` Baoquan He
2025-03-25  8:36       ` Dave Young
2025-03-26  9:17         ` Dave Young
2025-03-26 11:28           ` Mike Rapoport
2025-03-26 12:09             ` Dave Young
2025-03-25 14:04       ` Pasha Tatashin
2025-03-20  1:55 ` [PATCH v5 12/16] arm64: add KHO support Changyuan Lyu
2025-03-20  7:13   ` Krzysztof Kozlowski
2025-03-20  8:30     ` Krzysztof Kozlowski
2025-03-20 23:29     ` Changyuan Lyu
2025-04-11  3:47   ` Changyuan Lyu
2025-03-20  1:55 ` [PATCH v5 13/16] x86/setup: use memblock_reserve_kern for memory used by kernel Changyuan Lyu
2025-03-20  1:55 ` [PATCH v5 14/16] x86: add KHO support Changyuan Lyu
2025-03-20  1:55 ` [PATCH v5 15/16] memblock: add KHO support for reserve_mem Changyuan Lyu
2025-03-20  1:55 ` [PATCH v5 16/16] Documentation: add documentation for KHO Changyuan Lyu
2025-03-20 14:45   ` Jonathan Corbet
2025-03-21  6:33     ` Changyuan Lyu
2025-03-21 13:46       ` Jonathan Corbet
2025-03-25 14:19 ` [PATCH v5 00/16] kexec: introduce Kexec HandOver (KHO) Pasha Tatashin
2025-03-25 15:03   ` Mike Rapoport

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=mafs0cydrq4wv.fsf@amazon.de \
    --to=ptyadav@amazon.de \
    --cc=akpm@linux-foundation.org \
    --cc=anthony.yznaga@oracle.com \
    --cc=arnd@arndb.de \
    --cc=ashish.kalra@amd.com \
    --cc=benh@kernel.crashing.org \
    --cc=bp@alien8.de \
    --cc=catalin.marinas@arm.com \
    --cc=changyuanl@google.com \
    --cc=corbet@lwn.net \
    --cc=dave.hansen@linux.intel.com \
    --cc=devicetree@vger.kernel.org \
    --cc=dwmw2@infradead.org \
    --cc=ebiederm@xmission.com \
    --cc=graf@amazon.com \
    --cc=hpa@zytor.com \
    --cc=jgg@nvidia.com \
    --cc=jgowans@amazon.com \
    --cc=kexec@lists.infradead.org \
    --cc=krzk@kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=luto@kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=mingo@redhat.com \
    --cc=pasha.tatashin@soleen.com \
    --cc=pbonzini@redhat.com \
    --cc=peterz@infradead.org \
    --cc=robh+dt@kernel.org \
    --cc=robh@kernel.org \
    --cc=rostedt@goodmis.org \
    --cc=rppt@kernel.org \
    --cc=saravanak@google.com \
    --cc=skinsburskii@linux.microsoft.com \
    --cc=tglx@linutronix.de \
    --cc=thomas.lendacky@amd.com \
    --cc=usama.arif@bytedance.com \
    --cc=will@kernel.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.