From: Mike Rapoport <rppt@kernel.org>
To: Jason Miu <jasonmiu@google.com>
Cc: Alexander Graf <graf@amazon.com>,
Andrew Morton <akpm@linux-foundation.org>,
Baoquan He <bhe@redhat.com>,
Changyuan Lyu <changyuanl@google.com>,
David Matlack <dmatlack@google.com>,
David Rientjes <rientjes@google.com>,
Jason Gunthorpe <jgg@nvidia.com>,
Pasha Tatashin <pasha.tatashin@soleen.com>,
Pratyush Yadav <pratyush@kernel.org>,
kexec@lists.infradead.org, linux-kernel@vger.kernel.org,
linux-mm@kvack.org
Subject: Re: [PATCH v7 1/2] kho: Adopt radix tree for preserved memory tracking
Date: Tue, 20 Jan 2026 19:57:37 +0200 [thread overview]
Message-ID: <aW_CEV-Qqrj2dvEb@kernel.org> (raw)
In-Reply-To: <20260116034432.1520731-2-jasonmiu@google.com>
Hi Jason,
On Thu, Jan 15, 2026 at 07:44:31PM -0800, Jason Miu wrote:
> Introduce a radix tree implementation for tracking preserved memory
> pages and switch the KHO memory tracking mechanism to use it. This
> lays the groundwork for a stateless KHO implementation that eliminates
> the need for serialization and the associated "finalize" state.
>
> This patch introduces the core radix tree data structures and
> constants to the KHO ABI. It adds the radix tree node and leaf
> structures, along with documentation for the radix tree key encoding
> scheme that combines a page's physical address and order.
>
> To support broader use by other kernel subsystems, such as hugetlb
> preservation, the core radix tree manipulation functions are exported
> as a public API.
>
> The xarray-based memory tracking is replaced with this new radix tree
> implementation. The core KHO preservation and unpreservation functions
> are wired up to use the radix tree helpers. On boot, the second kernel
> restores the preserved memory map by walking the radix tree whose root
> physical address is passed via the FDT.
>
> The ABI `compatible` version is bumped to "kho-v2" to reflect the
> structural changes in the preserved memory map and sub-FDT property
> names. This includes renaming "fdt" to "preserved-data" to better
> reflect that preserved state may use formats other than FDT.
>
> Signed-off-by: Jason Miu <jasonmiu@google.com>
...
> diff --git a/kernel/liveupdate/kexec_handover.c b/kernel/liveupdate/kexec_handover.c
> index 49bf2cecab12..06adaf56cd69 100644
> --- a/kernel/liveupdate/kexec_handover.c
> +++ b/kernel/liveupdate/kexec_handover.c
> @@ -5,6 +5,7 @@
> * Copyright (C) 2025 Microsoft Corporation, Mike Rapoport <rppt@kernel.org>
> * Copyright (C) 2025 Google LLC, Changyuan Lyu <changyuanl@google.com>
> * Copyright (C) 2025 Pasha Tatashin <pasha.tatashin@soleen.com>
> + * Copyright (C) 2025 Google LLC, Jason Miu <jasonmiu@google.com>
It's already 2026 ;-)
> */
>
> #define pr_fmt(fmt) "KHO: " fmt
...
> +int kho_radix_add_page(struct kho_radix_tree *tree,
> + unsigned long pfn, unsigned int order)
> +{
> + /* Newly allocated nodes for error cleanup */
> + struct kho_radix_node *intermediate_nodes[KHO_TREE_MAX_DEPTH] = { 0 };
> + unsigned long key = kho_radix_encode_key(PFN_PHYS(pfn), order);
> + struct kho_radix_node *new_node, *anchor_node;
> + struct kho_radix_node *node = tree->root;
> + unsigned int i, idx, anchor_idx;
> + struct kho_radix_leaf *leaf;
> + int err = 0;
> +
> + if (WARN_ON_ONCE(!tree->root))
> + return -EINVAL;
> +
> + might_sleep();
> +
> + guard(mutex)(&tree->lock);
> +
> + /* Go from high levels to low levels */
> + for (i = KHO_TREE_MAX_DEPTH - 1; i > 0; i--) {
> + idx = kho_radix_get_table_index(key, i);
> +
> + if (node->table[idx]) {
> + node = phys_to_virt(node->table[idx]);
> + continue;
> + }
> +
> + /* Next node is empty, create a new node for it */
> + new_node = (struct kho_radix_node *)get_zeroed_page(GFP_KERNEL);
> + if (!new_node) {
> + err = -ENOMEM;
> + goto err_free_nodes;
> + }
> +
> + node->table[idx] = virt_to_phys(new_node);
> +
> + /*
> + * Capture the node where the new branch starts for cleanup
> + * if allocation fails.
> + */
> + if (!anchor_node) {
I think anchor_node should be initialized to NULL for this to work.
> + anchor_node = node;
> + anchor_idx = idx;
> + }
> + intermediate_nodes[i] = new_node;
> +
> + node = new_node;
> + }
> +
> + /* Handle the leaf level bitmap (level 0) */
> + idx = kho_radix_get_bitmap_index(key);
> + leaf = (struct kho_radix_leaf *)node;
> + __set_bit(idx, leaf->bitmap);
> +
> + return 0;
> +
> +err_free_nodes:
> + for (i = KHO_TREE_MAX_DEPTH - 1; i > 0; i--) {
> + if (intermediate_nodes[i])
> + free_page((unsigned long)intermediate_nodes[i]);
> + }
> + if (anchor_node)
> + anchor_node->table[anchor_idx] = 0;
> +
> + return err;
> +}
> +EXPORT_SYMBOL_GPL(kho_radix_add_page);
...
> + if (WARN_ON(!node->table[idx]))
> + return;
> +
> + node = phys_to_virt((phys_addr_t)node->table[idx]);
No need for casting.
> + shift = ((level - 1) * KHO_TABLE_SIZE_LOG2) +
> + KHO_BITMAP_SIZE_LOG2;
> + key = start | (i << shift);
> +
> + node = phys_to_virt((phys_addr_t)root->table[i]);
Ditto.
> @@ -1466,12 +1489,6 @@ void __init kho_populate(phys_addr_t fdt_phys, u64 fdt_len,
> goto out;
> }
>
> - mem_map_phys = kho_get_mem_map_phys(fdt);
> - if (!mem_map_phys) {
> - err = -ENOENT;
> - goto out;
> - }
I think we should keep the logic that skips scratch initialization if there
were no memory preservations, like Pasha implemented here:
https://lkml.kernel.org/r/20251223140140.2090337-1-pasha.tatashin@soleen.com
(commit e1c3bfd091f3 ("kho: validate preserved memory map during
population") in today's mm tree)
We just should update the validation to work with the radix tree.
> scratch = early_memremap(scratch_phys, scratch_len);
> if (!scratch) {
> pr_warn("setup: failed to memremap scratch (phys=0x%llx, len=%lld)\n",
> @@ -1512,7 +1529,6 @@ void __init kho_populate(phys_addr_t fdt_phys, u64 fdt_len,
>
> kho_in.fdt_phys = fdt_phys;
> kho_in.scratch_phys = scratch_phys;
> - kho_in.mem_map_phys = mem_map_phys;
> kho_scratch_cnt = scratch_cnt;
> pr_info("found kexec handover data.\n");
>
> --
> 2.52.0.457.g6b5491de43-goog
>
--
Sincerely yours,
Mike.
next prev parent reply other threads:[~2026-01-20 17:57 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-01-16 3:44 [PATCH v7 0/2] Make KHO Stateless Jason Miu
2026-01-16 3:44 ` [PATCH v7 1/2] kho: Adopt radix tree for preserved memory tracking Jason Miu
2026-01-20 17:57 ` Mike Rapoport [this message]
2026-01-16 3:44 ` [PATCH v7 2/2] kho: Remove finalize state and clients Jason Miu
2026-01-20 17:25 ` Mike Rapoport
2026-01-19 18:43 ` [PATCH v7 0/2] Make KHO Stateless Mike Rapoport
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aW_CEV-Qqrj2dvEb@kernel.org \
--to=rppt@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=bhe@redhat.com \
--cc=changyuanl@google.com \
--cc=dmatlack@google.com \
--cc=graf@amazon.com \
--cc=jasonmiu@google.com \
--cc=jgg@nvidia.com \
--cc=kexec@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=pasha.tatashin@soleen.com \
--cc=pratyush@kernel.org \
--cc=rientjes@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.