From: Pratyush Yadav <pratyush@kernel.org>
To: Mike Rapoport <rppt@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Alexander Graf <graf@amazon.com>, Baoquan He <bhe@redhat.com>,
Changyuan Lyu <changyuanl@google.com>,
Chris Li <chrisl@kernel.org>, Jason Gunthorpe <jgg@nvidia.com>,
Pasha Tatashin <pasha.tatashin@soleen.com>,
Pratyush Yadav <pratyush@kernel.org>,
kexec@lists.infradead.org, linux-mm@kvack.org,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH v4 2/4] kho: replace kho_preserve_phys() with kho_preserve_pages()
Date: Thu, 18 Sep 2025 12:32:08 +0200 [thread overview]
Message-ID: <mafs05xdggifr.fsf@kernel.org> (raw)
In-Reply-To: <20250917174033.3810435-3-rppt@kernel.org>
Hi Mike,
On Wed, Sep 17 2025, Mike Rapoport wrote:
> From: "Mike Rapoport (Microsoft)" <rppt@kernel.org>
>
> to make it clear that KHO operates on pages rather than on a random
> physical address.
>
> The kho_preserve_pages() will be also used in upcoming support for
> vmalloc preservation.
>
> Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
> ---
> include/linux/kexec_handover.h | 5 +++--
> kernel/kexec_handover.c | 25 +++++++++++--------------
> mm/memblock.c | 4 +++-
> 3 files changed, 17 insertions(+), 17 deletions(-)
>
> diff --git a/include/linux/kexec_handover.h b/include/linux/kexec_handover.h
> index 348844cffb13..cc5c49b0612b 100644
> --- a/include/linux/kexec_handover.h
> +++ b/include/linux/kexec_handover.h
> @@ -18,6 +18,7 @@ enum kho_event {
>
> struct folio;
> struct notifier_block;
> +struct page;
>
> #define DECLARE_KHOSER_PTR(name, type) \
> union { \
> @@ -42,7 +43,7 @@ struct kho_serialization;
> bool kho_is_enabled(void);
>
> int kho_preserve_folio(struct folio *folio);
> -int kho_preserve_phys(phys_addr_t phys, size_t size);
> +int kho_preserve_pages(struct page *page, unsigned int nr_pages);
> struct folio *kho_restore_folio(phys_addr_t phys);
> int kho_add_subtree(struct kho_serialization *ser, const char *name, void *fdt);
> int kho_retrieve_subtree(const char *name, phys_addr_t *phys);
> @@ -65,7 +66,7 @@ static inline int kho_preserve_folio(struct folio *folio)
> return -EOPNOTSUPP;
> }
>
> -static inline int kho_preserve_phys(phys_addr_t phys, size_t size)
> +static inline int kho_preserve_pages(struct page *page, unsigned int nr_pages)
> {
> return -EOPNOTSUPP;
> }
> diff --git a/kernel/kexec_handover.c b/kernel/kexec_handover.c
> index f421acc58c1f..3ad59c5f9eaa 100644
> --- a/kernel/kexec_handover.c
> +++ b/kernel/kexec_handover.c
> @@ -698,26 +698,23 @@ int kho_preserve_folio(struct folio *folio)
> EXPORT_SYMBOL_GPL(kho_preserve_folio);
>
> /**
> - * kho_preserve_phys - preserve a physically contiguous range across kexec.
> - * @phys: physical address of the range.
> - * @size: size of the range.
> + * kho_preserve_pages - preserve contiguous pages across kexec
> + * @page: first page in the list.
> + * @nr_pages: number of pages.
> *
> - * Instructs KHO to preserve the memory range from @phys to @phys + @size
> - * across kexec.
> + * Preserve a contiguous list of order 0 pages. Must be restored using
> + * kho_restore_page() on each order 0 page.
This is not true. The pages are preserved with the maximum order
possible.
while (pfn < end_pfn) {
const unsigned int order =
min(count_trailing_zeros(pfn), ilog2(end_pfn - pfn));
err = __kho_preserve_order(track, pfn, order);
[...]
So four 0-order pages will be preserved as one 2-order page. Restoring
them as four 0-order pages is wrong. And my proposed patch for checking
the magic [0] will uncover this exact bug.
I think you should either change the logic to always preserve at order
0, or maybe add a kho_restore_pages() that replicates the same order
calculation.
[0] https://lore.kernel.org/lkml/20250917125725.665-2-pratyush@kernel.org/
> *
> * Return: 0 on success, error code on failure
> */
> -int kho_preserve_phys(phys_addr_t phys, size_t size)
> +int kho_preserve_pages(struct page *page, unsigned int nr_pages)
> {
> - unsigned long pfn = PHYS_PFN(phys);
> + struct kho_mem_track *track = &kho_out.ser.track;
> + const unsigned long start_pfn = page_to_pfn(page);
> + const unsigned long end_pfn = start_pfn + nr_pages;
> + unsigned long pfn = start_pfn;
> unsigned long failed_pfn = 0;
> - const unsigned long start_pfn = pfn;
> - const unsigned long end_pfn = PHYS_PFN(phys + size);
> int err = 0;
> - struct kho_mem_track *track = &kho_out.ser.track;
> -
> - if (!PAGE_ALIGNED(phys) || !PAGE_ALIGNED(size))
> - return -EINVAL;
>
> while (pfn < end_pfn) {
> const unsigned int order =
> @@ -737,7 +734,7 @@ int kho_preserve_phys(phys_addr_t phys, size_t size)
>
> return err;
> }
> -EXPORT_SYMBOL_GPL(kho_preserve_phys);
> +EXPORT_SYMBOL_GPL(kho_preserve_pages);
>
> /* Handling for debug/kho/out */
>
> diff --git a/mm/memblock.c b/mm/memblock.c
> index 117d963e677c..6ec3eaa4e8d1 100644
> --- a/mm/memblock.c
> +++ b/mm/memblock.c
> @@ -2516,8 +2516,10 @@ static int reserve_mem_kho_finalize(struct kho_serialization *ser)
>
> for (i = 0; i < reserved_mem_count; i++) {
> struct reserve_mem_table *map = &reserved_mem_table[i];
> + struct page *page = phys_to_page(map->start);
> + unsigned int nr_pages = map->size >> PAGE_SHIFT;
>
> - err |= kho_preserve_phys(map->start, map->size);
> + err |= kho_preserve_pages(page, nr_pages);
Unrelated to this patch, but since there is no
kho_restore_{phys,pages}(), won't the reserve_mem memory end up with
uninitialized struct pages, since preserved pages are
memblock_reserved_mark_noinit()?
That would also be a case for kho_restore_pages() I suppose?
--
Regards,
Pratyush Yadav
next prev parent reply other threads:[~2025-09-18 10:32 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-09-17 17:40 [PATCH v4 0/4] kho: add support for preserving vmalloc allocations Mike Rapoport
2025-09-17 17:40 ` [PATCH v4 1/4] kho: check if kho is finalized in __kho_preserve_order() Mike Rapoport
2025-09-18 10:12 ` Pratyush Yadav
2025-09-17 17:40 ` [PATCH v4 2/4] kho: replace kho_preserve_phys() with kho_preserve_pages() Mike Rapoport
2025-09-18 10:32 ` Pratyush Yadav [this message]
2025-09-18 11:04 ` Mike Rapoport
2025-09-17 17:40 ` [PATCH v4 3/4] kho: add support for preserving vmalloc allocations Mike Rapoport
2025-09-17 21:15 ` Andrew Morton
2025-09-17 21:21 ` Jason Gunthorpe
2025-09-18 10:33 ` Pratyush Yadav
2025-09-17 17:40 ` [PATCH v4 4/4] lib/test_kho: use kho_preserve_vmalloc instead of storing addresses in fdt Mike Rapoport
2025-09-18 10:36 ` Pratyush Yadav
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=mafs05xdggifr.fsf@kernel.org \
--to=pratyush@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=bhe@redhat.com \
--cc=changyuanl@google.com \
--cc=chrisl@kernel.org \
--cc=graf@amazon.com \
--cc=jgg@nvidia.com \
--cc=kexec@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=pasha.tatashin@soleen.com \
--cc=rppt@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.