All of lore.kernel.org
 help / color / mirror / Atom feed
From: Pratyush Yadav <pratyush@kernel.org>
To: Breno Leitao <leitao@debian.org>
Cc: Pasha Tatashin <pasha.tatashin@soleen.com>,
	 akpm@linux-foundation.org, rppt@kernel.org,  graf@amazon.com,
	 linux-kernel@vger.kernel.org, kexec@lists.infradead.org,
	 linux-mm@kvack.org,  pratyush@kernel.org,
	ricardo.neri-calderon@linux.intel.com,  kernel-team@meta.com
Subject: Re: [PATCH v4] kho: validate preserved memory map during population
Date: Fri, 16 Jan 2026 16:21:28 +0000	[thread overview]
Message-ID: <2vxzo6mt7cl3.fsf@kernel.org> (raw)
In-Reply-To: <jixxiqeajngwox3nh2u3pwf2rivhmqxxmpyttqxu32pixshyde@4jdn5jvg5hvk> (Breno Leitao's message of "Thu, 8 Jan 2026 10:03:14 -0800")

Hi Breno,

On Thu, Jan 08 2026, Breno Leitao wrote:

> Hello Pasha,
>
> On Tue, Dec 23, 2025 at 09:01:40AM -0500, Pasha Tatashin wrote:
>> If the previous kernel enabled KHO but did not call kho_finalize()
>> (e.g., CONFIG_LIVEUPDATE=n or userspace skipped the finalization step),
>> the 'preserved-memory-map' property in the FDT remains empty/zero.
>> 
>> Previously, kho_populate() would succeed regardless of the memory map's
>> state, reserving the incoming scratch regions in memblock. However,
>> kho_memory_init() would later fail to deserialize the empty map. By that
>> time, the scratch regions were already registered, leading to partial
>> initialization and subsequent list corruption (freeing scratch area
>> twice) during kho_init().
>
> While trying my new patchset [0] on top of this patch, I got the
> following issue:
>
> 	[    0.000000] KHO: disabling KHO revival: -2
>
> Trying to solve it, I come up with a change in kho_get_mem_map_phys() to
> distinguish no memory and error, see the patch attached later.
>
> This is what I used to test [0] on top of linux-next. Is this useful?
>
> Link: https://lore.kernel.org/all/20260108-kho-v3-1-b1d6b7a89342@debian.org/ [0]
>
> thanks
> --breno
>
> commit 5d7855fede8110d74942e1b67056ba589a1cb54a
> Author: Breno Leitao <leitao@debian.org>
> Date:   Thu Jan 8 07:44:08 2026 -0800
>
>     kho: allow KHO to work when no memory is preserved
>     
>     Fix KHO initialization failing when no memory pages were preserved by
>     the previous kernel.
>     
>     Commit eda79a683a0a ("kho: validate preserved memory map during
>     population") introduced kho_get_mem_map_phys() which returns the physical
>     address of the preserved memory map directly as its return value. The
>     caller then validates it with:
>     
>         mem_map_phys = kho_get_mem_map_phys(fdt);
>         if (!mem_map_phys) {
>             err = -ENOENT;
>             goto out;
>         }
>     
>     This creates an ambiguity: physical address 0 is used both as an error
>     indicator (property missing/malformed) and as a valid value (property
>     exists with value 0, meaning no memory was preserved).
>     
>     "No memory preserved" is a legitimate state. KHO provides features beyond
>     memory page preservation, such as previous kernel version tracking and
>     kexec count tracking. When the previous kernel enables KHO but doesn't
>     preserve any memory pages, it sets 'preserved-memory-map' to 0. This is
>     semantically different from "KHO not initialized" - it means "KHO is
>     active, there's just nothing in the memory map."

This isn't true. If you hand over _any_ state, you will at least need
the KHO FDT. And the KHO FDT is preserved memory (see the
kho_alloc_preserve() call in kho_init()). So I don't see how you can
ever have valid KHO with no memory.

mem_map_phys _can_ be 0, but only when KHO was enabled but not used. And
that is of course also a valid use case.

We want to treat mem_map_phys == 0 the same as the error path, just
without the error print. This lets us discard all previous scratch areas
since they don't have anything useful anyway, and have a fresh start.

So while you are seeing this error message, I don't think it should
break anything and KHO should still be working fine. You can
double-check this by inspecting /sys/kernel/debug/kho/out.

So I think the patch is certainly a useful fix, it just needs some
re-wording and fixups.

Some comments on the code below.

>     
>     Before eda79a683a0a, the code handled this gracefully in
>     kho_mem_deserialize():
>     
>         chunk = mem ? phys_to_virt(mem) : NULL;
>         if (!chunk)
>             return false;  // No pages, but KHO could still work
>     
>     After eda79a683a0a, the early validation conflated "no property" with
>     "property value is 0", causing KHO to be completely disabled in both
>     cases.
>     
>     Fix this by changing kho_get_mem_map_phys() to return an error code and
>     pass the physical address via pointer. This allows distinguishing between:
>      - Property missing/malformed: return -ENOENT (KHO fails)
>      - Property exists with value 0: return 0 (KHO succeeds, no memory to
>        restore)
>     
>     Fixes: eda79a683a0a ("kho: validate preserved memory map during population")
>     Signed-off-by: Breno Leitao <leitao@debian.org>
>
> diff --git a/kernel/liveupdate/kexec_handover.c b/kernel/liveupdate/kexec_handover.c
> index 271d90198a08..3cf2dc6840c9 100644
> --- a/kernel/liveupdate/kexec_handover.c
> +++ b/kernel/liveupdate/kexec_handover.c
> @@ -471,8 +471,8 @@ static void __init deserialize_bitmap(unsigned int order,
>  	}
>  }
>  
> -/* Returns physical address of the preserved memory map from FDT */
> -static phys_addr_t __init kho_get_mem_map_phys(const void *fdt)
> +/* Returns 0 on success and stores physical address in *phys_out */
> +static int __init kho_get_mem_map_phys(const void *fdt, phys_addr_t *phys_out)
>  {
>  	const void *mem_ptr;
>  	int len;
> @@ -480,10 +480,11 @@ static phys_addr_t __init kho_get_mem_map_phys(const void *fdt)
>  	mem_ptr = fdt_getprop(fdt, 0, KHO_FDT_MEMORY_MAP_PROP_NAME, &len);
>  	if (!mem_ptr || len != sizeof(u64)) {
>  		pr_err("failed to get preserved memory bitmaps\n");
> -		return 0;
> +		return -ENOENT;
>  	}
>  
> -	return get_unaligned((const u64 *)mem_ptr);
> +	*phys_out = get_unaligned((const u64 *)mem_ptr);
> +	return 0;
>  }
>  
>  static void __init kho_mem_deserialize(struct khoser_mem_chunk *chunk)
> @@ -1439,7 +1440,7 @@ void __init kho_populate(phys_addr_t fdt_phys, u64 fdt_len,
>  			 phys_addr_t scratch_phys, u64 scratch_len)
>  {
>  	struct kho_scratch *scratch = NULL;
> -	phys_addr_t mem_map_phys;
> +	phys_addr_t mem_map_phys = 0;
>  	void *fdt = NULL;
>  	int err = 0;
>  	unsigned int scratch_cnt = scratch_len / sizeof(*kho_scratch);
> @@ -1466,11 +1467,9 @@ void __init kho_populate(phys_addr_t fdt_phys, u64 fdt_len,
>  		goto out;
>  	}
>  
> -	mem_map_phys = kho_get_mem_map_phys(fdt);
> -	if (!mem_map_phys) {
> -		err = -ENOENT;
> +	err = kho_get_mem_map_phys(fdt, &mem_map_phys);
> +	if (err)

This will break when mem_map_phys == 0. As I explained earlier, when
that happens we want to discard all previous scratch info and start with
a clean slate.

Making this if (err || !mem_map_phys) should do the trick. The if (err)
check before the print should make sure the error message is not printed
when we have a valid property but its value is 0.

>  		goto out;
> -	}
>  
>  	scratch = early_memremap(scratch_phys, scratch_len);
>  	if (!scratch) {

-- 
Regards,
Pratyush Yadav


  reply	other threads:[~2026-01-16 16:21 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-12-23 14:01 [PATCH v4] kho: validate preserved memory map during population Pasha Tatashin
2025-12-23 14:11 ` Pratyush Yadav
2026-01-08 18:03 ` Breno Leitao
2026-01-16 16:21   ` Pratyush Yadav [this message]
2026-01-18  6:11     ` Mike Rapoport
2026-01-20 15:11       ` Pratyush Yadav
2026-01-19 18:54     ` Breno Leitao
2026-01-20 15:10       ` Pratyush Yadav
2026-01-19  7:04 ` Zhu Yanjun

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2vxzo6mt7cl3.fsf@kernel.org \
    --to=pratyush@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=graf@amazon.com \
    --cc=kernel-team@meta.com \
    --cc=kexec@lists.infradead.org \
    --cc=leitao@debian.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=pasha.tatashin@soleen.com \
    --cc=ricardo.neri-calderon@linux.intel.com \
    --cc=rppt@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.