All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mike Rapoport <rppt@kernel.org>
To: Breno Leitao <leitao@debian.org>
Cc: Alexander Graf <graf@amazon.com>,
	Pasha Tatashin <pasha.tatashin@soleen.com>,
	Pratyush Yadav <pratyush@kernel.org>,
	linux-kernel@vger.kernel.org, kexec@lists.infradead.org,
	linux-mm@kvack.org, usamaarif642@gmail.com, rmikey@meta.com,
	clm@fb.com, riel@surriel.com, kernel-team@meta.com,
	SeongJae Park <sj@kernel.org>
Subject: Re: [PATCH v4] kho: kexec-metadata: track previous kernel chain
Date: Thu, 22 Jan 2026 12:57:50 +0200	[thread overview]
Message-ID: <aXICriRKbYB5f3li@kernel.org> (raw)
In-Reply-To: <20260121-kho-v4-1-5c8fe77b6804@debian.org>

Hi Breno,

On Wed, Jan 21, 2026 at 06:50:38AM -0800, Breno Leitao wrote:
> Use Kexec Handover (KHO) to pass the previous kernel's version string
> and the number of kexec reboots since the last cold boot to the next
> kernel, and print it at boot time.
> 
> Example output:
>     [    0.000000] KHO: exec from: 6.19.0-rc4-next-20260107 (count 1)
> 
> Motivation
> ==========
> 
> Bugs that only reproduce when kexecing from specific kernel versions
> are difficult to diagnose. These issues occur when a buggy kernel
> kexecs into a new kernel, with the bug manifesting only in the second
> kernel.
> 
> Recent examples include the following commits:
> 
>  * eb2266312507 ("x86/boot: Fix page table access in 5-level to 4-level paging transition")
>  * 77d48d39e991 ("efistub/tpm: Use ACPI reclaim memory for event log to avoid corruption")
>  * 64b45dd46e15 ("x86/efi: skip memattr table on kexec boot")
> 
> As kexec-based reboots become more common, these version-dependent bugs
> are appearing more frequently. At scale, correlating crashes to the
> previous kernel version is challenging, especially when issues only
> occur in specific transition scenarios.
> 
> Implementation
> ==============
> 
> The kexec metadata is stored as a plain C struct (struct kho_kexec_metadata)
> rather than FDT format, for simplicity and direct field access. It is
> registered via kho_add_subtree() as a separate subtree, keeping it
> independent from the core KHO ABI. This design choice:
> 
>  - Keeps the core KHO ABI minimal and stable
>  - Allows the metadata format to evolve independently
>  - Avoids requiring version bumps for all KHO consumers (LUO, etc.)
>    when the metadata format changes
> 
> The struct kho_metadata contains two fields:
>  - previous_release: The kernel version that initiated the kexec
>  - kexec_count: Number of kexec boots since last cold boot
> 
> On cold boot, kexec_count starts at 0 and increments with each kexec.
> The count helps identify issues that only manifest after multiple
> consecutive kexec reboots.
> 
> Signed-off-by: Breno Leitao <leitao@debian.org>
> Acked-by: SeongJae Park <sj@kernel.org>
> ---
> Changes in v4:
> - Squashed everything in a single commit
> - Moved from FDT to C structs (Pratyush)
> - Usage of subtress intead of FDT directly (Pratyush)
> - Renamed a bunch of variables and functions.
> - Link to v3: https://patch.msgid.link/20260108-kho-v3-0-b1d6b7a89342@debian.org
> 
> Changes in v3:
> - Remove the extra CONFIG for this feature.
> - Reworded some identifiers, properties and printks.
> - Better documented the questions raised during v2.
> - Link to v2: https://patch.msgid.link/20260102-kho-v2-0-1747b1a3a1d6@debian.org
> 
> Changes from v2 to v1 (RFC)
> - Track the number of kexecs since cold boot (Pasha)
> - Change the printk() order compared to KHO
> - Rewording of the commit summary
> - Link to RFC: https://patch.msgid.link/20251230-kho-v1-1-4d795a24da9e@debian.org
> ---
>  include/linux/kho/abi/kexec_handover.h | 29 +++++++++++++++
>  kernel/liveupdate/kexec_handover.c     | 65 ++++++++++++++++++++++++++++++++++
>  2 files changed, 94 insertions(+)
> 
> diff --git a/include/linux/kho/abi/kexec_handover.h b/include/linux/kho/abi/kexec_handover.h
> index 285eda8a36e45..e18022a4e664d 100644
> --- a/include/linux/kho/abi/kexec_handover.h
> +++ b/include/linux/kho/abi/kexec_handover.h
> @@ -11,6 +11,7 @@
>  #define _LINUX_KHO_ABI_KEXEC_HANDOVER_H
>  
>  #include <linux/types.h>
> +#include <linux/utsname.h>
>  
>  /**
>   * DOC: Kexec Handover ABI
> @@ -84,6 +85,34 @@
>  /* The FDT property for sub-FDTs. */
>  #define KHO_FDT_SUB_TREE_PROP_NAME "fdt"
>  
> +/**
> + * DOC: Kexec Metadata ABI
> + *

It would be nice to link it from Documentation/ as well ;-)

> + * The "kexec-metadata" subtree stores optional metadata about the kexec chain.
> + * It is registered via kho_add_subtree(), keeping it independent from the core
> + * KHO ABI. This allows the metadata format to evolve without affecting other
> + * KHO consumers.
> + *
> + * The metadata is stored as a plain C struct rather than FDT format for
> + * simplicity and direct field access.
> + */
> +
> +/**
> + * struct kho_kexec_metadata - Kexec metadata passed between kernels
> + * @previous_release: Kernel version string that initiated the kexec
> + * @kexec_count: Number of kexec boots since last cold boot
> + *
> + * This structure is preserved across kexec and allows the new kernel to
> + * identify which kernel it was booted from and how many kexec reboots
> + * have occurred.
> + */
> +struct kho_kexec_metadata {
> +	char previous_release[__NEW_UTS_LEN + 1];
> +	u32 kexec_count;
> +} __packed;
> +
> +#define KHO_METADATA_NODE_NAME "kexec-metadata"
> +
>  /**
>   * DOC: Kexec Handover ABI for vmalloc Preservation
>   *
> diff --git a/kernel/liveupdate/kexec_handover.c b/kernel/liveupdate/kexec_handover.c

...

>  static __init int kho_init(void)
>  {
>  	const void *fdt = kho_get_fdt();
> @@ -1357,6 +1413,15 @@ static __init int kho_init(void)
>  	if (err)
>  		goto err_free_fdt;
>  
> +	if (fdt)
> +		kho_process_kexec_metadata();

Can't we move it into the existing if (fdt) below?
 
> +
> +	/* Populate kexec metadata for the possible next kexec */
> +	err = kho_populate_kexec_metadata();
> +	if (err)
> +		pr_warn("failed to initialize kexec-metadata subtree: %d\n",
> +			err);

Please follow if (err) goto err_ pattern.

kho_populate_kexec_metadata() failure essentially means that we failed to
allocate memory. This shouldn't happen that early in boot, but if it did,
then something is utterly wrong.

> +
>  	if (fdt) {
>  		kho_in_debugfs_init(&kho_in.dbg, fdt);
>  		return 0;

-- 
Sincerely yours,
Mike.


  reply	other threads:[~2026-01-22 10:58 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-21 14:50 [PATCH v4] kho: kexec-metadata: track previous kernel chain Breno Leitao
2026-01-22 10:57 ` Mike Rapoport [this message]
2026-01-22 12:04   ` Breno Leitao
2026-01-25 11:32     ` Mike Rapoport
2026-01-26 10:51       ` Breno Leitao
2026-01-26 12:01 ` Breno Leitao
2026-01-26 13:28   ` Pratyush Yadav
2026-01-26 13:45     ` Pratyush Yadav
2026-01-26 13:47     ` Breno Leitao
2026-01-26 13:35 ` Pratyush Yadav

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aXICriRKbYB5f3li@kernel.org \
    --to=rppt@kernel.org \
    --cc=clm@fb.com \
    --cc=graf@amazon.com \
    --cc=kernel-team@meta.com \
    --cc=kexec@lists.infradead.org \
    --cc=leitao@debian.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=pasha.tatashin@soleen.com \
    --cc=pratyush@kernel.org \
    --cc=riel@surriel.com \
    --cc=rmikey@meta.com \
    --cc=sj@kernel.org \
    --cc=usamaarif642@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.