public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Baoquan He <bhe@redhat.com>
To: Eric DeVolder <eric.devolder@oracle.com>
Cc: linux-kernel@vger.kernel.org, x86@kernel.org,
	kexec@lists.infradead.org, ebiederm@xmission.com,
	dyoung@redhat.com, vgoyal@redhat.com, tglx@linutronix.de,
	mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com,
	hpa@zytor.com, nramas@linux.microsoft.com,
	thomas.lendacky@amd.com, robh@kernel.org, efault@gmx.de,
	rppt@kernel.org, david@redhat.com, sourabhjain@linux.ibm.com,
	konrad.wilk@oracle.com, boris.ostrovsky@oracle.com
Subject: Re: [PATCH v22 6/8] crash: hotplug support for kexec_load()
Date: Mon, 8 May 2023 13:13:33 +0800	[thread overview]
Message-ID: <ZFiE/TXDtrt/y73w@MiWiFi-R3L-srv> (raw)
In-Reply-To: <20230503224145.7405-7-eric.devolder@oracle.com>

On 05/03/23 at 06:41pm, Eric DeVolder wrote:
> The hotplug support for kexec_load() requires coordination with
> userspace, and therefore a little extra help from the kernel to
> facilitate the coordination.
> 
> In the absence of the solution contained within this particular
> patch, if a kdump capture kernel is loaded via kexec_load() syscall,
> then the crash hotplug logic would find the segment containing the
> elfcorehdr, and upon a hotplug event, rewrite the elfcorehdr. While
> generally speaking that is the desired behavior and outcome, a
> problem arises from the fact that if the kdump image includes a
> purgatory that performs a digest checksum, then that check would
> fail (because the elfcorehdr was changed), and the capture kernel
> would fail to boot and no kdump occur.
> 
> Therefore, what is needed is for the userspace kexec-tools to
> indicate to the kernel whether or not the supplied kdump image/
> elfcorehdr can be modified (because the kexec-tools excludes the
> elfcorehdr from the digest, and sizes the elfcorehdr memory buffer
> appropriately).
> 
> To solve these problems, this patch introduces:
>  - a new kexec flag KEXEC_UPATE_ELFCOREHDR to indicate that it is
>    safe for the kernel to modify the elfcorehdr (because kexec-tools
>    has excluded the elfcorehdr from the digest).
>  - the /sys/kernel/crash_elfcorehdr_size node to communicate to
>    kexec-tools what the preferred size of the elfcorehdr memory buffer
>    should be in order to accommodate hotplug changes.
>  - The sysfs crash_hotplug nodes (ie.
>    /sys/devices/system/[cpu|memory]/crash_hotplug) are now dynamic in
>    that they examine kexec_file_load() vs kexec_load(), and when
>    kexec_load(), whether or not KEXEC_UPDATE_ELFCOREHDR is in effect.
>    This is critical so that the udev rule processing of crash_hotplug
>    indicates correctly (ie. the userspace unload-then-load of the
>    kdump of the kdump image can be skipped, or not).
> 
> With this patch in place, I believe the following statements to be true
> (with local testing to verify):
> 
>  - For systems which have these kernel changes in place, but not the
>    corresponding changes to the crash hot plug udev rules and
>    kexec-tools, (ie "older" systems) those systems will continue to
>    unload-then-load the kdump image, as has always been done. The
>    kexec-tools will not set KEXEC_UPDATE_ELFCOREHDR.
>  - For systems which have these kernel changes in place and the proposed
>    udev rule changes in place, but not the kexec-tools changes in place:
>     - the use of kexec_load() will not set KEXEC_UPDATE_ELFCOREHDR and
>       so the unload-then-reload of kdump image will occur (the sysfs
>       crash_hotplug nodes will show 0).
>     - the use of kexec_file_load() will permit sysfs crash_hotplug nodes
>       to show 1, and the kernel will modify the elfcorehdr directly. And
>       with the udev changes in place, the unload-then-load will not occur!
>  - For systems which have these kernel changes as well as the udev and
>    kexec-tools changes in place, then the user/admin has full authority
>    over the enablement and support of crash hotplug support, whether via
>    kexec_file_load() or kexec_load().
> 
> Said differently, as kexec_load() was/is widely in use, these changes
> permit it to continue to be used as-is (retaining the current unload-then-
> reload behavior) until such time as the udev and kexec-tools changes can
> be rolled out as well.
> 
> I've intentionally kept the changes related to userspace coordination
> for kexec_load() separate as this need was identified late; the
> rest of this series has been generally reviewed and accepted. Once
> this support has been vetted, I can refactor if needed.
> 
> Suggested-by: Hari Bathini <hbathini@linux.ibm.com>
> Signed-off-by: Eric DeVolder <eric.devolder@oracle.com>

LGTM,

Acked-by: Baoquan He <bhe@redhat.com>


  reply	other threads:[~2023-05-08  5:15 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-03 22:41 [PATCH v22 0/8] crash: Kernel handling of CPU and memory hot un/plug Eric DeVolder
2023-05-03 22:41 ` [PATCH v22 1/8] crash: move a few code bits to setup support of crash hotplug Eric DeVolder
2023-05-03 22:41 ` [PATCH v22 2/8] crash: add generic infrastructure for crash hotplug support Eric DeVolder
2023-05-03 22:41 ` [PATCH v22 3/8] kexec: exclude elfcorehdr from the segment digest Eric DeVolder
2023-05-03 22:41 ` [PATCH v22 4/8] crash: memory and CPU hotplug sysfs attributes Eric DeVolder
2023-05-03 22:41 ` [PATCH v22 5/8] x86/crash: add x86 crash hotplug support Eric DeVolder
2023-05-09 22:52   ` Thomas Gleixner
2023-05-10 22:49     ` Eric DeVolder
2023-05-03 22:41 ` [PATCH v22 6/8] crash: hotplug support for kexec_load() Eric DeVolder
2023-05-08  5:13   ` Baoquan He [this message]
2023-05-09  6:15   ` Sourabh Jain
2023-05-10 22:52     ` Eric DeVolder
2023-05-09  6:56   ` Sourabh Jain
2023-05-09 20:35     ` Eric DeVolder
2023-05-03 22:41 ` [PATCH v22 7/8] crash: change crash_prepare_elf64_headers() to for_each_possible_cpu() Eric DeVolder
2023-05-03 22:41 ` [PATCH v22 8/8] x86/crash: optimize CPU changes Eric DeVolder
2023-05-09 22:39   ` Thomas Gleixner
2023-05-10 22:49     ` Eric DeVolder
2023-05-04  6:22 ` [PATCH v22 0/8] crash: Kernel handling of CPU and memory hot un/plug Hari Bathini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZFiE/TXDtrt/y73w@MiWiFi-R3L-srv \
    --to=bhe@redhat.com \
    --cc=boris.ostrovsky@oracle.com \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=david@redhat.com \
    --cc=dyoung@redhat.com \
    --cc=ebiederm@xmission.com \
    --cc=efault@gmx.de \
    --cc=eric.devolder@oracle.com \
    --cc=hpa@zytor.com \
    --cc=kexec@lists.infradead.org \
    --cc=konrad.wilk@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=nramas@linux.microsoft.com \
    --cc=robh@kernel.org \
    --cc=rppt@kernel.org \
    --cc=sourabhjain@linux.ibm.com \
    --cc=tglx@linutronix.de \
    --cc=thomas.lendacky@amd.com \
    --cc=vgoyal@redhat.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox