Kexec Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: Baoquan He <bhe@redhat.com>
To: Eric DeVolder <eric.devolder@oracle.com>
Cc: Sourabh Jain <sourabhjain@linux.ibm.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	linux-kernel@vger.kernel.org, x86@kernel.org,
	kexec@lists.infradead.org, ebiederm@xmission.com,
	dyoung@redhat.com, vgoyal@redhat.com, mingo@redhat.com,
	bp@alien8.de, dave.hansen@linux.intel.com, hpa@zytor.com,
	nramas@linux.microsoft.com, thomas.lendacky@amd.com,
	robh@kernel.org, efault@gmx.de, rppt@kernel.org,
	david@redhat.com, konrad.wilk@oracle.com,
	boris.ostrovsky@oracle.com
Subject: Re: [PATCH v18 5/7] kexec: exclude hot remove cpu from elfcorehdr notes
Date: Thu, 2 Mar 2023 18:51:55 +0800	[thread overview]
Message-ID: <ZAB/y/pUU/xhY2k9@MiWiFi-R3L-srv> (raw)
In-Reply-To: <536c69d0-9ebd-8356-ebcb-680562bcd277@oracle.com>

On 03/01/23 at 09:48am, Eric DeVolder wrote:
...... 
> From b56aa428b07d970f26e3c3704d54ce8805f05ddc Mon Sep 17 00:00:00 2001
> From: Eric DeVolder <eric.devolder@oracle.com>
> Date: Tue, 28 Feb 2023 14:20:04 -0500
> Subject: [PATCH v19 3/7] crash: change crash_prepare_elf64_headers() to
>  for_each_possible_cpu()
> 
> The function crash_prepare_elf64_headers() generates the elfcorehdr
> which describes the cpus and memory in the system for the crash kernel.
> In particular, it writes out ELF PT_NOTEs for memory regions and the
> processors in the system.
> 
> With respect to the cpus, the current implementation utilizes
> for_each_present_cpu() which means that as cpus are added and removed,
> the elfcorehdr must again be updated to reflect the new set of cpus.
> 
> The reasoning behind the change to use for_each_possible_cpu(), is:
> 
> - At kernel boot time, all percpu crash_notes are allocated for all
>   possible cpus; that is, crash_notes are not allocated dynamically
>   when cpus are plugged/unplugged. Thus the crash_notes for each
>   possible cpu are always available.
> 
> - The crash_prepare_elf64_headers() creates an ELF PT_NOTE per cpu.
>   Changing to for_each_possible_cpu() is valid as the crash_notes
>   pointed to by each cpu PT_NOTE are present and always valid.
> 
> Furthermore, examining a common crash processing path of:
> 
>  kernel panic -> crash kernel -> makedumpfile -> 'crash' analyzer
>            elfcorehdr      /proc/vmcore     vmcore
> 
> reveals how the ELF cpu PT_NOTEs are utilized:
> 
> - Upon panic, each cpu is sent an IPI and shuts itself down, recording
>  its state in its crash_notes. When all cpus are shutdown, the
>  crash kernel is launched with a pointer to the elfcorehdr.
> 
> - The crash kernel via linux/fs/proc/vmcore.c does not examine or
>  use the contents of the PT_NOTEs, it exposes them via /proc/vmcore.
> 
> - The makedumpfile utility uses /proc/vmcore and reads the cpu
>  PT_NOTEs to craft a nr_cpus variable, which is reported in a
>  header but otherwise generally unused. Makedumpfile creates the
>  vmcore.
> 
> - The 'crash' dump analyzer does not appear to reference the cpu
>  PT_NOTEs. Instead it looks-up the cpu_[possible|present|onlin]_mask
>  symbols and directly examines those structure contents from vmcore
>  memory. From that information it is able to determine which cpus
>  are present and online, and locate the corresponding crash_notes.
>  Said differently, it appears to me that 'crash' analyzer does not
>  rely on the ELF PT_NOTEs for cpus; rather it obtains the information
>  directly via kernel symbols and the memory within the vmcore.
> 
> (There maybe other vmcore generating and analysis tools that do use
> these PT_NOTEs, but 'makedumpfile' and 'crash' seem to me to be the
> most common solution.)
> 
> This change results in the benefit of having all cpus described in
> the elfcorehdr, and therefore reducing the need to re-generate the
> elfcorehdr on cpu changes, at the small expense of an additional
> 56 bytes per PT_NOTE for not-present-but-possible cpus.
> 
> On systems where kexec_file_load() syscall is utilized, all the above
> is valid. On systems where kexec_load() syscall is utilized, there
> may be the need for the elfcorehdr to be regenerated once. The reason
> being that some archs only populate the 'present' cpus in the
> /sys/devices/system/cpus entries, which the userspace 'kexec' utility
> uses to generate the userspace-supplied elfcorehdr. In this situation,
> one memory or cpu change will rewrite the elfcorehdr via the
> crash_prepare_elf64_headers() function and now all possible cpus will
> be described, just as with kexec_file_load() syscall.

So, with for_each_possible_cpu(), we don't need to respond to cpu
hotplug event, right? If so, it does bring benefit. While kexec_load
won't benefit from that. So far, it looks not bad.

> 
> Suggested-by: Sourabh Jain <sourabhjain@linux.ibm.com>
> Signed-off-by: Eric DeVolder <eric.devolder@oracle.com>
> ---
>  kernel/crash_core.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/kernel/crash_core.c b/kernel/crash_core.c
> index dba4b75f7541..537b199a8774 100644
> --- a/kernel/crash_core.c
> +++ b/kernel/crash_core.c
> @@ -365,7 +365,7 @@ int crash_prepare_elf64_headers(struct crash_mem *mem, int need_kernel_map,
>  	ehdr->e_phentsize = sizeof(Elf64_Phdr);
> 
>  	/* Prepare one phdr of type PT_NOTE for each present CPU */
> -	for_each_present_cpu(cpu) {
> +	for_each_possible_cpu(cpu) {
>  		phdr->p_type = PT_NOTE;
>  		notes_addr = per_cpu_ptr_to_phys(per_cpu_ptr(crash_notes, cpu));
>  		phdr->p_offset = phdr->p_paddr = notes_addr;
> -- 
> 2.31.1
> 


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

  reply	other threads:[~2023-03-02 10:52 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-31 22:42 [PATCH v18 0/7] crash: Kernel handling of CPU and memory hot un/plug Eric DeVolder
2023-01-31 22:42 ` [PATCH v18 1/7] crash: move a few code bits to setup support of crash hotplug Eric DeVolder
2023-01-31 22:42 ` [PATCH v18 2/7] crash: prototype change for crash_prepare_elf64_headers() Eric DeVolder
2023-01-31 22:42 ` [PATCH v18 3/7] crash: add generic infrastructure for crash hotplug support Eric DeVolder
2023-02-09 19:10   ` Sourabh Jain
2023-02-10 16:51     ` Eric DeVolder
2023-01-31 22:42 ` [PATCH v18 4/7] kexec: exclude elfcorehdr from the segment digest Eric DeVolder
2023-01-31 22:42 ` [PATCH v18 5/7] kexec: exclude hot remove cpu from elfcorehdr notes Eric DeVolder
2023-02-01 11:33   ` Thomas Gleixner
2023-02-06  8:12     ` Sourabh Jain
2023-02-06 13:03       ` Thomas Gleixner
2023-02-07 17:23     ` Eric DeVolder
2023-02-08 13:44       ` Thomas Gleixner
2023-02-09 17:31         ` Eric DeVolder
2023-02-09 18:43           ` Sourabh Jain
2023-02-09 19:39             ` Eric DeVolder
2023-02-10  6:29               ` Sourabh Jain
2023-02-11  0:35                 ` Eric DeVolder
2023-02-13  4:40                   ` Sourabh Jain
2023-02-13 12:52                     ` Thomas Gleixner
2023-02-15  2:53                       ` Sourabh Jain
2023-02-28 12:44                     ` Baoquan He
2023-02-28 18:52                       ` Eric DeVolder
2023-03-01 15:48                         ` Eric DeVolder
2023-03-02 10:51                           ` Baoquan He [this message]
2023-03-02  5:23                         ` Sourabh Jain
2023-02-23 20:34                 ` Eric DeVolder
2023-02-24  8:34                   ` Sourabh Jain
2023-02-24 20:16                     ` Eric DeVolder
2023-02-27  6:11                       ` Sourabh Jain
2023-02-28 21:50                         ` Eric DeVolder
2023-03-01  6:22                           ` Sourabh Jain
2023-03-01 14:16                             ` Eric DeVolder
2023-01-31 22:42 ` [PATCH v18 6/7] crash: memory and cpu hotplug sysfs attributes Eric DeVolder
2023-01-31 22:42 ` [PATCH v18 7/7] x86/crash: add x86 crash hotplug support Eric DeVolder

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZAB/y/pUU/xhY2k9@MiWiFi-R3L-srv \
    --to=bhe@redhat.com \
    --cc=boris.ostrovsky@oracle.com \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=david@redhat.com \
    --cc=dyoung@redhat.com \
    --cc=ebiederm@xmission.com \
    --cc=efault@gmx.de \
    --cc=eric.devolder@oracle.com \
    --cc=hpa@zytor.com \
    --cc=kexec@lists.infradead.org \
    --cc=konrad.wilk@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=nramas@linux.microsoft.com \
    --cc=robh@kernel.org \
    --cc=rppt@kernel.org \
    --cc=sourabhjain@linux.ibm.com \
    --cc=tglx@linutronix.de \
    --cc=thomas.lendacky@amd.com \
    --cc=vgoyal@redhat.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox