Kexec Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: Sourabh Jain <sourabhjain@linux.ibm.com>
To: Eric DeVolder <eric.devolder@oracle.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	linux-kernel@vger.kernel.org, x86@kernel.org,
	kexec@lists.infradead.org, ebiederm@xmission.com,
	dyoung@redhat.com, bhe@redhat.com, vgoyal@redhat.com
Cc: mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com,
	hpa@zytor.com, nramas@linux.microsoft.com,
	thomas.lendacky@amd.com, robh@kernel.org, efault@gmx.de,
	rppt@kernel.org, david@redhat.com, konrad.wilk@oracle.com,
	boris.ostrovsky@oracle.com
Subject: Re: [PATCH v18 5/7] kexec: exclude hot remove cpu from elfcorehdr notes
Date: Fri, 10 Feb 2023 00:13:00 +0530	[thread overview]
Message-ID: <d465173e-a31a-c4d6-af51-59d9ff0c2edc@linux.ibm.com> (raw)
In-Reply-To: <7580421a-648a-2c4b-3c33-82e7622d9585@oracle.com>

Hello Eric,

On 09/02/23 23:01, Eric DeVolder wrote:
>
>
> On 2/8/23 07:44, Thomas Gleixner wrote:
>> Eric!
>>
>> On Tue, Feb 07 2023 at 11:23, Eric DeVolder wrote:
>>> On 2/1/23 05:33, Thomas Gleixner wrote:
>>>
>>> So my latest solution is introduce two new CPUHP states, 
>>> CPUHP_AP_ELFCOREHDR_ONLINE
>>> for onlining and CPUHP_BP_ELFCOREHDR_OFFLINE for offlining. I'm open 
>>> to better names.
>>>
>>> The CPUHP_AP_ELFCOREHDR_ONLINE needs to be placed after 
>>> CPUHP_BRINGUP_CPU. My
>>> attempts at locating this state failed when inside the STARTING 
>>> section, so I located
>>> this just inside the ONLINE sectoin. The crash hotplug handler is 
>>> registered on
>>> this state as the callback for the .startup method.
>>>
>>> The CPUHP_BP_ELFCOREHDR_OFFLINE needs to be placed before 
>>> CPUHP_TEARDOWN_CPU, and I
>>> placed it at the end of the PREPARE section. This crash hotplug 
>>> handler is also
>>> registered on this state as the callback for the .teardown method.
>>
>> TBH, that's still overengineered. Something like this:
>>
>> bool cpu_is_alive(unsigned int cpu)
>> {
>>     struct cpuhp_cpu_state *st = per_cpu_ptr(&cpuhp_state, cpu);
>>
>>     return data_race(st->state) <= CPUHP_AP_IDLE_DEAD;
>> }
>>
>> and use this to query the actual state at crash time. That spares all
>> those callback heuristics.
>>
>>> I'm making my way though percpu crash_notes, elfcorehdr, vmcoreinfo,
>>> makedumpfile and (the consumer of it all) the userspace crash utility,
>>> in order to understand the impact of moving from for_each_present_cpu()
>>> to for_each_online_cpu().
>>
>> Is the packing actually worth the trouble? What's the actual win?
>>
>> Thanks,
>>
>>          tglx
>>
>>
>
> Thomas,
> I've investigated the passing of crash notes through the vmcore. What 
> I've learned is that:
>
> - linux/fs/proc/vmcore.c (which makedumpfile references to do its job) 
> does
>   not care what the contents of cpu PT_NOTES are, but it does coalesce 
> them together.
>
> - makedumpfile will count the number of cpu PT_NOTES in order to 
> determine its
>   nr_cpus variable, which is reported in a header, but otherwise 
> unused (except
>   for sadump method).
>
> - the crash utility, for the purposes of determining the cpus, does 
> not appear to
>   reference the elfcorehdr PT_NOTEs. Instead it locates the various
>   cpu_[possible|present|online]_mask and computes nr_cpus from that, 
> and also of
>   course which are online. In addition, when crash does reference the 
> cpu PT_NOTE,
>   to get its prstatus, it does so by using a percpu technique directly 
> in the vmcore
>   image memory, not via the ELF structure. Said differently, it 
> appears to me that
>   crash utility doesn't rely on the ELF PT_NOTEs for cpus; rather it 
> obtains them
>   via kernel cpumasks and the memory within the vmcore.
>
> With this understanding, I did some testing. Perhaps the most telling 
> test was that I
> changed the number of cpu PT_NOTEs emitted in the 
> crash_prepare_elf64_headers() to just 1,
> hot plugged some cpus, then also took a few offline sparsely via 
> chcpu, then generated a
> vmcore. The crash utility had no problem loading the vmcore, it 
> reported the proper number
> of cpus and the number offline (despite only one cpu PT_NOTE), and 
> changing to a different
> cpu via 'set -c 30' and the backtrace was completely valid.
>
> My take away is that crash utility does not rely upon ELF cpu 
> PT_NOTEs, it obtains the
> cpu information directly from kernel data structures. Perhaps at one 
> time crash relied
> upon the ELF information, but no more. (Perhaps there are other crash 
> dump analyzers
> that might rely on the ELF info?)
>
> So, all this to say that I see no need to change 
> crash_prepare_elf64_headers(). There
> is no compelling reason to move away from for_each_present_cpu(), or 
> modify the list for
> online/offline.
>
> Which then leaves the topic of the cpuhp state on which to register. 
> Perhaps reverting
> back to the use of CPUHP_BP_PREPARE_DYN is the right answer. There 
> does not appear to
> be a compelling need to accurately track whether the cpu went 
> online/offline for the
> purposes of creating the elfcorehdr, as ultimately the crash utility 
> pulls that from
> kernel data structures, not the elfcorehdr.
>
> I think this is what Sourabh has known and has been advocating for an 
> optimization
> path that allows not regenerating the elfcorehdr on cpu changes 
> (because all the percpu
> structs are all laid out). I do think it best to leave that as an arch 
> choice.

Since things are clear on how the PT_NOTES are consumed in kdump kernel 
[fs/proc/vmcore.c],
makedumpfile, and crash tool I need your opinion on this:

Do we really need to regenerate elfcorehdr for CPU hotplug events?
If yes, can you please list the elfcorehdr components that changes due 
to CPU hotplug.

 From what I understood, crash notes are prepared for possible CPUs as 
system boots and
could be used to create a PT_NOTE section for each possible CPU while 
generating the elfcorehdr
during the kdump kernel load.

Now once the elfcorehdr is loaded with PT_NOTEs for every possible CPU 
there is no need to
regenerate it for CPU hotplug events. Or do we?

Thanks,
Sourabh Jain

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

  reply	other threads:[~2023-02-09 18:43 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-31 22:42 [PATCH v18 0/7] crash: Kernel handling of CPU and memory hot un/plug Eric DeVolder
2023-01-31 22:42 ` [PATCH v18 1/7] crash: move a few code bits to setup support of crash hotplug Eric DeVolder
2023-01-31 22:42 ` [PATCH v18 2/7] crash: prototype change for crash_prepare_elf64_headers() Eric DeVolder
2023-01-31 22:42 ` [PATCH v18 3/7] crash: add generic infrastructure for crash hotplug support Eric DeVolder
2023-02-09 19:10   ` Sourabh Jain
2023-02-10 16:51     ` Eric DeVolder
2023-01-31 22:42 ` [PATCH v18 4/7] kexec: exclude elfcorehdr from the segment digest Eric DeVolder
2023-01-31 22:42 ` [PATCH v18 5/7] kexec: exclude hot remove cpu from elfcorehdr notes Eric DeVolder
2023-02-01 11:33   ` Thomas Gleixner
2023-02-06  8:12     ` Sourabh Jain
2023-02-06 13:03       ` Thomas Gleixner
2023-02-07 17:23     ` Eric DeVolder
2023-02-08 13:44       ` Thomas Gleixner
2023-02-09 17:31         ` Eric DeVolder
2023-02-09 18:43           ` Sourabh Jain [this message]
2023-02-09 19:39             ` Eric DeVolder
2023-02-10  6:29               ` Sourabh Jain
2023-02-11  0:35                 ` Eric DeVolder
2023-02-13  4:40                   ` Sourabh Jain
2023-02-13 12:52                     ` Thomas Gleixner
2023-02-15  2:53                       ` Sourabh Jain
2023-02-28 12:44                     ` Baoquan He
2023-02-28 18:52                       ` Eric DeVolder
2023-03-01 15:48                         ` Eric DeVolder
2023-03-02 10:51                           ` Baoquan He
2023-03-02  5:23                         ` Sourabh Jain
2023-02-23 20:34                 ` Eric DeVolder
2023-02-24  8:34                   ` Sourabh Jain
2023-02-24 20:16                     ` Eric DeVolder
2023-02-27  6:11                       ` Sourabh Jain
2023-02-28 21:50                         ` Eric DeVolder
2023-03-01  6:22                           ` Sourabh Jain
2023-03-01 14:16                             ` Eric DeVolder
2023-01-31 22:42 ` [PATCH v18 6/7] crash: memory and cpu hotplug sysfs attributes Eric DeVolder
2023-01-31 22:42 ` [PATCH v18 7/7] x86/crash: add x86 crash hotplug support Eric DeVolder

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d465173e-a31a-c4d6-af51-59d9ff0c2edc@linux.ibm.com \
    --to=sourabhjain@linux.ibm.com \
    --cc=bhe@redhat.com \
    --cc=boris.ostrovsky@oracle.com \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=david@redhat.com \
    --cc=dyoung@redhat.com \
    --cc=ebiederm@xmission.com \
    --cc=efault@gmx.de \
    --cc=eric.devolder@oracle.com \
    --cc=hpa@zytor.com \
    --cc=kexec@lists.infradead.org \
    --cc=konrad.wilk@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=nramas@linux.microsoft.com \
    --cc=robh@kernel.org \
    --cc=rppt@kernel.org \
    --cc=tglx@linutronix.de \
    --cc=thomas.lendacky@amd.com \
    --cc=vgoyal@redhat.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox