Kexec Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: Eric DeVolder <eric.devolder@oracle.com>
To: Thomas Gleixner <tglx@linutronix.de>,
	linux-kernel@vger.kernel.org, x86@kernel.org,
	kexec@lists.infradead.org, ebiederm@xmission.com,
	dyoung@redhat.com, bhe@redhat.com, vgoyal@redhat.com
Cc: mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com,
	hpa@zytor.com, nramas@linux.microsoft.com,
	thomas.lendacky@amd.com, robh@kernel.org, efault@gmx.de,
	rppt@kernel.org, david@redhat.com, sourabhjain@linux.ibm.com,
	konrad.wilk@oracle.com, boris.ostrovsky@oracle.com
Subject: Re: [PATCH v18 5/7] kexec: exclude hot remove cpu from elfcorehdr notes
Date: Tue, 7 Feb 2023 11:23:58 -0600	[thread overview]
Message-ID: <dd03f47a-0017-6239-04e9-e796dca03c0c@oracle.com> (raw)
In-Reply-To: <87sffpzkle.ffs@tglx>



On 2/1/23 05:33, Thomas Gleixner wrote:
> Eric!
> 
> On Tue, Jan 31 2023 at 17:42, Eric DeVolder wrote:
>> --- a/kernel/crash_core.c
>> +++ b/kernel/crash_core.c
>> @@ -366,6 +366,14 @@ int crash_prepare_elf64_headers(struct kimage *image, struct crash_mem *mem,
>>   
>>   	/* Prepare one phdr of type PT_NOTE for each present CPU */
>>   	for_each_present_cpu(cpu) {
>> +#ifdef CONFIG_CRASH_HOTPLUG
>> +		if (IS_ENABLED(CONFIG_HOTPLUG_CPU)) {
>> +			/* Skip the soon-to-be offlined cpu */
>> +			if ((image->hp_action == KEXEC_CRASH_HP_REMOVE_CPU) &&
>> +				(cpu == image->offlinecpu))
>> +				continue;
>> +		}
>> +#endif
> 
> I'm failing to see how the above is correct in any way. Look at the
> following sequence of events:
> 
>       1) Offline CPU$N
> 
>          -> Prepare elf headers with CPU$N excluded
> 
>       2) Another hotplug operation != 'Online CPU$N'
> 
>          -> Prepare elf headers with CPU$N included
> 
> Also in case of loading the crash kernel in the situation where not all
> present CPUs are online (think boot time SMT disable) then your
> resulting crash image will contain all present CPUs and none of the
> offline CPUs are excluded.
> 
> How does that make any sense at all?
> 
> This image->hp_action and image->offlinecpu dance is engineering
> voodoo. You just can do:
> 
>          for_each_present_cpu(cpu) {
>              if (!cpu_online(cpu))
>              	continue;
>              do_stuff(cpu);
> 
> which does the right thing in all situations and can be further
> simplified to:
> 
>          for_each_online_cpu(cpu) {
>              do_stuff(cpu);
> 
> without the need for ifdefs or whatever.
> 
> No?
> 
> Thanks,
> 
>          tglx

Thomas,

I've been re-examining the cpuhp framework and understand a bit better its
operation.

Up until now, this patch series has been using either CPUHP_AP_ONLINE_DYN
or more recently CPUHP_BP_PREPARE_DYN with the same handler for both the
startup and teardown callbacks. This resulted in the cpu state, as seen by
my handler, as being incorrect in one direction or the other. For example,
when using CPUHP_AP_ONLINE_DYN, cpu_online() always resulted in 1 for the
cpu in my callback, even during tear down. For CPUHP_BP_PREPARE_DYN,
cpu_online() always resulted in 0. Thus the offlinecpu voodoo.

But no more!

The reason, as I now understand, is simple. A cpu will not show as online
until after state CPUHP_BRINGUP_CPU (when working from CPUHP_OFFLINE towards
CPUHP_ONLINE). And a cpu will not show as offline until after state
CPUHP_TEARDOWN_CPU (when working reverse order from CPUHP_ONLINE to
CPUHP_OFFLINE).

The CPUHP_BRINGUP_CPU is the last state of the PREPARE section, and boots
the new cpu. It is code running on the booting cpu that marks itself as
online.

  CPUHP_BRINGUP_CPU
    .startup()
      bringup_cpu()
        __cpu_up()
         smp_ops.cpu_up()
          native_cpu_up()
           do_boot_cpu()
            ===== on new cpu! =====
            start_secondary()
             set_cpu_online(true)

There are quite a few CPUHP_..._STARTING states before the cpu is in a productive state.

The CPUHP_TEARDOWN_CPU is the last state in the STARTING section, and takes the cpu down.
Work/irqs are removed from this cpu and re-assigned to others.

  CPUHP_TEARDOWN_CPU
    .teardown()
     takedown_cpu()
      take_cpu_down()
       __cpu_disable()
        smp_ops.cpu_disable()
         native_cpu_disable()
          cpu_disable_common()
           remove_cpu_from_maps()
            set_cpu_online(false)

So my latest solution is introduce two new CPUHP states, CPUHP_AP_ELFCOREHDR_ONLINE
for onlining and CPUHP_BP_ELFCOREHDR_OFFLINE for offlining. I'm open to better names.

The CPUHP_AP_ELFCOREHDR_ONLINE needs to be placed after CPUHP_BRINGUP_CPU. My
attempts at locating this state failed when inside the STARTING section, so I located
this just inside the ONLINE sectoin. The crash hotplug handler is registered on
this state as the callback for the .startup method.

The CPUHP_BP_ELFCOREHDR_OFFLINE needs to be placed before CPUHP_TEARDOWN_CPU, and I
placed it at the end of the PREPARE section. This crash hotplug handler is also
registered on this state as the callback for the .teardown method.

diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h
index 6c6859bfc454..52d2db4d793e 100644
--- a/include/linux/cpuhotplug.h
+++ b/include/linux/cpuhotplug.h
@@ -131,6 +131,7 @@ enum cpuhp_state {
     CPUHP_ZCOMP_PREPARE,
     CPUHP_TIMERS_PREPARE,
     CPUHP_MIPS_SOC_PREPARE,
+   CPUHP_BP_ELFCOREHDR_OFFLINE,
     CPUHP_BP_PREPARE_DYN,
     CPUHP_BP_PREPARE_DYN_END        = CPUHP_BP_PREPARE_DYN + 20,
     CPUHP_BRINGUP_CPU,
@@ -205,6 +206,7 @@ enum cpuhp_state {

     /* Online section invoked on the hotplugged CPU from the hotplug thread */
     CPUHP_AP_ONLINE_IDLE,
+   CPUHP_AP_ELFCOREHDR_ONLINE,
     CPUHP_AP_SCHED_WAIT_EMPTY,
     CPUHP_AP_SMPBOOT_THREADS,
     CPUHP_AP_X86_VDSO_VMA_ONLINE,

diff --git a/kernel/crash_core.c b/kernel/crash_core.c
index 8a439b6d723b..e1a3430f06f4 100644
--- a/kernel/crash_core.c
+++ b/kernel/crash_core.c

+   if (IS_ENABLED(CONFIG_HOTPLUG_CPU)) {
+       result = cpuhp_setup_state_nocalls(CPUHP_AP_ELFCOREHDR_ONLINE,
+                          "crash/cpuhp_online", crash_cpuhp_online, NULL);
+       result = cpuhp_setup_state_nocalls(CPUHP_BP_ELFCOREHDR_OFFLINE,
+                          "crash/cpuhp_offline", NULL, crash_cpuhp_offline);
+   }

With the above, there is no need for offlinecpu, as the crash hotplug handler
callback now observes the correct cpu_online() state in both online and offline
activities.

Which leads me to the next item. Thomas you suggested

           for_each_online_cpu(cpu) {
               do_stuff(cpu);

I've been looking into this further, and don't yet have conclusion.
In light of Sourabh's comments/concerns about packing PT_NOTES, I
need to determine if my introduction of

        if (IS_ENABLED(CONFIG_CRASH_HOTPLUG)) {
            if (!cpu_online(cpu)) continue;
        }

does not cause other downstream issues. My testing was focused on
hot plug/unplugging cpus in a last-on-first-off manner, where as
I now realize cpus can be onlined/offlined sparsely (thus the PT_NOTE
packing concern).

I'm making my way though percpu crash_notes, elfcorehdr, vmcoreinfo,
makedumpfile and (the consumer of it all) the userspace crash utility,
in order to understand the impact of moving from for_each_present_cpu()
to for_each_online_cpu().

At any rate, I wanted to at least put forth the introduction of the
two new CPUHP states and solicit feedback there while I investigate
the for_each_online_cpu() matter.

Thanks for pushing me on this topic!
eric


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

  parent reply	other threads:[~2023-02-07 17:24 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-31 22:42 [PATCH v18 0/7] crash: Kernel handling of CPU and memory hot un/plug Eric DeVolder
2023-01-31 22:42 ` [PATCH v18 1/7] crash: move a few code bits to setup support of crash hotplug Eric DeVolder
2023-01-31 22:42 ` [PATCH v18 2/7] crash: prototype change for crash_prepare_elf64_headers() Eric DeVolder
2023-01-31 22:42 ` [PATCH v18 3/7] crash: add generic infrastructure for crash hotplug support Eric DeVolder
2023-02-09 19:10   ` Sourabh Jain
2023-02-10 16:51     ` Eric DeVolder
2023-01-31 22:42 ` [PATCH v18 4/7] kexec: exclude elfcorehdr from the segment digest Eric DeVolder
2023-01-31 22:42 ` [PATCH v18 5/7] kexec: exclude hot remove cpu from elfcorehdr notes Eric DeVolder
2023-02-01 11:33   ` Thomas Gleixner
2023-02-06  8:12     ` Sourabh Jain
2023-02-06 13:03       ` Thomas Gleixner
2023-02-07 17:23     ` Eric DeVolder [this message]
2023-02-08 13:44       ` Thomas Gleixner
2023-02-09 17:31         ` Eric DeVolder
2023-02-09 18:43           ` Sourabh Jain
2023-02-09 19:39             ` Eric DeVolder
2023-02-10  6:29               ` Sourabh Jain
2023-02-11  0:35                 ` Eric DeVolder
2023-02-13  4:40                   ` Sourabh Jain
2023-02-13 12:52                     ` Thomas Gleixner
2023-02-15  2:53                       ` Sourabh Jain
2023-02-28 12:44                     ` Baoquan He
2023-02-28 18:52                       ` Eric DeVolder
2023-03-01 15:48                         ` Eric DeVolder
2023-03-02 10:51                           ` Baoquan He
2023-03-02  5:23                         ` Sourabh Jain
2023-02-23 20:34                 ` Eric DeVolder
2023-02-24  8:34                   ` Sourabh Jain
2023-02-24 20:16                     ` Eric DeVolder
2023-02-27  6:11                       ` Sourabh Jain
2023-02-28 21:50                         ` Eric DeVolder
2023-03-01  6:22                           ` Sourabh Jain
2023-03-01 14:16                             ` Eric DeVolder
2023-01-31 22:42 ` [PATCH v18 6/7] crash: memory and cpu hotplug sysfs attributes Eric DeVolder
2023-01-31 22:42 ` [PATCH v18 7/7] x86/crash: add x86 crash hotplug support Eric DeVolder

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=dd03f47a-0017-6239-04e9-e796dca03c0c@oracle.com \
    --to=eric.devolder@oracle.com \
    --cc=bhe@redhat.com \
    --cc=boris.ostrovsky@oracle.com \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=david@redhat.com \
    --cc=dyoung@redhat.com \
    --cc=ebiederm@xmission.com \
    --cc=efault@gmx.de \
    --cc=hpa@zytor.com \
    --cc=kexec@lists.infradead.org \
    --cc=konrad.wilk@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=nramas@linux.microsoft.com \
    --cc=robh@kernel.org \
    --cc=rppt@kernel.org \
    --cc=sourabhjain@linux.ibm.com \
    --cc=tglx@linutronix.de \
    --cc=thomas.lendacky@amd.com \
    --cc=vgoyal@redhat.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox