From: Eric DeVolder <eric.devolder@oracle.com>
To: Thomas Gleixner <tglx@linutronix.de>,
linux-kernel@vger.kernel.org, x86@kernel.org,
kexec@lists.infradead.org, ebiederm@xmission.com,
dyoung@redhat.com, bhe@redhat.com, vgoyal@redhat.com
Cc: mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com,
hpa@zytor.com, nramas@linux.microsoft.com,
thomas.lendacky@amd.com, robh@kernel.org, efault@gmx.de,
rppt@kernel.org, david@redhat.com, sourabhjain@linux.ibm.com,
konrad.wilk@oracle.com, boris.ostrovsky@oracle.com
Subject: Re: [PATCH v17 3/6] crash: add generic infrastructure for crash hotplug support
Date: Fri, 20 Jan 2023 15:24:32 -0600 [thread overview]
Message-ID: <92fd9892-e1cf-79f6-529c-c4e5b1516802@oracle.com> (raw)
In-Reply-To: <878rhyi53l.ffs@tglx>
On 1/19/23 15:31, Thomas Gleixner wrote:
> Eric!
>
> On Wed, Jan 18 2023 at 16:35, Eric DeVolder wrote:
>> CPU and memory change notifications are received in order to
>> regenerate the elfcorehdr.
>>
>> To support cpu hotplug, a callback is registered to capture the
>> CPUHP_AP_ONLINE_DYN online and offline events via
>> cpuhp_setup_state_nocalls().
>
> This sentence does not make sense. The callback is not registered to
> capture CPUHP_AP_ONLINE_DYN events >
> What this does is: It installs a dynamic CPU hotplug state with
> callbacks for online and offline. These callbacks store information
> about a CPU coming up and going down. Right?
I agree, the wording is wrong; this code taps into that state, as you suggest, in order to handle
the online and offline events.
>
> But why are they required and what's the value?
>
> This changelog tells WHAT it does and not WHY. I can see the WHAT from
> the patch itself.
>
> Don't tell me the WHY is in the cover letter. The cover letter is not
> part of the commits and changelogs have to be self contained.
>
> Now let me cite from your cover letter:
>
>> When the kdump service is loaded, if a CPU or memory is hot
>> un/plugged, the crash elfcorehdr, which describes the CPUs
>> and memory in the system, must also be updated, else the resulting
>> vmcore is inaccurate (eg. missing either CPU context or memory
>> regions).
I'll work to improve the wording and why for the next iteration.
>
> The CPU hotplug state you are using for this is patently inaccurate
> too. With your approach the CPU is tracked as online very late in the
> hotplug process and tracked as offline very early on unplug.
>
> So if the kernel crashes before/after the plug/unplug tracking event
> then your recorded state is bogus and given the amount of callbacks
> between the real online/offline and the recording point there is a
> pretty large window.
>
> You can argue that this is better than the current state and considered
> good enough for whatever reason, but such information wants to be in the
> changelog, no?
I agree! I admit that CPUHP_AP_ONLINE_DYN may (is) not the best choice. I did spend time looking at
the cpu hotplug infrastructure, but did not learn a better/correct way. Fwiw:
https://lore.kernel.org/lkml/20211118174948.37435-1-eric.devolder@oracle.com/:
"The second problem is the use of CPUHP_AP_ONLINE_DYN. The
cpuhp_setup_state_nocalls() is invoked with parameter
CPUHP_AP_ONLINE_DYN. While this works, when a CPU is being unplugged,
the CPU still shows up in foreach_present_cpu() during the
regeneration of the elfcorehdr, thus the need to explicitly check and
exclude the soon-to-be offlined CPU in crash_prepare_elf64_headers().
Perhaps if value(s) new/different than CPUHP_AP_ONLINE_DYN to
cpuhp_setup_state() was utilized, then the offline cpu would no longer
be in foreach_present_cpu(), and this change could be eliminated. I do
not understand cpuhp_setup_state() well enough to choose, or create,
appropriate value(s)."
The problem described (and worked around in this patch series) is the behavior/window you point out.
I'd prefer to narrow the window, if possible. The states/values I tried did not work; any
suggestions for a more appropriate state/value would be most welcomed!
>
> Thanks,
>
> tglx
>
> Hint: The requirements for changelogs are well documented in Documentation/process/
>
>
Thomas, thank you for looking at this!
eric
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
next prev parent reply other threads:[~2023-01-20 21:25 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-01-18 21:35 [PATCH v17 0/6] crash: Kernel handling of CPU and memory hot un/plug Eric DeVolder
2023-01-18 21:35 ` [PATCH v17 1/6] crash: move a few code bits to setup support of crash hotplug Eric DeVolder
2023-01-18 21:35 ` [PATCH v17 2/6] crash: prototype change for crash_prepare_elf64_headers() Eric DeVolder
2023-01-18 21:35 ` [PATCH v17 3/6] crash: add generic infrastructure for crash hotplug support Eric DeVolder
2023-01-19 21:31 ` Thomas Gleixner
2023-01-20 21:24 ` Eric DeVolder [this message]
2023-01-18 21:35 ` [PATCH v17 4/6] kexec: exclude elfcorehdr from the segment digest Eric DeVolder
2023-01-18 21:35 ` [PATCH v17 5/6] kexec: exclude hot remove cpu from elfcorehdr notes Eric DeVolder
2023-01-18 21:35 ` [PATCH v17 6/6] x86/crash: add x86 crash hotplug support Eric DeVolder
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=92fd9892-e1cf-79f6-529c-c4e5b1516802@oracle.com \
--to=eric.devolder@oracle.com \
--cc=bhe@redhat.com \
--cc=boris.ostrovsky@oracle.com \
--cc=bp@alien8.de \
--cc=dave.hansen@linux.intel.com \
--cc=david@redhat.com \
--cc=dyoung@redhat.com \
--cc=ebiederm@xmission.com \
--cc=efault@gmx.de \
--cc=hpa@zytor.com \
--cc=kexec@lists.infradead.org \
--cc=konrad.wilk@oracle.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=nramas@linux.microsoft.com \
--cc=robh@kernel.org \
--cc=rppt@kernel.org \
--cc=sourabhjain@linux.ibm.com \
--cc=tglx@linutronix.de \
--cc=thomas.lendacky@amd.com \
--cc=vgoyal@redhat.com \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox