* [PATCHv3] arm64:kexec: have own crash_smp_send_stop() for crash dump for nonpanic cores
@ 2017-08-17 2:24 ` Hoeun Ryu
0 siblings, 0 replies; 4+ messages in thread
From: Hoeun Ryu @ 2017-08-17 2:24 UTC (permalink / raw)
To: linux-arm-kernel
Commit 0ee5941 : (x86/panic: replace smp_send_stop() with kdump friendly
version in panic path) introduced crash_smp_send_stop() which is a weak
function and can be overridden by architecture codes to fix the side effect
caused by commit f06e515 : (kernel/panic.c: add "crash_kexec_post_
notifiers" option).
ARM64 architecture uses the weak version function and the problem is that
the weak function simply calls smp_send_stop() which makes other CPUs
offline and takes away the chance to save crash information for nonpanic
CPUs in machine_crash_shutdown() when crash_kexec_post_notifiers kernel
option is enabled.
Calling smp_send_crash_stop() in machine_crash_shutdown() is useless
because all nonpanic CPUs are already offline by smp_send_stop() in this
case and smp_send_crash_stop() only works against online CPUs.
The result is that secondary CPUs registers are not saved by
crash_save_cpu() and the vmcore file misreports these CPUs as being
offline.
crash_smp_send_stop() is implemented to fix this problem by replacing the
existing smp_send_crash_stop() and adding a check for multiple calling to
the function. The function (strong symbol version) saves crash information
for nonpanic CPUs and machine_crash_shutdown() tries to save crash
information for nonpanic CPUs only when crash_kexec_post_notifiers kernel
option is disabled.
* crash_kexec_post_notifiers : false
panic()
__crash_kexec()
machine_crash_shutdown()
crash_smp_send_stop() <= save crash dump for nonpanic cores
* crash_kexec_post_notifiers : true
panic()
crash_smp_send_stop() <= save crash dump for nonpanic cores
__crash_kexec()
machine_crash_shutdown()
crash_smp_send_stop() <= just return.
Signed-off-by: Hoeun Ryu <hoeun.ryu@gmail.com>
Reviewed-by: James Morse <james.morse@arm.com>
Tested-by: James Morse <james.morse@arm.com>
---
v3:
- fix typos in the commit log.
- modify commit log about the result of this problem.
- add Tested-by/Reviewed-by: James Morse.
v2:
- replace the existing smp_send_crash_stop() with crash_smp_send_stop()
and adding called-twice logic to it.
- modify the commit message.
arch/arm64/include/asm/smp.h | 2 +-
arch/arm64/kernel/machine_kexec.c | 2 +-
arch/arm64/kernel/smp.c | 12 +++++++++++-
3 files changed, 13 insertions(+), 3 deletions(-)
diff --git a/arch/arm64/include/asm/smp.h b/arch/arm64/include/asm/smp.h
index 55f08c5..f82b447 100644
--- a/arch/arm64/include/asm/smp.h
+++ b/arch/arm64/include/asm/smp.h
@@ -148,7 +148,7 @@ static inline void cpu_panic_kernel(void)
*/
bool cpus_are_stuck_in_kernel(void);
-extern void smp_send_crash_stop(void);
+extern void crash_smp_send_stop(void);
extern bool smp_crash_stop_failed(void);
#endif /* ifndef __ASSEMBLY__ */
diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c
index 481f54a..11121f6 100644
--- a/arch/arm64/kernel/machine_kexec.c
+++ b/arch/arm64/kernel/machine_kexec.c
@@ -252,7 +252,7 @@ void machine_crash_shutdown(struct pt_regs *regs)
local_irq_disable();
/* shutdown non-crashing cpus */
- smp_send_crash_stop();
+ crash_smp_send_stop();
/* for crashing cpu */
crash_save_cpu(regs, smp_processor_id());
diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
index dc66e6e..73d8f5e 100644
--- a/arch/arm64/kernel/smp.c
+++ b/arch/arm64/kernel/smp.c
@@ -977,11 +977,21 @@ void smp_send_stop(void)
}
#ifdef CONFIG_KEXEC_CORE
-void smp_send_crash_stop(void)
+void crash_smp_send_stop(void)
{
+ static int cpus_stopped;
cpumask_t mask;
unsigned long timeout;
+ /*
+ * This function can be called twice in panic path, but obviously
+ * we execute this only once.
+ */
+ if (cpus_stopped)
+ return;
+
+ cpus_stopped = 1;
+
if (num_online_cpus() == 1)
return;
--
2.7.4
^ permalink raw reply related [flat|nested] 4+ messages in thread* [PATCHv3] arm64:kexec: have own crash_smp_send_stop() for crash dump for nonpanic cores @ 2017-08-17 2:24 ` Hoeun Ryu 0 siblings, 0 replies; 4+ messages in thread From: Hoeun Ryu @ 2017-08-17 2:24 UTC (permalink / raw) To: Catalin Marinas, Will Deacon, James Morse, Mark Rutland, AKASHI Takahiro, Robin Murphy, Ard Biesheuvel, Ingo Molnar, Peter Zijlstra (Intel), Mark Salter, Suzuki K Poulose, David Daney, Rob Herring, Dmitry Torokhov, Thomas Gleixner Cc: Hoeun Ryu, linux-arm-kernel, linux-kernel Commit 0ee5941 : (x86/panic: replace smp_send_stop() with kdump friendly version in panic path) introduced crash_smp_send_stop() which is a weak function and can be overridden by architecture codes to fix the side effect caused by commit f06e515 : (kernel/panic.c: add "crash_kexec_post_ notifiers" option). ARM64 architecture uses the weak version function and the problem is that the weak function simply calls smp_send_stop() which makes other CPUs offline and takes away the chance to save crash information for nonpanic CPUs in machine_crash_shutdown() when crash_kexec_post_notifiers kernel option is enabled. Calling smp_send_crash_stop() in machine_crash_shutdown() is useless because all nonpanic CPUs are already offline by smp_send_stop() in this case and smp_send_crash_stop() only works against online CPUs. The result is that secondary CPUs registers are not saved by crash_save_cpu() and the vmcore file misreports these CPUs as being offline. crash_smp_send_stop() is implemented to fix this problem by replacing the existing smp_send_crash_stop() and adding a check for multiple calling to the function. The function (strong symbol version) saves crash information for nonpanic CPUs and machine_crash_shutdown() tries to save crash information for nonpanic CPUs only when crash_kexec_post_notifiers kernel option is disabled. * crash_kexec_post_notifiers : false panic() __crash_kexec() machine_crash_shutdown() crash_smp_send_stop() <= save crash dump for nonpanic cores * crash_kexec_post_notifiers : true panic() crash_smp_send_stop() <= save crash dump for nonpanic cores __crash_kexec() machine_crash_shutdown() crash_smp_send_stop() <= just return. Signed-off-by: Hoeun Ryu <hoeun.ryu@gmail.com> Reviewed-by: James Morse <james.morse@arm.com> Tested-by: James Morse <james.morse@arm.com> --- v3: - fix typos in the commit log. - modify commit log about the result of this problem. - add Tested-by/Reviewed-by: James Morse. v2: - replace the existing smp_send_crash_stop() with crash_smp_send_stop() and adding called-twice logic to it. - modify the commit message. arch/arm64/include/asm/smp.h | 2 +- arch/arm64/kernel/machine_kexec.c | 2 +- arch/arm64/kernel/smp.c | 12 +++++++++++- 3 files changed, 13 insertions(+), 3 deletions(-) diff --git a/arch/arm64/include/asm/smp.h b/arch/arm64/include/asm/smp.h index 55f08c5..f82b447 100644 --- a/arch/arm64/include/asm/smp.h +++ b/arch/arm64/include/asm/smp.h @@ -148,7 +148,7 @@ static inline void cpu_panic_kernel(void) */ bool cpus_are_stuck_in_kernel(void); -extern void smp_send_crash_stop(void); +extern void crash_smp_send_stop(void); extern bool smp_crash_stop_failed(void); #endif /* ifndef __ASSEMBLY__ */ diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c index 481f54a..11121f6 100644 --- a/arch/arm64/kernel/machine_kexec.c +++ b/arch/arm64/kernel/machine_kexec.c @@ -252,7 +252,7 @@ void machine_crash_shutdown(struct pt_regs *regs) local_irq_disable(); /* shutdown non-crashing cpus */ - smp_send_crash_stop(); + crash_smp_send_stop(); /* for crashing cpu */ crash_save_cpu(regs, smp_processor_id()); diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c index dc66e6e..73d8f5e 100644 --- a/arch/arm64/kernel/smp.c +++ b/arch/arm64/kernel/smp.c @@ -977,11 +977,21 @@ void smp_send_stop(void) } #ifdef CONFIG_KEXEC_CORE -void smp_send_crash_stop(void) +void crash_smp_send_stop(void) { + static int cpus_stopped; cpumask_t mask; unsigned long timeout; + /* + * This function can be called twice in panic path, but obviously + * we execute this only once. + */ + if (cpus_stopped) + return; + + cpus_stopped = 1; + if (num_online_cpus() == 1) return; -- 2.7.4 ^ permalink raw reply related [flat|nested] 4+ messages in thread
* [PATCHv3] arm64:kexec: have own crash_smp_send_stop() for crash dump for nonpanic cores 2017-08-17 2:24 ` Hoeun Ryu @ 2017-08-21 16:35 ` Catalin Marinas -1 siblings, 0 replies; 4+ messages in thread From: Catalin Marinas @ 2017-08-21 16:35 UTC (permalink / raw) To: linux-arm-kernel On Thu, Aug 17, 2017 at 11:24:27AM +0900, Hoeun Ryu wrote: > Commit 0ee5941 : (x86/panic: replace smp_send_stop() with kdump friendly > version in panic path) introduced crash_smp_send_stop() which is a weak > function and can be overridden by architecture codes to fix the side effect > caused by commit f06e515 : (kernel/panic.c: add "crash_kexec_post_ > notifiers" option). > > ARM64 architecture uses the weak version function and the problem is that > the weak function simply calls smp_send_stop() which makes other CPUs > offline and takes away the chance to save crash information for nonpanic > CPUs in machine_crash_shutdown() when crash_kexec_post_notifiers kernel > option is enabled. > > Calling smp_send_crash_stop() in machine_crash_shutdown() is useless > because all nonpanic CPUs are already offline by smp_send_stop() in this > case and smp_send_crash_stop() only works against online CPUs. > > The result is that secondary CPUs registers are not saved by > crash_save_cpu() and the vmcore file misreports these CPUs as being > offline. > > crash_smp_send_stop() is implemented to fix this problem by replacing the > existing smp_send_crash_stop() and adding a check for multiple calling to > the function. The function (strong symbol version) saves crash information > for nonpanic CPUs and machine_crash_shutdown() tries to save crash > information for nonpanic CPUs only when crash_kexec_post_notifiers kernel > option is disabled. > > * crash_kexec_post_notifiers : false > > panic() > __crash_kexec() > machine_crash_shutdown() > crash_smp_send_stop() <= save crash dump for nonpanic cores > > * crash_kexec_post_notifiers : true > > panic() > crash_smp_send_stop() <= save crash dump for nonpanic cores > __crash_kexec() > machine_crash_shutdown() > crash_smp_send_stop() <= just return. > > Signed-off-by: Hoeun Ryu <hoeun.ryu@gmail.com> > Reviewed-by: James Morse <james.morse@arm.com> > Tested-by: James Morse <james.morse@arm.com> Queued for 4.14. Thanks. -- Catalin ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCHv3] arm64:kexec: have own crash_smp_send_stop() for crash dump for nonpanic cores @ 2017-08-21 16:35 ` Catalin Marinas 0 siblings, 0 replies; 4+ messages in thread From: Catalin Marinas @ 2017-08-21 16:35 UTC (permalink / raw) To: Hoeun Ryu Cc: Will Deacon, James Morse, Mark Rutland, AKASHI Takahiro, Robin Murphy, Ard Biesheuvel, Ingo Molnar, Peter Zijlstra (Intel), Mark Salter, Suzuki K Poulose, David Daney, Rob Herring, Dmitry Torokhov, Thomas Gleixner, linux-kernel, linux-arm-kernel On Thu, Aug 17, 2017 at 11:24:27AM +0900, Hoeun Ryu wrote: > Commit 0ee5941 : (x86/panic: replace smp_send_stop() with kdump friendly > version in panic path) introduced crash_smp_send_stop() which is a weak > function and can be overridden by architecture codes to fix the side effect > caused by commit f06e515 : (kernel/panic.c: add "crash_kexec_post_ > notifiers" option). > > ARM64 architecture uses the weak version function and the problem is that > the weak function simply calls smp_send_stop() which makes other CPUs > offline and takes away the chance to save crash information for nonpanic > CPUs in machine_crash_shutdown() when crash_kexec_post_notifiers kernel > option is enabled. > > Calling smp_send_crash_stop() in machine_crash_shutdown() is useless > because all nonpanic CPUs are already offline by smp_send_stop() in this > case and smp_send_crash_stop() only works against online CPUs. > > The result is that secondary CPUs registers are not saved by > crash_save_cpu() and the vmcore file misreports these CPUs as being > offline. > > crash_smp_send_stop() is implemented to fix this problem by replacing the > existing smp_send_crash_stop() and adding a check for multiple calling to > the function. The function (strong symbol version) saves crash information > for nonpanic CPUs and machine_crash_shutdown() tries to save crash > information for nonpanic CPUs only when crash_kexec_post_notifiers kernel > option is disabled. > > * crash_kexec_post_notifiers : false > > panic() > __crash_kexec() > machine_crash_shutdown() > crash_smp_send_stop() <= save crash dump for nonpanic cores > > * crash_kexec_post_notifiers : true > > panic() > crash_smp_send_stop() <= save crash dump for nonpanic cores > __crash_kexec() > machine_crash_shutdown() > crash_smp_send_stop() <= just return. > > Signed-off-by: Hoeun Ryu <hoeun.ryu@gmail.com> > Reviewed-by: James Morse <james.morse@arm.com> > Tested-by: James Morse <james.morse@arm.com> Queued for 4.14. Thanks. -- Catalin ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2017-08-21 16:35 UTC | newest] Thread overview: 4+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2017-08-17 2:24 [PATCHv3] arm64:kexec: have own crash_smp_send_stop() for crash dump for nonpanic cores Hoeun Ryu 2017-08-17 2:24 ` Hoeun Ryu 2017-08-21 16:35 ` Catalin Marinas 2017-08-21 16:35 ` Catalin Marinas
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.