From: Hari Bathini <hbathini@linux.ibm.com>
To: mpe@ellerman.id.au, linuxppc-dev@lists.ozlabs.org
Cc: Hari Bathini <hbathini@linux.ibm.com>,
sourabhjain@linux.ibm.com, mahesh@linux.ibm.com,
npiggin@gmail.com
Subject: [PATCH] ppc64/fadump: fix inaccurate CPU state info in vmcore generated with panic
Date: Thu, 11 Nov 2021 00:31:43 +0530 [thread overview]
Message-ID: <20211110190143.186346-1-hbathini@linux.ibm.com> (raw)
In panic path, fadump is triggered via a panic notifier function.
Before calling panic notifier functions, smp_send_stop() gets called,
which stops all CPUs except the panic'ing CPU. Commit 8389b37dffdc
("powerpc: stop_this_cpu: remove the cpu from the online map.") and
again commit bab26238bbd4 ("powerpc: Offline CPU in stop_this_cpu()")
started marking CPUs as offline while stopping them. So, if a kernel
has either of the above commits, vmcore captured with fadump via panic
path would show only the panic'ing CPU as online. Sample output of
crash-utility with such vmcore:
# crash vmlinux vmcore
...
KERNEL: vmlinux
DUMPFILE: vmcore [PARTIAL DUMP]
CPUS: 1
DATE: Wed Nov 10 09:56:34 EST 2021
UPTIME: 00:00:42
LOAD AVERAGE: 2.27, 0.69, 0.24
TASKS: 183
NODENAME: XXXXXXXXX
RELEASE: 5.15.0+
VERSION: #974 SMP Wed Nov 10 04:18:19 CST 2021
MACHINE: ppc64le (2500 Mhz)
MEMORY: 8 GB
PANIC: "Kernel panic - not syncing: sysrq triggered crash"
PID: 3394
COMMAND: "bash"
TASK: c0000000150a5f80 [THREAD_INFO: c0000000150a5f80]
CPU: 1
STATE: TASK_RUNNING (PANIC)
crash> p -x __cpu_online_mask
__cpu_online_mask = $1 = {
bits = {0x2, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}
}
crash>
crash>
crash> p -x __cpu_active_mask
__cpu_active_mask = $2 = {
bits = {0xff, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}
}
crash>
While this has been the case since fadump was introduced, the issue
was not identified for two probable reasons:
- In general, the bulk of the vmcores analyzed were from crash
due to exception.
- The above did change since commit 8341f2f222d7 ("sysrq: Use
panic() to force a crash") started using panic() instead of
deferencing NULL pointer to force a kernel crash. But then
commit de6e5d38417e ("powerpc: smp_send_stop do not offline
stopped CPUs") stopped marking CPUs as offline till kernel
commit bab26238bbd4 ("powerpc: Offline CPU in stop_this_cpu()")
reverted that change.
To avoid vmcore from showing only one CPU as online in panic path,
skip marking non panic'ing CPUs as offline while stopping them
during fadump crash.
Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
---
arch/powerpc/kernel/smp.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
index c23ee842c4c3..20555d5d5966 100644
--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -61,6 +61,7 @@
#include <asm/cpu_has_feature.h>
#include <asm/ftrace.h>
#include <asm/kup.h>
+#include <asm/fadump.h>
#ifdef DEBUG
#include <asm/udbg.h>
@@ -626,7 +627,8 @@ static void nmi_stop_this_cpu(struct pt_regs *regs)
/*
* IRQs are already hard disabled by the smp_handle_nmi_ipi.
*/
- set_cpu_online(smp_processor_id(), false);
+ if (!(oops_in_progress && should_fadump_crash()))
+ set_cpu_online(smp_processor_id(), false);
spin_begin();
while (1)
@@ -650,7 +652,8 @@ static void stop_this_cpu(void *dummy)
* to know other CPUs are offline before it breaks locks to flush
* printk buffers, in case we panic()ed while holding the lock.
*/
- set_cpu_online(smp_processor_id(), false);
+ if (!(oops_in_progress && should_fadump_crash()))
+ set_cpu_online(smp_processor_id(), false);
spin_begin();
while (1)
--
2.31.1
next reply other threads:[~2021-11-10 19:02 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-11-10 19:01 Hari Bathini [this message]
2021-11-11 6:14 ` [PATCH] ppc64/fadump: fix inaccurate CPU state info in vmcore generated with panic Michael Ellerman
2021-11-11 12:06 ` Hari Bathini
2021-11-11 13:20 ` Nicholas Piggin
2021-11-17 6:09 ` Hari Bathini
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20211110190143.186346-1-hbathini@linux.ibm.com \
--to=hbathini@linux.ibm.com \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=mahesh@linux.ibm.com \
--cc=mpe@ellerman.id.au \
--cc=npiggin@gmail.com \
--cc=sourabhjain@linux.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).