linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] powernv: Avoid checkstop on HMI and MCE
@ 2017-10-24  9:20 Michael Neuling
  2017-10-25 10:16 ` Michael Ellerman
  2017-10-25 12:56 ` Balbir Singh
  0 siblings, 2 replies; 5+ messages in thread
From: Michael Neuling @ 2017-10-24  9:20 UTC (permalink / raw)
  To: mpe; +Cc: linuxppc-dev, mikey, benh, anton, Mahesh Salgaonkar,
	Vipin K Parashar

On an unrecoverable HMI or MCE only generate an checkstop (via
PLATFORM ERROR opal reboot call) when panic_on_oops is set.

We currently generate an checkstop as an attempt for the FSP to grab a
dump and then reboot us. Unfortunately this never works and no one
I've talked to has ever seen a resulting dump, let alone got useful
information from it.

Even worse, the checkstop gets in the way of debugging real
problems. If we hit a software bug that results in this, we get no
opportunity to debug it live. Similarly if the bug is due to hardware
that is not in the dump (say PCI or NVLINK GPU), we get no information
in the dump about that hardware.

So let's remove it unless someone sets panic_on_oops.

Signed-off-by: Michael Neuling <mikey@neuling.org>
---
 arch/powerpc/platforms/powernv/opal-hmi.c | 6 ++++++
 arch/powerpc/platforms/powernv/opal.c     | 4 ++++
 2 files changed, 10 insertions(+)

diff --git a/arch/powerpc/platforms/powernv/opal-hmi.c b/arch/powerpc/platforms/powernv/opal-hmi.c
index c9e1a4ff29..23780970d0 100644
--- a/arch/powerpc/platforms/powernv/opal-hmi.c
+++ b/arch/powerpc/platforms/powernv/opal-hmi.c
@@ -29,6 +29,7 @@
 #include <asm/opal.h>
 #include <asm/cputable.h>
 #include <asm/machdep.h>
+#include <asm/bug.h>
 
 #include "powernv.h"
 
@@ -284,6 +285,11 @@ static void hmi_event_handler(struct work_struct *work)
 			print_hmi_event_info(hmi_evt);
 		}
 
+		if (!panic_on_oops) {
+			die("Unrecoverable HMI exception", NULL, SIGBUS);
+			return;
+		}
+
 		pnv_platform_error_reboot(NULL, "Unrecoverable HMI exception");
 	}
 }
diff --git a/arch/powerpc/platforms/powernv/opal.c b/arch/powerpc/platforms/powernv/opal.c
index 65c79ecf5a..f208f222e4 100644
--- a/arch/powerpc/platforms/powernv/opal.c
+++ b/arch/powerpc/platforms/powernv/opal.c
@@ -523,6 +523,10 @@ int opal_machine_check(struct pt_regs *regs)
 	if (opal_recover_mce(regs, &evt))
 		return 1;
 
+	if (!panic_on_oops) {
+		die("Unrecoverable Machine check", regs, SIGBUS);
+		return 0;
+	}
 	pnv_platform_error_reboot(regs, "Unrecoverable Machine Check exception");
 }
 
-- 
2.14.1

^ permalink raw reply related	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2017-10-25 12:57 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-10-24  9:20 [PATCH] powernv: Avoid checkstop on HMI and MCE Michael Neuling
2017-10-25 10:16 ` Michael Ellerman
2017-10-25 10:59   ` Michael Neuling
2017-10-25 11:24     ` Nicholas Piggin
2017-10-25 12:56 ` Balbir Singh

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).