All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] sparc64: Normalize NMI watchdog logging and behavior.
@ 2014-05-04  5:27 David Miller
  2014-05-04  7:04 ` Sam Ravnborg
  2014-05-04 18:24 ` David Miller
  0 siblings, 2 replies; 3+ messages in thread
From: David Miller @ 2014-05-04  5:27 UTC (permalink / raw)
  To: sparclinux


Bring this code in line with the perf based generic NMI watchdog
in kernel/watchdog.c (which we should convert over to at some
point).

In particular, don't do anything super fancy when the watchdog
triggers, and specifically don't do a do_exit() which only makes
things worse.

Either panic(), or WARN().  The latter of which will do all of
the actions such as give us a stack backtrace.

Signed-off-by: David S. Miller <davem@davemloft.net>
---

I noticed this while trying to debug various kinds of hangs I can
trigger in 3.15, hopefully these adjustments make debugging easier
for other people as well.

Committed to 'sparc' GIT.

 arch/sparc/kernel/nmi.c | 21 +++++----------------
 1 file changed, 5 insertions(+), 16 deletions(-)

diff --git a/arch/sparc/kernel/nmi.c b/arch/sparc/kernel/nmi.c
index 6479256..3370945 100644
--- a/arch/sparc/kernel/nmi.c
+++ b/arch/sparc/kernel/nmi.c
@@ -68,27 +68,16 @@ EXPORT_SYMBOL(touch_nmi_watchdog);
 
 static void die_nmi(const char *str, struct pt_regs *regs, int do_panic)
 {
+	int this_cpu = smp_processor_id();
+
 	if (notify_die(DIE_NMIWATCHDOG, str, regs, 0,
 		       pt_regs_trap_type(regs), SIGINT) = NOTIFY_STOP)
 		return;
 
-	console_verbose();
-	bust_spinlocks(1);
-
-	printk(KERN_EMERG "%s", str);
-	printk(" on CPU%d, ip %08lx, registers:\n",
-	       smp_processor_id(), regs->tpc);
-	show_regs(regs);
-	dump_stack();
-
-	bust_spinlocks(0);
-
 	if (do_panic || panic_on_oops)
-		panic("Non maskable interrupt");
-
-	nmi_exit();
-	local_irq_enable();
-	do_exit(SIGBUS);
+		panic("Watchdog detected hard LOCKUP on cpu %d", this_cpu);
+	else
+		WARN(1, "Watchdog detected hard LOCKUP on cpu %d", this_cpu);
 }
 
 notrace __kprobes void perfctr_irq(int irq, struct pt_regs *regs)
-- 
1.8.1.2


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] sparc64: Normalize NMI watchdog logging and behavior.
  2014-05-04  5:27 [PATCH] sparc64: Normalize NMI watchdog logging and behavior David Miller
@ 2014-05-04  7:04 ` Sam Ravnborg
  2014-05-04 18:24 ` David Miller
  1 sibling, 0 replies; 3+ messages in thread
From: Sam Ravnborg @ 2014-05-04  7:04 UTC (permalink / raw)
  To: sparclinux

On Sun, May 04, 2014 at 01:27:10AM -0400, David Miller wrote:
> 
> Bring this code in line with the perf based generic NMI watchdog
> in kernel/watchdog.c (which we should convert over to at some
> point).
> 
> In particular, don't do anything super fancy when the watchdog
> triggers, and specifically don't do a do_exit() which only makes
> things worse.

It is always good when we can use more of the generic functionality.
Should we do something remotely similar for sparc32?

It looks like the sun4m_nmi() function is also used for sun4d + leon,
but I need to look again to make sure.

	Sam

Something like this:

diff --git a/arch/sparc/kernel/sun4m_irq.c b/arch/sparc/kernel/sun4m_irq.c
index 8bb3b3f..7c8ad6f 100644
--- a/arch/sparc/kernel/sun4m_irq.c
+++ b/arch/sparc/kernel/sun4m_irq.c
@@ -308,28 +308,28 @@ static void sun4m_clear_clock_irq(void)
 void sun4m_nmi(struct pt_regs *regs)
 {
 	unsigned long afsr, afar, si;
+	char *reason = "unknown";
 
-	printk(KERN_ERR "Aieee: sun4m NMI received!\n");
 	/* XXX HyperSparc hack XXX */
 	__asm__ __volatile__("mov 0x500, %%g1\n\t"
 			     "lda [%%g1] 0x4, %0\n\t"
 			     "mov 0x600, %%g1\n\t"
 			     "lda [%%g1] 0x4, %1\n\t" :
 			     "=r" (afsr), "=r" (afar));
-	printk(KERN_ERR "afsr=%08lx afar=%08lx\n", afsr, afar);
+
 	si = sbus_readl(&sun4m_irq_global->pending);
 	printk(KERN_ERR "si=%08lx\n", si);
 	if (si & SUN4M_INT_MODULE_ERR)
-		printk(KERN_ERR "Module async error\n");
+		reason = "Module async error";
 	if (si & SUN4M_INT_M2S_WRITE_ERR)
-		printk(KERN_ERR "MBus/SBus async error\n");
+		reason = "MBus/SBus async error";
 	if (si & SUN4M_INT_ECC_ERR)
-		printk(KERN_ERR "ECC memory error\n");
+		reason = "ECC memory error";
 	if (si & SUN4M_INT_VME_ERR)
-		printk(KERN_ERR "VME async error\n");
-	printk(KERN_ERR "you lose buddy boy...\n");
-	show_regs(regs);
-	prom_halt();
+		reason = "VME async error";
+
+	panic("sun4m NMI received (%s), afsr=%08lx afar=%08lx\n",
+	      reason, afsr, afar);
 }
 
 void sun4m_unmask_profile_irq(void)

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] sparc64: Normalize NMI watchdog logging and behavior.
  2014-05-04  5:27 [PATCH] sparc64: Normalize NMI watchdog logging and behavior David Miller
  2014-05-04  7:04 ` Sam Ravnborg
@ 2014-05-04 18:24 ` David Miller
  1 sibling, 0 replies; 3+ messages in thread
From: David Miller @ 2014-05-04 18:24 UTC (permalink / raw)
  To: sparclinux

From: Sam Ravnborg <sam@ravnborg.org>
Date: Sun, 4 May 2014 09:04:37 +0200

> On Sun, May 04, 2014 at 01:27:10AM -0400, David Miller wrote:
>> 
>> Bring this code in line with the perf based generic NMI watchdog
>> in kernel/watchdog.c (which we should convert over to at some
>> point).
>> 
>> In particular, don't do anything super fancy when the watchdog
>> triggers, and specifically don't do a do_exit() which only makes
>> things worse.
> 
> It is always good when we can use more of the generic functionality.
> Should we do something remotely similar for sparc32?
> 
> It looks like the sun4m_nmi() function is also used for sun4d + leon,
> but I need to look again to make sure.

The sun4m NMI function is just for hard asynchronous errors, rather
than a periodic event generated by perf counters.

So it serves a different purpose, but it could use some cleanups
nonetheless.  I wrote that code when I was a coding cowboy of
sorts :-)

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2014-05-04 18:24 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-05-04  5:27 [PATCH] sparc64: Normalize NMI watchdog logging and behavior David Miller
2014-05-04  7:04 ` Sam Ravnborg
2014-05-04 18:24 ` David Miller

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.