Linux IA64 platform development
 help / color / mirror / Atom feed
* [patch] Increase severity of MCA recovery messages
@ 2006-02-27 20:12 Russ Anderson
  2006-02-27 21:40 ` Luck, Tony
  2006-02-27 22:15 ` Russ Anderson
  0 siblings, 2 replies; 3+ messages in thread
From: Russ Anderson @ 2006-02-27 20:12 UTC (permalink / raw)
  To: linux-ia64

[patch] Increase severity of MCA recovery messages

The MCA recovery messages are currently KERN_DEBUG,
so they don't show up in /var/log/messages (by default).
Increase the severity to KERN_CRIT, which is the 
severity used when the kernel kills out of memory 
processes.

Signed-off-by: Russ Anderson (rja@sgi.com)

---
 arch/ia64/kernel/mca_drv.c |    6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

Index: test/arch/ia64/kernel/mca_drv.c
=================================--- test.orig/arch/ia64/kernel/mca_drv.c	2006-02-24 14:01:34.000000000 -0600
+++ test/arch/ia64/kernel/mca_drv.c	2006-02-27 13:49:21.476589856 -0600
@@ -124,7 +124,7 @@ mca_page_isolate(unsigned long paddr)
 void
 mca_handler_bh(unsigned long paddr, void *iip, unsigned long ipsr)
 {
-	printk(KERN_DEBUG "OS_MCA: process [cpu %d, pid: %d, uid: %d, "
+	printk(KERN_CRIT "OS_MCA: process [cpu %d, pid: %d, uid: %d, "
 		"iip: %p, psr: 0x%lx,paddr: 0x%lx](%s) encounters MCA.\n",
 		raw_smp_processor_id(), current->pid, current->uid,
 		iip, ipsr, paddr, current->comm);
@@ -132,10 +132,10 @@ mca_handler_bh(unsigned long paddr, void
 	spin_lock(&mca_bh_lock);
 	switch (mca_page_isolate(paddr)) {
 	case ISOLATE_OK:
-		printk(KERN_DEBUG "Page isolation: ( %lx ) success.\n", paddr);
+		printk(KERN_CRIT "Page isolation: ( %lx ) success.\n", paddr);
 		break;
 	case ISOLATE_NG:
-		printk(KERN_DEBUG "Page isolation: ( %lx ) failure.\n", paddr);
+		printk(KERN_CRIT "Page isolation: ( %lx ) failure.\n", paddr);
 		break;
 	default:
 		break;

^ permalink raw reply	[flat|nested] 3+ messages in thread

* RE: [patch] Increase severity of MCA recovery messages
  2006-02-27 20:12 [patch] Increase severity of MCA recovery messages Russ Anderson
@ 2006-02-27 21:40 ` Luck, Tony
  2006-02-27 22:15 ` Russ Anderson
  1 sibling, 0 replies; 3+ messages in thread
From: Luck, Tony @ 2006-02-27 21:40 UTC (permalink / raw)
  To: linux-ia64

> The MCA recovery messages are currently KERN_DEBUG,
> so they don't show up in /var/log/messages (by default).
> Increase the severity to KERN_CRIT, which is the 
> severity used when the kernel kills out of memory 
> processes.

-	printk(KERN_DEBUG "OS_MCA: process [cpu %d, pid: %d, uid: %d, "
+	printk(KERN_CRIT "OS_MCA: process [cpu %d, pid: %d, uid: %d, "

This one definitely needs a much bigger severity than DEBUG ... but is
it really as high as CRIT?  The whole point of the error recovery code
is that we (the system) do in fact recover (though at the expense of
killing a process).  Perhaps KERN_ERR?  But I'd like to hear opinions
on this.

-		printk(KERN_DEBUG "Page isolation: ( %lx ) success.\n", paddr);
+		printk(KERN_CRIT "Page isolation: ( %lx ) success.\n", paddr);

-		printk(KERN_DEBUG "Page isolation: ( %lx ) failure.\n", paddr);
+		printk(KERN_CRIT "Page isolation: ( %lx ) failure.\n", paddr);

But these ones ... I'm not so sure about.  We have already printed the first
message ... and don't take any different action whether we succeed or fail
at isolating the page.  Perhaps failure to isolate is a big problem, but
succesfully isolating isn't?  Though getting the physical address logged
would seem to be pretty useful (maybe it should be in the first printk?)

-Tony

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [patch] Increase severity of MCA recovery messages
  2006-02-27 20:12 [patch] Increase severity of MCA recovery messages Russ Anderson
  2006-02-27 21:40 ` Luck, Tony
@ 2006-02-27 22:15 ` Russ Anderson
  1 sibling, 0 replies; 3+ messages in thread
From: Russ Anderson @ 2006-02-27 22:15 UTC (permalink / raw)
  To: linux-ia64

Tony Luck wrote:
>
> > The MCA recovery messages are currently KERN_DEBUG,
> > so they don't show up in /var/log/messages (by default).
> > Increase the severity to KERN_CRIT, which is the
> > severity used when the kernel kills out of memory
> > processes.
> 
> -	printk(KERN_DEBUG "OS_MCA: process [cpu %d, pid: %d, uid: %d, "
> +	printk(KERN_CRIT "OS_MCA: process [cpu %d, pid: %d, uid: %d, "
> 
> This one definitely needs a much bigger severity than DEBUG ... but is
> it really as high as CRIT?  The whole point of the error recovery code
> is that we (the system) do in fact recover (though at the expense of
> killing a process).  Perhaps KERN_ERR?  But I'd like to hear opinions
> on this.

In side discussions, the argument for KERN_CRIT was that a process is
killed.

The argument for KERN_ERR is that it is a hardware error, even though
the system did not crash.

The argument for KERN_WARN is that recovery is "normal".

A closer look at __oom_kill_task() shows it uses KERN_ERR when killing
a process.

> -		printk(KERN_DEBUG "Page isolation: ( %lx ) success.\n", paddr);
> +		printk(KERN_CRIT "Page isolation: ( %lx ) success.\n", paddr);
> 
> -		printk(KERN_DEBUG "Page isolation: ( %lx ) failure.\n", paddr);
> +		printk(KERN_CRIT "Page isolation: ( %lx ) failure.\n", paddr);
> 
> But these ones ... I'm not so sure about.  We have already printed the first
> message ... and don't take any different action whether we succeed or fail
> at isolating the page.  Perhaps failure to isolate is a big problem, but
> succesfully isolating isn't?  Though getting the physical address logged
> would seem to be pretty useful (maybe it should be in the first printk?)

The success message is clearly lower priority than the failure.
The failure message is a big problem because you will go down,
though the failure to recover is not the root cause.

My opinion is the first should be at least KERN_ERR, following
the example of __oom_kill_task().  The success message should 
be KERN_WARN and the failure message KERN_CRIT because the 
system will be going down.  



-- 
Russ Anderson, OS RAS/Partitioning Project Lead  
SGI - Silicon Graphics Inc          rja@sgi.com

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2006-02-27 22:15 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-02-27 20:12 [patch] Increase severity of MCA recovery messages Russ Anderson
2006-02-27 21:40 ` Luck, Tony
2006-02-27 22:15 ` Russ Anderson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox