* [PATCH] oops_in_progress on MCA/INIT
@ 2006-07-11 19:12 Russ Anderson
2006-07-18 2:54 ` Hidetoshi Seto
0 siblings, 1 reply; 2+ messages in thread
From: Russ Anderson @ 2006-07-11 19:12 UTC (permalink / raw)
To: linux-ia64
Keith Owens wrote:
>
> The existing 'oops_in_progress' code is working pretty well. It does
> leave nasty bits behind if the MCA is recoverable, but that problem is
> not bad enough to justify a completely separate print mechanism plus
> changes to external programs. Instead we should fix the unwanted side
> effects of oops_in_progress.
One problem is that oops_in_progress gets set in MCA/INIT but
does not get cleared if the MCA is recovered (or after the INIT
stack trace prints). The result is that subsequent messages do
not get to /var/log/messages, due to release_console_sem() not
waking up klogd. Thanks to Keith Owens for his analysis of
this problem.
This patch does not address the larger issue of printing from
MCA/INIT context.
Signed-off-by: Russ Anderson (rja@sgi.com)
---------------------------------------------------------
---
arch/ia64/kernel/mca.c | 17 +++++++++++++++++
1 file changed, 17 insertions(+)
Index: test/arch/ia64/kernel/mca.c
=================================--- test.orig/arch/ia64/kernel/mca.c 2006-06-22 16:37:35.000000000 -0500
+++ test/arch/ia64/kernel/mca.c 2006-07-11 13:17:39.765023019 -0500
@@ -132,6 +132,17 @@ static int cmc_polling_enabled = 1;
* necessary for debugging.
*/
static int cpe_poll_enabled = 1;
+static int loglevel_save = -1;
+
+#define SAVE_LOGLEVEL(__console_loglevel) \
+ if (loglevel_save < 0) \
+ loglevel_save = __console_loglevel
+
+#define RESTORE_LOGLEVEL(__console_loglevel) \
+ if (loglevel_save >= 0) { \
+ __console_loglevel = loglevel_save; \
+ loglevel_save = -1; \
+ }
extern void salinfo_log_wakeup(int type, u8 *buffer, u64 size, int irqsafe);
@@ -1028,6 +1039,7 @@ ia64_mca_handler(struct pt_regs *regs, s
struct ia64_mca_notify_die nd { .sos = sos, .monarch_cpu = &monarch_cpu };
+ SAVE_LOGLEVEL(console_loglevel);
oops_in_progress = 1; /* FIXME: make printk NMI/MCA/INIT safe */
console_loglevel = 15; /* make sure printks make it to console */
printk(KERN_INFO "Entered OS MCA handler. PSP=%lx cpu=%d monarch=%ld\n",
@@ -1067,6 +1079,8 @@ ia64_mca_handler(struct pt_regs *regs, s
rh->severity = sal_log_severity_corrected;
ia64_sal_clear_state_info(SAL_INFO_TYPE_MCA);
sos->os_status = IA64_MCA_CORRECTED;
+ RESTORE_LOGLEVEL(console_loglevel);
+ oops_in_progress = 0;
}
if (notify_die(DIE_MCA_MONARCH_LEAVE, "MCA", regs, (long)&nd, 0, recover)
= NOTIFY_STOP)
@@ -1358,6 +1372,7 @@ ia64_init_handler(struct pt_regs *regs,
struct ia64_mca_notify_die nd { .sos = sos, .monarch_cpu = &monarch_cpu };
+ SAVE_LOGLEVEL(console_loglevel);
oops_in_progress = 1; /* FIXME: make printk NMI/MCA/INIT safe */
console_loglevel = 15; /* make sure printks make it to console */
@@ -1442,6 +1457,8 @@ ia64_init_handler(struct pt_regs *regs,
ia64_mca_spin(__FUNCTION__);
printk("\nINIT dump complete. Monarch on cpu %d returning to normal service.\n", cpu);
atomic_dec(&monarchs);
+ RESTORE_LOGLEVEL(console_loglevel);
+ oops_in_progress = 0;
set_curr_task(cpu, previous_current);
monarch_cpu = -1;
return;
--
Russ Anderson, OS RAS/Partitioning Project Lead
SGI - Silicon Graphics Inc rja@sgi.com
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: [PATCH] oops_in_progress on MCA/INIT
2006-07-11 19:12 [PATCH] oops_in_progress on MCA/INIT Russ Anderson
@ 2006-07-18 2:54 ` Hidetoshi Seto
0 siblings, 0 replies; 2+ messages in thread
From: Hidetoshi Seto @ 2006-07-18 2:54 UTC (permalink / raw)
To: linux-ia64
Russ Anderson wrote:
> Keith Owens wrote:
>> The existing 'oops_in_progress' code is working pretty well. It does
>> leave nasty bits behind if the MCA is recoverable, but that problem is
>> not bad enough to justify a completely separate print mechanism plus
>> changes to external programs. Instead we should fix the unwanted side
>> effects of oops_in_progress.
>
> One problem is that oops_in_progress gets set in MCA/INIT but
> does not get cleared if the MCA is recovered (or after the INIT
> stack trace prints). The result is that subsequent messages do
> not get to /var/log/messages, due to release_console_sem() not
> waking up klogd. Thanks to Keith Owens for his analysis of
> this problem.
>
> This patch does not address the larger issue of printing from
> MCA/INIT context.
Still there are larger issues...
Here are related codes in kernel/printk.c(2.6.17):
418 static void zap_locks(void)
419 {
420 static unsigned long oops_timestamp;
421
422 if (time_after_eq(jiffies, oops_timestamp) &&
423 !time_after(jiffies, oops_timestamp + 30 * HZ))
424 return;
425
426 oops_timestamp = jiffies;
427
428 /* If a crash is occurring, make sure we can't deadlock */
429 spin_lock_init(&logbuf_lock);
430 /* And make sure that we print immediately */
431 init_MUTEX(&console_sem);
432 }
490 asmlinkage int vprintk(const char *fmt, va_list args)
491 {
492 unsigned long flags;
493 int printed_len;
494 char *p;
495 static char printk_buf[1024];
496 static int log_level_unknown = 1;
497
498 preempt_disable();
499 if (unlikely(oops_in_progress) && printk_cpu = smp_processor_id())
500 /* If a crash is occurring during printk() on this CPU,
501 * make sure we can't deadlock */
502 zap_locks();
503
504 /* This stops the holder of console_sem just where we want him */
505 spin_lock_irqsave(&logbuf_lock, flags);
506 printk_cpu = smp_processor_id();
It seems that there are at least two problems not solved yet.
- zap_lock initializes console_sem. It doesn't wake up waiters.
- it allows existence of two holders of logbuf_lock if interrupted
original holder restarts after spin_lock_init(logbuf_lock).
You'll see mixed message like: inrterecruovepteredd
These larger issues are more critical and need to be solved before
returning from MCA/INIT handlers saying "recovered".
And these issues are no matter if the kernel is really progressing oops.
H.Seto
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2006-07-18 2:54 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-07-11 19:12 [PATCH] oops_in_progress on MCA/INIT Russ Anderson
2006-07-18 2:54 ` Hidetoshi Seto
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox