* [RFC][PATCH] introduce panic_gently
@ 2007-07-05 23:07 Bodo Eggert
2007-07-06 7:57 ` Clemens Koller
2007-07-06 12:05 ` Andi Kleen
0 siblings, 2 replies; 8+ messages in thread
From: Bodo Eggert @ 2007-07-05 23:07 UTC (permalink / raw)
To: linux-kernel
If the boot process failes to find init or the root fs, the cause has
usually scrolled off the screen, and because of the panic, it can't be
reached anymore.
This patch introduces panic_gently, which will allow to use the scrollback
buffer and to reboot, but it can't be called from unsafe context.
Signed-Off-By: Bodo Eggert <7eggert@gmx.de>
---
This patch seems to work correctly on bochs/i386, except for the qemu
BIOS hangigng after a ctrl_alt_del, but I did run qemu using -kernel and
-initrd, which might have caused this behaviour.
Is this function useful outside init code?
Should it be disabled on non-console systems/archs?
diff -X dontdiff -pruN 2.6.21.ori/include/linux/kernel.h 2.6.21/include/linux/kernel.h
--- 2.6.21.ori/include/linux/kernel.h 2007-07-06 00:13:03.000000000 +0200
+++ 2.6.21/include/linux/kernel.h 2007-07-05 23:35:46.000000000 +0200
@@ -96,6 +96,8 @@ extern struct atomic_notifier_head panic
extern long (*panic_blink)(long time);
NORET_TYPE void panic(const char * fmt, ...)
__attribute__ ((NORET_AND format (printf, 1, 2)));
+NORET_TYPE void panic_gently(const char * fmt, ...)
+ __attribute__ ((NORET_AND format (printf, 1, 2)));
extern void oops_enter(void);
extern void oops_exit(void);
extern int oops_may_print(void);
diff -X dontdiff -pruN 2.6.21.ori/init/do_mounts.c 2.6.21/init/do_mounts.c
--- 2.6.21.ori/init/do_mounts.c 2006-11-29 22:57:37.000000000 +0100
+++ 2.6.21/init/do_mounts.c 2007-07-05 23:55:35.000000000 +0200
@@ -315,7 +315,7 @@ retry:
root_device_name, b);
printk("Please append a correct \"root=\" boot option\n");
- panic("VFS: Unable to mount root fs on %s", b);
+ panic_gently("VFS: Unable to mount root fs on %s", b);
}
printk("No filesystem could mount root, tried: ");
@@ -325,7 +325,7 @@ retry:
#ifdef CONFIG_BLOCK
__bdevname(ROOT_DEV, b);
#endif
- panic("VFS: Unable to mount root fs on %s", b);
+ panic_gently("VFS: Unable to mount root fs on %s", b);
out:
putname(fs_names);
}
diff -X dontdiff -pruN 2.6.21.ori/init/main.c 2.6.21/init/main.c
--- 2.6.21.ori/init/main.c 2007-07-06 00:13:03.000000000 +0200
+++ 2.6.21/init/main.c 2007-07-05 23:43:15.000000000 +0200
@@ -579,7 +579,7 @@ asmlinkage void __init start_kernel(void
*/
console_init();
if (panic_later)
- panic(panic_later, panic_param);
+ panic_gently(panic_later, panic_param);
lockdep_info();
@@ -769,7 +769,7 @@ static int noinline init_post(void)
run_init_process("/bin/init");
run_init_process("/bin/sh");
- panic("No init found. Try passing init= option to kernel.");
+ panic_gently("No init found. Try passing init= option to kernel.");
}
static int __init init(void * unused)
diff -X dontdiff -pruN 2.6.21.ori/kernel/panic.c 2.6.21/kernel/panic.c
--- 2.6.21.ori/kernel/panic.c 2007-07-06 00:13:03.000000000 +0200
+++ 2.6.21/kernel/panic.c 2007-07-05 23:48:28.000000000 +0200
@@ -139,7 +139,64 @@ NORET_TYPE void panic(const char * fmt,
}
}
+NORET_TYPE void panic_gently(const char * fmt, ...)
+{
+ long i;
+ static char buf[1024];
+ va_list args;
+#if defined(CONFIG_S390)
+ unsigned long caller = (unsigned long) __builtin_return_address(0);
+#endif
+
+ va_start(args, fmt);
+ vsnprintf(buf, sizeof(buf), fmt, args);
+ va_end(args);
+ printk(KERN_EMERG "Kernel panic - not syncing: %s\n",buf);
+
+ atomic_notifier_call_chain(&panic_notifier_list, 0, buf);
+
+ if (!panic_blink)
+ panic_blink = no_blink;
+
+ if (panic_timeout > 0) {
+ /*
+ * Delay timeout seconds before rebooting the machine.
+ * We can't use the "normal" timers since we just panicked..
+ */
+ printk(KERN_EMERG "Rebooting in %d seconds..",panic_timeout);
+ for (i = 0; i < panic_timeout*1000; ) {
+ touch_nmi_watchdog();
+ i += panic_blink(i);
+ mdelay(1);
+ i++;
+ }
+ /* This will not be a clean reboot, with everything
+ * shutting down. But if there is a chance of
+ * rebooting the system it will be rebooted.
+ */
+ kernel_restart(NULL);
+ }
+#ifdef __sparc__
+ {
+ extern int stop_a_enabled;
+ /* Make sure the user can actually press Stop-A (L1-A) */
+ stop_a_enabled = 1;
+ printk(KERN_EMERG "Press Stop-A (L1-A) to return to the boot prom\n");
+ }
+#endif
+#if defined(CONFIG_S390)
+ disabled_wait(caller);
+#endif
+ for (i = 0;;) {
+ touch_softlockup_watchdog();
+ i += panic_blink(i);
+ msleep(1);
+ i++;
+ }
+}
+
EXPORT_SYMBOL(panic);
+EXPORT_SYMBOL(panic_gently);
/**
* print_tainted - return a string to represent the kernel taint state.
--
knghtbrd:<JHM> AIX - the Unix from the universe where Spock has a beard.
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: [RFC][PATCH] introduce panic_gently
2007-07-05 23:07 [RFC][PATCH] introduce panic_gently Bodo Eggert
@ 2007-07-06 7:57 ` Clemens Koller
2007-07-06 12:03 ` Andi Kleen
2007-07-06 18:00 ` Chuck Ebbert
2007-07-06 12:05 ` Andi Kleen
1 sibling, 2 replies; 8+ messages in thread
From: Clemens Koller @ 2007-07-06 7:57 UTC (permalink / raw)
To: Bodo Eggert; +Cc: linux-kernel
Bodo Eggert schrieb:
> If the boot process failes to find init or the root fs, the cause has
> usually scrolled off the screen, and because of the panic, it can't be
> reached anymore.
>
> This patch introduces panic_gently, which will allow to use the scrollback
> buffer and to reboot, but it can't be called from unsafe context.
In the case where you introduced panic_gently() there is IMHO no reason
to panic() at all. There is no bug which got hit, the machine just needs
user intervention because of wrong boot parameters (in most cases).
What about asking the user for the correct root= or init= parameters
and just retry/continue the boot process?
The 180seconds reboot timeout also doesn't make sense here. The problem
won't go away after a reboot without user interaction.
Regards,
--
Clemens Koller
_______________________________
R&D Imaging Devices
Anagramm GmbH
Rupert-Mayer-Str. 45/1
81379 Muenchen
Germany
http://www.anagramm-technology.com
Phone: +49-89-741518-50
Fax: +49-89-741518-19
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [RFC][PATCH] introduce panic_gently
2007-07-06 7:57 ` Clemens Koller
@ 2007-07-06 12:03 ` Andi Kleen
2007-07-06 20:52 ` Oleg Verych
2007-07-06 18:00 ` Chuck Ebbert
1 sibling, 1 reply; 8+ messages in thread
From: Andi Kleen @ 2007-07-06 12:03 UTC (permalink / raw)
To: Clemens Koller; +Cc: Bodo Eggert, linux-kernel
Clemens Koller <clemens.koller@anagramm.de> writes:
> Bodo Eggert schrieb:
> > If the boot process failes to find init or the root fs, the cause
> > has usually scrolled off the screen, and because of the panic, it
> > can't be reached anymore.
> > This patch introduces panic_gently, which will allow to use the
> > scrollback buffer and to reboot, but it can't be called from unsafe
> > context.
>
> In the case where you introduced panic_gently() there is IMHO no reason
> to panic() at all. There is no bug which got hit, the machine just needs
> user intervention because of wrong boot parameters (in most cases).
>
> What about asking the user for the correct root= or init= parameters
> and just retry/continue the boot process?
BSDs have a boot shell for this. But not sure it's a good idea.
On EFI systems it might be better to just drop back to the boot environment
where a "re-boot" could be initiated.
> The 180seconds reboot timeout also doesn't make sense here. The problem
> won't go away after a reboot without user interaction.
It will when lilo -R/grubonce were used or the NFS server fixed itself
on nfs root or ... There can be many reasons for it being a good idea.
-Andi
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [RFC][PATCH] introduce panic_gently
2007-07-06 12:03 ` Andi Kleen
@ 2007-07-06 20:52 ` Oleg Verych
0 siblings, 0 replies; 8+ messages in thread
From: Oleg Verych @ 2007-07-06 20:52 UTC (permalink / raw)
To: linux-kernel
* Andi Kleen (06 Jul 2007 14:03:05 +0200)
>> The 180seconds reboot timeout also doesn't make sense here. The problem
>> won't go away after a reboot without user interaction.
>
> It will when lilo -R/grubonce were used or the NFS server fixed itself
> on nfs root or ... There can be many reasons for it being a good idea.
Hope Peter with his 16bit C rewrite will implement this in early boot
code, thing that i was crying about.
____
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [RFC][PATCH] introduce panic_gently
2007-07-06 7:57 ` Clemens Koller
2007-07-06 12:03 ` Andi Kleen
@ 2007-07-06 18:00 ` Chuck Ebbert
2007-07-08 17:24 ` Clemens Koller
1 sibling, 1 reply; 8+ messages in thread
From: Chuck Ebbert @ 2007-07-06 18:00 UTC (permalink / raw)
To: Clemens Koller; +Cc: Bodo Eggert, linux-kernel
On 07/06/2007 03:57 AM, Clemens Koller wrote:
> Bodo Eggert schrieb:
>> If the boot process failes to find init or the root fs, the cause has
>> usually scrolled off the screen, and because of the panic, it can't be
>> reached anymore.
>>
>> This patch introduces panic_gently, which will allow to use the
>> scrollback buffer and to reboot, but it can't be called from unsafe
>> context.
>
> In the case where you introduced panic_gently() there is IMHO no reason
> to panic() at all. There is no bug which got hit, the machine just needs
> user intervention because of wrong boot parameters (in most cases).
>
> What about asking the user for the correct root= or init= parameters
> and just retry/continue the boot process?
>
> The 180seconds reboot timeout also doesn't make sense here. The problem
> won't go away after a reboot without user interaction.
What about GRUB fallbacks? The fallback image/kernel will start
automatically on reboot if you use that feature.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [RFC][PATCH] introduce panic_gently
2007-07-06 18:00 ` Chuck Ebbert
@ 2007-07-08 17:24 ` Clemens Koller
0 siblings, 0 replies; 8+ messages in thread
From: Clemens Koller @ 2007-07-08 17:24 UTC (permalink / raw)
To: Chuck Ebbert; +Cc: Bodo Eggert, linux-kernel
Chuck Ebbert schrieb:
> On 07/06/2007 03:57 AM, Clemens Koller wrote:
>> The 180seconds reboot timeout also doesn't make sense here. The problem
>> won't go away after a reboot without user interaction.
>
> What about GRUB fallbacks? The fallback image/kernel will start
> automatically on reboot if you use that feature.
Yes, you are right.
Regards,
--
Clemens Koller
_______________________________
R&D Imaging Devices
Anagramm GmbH
Rupert-Mayer-Str. 45/1
81379 Muenchen
Germany
http://www.anagramm-technology.com
Phone: +49-89-741518-50
Fax: +49-89-741518-19
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [RFC][PATCH] introduce panic_gently
2007-07-05 23:07 [RFC][PATCH] introduce panic_gently Bodo Eggert
2007-07-06 7:57 ` Clemens Koller
@ 2007-07-06 12:05 ` Andi Kleen
2007-07-06 15:24 ` Bodo Eggert
1 sibling, 1 reply; 8+ messages in thread
From: Andi Kleen @ 2007-07-06 12:05 UTC (permalink / raw)
To: Bodo Eggert; +Cc: linux-kernel
Bodo Eggert <7eggert@gmx.de> writes:
> If the boot process failes to find init or the root fs, the cause has
> usually scrolled off the screen, and because of the panic, it can't be
> reached anymore.
>
> This patch introduces panic_gently, which will allow to use the scrollback
> buffer and to reboot, but it can't be called from unsafe context.
The implementation certainly has too much duplicated code. If anything
it needs some common functions.
The problem with keeping interrupts on is that the system might continue
to route packets. This is sometimes quite unexpected for users.
Arguably that's unlikely to be already enabled for missing root,
but in theory initrd could have done it.
I think i would prefer if the normal panic() tried to detect the situations
where this is
It couldn't detect spinlocks, but interrupts off/interrupt context etc.
-Andi
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [RFC][PATCH] introduce panic_gently
2007-07-06 12:05 ` Andi Kleen
@ 2007-07-06 15:24 ` Bodo Eggert
0 siblings, 0 replies; 8+ messages in thread
From: Bodo Eggert @ 2007-07-06 15:24 UTC (permalink / raw)
To: Andi Kleen; +Cc: Bodo Eggert, linux-kernel
On Fri, 6 Jul 2007, Andi Kleen wrote:
> Bodo Eggert <7eggert@gmx.de> writes:
>
> > If the boot process failes to find init or the root fs, the cause has
> > usually scrolled off the screen, and because of the panic, it can't be
> > reached anymore.
> >
> > This patch introduces panic_gently, which will allow to use the scrollback
> > buffer and to reboot, but it can't be called from unsafe context.
>
> The implementation certainly has too much duplicated code. If anything
> it needs some common functions.
There are common parts, but they have subtile differences.
I'd rather make that function __init and not wory about that 200 bytes.
Maybe some parts can be skipped, too.
> The problem with keeping interrupts on is that the system might continue
> to route packets. This is sometimes quite unexpected for users.
> Arguably that's unlikely to be already enabled for missing root,
> but in theory initrd could have done it.
If initrd set up a router, it should should also do the mounts and call
init, shouldn't it? In this case, the panic() won't ever happen.
(At least the kernel panic()s if I run rdinit=/bin/ash and exit that
shell, therefore I can't depend on the kernel to execute init.)
> I think i would prefer if the normal panic() tried to detect the situations
> where this is
>
> It couldn't detect spinlocks, but interrupts off/interrupt context etc.
I asumed that to be the greater challenge, taking several months to get
right instead of a few minutes. (I'm not sure I really got this right, but
it happens to works for me.-)
--
Funny quotes:
36. You never really learn to swear until you learn to drive.
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2007-07-08 17:24 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-07-05 23:07 [RFC][PATCH] introduce panic_gently Bodo Eggert
2007-07-06 7:57 ` Clemens Koller
2007-07-06 12:03 ` Andi Kleen
2007-07-06 20:52 ` Oleg Verych
2007-07-06 18:00 ` Chuck Ebbert
2007-07-08 17:24 ` Clemens Koller
2007-07-06 12:05 ` Andi Kleen
2007-07-06 15:24 ` Bodo Eggert
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox