* Total system lockup with Alt-SysRQ-L
@ 2001-12-23 17:58 Russell King
2001-12-24 2:34 ` Alan Cox
0 siblings, 1 reply; 10+ messages in thread
From: Russell King @ 2001-12-23 17:58 UTC (permalink / raw)
To: linux-kernel
Ok, alt-sysrq-l is a pretty major thing to do, as it has the effect of
killing everything, including init.
When pid1 exits (maybe due to a kill signal), we lockup hard in (iirc)
exit_notify. I don't remember the details I'm afraid.
Back in 2.3, I had a go at fixing this, Linus rejected the patch saying
that it was doing the wrong thing. To this day, the kernel still suffers
from this, and I've not had the inclination to spend any more time on it.
So, I'm just letting people know that alt-sysrq-l is rather fatal,
especially if you want to do the following sequence to avoid a fsck:
alt-sysrq-l
alt-sysrq-s
alt-sysrq-u
alt-sysrq-b
IMHO either alt-sysrq-l should be removed, or someone who knows the logic
behind the linking of tasks together needs to fix exit_notify so it doesn't
enter an infinite loop when init exits.
--
Russell King (rmk@arm.linux.org.uk) The developer of ARM Linux
http://www.arm.linux.org.uk/personal/aboutme.html
^ permalink raw reply [flat|nested] 10+ messages in thread* Re: Total system lockup with Alt-SysRQ-L 2001-12-23 17:58 Total system lockup with Alt-SysRQ-L Russell King @ 2001-12-24 2:34 ` Alan Cox 2001-12-24 8:37 ` Russell King 0 siblings, 1 reply; 10+ messages in thread From: Alan Cox @ 2001-12-24 2:34 UTC (permalink / raw) To: Russell King; +Cc: linux-kernel > When pid1 exits (maybe due to a kill signal), we lockup hard in (iirc) > exit_notify. I don't remember the details I'm afraid. pid1 ends up trying to kill pid1 and it goes deeply down the toilet from that point onwards. The Unix traditional world reboots when pid 1 dies. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Total system lockup with Alt-SysRQ-L 2001-12-24 2:34 ` Alan Cox @ 2001-12-24 8:37 ` Russell King 2001-12-24 11:48 ` Denis Oliver Kropp ` (3 more replies) 0 siblings, 4 replies; 10+ messages in thread From: Russell King @ 2001-12-24 8:37 UTC (permalink / raw) To: Alan Cox; +Cc: linux-kernel On Mon, Dec 24, 2001 at 02:34:20AM +0000, Alan Cox wrote: > > When pid1 exits (maybe due to a kill signal), we lockup hard in (iirc) > > exit_notify. I don't remember the details I'm afraid. > > pid1 ends up trying to kill pid1 and it goes deeply down the toilet from > that point onwards. The Unix traditional world reboots when pid 1 dies. The problem was definitely in the exit_notify code, where it manipulated the task links indefinitely. (I think it was cptr never becomes null, so the loop never terminates). However, if we're saying that "pid1 must not die" then maybe we should get rid of the 'killall' sysrq option since it serves no useful purpose, and add a suitable panic in the do_exit path? I'll generate a patch for that if there's interest. -- Russell King (rmk@arm.linux.org.uk) The developer of ARM Linux http://www.arm.linux.org.uk/personal/aboutme.html ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Total system lockup with Alt-SysRQ-L 2001-12-24 8:37 ` Russell King @ 2001-12-24 11:48 ` Denis Oliver Kropp 2001-12-24 12:26 ` Russell King ` (2 subsequent siblings) 3 siblings, 0 replies; 10+ messages in thread From: Denis Oliver Kropp @ 2001-12-24 11:48 UTC (permalink / raw) To: Russell King; +Cc: Alan Cox, linux-kernel Quoting Russell King (rmk@arm.linux.org.uk): > On Mon, Dec 24, 2001 at 02:34:20AM +0000, Alan Cox wrote: > > > When pid1 exits (maybe due to a kill signal), we lockup hard in (iirc) > > > exit_notify. I don't remember the details I'm afraid. > > > > pid1 ends up trying to kill pid1 and it goes deeply down the toilet from > > that point onwards. The Unix traditional world reboots when pid 1 dies. > > The problem was definitely in the exit_notify code, where it manipulated > the task links indefinitely. (I think it was cptr never becomes null, > so the loop never terminates). > > However, if we're saying that "pid1 must not die" then maybe we should get > rid of the 'killall' sysrq option since it serves no useful purpose, and > add a suitable panic in the do_exit path? Another annoying thing that happens sometimes is that I accidently press 'L' or 'E' instead of 'K' or 'R', the mostly used SysRQs for me. An additional modifier for the harmful actions would be useful, e.g. Shift. So pressing Alt-SysRQ-E would do nothing until Shift is pressed, too. -- Best regards, Denis Oliver Kropp .------------------------------------------. | DirectFB - Hardware accelerated graphics | | http://www.directfb.org/ | "------------------------------------------" convergence integrated media GmbH ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Total system lockup with Alt-SysRQ-L 2001-12-24 8:37 ` Russell King 2001-12-24 11:48 ` Denis Oliver Kropp @ 2001-12-24 12:26 ` Russell King 2001-12-25 11:33 ` Pavel Machek 2001-12-24 14:27 ` M. Edward (Ed) Borasky 2001-12-28 1:00 ` David Woodhouse 3 siblings, 1 reply; 10+ messages in thread From: Russell King @ 2001-12-24 12:26 UTC (permalink / raw) To: linux-kernel On Mon, Dec 24, 2001 at 08:37:52AM +0000, Russell King wrote: > The problem was definitely in the exit_notify code, where it manipulated > the task links indefinitely. (I think it was cptr never becomes null, > so the loop never terminates). > > However, if we're saying that "pid1 must not die" then maybe we should get > rid of the 'killall' sysrq option since it serves no useful purpose, and > add a suitable panic in the do_exit path? Ok, can someone explain *why* it is desirable to attempt to kill pid1 given that doing so will completely lockup the machine? (should we rename it to "Lockup" instead of "killalL"? 8) We do have some tests in the do_exit() path to panic if/when init dies, which rely on the init PID being '1'. Unfortunately, these don't trigger because of the following bogosity in drivers/char/sysrq.c: if (p->pid == 1 && even_init) /* Ugly hack to kill init */ p->pid = 0x8000; So, I propose we get rid of this "ugly hack", and the alt-sysrq-l option altogether - it would appear to serve no useful purpose. Here is a patch that does just this. It should apply to 2.4.17 and 2.5.1 kernels fine (generated on 2.5.1). --- orig/drivers/char/sysrq.c Wed Dec 12 11:37:40 2001 +++ linux/drivers/char/sysrq.c Mon Dec 24 12:19:58 2001 @@ -284,24 +284,20 @@ /* signal sysrq helper function * Sends a signal to all user processes */ -static void send_sig_all(int sig, int even_init) +static void send_sig_all(int sig) { struct task_struct *p; for_each_task(p) { - if (p->mm) { /* Not swapper nor kernel thread */ - if (p->pid == 1 && even_init) - /* Ugly hack to kill init */ - p->pid = 0x8000; - if (p->pid != 1) - force_sig(sig, p); - } + if (p->mm && p->pid != 1) + /* Not swapper, init nor kernel thread */ + force_sig(sig, p); } } static void sysrq_handle_term(int key, struct pt_regs *pt_regs, struct kbd_struct *kbd, struct tty_struct *tty) { - send_sig_all(SIGTERM, 0); + send_sig_all(SIGTERM); console_loglevel = 8; } static struct sysrq_key_op sysrq_term_op = { @@ -312,7 +308,7 @@ static void sysrq_handle_kill(int key, struct pt_regs *pt_regs, struct kbd_struct *kbd, struct tty_struct *tty) { - send_sig_all(SIGKILL, 0); + send_sig_all(SIGKILL); console_loglevel = 8; } static struct sysrq_key_op sysrq_kill_op = { @@ -321,17 +317,6 @@ action_msg: "Kill All Tasks", }; -static void sysrq_handle_killall(int key, struct pt_regs *pt_regs, - struct kbd_struct *kbd, struct tty_struct *tty) { - send_sig_all(SIGKILL, 1); - console_loglevel = 8; -} -static struct sysrq_key_op sysrq_killall_op = { - handler: sysrq_handle_killall, - help_msg: "killalL", - action_msg: "Kill All Tasks (even init)", -}; - /* END SIGNAL SYSRQ HANDLERS BLOCK */ @@ -366,7 +351,7 @@ #else /* k */ NULL, #endif -/* l */ &sysrq_killall_op, +/* l */ NULL, /* m */ &sysrq_showmem_op, /* n */ NULL, /* o */ NULL, /* This will often be registered -- Russell King (rmk@arm.linux.org.uk) The developer of ARM Linux http://www.arm.linux.org.uk/personal/aboutme.html ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Total system lockup with Alt-SysRQ-L 2001-12-24 12:26 ` Russell King @ 2001-12-25 11:33 ` Pavel Machek 0 siblings, 0 replies; 10+ messages in thread From: Pavel Machek @ 2001-12-25 11:33 UTC (permalink / raw) To: Russell King; +Cc: linux-kernel Hi! > We do have some tests in the do_exit() path to panic if/when init dies, > which rely on the init PID being '1'. Unfortunately, these don't trigger > because of the following bogosity in drivers/char/sysrq.c: > > if (p->pid == 1 && even_init) > /* Ugly hack to kill init */ > p->pid = 0x8000; > > So, I propose we get rid of this "ugly hack", and the alt-sysrq-l > option altogether - it would appear to serve no useful purpose. Ask mj if it was ever usefull... But I guess it was not. Kill it. > > Here is a patch that does just this. It should apply to 2.4.17 and 2.5.1 > kernels fine (generated on 2.5.1). > > --- orig/drivers/char/sysrq.c Wed Dec 12 11:37:40 2001 > +++ linux/drivers/char/sysrq.c Mon Dec 24 12:19:58 2001 > @@ -284,24 +284,20 @@ > > /* signal sysrq helper function > * Sends a signal to all user processes */ > -static void send_sig_all(int sig, int even_init) > +static void send_sig_all(int sig) > { > struct task_struct *p; > > for_each_task(p) { > - if (p->mm) { /* Not swapper nor kernel thread */ > - if (p->pid == 1 && even_init) > - /* Ugly hack to kill init */ > - p->pid = 0x8000; > - if (p->pid != 1) > - force_sig(sig, p); > - } > + if (p->mm && p->pid != 1) > + /* Not swapper, init nor kernel thread */ > + force_sig(sig, p); > } > } > > static void sysrq_handle_term(int key, struct pt_regs *pt_regs, > struct kbd_struct *kbd, struct tty_struct *tty) { > - send_sig_all(SIGTERM, 0); > + send_sig_all(SIGTERM); > console_loglevel = 8; > } > static struct sysrq_key_op sysrq_term_op = { > @@ -312,7 +308,7 @@ > > static void sysrq_handle_kill(int key, struct pt_regs *pt_regs, > struct kbd_struct *kbd, struct tty_struct *tty) { > - send_sig_all(SIGKILL, 0); > + send_sig_all(SIGKILL); > console_loglevel = 8; > } > static struct sysrq_key_op sysrq_kill_op = { > @@ -321,17 +317,6 @@ > action_msg: "Kill All Tasks", > }; > > -static void sysrq_handle_killall(int key, struct pt_regs *pt_regs, > - struct kbd_struct *kbd, struct tty_struct *tty) { > - send_sig_all(SIGKILL, 1); > - console_loglevel = 8; > -} > -static struct sysrq_key_op sysrq_killall_op = { > - handler: sysrq_handle_killall, > - help_msg: "killalL", > - action_msg: "Kill All Tasks (even init)", > -}; > - > /* END SIGNAL SYSRQ HANDLERS BLOCK */ > > > @@ -366,7 +351,7 @@ > #else > /* k */ NULL, > #endif > -/* l */ &sysrq_killall_op, > +/* l */ NULL, > /* m */ &sysrq_showmem_op, > /* n */ NULL, > /* o */ NULL, /* This will often be registered > > -- > Russell King (rmk@arm.linux.org.uk) The developer of ARM Linux > http://www.arm.linux.org.uk/personal/aboutme.html > > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ -- Philips Velo 1: 1"x4"x8", 300gram, 60, 12MB, 40bogomips, linux, mutt, details at http://atrey.karlin.mff.cuni.cz/~pavel/velo/index.html. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Total system lockup with Alt-SysRQ-L 2001-12-24 8:37 ` Russell King 2001-12-24 11:48 ` Denis Oliver Kropp 2001-12-24 12:26 ` Russell King @ 2001-12-24 14:27 ` M. Edward (Ed) Borasky 2001-12-24 17:07 ` Alan Cox 2001-12-28 1:00 ` David Woodhouse 3 siblings, 1 reply; 10+ messages in thread From: M. Edward (Ed) Borasky @ 2001-12-24 14:27 UTC (permalink / raw) To: Russell King; +Cc: Alan Cox, linux-kernel On Mon, 24 Dec 2001, Russell King wrote: > The problem was definitely in the exit_notify code, where it > manipulated the task links indefinitely. (I think it was cptr never > becomes null, so the loop never terminates). > > However, if we're saying that "pid1 must not die" then maybe we should > get rid of the 'killall' sysrq option since it serves no useful > purpose, and add a suitable panic in the do_exit path? > > I'll generate a patch for that if there's interest. What would be even better, and I think there may already be such an option, would be a one-button "sync up all the disks, forbid any more writes, save as much state as possbile (registers, memory) to a swap partition, set a flag for crash dump processing and reboot" capability. -- M. Edward Borasky znmeb@borasky-research.net http://www.borasky-research.net If God had meant carrots to be eaten cooked, He would have given rabbits fire. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Total system lockup with Alt-SysRQ-L 2001-12-24 14:27 ` M. Edward (Ed) Borasky @ 2001-12-24 17:07 ` Alan Cox 2001-12-25 11:35 ` Pavel Machek 0 siblings, 1 reply; 10+ messages in thread From: Alan Cox @ 2001-12-24 17:07 UTC (permalink / raw) To: "M. Edward (Ed) Borasky"; +Cc: Russell King, Alan Cox, linux-kernel > option, would be a one-button "sync up all the disks, forbid any more > writes, save as much state as possbile (registers, memory) to a swap > partition, set a flag for crash dump processing and reboot" capability. Very hard to do - you can't trust the I/O systems state so the dump code has to verify it hasnt been corrupted, reconfigure the drive it wishes to write to, write the data out using its own non interrupt driven code and then halt the box. There are folks with patches that do a lot of that (lkcd) ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Total system lockup with Alt-SysRQ-L 2001-12-24 17:07 ` Alan Cox @ 2001-12-25 11:35 ` Pavel Machek 0 siblings, 0 replies; 10+ messages in thread From: Pavel Machek @ 2001-12-25 11:35 UTC (permalink / raw) To: Alan Cox; +Cc: "M. Edward (Ed) Borasky", Russell King, linux-kernel Hi! > > option, would be a one-button "sync up all the disks, forbid any more > > writes, save as much state as possbile (registers, memory) to a swap > > partition, set a flag for crash dump processing and reboot" capability. > > Very hard to do - you can't trust the I/O systems state so the dump code Actually... swsusp should be usable for most of this... But swsusp will not work in bad state and I guess that's showtopper. -- Philips Velo 1: 1"x4"x8", 300gram, 60, 12MB, 40bogomips, linux, mutt, details at http://atrey.karlin.mff.cuni.cz/~pavel/velo/index.html. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Total system lockup with Alt-SysRQ-L 2001-12-24 8:37 ` Russell King ` (2 preceding siblings ...) 2001-12-24 14:27 ` M. Edward (Ed) Borasky @ 2001-12-28 1:00 ` David Woodhouse 3 siblings, 0 replies; 10+ messages in thread From: David Woodhouse @ 2001-12-28 1:00 UTC (permalink / raw) To: Russell King; +Cc: linux-kernel rmk@arm.linux.org.uk said: > Ok, can someone explain *why* it is desirable to attempt to kill pid1 > given that doing so will completely lockup the machine? (should we > rename it to "Lockup" instead of "killalL"? 8) It's not. I believe SysRq-L was implemented while Linux would still exhibit sane behaviour upon pid1 dying, and was never removed when the current brokenness was introduced. -- dwmw2 ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2001-12-28 1:00 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2001-12-23 17:58 Total system lockup with Alt-SysRQ-L Russell King 2001-12-24 2:34 ` Alan Cox 2001-12-24 8:37 ` Russell King 2001-12-24 11:48 ` Denis Oliver Kropp 2001-12-24 12:26 ` Russell King 2001-12-25 11:33 ` Pavel Machek 2001-12-24 14:27 ` M. Edward (Ed) Borasky 2001-12-24 17:07 ` Alan Cox 2001-12-25 11:35 ` Pavel Machek 2001-12-28 1:00 ` David Woodhouse
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox