* Silly 100% CPU behavior on a SIG_IGN-ored SIGBUS.
@ 2009-06-23 21:45 Kaz Kylheku
2009-06-23 21:45 ` Kaz Kylheku
` (2 more replies)
0 siblings, 3 replies; 12+ messages in thread
From: Kaz Kylheku @ 2009-06-23 21:45 UTC (permalink / raw)
To: linux-mips
Hi all,
On kernel 2.6.26, glibc 2.5 (n32), SiByte SB-1 core, the following
program goes into 100% CPU, chewing up about 80% kernel time and
20% user.
#include <stdio.h>
#include <signal.h>
int main(void)
{
int *deadbeef = (int *) 0xdeadbeef;
signal(SIGBUS, SIG_IGN);
printf("*deadbeef == %d\n", *deadbeef);
return 0;
}
If any fatal exception is ignored, the program should be killed
if that exception happens. 100% CPU is not a useful response.
(If there is a handler, and that handler returns without doing anything
to
prevent a recurrence of the exception when the instruction is re-tried,
that's different).
^ permalink raw reply [flat|nested] 12+ messages in thread* Silly 100% CPU behavior on a SIG_IGN-ored SIGBUS. 2009-06-23 21:45 Silly 100% CPU behavior on a SIG_IGN-ored SIGBUS Kaz Kylheku @ 2009-06-23 21:45 ` Kaz Kylheku 2009-06-23 22:03 ` David Daney 2009-06-23 22:44 ` Kevin D. Kissell 2 siblings, 0 replies; 12+ messages in thread From: Kaz Kylheku @ 2009-06-23 21:45 UTC (permalink / raw) To: linux-mips Hi all, On kernel 2.6.26, glibc 2.5 (n32), SiByte SB-1 core, the following program goes into 100% CPU, chewing up about 80% kernel time and 20% user. #include <stdio.h> #include <signal.h> int main(void) { int *deadbeef = (int *) 0xdeadbeef; signal(SIGBUS, SIG_IGN); printf("*deadbeef == %d\n", *deadbeef); return 0; } If any fatal exception is ignored, the program should be killed if that exception happens. 100% CPU is not a useful response. (If there is a handler, and that handler returns without doing anything to prevent a recurrence of the exception when the instruction is re-tried, that's different). ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Silly 100% CPU behavior on a SIG_IGN-ored SIGBUS. 2009-06-23 21:45 Silly 100% CPU behavior on a SIG_IGN-ored SIGBUS Kaz Kylheku 2009-06-23 21:45 ` Kaz Kylheku @ 2009-06-23 22:03 ` David Daney 2009-06-25 3:39 ` Kaz Kylheku 2009-06-23 22:44 ` Kevin D. Kissell 2 siblings, 1 reply; 12+ messages in thread From: David Daney @ 2009-06-23 22:03 UTC (permalink / raw) To: Kaz Kylheku; +Cc: linux-mips Kaz Kylheku wrote: > Hi all, > > On kernel 2.6.26, glibc 2.5 (n32), SiByte SB-1 core, the following > program goes into 100% CPU, chewing up about 80% kernel time and > 20% user. > > #include <stdio.h> > #include <signal.h> > > int main(void) > { > int *deadbeef = (int *) 0xdeadbeef; > signal(SIGBUS, SIG_IGN); > printf("*deadbeef == %d\n", *deadbeef); > return 0; > } > > If any fatal exception is ignored, the program should be killed > if that exception happens. 100% CPU is not a useful response. > > (If there is a handler, and that handler returns without doing anything > to > prevent a recurrence of the exception when the instruction is re-tried, > that's different). > > I wonder if it is another instance of: http://www.linux-mips.org/git?p=linux.git;a=commitdiff;h=49cf0e2d68dd98dbb28eaca0284e8460ab6ad86d David Daney ^ permalink raw reply [flat|nested] 12+ messages in thread
* RE: Silly 100% CPU behavior on a SIG_IGN-ored SIGBUS. 2009-06-23 22:03 ` David Daney @ 2009-06-25 3:39 ` Kaz Kylheku 2009-06-25 3:39 ` Kaz Kylheku 0 siblings, 1 reply; 12+ messages in thread From: Kaz Kylheku @ 2009-06-25 3:39 UTC (permalink / raw) To: linux-mips On June 23, 2009 3:03 PM, David Daney [mailto:ddaney@caviumnetworks.com] > I wonder if it is another instance of: > > http://www.linux-mips.org/git?p=linux.git;a=commitdiff;h=49cf0 > e2d68dd98dbb28eaca0284e8460ab6ad86d Thanks for digging that up. The patch fixes it for me. Clearly, it's not just for init. Any process can ignore signals that shouldn't be ignored. I'm surprised this has only been discovered so recently. It's amazing how we sometimes find things at around the same time. If I may pontificate now, someone said my little program was silly. Of course; it's a repro test case! Nobody would deliberately block SIGBUS and then deliberately trigger a bus error. But this situation happened in a large and complex real application, leaving some of my developers scratching their heads, and distracting them from looking for the bad pointer! ``Hey look, we can break this 100% CPU thread with gdb, but it always stops on the same location, which is an indirect load through a register containing a bad pointer! And it's spinning mostly in the kernel. Hmm!'' I gave them that little program to demonstrate how the behavior can occur. They are now working out how SIGBUS came to be ignored, and, of course, the cause of the bad pointer. ^ permalink raw reply [flat|nested] 12+ messages in thread
* RE: Silly 100% CPU behavior on a SIG_IGN-ored SIGBUS. 2009-06-25 3:39 ` Kaz Kylheku @ 2009-06-25 3:39 ` Kaz Kylheku 0 siblings, 0 replies; 12+ messages in thread From: Kaz Kylheku @ 2009-06-25 3:39 UTC (permalink / raw) To: linux-mips On June 23, 2009 3:03 PM, David Daney [mailto:ddaney@caviumnetworks.com] > I wonder if it is another instance of: > > http://www.linux-mips.org/git?p=linux.git;a=commitdiff;h=49cf0 > e2d68dd98dbb28eaca0284e8460ab6ad86d Thanks for digging that up. The patch fixes it for me. Clearly, it's not just for init. Any process can ignore signals that shouldn't be ignored. I'm surprised this has only been discovered so recently. It's amazing how we sometimes find things at around the same time. If I may pontificate now, someone said my little program was silly. Of course; it's a repro test case! Nobody would deliberately block SIGBUS and then deliberately trigger a bus error. But this situation happened in a large and complex real application, leaving some of my developers scratching their heads, and distracting them from looking for the bad pointer! ``Hey look, we can break this 100% CPU thread with gdb, but it always stops on the same location, which is an indirect load through a register containing a bad pointer! And it's spinning mostly in the kernel. Hmm!'' I gave them that little program to demonstrate how the behavior can occur. They are now working out how SIGBUS came to be ignored, and, of course, the cause of the bad pointer. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Silly 100% CPU behavior on a SIG_IGN-ored SIGBUS. 2009-06-23 21:45 Silly 100% CPU behavior on a SIG_IGN-ored SIGBUS Kaz Kylheku 2009-06-23 21:45 ` Kaz Kylheku 2009-06-23 22:03 ` David Daney @ 2009-06-23 22:44 ` Kevin D. Kissell 2009-06-25 13:13 ` Ralf Baechle 2 siblings, 1 reply; 12+ messages in thread From: Kevin D. Kissell @ 2009-06-23 22:44 UTC (permalink / raw) To: Kaz Kylheku; +Cc: linux-mips Kaz Kylheku wrote: > Hi all, > > On kernel 2.6.26, glibc 2.5 (n32), SiByte SB-1 core, the following > program goes into 100% CPU, chewing up about 80% kernel time and > 20% user. > > #include <stdio.h> > #include <signal.h> > > int main(void) > { > int *deadbeef = (int *) 0xdeadbeef; > signal(SIGBUS, SIG_IGN); > printf("*deadbeef == %d\n", *deadbeef); > return 0; > } > > If any fatal exception is ignored, the program should be killed > if that exception happens. 100% CPU is not a useful response. > It's not a useful program, so what did you expect? One might argue that it would be more useful or correct to have the kernel advance the PC to not endlessly repeat the doomed load, but ignoring SIG_IGN and silently killing the thread violates the signal API as I've always understood it. Regards, Kevin K. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Silly 100% CPU behavior on a SIG_IGN-ored SIGBUS. 2009-06-23 22:44 ` Kevin D. Kissell @ 2009-06-25 13:13 ` Ralf Baechle 2009-06-25 13:45 ` Ralf Baechle 0 siblings, 1 reply; 12+ messages in thread From: Ralf Baechle @ 2009-06-25 13:13 UTC (permalink / raw) To: Kevin D. Kissell; +Cc: Kaz Kylheku, linux-mips On Tue, Jun 23, 2009 at 03:44:29PM -0700, Kevin D. Kissell wrote: >> int main(void) >> { >> int *deadbeef = (int *) 0xdeadbeef; >> signal(SIGBUS, SIG_IGN); >> printf("*deadbeef == %d\n", *deadbeef); >> return 0; >> } >> >> If any fatal exception is ignored, the program should be killed >> if that exception happens. 100% CPU is not a useful response. >> > It's not a useful program, so what did you expect? One might argue > that it would be more useful or correct to have the kernel advance the > PC to not endlessly repeat the doomed load, but ignoring SIG_IGN and > silently killing the thread violates the signal API as I've always > understood it. It's not a useful program but valid as a test case. However I agree with your interpretation of signal semantics but I'll have to round up a copy of the relevant standard documents; I have vague memories about some small print for cases like this. Ralf ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Silly 100% CPU behavior on a SIG_IGN-ored SIGBUS. 2009-06-25 13:13 ` Ralf Baechle @ 2009-06-25 13:45 ` Ralf Baechle 2009-06-25 16:00 ` Kaz Kylheku 0 siblings, 1 reply; 12+ messages in thread From: Ralf Baechle @ 2009-06-25 13:45 UTC (permalink / raw) To: Kevin D. Kissell; +Cc: Kaz Kylheku, linux-mips On Thu, Jun 25, 2009 at 02:13:00PM +0100, Ralf Baechle wrote: > >> int main(void) > >> { > >> int *deadbeef = (int *) 0xdeadbeef; > >> signal(SIGBUS, SIG_IGN); > >> printf("*deadbeef == %d\n", *deadbeef); > >> return 0; > >> } > >> > >> If any fatal exception is ignored, the program should be killed > >> if that exception happens. 100% CPU is not a useful response. > >> > > It's not a useful program, so what did you expect? One might argue > > that it would be more useful or correct to have the kernel advance the > > PC to not endlessly repeat the doomed load, but ignoring SIG_IGN and > > silently killing the thread violates the signal API as I've always > > understood it. > > It's not a useful program but valid as a test case. However I agree with > your interpretation of signal semantics but I'll have to round up a copy > of the relevant standard documents; I have vague memories about some small > print for cases like this. I found this in IRIX 6.5 documentation: Caution: Signals raised by the instruction stream, SIGILL, SIGEMT, SIGBUS, and SIGSEGV, will cause infinite loops if their handler returns, or the action is set to SIG_IGN. Ralf ^ permalink raw reply [flat|nested] 12+ messages in thread
* RE: Silly 100% CPU behavior on a SIG_IGN-ored SIGBUS. 2009-06-25 13:45 ` Ralf Baechle @ 2009-06-25 16:00 ` Kaz Kylheku 2009-06-25 16:00 ` Kaz Kylheku 2009-06-26 0:45 ` Ralf Baechle 0 siblings, 2 replies; 12+ messages in thread From: Kaz Kylheku @ 2009-06-25 16:00 UTC (permalink / raw) To: Ralf Baechle, Kevin D. Kissell; +Cc: linux-mips Ralf wrote: > I found this in IRIX 6.5 documentation: > > Caution: Signals raised by the instruction stream, SIGILL, > SIGEMT, SIGBUS, and SIGSEGV, will cause infinite loops > if their handler returns, or the action is set to SIG_IGN. The Single Unix Specification (Issue 6) marks the behavior explicitly undefined. Bookmark this: http://www.opengroup.org/onlinepubs/009695399 Not the latest set of documents, but that can be regarded as a virtue. :) Under pthread_sigmask and sigprocmask, for blocking: If any of the SIGFPE, SIGILL, SIGSEGV, or SIGBUS signals are generated while they are blocked, the result is undefined, unless the signal was generated by the kill() function, the sigqueue() function, or the raise() function. Under ``2.4 Signal Concepts'', for SIG_IGN: SIG_IGN Ignore signal. Delivery of the signal shall have no effect on the process. The behavior of a process is undefined after it ignores a SIGFPE, SIGILL, SIGSEGV, or SIGBUS signal that was not generated by kill(), sigqueue(), or raise(). So, as I suspected, there are in fact no requirements from the applicable spec. Infinite looping or stopping the process anyway are conforming responses, as is rebooting or halting the machine with a ``panic'' message. ^ permalink raw reply [flat|nested] 12+ messages in thread
* RE: Silly 100% CPU behavior on a SIG_IGN-ored SIGBUS. 2009-06-25 16:00 ` Kaz Kylheku @ 2009-06-25 16:00 ` Kaz Kylheku 2009-06-26 0:45 ` Ralf Baechle 1 sibling, 0 replies; 12+ messages in thread From: Kaz Kylheku @ 2009-06-25 16:00 UTC (permalink / raw) To: Ralf Baechle, Kevin D. Kissell; +Cc: linux-mips Ralf wrote: > I found this in IRIX 6.5 documentation: > > Caution: Signals raised by the instruction stream, SIGILL, > SIGEMT, SIGBUS, and SIGSEGV, will cause infinite loops > if their handler returns, or the action is set to SIG_IGN. The Single Unix Specification (Issue 6) marks the behavior explicitly undefined. Bookmark this: http://www.opengroup.org/onlinepubs/009695399 Not the latest set of documents, but that can be regarded as a virtue. :) Under pthread_sigmask and sigprocmask, for blocking: If any of the SIGFPE, SIGILL, SIGSEGV, or SIGBUS signals are generated while they are blocked, the result is undefined, unless the signal was generated by the kill() function, the sigqueue() function, or the raise() function. Under ``2.4 Signal Concepts'', for SIG_IGN: SIG_IGN Ignore signal. Delivery of the signal shall have no effect on the process. The behavior of a process is undefined after it ignores a SIGFPE, SIGILL, SIGSEGV, or SIGBUS signal that was not generated by kill(), sigqueue(), or raise(). So, as I suspected, there are in fact no requirements from the applicable spec. Infinite looping or stopping the process anyway are conforming responses, as is rebooting or halting the machine with a ``panic'' message. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Silly 100% CPU behavior on a SIG_IGN-ored SIGBUS. 2009-06-25 16:00 ` Kaz Kylheku 2009-06-25 16:00 ` Kaz Kylheku @ 2009-06-26 0:45 ` Ralf Baechle 2009-06-26 0:52 ` David Daney 1 sibling, 1 reply; 12+ messages in thread From: Ralf Baechle @ 2009-06-26 0:45 UTC (permalink / raw) To: Kaz Kylheku; +Cc: Kevin D. Kissell, linux-mips On Thu, Jun 25, 2009 at 09:00:19AM -0700, Kaz Kylheku wrote: > Ralf wrote: > > I found this in IRIX 6.5 documentation: > > > > Caution: Signals raised by the instruction stream, SIGILL, > > SIGEMT, SIGBUS, and SIGSEGV, will cause infinite loops > > if their handler returns, or the action is set to SIG_IGN. > > The Single Unix Specification (Issue 6) marks the behavior > explicitly undefined. I should have mentioned that above mentioned paragraph of IRIX documentation was in the section on implmentation specific behaviour. > Bookmark this: http://www.opengroup.org/onlinepubs/009695399 > > Not the latest set of documents, but that can be regarded > as a virtue. :) > > Under pthread_sigmask and sigprocmask, for blocking: > > If any of the SIGFPE, SIGILL, SIGSEGV, or SIGBUS > signals are generated while they are blocked, > the result is undefined, unless the signal > was generated by the kill() function, the > sigqueue() function, or the raise() function. > > Under ``2.4 Signal Concepts'', for SIG_IGN: > > SIG_IGN > > Ignore signal. > > Delivery of the signal shall have no effect on > the process. The behavior of a process is undefined > after it ignores a SIGFPE, SIGILL, SIGSEGV, > or SIGBUS signal that was not generated by kill(), > sigqueue(), or raise(). > > So, as I suspected, there are in fact no requirements > from the applicable spec. Infinite looping or > stopping the process anyway are conforming responses, > as is rebooting or halting the machine with a > ``panic'' message. I'd not go quite as far as that but execve("/usr/bin/nethack") certainly would be acceptable. Ralf ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Silly 100% CPU behavior on a SIG_IGN-ored SIGBUS. 2009-06-26 0:45 ` Ralf Baechle @ 2009-06-26 0:52 ` David Daney 0 siblings, 0 replies; 12+ messages in thread From: David Daney @ 2009-06-26 0:52 UTC (permalink / raw) To: Ralf Baechle; +Cc: Kaz Kylheku, Kevin D. Kissell, linux-mips Ralf Baechle wrote: > On Thu, Jun 25, 2009 at 09:00:19AM -0700, Kaz Kylheku wrote: > >> Ralf wrote: >>> I found this in IRIX 6.5 documentation: >>> >>> Caution: Signals raised by the instruction stream, SIGILL, >>> SIGEMT, SIGBUS, and SIGSEGV, will cause infinite loops >>> if their handler returns, or the action is set to SIG_IGN. >> The Single Unix Specification (Issue 6) marks the behavior >> explicitly undefined. > > I should have mentioned that above mentioned paragraph of IRIX documentation > was in the section on implmentation specific behaviour. > >> Bookmark this: http://www.opengroup.org/onlinepubs/009695399 >> >> Not the latest set of documents, but that can be regarded >> as a virtue. :) >> >> Under pthread_sigmask and sigprocmask, for blocking: >> >> If any of the SIGFPE, SIGILL, SIGSEGV, or SIGBUS >> signals are generated while they are blocked, >> the result is undefined, unless the signal >> was generated by the kill() function, the >> sigqueue() function, or the raise() function. >> >> Under ``2.4 Signal Concepts'', for SIG_IGN: >> >> SIG_IGN >> >> Ignore signal. >> >> Delivery of the signal shall have no effect on >> the process. The behavior of a process is undefined >> after it ignores a SIGFPE, SIGILL, SIGSEGV, >> or SIGBUS signal that was not generated by kill(), >> sigqueue(), or raise(). >> >> So, as I suspected, there are in fact no requirements >> from the applicable spec. Infinite looping or >> stopping the process anyway are conforming responses, >> as is rebooting or halting the machine with a >> ``panic'' message. > > I'd not go quite as far as that but execve("/usr/bin/nethack") certainly > would be acceptable. It is kind of moot at this point. The HEAD now kills the process instead of looping forever. David Daney ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2009-06-26 0:58 UTC | newest] Thread overview: 12+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2009-06-23 21:45 Silly 100% CPU behavior on a SIG_IGN-ored SIGBUS Kaz Kylheku 2009-06-23 21:45 ` Kaz Kylheku 2009-06-23 22:03 ` David Daney 2009-06-25 3:39 ` Kaz Kylheku 2009-06-25 3:39 ` Kaz Kylheku 2009-06-23 22:44 ` Kevin D. Kissell 2009-06-25 13:13 ` Ralf Baechle 2009-06-25 13:45 ` Ralf Baechle 2009-06-25 16:00 ` Kaz Kylheku 2009-06-25 16:00 ` Kaz Kylheku 2009-06-26 0:45 ` Ralf Baechle 2009-06-26 0:52 ` David Daney
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).