Silly 100% CPU behavior on a SIG

linux-mips.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Silly 100% CPU behavior on a SIG_IGN-ored SIGBUS.
@ 2009-06-23 21:45 Kaz Kylheku
  2009-06-23 21:45 ` Kaz Kylheku
                   ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Kaz Kylheku @ 2009-06-23 21:45 UTC (permalink / raw)
  To: linux-mips

Hi all,

On kernel 2.6.26, glibc 2.5 (n32), SiByte SB-1 core, the following
program goes into 100% CPU, chewing up about 80% kernel time and
20% user.

#include <stdio.h>
#include <signal.h>

int main(void)
{
  int *deadbeef = (int *) 0xdeadbeef;
  signal(SIGBUS, SIG_IGN);
  printf("*deadbeef == %d\n", *deadbeef);
  return 0;
}

If any fatal exception is ignored, the program should be killed
if that exception happens. 100% CPU is not a useful response.

(If there is a handler, and that handler returns without doing anything
to
prevent a recurrence of the exception when the instruction is re-tried,
that's different).

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Silly 100% CPU behavior on a SIG_IGN-ored SIGBUS.
  2009-06-23 21:45 Silly 100% CPU behavior on a SIG_IGN-ored SIGBUS Kaz Kylheku
@ 2009-06-23 21:45 ` Kaz Kylheku
  2009-06-23 22:03 ` David Daney
  2009-06-23 22:44 ` Kevin D. Kissell
  2 siblings, 0 replies; 12+ messages in thread
From: Kaz Kylheku @ 2009-06-23 21:45 UTC (permalink / raw)
  To: linux-mips

Hi all,

On kernel 2.6.26, glibc 2.5 (n32), SiByte SB-1 core, the following
program goes into 100% CPU, chewing up about 80% kernel time and
20% user.

#include <stdio.h>
#include <signal.h>

int main(void)
{
  int *deadbeef = (int *) 0xdeadbeef;
  signal(SIGBUS, SIG_IGN);
  printf("*deadbeef == %d\n", *deadbeef);
  return 0;
}

If any fatal exception is ignored, the program should be killed
if that exception happens. 100% CPU is not a useful response.

(If there is a handler, and that handler returns without doing anything
to
prevent a recurrence of the exception when the instruction is re-tried,
that's different).

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Silly 100% CPU behavior on a SIG_IGN-ored SIGBUS.
  2009-06-23 21:45 Silly 100% CPU behavior on a SIG_IGN-ored SIGBUS Kaz Kylheku
  2009-06-23 21:45 ` Kaz Kylheku
@ 2009-06-23 22:03 ` David Daney
  2009-06-25  3:39   ` Kaz Kylheku
  2009-06-23 22:44 ` Kevin D. Kissell
  2 siblings, 1 reply; 12+ messages in thread
From: David Daney @ 2009-06-23 22:03 UTC (permalink / raw)
  To: Kaz Kylheku; +Cc: linux-mips

Kaz Kylheku wrote:
> Hi all,
> 
> On kernel 2.6.26, glibc 2.5 (n32), SiByte SB-1 core, the following
> program goes into 100% CPU, chewing up about 80% kernel time and
> 20% user.
> 
> #include <stdio.h>
> #include <signal.h>
> 
> int main(void)
> {
>   int *deadbeef = (int *) 0xdeadbeef;
>   signal(SIGBUS, SIG_IGN);
>   printf("*deadbeef == %d\n", *deadbeef);
>   return 0;
> }
> 
> If any fatal exception is ignored, the program should be killed
> if that exception happens. 100% CPU is not a useful response.
> 
> (If there is a handler, and that handler returns without doing anything
> to
> prevent a recurrence of the exception when the instruction is re-tried,
> that's different).
> 
> 


I wonder if it is another instance of:

http://www.linux-mips.org/git?p=linux.git;a=commitdiff;h=49cf0e2d68dd98dbb28eaca0284e8460ab6ad86d

David Daney

^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: Silly 100% CPU behavior on a SIG_IGN-ored SIGBUS.
  2009-06-23 22:03 ` David Daney
@ 2009-06-25  3:39   ` Kaz Kylheku
  2009-06-25  3:39     ` Kaz Kylheku
  0 siblings, 1 reply; 12+ messages in thread
From: Kaz Kylheku @ 2009-06-25  3:39 UTC (permalink / raw)
  To: linux-mips

On June 23, 2009 3:03 PM, David Daney [mailto:ddaney@caviumnetworks.com]

> I wonder if it is another instance of:
> 
> http://www.linux-mips.org/git?p=linux.git;a=commitdiff;h=49cf0
> e2d68dd98dbb28eaca0284e8460ab6ad86d

Thanks for digging that up. The patch fixes it for me.
Clearly, it's not just for init. Any process can
ignore signals that shouldn't be ignored.

I'm surprised this has only been discovered
so recently. It's amazing how we sometimes find
things at around the same time.

If I may pontificate now, someone said my little
program was silly. Of course; it's a repro test case! 
Nobody would deliberately block SIGBUS and
then deliberately trigger a bus error.

But this situation happened in a large and complex real
application, leaving some of my developers scratching
their heads, and distracting them from looking for
the bad pointer!

``Hey look, we can break this 100% CPU thread with gdb, but
it always stops on the same location, which is an
indirect load through a register containing a bad pointer!
And it's spinning mostly in the kernel. Hmm!''

I gave them that little program to demonstrate how the
behavior can occur. They are now working out how
SIGBUS came to be ignored, and, of course, the
cause of the bad pointer.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: Silly 100% CPU behavior on a SIG_IGN-ored SIGBUS.
  2009-06-25  3:39   ` Kaz Kylheku
@ 2009-06-25  3:39     ` Kaz Kylheku
  0 siblings, 0 replies; 12+ messages in thread
From: Kaz Kylheku @ 2009-06-25  3:39 UTC (permalink / raw)
  To: linux-mips

On June 23, 2009 3:03 PM, David Daney [mailto:ddaney@caviumnetworks.com]

> I wonder if it is another instance of:
> 
> http://www.linux-mips.org/git?p=linux.git;a=commitdiff;h=49cf0
> e2d68dd98dbb28eaca0284e8460ab6ad86d

Thanks for digging that up. The patch fixes it for me.
Clearly, it's not just for init. Any process can
ignore signals that shouldn't be ignored.

I'm surprised this has only been discovered
so recently. It's amazing how we sometimes find
things at around the same time.

If I may pontificate now, someone said my little
program was silly. Of course; it's a repro test case! 
Nobody would deliberately block SIGBUS and
then deliberately trigger a bus error.

But this situation happened in a large and complex real
application, leaving some of my developers scratching
their heads, and distracting them from looking for
the bad pointer!

``Hey look, we can break this 100% CPU thread with gdb, but
it always stops on the same location, which is an
indirect load through a register containing a bad pointer!
And it's spinning mostly in the kernel. Hmm!''

I gave them that little program to demonstrate how the
behavior can occur. They are now working out how
SIGBUS came to be ignored, and, of course, the
cause of the bad pointer.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Silly 100% CPU behavior on a SIG_IGN-ored SIGBUS.
  2009-06-23 21:45 Silly 100% CPU behavior on a SIG_IGN-ored SIGBUS Kaz Kylheku
  2009-06-23 21:45 ` Kaz Kylheku
  2009-06-23 22:03 ` David Daney
@ 2009-06-23 22:44 ` Kevin D. Kissell
  2009-06-25 13:13   ` Ralf Baechle
  2 siblings, 1 reply; 12+ messages in thread
From: Kevin D. Kissell @ 2009-06-23 22:44 UTC (permalink / raw)
  To: Kaz Kylheku; +Cc: linux-mips

Kaz Kylheku wrote:
> Hi all,
>
> On kernel 2.6.26, glibc 2.5 (n32), SiByte SB-1 core, the following
> program goes into 100% CPU, chewing up about 80% kernel time and
> 20% user.
>
> #include <stdio.h>
> #include <signal.h>
>
> int main(void)
> {
>   int *deadbeef = (int *) 0xdeadbeef;
>   signal(SIGBUS, SIG_IGN);
>   printf("*deadbeef == %d\n", *deadbeef);
>   return 0;
> }
>
> If any fatal exception is ignored, the program should be killed
> if that exception happens. 100% CPU is not a useful response.
>   
It's not a useful program, so what did you expect?   One might argue 
that it would be more useful or correct to have the kernel advance the 
PC to not endlessly repeat the doomed load, but ignoring SIG_IGN and 
silently killing the thread violates the signal API as I've always 
understood it.

          Regards,

          Kevin K.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Silly 100% CPU behavior on a SIG_IGN-ored SIGBUS.
  2009-06-23 22:44 ` Kevin D. Kissell
@ 2009-06-25 13:13   ` Ralf Baechle
  2009-06-25 13:45     ` Ralf Baechle
  0 siblings, 1 reply; 12+ messages in thread
From: Ralf Baechle @ 2009-06-25 13:13 UTC (permalink / raw)
  To: Kevin D. Kissell; +Cc: Kaz Kylheku, linux-mips

On Tue, Jun 23, 2009 at 03:44:29PM -0700, Kevin D. Kissell wrote:

>> int main(void)
>> {
>>   int *deadbeef = (int *) 0xdeadbeef;
>>   signal(SIGBUS, SIG_IGN);
>>   printf("*deadbeef == %d\n", *deadbeef);
>>   return 0;
>> }
>>
>> If any fatal exception is ignored, the program should be killed
>> if that exception happens. 100% CPU is not a useful response.
>>   
> It's not a useful program, so what did you expect?   One might argue  
> that it would be more useful or correct to have the kernel advance the  
> PC to not endlessly repeat the doomed load, but ignoring SIG_IGN and  
> silently killing the thread violates the signal API as I've always  
> understood it.

It's not a useful program but valid as a test case.  However I agree with
your interpretation of signal semantics but I'll have to round up a copy
of the relevant standard documents; I have vague memories about some small
print for cases like this.

  Ralf

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Silly 100% CPU behavior on a SIG_IGN-ored SIGBUS.
  2009-06-25 13:13   ` Ralf Baechle
@ 2009-06-25 13:45     ` Ralf Baechle
  2009-06-25 16:00       ` Kaz Kylheku
  0 siblings, 1 reply; 12+ messages in thread
From: Ralf Baechle @ 2009-06-25 13:45 UTC (permalink / raw)
  To: Kevin D. Kissell; +Cc: Kaz Kylheku, linux-mips

On Thu, Jun 25, 2009 at 02:13:00PM +0100, Ralf Baechle wrote:

> >> int main(void)
> >> {
> >>   int *deadbeef = (int *) 0xdeadbeef;
> >>   signal(SIGBUS, SIG_IGN);
> >>   printf("*deadbeef == %d\n", *deadbeef);
> >>   return 0;
> >> }
> >>
> >> If any fatal exception is ignored, the program should be killed
> >> if that exception happens. 100% CPU is not a useful response.
> >>   
> > It's not a useful program, so what did you expect?   One might argue  
> > that it would be more useful or correct to have the kernel advance the  
> > PC to not endlessly repeat the doomed load, but ignoring SIG_IGN and  
> > silently killing the thread violates the signal API as I've always  
> > understood it.
> 
> It's not a useful program but valid as a test case.  However I agree with
> your interpretation of signal semantics but I'll have to round up a copy
> of the relevant standard documents; I have vague memories about some small
> print for cases like this.

I found this in IRIX 6.5 documentation:

  Caution: Signals raised by the instruction stream, SIGILL, SIGEMT, SIGBUS,
           and SIGSEGV, will cause infinite loops if their handler returns,
           or the action is set to SIG_IGN. 

  Ralf

^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: Silly 100% CPU behavior on a SIG_IGN-ored SIGBUS.
  2009-06-25 13:45     ` Ralf Baechle
@ 2009-06-25 16:00       ` Kaz Kylheku
  2009-06-25 16:00         ` Kaz Kylheku
  2009-06-26  0:45         ` Ralf Baechle
  0 siblings, 2 replies; 12+ messages in thread
From: Kaz Kylheku @ 2009-06-25 16:00 UTC (permalink / raw)
  To: Ralf Baechle, Kevin D. Kissell; +Cc: linux-mips

Ralf wrote:
> I found this in IRIX 6.5 documentation:
> 
>   Caution: Signals raised by the instruction stream, SIGILL, 
>   SIGEMT, SIGBUS, and SIGSEGV, will cause infinite loops
>   if their handler returns, or the action is set to SIG_IGN. 

The Single Unix Specification (Issue 6) marks the behavior
explicitly undefined.

Bookmark this: http://www.opengroup.org/onlinepubs/009695399

Not the latest set of documents, but that can be regarded
as a virtue. :)

Under pthread_sigmask and sigprocmask, for blocking:

  If any of the SIGFPE, SIGILL, SIGSEGV, or SIGBUS
  signals are generated while they are blocked,
  the result is undefined, unless the signal
  was generated by the kill() function, the
  sigqueue() function, or the raise() function.

Under ``2.4 Signal Concepts'', for SIG_IGN:

  SIG_IGN 

  Ignore signal. 

  Delivery of the signal shall have no effect on
  the process. The behavior of a process is undefined
  after it ignores a   SIGFPE, SIGILL, SIGSEGV,
  or SIGBUS  signal that was not generated by kill(),
  sigqueue(), or raise().

So, as I suspected, there are in fact no requirements
from the applicable spec. Infinite looping or
stopping the process anyway are conforming responses,
as is rebooting or halting the machine with a
``panic'' message. 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: Silly 100% CPU behavior on a SIG_IGN-ored SIGBUS.
  2009-06-25 16:00       ` Kaz Kylheku
@ 2009-06-25 16:00         ` Kaz Kylheku
  2009-06-26  0:45         ` Ralf Baechle
  1 sibling, 0 replies; 12+ messages in thread
From: Kaz Kylheku @ 2009-06-25 16:00 UTC (permalink / raw)
  To: Ralf Baechle, Kevin D. Kissell; +Cc: linux-mips

Ralf wrote:
> I found this in IRIX 6.5 documentation:
> 
>   Caution: Signals raised by the instruction stream, SIGILL, 
>   SIGEMT, SIGBUS, and SIGSEGV, will cause infinite loops
>   if their handler returns, or the action is set to SIG_IGN. 

The Single Unix Specification (Issue 6) marks the behavior
explicitly undefined.

Bookmark this: http://www.opengroup.org/onlinepubs/009695399

Not the latest set of documents, but that can be regarded
as a virtue. :)

Under pthread_sigmask and sigprocmask, for blocking:

  If any of the SIGFPE, SIGILL, SIGSEGV, or SIGBUS
  signals are generated while they are blocked,
  the result is undefined, unless the signal
  was generated by the kill() function, the
  sigqueue() function, or the raise() function.

Under ``2.4 Signal Concepts'', for SIG_IGN:

  SIG_IGN 

  Ignore signal. 

  Delivery of the signal shall have no effect on
  the process. The behavior of a process is undefined
  after it ignores a   SIGFPE, SIGILL, SIGSEGV,
  or SIGBUS  signal that was not generated by kill(),
  sigqueue(), or raise().

So, as I suspected, there are in fact no requirements
from the applicable spec. Infinite looping or
stopping the process anyway are conforming responses,
as is rebooting or halting the machine with a
``panic'' message. 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Silly 100% CPU behavior on a SIG_IGN-ored SIGBUS.
  2009-06-25 16:00       ` Kaz Kylheku
  2009-06-25 16:00         ` Kaz Kylheku
@ 2009-06-26  0:45         ` Ralf Baechle
  2009-06-26  0:52           ` David Daney
  1 sibling, 1 reply; 12+ messages in thread
From: Ralf Baechle @ 2009-06-26  0:45 UTC (permalink / raw)
  To: Kaz Kylheku; +Cc: Kevin D. Kissell, linux-mips

On Thu, Jun 25, 2009 at 09:00:19AM -0700, Kaz Kylheku wrote:

> Ralf wrote:
> > I found this in IRIX 6.5 documentation:
> > 
> >   Caution: Signals raised by the instruction stream, SIGILL, 
> >   SIGEMT, SIGBUS, and SIGSEGV, will cause infinite loops
> >   if their handler returns, or the action is set to SIG_IGN. 
> 
> The Single Unix Specification (Issue 6) marks the behavior
> explicitly undefined.

I should have mentioned that above mentioned paragraph of IRIX documentation
was in the section on implmentation specific behaviour.

> Bookmark this: http://www.opengroup.org/onlinepubs/009695399
> 
> Not the latest set of documents, but that can be regarded
> as a virtue. :)
> 
> Under pthread_sigmask and sigprocmask, for blocking:
> 
>   If any of the SIGFPE, SIGILL, SIGSEGV, or SIGBUS
>   signals are generated while they are blocked,
>   the result is undefined, unless the signal
>   was generated by the kill() function, the
>   sigqueue() function, or the raise() function.
> 
> Under ``2.4 Signal Concepts'', for SIG_IGN:
> 
>   SIG_IGN 
> 
>   Ignore signal. 
> 
>   Delivery of the signal shall have no effect on
>   the process. The behavior of a process is undefined
>   after it ignores a   SIGFPE, SIGILL, SIGSEGV,
>   or SIGBUS  signal that was not generated by kill(),
>   sigqueue(), or raise().
> 
> So, as I suspected, there are in fact no requirements
> from the applicable spec. Infinite looping or
> stopping the process anyway are conforming responses,
> as is rebooting or halting the machine with a
> ``panic'' message. 

I'd not go quite as far as that but execve("/usr/bin/nethack") certainly
would be acceptable.

  Ralf

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Silly 100% CPU behavior on a SIG_IGN-ored SIGBUS.
  2009-06-26  0:45         ` Ralf Baechle
@ 2009-06-26  0:52           ` David Daney
  0 siblings, 0 replies; 12+ messages in thread
From: David Daney @ 2009-06-26  0:52 UTC (permalink / raw)
  To: Ralf Baechle; +Cc: Kaz Kylheku, Kevin D. Kissell, linux-mips

Ralf Baechle wrote:
> On Thu, Jun 25, 2009 at 09:00:19AM -0700, Kaz Kylheku wrote:
> 
>> Ralf wrote:
>>> I found this in IRIX 6.5 documentation:
>>>
>>>   Caution: Signals raised by the instruction stream, SIGILL, 
>>>   SIGEMT, SIGBUS, and SIGSEGV, will cause infinite loops
>>>   if their handler returns, or the action is set to SIG_IGN. 
>> The Single Unix Specification (Issue 6) marks the behavior
>> explicitly undefined.
> 
> I should have mentioned that above mentioned paragraph of IRIX documentation
> was in the section on implmentation specific behaviour.
> 
>> Bookmark this: http://www.opengroup.org/onlinepubs/009695399
>>
>> Not the latest set of documents, but that can be regarded
>> as a virtue. :)
>>
>> Under pthread_sigmask and sigprocmask, for blocking:
>>
>>   If any of the SIGFPE, SIGILL, SIGSEGV, or SIGBUS
>>   signals are generated while they are blocked,
>>   the result is undefined, unless the signal
>>   was generated by the kill() function, the
>>   sigqueue() function, or the raise() function.
>>
>> Under ``2.4 Signal Concepts'', for SIG_IGN:
>>
>>   SIG_IGN 
>>
>>   Ignore signal. 
>>
>>   Delivery of the signal shall have no effect on
>>   the process. The behavior of a process is undefined
>>   after it ignores a   SIGFPE, SIGILL, SIGSEGV,
>>   or SIGBUS  signal that was not generated by kill(),
>>   sigqueue(), or raise().
>>
>> So, as I suspected, there are in fact no requirements
>> from the applicable spec. Infinite looping or
>> stopping the process anyway are conforming responses,
>> as is rebooting or halting the machine with a
>> ``panic'' message. 
> 
> I'd not go quite as far as that but execve("/usr/bin/nethack") certainly
> would be acceptable.

It is kind of moot at this point.  The HEAD now kills the process 
instead of looping forever.

David Daney

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2009-06-26  0:58 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-06-23 21:45 Silly 100% CPU behavior on a SIG_IGN-ored SIGBUS Kaz Kylheku
2009-06-23 21:45 ` Kaz Kylheku
2009-06-23 22:03 ` David Daney
2009-06-25  3:39   ` Kaz Kylheku
2009-06-25  3:39     ` Kaz Kylheku
2009-06-23 22:44 ` Kevin D. Kissell
2009-06-25 13:13   ` Ralf Baechle
2009-06-25 13:45     ` Ralf Baechle
2009-06-25 16:00       ` Kaz Kylheku
2009-06-25 16:00         ` Kaz Kylheku
2009-06-26  0:45         ` Ralf Baechle
2009-06-26  0:52           ` David Daney

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).