Re: Dumb question: Why are exceptions such as SIGSEGV not logged

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* Re: Dumb question: Why are exceptions such as SIGSEGV not logged
@ 2003-08-18 20:50 Hank Leininger
  2003-08-18 21:02 ` Mike Fedyk
  0 siblings, 1 reply; 12+ messages in thread
From: Hank Leininger @ 2003-08-18 20:50 UTC (permalink / raw)
  To: linux-kernel

On 2003-08-18, Michael Frank <mhf () linuxmail ! org> wrote:

> I tend to see segfaults only when something is broken or when my lapse
> of attention perhaps should be rewarded by said "sucker rod".

As others have said some apps use "interesting" signals normally.  For
instance probably the most common is vmware.  vmware sends itself SIGSEGV
all the time (at startup, at least) as part of its memory-management foo:

Aug 12 14:11:23 foo kernel: grsec: signal 11 sent to (vmware-ui:12180) \
	UID(XXXX) EUID(XXXX), parent (vmware:17653) UID(XXXX) EUID(XXXX)
Aug 12 14:11:23 foo kernel: grsec: signal 11 sent to (vmware-mks:25238) \
	UID(XXXX) EUID(XXXX), parent (vmware:17653) UID(XXXX) EUID(XXXX)
Aug 12 14:11:23 foo kernel: grsec: signal 11 sent to (vmware:17653) \
	UID(XXXX) EUID(XXXX), parent (bash:2883) UID(XXXX) EUID(XXXX)

..So not *all* such cases are cause for alarm.  However, if you run one of
the patches enabling logging of this, you quickly learn what's normal for
the apps you run, and can teach your log-auditing tools and/or your brain
to ignore them.

--
Hank Leininger <hlein@progressive-comp.com> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Dumb question: Why are exceptions such as SIGSEGV not logged
  2003-08-18 20:50 Dumb question: Why are exceptions such as SIGSEGV not logged Hank Leininger
@ 2003-08-18 21:02 ` Mike Fedyk
  2003-08-18 21:18   ` Hank Leininger
                     ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Mike Fedyk @ 2003-08-18 21:02 UTC (permalink / raw)
  To: Hank Leininger; +Cc: linux-kernel

On Mon, Aug 18, 2003 at 04:50:49PM -0400, Hank Leininger wrote:
> ..So not *all* such cases are cause for alarm.  However, if you run one of
> the patches enabling logging of this, you quickly learn what's normal for
> the apps you run, and can teach your log-auditing tools and/or your brain
> to ignore them.

And why not just catch the ones sent from the kernel?  That's the one that
is killing the program because it crashed, and that's the one the origional
poster wants logged...

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Dumb question: Why are exceptions such as SIGSEGV not logged
  2003-08-18 21:02 ` Mike Fedyk
@ 2003-08-18 21:18   ` Hank Leininger
  2003-08-18 21:25     ` Mike Fedyk
  2003-08-18 22:12   ` William Lee Irwin III
  2003-08-18 22:39   ` David Schwartz
  2 siblings, 1 reply; 12+ messages in thread
From: Hank Leininger @ 2003-08-18 21:18 UTC (permalink / raw)
  To: Mike Fedyk; +Cc: linux-kernel

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Mon, 18 Aug 2003, Mike Fedyk wrote:

> On Mon, Aug 18, 2003 at 04:50:49PM -0400, Hank Leininger wrote:
> > ..So not *all* such cases are cause for alarm.  However, if you run one of
> > the patches enabling logging of this, you quickly learn what's normal for
> > the apps you run, and can teach your log-auditing tools and/or your brain
> > to ignore them.
>
> And why not just catch the ones sent from the kernel?  That's the one that
> is killing the program because it crashed,

Well, in my case at least, because if a network-listening daemon fell
over with sigsegv, sigill, etc I most definitely wanted to know about
it.  But, you certainly could make a patch to do only that; it'd be
lower impact, less contraversial but probably still not accepted into
mainline (just a guess).

> and that's the one the origional poster wants logged...

Hm, I see Thar Filipau bringing that up specifically, and it does seem
like something that ought to generate some logs.  (But I thought they
should already generate oops's?  Apparently not.)  The OP seemed to be
concerned with any SIGSEGV and SIGILL signals, not just in-kernel ones?

Hank Leininger <hlein@progressive-comp.com>
E407 AEF4 761E D39C D401  D4F4 22F8 EF11 861A A6F1
-----BEGIN PGP SIGNATURE-----

iD8DBQE/QUKRIvjvEYYapvERAn28AJ9ELPYOXKOfcIjvzV88BRzOfde1mACfRbOx
zngdpycDsO4FZgcrilGRMQU=
=X+3w
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Dumb question: Why are exceptions such as SIGSEGV not logged
  2003-08-18 21:18   ` Hank Leininger
@ 2003-08-18 21:25     ` Mike Fedyk
  0 siblings, 0 replies; 12+ messages in thread
From: Mike Fedyk @ 2003-08-18 21:25 UTC (permalink / raw)
  To: Hank Leininger; +Cc: linux-kernel

On Mon, Aug 18, 2003 at 05:18:09PM -0400, Hank Leininger wrote:
> Hm, I see Thar Filipau bringing that up specifically, and it does seem
> like something that ought to generate some logs.  (But I thought they
> should already generate oops's?  Apparently not.)  The OP seemed to be
> concerned with any SIGSEGV and SIGILL signals, not just in-kernel ones?

No, the crashes are in userspace apps, not the kernel.  But when they
crash they get sent a signal from the kernel.  That is what needs to be
logged, not the signals an app might send to itself.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Dumb question: Why are exceptions such as SIGSEGV not logged
  2003-08-18 21:02 ` Mike Fedyk
  2003-08-18 21:18   ` Hank Leininger
@ 2003-08-18 22:12   ` William Lee Irwin III
  2003-08-18 22:39   ` David Schwartz
  2 siblings, 0 replies; 12+ messages in thread
From: William Lee Irwin III @ 2003-08-18 22:12 UTC (permalink / raw)
  To: Hank Leininger, linux-kernel

On Mon, Aug 18, 2003 at 04:50:49PM -0400, Hank Leininger wrote:
>> ..So not *all* such cases are cause for alarm.  However, if you run one of
>> the patches enabling logging of this, you quickly learn what's normal for
>> the apps you run, and can teach your log-auditing tools and/or your brain
>> to ignore them.

On Mon, Aug 18, 2003 at 02:02:38PM -0700, Mike Fedyk wrote:
> And why not just catch the ones sent from the kernel?  That's the one that
> is killing the program because it crashed, and that's the one the origional
> poster wants logged...

They're almost all sent by the kernel. Very few represent kill(1).


-- wli

^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: Dumb question: Why are exceptions such as SIGSEGV not logged
  2003-08-18 21:02 ` Mike Fedyk
  2003-08-18 21:18   ` Hank Leininger
  2003-08-18 22:12   ` William Lee Irwin III
@ 2003-08-18 22:39   ` David Schwartz
  2003-08-18 22:44     ` Mike Fedyk
  2003-08-19  6:54     ` Denis Vlasenko
  2 siblings, 2 replies; 12+ messages in thread
From: David Schwartz @ 2003-08-18 22:39 UTC (permalink / raw)
  To: Mike Fedyk, Hank Leininger; +Cc: linux-kernel


> And why not just catch the ones sent from the kernel?  That's the one that
> is killing the program because it crashed, and that's the one the
> origional
> poster wants logged...

	Because sometimes a program wants to terminate. And it is perfectly legal
for a programmer who needs to terminate his program as quickly as possible
to do this:

char *j=NULL;
signal(SIGSEGV, SIG_DFL);
*j++;

	This is a perfectly sensible thing for a program to do with well-defined
semantics. If a program wants to create a child every minute like this and
kill it, that's perfectly fine. We should be able to do that in the default
configuration without a sysadmin complaining that we're DoSing his syslogs.

	DS




^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Dumb question: Why are exceptions such as SIGSEGV not logged
  2003-08-18 22:39   ` David Schwartz
@ 2003-08-18 22:44     ` Mike Fedyk
  2003-08-18 22:53       ` David Schwartz
  2003-08-19  6:54     ` Denis Vlasenko
  1 sibling, 1 reply; 12+ messages in thread
From: Mike Fedyk @ 2003-08-18 22:44 UTC (permalink / raw)
  To: David Schwartz; +Cc: Hank Leininger, linux-kernel

On Mon, Aug 18, 2003 at 03:39:15PM -0700, David Schwartz wrote:
> 
> > And why not just catch the ones sent from the kernel?  That's the one that
> > is killing the program because it crashed, and that's the one the
> > origional
> > poster wants logged...
> 
> 	Because sometimes a program wants to terminate. And it is perfectly legal
> for a programmer who needs to terminate his program as quickly as possible
> to do this:
> 
> char *j=NULL;
> signal(SIGSEGV, SIG_DFL);
> *j++;
> 
> 	This is a perfectly sensible thing for a program to do with well-defined
> semantics. If a program wants to create a child every minute like this and
> kill it, that's perfectly fine. We should be able to do that in the default
> configuration without a sysadmin complaining that we're DoSing his syslogs.

Are you saying that a signal requested from userspace uses the same code
path as the signal sent when a process has overstepped its bounds?

Surely some flag can be set so that we know the kernel is killing it because
it did something illegal...

^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: Dumb question: Why are exceptions such as SIGSEGV not logged
  2003-08-18 22:44     ` Mike Fedyk
@ 2003-08-18 22:53       ` David Schwartz
  0 siblings, 0 replies; 12+ messages in thread
From: David Schwartz @ 2003-08-18 22:53 UTC (permalink / raw)
  To: Mike Fedyk; +Cc: Hank Leininger, linux-kernel



> On Mon, Aug 18, 2003 at 03:39:15PM -0700, David Schwartz wrote:

> > > And why not just catch the ones sent from the kernel?  That's
> > > the one that
> > > is killing the program because it crashed, and that's the one the
> > > origional
> > > poster wants logged...

> > 	Because sometimes a program wants to terminate. And it is
> > perfectly legal
> > for a programmer who needs to terminate his program as quickly
> > as possible
> > to do this:

> > char *j=NULL;
> > signal(SIGSEGV, SIG_DFL);
> > *j++;

> > 	This is a perfectly sensible thing for a program to do with
> > well-defined
> > semantics. If a program wants to create a child every minute
> > like this and
> > kill it, that's perfectly fine. We should be able to do that in
> > the default
> > configuration without a sysadmin complaining that we're DoSing
> > his syslogs.

> Are you saying that a signal requested from userspace uses the same code
> path as the signal sent when a process has overstepped its bounds?

	It depends what you mean by "requested".

> Surely some flag can be set so that we know the kernel is killing
> it because
> it did something illegal...

	It depends what you mean by "illegal".

	Dereferencing a NULL pointer deliberately to induce the kernel to kill your
process is indistinguishable from dereferencing a NULL pointer accidentally
and forcing the kernel to kill your process.

	These "illegal" operations have well-defined semantics that programmers can
use and rely on. Logging every such operation changes their semantics and
breaks programs that currently work -- breaks in the sense that they will
now DoS logs and result in admin complaints.

	The kernel cannot determine whether a SEGV or ILL was the result of a
deliberate attempt on the part of the programmer to create such a signal or
whether it's due to a programming error. Even an uncaught exception can be
used as a good way to terminate a process immediately (is there another
portable way to do that?).

	DS



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Dumb question: Why are exceptions such as SIGSEGV not logged
  2003-08-18 22:39   ` David Schwartz
  2003-08-18 22:44     ` Mike Fedyk
@ 2003-08-19  6:54     ` Denis Vlasenko
  2003-08-19  8:31       ` Proposal (was: Why are exceptions such as SIGSEGV not logged) Jakob Oestergaard
                         ` (2 more replies)
  1 sibling, 3 replies; 12+ messages in thread
From: Denis Vlasenko @ 2003-08-19  6:54 UTC (permalink / raw)
  To: David Schwartz, Mike Fedyk, Hank Leininger; +Cc: linux-kernel

On 19 August 2003 01:39, David Schwartz wrote:
> > And why not just catch the ones sent from the kernel?  That's the one that
> > is killing the program because it crashed, and that's the one the
> > origional
> > poster wants logged...
> 
> 	Because sometimes a program wants to terminate. And it is perfectly legal
> for a programmer who needs to terminate his program as quickly as possible
> to do this:
> 
> char *j=NULL;
> signal(SIGSEGV, SIG_DFL);
> *j++;
> 
> 	This is a perfectly sensible thing for a program to do with well-defined
> semantics. If a program wants to create a child every minute like this and
> kill it, that's perfectly fine. We should be able to do that in the default
> configuration without a sysadmin complaining that we're DoSing his syslogs.

I disagree. _exit(2) is the most sensible way to terminate.

Logginh kernel-induced SEGVs and ILLs are definitely a help when you hunt
daemons mysteriously crashing. This outweighs DoS hazard.
--
vda

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Proposal (was: Why are exceptions such as SIGSEGV not logged)
  2003-08-19  6:54     ` Denis Vlasenko
@ 2003-08-19  8:31       ` Jakob Oestergaard
  2003-08-19 14:52       ` Dumb question: Why are exceptions such as SIGSEGV not logged Valdis.Kletnieks
  2003-08-19 18:51       ` David Schwartz
  2 siblings, 0 replies; 12+ messages in thread
From: Jakob Oestergaard @ 2003-08-19  8:31 UTC (permalink / raw)
  To: Denis Vlasenko; +Cc: David Schwartz, Mike Fedyk, Hank Leininger, linux-kernel

On Tue, Aug 19, 2003 at 09:54:17AM +0300, Denis Vlasenko wrote:
> On 19 August 2003 01:39, David Schwartz wrote:
...[snip]...
> > 	This is a perfectly sensible thing for a program to do with well-defined
> > semantics. If a program wants to create a child every minute like this and
> > kill it, that's perfectly fine. We should be able to do that in the default
> > configuration without a sysadmin complaining that we're DoSing his syslogs.
> 
> I disagree. _exit(2) is the most sensible way to terminate.
> 
> Logginh kernel-induced SEGVs and ILLs are definitely a help when you hunt
> daemons mysteriously crashing. This outweighs DoS hazard.

Ok guys - we will never come to an agreement on what would be the
sensible thing to do.

For good reasons, too: the purposes and uses of the systems out there,
and the minds of the people administering them, will be as different as
anything.

This reminds me of the "core naming wars", the "vm overcommit wars", and
other "big" (in the minds of people) issues that were solved to
everyones satisfaction with an entry in /proc.

May I suggest:
  /proc/sys/kernel/log_signals

Semantics:  Numbers can be written to log_signals - these are signal
numbers that will cause a log entry to be written, when the given signal
is delivered. The file can be read, in which case it will list the
signal numbers that cause log entries to be written.

Examples:

]$ cat /proc/sys/kernel/log_signals
   4
   7
]$ echo +15 > /proc/sys/kernel/log_signals
]$ cat /proc/sys/kernel/log_signals
   4
   7
   15
]$ echo -4 > /proc/sys/kernel/log_signals
]$ cat /proc/sys/kernel/log_signals
   7
   15
]$

Possible extension:

]$ echo '*' > /proc/sys/kernel/log_signals
]$ cat /proc/sys/kernel/log_signals
 ... lists all signals ...
]$ echo '-*' > /proc/sys/kernel/log_signals
]$ cat /proc/sys/kernel/log_signals
]$

In my oppinion it does not make sense to distinguish between signals
sent from process to process, and from kernel to process.  Some garbage
collectors, for example, depend on the kernel sending the SIGSEGV and do
their own handling of that - while for many other processes that
situation indicates a problem.   Better to handle that kind of thing in
user space log auditing tools.

An implementation of the above is left as an exercise for the reader  :)

Comments?

-- 
................................................................
:   jakob@unthought.net   : And I see the elder races,         :
:.........................: putrid forms of man                :
:   Jakob Østergaard      : See him rise and claim the earth,  :
:        OZ9ABN           : his downfall is at hand.           :
:.........................:............{Konkhra}...............:

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Dumb question: Why are exceptions such as SIGSEGV not logged
  2003-08-19  6:54     ` Denis Vlasenko
  2003-08-19  8:31       ` Proposal (was: Why are exceptions such as SIGSEGV not logged) Jakob Oestergaard
@ 2003-08-19 14:52       ` Valdis.Kletnieks
  2003-08-19 18:51       ` David Schwartz
  2 siblings, 0 replies; 12+ messages in thread
From: Valdis.Kletnieks @ 2003-08-19 14:52 UTC (permalink / raw)
  To: vda; +Cc: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 980 bytes --]

On Tue, 19 Aug 2003 09:54:17 +0300, Denis Vlasenko said:

> > char *j=NULL;
> > signal(SIGSEGV, SIG_DFL);
> > *j++;

> I disagree. _exit(2) is the most sensible way to terminate.

Not if you want it *dead*, *now*, with a core dump, and with minimal disruption
of program state.  Sometimes (especially when trying to shoot a race condition)
you just can't run the program under gdb - and if it calls _exit() there's not much
wreckage left for gdb to look at....

> Logginh kernel-induced SEGVs and ILLs are definitely a help when you hunt
> daemons mysteriously crashing. This outweighs DoS hazard.

Well, I can *see* the fact it exited with a signal in 'lastcomm' already.  If that's all
the info you're providing, it's of no help.

Now, if you figure out how to read the module's -g data and give me a line number
it died at:

	kprint(DEBUG "Process %d (%s) died on  signal %d at line %d of function %s", ....

but that would involve a lot of file I/O from kernelspace, soo.....

[-- Attachment #2: Type: application/pgp-signature, Size: 226 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: Dumb question: Why are exceptions such as SIGSEGV not logged
  2003-08-19  6:54     ` Denis Vlasenko
  2003-08-19  8:31       ` Proposal (was: Why are exceptions such as SIGSEGV not logged) Jakob Oestergaard
  2003-08-19 14:52       ` Dumb question: Why are exceptions such as SIGSEGV not logged Valdis.Kletnieks
@ 2003-08-19 18:51       ` David Schwartz
  2 siblings, 0 replies; 12+ messages in thread
From: David Schwartz @ 2003-08-19 18:51 UTC (permalink / raw)
  To: vda, Mike Fedyk, Hank Leininger; +Cc: linux-kernel


> On 19 August 2003 01:39, David Schwartz wrote:

> > > And why not just catch the ones sent from the kernel?  That's
> > > the one that
> > > is killing the program because it crashed, and that's the one the
> > > origional
> > > poster wants logged...

> > 	Because sometimes a program wants to terminate. And it is
> > perfectly legal
> > for a programmer who needs to terminate his program as quickly
> > as possible
> > to do this:
> >
> > char *j=NULL;
> > signal(SIGSEGV, SIG_DFL);
> > *j++;

> > 	This is a perfectly sensible thing for a program to do with
> > well-defined
> > semantics. If a program wants to create a child every minute
> > like this and
> > kill it, that's perfectly fine. We should be able to do that in
> > the default
> > configuration without a sysadmin complaining that we're DoSing
> > his syslogs.

> I disagree. _exit(2) is the most sensible way to terminate.

	Read the documentation for _exit. You will see that it is useless in the
case of a portable program that needs to terminate as quickly as possible
and, in fact, isn't guaranteed to cause program termination at all:

       The function _exit is like exit(), but does not  call  any
       functions  registered with the ANSI C atexit function, nor
       any registered signal handlers. Whether it  flushes  stan-
       dard  I/O buffers and removes temporary files created with
       tmpfile(3)  is  implementation-dependent.   On  the  other
       hand, _exit does close open file descriptors, and this may
       cause an unknown delay, waiting for pending output to fin-
       ish.  If  the delay is undesired, it may be useful to call
       functions like tcflush() before calling _exit().   Whether
       any pending I/O is cancelled, and which pending I/O may be
       cancelled upon _exit(), is implementation-dependent.

	One major problem with _exit() is that it touches various structures. If
the program's execution environment is no longer trusted, calling _exit()
can cause an endless loop. In multithreaded programs, _exit() may need to
acquire mutexes. This can take an indeterminate amount of time. Portable
programs cannot rely on _exit() in a case where they need to terminate as
soon as possible.

	Now, if you have a better way for a portable program that needs to
terminate immediately to do so, that's fine, tell me what it is. Otherwise,
you are *forcing* people to DoS your syslog.

	DS



^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2003-08-19 18:56 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-08-18 20:50 Dumb question: Why are exceptions such as SIGSEGV not logged Hank Leininger
2003-08-18 21:02 ` Mike Fedyk
2003-08-18 21:18   ` Hank Leininger
2003-08-18 21:25     ` Mike Fedyk
2003-08-18 22:12   ` William Lee Irwin III
2003-08-18 22:39   ` David Schwartz
2003-08-18 22:44     ` Mike Fedyk
2003-08-18 22:53       ` David Schwartz
2003-08-19  6:54     ` Denis Vlasenko
2003-08-19  8:31       ` Proposal (was: Why are exceptions such as SIGSEGV not logged) Jakob Oestergaard
2003-08-19 14:52       ` Dumb question: Why are exceptions such as SIGSEGV not logged Valdis.Kletnieks
2003-08-19 18:51       ` David Schwartz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox