linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] exec: log when wait_for_dump_helpers aborts due to a signal
@ 2011-10-26  1:07 Mandeep Singh Baines
  2011-10-26 16:54 ` Oleg Nesterov
  2011-10-28 23:26 ` Andrew Morton
  0 siblings, 2 replies; 5+ messages in thread
From: Mandeep Singh Baines @ 2011-10-26  1:07 UTC (permalink / raw)
  To: linux-kernel
  Cc: Mandeep Singh Baines, Alexander Viro, Neil Horman, Earl Chew,
	Oleg Nesterov, Andi Kleen, Alan Cox, Andrew Morton, linux-fsdevel

To allow coredump pipe readers to look at /proc/<pid> of the crashing
process, the kernel waits for the reader to exit. However, the wait
is silently aborted if the crashing process is signalled.

This patch, logs whenever wait_for_dump_helpers is aborted or in order
to assist in debugging cases where /proc/<pid> is gone.

Alternatively, we may want to consider not aborting on a signal. You
could always break the loop by killing the reader process.

Reference: http://crosbug.com/21559
Signed-off-by: Mandeep Singh Baines <msb@chromium.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Neil Horman <nhorman@tuxdriver.com>
Cc: Earl Chew <earl_chew@agilent.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-fsdevel@vger.kernel.org
---
 fs/exec.c |    9 ++++++++-
 1 files changed, 8 insertions(+), 1 deletions(-)

diff --git a/fs/exec.c b/fs/exec.c
index 25dcbe5..5d4190d 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -2030,7 +2030,14 @@ static void wait_for_dump_helpers(struct file *file)
 	pipe->readers++;
 	pipe->writers--;
 
-	while ((pipe->readers > 1) && (!signal_pending(current))) {
+	while (pipe->readers > 1) {
+		if (signal_pending(current)) {
+			pr_info("wait_for_dump_helpers[%d]: "
+				"aborted due to signal\n",
+				task_pid_nr(current));
+			break;
+		}
+
 		wake_up_interruptible_sync(&pipe->wait);
 		kill_fasync(&pipe->fasync_readers, SIGIO, POLL_IN);
 		pipe_wait(pipe);
-- 
1.7.3.1

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] exec: log when wait_for_dump_helpers aborts due to a signal
  2011-10-26  1:07 [PATCH] exec: log when wait_for_dump_helpers aborts due to a signal Mandeep Singh Baines
@ 2011-10-26 16:54 ` Oleg Nesterov
  2011-10-28 23:26 ` Andrew Morton
  1 sibling, 0 replies; 5+ messages in thread
From: Oleg Nesterov @ 2011-10-26 16:54 UTC (permalink / raw)
  To: Mandeep Singh Baines
  Cc: linux-kernel, Alexander Viro, Neil Horman, Earl Chew, Andi Kleen,
	Alan Cox, Andrew Morton, linux-fsdevel

On 10/25, Mandeep Singh Baines wrote:
>
> To allow coredump pipe readers to look at /proc/<pid> of the crashing
> process, the kernel waits for the reader to exit. However, the wait
> is silently aborted if the crashing process is signalled.

Well, yes... But note that this signal_pending() is only used because
with TIF_SIGPENDING we have the problems anyway.

Oh. I promise myself I'll make do_coredump() killable and cleanup this
all every time I look at this code...

> This patch, logs whenever wait_for_dump_helpers is aborted or in order
> to assist in debugging cases where /proc/<pid> is gone.

I don't really understand why this is useful. The reader process
can complain if it can't collect the data (say, /proc/pid goes away
or EOF doesn't come).



As for the patch itself,

> -	while ((pipe->readers > 1) && (!signal_pending(current))) {
> +	while (pipe->readers > 1) {
> +		if (signal_pending(current)) {
> +			pr_info("wait_for_dump_helpers[%d]: "
> +				"aborted due to signal\n",
> +				task_pid_nr(current));
> +			break;
> +		}
> +
>  		wake_up_interruptible_sync(&pipe->wait);

This can't help in general. If signal_pending() == T, it is quite
possible that pipe_write() already failed before.

Oleg.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] exec: log when wait_for_dump_helpers aborts due to a signal
  2011-10-26  1:07 [PATCH] exec: log when wait_for_dump_helpers aborts due to a signal Mandeep Singh Baines
  2011-10-26 16:54 ` Oleg Nesterov
@ 2011-10-28 23:26 ` Andrew Morton
  2011-10-29 12:42   ` Neil Horman
  2011-10-29 14:43   ` Oleg Nesterov
  1 sibling, 2 replies; 5+ messages in thread
From: Andrew Morton @ 2011-10-28 23:26 UTC (permalink / raw)
  To: Mandeep Singh Baines
  Cc: linux-kernel, Alexander Viro, Neil Horman, Earl Chew,
	Oleg Nesterov, Andi Kleen, Alan Cox, linux-fsdevel

On Tue, 25 Oct 2011 18:07:08 -0700
Mandeep Singh Baines <msb@chromium.org> wrote:

> To allow coredump pipe readers to look at /proc/<pid> of the crashing
> process, the kernel waits for the reader to exit. However, the wait
> is silently aborted if the crashing process is signalled.
> 
> This patch, logs whenever wait_for_dump_helpers is aborted or in order
> to assist in debugging cases where /proc/<pid> is gone.

You don't really describe what problem you're observing.  What's the
use case?  What are you trying to do?  etc.

Because if that is known, we can perhaps find better solutions.

> Alternatively, we may want to consider not aborting on a signal. You
> could always break the loop by killing the reader process.

Well.  Neil's changelog for 61be228a06dc6e8662 is quite nice and tells
us everything we could possibly want to know, except for why it tests
sgnal_pending() :(

Neil, Oleg: can you remember?

> --- a/fs/exec.c
> +++ b/fs/exec.c
> @@ -2030,7 +2030,14 @@ static void wait_for_dump_helpers(struct file *file)
>  	pipe->readers++;
>  	pipe->writers--;
>  
> -	while ((pipe->readers > 1) && (!signal_pending(current))) {
> +	while (pipe->readers > 1) {
> +		if (signal_pending(current)) {
> +			pr_info("wait_for_dump_helpers[%d]: "
> +				"aborted due to signal\n",
> +				task_pid_nr(current));
> +			break;
> +		}
> +

argh, printk("i screwed up").  Hopefully we can find a better solution
to whatever-your-problem is than this!

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] exec: log when wait_for_dump_helpers aborts due to a signal
  2011-10-28 23:26 ` Andrew Morton
@ 2011-10-29 12:42   ` Neil Horman
  2011-10-29 14:43   ` Oleg Nesterov
  1 sibling, 0 replies; 5+ messages in thread
From: Neil Horman @ 2011-10-29 12:42 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Mandeep Singh Baines, linux-kernel, Alexander Viro, Earl Chew,
	Oleg Nesterov, Andi Kleen, Alan Cox, linux-fsdevel

On Fri, Oct 28, 2011 at 04:26:31PM -0700, Andrew Morton wrote:
> On Tue, 25 Oct 2011 18:07:08 -0700
> Mandeep Singh Baines <msb@chromium.org> wrote:
> 
> > To allow coredump pipe readers to look at /proc/<pid> of the crashing
> > process, the kernel waits for the reader to exit. However, the wait
> > is silently aborted if the crashing process is signalled.
> > 
> > This patch, logs whenever wait_for_dump_helpers is aborted or in order
> > to assist in debugging cases where /proc/<pid> is gone.
> 
> You don't really describe what problem you're observing.  What's the
> use case?  What are you trying to do?  etc.
> 
> Because if that is known, we can perhaps find better solutions.
> 
> > Alternatively, we may want to consider not aborting on a signal. You
> > could always break the loop by killing the reader process.
> 
> Well.  Neil's changelog for 61be228a06dc6e8662 is quite nice and tells
> us everything we could possibly want to know, except for why it tests
> sgnal_pending() :(
> 
> Neil, Oleg: can you remember?
> 
I _think_ it was a way to detect if the helper was still running, or had become
stuck.  I.e. we send the process a SIGIO, and if on the next iteration its still
pending, we can assume something has gone wrong, and abort the dump.  Of course
my memory could be faulty, Oleg do you have better recollection than I?

I'm heading out for a long weekend, but when I get back, I'll dig through the
archives for a better answer.  I recall Oleg and I talking at length about this
code.

> > --- a/fs/exec.c
> > +++ b/fs/exec.c
> > @@ -2030,7 +2030,14 @@ static void wait_for_dump_helpers(struct file *file)
> >  	pipe->readers++;
> >  	pipe->writers--;
> >  
> > -	while ((pipe->readers > 1) && (!signal_pending(current))) {
> > +	while (pipe->readers > 1) {
> > +		if (signal_pending(current)) {
> > +			pr_info("wait_for_dump_helpers[%d]: "
> > +				"aborted due to signal\n",
> > +				task_pid_nr(current));
> > +			break;
> > +		}
> > +
> 
> argh, printk("i screwed up").  Hopefully we can find a better solution
> to whatever-your-problem is than this!
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] exec: log when wait_for_dump_helpers aborts due to a signal
  2011-10-28 23:26 ` Andrew Morton
  2011-10-29 12:42   ` Neil Horman
@ 2011-10-29 14:43   ` Oleg Nesterov
  1 sibling, 0 replies; 5+ messages in thread
From: Oleg Nesterov @ 2011-10-29 14:43 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Mandeep Singh Baines, linux-kernel, Alexander Viro, Neil Horman,
	Earl Chew, Andi Kleen, Alan Cox, linux-fsdevel

On 10/28, Andrew Morton wrote:
>
> Well.  Neil's changelog for 61be228a06dc6e8662 is quite nice and tells
> us everything we could possibly want to know, except for why it tests
> sgnal_pending() :(

In short, signal_pending() should not be here. It only reflects the
fact that do_coredump() needs the fixes (and can't resist, I sent the
patch several years ago, but it was ignored ;)

There are 2 reasons. if signal_pending() == T then:

	- pipe_wait() is pointless, it won't block. We do not want
	  a busywait loop.

	- And probably even wait_for_dump_helpers() is pointless,
	  it is quite possible that pipe_write() already failed
	  before and the reader doesn't know this.

What I think we should do:

	- Fix this code, it should not react to signals.

	- But! at the same time the explicit SIGKILL should stop
	  the coredump. It can take a lot of time/resources.

	  This also makes it oom-killable, and this is important.

	- If we dump to the pipe, then perhaps it makes sense to
	  send a signal to the pipe reader in the latter case, but
	  this is a bit offtopic.

I'll try to redo my old patches for 3.2 once I have the time. There
are some nasty problems which I forgot, _iirc_ this is not that
trivial.

Oleg.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2011-10-29 14:43 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-10-26  1:07 [PATCH] exec: log when wait_for_dump_helpers aborts due to a signal Mandeep Singh Baines
2011-10-26 16:54 ` Oleg Nesterov
2011-10-28 23:26 ` Andrew Morton
2011-10-29 12:42   ` Neil Horman
2011-10-29 14:43   ` Oleg Nesterov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).